绘制Vanilla Emacs里面所有函数的引用关系图。
这是我大概一个月前想到的emacs lisp趣味挑战,原本想在论坛上办一个挑战比赛。但是因为紫薯布丁被吃完了,没有合适的奖品,所以只好作罢。
我的想法很简单,穷搜所有的函数符号。如果一个函数是字节码,想办法找到它所在的文件,加载那个文件就可以得到它原本的定义了。对于宏展开和词法绑定依赖我没有考虑那么多,因为其实有很多字节码是追溯不到源头的。虽然反汇编字节码也不是不能做,但是做全套有点超出我的能力了。因此我的图实际上几乎并不包含隐蔽的依赖(那是调试器级别的工作),更多的是一些表面的符号依赖关系。
我选定的工具是graphviz dot,虽然它不支持动态演示和并行处理,但是可以产生很大的图片。这个体量的数据使用d3.js处理起来应该也足够了,但是用elisp生产json比较麻烦,所以我弃了。
我的代码生成了一个大约10M的dot文件,使用sfdp layout,生成了一个172M的svg文件。普通的图片查看器根本加载不了这么大的svg文件,一打开就崩了,librsvg也处理不了超过100000行的svg,inkscape可以打开(我这台32G内存的机器内存爆满了)但很卡。然后我发现,inkscape居然不支持小于1%的缩放比,此需求至今仍未实现。所以最后,我放弃了矢量图,输出了一张大小为76M的png(上图是压缩过的)
以下贴上代码实现,感兴趣的坛友可以自己挑战一下(无奖),或者抄我的作业做点更漂亮的图。
;;; roots-of-lisp.el --- Generate a relationship graph of Elisp functions -*- lexical-binding: t -*-
;; generate a dependency deps of all lisp calls
;; tested in Vanilla Emacs 30.1-4
;;
;; emacs -q --batch -l roots-of-lisp.el
(defun build-function-call-graph ()
"Build a deps of function calls from all Elisp functions."
(let ((deps (make-hash-table :test 'equal))
(type (make-hash-table :test 'equal)))
(mapatoms (lambda (symbol)
(when (fboundp symbol)
(classify-function symbol deps type)
;; debug print
;;(message "%S :: %S <- %S" symbol (gethash symbol type) (gethash symbol deps))
)))
(visualize-function-call-graph-dot deps type)))
(defun classify-function (symbol deps type)
"Find all function depends of SYMBOL, store a list of them in DEPS, keep a tag in TYPE."
(unless (gethash symbol type)
(let ((func (symbol-function symbol)))
(cond
((symbolp func)
(puthash symbol 'symbol type)
(puthash symbol (list func) deps))
((primitive-function-p func)
(puthash symbol 'primitive-function type))
((special-form-p func)
(puthash symbol 'special-form type))
((and (listp func)
(equal (car func) 'autoload))
(if (gethash symbol deps)
;; autoload does not help
(puthash symbol 'ill-autoload-form type)
;; if it is not loaded yet, try to autoload it first
(condition-case nil
(autoload-do-load func)
(:success
(message "searching %S" symbol)
(puthash symbol (find-func-deps nil func) deps)
(classify-function symbol deps type))
(t (puthash symbol 'load-error type))))
)
((closurep func)
(cond
((interpreted-function-p func)
(puthash symbol 'interpreted type))
((commandp symbol)
(puthash symbol 'command type))
(t
(puthash symbol 'closure type))
)
;; parse it
(let ((args (aref func 0))
(body (aref func 1)))
(if (listp body)
(progn
(message "searching %S" symbol)
(puthash symbol (find-func-deps args body) deps)
)
(puthash symbol 'byte-compiled-macro-generated type)))
)
((macrop func)
(puthash symbol 'macro type)
(let ((args (aref (cdr func) 0))
(body (aref (cdr func) 1)))
(if (listp body)
(progn
(message "searching %S" symbol)
(puthash symbol (find-func-deps args body) deps)
)
(puthash symbol 'byte-compiled-macro-generated type))))
((or (byte-code-function-p func)
(native-comp-function-p func)
(subrp func))
;; try to load it again from source code
(let ((def (find-symbol-definition-exclusive symbol)))
(cond
((equal def 'C-source)
(puthash symbol 'C-source type))
((closurep def)
(classify-function symbol deps type))
(t
(puthash symbol 'subr type))
))
)
(t ;; unknow type
(puthash symbol 'unknown type))
))))
(defvar also-C-source nil
"Whether search in Emacs C source code.")
(defun find-symbol-definition (symbol)
"Find Lisp definition of given SYMBOL of TYPE, for test."
(let* ((func (symbol-function symbol))
(def (find-definition-noselect
symbol nil (find-lisp-object-file-name
symbol func also-C-source))))
(with-current-buffer (car def)
(goto-char (cdr def))
(eval-expression (sexp-at-point))))
(symbol-function symbol))
(defvar library-blacklist
'(;; macroexpansion exceed
"/usr/share/emacs/30.1/lisp/emacs-lisp/cconv.el.gz"
;; macroexpansion exceed
"/usr/share/emacs/30.1/lisp/emacs-lisp/macroexp.el.gz"
;; depends on specific network env
"/usr/share/emacs/30.1/lisp/net/eudc-bob.el.gz"
;; should not be loaded twice
"/usr/share/emacs/30.1/lisp/international/characters.el.gz"
;; emacs will crash when loading this
"/usr/share/emacs/30.1/lisp/erc/erc.el.gz"
;; BUG: failed to search `latin1-display-ucs-per-lynx'
"/usr/share/emacs/30.1/lisp/international/latin1-disp.el.gz"
;; BUG: failed to search `comp--add-cond-cstrs'
"/usr/share/emacs/30.1/lisp/emacs-lisp/comp.el.gz"
)
"Library paths that will not be loaded.
Some Emacs libraries contains malformed `cl-defmacro' which may cause
stackoverflow or other dangerous forms. We had better skip them before
get prepared for them.")
(defun find-symbol-definition-exclusive (symbol)
"Find Lisp definition of given SYMBOL exclusively."
(let* ((func (symbol-function symbol))
(def (find-lisp-object-file-name symbol func also-C-source)))
(cond
((and (stringp def)
(string-match "\\.el\\(\\.gz\\)?$" def))
(unless (file-regular-p def)
;; you may get gzipped elisp libraries.
(setq def (concat def ".gz")))
(when (and (file-regular-p def)
(not (member def library-blacklist)))
(condition-case nil
(load-file def)
(:success
;; symbol may get redefined on loading!
(when symbol (symbol-function symbol)))
(t 'load-error)))
)
(t def))))
(defun find-func-deps (args body)
;; Here we simply try to find symbols that are functions.
;; Neglect inner dynamic bindings in lambda expressions.
;; For more complexed macro forms, it's hard to go further.
;; Doing macroexpansion will not solve it completely
;; because outer lexical bindings may be not parseable.
;; to completely parse a function, you will need something more.
(cond
((null body) nil)
((listp body)
(if (listp (cdr body))
;; regular list
;; first try to match a lambda form
(if (and (equal (car body) 'lambda)
(listp (cadr body))
(caddr body))
(find-func-deps (append args (cadr body))
(caddr body))
;; we do DFS on each entry
;; this may capture some explicit apply forms
;; but may cause problems when there is an infinite list generator
;; after all, let's forget about macros first.
(cl-remove-duplicates
(cl-loop for item in body append
(find-func-deps args item))))
;; dotted pair
(cl-remove-duplicates
(append
(find-func-deps args (car body))
(find-func-deps args (cdr body)))))
)
(t ;; an atom
(when (and (symbolp body)
(fboundp body)
(not (member body args)))
(list body)))))
(defun visualize-function-call-graph-dot (deps type)
"Visualize the function call deps GRAPH using Graphviz."
(with-temp-buffer
(insert "digraph G {\n")
(insert " node [colorscheme=accent8];\n\n")
(let ((nodemap (make-hash-table))
(cnt 0))
(maphash (lambda (func type)
(insert
(format " f%d[label=\"%s\",color=%s];\n" cnt func
(pcase type
('C-source "1")
('primitive-function "2")
('special-form "3")
('interpreted-function "4")
('closure "5")
('macro "6")
('byte-compiled-macro-generated "7")
('command "8")
(_ "\"red\""))))
(puthash func cnt nodemap)
(setq cnt (1+ cnt)))
type)
(maphash (lambda (func calls)
(dolist (call calls)
(when (gethash call nodemap)
(insert (format " f%d -> f%d;\n"
(gethash func nodemap)
(gethash call nodemap))))))
deps))
(insert "}\n")
(write-region (point-min) (point-max) "function-call-graph.dot")
;; this could take hours
;; (shell-command "dot -Ksfdp -Tpng function-call-graph.dot -o function-call-graph.png")
))
(build-function-call-graph)
执行上面这份代码大概需要5-10分钟。
sfdp -x -Tpng function-call-graph.dot -o function-call-graph.png
大概需要5分钟。