load-path-filter-function 的实现代码如下:
(defvar load-path-filter--cache nil
"A cache used by `load-path-filter-cache-directory-files'.
The value is an alist. The car of each entry is a list of load suffixes,
such as returned by `get-load-suffixes'. The cdr of each entry is a
cons whose car is a regex matching those suffixes
at the end of a string, and whose cdr is a hash-table mapping directories
to files in those directories which end with one of the suffixes.
These can also be nil, in which case no filtering will happen.
The files named in the hash-table can be of any kind,
including subdirectories.
The hash-table uses `equal' as its key comparison function.")
如果在 init.el 中添加了 (setq load-path-filter-function #'load-path-filter-cache-directory-files) ,这个 cache 变量会存放加载过程中的一些缓存信息,它是一个 alist,其中每个条目的 car 是一个加载后缀列表,cdr 是一个 cons,其中 car 是匹配这些后缀的正则,cdr 是哈希表,哈希表的键是路径,值是目录中的所有满足后缀正则的文件。我截取了我的 load-path-filter–cache 的某个条目的部分内容:
((".dll" ".elc" ".elc.gz" ".el" ".el.gz")
"\\(?:\\.\\(?:dll\\|el\\(?:\\.gz\\|c\\(?:\\.gz\\)?\\)?\\)\\)\\'" .
#s(hash-table test equal data
("d:/_D/msys64/home/26633/.emacs.d/elpa/activities-0.7.2" ("activities.elc" "activities.el" "activities-tabs.elc" "activities-tabs.el" "activities-pkg.el" "activities-list.elc" "activities-list.el" "activities-autoloads.el")
"d:/_D/msys64/home/26633/.emacs.d/elpa/asmd-0.1" ("asmd.elc" "asmd.el" "asmd-pkg.el" "asmd-autoloads.el")
"d:/_D/msys64/home/26633/.emacs.d/elpa/bison-mode-20210527.717" ("bison-mode.elc" "bison-mode.el" "bison-mode-pkg.el" "bison-mode-autoloads.el"))))
(defun load-path-filter-cache-directory-files (path file suffixes)
"Filter PATH to leave only directories which might contain FILE with SUFFIXES.
PATH should be a list of directories such as `load-path'.
Returns a copy of PATH with any directories that cannot contain FILE
with SUFFIXES removed from it.
Doesn't filter PATH if FILE is an absolute file name or if FILE is
a relative file name with leading directories.
Caches contents of directories in `load-path-filter--cache'.
This function is called from `load' via `load-path-filter-function'."
(if (file-name-directory file)
;; FILE has more than one component, don't bother filtering.
path
(pcase-let
((`(,rx . ,ht)
(with-memoization (alist-get suffixes load-path-filter--cache
nil nil #'equal)
(if (member "" suffixes)
'(nil ;; Optimize the filtering.
;; Don't bother filtering if "" is among the suffixes.
;; It's a much less common use-case and it would use
;; more memory to keep the corresponding info.
. nil)
(cons (concat (regexp-opt suffixes) "\\'")
(make-hash-table :test #'equal))))))
(if (null ht)
path
(let ((completion-regexp-list nil))
(seq-filter
(lambda (dir)
(when (file-directory-p dir)
(try-completion
file
(with-memoization (gethash dir ht)
(directory-files dir nil rx t)))))
path))))))
首先,如果文件是个路径就不做过滤,如果是文件则使用 (alist-get suffixes load-path-filter--cache nil nil #'equal) 尝试从 cache 中找到符合后缀的条目,pcase-let 中的 ht 会匹配上面我们提到的哈希表。
在获取哈希表后,接下来是通过 seq-filter 过滤 path 列表(类似 load-path 内容的列表),具体来说就是通过 try-completion 判断 file 是否在某一目录下的文件列表中。文件列表的获取逻辑是:如果在 with-memoization 中找到了该路径上则返回路径对应的文件列表,否则执行 directory-files 找到目录下文件的文件名并缓存在哈希表中。
经过这样的过滤操作,可以大大减少 load 过程中需要遍历的文件目录数量。相比原先可能遍历每个 load-path 中的目录,先过滤再在比较小的子集中查找要快不少,因为过滤操作不涉及系统调用。
在 lread.c 中,这一补丁也修改了 load 的定义(e5218df),添加了如下内容:
Lisp_Object load_path = Vload_path;
if (FUNCTIONP (Vload_path_filter_function))
load_path = calln (Vload_path_filter_function, load_path, file, suffixes);
当第一次调用 load 时,根据上面的实现逻辑,它就会将 path 中的所有路径做缓存处理,之后的 load 能够根据这一信息过滤到尽可能多的路径来实现快速加载。关于这一实现的讨论在 2025 年的 2 月, 4 月和 5 月: Speeding up loading when load-path has many packages。
顺带一提,我的 load-path 长度为 130,mapc 一遍用时差不多是 8ms。
(length load-path)
130
(benchmark-run (mapc (lambda (x) (directory-files x nil)) load-path))
(0.008635 0 0.0)