helm如何实现用拼音首字母搜索

helm如何实现用拼音首字母搜索

今天看到网上有人实现了,helm用拼音来过滤文件列表,用的是pinyin-search这个包,自己想实现一下switch-buffer部分,可能水平不行,没成功。贴一下别人的成果。

作者文章地址:helm-mode打开文件支持中文搜索

;; 首先,这个功能是基于pinyin-search里面的函数来创建pinyin字母到汉字的正则表达式创建。
(setq helm-pinyin-search-p t)
(when helm-pinyin-search-p
  (require 'pinyin-search))


;; helm-find-files中的拼音搜索
;; 将原来的创建搜索的正则表达式由纯英文的改为中英文混合:
;; helm–mapconcat-pattern:
;; he => [^h]*h[^e]*e
;; helm–mapconcat-pinyin-pattern:
;; he => [^h哈]*[h哈][^e额]*[e额]


(defsubst helm--mapconcat-pinyin-pattern (pattern)
    "Transform string PATTERN in regexp for further fuzzy matching.
e.g helm.el$
    => \"[^h哈]*[h哈][^e额]*[e额][^l]*l[^m]*m[^.]*[.][^e]*e[^l]*l$\"
    ^helm.el$
    => \"helm[.]el$\"."
    (let ((ls (split-string-and-unquote pattern "")))
      (if (string= "^" (car ls))
          ;; Exact match.
          (mapconcat (lambda (c)
                       (if (and (string= c "$")
                                (string-match "$\\'" pattern))
                           c (regexp-quote c)))
                     (cdr ls) "")
        ;; Fuzzy match.
        (mapconcat (lambda (c)
                     (if (and (string= c "$")
                              (string-match "$\\'" pattern))
                         c (let ((pinyin-pattern (pinyinlib-build-regexp-string c)))
                             (if (< (length pinyin-pattern) 3)
                                 c
                               (format "[^%s]*%s" (substring pinyin-pattern 1 -1) pinyin-pattern)))))
                   ls ""))))


;; 再把查找文件函数里面的helm–mapconcat-pattern替换为helm–mapconcat-pinyin-pattern:

(defun helm-ff--transform-pattern-for-completion (pattern)
    "Maybe return PATTERN with it's basename modified as a regexp.
This happen only when `helm-ff-fuzzy-matching' is enabled.
This provide a similar behavior as `ido-enable-flex-matching'.
See also `helm--mapconcat-pinyin-pattern'
If PATTERN is an url returns it unmodified.
When PATTERN contain a space fallback to multi-match.
If basename contain one or more space fallback to multi-match.
If PATTERN is a valid directory name,return PATTERN unchanged."
    ;; handle bad filenames containing a backslash.
    (setq pattern (helm-ff-handle-backslash pattern))
    (let ((bn      (helm-basename pattern))
          (bd      (or (helm-basedir pattern) ""))
          ;; Trigger tramp connection with file-directory-p.
          (dir-p   (file-directory-p pattern))
          (tramp-p (cl-loop for (m . f) in tramp-methods
                            thereis (string-match m pattern))))
      ;; Always regexp-quote base directory name to handle
      ;; crap dirnames such e.g bookmark+
      (cond
       ((or (and dir-p tramp-p (string-match ":\\'" pattern))
            (string= pattern "")
            (and dir-p (<= (length bn) 2))
            ;; Fix Issue #541 when BD have a subdir similar
            ;; to BN, don't switch to match plugin
            ;; which will match both.
            (and dir-p (string-match (regexp-quote bn) bd)))
        ;; Use full PATTERN on e.g "/ssh:host:".
        (regexp-quote pattern))
       ;; Prefixing BN with a space call multi-match completion.
       ;; This allow showing all files/dirs matching BN (Issue #518).
       ;; FIXME: some multi-match methods may not work here.
       (dir-p (concat (regexp-quote bd) " " (regexp-quote bn)))
       ((or (not (helm-ff-fuzzy-matching-p))
            (string-match "\\s-" bn))    ; Fall back to multi-match.
        (concat (regexp-quote bd) bn))
       ((or (string-match "[*][.]?.*" bn) ; Allow entering wilcard.
            (string-match "/$" pattern)     ; Allow mkdir.
            (string-match helm-ff-url-regexp pattern)
            (and (string= helm-ff-default-directory "/") tramp-p))
        ;; Don't treat wildcards ("*") as regexp char.
        ;; (e.g ./foo/*.el => ./foo/[*].el)
        (concat (regexp-quote bd)
                (replace-regexp-in-string "[*]" "[*]" bn)))
       (t (concat (regexp-quote bd)
                  (if (>= (length bn) 2) ; wait 2nd char before concating.
                      (progn
                        ;; (print (helm--mapconcat-pinyin-pattern bn))
                        (helm--mapconcat-pinyin-pattern bn))
                    (concat ".*" (regexp-quote bn))))))))


;; 4 helm-multi-files和helm-projectile中的拼音搜索
;; 这两个模式里面的搜索不一样,因为包含全路径。
;; 4.1 match
;; 用来判断模式pattern和string是否匹配。
;; 4.2 search是用于真正的搜索过滤的函数
;; 用来搜索和过滤candidates。


(cl-defun helm-mm-3-match (str &optional (pattern helm-pattern))
    "Check if PATTERN match STR.
When PATTERN contain a space, it is splitted and matching is done
with the several resulting regexps against STR.
e.g \"bar foo\" will match \"foobar\" and \"barfoo\".
Argument PATTERN, a string, is transformed in a list of
cons cell with `helm-mm-3-get-patterns' if it contain a space.
e.g \"foo bar\"=>((identity . \"foo\") (identity . \"bar\")).
Then each predicate of cons cell(s) is called with regexp of same
cons cell against STR (a candidate).
i.e (identity (string-match \"foo\" \"foo bar\")) => t."
    (let ((pat (helm-mm-3-get-patterns pattern)))
      (let ((source-name (assoc-default 'name (helm-get-current-source))))
        ;; (print (concat "8 " source-name))
        (if (string= source-name "Recentf")
            (cl-loop for (predicate . regexp) in pat
                     always (funcall predicate
                                     (condition-case _err
                                         ;; FIXME: Probably do nothing when
                                         ;; using fuzzy leaving the job
                                         ;; to the fuzzy fn.
                                         (string-match
                                          (concat "\\(" regexp "\\)\\|\\(" (pinyin-search--pinyin-to-regexp regexp) "\\)") str)
                                       (invalid-regexp nil))))
          (cl-loop for (predicate . regexp) in pat
                   always (funcall predicate
                                   (condition-case _err
                                       ;; FIXME: Probably do nothing when
                                       ;; using fuzzy leaving the job
                                       ;; to the fuzzy fn.
                                       (string-match regexp str)
                                     (invalid-regexp nil))))))))

如果你说的是helm-buffers-list,可以试试下面的代码(用的是pinyinlib

              (setq zjy/helm-match-in-pinyin t)

              (defun zjy/pinyin-match (pattern candidate)
                (let ((case-fold-search t))
                  (if zjy/helm-match-in-pinyin
                      (string-match (pinyinlib-build-regexp-string pattern) candidate)
                    (string-match pattern candidate))))

              (defun zjy/helm-buffer--match-pattern (pattern candidate &optional nofuzzy)
                (let ((bfn (if (and helm-buffers-fuzzy-matching
                                    (not nofuzzy)
                                    (not helm-migemo-mode)
                                    (not (string-match "\\`\\^" pattern)))
                               #'helm-buffer--memo-pattern
                             #'identity))
                      (mfn (if helm-migemo-mode
                               #'helm-mm-migemo-string-match #'zjy/pinyin-match)))
                  (if (string-match "\\`!" pattern)
                      (not (funcall mfn (funcall bfn (substring pattern 1))
                                    candidate))
                    (funcall mfn (funcall bfn pattern) candidate))))

              (advice-add #'helm-buffer--match-pattern :override #'zjy/helm-buffer--match-pattern)

PS. 就支持拼音过滤这点来说,ivy/selectrum就要方便多了,很容易改变filter函数。

thanks,回去试试。

一直用iswitchb-pinyin解决这个问题,可以让所有的helm窗口都支持拼音首字母搜索

(require 'iswitchb-pinyin)
;; 支持中文拼音首字母匹配,会使helm-find-files匹配过多。
(cl-defun helm-mm-3-match/around (orig-fn str &rest args)
    (apply orig-fn (concat str "|" (str-unicode-to-pinyin-initial str)) args))
(advice-add 'helm-mm-3-match :around #'helm-mm-3-match/around)
;; 默认在输入前面加空格解决匹配问题。
(defun helm-find-files-1/around (orig-fn fname &rest args)
    (apply orig-fn (concat fname " ") args))
(advice-add 'helm-find-files-1 :around #'helm-find-files-1/around)

不知是不是我错觉,用pinyin-lib好像比iswitchb-pinyin要快一点。

应该确实慢一些,不过用在这里感觉还好,楼主的方式应该更完善

慢的真正原因不在于用 pinyinlib 或其它什么方式生成正则表达式。

刚才稍微梳理了一下 helm 的 find files 和 switch buffer 逻辑,发现很多问题。大量重复的 match 操作,难怪它比 ivy 慢。

拼音支持不该放在 helm-mm-3-match,这个函数处在循环的最末端,也就是说,如果有 1000 个文件等待过滤,那么拼音的正则就要重复生成 1000 次。性能消耗在这里。

但 helm 内部实现一团乱麻,想要提升性能,需进行不小的改动。

你会考虑来维护helm么? 还是推荐迁移到ivy? 我刚入坑使用helm,因为了解到它功能还挺全的, 而spacemacs 目前看起来比较依赖helm。我就从ivy 迁移到helm,感觉还可以。

对 helm 研究不多,暂不考虑接手,而且似乎原作者也没有意愿寻求别人接手。

我先写个 patch 自己用用,稳定了再放出来,或者 fork 一份源代码来修补或添加自己想要的特性。

1 个赞

在ivy 和helm之间,你其实还在用helm?还是两者都在用?

我用 helm,一度想迁移至 ivy,但是用不顺手。

:point_right: helm-pinyin: 为 helm 添加拼音搜索