谁有在 Mac 上面使用 Emacs 作为邮件客户端的经验?

因为xapian对CJK的搜索比较弱,只能搜两个字的词,如果搜“中国人”就不行。我写了个程序,可以自动帮你把单词切分为2个字的,并进行搜索。

;;
;; Xapian, the search engine of mu has a poor support of CJK characters,
;; which causes only query contains no more than 2 CJK characters works.
;; 
;; https://researchmap.jp/?page_id=457
;;
;; This workaroud breaks any CJK words longer than 2 characters into
;; combines of bi-grams. Example: 我爱你 -> (我爱 爱你)
;;
(defun mu4e-goodies~break-cjk-word (word)
  "Break CJK word into list of bi-grams like: 我爱你 -> 我爱 爱你"
  (if (or (<= (length word) 2)
          (equal (length word) (string-bytes word))) ; only ascii chars
      word
    (let ((pos nil)
          (char-list nil)
          (br-word nil))
      (if (setq pos (string-match ":" word))     ; like: "s:abc"
          (concat (substring word 0 (+ 1 pos)) 
                  (mu4e-goodies~break-cjk-word (substring word (+ 1 pos))))
        (if (memq 'ascii (find-charset-string word)) ; ascii mixed with others like: abcあいう
            word
          (progn 
            (setq char-list (split-string word "" t))
            (while (cdr char-list)
              (setq br-word (concat br-word (concat (car char-list) (cadr char-list)) " "))
              (setq char-list (cdr char-list)))
            br-word))))))

(defun mu4e-goodies~break-cjk-query (expr)
  "Break CJK strings into bi-grams in query."
  (let ((word-list (split-string expr " " t))
        (new ""))
    (dolist (word word-list new)
      (setq new (concat new (mu4e-goodies~break-cjk-word word) " ")))))

(setq mu4e-query-rewrite-function 'mu4e-goodies~break-cjk-query)

这个包含在我的一个包含了不少我自己写的mu4e的扩展功能的GitHub项目中。

3 个赞