自 怎样把vim的一些操作与中文(比如分词)结合 继续讨论:
我觉得 Google Chrome 的双击取词和 macOS 下的三指取词(词典)体验都还不错(显然我这里指的是中文内容),可见虽然据说中文分词很难,但不妨碍我们作简单的应用,比如取词。“取词”是“选中”操作的预判,因此会出现猜错完全在意料之中。总之有比没有强。我刚才尝试性地实现了一个取词命令
;; For testing:
;; 如何实现中文取词
(defvar mark-chinese-word--words '("如何" "实现" "中文" "取词"))
(defun mark-chinese-word--substrings (string nth)
"Return all substring in STRING which contains NTH."
(let (before
(before-bound (1+ nth))
after
(after-bound (1+ (- (length string) nth)))
result)
(setq before 0)
(while (< before before-bound)
(setq after 1)
(while (< after after-bound)
(push (cons (substring string (- nth before) (+ nth after))
(cons before after))
result)
(incf after))
(incf before))
result))
;; (mark-chinese-word--substrings "中文取词" 2)
;; => (("中文取词" 2 . 2) ("中文取" 2 . 1) ("文取词" 1 . 2) ("文取" 1 . 1) ("取词" 0 . 2) ("取" 0 . 1))
(defun mark-chinese-word ()
"Mark a Chinese word at point."
(interactive)
(let ((str (thing-at-point 'word))
(nth (- (point) (car (bounds-of-thing-at-point 'word)))))
(let ((word
(loop for s in (mark-chinese-word--substrings str nth)
when (member (car s) mark-chinese-word--words)
return s)))
(if word
(progn (set-mark (- (point) (cadr word)))
(goto-char (+ (point) (cddr word))))
(set-mark (point))
(forward-char 1)))))
不清楚大家有什么想法?