通过沉浸式翻译等插件翻译的网页,保存到本地时,中英文之间缺乏换行,阅读体验很差。所以写了一个小函数以方便处理这种情况:
(defun split-english-chinese ()
"在每行的英文和中文交界处插入一次换行,然后跳到下一行继续检测"
(interactive)
(save-excursion
(goto-char (point-min))
(let ((count 0))
(while (not (eobp))
(let ((line-end (line-end-position))
(found-boundary nil))
;; 在当前行查找英文到中文的交界
(while (and (< (point) line-end)
(not found-boundary))
(let ((char (char-after))
(prev-char (if (> (point) (point-min)) (char-before) nil)))
;; 如果当前字符是中文且前一个字符是英文,这就是交界处
(when (and char prev-char
;; 当前字符是中文
(or (and (>= char #x4e00) (<= char #x9fff)) ;; 基本汉字
(and (>= char #x3400) (<= char #x4dbf)) ;; 扩展A
(and (>= char #x20000) (<= char #x2a6df)) ;; 扩展B
(and (>= char #xf900) (<= char #xfaff))) ;; 兼容表意文字
;; 前一个字符不是中文、不是中文标点、不是换行符、不是空格
(not (or (and (>= prev-char #x4e00) (<= prev-char #x9fff))
(and (>= prev-char #x3400) (<= prev-char #x4dbf))
(and (>= prev-char #x20000) (<= prev-char #x2a6df))
(and (>= prev-char #xf900) (<= prev-char #xfaff))
(and (>= prev-char #x3000) (<= prev-char #x303f))
(and (>= prev-char #xff00) (<= prev-char #xffef))
(eq prev-char ?\n)
(eq prev-char ?\ ))))
;; 插入换行
(insert "\n")
(setq count (1+ count))
(setq found-boundary t)))
(unless found-boundary
(forward-char 1))))
(forward-line 1)))))