replace-regexp-in-string函数的第二个参数如何获取正则捕获组的疑问

m2fox · 2019 年1 月 3 日 12:16

问题描述

现在有这样一个需求，想要把某个字符串中紧跟在字母后面的数字子字符串的前面加一个$，数字子字符串末尾加一个#，而其他位置的数字子字符串不做任何操作。

举个简单例子来讲，有一个字符串：ab12, cde3, 4，就需要替换成这样一个字符串：ab$12#, cde$3#, 4

这时可以这样做达到这个目的：

(setq str "ab12, cde3, 4")
(replace-regexp-in-string "\\b\\([a-zA-Z]+\\)\\([0-9]+\\)\\b"
			  "\\1$\\2#"
			  str)

替换字符串（\\1$\\2#）中用正则的捕获组，可以非常容易地达到我的目的。

不过我想在replace-regexp-in-string函数的第二个参数传入一个匿名函数来达到这个目的（因为在其他的一些类似场景可能需要对匹配到的字符串做很复杂的处理，必须用一个函数来处理），像下面这样：

(setq str "ab12, cde3, 4")
(replace-regexp-in-string "\\b\\([a-zA-Z]+\\)\\([0-9]+\\)\\b"
			  (lambda (m) (concat "$" m "#"))  ;; 问题关键：这里怎么获取到捕获组并进行处理？
			  str)

这个时候就会发现在lambda函数内部不知道怎么获取到正则的捕获组，无法达到目标（上述代码的运行结果是：$ab12#, $cde3#, 4，不是我想要的结果）

希望有大神能帮忙解答这个疑惑，感谢~

xuchunyang · 2019 年1 月 3 日 12:50

When REP is called, the match data are the result of matching REGEXP against a substring of STRING, the same substring that is the actual text of the match which is passed to REP as its argument.

像正常搜索一样，REP 被调用时会给你设置好匹配数据（Match Data），可以用 match-string 访问分组，第二个参数即 REP 的参数。

(replace-regexp-in-string
 "\\b\\([a-zA-Z]+\\)\\([0-9]+\\)\\b"
 (lambda (s)
   (concat (match-string 1 s) "$" (match-string 2 s) "#"))
 "ab12, cde3, 4")
;; => "ab$12#, cde$3#, 4"