用replace-regexp 把所有尖括号里面的内容去掉

我想去掉以下string里尖括号中的所有内容,包含尖括号本身。

(defvar temp-string "<a href=\"/locus/S000004071\">GAL2</a> encodes a high affinity <a href=\"/go/5354\">galactose permease</a> that is also able to <a href=\"/go/5355\">transport glucose</a> (<span data-tooltip aria-haspopup=\"true\" class=\"has-tip\" title=\"Tschopp JF, et al. (1986)\"><a href=\"/reference/S000054945\">6</a></span>, <span data-tooltip aria-haspopup=\"true\" class=\"has-tip\" title=\"Maier A, et al. (2002)\"><a href=\"/reference/S000073003\">4</a></span>).")

(with-temp-buffer
  (insert temp-string)
  (goto-char (point-min))
  (replace-regexp "<*>" "\&")
  (buffer-string))

返回的结果是:

<a href="/locus/S000004071"&GAL2</a& encodes a high affinity <a href="/go/5354"&galactose permease</a& that is also able to <a href="/go/5355"&transport glucose</a& (<span data-tooltip aria-haspopup="true" class="has-tip" title="Tschopp JF, et al. (1986)"&<a href="/reference/S000054945"&6</a&</span&, <span data-tooltip aria-haspopup="true" class="has-tip" title="Maier A, et al. (2002)"&<a href="/reference/S000073003"&4</a&</span&).

文档里说\&指示替代掉整个匹配上的string, 但这段代码只是把>换成了&。 是不是我哪里理解错了?

你需要转义\

(replace-regexp "<*>" "\\&")

ps. 按照你的意思,正则应该是<.*?> 不然temp-string里面只能匹配>

谢谢提示!我再折腾了一下之后发现用non-greedy regexp

(with-temp-buffer
  (insert temp-string)
  (goto-char (point-min))
  (replace-regexp "<.*?>" "")
  (buffer-string))

能返回正确的值:

GAL2 encodes a high affinity galactose permease that is also able to transport glucose (6, 4).

似乎不需要用到"\\&“, 用了反而不行。有谁知道为啥吗?

(感觉自己还是要好好学regexp)