[Trick] 用 Emacs 正则实现 Negative lookahead (?!pattern) 效果

(defun regexp-not (s)
  "Return a regexp to not match string S.

Example:

  (let ((r (concat \"^\\(foo\\)\" (regexp-not \"abc\"))))
    (cl-assert (not (string-match-p r \"fooabc\")))
    (cl-assert (not (string-match-p r \"fooabcd\")))
    (cl-assert      (string-match-p r \"fooabar\"))
    (cl-assert      (string-match-p r \"fooaqux\")))"
  (let ((i (length s)) rx)
    (while (<= 0 (setq i (1- i)))
      (let ((c (regexp-quote (char-to-string (aref s i)))))
        (setq rx (concat "\\(?:"
                         (unless (zerop i) "\\'\\|")
                         "[^" c "]"
                         (when rx (concat "\\|" c rx))
                         "\\)"))))
    rx))

(let ((r (concat "^\\(foo\\)" (regexp-not "abc"))))
  (mapcar (lambda (s)
            (list :input s
                  :match-1 (when (string-match r s)
                             (match-string 1 s))))
          '("fooabc"
            "fooabcd"
            "foobar"
            "fooqux")))
;; => ((:input "fooabc"  :match-1 nil)
;;     (:input "fooabcd" :match-1 nil)
;;     (:input "foobar"  :match-1 "foo")
;;     (:input "fooqux"  :match-1 "foo"))
7 个赞

补两张对比效果图:

emacs-regexp-negative-lookahead

regex101-negative-lookahead

Emacs 正则表达式很复杂,手写很难保证不出错,即使用 rx 也会写到走神:

(xr (concat "^\\(foo\\)" (regexp-not "abc")))
;; => (seq bol
;;        (group "foo")
;;        (or (not (any "a"))
;;            (seq "a" (or eos (not (any "b"))
;;                      (seq "b" (or eos (not (any "c"))))))))