性能优化求大神指导

下面代码可能非常长,我也不知道可以怎么优化,关键是 profiler 的结果比较平均,想问问有没有大神看哪段写法比较难受的,可以指出问题。

最近有点想重写 parinfer, 不过感觉依然不知道怎么得到比较好的性能。 下面是一个完成度不怎么高的代码,我通常打开一个 lisp 的 buffer 然后用 parinfer-benchparinfer-report 来测试性能。

对于一个600行的 clojure 代码(其它 lisp 应该也一样),一次处理要 180ms,byte compile 之后 20ms。这个性能还是有点惨,想优化但是没有什么头绪,求指导。

;;; -*- lexical-binding: t -*-

(require 'cl-lib)
(require 'dash)

;;; Errors

(defconst pi/error-eol-backslash
  "Line cannot end in a hanging character escaping.")

(defconst pi/error-quote-danger
  "Quotes must balanced inside comment blocks.")

(defconst pi/error-unclosed-quote
  "String is missing a closing quote.")

(defconst pi/error-unclosed-paren
  "Unclosed open-paren.")

(defconst pi/error-unmatched-close-paren
  "Unmatched close-paren.")

(defconst pi/error-leading-close-paren
  "Line cannot lead with a close-paren.")

;;; Faces

(defface parinfer-trail-face
  '((t (:foreground "grey40")))
  "Face for dimming trails."
  :group 'parinfer)

(defface parinfer-error-face
  '((t (:underline (:style wave :color "red"))))
  "Face for parsing error."
  :group 'parinfer)

(defvar pi/mode 'indent
  "Can be indent, paren or smart.")

(defvar pi/cursor-line-no nil
  "Line number of cursor.")

(defvar pi/cursor-line-x nil
  "Line x of cursor.")

(defvar pi/protect-range nil
  "The range of content that won't be changed.")

(defvar pi/current-indent 0
  "Current indentation.")

(defvar pi/changes nil
  "The changes to made.")

(defvar pi/line-x 0
  "Current line x.")

(defvar pi/line-no 0
  "Current line number.")

(defvar pi/quote-danger nil
  "If we are quote danger.")

(defvar pi/error nil
  "A list of errors.

Has structure of (error . (beg . end)) ")

(defvar pi/openers nil
  "A stack of openers, we will push element in when we meet an opener.

Each element has a structure of (type . point)")

(defvar pi/closers nil
  "A stack of closers, we will push element in when we meet a closer,

Each element has a structure of (type . point)")

(defvar pi/current-trail nil
  "Current tail, has structure of (beg . end)")

(defvar pi/trails nil
  "A list of trails where we can insert or remove parens.

The trail should start with a closer, contains zero or more whitespace and closers, till the end of line. But contains no code characters.
Each element has a structure of (beg . end).
When we apply modifications to buffer, all whitespace in the trails will be deleted.
All closers in the trail will be apply a dim face.")

(defvar pi/indents nil
  "A list of indent to make.

Each element has a structure of (line-number . shift-x)")

(defvar pi/parse-state nil
  "Current state when parsing, can be one of: nil, 'string, 'comment, 'escape.")

(make-variable-buffer-local
 (defvar pi/escape-char nil))

(setq-default pi/escape-char 92)

;;; Debug

(defvar pi/debug t)
(setq pi/debug nil)

(defvar pi/log-buffer-name "*parinfer-log*")

;;; Processing

(defun pi/delete (pos)
  (push (cons nil pos) pi/changes))

(defun pi/insert (ch pos)
  (push (cons ch pos) pi/changes))

(defun pi/get-closer-ch (opener-type)
  (cl-case opener-type
    ((round) ")")
    ((square) "]")
    ((curly) "}")))

(defun pi/init-current-trail (pos)
  "Init current trail at POS."
  (setq pi/current-trail (cons pos pos)))

(defun pi/extend-current-trail ()
  "Extend current trail by one point."
  (setq pi/current-trail
        (cons (car pi/current-trail)
              (1+ (cdr pi/current-trail)))))

(defun pi/delete-trail-whitespaces (from to)
  "Delete whitespaces in trail."
  (-let (((beg . end) pi/protect-range))
    (pi/log "protect range" (list :from from :protect pi/protect-range))
    (while (< from to)
      (unless (and (<= beg from)
                   (< from end))
        (pi/delete from))
      (setq from (1+ from)))))

(defun pi/handle-indent-end ()
  (pi/log "handle indent end" (list :current-trail pi/current-trail
                                    :closers pi/closers
                                    :openers pi/openers))

  (when pi/current-trail
    (-let (((trail-beg . trail-end) pi/current-trail)
           (closers))

      ;; Pop trail closers from variable `pi/closers',
      ;; put them into variable `closers' for later usage.
      (while (let ((pos (cadar pi/closers)))
               (and pos (>= pos trail-beg)))
        (push (pop pi/closers) closers))

      (pi/log "the closers in trail" closers)

      ;; Closing openers
      (-let ((break)
             ;; We use this cnt variable to iterate chars in trails.
             ;; If there's no closer with pos equals (+ trail-beg cnt)
             ;; means there is a whitespace.
             ;; We remove these whitespace, except when cursor is after it in the same line.
             (cnt 0))
        (while (not break)
          (-let* ((opener (car pi/openers))
                  ((opener-type opener-pos opener-indent) opener))
            (if (or (not opener)
                    (< opener-indent pi/current-indent))
                (setq break t)
              (pi/log "adjust trail for opener" opener)
              (pop pi/openers)
              (-let* ((closer (pop closers))
                      ((closer-type closer-pos closer-indent) closer))
                ;; remove whitespaces
                (when closer
                  (pi/delete-trail-whitespaces (+ trail-beg cnt) closer-pos)
                  (setq cnt (- closer-pos trail-beg)))
                ;; match closer
                (cond
                 ((not closer)
                  (let ((ch (pi/get-closer-ch opener-type)))
                    (pi/log "no closer found, insert" (cons ch (cons cnt trail-beg)))
                    (pi/insert ch (+ cnt trail-beg))))
                 ((eq closer-type opener-type)
                  (setq cnt (1+ cnt))
                  (pi/log "closer is correct" nil))
                 (t
                  (let ((ch (pi/get-closer-ch opener-type)))
                    (pi/log "incorrect closer, change it to " ch)
                    (setq cnt (1+ cnt))
                    (pi/insert ch closer-pos)
                    (pi/delete closer-pos))))))))

        ;; Remove extra closers
        (dolist (closer closers)
          (-let (((_ closer-pos _) closer))
            (pi/delete-trail-whitespaces (+ trail-beg cnt) closer-pos)
            (pi/delete closer-pos)
            (setq cnt (1+ cnt)))))

      ;; Add this trail to variable `pi/trails'
      ;; (push pi/current-trail pi/trails)
      ))
  (setq pi/current-indent nil
        pi/current-trail nil))

(defun pi/close-current-trail (pos)
  "When we start a new trail at POS to replace the current one."
  (when pi/closers
    (pi/log "replace trail" (list :pos pos :closers pi/closers :openers pi/openers))
    (dolist (closer (nreverse pi/closers))
      (if (not pi/openers)
          (progn
            (pi/log "extra closer, remove it." pos)
            (pi/delete (cadr closer)))
        (-let* (((closer-type closer-pos closer-indent) closer)
                (opener (pop pi/openers))
                ((opener-type opener-pos opener-indent) opener))
          (if (eq closer-type opener-type)
              (pi/log "match paren" (list :opener opener :closer closer))
            (pi/log "! unmatched close paren"
                    (list :opener opener :closer closer))
            (setq pi/error (cons pi/error-unmatched-close-paren
                                 (cons opener-pos (1+ closer-pos))))))))
    (setq pi/closers nil)))

(defun pi/handle-trail-end (indent)
  "End current trail with current trail."
  ())

(defun pi/process-opener (type pos)
  "Process a opener with `type' at `pos'.
Update openers.
Init current trail at pos. "
  (if pi/current-indent
      (pi/handle-indent-end)
    (pi/close-current-trail pos)
    (setq pi/current-trail (cons pos pos)))
  (let ((opener `(,type ,(1- pos) ,pi/line-x)))
    (pi/log "opener" opener)
    (push opener pi/openers)
    (pi/init-current-trail pos)
    ;; (pi/log "openers stack" pi/openers)
    ))

(defun pi/process-closer (type pos)
  "Process a closer with `type' at `pos'.
Update openers.
Extend current trail at pos. "
  (let ((closer `(,type ,(1- pos) ,pi/line-x)))
    (pi/log "closer" closer)
    (if pi/current-indent
        (setq pi/error (cons pi/error-leading-close-paren
                             (cons (line-beginning-position) pos)))
      (push closer pi/closers)
      (pi/extend-current-trail))))

(defun pi/process-newline (pos)
  "Process a newline at `pos'.
Increament pi/line-no by 1.
Set pi/current-indent to 0. "
  (pi/log "newline" pos)
  (setq pi/line-no (1+ pi/line-no)
        pi/line-x -1
        pi/current-indent 0))

(defun pi/process-whitespace (pos)
  "Process a whitespace at `pos'.
When at indent,
  Increment pi/current-indent by 1.
When not,
  Extend pi/current-trail."
  (cond
   (pi/current-indent (setq pi/current-indent (1+ pi/current-indent)))
   (t
    (pi/log "extend trail" pos)
    (pi/extend-current-trail))))

(defun pi/process-delimiter (pos)
  (pi/process-code 34 pos)
  (setq pi/parse-state 'string)
  (if pi/current-indent
      (pi/handle-indent-end)
    (pi/close-current-trail pos)))

(defun pi/process-commenter (pos)
  (setq pi/current-indent nil)
  (setq pi/parse-state 'comment))

(defun pi/process-code (ch pos)
  "Process code at `pos'.
Init current trail at pos."
  (pi/log "code" (cons (concat "\"" (char-to-string ch) "\"") pos))
  (if pi/current-indent
      (pi/handle-indent-end)
    (pi/close-current-trail pos)
    (setq pi/current-trail (cons pos pos)))
  (setq pi/current-indent nil)
  (pi/init-current-trail pos))

(defun pi/process-after-escape (ch pos)
  (if (= ch 10)
      (setq pi/error (cons pi/error-eol-backslash (cons (1- pos) pos)))
    (pi/log "escaped char" ch)
    (setq pi/parse-state nil)
    (pi/process-code ch pos)))

(defun pi/process-escape (pos)
  (pi/log "escape enable" pos)
  (setq pi/parse-state 'escape)
  (pi/process-code ch pos))

(defun pi/process-in-string (ch pos)
  (if (not (= ch 34))
      (pi/log "in string" pos)
    (pi/log "end string" pos)
    (setq pi/parse-state nil)
    (pi/init-current-trail pos)))

(defun pi/process-in-comment (ch pos)
  (cond
   ((= ch 10)
    (if pi/quote-danger
        (setq pi/error (cons pi/error-quote-danger (cons pi/quote-danger pos)))
      (pi/log "end comment" pos)
      (pi/process-newline pos)
      (setq pi/parse-state nil)))
   ((= ch 34)
    (if pi/quote-danger
        (setq pi/quote-danger nil)
      (setq pi/quote-danger pos))
    (pi/log "toggle quote danger" pos))
   (t
    (pi/log "in comment" pos))))

(defun pi/process-eof ()
  (cond
   ((eq pi/parse-state 'escape)
    (setq pi/error (cons pi/error-eol-backslash
                         (cons (1- (point-max)) (point-max)))))
   ((eq pi/parse-state 'string)
    (setq pi/error (cons pi/error-unclosed-quote
                         (cons (1- (point-max)) (point-max)))))
   (t
    (setq pi/line-x 0
          pi/line-no (1+ pi/line-no)
          pi/current-indent 0)
    (pi/handle-indent-end))))

(defun pi/parse-1 (ch pos)
  "Parse on a single pos."
  (setq pi/line-x (1+ pi/line-x))

  (cond
   ((eq pi/parse-state 'escape)
    (pi/process-after-escape))

   ((= ch pi/escape-char)
    (pi/process-escape pos))

   ((eq pi/parse-state 'string)
    (pi/process-in-string ch pos))

   ((eq pi/parse-state 'comment)
    (pi/process-in-comment ch pos))

   (t
    (case ch
      (32  (pi/process-whitespace pos))
      (10  (pi/process-newline pos))
      (40  (pi/process-opener 'round pos))
      (41  (pi/process-closer 'round pos))
      (91  (pi/process-opener 'square pos))
      (93  (pi/process-closer 'square pos))
      (123 (pi/process-opener 'curly pos))
      (125 (pi/process-closer 'curly pos))
      (34  (pi/process-delimiter pos))
      (59  (pi/process-commenter pos))
      (t (pi/process-code ch pos))))))

(defun pi/get-protect-range ()
  "Return the protect range.

It begins at line beggining, and ends at cursor position."
  (cons (line-beginning-position)
        (point)))

(defun pi/reset-state ()
  "Reset all parser state."
  (setq pi/cursor-line-no (line-number-at-pos (point) t)
        pi/cursor-line-x (- (point) (line-beginning-position))
        pi/protect-range (pi/get-protect-range)
        pi/parse-state nil
        pi/quote-danger nil
        pi/openers nil
        pi/closers nil
        pi/trails nil
        pi/indents nil
        pi/line-no 0
        pi/line-x -1
        pi/error nil
        pi/current-indent 0
        pi/current-trail nil
        pi/changes nil))

(defun pi/parse ()
  (pi/reset-state)
  (let ((max (point-max))
        (pos (point-min)))
    (while (and (not pi/error)
                (not (= pos max)))
      (setq pos (1+ pos))
      (pi/parse-1 (char-before pos) pos)))
  ;; We already at the end of buffer.
  ;; Deal with unhandled trails.
  (pi/log "eof parse" nil)
  (pi/process-eof))

;;; Apply changes

(defun pi/apply-changes ()
  (save-mark-and-excursion
    (let ((inhibit-x-resources)))
    (dolist (change pi/changes)
      (-let (((ch . pos) change))
        (goto-char pos)
        (if (not ch)
            (progn
              (delete-char 1))
          (insert-char ch))))))

;;; Highlight

(defun pi/mark-error ()
  (remove-overlays (point-min) (point-max) 'name 'parinfer-error)
  (when pi/error
    (-let* (((error-message . (beg . end)) pi/error)
            (ov (make-overlay beg end)))
      (overlay-put ov 'face 'parinfer-error-face)
      (overlay-put ov 'name 'parinfer-error)
      (message error-message))))

(defun pi/mark-trails ()
  "Mark trails for current buffer."
  (remove-overlays (point-min) (point-max) 'name 'parinfer-trail)
  (dolist (p pi/trails)
    (when p
      (-let* (((beg . _) p)
              (ov (make-overlay beg (save-mark-and-excursion
                                      (goto-char beg)
                                      (line-end-position)))))
        (overlay-put ov 'face 'parinfer-trail-face)
        (overlay-put ov 'name 'parinfer-trail)))))

(defun pi/log-1 (pre msg)
  (with-current-buffer (get-buffer-create pi/log-buffer-name)
    (insert (format "[ NO:%5d X:%5d IDT:%5s] %s -> %s\n" pi/line-no pi/line-x pi/current-indent pre msg))))

(defun pi/log-state-1 ()
  (pi/log "-------------" nil)
  (pi/log "Openers" pi/openers)
  (pi/log "Closers" pi/closers)
  (pi/log "Trails" pi/trails)
  (pi/log "Changes" pi/changes)
  (pi/log "Error" pi/error)
  (when (window-live-p (get-buffer-window pi/log-buffer-name))
    (with-selected-window (get-buffer-window pi/log-buffer-name)
      (goto-char (point-max)))))

(defun pi/log-clr ()
  (with-current-buffer (get-buffer-create pi/log-buffer-name)
    (delete-region (point-min) (point-max))
    (text-mode)))

(defmacro pi/log (pre msg)
  (if pi/debug
      `(pi/log-1 ,pre ,msg)
    nil))

(defun pi/process-buffer ()
  (interactive)
  (pi/log-clr)
  (pi/parse)
  (unless pi/error
    (pi/apply-changes))
  (pi/mark-error)
  (pi/log-state))

(global-set-key (kbd "C-z") #'pi/process-buffer)

(defun pi/process ()
  (interactive)
  (pi/parse)
  (unless pi/error
    (pi/apply-changes))
  (pi/mark-error))

(defun pi/enable ()
  (add-hook 'post-command-hook 'pi/process nil t))

(defun pi/disable ()
  (remove-hook 'post-command-hook 'pi/process t))

(defmacro parinfer--measure-time (&rest body)
  "Measure the time it takes to evaluate BODY."
  `(let ((time (current-time)))
     ,@body
     (message "%.06fms" (* (float-time (time-since time)) 1000))))

(defun parinfer-bench ()
  (interactive)
  (parinfer--measure-time
   (pi/process)))

(defun parinfer-report ()
  (interactive)
  (profiler-start 'cpu)
  (parinfer--measure-time
   (dotimes (_ 10)
     (pi/process)))
  (profiler-report)
  (profiler-stop))

(define-minor-mode parinfer-mode
  "Simpler Lisp Edit."
  nil
  nil
  nil
  (if parinfer-mode
      (pi/enable)
    (pi/disable)))

(provide 'parinfer)

非常感谢!

现在可不可用?可用的话,卡不卡?不可用的话,不需要现在考虑性能问题。

实现的时候应该有留意到耗时的操作?不是百分百肯定,有值得怀疑的地方?如重复计算缺少缓存。

这段代码是可用的,会按缩进平衡整个 buffer 的括号。因为是每一次按键或每一个命令之后要执行一次的逻辑,所以时间希望在千行代码上达到 8ms 以下的耗时。

只需要考虑当前 top-level 的 sexp 就行了吧,而不是整个 Buffer,一个 sexp 一般也就几十行。

那样不完美,一个是出现不关闭的字符串的时候会有问题。 另一个是有些很长的配置本身就只有一个 top-level 的 sexp。

不清楚 parinfer,ParEdit 括号总是平衡的,用不着考虑不平衡的情况。一般的 sexp (比如一个函数)也就几十行。

paredit 很容易造成不平衡的情况,比如复制内容里面不平衡。

但这不是重点,我想知道我写的这个代码怎么能有更好的性能。而不是 paredit 和 parinfer 的关系、区别、优劣和必要性之类的。

setq 1+不写成incf吗?

代码看起来很low-level了,不知道还有什么优化空间。

buffer有必要从头parse到尾?

函数调用的开销?改写成inline?

谢谢,我不知道有incf这个函数。

从头 parse 到尾是为了解决没有关闭的字符串之类的问题。就算取一段来处理,我也觉得要保证在 1000 行这个数量上性能比较好才行。

说一个比较琐碎的,pcasecl-case 比较数字用的是eql,这个函数没有自己的byte op,性能不是最快的。可以用 =(确定lhs和rhs都是数字)或者eq (确定要比较一个long int)来做数字比较

(defmacro my/number-case (form &rest clauses)
  (declare (indent 1) (debug cl-case))
  (macroexp-let2 macroexp-copyable-p form form
    `(cond ,@(mapcar (lambda (c)
                       (pcase-exhaustive c
                         (`(,head . ,handlers)
                           (if (memq head '(t otherwise))
                               `(t ,@handlers)
                             `((= ,head ,form)
                               ,@handlers)))))
                     clauses))))

(my/number-case 1
  (1 (+ 1 2))
  (3 4))
1 个赞

谢谢,但是不知道为什么换了这个方式之后反而更慢了。差30%左右的样子。

性能和 cond 里面都用 = 是一样,不知道为什么 cl-case 反而还快。

但我明白了确实一些细微的地方就可以有比较大的影响。

那可以试试用eq,eq是比=快的。是我疏忽了,因为=要分别判断 浮点,整数,大整这几个不同的数字类型。eq就是比较指针。另外一方面,你有没有把macro编译之后再做bench?macro展开也是需要时间的。

(defmacro my/number-case (form &rest clauses)
  (declare (indent 1) (debug cl-case))
  (macroexp-let2 macroexp-copyable-p form form
    `(cond ,@(mapcar (lambda (c)
                       (pcase-exhaustive c
                         (`(,head . ,handlers)
                           (if (memq head '(t otherwise))
                               `(t ,@handlers)
                             `((eq ,head ,form)
                               ,@handlers)))))
                     clauses))))

(my/number-case 1
  (1 (+ 1 2))
  (3 4))

这样的话似乎和 cl-case 的性能是一样的,可能这个地方的影响不是很大。 虽然代码里面有个很长的 case。

你是什么环境?我在Archlinux Emacs 27.0.90里,用原编译后的代码(无更改)在 flycheck.el 直接开parinfer-bench,耗时3.288438ms

flycheck 共11521行代码

ArchLinux Emacs 27.0.90 是不是你的电脑性能太好了?

你试的 flycheck.el?

有可能是中间遇到了对于 parinfer 来说的错误,停止 parse 了。 我试下


是的,有 parse 中的错误。在我这大约 10ms

原来一大片下划线是parinfer标错啊,我还以为flycheck坏了……不过就算标错我也只用了2ms,这样我觉得可能是你GC阈值太低影响性能了。

(setq gc-cons-threshold (* 20 1024 1024))

是的,遇到错误就会停下来了。我决定先放一放,等有想法了再继续。

也许能够获得一个范围是最好的办法,不过这个范围很难算。

这个函数如果用的太多是很慢的,因为他的原理是跳到point-min一行行数……不过既然你profile了没有发现,应该不是大头。


GC很多会不会是你overlay太多了?太多的话可以不用以后立即删掉overlay,而不是等着GC。

是的,初始化的时候一共只用一次。

我感觉是push和pop太频繁了,初步的一个想法就是不要一个字符一个字符的处理,看在哪些情况下可以跳过一些内容。我改了 whitespace 的逻辑,在缩进那段区域直接用 back-to-indentation 跳过去,但是好像效果不是很好。

在考虑遇到 symbol 的时候直接跳过去。

代码上好像没啥问题,再优化估计要靠算法了。