[分享] 使用elisp很方便地打印html

我在使用orgMode搭建静态博客时,需要直接使用elisp生成部分html代码,于是写了个 print-html 函数。主要有两个功能 print-html-formatedprint-html-unformated ,前者生成格式化的html,后者生成的html排在一行。

1 使用方法:

如下列表:

(setq list '(div :class "post-container"
		      (div :class "post-div"
		        (h1 (a :href "url" "text of link"))
		        (p "content of paragraph...")
		        (img :src "src-url" :alt "alt-name")
		        (code (span :class "post-date" "date")))))

经过 (print-html-formated list) 后生成:

<div class="post-container">
  <div class="post-div">
    <h1>
      <a ref="url">text of link</a>
    </h1>
    <p>content of paragraph...</p>
    <img src="src-url" alt="alt-name"/>
    <code>
      <span class="post-date">date</span>
    </code>
  </div>
</div>

2 解析思路:

一个典型的输入列表格式为:

'(tagname :attr1 value1 :attr2 value2 ...
	  text
	  (tagname ...)
	  ...)

每个列表的 第一个元素必须存在且只能是标签名 ,标签名为symbol类型。除了标签名,其余部分都是可选的。如果该html标签存在一些属性,比如 class style 等, 那么这些 属性以属性列表(Plist)的形式排列在标签后面 。至此便构建好了一个带属性的html标签。

那么,列表余下的内容便是该html标签内部的内容。这些内容可能是文本信息,也可能是子级html标签(文本与子标签的先后顺序由html具体结构决定)。子级html的列表同样按照以上思路解析,以此类推….直到整个列表解析完毕。根据以上思路,不难想到解析列表使用了 递归 思想。

输入列表的设计思路:

使用Plist表示html的属性键值比Alist少写了许多括号,视觉上更简洁;用列表嵌套来表示html的层级关系,在书写时结构清晰。

3 应用场景

print-html 的初衷是为了生成静态博客的首页、归档页、分类页等具有特定格式的页面。因为对这些页面的布局有要求,所以orgMode导出的html不能满足需求。使用elisp生成的html可以嵌入在orgMode中,博客的可定制性便高了许多。

除此之外,我还没有想到其他应用场景。所以此番折腾更多的是为了熟悉elisp编程和递归思想。如果还有其他应用场景或拓展的思路,还望大家留言讨论~

4 完整代码

点击查看代码实现
(defvar print-html--single-tag-list
  '("img" "br" "hr" "input" "meta" "link" "param"))

(defun print-html--get-plist (list)
  "get attributes of a tag"
  (let* ((i 0)
	 (plist nil))
    (while (and (nth i list) (symbolp (nth i list)))
      (setq key (nth i list))
      (setq value (nth (1+ i) list))
      (setq plist (append plist (list key value)))
      (incf i 2))
    plist))

(defun print-html--get-inner (list)
  "get inner html of a tag"
  (let* ((i 0)
	 (inner nil)
	 (plist (print-html--get-plist list)))
    (if (null plist)
	(setq inner list)
      (dolist (p plist)
	(setq inner (remove p list))
	(setq list inner)))
    inner))

(defun print-html--plist->alist (plist)
  "convert plist to alist"
  (if (null plist)
      '()
    (cons
     (list (car plist) (cadr plist))
     (print-html--plist->alist (cddr plist)))))

(defun print-html--insert-html-tag (tag &optional attrs)
  "insert html tag and attributes"
  (let ((tag (symbol-name tag))
	(attrs (print-html--plist->alist attrs)))
    (if (member tag print-html--single-tag-list)
	(progn
	  (insert (concat "<" tag "/>"))
	  (backward-char 2)
	  (dolist (attr attrs)
	    (insert (concat " " (substring (symbol-name (car attr)) 1) "=" "\"" (cadr attr) "\"")))
	  (forward-char 2))
      (progn
	(insert (concat "<" tag ">" "</" tag ">"))
	(backward-char (+ 4 (length tag)))
	(dolist (attr attrs)
	  (insert (concat " " (substring (symbol-name (car attr)) 1) "=" "\"" (cadr attr) "\"")))
	(forward-char 1)))
    ))

(defun print-html--jump-outside (tag)
  "jump outside of html tag"
  (let ((tag (symbol-name tag)))
    (if (member tag print-html--single-tag-list)
	(forward-char 0)
      (forward-char (+ 3 (length tag))))))
;;----------------------------------------
(defun print-html--parse-list-unformated (list)
  "parse elisp to unformated html"
  (let* ((tag (car list))
	 (left (cdr list))
	 (plist (print-html--get-plist left))
	 (inner (print-html--get-inner left))
	 (html ""))
    (with-current-buffer (get-buffer-create "*print html*")
      (print-html--insert-html-tag tag plist)
      (dolist (item inner)
	(if (listp item)
	    (print-html--parse-list-unformated item)
	  (insert item)))
      (print-html--jump-outside tag)
      (setq html (buffer-substring-no-properties (point-min) (point-max))))
    html))
;;----------------------------------------
(defun print-html--if-no-child (inner)
  "judge if html tag has child-tag"
  (let ((no-child t))
    (dolist (item inner)
      (if (listp item)
	  (setq no-child nil)))
    no-child))

(defun print-html--format-html (tag inner)
  "format html tag, tag which has no child show in one line, others are well formated by default. change this function to redesign the format rule."
  (let ((tag (symbol-name tag)))
    (if (member tag print-html--single-tag-list)
	(insert "")
      (progn
	(if (not (print-html--if-no-child inner))
	    (insert "\n"))))))

(defun print-html--parse-list-formated (list)
  "parse elisp to formated html"
  (let* ((tag (car list))
	 (left (cdr list))
	 (plist (print-html--get-plist left))
	 (inner (print-html--get-inner left))
	 (html ""))
    (with-current-buffer (get-buffer-create "*print html*")
      (print-html--insert-html-tag tag plist)
      (print-html--format-html tag inner)
      (dolist (item inner)
	(if (listp item)
	    (print-html--parse-list-formated item)
	  (progn
	    (insert item)
	    (print-html--format-html tag inner))))
      (print-html--jump-outside tag)
      (insert "\n")
      (setq html (buffer-substring-no-properties (point-min) (point-max))))
    html))
;;---------------------------------------
(defun print-html-unformated (LIST)
  (let ((html (print-html--parse-list-unformated LIST)))
    (kill-buffer "*print html*")
    html))

(defun print-html-formated (LIST)
  (let ((html (print-html--parse-list-formated LIST)))
    (kill-buffer "*print html*")
    html))

(defun print-html (FORMATED LIST)
  "print html with elisp. the first elem of LIST is always a html tag, others could be attributes or text content or child tag. the three mentioned above are all optional. if has, attributes must be in first place, followed by text content and child tag."
  (let ((html ""))
    (if FORMATED
	(print-html-formated LIST)
	(print-html-unformated LIST)
	)))

也可以看 这里

4赞

非泼冷水:内置已有xml-print把sexp输出成XML。还有dom.el操作xml node

格式 (TAG ((ATTR . VAL) ...) BODY...)

(require 'xml)
(with-temp-buffer
  (xml-print '((html ((attr1 . "val1")
                      (attr2 . "val2"))
                "body")))
  (buffer-string))
;; => "<html attr1=\"val1\" attr2=\"val2\">body</html>"

另外生成xml叫parse挺奇怪的,常规的叫法应该是pp(pretty-print)

1赞

谢谢分享,我不知道有现成的实现。不过我的主要目的还是练手,尝试使用elisp将一个想法从零实现。

另外 xml print 使用的是alist,括号太多了,所以我在构思的时候用了plist,写法更简洁。 你的例子我的写法就是:

(require 'print-html)
(with-temp-buffer
  (print-html t '(html :attr1 "val1" :attr2 "val2" "body")
  (buffer-string))
;; => "<html attr1=\"val1\" attr2=\"val2\">body</html>"

使用plist增加了实现难度,不过还是搞定了,一番折腾收获更多,尤其封装与递归的思想。

命名确实有点奇怪,感觉意思都反了 :joy:,改成了 print-html。

有点像 Clojure 里面的 Hiccup。

[:div#id.cls1.cls2 {:attr ""}
  [:span ...]]
1赞

名字改成了 print-html

我觉得挺好,不同于传统 S-Expression 的 SXML 的模型,plist 这里用的很好啊,赞一个

1赞