dom-select.el: 基于 dom.el 实现的 CSS 风格选择器

基于 dom.el,目前只支持这些选择器和组合子:

Syntax Example Example description
.class .intro Select all elements with class="intro"
.class1.class2 .name1.name2 Select all elements with both name1 and name2 set within its class attribute
#id #firstname Select all elements with id="firstname"
tag p Select all <p> elements
[attr=fullstring] [target=_blank] Select all elements with target="_blank"
[attr|=prefix] [lang|=en] Select all elements with a lang attribute value equals “en” or starts with “en-”
[attr~=word] [title~=flower] Select all elements with title attribute containing the word “flower”
[attr^=prefix] a[href^=https] Select every <a> element whose href attribute value begins with “https”
[attr*=substring] a[href*=example] Select every <a> element whose href attribute value contains “example”
[attr$=prefix] a[href$=.pdf] Select every <a> element whose href attribute value ends with “https”
ancestor descendant div p Select all <p> elements inside <div> elements
parent > child div > p Select all <p> elements where the parent is a <div> element
prev + adjacent div + p Select all <p> elements that are placed immegiately after <div> elements
prev ~ siblings p ~ ul Select every <ul> element that are placed by a <p> element
parent < child div < p Select all <div> elements which contains at least one <p> element

用法:

(dom-select dom "selector string")

类似的项目:

  • elquery 输出的是自定义结构,后续处理无法利用内置的 dom- 函数。而且在我试用的过程中,发现有时选不到元素。
  • doom.el JS 风格的接口,项目已停更并删库了,这里有个 fork
2 个赞

还有 esxml-query.el

1 个赞

看起来相当完善啊,为啥我没有早发现它。

完善度比较(:triangular_flag_on_post:表示得分):

Name esxml-query dom-select Syntax
Namespaces No No foo
Commas Yes🚩 No foo,bar
Descendant combinator Yes Yes foo bar
Child combinator Yes Yes foo>bar
Parent combinator No Yes🚩 foo<bar
Adjacent sibling combinator No Yes🚩 foo+bar
General sibling combinator No Yes🚩 foo~bar
Universal selector Yes🚩 No *
Type selector Yes Yes tag
ID selector Yes Yes #foo
Class selector Yes Yes .foo
Attribute selector Yes Yes [foo]
Exact match attribute selector Yes Yes [foo=bar]
Prefix match attribute selector Yes Yes [foo^=bar]
Suffix match attribute selector Yes Yes [foo$=bar]
Substring match attribute selector Yes Yes [foo*=bar]
Include match attribute selector Yes Yes [foo~=bar]
Dash match attribute selector Yes Yes [foo
Attribute selector modifiers No No [foo=bar i]
Pseudo elements No No ::foo
Pseudo classes No No :foo

(其中 Parent combinator 并非标准实现,是借用 jQuery :has 选择器的想法。)

这么一比较,让我感觉到少许安慰:暂时看起来没白写。

另外,我对它 `[foo]` 的实现有疑问,例如以下情况就选不到元素了:
(with-temp-buffer
  (insert "<p disabled>disabled element</p>")
  (let ((dom (libxml-parse-xml-region (point-min) (point-max))))
    (esxml-query "[disabled]" dom)))
;; => nil

因为 libxml-parse-xml-region 直接把没有值的 attribute 删掉了。这也是我纠结了很久,最后决定放弃实现 [attr] 的原因。

用 libxml-parse-html-region 是正常的:

(with-temp-buffer
  (insert "<p disabled>disabled element</p>")
  (libxml-parse-html-region (point-min) (point-max)))
;; => (html nil (body nil (p ((disabled . "disabled")) "disabled element")))

(with-temp-buffer
  (insert "<p disabled>disabled element</p>")
  (libxml-parse-xml-region (point-min) (point-max)))
;; => nil

只有 disabled readonly 等内置属性(更多见 html-tag-alist)有效,自定义属性无效:

(with-temp-buffer
  (insert "<p foo></p>")
  (libxml-parse-html-region (point-min) (point-max)))
;; => (html nil (body nil (p nil)))

没有值返回空字符不是更好吗?

(with-temp-buffer
  (insert "<p foo=\"\"></p>")
  (libxml-parse-html-region (point-min) (point-max)))
;; => (html nil (body nil (p ((foo . "")))))

在 Google Chrome 的控制台读取 readonly 得到的就是空字符:

> $$('input')[0].getAttribute('readonly');
""

现在也支持 [attr] 了。