gk-roam: 轻量的 roam 插件

icezzz · 2020 年9 月 11 日 08:11

rx-form: Unknown rx form ‘anychar’ 应该是rx正则没有anychar定义，请问怎么处理？

Kinney · 2020 年9 月 11 日 08:25

你用的什么版本的emacs？rx是emacs自带的库。我用的27.1。

icezzz · 2020 年9 月 11 日 08:35

果然，我的是26.3

icezzz · 2020 年9 月 11 日 08:38

manjaro当前为26.3，用emacs-git是28不稳定

Kinney · 2020 年9 月 11 日 08:47

你 C-h f “rx” 把文档贴给我看一下，我看看26.3里用的什么代替 anychar。

icezzz · 2020 年9 月 11 日 08:55

rx is an autoloaded Lisp macro in ‘rx.el’.

(rx &rest REGEXPS)

Translate regular expressions REGEXPS in sexp form to a regexp string. REGEXPS is a non-empty sequence of forms of the sort listed below.

Note that ‘rx’ is a Lisp macro; when used in a Lisp program being compiled, the translation is performed by the compiler. See ‘rx-to-string’ for how to do such a translation at run-time.

The following are valid subforms of regular expressions in sexp notation.

STRING matches string STRING literally.

CHAR matches character CHAR literally.

‘not-newline’, ‘nonl’ matches any character except a newline.

‘anything’ matches any character

‘(any SET …)’ ‘(in SET …)’ ‘(char SET …)’ matches any character in SET … SET may be a character or string. Ranges of characters can be specified as ‘A-Z’ in strings. Ranges may also be specified as conses like ‘(?A . ?Z)’.

 SET may also be the name of a character class: ‘digit’,
 ‘control’, ‘hex-digit’, ‘blank’, ‘graph’, ‘print’, ‘alnum’,
 ‘alpha’, ‘ascii’, ‘nonascii’, ‘lower’, ‘punct’, ‘space’, ‘upper’,
 ‘word’, or one of their synonyms.

‘(not (any SET …))’ matches any character not in SET …

‘line-start’, ‘bol’ matches the empty string, but only at the beginning of a line in the text being matched

‘line-end’, ‘eol’ is similar to ‘line-start’ but matches only at the end of a line

‘string-start’, ‘bos’, ‘bot’ matches the empty string, but only at the beginning of the string being matched against.

‘string-end’, ‘eos’, ‘eot’ matches the empty string, but only at the end of the string being matched against.

‘buffer-start’ matches the empty string, but only at the beginning of the buffer being matched against. Actually equivalent to ‘string-start’.

‘buffer-end’ matches the empty string, but only at the end of the buffer being matched against. Actually equivalent to ‘string-end’.

‘point’ matches the empty string, but only at point.

‘word-start’, ‘bow’ matches the empty string, but only at the beginning of a word.

‘word-end’, ‘eow’ matches the empty string, but only at the end of a word.

‘word-boundary’ matches the empty string, but only at the beginning or end of a word.

‘(not word-boundary)’ ‘not-word-boundary’ matches the empty string, but not at the beginning or end of a word.

‘symbol-start’ matches the empty string, but only at the beginning of a symbol.

‘symbol-end’ matches the empty string, but only at the end of a symbol.

‘digit’, ‘numeric’, ‘num’ matches 0 through 9.

‘control’, ‘cntrl’ matches ASCII control characters.

‘hex-digit’, ‘hex’, ‘xdigit’ matches 0 through 9, a through f and A through F.

‘blank’ matches horizontal whitespace, as defined by Annex C of the Unicode Technical Standard #18. In particular, it matches spaces, tabs, and other characters whose Unicode ‘general-category’ property indicates they are spacing separators.

‘graphic’, ‘graph’ matches graphic characters–everything except whitespace, ASCII and non-ASCII control characters, surrogates, and codepoints unassigned by Unicode.

‘printing’, ‘print’ matches whitespace and graphic characters.

‘alphanumeric’, ‘alnum’ matches alphabetic characters and digits. For multibyte characters, it matches characters whose Unicode ‘general-category’ property indicates they are alphabetic or decimal number characters.

‘letter’, ‘alphabetic’, ‘alpha’ matches alphabetic characters. For multibyte characters, it matches characters whose Unicode ‘general-category’ property indicates they are alphabetic characters.

‘ascii’ matches ASCII (unibyte) characters.

‘nonascii’ matches non-ASCII (multibyte) characters.

‘lower’, ‘lower-case’ matches anything lower-case, as determined by the current case table. If ‘case-fold-search’ is non-nil, this also matches any upper-case letter.

‘upper’, ‘upper-case’ matches anything upper-case, as determined by the current case table. If ‘case-fold-search’ is non-nil, this also matches any lower-case letter.

‘punctuation’, ‘punct’ matches punctuation. (But at present, for multibyte characters, it matches anything that has non-word syntax.)

‘space’, ‘whitespace’, ‘white’ matches anything that has whitespace syntax.

‘word’, ‘wordchar’ matches anything that has word syntax.

‘not-wordchar’ matches anything that has non-word syntax.

‘(syntax SYNTAX)’ matches a character with syntax SYNTAX. SYNTAX must be one of the following symbols, or a symbol corresponding to the syntax character, e.g. ‘.’ for ‘\s.’.

 ‘whitespace’		(\s- in string notation)
 ‘punctuation’		(\s.)
 ‘word’			(\sw)
 ‘symbol’			(\s_)
 ‘open-parenthesis’		(\s()
 ‘close-parenthesis’	(\s))
 ‘expression-prefix’	(\s’)
 ‘string-quote’		(\s")
 ‘paired-delimiter’		(\s$)
 ‘escape’			(\s\)
 ‘character-quote’		(\s/)
 ‘comment-start’		(\s<)
 ‘comment-end’		(\s>)
 ‘string-delimiter’		(\s|)
 ‘comment-delimiter’	(\s!)

‘(not (syntax SYNTAX))’ matches a character that doesn’t have syntax SYNTAX.

‘(category CATEGORY)’ matches a character with category CATEGORY. CATEGORY must be either a character to use for C, or one of the following symbols.

 ‘consonant’			(\c0 in string notation)
 ‘base-vowel’			(\c1)
 ‘upper-diacritical-mark’		(\c2)
 ‘lower-diacritical-mark’		(\c3)
 ‘tone-mark’		        (\c4)
 ‘symbol’			        (\c5)
 ‘digit’			        (\c6)
 ‘vowel-modifying-diacritical-mark’	(\c7)
 ‘vowel-sign’			(\c8)
 ‘semivowel-lower’			(\c9)
 ‘not-at-end-of-line’		(\c<)
 ‘not-at-beginning-of-line’		(\c>)
 ‘alpha-numeric-two-byte’		(\cA)
 ‘chinese-two-byte’			(\cC)
 ‘greek-two-byte’			(\cG)
 ‘japanese-hiragana-two-byte’	(\cH)
 ‘indian-two-byte’			(\cI)
 ‘japanese-katakana-two-byte’	(\cK)
 ‘korean-hangul-two-byte’		(\cN)
 ‘cyrillic-two-byte’		(\cY)
 ‘combining-diacritic’		(\c^)
 ‘ascii’				(\ca)
 ‘arabic’				(\cb)
 ‘chinese’				(\cc)
 ‘ethiopic’				(\ce)
 ‘greek’				(\cg)
 ‘korean’				(\ch)
 ‘indian’				(\ci)
 ‘japanese’				(\cj)
 ‘japanese-katakana’		(\ck)
 ‘latin’				(\cl)
 ‘lao’				(\co)
 ‘tibetan’				(\cq)
 ‘japanese-roman’			(\cr)
 ‘thai’				(\ct)
 ‘vietnamese’			(\cv)
 ‘hebrew’				(\cw)
 ‘cyrillic’				(\cy)
 ‘can-break’			(\c|)

‘(not (category CATEGORY))’ matches a character that doesn’t have category CATEGORY.

‘(and SEXP1 SEXP2 …)’ ‘(: SEXP1 SEXP2 …)’ ‘(seq SEXP1 SEXP2 …)’ ‘(sequence SEXP1 SEXP2 …)’ matches what SEXP1 matches, followed by what SEXP2 matches, etc.

‘(submatch SEXP1 SEXP2 …)’ ‘(group SEXP1 SEXP2 …)’ like ‘and’, but makes the match accessible with ‘match-end’, ‘match-beginning’, and ‘match-string’.

‘(submatch-n N SEXP1 SEXP2 …)’ ‘(group-n N SEXP1 SEXP2 …)’ like ‘group’, but make it an explicitly-numbered group with group number N.

‘(or SEXP1 SEXP2 …)’ ‘(| SEXP1 SEXP2 …)’ matches anything that matches SEXP1 or SEXP2, etc. If all args are strings, use ‘regexp-opt’ to optimize the resulting regular expression.

‘(minimal-match SEXP)’ produce a non-greedy regexp for SEXP. Normally, regexps matching zero or more occurrences of something are “greedy” in that they match as much as they can, as long as the overall regexp can still match. A non-greedy regexp matches as little as possible.

‘(maximal-match SEXP)’ produce a greedy regexp for SEXP. This is the default.

Below, ‘SEXP …’ represents a sequence of regexp forms, treated as if enclosed in ‘(and …)’.

‘(zero-or-more SEXP …)’ ‘(0+ SEXP …)’ matches zero or more occurrences of what SEXP … matches.

‘(* SEXP …)’ like ‘zero-or-more’, but always produces a greedy regexp, independent of ‘rx-greedy-flag’.

‘(*? SEXP …)’ like ‘zero-or-more’, but always produces a non-greedy regexp, independent of ‘rx-greedy-flag’.

‘(one-or-more SEXP …)’ ‘(1+ SEXP …)’ matches one or more occurrences of SEXP …

‘(+ SEXP …)’ like ‘one-or-more’, but always produces a greedy regexp.

‘(+? SEXP …)’ like ‘one-or-more’, but always produces a non-greedy regexp.

‘(zero-or-one SEXP …)’ ‘(optional SEXP …)’ ‘(opt SEXP …)’ matches zero or one occurrences of A.

‘(? SEXP …)’ like ‘zero-or-one’, but always produces a greedy regexp.

‘(?? SEXP …)’ like ‘zero-or-one’, but always produces a non-greedy regexp.

‘(repeat N SEXP)’ ‘(= N SEXP …)’ matches N occurrences.

‘(>= N SEXP …)’ matches N or more occurrences.

‘(repeat N M SEXP)’ ‘(** N M SEXP …)’ matches N to M occurrences.

‘(backref N)’ matches what was matched previously by submatch N.

‘(eval FORM)’ evaluate FORM and insert result. If result is a string, ‘regexp-quote’ it.

‘(regexp REGEXP)’ include REGEXP in string notation in the result.

Kinney · 2020 年9 月 11 日 09:08

好的，我晚点改一下。

icezzz · 2020 年9 月 11 日 09:09

谢谢！！！！总体感觉比org-roam清晰，更符合我的思维习惯。期待中

icezzz · 2020 年9 月 11 日 09:57

发现另一处问题：如果再headline中使用链接。Linked Reference就会混乱。

Kinney · 2020 年9 月 11 日 10:26

是的，目前还不支持headline，文档里有写。

Kinney · 2020 年9 月 11 日 22:56

最近的更新里没有使用 anychar了。

Kinney · 2020 年9 月 11 日 23:11

更新 20200911

修改page的文件名为日期数字串，去掉title。文件名只用于标识page。
新增 gk-roam-daily，快速打开当天的daily notes page。
修改page link的格式为 {[title]} ，新增hashtag格式为 #{[title]} 。
输入brackets 或 hashtag后，可以自动补全page名称。

注意

最新的master分支因为使用了特殊格式的 brackts link和hashtag，所以发布为html时链接不能正常解析，需要转化为org link。（待解决）
最新的更新由于改变了文件名的格式和链接格式，所以使用旧版的童鞋要手动改一下文件名和删掉以往的链接，重新插入。

Youmu · 2020 年9 月 12 日 04:59

什么时候会支持 headline 引用呀？

Kinney · 2020 年9 月 12 日 06:15

先把基本的功能都实现了再考虑headline的问题。关于headline引用我还没有想清楚该怎么做。按照roam的思想，每一个page都是独立的单元，只需随意的新建，引用page便可建立不同单元之间的联系。而headline打破了这种思想，使得page不再独立，而是可以包含很多子单元的文件，这样我们创建headline link的时候就要考虑它应该属于哪个文件，操作便有了负担。

zbelial · 2020 年9 月 13 日 02:20

支持headline更多可能是为了迁移已有org文件到roam吧？个人感觉是headline可能会增加心智负担，我现在在用org-roam，只用单文件方式，只关注topic之间的联系，不关注怎么组织文件，用着挺顺手的。

VagrantJoker · 2020 年9 月 13 日 02:59

roam-research 是支持块引用的，这某种程度上和org的headline有点类似。

至于增加负担这一点，org-roam为headline设置id的方法就挺不错的。不需要考虑所属文件关系。

~~单文件的方案确实可以方便操作，但是文件数量上去后有不可避免的有性能问题。org-roam现在就面临这种情况~~。

Kinney · 2020 年9 月 13 日 03:07

block reference 真的好用，打算研究一下怎么实现。

Kinney · 2020 年9 月 13 日 04:16

刚刚更新了新的版本，重新整理了一下内容，麻烦大家到新的帖子讨论。

zbelial · 2020 年9 月 13 日 09:51

看了下这个issue的讨论，好像问题不在文件多，恰恰相反，是文件大。

文件多导致性能差一是因为deft，二是因为company，并不在org-roam。（提问题的人自己的反馈）

文件大导致性能差是因为要解析org文件，所以后面有个pr是不解析成ast，而是通过正则做查找。

VagrantJoker · 2020 年9 月 13 日 10:16

是我没仔细看，抱歉。

主要是早期用的时候测试了一下test-org-files(含1000左右文件)，结果运行表现不是很理想，有点想当然了，忘了文件大小这个因素。