org #+include URL

没有收录的,我一般通过 rss-bridge 或 freshrss 写 css-selector 或 xpath 了

试了一下目前 org-feed 没把 html 转 org,,

简单的话 pandoc 也能用,格式不怎样但是方便: C-u M-! pandoc -f html org #+include URL -t org -o-

进一步地,将x-include改造为不仅可以拉取云端代码执行,还可expand云端代码:

#+name: elisp-2025-03-26-18-21
#+begin_src emacs-lisp
  (defun org-babel--src-block (&optional there name header-args expand-only)
    "Execute or expand source block defined in `THERE'.
  Source block is specified by `NAME' which is a string or a number index.
  `THERE' could be a buffer or buffer-name or file-name or http/https url.
  `HEADER-ARGS' is a string with format like \":var foo=\"bar\" :eval no\".
  Return the execution result or the expanding source of the source block.
  "
    (when-let*
        ((org-babel--src-block-prog
          #'(lambda (&optional name params)
              (when-let*
                  ((name (or name 0))
                   (info (cond
                          ((stringp name) (org-babel-lob--src-info name))
                          ((numberp name)
                           (catch 'break
                             (org-babel-map-src-blocks nil
                               (when (= name 0)
                                 (throw 'break (org-babel-get-src-block-info t)))
                               (setf name (1- name)))))))
                   (_ (progn
                        (cl-callf org-babel-merge-params (nth 2 info) params)
                        (cl-callf org-babel-process-params (nth 2 info))))
                   (body (org-babel--expand-body info))
                   (lang (nth 0 info))
                   (cmd (or (intern (concat "org-babel-execute:" lang))
                            `(lambda (&rest _)
                               (error
                                "No org-babel-execute function for %s!" ,lang)))))
                `(,cmd ,body ',(nth 2 info)))))
         (buf (or (and (or (null there) (string-empty-p there)) (current-buffer))
                  (get-buffer there)
                  (find-buffer-visiting there)
                  (and (file-exists-p there) (find-file-noselect there))
                  (when-let* ;; try URL, support only http/https currently.
                      ((url (if (url-p there) there (url-generic-parse-url there)))
                       (_ (member (url-type url) `("http" "https")))
                       (buf (url-retrieve-synchronously url))
                       (text (with-current-buffer buf
                               (dom-texts (libxml-parse-html-region nil nil)
                                          "\n@@@\n"))))
                    (with-current-buffer (generate-new-buffer "*temp-text*" t)
                      (save-excursion (org-mode) (insert text) (current-buffer)))))))
      (with-current-buffer buf
        (let* ((params (org-babel-parse-header-arguments header-args))
               (prog (funcall org-babel--src-block-prog name params))
               (results (if prog (if expand-only (nth 1 prog) (eval prog)) "")))
          (if (string= (buffer-name) "*temp-text*") (kill-buffer buf))
          results))))
#+end_src

于是,converter的几种写法:一、converter仅为一个代码块,通过org-babel的header-args传参:

#+name: elisp-2025-03-26-16-30
#+begin_src emacs-lisp
  (let* ((dom (with-temp-buffer
                (shell-command (concat "curl -s " url) (current-buffer))
                (libxml-parse-html-region (point-min) (point-max)))))
    (s-join "\n"
            (-map #'dom-texts
                  (dom-non-text-children (dom-by-class dom "^post$")))))
#+end_src

调用方式如下:

#+begin_src emacs-lisp :results drawer
  (org-babel--src-block
   "https://emacs-china.org/t/org-include-url/29242" "elisp-2025-03-26-16-30"
   ":var url=\"https://emacs-china.org/t/org-include-url/29242\"")
#+end_src

二、converter是一个代码块,其内容为表示lambda函数的sexp字符串:

#+begin_src emacs-lisp :noweb yes
  (lambda (url)
    <<elisp-2025-03-26-16-30>>)
#+end_src

调用方式如下(调用时会先将sexp转为lambda,再借funcall传参调用):

#+begin_src emacs-lisp :results drawer
  (funcall
   (read
    (org-babel--src-block
     "https://emacs-china.org/t/org-include-url/29242" nil nil t))
   "https://emacs-china.org/t/org-include-url/29242")
#+end_src

在org-babel–src-block基础上提供一个org babel函数,封装org-babel–src-block,以使用<<expand(there=“foo”,name=“bar”)>​>语义展开位于各处的代码块。

#+name: elisp-2025-03-26-17-50
#+begin_src emacs-lisp
  (unless (assoc 'expand org-babel-library-of-babel)
    (with-temp-buffer
      (org-mode)
      (insert
       "#+name: expand\n"
       "#+begin_src emacs-lisp\n"
       "(org-babel--src-block\n"
       "  (if (boundp 'there) there) (if (boundp 'name) name) nil t)\n"
       "#+end_src")
      (org-babel-lob-ingest)))
#+end_src

以如下代码块为例子,当C-c C-c下述代码块时,expand将展开来自指定位置there的命名代码块name于以下代码块中,随后展开的代码进一步被执行,最终输出执行结果。

#+begin_src emacs-lisp :noweb yes :var a="b"
<<expand(there="https://emacs-china.org/t/org-include-url/29242",name="elisp-2025-03-26-17-29-test")>>
#+end_src

于是,配合converter的第三种写法:

#+begin_src emacs-lisp :noweb yes :results drawer
  (funcall
   <<expand(there="https://emacs-china.org/t/org-include-url/29242")>>
   "https://emacs-china.org/t/org-include-url/29242")
#+end_src

本次试验的完整代码:

#+begin_src emacs-lisp :noweb yes :tangle ~/2025-03-26-18-25.el
  <<expand(there="https://emacs-china.org/t/org-include-url/29242/23",name="elisp-2025-03-26-18-21")>>
  <<expand(there="https://emacs-china.org/t/org-include-url/29242/23",name="elisp-2025-03-26-17-50")>>
#+end_src

另一个应用:

先从上文的elisp-2025-03-26-18-21代码块中定义elisp函数org-babel–src-block;再从上文的elisp-2025-03-26-17-50代码块中定义org babel函数expand;最后把下面的代码块复制到某buffer中,C-c C-v t。

#+begin_src emacs-elisp :noweb yes :tangle ~/org/test.el
  <<expand(there="https://orgmode.org/manual/Working-with-Source-Code.html")>>
  <<expand(there="https://orgmode.org/manual/Literal-Examples.html",name=1)>>
  <<expand(there="https://orgmode.org/manual/Literal-Examples.html",name=2)>>
#+end_src

之后,HOME目录下tangle出的test.el将为:

#+begin_src emacs-lisp
(defun org-xor (a b)
   "Exclusive or."
   (if a (not b) b))
;; This exports with line number 20.
(message "This is line 21")
;; This is listed as line 31.
(message "This is line 32")
#+end_src
1 个赞