emacs -Q 启动 emacs ,在 scratch 中 粘贴一下代码,然后 eval-buffer
;;; -*- lexical-binding:t -*-
(require 'cl-lib)
(require 'url)
(cl-defun my-test-user-agent ()
(interactive)
(setq url-debug t)
(let ((url-user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.57"))
(url-retrieve "https://httpbin.org/get?a=b" (lambda (_)))))
然后 M-x 运行 my-test-user-agent
, 打开 URL-DEBUG 查看,发现 User-Agent 没有变成 let 里定义的字符串。
所以应该如何修改 url-user-agent
?
这个跟 cl-defun 和 defun 没有关系。
url-retrieve 是异步的,当实际创建请求用到 url-user-agent 时,可能已经出了 let 的作用域,
这个时候 url-user-agent 就变成原值了,所以看着像没有生效。
url-user-agent
有无生效不是这么看的,得看服务端收到了什么。
用 https://ifconfig.net/ 做测试,当 user-agent
为 curl
时它返回 IP,否则返回 HTML:
(let ((url-user-agent "curl"))
(url-retrieve
"https://ifconfig.net"
(lambda (_)
(with-current-buffer (current-buffer)
(when (re-search-forward "\n\n" nil t)
(print (buffer-substring (point) (point-max))))))))
;; => xx.xx.xx.xxx
(let ((url-user-agent "curl"))
(with-current-buffer (url-retrieve-synchronously "https://ifconfig.net/")
(when (re-search-backward "\n\n" nil t)
(buffer-substring (point) (point-max)))))
;; => xx.xx.xx.xxx
EDIT:
实际上 *URL-DEBUG*
看到的也是已经修改的 user-agent
:
http -> Found existing connection: ifconfig.net:443 #
http -> Reusing existing connection: ifconfig.net:443
http -> Marking connection as busy: ifconfig.net:443 #
http -> getting referer from buffer: buffer:# target-url:#s(url "https" nil nil "ifconfig.net" nil "" nil nil t nil t t) lastloc:nil
http -> Request is:
GET / HTTP/1.1
MIME-Version: 1.0
Connection: keep-alive
Host: ifconfig.net
Accept-encoding: gzip
Accept: */*
User-Agent: curl
当你第一次请求 “https://ifconfig.net” ,url-user-agent 就没有修改成功,
后面再请求同一个网站因为重用了连接,再 url-retrieve 返回前就已经发送了请求,所以 url-user-agent 能够正常修改
我用你上面那段代码,第一次执行返回的 html ,后面返回的就是 ip
url 处理第一次连接和重用连接的代码:
⋊> emacs -Q --eval "\
(progn
(defvar data-buffer nil)
(let ((url-user-agent \"curl\")
(url-debug t)
(proc-buffer
(url-retrieve
\"https://ifconfig.net\"
(lambda (_)
(with-current-buffer (setq data-buffer (current-buffer))
(when (re-search-forward \"\n\n\" nil t)
(message \"==> Respone buffer\")
(message \"==> %s\" (buffer-substring (point) (point-max)))))))))
(while (not data-buffer)
(sit-for 0.1))
(with-current-buffer \"*URL-DEBUG*\"
(message \"==> *URL-DEBUG*\")
(goto-char (point-min))
(while (re-search-forward \"^User-Agent:.*\" nil t)
(message \"==> %s\" (buffer-substring (match-beginning 0) (match-end 0)))))))" --batch
Contacting host: ifconfig.net:443
==> Respone buffer
==> xx.xx.xx.xxx
==> *URL-DEBUG*
==> User-Agent: curl
把 sit-for 放在 let 作用域外试试,上面整个请求过程都在 let 作用域里面,url-user-agent 肯定能够改变
这样的确是有问题。
文档里说了:
The variables ‘url-request-data’, ‘url-request-method’ and
‘url-request-extra-headers’ can be dynamically bound around the
request; dynamic binding of other variables doesn’t necessarily
take effect.
我觉得其它 url-*
变量动态绑定也应该生效比较合理。
可是 (let ((url-request-extra-headers '(("User-Agent" . "curl"))))...
也不生效啊,url 这块代码应可能问题。
按理说 url-request-extra-headers
可以用来动态绑定 User-Agent
,但是在构造 http 请求的时候只是调用了 (url-http-user-agent-string)
,而忽略了 url-request-extra-headers
的设置:
(with-emacs
(defvar data-buffer nil)
(setq url-debug t)
(let ((url-request-extra-headers '(("User-Agent" . "curl"))))
(url-retrieve
"https://ifconfig.net"
(lambda (_)
(with-current-buffer (current-buffer)
(when (re-search-forward "\n\n" nil t)
(message "==> Respone callback")
(message "%s" (truncate-string-to-width
(buffer-substring (point) (point-max))
100 nil nil t))))
(setq data-buffer t))))
(while (not data-buffer)
(sit-for 0.1))
(with-current-buffer "*URL-DEBUG*"
(message "==> *URL-DEBUG*")
(goto-char (point-min))
(while (re-search-forward "^User-Agent:.*" nil t)
(message "%s" (buffer-substring (match-beginning 0) (match-end 0))))))
;; ==> Respone callback
;; <!DOCTYPE html>
;; <html lang="en">
;; <head>
;; <meta charset="utf-8" />
;; <title>What is my IP addre...
;; ==> *URL-DEBUG*
;; User-Agent: URL/Emacs Emacs/29.0.60 (TTY; x86_64-apple-darwin17.7.0)
;; User-Agent: curl
从*URL-DEBUG*
找到了两个不同 User-Agent
。
在 url-retrive
返回的 buffer 再设置一下 local variable 就可以了:
(with-emacs
(defvar data-buffer nil)
(setq url-debug t)
(let ((url-user-agent "curl"))
(with-current-buffer
(url-retrieve
"https://ifconfig.net"
(lambda (_)
(with-current-buffer (current-buffer)
(when (re-search-forward "\n\n" nil t)
(message "==> Respone callback")
(message "%s" (truncate-string-to-width
(buffer-substring (point) (point-max))
100 nil nil t))))
(setq data-buffer t)))
(set (make-local-variable 'url-user-agent) url-user-agent))) ;; +++
(while (not data-buffer)
(sit-for 0.1))
(with-current-buffer "*URL-DEBUG*"
(message "==> *URL-DEBUG*")
(goto-char (point-min))
(while (re-search-forward "^User-Agent:.*" nil t)
(message "%s" (buffer-substring (match-beginning 0) (match-end 0))))))
输出:
Contacting host: ifconfig.net:443
==> Respone callback
xx.xx.xx.xxx
==> *URL-DEBUG*
User-Agent: curl
1 个赞