Help implementing better out-of-the-box xelatex export in Org

Hi,

Pedro Andres Aranda Gutierrez have been recently working on better LaTeX export of non-latin Org documents, including written in Chinese. As a part of this work, we are looking for suitable standard LaTeX preamble that will suit most Chinese documents. Apart from setting appropriate unicode fonts, we also need to load some packages like \usepackage[CJKspace]{xeCJK}.

Another package we often see in various online forums is \usepackage{xpinyin}, but we are not 100% sure if it is really something that is commonly needed by native Chinese Org users. Could you please share your experience?

The ongoing discussion of WIP feature branch is in Status of the all-tex-fonts feature branch The branch itself is in https://cgit.git.savannah.gnu.org/cgit/emacs/org-mode.git/log/?h=feature%2Fall-tex-fonts


你好,

Pedro Andres Aranda Gutierrez 近期在改进非拉丁语Org文档(包括中文文档)的LaTeX导出功能。为此我们需要寻找一个标准的LaTeX导言,能够适配大多数中文文档。除了设置合适的Unicode字体外,还需要加载一些包,比如\usepackage[CJKspace]{xeCJK}

另外,我们经常在各种网络论坛中看到\usepackage{xpinyin}这个包,但不确定它是否被中文用户广泛使用。能否分享您的使用经验?

关于该功能的工作进行讨论,详见:Status of the all-tex-fonts feature branch
代码分支地址是:https://cgit.git.savannah.gnu.org/cgit/emacs/org-mode.git/log/?h=feature%2Fall-tex-fonts

3 个赞

the popular universe all-in-one solution for writing Chinese with LaTeX is the CTAN: Package ctex package by

\usepackage{ctex}

or by using one of the ctexart, ctexrep, ctexbook, ctexbeamer documentclasses, which auto detects TeX engine and load appropriate support packages like xeCJK for XeTeX or luatex-ja for luaTeX.

xpinyin

no, it is for adding phonetic alphabets, which is only relevant in very specific scenarios such as preparing exam papers for elementary school

5 个赞

正好借楼问一下,用 ctex 包时,org当中的加粗转为中间文件 tex 时是 \textbf{},当 {} 是中文的时候,PDF 中展示时两侧会多于出来空白(空格),这个搜了下好像没什么好的解决办法。

一个标记语言(或者如果 org 不是标记语言的话标记功能)设计的不够 robust 是这样的

在某种意义上 org 在完成标记的(本职?)工作上可能不如 markdown

我好像记得之前论坛里有 workaround

用 XeTeX 引擎,ctex 选项 space=auto

好像没什么效果。

那么就要修改 org-mode 才行了

Can someone please create a bug report about this? We have discussed potential solutions for Org markup when using Chinese on the list, but that was rather theoretical, without input from native speakers.

One of the possibilities I do recall is using zero-width space. You can get bold inside Org buffers that way: 你好​*你好*​你好. (you can copy-paste into Org buffer and see that “hello” in the middle is fontified as bold). This, however does not work with latex export as latex typesets zero-width spaces as normal spaces by default. So, we further discussed a possibility to cleanup zero-width spaces around markup during export. Or, alternatively, there might be a way to do the same in LaTeX:

#+LANGUAGE: zh
#+LATEX_HEADER: \usepackage[UTF8]{ctex}
#+LATEX_HEADER: \usepackage{newunicodechar}
#+LATEX_HEADER: \newunicodechar{​}{}
#+LATEX_COMPILER: xelatex
* This is test
你好​*你好*​你好.

能否请有人为此撰写一份 bug 报告?我们已经讨论了在列表中使用中文时的 Org 标记的潜在解决方案,但这些讨论主要是理论性的,缺乏母语使用者的实际输入。

我记得的一种可能做法是使用零宽空格(zero‑width space)。这样可以在 Org 缓冲区中实现加粗效果,例如:你好​*你好*​你好.(可以复制粘贴到 Org 缓冲区,看到中间的 “你好” 被字体加粗显示)。然而,这在 LaTeX 导出时不起作用,因为 LaTeX 默认会把零宽空格当作普通空格进行排版。因此,我们进一步讨论了在导出过程中清除标记前后零宽空格的可能性。或者,也可以尝试在 LaTeX 中实现同样的效果:

#+LANGUAGE: zh
#+LATEX_HEADER: \usepackage[UTF8]{ctex}
#+LATEX_HEADER: \usepackage{newunicodechar}
#+LATEX_HEADER: \newunicodechar{​}{}
#+LATEX_COMPILER: xelatex
* This is test
你好​*你好*​你好.

sure.

这个问题应该是 ctex 的,不是 org-mode 的。因为中间 tex 当中并没有零宽空格 or 空格。

There will be zero-width characters if you use the suggested 你好​*你好*​你好. Here, I used the official way Org supports inline emphasis (Escape Character (The Org Manual)) - adding zero-width spaces around the emphasis markers. These markers, however, are also preserved in LaTeX, where they are rendered as full-width spaces.


如果你使用建议的 你好​*你好*​你好,将会出现零宽字符。在这里,我使用了 Org 官方支持内联强调的方式(Escape Character (The Org Manual) )——在强调标记周围添加零宽空格。不过,这些标记在 LaTeX 中也会被保留,并被渲染为全角空格。

如果中间 TeX 真的没有「零宽空格/空格」那 PDF 里也不应该有空格

所以不太会是 ctex 的问题(

我觉得应该直接加空格,然后使用导出hook将生成的 tex 文件不需要的空格删除掉

许多年前折腾的一个包,不知道现在还能不能用,GitHub - tumashu/org2ctex: Export org to ctex (a latex macro for Chinese)

现在 org 开发者打算做的事情好像是使用 zws(零宽空格),然后正在折腾 CJK 支持,和怎么让 TeX 正常处理(或者说忽视这个空格1

这样也不失为一种解决方法,论坛里之前好像也有在 cursor 离开当前行的时候自动插入 zws 的办法

1 因为 xeCJK 会把它渲染成一个全宽空格

说实话我感觉插入zws的方法还是太hacky了,最好还是提供一个设置来开启是否识别两边没有空格的标记。

That would be too breaking and also against the original Org markup design. We have discussed alternative approaches to provide intraword markup in the past: (1) adding multiple markers **bold**; (2) adding a more verbose version of the markup @*{bold} (a special kind of inline special block).


那样改动太大,也违背了原始 Org 标记的设计。我们以前讨论过为词内标记(intraword markup)提供替代方案:(1) 添加多个标记符号 **bold**;(2) 添加更冗长的标记版本 @*{bold}(一种特殊的内联特殊块)。