Back to Pdfmathtranslate

ADVANCED

docs/ADVANCED.md

1.9.1115.9 KB
Original Source

Documentation > Advanced Usage (current)


<h3 id="toc">Table of Contents</h3>
<h3 id="partial">Full / partial translation</h3>
  • Entire document

    bash
    pdf2zh example.pdf
    
  • Part of the document

    bash
    pdf2zh example.pdf -p 1-3,5
    

⬆️ Back to top


<h3 id="language">Specify source and target languages</h3>

See Google Languages Codes, DeepL Languages Codes

bash
pdf2zh example.pdf -li en -lo ja

⬆️ Back to top


<h3 id="services">Translate with different services</h3>

We've provided a detailed table on the required environment variables for each translation service. Make sure to set them before using the respective service.

TranslatorServiceEnvironment VariablesDefault ValuesNotes
Google (Default)googleNoneN/ANone
BingbingNoneN/ANone
302.AI302aiX302AI_API_KEY, X302AI_MODEL[Your Key], Gemma-7BSee 302.AI
OpenAIopenaiOPENAI_BASE_URL, OPENAI_API_KEY, OPENAI_MODELhttps://api.openai.com/v1, [Your Key], gpt-4o-miniSee OpenAI
DeepLdeeplDEEPL_AUTH_KEY[Your Key]See DeepL
DeepLXdeeplxDEEPLX_ENDPOINThttps://api.deepl.com/translateSee DeepLX
OllamaollamaOLLAMA_HOST, OLLAMA_MODELhttp://127.0.0.1:11434, gemma2See Ollama
XinferencexinferenceXINFERENCE_HOST, XINFERENCE_MODELhttp://127.0.0.1:9997, gemma-2-itSee Xinference
AzureOpenAIazure-openaiAZURE_OPENAI_BASE_URL, AZURE_OPENAI_API_KEY, AZURE_OPENAI_MODEL[Your Endpoint], [Your Key], gpt-4o-miniSee Azure OpenAI
ZhipuzhipuZHIPU_API_KEY, ZHIPU_MODEL[Your Key], glm-4-flashSee Zhipu
ModelScopemodelscopeMODELSCOPE_API_KEY, MODELSCOPE_MODEL[Your Key], Qwen/Qwen2.5-Coder-32B-InstructSee ModelScope
SiliconsiliconSILICON_API_KEY, SILICON_MODEL[Your Key], Qwen/Qwen2.5-7B-InstructSee SiliconCloud
GeminigeminiGEMINI_API_KEY, GEMINI_MODEL[Your Key], gemini-1.5-flashSee Gemini
AzureazureAZURE_ENDPOINT, AZURE_API_KEYhttps://api.translator.azure.cn, [Your Key]See Azure
TencenttencentTENCENTCLOUD_SECRET_ID, TENCENTCLOUD_SECRET_KEY[Your ID], [Your Key]See Tencent
DifydifyDIFY_API_URL, DIFY_API_KEY[Your DIFY URL], [Your Key]See Dify,Three variables, lang_out, lang_in, and text, need to be defined in Dify's workflow input.
AnythingLLManythingllmAnythingLLM_URL, AnythingLLM_APIKEY[Your AnythingLLM URL], [Your Key]See anything-llm
Argos TranslateargosSee argos-translate
GrokgrokGORK_API_KEY, GORK_MODEL[Your GORK_API_KEY], grok-2-1212See Grok
GroqgroqGROQ_API_KEY, GROQ_MODEL[Your GROQ_API_KEY], llama-3-3-70b-versatileSee Groq
DeepSeekdeepseekDEEPSEEK_API_KEY, DEEPSEEK_MODEL[Your DEEPSEEK_API_KEY], deepseek-chatSee DeepSeek
OpenAI-LikedopenailikedOPENAILIKED_BASE_URL, OPENAILIKED_API_KEY, OPENAILIKED_MODELurl, [Your Key], model nameNone
Ali Qwen Translationqwen-mtALI_MODEL, ALI_API_KEY, ALI_DOMAINSqwen-mt-turbo, [Your Key], scientific paperTranditional Chinese are not yet supported, it will be translated into Simplified Chinese. More see Qwen MT

For large language models that are compatible with the OpenAI API but not listed in the table above, you can set environment variables using the same method outlined for OpenAI in the table.

Use -s service or -s service:model to specify service:

bash
pdf2zh example.pdf -s openai:gpt-4o-mini

Or specify model with environment variables:

bash
set OPENAI_MODEL=gpt-4o-mini
pdf2zh example.pdf -s openai

For PowerShell user:

shell
$env:OPENAI_MODEL = gpt-4o-mini
pdf2zh example.pdf -s openai

⬆️ Back to top


<h3 id="exceptions">Translate wih exceptions</h3>

Use regex to specify formula fonts and characters that need to be preserved:

bash
pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"

Preserve Latex, Mono, Code, Italic, Symbol and Math fonts by default:

bash
pdf2zh example.pdf -f "(CM[^R]|MS.M|XY|MT|BL|RM|EU|LA|RS|LINE|LCIRCLE|TeX-|rsfs|txsy|wasy|stmary|.*Mono|.*Code|.*Ital|.*Sym|.*Math)"

⬆️ Back to top


<h3 id="threads">Multi-threads</h3>

Use -t to specify how many threads to use in translation:

bash
pdf2zh example.pdf -t 1

⬆️ Back to top


<h3 id="prompt">Custom prompt</h3>

Note: System prompt is currently not supported. See this change.

Use --prompt to specify which prompt to use in llm:

bash
pdf2zh example.pdf --prompt prompt.txt

For example:

txt
You are a professional, authentic machine translation engine. Only Output the translated text, do not include any other text.

Translate the following markdown source text to ${lang_out}. Keep the formula notation {v*} unchanged. Output translation directly without any additional text.

Source Text: ${text}

Translated Text:

In custom prompt file, there are three variables can be used.

variablescomment
lang_ininput language
lang_outoutput language
texttext need to be translated

⬆️ Back to top


<h3 id="auth">Authorization</h3>

Use --authorized to specify which user to use Web UI and custom the login page:

bash
pdf2zh example.pdf --authorized users.txt auth.html

example users.txt Each line contains two elements, username, and password, separated by a comma.

admin,123456
user1,password1
user2,abc123
guest,guest123
test,test123

example auth.html

html
<!DOCTYPE html>
<html>
<head>
    <title>Simple HTML</title>
</head>
<body>
    <h1>Hello, World!</h1>
    <p>Welcome to my simple HTML page.</p>
</body>
</html>

⬆️ Back to top


<h3 id="cofig">Custom configuration file</h3>

Use --config to specify which file to configure the PDFMathTranslate:

bash
pdf2zh example.pdf --config config.json
bash
pdf2zh -i --config config.json

example config.json

json
{
    "USE_MODELSCOPE": "0",
    "PDF2ZH_LANG_FROM": "English",
    "PDF2ZH_LANG_TO": "Simplified Chinese",
    "NOTO_FONT_PATH": "/app/SourceHanSerifCN-Regular.ttf",
    "translators": [
        {
            "name": "deeplx",
            "envs": {
                "DEEPLX_ENDPOINT": "http://localhost:1188/translate/",
                "DEEPLX_ACCESS_TOKEN": null
            }
        },
        {
            "name": "ollama",
            "envs": {
                "OLLAMA_HOST": "http://127.0.0.1:11434",
                "OLLAMA_MODEL": "gemma2"
            }
        }
    ]
}

By default, the config file is saved in the ~/.config/PDFMathTranslate/config.json. The program will start by reading the contents of config.json, and after that it will read the contents of the environment variables. When an environment variable is available, the contents of the environment variable are used first and the file is updated.

⬆️ Back to top


<h3 id="font-subset">Fonts subsetting</h3>

By default, PDFMathTranslate uses fonts subsetting to decrease sizes of output files. You can use --skip-subset-fonts option to disable fonts subsetting when encoutering compatibility issues.

bash
pdf2zh example.pdf --skip-subset-fonts

⬆️ Back to top


<h3 id="cache">Translation cache</h3>

PDFMathTranslate caches translated texts to increase speed and avoid unnecessary API calls for same contents. You can use --ignore-cache option to ignore translation cache and force retranslation.

bash
pdf2zh example.pdf --ignore-cache

⬆️ Back to top


<h3 id="public-services">Deployment as a public services</h3>

PDFMathTranslate has added the features of enabling partial services and hiding Backend information in the configuration file. You can enable these by setting ENABLED_SERVICES and HIDDEN_GRADIO_DETAILS in the configuration file. Among them:

  • ENABLED_SERVICES allows you to choose to enable only certain options, limiting the number of available services.
  • HIDDEN_GRADIO_DETAILS will hide the real API_KEY on the web, preventing users from obtaining server-side keys.

A usable configuration is as follows:

json
{
    "USE_MODELSCOPE": "0",
    "translators": [
        {
            "name": "grok",
            "envs": {
                "GORK_API_KEY": null,
                "GORK_MODEL": "grok-2-1212"
            }
        },
        {
            "name": "openai",
            "envs": {
                "OPENAI_BASE_URL": "https://api.openai.com/v1",
                "OPENAI_API_KEY": "sk-xxxx",
                "OPENAI_MODEL": "gpt-4o-mini"
            }
        }
    ],
    "ENABLED_SERVICES": [
        "OpenAI",
        "Grok"
    ],
    "HIDDEN_GRADIO_DETAILS": true,
    "PDF2ZH_LANG_FROM": "English",
    "PDF2ZH_LANG_TO": "Simplified Chinese",
    "NOTO_FONT_PATH": "/app/SourceHanSerifCN-Regular.ttf"
}

⬆️ Back to top


<h3 id="mcp">MCP</h3>

PDFMathTranslate can run as MCP server. To use this, you need to run uv pip install pdf2zh, and config claude_desktop_config.json, an example config is as follows:

json
{
    "mcpServers": {
        "filesystem": {
            "command": "npx",
            "args": [
                "-y",
                "@modelcontextprotocol/server-filesystem",
                "/path/to/Document"
            ]
        },
        "translate_pdf": {
            "command": "uv",
            "args": [
                "run",
                "pdf2zh",
                "--mcp"
            ]
        }
    }
}

filesystem is a reuqired mcp server to find pdf file, and translate_pdf is our mcp server.

To test if the mcp server works, you can open claude desktop and tell

find the `test.pdf` in my Document folder and translate it to Chinese