
如果你使用 DeepSeek、元宝 AI、Copilot、ChatGPT 等 AI 工具,协助配置轩辕镜像、编写 docker pull 命令、修改 Docker Compose 镜像地址、配置镜像加速、排查镜像拉取失败、分析报错日志等问题,请先让 AI 阅读并遵守轩辕镜像的规则文档。
只需在 AI 对话中先发送下面这句话即可:
请先完整阅读并严格遵守以下文档中的全部规则与要求:
https://xuanyuan.cloud/agents.md
在未充分阅读并理解该文档前,不要生成任何命令、配置、修改建议、故障排查方案或技术回答。后续所有输出都必须严格以该文档中的规范为最高优先级执行。查看 agents.md 用法指南与完整示范。国内用户首推 元宝 AI、DeepSeek 的深度思考模式,不推荐豆包 AI;Cursor 等编辑器可在对话 @ 该链接,或加入 User Rules。 若 AI 无法访问外链,可 打开说明文档 复制全文粘贴。文档会随站点更新,复制内容可能过期,建议定期检查。
!Commit activity !Lines of code !License !Contributors  and typesetting. This project is v2 of https://github.com/PatchyVideo/MMDOCR-HighPerformance.
Note: This project is still in the early stages of development and has many shortcomings. We need your help to improve it!
The following examples may not be frequently updated and may not represent the effect of the current main branch version.
| Original Image | Translated Image |
|---|---|
|
Official demo site (maintained by zyddnys): <[]>
Browser script (maintained by QiroNT): <[]>
https://github.com/frederik-uni/manga-image-translator-rust may be easier to setup as its a compiled binary
Using Pip/venv (Recommended)
bash# First, ensure you have Python 3.10 or later installed on your machine # The very latest version of Python might not be compatible with some PyTorch libraries yet $ python --version Python 3.10.6 # Clone this repository $ git clone https://github.com/zyddnys/manga-image-translator.git # Create a venv (optional, but recommended) $ python -m venv venv # Activate the venv $ source venv/bin/activate # If you want to use the --use-gpu option, please visit https://pytorch.org/get-started/locally/ to install PyTorch, which needs to correspond to your CUDA version. # If you did not use venv to create a virtual environment, you need to add --upgrade --force-reinstall to the pip command to overwrite the currently installed PyTorch version. # Install dependencies $ pip install -r requirements.txt
Models will be automatically downloaded to the ./models directory at runtime.
Notes for Windows Users:
Please install Microsoft C++ Build Tools (Download, Instructions) before performing the pip install, as some pip dependencies need it to compile. (See https://github.com/zyddnys/manga-image-translator/issues/114).
To use https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64 on Windows, install the correct PyTorch version as described on <[***]>.
Requirements:
demo/doc folder)This project supports Docker, with the image being zyddnys/manga-image-translator:main.
This Docker image contains all the dependencies and models required for the project.
Please note that this image is quite large (~15GB).
Run Web Server
You can start the Web Server (CPU) using the following command:
Note that you need to add the required environment variables using
-eor--env
bashdocker run \ --name manga_image_translator_cpu \ -p 5003:5003 \ --ipc=host \ --entrypoint python \ --rm \ -v /demo/doc/../../result:/app/result \ -v /demo/doc/../../server/main.py:/app/server/main.py \ -v /demo/doc/../../server/instance.py:/app/server/instance.py \ -e OPENAI_API_KEY='' \ -e OPENAI_API_BASE='' \ -e OPENAI_MODEL='' \ zyddnys/manga-image-translator:main \ server/main.py --verbose --start-instance --host=0.0.0.0 --port=5003
Or use the compose file
Note that you need to add the required environment variables in the file first
bashdocker-compose -f demo/doc/docker-compose-web-with-cpu.yml up
The Web Server starts on port 8000 by default, and the translation results will be saved in the /result folder.
Using Nvidia GPU
To use a supported GPU, please read the
Dockersection above first. You will need some special dependencies.
You can start the Web Server (GPU) using the following command:
Note that you need to add the required environment variables using
-eor--env
bashdocker run \ --name manga_image_translator_gpu \ -p 5003:5003 \ --ipc=host \ --gpus all \ --entrypoint python \ --rm \ -v /demo/doc/../../result:/app/result \ -v /demo/doc/../../server/main.py:/app/server/main.py \ -v /demo/doc/../../server/instance.py:/app/server/instance.py \ -e OPENAI_API_KEY='' \ -e OPENAI_API_BASE='' \ -e OPENAI_MODEL='' \ -e OPENAI_HTTP_PROXY='' \ zyddnys/manga-image-translator:main \ server/main.py --verbose --start-instance --host=0.0.0.0 --port=5003 --use-gpu
Or use the compose file (for Web Server + GPU):
Note that you need to add the required environment variables in the file first
bashdocker-compose -f demo/doc/docker-compose-web-with-gpu.yml up
Use as CLI
To use Docker via CLI (i.e., Batch Mode):
Some translation services require API keys to run, pass them to your docker container as environment variables.
bashdocker run --env="DEEPL_AUTH_KEY=xxx" -v <targetFolder>:/app/<targetFolder> -v <targetFolder>-translated:/app/<targetFolder>-translated --ipc=host --rm zyddnys/manga-image-translator:main local -i=/app/<targetFolder> <cli flags>
Note: If you need to reference files on your host, you will need to mount the relevant files as volumes into the /app folder inside the container. The CLI paths will need to be the internal Docker path /app/... and not the path on your host.
Build Locally
To build the docker image locally, you can run the following command (you need to have make tool installed on your machine):
bashmake build-image
Then test the built image, run:
Some translation services require API keys to run, pass them to your docker container as environment variables. Add environment variables in the Dockerfile.
bashmake run-web-server
bash# Replace <path> with the path to your image folder or file. $ python -m manga_translator local -v -i <path> # The results can be found in `<path_to_image_folder>-translated`.
Old UI
bash# Start a web server. $ cd server $ python main.py --use-gpu # The web demo service address is http://127.0.0.1:8000
New UI
Documentation
bash# Start a web server. $ cd server $ python main.py --use-gpu # The API service address is http://127.0.0.1:8001
API Documentation
Read the openapi documentation at: 127.0.0.1:8000/docs
FastAPI-html
bashpython -m manga_translator config-help
Detector:
{"detector":{"detector": "ctd"}} can increase the number of text lines detected
Update: Actual testing shows that default works better with related parameter adjustments in black and white comics.OCR:
Translator:
Inpainter: lama_large
Colorizer: mc2
Tips to Improve Translation Quality
upscale_ratio 2 or any other valuefont_size_offset or use the --manga2eng renderer, which will try to fit the detected text bubble rather than detected textline area.--font-path fonts/anime_ace_3.ttf for examplemask_dilation_offset to 10~30 to increase the mask coverage and better wrap the source textbox_threshold can help filter out gibberish from OCR error detection to some extent.OpenaiTranslator to load the glossary file (custom_openai cannot load it)detection_size, otherwise it may cause some sentences to be missed. The opposite is true when the image resolution is high.inpainting_size, otherwise it may not completely cover the mask, resulting in source text leakage. In other cases, you can increase kernel_size to reduce the accuracy of text removal so that the model gets a larger field of view (Note: Judge whether the text leakage is caused by inpainting based on the consistency between the source text and the translated text. If consistent, it is caused by inpainting, otherwise it is caused by text detection and OCR)Basic Options
text-h, --help show this help message and exit -v, --verbose print debug messages and save intermediate images in results folder --attempts ATTEMPTS Number of attempts when an error occurs. -1 for infinite attempts. --ignore-errors Skip images when an error occurs. --model-dir MODEL_DIR Model directory (defaults to ./models in the project root) --use-gpu Turns on/off GPU (automatically switches between mps and cuda) --use-gpu-limited Turns on/off GPU (excluding offline translators) --font-path FONT_PATH Path to the font file --pre-dict PRE_DICT Path to the pre-translation replacement dictionary file --post-dict POST_DICT Path to the post-translation replacement dictionary file --kernel-size KERNEL_SIZE Set the kernel size for the convolution of text erasure area to completely clear residual text --context-size Pages of context are needed for translating the current page. currently, this only applies to openaitranslator.
Additional Options
Local Mode Options
textlocal run in batch translation mode -i, --input INPUT [INPUT ...] Image folder path (required) -o, --dest DEST Destination folder path for translated images (default: '') -f, --format FORMAT Output format for the translation. Options: [List OUTPUT_FORMATS here, png,webp,jpg,jpeg,xcf,psd,pdf] --overwrite Overwrite already translated images --skip-no-text Skip images with no text (won't be saved). --use-mtpe Turn on/off Machine Translation Post-Editing (MTPE) on the command line (currently Linux only) --save-text Save extracted text and translations to a text file. --load-text Load extracted text and translations from a text file. --save-text-file SAVE_TEXT_FILE Similar to --save-text, but with a specified file path. (default: '') --prep-manual Prepare for manual typesetting by outputting blanked, inpainted images, and copies of the original image for reference --save-quality SAVE_QUALITY Quality of saved JPEG images, from 0 to 100 where 100 is best (default: 100) --config-file CONFIG_FILE Path to a configuration file (default: None)
WebSocket Mode Options
textws run in WebSocket mode --host HOST Host of the WebSocket service (default: 127.0.0.1) --port PORT Port of the WebSocket service (default: 5003) --nonce NONCE Nonce used to secure internal WebSocket communication --ws-url WS_URL Server URL for WebSocket mode (default: ws://localhost:5000) --models-ttl MODELS_TTL Time in seconds to keep models in memory after last use (0 means forever)
API Mode Options
textshared run in API mode --host HOST Host of the API service (default: 127.0.0.1) --port PORT Port of the API service (default: 5003) --nonce NONCE Nonce used to secure internal API server communication, set to "None" to disable --report REPORT Report to server to register instance (default: None) --models-ttl MODELS_TTL TTL of models in memory in seconds (0 means forever)
Web Mode Options (missing some basic options, still needs to be added)
text--host HOST Host address (default: 127.0.0.1) --port PORT Port number (default: 8000) --start-instance Whether an instance of the translator should be started automatically --nonce NONCE Nonce used to secure internal Web Server communication, set to "None" to disable --models-ttl MODELS_TTL Time in seconds to keep models in memory after last use (0 means forever)
Run python -m manga_translator config-help >> config-info.json to see the documentation for the JSON schema
An example config file can be found in example/config-example.json
json{ "$defs": { "Alignment": { "enum": [ "auto", "left", "center", "right" ], "title": "Alignment", "type": "string" }, "Colorizer": { "enum": [ "none", "mc2" ], "title": "Colorizer", "type": "string" }, "ColorizerConfig": { "properties": { "colorization_size": { "default": 576, "title": "Colorization Size", "type": "integer" }, "denoise_sigma": { "default": 30, "title": "Denoise Sigma", "type": "integer" }, "colorizer": { "$ref": "#/$defs/Colorizer", "default": "none" } }, "title": "ColorizerConfig", "type": "object" }, "Detector": { "enum": [ "default", "dbconvnext", "ctd", "craft", "paddle", "none" ], "title": "Detector", "type": "string" }, "DetectorConfig": { "properties": { "detector": { "$ref": "#/$defs/Detector", "default": "default" }, "detection_size": { "default": 2048, "title": "Detection Size", "type": "integer" }, "text_threshold": { "default": 0.5, "title": "Text Threshold", "type": "number" }, "det_rotate": { "default": false, "title": "Det Rotate", "type": "boolean" }, "det_auto_rotate": { "default": false, "title": "Det Auto Rotate", "type": "boolean" }, "det_invert": { "default": false, "title": "Det Invert", "type": "boolean" }, "det_gamma_correct": { "default": false, "title": "Det Gamma Correct", "type": "boolean" }, "box_threshold": { "default": 0.75, "title": "Box Threshold", "type": "number" }, "unclip_ratio": { "default": 2.3, "title": "Unclip Ratio", "type": "number" } }, "title": "DetectorConfig", "type": "object" }, "Direction": { "enum": [ "auto", "horizontal", "vertical" ], "title": "Direction", "type": "string" }, "InpaintPrecision": { "enum": [ "fp32", "fp16", "bf16" ], "title": "InpaintPrecision", "type": "string" }, "Inpainter": { "enum": [ "default", "lama_large", "lama_mpe", "sd", "none", "original" ], "title": "Inpainter", "type": "string" }, "InpainterConfig": { "properties": { "inpainter": { "$ref": "#/$defs/Inpainter", "default": "lama_large" }, "inpainting_size": { "default": 2048, "title": "Inpainting Size", "type": "integer" }, "inpainting_precision": { "$ref": "#/$defs/InpaintPrecision", "default": "bf16" } }, "title": "InpainterConfig", "type": "object" }, "Ocr": { "enum": [ "32px", "48px", "48px_ctc", "mocr" ], "title": "Ocr", "type": "string" }, "OcrConfig": { "properties": { "use_mocr_merge": { "default": false, "title": "Use Mocr Merge", "type": "boolean" }, "ocr": { "$ref": "#/$defs/Ocr", "default": "48px" }, "min_text_length": { "default": 0, "title": "Min Text Length", "type": "integer" }, "ignore_bubble": { "default": 0, "title": "Ignore Bubble",
您可以使用以下命令拉取该镜像。请将 <标签> 替换为具体的标签版本。如需查看所有可用标签版本,请访问 标签列表页面。
来自真实用户的反馈,见证轩辕镜像的优质服务