
如果你使用 DeepSeek、元宝 AI、Copilot、ChatGPT 等 AI 工具,协助配置轩辕镜像、编写 docker pull 命令、修改 Docker Compose 镜像地址、配置镜像加速、排查镜像拉取失败、分析报错日志等问题,请先让 AI 阅读并遵守轩辕镜像的规则文档。
只需在 AI 对话中先发送下面这句话即可:
请先完整阅读并严格遵守以下文档中的全部规则与要求:
https://xuanyuan.cloud/agents.md
在未充分阅读并理解该文档前,不要生成任何命令、配置、修改建议、故障排查方案或技术回答。后续所有输出都必须严格以该文档中的规范为最高优先级执行。查看 agents.md 用法指南与完整示范。国内用户首推 元宝 AI、DeepSeek 的深度思考模式,不推荐豆包 AI;Cursor 等编辑器可在对话 @ 该链接,或加入 User Rules。 若 AI 无法访问外链,可 打开说明文档 复制全文粘贴。文档会随站点更新,复制内容可能过期,建议定期检查。
/opt/llama.cpp)/opt/llama-cpp-python)[!WARNING] 从0.1.79版本开始,模型格式已从GGML更改为GGUF。现有GGML模型可使用llama.cpp中的
convert-llama-ggmlv3-to-gguf.py脚本转换(或可在https://huggingface.co/models?search=GGUF%E6%89%BE%E5%88%B0GGUF%E6%A0%BC%E5%BC%8F%E7%9A%84%E8%BD%AC%E6%8D%A2%E7%89%88%E6%9C%AC%EF%BC%89
为保持向后兼容性,该容器提供两个分支:
llama_cpp:gguf(默认分支,跟踪上游master分支)llama_cpp:ggml(仍支持GGML模型格式)legacy GGML分支应用了以下补丁:
__fp16的typedef(使用NVCC的half类型)可使用llama.cpp内置的https://github.com/ggerganov/llama.cpp/tree/master/examples/main%E5%B7%A5%E5%85%B7%E8%BF%90%E8%A1%8CGGUF%E6%A8%A1%E5%9E%8B%EF%BC%88%E6%9D%A5%E8%87%AAhttps://huggingface.co/models?search=gguf%E6%88%96%E5%85%B6%E4%BB%96%E6%9D%A5%E6%BA%90%EF%BC%89
bash./run.sh --workdir=/opt/llama.cpp/bin $(./autotag llama_cpp) /bin/bash -c \ './main --model $(huggingface-downloader TheBloke/Llama-2-7B-GGUF/llama-2-7b.Q4_K_S.gguf) \ --prompt "很久很久以前," \ --n-predict 128 --ctx-size 192 --batch-size 192 \ --n-gpu-layers 999 --threads $(nproc)'
--model参数需要.gguf文件名(通常使用Q4_K_S量化版本)
若加载Llama-2-70B模型,需添加--gqa 8标志
也可使用Python API和https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/llama_cpp/benchmark.py%EF%BC%9A
bash./run.sh --workdir=/opt/llama.cpp/bin $(./autotag llama_cpp) /bin/bash -c \ 'python3 benchmark.py --model $(huggingface-downloader TheBloke/Llama-2-7B-GGUF/llama-2-7b.Q4_K_S.gguf) \ --prompt "很久很久以前," \ --n-predict 128 --ctx-size 192 --batch-size 192 \ --n-gpu-layers 999 --threads $(nproc)'
| 模型 | 量化方式 | 内存(MB) |
|---|---|---|
| https://huggingface.co/TheBloke/Llama-2-7B-GGUF | llama-2-7b.Q4_K_S.gguf | 5,268 |
| https://huggingface.co/TheBloke/Llama-2-13B-GGUF | llama-2-13b.Q4_K_S.gguf | 8,609 |
| https://huggingface.co/TheBloke/LLaMA-30b-GGUF | llama-30b.Q4_K_S.gguf | 19,045 |
| https://huggingface.co/TheBloke/Llama-2-70B-GGUF | llama-2-70b.Q4_K_S.gguf | 37,655 |
llama_cpp:0.2.57 | |
|---|---|
| 别名 | llama_cpp |
| 依赖系统 | L4T ['>=34.1.0'] |
| 依赖项 | https://github.com/dusty-nv/jetson-containers/tree/master/packages/build/build-essential https://github.com/dusty-nv/jetson-containers/tree/master/packages/cuda/cuda https://github.com/dusty-nv/jetson-containers/tree/master/packages/cuda/cudnn https://github.com/dusty-nv/jetson-containers/tree/master/packages/build/python https://github.com/dusty-nv/jetson-containers/tree/master/packages/build/cmake/cmake_pip https://github.com/dusty-nv/jetson-containers/tree/master/packages/numpy https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/huggingface_hub |
| 被依赖项 | https://github.com/dusty-nv/jetson-containers/tree/master/packages/rag/langchain https://github.com/dusty-nv/jetson-containers/tree/master/packages/rag/langchain https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/text-generation-webui https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/text-generation-webui https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/text-generation-webui |
| Dockerfile | https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/llama_cpp/Dockerfile |
| 仓库/标签 | 日期 | 架构 | 大小 |
|---|---|---|---|
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-12-05 | arm64 | 5.2GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-12-06 | arm64 | 5.2GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-12-19 | arm64 | 5.2GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-12-19 | arm64 | 5.1GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-12-15 | arm64 | 5.1GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-12-19 | arm64 | 5.2GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-12-15 | arm64 | 5.1GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-12-19 | arm64 | 5.1GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-08-29 | arm64 | 5.2GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-08-15 | arm64 | 5.2GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2023-08-13 | arm64 | 5.1GB |
| https://hub.docker.com/r/dustynv/llama_cpp/tags | 2024-02-22 | arm64 | 5.3GB |
容器镜像与其他次要版本的JetPack/L4T兼容:
• L4T R32.7容器可在其他L4T R32.7版本(JetPack 4.6+)上运行
• L4T R35.x容器可在其他L4T R35.x版本(JetPack 5.1+)上运行
可使用https://github.com/dusty-nv/jetson-containers/tree/master/docs/run.md%E5%92%8Chttps://github.com/dusty-nv/jetson-containers/tree/master/docs/run.md#autotag%E5%90%AF%E5%8A%A8%E5%AE%B9%E5%99%A8%EF%BC%8C%E6%88%96%E6%89%8B%E5%8A%A8%E6%9E%84%E5%BB%BA%60docker run`命令:
bash# 自动拉取或构建兼容的容器镜像 jetson-containers run $(autotag llama_cpp) # 或显式指定上述容器镜像之一 jetson-containers run dustynv/llama_cpp:r36.2.0 # 或使用'docker run'(指定镜像和挂载等) sudo docker run --runtime nvidia -it --rm --network=host dustynv/llama_cpp:r36.2.0
https://github.com/dusty-nv/jetson-containers/tree/master/docs/run.md%E5%B0%86%E5%8F%82%E6%95%B0%E8%BD%AC%E5%8F%91%E7%BB%99%60docker run
,并添加一些默认值(如--runtime nvidia、挂载/data`缓存、检测设备)
https://github.com/dusty-nv/jetson-containers/tree/master/docs/run.md#autotag%E4%BC%9A%E6%89%BE%E5%88%B0%E4%B8%8E%E6%82%A8%E7%9A%84JetPack/L4T%E7%89%88%E6%9C%AC%E5%85%BC%E5%AE%B9%E7%9A%84%E5%AE%B9%E5%99%A8%E9%95%9C%E5%83%8F%E2%80%94%E2%80%94%E6%9C%AC%E5%9C%B0%E3%80%81%E4%BB%8E%E6%B3%A8%E5%86%8C%E8%A1%A8%E6%8B%89%E5%8F%96%E6%88%96%E6%9E%84%E5%BB%BA%E3%80%82
要将主机目录挂载到容器中,使用-v或--volume标志:
bashjetson-containers run -v /主机路径:/容器路径 $(autotag llama_cpp)
要启动容器运行命令而非交互式shell:
bashjetson-containers run $(autotag llama_cpp) my_app --abc xyz
可传递任何docker run支持的选项,执行前会打印完整命令。
若如上所示使用https://github.com/dusty-nv/jetson-containers/tree/master/docs/run.md#autotag%EF%BC%8C%E9%9C%80%E8%A6%81%E6%97%B6%E4%BC%9A%E6%8F%90%E7%A4%BA%E6%9E%84%E5%BB%BA%E5%AE%B9%E5%99%A8%E3%80%82%E8%A6%81%E6%89%8B%E5%8A%A8%E6%9E%84%E5%BB%BA%EF%BC%8C%E5%85%88%E5%AE%8C%E6%88%90https://github.com/dusty-nv/jetson-containers/tree/master/docs/setup.md%EF%BC%8C%E7%84%B6%E5%90%8E%E8%BF%90%E8%A1%8C%EF%BC%9A
bashjetson-containers build llama_cpp
上述依赖项将构建到容器中,并在构建过程中进行测试。使用https://github.com/dusty-nv/jetson-containers/tree/master/jetson_containers/build.py%E6%9F%A5%E7%9C%8B%E6%9E%84%E5%BB%BA%E9%80%89%E9%A1%B9%E3%80%82
您可以使用以下命令拉取该镜像。请将 <标签> 替换为具体的标签版本。如需查看所有可用标签版本,请访问 标签列表页面。
来自真实用户的反馈,见证轩辕镜像的优质服务