
如果你使用 DeepSeek、元宝 AI、Copilot、ChatGPT 等 AI 工具,协助配置轩辕镜像、编写 docker pull 命令、修改 Docker Compose 镜像地址、配置镜像加速、排查镜像拉取失败、分析报错日志等问题,请先让 AI 阅读并遵守轩辕镜像的规则文档。
只需在 AI 对话中先发送下面这句话即可:
请先完整阅读并严格遵守以下文档中的全部规则与要求:
https://xuanyuan.cloud/agents.md
在未充分阅读并理解该文档前,不要生成任何命令、配置、修改建议、故障排查方案或技术回答。后续所有输出都必须严格以该文档中的规范为最高优先级执行。查看 agents.md 用法指南与完整示范。国内用户首推 元宝 AI、DeepSeek 的深度思考模式,不推荐豆包 AI;Cursor 等编辑器可在对话 @ 该链接,或加入 User Rules。 若 AI 无法访问外链,可 打开说明文档 复制全文粘贴。文档会随站点更新,复制内容可能过期,建议定期检查。
Chatbots are the most widely adopted use case for leveraging the powerful chat and reasoning capabilities of large language models (LLMs). The retrieval augmented generation (RAG) architecture is quickly becoming the industry standard for chatbot development. It combines the benefits of a knowledge base (via a vector store) and generative models to reduce hallucinations, maintain up-to-date information, and leverage domain-specific knowledge.
RAG bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that the response generated remains factual and current. Vector databases are at the core of this architecture, enabling efficient retrieval of semantically relevant information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity.
The ChatQnA application is a customizable end-to-end workflow that leverages the capabilities of LLMs and RAG efficiently. ChatQnA architecture is shown below:
!https://github.com/opea-project/GenAIExamples/raw/main/./assets/img/chatqna_architecture.png
This application is modular as it leverages each component as a microservice(as defined in https://github.com/opea-project/GenAIComps) that can scale independently. It comprises data preparation, embedding, retrieval, reranker(optional) and LLM microservices. All these microservices are stitched together by the ChatQnA megaservice that orchestrates the data through these microservices. The flow chart below shows the information flow between different microservices for this example.
mermaid--- config: flowchart: nodeSpacing: 400 rankSpacing: 100 curve: linear themeVariables: fontSize: 50px --- flowchart LR %% Colors %% classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 classDef invisible fill:transparent,stroke:transparent; style ChatQnA-MegaService stroke:#000000 %% Subgraphs %% subgraph ChatQnA-MegaService["ChatQnA MegaService "] direction LR EM([Embedding MicroService]):::blue RET([Retrieval MicroService]):::blue RER([Rerank MicroService]):::blue LLM([LLM MicroService]):::blue end subgraph UserInterface[" User Interface "] direction LR a([User Input Query]):::orchid Ingest([Ingest data]):::orchid UI([UI server<br>]):::orchid end TEI_RER{{Reranking service<br>}} TEI_EM{{Embedding service <br>}} VDB{{Vector DB<br><br>}} R_RET{{Retriever service <br>}} DP([Data Preparation MicroService]):::blue LLM_gen{{LLM Service <br>}} GW([ChatQnA GateWay<br>]):::orange %% Data Preparation flow %% Ingest data flow direction LR Ingest[Ingest data] --> UI UI --> DP DP <-.-> TEI_EM %% Questions interaction direction LR a[User Input Query] --> UI UI --> GW GW <==> ChatQnA-MegaService EM ==> RET RET ==> RER RER ==> LLM %% Embedding service flow direction LR EM <-.-> TEI_EM RET <-.-> R_RET RER <-.-> TEI_RER LLM <-.-> LLM_gen direction TB %% Vector DB interaction R_RET <-.->|d|VDB DP <-.->|d|VDB
The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.
| Category | Deployment Option | Description |
|---|---|---|
| On-premise Deployments | Docker compose | https://github.com/opea-project/GenAIExamples/blob/main/./docker_compose/intel/cpu/xeon/README.md |
| https://github.com/opea-project/GenAIExamples/blob/main/./docker_compose/intel/cpu/aipc/README.md | ||
| https://github.com/opea-project/GenAIExamples/blob/main/./docker_compose/intel/hpu/gaudi/README.md | ||
| https://github.com/opea-project/GenAIExamples/blob/main/./docker_compose/nvidia/gpu/README.md | ||
| https://github.com/opea-project/GenAIExamples/blob/main/./docker_compose/amd/cpu/epyc/README.md | ||
| https://github.com/opea-project/GenAIExamples/blob/main/./docker_compose/amd/cpu/epyc/README.md | ||
| Cloud Platforms Deployment on AWS, GCP, Azure, IBM Cloud,Oracle Cloud, Intel® Tiber™ AI Cloud | Docker Compose | https://github.com/opea-project/docs/tree/main/getting-started/README.md |
| Kubernetes | https://github.com/opea-project/GenAIExamples/blob/main/./kubernetes/helm/README.md | |
| Automated Terraform Deployment on Cloud Service Providers | AWS | https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna |
| https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna-falcon11B | ||
| GCP | https://github.com/intel/terraform-intel-gcp-vm/tree/main/examples/gen-ai-xeon-opea-chatqna | |
| Azure | https://github.com/intel/terraform-intel-azure-linux-vm/tree/main/examples/azure-gen-ai-xeon-opea-chatqna-tdx | |
| Intel Tiber AI Cloud | Coming Soon | |
| Any Xeon based Ubuntu system | https://github.com/intel/optimized-cloud-recipes/tree/main/recipes/ai-opea-chatqna-xeon. Use this if you are not using Terraform and have provisioned your system either manually or with another tool, including directly on bare metal. |
Follow https://opea-project.github.io/latest/tutorial/OpenTelemetry/OpenTelemetry_OPEA_Guide.html to understand how to use OpenTelemetry tracing and metrics in OPEA.
For ChatQnA specific tracing and metrics monitoring, follow https://opea-project.github.io/latest/tutorial/OpenTelemetry/deploy/ChatQnA.html section.
FAQ Generation Application leverages the power of large language models (LLMs) to revolutionize the way you interact with and comprehend complex textual data. By harnessing cutting-edge natural language processing techniques, our application can automatically generate comprehensive and natural-sounding frequently asked questions (FAQs) from your documents, legal texts, customer queries, and other sources. We merged the FaqGen into the ChatQnA example, which utilize LangChain to implement FAQ Generation and facilitate LLM inference using Text Generation Inference on Intel Xeon and Gaudi2 processors.
| Deploy Method | LLM Engine | LLM Model | Embedding | Vector Database | Reranking | Guardrails | Hardware |
|---|---|---|---|---|---|---|---|
| Docker Compose | vLLM, TGI | meta-llama/Meta-Llama-3-8B-Instruct | TEI | Redis | w/, w/o | w/, w/o | Intel Gaudi |
| Docker Compose | vLLM, TGI | meta-llama/Meta-Llama-3-8B-Instruct | TEI | Redis, Mariadb, Milvus, Pinecone, Qdrant | w/, w/o | w/o | Intel Xeon |
| Docker Compose | Ollama | llama3.2 | TEI | Redis | w/ | w/o | Intel AIPC |
| Docker Compose | vLLM, TGI | meta-llama/Meta-Llama-3-8B-Instruct | TEI | Redis | w/ | w/o | AMD ROCm |
| Helm Charts | vLLM, TGI | meta-llama/Meta-Llama-3-8B-Instruct | TEI | Redis | w/, w/o | w/, w/o | Intel Gaudi |
| Helm Charts | vLLM, TGI | meta-llama/Meta-Llama-3-8B-Instruct | TEI | Redis, Milvus, Qdrant | w/, w/o | w/o | Intel Xeon |
您可以使用以下命令拉取该镜像。请将 <标签> 替换为具体的标签版本。如需查看所有可用标签版本,请访问 标签列表页面。

来自真实用户的反馈,见证轩辕镜像的优质服务