🚅 LiteLLM
LiteLLM AI 网关
支持 100+ 大语言模型(LLM)的开源 AI 网关。可自托管,企业级就绪。以 OpenAI 格式调用任何 LLM。
LiteLLM Proxy Server (AI Gateway) | Hosted Proxy | Enterprise Tier | Website
LiteLLM 是一个开源 AI 网关,为您提供单一、统一的接口,可使用 OpenAI 格式调用 100+ LLM 提供商——包括 OpenAI、Anthropic、Gemini、Bedrock、Azure 等。
您可以将其用作Python SDK 进行直接库集成,或部署AI 网关(代理服务器) 作为团队或组织的集中式服务。
跳转到 LiteLLM 代理(LLM 网关)文档 跳转到支持的 LLM 提供商
跨提供商管理 LLM 调用很快会变得复杂——每个模型都有不同的 SDK、身份验证模式、请求格式和错误类型。LiteLLM 消除了这种摩擦:
Netflix
LLM - 调用 100+ LLM(Python SDK + AI 网关)
所有支持的端点 - /chat/completions、/responses、/embeddings、/images、/audio、/batches、/rerank、/a2a、/messages 等。
uv add litellm
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key"
# OpenAI
response = completion(model="openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}])
# Anthropic
response = completion(model="anthropic/claude-sonnet-4-20250514", messages=[{"role": "user", "content": "Hello!"}])
快速入门 - 端到端教程 - 设置虚拟密钥,发出第一个请求
uv tool install 'litellm[proxy]'
litellm --model gpt-4o
import openai
client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000")
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
文档:LLM 提供商
代理 - 调用 A2A 代理(Python SDK + AI 网关)
支持的提供商 - LangGraph、Vertex AI Agent Engine、Azure AI Foundry、Bedrock AgentCore、Pydantic AI
from litellm.a2a_protocol import A2AClient
from a2a.types import SendMessageRequest, MessageSendParams
from uuid import uuid4
client = A2AClient(base_url="http://localhost:10001")
request = SendMessageRequest(
id=str(uuid4()),
params=MessageSendParams(
message={
"role": "user",
"parts": [{"kind": "text", "text": "Hello!"}],
"messageId": uuid4().hex,
}
)
)
response = await client.send_message(request)
步骤 1. 将代理添加到 AI 网关
步骤 2. 通过 A2A SDK 调用代理
from a2a.client import A2ACardResolver, A2AClient
from a2a.types import MessageSendParams, SendMessageRequest
from uuid import uuid4
import httpx
base_url = "http://localhost:4000/a2a/my-agent" # LiteLLM 代理 + 代理名称
headers = {"Authorization": "Bearer sk-1234"} # LiteLLM 虚拟密钥
async with httpx.AsyncClient(headers=headers) as httpx_client:
resolver = A2ACardResolver(httpx_client=httpx_client, base_url=base_url)
agent_card = await resolver.get_agent_card()
client = A2AClient(httpx_client=httpx_client, agent_card=agent_card)
request = SendMessageRequest(
id=str(uuid4()),
params=MessageSendParams(
message={
"role": "user",
"parts": [{"kind": "text", "text": "Hello!"}],
"messageId": uuid4().hex,
}
)
)
response = await client.send_message(request)
文档:A2A 代理网关
MCP 工具 - 将 MCP 服务器连接到任何 LLM(Python SDK + AI 网关)
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from litellm import experimental_mcp_client
import litellm
server_params = StdioServerParameters(command="python", args=["mcp_server.py"])
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# 以 OpenAI 格式加载 MCP 工具
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
# 与任何 LiteLLM 模型配合使用
response = await litellm.acompletion(
model="gpt-4o",
messages=[{"role": "user", "content": "What's 3 + 5?"}],
tools=tools
)
步骤 1. 将 MCP 服务器添加到 AI 网关
步骤 2. 通过 /chat/completions 调用 MCP 工具
curl -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Summarize the latest open PR"}],
"tools": [{
"type": "mcp",
"server_url": "litellm_proxy/mcp/github",
"server_label": "github_mcp",
"require_approval": "never"
}]
}'
{
"mcpServers": {
"LiteLLM": {
"url": "http://localhost:4000/mcp/",
"headers": {
"x-litellm-api-key": "Bearer sk-1234"
}
}
}
}
文档:MCP 网关
| 提供商 | /chat/completions | /messages | /responses | /embeddings | /image/generations | /audio/transcriptions | /audio/speech | /moderations | /batches | /rerank |
|---|---|---|---|---|---|---|---|---|---|---|
Abliteration (abliteration) | ✅ | |||||||||
AI/ML API (aiml) | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
AI21 (ai21) | ✅ | ✅ | ✅ | |||||||
AI21 Chat (ai21_chat) | ✅ | ✅ | ✅ | |||||||
| Aleph Alpha | ✅ | ✅ | ✅ | |||||||
| Amazon Nova | ✅ | ✅ | ✅ | |||||||
Anthropic (anthropic) | ✅ | ✅ | ✅ | ✅ | ||||||
Anthropic Text (anthropic_text) | ✅ | ✅ | ✅ | ✅ | ||||||
| Anyscale | ✅ | ✅ | ✅ | |||||||
AssemblyAI (assemblyai) | ✅ | ✅ | ✅ | ✅ | ||||||
Auto Router (auto_router) | ✅ | ✅ | ✅ | |||||||
AWS - Bedrock (bedrock) | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
AWS - Sagemaker (sagemaker) | ✅ | ✅ | ✅ | ✅ | ||||||
Azure (azure) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |
Azure AI (azure_ai) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |
Azure Text (azure_text) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |||
Baseten (baseten) | ✅ | ✅ | ✅ | |||||||
Bytez (bytez) | ✅ | ✅ | ✅ | |||||||
Cerebras (cerebras) | ✅ | ✅ | ✅ | |||||||
Clarifai (clarifai) | ✅ | ✅ | ✅ | |||||||
Cloudflare AI Workers (cloudflare) | ✅ | ✅ | ✅ | |||||||
Codestral (codestral) | ✅ | ✅ | ✅ | |||||||
Cohere (cohere) | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Cohere Chat (cohere_chat) | ✅ | ✅ | ✅ | |||||||
CometAPI (cometapi) | ✅ | ✅ | ✅ | ✅ | ||||||
CompactifAI (compactifai) | ✅ | ✅ | ✅ | |||||||
Custom (custom) | ✅ | ✅ | ✅ | |||||||
Custom OpenAI (custom_openai) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |||
Dashscope (dashscope) | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Databricks (databricks) | ✅ | ✅ | ✅ | |||||||
DataRobot (datarobot) | ✅ | ✅ | ✅ | |||||||
Deepgram (deepgram) | ✅ | ✅ | ✅ | ✅ | ||||||
DeepInfra (deepinfra) | ✅ | ✅ | ✅ | |||||||
Deepseek (deepseek) | ✅ | ✅ | ✅ | |||||||
ElevenLabs (elevenlabs) | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Empower (empower) | ✅ | ✅ | ✅ | |||||||
Fal AI (fal_ai) | ✅ | ✅ | ✅ | ✅ | ||||||
Featherless AI (featherless_ai) | ✅ | ✅ | ✅ | |||||||
Fireworks AI (fireworks_ai) | ✅ | ✅ | ✅ | |||||||
FriendliAI (friendliai) | ✅ | ✅ | ✅ | |||||||
Galadriel (galadriel) | ✅ | ✅ | ✅ | |||||||
GitHub Copilot (github_copilot) | ✅ | ✅ | ✅ | ✅ | ||||||
GitHub Models (github) | ✅ | ✅ | ✅ | |||||||
| Google - PaLM | ✅ | ✅ | ✅ | |||||||
Google - Vertex AI (vertex_ai) | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Google AI Studio - Gemini (gemini) | ✅ | ✅ | ✅ | |||||||
GradientAI (gradient_ai) | ✅ | ✅ | ✅ | |||||||
Groq AI (groq) | ✅ | ✅ | ✅ | |||||||
Heroku (heroku) | ✅ | ✅ | ✅ | |||||||
Hosted VLLM (hosted_vllm) | ✅ | ✅ | ✅ | |||||||
Huggingface (huggingface) | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Hyperbolic (hyperbolic) | ✅ | ✅ | ✅ | |||||||
IBM - Watsonx.ai (watsonx) | ✅ | ✅ | ✅ | ✅ | ||||||
Infinity (infinity) | ✅ | |||||||||
Jina AI (jina_ai) | ✅ | |||||||||
Lambda AI (lambda_ai) | ✅ | ✅ | ✅ | |||||||
Lemonade (lemonade) | ✅ | ✅ | ✅ | |||||||
LiteLLM Proxy (litellm_proxy) | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Llamafile (llamafile) | ✅ | ✅ | ✅ | |||||||
LM Studio (lm_studio) | ✅ | ✅ | ✅ | |||||||
Maritalk (maritalk) | ✅ | ✅ | ✅ | |||||||
| Meta - Llama API | ✅ | ✅ | ✅ |
我们欢迎对 LiteLLM 的贡献!无论是修复错误、添加功能还是改进文档,我们都感谢您的帮助。
这需要安装 uv。
git clone https://github.com/BerriAI/litellm.git
cd litellm
make install-dev # 安装开发依赖
make format # 格式化代码
make lint # 运行所有代码检查
make test-unit # 运行单元测试
make format-check # 仅检查格式
有关详细的贡献指南,请参阅 CONTRIBUTING.md。
📖 想要贡献文档? LiteLLM 文档已迁移到单独的仓库:https://github.com/BerriAI/litellm-docs%E3%80%82%E8%AF%B7%E5%9C%A8%E8%AF%A5%E4%BB%93%E5%BA%93%E6%8F%90%E4%BA%A4%E6%96%87%E6%A1%A3 PR。文档托管于 docs.litellm.ai。
LiteLLM 遵循 https://google.github.io/styleguide/pyguide.html%E3%80%82
我们的自动化检查包括:
在您的 PR 被合并之前,所有这些检查都必须通过。
请登录使用轩辕镜像享受快速拉取体验,支持国内访问优化,速度提升
docker pull ghcr.io/BerriAI/litellm-database:v1.85.0-rc.2探索更多轩辕镜像的使用方法,找到最适合您系统的配置方式
通过 Docker 登录认证访问私有仓库
无需登录使用专属域名
Kubernetes 集群配置 Containerd
K3s 轻量级 Kubernetes 镜像加速
VS Code Dev Containers 配置
Podman 容器引擎配置
HPC 科学计算容器配置
ghcr、Quay、nvcr 等镜像仓库
Harbor Proxy Repository 对接专属域名
Portainer Registries 加速拉取
Nexus3 Docker Proxy 内网缓存
需要其他帮助?请查看我们的 常见问题Docker 镜像访问常见问题解答 或 提交工单
docker search 限制
站内搜不到镜像
离线 save/load
插件要用 plugin install
WSL 拉取慢
安全与 digest
新手拉取配置
镜像合规机制
manifest unknown
no matching manifest(架构)
invalid tar header(解压)
TLS 证书失败
DNS 超时
域名连通性排查
410 Gone 排查
402 与流量用尽
401 认证失败
429 限流
D-Bus 凭证提示
413 与超大单层
来自真实用户的反馈,见证轩辕镜像的优质服务