
aistaging/qwen3-coderQwen3-Coder-30B-A3B-Instruct is a state-of-the-art coding model developed by Alibaba Cloud's Qwen team. This streamlined mixture-of-experts (MoE) model delivers impressive performance in agentic coding, browser automation, and foundational coding tasks while maintaining efficiency through sparse activation. With only 3.3B parameters activated out of 30.5B total, it achieves exceptional performance while remaining computationally efficient.
The model excels at tool calling and function execution, making it ideal for agentic workflows where code generation needs to interact with external tools and APIs. It supports an extensive context window of 256K tokens natively (extendable to 1M tokens using Yarn), enabling repository-scale code understanding and generation. Qwen3-Coder is specifically designed to work seamlessly with platforms like CLINE and features a specially designed function call format for agentic coding scenarios.
This non-thinking mode model generates direct code responses without intermediate reasoning blocks, making it optimized for production environments where clean, immediate code output is preferred. It supports conversational interfaces and can handle complex multi-turn coding dialogues while maintaining context across long interactions.
| Attribute | Value |
|---|---|
| Provider | Qwen (Alibaba Cloud) |
| Architecture | Qwen3 MoE (Mixture of Experts) |
| Languages | Multilingual |
| Input modalities | Text |
| Output modalities | Text |
| Context length | 262,144 tokens (extendable to 1M with Yarn) |
| Parameters | 30.5B total, 3.3B activated |
| Layers | 48 |
| Attention heads | 32 (Q), 4 (KV) - Grouped Query Attention |
| Experts | 128 total, 8 activated |
| License | Apache 2.0 |
bashdocker model run qwen3-coder
For more information, check out the Docker Model Runner docs.
!Qwen3-Coder Architecture
Qwen3-Coder uses a Mixture of Experts (MoE) architecture with 128 expert networks, activating only 8 experts per token. This sparse activation pattern enables the model to maintain a large total parameter count while keeping computational costs manageable through selective activation.
Agentic Coding: Built-in tool calling capabilities with a specialized function call format that works across multiple platforms including Qwen Code and CLINE. The model can seamlessly integrate with external tools and APIs.
Long Context Understanding: Native support for 262K token context windows, with extensibility up to 1 million tokens using Yarn technique. This enables comprehensive repository-level code analysis and generation.
Browser Automation: Strong performance on browser-use tasks, enabling automated web interaction and testing workflows.
Conversational Interface: Supports multi-turn conversations with maintained context, making it ideal for interactive coding assistants and pair programming scenarios.
Non-Thinking Mode: Generates direct code output without intermediate reasoning steps, optimized for production use cases requiring clean, immediate responses.
For optimal performance, the following parameters are recommended:
transformers>=4.51.0 for proper model loading (earlier versions will result in KeyError: 'qwen3_moe')This model card was automatically generated using cagent-action. Want to learn more about Docker Model Runner? Check out the project repository: [***]






manifest unknown 错误
TLS 证书验证失败
DNS 解析超时
410 错误:版本过低
402 错误:流量耗尽
身份认证失败错误
429 限流错误
凭证保存错误
来自真实用户的反馈,见证轩辕镜像的优质服务