qwen3-30b-a3b

Model Description

Qwen3 represents the latest generation in the Qwen series of large language models, offering a comprehensive suite of both dense and mixture-of-experts (MoE) models. Leveraging extensive training, Qwen3 introduces unprecedented advancements in reasoning, instruction following, agent capabilities, and multilingual support. Its key features include:

Seamless Mode Switching: The model uniquely supports smooth transitions between “thinking” mode (for complex logical reasoning, mathematics, and coding) and “non-thinking” mode (for efficient, general-purpose dialogue), ensuring optimal performance across a variety of scenarios.
Enhanced Reasoning: Qwen3 demonstrates significantly improved reasoning abilities, outperforming previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) in mathematics, code generation, and commonsense logical reasoning tasks.
Human Preference Alignment: The model excels in creative writing, role-playing, multi-turn conversations, and instruction following, delivering a natural, engaging, and immersive conversational experience.
Agent Proficiency: Qwen3 offers advanced agent capabilities, enabling precise integration with external tools in both thinking and non-thinking modes, and achieves leading performance among open-source models on complex agent-based tasks.
Multilingual Support: It supports over 100 languages and dialects, showcasing strong capabilities for multilingual instruction following and translation.
Model Details

Below is an overview of the FP8 version of Qwen3-30B-A3B:

Feature Specification
Type Causal Language Models
Training Stage Pretraining & Post-training
Number of Parameters (Total) 30.5B
Number of Activated Parameters 3.3B
Number of Parameters (Non-Embedding) 29.9B
Number of Layers 48
Number of Attention Heads (GQA) 32 for Q, 4 for KV
Number of Experts 128
Number of Activated Experts 8
Context Length 32,768 tokens natively; 131,072 tokens with YaRN

🔔How to Use

graph LR A("Purchase Now") --> B["Start Chat on Homepage"] A --> D["Read API Documentation"] B --> C["Register / Login"] C --> E["Enter Key"] D --> F["Enter Endpoint & Key"] E --> G("Start Using") F --> G style A fill:#f9f9f9,stroke:#333,stroke-width:1px style B fill:#f9f9f9,stroke:#333,stroke-width:1px style C fill:#f9f9f9,stroke:#333,stroke-width:1px style D fill:#f9f9f9,stroke:#333,stroke-width:1px style E fill:#f9f9f9,stroke:#333,stroke-width:1px style F fill:#f9f9f9,stroke:#333,stroke-width:1px style G fill:#f9f9f9,stroke:#333,stroke-width:1px
Description Ends

Recommend Models

o3-2025-04-16

Our most powerful reasoning model with leading performance on coding, math, science, and vision

o4-mini

Our faster, cost-efficient reasoning model delivering strong performance on math, coding and vision

gpt-4o-image

Using reverse engineering to call the model within the official application and convert it into an API.