qwen3-30b-a3b

Model Description

Qwen3 represents the latest generation in the Qwen series of large language models, offering a comprehensive suite of both dense and mixture-of-experts (MoE) models. Leveraging extensive training, Qwen3 introduces unprecedented advancements in reasoning, instruction following, agent capabilities, and multilingual support. Its key features include:

Seamless Mode Switching: The model uniquely supports smooth transitions between “thinking” mode (for complex logical reasoning, mathematics, and coding) and “non-thinking” mode (for efficient, general-purpose dialogue), ensuring optimal performance across a variety of scenarios.
Enhanced Reasoning: Qwen3 demonstrates significantly improved reasoning abilities, outperforming previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) in mathematics, code generation, and commonsense logical reasoning tasks.
Human Preference Alignment: The model excels in creative writing, role-playing, multi-turn conversations, and instruction following, delivering a natural, engaging, and immersive conversational experience.
Agent Proficiency: Qwen3 offers advanced agent capabilities, enabling precise integration with external tools in both thinking and non-thinking modes, and achieves leading performance among open-source models on complex agent-based tasks.
Multilingual Support: It supports over 100 languages and dialects, showcasing strong capabilities for multilingual instruction following and translation.
Model Details

Below is an overview of the FP8 version of Qwen3-30B-A3B:

Feature Specification
Type Causal Language Models
Training Stage Pretraining & Post-training
Number of Parameters (Total) 30.5B
Number of Activated Parameters 3.3B
Number of Parameters (Non-Embedding) 29.9B
Number of Layers 48
Number of Attention Heads (GQA) 32 for Q, 4 for KV
Number of Experts 128
Number of Activated Experts 8
Context Length 32,768 tokens natively; 131,072 tokens with YaRN

🔔How to Use

graph LR A("Purchase Now") --> B["Start Chat on Homepage"] A --> D["Read API Documentation"] B --> C["Register / Login"] C --> E["Enter Key"] D --> F["Enter Endpoint & Key"] E --> G("Start Using") F --> G style A fill:#f9f9f9,stroke:#333,stroke-width:1px style B fill:#f9f9f9,stroke:#333,stroke-width:1px style C fill:#f9f9f9,stroke:#333,stroke-width:1px style D fill:#f9f9f9,stroke:#333,stroke-width:1px style E fill:#f9f9f9,stroke:#333,stroke-width:1px style F fill:#f9f9f9,stroke:#333,stroke-width:1px style G fill:#f9f9f9,stroke:#333,stroke-width:1px
Description Ends

Recommend Models

claude-opus-4-20250514-thinking

Comprehensive introduction to Anthropic's newly released Claude 4 models, Opus 4 and Sonnet 4, highlighting their features, performance benchmarks, application scenarios, pricing, and availability. This report summarizes key differences between the models and discusses their integration with major platforms such as GitHub Copilot, emphasizing their advantages in coding, advanced reasoning, and ethical AI responses.

gpt-4o-mini-rev

Using reverse engineering to call the model within the official application and convert it into an API.

DeepSeek-R1

Performance on par with OpenAI-o1, Fully open-source model & technical report, Code and models are released under the MIT License: Distill & commercialize freely.