qwen3-32b

2025-04-28
Chat, Reasoning
By Qwen

Input: ￥2.00 / M tokens Output: ￥20.00 / M tokens
Features: Reasoning, Streaming, Text Input, Text Output
Context Window: 128K
Maximum Output: 8K

Input: ￥2.00 / M tokens Output: ￥20.00 / M tokens
Features: Reasoning, Streaming, Text Input, Text Output
Context Window: 128K
Maximum Output: 8K

Model Description

Qwen3-32B represents the latest advancement in the Qwen series of large language models, offering a dense architecture expertly trained for groundbreaking performance. Qwen3 models are recognized for their seamless switching between thinking mode (complex logical reasoning, mathematics, and coding) and non-thinking mode (efficient, general-purpose dialogue) within a single framework, ensuring optimal performance across diverse scenarios.

Key highlights:

Outperforms previous QwQ and Qwen2.5 instruct models in mathematics, code generation, and commonsense logical reasoning.
Demonstrates superior alignment with human preferences, excelling at creative writing, role-play, multi-turn conversation, and instruction following for highly engaging and natural interactions.
Exhibits advanced agent capabilities, enabling accurate integration with external tools in both thinking and non-thinking modes for leading performance in complex agent-based tasks.
Provides strong multilingual support, covering over 100 languages and dialects with reliable instruction-following and translation capabilities.
Model overview:

Feature	Description
Type	Causal Language Model
Training Stage	Pretraining & Post-training
Number of Parameters	32.8B
Non-Embedding Parameters	31.2B
Layers	64
Attention Heads (GQA)	Q: 64, KV: 8
Context Length	32,768 tokens natively, up to 131,072 tokens with YaRN

Qwen3-32B sets a new benchmark for large language models in terms of reasoning, agent functionality, conversational quality, and multilingual support, making it an ideal solution for a variety of advanced AI applications.

🔔How to Use

graph LR A("Purchase Now") --> B["Start Chat on Homepage"] A --> D["Read API Documentation"] B --> C["Register / Login"] C --> E["Enter Key"] D --> F["Enter Endpoint & Key"] E --> G("Start Using") F --> G style A fill:#f9f9f9,stroke:#333,stroke-width:1px style B fill:#f9f9f9,stroke:#333,stroke-width:1px style C fill:#f9f9f9,stroke:#333,stroke-width:1px style D fill:#f9f9f9,stroke:#333,stroke-width:1px style E fill:#f9f9f9,stroke:#333,stroke-width:1px style F fill:#f9f9f9,stroke:#333,stroke-width:1px style G fill:#f9f9f9,stroke:#333,stroke-width:1px

Purchase Now

Start Chat on Homepage

Register / Login

Enter Key

Read API Documentation

Enter Endpoint & Key

Start Using

Recommend Models

claude-opus-4-20250514-thinking

Chat, Reasoning, Vision
Anthropic

Comprehensive introduction to Anthropic's newly released Claude 4 models, Opus 4 and Sonnet 4, highlighting their features, performance benchmarks, application scenarios, pricing, and availability. This report summarizes key differences between the models and discusses their integration with major platforms such as GitHub Copilot, emphasizing their advantages in coding, advanced reasoning, and ethical AI responses.

2025-05-14

o3-pro

Chat, Reasoning, Vision
OpenAI

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers. o3-pro is available in the Responses API only to enable support for multi-turn model interactions before responding to API requests, and other advanced API features in the future. Since o3-pro is designed to tackle tough problems, some requests may take several minutes to finish. To avoid timeouts, try using background mode.

2025-06-10

gpt-4o-mini-rev

Chat, Vision
Reverse Source

Using reverse engineering to call the model within the official application and convert it into an API.

2024-12-01