DeepSeek-V3.1

Model Description

Added DeepSeek’s latest open-source model, DeepSeek-V3.1, which supports both thinking and non-thinking modes, with a larger context window and output length.

Model Version:

deepseek-v3-1-nothinking

CONTEXT LENGTH: 128K

DEFAULT: 4K
MAXIMUM: 8K

Json Output

Function Calling

Chat Prefix Completion(Beta)

FIM Completion(Beta)

deepseek-v3-1-thinking

CONTEXT LENGTH: 128K

DEFAULT: 32K
MAXIMUM: 64K

Json Output

Chat Prefix Completion(Beta)

🔔How to Use

graph LR A("Purchase Now") --> B["Start Chat on Homepage"] A --> D["Read API Documentation"] B --> C["Register / Login"] C --> E["Enter Key"] D --> F["Enter Endpoint & Key"] E --> G("Start Using") F --> G style A fill:#f9f9f9,stroke:#333,stroke-width:1px style B fill:#f9f9f9,stroke:#333,stroke-width:1px style C fill:#f9f9f9,stroke:#333,stroke-width:1px style D fill:#f9f9f9,stroke:#333,stroke-width:1px style E fill:#f9f9f9,stroke:#333,stroke-width:1px style F fill:#f9f9f9,stroke:#333,stroke-width:1px style G fill:#f9f9f9,stroke:#333,stroke-width:1px

Purchase Now

Start Chat on Homepage

Register / Login

Enter Key

Read API Documentation

Enter Endpoint & Key

Start Using

Description Ends

Recommend Models

o3-pro

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers. o3-pro is available in the Responses API only to enable support for multi-turn model interactions before responding to API requests, and other advanced API features in the future. Since o3-pro is designed to tackle tough problems, some requests may take several minutes to finish. To avoid timeouts, try using background mode.

gemini-2.5-flash-lite-preview-06-17

A Gemini 2.5 Flash model optimized for cost efficiency and low latency.

claudecode/claude-sonnet-4-20250514

The Claude model series offered by the Claude Code has moderate stability and is extremely low-priced, making it more suitable for data batch processing tasks where strict stability requirements are not particularly stringent.