Model Filter

Model Type

Features

Context Windown

Maxmium Output

Provider

Recommend

We have launched the Basic Series economy models, offering higher discounts. Click to view the model comparison >>

1MT: One million tokens. This pricing is based on the conversion rate of ¥2 = $1. If your purchase rate is ¥3.5 = $1, the price should be multiplied by 1.75 accordingly.

grok-code-fast-1

We're thrilled to introduce grok-code-fast-1, a speedy and economical reasoning model that excels at agentic coding.

gemini-2.5-flash-image-preview(nano-banana)

Gemini 2.5 Flash Image is a state-of-the-art model for image generation and editing that offers advanced capabilities like character consistency, natural language-based transformations, multi-image fusion, and the integration of Gemini's world knowledge.

gemini-2.5-flash-image-preview-bs(nano-banana)

Gemini 2.5 Flash Image is a state-of-the-art model for image generation and editing that offers advanced capabilities like character consistency, natural language-based transformations, multi-image fusion, and the integration of Gemini's world knowledge.

DeepSeek-V3.1

Added DeepSeek's latest open-source model, V3.1, which supports both thinking and non-thinking modes, with a larger context window and output length.

gpt-5

Our smartest, fastest, and most useful model yet, with thinking built in. Available to everyone.

claude-opus-4-1-20250805

Opus 4.1 advances our state-of-the-art coding performance to 74.5% on SWE-bench Verified. It also improves Claude’s in-depth research and data analysis skills, especially around detail tracking and agentic search.

gemini-embedding-001

The Gemini Embedding model achieves a SOTA performance across many key dimensions including code, multi-lingual, and retrieval.

gemini-2.5-flash-lite

A Gemini 2.5 Flash model optimized for cost-efficiency and high throughput.

flux-kontext-pro

A unified model delivering local editing, generative modifications, and text-to-image generation in FLUX.1 quality. Processes text and image inputs for precise regional edits or full scene transformations at breakthrough speeds, pioneering iterative workflows that maintain character consistency across multiple editing turns.

flux-kontext-max

Our new premium model brings maximum performance across all aspects – greatly improved prompt adherence and typography generation meet premium consistency for editing without compromise on speed.

Midjourney API

Supports API calls for Midjourney, including advanced commands such as text-to-image, image-to-image, image blending, upscaling, variations, and local modification. Supports asynchronous queries.
midjourney

grok-4-0709

Our latest and greatest flagship model, offering unparalleled performance in natural language, math and reasoning - the perfect jack of all trades.

claudecode/claude-sonnet-4-20250514

The Claude model series offered by the Claude Code has moderate stability and is extremely low-priced, making it more suitable for data batch processing tasks where strict stability requirements are not particularly stringent.

az/claude-sonnet-4-20250514

The Claude model series offered by the Microsoft Azure platform has moderate stability and is extremely low-priced, making it more suitable for data batch processing tasks where strict stability requirements are not particularly stringent.

sora_image

Reverse-engineered version of the official GPT-Image-1, featuring stable performance, high cost-effectiveness, compatibility with traditional OpenAI formats, and support for direct image generation through conversation.

gemini-2.5-pro

Gemini 2.5 Pro is Google's most advanced AI model designed for coding and complex tasks, featuring enhanced reasoning capabilities, native multimodal support, and a 1-million token context window.

gemini-2.5-flash

Gemini 2.5 Flash is Google's most efficient multimodal AI model designed for fast, cost-effective performance on everyday tasks with native audio capabilities and a 1-million token context window.

gemini-2.5-flash-lite-preview-06-17

A Gemini 2.5 Flash model optimized for cost efficiency and low latency.

o3-pro

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers. o3-pro is available in the Responses API only to enable support for multi-turn model interactions before responding to API requests, and other advanced API features in the future. Since o3-pro is designed to tackle tough problems, some requests may take several minutes to finish. To avoid timeouts, try using background mode.

gemini-2.5-pro-preview-06-05

Google has released an upgraded preview of Gemini 2.5 Pro (06-05) that significantly improves coding performance, mathematical reasoning, and response formatting while addressing previous performance concerns.