Supported Model List:
- claude-opus-4-6
- claude-opus-4-6-thinking
- claudecode/claude-opus-4-6
- claudecode/claude-opus-4-6-thinking
- aws/claude-opus-4-6
- aws/claude-opus-4-6-thinking
Claude Opus 4.6 is a next-generation flagship model featuring adaptive thinking, a doubled output capacity of 128K tokens, and new context management tools like the Compaction API.
Claude Opus 4.6 represents the latest evolution in the Claude 4 model family, specifically optimized for complex coding tasks and autonomous agent development. The model maintains a standard 200K token context window, while offering a 1M token window in beta for extremely large datasets. A significant technical upgrade is the expansion of the maximum output limit to 128K tokens—twice the capacity of its predecessor—which facilitates longer internal reasoning chains and more comprehensive document generation.
The core of this release is the introduction of “Adaptive Thinking” mode. This feature allows the model to dynamically determine the appropriate depth of internal reasoning required for a given task. Instead of manually setting fixed token budgets, users now use an “effort” parameter to control thinking depth. At the highest effort level, the model applies its maximum cognitive capability to solve difficult problems, while lower effort levels allow for more cost-effective responses to simpler queries.
To improve the efficiency of long-term interactions, Claude Opus 4.6 introduces the Compaction API in beta. This feature provides automatic, server-side context summarization, effectively enabling infinite conversations by condensing earlier parts of a dialogue as it approaches the context window limit. Additionally, the release brings fine-grained tool streaming to general availability and adds data residency controls, allowing users to specify geographic routing for model inference via the new inference_geo parameter.
Developers should be aware of several architectural changes and deprecations. Claude Opus 4.6 no longer supports assistant message prefilling; requests containing prefilled messages will return an error, requiring a shift toward using structured outputs or system prompts for response guidance. Furthermore, the API has moved the structured output configuration to a new parameter location and deprecated older manual thinking configurations in favor of the new adaptive system.