llama-3.3-70b

Model Description

Meta Llama 3.3 is a state-of-the-art 70 billion parameter multilingual large language model (LLM) designed for text generation tasks. As an instruction-tuned variant of the Llama architecture, it specializes in assistant-like dialogue applications across English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. The model employs an optimized transformer architecture with Grouped-Query Attention (GQA) for efficient inference, trained on over 15 trillion tokens of publicly available data with a knowledge cutoff in December 2023. It leverages both supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align responses with human preferences for helpfulness and safety. Notable features include a 128k token context window, tool calling capabilities, and compliance with Meta’s custom commercial license (Llama 3.3 Community License). The model demonstrates strong performance on industry benchmarks while explicitly prohibiting unlawful uses or applications in unsupported languages without proper safety measures.

🔔How to Use

graph LR A("Purchase Now") --> B["Start Chat on Homepage"] A --> D["Read API Documentation"] B --> C["Register / Login"] C --> E["Enter Key"] D --> F["Enter Endpoint & Key"] E --> G("Start Using") F --> G style A fill:#f9f9f9,stroke:#333,stroke-width:1px style B fill:#f9f9f9,stroke:#333,stroke-width:1px style C fill:#f9f9f9,stroke:#333,stroke-width:1px style D fill:#f9f9f9,stroke:#333,stroke-width:1px style E fill:#f9f9f9,stroke:#333,stroke-width:1px style F fill:#f9f9f9,stroke:#333,stroke-width:1px style G fill:#f9f9f9,stroke:#333,stroke-width:1px
Description Ends

Recommend Models

o3-pro

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers. o3-pro is available in the Responses API only to enable support for multi-turn model interactions before responding to API requests, and other advanced API features in the future. Since o3-pro is designed to tackle tough problems, some requests may take several minutes to finish. To avoid timeouts, try using background mode.

o3

Our most powerful reasoning model with leading performance on coding, math, science, and vision

gemini-2.5-flash-preview-04-17

Gemini-2.5-Flash-Preview-04-17 is a large language model supporting text, image, video, and audio inputs, with advanced output and code execution capabilities and high token limits.