gemini-embedding-001

Model Description

The Gemini Embedding model achieves a SOTA performance across many key dimensions including code, multi-lingual, and retrieval.

Gemini Embedding is a state-of-the-art model that leverages the Gemini architecture to produce highly generalizable embeddings for text and code across numerous languages, designed for tasks like retrieval, classification, and clustering.

An Introduction to the Gemini Embedding Model

Gemini Embedding is a state-of-the-art embedding model designed to leverage the capabilities of Google’s Gemini large language model. It produces highly generalizable, dense vector representations for text spanning over 100 languages and various textual modalities, including code. These embeddings can be precomputed and applied to a wide range of downstream tasks such as classification, semantic similarity, clustering, ranking, and information retrieval.

Model Architecture

The model’s architecture is designed to create holistic representations of inputs. The process begins by initializing the embedding model from a pre-existing Gemini model, which allows it to build upon the vast knowledge already contained within Gemini’s parameters.

The technical process involves three main steps:

  1. An input text sequence is processed by a transformer with bidirectional attention, which produces a sequence of token-level embeddings.
  2. A mean pooling strategy is then applied. This involves averaging the token embeddings along the sequence axis to generate a single, fixed-size embedding that represents the entire input.
  3. Finally, a randomly initialized linear projection layer scales this pooled embedding to the desired final output dimension.

Training

The Gemini Embedding model was refined using a training objective based on a noise-contrastive estimation (NCE) loss function with in-batch negatives.

Performance and Capabilities

When evaluated on the Massive Multilingual Text Embedding Benchmark (MMTEB), which includes over one hundred tasks across more than 250 languages, Gemini Embedding has been shown to substantially outperform previous state-of-the-art models. It established a new state-of-the-art on the public leaderboard, achieving a mean score of 68.32, a significant improvement over the next-best model.

The model demonstrates exceptional performance not only in high-resource languages like English but also in numerous low-resource languages, such as Macedonian. It has also set new records on specific benchmarks like XOR-Retrieve for cross-lingual retrieval. This unified model shows strong capabilities across a broad selection of tasks, surpassing even specialized, domain-specific models in English, multilingual, and code benchmarks.

🔔How to Use

graph LR A("Purchase Now") --> B["Start Chat on Homepage"] A --> D["Read API Documentation"] B --> C["Register / Login"] C --> E["Enter Key"] D --> F["Enter Endpoint & Key"] E --> G("Start Using") F --> G style A fill:#f9f9f9,stroke:#333,stroke-width:1px style B fill:#f9f9f9,stroke:#333,stroke-width:1px style C fill:#f9f9f9,stroke:#333,stroke-width:1px style D fill:#f9f9f9,stroke:#333,stroke-width:1px style E fill:#f9f9f9,stroke:#333,stroke-width:1px style F fill:#f9f9f9,stroke:#333,stroke-width:1px style G fill:#f9f9f9,stroke:#333,stroke-width:1px

Purchase Now

Start Chat on Homepage

Register / Login

Enter Key

Read API Documentation

Enter Endpoint & Key

Start Using

Description Ends

Recommend Models

sora_image

Reverse-engineered version of the official GPT-Image-1, featuring stable performance, high cost-effectiveness, compatibility with traditional OpenAI formats, and support for direct image generation through conversation.

gpt-4.1-nano

GPT-4.1 nano is the fastest, most cost-effective GPT-4.1 model.

gemini-2.5-pro-preview-06-05

Google has released an upgraded preview of Gemini 2.5 Pro (06-05) that significantly improves coding performance, mathematical reasoning, and response formatting while addressing previous performance concerns.