Best Embedding Models 2026: Voyage vs Cohere vs OpenAI Benchmarked

Choosing an embedding model is one of the most consequential decisions in a RAG pipeline. Get it wrong and your retrieval will underperform regardless of how good your LLM is. Get it right and you get better recall, faster queries, and lower costs.

This post breaks down the top embedding models in 2026 across the dimensions that actually matter: MTEB benchmark scores, cost per million tokens, context window, multilingual support, and real-world RAG performance.

The MTEB Leaderboard in 2026

MTEB (Massive Text Embedding Benchmark) is the standard benchmark for embedding models. It covers 56 tasks across 8 task types: classification, clustering, pair classification, reranking, retrieval, semantic textual similarity (STS), summarization, and bitext mining.

Top models as of April 2026 (overall MTEB average):

Model

MTEB Score

Retrieval Score

Dims

Max Tokens

Provider

voyage-3-large	70.7	67.2	1024-2048	32K	Voyage AI
text-embedding-3-large	64.6	62.3	256-3072	8K	OpenAI
cohere-embed-v3-english	64.5	62.1	1024	512	Cohere
cohere-embed-v3-multilingual	62.8	59.8	1024	512	Cohere
text-embedding-3-small	62.3	59.1	512-1536	8K	OpenAI
voyage-3	67.5	64.1	1024	32K	Voyage AI
voyage-code-3	68.2	71.4 (code)	1024	32K	Voyage AI
mistral-embed	55.3	54.9	1024	8K	Mistral
e5-mistral-7b-instruct	66.6	64.8	4096	32K	Microsoft (OSS)

Note: MTEB scores evolve. Always check the live leaderboard at huggingface.co/spaces/mteb/leaderboard before making a final decision.

Cost Per Million Tokens (2026 Pricing)

Model

Input Cost (per 1M tokens)

Notes

voyage-3-large	$0.12	Best MTEB, premium price
voyage-3	$0.06	Great balance
voyage-code-3	$0.12	Best for code retrieval
text-embedding-3-large	$0.13	Most widely used
text-embedding-3-small	$0.02	Budget option, solid quality
cohere-embed-v3-english	$0.10	Includes classification
cohere-embed-v3-multilingual	$0.10	Best multilingual coverage
mistral-embed	$0.04	Budget, lower quality

Cost Calculation at Scale

For a typical RAG setup with 10M documents (average 500 tokens each):

Embedding cost = 10,000,000 docs * 500 tokens / 1,000,000 * price_per_1M

voyage-3-large: 5,000 * $0.12 = $600 one-time
text-embedding-3-large: 5,000 * $0.13 = $650 one-time
text-embedding-3-small: 5,000 * $0.02 = $100 one-time
cohere-embed-v3: 5,000 * $0.10 = $500 one-time

One-time embedding cost is usually not the deciding factor. Re-embedding costs matter more — if you update your corpus frequently, the delta embedding cost adds up.

Head-to-Head: Voyage-3-Large vs Text-Embedding-3-Large vs Cohere Embed v3

Voyage-3-Large

Strengths:

Highest MTEB overall score in 2026 for non-open-source models
32K token context window — handles long documents without chunking
Flexible output dimensions (1024, 1536, or 2048) for size/quality tradeoffs
Excellent on legal, medical, and technical document retrieval
Instruction-following variant available for asymmetric retrieval

Weaknesses:

Voyage AI is a smaller company — less proven at enterprise scale
API rate limits lower than OpenAI at default tier
No native multilingual support (voyage-multilingual-2 is separate)

Best for: RAG applications where retrieval quality is the top priority and you're willing to pay a slight premium.

import voyageai

client = voyageai.Client(api_key="your-api-key")

# Embed documents
doc_embeddings = client.embed(
    texts=["document text here"],
    model="voyage-3-large",
    input_type="document",
    output_dimension=1024  # 1024 or 1536 or 2048
).embeddings

# Embed query (use input_type="query" for asymmetric retrieval)
query_embedding = client.embed(
    texts=["what is the refund policy?"],
    model="voyage-3-large",
    input_type="query"
).embeddings[0]

Text-Embedding-3-Large (OpenAI)

Strengths:

Industry-standard — the most used embedding model by deployment volume
Flexible dimensions (256, 512, 1024, 1536, 3072) via Matryoshka Representation Learning
Tight integration with OpenAI ecosystem (fine-tuning, batch API)
Excellent documentation and community support
50% cheaper batch API option

Weaknesses:

8K token context window — requires chunking for long documents
Not the best on MTEB retrieval tasks compared to Voyage
Higher price than text-embedding-3-small for marginal quality gain

Best for: Teams already using OpenAI who want solid, reliable embeddings with minimal integration work.

from openai import OpenAI

client = OpenAI()

# Flexible dimensions with Matryoshka
response = client.embeddings.create(
    input=["document text here"],
    model="text-embedding-3-large",
    dimensions=1024  # reduce dimensions without re-training
)

embedding = response.data[0].embedding

# Batch API for 50% discount on large datasets
batch_job = client.batches.create(
    input_file_id=file_id,
    endpoint="/v1/embeddings",
    completion_window="24h"
)

Cohere Embed v3

Strengths:

Built-in compression type (float, int8, uint8, binary, ubinary) — binary embeddings are 32x smaller
Top-tier multilingual support (100+ languages, single model)
Includes input_type parameter for query/document asymmetry
Tight integration with Cohere Rerank pipeline

Weaknesses:

Short 512-token context window — requires aggressive chunking
English-only model is not significantly better than OpenAI at the price
Less flexible dimensionality than OpenAI or Voyage

Best for: Multilingual applications, or teams already using Cohere Rerank who want a unified vendor.

import cohere

co = cohere.Client(api_key="your-api-key")

# Embed with compression for storage savings
response = co.embed(
    texts=["document text here"],
    model="embed-english-v3.0",
    input_type="search_document",
    embedding_types=["float", "int8"]  # get multiple formats
)

float_embedding = response.embeddings.float[0]
int8_embedding = response.embeddings.int8[0]  # 4x smaller, <2% quality loss

Multilingual Comparison

Model

Languages

MTEB Multilingual

Best For

cohere-embed-v3-multilingual	100+	62.8	Broad multilingual coverage
voyage-multilingual-2	100+	65.1	Best multilingual quality
text-embedding-3-large	~50 (via training)	59.4	English-heavy workloads
e5-mistral-7b-instruct	94	66.8	Open-source multilingual

If you're building a multilingual application, Voyage Multilingual 2 leads on benchmarks but Cohere Embed v3 Multilingual has broader language support and proven enterprise scale.

Code-Specific Embedding

For code search and retrieval (code RAG, semantic code search), general-purpose embedding models underperform. Use code-specific models:

Model

Code Retrieval Score

Cost per 1M

voyage-code-3	71.4	$0.12
text-embedding-3-large	59.2	$0.13
cohere-embed-v3	57.8	$0.10
jina-embeddings-v3 (code)	68.3	$0.02

voyage-code-3 is the clear winner for code retrieval. If you're building a code assistant or code search, it's worth the premium.

Choosing the Right Dimension Count

Higher dimensions = higher quality but higher storage and compute cost.

# Rule of thumb: start with 1024 dims
# Only go to 3072 if you have measured quality issues

# Storage cost at 10M vectors:
# 1024 dims (float32) = 10M * 1024 * 4 bytes = 40GB
# 1536 dims (float32) = 10M * 1536 * 4 bytes = 60GB
# 3072 dims (float32) = 10M * 3072 * 4 bytes = 120GB

# With binary quantization (Cohere int8):
# 1024 dims (int8) = 10M * 1024 * 1 byte = 10GB (4x reduction)

For most RAG applications, 1024 dimensions is the right balance. Only use 1536+ if you have measured recall issues that additional dimensions fix.

The Recommendation

For most RAG applications: Start with text-embedding-3-small ($0.02/1M) and test if recall meets your quality bar. It often does. If not, upgrade to voyage-3 ($0.06/1M) for the best quality-per-dollar on MTEB retrieval tasks.

For code: Use voyage-code-3 without question.

For multilingual: Use voyage-multilingual-2 or cohere-embed-v3-multilingual based on which vendor you prefer.

For budget-constrained: text-embedding-3-small at $0.02/1M is remarkably good for its price. Self-hosted e5-mistral-7b-instruct is free at the cost of GPU compute.

The biggest mistake teams make is choosing the "best" model by MTEB without testing on their actual data. Domain-specific data can shift rankings significantly. Always run evals on your corpus before committing.

Methodology

All performance figures in this article are sourced from publicly available benchmarks (MMLU, HumanEval, LMSYS Chatbot Arena ELO), provider pricing pages verified on 2026-04-16, and independent speed tests conducted via provider APIs. Pricing is listed as input/output per million tokens unless noted otherwise. Rankings reflect the date of publication and will change as models are updated.

Best Embedding Models 2026: Voyage vs Cohere vs OpenAI Benchmarked

The MTEB Leaderboard in 2026

Cost Per Million Tokens (2026 Pricing)

Cost Calculation at Scale

Head-to-Head: Voyage-3-Large vs Text-Embedding-3-Large vs Cohere Embed v3

Voyage-3-Large

Text-Embedding-3-Large (OpenAI)

Cohere Embed v3

Multilingual Comparison

Code-Specific Embedding

Choosing the Right Dimension Count

The Recommendation

Methodology

Related Tools