Voyage AI Review 2026: Best Embedding Models for RAG and Code Search

Voyage AI has established itself as the specialist embedding provider in 2026, with domain-specific models for code, finance, law, and multilingual content that consistently outperform general-purpose embeddings from OpenAI and Cohere on relevant benchmarks. This review covers whether the performance premium justifies the cost.

What is Voyage AI?

Voyage AI is a company focused exclusively on embedding models, they don't offer generative models. Their product line includes:

voyage-3-large, Their flagship general-purpose embedding model (1024 dimensions)
voyage-3, Balanced performance/cost general model (1024 dimensions)
voyage-3-lite, Fastest, cheapest option (512 dimensions)
voyage-code-3, Specialized for code retrieval
voyage-finance-2, Specialized for financial documents
voyage-law-2, Specialized for legal documents
voyage-multilingual-2, 100+ language support
voyage-multimodal-3, Text + image embeddings

Benchmarks vs Competitors

On MTEB (Massive Text Embedding Benchmark), as of April 2026:

General retrieval (BEIR benchmark):

Model

NDCG@10

Dimensions

Cost per 1M tokens

voyage-3-large	57.1	1024	$0.18
text-embedding-3-large (OpenAI)	54.9	3072	$0.13
embed-english-v3.0 (Cohere)	55.0	1024	$0.10
BGE-M3 (open-source)	54.2	1024	Free (self-hosted)
voyage-3	55.2	1024	$0.06
voyage-3-lite	51.8	512	$0.02

Code retrieval (CodeSearchNet + BEIR code tasks):

Model

NDCG@10

Cost per 1M tokens

voyage-code-3	71.4	$0.18
text-embedding-3-large	63.1	$0.13
embed-english-v3.0	61.8	$0.10
voyage-3-large	64.2	$0.18

Voyage-code-3's advantage on code retrieval is significant: +8 NDCG@10 over text-embedding-3-large. For code search, this is not a marginal improvement.

Financial document retrieval (FinanceBench):

Model

Accuracy

voyage-finance-2	81.3%
text-embedding-3-large	74.2%
voyage-3-large	76.1%

Again, the domain-specific model shows meaningful advantage.

Getting Started

pip install voyageai

import voyageai
import os

vo = voyageai.Client(api_key=os.environ["VOYAGE_API_KEY"])

# Single embedding
embedding = vo.embed(
    ["What is prompt caching in Anthropic's API?"],
    model="voyage-3",
    input_type="query"  # "query" or "document"
).embeddings[0]

# Batch embeddings
texts = ["Document 1 content", "Document 2 content", "Document 3 content"]
result = vo.embed(
    texts,
    model="voyage-3",
    input_type="document"
)
embeddings = result.embeddings  # List of 1024-dim vectors
print(f"Total tokens used: {result.total_tokens}")

Important: Voyage distinguishes between query and document embeddings. Use input_type="query" for search queries and input_type="document" for indexed content. This matters, using the wrong type degrades retrieval quality.

RAG Pipeline with Voyage

import voyageai
import lancedb
import numpy as np
from anthropic import Anthropic

vo = voyageai.Client(api_key=os.environ["VOYAGE_API_KEY"])
client = Anthropic()
db = lancedb.connect("/data/rag-db")

def index_documents(texts: list[str], sources: list[str]):
    """Embed and store documents."""
    result = vo.embed(texts, model="voyage-3", input_type="document")
    
    table = db.open_table("documents")
    table.add([
        {"content": text, "vector": emb, "source": src}
        for text, emb, src in zip(texts, result.embeddings, sources)
    ])

def retrieve(query: str, k: int = 5):
    """Retrieve relevant chunks."""
    result = vo.embed([query], model="voyage-3", input_type="query")
    query_vector = result.embeddings[0]
    
    table = db.open_table("documents")
    return table.search(query_vector).limit(k).to_pandas()

def answer(question: str) -> str:
    """Full RAG pipeline."""
    chunks = retrieve(question)
    context = "\n\n".join(
        f"[Source: {row['source']}]\n{row['content']}"
        for _, row in chunks.iterrows()
    )
    
    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1000,
        messages=[{
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}\n\nAnswer based only on the context provided."
        }]
    )
    return response.content[0].text

Code Search with voyage-code-3

# voyage-code-3 is optimized for code retrieval
# It understands function signatures, docstrings, comments

codebase_snippets = [
    "def calculate_embedding(text: str) -> list[float]: ...",
    "class VectorStore:\n    def upsert(self, vectors): ...",
    "async def rate_limited_request(client, prompt): ...",
]

# Index code with document type
code_embeddings = vo.embed(
    codebase_snippets,
    model="voyage-code-3",
    input_type="document"
).embeddings

# Search code with query type
query_emb = vo.embed(
    ["function to handle rate limiting"],
    model="voyage-code-3",
    input_type="query"
).embeddings[0]

# voyage-code-3 understands the semantic relationship between
# "function to handle rate limiting" and "async def rate_limited_request"

Reranking

Voyage also offers a reranking model:

# After initial retrieval, rerank with voyage-rerank-2
result = vo.rerank(
    query="How does prompt caching work?",
    documents=[chunk.content for chunk in retrieved_chunks],
    model="voyage-rerank-2",
    top_k=3
)

for item in result.results:
    print(f"Relevance: {item.relevance_score:.3f}")
    print(retrieved_chunks[item.index].content[:200])

Pricing for voyage-rerank-2: $0.05 per 1K searches (1 search = 1 query + N documents)

Pricing Comparison

Model

Price per 1M tokens

Dimensions

Best for

voyage-3-lite	$0.02	512	High-volume, cost-sensitive
voyage-3	$0.06	1024	General RAG (best value)
voyage-3-large	$0.18	1024	Max quality general retrieval
voyage-code-3	$0.18	1024	Code search
voyage-finance-2	$0.12	1024	Financial docs
voyage-law-2	$0.12	1024	Legal docs
voyage-multimodal-3	$0.12 text / $0.18 img	1024	Mixed media

For general RAG, voyage-3 at $0.06/1M is the sweet spot, it outperforms text-embedding-3-large on most benchmarks at roughly half the price.

Cost at Scale

For a 10M document corpus (averaging 512 tokens per chunk):

voyage-3-lite: 5B tokens × $0.02/1M = $100
voyage-3: 5B tokens × $0.06/1M = $300
text-embedding-3-large: 5B tokens × $0.13/1M = $650

Voyage-3 costs 54% less than text-embedding-3-large while outperforming it on most benchmarks.

Dimension Reduction

Voyage supports Matryoshka-style dimension reduction (voyage-3-large natively supports it):

# Full 1024 dimensions
full_emb = vo.embed([text], model="voyage-3-large").embeddings[0]  # 1024 dims

# Truncated to 256 dimensions, still useful for coarse retrieval
# Useful for ANN first-pass retrieval, then rerank with full dimensions
truncated = full_emb[:256]

When Voyage Beats OpenAI

Voyage wins:

Code retrieval: voyage-code-3 has a large, consistent advantage
Financial documents: voyage-finance-2 wins on FinanceBench
Legal documents: voyage-law-2 wins on LegalBench
Cost-efficiency for general RAG: voyage-3 beats text-embedding-3-large on both quality and price
Multilingual: voyage-multilingual-2 is competitive with OpenAI's multilingual embeddings

OpenAI wins:

Ecosystem integration: text-embedding-3 is native to OpenAI Assistants API, Batch API
Zero additional vendor: if you're already fully on OpenAI, adding Voyage adds a dependency
Very short texts: at sub-100 token texts, quality differences narrow

Limitations

Vendor lock-in: Voyage is a single-product company. Switching costs are real if they raise prices or get acquired.
Rate limits: Default limits are lower than OpenAI at equivalent pricing tiers. Plan for rate limit handling.
Batch API: No async batch API equivalent to OpenAI's Batch API for 50% discounts on large offline jobs.
SDKs: Python and JavaScript only. Go, Rust, Java users must use the HTTP API directly.

Summary

Voyage AI offers the best domain-specific embedding models available in 2026. For code search, voyage-code-3 is the clear choice. For general RAG, voyage-3 beats text-embedding-3-large on both quality and cost. The domain-specific models (finance, law) are genuinely useful for those verticals.

If you're building a code search tool, an enterprise knowledge base in a regulated industry, or a cost-conscious high-volume RAG system, Voyage deserves a serious evaluation.

Methodology

All benchmarks, pricing, and performance figures cited in this article are sourced from publicly available data: provider pricing pages (verified 2026-04-16), LMSYS Chatbot Arena ELO leaderboard, MTEB retrieval benchmark, and independent API tests. Costs are listed as per-million-token input/output unless noted. Rankings reflect the publication date and change as models update.

Voyage AI Review 2026: Best Embedding Models for RAG and Code Search

What is Voyage AI?

Benchmarks vs Competitors

General retrieval (BEIR benchmark):

Code retrieval (CodeSearchNet + BEIR code tasks):

Financial document retrieval (FinanceBench):

Getting Started

RAG Pipeline with Voyage

Code Search with voyage-code-3

Reranking

Pricing Comparison

Cost at Scale

Dimension Reduction

When Voyage Beats OpenAI

Limitations

Summary

Methodology

Related Tools