voyage-aiembeddingsragreviewcode-search

Voyage AI Review 2026: Best Embedding Models for RAG and Code Search

Voyage AI Review 2026: Best Embedding Models for RAG and Code Search

Voyage AI has established itself as the specialist embedding provider in 2026, with domain-specific models for code, finance, law, and multilingual content that consistently outperform general-purpose embeddings from OpenAI and Cohere on relevant benchmarks. This review covers whether the performance premium justifies the cost.

What is Voyage AI?

Voyage AI is a company focused exclusively on embedding models, they don't offer generative models. Their product line includes:

  • voyage-3-large, Their flagship general-purpose embedding model (1024 dimensions)
  • voyage-3, Balanced performance/cost general model (1024 dimensions)
  • voyage-3-lite, Fastest, cheapest option (512 dimensions)
  • voyage-code-3, Specialized for code retrieval
  • voyage-finance-2, Specialized for financial documents
  • voyage-law-2, Specialized for legal documents
  • voyage-multilingual-2, 100+ language support
  • voyage-multimodal-3, Text + image embeddings

Benchmarks vs Competitors

On MTEB (Massive Text Embedding Benchmark), as of April 2026:

General retrieval (BEIR benchmark):

ModelNDCG@10DimensionsCost per 1M tokens
voyage-3-large57.11024$0.18
text-embedding-3-large (OpenAI)54.93072$0.13
embed-english-v3.0 (Cohere)55.01024$0.10
BGE-M3 (open-source)54.21024Free (self-hosted)
voyage-355.21024$0.06
voyage-3-lite51.8512$0.02

Code retrieval (CodeSearchNet + BEIR code tasks):

ModelNDCG@10Cost per 1M tokens
voyage-code-371.4$0.18
text-embedding-3-large63.1$0.13
embed-english-v3.061.8$0.10
voyage-3-large64.2$0.18

Voyage-code-3's advantage on code retrieval is significant: +8 NDCG@10 over text-embedding-3-large. For code search, this is not a marginal improvement.

Financial document retrieval (FinanceBench):

ModelAccuracy
voyage-finance-281.3%
text-embedding-3-large74.2%
voyage-3-large76.1%

Again, the domain-specific model shows meaningful advantage.

Getting Started

pip install voyageai

import voyageai
import os

vo = voyageai.Client(api_key=os.environ["VOYAGE_API_KEY"])

# Single embedding
embedding = vo.embed(
    ["What is prompt caching in Anthropic's API?"],
    model="voyage-3",
    input_type="query"  # "query" or "document"
).embeddings[0]

# Batch embeddings
texts = ["Document 1 content", "Document 2 content", "Document 3 content"]
result = vo.embed(
    texts,
    model="voyage-3",
    input_type="document"
)
embeddings = result.embeddings  # List of 1024-dim vectors
print(f"Total tokens used: {result.total_tokens}")

Important: Voyage distinguishes between query and document embeddings. Use input_type="query" for search queries and input_type="document" for indexed content. This matters, using the wrong type degrades retrieval quality.

RAG Pipeline with Voyage

import voyageai
import lancedb
import numpy as np
from anthropic import Anthropic

vo = voyageai.Client(api_key=os.environ["VOYAGE_API_KEY"])
client = Anthropic()
db = lancedb.connect("/data/rag-db")

def index_documents(texts: list[str], sources: list[str]):
    """Embed and store documents."""
    result = vo.embed(texts, model="voyage-3", input_type="document")
    
    table = db.open_table("documents")
    table.add([
        {"content": text, "vector": emb, "source": src}
        for text, emb, src in zip(texts, result.embeddings, sources)
    ])

def retrieve(query: str, k: int = 5):
    """Retrieve relevant chunks."""
    result = vo.embed([query], model="voyage-3", input_type="query")
    query_vector = result.embeddings[0]
    
    table = db.open_table("documents")
    return table.search(query_vector).limit(k).to_pandas()

def answer(question: str) -> str:
    """Full RAG pipeline."""
    chunks = retrieve(question)
    context = "\n\n".join(
        f"[Source: {row['source']}]\n{row['content']}"
        for _, row in chunks.iterrows()
    )
    
    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1000,
        messages=[{
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}\n\nAnswer based only on the context provided."
        }]
    )
    return response.content[0].text

Code Search with voyage-code-3

# voyage-code-3 is optimized for code retrieval
# It understands function signatures, docstrings, comments

codebase_snippets = [
    "def calculate_embedding(text: str) -> list[float]: ...",
    "class VectorStore:\n    def upsert(self, vectors): ...",
    "async def rate_limited_request(client, prompt): ...",
]

# Index code with document type
code_embeddings = vo.embed(
    codebase_snippets,
    model="voyage-code-3",
    input_type="document"
).embeddings

# Search code with query type
query_emb = vo.embed(
    ["function to handle rate limiting"],
    model="voyage-code-3",
    input_type="query"
).embeddings[0]

# voyage-code-3 understands the semantic relationship between
# "function to handle rate limiting" and "async def rate_limited_request"

Reranking

Voyage also offers a reranking model:

# After initial retrieval, rerank with voyage-rerank-2
result = vo.rerank(
    query="How does prompt caching work?",
    documents=[chunk.content for chunk in retrieved_chunks],
    model="voyage-rerank-2",
    top_k=3
)

for item in result.results:
    print(f"Relevance: {item.relevance_score:.3f}")
    print(retrieved_chunks[item.index].content[:200])

Pricing for voyage-rerank-2: $0.05 per 1K searches (1 search = 1 query + N documents)

Pricing Comparison

ModelPrice per 1M tokensDimensionsBest for
voyage-3-lite$0.02512High-volume, cost-sensitive
voyage-3$0.061024General RAG (best value)
voyage-3-large$0.181024Max quality general retrieval
voyage-code-3$0.181024Code search
voyage-finance-2$0.121024Financial docs
voyage-law-2$0.121024Legal docs
voyage-multimodal-3$0.12 text / $0.18 img1024Mixed media

For general RAG, voyage-3 at $0.06/1M is the sweet spot, it outperforms text-embedding-3-large on most benchmarks at roughly half the price.

Cost at Scale

For a 10M document corpus (averaging 512 tokens per chunk):

  • voyage-3-lite: 5B tokens × $0.02/1M = $100
  • voyage-3: 5B tokens × $0.06/1M = $300
  • text-embedding-3-large: 5B tokens × $0.13/1M = $650

Voyage-3 costs 54% less than text-embedding-3-large while outperforming it on most benchmarks.

Dimension Reduction

Voyage supports Matryoshka-style dimension reduction (voyage-3-large natively supports it):

# Full 1024 dimensions
full_emb = vo.embed([text], model="voyage-3-large").embeddings[0]  # 1024 dims

# Truncated to 256 dimensions, still useful for coarse retrieval
# Useful for ANN first-pass retrieval, then rerank with full dimensions
truncated = full_emb[:256]

When Voyage Beats OpenAI

Voyage wins:

  • Code retrieval: voyage-code-3 has a large, consistent advantage
  • Financial documents: voyage-finance-2 wins on FinanceBench
  • Legal documents: voyage-law-2 wins on LegalBench
  • Cost-efficiency for general RAG: voyage-3 beats text-embedding-3-large on both quality and price
  • Multilingual: voyage-multilingual-2 is competitive with OpenAI's multilingual embeddings

OpenAI wins:

  • Ecosystem integration: text-embedding-3 is native to OpenAI Assistants API, Batch API
  • Zero additional vendor: if you're already fully on OpenAI, adding Voyage adds a dependency
  • Very short texts: at sub-100 token texts, quality differences narrow

Limitations

  • Vendor lock-in: Voyage is a single-product company. Switching costs are real if they raise prices or get acquired.
  • Rate limits: Default limits are lower than OpenAI at equivalent pricing tiers. Plan for rate limit handling.
  • Batch API: No async batch API equivalent to OpenAI's Batch API for 50% discounts on large offline jobs.
  • SDKs: Python and JavaScript only. Go, Rust, Java users must use the HTTP API directly.

Summary

Voyage AI offers the best domain-specific embedding models available in 2026. For code search, voyage-code-3 is the clear choice. For general RAG, voyage-3 beats text-embedding-3-large on both quality and cost. The domain-specific models (finance, law) are genuinely useful for those verticals.

If you're building a code search tool, an enterprise knowledge base in a regulated industry, or a cost-conscious high-volume RAG system, Voyage deserves a serious evaluation.

Methodology

All benchmarks, pricing, and performance figures cited in this article are sourced from publicly available data: provider pricing pages (verified 2026-04-16), LMSYS Chatbot Arena ELO leaderboard, MTEB retrieval benchmark, and independent API tests. Costs are listed as per-million-token input/output unless noted. Rankings reflect the publication date and change as models update.

Your ad here

Related Tools