llmgatewayopenrouterlitellmportkeyinfrastructure

LLM Gateway Comparison 2026: OpenRouter vs LiteLLM vs Portkey vs Vercel AI Gateway

LLM Gateway Comparison 2026: OpenRouter vs LiteLLM vs Portkey vs Vercel AI Gateway

An LLM gateway sits between your application and AI provider APIs. It provides a unified interface, handles routing, manages fallbacks, adds observability, and sometimes reduces costs. But not all gateways are built the same.

Here's the complete comparison of the four main options in 2026.

Why Use an LLM Gateway?

Without a gateway, your application calls a single provider directly:

# Direct provider call — no gateway
response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=[...]
)

Problems:

  • No fallbacks: If OpenAI is down, your app is down
  • No routing: You can't route cheap tasks to cheap models
  • No observability: You can't see how much each feature costs
  • No cost controls: No spending limits, no per-user quotas
  • Provider lock-in: Switching providers requires code changes

A gateway solves all of these.

Feature Comparison

FeatureOpenRouterLiteLLMPortkeyVercel AI Gateway
Unified API
Model routing
Automatic fallbacks
Load balancingLimited
Request caching
Observability/tracingBasicBasic
Cost analyticsLimited
Self-hostingNoNo
Rate limitingNo
Evals integrationNoLimitedNo
Prompt managementNoNoNo
TypeScript SDKLimited

OpenRouter

What It Is

OpenRouter is a managed gateway that provides access to 100+ models from all major providers through a single OpenAI-compatible API. You pay OpenRouter directly; they handle the provider relationships.

Pricing

  • No subscription fee
  • Small markup on top of provider prices (typically 5-15%)
  • Free tier with rate limits
  • Models available at exact provider pricing for popular ones

Code Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-or-your-openrouter-key",
    base_url="https://openrouter.ai/api/v1"
)

# Use any model with the same API
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-5",  # or openai/gpt-4o, google/gemini-2.5-pro, etc.
    messages=[{"role": "user", "content": "Hello"}],
    extra_headers={
        "HTTP-Referer": "https://yourapp.com",  # Required by OpenRouter
        "X-Title": "Your App Name"
    }
)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.OPENROUTER_API_KEY,
  baseURL: 'https://openrouter.ai/api/v1',
});

// Route to cheapest model
const response = await client.chat.completions.create({
  model: 'openrouter/auto',  // Automatically selects cheapest capable model
  messages: [{ role: 'user', content: 'Classify this text: ...' }],
});

Automatic Fallbacks in OpenRouter

# OpenRouter handles fallbacks automatically
# Specify fallback models in order
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-5",
    messages=[...],
    extra_body={
        "models": [  # Fallback chain
            "anthropic/claude-sonnet-4-5",
            "openai/gpt-4o",
            "google/gemini-2.5-pro"
        ]
    }
)

Best For

  • Prototypes and side projects that want model flexibility
  • Teams that don't want to manage infrastructure
  • Applications that want to compare multiple models
  • Access to niche or new models quickly

Limitations

  • No self-hosting
  • Limited observability
  • Markup on provider prices
  • No rate limiting or user quotas

LiteLLM

What It Is

LiteLLM is an open-source proxy that you deploy yourself. It provides a single OpenAI-compatible API that forwards to any provider. Because it's self-hosted, you pay providers directly with no markup.

Pricing

  • Open-source: Free
  • LiteLLM Enterprise: ~$1,000/month for advanced features
  • You pay providers directly — no gateway markup

Setup

# litellm/config.yaml
model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
      
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-5
      api_key: os.environ/ANTHROPIC_API_KEY
      
  - model_name: gemini-pro
    litellm_params:
      model: gemini/gemini-2.5-pro
      api_key: os.environ/GEMINI_API_KEY

# Router configuration
router_settings:
  routing_strategy: least-busy  # or latency-based, cost-based
  fallbacks:
    - {"gpt-4o": ["claude-sonnet", "gemini-pro"]}
  
litellm_settings:
  cache: true
  cache_params:
    type: redis
    host: localhost
    port: 6379

# Deploy
docker run -p 4000:4000 \
  -v $(pwd)/config.yaml:/app/config.yaml \
  -e OPENAI_API_KEY=$OPENAI_API_KEY \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  ghcr.io/berriai/litellm:latest

Code Example

# Point to your LiteLLM proxy
client = OpenAI(
    api_key="sk-1234",  # Virtual key
    base_url="http://localhost:4000"
)

response = client.chat.completions.create(
    model="gpt-4o",  # Maps to your configured model
    messages=[{"role": "user", "content": "Hello"}]
)

Load Balancing

# Multiple deployments of the same model
model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY_1
      rpm: 500
      
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY_2  # Second account for higher limits
      rpm: 500

Best For

  • Teams with significant API volume who want to avoid gateway markups
  • Enterprise teams needing data to stay in their infrastructure
  • Complex load balancing across multiple provider accounts
  • Teams already running their own infrastructure

Limitations

  • Requires infrastructure to deploy and maintain
  • Less polished UI than commercial alternatives
  • Enterprise features cost $1,000/month

Portkey

What It Is

Portkey is a managed LLM gateway with the most complete feature set: routing, fallbacks, caching, observability, prompt management, and evals — all in one platform.

Pricing

  • Developer: Free (10,000 requests/month)
  • Growth: $49/month (unlimited requests + advanced features)
  • Enterprise: Custom

Code Example

from portkey_ai import Portkey

portkey = Portkey(
    api_key="pk-your-portkey-key",
    virtual_key="pk-openai-xxx"  # Your configured OpenAI key in Portkey
)

response = portkey.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

Advanced Routing with Portkey Configs

from portkey_ai import Portkey, createHeaders, Config

# Create a config with fallback chain
config = Config(
    retry={"attempts": 3},
    strategy={"mode": "fallback"},
    targets=[
        {"virtual_key": "pk-anthropic-xxx", "model": "claude-sonnet-4-5"},
        {"virtual_key": "pk-openai-xxx", "model": "gpt-4o"},
        {"virtual_key": "pk-google-xxx", "model": "gemini-2.5-pro"}
    ]
)

portkey = Portkey(
    api_key="pk-your-portkey-key",
    config=config
)

response = portkey.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}]
)

TypeScript

import Portkey from 'portkey-ai';

const portkey = new Portkey({
  apiKey: process.env.PORTKEY_API_KEY,
  virtualKey: process.env.PORTKEY_VIRTUAL_KEY,
  config: {
    cache: { mode: 'semantic', maxAge: 3600 },  // Semantic caching!
    retry: { attempts: 3 },
    strategy: { mode: 'loadbalance' },
    targets: [
      { virtualKey: 'pk-openai-xxx', weight: 60 },
      { virtualKey: 'pk-anthropic-xxx', weight: 40 }
    ]
  }
});

Best For

  • Teams that want the most complete feature set without self-hosting
  • Applications needing prompt version management
  • Teams doing active LLM evaluation
  • Production apps that need semantic caching

Vercel AI Gateway

What It Is

Vercel AI Gateway is built into the Vercel platform and integrates directly with the Vercel AI SDK. If you're deploying on Vercel, it's the zero-setup option.

Setup

// In a Next.js app on Vercel — no setup needed
import { generateText } from 'ai';
import { createAIGateway } from '@ai-sdk/gateway';

const gateway = createAIGateway(); // Automatically uses Vercel AI Gateway

export async function POST(req: Request) {
  const { prompt } = await req.json();
  
  const { text } = await generateText({
    model: gateway('anthropic/claude-sonnet-4-5'),
    prompt,
    // Automatic fallback if provider is down
    fallbackModels: ['openai/gpt-4o', 'google/gemini-2.5-pro']
  });
  
  return Response.json({ text });
}

Best For

  • Next.js apps deployed on Vercel
  • Teams already using the Vercel AI SDK
  • Simple gateway needs without self-hosting complexity

Limitations

  • Tied to Vercel platform
  • Less feature-complete than LiteLLM or Portkey
  • Limited observability

The Decision Matrix

Pick OpenRouter if: You're a solo developer or small team who wants model flexibility without infrastructure.

Pick LiteLLM if: You have significant volume, need data in your own infrastructure, or want to avoid per-request markups.

Pick Portkey if: You want a managed solution with complete features — caching, routing, observability, evals — and don't want to run infrastructure.

Pick Vercel AI Gateway if: You're on Vercel and want zero-config gateway integration.

For most production teams: LiteLLM for the proxy layer + Langfuse or Braintrust for observability is the most capable combination. For smaller teams, Portkey provides both in one package.

Your ad here

Related Tools