LLM Gateway Comparison 2026: OpenRouter vs LiteLLM vs Portkey vs Vercel AI Gateway
An LLM gateway sits between your application and AI provider APIs. It provides a unified interface, handles routing, manages fallbacks, adds observability, and sometimes reduces costs. But not all gateways are built the same.
Here's the complete comparison of the four main options in 2026.
Why Use an LLM Gateway?
Without a gateway, your application calls a single provider directly:
# Direct provider call — no gateway
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[...]
)
Problems:
- No fallbacks: If OpenAI is down, your app is down
- No routing: You can't route cheap tasks to cheap models
- No observability: You can't see how much each feature costs
- No cost controls: No spending limits, no per-user quotas
- Provider lock-in: Switching providers requires code changes
A gateway solves all of these.
Feature Comparison
| Feature | OpenRouter | LiteLLM | Portkey | Vercel AI Gateway |
| Unified API | ✓ | ✓ | ✓ | ✓ |
| Model routing | ✓ | ✓ | ✓ | ✓ |
| Automatic fallbacks | ✓ | ✓ | ✓ | ✓ |
| Load balancing | Limited | ✓ | ✓ | ✓ |
| Request caching | ✓ | ✓ | ✓ | ✓ |
| Observability/tracing | Basic | ✓ | ✓ | Basic |
| Cost analytics | ✓ | ✓ | ✓ | Limited |
| Self-hosting | No | ✓ | ✓ | No |
| Rate limiting | No | ✓ | ✓ | ✓ |
| Evals integration | No | Limited | ✓ | No |
| Prompt management | No | No | ✓ | No |
| TypeScript SDK | ✓ | Limited | ✓ | ✓ |
OpenRouter
What It Is
OpenRouter is a managed gateway that provides access to 100+ models from all major providers through a single OpenAI-compatible API. You pay OpenRouter directly; they handle the provider relationships.
Pricing
- No subscription fee
- Small markup on top of provider prices (typically 5-15%)
- Free tier with rate limits
- Models available at exact provider pricing for popular ones
Code Example
from openai import OpenAI
client = OpenAI(
api_key="sk-or-your-openrouter-key",
base_url="https://openrouter.ai/api/v1"
)
# Use any model with the same API
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4-5", # or openai/gpt-4o, google/gemini-2.5-pro, etc.
messages=[{"role": "user", "content": "Hello"}],
extra_headers={
"HTTP-Referer": "https://yourapp.com", # Required by OpenRouter
"X-Title": "Your App Name"
}
)
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.OPENROUTER_API_KEY,
baseURL: 'https://openrouter.ai/api/v1',
});
// Route to cheapest model
const response = await client.chat.completions.create({
model: 'openrouter/auto', // Automatically selects cheapest capable model
messages: [{ role: 'user', content: 'Classify this text: ...' }],
});
Automatic Fallbacks in OpenRouter
# OpenRouter handles fallbacks automatically
# Specify fallback models in order
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4-5",
messages=[...],
extra_body={
"models": [ # Fallback chain
"anthropic/claude-sonnet-4-5",
"openai/gpt-4o",
"google/gemini-2.5-pro"
]
}
)
Best For
- Prototypes and side projects that want model flexibility
- Teams that don't want to manage infrastructure
- Applications that want to compare multiple models
- Access to niche or new models quickly
Limitations
- No self-hosting
- Limited observability
- Markup on provider prices
- No rate limiting or user quotas
LiteLLM
What It Is
LiteLLM is an open-source proxy that you deploy yourself. It provides a single OpenAI-compatible API that forwards to any provider. Because it's self-hosted, you pay providers directly with no markup.
Pricing
- Open-source: Free
- LiteLLM Enterprise: ~$1,000/month for advanced features
- You pay providers directly — no gateway markup
Setup
# litellm/config.yaml
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-sonnet-4-5
api_key: os.environ/ANTHROPIC_API_KEY
- model_name: gemini-pro
litellm_params:
model: gemini/gemini-2.5-pro
api_key: os.environ/GEMINI_API_KEY
# Router configuration
router_settings:
routing_strategy: least-busy # or latency-based, cost-based
fallbacks:
- {"gpt-4o": ["claude-sonnet", "gemini-pro"]}
litellm_settings:
cache: true
cache_params:
type: redis
host: localhost
port: 6379
# Deploy
docker run -p 4000:4000 \
-v $(pwd)/config.yaml:/app/config.yaml \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
ghcr.io/berriai/litellm:latest
Code Example
# Point to your LiteLLM proxy
client = OpenAI(
api_key="sk-1234", # Virtual key
base_url="http://localhost:4000"
)
response = client.chat.completions.create(
model="gpt-4o", # Maps to your configured model
messages=[{"role": "user", "content": "Hello"}]
)
Load Balancing
# Multiple deployments of the same model
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY_1
rpm: 500
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY_2 # Second account for higher limits
rpm: 500
Best For
- Teams with significant API volume who want to avoid gateway markups
- Enterprise teams needing data to stay in their infrastructure
- Complex load balancing across multiple provider accounts
- Teams already running their own infrastructure
Limitations
- Requires infrastructure to deploy and maintain
- Less polished UI than commercial alternatives
- Enterprise features cost $1,000/month
Portkey
What It Is
Portkey is a managed LLM gateway with the most complete feature set: routing, fallbacks, caching, observability, prompt management, and evals — all in one platform.
Pricing
- Developer: Free (10,000 requests/month)
- Growth: $49/month (unlimited requests + advanced features)
- Enterprise: Custom
Code Example
from portkey_ai import Portkey
portkey = Portkey(
api_key="pk-your-portkey-key",
virtual_key="pk-openai-xxx" # Your configured OpenAI key in Portkey
)
response = portkey.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}]
)
Advanced Routing with Portkey Configs
from portkey_ai import Portkey, createHeaders, Config
# Create a config with fallback chain
config = Config(
retry={"attempts": 3},
strategy={"mode": "fallback"},
targets=[
{"virtual_key": "pk-anthropic-xxx", "model": "claude-sonnet-4-5"},
{"virtual_key": "pk-openai-xxx", "model": "gpt-4o"},
{"virtual_key": "pk-google-xxx", "model": "gemini-2.5-pro"}
]
)
portkey = Portkey(
api_key="pk-your-portkey-key",
config=config
)
response = portkey.chat.completions.create(
messages=[{"role": "user", "content": "Hello"}]
)
TypeScript
import Portkey from 'portkey-ai';
const portkey = new Portkey({
apiKey: process.env.PORTKEY_API_KEY,
virtualKey: process.env.PORTKEY_VIRTUAL_KEY,
config: {
cache: { mode: 'semantic', maxAge: 3600 }, // Semantic caching!
retry: { attempts: 3 },
strategy: { mode: 'loadbalance' },
targets: [
{ virtualKey: 'pk-openai-xxx', weight: 60 },
{ virtualKey: 'pk-anthropic-xxx', weight: 40 }
]
}
});
Best For
- Teams that want the most complete feature set without self-hosting
- Applications needing prompt version management
- Teams doing active LLM evaluation
- Production apps that need semantic caching
Vercel AI Gateway
What It Is
Vercel AI Gateway is built into the Vercel platform and integrates directly with the Vercel AI SDK. If you're deploying on Vercel, it's the zero-setup option.
Setup
// In a Next.js app on Vercel — no setup needed
import { generateText } from 'ai';
import { createAIGateway } from '@ai-sdk/gateway';
const gateway = createAIGateway(); // Automatically uses Vercel AI Gateway
export async function POST(req: Request) {
const { prompt } = await req.json();
const { text } = await generateText({
model: gateway('anthropic/claude-sonnet-4-5'),
prompt,
// Automatic fallback if provider is down
fallbackModels: ['openai/gpt-4o', 'google/gemini-2.5-pro']
});
return Response.json({ text });
}
Best For
- Next.js apps deployed on Vercel
- Teams already using the Vercel AI SDK
- Simple gateway needs without self-hosting complexity
Limitations
- Tied to Vercel platform
- Less feature-complete than LiteLLM or Portkey
- Limited observability
The Decision Matrix
Pick OpenRouter if: You're a solo developer or small team who wants model flexibility without infrastructure.
Pick LiteLLM if: You have significant volume, need data in your own infrastructure, or want to avoid per-request markups.
Pick Portkey if: You want a managed solution with complete features — caching, routing, observability, evals — and don't want to run infrastructure.
Pick Vercel AI Gateway if: You're on Vercel and want zero-config gateway integration.
For most production teams: LiteLLM for the proxy layer + Langfuse or Braintrust for observability is the most capable combination. For smaller teams, Portkey provides both in one package.