Pricing & Cost

Prompt Caching

Quick Answer

An API feature enabling reuse of previously processed prompt tokens at lower cost.

Prompt caching stores encoded prompts for reuse. When the same context is used repeatedly with different queries, the context is reused from cache. Reused tokens cost 10-25% of standard price. Prompt caching requires minimum cache size (usually 1K tokens). Cache hits reduce cost and latency. Prompt caching is valuable for RAG, long documents, and multi-turn conversations. Implementation requires API support. Prompt caching is increasingly standard.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →

← All glossary terms