Fundamentals
Top-K Sampling
Quick Answer
A sampling method that selects from the K most likely next tokens.
Top-K sampling restricts the model to only consider the K most probable next tokens. If K=40, the model samples from only the top 40 tokens by probability, filtering out unlikely options. This prevents the model from occasionally generating nonsensical text from very low-probability tokens while preserving diversity when multiple tokens are plausible. Top-K is simpler than nucleus sampling but less adaptive. Many practitioners use top-p exclusively, though top-k can be useful for specific applications.
Last verified: 2026-04-08