Architecture
Positional Encoding
Quick Answer
A technique for encoding position information into transformer embeddings.
Transformers process tokens in parallel without inherent order awareness. Positional encoding injects position information into embeddings so the model knows token sequence. Absolute positional encoding assigns fixed embeddings based on position. Relative positional encoding captures relationships between positions. Rotary position embeddings (RoPE) are a modern technique providing better generalization to longer sequences. The choice of positional encoding affects how well models handle long contexts and generalize to unseen lengths. Recent approaches focus on improving extrapolation to longer sequences than seen during training.
Last verified: 2026-04-08