Training

Constitutional AI

Quick Answer

A training approach using a set of principles to guide model behavior without extensive human feedback.

Constitutional AI trains models to follow a set of explicit principles (constitution) rather than relying solely on human preference judgments. The model is given principles ('Prioritize safety', 'Be helpful') and learns to follow them. CAI reduces dependence on large-scale human feedback. It's more interpretable than black-box reward models. CAI can be combined with standard RLHF. Anthropic has pioneered this approach. Constitutional AI is one approach to scalable alignment.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →

← All glossary terms