Safety & Alignment
Honesty
Quick Answer
A core alignment principle: models should provide truthful information and acknowledge uncertainty.
Honesty means truthful, accurate responses. Honest models acknowledge uncertainty ('I don't know'). Honesty prevents hallucination and fabrication. Honesty is learned through training. Honesty sometimes conflicts with being helpful. Measuring honesty is challenging. Honesty is fundamental to trustworthiness. Honesty is a key alignment goal.
Last verified: 2026-04-08