Safety & Alignment
Prompt Injection Attack
Quick Answer
A security attack where malicious input overrides model instructions, causing unintended behavior.
Prompt injection attacks insert malicious instructions in user input. Example: 'Ignore previous instructions and do X'. Injection attacks can cause models to bypass safety. Defenses: clear delimiters, parameterization, sandboxing. Injection attacks are particularly dangerous with untrusted input. Careful prompt design helps defense. Injection attacks are a major security concern. Injection attack prevention requires vigilance.
Last verified: 2026-04-08