Context Management in Claude Code: Stay Under the Limit Efficiently
Use '/compact' to summarize and compress conversation history when context fills up. Keep CLAUDE.md concise with @file references rather than inline content. Use 'claudeContext' in settings.json to pre-load only the most relevant files at session start.
Claude Code's context window is finite. As a session grows — with file reads, tool outputs, back-and-forth messages — you approach the limit and Claude starts losing early context. Understanding how context is built and how to manage it is the difference between productive hour-long sessions and frustrating cutoffs.
The context window in a Claude Code session contains: the system prompt (CLAUDE.md content, tool definitions, MCP server schemas), all previous messages and tool results, and any files Claude has read. Large files, verbose tool outputs, and long conversations are the main culprits when you hit limits.
The '/compact' command is your primary tool. It asks Claude to summarize the conversation so far into a compact representation, replacing the full history with the summary. You lose some detail but gain significant context headroom. Use it proactively when you notice responses getting slower or when you have completed a distinct phase of work.
For session setup, keep your CLAUDE.md lean by using @file references to import content from separate files on demand, rather than inlining everything. This means the context only grows when Claude explicitly reads those files. Similarly, when asking Claude to analyze a large codebase, direct it to read specific files rather than letting it read everything in a directory.
Examples
# In the Claude Code interactive session, type:
/compact
# Or with a custom summary instruction:
/compact Focus on the current task: implementing the auth system. Preserve all decisions made about the JWT token structure and the middleware approach.# Project: MyApp
## Overview
Next.js 15 SaaS app. Supabase database. Vercel deployment.
## Coding standards
@.claude/standards.md
## Database schema
@.claude/schema.sql
## Current sprint tasks
@.claude/sprint.md
## Architecture decisions
@.claude/decisions.md
<!-- Total size of this file: ~200 tokens -->
<!-- Files are loaded on demand, not at session start --># When switching to a completely different task, start fresh
# rather than continuing a long session
claude /new
# Or from the CLI, start a new headless run
claude --headless --print "New independent task here"
# Check current context usage (approximate)
# Claude will mention when context is getting full — watch for the warningTips
- →Use /compact after completing each major phase of work — e.g., after finishing the data model before starting the UI.
- →Avoid asking Claude to 'read the entire codebase' — direct it to specific files relevant to the task.
- →Large JSON files and SQL dumps are context killers — pipe them through jq or grep before having Claude read them.
- →Keep MCP server responses concise — an MCP tool that returns 50KB of JSON burns context fast. Add pagination or summary modes.
- →Start a new session (/new or a fresh CLI invocation) for completely unrelated tasks rather than extending a long session.
FAQ
How do I know when I am approaching the context limit?+
Claude Code displays a warning indicator when context usage is high. You may also notice Claude starting to forget earlier parts of the conversation or giving responses that ignore previously established context. The /compact command shows you the estimated token count before and after compaction.
Does /compact lose important information?+
The summary preserves decisions, code written, and key facts but loses the verbatim conversation. For tasks where exact history matters (debugging a subtle regression), save important snippets to a file before compacting. You can tell Claude: 'Before compacting, write a summary of all decisions made to .claude/session-notes.md'.
Is there a way to see how many tokens are in the current context?+
Claude Code does not expose a live token counter in the UI, but the /compact command shows usage before compaction. For programmatic monitoring, the JSON output mode from headless runs includes token usage metadata you can log.
Does a larger Claude model mean more context?+
Claude Sonnet and Opus share the same 200K token context window. Model selection affects capability and cost but not the context limit. See the model-selection guide for when to use each model.