Agent Planning: Task Decomposition and Execution Strategies (2026)
Agent planning has two phases: upfront plan generation (decompose the task into steps before execution) and adaptive re-planning (update the plan based on what was discovered during execution). Upfront planning improves success on long tasks by giving the agent a roadmap; adaptive re-planning handles the inevitable surprises. The key is knowing when to replan vs. continue on the original path.
When to Use
- ✓Tasks with 5+ steps where the agent needs a roadmap to stay on track
- ✓Tasks where early failures should trigger replanning, not crashing the whole workflow
- ✓Research or analysis tasks where the required steps depend on what's discovered
- ✓Multi-day or multi-session tasks where the plan must persist between interruptions
- ✓Tasks requiring resource estimation upfront (API call budget, time estimate, cost limit)
How It Works
- 1Plan-and-execute pattern: first prompt the agent to generate a complete plan (ordered list of steps with expected outputs), then execute each step in sequence, checking after each step whether to continue, replan, or stop.
- 2Hierarchical planning: decompose into high-level goals → subgoals → atomic steps. The agent knows its place in the hierarchy, preventing premature zoom into details before the high-level approach is validated.
- 3Dynamic replanning: after each step, ask: 'Does the original plan still make sense given what we've learned?' If a step revealed that the approach is wrong, generate a revised plan before continuing. Limit replanning to 2-3 times to prevent overthinking.
- 4Resource-bounded planning: give the agent a budget (N tool calls, K tokens, T minutes) and require it to produce a plan that fits within the budget. This prevents unbounded exploration on tasks that should be quick.
- 5Task state tracking: maintain explicit state (not just conversation history): current_step, completed_steps, discovered_facts, current_plan. This state persists across session breaks and enables recovery after failures.
Examples
# Step 1: Generate plan
plan_response = client.messages.create(
model='claude-3-5-sonnet-20241022',
system='You are a planning agent. Given a task, create a numbered execution plan. Each step should have: action, expected_output, tools_needed.',
messages=[{'role': 'user', 'content': f'Task: {task}\n\nCreate an execution plan. Be specific about which tools to use at each step.'}]
)
plan = parse_plan(plan_response.content[0].text)
# Step 2: Execute plan with replanning capability
for i, step in enumerate(plan.steps):
result = execute_step(step, available_tools)
if result.status == 'failed' or result.requires_replan:
new_plan = replan(original_plan=plan, completed=plan.steps[:i],
failed_step=step, failure_reason=result.error)
plan = new_plan
plan.steps[i].result = resultSystem: You are a research agent. For complex tasks:
1. First output a PLAN: numbered list of steps you will take
2. Then execute each step, prefixing each with STEP N:
3. After each step, output STATUS: [continue|replan|complete] and WHY
4. If status is 'replan', output REVISED PLAN: before continuing
5. Conclude with SUMMARY: of findings
This structure makes your work auditable and allows human review at each step.Common Mistakes
- ✗Over-planning before acting — generating a 20-step plan upfront for a task that might be completed in 3 steps wastes tokens and creates false confidence in a plan that will need replanning anyway. Generate the minimal plan that unblocks the first 2-3 steps.
- ✗No plan validation — plans generated by the LLM can contain logical errors, impossible steps, or undefined dependencies. Validate plans before execution: check that each step has available tools, that dependencies are ordered correctly, and that the plan fits within resource budgets.
- ✗Treating the plan as immutable — agents that rigidly follow the original plan even when execution reveals it's wrong will fail. Build explicit replanning checkpoints after steps that gate the continuation of the plan.
- ✗Not persisting plan state — for long-running tasks, plan state must be serialized to a database between steps. If the agent crashes on step 7 of 15, it should resume from step 8, not restart from scratch.
FAQ
When should the agent replan vs. continue with the original plan?+
Replan when: (1) a step fails and the failure invalidates downstream steps, (2) new information fundamentally changes the approach (e.g., discovered the database schema is different than assumed), (3) the resource budget is 50%+ consumed with less than 50% of the plan completed. Continue when: a step fails but it's optional, or new information confirms the original plan is valid.
Should I use a dedicated 'planner' model and a separate 'executor' model?+
This pattern (planner/executor split) is common in multi-agent architectures. Use a powerful model (Claude Opus, GPT-4o) for planning — it needs to reason about the full task. Use a cheaper model (Claude Haiku, GPT-4o-mini) for individual execution steps when they're straightforward. This reduces cost by 3-5x for long plans.
How do I handle tasks where the plan can't be determined upfront?+
Use reactive planning: don't generate a full plan upfront. Instead, after each step, generate 'next best action' given current state. This is what ReAct does by default. Reactive planning handles highly uncertain tasks better; upfront planning handles well-structured tasks better. Many production agents use a hybrid: plan the first 3-5 steps, then reactively plan the rest.
What's the right level of granularity for plan steps?+
Steps should be atomic tool calls or small groups of related calls. 'Search for pricing data' is too vague — 'call get_pricing(model="gpt-4o")' is the right granularity. Each step should produce a verifiable output. If a step takes more than 3 tool calls to execute, break it into sub-steps.
How do I evaluate agent planning quality?+
Measure: plan adherence rate (% of plan steps completed without replanning), replan frequency (how often the plan changes), task completion rate (% of tasks fully completed), and plan efficiency (actual steps / planned steps — values near 1.0 indicate accurate planning). Low plan adherence suggests the agent is being given tasks outside its planning ability.