agentsadvanced

Agent Planning: Task Decomposition and Execution Strategies (2026)

Quick Answer

Agent planning has two phases: upfront plan generation (decompose the task into steps before execution) and adaptive re-planning (update the plan based on what was discovered during execution). Upfront planning improves success on long tasks by giving the agent a roadmap; adaptive re-planning handles the inevitable surprises. The key is knowing when to replan vs. continue on the original path.

When to Use

✓Tasks with 5+ steps where the agent needs a roadmap to stay on track
✓Tasks where early failures should trigger replanning, not crashing the whole workflow
✓Research or analysis tasks where the required steps depend on what's discovered
✓Multi-day or multi-session tasks where the plan must persist between interruptions
✓Tasks requiring resource estimation upfront (API call budget, time estimate, cost limit)

How It Works

1Plan-and-execute pattern: first prompt the agent to generate a complete plan (ordered list of steps with expected outputs), then execute each step in sequence, checking after each step whether to continue, replan, or stop.
2Hierarchical planning: decompose into high-level goals → subgoals → atomic steps. The agent knows its place in the hierarchy, preventing premature zoom into details before the high-level approach is validated.
3Dynamic replanning: after each step, ask: 'Does the original plan still make sense given what we've learned?' If a step revealed that the approach is wrong, generate a revised plan before continuing. Limit replanning to 2-3 times to prevent overthinking.
4Resource-bounded planning: give the agent a budget (N tool calls, K tokens, T minutes) and require it to produce a plan that fits within the budget. This prevents unbounded exploration on tasks that should be quick.
5Task state tracking: maintain explicit state (not just conversation history): current_step, completed_steps, discovered_facts, current_plan. This state persists across session breaks and enables recovery after failures.

Examples

Plan-and-execute agent

# Step 1: Generate plan
plan_response = client.messages.create(
    model='claude-3-5-sonnet-20241022',
    system='You are a planning agent. Given a task, create a numbered execution plan. Each step should have: action, expected_output, tools_needed.',
    messages=[{'role': 'user', 'content': f'Task: {task}\n\nCreate an execution plan. Be specific about which tools to use at each step.'}]
)
plan = parse_plan(plan_response.content[0].text)

# Step 2: Execute plan with replanning capability
for i, step in enumerate(plan.steps):
    result = execute_step(step, available_tools)
    
    if result.status == 'failed' or result.requires_replan:
        new_plan = replan(original_plan=plan, completed=plan.steps[:i], 
                         failed_step=step, failure_reason=result.error)
        plan = new_plan
    
    plan.steps[i].result = result

Output:Two-phase: plan first, then execute with optional replanning. Replanning triggers only on failure or significant new information — not on every step (which would be slow and expensive).

Explicit plan in system prompt

System: You are a research agent. For complex tasks:
1. First output a PLAN: numbered list of steps you will take
2. Then execute each step, prefixing each with STEP N:
3. After each step, output STATUS: [continue|replan|complete] and WHY
4. If status is 'replan', output REVISED PLAN: before continuing
5. Conclude with SUMMARY: of findings

This structure makes your work auditable and allows human review at each step.

Output:The explicit plan-in-output format makes the agent's reasoning transparent. Each STATUS decision is logged. Humans reviewing the trace can catch errors early. The format overhead (~50 tokens/step) is worth the observability.

Common Mistakes

✗Over-planning before acting — generating a 20-step plan upfront for a task that might be completed in 3 steps wastes tokens and creates false confidence in a plan that will need replanning anyway. Generate the minimal plan that unblocks the first 2-3 steps.
✗No plan validation — plans generated by the LLM can contain logical errors, impossible steps, or undefined dependencies. Validate plans before execution: check that each step has available tools, that dependencies are ordered correctly, and that the plan fits within resource budgets.
✗Treating the plan as immutable — agents that rigidly follow the original plan even when execution reveals it's wrong will fail. Build explicit replanning checkpoints after steps that gate the continuation of the plan.
✗Not persisting plan state — for long-running tasks, plan state must be serialized to a database between steps. If the agent crashes on step 7 of 15, it should resume from step 8, not restart from scratch.

FAQ

When should the agent replan vs. continue with the original plan?+

Replan when: (1) a step fails and the failure invalidates downstream steps, (2) new information fundamentally changes the approach (e.g., discovered the database schema is different than assumed), (3) the resource budget is 50%+ consumed with less than 50% of the plan completed. Continue when: a step fails but it's optional, or new information confirms the original plan is valid.

Should I use a dedicated 'planner' model and a separate 'executor' model?+

This pattern (planner/executor split) is common in multi-agent architectures. Use a powerful model (Claude Opus, GPT-4o) for planning — it needs to reason about the full task. Use a cheaper model (Claude Haiku, GPT-4o-mini) for individual execution steps when they're straightforward. This reduces cost by 3-5x for long plans.

How do I handle tasks where the plan can't be determined upfront?+

Use reactive planning: don't generate a full plan upfront. Instead, after each step, generate 'next best action' given current state. This is what ReAct does by default. Reactive planning handles highly uncertain tasks better; upfront planning handles well-structured tasks better. Many production agents use a hybrid: plan the first 3-5 steps, then reactively plan the rest.

What's the right level of granularity for plan steps?+

Steps should be atomic tool calls or small groups of related calls. 'Search for pricing data' is too vague — 'call get_pricing(model="gpt-4o")' is the right granularity. Each step should produce a verifiable output. If a step takes more than 3 tool calls to execute, break it into sub-steps.

How do I evaluate agent planning quality?+

Measure: plan adherence rate (% of plan steps completed without replanning), replan frequency (how often the plan changes), task completion rate (% of tasks fully completed), and plan efficiency (actual steps / planned steps — values near 1.0 indicate accurate planning). Low plan adherence suggests the agent is being given tasks outside its planning ability.

react pattern tool use human in loop agent evaluation ↗ research agent ↗ multi agent orchestration ↗ qa testing agent

Agent Planning: Task Decomposition and Execution Strategies (2026)

When to Use

How It Works

Examples

Common Mistakes

FAQ

Related