Plan-Build-Review Pattern

A workflow pattern that separates specification from implementation, with optional research phase and feedback loops where production experience continuously improves the process itself.

Core Insight

[2025-12-10]: Bad research compounds exponentially. One flawed architectural assumption in the Research phase leads to a misguided plan, which generates thousands of lines of incorrect code. Research quality has massive leverage on all downstream outcomes.

The core pattern has three phases (Plan-Build-Improve), with an optional Research phase for complex domains. The four-command expert pattern (Research-Plan-Build-Improve) creates a learning loop where each phase builds on validated understanding:

Research - Understand the problem space, codebase structure, and dependencies
Plan - Creates detailed specifications grounded in research findings
Build - Implements according to the research-backed spec
Improve - Analyzes what happened and updates the Research/Plan/Build expertise

The Improve phase closes the loop by extracting learnings from actual production usage and updating the expert's knowledge base across all commands, making each iteration smarter than the last.

How It Works

The Learning Loop

    Research (with Expertise)
         ↓
    Plan (with Expertise)
         ↓
    Build (with Expertise)
         ↓
    Production Experience
         ↓
    Improve (updates Expertise)
         ↓
    (feeds back to Research/Plan/Build)

Key Characteristics

Mutable Expertise Sections: Unlike workflows, the Expertise sections in Research/Plan/Build commands are designed to evolve
Experience-Driven Evolution: Learnings come from real production usage, not theory
Self-Improving System: Each cycle makes the next iteration better
Persistent Knowledge: Insights are captured in the prompts themselves, not lost in conversation history

The Research Phase

[2025-12-10]: Research is the foundation that determines everything downstream. Before planning what to build, you need to understand:

Codebase structure: Where does relevant code live? What are the existing patterns?
Dependencies: What does this component depend on? What depends on it?
Problem causes: What actually broke? What's the root cause vs. symptoms?
Constraints: What can't change? What assumptions must hold?

Research as Artifact Generation

The output of research isn't just understanding—it's concise summary documents that capture findings:

# Research Summary: Feature X Implementation
 
## Current Architecture
- Component Y handles Z via pattern P
- Dependencies: A, B, C (see /path/to/code)
 
## Problem Root Cause
- Symptom: Users report timeout
- Cause: Unbounded retry loop in module M
- Evidence: /path/to/logs, line 234
 
## Constraints
- Cannot change API contract (external clients)
- Must maintain backward compatibility with v1.x
 
## Recommended Approach
Based on architecture analysis, approach 2 (async queue)
fits existing patterns better than approach 1 (sync retry).

These artifacts serve two purposes:

Grounding: Plan phase references concrete findings, not assumptions
Validation: Team can review research conclusions before building begins

The Exponential Compounding Problem

Why research deserves a dedicated phase:

Research Quality	Plan Accuracy	Lines of Wrong Code	Cost to Fix
Excellent	95%	~50	Hours
Good	80%	~500	Days
Poor	50%	~5,000	Weeks
Terrible	10%	Complete rewrite	Months

A 10-minute research mistake can create a 10-week refactoring disaster. The leverage is asymmetric—investing in research quality pays exponential dividends.

What Good Research Looks Like

Good:

"Module X uses pattern Y (see lines 45-67 in file.py)"
"Dependency graph shows A→B→C, change must start at C"
"Root cause: race condition between handlers (reproduced in test)"

Bad:

"The code probably does something with databases"
"I think this component is important"
"We should investigate the authentication system"

Good research is specific, sourced, and falsifiable. Bad research is vague and ungrounded.

Implementation in Claude Code

In Claude Code's slash command system, this pattern is implemented through expert command sets:

.claude/commands/experts/[domain]-expert/
     [domain]_expert_research.md  # Has ## Expertise section
     [domain]_expert_plan.md      # Has ## Expertise section
     [domain]_expert_build.md     # Has ## Expertise section
     [domain]_expert_improve.md   # Updates the Expertise sections

The Improve command:

Analyzes recent git changes
Identifies successful patterns and learnings
Updates ONLY the Expertise sections in Research/Plan/Build commands
Keeps workflow sections stable

Research Command Structure

A typical Research expert command:

## Workflow
 
1. **Understand the Requirement**
   - What problem are we solving?
   - What's the scope?
 
2. **Explore the Codebase**
   - Use Glob/Grep to find relevant files
   - Read key components
   - Map dependencies
 
3. **Analyze the Problem**
   - Identify root causes
   - Distinguish symptoms from causes
   - Document constraints
 
4. **Generate Research Summary**
   - Create concise summary document
   - Include specific file paths and line numbers
   - State conclusions with evidence
 
## Expertise
 
*[2025-12-10]*: (Accumulated learnings about researching this domain)

When to Use This Pattern

Good Fit

Complex, knowledge-intensive domains:

Multi-step workflows requiring specialized understanding
Tasks where learning from experience is valuable
Situations where patterns emerge over time
When you want prompts to improve with use

Indicators you need this pattern:

Repeatedly solving similar problems in the same domain
Accumulating domain-specific best practices
Need for consistent approach across team/project
High cost of mistakes (research phase prevents compounding errors)

Poor Fit

Simple or one-off tasks:

Single-file changes with clear requirements
Domains with fixed, unchanging requirements
When expertise doesn't accumulate meaningfully
Tasks simple enough that planning overhead exceeds benefit

When simpler approaches work:

Ad-hoc changes with no pattern reuse
Exploratory work where learning isn't transferable
Domains where external documentation is comprehensive and stable

Connections

To Prompts/Structuring: How mutable sections in prompts enable learning
To Claude Code: Concrete implementation in slash command experts
To Evaluation: How do you measure if Improve is actually improving?
To Context Management: Research generates artifacts that become context for Plan/Build phases
To Specs as Source Code: Plans are executable specifications, not throwaway scaffolding

ReAct Loop: Similar feedback structure but at inference time, not across sessions
Human-in-the-Loop: Review phase could involve human validation
Multi-Agent: Multiple experts could each use Research-Plan-Build-Improve internally