🎯 Domain 4 · Task Statement 4.6

Design Multi-Instance and Multi-Pass Review Architectures

📊 Domain Weight: 20% 🎬 Difficulty: Architect Level 🚨 Focus: Redundancy & Consensus

For mission-critical tasks (legal drafting, safety auditing), a single AI turn is a single point of failure. This task explores Parallel and Sequential ensembles. You will learn to architect systems that run multiple independent Claude instances and sequential review chains to eliminate hallucinations through consensus and criticism.

📋 Contents

  1. Real-World Analogy: The Grand Jury
  2. Parallel Multi-Instance vs. Sequential Multi-Pass
  3. Architecture A: Parallel Consensus (Voting)
  4. Advanced: The Self-Debate Mode
  5. Diagram: The Multi-Stage Verification Pipeline
  6. Architecture B: Sequential Refinement (The Judge/Critic)
  7. Anti-Patterns: Cost Overruns & Echo Chambers
  8. Exam Readiness & Key Takeaways

🏭 Real-World Analogy: The Grand Jury

🩹 Analogy — Three Doctors vs. One Opinion

Imagine you have a complex medical condition. Standard Prompting is like asking one doctor for a diagnosis. If they missed one day of class, their advice might be wrong. Single point of failure.

Multi-Instance Architecture is like asking 3 independent doctors in 3 different cities. If they all arrive at the same diagnosis (Consensus), your confidence is near 100%.

Multi-Pass Architecture is like the 1st doctor doing the surgery, and a 2nd "Senior Specialist" watching the monitor and Saying, "Watch that artery!" (Critic/Reviewer).

🏢 Parallel Multi-Instance vs. Sequential Multi-Pass

Architects must decide if they need Variety (Parallel) or Depth (Sequential).

MethodStructureBest ForGoal
ParallelClaude A, B, and C run in tandem.Fast extraction, classification.Consensus (remove outliers).
SequentialA → B → C (Chain).Summarization, coding, legal.Refinement (improve quality).

📈 Architecture A: Parallel Consensus (Voting)

Run 3 instances of Claude-3-Sonnet with temperature: 0.7 using varied role prompts (e.g. "Skeptical Auditor" vs "Efficient Developer"). Compare results in your code.

💡 Pro Tip — The N-of-3 Pattern

If Instance 1 and 2 extract the number $500, but Instance 3 extracts $550, your aggregator code should discard Instance 3 as an outlier. This is a common pattern for high-accuracy financial extraction.

🚀 Advanced: The Self-Debate Mode (Internal Multi-Pass)

If you have limited tokens but need high accuracy, use **Self-Debate**. Prompt Claude to simulate two internal personas arguing over a complex decision before it outputs the final XML.

Self-Debate Prompt Pattern
<thinking>
- Persongen A: Suggests we use the 'strict' parser for security.
- Persona B: Objects, saying it might break legacy data.
- Synthesis: Resolve to use 'strict' but with a fallback handler.
</thinking>

The "Consensus Algorithm" should escalate to a human if fewer than 2/3 of instances agree on a specific field value.

🕐 Diagram: The Multi-Stage Verification Pipeline

Combined Parallel Consensus + Sequential Review
AGGREGATOR CRITIC AGENT CERTIFIED RESULT

🔄 Architecture B: Sequential Refinement (The Judge)

Assign a 2nd turn to a "Reviewer" prompt. Give the 1st agent's output and the original source to the 2nd larger instance (e.g. Claude-3-Opus) and say: "Flag any claim NOT supported by the source XML."

Best Practice: Use the larger Claude-3-Sonnet for the 1st draft, and the fast Claude-3-Haiku for checking simple syntax errors. For logical errors, the Checker must be at least as large as the Creator.

Anti-Patterns: Cost Overruns & Echo Chambers

Identical Prompts (Echo Chamber)

Running 3 instances with the exact same system prompt. Claude will arrive at the same hallucination. Fix: Vary the roles.

Linear Scaling of Cost

Using multi-instance for simple tasks. Fix: Only trigger if the 1st agent expresses low confidence.

The "Infinite Refinement" Trap

Letting two agents argue back and forth without a termination condition. Fix: Limit to 2 passes.

Small Model Reviewing Large Model

Using Haiku to judge the reasoning of Sonnet. Haiku will miss the subtle logic.

Exam Readiness & Key Takeaways

🎓 Exam Scenario — The Legal Accuracy Mandate

Scenario: You're building an automated summarizer for legal contracts. The system MUST not miss any "Right of First Refusal" clauses. Accuracy is prioritize over cost.

Question: What is the most architecturally sound approach?

  • A) Add many few-shot examples of missed clauses to a single prompt.
  • B) Implement a Multi-Pass Architecture where Agent 1 extracts the clauses, and Agent 2 (The Critic) verifies every item against the original text.
  • C) Increase the context window size.

Correct Answer: B. A multi-pass review architecture provides a "Self-Check" system that significantly reduces omissions.

1
Aggregator Logic. Your architecture needs code to handle the math of the consensus.
2
Temperature Variance. Use T=0.0 for stability and T=0.7 for exploratory reasoning in parallel checks.
3
Model Asymmetry. Use specialized models for verification (e.g., using Opus to check Sonnet's work).