📚 Domain 1 · Task Statement 1.7

Manage Session State, Resumption & Forking

📊 Domain Weight: 27% ⭐ High Exam Priority 🔗 Scenario: Developer Productivity Tool

Long-running agentic tasks — exploring a codebase, researching a topic, building iteratively — span multiple work sessions over hours or days. The exam tests three session management decisions: when to resume a named session, when to fork for divergent exploration, and when to start fresh with an injected summary instead of resuming with stale context. Getting these choices wrong wastes computation, produces incorrect analysis, or loses valuable invested context.

📋 Contents

Analogy — The Research Notebook
The Three Session Management Strategies
Named Session Resumption with --resume
fork_session for Parallel Exploration Branches
Stale Tool Results — When to Start Fresh
Handling File Changes in Resumed Sessions
Decision Framework: Resume vs Fork vs Fresh
Anti-Patterns & Pro Tips
Summary & Exam Key Points

📓 Analogy — The Research Notebook

📓 Three Ways to Continue Research

Named resumption → open the same notebook and continue writing. You left off mid-investigation yesterday. Today you open the same notebook to page 47 and pick up exactly where you stopped — all your previous findings, sketches, and conclusions are right there. This is --resume <session-name>: the agent sees its complete prior conversation history and can continue seamlessly.

Fork session → photocopy your notebook and give one copy to each colleague. You reached a key decision point mid-investigation: should we pursue Strategy A or Strategy B? You photocopy your notebook at page 47 — both colleagues start from identical page 47 context but continue in completely different directions. Neither branch pollutes the other's notes. This is fork_session: create two independent branches from a shared baseline.

Start fresh with a summary → write a briefing document, hand it to a fresh analyst. Your notebook is months old. Half the pages reference data that has since changed — the code you analyzed has been refactored, the API you mapped has been replaced. Rather than hand the analyst a notebook full of stale facts, you write a concise briefing: "Here's what we know, here's what changed, here's where to focus." This is starting a new session with an injected structured summary.

🗺️ The Three Session Management Strategies

Strategy	Mechanism	When to Use	Risk if Wrong
Named Resumption	`--resume <session-name>`	Prior context is mostly valid; continuing a paused investigation with little or no code/data changes	None if context is current; high if tool results are stale — agent reasons on outdated facts
Fork Branching	`fork_session(session_id)`	A shared analysis baseline is complete; you need to explore 2+ divergent paths without cross-contamination	Missed opportunity — running alternatives sequentially in one session means later context pollutes earlier findings
Fresh + Summary Injection	New session + structured summary in system prompt	Prior tool results are stale (code changed, data changed); need reliable reasoning on current state	Loses invested context; must re-explore from scratch what was already known

💡 The Central Tension on the Exam

The exam creates scenarios where the "intuitive" choice (always resume — it saves context!) is wrong, and you must recognize when stale tool results make resumption actively harmful. The question is never "resume or don't resume" in isolation — it's always "is the prior context reliable enough to reason from?"

🔖 Named Session Resumption with `--resume`

Named sessions allow you to assign a memorable name to a session and resume it exactly by name across work sessions. This is more reliable than session IDs (which are opaque strings) and allows team members to resume each other's sessions by convention.

CLI — Named Session Start and Resume

# Starting a named investigation session
claude --session-name payment-refactor-investigation \
       "Explore the payment module codebase. Map all files, \
identify the legacy payment gateway integration points, \
and document the data flow from checkout to confirmation."

# Session runs, agent explores, produces findings...
# Work ends for the day.

# ── NEXT DAY ── Resume the named session exactly where it stopped
claude --resume payment-refactor-investigation \
       "Continue from where you left off. Focus next on \
the error handling paths — I'd like to understand \
how payment failures are currently managed."

# The agent resumes with full prior conversation history:
# - The original exploration findings
# - All tool calls and their results
# - The structural map it built yesterday
# It does NOT need to re-explore what it already mapped.

What the Resumed Agent Sees

When a session is resumed with --resume, the agent receives its complete prior conversation history — every message, every tool call, every tool result. This means:

It remembers which files it already explored — no re-reading required
It remembers conclusions it reached — no re-analysis required
It can reference specific prior findings ("as I noted in the payment.py analysis...")
It continues the investigation as if no break occurred
CRITICAL: Its overarching System Prompt and safety constraints (as defined in its initial specification) remain perfectly intact, preventing jailbreaks via context-window manipulation.

✅ Ideal Use Case — Continuing a Paused Investigation

You started a deep codebase exploration at 2pm. At 5pm you need to stop. The code hasn't changed. Tomorrow morning you resume with --resume codebase-exploration and continue seamlessly. The value is in not re-paying the exploration cost — all prior tool results (file reads, grep outputs, dependency maps) are already in context.

⚠️ When Resumption Goes Wrong

You paused a session last week. Since then, three files have been refactored. When you resume, the agent's tool results reference the old versions of those files. If you ask it to make decisions based on the codebase structure, it will reason from stale data — producing recommendations that are wrong for the current code. This is the stale tool results problem.

🌿 fork_session for Parallel Exploration Branches

fork_session creates an independent copy of the current session state. Both the original and the fork start from identical context — the same conversation history, same tool results, same analysis baseline — but subsequent actions in neither branch affect the other. This is the correct pattern when you need to explore two divergent approaches from a shared starting point.

Figure 1 — fork_session Creates Independent Branches from Shared Baseline

Python — fork_session: Comparing Two Testing Strategies

from anthropic_agent_sdk import AgentSession
import asyncio

async def compare_testing_strategies(session_id: str):
    """
    After completing codebase analysis in a shared session,
    fork to explore two testing strategies independently.
    Neither branch contaminates the other.
    """
    # Shared baseline is complete: codebase analysis, architecture map, etc.
    # Neither branch should be influenced by the other's findings.

    # Create two independent branches from the same baseline
    branch_a_session = await AgentSession.fork_session(
        session_id=session_id,
        branch_name="unit-test-strategy"
    )
    branch_b_session = await AgentSession.fork_session(
        session_id=session_id,
        branch_name="integration-test-strategy"
    )

    # Run both branches in parallel — they share the baseline but
    # their subsequent actions are completely independent
    results_a, results_b = await asyncio.gather(
        branch_a_session.run(
            "Explore a unit testing strategy for this codebase. "
            "What unit tests would give the best coverage? "
            "List all mocks you'd need and estimate test count."
        ),
        branch_b_session.run(
            "Explore an integration testing strategy for this codebase. "
            "What integration tests would give the best behaviour coverage? "
            "List infrastructure requirements and estimate test count."
        )
    )

    # Synthesize findings from both branches to make recommendation
    return {
        "unit_test_findings": results_a,
        "integration_test_findings": results_b,
        "recommendation": await synthesize_recommendation(results_a, results_b)
    }

⭐ Pro Tip — What fork_session Is NOT

fork_session is not for running the same task multiple times with different parameters. It's specifically for cases where you have a shared analysis baseline that was expensive to build and you want to explore divergent approaches without re-paying that baseline cost or cross-contaminating the exploration. If you don't have a shared baseline worth preserving, just run two independent sessions.

⚠️ Stale Tool Results — When to Start Fresh

This is the most nuanced — and most tested — concept in Task 1.7. The exam guide states explicitly: "Why starting a new session with a structured summary is more reliable than resuming with stale tool results."

Tool results become stale when the underlying data changes after the tool was called. A file read at 2pm is stale if the file was modified at 3pm. A database query result from last Tuesday is stale if the data model changed. The agent doesn't know results are stale — it reasons on them as if they're current.

The Stale Context Problem Illustrated

❌ Resuming with Stale Context

Last week: Agent read auth.py, found it used JWT with RS256 signing.

This week: Team refactored to OAuth 2.0. auth.py completely rewritten.

You resume: Agent reasons "based on my analysis of auth.py, the token signing key is..." — but it's quoting a file that no longer exists in that form. Its conclusions are confidently wrong.

✅ Starting Fresh with Structured Summary

New session system prompt: "Context: we previously explored the auth module (now refactored to OAuth 2.0), payment module (still JWT), and API gateway (unchanged). Key known issues: [list]. Your task: analyze the new OAuth implementation."

Agent has accurate, current framing and immediately focuses on what matters — without being confused by stale JWT analysis.

🎯 Exam Guide Exact Language — Memorise This

"Why starting a new session with a structured summary is more reliable than resuming with stale tool results."

The key word is structured summary — not just "tell the agent things changed." A structured summary includes: what was previously known (still valid), what has changed (invalidated), and what the current task focus is. It replaces stale tool results with curated, current, human-verified facts.

Python — Creating a Structured Summary for Fresh Session

def build_structured_summary(prior_findings: dict, changes_since: list) -> str:
    """
    Build the structured summary injected into a fresh session
    when prior tool results are stale.
    """
    return f"""
## Context from Prior Investigation (verified current as of today)

### Still Valid Findings
{chr(10).join(f"- {finding}" for finding in prior_findings['still_valid'])}

### INVALIDATED — Do NOT rely on these
The following were previously analyzed but have since changed:
{chr(10).join(f"- {change['file']}: {change['nature_of_change']}" 
             for change in changes_since)}

### Current Task Focus
{prior_findings['next_objective']}

### Known Architectural Facts (human-verified, current)
{chr(10).join(f"- {fact}" for fact in prior_findings['verified_facts'])}

---
Begin your analysis from this context. Use tools to read the changed files fresh.
Do NOT assume you have prior knowledge of files listed as INVALIDATED above.
"""

# Usage: inject as system_prompt in the new session
summary = build_structured_summary(
    prior_findings={
        "still_valid": [
            "Payment module is still JWT RS256, no changes",
            "API gateway configuration unchanged",
            "Database schema: orders table unchanged"
        ],
        "next_objective": "Analyze new OAuth 2.0 auth implementation",
        "verified_facts": [
            "Team uses pytest for all unit tests",
            "Production DB is PostgreSQL 15"
        ]
    },
    changes_since=[
        {"file": "auth.py", "nature_of_change": "Refactored from JWT to OAuth 2.0"},
        {"file": "middleware/session.py", "nature_of_change": "New file added for OAuth state"}
    ]
)

📝 Handling File Changes in Resumed Sessions

Between the exam's two approaches (full fresh session vs blind resumption), there is a targeted middle ground: resume the session, but explicitly inform the agent which files have changed and need re-analysis. This is the correct choice when most prior context is still valid but specific files were modified.

💡 Exam Guide Language — Targeted Re-Analysis

"Informing a resumed session about specific file changes for targeted re-analysis rather than requiring full re-exploration."

The insight here is efficiency: if 18 of 20 files are unchanged and 2 files were modified, you don't want to re-explore 18 valid files. You resume the session and tell the agent exactly which 2 files changed — it performs targeted re-analysis of only those, updating only the affected parts of its understanding.

CLI — Targeted Re-Analysis After File Changes (Resume + Inform)

# Resume the named session AND inform about what changed
# The agent will re-analyze only the changed files, not everything.
claude --resume payment-refactor-investigation \
  "Since our last session, two files have been modified:
  
  1. src/payment/processor.py — The team added retry logic for 
     network timeouts. Please re-read this file and update your 
     understanding of the payment flow accordingly.
  
  2. src/payment/config.py — New configuration keys added for 
     the retry settings. Please re-read and note the new keys.
  
  All other files you previously analyzed are unchanged.
  After re-analyzing these two files, continue your assessment 
  of the payment module's error handling."

The Three Staleness Scenarios and Their Correct Responses

Scenario	Files Changed	Correct Action
Paused session, no code changes (same day / overnight)	0 files changed	--resume <name> Context fully valid. Continue seamlessly.
Paused session, a few specific files modified	2–5 files changed	--resume + inform Resume + explicitly tell agent which files changed for targeted re-analysis.
Session from last week; significant refactoring occurred	Many files changed	Fresh + structured summary Prior tool results too stale. Start new session with curated summary of what's still valid.
Want to compare two refactoring approaches from current analysis	N/A — decision point	fork_session Shared baseline established. Fork to explore both approaches independently.

📐 Decision Framework: Resume vs Fork vs Fresh

Figure 2 — Session Management Decision Tree

⚠️ Anti-Patterns & Pro Tips

❌ Always Resuming Without Checking Staleness

Resuming last week's session without considering what changed. Agent reasons confidently from stale tool results, producing recommendations that are wrong for the current codebase state.

❌ Sequential Exploration of Divergent Approaches

Exploring Strategy A, then exploring Strategy B in the same session. Strategy B's exploration is contaminated by everything learned while exploring A. The comparison is not fair or clean.

❌ Resuming When Full Re-Exploration Is Needed

If 60%+ of prior tool results are stale, resuming saves less time than starting fresh. The agent wastes tokens trying to reconcile contradictions between stale context and current reality.

❌ Vague Change Notifications

"Some files changed since last time." The agent can't know which of its prior tool results to distrust. Every specific change must be named with the nature of the change.

✅ Adopt a Session Naming Convention

Use consistent, descriptive names: payment-refactor-v2, auth-security-audit. Good names make sessions resumable by any team member, not just the one who started them.

✅ Log What Changed Between Sessions

Before resuming, run git diff --name-only HEAD@{yesterday} to get the exact list of changed files since the session was paused. Give this list to the agent explicitly.

✅ Structured Summary Template

Create a template for fresh-start summaries: (1) still-valid findings, (2) invalidated findings, (3) verified architectural facts, (4) current task objective. Use it consistently so summaries are complete.

✅ Fork at the Right Moment

Fork after the baseline analysis is complete — not before. Forking too early means both branches repeat expensive exploration. Forking too late means branches will have already contaminated each other.

📝 Summary & Exam Key Points

🎯 Exam Scenario — Developer Productivity Tool (Scenario 4)

The primary exam scenario: "You are building developer productivity tools using the Claude Agent SDK. The agent helps engineers explore unfamiliar codebases, understand legacy systems, generate boilerplate code, and automate repetitive tasks."

Questions test: (1) choosing between resume, fork, and fresh for a given scenario description, (2) identifying that stale tool results make resumption unreliable and a structured summary is more reliable, (3) knowing that fork_session is for divergent exploration from a shared baseline — not for re-running the same task, and (4) targeted re-analysis by informing a resumed session about specific changed files.

Named session resumption (--resume <session-name>) continues a session with its full prior conversation history. The agent sees every message, tool call, and tool result from the prior session. Ideal when prior context is valid and the investigation is simply paused mid-task.

fork_session creates independent branches from a shared analysis baseline. Use it when a shared expensive baseline is complete and you need to explore divergent approaches (e.g., comparing two testing strategies or two refactoring approaches) without either branch contaminating the other.

Starting a new session with a structured summary is more reliable than resuming with stale tool results. This is exact exam guide language. When underlying code or data has changed significantly, prior tool results are wrong. A structured summary provides curated, current, human-verified context instead.

Inform a resumed session about specific file changes for targeted re-analysis. When only a few files changed, resume the session and explicitly name the changed files and nature of changes. The agent re-reads only those files rather than re-exploring the entire codebase.

The staleness decision rule: 0 files changed → resume cleanly. A few specific files changed → resume + inform. Many files changed / significant refactoring → fresh session with structured summary. Divergent exploration needed → fork_session.

A structured summary contains three parts: (1) still-valid findings from prior analysis, (2) explicitly invalidated findings and why, (3) current task objective and verified architectural facts. This replaces stale tool results with accurate, current context.

fork_session branches are truly independent — neither affects the other. This independence is the key property. If exploration in Branch A could contaminate Branch B's reasoning, the comparison between strategies becomes unreliable. Fork enforces clean separation.

Session management is about the reliability of context, not just continuity. The wrong instinct is "always resume to save context." The correct instinct is "resume only when context is reliable; start fresh when it isn't." Reliable context produces correct analysis; stale context produces confident but wrong analysis.

🎉 Domain 1 Complete — All 7 Task Statements Covered

You have now covered all 7 task statements of Domain 1: Agentic Architecture & Orchestration (27% of scored content). Review the Domain 1 Deep Dive Study Guide for integrated practice questions and the complete cheat sheet.

← Previous Task 1.6 — Task Decomposition Strategies

Next → Domain 1 Summary