Hooks are the enforcement layer of the Agent SDK. While Claude reasons about what to do, hooks control whether it is allowed to do it and what data it sees when it does. This task statement tests two distinct hook patterns: PostToolUse hooks for normalizing heterogeneous data after a tool returns, and PreToolCall hooks for enforcing compliance rules before a tool executes. The central exam principle: hooks give deterministic guarantees; prompt instructions give probabilistic compliance.
At international arrivals, the customs officer plays two distinct roles, which perfectly illustrate the AI Fluency concept of Delegation Boundaries. Role 1 (PostToolUse): When luggage arrives on the conveyor belt from different countries, it gets processed through X-ray and converted into a standardised inspection format โ regardless of which airline or origin it came from. The inspector always sees the same structured view. Role 2 (PreToolCall): Before any goods may leave the secure area, the officer checks a manifest. If the declared value exceeds the duty-free limit, the passenger is stopped and redirected to the customs desk โ they cannot just walk through.
PostToolUse hooks are the X-ray normalizer โ they ensure the agent's Description of reality is formatted correctly. PreToolCall hooks are the definitive bounds of Delegation โ they intercept outgoing tool calls and enforce absolute limits on the agent's autonomy, blocking any action that exceeds its designated authority.
The Agent SDK intercepts tool execution at two points. Understanding where each hook fires is the foundation of this entire task statement.
| Hook Type | Fires When | Primary Purpose | Can Block? |
|---|---|---|---|
| PreToolCall | BEFORE the tool executes | Enforce compliance rules; block policy-violating calls | Yes โ returns error as tool_result |
| PostToolUse | AFTER the tool returns its result | Normalize heterogeneous data formats before Claude reads them | No โ transforms but does not block |
This is the central conceptual distinction for Task 1.5. The exam will present scenarios and ask: which approach guarantees compliance? The answer is always hooks, never prompts.
| Dimension | Prompt-Based Instructions | SDK Hook Enforcement |
|---|---|---|
| Compliance rate | Probabilistic โ non-zero failure rate even with perfect prompts | Deterministic โ guaranteed 100% of the time |
| Can user bypass? | Yes โ persuasive context or jailbreak attempts may succeed | No โ hook fires in your code, outside Claude's reasoning |
| Audit trail | Weak โ depends on Claude's output text | Strong โ hook logs every check with timestamp |
| Implementation layer | System prompt / user prompt | Application code (SDK hook callbacks) |
| Best for | Soft guidelines, stylistic preferences | Financial thresholds, regulatory compliance, irreversible actions |
| Failure mode | Claude reasons around instruction given compelling context | Bug in hook logic โ false block or false pass |
When a scenario asks: "What is the most reliable way to ensure refunds above $500 always require human approval?" โ the correct answer is always a PreToolCall hook that intercepts process_refund calls and checks the amount before execution. A system prompt instruction saying "never approve refunds above $500 without human review" is explicitly wrong because it has a non-zero failure rate.
When your agent uses multiple MCP tools from different backends, those tools inevitably return data in heterogeneous formats. Dates might be Unix timestamps from one tool and ISO 8601 strings from another. Status codes might be numeric integers from a legacy system and descriptive strings from a modern API. Without normalization, Claude must reason across these inconsistencies โ increasing error risk and token usage.
The PostToolUse hook fires after every tool call, receiving the raw tool result before Claude processes it. It transforms the output into a normalized format and returns the cleaned version to the agentic loop.
Tool A returns 1711929600. Tool B returns "2024-04-01T12:00:00Z". Claude must compare them โ but they're incomparable without normalization. Hook converts all timestamps to ISO 8601.
Legacy system returns {"status": 2}. Modern system returns {"status": "shipped"}. A PostToolUse hook maps 2 โ "shipped" so Claude always sees human-readable strings.
One MCP tool returns "April 1, 2024". Another returns "01/04/2024". Another returns "2024-04-01". Hook normalizes all to ISO 8601 before Claude sees any of them.
from anthropic_agent_sdk import AgentHook, ToolResultContext import datetime # Status code mapping for legacy order management system ORDER_STATUS_MAP = { 0: "pending", 1: "confirmed", 2: "shipped", 3: "delivered", 4: "cancelled", 5: "refunded" } class DataNormalizationHook(AgentHook): """PostToolUse hook โ normalizes heterogeneous data from multiple MCP tools.""" async def post_tool_use(self, ctx: ToolResultContext): tool_name = ctx.tool_name result = ctx.result # Raw tool output (dict) # โโ Normalize lookup_order results โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ if tool_name == "lookup_order": # 1. Convert Unix timestamp โ ISO 8601 if "created_at" in result and isinstance(result["created_at"], int): result["created_at"] = datetime.datetime.utcfromtimestamp( result["created_at"] ).isoformat() + "Z" # 2. Translate numeric status code โ human-readable string if "status" in result and isinstance(result["status"], int): result["status"] = ORDER_STATUS_MAP.get( result["status"], f"unknown_status_{result['status']}" ) # โโ Normalize get_customer results โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ if tool_name == "get_customer": # Normalize "April 1, 2024" or "01/04/2024" โ "2024-04-01" if "member_since" in result: result["member_since"] = _normalize_date(result["member_since"]) # Return normalized result โ Claude will now reason on consistent data return ctx.with_result(result) def _normalize_date(raw: str) -> str: """Attempt to parse multiple date formats and return ISO 8601.""" for fmt in ("%B %d, %Y", "%d/%m/%Y", "%Y-%m-%d"): try: return datetime.datetime.strptime(raw, fmt).strftime("%Y-%m-%d") except ValueError: continue return raw # Return original if unparseable
PostToolUse hooks should normalize data silently from Claude's perspective โ the agent sees clean data and never knows transformation occurred. But your hook should log every transformation with the original and normalized values for debugging. This creates an audit trail and makes it easy to identify when a new tool returns an unexpected format that your hook doesn't yet handle.
PreToolCall hooks intercept outgoing tool calls before the tool executes. This is the correct mechanism for enforcing business rules that must never be bypassed regardless of what Claude reasons. The hook receives the tool name and input parameters, checks against your policy, and either allows the call to proceed or returns an error as a tool_result back to Claude.
The exam guide states: "Implementing tool call interception hooks that block policy-violating actions (e.g., refunds exceeding $500) and redirect to alternative workflows (e.g., human escalation)."
This is the primary test scenario: a customer requests a refund. Claude calls process_refund(amount=750). The PreToolCall hook fires, checks amount > 500, blocks the call, and returns an error that redirects to escalate_to_human.
from anthropic_agent_sdk import AgentHook, ToolCallContext # Business policy constants โ single source of truth REFUND_AUTO_APPROVAL_LIMIT = 500.00 # USD BULK_CANCEL_ITEM_LIMIT = 10 class ComplianceEnforcementHook(AgentHook): """PreToolCall hook โ enforces business rules before any tool executes.""" async def pre_tool_call(self, ctx: ToolCallContext): tool_name = ctx.tool_name tool_input = ctx.tool_input # The parameters Claude wants to pass # โโ Rule 1: Refunds above threshold require human approval โโโโโโ if tool_name == "process_refund": amount = tool_input.get("amount", 0) if amount > REFUND_AUTO_APPROVAL_LIMIT: # Block the refund and tell Claude WHY + what to do instead return ctx.block( error=f"POLICY_VIOLATION: Refund of ${amount:.2f} exceeds the " f"${REFUND_AUTO_APPROVAL_LIMIT:.2f} auto-approval limit. " f"Use escalate_to_human with reason='large_refund' and " f"include: customer_id, refund_amount={amount}, " f"root_cause, and recommended_action in the handoff." ) # โโ Rule 2: Bulk cancellations above limit require human review โ if tool_name == "cancel_orders": order_ids = tool_input.get("order_ids", []) if len(order_ids) > BULK_CANCEL_ITEM_LIMIT: return ctx.block( error=f"POLICY_VIOLATION: Bulk cancellation of {len(order_ids)} orders " f"exceeds the {BULK_CANCEL_ITEM_LIMIT} order limit for automated " f"processing. Escalate to human for review." ) # โโ Allow all other tool calls to proceed unmodified โโโโโโโโโโโโ return ctx.allow()
When a PreToolCall hook blocks a tool call, the SDK returns the error message as a tool_result in the conversation. Claude receives this, reads the error text, and decides its next action โ typically calling escalate_to_human as instructed in the error message. This is why actionable error messages matter: they tell Claude exactly what to do next, preventing hallucinated alternative actions.
A well-designed PreToolCall hook doesn't just block โ it redirects Claude to the correct alternative workflow. The error message returned as tool_result should be prescriptive, not just descriptive. Compare these two approaches:
"POLICY_VIOLATION: Refund amount exceeds limit."
Claude reads this, doesn't know what to do, and may attempt a workaround โ breaking the refund into two smaller calls, or hallucinating an explanation to the customer.
"POLICY_VIOLATION: Refund of $750 exceeds $500 auto-limit. Use escalate_to_human with reason='large_refund'. Include: customer_id, refund_amount=750, root_cause, recommended_action."
Claude now knows the exact next action and the exact handoff fields required.
The exam will present scenarios and ask you to select the correct enforcement mechanism. Use this decision framework:
| Scenario | Use Hooks? | Reasoning |
|---|---|---|
| Refunds above $500 must go to human approval | โ Yes โ PreToolCall | Financial threshold โ zero-failure-rate required. Prompt has non-zero failure rate. |
| Normalize Unix timestamps from legacy tool | โ Yes โ PostToolUse | Systematic format issue across all calls โ hook normalizes once, not per-prompt. |
| Claude should use a formal, professional tone | โ No โ Prompt | Stylistic preference, non-critical. Probabilistic compliance acceptable. |
| Cancellations of more than 10 orders require review | โ Yes โ PreToolCall | Business rule with regulatory implications. Must be deterministic. |
| Agent should summarise results concisely | โ No โ Prompt | Formatting preference โ acceptable to use prompt guidance. |
| KYC must complete before account funding | โ Yes โ PreToolCall | Regulatory compliance gate โ cannot be bypassed. Use hook + database check. |
| Tool returns ISO 8601 already โ no normalization needed | No hook needed | Homogeneous format โ PostToolUse hook adds overhead without benefit. |
If the business rule must NEVER be violated, regardless of Claude's reasoning โ Use a hook.
If the preference is directional but some deviation is acceptable โ Use a prompt instruction.
The litmus test: "Would I accept 99% compliance or must it be 100%?" 100% โ hook. <100% โ prompt.
Writing "Never process refunds above $500 without human approval" in the system prompt. A persuasive user message can override this. The exam explicitly tests this anti-pattern.
PostToolUse hooks transform data โ they don't block. If you need to block based on a tool's result, you need a different pattern (e.g., a validation step or a second hook). Confusing PostToolUse and PreToolCall is a tested anti-pattern.
"Policy violation." โ Claude reads this and has no idea what policy was violated or what to do next. This causes Claude to hallucinate an explanation or attempt workarounds.
Adding a PostToolUse hook to normalize data that is already in a consistent format adds latency and complexity. Only normalize when multiple tools actually return heterogeneous formats.
Include: what was violated, the exact threshold, the correct alternative action, and the required handoff fields. Claude uses this to take the right next action immediately.
Define thresholds (REFUND_LIMIT = 500) as named constants at the top of your hook, not as magic numbers inline. When policy changes, update in one place.
Log all block/allow decisions with tool name, input values, policy checked, and outcome. This creates an immutable compliance audit trail โ critical for financial regulations.
Test with amount=499.99 (should allow), amount=500.00 (check your > vs >= boundary), and amount=500.01 (should block). Off-by-one errors in compliance rules are production incidents.
The primary exam scenario: "You are building a customer support resolution agent with tools: get_customer, lookup_order, process_refund, escalate_to_human. The agent connects to legacy and modern MCP backends that return data in inconsistent formats. Your company policy requires human approval for refunds above $500."
Questions test: (1) which hook type normalizes tool output before Claude sees it (PostToolUse), (2) which hook type blocks policy-violating tool calls (PreToolCall), (3) why a prompt instruction alone is insufficient for the $500 rule (non-zero failure rate), and (4) what a block error message should contain to correctly redirect Claude (exact alternative action + handoff fields).
tool_result that Claude reads to determine its next action.process_refund when amount > $500 and redirecting to escalate_to_human. The block error message must specify: the reason, the threshold violated, the exact alternative tool to call, and the handoff fields required (customer_id, refund_amount, root_cause, recommended_action).