If Domain 1 is the skeleton of an agentic system, Domain 2 is its nervous system. Tools are how Claude perceives and acts upon the world. Without well-designed tool definitions and crystal-clear descriptions, even the most robust architecture will fail due to unreliable tool selection. This guide covers every dimension of Task 2.1: how Claude selects tools, the anatomy of a production-grade description, naming alignment, single responsibility, system prompt interference, and the danger of over-provisioning.
Imagine a hospital emergency department with three check-in windows: "General Admissions," "Cardiology Emergencies," and "Trauma Bay." If all three windows simply had a sign reading "Patient Check-In" with no further description, incoming patients would queue at random windows, causing dangerous delays when a heart attack patient lands at General Admissions.
A well-designed triage system gives each window a precise sign: "Cardiology Emergencies — Chest pain, shortness of breath, suspected heart attack. NOT for broken bones or lacerations — those go to Trauma Bay."
Tool descriptions serve exactly this function. Claude reads them to make routing decisions. Vague descriptions create dangerous "wrong-window" tool calls in production systems.
This analogy maps directly to the exam scenario: a Customer Support Resolution Agent with tools get_customer, lookup_order, process_refund, and escalate_to_human. If get_customer and lookup_order both say "retrieves information," Claude may call either one for any query about customers or orders, with no reliable routing guarantee.
Claude's tool selection is not based on function name alone, nor on code-level metadata. It is driven entirely by natural language understanding of the tool description field. When Claude receives a request, it performs this reasoning process:
tool_use block for the selected tool(s).Whenever an exam question asks "Why is Claude calling get_customer when a user asks about their order status?" or "Why is Claude misrouting between two similar tools?", the correct answer is almost always: ambiguous or insufficiently differentiated tool descriptions that fail to establish a clear boundary between the two tools.
The fix is always to enhance descriptions — not to add more tools, not to modify the system prompt's persona, and not to add iteration caps.
Think of the tool description as a legal contract between the tool author and the model. It must specify:
Minimal descriptions break this contract. A description of "Retrieves customer information" is unenforceable — Claude cannot reliably determine if "What is the status of my order?" falls within that scope or not.
A production-grade tool description must contain four elements. Missing any one of them degrades reliability:
| Element | Purpose | Example |
|---|---|---|
| 1. Exact Purpose | Unambiguous explanation of what the tool accomplishes — in one tight sentence. | "Retrieves a customer's account profile and contact details by their unique customer ID." |
| 2. Input Format | Precise format requirements with a concrete example identifier, not just a type declaration. | "customer_id must be in the format CUST-XXXXX (e.g., CUST-00123). Do not pass email addresses or order IDs to this tool." |
| 3. Edge Cases & What It Returns | What is returned on success and what happens on partial data — prevents misinterpretation of empty results. | "Returns full account object. If ID is not found, returns {found: false} — this is a valid empty result, not an error." |
| 4. Explicit Boundary Guidance | Negative constraint telling Claude when NOT to use this tool, and pointing to the correct tool instead. | "Do NOT use this tool to look up order status or refund details. For orders, use lookup_order. For refunds, use process_refund." |
Of the four elements, Explicit Boundary Guidance is most commonly missing and most commonly the root cause of misrouting. Claude has no automatic understanding of what a tool does not do. You must state it explicitly. Cross-tool disambiguation ("use X for Y, use Z for W") is what converts ambiguous descriptions into reliable routing.
Claude reads both the function name and the description when making tool selection decisions. The name is the first signal Claude encounters. If the name is generic (e.g., analyze_content), Claude must rely entirely on the description for disambiguation. But when the name is misleading relative to the description, it creates cognitive friction — and LLMs fail under cognitive friction.
The solution is Name-Description Alignment: ensure the tool name itself communicates the precise purpose, so Claude can often make a correct initial guess before even finishing reading the description.
| Generic Name (Avoid) | Aligned Name (Use) | Why It Helps |
|---|---|---|
| analyze_content | extract_web_results | Signals the scope (web) and the action (extract) before the description is read. |
| analyze_document | summarize_uploaded_pdf | Differentiates from web-content tools and specifies the file type and action. |
| query_database | lookup_customer_by_email | Constrains the lookup pattern to a specific entity and key type. |
| process_item | process_refund_for_order | Makes the action and target entity explicit, preventing use for non-refund payment operations. |
A reliable naming pattern is Verb + Noun + Qualifier: what action? on what entity? with what constraint? For example: lookup_order_by_id (lookup=verb, order=noun, by_id=qualifier). This pattern alone eliminates 80% of name-level ambiguity before the description is even read.
When an exam question asks "what is the BEST fix for tool misrouting between analyze_content and analyze_document?", the correct answer combines both renaming AND re-describing — renaming to extract_web_results and updating the description to explicitly state it is for web-sourced content only. Renaming alone, or re-describing alone, is the incomplete answer that appears as a distractor.
A tool that attempts to serve multiple different use cases creates decision paralysis. Consider a single analyze_document tool that: extracts structured data, summarizes content, AND verifies claims against source material. Claude must infer from user context which job to perform, and inference errors compound across multi-step workflows.
The principle of Single Responsibility from software engineering applies directly: each tool should do exactly one thing, expressed precisely in its name and description. This enables:
Replace one generic analyze_document tool with three purpose-specific tools:
extract_data_points — returns structured key-value pairssummarize_content — returns a concise prose summaryverify_claim_against_source — returns a boolean verdict with evidenceClaude can now reliably distinguish between these tools because they have non-overlapping purposes, names, and schemas.
For cases where you cannot decompose a tool (e.g., a general-purpose API you don't control), use a constrained wrapper tool. Instead of exposing fetch_url directly to the agent, expose load_internal_document that internally calls fetch_url but validates the URL against your document whitelist. The wrapper tool has a restricted, precise description. Claude sees the constrained tool, not the raw capability.
# Instead of exposing fetch_url directly: # description: "Fetches content from any URL" <-- TOO BROAD # Expose a constrained wrapper: tools = [{ "name": "load_internal_document", "description": ( "Loads the full text of an internal company document by its document URL. " "ONLY accepts URLs from the internal document store (docs.example.com). " "Returns the full document text. " "Do NOT use this for external web pages or APIs — use web_search for those. " "Document URL format: https://docs.example.com/<doc-id>" ), "input_schema": { "type": "object", "properties": { "document_url": { "type": "string", "description": "The internal document URL (must start with https://docs.example.com/)" } }, "required": ["document_url"] } }]
Claude reads the system prompt and the tool descriptions together as a unified context. This creates a subtle but dangerous risk: keywords in the system prompt can create unintended associations with specific tool names, biasing Claude toward calling certain tools even when they are not appropriate.
Consider this system prompt fragment:
"You are a customer service agent. When customers mention documents, always analyze them thoroughly before responding. Make sure you analyze every document the customer provides."
If a tool named analyze_document exists in the toolkit, the repeated word "analyze" in the system prompt artificially biases Claude toward invoking that tool — even when the user simply says "I have a document about your return policy" and just wants a text response, not a document analysis.
| Type | Example | Fix |
|---|---|---|
| Keyword Echo | System prompt says "analyze" → biases toward analyze_document |
Replace generic verbs in system prompt with specific phrases: "review the customer's uploaded file" instead of "analyze documents" |
| Entity Overlap | System prompt says "look up account details" → could match both get_customer and lookup_order |
Add negative constraints to tool descriptions that explicitly call out the overlap |
| Always-Do Instructions | "Always search before answering" → causes web_search calls even for factual memory questions | Qualify always-do instructions: "Always search when the user asks about current events or real-time data" |
After writing your system prompt, list all verbs and nouns. Check each one against your tool names. Any word in the system prompt that appears in a tool name or description creates a potential unintended association. Either rename the tool to reduce overlap, or qualify the system prompt instruction to specify exactly when the behavior applies.
A counterintuitive but exam-critical fact: providing an agent with more tools generally makes it LESS reliable. This violates the intuition that "more capability = more performance." Here is why it fails:
The exam guide explicitly states that 4-5 focused tools per agent is the target for reliable tool selection. If an agent has significantly more than 5 tools, you must reassess: Does it need all those tools for its specific role? Can the agent be split into more specialized subagents, each with its own constrained tool set?
Instead of giving all agents all tools, implement scoped tool access:
| Agent Role | Allowed Tools | Cross-Role Tools (Limited) |
|---|---|---|
| Search Agent | web_search, crawl_page, filter_results | verify_fact (shared) |
| Analysis Agent | extract_data_points, summarize_content | verify_fact (shared) |
| Synthesis Agent | merge_findings, generate_report | verify_fact (shared) |
| CS Agent | get_customer, lookup_order, process_refund, escalate_to_human | None |
The verify_fact cross-role tool is an example of a scoped cross-role tool: a shared utility needed frequently by multiple agents, exposed selectively rather than to all agents wholesale.
Below are side-by-side implementations of tool definitions — the minimal version that causes misrouting, and the production-grade version used in the customer support scenario.
# ❌ ANTI-PATTERN: Minimal descriptions with no boundaries tools_minimal = [ { "name": "get_customer", "description": "Retrieves customer information", # TOO VAGUE "input_schema": { "type": "object", "properties": { "identifier": {"type": "string"} # No format spec }, "required": ["identifier"] } }, { "name": "lookup_order", "description": "Retrieves order details", # Near-identical to get_customer "input_schema": { "type": "object", "properties": { "identifier": {"type": "string"} # Same schema name = maximum ambiguity }, "required": ["identifier"] } } ] # Result: Claude cannot reliably distinguish these tools for queries like # "What is the status of my account?" — will misroute unpredictably.
# ✅ PRODUCTION PATTERN: Complete descriptions with boundary guidance tools_production = [ { "name": "get_customer", "description": ( "Retrieves a customer's account profile from the CRM by their unique customer ID. " "Returns: name, email, phone, account status, and subscription tier. " "customer_id format: CUST-XXXXX (example: CUST-00123). " "Returns {found: false} if the customer ID does not exist — this is a VALID result, not an error. " "Use this tool ONLY for account-level queries (profile, status, contact info). " "Do NOT use for order status, refund history, or payment details — use lookup_order instead." ), "input_schema": { "type": "object", "properties": { "customer_id": { # Specific name, not generic "identifier" "type": "string", "description": "Customer ID in format CUST-XXXXX" } }, "required": ["customer_id"] } }, { "name": "lookup_order", "description": ( "Retrieves a specific order record from the order management system by its order ID. " "Returns: line items, quantities, prices, shipment status, tracking number, and expected delivery date. " "order_id format: ORD-XXXXX (example: ORD-98765). " "Returns {found: false} for non-existent orders — this is a valid empty result, not an error. " "Use this tool ONLY for order-level queries (status, tracking, items, shipping). " "Do NOT use for customer account or profile queries — use get_customer for those." ), "input_schema": { "type": "object", "properties": { "order_id": { # Distinct from customer_id — no overlap "type": "string", "description": "Order ID in format ORD-XXXXX" } }, "required": ["order_id"] } }, { "name": "process_refund", "description": ( "Initiates a monetary refund for a specific order. Requires BOTH a verified customer ID AND order ID. " "PREREQUISITE: get_customer must succeed before this tool can be used — you must have a verified CUST-XXXXX. " "refund_amount must be a positive float in USD (e.g., 29.99). Do not include currency symbols. " "Returns: refund confirmation ID and estimated processing time. " "Do NOT use if the customer account is in suspended status — escalate instead." ), "input_schema": { "type": "object", "properties": { "customer_id": {"type": "string", "description": "Verified CUST-XXXXX from get_customer"}, "order_id": {"type": "string", "description": "Order ID from lookup_order (ORD-XXXXX)"}, "refund_amount": {"type": "number", "description": "Amount in USD, e.g. 29.99"}, "reason": {"type": "string", "description": "Brief reason for refund"} }, "required": ["customer_id", "order_id", "refund_amount", "reason"] } } ]
Writing descriptions like "retrieves information" or "processes the request." These provide no basis for disambiguation and are the root cause of tool misrouting.
Describing what a tool DOES without stating what it does NOT do. Without negative constraints, Claude will use any tool that seems plausible for a given context.
Using identifier or id as parameter names across multiple tools. Use specific names like customer_id and order_id to reinforce the boundary at the schema level.
Giving a single agent access to every available tool "just in case." This degrades selection reliability dramatically. Scope each agent to 4-5 focused tools.
Tools that "analyze OR summarize OR verify" based on an internal mode parameter. Each purpose needs its own dedicated tool with its own name and schema.
System prompts that use verbs echoing tool names without qualifiers (e.g., "always analyze"). These create unintended tool bias that overrides well-written descriptions.
Every production tool description includes: exact purpose, input format with example, edge case handling, and explicit boundary guidance with pointers to alternative tools.
Tool names use Verb+Noun+Qualifier pattern: lookup_order_by_id, extract_web_results. The name communicates purpose before the description is read.
Each agent receives only the 4-5 tools relevant to its defined role. Cross-role tools are shared selectively for high-frequency needs only.
Customer Support: Agent has get_customer, lookup_order, process_refund, escalate_to_human. Questions test whether you can identify why Claude is misrouting (answer: overlapping descriptions) and prescribe the fix (answer: add boundary guidance and rename tools).
Multi-Agent Research: A synthesis agent is incorrectly calling web_search tools. Questions test whether you can identify why (answer: over-provisioning — synthesis agent has too many tools) and how to fix (answer: restrict synthesis agent to synthesis-only tools, add cross-role verify_fact as limited shared tool).
analyze_content → extract_web_results). The Verb+Noun+Qualifier naming pattern reduces ambiguity before Claude even reads the description. Renaming + re-describing is always better than either alone.analyze_document) into purpose-specific tools (extract_data_points, summarize_content, verify_claim_against_source). Use constrained wrapper tools when you can't modify underlying APIs.customer_id, order_id) rather than generic ones (identifier) across similar tools. Schema-level differentiation works in parallel with description-level differentiation.