Day 4 of 90 ๐ŸŒฑ Spark Phase MCP Tools

Tools in Depth โ€”
Design, Errors & Patterns

Tools are the most-used MCP primitive. The model calls them, you implement them. Today you go beyond the basics: how to design tools the model actually uses well, how to handle every error scenario correctly, and the real-world patterns that separate prototype servers from production-grade ones.

Day 3 gave you the skeleton. Today you add the nervous system. A poorly designed tool is worse than no tool โ€” the model will call it incorrectly, misinterpret results, and generate bad outputs. Tool design is product design: your user is an AI, and you're writing the UX for it.
Table of Contents
01 โ€” Design Principles

Tool Design Principles

The model cannot read your mind. It only knows what you tell it through three channels: the tool name, the description, and the input schema. If any of these are ambiguous, incomplete, or misleading, the model will make wrong calls โ€” and it won't know it made them. Designing tools well is therefore the highest-leverage skill in MCP development.

Think of the model as a very capable intern on their first day. They're smart, they read documentation carefully, and they follow instructions precisely. But they have no context beyond what's written. Your tool name, description, and schema are their only training material.

1task per tool
โ‰ค7params recommended
verb_nounnaming convention
alwaysdescribe every param
๐ŸŽฏ
Single Responsibility
One tool does one thing. A tool that searches, then filters, then formats, then emails is four tools pretending to be one. The model can chain tool calls โ€” let it.
Bad: process_and_send_report()
Good: search_records(), format_report(), send_email()
๐Ÿ“
Descriptive Names
Use verb_noun snake_case. The name should be a self-contained description of the action. Avoid abbreviations, internal jargon, or generic names like do_thing or run.
Bad: run(), exec(), process()
Good: search_github_repos(), create_calendar_event()
๐Ÿ”
Rich Descriptions
The description field is your primary communication channel to the model. Write 2โ€“4 sentences. Explain what it does, what it returns, and crucially โ€” when to use it vs similar tools.
Include: purpose, return shape,
when to use, caveats
๐Ÿ›ก๏ธ
Validate Everything
Never trust that the model will send perfectly formed inputs. Use Zod constraints (.min(), .max(), .regex()) to reject bad values early, before your handler logic runs.
Validate at the schema level,
not inside the handler body
โ†ฉ๏ธ
Idempotency Where Possible
The model may retry a tool if it thinks the call failed (e.g., network timeout but server succeeded). Design read tools to be fully idempotent. Flag destructive writes explicitly in their description.
GET = always safe
POST/DELETE = warn in description
๐Ÿ“ฆ
Informative Returns
Return enough context for the model to understand and act on the result. Don't return raw API blobs โ€” filter, format, summarize. But don't over-truncate either; the model needs the signal.
Return: result + key metadata
Skip: internal IDs, raw HTML
โš ๏ธ
The description is read by the model, not just humans

Unlike code comments, tool descriptions are parsed and reasoned over by the LLM at inference time. Write them as if writing an API doc for an intelligent system. Be precise, avoid vague language, and call out edge cases. "Fetches user data" is inadequate. "Retrieves a user profile by user ID. Returns name, email, role, and account status. Returns a tool error if the user ID does not exist." is correct.

Here's the difference between a tool the model uses correctly vs one it struggles with:

โœ— POOR TOOL DESIGN
// Name: too generic server.tool( "search", "Search for things", { q: z.string() }, async ({ q }) => { // returns raw API response return { content: [{ type: "text" as const, text: JSON.stringify(result) }] }; } );
โœ“ WELL-DESIGNED TOOL
// Name: specific action + domain server.tool( "search_github_issues", "Search open GitHub issues in a repo. Returns issue number, title, labels, and created date. Use this when the user asks about bugs or feature requests.", { repo: z.string().describe("owner/repo format"), query: z.string().describe("search terms"), limit: z.number().int().min(1).max(20).default(5) }, handler );
02 โ€” Schema Mastery

Input Schema Mastery

Day 3 covered the Zod basics. Now we go deeper โ€” real tools need complex schemas: nested objects, union types, optional fields with defaults, and discriminated unions for multi-mode tools. Each pattern has a specific use case.

Nested Objects
Use when parameters have logical groupings
const schema = { location: z.object({ city: z.string(), country: z.string().length(2) .describe("ISO 3166-1 alpha-2") }).describe("Target location"), options: z.object({ units: z.enum(["metric", "imperial"]), days: z.number().int().min(1).max(7) }).optional() };
Union Types
When a param can be one of several shapes
const schema = { // Accept string OR number for flexible IDs id: z.union([z.string(), z.number()]) .describe("User ID (string or int)"), // Date as ISO string OR timestamp date: z.union([ z.string().datetime(), z.number().int().positive() ]).describe("ISO 8601 or Unix timestamp") };
Discriminated Unions
Multi-mode tools with different param sets per mode
const schema = { action: z.discriminatedUnion("type", [ z.object({ type: z.literal("create"), name: z.string(), template: z.string().optional() }), z.object({ type: z.literal("delete"), id: z.string().uuid(), confirm: z.literal(true) // must be true }) ]) };
Transforms & Refinements
Coerce, normalize, or add custom validation
const schema = { // Auto-trim whitespace query: z.string().trim().min(2), // Transform: lowercase the tag tag: z.string().transform(s => s.toLowerCase()), // Custom validation with message email: z.string().refine( v => v.includes("@company.com"), { message: "Must be company email" } ) };
๐Ÿ’ก
z.infer gives you the TypeScript type for free

Define your schema as a const, then derive its TypeScript type with type MyInput = z.infer<typeof MySchema>. You can use this type to annotate internal functions that process the input, keeping full type-safety throughout your handler chain without duplicating the type definition.

typescriptHandler receives a fully-typed, validated object
const SearchSchema = z.object({ query: z.string().trim().min(2).describe("Search query"), limit: z.number().int().min(1).max(50).default(10), filters: z.object({ status: z.enum(["open", "closed", "all"]).default("open"), labels: z.array(z.string()).optional() }).optional() }); // TypeScript type auto-derived โ€” no manual annotation needed type SearchInput = z.infer<typeof SearchSchema>; // { query: string; limit: number; filters?: { status: "open"|"closed"|"all"; labels?: string[] } } server.tool( "search_issues", "Search repository issues by query string with optional status filter.", SearchSchema.shape, // pass .shape to extract the object's field definitions async (args) => { // args is SearchInput โ€” fully typed and already validated by Zod const results = await searchIssues(args.query, args.limit, args.filters); return { content: [{ type: "text" as const, text: formatResults(results) }] }; } );
03 โ€” Handler Patterns

Handler Patterns

A tool handler is an async function. Inside it, you can do anything Node.js can do: HTTP requests, file I/O, database queries, shell execution, computations. The key is structuring handlers so they're robust, readable, and the model gets actionable output regardless of what happens.

๐Ÿณ

The Short-Order Cook

A tool handler is like a short-order cook: an order comes in (the tool call), you execute it as fast as possible, and you hand back a plate (the result). If you're out of an ingredient (data not found), you say so clearly. You don't disappear into the kitchen for 10 minutes โ€” you keep the customer informed. Structure handlers the same way: fast, decisive, always returning something useful.

typescriptThe canonical handler structure
async ({ userId, includeInactive }) => { // 1. Input pre-processing (if not done by Zod transforms) const id = userId.trim(); // 2. Main operation โ€” always await, always in try/catch let user; try { user = await db.users.findById(id); } catch (err) { // 3. Infrastructure errors โ†’ tool error (not protocol error) return { content: [{ type: "text" as const, text: `Database error: ${err.message}` }], isError: true }; } // 4. Business logic validation if (!user) { return { content: [{ type: "text" as const, text: `User ${id} not found` }], isError: true }; } // 5. Conditionally include data based on params const data = includeInactive ? user : filterActive(user); // 6. Format and return โ€” structured, human-readable text return { content: [{ type: "text" as const, text: [ `User: ${data.name} (${data.email})`, `Role: ${data.role} | Status: ${data.status}`, `Created: ${data.createdAt.toISOString()}` ].join("\n") }] }; }
โšก
Parallelise independent async operations

If your handler needs to fetch from multiple sources, use Promise.all() โ€” don't await them sequentially. A handler that makes 3 API calls sequentially at 200ms each takes 600ms. With Promise.all() it takes 200ms. The model and user feel this latency directly.

typescriptParallel fetching inside a handler
// โŒ Sequential โ€” 600ms total const user = await fetchUser(id); const repos = await fetchRepos(id); const activity = await fetchActivity(id); // โœ… Parallel โ€” ~200ms total const [user, repos, activity] = await Promise.all([ fetchUser(id), fetchRepos(id), fetchActivity(id) ]); // โœ… With individual error handling per source const [userResult, reposResult] = await Promise.allSettled([ fetchUser(id), fetchRepos(id) ]); const user = userResult.status === "fulfilled" ? userResult.value : null; const repos = reposResult.status === "fulfilled" ? reposResult.value : [];
04 โ€” Error Handling

Error Handling in Depth

MCP has two completely different error channels. Understanding the difference is critical โ€” using the wrong one causes silent failures, confuses the model, or crashes the session. Most beginners mix them up.

Tool Errors
isError: true in CallToolResult
Business logic failures: user not found, API rate limit, file doesn't exist
The model sees the error message and can reason about it / retry differently
Returns HTTP 200 at the JSON-RPC level โ€” session stays alive
Use this for expected failure modes the model should handle
Response shape: { content: [...], isError: true }
Protocol Errors
Thrown exception from handler
Infrastructure failures: SDK converts uncaught throws to JSON-RPC error (-32603)
The client gets a JSON-RPC error response โ€” model sees "tool call failed"
Returns HTTP 200 but with error field in the JSON-RPC envelope
Use this only for truly unexpected bugs โ€” not business logic
Session survives (SDK catches), but model gets less useful error info
๐ŸŽฏ
The golden rule: use isError:true for anything the model should know about

"User not found", "rate limit exceeded", "invalid date range" โ€” these are all tool errors with isError: true. The model reads your error message, understands what went wrong, and can adjust. Throwing an exception gives the model a generic "internal error" with no actionable information. Reserve uncaught throws for bugs in your own code.

typescriptComplete error handling matrix
// โœ… Business logic error โ€” model gets the message, can retry/adjust if (!user) { return { content: [{ type: "text" as const, text: `No user found with ID "${id}". Try searching by email instead.` }], isError: true }; } // โœ… External API error โ€” surface the API's message to the model try { const data = await weatherAPI.fetch(city); return { content: [{ type: "text" as const, text: formatWeather(data) }] }; } catch (err: unknown) { const msg = err instanceof Error ? err.message : "Unknown error"; return { content: [{ type: "text" as const, text: `Weather API error: ${msg}. Try a different city name.` }], isError: true }; } // โœ… Rate limiting โ€” guide the model on what to do next if (response.status === 429) { const retryAfter = response.headers.get("retry-after") ?? "60"; return { content: [{ type: "text" as const, text: `Rate limit reached. Retry after ${retryAfter} seconds.` }], isError: true }; }
05 โ€” Multi-content Responses

Multi-content Responses

A tool's content array can contain multiple items of different types โ€” you're not limited to one text block. This lets you return rich, mixed responses: a summary text plus a chart image, or a description plus an embedded resource for further inspection.

๐Ÿ“
TextContent
Plain text or Markdown. The model reads this, reasons over it, and uses it in its response. Most common type.
type: "text"
text: string
๐Ÿ–ผ๏ธ
ImageContent
Base64-encoded image data. Useful for charts, screenshots, generated graphics. Model can visually analyze it.
type: "image"
data: string (base64)
mimeType: "image/png"
๐Ÿ“ฆ
EmbeddedResource
Embeds a resource inline. URI + content. The host can offer to save or open it separately from the chat.
type: "resource"
resource: { uri, mimeType,
  text | blob }
typescriptReturning multiple content blocks
// Tool that returns a text summary + a JSON data resource server.tool( "analyze_sales", "Analyze sales data for a date range. Returns a summary and the raw data as an embedded resource.", { from: z.string().date(), to: z.string().date() }, async ({ from, to }) => { const data = await fetchSalesData(from, to); return { content: [ // Block 1: Human-readable summary { type: "text" as const, text: [ `Sales Analysis: ${from} to ${to}`, `Total Revenue: $${data.revenue.toLocaleString()}`, `Orders: ${data.orders} | Avg Order Value: $${data.aov}`, `Top Product: ${data.topProduct}` ].join("\n") }, // Block 2: Embedded JSON resource for further processing { type: "resource" as const, resource: { uri: `sales://${from}/${to}`, mimeType: "application/json", text: JSON.stringify(data.rawRecords) } } ] }; } );
06 โ€” Tool Annotations

Tool Annotations & Metadata

MCP 2025-03 introduced tool annotations โ€” optional metadata hints that describe a tool's behavior to the host and model without changing its functional contract. These are hints, not enforced constraints, but they meaningfully improve how the host application presents your tools and how the model reasons about when to call them.

Annotations are passed as a fourth argument to server.tool() in an annotations object:

Annotation Type Default Meaning
readOnlyHint boolean false Tool only reads data, makes no changes. Hosts may show a "safe" indicator.
destructiveHint boolean true Tool may delete or irreversibly modify data. Hosts may require user confirmation.
idempotentHint boolean false Calling the tool multiple times with the same inputs has the same effect as calling it once. Safe to retry.
openWorldHint boolean true Tool interacts with external systems (internet, APIs). False means fully local/sandboxed.
typescriptAnnotations in practice
// Read-only search tool โ€” safe to call any number of times server.tool( "search_documents", "Search the document index. Read-only, never modifies data.", { query: z.string() }, handler, { annotations: { readOnlyHint: true, // no side effects idempotentHint: true, // safe to retry openWorldHint: false // queries local DB, no internet } } ); // Destructive tool โ€” should trigger confirmation in the host server.tool( "delete_project", "Permanently deletes a project and all associated data. This action cannot be undone.", { projectId: z.string().uuid() }, handler, { annotations: { readOnlyHint: false, destructiveHint: true, // triggers confirmation UX in host idempotentHint: false // deleting twice would fail on second call } } );
๐Ÿ”’
Annotations are trust hints, not security controls

A tool marked readOnlyHint: true is not sandboxed โ€” it's a declaration of intent. Nothing in the protocol prevents a read-only-annotated tool from writing to a database. The host uses these hints to build UX (confirmation dialogs, badges), but the actual safety guarantee comes from your handler's implementation. Annotate truthfully.

07 โ€” Real-World Patterns

Real-World Tool Patterns

Production MCP servers encounter situations that simple examples never show: API responses too large to return wholesale, slow operations that need progress hints, repeated calls that should hit a cache. Here are the three patterns you'll implement in nearly every real server.

๐Ÿ“„
Pagination Pattern
PAGINATION
// Problem: search_issues returns 500 results โ€” too much for context window // Solution: add cursor-based pagination server.tool( "list_issues", "List repository issues with pagination. Returns up to 'limit' issues. If 'nextCursor' is present in the result, pass it as 'cursor' to get the next page.", { repo: z.string(), limit: z.number().int().min(1).max(25).default(10), cursor: z.string().optional().describe("Pagination cursor from previous call") }, async ({ repo, limit, cursor }) => { const page = await github.issues.list({ repo, per_page: limit, cursor }); const lines = [ `Issues in ${repo} (page of ${page.items.length}):`, ...page.items.map(i => ` #${i.number} ${i.title} [${i.state}]`) ]; if (page.nextCursor) lines.push(`\nMore results available. Next cursor: ${page.nextCursor}`); return { content: [{ type: "text" as const, text: lines.join("\n") }] }; } );
โšก
In-Memory Caching Pattern
CACHING
// Problem: model calls get_user_profile 5 times per conversation // Solution: simple TTL cache at module level const cache = new Map<string, { data: unknown; expires: number }>(); const TTL_MS = 5 * 60 * 1000; // 5 minutes function getCached<T>(key: string): T | null { const entry = cache.get(key); if (!entry || Date.now() > entry.expires) { cache.delete(key); return null; } return entry.data as T; } server.tool( "get_user_profile", "Get a user's profile. Results are cached for 5 minutes.", { userId: z.string() }, async ({ userId }) => { let user = getCached<User>(userId); if (!user) { user = await fetchUserFromAPI(userId); cache.set(userId, { data: user, expires: Date.now() + TTL_MS }); } return { content: [{ type: "text" as const, text: formatUser(user) }] }; } );
๐Ÿ”„
Retry with Exponential Backoff
RETRY
// Problem: external API occasionally returns 503 (transient) // Solution: retry up to 3 times with exponential backoff async function withRetry<T>( fn: () => Promise<T>, maxAttempts = 3, baseDelayMs = 500 ): Promise<T> { for (let attempt = 1; attempt <= maxAttempts; attempt++) { try { return await fn(); } catch (err) { if (attempt === maxAttempts) throw err; const delay = baseDelayMs * 2 ** (attempt - 1); // 500, 1000, 2000ms await new Promise(r => setTimeout(r, delay)); } } throw new Error("Unreachable"); } server.tool( "fetch_report", "Fetch a report from the analytics API.", { reportId: z.string() }, async ({ reportId }) => { try { const report = await withRetry(() => analyticsAPI.getReport(reportId)); return { content: [{ type: "text" as const, text: formatReport(report) }] }; } catch (err: unknown) { return { content: [{ type: "text" as const, text: `Failed after 3 attempts: ${(err as Error).message}` }], isError: true }; } } );
08 โ€” Testing Tools

Testing Your Tools

A tool handler is just an async TypeScript function. You can unit test it in complete isolation โ€” no MCP server, no transport, no Claude required. Extract the handler logic into a named function, test it directly, and only pass it to server.tool() in your main entry point.

typescriptTestable handler architecture
// handlers/weather.ts โ€” pure handler function, no SDK dependency export async function getWeatherHandler( { city, units }: { city: string; units: "celsius" | "fahrenheit" }, deps = { weatherAPI } // injectable dependency for testing ) { const data = await deps.weatherAPI.fetch(city); if (!data) return { content: [{ type: "text" as const, text: "City not found" }], isError: true }; return { content: [{ type: "text" as const, text: formatWeather(data, units) }] }; } // handlers/weather.test.ts โ€” no MCP infrastructure needed import { describe, it, expect } from "vitest"; import { getWeatherHandler } from "./weather.js"; describe("getWeatherHandler", () => { it("returns weather for valid city", async () => { const mockAPI = { weatherAPI: { fetch: async () => ({ temp: 22, condition: "sunny" }) } }; const result = await getWeatherHandler({ city: "London", units: "celsius" }, mockAPI); expect(result.isError).toBeUndefined(); expect(result.content[0].text).toContain("22"); }); it("returns isError for unknown city", async () => { const mockAPI = { weatherAPI: { fetch: async () => null } }; const result = await getWeatherHandler({ city: "Atlantis", units: "celsius" }, mockAPI); expect(result.isError).toBe(true); }); });
๐Ÿงช
Use Vitest, not Jest, for ESM TypeScript projects

MCP projects use ESM ("type": "module") and TypeScript. Jest has notoriously difficult ESM/TypeScript configuration. Vitest works natively with both โ€” install with npm install --save-dev vitest and add "test": "vitest" to your scripts. No transform config, no babel, no --experimental-vm-modules flags.

09 โ€” Knowledge Check

Test Your Understanding

Five questions covering tool design, errors, and patterns from today.

Day 4 โ€” Tools in Depth Quiz

Select one answer per question, then submit to see your score.

Q1 A tool handler calls an external API that returns a 404 (resource not found). What should the handler return?
A Throw a JavaScript Error โ€” the SDK will convert it to a JSON-RPC error response
B Return a CallToolResult with isError: true and a descriptive message โ€” the model reads it and can adjust
C Return an empty content array โ€” the model will understand silence as failure
D Return a success result with the 404 status code embedded in the text
โœ… B is correct. A 404 is a business logic error โ€” the model should know about it and reason over it. Return isError: true with a clear message. Throwing an exception gives the model a generic protocol error with less useful information. Always prefer tool errors for expected failure modes.
Q2 Which naming convention is best for MCP tool names?
A camelCase like getWeather โ€” matches JavaScript conventions
B PascalCase like GetWeather โ€” matches class naming
C snake_case verb_noun like get_weather โ€” specific action + domain, widely used in MCP ecosystem
D Free-form like weather โ€” shorter names are clearer
โœ… C is correct. The MCP convention and OpenAI function-calling convention both use snake_case verb_noun. It's explicit, readable by the model, and unambiguous. Short generic names like weather give the model no information about whether it reads, writes, or formats weather data.
Q3 What does the destructiveHint: true annotation tell the host application?
A The tool is sandboxed and cannot make real changes
B The tool may permanently modify or delete data โ€” the host may show a confirmation prompt before allowing the call
C The tool will crash the server if called incorrectly
D The tool requires administrator privileges to run
โœ… B is correct. destructiveHint: true signals to the host that this tool can cause irreversible changes. The host application (like Claude Desktop) may display a confirmation dialog before the model is allowed to call it. It's a UX hint, not a security control โ€” your handler must still implement the actual authorization logic.
Q4 Your tool needs to fetch data from 3 independent APIs. What's the correct approach?
A Await them sequentially to avoid race conditions
B Use Promise.all() to fetch all three in parallel, reducing total latency to the slowest single fetch
C Create 3 separate tools, one per API, and have the model call them in sequence
D Fetch only the first API โ€” return partial data rather than waiting for all three
โœ… B is correct. Independent async operations should always be parallelized. Promise.all([fetch1, fetch2, fetch3]) runs all three simultaneously โ€” total time equals the slowest, not the sum. This directly reduces the latency the user feels. Use Promise.allSettled() if you want to handle individual failures without failing the entire call.
Q5 A tool returns 500 items from a database. What's the best practice?
A Return all 500 items as a JSON blob โ€” let the model filter what it needs
B Return only the first item โ€” simplicity is more important than completeness
C Add cursor-based pagination โ€” return 10โ€“25 items per call with a next-page cursor in the result text
D Raise the context window limit in Claude's config
โœ… C is correct. Large result sets should be paginated. Return a manageable slice (10โ€“25 items) plus a cursor token in your response text. The model reads the cursor, calls your tool again with it, and gets the next page. This keeps context window usage bounded and responses fast. A 500-item JSON blob wastes tokens and overwhelms the model's reasoning.
๐ŸŽ‰
Score: 5/5
Ready for Day 5!
โ† Previous Day
Day 3: TypeScript SDK
McpServer, Zod schemas, first server
Next Day โ†’
Day 5: Resources Deep Dive
Static resources, URI templates, subscriptions