Tools are the most-used MCP primitive. The model calls them, you implement them. Today you go beyond the basics: how to design tools the model actually uses well, how to handle every error scenario correctly, and the real-world patterns that separate prototype servers from production-grade ones.
Day 3 gave you the skeleton. Today you add the nervous system. A poorly designed tool is worse than no tool โ the model will call it incorrectly, misinterpret results, and generate bad outputs. Tool design is product design: your user is an AI, and you're writing the UX for it.
The model cannot read your mind. It only knows what you tell it through three channels: the tool name, the description, and the input schema. If any of these are ambiguous, incomplete, or misleading, the model will make wrong calls โ and it won't know it made them. Designing tools well is therefore the highest-leverage skill in MCP development.
Think of the model as a very capable intern on their first day. They're smart, they read documentation carefully, and they follow instructions precisely. But they have no context beyond what's written. Your tool name, description, and schema are their only training material.
1task per tool
โค7params recommended
verb_nounnaming convention
alwaysdescribe every param
๐ฏ
Single Responsibility
One tool does one thing. A tool that searches, then filters, then formats, then emails is four tools pretending to be one. The model can chain tool calls โ let it.
Use verb_noun snake_case. The name should be a self-contained description of the action. Avoid abbreviations, internal jargon, or generic names like do_thing or run.
The description field is your primary communication channel to the model. Write 2โ4 sentences. Explain what it does, what it returns, and crucially โ when to use it vs similar tools.
Include: purpose, return shape, when to use, caveats
๐ก๏ธ
Validate Everything
Never trust that the model will send perfectly formed inputs. Use Zod constraints (.min(), .max(), .regex()) to reject bad values early, before your handler logic runs.
Validate at the schema level, not inside the handler body
โฉ๏ธ
Idempotency Where Possible
The model may retry a tool if it thinks the call failed (e.g., network timeout but server succeeded). Design read tools to be fully idempotent. Flag destructive writes explicitly in their description.
GET = always safe POST/DELETE = warn in description
๐ฆ
Informative Returns
Return enough context for the model to understand and act on the result. Don't return raw API blobs โ filter, format, summarize. But don't over-truncate either; the model needs the signal.
Return: result + key metadata Skip: internal IDs, raw HTML
โ ๏ธ
The description is read by the model, not just humans
Unlike code comments, tool descriptions are parsed and reasoned over by the LLM at inference time. Write them as if writing an API doc for an intelligent system. Be precise, avoid vague language, and call out edge cases. "Fetches user data" is inadequate. "Retrieves a user profile by user ID. Returns name, email, role, and account status. Returns a tool error if the user ID does not exist." is correct.
Here's the difference between a tool the model uses correctly vs one it struggles with:
โ POOR TOOL DESIGN
// Name: too generic
server.tool(
"search",
"Search for things",
{ q: z.string() },async ({ q }) => {// returns raw API responsereturn{ content: [{
type: "text"as const,
text: JSON.stringify(result)
}]};
}
);
โ WELL-DESIGNED TOOL
// Name: specific action + domain
server.tool(
"search_github_issues",
"Search open GitHub issues in a repo. Returns issue number, title, labels, and created date. Use this when the user asks about bugs or feature requests.",
{
repo: z.string().describe("owner/repo format"),
query: z.string().describe("search terms"),
limit: z.number().int().min(1).max(20).default(5)
},
handler
);
02 โ Schema Mastery
Input Schema Mastery
Day 3 covered the Zod basics. Now we go deeper โ real tools need complex schemas: nested objects, union types, optional fields with defaults, and discriminated unions for multi-mode tools. Each pattern has a specific use case.
const schema = {// Accept string OR number for flexible IDs
id: z.union([z.string(), z.number()])
.describe("User ID (string or int)"),
// Date as ISO string OR timestamp
date: z.union([
z.string().datetime(),
z.number().int().positive()
]).describe("ISO 8601 or Unix timestamp")
};
Discriminated Unions
Multi-mode tools with different param sets per mode
const schema = {// Auto-trim whitespace
query: z.string().trim().min(2),
// Transform: lowercase the tag
tag: z.string().transform(s => s.toLowerCase()),
// Custom validation with message
email: z.string().refine(
v => v.includes("@company.com"),
{ message: "Must be company email"}
)
};
๐ก
z.infer gives you the TypeScript type for free
Define your schema as a const, then derive its TypeScript type with type MyInput = z.infer<typeof MySchema>. You can use this type to annotate internal functions that process the input, keeping full type-safety throughout your handler chain without duplicating the type definition.
typescriptHandler receives a fully-typed, validated object
const SearchSchema = z.object({
query: z.string().trim().min(2).describe("Search query"),
limit: z.number().int().min(1).max(50).default(10),
filters: z.object({
status: z.enum(["open", "closed", "all"]).default("open"),
labels: z.array(z.string()).optional()
}).optional()
});
// TypeScript type auto-derived โ no manual annotation neededtype SearchInput = z.infer<typeof SearchSchema>;
// { query: string; limit: number; filters?: { status: "open"|"closed"|"all"; labels?: string[] } }
server.tool(
"search_issues",
"Search repository issues by query string with optional status filter.",
SearchSchema.shape, // pass .shape to extract the object's field definitionsasync (args) => {// args is SearchInput โ fully typed and already validated by Zodconst results = await searchIssues(args.query, args.limit, args.filters);
return{ content: [{ type: "text"as const, text: formatResults(results) }]};
}
);
03 โ Handler Patterns
Handler Patterns
A tool handler is an async function. Inside it, you can do anything Node.js can do: HTTP requests, file I/O, database queries, shell execution, computations. The key is structuring handlers so they're robust, readable, and the model gets actionable output regardless of what happens.
๐ณ
The Short-Order Cook
A tool handler is like a short-order cook: an order comes in (the tool call), you execute it as fast as possible, and you hand back a plate (the result). If you're out of an ingredient (data not found), you say so clearly. You don't disappear into the kitchen for 10 minutes โ you keep the customer informed. Structure handlers the same way: fast, decisive, always returning something useful.
typescriptThe canonical handler structure
async ({ userId, includeInactive }) => {// 1. Input pre-processing (if not done by Zod transforms)const id = userId.trim();
// 2. Main operation โ always await, always in try/catchlet user;
try{
user = await db.users.findById(id);
}catch (err) {// 3. Infrastructure errors โ tool error (not protocol error)return{
content: [{ type: "text"as const, text: `Database error: ${err.message}`}],
isError: true};
}// 4. Business logic validationif (!user) {return{
content: [{ type: "text"as const, text: `User ${id} not found`}],
isError: true};
}// 5. Conditionally include data based on paramsconst data = includeInactive ? user : filterActive(user);
// 6. Format and return โ structured, human-readable textreturn{
content: [{
type: "text"as const,
text: [
`User: ${data.name} (${data.email})`,
`Role: ${data.role} | Status: ${data.status}`,
`Created: ${data.createdAt.toISOString()}`
].join("\n")
}]};
}
โก
Parallelise independent async operations
If your handler needs to fetch from multiple sources, use Promise.all() โ don't await them sequentially. A handler that makes 3 API calls sequentially at 200ms each takes 600ms. With Promise.all() it takes 200ms. The model and user feel this latency directly.
MCP has two completely different error channels. Understanding the difference is critical โ using the wrong one causes silent failures, confuses the model, or crashes the session. Most beginners mix them up.
Tool Errors
isError: true in CallToolResult
Business logic failures: user not found, API rate limit, file doesn't exist
The model sees the error message and can reason about it / retry differently
Returns HTTP 200 at the JSON-RPC level โ session stays alive
Use this for expected failure modes the model should handle
Response shape: { content: [...], isError: true }
Protocol Errors
Thrown exception from handler
Infrastructure failures: SDK converts uncaught throws to JSON-RPC error (-32603)
The client gets a JSON-RPC error response โ model sees "tool call failed"
Returns HTTP 200 but with error field in the JSON-RPC envelope
Use this only for truly unexpected bugs โ not business logic
Session survives (SDK catches), but model gets less useful error info
๐ฏ
The golden rule: use isError:true for anything the model should know about
"User not found", "rate limit exceeded", "invalid date range" โ these are all tool errors with isError: true. The model reads your error message, understands what went wrong, and can adjust. Throwing an exception gives the model a generic "internal error" with no actionable information. Reserve uncaught throws for bugs in your own code.
typescriptComplete error handling matrix
// โ Business logic error โ model gets the message, can retry/adjustif (!user) {return{
content: [{ type: "text"as const, text: `No user found with ID "${id}". Try searching by email instead.`}],
isError: true};
}// โ External API error โ surface the API's message to the modeltry{const data = await weatherAPI.fetch(city);
return{ content: [{ type: "text"as const, text: formatWeather(data) }]};
}catch (err: unknown) {const msg = err instanceof Error ? err.message : "Unknown error";
return{
content: [{ type: "text"as const, text: `Weather API error: ${msg}. Try a different city name.`}],
isError: true};
}// โ Rate limiting โ guide the model on what to do nextif (response.status === 429) {const retryAfter = response.headers.get("retry-after") ?? "60";
return{
content: [{ type: "text"as const, text: `Rate limit reached. Retry after ${retryAfter} seconds.`}],
isError: true};
}
05 โ Multi-content Responses
Multi-content Responses
A tool's content array can contain multiple items of different types โ you're not limited to one text block. This lets you return rich, mixed responses: a summary text plus a chart image, or a description plus an embedded resource for further inspection.
๐
TextContent
Plain text or Markdown. The model reads this, reasons over it, and uses it in its response. Most common type.
type: "text" text: string
๐ผ๏ธ
ImageContent
Base64-encoded image data. Useful for charts, screenshots, generated graphics. Model can visually analyze it.
Embeds a resource inline. URI + content. The host can offer to save or open it separately from the chat.
type: "resource" resource: { uri, mimeType, text | blob }
typescriptReturning multiple content blocks
// Tool that returns a text summary + a JSON data resource
server.tool(
"analyze_sales",
"Analyze sales data for a date range. Returns a summary and the raw data as an embedded resource.",
{
from: z.string().date(),
to: z.string().date()
},async ({ from, to }) => {const data = await fetchSalesData(from, to);
return{
content: [
// Block 1: Human-readable summary{
type: "text"as const,
text: [
`Sales Analysis: ${from} to ${to}`,
`Total Revenue: $${data.revenue.toLocaleString()}`,
`Orders: ${data.orders} | Avg Order Value: $${data.aov}`,
`Top Product: ${data.topProduct}`
].join("\n")
},// Block 2: Embedded JSON resource for further processing{
type: "resource"as const,
resource: {
uri: `sales://${from}/${to}`,
mimeType: "application/json",
text: JSON.stringify(data.rawRecords)
}}
]
};
}
);
06 โ Tool Annotations
Tool Annotations & Metadata
MCP 2025-03 introduced tool annotations โ optional metadata hints that describe a tool's behavior to the host and model without changing its functional contract. These are hints, not enforced constraints, but they meaningfully improve how the host application presents your tools and how the model reasons about when to call them.
Annotations are passed as a fourth argument to server.tool() in an annotations object:
Annotation
Type
Default
Meaning
readOnlyHint
boolean
false
Tool only reads data, makes no changes. Hosts may show a "safe" indicator.
destructiveHint
boolean
true
Tool may delete or irreversibly modify data. Hosts may require user confirmation.
idempotentHint
boolean
false
Calling the tool multiple times with the same inputs has the same effect as calling it once. Safe to retry.
openWorldHint
boolean
true
Tool interacts with external systems (internet, APIs). False means fully local/sandboxed.
typescriptAnnotations in practice
// Read-only search tool โ safe to call any number of times
server.tool(
"search_documents",
"Search the document index. Read-only, never modifies data.",
{ query: z.string() },
handler,
{
annotations: {
readOnlyHint: true, // no side effects
idempotentHint: true, // safe to retry
openWorldHint: false// queries local DB, no internet}}
);
// Destructive tool โ should trigger confirmation in the host
server.tool(
"delete_project",
"Permanently deletes a project and all associated data. This action cannot be undone.",
{ projectId: z.string().uuid() },
handler,
{
annotations: {
readOnlyHint: false,
destructiveHint: true, // triggers confirmation UX in host
idempotentHint: false// deleting twice would fail on second call}}
);
๐
Annotations are trust hints, not security controls
A tool marked readOnlyHint: true is not sandboxed โ it's a declaration of intent. Nothing in the protocol prevents a read-only-annotated tool from writing to a database. The host uses these hints to build UX (confirmation dialogs, badges), but the actual safety guarantee comes from your handler's implementation. Annotate truthfully.
07 โ Real-World Patterns
Real-World Tool Patterns
Production MCP servers encounter situations that simple examples never show: API responses too large to return wholesale, slow operations that need progress hints, repeated calls that should hit a cache. Here are the three patterns you'll implement in nearly every real server.
๐
Pagination Pattern
PAGINATION
// Problem: search_issues returns 500 results โ too much for context window
// Solution: add cursor-based pagination
server.tool(
"list_issues",
"List repository issues with pagination. Returns up to 'limit' issues. If 'nextCursor' is present in the result, pass it as 'cursor' to get the next page.",
{
repo: z.string(),
limit: z.number().int().min(1).max(25).default(10),
cursor: z.string().optional().describe("Pagination cursor from previous call")
},async ({ repo, limit, cursor }) => {const page = await github.issues.list({ repo, per_page: limit, cursor });
const lines = [
`Issues in ${repo} (page of ${page.items.length}):`,
...page.items.map(i => ` #${i.number}${i.title} [${i.state}]`)
];
if (page.nextCursor) lines.push(`\nMore results available. Next cursor: ${page.nextCursor}`);
return{ content: [{ type: "text"as const, text: lines.join("\n") }]};
}
);
โก
In-Memory Caching Pattern
CACHING
// Problem: model calls get_user_profile 5 times per conversation
// Solution: simple TTL cache at module levelconst cache = new Map<string, { data: unknown; expires: number }>();
const TTL_MS = 5 * 60 * 1000; // 5 minutesfunction getCached<T>(key: string): T | null{const entry = cache.get(key);
if (!entry || Date.now() > entry.expires) {
cache.delete(key);
return null;
}return entry.data as T;
}
server.tool(
"get_user_profile",
"Get a user's profile. Results are cached for 5 minutes.",
{ userId: z.string() },async ({ userId }) => {let user = getCached<User>(userId);
if (!user) {
user = await fetchUserFromAPI(userId);
cache.set(userId, { data: user, expires: Date.now() + TTL_MS });
}return{ content: [{ type: "text"as const, text: formatUser(user) }]};
}
);
๐
Retry with Exponential Backoff
RETRY
// Problem: external API occasionally returns 503 (transient)
// Solution: retry up to 3 times with exponential backoffasync function withRetry<T>(
fn: () => Promise<T>,
maxAttempts = 3,
baseDelayMs = 500
): Promise<T> {for (let attempt = 1; attempt <= maxAttempts; attempt++) {try{return await fn();
}catch (err) {if (attempt === maxAttempts) throw err;
const delay = baseDelayMs * 2 ** (attempt - 1); // 500, 1000, 2000msawait new Promise(r => setTimeout(r, delay));
}}throw new Error("Unreachable");
}
server.tool(
"fetch_report", "Fetch a report from the analytics API.",
{ reportId: z.string() },async ({ reportId }) => {try{const report = await withRetry(() => analyticsAPI.getReport(reportId));
return{ content: [{ type: "text"as const, text: formatReport(report) }]};
}catch (err: unknown) {return{
content: [{ type: "text"as const, text: `Failed after 3 attempts: ${(err as Error).message}`}],
isError: true};
}}
);
08 โ Testing Tools
Testing Your Tools
A tool handler is just an async TypeScript function. You can unit test it in complete isolation โ no MCP server, no transport, no Claude required. Extract the handler logic into a named function, test it directly, and only pass it to server.tool() in your main entry point.
MCP projects use ESM ("type": "module") and TypeScript. Jest has notoriously difficult ESM/TypeScript configuration. Vitest works natively with both โ install with npm install --save-dev vitest and add "test": "vitest" to your scripts. No transform config, no babel, no --experimental-vm-modules flags.
09 โ Knowledge Check
Test Your Understanding
Five questions covering tool design, errors, and patterns from today.
Day 4 โ Tools in Depth Quiz
Select one answer per question, then submit to see your score.
Q1 A tool handler calls an external API that returns a 404 (resource not found). What should the handler return?
A Throw a JavaScript Error โ the SDK will convert it to a JSON-RPC error response
B Return a CallToolResult with isError: true and a descriptive message โ the model reads it and can adjust
C Return an empty content array โ the model will understand silence as failure
D Return a success result with the 404 status code embedded in the text
โ B is correct. A 404 is a business logic error โ the model should know about it and reason over it. Return isError: true with a clear message. Throwing an exception gives the model a generic protocol error with less useful information. Always prefer tool errors for expected failure modes.
Q2 Which naming convention is best for MCP tool names?
A camelCase like getWeather โ matches JavaScript conventions
B PascalCase like GetWeather โ matches class naming
C snake_case verb_noun like get_weather โ specific action + domain, widely used in MCP ecosystem
D Free-form like weather โ shorter names are clearer
โ C is correct. The MCP convention and OpenAI function-calling convention both use snake_case verb_noun. It's explicit, readable by the model, and unambiguous. Short generic names like weather give the model no information about whether it reads, writes, or formats weather data.
Q3 What does the destructiveHint: true annotation tell the host application?
A The tool is sandboxed and cannot make real changes
B The tool may permanently modify or delete data โ the host may show a confirmation prompt before allowing the call
C The tool will crash the server if called incorrectly
D The tool requires administrator privileges to run
โ B is correct.destructiveHint: true signals to the host that this tool can cause irreversible changes. The host application (like Claude Desktop) may display a confirmation dialog before the model is allowed to call it. It's a UX hint, not a security control โ your handler must still implement the actual authorization logic.
Q4 Your tool needs to fetch data from 3 independent APIs. What's the correct approach?
A Await them sequentially to avoid race conditions
B Use Promise.all() to fetch all three in parallel, reducing total latency to the slowest single fetch
C Create 3 separate tools, one per API, and have the model call them in sequence
D Fetch only the first API โ return partial data rather than waiting for all three
โ B is correct. Independent async operations should always be parallelized. Promise.all([fetch1, fetch2, fetch3]) runs all three simultaneously โ total time equals the slowest, not the sum. This directly reduces the latency the user feels. Use Promise.allSettled() if you want to handle individual failures without failing the entire call.
Q5 A tool returns 500 items from a database. What's the best practice?
A Return all 500 items as a JSON blob โ let the model filter what it needs
B Return only the first item โ simplicity is more important than completeness
C Add cursor-based pagination โ return 10โ25 items per call with a next-page cursor in the result text
D Raise the context window limit in Claude's config
โ C is correct. Large result sets should be paginated. Return a manageable slice (10โ25 items) plus a cursor token in your response text. The model reads the cursor, calls your tool again with it, and gets the next page. This keeps context window usage bounded and responses fast. A 500-item JSON blob wastes tokens and overwhelms the model's reasoning.