Day 6 of 90 🌱 Spark Phase Prompts & Sampling

Prompts & Sampling
The Third Primitive & AI-to-AI

You've mastered Tools (model acts) and Resources (model reads). Today you complete the trilogy with Prompts — reusable conversation templates the user controls — and dive into Sampling, MCP's breakthrough capability that lets servers ask the AI to think, unlocking true agentic multi-step workflows.

Tools and Resources are about data flow. Prompts and Sampling are about intelligence flow. When your MCP server can ask Claude to reason about something and act on the result, you've crossed from "integration" into "agentic AI" territory. This is the day it clicks.
Table of Contents
01 — Three Primitives

The Three Primitives Revisited

Over the last five days you've built up two-thirds of the MCP primitive trinity. Today we complete the picture and understand how all three work together as a coherent system — each filling a gap the others can't.

🔧
Tools
Model-controlled
The AI decides when to call a tool based on the conversation. Tools perform actions — search, create, delete, compute. The model reasons about whether, when, and how to invoke them.
📦
Resources
App-controlled
The host application decides which resources to expose in context. Resources are addressable data — files, records, config. The app attaches them; the model reads them passively.
💬
Prompts
User-controlled
Today's focus. The user selects a prompt from a menu in the host app. Prompts are reusable conversation starters that accept arguments and expand into rich, multi-message conversations pre-configured for a task.
3MCP Primitives
userControls Prompts
samplingServer → AI calls
agenticWhat this unlocks
🎯
Why "user-controlled" matters for Prompts

The control model is the key to understanding each primitive's purpose. Tools are invoked autonomously by the AI mid-reasoning — the user may not even notice. Resources are attached by the application layer, often before the conversation starts. Prompts are explicitly chosen by the human from a visible menu — they're a UX affordance, a way for server developers to package expert workflows that users can discover and trigger intentionally.

02 — What Are Prompts

What Are Prompts?

An MCP Prompt is a named, parameterized conversation template. When a user selects a prompt from Claude Desktop's slash-command menu (or any MCP-compatible host), the client calls prompts/get with any argument values the user filled in. The server returns a structured list of messages — user and assistant role turns — that seed the conversation with exactly the right context for the task.

Think of a Prompt as a macro for conversations. Instead of the user laboriously typing "Please analyze the pull request at github.com/org/repo/pull/42 and list every security concern in the changed files, grouped by severity", they select your security-review prompt, type the PR URL, and the host does the rest.

📋

Prompts Are Like Form Templates

Imagine a law firm where each case type has a standard intake form — "Personal Injury", "Contract Dispute", "IP Infringement". The receptionist doesn't type a unique description from scratch every time. They select the right form, fill in the client-specific fields (name, case number, dates), and the system pre-populates the rest. MCP Prompts work identically: the server defines the template and required fields; the user fills in the variables; the host injects the fully-expanded conversation into Claude's context.

sequenceDiagram
  participant U as 👤 User
  participant H as Host App
  participant C as MCP Client
  participant S as MCP Server
  participant M as Claude Model

  C->>S: prompts/list
  S-->>C: [{name, description, arguments}, ...]
  H->>U: Show prompt menu (slash commands)
  U->>H: Select "security-review", enter PR URL
  C->>S: prompts/get { name: "security-review", arguments: { pr_url: "..." } }
  S-->>C: { messages: [{ role: "user", content: "..." }] }
  H->>M: Inject messages into conversation context
  M->>U: Responds with security analysis
      
🔍
Prompts vs System Prompts vs Tool Descriptions

A system prompt is set by the host application and sets the AI's persona/rules. A tool description tells the model what a tool does. An MCP Prompt is user-selectable and injects a full conversation structure — potentially multiple turns of user/assistant messages — tailored to a specific workflow. They operate at different layers and serve different purposes. Don't confuse them.

03 — Prompt Structure

Prompt Structure & Arguments

Every MCP Prompt has four components. Understanding each one's role is the foundation of building prompts that work reliably across different host applications.

💬
Prompt Object
The complete schema of a registered MCP Prompt
name
string — A unique identifier for this prompt within the server. Shown in the host's slash-command menu. Use kebab-case: security-review, write-unit-tests, explain-code. The model never sees this directly — it's purely for user discovery.
description
string — Human-readable description shown in the prompt picker. Write this for the user, not the model. "Generate a thorough security review of a GitHub pull request, checking for OWASP Top 10 vulnerabilities and secret exposure." One or two clear sentences.
arguments
PromptArgument[] — Optional array of input fields the user fills in before the prompt runs. Each argument has a name, optional description, and required: boolean. Arguments are plain strings — no type system, no Zod. Validation is your handler's job.
messages
PromptMessage[] — Returned by your handler. An array of { role: "user" | "assistant", content: TextContent | ImageContent | EmbeddedResource } objects. This sequence seeds the conversation. You can include multiple turns — pre-computed assistant reasoning, staged user questions, context documents — anything that sets Claude up for success.
role: "user"
What the user "said" to start this conversation
  • The question or task to be solved
  • Context documents embedded as text
  • Instructions on what format to respond in
  • The filled-in argument values interpolated into prose
  • Can include images (mimeType: image/*)
role: "assistant"
Pre-seeded reasoning to guide Claude's response
  • Pre-computed context (e.g., "I've retrieved the PR diff...")
  • Chain-of-thought scaffolding
  • Format examples the model should mirror
  • Partial analyses that Claude should complete
  • Usually optional — most prompts only need user turns

Prompt arguments are simpler than Tool input schemas — they are untyped strings with no Zod validation layer. The host renders them as text input fields in the prompt UI. Your handler receives a Record<string, string> and is responsible for validating values, handling missing optionals, and interpolating them into the message content.

Required Argument
{ name: "pr_url",
description: "GitHub PR URL",
required: true }
Host will not allow prompt submission without this. Field is marked as required in the UI. Your handler can assume it exists.
Optional Argument
{ name: "focus",
description: "Area to focus on
(security/perf/style)",
required: false }
Host shows a non-blocking input. Your handler must handle the case where it's undefined or empty string and apply sensible defaults.
Embedded Resource
// In message content:
{ type: "resource",
resource: {
uri: "file:///app.ts",
text: "...source code..."
}}
Prompts can embed resource content directly in their messages — read the resource in your handler and inline it into the conversation context.
04 — Building Prompts

Building Prompts in TypeScript

Registering a prompt uses server.prompt(). The API mirrors server.tool() — name, description, argument schema (as a plain object, not Zod), and an async handler that returns the message array.

typescriptSimple single-turn prompt
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { z } from "zod"; // ── Simple prompt: one user message with interpolated arg ───── server.prompt( "explain-code", // prompt name "Explain a code snippet in plain English", // shown in picker { // argument schema language: z.string().describe("Programming language"), code: z.string().describe("The code to explain") }, ({ language, code }) => ({ // handler (can be sync) messages: [{ role: "user", content: { type: "text" as const, text: `Please explain the following ${language} code in plain English. Describe what it does step by step, highlight any non-obvious patterns, and note any potential issues or improvements. \`\`\`${language} ${code} \`\`\`` } }] }) );
typescriptAdvanced multi-turn prompt with embedded resource
// Security review prompt — reads the PR diff and embeds it as a resource server.prompt( "security-review", "Perform a thorough security review of a GitHub pull request", { pr_url: z.string().url().describe("GitHub PR URL"), severity: z.enum(["all", "high-only"]).default("all").describe("Filter by severity") }, async ({ pr_url, severity }) => { // Fetch the PR diff from GitHub API const diff = await github.getPRDiff(pr_url); const pr = await github.getPRMeta(pr_url); return { messages: [ { role: "user", content: { type: "text" as const, text: `You are performing a security code review for the following pull request: Title: ${pr.title} Author: ${pr.author} URL: ${pr_url} Files changed: ${pr.changedFiles} Review scope: ${severity === "high-only" ? "Report HIGH and CRITICAL issues only" : "Report all issues"} Check for: SQL injection, XSS, CSRF, hardcoded secrets, insecure dependencies, broken authentication, path traversal, and OWASP Top 10 vulnerabilities. Format your response as: ## Summary ## Critical Issues (if any) ## High Issues (if any) ## Medium Issues (if any — unless high-only mode) ## Recommendations` } }, { role: "user", content: { // Embed the diff as a resource type: "resource" as const, resource: { uri: pr_url, mimeType: "text/plain", text: diff } } } ] }; } );
💡
Pre-fetch data in the prompt handler — don't make the model do it

The power of prompts is that your handler runs before Claude sees anything. Fetch the PR diff, read the database record, load the file — attach all the necessary context in the messages array. Claude receives a fully loaded conversation and can immediately deliver insight. If you only pass a URL and ask Claude to "fetch it" (without tool access), you've just written a worse system prompt.

05 — What Is Sampling

What Is Sampling?

Sampling is MCP's most powerful — and most misunderstood — feature. While every other primitive is about data flow from server to model, sampling reverses the direction: it lets a server ask the AI model to generate a response.

In practical terms: your tool handler is executing, it encounters a step that requires language model intelligence — summarizing a document, classifying text, generating code — and instead of hardcoding logic, it fires a sampling/createMessage request. The MCP client forwards this to Claude, gets the response, and hands it back to your handler. Your server now has AI reasoning as a runtime capability.

🧠

Sampling: The Server Calls the AI Back

Normal MCP flow: User → Claude → calls your tool → you return data → Claude answers user.

Sampling flow: User → Claude → calls your tool → your tool calls Claude back for a sub-task → Claude responds → you use that response in your tool's final answer → Claude answers user.

It's like a contractor (your server) calling an expert consultant (Claude) mid-job. The expert gives advice, the contractor incorporates it, then delivers the finished work back to the client. This is what makes agentic, multi-step workflows possible without building your own AI infrastructure.

🔒
Human-in-the-loop is mandatory for sampling

MCP's spec requires that the host application controls sampling — not the server. When your server sends sampling/createMessage, the request goes to the client, which shows it to the user for approval before passing it to the model. The user can see what the server is asking the AI to do and can reject it. This "human-in-the-loop" design prevents runaway agentic loops and is a core safety guarantee. Never design systems that assume sampling calls are invisible or auto-approved.

06 — Sampling Flow

The Sampling Flow & Parameters

sampling/createMessage — Full Request Lifecycle
1
Server declares sampling capability
During the handshake, the server declares it supports sampling. The client then knows to expect sampling requests from this server's handlers.
capabilities: { sampling: {} }
2
Tool handler triggers sampling
Mid-execution, your tool handler calls server.server.createMessage() with a messages array and model preferences. This sends a sampling/createMessage request to the client.
sampling/createMessage → client
3
Host presents to user for approval
The host application intercepts the sampling request and shows it to the user. The user sees what the server is asking the AI to do — they can approve, modify, or reject it. This is the human-in-the-loop checkpoint.
User approves / modifies / rejects
4
Client sends to model and returns result
If approved, the client forwards the messages to Claude (or another model per the server's preferences). The model's response comes back as a CreateMessageResult.
CreateMessageResult → server handler
5
Handler uses AI response, returns final result
Your handler incorporates the model's text response into its logic — classifies, routes, transforms — then returns the final tool result to the originating Claude conversation.
CallToolResult → original conversation

The sampling/createMessage request accepts these parameters:

ParameterTypeRequiredDescription
messages SamplingMessage[] Required The conversation to send to the model. Each message has role ("user" | "assistant") and content (text or image).
modelPreferences ModelPreferences Optional Hints about model selection: costPriority, speedPriority, intelligencePriority (0–1 each). The client chooses the best-matching model; you can't force a specific model.
systemPrompt string Optional System instructions for this specific AI call. Separate from the host's system prompt — gives the sub-task its own context window framing.
includeContext "none" | "thisServer" | "allServers" Optional Whether to include MCP context from the current session in the sampling call. "none" = clean context; "thisServer" = include this server's resources; "allServers" = everything.
temperature number (0–1) Optional Sampling temperature. Lower = more deterministic (classification). Higher = more creative (generation). Defaults to model default.
maxTokens number Required Maximum tokens for the model's response. Always set this explicitly — prevent accidental runaway generation in sub-tasks.
stopSequences string[] Optional Strings that stop generation early. Useful for structured outputs: ["```", "###END###"].
07 — Sampling in Code

Sampling in Code

Let's make sampling concrete with two real examples: a classification task (where sampling replaces brittle keyword matching), and an enrichment task (where sampling augments raw data with AI analysis).

typescriptDeclare sampling capability in server options
// sampling must be declared during initialization const server = new McpServer({ name: "smart-support-server", version: "1.0.0" }, { capabilities: { sampling: {} // enables sampling/createMessage method } });
typescriptExample 1: AI-powered ticket classification
// Tool that classifies a support ticket using Claude as a sub-task server.tool( "classify_ticket", "Classify a support ticket into a department and priority using AI analysis.", { ticket_id: z.string().describe("Ticket ID to classify"), ticket_text: z.string().describe("Full ticket content") }, async ({ ticket_id, ticket_text }) => { // 🤖 Ask Claude to classify the ticket let classification; try { const result = await server.server.createMessage({ messages: [{ role: "user", content: { type: "text", text: `Classify this support ticket into one of: billing, technical, account, general. Then assign priority: critical, high, medium, low. Respond ONLY in JSON: {"department": "...", "priority": "...", "reason": "..."} Ticket: ${ticket_text}` } }], systemPrompt: "You are a support ticket classifier. Respond only with valid JSON.", maxTokens: 150, temperature: 0.1, // low temp = consistent classification modelPreferences: { speedPriority: 0.9, // classification = fast is fine intelligencePriority: 0.4 // doesn't need the smartest model } }); // result.content.text = the model's JSON response const text = result.content.type === "text" ? result.content.text : "{}"; classification = JSON.parse(text); } catch (err) { classification = { department: "general", priority: "medium", reason: "Fallback" }; } // Apply the classification to the database await db.tickets.update(ticket_id, classification); return { content: [{ type: "text" as const, text: `Ticket ${ticket_id} classified: Department: ${classification.department} Priority: ${classification.priority} Reason: ${classification.reason}` }] }; } );
typescriptExample 2: AI data enrichment pipeline
// Tool that fetches raw data and uses sampling to generate a summary server.tool( "analyze_logs", "Fetch recent application logs and generate an AI-powered incident summary.", { minutes: z.number().int().min(5).max(60).default(15) }, async ({ minutes }) => { const logs = await fetchRecentLogs(minutes); if (logs.errors.length === 0) { return { content: [{ type: "text" as const, text: "No errors in the last " + minutes + " minutes." }] }; } // Use sampling to generate an incident analysis const analysis = await server.server.createMessage({ messages: [{ role: "user", content: { type: "text", text: `Analyze these application logs from the last ${minutes} minutes. Identify the root cause of errors, group related issues, and suggest immediate actions. ERROR LOG: ${logs.errors.slice(0, 50).join("\n")} ERROR COUNT: ${logs.errors.length} total in ${minutes} min` } }], systemPrompt: "You are a senior SRE analyzing application logs. Be concise and actionable.", maxTokens: 500, temperature: 0.3, modelPreferences: { intelligencePriority: 0.8, // incident analysis needs smart reasoning speedPriority: 0.4 } }); const summary = analysis.content.type === "text" ? analysis.content.text : "Analysis unavailable"; return { content: [{ type: "text" as const, text: `📊 Log Analysis — Last ${minutes} Minutes\n${"─".repeat(40)}\n${summary}` }] }; } );
🚨
Sampling anti-patterns to avoid

1. Sampling in every tool — not every tool needs AI reasoning. Use sampling for classification, summarization, and generation; never for data retrieval or simple logic. 2. Large maxTokens in sub-tasks — sampling is billed to the user. Keep sub-task token budgets tight. 3. Infinite sampling loops — never chain sampling results into more sampling calls without a strict depth limit. Runaway loops burn tokens fast. 4. Parsing unvalidated JSON — always wrap JSON.parse(result.text) in try/catch. Models sometimes add prose before the JSON.

08 — Real-World Patterns

Real-World Patterns

Here are four production-grade prompt and sampling patterns that show up across MCP servers in the wild. Each solves a specific, common problem.

📝
Contextual Code Review Prompt
PROMPT PATTERN
// Registers a code-review prompt that reads the actual file before Claude sees anything server.prompt("code-review", "Review a source file for quality and best practices", { file_path: z.string().describe("Absolute path to the file"), focus: z.enum(["quality", "security", "performance"]).default("quality") }, async ({ file_path, focus }) => { const source = await fs.readFile(file_path, "utf-8"); // pre-fetch the file const ext = path.extname(file_path).slice(1); // detect language return { messages: [{ role: "user", content: { type: "resource" as const, resource: { uri: `file://${file_path}`, mimeType: `text/${ext}`, text: `Focus: ${focus}\nFile: ${file_path}\n\n${source}` }} }] }; } );
🔀
AI Router — Route Requests with Sampling
SAMPLING PATTERN
// Route user queries to the correct department using Claude for classification server.tool("route_query", "Route a user query to the correct department.", { query: z.string() }, async ({ query }) => { const result = await server.server.createMessage({ messages: [{ role: "user", content: { type: "text", text: `Which department handles: "${query}"? Reply with ONE word: billing, technical, sales, general` }}], maxTokens: 10, temperature: 0, // deterministic, minimal modelPreferences: { speedPriority: 1, intelligencePriority: 0.2 } }); const dept = result.content.type === "text" ? result.content.text.trim().toLowerCase() : "general"; const queue = DEPARTMENT_QUEUES[dept] ?? DEPARTMENT_QUEUES.general; await queue.push(query); return { content: [{ type: "text" as const, text: `Routed to ${dept} team queue.` }] }; } );
📊
Multi-step Report Generation Prompt
PROMPT PATTERN
// Pre-seeds Claude with data from multiple sources before the user asks a question server.prompt("monthly-report", "Generate a monthly business performance report", { month: z.string().regex(/^\d{4}-\d{2}$/).describe("Month in YYYY-MM format") }, async ({ month }) => { // Fetch all data in parallel before building the prompt const [sales, support, infra] = await Promise.all([ fetchSalesData(month), fetchSupportMetrics(month), fetchInfraStats(month) ]); return { messages: [{ role: "user", content: { type: "text" as const, text: `Generate an executive monthly report for ${month}. SALES: Revenue $${sales.revenue}, Orders ${sales.orders}, NPS ${sales.nps} SUPPORT: ${support.tickets} tickets, avg response ${support.avgResponse}h, CSAT ${support.csat} INFRA: Uptime ${infra.uptime}%, Incidents ${infra.incidents}, Avg latency ${infra.latency}ms Include: Executive Summary, Key Wins, Areas of Concern, Next Month Goals.` } }] }; } );
🔁
Sampling with Retry on Bad JSON
DEFENSIVE PATTERN
// Utility: robustly parse JSON from a sampling response (handles markdown code blocks) function extractJSON(text: string): unknown { // strip markdown code fences if present: ```json ... ``` const fenced = text.match(/```(?:json)?\s*([\s\S]*?)```/); const raw = fenced ? fenced[1] : text; // find the first { } or [ ] balanced block const jsonMatch = raw.match(/[\[{][\s\S]*[\]}]/); if (!jsonMatch) throw new Error("No JSON found in response"); return JSON.parse(jsonMatch[0]); } // Usage in a sampling call: const result = await server.server.createMessage({ ...params }); const text = result.content.type === "text" ? result.content.text : ""; const data = extractJSON(text) as MyExpectedType;
09 — Knowledge Check

Test Your Understanding

Five questions covering Prompts, Sampling, and their correct use patterns.

Day 6 — Prompts & Sampling Quiz

Select one answer per question, then submit to see your score.

Q1 Who controls when an MCP Prompt is invoked?
A The AI model — it selects prompts the same way it selects tools
B The host application — it auto-injects prompts based on conversation context
C The user — they explicitly select a prompt from a menu in the host UI
D The MCP server — it pushes prompts to the client at any time
C is correct. Prompts are user-controlled. This is the key difference from Tools (model-controlled) and Resources (app-controlled). The user sees a menu of available prompts — typically slash commands in the host UI — and explicitly selects one. This intentional design makes prompts a discoverable, user-facing feature rather than an autonomous AI behavior.
Q2 Your prompt handler fetches a large file and includes it in the messages. Why is this better than passing just the file path to Claude?
A It isn't — Claude should always fetch its own context to save server resources
B The handler pre-fetches the content before Claude sees anything, so Claude immediately has the data without needing a tool call — fewer round trips, faster response, and it works even without a file-reading tool
C It uses less memory because the server streams the file lazily
D Claude cannot access file paths directly, so the handler must convert them
B is correct. The power of prompt handlers is that they run before Claude sees anything. By fetching and embedding content in the message array, you give Claude a fully loaded conversation — it can immediately analyze, not fetch. This eliminates tool round-trips, works regardless of what tools are registered, and delivers a better user experience.
Q3 What is the human-in-the-loop requirement for sampling?
A The server developer must manually approve every sampling call in the source code
B Sampling calls are fully autonomous — no human approval is needed
C The host application must present sampling requests to the user for approval before forwarding them to the model — the user can approve, modify, or reject
D The model automatically filters out unsafe sampling requests using its built-in safety training
C is correct. MCP's spec mandates that the host application surfaces sampling requests to the user before they reach the model. This is a core safety guarantee — it prevents servers from autonomously chaining unlimited AI calls without user awareness. Users see what the server is asking the AI to do and maintain control throughout.
Q4 You want Claude to classify text into categories reliably. What temperature and modelPreferences settings are most appropriate?
A temperature: 1.0, intelligencePriority: 1.0 — maximum creativity and smartness
B temperature: 0–0.1, speedPriority: 0.9 — low temperature for consistent output, fast model is sufficient for classification
C temperature: 0.5, intelligencePriority: 0.5 — balanced defaults are always best
D temperature: 0.9, costPriority: 1.0 — high temperature prevents biased classifications
B is correct. Classification tasks need consistency, not creativity. Low temperature (0–0.1) makes the model deterministic — the same input yields the same category every time. Speed priority makes sense because classification is a simple reasoning task that doesn't require the most capable model, reducing latency and cost.
Q5 A sampling call returns: "Sure! Here's the JSON:\n```json\n{\"status\": \"ok\"}\n```". What should your handler do?
A Call JSON.parse() directly on the full response string — it will work
B Strip the markdown code fences, extract the JSON block, then parse — or use a utility like extractJSON() that handles both fenced and unfenced responses
C Reject the response and retry — the model should never include prose before JSON
D Return the raw string to the user — JSON parsing is the model's responsibility
B is correct. Models frequently wrap JSON responses in markdown code fences (```json ... ```) and include conversational prose before/after the JSON. Calling JSON.parse() directly on such a string throws a SyntaxError. Always extract the JSON block first using a regex or a utility function, then parse. Wrap everything in try/catch in case the model's output is malformed.
🎉
Score: 5/5
Prompts & Sampling mastered. Ready for Day 7!
← Previous Day
Day 5: Resources Deep Dive
URIs, templates, MIME types, subscriptions
Next Day →
Day 7: Week 1 Capstone
Build a complete MCP server from scratch