šŸ“… Day 19 ā± 55 min šŸ”„ Level 3 — Ascend šŸ“” Transport

Streaming Responses
with SSE Transport

stdio makes you wait for the whole answer. SSE streams results token-by-token as they're generated — critical for long-running tools, real-time data feeds, and LLM-backed tools. This day covers SSE transport from internals to production deployment.

User experience is determined by perceived latency, not actual latency. A tool that streams its first result in 200ms feels faster than one that returns everything in 3 seconds — even if the total data is identical. SSE streaming in MCP is how you build that experience.
šŸ“‹ What you'll master today
šŸ“” Transport Choice

SSE vs stdio — When to Use Which

MCP supports multiple transport mechanisms. Choosing the right one affects your architecture, security model, and developer experience. Here's the definitive decision matrix:

CriterionstdioSSE (HTTP)Streamable HTTP
Use caseLocal tools, CLIRemote, multi-userREST clients
Network accessNone (local only)HTTP/HTTPSHTTP/HTTPS
StreamingFull response onlyTrue streamingTrue streaming
Auth supportOS-level (file perms)HTTP headers/OAuthHTTP headers/OAuth
Multiple clientsOne process per clientMany concurrent clientsMany concurrent clients
DeploymentSame machineECS/Lambda/K8sECS/Lambda/K8s
šŸ’”
Rule of thumb

Use stdio for developer tools on a local machine (Claude Desktop). Use SSE for anything deployed to a server, shared with multiple users, or needing real-time streaming updates.

šŸ”¬ SSE Internals

SSE Protocol Internals

Server-Sent Events (SSE) is a W3C standard built on top of HTTP. The client opens a persistent HTTP connection and the server pushes text frames down it — one-directional, text-only, and automatically reconnecting. MCP uses SSE for the server-to-client stream, and a separate HTTP POST endpoint for client-to-server messages.

šŸ“» The Radio Station Analogy

SSE is like tuning into a radio station. You connect once (the HTTP request) and then just listen — the station (server) pushes whatever it wants to broadcast (events) whenever it wants. You don't need to ask "any new songs?" every few seconds. If the signal drops, your radio automatically tries to reconnect and picks up where it left off. In MCP: the radio station is your tool running on the server, and the songs are streamed tool results.

# SSE wire format — what actually goes over the HTTP connection # Each event is: field: value\n\n (double newline terminates event) data: {"jsonrpc":"2.0","id":1,"result":{"type":"text","text":"Processing"}} data: {"jsonrpc":"2.0","id":1,"result":{"type":"text","text":" step 1"}} data: {"jsonrpc":"2.0","id":1,"result":{"type":"text","text":" of 3 complete"}} event: done data: {"jsonrpc":"2.0","id":1,"result":{"type":"text","text":"Done!"}} # Client reconnection: Last-Event-ID header tells server where to resume id: event-42 data: {"chunk": 42, ...} # Keepalive (prevents proxy timeouts): empty comment every 15s : keepalive
šŸ—ļø Build SSE Server

Building an SSE MCP Server

FastMCP makes SSE transport trivially easy to enable. Change one line and you have a full SSE server. The hard parts are deployment (covered in Day 23) and keeping long connections alive through load balancers and proxies.

from fastmcp import FastMCP import asyncio mcp = FastMCP("StreamingServer") @mcp.tool() async def analyze_large_dataset(dataset_id: str) -> str: """Long-running analysis that streams progress updates.""" steps = [ "Loading dataset from S3...", "Validating schema...", "Running statistical analysis...", "Generating visualizations...", "Writing report...", ] results = [] for i, step in enumerate(steps, 1): await asyncio.sleep(0.5) # simulates real work results.append(f"[{i}/{len(steps)}] {step} āœ“") # yield intermediate result — FastMCP streams this to client yield f"[{i}/{len(steps)}] {step}" yield f"Analysis complete for {dataset_id}!" # Start as SSE server — one flag change from stdio if __name__ == "__main__": mcp.run(transport="sse", host="0.0.0.0", port=8080)
āš ļø
ALB timeout setting

AWS Application Load Balancer has a 60-second idle timeout by default. For streaming tools that might pause between chunks, increase the idle timeout to 300s in your ALB settings — or send keepalive comments (: ping) every 30 seconds to keep the connection alive.

⚔ Progress Streaming

Streaming Long-Running Tool Results

The most powerful use of SSE in MCP is streaming progress from genuinely long-running operations: file processing, AI model inference, database migrations, report generation. Instead of making the user wait 30 seconds with no feedback, stream progress at each major milestone.

from fastmcp import FastMCP, Context import boto3, asyncio from typing import AsyncGenerator mcp = FastMCP("ProductionStreamer") s3 = boto3.client("s3") bedrock = boto3.client("bedrock-runtime") @mcp.tool() async def summarize_documents( ctx: Context, s3_bucket: str, prefix: str ) -> AsyncGenerator: """Stream summaries of all documents in an S3 prefix.""" # Step 1: List objects (immediate feedback) yield f"šŸ“‚ Scanning s3://{s3_bucket}/{prefix}..." paginator = s3.get_paginator("list_objects_v2") objects = [] for page in paginator.paginate(Bucket=s3_bucket, Prefix=prefix): objects.extend(page.get("Contents", [])) yield f"āœ… Found {len(objects)} documents. Starting analysis..." # Step 2: Process each document with streaming update for i, obj in enumerate(objects, 1): key = obj["Key"] yield f"šŸ“„ [{i}/{len(objects)}] Processing {key}..." # Fetch and summarize (simplified — real impl calls Bedrock) body = s3.get_object(Bucket=s3_bucket, Key=key)["Body"].read() yield f" āœ“ {key}: {len(body)} bytes read" yield f"\nšŸŽ‰ Complete! Processed {len(objects)} documents."
šŸ–„ļø Client Consumption

Client-Side SSE Consumption

When you're building a custom client (a web app, a monitoring dashboard, a custom Claude integration), you need to consume SSE streams directly. The browser EventSource API and Node.js eventsource package make this straightforward.

// Browser: consuming an MCP SSE stream from a web app const sse = new EventSource('https://api.yourdomain.com/mcp/sse', { headers: { 'Authorization': 'Bearer YOUR_TOKEN' } }); sse.onmessage = function(event) { const data = JSON.parse(event.data); // Each message is a JSON-RPC result chunk if (data.result?.type === 'text') { document.getElementById('output').textContent += data.result.text; } }; sse.onerror = function(err) { // EventSource auto-reconnects on error — this fires during reconnect attempt console.warn('SSE reconnecting...', err); }; sse.addEventListener('done', function() { sse.close(); // Clean up when tool is complete showCompletion(); });
šŸ”„ Error Handling

Error Handling & Reconnection Logic

SSE connections drop — network hiccups, server restarts, ALB timeouts. You need idempotent tool operations (safe to retry), event IDs for resuming streams, and smart backoff logic on the client.

# Server: assign event IDs for resumable streams @mcp.tool() async def resumable_report(job_id: str, resume_from: int = 0): steps = get_report_steps(job_id) for i, step in enumerate(steps): if i < resume_from: continue # skip already-delivered chunks yield {"id": i, "data": step} // Client: exponential backoff reconnection class ResilientSSE { constructor(url, options) { this.url = url; this.options = options; this.retries = 0; this.lastEventId = 0; this.connect(); } connect() { const url = this.lastEventId ? `${this.url}?resume_from=${this.lastEventId}` : this.url; this.es = new EventSource(url); this.es.onmessage = (e) => { this.lastEventId++; this.options.onMessage(e); }; this.es.onerror = () => { this.es.close(); const delay = Math.min(1000 * 2 ** this.retries++, 30000); setTimeout(() => this.connect(), delay); }; } }
šŸ­ Production Pattern at Scale
A data analytics company runs 50+ MCP servers on ECS Fargate behind an ALB. Each server handles 200 concurrent SSE connections. ALB timeout is set to 300s. Each server sends : ping every 25s to prevent idle disconnection. CloudWatch monitors connection counts — auto-scaling kicks in at 150 concurrent connections per task. Result: zero dropped streams in normal operation, graceful reconnect on deploys.
🧠 Knowledge Check — Day 19
4 questions on SSE transport and streaming
QUESTION 01 / 04
Which transport should you choose for a multi-user MCP server deployed on AWS ECS that needs real-time streaming?
Astdio — it's the default and most compatible
BSSE (HTTP) — designed for remote, multi-user streaming deployments
CWebSocket — more powerful than SSE
DgRPC — lowest latency
āœ… B. SSE transport is the right choice for remote multi-user deployments. stdio is local-only. MCP doesn't natively support WebSocket or gRPC transports in the current spec.
QUESTION 02 / 04
Why do SSE MCP servers need keepalive comments sent periodically?
ATo increase throughput
BTo prevent AWS ALB and proxy servers from closing idle connections due to timeout
CTo authenticate each event
DTo compress the stream
āœ… B. AWS ALB has a 60-second idle timeout. If no data is sent in 60 seconds, the connection is closed. Keepalive pings (empty SSE comments) keep the connection "active" even during tool processing delays.
QUESTION 03 / 04
In FastMCP, how do you stream intermediate results from a tool?
AReturn a list of strings
BUse yield statements inside the tool function (async generator)
CCall ctx.send_stream()
DWrite to a shared queue
āœ… B. Using yield makes the tool an async generator. FastMCP automatically handles streaming each yielded value to the client via the SSE connection. This is the cleanest, most Pythonic way to stream.
QUESTION 04 / 04
What is the purpose of event IDs in SSE streams?
AAuthentication tokens for each event
BSequence numbers that allow clients to resume a broken stream from the last received event
CDatabase primary keys for event storage
DRate limiting identifiers
āœ… B. When a client reconnects, it sends the Last-Event-ID header. The server can skip already-delivered events and resume from the next one — giving you resumable, reliable streaming.
Up Next — Day 20
Multi-Server Orchestration & Composition
Route requests across multiple specialized MCP servers and compose their tools into a unified interface.
Day 20 →