MCP Middleware - Fastmcp

import { VersionBadge } from "/snippets/version-badge.mdx"

MCP middleware is a powerful concept that allows you to add cross-cutting functionality to your FastMCP server. Unlike traditional web middleware, MCP middleware is designed specifically for the Model Context Protocol, providing hooks for different types of MCP operations like tool calls, resource reads, and prompt requests.

<Tip> MCP middleware is a FastMCP-specific concept and is not part of the official MCP protocol specification. This middleware system is designed to work with FastMCP servers and may not be compatible with other MCP implementations. </Tip> <Warning> MCP middleware is a brand new concept and may be subject to breaking changes in future versions. </Warning>

What is MCP Middleware?

MCP middleware lets you intercept and modify MCP requests and responses as they flow through your server. Think of it as a pipeline where each piece of middleware can inspect what's happening, make changes, and then pass control to the next middleware in the chain.

Common use cases for MCP middleware include:

Authentication and Authorization: Verify client permissions before executing operations
Logging and Monitoring: Track usage patterns and performance metrics
Rate Limiting: Control request frequency per client or operation type
Request/Response Transformation: Modify data before it reaches tools or after it leaves
Caching: Store frequently requested data to improve performance
Error Handling: Provide consistent error responses across your server

How Middleware Works

FastMCP middleware operates on a pipeline model. When a request comes in, it flows through your middleware in the order they were added to the server. Each middleware can:

Inspect the incoming request and its context
Modify the request before passing it to the next middleware or handler
Execute the next middleware/handler in the chain by calling call_next()
Inspect and modify the response before returning it
Handle errors that occur during processing

The key insight is that middleware forms a chain where each piece decides whether to continue processing or stop the chain entirely.

If you're familiar with ASGI middleware, the basic structure of FastMCP middleware will feel familiar. At its core, middleware is a callable class that receives a context object containing information about the current JSON-RPC message and a handler function to continue the middleware chain.

It's important to understand that MCP operates on the JSON-RPC specification. While FastMCP presents requests and responses in a familiar way, these are fundamentally JSON-RPC messages, not HTTP request/response pairs like you might be used to in web applications. FastMCP middleware works with all transport types, including local stdio transport and HTTP transports, though not all middleware implementations are compatible across all transports (e.g., middleware that inspects HTTP headers won't work with stdio transport).

The most fundamental way to implement middleware is by overriding the __call__ method on the Middleware base class:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext

class RawMiddleware(Middleware):
    async def __call__(self, context: MiddlewareContext, call_next):
        # This method receives ALL messages regardless of type
        print(f"Raw middleware processing: {context.method}")
        result = await call_next(context)
        print(f"Raw middleware completed: {context.method}")
        return result

This gives you complete control over every message that flows through your server, but requires you to handle all message types manually.

Middleware Hooks

To make it easier for users to target specific types of messages, FastMCP middleware provides a variety of specialized hooks. Instead of implementing the raw __call__ method, you can override specific hook methods that are called only for certain types of operations, allowing you to target exactly the level of specificity you need for your middleware logic.

Hook Hierarchy and Execution Order

FastMCP provides multiple hooks that are called with varying levels of specificity. Understanding this hierarchy is crucial for effective middleware design.

When a request comes in, multiple hooks may be called for the same request, going from general to specific:

on_message - Called for ALL MCP messages (both requests and notifications)
on_request or on_notification - Called based on the message type
Operation-specific hooks - Called for specific MCP operations like on_call_tool

For example, when a client calls a tool, your middleware will receive multiple hook calls:

on_message and on_request for any initial tool discovery operations (list_tools)
on_message (because it's any MCP message) for the tool call itself
on_request (because tool calls expect responses) for the tool call itself
on_call_tool (because it's specifically a tool execution) for the tool call itself

Note that the MCP SDK may perform additional operations like listing tools for caching purposes, which will trigger additional middleware calls beyond just the direct tool execution.

This hierarchy allows you to target your middleware logic with the right level of specificity. Use on_message for broad concerns like logging, on_request for authentication, and on_call_tool for tool-specific logic like performance monitoring.

Available Hooks

on_message: Called for all MCP messages (requests and notifications)
on_request: Called specifically for MCP requests (that expect responses)
on_notification: Called specifically for MCP notifications (fire-and-forget)
on_call_tool: Called when tools are being executed
on_read_resource: Called when resources are being read
on_get_prompt: Called when prompts are being retrieved
on_list_tools: Called when listing available tools
on_list_resources: Called when listing available resources
on_list_resource_templates: Called when listing resource templates
on_list_prompts: Called when listing available prompts
<VersionBadge version="2.13.0" />
on_initialize: Called when a client connects and initializes the session (returns None)
<Note>

The on_initialize hook receives the client's initialization request but returns None rather than a result. The initialization response is handled internally by the MCP protocol and cannot be modified by middleware. This hook is useful for client detection, logging connections, or initializing session state, but not for modifying the initialization handshake itself. </Note>

Example:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext
from mcp import McpError
from mcp.types import ErrorData

class InitializationMiddleware(Middleware):
    async def on_initialize(self, context: MiddlewareContext, call_next):
        # Check client capabilities before initialization
        client_info = context.message.params.get("clientInfo", {})
        client_name = client_info.get("name", "unknown")

        # Reject unsupported clients BEFORE call_next
        if client_name == "unsupported-client":
            raise McpError(ErrorData(code=-32000, message="This client is not supported"))

        # Log successful initialization
        await call_next(context)
        print(f"Client {client_name} initialized successfully")

<Warning> If you raise `McpError` in `on_initialize` **after** calling `call_next()`, the error will only be logged and will not be sent to the client. The initialization response has already been sent at that point. Always raise `McpError` **before** `call_next()` if you want to reject the initialization. </Warning>

MCP Session Availability in Middleware

The MCP session and request context are not available during certain phases like initialization. When middleware runs during these phases, context.fastmcp_context.request_context returns None rather than the full MCP request context.

This typically occurs when:

The on_request hook fires during client initialization
The MCP handshake hasn't completed yet

To handle this in middleware, check if the MCP request context is available before accessing MCP-specific attributes. Note that the MCP request context is distinct from the HTTP request - for HTTP transports, you can use HTTP helpers to access request data even when the MCP session is not available:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext

class SessionAwareMiddleware(Middleware):
    async def on_request(self, context: MiddlewareContext, call_next):
        ctx = context.fastmcp_context

        if ctx.request_context:
            # MCP session available - can access session-specific attributes
            session_id = ctx.session_id
            request_id = ctx.request_id
        else:
            # MCP session not available yet - use HTTP helpers for request data (if using HTTP transport)
            from fastmcp.server.dependencies import get_http_headers
            headers = get_http_headers()
            # Access HTTP data for auth, logging, etc.

        return await call_next(context)

For HTTP request data (headers, client IP, etc.) when using HTTP transports, use get_http_request() or get_http_headers() from fastmcp.server.dependencies, which work regardless of MCP session availability. See HTTP Requests for details.

Component Access in Middleware

Understanding how to access component information (tools, resources, prompts) in middleware is crucial for building powerful middleware functionality. The access patterns differ significantly between listing operations and execution operations.

Listing Operations vs Execution Operations

FastMCP middleware handles two types of operations differently:

Listing Operations (on_list_tools, on_list_resources, on_list_prompts, etc.):

Middleware receives FastMCP component objects with full metadata
These objects include FastMCP-specific properties like tags that can be accessed directly from the component
The result contains complete component information before it's converted to MCP format
Tags are included in the component's meta field in the listing response returned to MCP clients

Execution Operations (on_call_tool, on_read_resource, on_get_prompt):

Middleware runs before the component is executed
The middleware result is either the execution result or an error if the component wasn't found
Component metadata isn't directly available in the hook parameters

Accessing Component Metadata During Execution

If you need to check component properties (like tags) during execution operations, use the FastMCP server instance available through the context:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext
from fastmcp.exceptions import ToolError

class TagBasedMiddleware(Middleware):
    async def on_call_tool(self, context: MiddlewareContext, call_next):
        # Access the tool object to check its metadata
        if context.fastmcp_context:
            try:
                tool = await context.fastmcp_context.fastmcp.get_tool(context.message.name)
                
                # Check if this tool has a "private" tag
                if "private" in tool.tags:
                    raise ToolError("Access denied: private tool")
                    
                # Check if tool is enabled
                if not tool.enabled:
                    raise ToolError("Tool is currently disabled")
                    
            except Exception:
                # Tool not found or other error - let execution continue
                # and handle the error naturally
                pass
        
        return await call_next(context)

The same pattern works for resources and prompts:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext
from fastmcp.exceptions import ResourceError, PromptError

class ComponentAccessMiddleware(Middleware):
    async def on_read_resource(self, context: MiddlewareContext, call_next):
        if context.fastmcp_context:
            try:
                resource = await context.fastmcp_context.fastmcp.get_resource(context.message.uri)
                if "restricted" in resource.tags:
                    raise ResourceError("Access denied: restricted resource")
            except Exception:
                pass
        return await call_next(context)
    
    async def on_get_prompt(self, context: MiddlewareContext, call_next):
        if context.fastmcp_context:
            try:
                prompt = await context.fastmcp_context.fastmcp.get_prompt(context.message.name)
                if not prompt.enabled:
                    raise PromptError("Prompt is currently disabled")
            except Exception:
                pass
        return await call_next(context)

Working with Listing Results

For listing operations, the middleware call_next function returns a list of FastMCP components prior to being converted to MCP format. You can filter or modify this list and return it to the client. For example:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext

class ListingFilterMiddleware(Middleware):
    async def on_list_tools(self, context: MiddlewareContext, call_next):
        result = await call_next(context)
        
        # Filter out tools with "private" tag
        filtered_tools = [
            tool for tool in result 
            if "private" not in tool.tags
        ]
        
        # Return modified list
        return filtered_tools

This filtering happens before the components are converted to MCP format and returned to the client. Tags are accessible both during filtering and are included in the component's meta field in the final listing response.

<Tip> When filtering components in listing operations, ensure you also prevent execution of filtered components in the corresponding execution hooks (`on_call_tool`, `on_read_resource`, `on_get_prompt`) to maintain consistency. </Tip>

Tool Call Denial

You can deny access to specific tools by raising a ToolError in your middleware. This is the correct way to block tool execution, as it integrates properly with the FastMCP error handling system.

python

from fastmcp.server.middleware import Middleware, MiddlewareContext
from fastmcp.exceptions import ToolError

class AuthMiddleware(Middleware):
    async def on_call_tool(self, context: MiddlewareContext, call_next):
        tool_name = context.message.name
        
        # Deny access to restricted tools
        if tool_name.lower() in ["delete", "admin_config"]:
            raise ToolError("Access denied: tool requires admin privileges")
        
        # Allow other tools to proceed
        return await call_next(context)

<Warning> When denying tool calls, always raise `ToolError` rather than returning `ToolResult` objects or other values. `ToolError` ensures proper error propagation through the middleware chain and converts to the correct MCP error response format. </Warning>

Tool Call Modification

For execution operations like tool calls, you can modify arguments before execution or transform results afterward:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext

class ToolCallMiddleware(Middleware):
    async def on_call_tool(self, context: MiddlewareContext, call_next):
        # Modify arguments before execution
        if context.message.name == "calculate":
            # Ensure positive inputs
            if context.message.arguments.get("value", 0) < 0:
                context.message.arguments["value"] = abs(context.message.arguments["value"])
        
        result = await call_next(context)
        
        # Transform result after execution
        if context.message.name == "get_data":
            # Add metadata to result
            if result.structured_content:
                result.structured_content["processed_at"] = "2024-01-01T00:00:00Z"
        
        return result

<Tip> For more complex tool rewriting scenarios, consider using [Tool Transformation](/v2/patterns/tool-transformation) patterns which provide a more structured approach to creating modified tool variants. </Tip>

Anatomy of a Hook

Every middleware hook follows the same pattern. Let's examine the on_message hook to understand the structure:

python

async def on_message(self, context: MiddlewareContext, call_next):
    # 1. Pre-processing: Inspect and optionally modify the request
    print(f"Processing {context.method}")
    
    # 2. Chain continuation: Call the next middleware/handler
    result = await call_next(context)
    
    # 3. Post-processing: Inspect and optionally modify the response
    print(f"Completed {context.method}")
    
    # 4. Return the result (potentially modified)
    return result

Hook Parameters

Every hook receives two parameters:

context: MiddlewareContext - Contains information about the current request:
- context.method - The MCP method name (e.g., "tools/call")
- context.source - Where the request came from ("client" or "server")
- context.type - Message type ("request" or "notification")
- context.message - The MCP message data
- context.timestamp - When the request was received
- context.fastmcp_context - FastMCP Context object (if available)
call_next - A function that continues the middleware chain. You must call this to proceed, unless you want to stop processing entirely.

Control Flow

You have complete control over the request flow:

Continue processing: Call await call_next(context) to proceed
Modify the request: Change the context before calling call_next
Modify the response: Change the result after calling call_next
Stop the chain: Don't call call_next (rarely needed)
Handle errors: Wrap call_next in try/catch blocks

State Management

In addition to modifying the request and response, you can also store state data that your tools can (optionally) access later. To do so, use the FastMCP Context to either set_state or get_state as appropriate. For more information, see the Context State Management docs.

Creating Middleware

FastMCP middleware is implemented by subclassing the Middleware base class and overriding the hooks you need. You only need to implement the hooks that are relevant to your use case.

python

from fastmcp import FastMCP
from fastmcp.server.middleware import Middleware, MiddlewareContext

class LoggingMiddleware(Middleware):
    """Middleware that logs all MCP operations."""
    
    async def on_message(self, context: MiddlewareContext, call_next):
        """Called for all MCP messages."""
        print(f"Processing {context.method} from {context.source}")
        
        result = await call_next(context)
        
        print(f"Completed {context.method}")
        return result

# Add middleware to your server
mcp = FastMCP("MyServer")
mcp.add_middleware(LoggingMiddleware())

This creates a basic logging middleware that will print information about every request that flows through your server.

Adding Middleware to Your Server

Single Middleware

Adding middleware to your server is straightforward:

python

mcp = FastMCP("MyServer")
mcp.add_middleware(LoggingMiddleware())

Multiple Middleware

Middleware executes in the order it's added to the server. The first middleware added runs first on the way in, and last on the way out:

python

mcp = FastMCP("MyServer")

mcp.add_middleware(AuthenticationMiddleware("secret-token"))
mcp.add_middleware(PerformanceMiddleware())
mcp.add_middleware(LoggingMiddleware())

This creates the following execution flow:

AuthenticationMiddleware (pre-processing)
PerformanceMiddleware (pre-processing)
LoggingMiddleware (pre-processing)
Actual tool/resource handler
LoggingMiddleware (post-processing)
PerformanceMiddleware (post-processing)
AuthenticationMiddleware (post-processing)

Server Composition and Middleware

When using Server Composition with mount or import_server, middleware behavior follows these rules:

Parent server middleware runs for all requests, including those routed to mounted servers
Mounted server middleware only runs for requests handled by that specific server
Middleware order is preserved within each server

This allows you to create layered middleware architectures where parent servers handle cross-cutting concerns like authentication, while child servers focus on domain-specific middleware.

python

# Parent server with middleware
parent = FastMCP("Parent")
parent.add_middleware(AuthenticationMiddleware("token"))

# Child server with its own middleware  
child = FastMCP("Child")
child.add_middleware(LoggingMiddleware())

@child.tool
def child_tool() -> str:
    return "from child"

# Mount the child server
parent.mount(child, prefix="child")

When a client calls "child_tool", the request will flow through the parent's authentication middleware first, then route to the child server where it will go through the child's logging middleware.

Built-in Middleware Examples

FastMCP includes several middleware implementations that demonstrate best practices and provide immediately useful functionality. Let's explore how each type works by building simplified versions, then see how to use the full implementations.

Timing Middleware

Performance monitoring is essential for understanding your server's behavior and identifying bottlenecks. FastMCP includes timing middleware at fastmcp.server.middleware.timing.

Here's an example of how it works:

python

import time
from fastmcp.server.middleware import Middleware, MiddlewareContext

class SimpleTimingMiddleware(Middleware):
    async def on_request(self, context: MiddlewareContext, call_next):
        start_time = time.perf_counter()
        
        try:
            result = await call_next(context)
            duration_ms = (time.perf_counter() - start_time) * 1000
            print(f"Request {context.method} completed in {duration_ms:.2f}ms")
            return result
        except Exception as e:
            duration_ms = (time.perf_counter() - start_time) * 1000
            print(f"Request {context.method} failed after {duration_ms:.2f}ms: {e}")
            raise

To use the full version with proper logging and configuration:

python

from fastmcp.server.middleware.timing import (
    TimingMiddleware, 
    DetailedTimingMiddleware
)

# Basic timing for all requests
mcp.add_middleware(TimingMiddleware())

# Detailed per-operation timing (tools, resources, prompts)
mcp.add_middleware(DetailedTimingMiddleware())

The built-in versions include custom logger support, proper formatting, and DetailedTimingMiddleware provides operation-specific hooks like on_call_tool and on_read_resource for granular timing.

Tool Injection Middleware

Tool injection middleware is a middleware that injects tools into the server during the request lifecycle:

python

from fastmcp.server.middleware.tool_injection import ToolInjectionMiddleware

def my_tool_fn(a: int, b: int) -> int:
    return a + b

my_tool = Tool.from_function(fn=my_tool_fn, name="my_tool")

mcp.add_middleware(ToolInjectionMiddleware(tools=[my_tool]))

Prompt Tool Middleware

Prompt tool middleware is a compatibility middleware for clients that are unable to list or get prompts. It provides two tools: list_prompts and get_prompt which allow clients to list and get prompts respectively using only tool calls.

python

from fastmcp.server.middleware.tool_injection import PromptToolMiddleware

mcp.add_middleware(PromptToolMiddleware())

Resource Tool Middleware

Resource tool middleware is a compatibility middleware for clients that are unable to list or read resources. It provides two tools: list_resources and read_resource which allow clients to list and read resources respectively using only tool calls.

python

from fastmcp.server.middleware.tool_injection import ResourceToolMiddleware

mcp.add_middleware(ResourceToolMiddleware())

Caching Middleware

Caching middleware is essential for improving performance and reducing server load. FastMCP provides caching middleware at fastmcp.server.middleware.caching.

Here's how to use the full version:

python

from fastmcp.server.middleware.caching import ResponseCachingMiddleware

mcp.add_middleware(ResponseCachingMiddleware())

Out of the box, it caches call/list tool, resources, and prompts to an in-memory cache with TTL-based expiration. Cache entries expire based on their TTL; there is no event-based cache invalidation. List calls are stored under global keys—when sharing a storage backend across multiple servers, consider namespacing collections to prevent conflicts. See Storage Backends for advanced configuration options.

Each method can be configured individually, for example, caching list tools for 30 seconds, limiting caching to specific tools, and disabling caching for resource reads:

python

from fastmcp.server.middleware.caching import ResponseCachingMiddleware, CallToolSettings, ListToolsSettings, ReadResourceSettings

mcp.add_middleware(ResponseCachingMiddleware(
    list_tools_settings=ListToolsSettings(
        ttl=30,
    ),
    call_tool_settings=CallToolSettings(
        included_tools=["tool1"],
    ),
    read_resource_settings=ReadResourceSettings(
        enabled=False
    )
))

Storage Backends

By default, caching uses in-memory storage, which is fast but doesn't persist across restarts. For production or persistent caching across server restarts, configure a different storage backend. See Storage Backends for complete options including disk, Redis, DynamoDB, and custom implementations.

Disk-based caching example:

python

from fastmcp.server.middleware.caching import ResponseCachingMiddleware
from key_value.aio.stores.disk import DiskStore

mcp.add_middleware(ResponseCachingMiddleware(
    cache_storage=DiskStore(directory="cache"),
))

Redis for distributed deployments:

python

from fastmcp.server.middleware.caching import ResponseCachingMiddleware
from key_value.aio.stores.redis import RedisStore

mcp.add_middleware(ResponseCachingMiddleware(
    cache_storage=RedisStore(host="redis.example.com", port=6379),
))

Cache Statistics

The caching middleware collects operation statistics (hits, misses, etc.) through the underlying storage layer. Access statistics from the middleware instance:

python

from fastmcp.server.middleware.caching import ResponseCachingMiddleware

middleware = ResponseCachingMiddleware()
mcp.add_middleware(middleware)

# Later, retrieve statistics
stats = middleware.statistics()
print(f"Total cache operations: {stats}")

Logging Middleware

Request and response logging is crucial for debugging, monitoring, and understanding usage patterns in your MCP server. FastMCP provides comprehensive logging middleware at fastmcp.server.middleware.logging.

Here's an example of how it works:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext

class SimpleLoggingMiddleware(Middleware):
    async def on_message(self, context: MiddlewareContext, call_next):
        print(f"Processing {context.method} from {context.source}")
        
        try:
            result = await call_next(context)
            print(f"Completed {context.method}")
            return result
        except Exception as e:
            print(f"Failed {context.method}: {e}")
            raise

To use the full versions with advanced features:

python

from fastmcp.server.middleware.logging import (
    LoggingMiddleware, 
    StructuredLoggingMiddleware
)

# Human-readable logging with payload support
mcp.add_middleware(LoggingMiddleware(
    include_payloads=True,
    max_payload_length=1000
))

# JSON-structured logging for log aggregation tools
mcp.add_middleware(StructuredLoggingMiddleware(include_payloads=True))

The built-in versions include payload logging, structured JSON output, custom logger support, payload size limits, and operation-specific hooks for granular control.

Rate Limiting Middleware

Rate limiting is essential for protecting your server from abuse, ensuring fair resource usage, and maintaining performance under load. FastMCP includes sophisticated rate limiting middleware at fastmcp.server.middleware.rate_limiting.

Here's an example of how it works:

python

import time
from collections import defaultdict
from fastmcp.server.middleware import Middleware, MiddlewareContext
from mcp import McpError
from mcp.types import ErrorData

class SimpleRateLimitMiddleware(Middleware):
    def __init__(self, requests_per_minute: int = 60):
        self.requests_per_minute = requests_per_minute
        self.client_requests = defaultdict(list)
    
    async def on_request(self, context: MiddlewareContext, call_next):
        current_time = time.time()
        client_id = "default"  # In practice, extract from headers or context
        
        # Clean old requests and check limit
        cutoff_time = current_time - 60
        self.client_requests[client_id] = [
            req_time for req_time in self.client_requests[client_id]
            if req_time > cutoff_time
        ]
        
        if len(self.client_requests[client_id]) >= self.requests_per_minute:
            raise McpError(ErrorData(code=-32000, message="Rate limit exceeded"))
        
        self.client_requests[client_id].append(current_time)
        return await call_next(context)

To use the full versions with advanced algorithms:

python

from fastmcp.server.middleware.rate_limiting import (
    RateLimitingMiddleware, 
    SlidingWindowRateLimitingMiddleware
)

# Token bucket rate limiting (allows controlled bursts)
mcp.add_middleware(RateLimitingMiddleware(
    max_requests_per_second=10.0,
    burst_capacity=20
))

# Sliding window rate limiting (precise time-based control)
mcp.add_middleware(SlidingWindowRateLimitingMiddleware(
    max_requests=100,
    window_minutes=1
))

The built-in versions include token bucket algorithms, per-client identification, global rate limiting, and async-safe implementations with configurable client identification functions.

Error Handling Middleware

Consistent error handling and recovery is critical for robust MCP servers. FastMCP provides comprehensive error handling middleware at fastmcp.server.middleware.error_handling.

Here's an example of how it works:

python

import logging
from fastmcp.server.middleware import Middleware, MiddlewareContext

class SimpleErrorHandlingMiddleware(Middleware):
    def __init__(self):
        self.logger = logging.getLogger("errors")
        self.error_counts = {}
    
    async def on_message(self, context: MiddlewareContext, call_next):
        try:
            return await call_next(context)
        except Exception as error:
            # Log the error and track statistics
            error_key = f"{type(error).__name__}:{context.method}"
            self.error_counts[error_key] = self.error_counts.get(error_key, 0) + 1
            
            self.logger.error(f"Error in {context.method}: {type(error).__name__}: {error}")
            raise

To use the full versions with advanced features:

python

from fastmcp.server.middleware.error_handling import (
    ErrorHandlingMiddleware, 
    RetryMiddleware
)

# Comprehensive error logging and transformation
mcp.add_middleware(ErrorHandlingMiddleware(
    include_traceback=True,
    transform_errors=True,
    error_callback=my_error_callback
))

# Automatic retry with exponential backoff
mcp.add_middleware(RetryMiddleware(
    max_retries=3,
    retry_exceptions=(ConnectionError, TimeoutError)
))

The built-in versions include error transformation, custom callbacks, configurable retry logic, and proper MCP error formatting.

Combining Middleware

These middleware work together seamlessly:

python

from fastmcp import FastMCP
from fastmcp.server.middleware.timing import TimingMiddleware
from fastmcp.server.middleware.logging import LoggingMiddleware
from fastmcp.server.middleware.rate_limiting import RateLimitingMiddleware
from fastmcp.server.middleware.error_handling import ErrorHandlingMiddleware

mcp = FastMCP("Production Server")

# Add middleware in logical order
mcp.add_middleware(ErrorHandlingMiddleware())  # Handle errors first
mcp.add_middleware(RateLimitingMiddleware(max_requests_per_second=50))
mcp.add_middleware(TimingMiddleware())  # Time actual execution
mcp.add_middleware(LoggingMiddleware())  # Log everything

@mcp.tool
def my_tool(data: str) -> str:
    return f"Processed: {data}"

This configuration provides comprehensive monitoring, protection, and observability for your MCP server.

Custom Middleware Example

You can also create custom middleware by extending the base class:

python

from fastmcp.server.middleware import Middleware, MiddlewareContext

class CustomHeaderMiddleware(Middleware):
    async def on_request(self, context: MiddlewareContext, call_next):
        # Add custom logic here
        print(f"Processing {context.method}")
        
        result = await call_next(context)
        
        print(f"Completed {context.method}")
        return result

mcp.add_middleware(CustomHeaderMiddleware())