An emerging open standard establishing a universal interface between AI models, tools, and data.
The Model Context Protocol (MCP) is an emerging open standard providing a universal interface between LLMs, tools, and data. It establishes a secure, bidirectional communication channel using JSON-RPC 2.0.
Think of it as the "USB-C port for AI"; any compliant tool can be plugged in and immediately utilized.
Note: The ecosystem is rapidly growing but early-stage. MCP is one of the leading approaches for agent interoperability.
Historically, every third-party API has a different structure, leading to an M × N problem (M models × N tools = massive duplication of integration effort).
MCP provides a standardized guide telling the AI Agent exactly what an external system can do, what inputs it requires, and what outputs it will return.
Complexity formula: Legacy = O(N × M) → MCP = O(N + M). Adding one new model equals one new client. Adding one new tool equals one new server.
LLMs (Large Language Models): The "brain" (e.g., Gemini, GPT-4, Claude). Natively, they can only generate text (or multimodal outputs). They cannot take actions.
AI Agents: The brain paired with memory, reasoning loops, and Tools. Agents interact with third-party APIs to retrieve data, make decisions, and take actions (e.g., scanning a codebase to find a bug, or hitting a booking API to reserve a flight).
Value Proposition
LLM-agnostic component that responds to JSON-RPC requests. Exposes Tools, Resources, and Prompts within isolated functional domains.
Spawned 1:1 per server. Serializes LLM intents into JSON-RPC messages, manages the transport lifecycle, and executes capability negotiation. Lives inside the Host.
The application running the AI model (e.g., Claude Desktop, Google ADK). Orchestrates workflow, enforces auth policies, spawns MCP Clients, and requests Human-in-the-Loop (HITL) approvals. Maintains ultimate authority over what the model sees/does.
MCP standardizes the exact sequence of communication between the Client (LLM Host) and Server. All communication uses JSON-RPC 2.0 request IDs to match async responses to their original requests.
Client sends its protocol version and capabilities (e.g., roots, sampling).
Server responds with its own capabilities (tools, resources, prompts).
Client confirms receipt. Handshake complete — no tool calls or prompt fetches permitted before this point.
Client requests tool execution, fetches resources, or utilizes prompts.
Termination, handling resumability, connection drops, and graceful closures.
Servers expose functionality through standardized building blocks:
Static data/context the server provides to empower the LLM (similar to RAG). Application-controlled.
config://app) or templated (settings://{type}).postgres://prod-db/schema, github://repo/main/commits.resources/updated.Executable functions performed on behalf of the LLM (e.g., execute_query), dispatched via tools/call.
Pre-built messages hosted on the server that automatically inject context. User-controlled.
Runtime capability negotiation.
tools/list and resources/list.Interactive, specialized mini-apps that can run directly inside the AI client's UI to provide a richer user experience (Note: this is currently Claude-specific, not part of the core MCP specification).
subprocess.Popen or child_process.spawn. Clients must never treat stderr as a protocol failure.
/mcp endpoint.
Accept: text/event-stream).
Mcp-Session-Id at init; Client appends it to subsequent requests.
Mcp-Session-Id and Last-Event-Id headers to safely drop and restore connections.
Beyond simple request/response, modern MCP supports complex, agentic interactions:
setRequestHandler); not available via FastMCP decorators.{alternate_dates: string}) → Client renders prompt → User responds → Flow resumes.
Static schemas, system instructions, and tool descriptions can be cached at the provider level (e.g., via Anthropic's cache_control header) to reduce token costs by up to 90% and slash latency for repeated calls.
Output schemas must resolve to "type": "object". Using *args/**kwargs, legacy Pydantic v1, or non-serializable objects triggers PydanticSchemaGenerationError at initialization.
@mcp_server.tool); auto-generates JSON schemas from type hints/docstrings.list_tools at init and auto-converts schemas to native BaseTool instances. Uses AsyncExitStack for safe subprocess teardown. Use tool_filter=[] for least-privilege.npx or uvx to spin up ephemeral servers on the fly, eliminating host machine dependency.setRequestHandler. Required for Sampling, Elicitation, and advanced architectures.Anti-Pattern — Legacy API Pass-Through: Wrapping REST endpoints 1:1 fails as they are for deterministic software. Rule: Tools must be contextually enriched (the Semantic Layer) with rich descriptions and self-describing payloads.
Unconstrained responses cause token exhaustion (e.g. Claude Code truncates at ~25,000 tokens — causing silent context failures). Use:
// To connect an agent to an MCP server, configure a JSON file in the IDE/Client
{
"mcpServers": {
"my-server": {
"command": "node", // or python
"args": ["server.js"],
"env": {
"API_KEY": "..." // API keys must NOT be hardcoded! Inject via environment variables.
}
}
}
}
User.Read) to allow granular Role-Based Access Control (RBAC). JWTs should map roles to permissions (e.g., readers can call readFile, writers can call writeFile).401 Unauthorized → Protected Resource Metadata (PRM) (RFC 9728) → Discover scopes/IdP URI → Token. Anti-Pattern: Blindly forwarding client tokens without validating audience.state parameter, store it server-side, and validate strictly on callback redirect.tools/call request must be logged with user session ID, prompt ID, and latency. Integrate with Prometheus/Grafana TraceQL for real-time monitoring.Put MCP web servers behind API Gateways to implement:
Extends the MCP philosophy using specialized agents (e.g., a "Flight Agent" and a "Hotel Agent") rather than one monolithic agent.
In A2A, agents advertise capabilities via Agent Cards (a JSON resume of abilities) and assign Tasks — the A2A equivalent of MCP's tools/list handshake.
| Selection Rule | MCP | A2A (Agent-to-Agent) |
|---|---|---|
| Architectural Boundary | Vertical integration (Agent ↔ System) | Horizontal orchestration (Agent ↔ Agent) |
| Execution Flow | Synchronous request/response (waits for schema-validated result) | Asynchronous streaming (streams progress updates over long workflows) |
| Primary Use Case | Deterministic tasks (query DB, call API, read file) | Cognitive reasoning, multi-step planning |
| Production Pattern | Used by sub-agents for low-level execution | Meta-agent distributes tasks to sub-agents |
Q1. What are the three core capabilities exposed by an MCP server?
Q2. What is the difference between MCP and A2A (Agent-to-Agent)?
Q3. Why should API keys NOT be hardcoded in the MCP client configuration file?