MCP in production SaaS: where Model Context Protocol fits your stack
MCP standardizes how agents connect to tools and data — but it does not replace LLM middleware, tenant auth, or observability. A decision guide for product and platform teams.
Every engineering leader evaluating AI agents has heard the same pitch: connect everything with MCP. Model Context Protocol ships with Cursor, Claude Desktop, and a growing catalog of community servers for GitHub, Postgres, Linear, and internal APIs.
The question from platform teams is sharper: does MCP belong inside our product, or is it a dev-tool convenience we should ignore?
Both answers can be true. MCP is a useful tool and context transport standard. It is not a substitute for LLM middleware, tenant isolation, evals, or the security patterns you already need for in-app copilots. This guide is the decision layer before you wire MCP into production.
What MCP is (and is not)
MCP separates three roles:
| Role | Responsibility | Examples |
|---|---|---|
| Host | Runs the agent loop, presents UI, invokes tools | Cursor, Claude Desktop, your copilot backend |
| Client | Speaks MCP on behalf of the host | SDK client inside your API route |
| Server | Exposes tools, resources, and prompts over the protocol | GitHub server, Postgres server, wrapper around your billing API |
User → your product UI → your API (host)
│
MCP client
│
┌─────────────┼─────────────┐
▼ ▼ ▼
MCP server MCP server native tool
(GitHub) (Postgres) (your store.ts)MCP is:
- A wire format and discovery protocol for capabilities the model can use
- A way to reuse connectors instead of hand-writing every tool schema
- Increasingly relevant for internal operator tools and admin copilots
MCP is not:
- LLM middleware — no rate limits, tenant budgets, or session auth by itself
- RAG — resources can expose documents, but retrieval policy still lives in your stack
- A copilot UI pattern — see in-app copilots for that
- A security model — tool outputs are untrusted input; see prompt injection
Where MCP fits in your existing architecture
Your production path should still look like the stack in our middleware guide:
- Client UI sends intent to your authenticated API
- LLM middleware validates session, assembles context, enforces policy
- Model runs with tools the middleware allows for this user and tenant
- Results are logged, traced, and returned
MCP enters at step 3 as one way to implement tools — alongside native functions (AI SDK tool(), LangChain @tool, direct service calls).
Client UI
Copilot, search, actions
Your API
Existing auth session
LLM middleware
Auth, rate limits, logging
Model provider
OpenAI, Anthropic, etc.
Every model call passes through your stack — not around it.
Think of the split this way:
- Middleware answers: Who is calling? What are they allowed to do? What did it cost?
- MCP answers: How does this capability expose itself to the host in a standard shape?
You still implement middleware first. MCP replaces custom glue between your agent runtime and some external systems — not the policy layer in front of them.
Three production use cases that make sense
1. Internal operator copilots
When the user is your employee — support, ops, engineering — and connectors are shared infrastructure, MCP can accelerate integration.
Example: An admin copilot that searches tickets (native tools), queries a read-only Postgres replica (MCP Postgres server), and posts a summary to Slack (MCP Slack server). Middleware enforces role checks before any MCP client is invoked; the model never holds database credentials.
This matches how many teams already demo agents with tool-calling, with less bespoke adapter code per system.
2. Reusing community or vendor MCP servers behind your boundary
MCP ecosystems ship servers for common systems. Your platform team can allowlist which servers run in staging and production, route calls through a single MCP client inside your API, and apply the same logging and budget rules as native tools.
Pattern: MCP server runs in your VPC or as a sidecar; only your middleware can reach it; end users never configure servers themselves.
3. Wrapping your product APIs as MCP servers (internal consumers)
Some teams expose internal MCP servers that wrap existing REST or GraphQL APIs — useful when multiple internal agents (on-call bot, support assist, CI triage) need the same typed surface.
Important: The MCP server is still backed by your APIs with normal auth. MCP does not magically add RBAC; your API layer does.
Three anti-patterns to avoid
1. Customer-facing "bring your own MCP"
Letting each tenant plug arbitrary MCP servers into your SaaS UI sounds flexible. In practice it creates unbounded data exfiltration risk, unmanageable support load, and no consistent eval story.
Use instead: You operate a fixed tool surface per plan tier, implemented natively or via MCP servers you host and audit.
2. Skipping middleware because "MCP handles integration"
MCP servers do not know your tenant model, your feature flags, or your cost budgets. Calling MCP directly from a browser or from an unauthenticated worker repeats the POC mistake our middleware article describes: keys and policy scattered across the stack.
Use instead: Every MCP invocation flows through the same middleware module as native tools — auth first, then tool dispatch.
3. Observability only in the MCP host
If traces stop at "the agent called something," you cannot answer finance or support questions. Log which server, which tool, latency, outcome, tenantId in your middleware, and forward spans to Langfuse or your OTel stack.
MCP vs native tools: a decision table
| Factor | Native tools (in-process) | MCP servers |
|---|---|---|
| Latency | Lower — no IPC or HTTP hop | Higher — separate process or remote server |
| Tenancy / RBAC | Easier — direct access to your auth context | Requires explicit propagation into server |
| Ops | Deployed with your app | Extra service lifecycle, versioning, health checks |
| Reuse | Custom per integration | Community servers, shared internal wrappers |
| Best for | Core product APIs, tight SLAs | Internal systems, optional connectors, polyglot tools |
Most production SaaS products we integrate use native tools for core workflows (tickets, accounts, records the product already owns) and MCP selectively for peripheral systems where a maintained server exists and latency is acceptable.
Security: treat MCP like any other tool path
MCP does not change the prompt injection threat model. It can expand it if servers are broad or over-permissioned.
Production checklist:
- Allowlist servers and tools per environment — no ad hoc discovery in prod
- Run servers with least privilege — read-only DB roles, scoped OAuth tokens
- Never pass end-user credentials to MCP servers; middleware uses service identity and passes tenant context explicitly
- Treat tool results as untrusted input — they flow back into the model context; sanitize and bound size
- Audit log tool name, arguments (redacted), actor, tenant, outcome — same bar as
update_ticketin your product - Human confirmation for destructive MCP tools — same gates as native write tools
A Postgres MCP server with write access and no confirmation gate is a production incident waiting for a crafted ticket body.
Observability and evals still apply
MCP does not replace structured logging, OTel metrics, or eval pipelines. Tag every MCP invocation with:
feature,tenantId,sessionIdmcpServer,toolNamelatencyMs,outcome,model
Golden-set evals should assert which tool was selected — MCP or native — and whether out-of-scope requests were refused. Changing MCP server versions should trigger the same CI gate as a prompt change.
Rollout: when to introduce MCP
| Stage | Tooling approach |
|---|---|
| First copilot / agent | Native tools in middleware — fastest path to RBAC and evals |
| Second connector is external | Evaluate maintained MCP server vs thin native wrapper |
| Multiple internal agents | Shared MCP servers wrapping internal APIs; one auth model |
| Customer-facing expansion | Still native or your hosted MCP — not tenant-supplied servers |
Use this as a gate before calling an AI feature GA — not as a post-launch backlog.
Do not introduce MCP because it is trending. Introduce it when a specific connector benefits from the standard and your middleware already enforces policy.
Common questions from platform teams
Should we use MCP instead of LangChain tools?
No — they solve different problems. LangChain (or AI SDK, or a custom loop) orchestrates the agent; MCP transports some tool calls. See Build an agent with LangChain.
Is MCP the same as OpenAPI for LLMs?
Similar intent (expose capabilities to models), different layer. You may implement MCP servers that call OpenAPI backends internally.
Does MCP replace our LLM gateway?
No. Gateways handle provider routing, keys, and streaming. MCP handles tool/resource discovery for the host.
What about Remote MCP?
Hosted MCP endpoints simplify connector deployment but raise the security bar — TLS, auth, allowlists, and data residency reviews apply the same way as any third-party integration.
The integration mindset
MCP is USB-C for agent connectors — helpful when you need interchangeable plugs, irrelevant when you only have one device and tight latency requirements.
The teams that ship reliably still follow the same order: middleware first, workflow-bound features, native tools for core domain, evals before expanding the tool surface, and MCP only where it reduces maintenance without weakening tenant boundaries.
Saying "we support MCP" is not a product strategy. Saying "our middleware dispatches permissioned tools — some native, some MCP — with full audit and observability" is.
Evaluating agents and unsure whether MCP belongs in your architecture? Describe the workflow — stack, auth model, and which systems the model must touch — and we will map the smallest pattern that fits, with an honest read on when MCP earns its ops cost.
Related resources
More on integration- LLM middleware: what it is, why you need it, and how to implement it
A practical guide to the server-side layer between your app and the model — auth, rate limits, routing, logging, and the patterns that keep AI features production-ready.
- In-app copilots: how to embed AI in your product without a sidebar chatbot
A practical guide to embedded copilots: context from product state, server-side assembly, RBAC, and UI patterns that fit existing workflows instead of a floating chat widget.
- Build an agent with LangChain — a practical tutorial
Step-by-step guide to building a tool-calling agent with LangChain and LangGraph, from first prototype to patterns that survive production.
