475Cumulus
Guide

MCP in production SaaS: where Model Context Protocol fits your stack

MCP standardizes how agents connect to tools and data — but it does not replace LLM middleware, tenant auth, or observability. A decision guide for product and platform teams.

Every engineering leader evaluating AI agents has heard the same pitch: connect everything with MCP. Model Context Protocol ships with Cursor, Claude Desktop, and a growing catalog of community servers for GitHub, Postgres, Linear, and internal APIs.

The question from platform teams is sharper: does MCP belong inside our product, or is it a dev-tool convenience we should ignore?

Both answers can be true. MCP is a useful tool and context transport standard. It is not a substitute for LLM middleware, tenant isolation, evals, or the security patterns you already need for in-app copilots. This guide is the decision layer before you wire MCP into production.

What MCP is (and is not)

MCP separates three roles:

RoleResponsibilityExamples
HostRuns the agent loop, presents UI, invokes toolsCursor, Claude Desktop, your copilot backend
ClientSpeaks MCP on behalf of the hostSDK client inside your API route
ServerExposes tools, resources, and prompts over the protocolGitHub server, Postgres server, wrapper around your billing API
User  →  your product UI  →  your API (host)

                            MCP client

                    ┌─────────────┼─────────────┐
                    ▼             ▼             ▼
              MCP server    MCP server    native tool
              (GitHub)      (Postgres)    (your store.ts)

MCP is:

  • A wire format and discovery protocol for capabilities the model can use
  • A way to reuse connectors instead of hand-writing every tool schema
  • Increasingly relevant for internal operator tools and admin copilots

MCP is not:

  • LLM middleware — no rate limits, tenant budgets, or session auth by itself
  • RAG — resources can expose documents, but retrieval policy still lives in your stack
  • A copilot UI pattern — see in-app copilots for that
  • A security model — tool outputs are untrusted input; see prompt injection

Where MCP fits in your existing architecture

Your production path should still look like the stack in our middleware guide:

  1. Client UI sends intent to your authenticated API
  2. LLM middleware validates session, assembles context, enforces policy
  3. Model runs with tools the middleware allows for this user and tenant
  4. Results are logged, traced, and returned

MCP enters at step 3 as one way to implement tools — alongside native functions (AI SDK tool(), LangChain @tool, direct service calls).

Request flow through LLM middleware

Client UI

Copilot, search, actions

Your API

Existing auth session

middleware

LLM middleware

Auth, rate limits, logging

Model provider

OpenAI, Anthropic, etc.

Inject tenant-scoped context
Enforce tool permissions
Record tokens & latency

Every model call passes through your stack — not around it.

Think of the split this way:

  • Middleware answers: Who is calling? What are they allowed to do? What did it cost?
  • MCP answers: How does this capability expose itself to the host in a standard shape?

You still implement middleware first. MCP replaces custom glue between your agent runtime and some external systems — not the policy layer in front of them.

Three production use cases that make sense

1. Internal operator copilots

When the user is your employee — support, ops, engineering — and connectors are shared infrastructure, MCP can accelerate integration.

Example: An admin copilot that searches tickets (native tools), queries a read-only Postgres replica (MCP Postgres server), and posts a summary to Slack (MCP Slack server). Middleware enforces role checks before any MCP client is invoked; the model never holds database credentials.

This matches how many teams already demo agents with tool-calling, with less bespoke adapter code per system.

2. Reusing community or vendor MCP servers behind your boundary

MCP ecosystems ship servers for common systems. Your platform team can allowlist which servers run in staging and production, route calls through a single MCP client inside your API, and apply the same logging and budget rules as native tools.

Pattern: MCP server runs in your VPC or as a sidecar; only your middleware can reach it; end users never configure servers themselves.

3. Wrapping your product APIs as MCP servers (internal consumers)

Some teams expose internal MCP servers that wrap existing REST or GraphQL APIs — useful when multiple internal agents (on-call bot, support assist, CI triage) need the same typed surface.

Important: The MCP server is still backed by your APIs with normal auth. MCP does not magically add RBAC; your API layer does.

Three anti-patterns to avoid

1. Customer-facing "bring your own MCP"

Letting each tenant plug arbitrary MCP servers into your SaaS UI sounds flexible. In practice it creates unbounded data exfiltration risk, unmanageable support load, and no consistent eval story.

Use instead: You operate a fixed tool surface per plan tier, implemented natively or via MCP servers you host and audit.

2. Skipping middleware because "MCP handles integration"

MCP servers do not know your tenant model, your feature flags, or your cost budgets. Calling MCP directly from a browser or from an unauthenticated worker repeats the POC mistake our middleware article describes: keys and policy scattered across the stack.

Use instead: Every MCP invocation flows through the same middleware module as native tools — auth first, then tool dispatch.

3. Observability only in the MCP host

If traces stop at "the agent called something," you cannot answer finance or support questions. Log which server, which tool, latency, outcome, tenantId in your middleware, and forward spans to Langfuse or your OTel stack.

MCP vs native tools: a decision table

FactorNative tools (in-process)MCP servers
LatencyLower — no IPC or HTTP hopHigher — separate process or remote server
Tenancy / RBACEasier — direct access to your auth contextRequires explicit propagation into server
OpsDeployed with your appExtra service lifecycle, versioning, health checks
ReuseCustom per integrationCommunity servers, shared internal wrappers
Best forCore product APIs, tight SLAsInternal systems, optional connectors, polyglot tools

Most production SaaS products we integrate use native tools for core workflows (tickets, accounts, records the product already owns) and MCP selectively for peripheral systems where a maintained server exists and latency is acceptable.

Security: treat MCP like any other tool path

MCP does not change the prompt injection threat model. It can expand it if servers are broad or over-permissioned.

Production checklist:

  • Allowlist servers and tools per environment — no ad hoc discovery in prod
  • Run servers with least privilege — read-only DB roles, scoped OAuth tokens
  • Never pass end-user credentials to MCP servers; middleware uses service identity and passes tenant context explicitly
  • Treat tool results as untrusted input — they flow back into the model context; sanitize and bound size
  • Audit log tool name, arguments (redacted), actor, tenant, outcome — same bar as update_ticket in your product
  • Human confirmation for destructive MCP tools — same gates as native write tools

A Postgres MCP server with write access and no confirmation gate is a production incident waiting for a crafted ticket body.

Observability and evals still apply

MCP does not replace structured logging, OTel metrics, or eval pipelines. Tag every MCP invocation with:

  • feature, tenantId, sessionId
  • mcpServer, toolName
  • latencyMs, outcome, model

Golden-set evals should assert which tool was selected — MCP or native — and whether out-of-scope requests were refused. Changing MCP server versions should trigger the same CI gate as a prompt change.

Rollout: when to introduce MCP

StageTooling approach
First copilot / agentNative tools in middleware — fastest path to RBAC and evals
Second connector is externalEvaluate maintained MCP server vs thin native wrapper
Multiple internal agentsShared MCP servers wrapping internal APIs; one auth model
Customer-facing expansionStill native or your hosted MCP — not tenant-supplied servers
Production readiness checklist
Server-side auth
Tenant-scoped context
Structured logging
Cost per action
Eval pipeline
Provider fallback
Feature flags
Audit on tool calls

Use this as a gate before calling an AI feature GA — not as a post-launch backlog.

Do not introduce MCP because it is trending. Introduce it when a specific connector benefits from the standard and your middleware already enforces policy.

Common questions from platform teams

Should we use MCP instead of LangChain tools?
No — they solve different problems. LangChain (or AI SDK, or a custom loop) orchestrates the agent; MCP transports some tool calls. See Build an agent with LangChain.

Is MCP the same as OpenAPI for LLMs?
Similar intent (expose capabilities to models), different layer. You may implement MCP servers that call OpenAPI backends internally.

Does MCP replace our LLM gateway?
No. Gateways handle provider routing, keys, and streaming. MCP handles tool/resource discovery for the host.

What about Remote MCP?
Hosted MCP endpoints simplify connector deployment but raise the security bar — TLS, auth, allowlists, and data residency reviews apply the same way as any third-party integration.

The integration mindset

MCP is USB-C for agent connectors — helpful when you need interchangeable plugs, irrelevant when you only have one device and tight latency requirements.

The teams that ship reliably still follow the same order: middleware first, workflow-bound features, native tools for core domain, evals before expanding the tool surface, and MCP only where it reduces maintenance without weakening tenant boundaries.

Saying "we support MCP" is not a product strategy. Saying "our middleware dispatches permissioned tools — some native, some MCP — with full audit and observability" is.


Evaluating agents and unsure whether MCP belongs in your architecture? Describe the workflow — stack, auth model, and which systems the model must touch — and we will map the smallest pattern that fits, with an honest read on when MCP earns its ops cost.

More on integration