MCP vs Function Calling: The Production Decision Rule

Use function calling for app-local tools. Build MCP when a capability must be shared, discovered, approved, logged, and reused across agents.

Wednesday, June 3, 2026

Omid Saffari

MCP vs Function Calling: The Production Decision Rule

Use function calling when the tool belongs to one application loop. Build an MCP server when the capability must be discovered, reused, governed, approved, and logged across agents, clients, or teams.

The Short Rule: Function Calling Is Invocation, MCP Is a Capability Boundary

Function calling is the right default for a narrow tool that only your application should execute. The model emits structured arguments, your code validates them, your application runs the function, and the result goes back into the conversation. That is a clean pattern for app-local behavior: quote a price, check a feature flag, classify an inbound request, or fetch a record that only this product flow needs.

MCP is the right boundary when the tool is no longer just a helper inside one app. A custom MCP server turns an internal capability into a discoverable service that can be called by compatible clients while keeping execution, credentials, authorization, and audit logic behind one server boundary. That boundary matters more than the protocol shape.

The mistake is treating MCP as a replacement for function calling. Function calling answers, "How does the model ask my application to do something?" MCP answers, "How do we expose a governed capability to multiple AI clients without copying the integration into every app?"

Model Context Protocol specification — The current MCP specification defines hosts, clients, servers, JSON-RPC 2.0 messaging, and server features such as tools, resources, and prompts.

For a production team, the decision line is simple:

Keep direct function calls for app-local tools with one owner, one runtime, and one approval path.
Build MCP for shared capabilities that need discovery, scope-aware access, review, logging, or reuse across clients.
Use both when an agent needs local workflow logic and shared internal tools.

That hybrid architecture is usually the durable one. Direct functions stay close to the product loop. MCP servers sit at the boundary of systems that need governance.

The Comparison That Matters in Production

The useful comparison is not syntax. It is where ownership, control, and failure handling live when the tool starts touching real systems.

Production axis	Function calling	MCP server	Use function calling when	Use MCP when
Ownership	Tool schema, dispatch, credentials, and retries live in the app	Tool schema, execution, auth, and logs live behind a server boundary	One product team owns the whole path	Several teams or clients need the same capability
Discovery	The app sends callable tools in the request	Clients can discover tools with `tools/list`	The tool set is stable for that request	Tools change or vary by user, role, or environment
Schema	Function tools are JSON Schema definitions in the API request	MCP tools expose `inputSchema` and optional `outputSchema` through the server	The schema is small and app-specific	The schema should be reused and versioned once
Context	Context is passed by the app in the conversation loop	MCP adds resources and prompts in addition to tools	The app already owns all needed context	Clients need files, database schemas, docs, or prompt templates from the same boundary
Approval	Your app implements review around the function	Clients and platforms can require MCP approval before data or actions flow	Approval is a simple local branch	Sensitive data or side effects need a review trail
Authorization	The app holds service credentials	HTTP MCP can use OAuth 2.1, protected resource metadata, bearer tokens, and scopes	One backend credential is enough	Scope, tenant, user, or client identity must be enforced server-side
Observability	You log inside each application	You can centralize tool usage at the MCP server boundary	One app log is enough	You need to answer which client called which tool with which approval state
Cost	Tool definitions count against context and are billed as input tokens	OpenAI remote MCP charges tokens for imported tool definitions and calls, with no extra per-tool-call fee	The schema is small and always needed	Tool lists are large enough to require filtering with `allowed_tools`
Failure mode	App bugs, schema drift, hidden credentials	Server sprawl, auth drift, tool list bloat, network failure	Simplicity is worth tight coupling	Central control is worth another process or network hop

This table is the decision. If the tool is a private implementation detail of one app, keep it direct. If it is a production capability that other agents, IDEs, workflows, or teams will consume, put it behind MCP and operate it like infrastructure.

What Function Calling Gives You

Function calling gives you the shortest controlled loop between a model and your application code. OpenAI defines function calling, also called tool calling, as a way to connect models to data and actions provided by your application. Function tools are defined by JSON Schema, and the model can return a tool call that your code executes application-side.

OpenAI function calling guide — Function calling is an application-owned loop: define tools, receive a tool call, execute code, return output, then let the model continue.

The production value is locality. Your application already has the user session, database transaction, feature flags, rate limit state, and product-specific validation. You can gate the function next to the code that understands the risk.

For example, an incident assistant inside your own ops dashboard can keep these as direct functions:

TypeScript

const tools = [
  {
    type: "function",
    name: "create_incident_note",
    description: "Append a note to the active incident timeline.",
    parameters: {
      type: "object",
      additionalProperties: false,
      properties: {
        incident_id: { type: "string" },
        note: { type: "string" },
        visibility: { type: "string", enum: ["internal", "customer"] }
      },
      required: ["incident_id", "note", "visibility"]
    },
    strict: true
  }
];

That tool is not a platform capability. It is one product action with one local permission model. Keep it in the app, put strict: true on the schema, validate the user and incident state before execution, and log the result in the same event stream as every other incident change.

Function calling starts to strain when the callable set grows into an integration surface. OpenAI's docs state that callable function definitions count against the model context limit and are billed as input tokens. If you send a large list of tools on every request, the cost and latency tax becomes part of every run. If the same schemas are copied across products, each copy becomes a place for drift.

The local loop should still have production controls:

Validate before execution
Treat tool arguments as untrusted input. Schema adherence is necessary, but it is not authorization. Check tenant, actor, state, and side-effect policy before calling the function.
Record the attempted action
Log run_id, model, tool_name, arguments_hash, actor_id, tenant_id, approval_state, latency_ms, and error_type. Store redacted arguments only when the data policy allows it.
Gate side effects
Use a local approval branch for actions that send messages, change records, cancel orders, trigger deploys, or touch customer-visible data.

Function calling is not less serious than MCP. It is just a smaller boundary. The risk is pretending a local helper is still local after three other agents start calling a copied version of it.

What MCP Gives You That Function Calling Does Not

MCP gives you a reusable interface for capabilities, not just a different way to describe a function. The current MCP specification defines an open protocol using JSON-RPC 2.0 messages between hosts, clients, and servers. Servers can expose tools, resources, and prompts. That is the key distinction: MCP is not only about executable actions.

Tools are functions the model can execute. MCP clients discover them with tools/list and invoke them with tools/call. Tool definitions include a name, description, inputSchema, optional outputSchema, annotations, and execution metadata. The inputSchema must be a valid JSON Schema object.

Resources are contextual data identified by URI. A server can expose files, database schemas, or application-specific information with resources/list and resources/read. Prompts are reusable structured messages and workflows that clients can discover with prompts/list and retrieve with prompts/get.

That broader surface changes the architecture. A "query customer data" capability can expose:

a tool for search_accounts
a resource for the current account schema
a prompt for a standard account-risk review workflow
an authorization boundary that scopes the user to the accounts they are allowed to inspect
logs that show which client requested what and whether a human approved it

Direct function calling can implement the same backend behavior, but every app has to wire it. MCP makes the capability discoverable and reusable without embedding the integration in each agent loop.

OpenAI's remote MCP support shows the operational shape. The Responses API can use connectors and remote MCP servers through the mcp built-in tool type. A remote MCP server requires a server_url and may require OAuth authorization. The API can emit mcp_list_tools when it imports available tools and mcp_call when the model calls one. It also supports allowed_tools, which matters because OpenAI's docs warn that exposing many MCP tools can increase cost and latency.

OpenAI MCP and connectors guide — Remote MCP introduces discovery, filtering, approvals, and trace items such as mcp_list_tools and mcp_call.

The approval behavior is also a production clue. OpenAI requests approval by default before data is shared with a connector or remote MCP server, and recommends reviewing and optionally logging all data shared with a remote MCP server. That is the right posture. A remote tool boundary should make data movement visible.

MCP authorization is not magic. The spec says authorization is optional for implementations. When HTTP authorization is supported, the current spec defines OAuth 2.1 based flows, OAuth 2.0 Protected Resource Metadata, bearer token usage, and standard error behavior such as 401, 403, and 400. In practice, that means a production MCP server still needs real identity, scopes, tenant checks, token validation, audit logging, and a refusal path.

The payoff is a cleaner operating model:

Clients discover only the tools they are allowed to see.
Tool calls carry enough identity to enforce policy at the server.
A shared capability can be updated once instead of copied into every app.
Security review can focus on one server boundary and its allowed actions.
Observability can show tool usage across agents, IDEs, workflows, and products.

Build MCP when that operating model is worth more than the extra runtime boundary.

The Architecture We Ship for Internal Tools

A custom MCP server should be treated like an internal API product for AI clients. The minimum useful design is not "wrap every endpoint and publish it." It is a scoped tool surface with explicit permissions, approvals, and logs.

For an internal engineering platform, the first server often exposes a small set of high-value capabilities:

search runbooks
read service ownership metadata
query deploy status
open an incident draft
propose a rollback plan

The sensitive action is not the read. The sensitive action is the transition from "propose" to "execute." Keep that distinction in the tool design.

JSON

{
  "server": "engineering-ops",
  "tools": [
    {
      "name": "search_runbooks",
      "risk": "read",
      "approval": "none",
      "scopes": ["runbooks:read"]
    },
    {
      "name": "create_incident_draft",
      "risk": "write",
      "approval": "review",
      "scopes": ["incidents:write"]
    },
    {
      "name": "request_rollback",
      "risk": "production_side_effect",
      "approval": "required",
      "scopes": ["deployments:rollback:request"]
    }
  ]
}

That policy object is not the MCP spec. It is the control layer your production system needs around the spec. The article belongs in the MCP lane because this is where most teams need help: the protocol is the connection standard, but the production work is selecting, constraining, approving, and observing the capabilities.

The server should emit one audit event for every attempted tool call, not just successful calls:

JSON

{
  "event": "mcp_tool_call_attempted",
  "run_id": "run_...",
  "client": "incident-agent",
  "server": "engineering-ops",
  "tool": "request_rollback",
  "actor_id": "user_...",
  "tenant_id": "tenant_...",
  "scope_result": "allowed",
  "approval_state": "pending",
  "arguments_hash": "sha256:...",
  "latency_ms": null,
  "error_type": null
}

For side effects, store the approval state separately from the model message. OpenAI's Agents guidance frames human review as the approval path for tool calls: the run pauses, a person or policy approves or rejects the action, and execution resumes from state. Even if your agent runtime is not OpenAI's Agents SDK, the pattern holds. Approval is state, not a prompt instruction.

For private systems, do not expose an internal MCP server publicly just to make an AI product reach it. OpenAI's Secure MCP Tunnel is one pattern: it connects private MCP servers to supported OpenAI products without opening inbound firewall ports, using an outbound HTTPS path that pulls queued work, forwards requests locally, and returns responses through the same tunnel. The broader rule is vendor-independent: keep the server inside the trust boundary that already protects the underlying systems.

This is also where observability links back to the rest of the AI stack. If the MCP server becomes the capability boundary, traces and evals need to capture tool selection quality, approval outcomes, refusal rates, and error classes. For the observability tool decision, see the related comparison at /blog/langfuse-vs-langsmith-production-observability.

What Breaks First

Function calling breaks first through invisible growth. The local helper starts as a clean schema and a dispatch function. Then another agent copies it. Then a second provider needs a different wrapper. Then the schema has five subtly different versions. Then a sensitive action is guarded in one app and not another. The failure is not that function calling is weak. The failure is that the capability outgrew its boundary.

The concrete warning signs:

the same tool schema appears in multiple repos
credentials for different systems live in one agent runtime
approval checks are implemented differently per app
tool descriptions are tuned per model instead of per capability
logs cannot answer which model asked for which action
cost rises because every request carries tool definitions that are rarely used

MCP breaks first in a different way. It makes capabilities easier to expose, so teams expose too much. A server with a broad tool list becomes expensive and confusing for the model. A server without scopes becomes a soft permission bypass. A server without approval state turns "model-controlled" into "model-can-try-anything." A server without logs becomes another hidden integration layer.

The concrete warning signs:

clients import broad tool lists instead of using allowed_tools
tool annotations are trusted without server provenance
OAuth scopes do not map to tool risk
approval prompts exist in the UI but not in server-side enforcement
resource reads are logged less carefully than tool calls
network errors are retried without idempotency rules
server inventory is missing, so nobody knows which agents can reach which tools

MCP's own spec is clear that tools represent arbitrary code execution and should be treated with caution. It also says hosts must obtain explicit user consent before invoking any tool, and implementors should build robust consent and authorization flows. That does not happen automatically when a server starts.

The operational answer is a small control plane:

Control	Function calling implementation	MCP implementation
Tool allowlist	Select tools per request	Filter imported server tools with `allowed_tools` and server-side policy
Approval	Local workflow branch before execution	Approval object checked by server before sensitive tool execution
Identity	App session and service credentials	Actor, client, tenant, server, scopes, and token validation
Logging	App event log around function execution	Central audit event for list, call, refusal, approval, and error
Cost	Count repeated schema tokens	Count imported tool definitions, tool calls, and filtered tool lists
Reliability	Local retries and validation	Transport errors, protocol errors, tool errors, idempotency keys, and circuit breakers

Use direct calls when these controls are naturally local. Use MCP when centralizing them is the safer architecture.

Migration Path: Promote One Tool, Not the Whole System

The clean migration path is to promote the most reused or highest-risk capability first. Do not convert the entire agent system to MCP because the protocol is available. Move the tool that already behaves like shared infrastructure.

Find the boundary
Choose a tool that is copied across clients, uses sensitive credentials, or needs shared audit logs. Leave app-local behavior as function calling.
Define the minimum server surface
Expose the few tools, resources, and prompts that represent the capability. Do not mirror every backend endpoint.
Add policy before rollout
Map each tool to scopes, approval requirements, idempotency behavior, and redaction rules before connecting production clients.
Instrument discovery and calls
Log tools/list, imported tools, refused tools, approvals, tool calls, errors, and latency. Tool discovery is part of the production surface.
Keep direct functions for hot paths
If a tool is latency-sensitive, app-specific, and owned by one runtime, do not promote it just for symmetry.

The decision should reduce duplicated integration code or improve control. If MCP only adds a network hop and another deployment without solving ownership, discovery, or governance, keep the function direct.

The Final Decision Rule

Use function calling for product-local behavior. Build a custom MCP server for shared internal capabilities.

That rule holds across model providers and agent frameworks because it is an ownership rule. Function calling is a good application pattern. MCP is a good platform boundary. The best production systems use both, with direct calls for local behavior and MCP servers for capabilities that need to be discovered, governed, approved, reused, and observed.

For deeper MCP implementation patterns, the /writing/mcp lane covers custom server design, security boundaries, and production rollout decisions.

Is MCP just function calling with extra steps?

No. Function calling lets a model request a function from your application. MCP standardizes how capabilities are discovered, invoked, authorized, and reused across AI clients through a server boundary.

Does MCP replace function calling?

No. Keep direct function calls for app-local behavior. Use MCP for shared or governed capabilities. A mature production agent stack usually has both.

When should we build a custom MCP server?

Build one when a capability needs multiple clients, server-side auth, scoped access, resource discovery, approval controls, or audit logs. Those are platform concerns, not prompt concerns.

Does MCP add latency?

It can. Local stdio adds a process boundary and remote HTTP adds a network boundary. Keep latency-sensitive app-local tools direct unless the governance benefit is worth the hop.

Scope Your MCP Server

Design and ship a custom MCP server with scoped tools, approvals, audit logs, and production-ready client integration.

Last Updated

Jun 3, 2026

CategoryMCP

MCP vs Function Calling: The Production Decision Rule

The Short Rule: Function Calling Is Invocation, MCP Is a Capability Boundary

The Comparison That Matters in Production

What Function Calling Gives You

Validate before execution

Record the attempted action

Gate side effects

What MCP Gives You That Function Calling Does Not

The Architecture We Ship for Internal Tools

What Breaks First

Migration Path: Promote One Tool, Not the Whole System

Find the boundary

Define the minimum server surface

Add policy before rollout

Instrument discovery and calls

Keep direct functions for hot paths

The Final Decision Rule

Scope Your MCP Server

More from MCP

Video Probe MCP Build Log

AWS MCP Server for Production Agents: The Build-or-Boundary Rule

MCP Sampling vs Elicitation for Production Servers

MCP Resources vs Tools: The Production Server Rule

MCP Authorization for Production Servers

MCP Security Best Practices for Production Servers

One letter, every week. Working systems — not hot takes.