MCP Sampling vs Elicitation for Production Servers

Use MCP sampling for client-owned model calls and elicitation for user input. Set the production boundary, approval flow, and logging rules.

Sunday, June 14, 2026

Omid Saffari

MCP Sampling vs Elicitation for Production Servers

Use MCP sampling when your server needs the client's model to generate or reason, and use MCP elicitation when your server needs the user to provide missing information. Treat both as client-mediated control points, not shortcuts around permissions, credentials, or approval UI.

The production rule

Sampling asks the client model to do model work; elicitation asks the user to provide missing input. That is the boundary to keep in your head when designing a production MCP server.

Sampling is not "randomly sample documents from context." In MCP, sampling lets a server send sampling/createMessage to the client so the client can route a text, audio, or image-based generation through the user's model environment. The client keeps control over model access, model selection, permissions, prompt review, and response review, with no server API keys necessary.

Elicitation is not retrieval. It lets the server send elicitation/create so the client can ask the user for additional information during a workflow. The answer might be a small structured form response, or it might be consent to open an external URL for a sensitive interaction.

Use sampling when the server has enough user intent and tool context, but needs a model step it should not run itself. Use elicitation when the server cannot safely continue without the user supplying or confirming something. Use neither as a replacement for normal MCP tools, resources, or authorization. If the question is "should this be a callable action or exposed context first," start with the resource and tool boundary in MCP Resources vs Tools, then add sampling or elicitation only where a nested model or user interaction is genuinely required.

Production comparison

The clean production design is to treat sampling and elicitation as different client-mediated request types with different approval, logging, and security rules.

Decision axis	MCP sampling	MCP elicitation	Production rule
What the server asks for	A model generation through the client	Additional user information through the client	Sampling is model output. Elicitation is user input.
Protocol method	`sampling/createMessage`	`elicitation/create`	Route and log them as different control-plane events.
Capability check	Client declares `sampling`; tool-enabled sampling requires `sampling.tools`	Client declares `elicitation` with `form`, `url`, or form-only empty object	Never send optional requests before initialization proves support.
Approval UI	User can review or edit the prompt and review the generated response	User can review, modify, decline, or cancel the request	Approval is part of the feature, not a wrapper you add later.
Sensitive data	Both parties must handle sensitive data appropriately	Form mode must not request passwords, API keys, access tokens, or payment credentials; URL mode is required for sensitive interactions	Secrets do not pass through model context or form mode.
First production failure	Unbounded model/tool loop, hidden model spend, or unsupported client capability	Missing decline path, unsafe URL handling, or state tied to the wrong user	Build failure handling before exposing the server broadly.
What to log	Request ID, server, method, model hint, `maxTokens`, `toolChoice`, stop reason, approval result	Request ID, server, mode, schema hash, target domain for URL mode, user action, retry state	Logs should explain who approved what and why execution continued.

The most expensive mistake is making the feature invisible. If sampling looks like a normal tool call in logs, you lose the model boundary. If elicitation looks like a normal form submission, you lose the user-consent boundary. A production MCP server should make both visible in the same control plane that already tracks tool calls, approvals, run state, and authorization checks.

Use sampling when the server needs model work

Sampling belongs inside a server workflow when the server needs a model result, but the model should remain owned by the client. The server sends messages, model preferences, an optional system prompt, and maxTokens; the client decides how to satisfy the generation request.

A good production use case is a repository MCP server that can inspect a diff and then ask the client's model to produce a risk note. The server has the repo context and the tool result. The client has the model, the user's model policy, and the approval UI. That split is the point.

JSON

{
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Summarize the migration risk from this schema diff."
        }
      }
    ],
    "modelPreferences": {
      "hints": [{ "name": "reasoning-model" }],
      "intelligencePriority": 0.8,
      "speedPriority": 0.5
    },
    "systemPrompt": "Return a concise engineering risk note.",
    "maxTokens": 1000
  }
}

That request should not fire silently. The MCP spec says sampling applications should provide UI that lets users review sampling requests, view and edit prompts before sending, and review generated responses before delivery. In production, that means the prompt shown to the user should include the requesting server, the purpose, the requested model preference, and the data classes being sent.

Tool-enabled sampling raises the bar. A server can include a tools array and optional toolChoice, but only when the client has declared sampling.tools. The server must not send tool-enabled sampling to a client that has not declared support. The toolChoice mode matters: auto lets the model decide whether to use tools, required forces at least one tool call before completion, and none blocks tool use.

The production trap is an unbounded nested loop. When sampling returns stopReason: "toolUse", every assistant tool-use block must be followed by matching tool-result blocks before the conversation continues. Both parties should implement iteration limits. In practical terms, your server should store the sampling request, the model response, each tool-use ID, the matching tool-result ID, the final stop reason, and the approval result. If a tool result is missing, stop the run and surface a recoverable error instead of improvising another model call.

Use elicitation when the server needs user input

Elicitation belongs where the server cannot safely continue without more information from the user. The server sends an elicitation/create request with a human-readable message and a mode. Form mode collects structured data through the MCP client. URL mode sends the user to an external URL for interactions that must not pass through the MCP client.

A good production use case is a deployment MCP server that can create a release note, but cannot decide which environment to target. The server should not guess. It should ask.

JSON

{
  "method": "elicitation/create",
  "params": {
    "mode": "form",
    "message": "Choose the deployment environment for this release.",
    "requestedSchema": {
      "type": "object",
      "properties": {
        "environment": {
          "type": "string",
          "title": "Environment",
          "enum": ["staging", "production"]
        }
      },
      "required": ["environment"]
    }
  }
}

Form mode is intentionally narrow. The requested schema is a restricted JSON Schema subset: flat objects with primitive properties only. Supported string formats include email, uri, date, and date-time. Complex nested structures and arrays of objects are intentionally not supported because the client needs to render something a user can understand and decline.

The response is not a boolean. Elicitation has three actions: accept, decline, and cancel. For form mode, accept includes submitted data matching the requested schema. decline means the user explicitly chose not to provide the information. cancel means the user dismissed the interaction without a clear choice. Treat those as separate control states. A production server can offer an alternative after decline, but should not continue as if the field was optional unless the workflow actually allows that.

URL mode is the security boundary for sensitive interactions. The spec says servers must not use form mode to request passwords, API keys, access tokens, or payment credentials. For those cases, the server must use URL mode. URL mode accept means the user consented to start the external interaction. It does not mean the external interaction finished. If the original request cannot proceed until that interaction completes, the server can return URLElicitationRequiredError with code -32042 when URL mode elicitation is required.

For remote servers, bind elicitation state to the individual user, not just a session ID. The spec is explicit that elicitation state must not be associated with session IDs alone, and that state storage must be protected against unauthorized access. If the flow involves third-party authorization, keep that boundary aligned with your MCP authorization design. The safe companion read is MCP Authorization for Production Servers.

The boundary that breaks first

The first failure is usually not the JSON-RPC method. It is the missing policy around who gets to approve the nested action.

For sampling, the failure is a server that assumes the client can sample, then silently falls back to its own model key. That changes the trust boundary, billing boundary, and audit boundary. If you need a fallback, make it explicit in server configuration and logs. A client-owned sampling flow and a server-owned provider call are different products from a security review standpoint.

For elicitation, the failure is a server that asks for a secret in form mode because it is convenient. That violates the protocol boundary. Sensitive credentials should not move through the MCP client, the LLM context, or intermediate server logs. Use URL mode, show the full URL to the user, require explicit consent before opening it, and verify that the user who completes the external flow is the same user who started it.

For tool-enabled sampling, the failure is an uncontrolled loop. Sampling with tools can produce a model response with tool-use content, then require tool results, then continue with more model work. That can be the right design for a bounded analysis workflow. It is the wrong design when the server has no iteration limit, no per-run approval state, and no way to explain why the final answer used a given tool result.

For both features, capability negotiation is the first gate. MCP lifecycle initialization is where the client and server establish protocol version compatibility, exchange capabilities, and share implementation details. The client sends the initialization request, the server responds with its capabilities, and the client sends notifications/initialized after successful initialization. Optional features belong after that point.

Gate on capability, not hope
Read the initialized client capabilities before sending either request. Sampling requires sampling; tool-enabled sampling requires sampling.tools; elicitation requires elicitation with a supported mode.
Show the approval surface
Sampling approval should show the prompt, server, model preference, and response before delivery. Elicitation approval should show the server, message, requested fields or full URL, and a clear decline path.
Persist the control state
Store the request ID, method, capability branch, user action, approval timestamp, and final result. For URL mode, store the elicitation ID and completion state so the original tool call can be retried safely.
Stop on mismatch
If the client lacks support, if a URL cannot be safely shown, if a tool-use response is missing its matching tool result, or if the user declines, stop the workflow and return a recoverable error.

Wire it like a control plane

The implementation should feel less like a helper method and more like a small control plane. Both features cross a trust boundary inside an already-running MCP workflow.

For sampling, route requests through a sampling broker in your server. That broker should enforce capability checks, normalize model preferences, set includeContext to the safe default by omitting it unless the client explicitly supports context inclusion, cap the loop, and emit structured events for prompt review, response review, tool-use responses, and final stop reason.

For elicitation, route requests through an elicitation broker. That broker should reject form-mode requests for secrets, hash or label the requested schema for logs, record accept, decline, and cancel separately, and treat URL mode as an out-of-band state machine. Do not pre-authenticate URL links. Do not include sensitive end-user information in the URL. Use HTTPS URLs outside development environments.

The useful dashboard has two lanes:

Model work lane: sampling request, prompt approval, model preference, toolChoice, model response, stop reason, response approval.
User input lane: elicitation request, mode, schema hash or target domain, user action, completion notification, retry state.

That dashboard gives engineering and security the same thing: an audit trail for why the server continued. Without it, sampling and elicitation become invisible control transfers. With it, they become strong protocol features that let custom MCP servers ask for exactly the right help at the right boundary.

What is MCP sampling?

MCP sampling lets a server request a client-mediated LLM generation by sending sampling/createMessage. Use it when the server needs model output, but the client should retain control over model access, model selection, permissions, prompt approval, and response approval.

What is MCP elicitation?

MCP elicitation lets a server request additional information from the user through the client by sending elicitation/create. Use form mode for small structured inputs and URL mode for sensitive interactions that must not pass through the MCP client.

What is MCP sampling vs elicitation?

Sampling asks the model to produce output. Elicitation asks the user to provide input. In production, they should have separate capability checks, approval UI, failure handling, and audit logs.

Can MCP elicitation collect API keys or passwords?

Not through form mode. The MCP spec says servers must not use form mode to request passwords, API keys, access tokens, or payment credentials. Use URL mode for sensitive interactions and keep credentials out of the MCP client and model context.

Do all MCP clients support sampling and elicitation?

No. Clients declare optional capabilities during initialization. A production server should branch on the initialized capabilities and return a clear unsupported-feature error instead of assuming sampling or elicitation exists.

Scope Your MCP Server

Design the protocol boundary, approval flow, and production controls for a custom MCP server that real teams can trust.

Last Updated

Jun 14, 2026

CategoryMCP

MCP Sampling vs Elicitation for Production Servers

The production rule

Production comparison

Use sampling when the server needs model work

Use elicitation when the server needs user input

The boundary that breaks first

Gate on capability, not hope

Show the approval surface

Persist the control state

Stop on mismatch

Wire it like a control plane

Scope Your MCP Server

More from MCP

Video Probe MCP Build Log

AWS MCP Server for Production Agents: The Build-or-Boundary Rule

MCP Resources vs Tools: The Production Server Rule

MCP Authorization for Production Servers

MCP Security Best Practices for Production Servers

MCP vs Function Calling: The Production Decision Rule

One letter, every week. Working systems — not hot takes.