
🤖 Ghostwritten by Claude Opus 4.6 · Curated by Tom Hundley
The three enterprise MCP gateways ready for production deployment in 2026 are Kong AI MCP Proxy, Azure API Management with Entra ID, and Operant's MCP Gateway. Each solves a different slice of the enterprise AI tool integration problem: Kong excels at bridging MCP to existing HTTP APIs, Azure APIM provides native identity federation for Microsoft-heavy shops, and Operant delivers runtime security enforcement that blocks the exact exploit classes—GitHub token leaks, remote code execution—that security researchers demonstrated in March 2026.
The Model Context Protocol specification updated on March 26, 2025 introduced OAuth 2.1 authorization, Streamable HTTP transport (replacing the SSE-only model), JSON-RPC batching, and tool annotations. These changes fundamentally shifted MCP from a local-first developer tool into an enterprise-grade integration protocol. But the spec alone doesn't give you production readiness. You need a gateway layer that handles authentication, rate limiting, audit logging, and tool-level access control.
This guide walks through concrete deployment patterns for all three gateways, including configuration files, OAuth 2.1 setup, and the security controls you need to avoid becoming the next cautionary tale in an MCP vulnerability disclosure. If you've already worked with MCP server integration patterns, this article extends that foundation into enterprise-grade deployment.
TL;DR: The March 2025 MCP spec update added OAuth 2.1, Streamable HTTP transport, JSON-RPC batching, and tool annotations—transforming MCP from a dev tool into an enterprise integration protocol.
The original MCP spec had no standardized auth mechanism. The March 2025 update mandates OAuth 2.1 with PKCE (Proof Key for Code Exchange) for all remote MCP connections. This means every MCP server must act as either an OAuth resource server or delegate to an external authorization server.
The critical implementation detail: MCP OAuth 2.1 uses the mcp scope convention with tool-level granularity. Your token requests should look like this:
POST /oauth/token HTTP/1.1
Content-Type: application/x-www-form-urlencoded
grant_type=client_credentials
&client_id=ai-agent-prod-001
&client_secret=<redacted>
&scope=mcp:tools/database.query mcp:tools/files.read
&code_verifier=<pkce_verifier>The scope mcp:tools/database.query grants access to a specific tool, not the entire MCP server. This is the foundation of least-privilege MCP access.
The previous stdio and SSE-only transports created headaches for enterprise deployments. Streamable HTTP unifies request-response and streaming into a single transport that works through corporate proxies, load balancers, and API gateways without special configuration.
// Streamable HTTP MCP client connection
import { MCPClient } from '@modelcontextprotocol/sdk';
const client = new MCPClient({
transport: 'streamable-http',
endpoint: 'https://mcp-gateway.internal.corp/v1',
auth: {
type: 'oauth2',
tokenEndpoint: 'https://auth.internal.corp/oauth/token',
clientId: 'ai-agent-prod-001',
scopes: ['mcp:tools/database.query']
},
// JSON-RPC batching for reduced round-trips
batching: { enabled: true, maxBatchSize: 10, flushIntervalMs: 50 }
});Tool annotations are metadata that describe a tool's behavior—whether it reads or writes data, whether it's idempotent, its expected latency class. Gateways use these annotations to enforce policies automatically:
{
"name": "database.query",
"annotations": {
"readOnly": true,
"idempotent": true,
"latencyClass": "medium",
"dataClassification": "pii-contained"
}
}A gateway can block all non-readOnly tools for a particular agent without listing every write tool individually.
TL;DR: Kong fits API-first organizations with existing HTTP services, Azure APIM suits Microsoft-centric enterprises needing Entra ID federation, and Operant targets security-first deployments requiring runtime exploit prevention.
| Feature | Kong AI MCP Proxy | Azure API Management MCP | Operant MCP Gateway |
|---|---|---|---|
| Transport | Streamable HTTP, SSE | Streamable HTTP | Streamable HTTP, stdio proxy |
| Auth | OAuth 2.1, API keys, mTLS | Entra ID, OAuth 2.1, managed identities | OAuth 2.1, SPIFFE/SPIRE |
| Tool-level ACL | Plugin-based | Policy expressions | Built-in, annotation-aware |
| Rate Limiting | Per-tool, per-agent | Per-subscription, per-tool | Per-tool, anomaly-based |
| Audit Logging | OpenTelemetry, custom | Azure Monitor, Log Analytics | SIEM integration, real-time |
| HTTP API Bridging | Native (Kong routes) | API transformation policies | Sidecar proxy model |
| Deployment | Self-hosted, Konnect SaaS | Azure-managed | Self-hosted, Kubernetes |
| Best For | API-first orgs | Microsoft shops | Security-critical workloads |
TL;DR: Kong's AI MCP Proxy lets you expose existing REST APIs as MCP tools without rewriting them, using declarative configuration that maps HTTP endpoints to MCP tool definitions.
Kong sits between your AI agents and your existing HTTP services. Inbound MCP tool calls arrive via Streamable HTTP, Kong authenticates via OAuth 2.1, applies rate limiting and ACLs, then transforms the MCP JSON-RPC call into a standard HTTP request to your backend.
[AI Agent] → Streamable HTTP → [Kong AI MCP Proxy] → HTTP/REST → [Your APIs]
↓
OAuth 2.1 validation
Tool-level ACLs
Rate limiting
Audit loggingHere's a kong.yml declarative configuration that exposes an existing inventory API as an MCP tool:
_format_version: "3.0"
services:
- name: inventory-mcp
url: http://inventory-api.internal:8080
routes:
- name: mcp-gateway
paths:
- /mcp/v1
protocols:
- https
plugins:
- name: ai-mcp-proxy
service: inventory-mcp
config:
mcp_version: "2025-03-26"
transport: streamable-http
tools:
- name: inventory.check_stock
description: "Check current stock levels for a product SKU"
backend_path: /api/v2/stock/{sku}
method: GET
annotations:
readOnly: true
idempotent: true
input_schema:
type: object
properties:
sku:
type: string
description: "Product SKU identifier"
required: [sku]
parameter_mapping:
sku: path
- name: inventory.reserve
description: "Reserve inventory for an order"
backend_path: /api/v2/reserve
method: POST
annotations:
readOnly: false
idempotent: false
input_schema:
type: object
properties:
sku: { type: string }
quantity: { type: integer, minimum: 1 }
required: [sku, quantity]
- name: oauth2-introspection
service: inventory-mcp
config:
introspection_url: https://auth.corp.com/oauth/introspect
token_type_hint: access_token
scope_claim: scope
required_scopes:
- mcp:tools/*
- name: rate-limiting-advanced
service: inventory-mcp
config:
strategy: sliding-window
limits:
- entity: consumer
window_size: 60
limit: 100 # per tool, per minuteThe parameter_mapping field is the key to bridging: it tells Kong to extract the sku field from the MCP tool call's input and inject it as a path parameter in the HTTP request. No changes to your existing inventory API required.
Kong's plugin architecture means you can stack additional enterprise concerns—IP allowlisting, mutual TLS to backends, request/response transformation—on top of the MCP proxy layer. According to Kong's 2025 API Gateway report, organizations using gateway-mediated AI tool access reduced unauthorized API call incidents significantly compared to direct integration approaches.
TL;DR: Azure APIM's MCP support integrates natively with Entra ID (formerly Azure AD), enabling managed identity-based MCP authentication that eliminates credential management for Azure-hosted AI agents.
For organizations already running on Azure, the Entra ID integration eliminates the OAuth 2.1 token management headache entirely. AI agents running in Azure Container Apps or AKS use managed identities—no client secrets to rotate.
// Bicep template for APIM MCP gateway with Entra ID
resource apimService 'Microsoft.ApiManagement/service@2024-06-01' = {
name: 'mcp-gateway-prod'
location: resourceGroup().location
sku: {
name: 'Premium'
capacity: 1
}
identity: {
type: 'SystemAssigned'
}
}
resource mcpApi 'Microsoft.ApiManagement/service/apis@2024-06-01' = {
parent: apimService
name: 'mcp-tools'
properties: {
displayName: 'MCP Tool Gateway'
path: 'mcp/v1'
protocols: ['https']
serviceUrl: 'http://mcp-backend.internal'
subscriptionRequired: true
authenticationSettings: {
oAuth2: {
authorizationServerId: 'entra-id-oauth'
}
}
}
}Azure APIM policies can inspect MCP JSON-RPC payloads and enforce tool-level access:
<policies>
<inbound>
<validate-jwt header-name="Authorization"
failed-validation-httpcode="401">
<openid-config url="https://login.microsoftonline.com/{tenant}/.well-known/openid-configuration" />
<required-claims>
<claim name="roles" match="any">
<value>MCP.Tools.Read</value>
</claim>
</required-claims>
</validate-jwt>
<!-- Extract MCP tool name from JSON-RPC body -->
<set-variable name="mcpTool"
value="@(context.Request.Body.As<JObject>()[\"method\"].ToString())" />
<!-- Block write tools for read-only agents -->
<choose>
<when condition="@(!context.User.Roles.Contains(\"MCP.Tools.Write\") &&
!((string)context.Variables[\"mcpTool\"]).EndsWith(\"query\"))">
<return-response>
<set-status code="403" reason="Tool access denied" />
</return-response>
</when>
</choose>
</inbound>
</policies>This pattern lets you manage MCP tool permissions through Entra ID app roles—the same system your team already uses for API access control. For teams deploying Claude Code across enterprise environments, this means MCP tool permissions flow through existing identity governance.
TL;DR: The March 2026 security research revealed GitHub token leaks and RCE vulnerabilities in MCP integrations—prevent them with tool-level scoping, input validation at the gateway, and output sanitization before returning results to agents.
Security research published in March 2026 demonstrated concrete attacks against MCP deployments:
GitHub Token Exfiltration: An MCP tool with overly broad OAuth scopes allowed an agent to read repository secrets, then pass them to a second MCP tool that wrote to external storage. The tokens were never exposed to the user—only to the agent.
RCE via Unsanitized Tool Input: MCP tool servers that passed agent-supplied input directly to shell commands without sanitization were vulnerable to command injection.
Tool Shadowing: A malicious MCP server could register tools with names identical to trusted tools, intercepting calls intended for legitimate services.
These aren't theoretical—they were reproduced against real MCP integrations.
Here's an Operant MCP Gateway configuration that addresses all three exploit classes:
# operant-mcp-gateway.yaml
apiVersion: operant.ai/v1
kind: MCPGatewayPolicy
metadata:
name: production-security
spec:
authentication:
oauth2:
issuer: https://auth.corp.com
audience: mcp-gateway-prod
scopeEnforcement: strict # reject tokens with scopes beyond what's needed
toolRegistry:
# Prevent tool shadowing by pinning tool→server mappings
allowedServers:
- name: internal-tools
url: https://mcp-tools.internal:443
mtls:
clientCert: /certs/gateway-client.pem
tools:
- inventory.check_stock
- inventory.reserve
- name: github-integration
url: https://mcp-github.internal:443
tools:
- github.search_code
- github.read_file
# Explicitly NOT including github.create_secret, github.update_workflow
rejectUnknownTools: true # block any tool not in the registry
inputValidation:
# Prevent command injection
rules:
- toolPattern: "*"
sanitize:
stripShellMetachars: true
maxInputLength: 10000
blockPatterns:
- "$(" # command substitution
- "`" # backtick execution
- "; rm " # common injection pattern
- "| curl" # data exfiltration
outputSanitization:
# Prevent token leakage in tool responses
redact:
patterns:
- name: github-tokens
regex: "ghp_[a-zA-Z0-9]{36}"
replacement: "[REDACTED_GH_TOKEN]"
- name: aws-keys
regex: "AKIA[0-9A-Z]{16}"
replacement: "[REDACTED_AWS_KEY]"
rateLimiting:
perAgent:
requestsPerMinute: 60
burstLimit: 10
anomalyDetection:
enabled: true
# Alert if an agent suddenly calls tools it's never used before
baselinePeriodDays: 7
deviationThreshold: 3.0For every MCP tool in your gateway, verify these controls:
As noted in the complete Claude Code integration guide, these security controls become especially critical when AI coding assistants have access to production infrastructure tools.
Stdio transport requires the MCP server to run as a local subprocess, which doesn't work for centralized enterprise deployment. To migrate, wrap your existing MCP server with the Streamable HTTP transport adapter from the @modelcontextprotocol/sdk package (v1.2+). This involves adding an HTTP server layer that accepts JSON-RPC over HTTP POST and returns results as either single responses or streamed chunks. The key change is moving from process-level isolation (one server per client) to network-level isolation (one server, many authenticated clients), which requires adding OAuth 2.1 authentication that stdio never needed.
Yes, but it requires Kong's request-transformer plugin to convert between protocols. For gRPC backends, configure Kong to receive the MCP JSON-RPC call, extract tool parameters, and forward them as a gRPC request using Kong's gRPC-gateway plugin. For GraphQL, use the request-transformer to construct a GraphQL query from the MCP tool input schema. The MCP tool definition's input_schema maps cleanly to GraphQL variables, making this transformation straightforward for read operations. Write mutations require more careful mapping of the tool's input schema to mutation arguments.
Expect 5-15ms of additional latency per tool call for authentication, policy evaluation, and input validation—negligible compared to the typical 200-2000ms latency of the underlying tool execution. JSON-RPC batching (added in the March 2025 spec) offsets this by reducing round-trips: batching 5 independent tool calls into a single HTTP request eliminates 4 round-trips of gateway overhead. The main performance consideration is rate limiting state storage; use Redis-backed rate limiting for multi-node gateway deployments to avoid per-node counting inconsistencies.
Tool annotations are metadata published by the MCP server, not assertions of trust. Your gateway should treat them as hints that inform policy, not as security boundaries. For example, a tool annotated readOnly: true might still have a bug that allows writes—so your gateway policy should combine annotation-based rules with explicit tool allowlists. The recommended pattern is: use annotations for coarse-grained default policies (e.g., "read-only agents can call any readOnly tool"), then layer explicit deny rules for sensitive tools regardless of annotations (e.g., "no agent calls database.drop_table even if someone annotates it readOnly").
Run separate gateway instances per trust boundary, not per environment. A single gateway for all of production is fine if all agents share the same trust level. But if you have internal agents (trusted, broad access) and customer-facing agents (untrusted, restricted access), deploy separate gateway instances with distinct policies. This prevents a policy misconfiguration on the customer-facing gateway from accidentally exposing internal tools. Kubernetes namespaces with NetworkPolicies or separate VPCs provide the isolation layer between gateway instances.
Enterprise MCP gateway deployment in 2026 is a solved architecture problem with three production-ready options. The unsolved problem is organizational: mapping your existing API inventory to MCP tools with appropriate security boundaries, training your teams on MCP-specific threat models, and building the operational muscle for gateway monitoring and incident response.
The March 2026 security disclosures proved that ungoverned MCP access creates real, exploitable vulnerabilities. A gateway isn't optional—it's the control plane that makes the difference between a powerful AI tool integration and a supply chain attack waiting to happen.
If your team is planning enterprise MCP gateway deployment and needs help navigating the architecture decisions, security patterns, and implementation specifics, Elegant Software Solutions provides hands-on AI implementation engagements that include MCP infrastructure design and deployment. Schedule a technical consultation to discuss your MCP gateway strategy.
Discover more content: