🤖 Ghostwritten by Claude Opus 4.6 · Curated by Tom Hundley

Enterprise MCP Gateway Implementation Guide for 2026

The three enterprise MCP gateways ready for production deployment in 2026 are Kong AI MCP Proxy, Azure API Management with Entra ID, and Operant's MCP Gateway. Each solves a different slice of the enterprise AI tool integration problem: Kong excels at bridging MCP to existing HTTP APIs, Azure APIM provides native identity federation for Microsoft-heavy shops, and Operant delivers runtime security enforcement that blocks the exact exploit classes—GitHub token leaks, remote code execution—that security researchers demonstrated in March 2026.

The Model Context Protocol specification updated on March 26, 2025 introduced OAuth 2.1 authorization, Streamable HTTP transport (replacing the SSE-only model), JSON-RPC batching, and tool annotations. These changes fundamentally shifted MCP from a local-first developer tool into an enterprise-grade integration protocol. But the spec alone doesn't give you production readiness. You need a gateway layer that handles authentication, rate limiting, audit logging, and tool-level access control.

This guide walks through concrete deployment patterns for all three gateways, including configuration files, OAuth 2.1 setup, and the security controls you need to avoid becoming the next cautionary tale in an MCP vulnerability disclosure. If you've already worked with MCP server integration patterns, this article extends that foundation into enterprise-grade deployment.

Understanding the March 2025 MCP Specification Changes

TL;DR: The March 2025 MCP spec update added OAuth 2.1, Streamable HTTP transport, JSON-RPC batching, and tool annotations—transforming MCP from a dev tool into an enterprise integration protocol.

OAuth 2.1 Authorization

The original MCP spec had no standardized auth mechanism. The March 2025 update mandates OAuth 2.1 with PKCE (Proof Key for Code Exchange) for all remote MCP connections. This means every MCP server must act as either an OAuth resource server or delegate to an external authorization server.

The critical implementation detail: MCP OAuth 2.1 uses the mcp scope convention with tool-level granularity. Your token requests should look like this:

POST /oauth/token HTTP/1.1
Content-Type: application/x-www-form-urlencoded

grant_type=client_credentials
&client_id=ai-agent-prod-001
&client_secret=<redacted>
&scope=mcp:tools/database.query mcp:tools/files.read
&code_verifier=<pkce_verifier>

The scope mcp:tools/database.query grants access to a specific tool, not the entire MCP server. This is the foundation of least-privilege MCP access.

Streamable HTTP Transport

The previous stdio and SSE-only transports created headaches for enterprise deployments. Streamable HTTP unifies request-response and streaming into a single transport that works through corporate proxies, load balancers, and API gateways without special configuration.

// Streamable HTTP MCP client connection
import { MCPClient } from '@modelcontextprotocol/sdk';

const client = new MCPClient({
  transport: 'streamable-http',
  endpoint: 'https://mcp-gateway.internal.corp/v1',
  auth: {
    type: 'oauth2',
    tokenEndpoint: 'https://auth.internal.corp/oauth/token',
    clientId: 'ai-agent-prod-001',
    scopes: ['mcp:tools/database.query']
  },
  // JSON-RPC batching for reduced round-trips
  batching: { enabled: true, maxBatchSize: 10, flushIntervalMs: 50 }
});

Tool Annotations

Tool annotations are metadata that describe a tool's behavior—whether it reads or writes data, whether it's idempotent, its expected latency class. Gateways use these annotations to enforce policies automatically:

{
  "name": "database.query",
  "annotations": {
    "readOnly": true,
    "idempotent": true,
    "latencyClass": "medium",
    "dataClassification": "pii-contained"
  }
}

A gateway can block all non-readOnly tools for a particular agent without listing every write tool individually.

Comparing Enterprise MCP Gateways

TL;DR: Kong fits API-first organizations with existing HTTP services, Azure APIM suits Microsoft-centric enterprises needing Entra ID federation, and Operant targets security-first deployments requiring runtime exploit prevention.

Feature	Kong AI MCP Proxy	Azure API Management MCP	Operant MCP Gateway
Transport	Streamable HTTP, SSE	Streamable HTTP	Streamable HTTP, stdio proxy
Auth	OAuth 2.1, API keys, mTLS	Entra ID, OAuth 2.1, managed identities	OAuth 2.1, SPIFFE/SPIRE
Tool-level ACL	Plugin-based	Policy expressions	Built-in, annotation-aware
Rate Limiting	Per-tool, per-agent	Per-subscription, per-tool	Per-tool, anomaly-based
Audit Logging	OpenTelemetry, custom	Azure Monitor, Log Analytics	SIEM integration, real-time
HTTP API Bridging	Native (Kong routes)	API transformation policies	Sidecar proxy model
Deployment	Self-hosted, Konnect SaaS	Azure-managed	Self-hosted, Kubernetes
Best For	API-first orgs	Microsoft shops	Security-critical workloads

Kong AI MCP Proxy: Bridging MCP to Existing APIs

TL;DR: Kong's AI MCP Proxy lets you expose existing REST APIs as MCP tools without rewriting them, using declarative configuration that maps HTTP endpoints to MCP tool definitions.

Architecture Pattern

Kong sits between your AI agents and your existing HTTP services. Inbound MCP tool calls arrive via Streamable HTTP, Kong authenticates via OAuth 2.1, applies rate limiting and ACLs, then transforms the MCP JSON-RPC call into a standard HTTP request to your backend.

[AI Agent] → Streamable HTTP → [Kong AI MCP Proxy] → HTTP/REST → [Your APIs]
                                      ↓
                              OAuth 2.1 validation
                              Tool-level ACLs
                              Rate limiting
                              Audit logging

Configuration Example

Here's a kong.yml declarative configuration that exposes an existing inventory API as an MCP tool:

_format_version: "3.0"

services:
  - name: inventory-mcp
    url: http://inventory-api.internal:8080
    routes:
      - name: mcp-gateway
        paths:
          - /mcp/v1
        protocols:
          - https

plugins:
  - name: ai-mcp-proxy
    service: inventory-mcp
    config:
      mcp_version: "2025-03-26"
      transport: streamable-http
      tools:
        - name: inventory.check_stock
          description: "Check current stock levels for a product SKU"
          backend_path: /api/v2/stock/{sku}
          method: GET
          annotations:
            readOnly: true
            idempotent: true
          input_schema:
            type: object
            properties:
              sku:
                type: string
                description: "Product SKU identifier"
            required: [sku]
          parameter_mapping:
            sku: path

        - name: inventory.reserve
          description: "Reserve inventory for an order"
          backend_path: /api/v2/reserve
          method: POST
          annotations:
            readOnly: false
            idempotent: false
          input_schema:
            type: object
            properties:
              sku: { type: string }
              quantity: { type: integer, minimum: 1 }
            required: [sku, quantity]

  - name: oauth2-introspection
    service: inventory-mcp
    config:
      introspection_url: https://auth.corp.com/oauth/introspect
      token_type_hint: access_token
      scope_claim: scope
      required_scopes:
        - mcp:tools/*

  - name: rate-limiting-advanced
    service: inventory-mcp
    config:
      strategy: sliding-window
      limits:
        - entity: consumer
          window_size: 60
          limit: 100  # per tool, per minute

The parameter_mapping field is the key to bridging: it tells Kong to extract the sku field from the MCP tool call's input and inject it as a path parameter in the HTTP request. No changes to your existing inventory API required.

Production Considerations

Kong's plugin architecture means you can stack additional enterprise concerns—IP allowlisting, mutual TLS to backends, request/response transformation—on top of the MCP proxy layer. According to Kong's 2025 API Gateway report, organizations using gateway-mediated AI tool access reduced unauthorized API call incidents significantly compared to direct integration approaches.

Azure API Management with Entra ID for MCP

TL;DR: Azure APIM's MCP support integrates natively with Entra ID (formerly Azure AD), enabling managed identity-based MCP authentication that eliminates credential management for Azure-hosted AI agents.

MCP OAuth 2.1 with Entra ID

For organizations already running on Azure, the Entra ID integration eliminates the OAuth 2.1 token management headache entirely. AI agents running in Azure Container Apps or AKS use managed identities—no client secrets to rotate.

// Bicep template for APIM MCP gateway with Entra ID
resource apimService 'Microsoft.ApiManagement/service@2024-06-01' = {
  name: 'mcp-gateway-prod'
  location: resourceGroup().location
  sku: {
    name: 'Premium'
    capacity: 1
  }
  identity: {
    type: 'SystemAssigned'
  }
}

resource mcpApi 'Microsoft.ApiManagement/service/apis@2024-06-01' = {
  parent: apimService
  name: 'mcp-tools'
  properties: {
    displayName: 'MCP Tool Gateway'
    path: 'mcp/v1'
    protocols: ['https']
    serviceUrl: 'http://mcp-backend.internal'
    subscriptionRequired: true
    authenticationSettings: {
      oAuth2: {
        authorizationServerId: 'entra-id-oauth'
      }
    }
  }
}

Tool-Level Policy Enforcement

Azure APIM policies can inspect MCP JSON-RPC payloads and enforce tool-level access:

<policies>
  <inbound>
    <validate-jwt header-name="Authorization"
                  failed-validation-httpcode="401">
      <openid-config url="https://login.microsoftonline.com/{tenant}/.well-known/openid-configuration" />
      <required-claims>
        <claim name="roles" match="any">
          <value>MCP.Tools.Read</value>
        </claim>
      </required-claims>
    </validate-jwt>
    <!-- Extract MCP tool name from JSON-RPC body -->
    <set-variable name="mcpTool"
                  value="@(context.Request.Body.As<JObject>()[\"method\"].ToString())" />
    <!-- Block write tools for read-only agents -->
    <choose>
      <when condition="@(!context.User.Roles.Contains(\"MCP.Tools.Write\") && 
                         !((string)context.Variables[\"mcpTool\"]).EndsWith(\"query\"))">
        <return-response>
          <set-status code="403" reason="Tool access denied" />
        </return-response>
      </when>
    </choose>
  </inbound>
</policies>

This pattern lets you manage MCP tool permissions through Entra ID app roles—the same system your team already uses for API access control. For teams deploying Claude Code across enterprise environments, this means MCP tool permissions flow through existing identity governance.

MCP Security Best Practices: Lessons from Real Exploits

TL;DR: The March 2026 security research revealed GitHub token leaks and RCE vulnerabilities in MCP integrations—prevent them with tool-level scoping, input validation at the gateway, and output sanitization before returning results to agents.

The Exploit Landscape

Security research published in March 2026 demonstrated concrete attacks against MCP deployments:

GitHub Token Exfiltration: An MCP tool with overly broad OAuth scopes allowed an agent to read repository secrets, then pass them to a second MCP tool that wrote to external storage. The tokens were never exposed to the user—only to the agent.
RCE via Unsanitized Tool Input: MCP tool servers that passed agent-supplied input directly to shell commands without sanitization were vulnerable to command injection.
Tool Shadowing: A malicious MCP server could register tools with names identical to trusted tools, intercepting calls intended for legitimate services.

These aren't theoretical—they were reproduced against real MCP integrations.

Defense-in-Depth Configuration

Here's an Operant MCP Gateway configuration that addresses all three exploit classes:

# operant-mcp-gateway.yaml
apiVersion: operant.ai/v1
kind: MCPGatewayPolicy
metadata:
  name: production-security
spec:
  authentication:
    oauth2:
      issuer: https://auth.corp.com
      audience: mcp-gateway-prod
      scopeEnforcement: strict  # reject tokens with scopes beyond what's needed
  
  toolRegistry:
    # Prevent tool shadowing by pinning tool→server mappings
    allowedServers:
      - name: internal-tools
        url: https://mcp-tools.internal:443
        mtls:
          clientCert: /certs/gateway-client.pem
        tools:
          - inventory.check_stock
          - inventory.reserve
      - name: github-integration  
        url: https://mcp-github.internal:443
        tools:
          - github.search_code
          - github.read_file
          # Explicitly NOT including github.create_secret, github.update_workflow
    
    rejectUnknownTools: true  # block any tool not in the registry
  
  inputValidation:
    # Prevent command injection
    rules:
      - toolPattern: "*"
        sanitize:
          stripShellMetachars: true
          maxInputLength: 10000
          blockPatterns:
            - "$("       # command substitution
            - "`"        # backtick execution
            - "; rm "    # common injection pattern
            - "| curl"   # data exfiltration
  
  outputSanitization:
    # Prevent token leakage in tool responses
    redact:
      patterns:
        - name: github-tokens
          regex: "ghp_[a-zA-Z0-9]{36}"
          replacement: "[REDACTED_GH_TOKEN]"
        - name: aws-keys
          regex: "AKIA[0-9A-Z]{16}"
          replacement: "[REDACTED_AWS_KEY]"
  
  rateLimiting:
    perAgent:
      requestsPerMinute: 60
      burstLimit: 10
    anomalyDetection:
      enabled: true
      # Alert if an agent suddenly calls tools it's never used before
      baselinePeriodDays: 7
      deviationThreshold: 3.0

The Least-Privilege Checklist

For every MCP tool in your gateway, verify these controls:

Scope granularity: One OAuth scope per tool, not per server
Annotation enforcement: Gateway blocks write-annotated tools for read-only agent roles
Input validation: All tool inputs sanitized at the gateway before reaching the tool server
Output redaction: Credentials and PII stripped from tool responses before returning to the agent
Tool registry pinning: Only explicitly registered tool→server mappings are routed
Behavioral baseline: Anomaly detection flags agents that deviate from historical tool usage patterns

As noted in the complete Claude Code integration guide, these security controls become especially critical when AI coding assistants have access to production infrastructure tools.

Frequently Asked Questions

Q: How do I migrate from stdio MCP transport to Streamable HTTP for enterprise deployment?

Stdio transport requires the MCP server to run as a local subprocess, which doesn't work for centralized enterprise deployment. To migrate, wrap your existing MCP server with the Streamable HTTP transport adapter from the @modelcontextprotocol/sdk package (v1.2+). This involves adding an HTTP server layer that accepts JSON-RPC over HTTP POST and returns results as either single responses or streamed chunks. The key change is moving from process-level isolation (one server per client) to network-level isolation (one server, many authenticated clients), which requires adding OAuth 2.1 authentication that stdio never needed.

Q: Can I use Kong AI MCP Proxy with non-REST backends like gRPC or GraphQL?

Yes, but it requires Kong's request-transformer plugin to convert between protocols. For gRPC backends, configure Kong to receive the MCP JSON-RPC call, extract tool parameters, and forward them as a gRPC request using Kong's gRPC-gateway plugin. For GraphQL, use the request-transformer to construct a GraphQL query from the MCP tool input schema. The MCP tool definition's input_schema maps cleanly to GraphQL variables, making this transformation straightforward for read operations. Write mutations require more careful mapping of the tool's input schema to mutation arguments.

Q: What's the performance overhead of adding an MCP gateway layer?

Expect 5-15ms of additional latency per tool call for authentication, policy evaluation, and input validation—negligible compared to the typical 200-2000ms latency of the underlying tool execution. JSON-RPC batching (added in the March 2025 spec) offsets this by reducing round-trips: batching 5 independent tool calls into a single HTTP request eliminates 4 round-trips of gateway overhead. The main performance consideration is rate limiting state storage; use Redis-backed rate limiting for multi-node gateway deployments to avoid per-node counting inconsistencies.

Q: How do tool annotations interact with gateway-level access control?

Tool annotations are metadata published by the MCP server, not assertions of trust. Your gateway should treat them as hints that inform policy, not as security boundaries. For example, a tool annotated readOnly: true might still have a bug that allows writes—so your gateway policy should combine annotation-based rules with explicit tool allowlists. The recommended pattern is: use annotations for coarse-grained default policies (e.g., "read-only agents can call any readOnly tool"), then layer explicit deny rules for sensitive tools regardless of annotations (e.g., "no agent calls database.drop_table even if someone annotates it readOnly").

Q: Should I run one MCP gateway or multiple gateways per environment?

Run separate gateway instances per trust boundary, not per environment. A single gateway for all of production is fine if all agents share the same trust level. But if you have internal agents (trusted, broad access) and customer-facing agents (untrusted, restricted access), deploy separate gateway instances with distinct policies. This prevents a policy misconfiguration on the customer-facing gateway from accidentally exposing internal tools. Kubernetes namespaces with NetworkPolicies or separate VPCs provide the isolation layer between gateway instances.

Key Takeaways

The March 2025 MCP spec update (OAuth 2.1, Streamable HTTP, tool annotations, JSON-RPC batching) is the foundation for enterprise MCP deployment—don't build on pre-March 2025 patterns
Choose your gateway based on your existing stack: Kong for API-first organizations, Azure APIM for Microsoft shops with Entra ID, Operant for security-critical workloads
Bridge existing APIs to MCP using gateway-level parameter mapping rather than rewriting backends as native MCP servers
Implement least-privilege at the tool level: one OAuth scope per tool, annotation-aware policies, explicit tool registry pinning
Defend against demonstrated exploits: input sanitization blocks RCE, output redaction prevents token leaks, tool registry pinning prevents shadowing
Batch JSON-RPC calls to offset gateway latency overhead—5 batched calls are faster than 5 sequential calls even without a gateway

Conclusion

Enterprise MCP gateway deployment in 2026 is a solved architecture problem with three production-ready options. The unsolved problem is organizational: mapping your existing API inventory to MCP tools with appropriate security boundaries, training your teams on MCP-specific threat models, and building the operational muscle for gateway monitoring and incident response.

The March 2026 security disclosures proved that ungoverned MCP access creates real, exploitable vulnerabilities. A gateway isn't optional—it's the control plane that makes the difference between a powerful AI tool integration and a supply chain attack waiting to happen.

If your team is planning enterprise MCP gateway deployment and needs help navigating the architecture decisions, security patterns, and implementation specifics, Elegant Software Solutions provides hands-on AI implementation engagements that include MCP infrastructure design and deployment. Schedule a technical consultation to discuss your MCP gateway strategy.