Monetize the Context Layer:
The MCP Server Blueprint
MCP runs three transports: HTTP, SSE, and stdio. Standard API gateways see HTTP — and nothing else. Every SSE stream and every stdio tunnel passes through completely invisible, unmetered, and unbilled.
Aforo closes the gap with a dual interception strategy: a Kong Lua plugin for HTTP transport and a one-line SDK decorator for stdio and SSE. Every tools/call is metered — agent ID, tool name, execution duration, token metadata — then routed through Redis enforcement before the response is returned.
Your Gateway Is Watching
One of Three Transports
MCP defines three wire protocols. Enterprise gateways only intercept HTTP. The other two carry real workloads — unbilled.
HTTP Transport
Standard JSON-RPC over HTTP POST. Kong, Apigee, and AWS APIM can see every tools/call request. Quotas enforceable. Usage trackable.
POST /mcp ← gateway intercepts
Content-Type: application/json
{"method":"tools/call","params":{...}}SSE Transport
Server-Sent Events over a persistent connection. The gateway sees one long-lived HTTP GET — not individual JSON-RPC frames. Tool calls are invisible.
GET /mcp/sse ← one open socket
data: {"method":"tools/call",...} ✗
data: {"method":"tools/call",...} ✗
data: {"method":"tools/call",...} ✗stdio Transport
Process-level communication. MCP clients spawn a local server subprocess and communicate via stdin/stdout. No network hop — zero gateway coverage.
# spawned subprocess: node server.js | claude-desktop # gateway never sees: tools/call → web_search tools/call → analyze_data tools/call → query_db
Two Hooks.
All Three Transports Covered.
For HTTP MCP servers, Aforo deploys a Kong Lua plugin that intercepts at the post-response log phase — zero latency added to inference. For stdio and SSE, a one-line SDK decorator wraps each tool handler.
-- detect_mcp_tool_call()
-- Runs in log phase — AFTER response sent.
-- Zero latency added to inference hot path.
local function detect_mcp_tool_call(body)
local ok, parsed = pcall(cjson.decode, body)
if not ok then return nil end
if parsed.jsonrpc ~= "2.0" then return nil end
if parsed.method ~= "tools/call" then return nil end
local tool_name = parsed.params.name
local agent_id = kong.request
.get_header("X-Agent-Id")
-- Buffer in lua_shared_dict, flush async
-- to POST /v1/ingest/batch
return {
tool_name = tool_name,
agent_id = agent_id,
tenant_id = kong.client.get_consumer().custom_id,
started_at = ngx.now() * 1000,
}
end
-- Wired in log phase:
function AforoMeteringHandler:log()
local event = detect_mcp_tool_call(body)
if event then
aforo_buffer:push(event) -- non-blocking
end
endimport { AforoClient } from "@aforo/mcp-metering";
const billing = new AforoClient({
tenantId: process.env.AFORO_TENANT_ID,
productId: process.env.AFORO_PRODUCT_ID,
apiKey: process.env.AFORO_API_KEY,
});
// One-line decorator. Works for stdio and SSE.
// Measures wall-clock duration automatically.
const wrapped = billing.wrapToolHandler(
async (name: string, args: unknown) => {
// Record execution duration + token metadata
// Flush batch → POST /v1/ingest/batch
// Heartbeat ACK carries kill signal back
return await yourToolRouter(name, args);
}
);
// Register as your MCP tool dispatcher:
server.setRequestHandler(
CallToolRequestSchema,
wrapped
);Hard Limits on
Persistent stdio Sessions
Closing a TCP connection terminates an HTTP API call. But stdio MCP servers run as long-lived subprocesses — a single session can invoke thousands of tools before the parent process exits. Standard revocation does nothing.
Aforo uses a Heartbeat Kill Signal: the SDK sends periodic ACKs to the ingestor. When Redis COMPARE_AND_INCREMENT breaches the hard limit, the next heartbeat ACK carries a kill payload. The SDK receives it and terminates the stdio transport immediately.
Every Invocation Captured
in ClickHouse
Each tool call emits a structured event to the usage ingestor. ClickHouse materializes it into mcp_daily_stats_mv for real-time per-tool revenue analytics.
{
"metricName":
"mcp_server.tool_invocations",
"productType": "MCP_SERVER",
"toolName": "web_search",
"sessionId": "sess_abc123",
"agentId":
"agent_gpt4o_prod",
"executionDurationMs": 542,
"metadata": {
"inputTokens": 1200,
"outputTokens": 450
}
}metricName"mcp_server.tool_invocations"productType"MCP_SERVER"toolName"web_search"sessionId"sess_abc123"agentId"agent_gpt4o_prod"executionDurationMs542metadata.inputTokens1200metadata.outputTokens450mcp_daily_stats_mv and mcp_session_summary_mv — per-tool revenue, per-agent COGS, and session-level margin available in real time./analytics/mcp/revenue/by-tool/analytics/mcp/by-agent/analytics/mcp/summaryYour MCP tools are already being called.
Start getting paid for every invocation. Integrate the SDK in minutes. Set per-tool pricing. Real-time analytics. Automatic billing. No setup fees. Pay as you grow.