ConduitMcp.PromEx (ConduitMCP v0.9.0)

Copy Markdown View Source

A PromEx plugin for monitoring ConduitMCP operations.

This plugin captures metrics from the ConduitMCP library's telemetry events:

  • Request metrics (all MCP method calls)
  • Tool execution metrics
  • Resource read metrics
  • Prompt retrieval metrics
  • Authentication metrics

Installation

Add :prom_ex to your dependencies:

def deps do
  [
    {:conduit_mcp, "~> 0.9.0"},
    {:prom_ex, "~> 1.11"}
  ]
end

Usage

Add this plugin to your PromEx module's plugins list:

defmodule MyApp.PromEx do
  use PromEx, otp_app: :my_app

  @impl true
  def plugins do
    [
      PromEx.Plugins.Application,
      PromEx.Plugins.Beam,
      {ConduitMcp.PromEx, otp_app: :my_app}
    ]
  end

  @impl true
  def dashboard_assigns do
    [
      datasource_id: "prometheus",
      default_selected_interval: "30s"
    ]
  end
end

Then add your PromEx module to your supervision tree:

def start(_type, _args) do
  children = [
    MyApp.PromEx,
    # ... other children ...
  ]

  Supervisor.start_link(children, strategy: :one_for_one)
end

Configuration Options

  • :otp_app (required) - Your application name
  • :duration_unit (optional) - Unit for duration metrics (default: :millisecond)

Metrics Exposed

All metrics are prefixed with {otp_app}_conduit_mcp_.

Request Metrics

  • {prefix}_request_total - Counter of MCP requests
    • Tags: method (e.g., "tools/list"), status (:ok | :error)

  • {prefix}_request_duration_milliseconds - Distribution of request durations
    • Tags: method, status
    • Buckets: [10, 50, 100, 250, 500, 1_000, 2_500, 5_000, 10_000]

Tool Metrics

  • {prefix}_tool_execution_total - Counter of tool executions
    • Tags: tool_name, status
  • {prefix}_tool_duration_milliseconds - Distribution of tool execution durations
    • Tags: tool_name, status
    • Buckets: [10, 50, 100, 500, 1_000, 5_000, 10_000, 30_000]

Resource Metrics

  • {prefix}_resource_read_total - Counter of resource reads
    • Tags: status
  • {prefix}_resource_read_duration_milliseconds - Distribution of resource read durations
    • Tags: status
    • Buckets: [10, 50, 100, 500, 1_000, 5_000]

Prompt Metrics

  • {prefix}_prompt_get_total - Counter of prompt retrievals
    • Tags: prompt_name, status
  • {prefix}_prompt_get_duration_milliseconds - Distribution of prompt retrieval durations
    • Tags: prompt_name, status
    • Buckets: [10, 50, 100, 500, 1_000]

Message Rate Limit Metrics

  • {prefix}_message_rate_limit_check_total - Counter of message rate limit checks
    • Tags: status (:allow | :deny), method (MCP method name)

  • {prefix}_message_rate_limit_check_duration_milliseconds - Distribution of check durations
    • Tags: status, method
    • Buckets: [1, 5, 10, 25, 50, 100, 250]

Authentication Metrics

  • {prefix}_auth_verify_total - Counter of authentication attempts
    • Tags: strategy (:bearer_token | :api_key | :function), status

  • {prefix}_auth_verify_duration_milliseconds - Distribution of auth verification durations
    • Tags: strategy, status
    • Buckets: [1, 5, 10, 25, 50, 100, 250]

PromQL Examples

Request Rate by Method

rate({otp_app}_conduit_mcp_request_total[5m])

Error Rate Percentage

100 * (
  rate({otp_app}_conduit_mcp_request_total{status="error"}[5m])
  /
  rate({otp_app}_conduit_mcp_request_total[5m])
)

P95 Request Duration

histogram_quantile(0.95,
  rate({otp_app}_conduit_mcp_request_duration_milliseconds_bucket[5m])
)

Slow Tool Executions (>5s)

histogram_quantile(0.95,
  rate({otp_app}_conduit_mcp_tool_duration_milliseconds_bucket[5m])
) > 5000

Authentication Success Rate

100 * (
  rate({otp_app}_conduit_mcp_auth_verify_total{status="ok"}[5m])
  /
  rate({otp_app}_conduit_mcp_auth_verify_total[5m])
)

Alert Examples

High Error Rate

- alert: ConduitMcpHighErrorRate
  expr: |
    100 * (
      rate(myapp_conduit_mcp_request_total{status="error"}[5m])
      /
      rate(myapp_conduit_mcp_request_total[5m])
    ) > 5
  for: 5m
  annotations:
    summary: "High error rate in ConduitMCP ({{ $value }}%)"

Slow Tool Executions

- alert: ConduitMcpSlowTools
  expr: |
    histogram_quantile(0.95,
      rate(myapp_conduit_mcp_tool_duration_milliseconds_bucket[5m])
    ) > 5000
  for: 10m
  annotations:
    summary: "Tool executions are slow (p95: {{ $value }}ms)"

Authentication Failures

- alert: ConduitMcpAuthFailures
  expr: |
    rate(myapp_conduit_mcp_auth_verify_total{status="error"}[5m]) > 0.1
  for: 5m
  annotations:
    summary: "Authentication failures detected"

Cardinality Considerations

This plugin is designed to minimize metric cardinality:

  • ✅ LOW: method (limited set of MCP methods)
  • ✅ LOW: status (only :ok or :error)
  • ✅ LOW: strategy (only 3 auth strategies)
  • ✅ LOW: tool_name (user-defined but typically limited)
  • ✅ LOW: prompt_name (user-defined but typically limited)
  • ❌ HIGH: uri is NOT included (unbounded)
  • ❌ HIGH: server_module is NOT included (not useful)

All string values are normalized to prevent cardinality explosion.