Skip to content

Latest commit

 

History

History
418 lines (316 loc) · 9.9 KB

File metadata and controls

418 lines (316 loc) · 9.9 KB

Observability Examples for MCP DevTools

This directory contains example configurations for setting up observability with MCP DevTools using OpenTelemetry.

Quick Start - Jaeger Only (Recommended for just tracing)

The simplest way to get started with distributed tracing, no config files needed:

cd docs/observability/examples
docker compose up -d

Configure MCP DevTools:

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 ./bin/mcp-devtools

View traces: http://localhost:16686

Configuration Scenarios

1. Jaeger Only (Default)

Simple distributed tracing with persistent storage. Jaeger has a built-in OTLP receiver, so no OTEL Collector is needed.

What you get:

  • Distributed tracing with Jaeger UI
  • 7 days trace retention
  • Direct OTLP ingestion from MCP DevTools

Setup:

docker compose up -d
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 ./bin/mcp-devtools

Access:

Note: This setup does NOT use the OTEL Collector - Jaeger receives traces directly.


2. Jaeger + Prometheus + Grafana (advanced)

Full observability stack with traces and metrics.

What you get:

  • Distributed tracing via Jaeger
  • Metrics collection via Prometheus
  • Unified visualisation in Grafana
  • 30 days metric retention

Setup:

  1. Edit docker-compose.yaml:

    • Uncomment otel-collector service
    • Uncomment prometheus service
    • Uncomment grafana service
    • Uncomment volume definitions
  2. Update OTEL collector config mount:

    - ./configs/otel/jaeger-prometheus.yaml:/etc/otel-collector-config.yaml:ro
  3. Start services:

    docker compose up -d
  4. Configure MCP DevTools:

    OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
    MCP_METRICS_GROUPS=tool,session,cache,security \
    ./bin/mcp-devtools

Access:

Grafana Configuration:

  • Datasources are auto-provisioned (Prometheus + Jaeger)
  • Create custom dashboards or import community dashboards
  • Explore traces via Jaeger datasource
  • Query metrics via Prometheus datasource

3. AWS X-Ray

Export traces to AWS X-Ray for cloud-native observability.

What you get:

  • Traces sent to AWS X-Ray service
  • CloudWatch integration
  • AWS service map
  • Managed trace storage

Prerequisites:

export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export AWS_REGION=us-east-1

Setup:

  1. Edit docker-compose.yaml:

    • Comment out jaeger service
    • Uncomment otel-collector service
  2. Update OTEL collector config mount:

    - ./configs/otel/xray.yaml:/etc/otel-collector-config.yaml:ro
  3. Add AWS credentials to otel-collector service:

    environment:
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
      - AWS_REGION=${AWS_REGION}
  4. Start services:

    docker compose up -d
  5. Configure MCP DevTools:

    OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
    OTEL_SERVICE_NAME=mcp-devtools \
    ./bin/mcp-devtools

Access:


4. Grafana Tempo

Lightweight tracing backend optimised for cost and simplicity.

What you get:

  • S3-compatible object storage for traces
  • Native Grafana integration
  • Lower resource usage than Jaeger
  • TraceQL query language

Setup:

  1. Edit docker-compose.yaml:

    • Comment out jaeger service
    • Uncomment tempo service
    • Uncomment grafana service
    • Uncomment volume definitions
  2. Update Grafana datasources config to use Tempo:

    • Edit configs/grafana/datasources/datasources.yaml
    • Comment out Jaeger datasource
    • Uncomment Tempo datasource
  3. Start services:

    docker compose up -d
  4. Configure MCP DevTools:

    OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 ./bin/mcp-devtools

Access:


Configuration Files

All configuration files are organised in configs/ by service:

configs/
├── grafana/
│   ├── datasources/datasources.yaml    # Auto-provisioned datasources
│   └── dashboards/dashboards.yaml      # Dashboard provisioning
├── jaeger/
│   └── jaeger.yaml                     # Commented OTEL Collector config (reference)
├── otel/
│   ├── jaeger-prometheus.yaml          # OTEL → Jaeger + Prometheus
│   └── xray.yaml                       # OTEL → AWS X-Ray
├── prometheus/
│   └── prometheus.yml                  # Prometheus scrape config
└── tempo/
    └── tempo.yaml                      # Tempo configuration

Configuration Details

Config Purpose When to Use
otel/jaeger-prometheus.yaml Routes traces to Jaeger, metrics to Prometheus Full observability stack
otel/xray.yaml Exports traces to AWS X-Ray AWS cloud deployments
jaeger/jaeger.yaml Reference OTEL config (commented) Learning/documentation
tempo/tempo.yaml Direct Tempo configuration Using Tempo instead of Jaeger
prometheus/prometheus.yml Scrapes OTEL Collector metrics Metrics collection
grafana/datasources/ Auto-configures datasources Grafana setup

Common Operations

View Logs

# All services
docker compose logs -f

# Specific service
docker compose logs -f jaeger
docker compose logs -f otel-collector

Restart Services

# All services
docker compose restart

# Specific service
docker compose restart otel-collector

Clean Up

# Stop and remove containers
docker compose down

# Also remove volumes (deletes all data)
docker compose down -v

Update Configuration

After editing config files:

docker compose restart otel-collector

Troubleshooting

No Traces Appearing

Check OTEL endpoint:

curl http://localhost:4318/v1/traces  # Should return 405

Check container logs:

docker compose logs jaeger
docker compose logs otel-collector

Verify MCP DevTools config:

echo $OTEL_EXPORTER_OTLP_ENDPOINT

No Metrics in Prometheus

Check OTEL Collector is exposing metrics:

curl http://localhost:8889/metrics | grep mcp_

Check Prometheus targets:

Verify metrics are enabled:

echo $MCP_METRICS_GROUPS  # Should show: tool,session,cache,security

Grafana Datasources Not Working

Check datasource provisioning:

docker compose logs grafana | grep datasource

Verify network connectivity:

docker compose exec grafana wget -O- http://prometheus:9090/-/healthy
docker compose exec grafana wget -O- http://jaeger:16686

High Memory Usage

Reduce OTEL Collector memory: Edit the collector config's memory_limiter processor:

memory_limiter:
  limit_mib: 256  # Reduce from 512
  spike_limit_mib: 64  # Reduce from 128

Enable sampling:

OTEL_TRACES_SAMPLER=traceidratio \
OTEL_TRACES_SAMPLER_ARG=0.1 \
./bin/mcp-devtools

Production Considerations

Security

  • Enable authentication on Grafana, Prometheus, Jaeger
  • Use TLS for OTLP endpoints (configure in OTEL Collector)
  • Restrict network access using Docker networks and firewalls
  • Rotate credentials for Grafana and external services

Persistence

All services use named volumes for data persistence:

  • jaeger-data - Trace storage (Badger)
  • prometheus-data - Metrics (TSDB)
  • grafana-data - Dashboards and config
  • tempo-data - Trace storage

Retention

Jaeger:

  • Default: 7 days (BADGER_SPAN_STORE_TTL=168h)
  • Adjust in docker-compose.yaml

Prometheus:

  • Default: 30 days (--storage.tsdb.retention.time=30d)
  • Adjust in docker-compose.yaml

Tempo:

  • Default: No expiration (manual cleanup required)
  • Configure retention in configs/tempo.yaml

Resource Limits

Add resource limits in docker-compose.yaml:

services:
  jaeger:
    deploy:
      resources:
        limits:
          memory: 1G
          cpus: '1.0'

Example Queries

Prometheus (PromQL)

# Tool call rate (calls/second)
rate(mcp_tool_calls_total[5m])

# P95 latency by tool
histogram_quantile(0.95, rate(mcp_tool_duration_seconds_bucket[5m]))

# Error rate
sum(rate(mcp_tool_errors_total[5m])) / sum(rate(mcp_tool_calls_total[5m]))

# Cache hit ratio
sum(rate(mcp_cache_operations_total{result="hit"}[5m])) /
sum(rate(mcp_cache_operations_total{operation="get"}[5m]))

Jaeger (Search)

service="mcp-devtools" AND mcp.tool.name="internet_search"
service="mcp-devtools" AND http.status_code >= 500
service="mcp-devtools" AND mcp.session.id="abc123"

Further Reading