Skip to main content

Grafana Integration

Overview

OpsWorker has two distinct Grafana integrations that serve different purposes and are configured independently:

IntegrationDirectionPurposeWhat It Does
Grafana AlertingGrafana → OpsWorkerAlert ingestionSends Grafana alerts to OpsWorker via webhook for automatic investigation
Grafana MCPOpsWorker → GrafanaObservability queriesAllows OpsWorker's AI agents to query Grafana metrics, logs, dashboards, and alerts during investigations and chat

You can use one or both. They are independent — Grafana Alerting works without MCP, and MCP works without Grafana Alerting.


Grafana Alerting (Webhook)

Receives alerts from Grafana's unified alerting system. Requires Grafana 9+.

How It Works

  1. Grafana alert rule fires
  2. Notification policy routes it to the OpsWorker contact point (webhook)
  3. OpsWorker receives the alert, normalizes it, and stores it as a signal
  4. If an alert rule matches, an automatic investigation starts

Setup

1. Get Your Webhook URL and Credentials

  1. In the OpsWorker portal, go to Integrations → select your cluster → Grafana Alerting
  2. Copy the webhook URL and the Authorization header value

2. Create a Contact Point in Grafana

  1. In Grafana, go to Alerting → Contact Points
  2. Click New Contact Point
  3. Configure:
    • Name: OpsWorker
    • Type: Webhook
    • URL: Your OpsWorker webhook URL
    • HTTP Method: POST
    • Authorization Header: Paste the value from the portal (if provided)
  4. Click Test to send a test notification
  5. Click Save

3. Configure Notification Policies

  1. Go to Alerting → Notification Policies
  2. Either:
    • Add to the default policy — All Grafana alerts go to OpsWorker
    • Create a specific policy — Route certain alerts (by label, folder, or severity) to OpsWorker
  3. Set the contact point to OpsWorker
  4. Save

4. Verify

Trigger a Grafana alert or use the test button on the contact point. Check the OpsWorker portal under Alerts for incoming signals.

Grafana Cloud

The same setup works with Grafana Cloud — create a webhook contact point using your OpsWorker webhook URL.


Grafana MCP (Observability Queries)

Enables the Observability AI Agent to query your Grafana instance during investigations and chat sessions. The agent runs as a sidecar alongside the OpsWorker Kubernetes Agent.

What It Enables

With Grafana MCP configured, OpsWorker's AI agents can:

CapabilityDescription
PromQL queriesExecute Prometheus queries for CPU, memory, network, latency, histograms, percentiles
Loki log searchRun LogQL queries for log pattern detection and error search
Dashboard inspectionSearch, browse, and read data from Grafana dashboards
Datasource queriesQuery any configured datasource (Prometheus, Loki, CloudWatch, ClickHouse, Elasticsearch)
Alert rule inspectionView Grafana alert rules and notification policies
Incident browsingView Grafana incidents and on-call schedules
Annotation retrievalRead Grafana annotations for event correlation
Deep link generationGenerate direct links to relevant Grafana views

How It Works

graph LR
AI[OpsWorker AI Agent] -->|SQS command| Agent[K8s Agent]
Agent -->|MCP protocol| MCP[Grafana MCP Sidecar]
MCP -->|API calls| Grafana[Your Grafana Instance]
  1. The Grafana MCP server runs as a sidecar container in the OpsWorker agent pod
  2. It connects to your Grafana instance using a service account token
  3. During investigations, AI agents send queries via SQS → Agent → MCP → Grafana
  4. Results flow back to the AI for analysis and correlation with Kubernetes data

Setup

1. Create a Grafana Service Account

  1. In Grafana, go to Administration → Service Accounts
  2. Click Add service account
  3. Set the role to Viewer (read-only access is sufficient)
  4. Create a token for the service account
  5. Copy the token

2. Enable in OpsWorker Portal

  1. In the OpsWorker portal, go to Integrations → select your cluster
  2. Select Grafana MCP
  3. Enter your Grafana URL (e.g., https://grafana.example.com)
  4. Save

3. Install or Upgrade the Agent with Grafana MCP

The portal generates a Helm command with the Grafana MCP flags. Run it:

helm upgrade opsworker-agent opsworker/opsworker-agent \
-n opsworker \
--set clusterToken=YOUR_CLUSTER_TOKEN \
--set grafana-mcp.enabled=true \
--set grafana-mcp.grafanaUrl=https://grafana.example.com \
--set grafana-mcp.grafanaToken=YOUR_SERVICE_ACCOUNT_TOKEN

4. Verify

After the agent restarts, the Grafana MCP sidecar starts alongside the main agent. Test by asking a question in AI Chat:

What Grafana dashboards exist for the production namespace?

Investigation Enhancement

When Grafana MCP is active, these investigation agents gain additional capabilities:

AgentGrafana Enhancement
investigateCorrelates alerts with historical metric trends via PromQL, searches logs via Loki, inspects related Grafana alerts and incidents, checks on-call context
analyze_logsQueries Loki for log patterns and statistics, cross-references log error spikes with metric anomalies
validate_resourcesChecks CPU/memory utilization metrics from Prometheus to validate resource configurations
check_dependenciesQueries service-level metrics and request latency to identify degrading dependencies

Failure Isolation

Grafana MCP and the Kubernetes Agent run as independent MCP sessions. If Grafana MCP is unavailable (e.g., Grafana is down), Kubernetes investigation tools continue working normally.


Compatibility

Grafana VersionAlerting (Webhook)MCP (Queries)
Grafana 9+SupportedSupported
Grafana CloudSupportedSupported
Grafana 8 and earlierNot supported (use legacy alerting → AlertManager → OpsWorker)Supported
Self-hosted GrafanaSupportedSupported (must be reachable from the cluster)

Next Steps