Datadog AI Agent
Overview
The Datadog AI Agent queries your Datadog account for metrics, traces, and monitor status during investigations and chat sessions. It enriches Kubernetes investigation data with application performance monitoring (APM) context.
Requirements
- Datadog integration configured for the cluster (API key and Application key)
- See Datadog Integration
Capabilities
| Capability | Description |
|---|---|
| Query metrics | Retrieve time-series metric data (CPU, memory, request rate, error rate) |
| List monitors | Check which Datadog monitors are in alert, warn, or OK state |
| Search traces | Find APM traces for specific services and time ranges |
| Get dashboard data | Retrieve data from Datadog dashboards |
Use Cases
During Investigations
The Datadog Agent adds metric context to Kubernetes investigations:
- Correlates application metrics with pod failures (e.g., error rate spike → pod crash)
- Checks Datadog monitors related to the alerting resource
- Identifies latency patterns that Kubernetes data alone can't show
In AI Chat
Ask metric-related questions:
What's the error rate for service api-gateway in the last hour?
Are there any Datadog monitors currently alerting?
Show me the latency trend for the checkout service
What's the request rate for namespace production?
Enriched Root Cause Analysis
When both Kubernetes and Datadog data are available, OpsWorker can provide deeper analysis:
- "Pod OOMKilled at 14:23 UTC — Datadog shows memory usage ramping linearly from 12:00 UTC, correlating with a traffic spike visible in request rate metrics"
Next Steps
- Datadog Integration Setup — Configure Datadog
- Multi-Agent Workflows — How agents work together