Datadog AI Agent

Overview

The Datadog AI Agent queries your Datadog account for metrics, traces, and monitor status during investigations and chat sessions. It enriches Kubernetes investigation data with application performance monitoring (APM) context.

Requirements

Datadog integration configured for the cluster (API key and Application key)
See Datadog Integration

Capabilities

Capability	Description
Query metrics	Retrieve time-series metric data (CPU, memory, request rate, error rate)
List monitors	Check which Datadog monitors are in alert, warn, or OK state
Search traces	Find APM traces for specific services and time ranges
Get dashboard data	Retrieve data from Datadog dashboards

Use Cases

During Investigations

The Datadog Agent adds metric context to Kubernetes investigations:

Correlates application metrics with pod failures (e.g., error rate spike → pod crash)
Checks Datadog monitors related to the alerting resource
Identifies latency patterns that Kubernetes data alone can't show

In AI Chat

Ask metric-related questions:

What's the error rate for service api-gateway in the last hour?

Are there any Datadog monitors currently alerting?

Show me the latency trend for the checkout service

What's the request rate for namespace production?

Enriched Root Cause Analysis

When both Kubernetes and Datadog data are available, OpsWorker can provide deeper analysis:

"Pod OOMKilled at 14:23 UTC — Datadog shows memory usage ramping linearly from 12:00 UTC, correlating with a traffic spike visible in request rate metrics"

Next Steps

Datadog Integration Setup — Configure Datadog
Multi-Agent Workflows — How agents work together

Overview​

Requirements​

Capabilities​

Use Cases​

During Investigations​

In AI Chat​

Enriched Root Cause Analysis​

Next Steps​