Skip to main content

AI Agents

OpsWorker uses specialized AI agents that each understand a specific data source. During investigations and free-form AI Chat sessions, these agents gather context and, in the case of source control, can propose changes.

Each agent is gated by one or more capability tokens: cluster (the in-cluster Kubernetes Agent), sourcecontrol (GitHub OR GitLab), and metrics (Grafana MCP). An agent that is missing a required capability is shown as unavailable with its missing capabilities listed.

Free-Form AI Chat Agent Registry

Free-form AI Chat exposes six agents:

AgentWhat it doesRequired capability
Investigate Issue (investigate)Runs a focused investigation on a problem you describecluster
Validate Resources (validate_resources)Checks configuration correctness (selectors, ports, limits)cluster
Check Dependencies (check_dependencies)Examines service-to-service dependencies and healthcluster
Analyze Logs (analyze_logs)Searches and interprets logs for error patternscluster
Source Code & Repository Agent (source_code)Correlates code changes and opens PRs/MRs with fixessourcecontrol (GitHub OR GitLab)
Resource Optimizer (resource_optimizer)Right-sizes CPU/memory requests, memory limits, and HPA, and can open a PR/MRcluster + metrics

These chat agents are distinct from the investigation graph nodes (static_extraction, topology_validation, dependency_extraction, investigation, analysis). See How Investigations Work.

Resource Optimizer

The Resource Optimizer uses live utilization metrics (via the metrics capability / Grafana MCP) to recommend right-sized CPU requests, memory requests, memory limits, and HPA settings. It never recommends a CPU limit (it may recommend removing one), and it no-ops when current settings are already within roughly 15% of its recommendation. After you confirm, it can open a GitHub PR or GitLab MR with the manifest change (a draft if confidence is not high) for human review.


☸️ Kubernetes Agent

Read-only queries for pod status, events, logs, configurations, and resource topology via the in-cluster agent (core types built-in, broader types via Kubernetes MCP).

Read more →

🔀 Source Code & Repository Agent

Correlates commits and PRs with incidents and can open human-reviewed pull/merge requests with fixes (the only write path in OpsWorker).

Read more →

📈 Metrics & Grafana Capability

Grafana MCP tools for PromQL metrics, LogQL log search, dashboards, and alert rules. Consumed mainly by Resource Optimizer and to enrich investigations.

Read more →

📐 Resource Optimizer

Right-sizes CPU/memory requests, memory limits, and HPA from live metrics, then opens a PR/MR after your confirmation. Never sets a CPU limit.

Read more →