Skip to main content

Investigations

What is an Investigation

An investigation is the core unit of work in OpsWorker. It represents a complete AI-powered analysis of an alert — from receiving the signal to delivering a root cause analysis with remediation steps.

Each investigation contains:

  • Trigger — The alert that started the investigation
  • Topology — Discovered Kubernetes resources and their relationships
  • Collected data — Logs, events, configurations gathered from the cluster
  • Analysis — AI-generated root cause identification with confidence level
  • Recommendations — Immediate actions and preventive measures, including specific kubectl commands

Lifecycle

Investigations move through the following states:

stateDiagram-v2
[*] --> Pending: Alert received
Pending --> InProgress: Investigation starts
InProgress --> Completed: Analysis finished
InProgress --> Failed: Error occurred
Completed --> [*]
Failed --> [*]
StateDescription
PendingAlert received and investigation queued
In ProgressAgents actively investigating — discovering topology, collecting data, analyzing
CompletedRoot cause identified, recommendations generated, notification sent
FailedInvestigation couldn't complete (e.g., cluster connectivity lost, agent unavailable)

Most investigations complete in under 2 minutes from alert arrival to Slack notification.

Viewing Investigations

Investigations are available in two places:

  • OpsWorker Portal — Full investigation detail with topology view, collected data, AI analysis, recommendations, and conversation log
  • Slack — Summary notification with root cause, recommendations, and feedback buttons

Interacting with Investigations

After an investigation completes, you can:

  • Chat — Ask follow-up questions about the investigation ("Why did this specific pod crash?", "Show me the relevant logs")
  • Provide feedback — Rate the investigation quality (Accurate, Partially Accurate, Needs Improvement) to help improve future investigations
  • Share — Link to the investigation in the portal for team review

Investigation Types

TypeTriggerDescription
AutomaticAlert matches an alert ruleNo manual intervention — fully automated
Test"Test Integration" buttonSynthetic alert to verify setup
Chat-initiatedAI Chat queryStarted from a user question in the chat interface

Next Steps