Data Processing

Overview

All data processing happens in OpsWorker's cloud — not in your cluster. The in-cluster agent only collects raw data; analysis is performed by the investigation engine.

Processing Pipeline

1. Field Extraction

Regex-based (fast): Parses alert labels and annotations for namespace, pod, severity
AI-based (fallback): For non-standard alert formats, an LLM extracts fields from free-text descriptions

2. Topology Construction

Builds a dependency graph from discovered resources
Breadth-first traversal: Pod → Service → Ingress, Pod → Deployment → ReplicaSet
Identifies which resources may contain the root cause

3. Configuration Validation

Automated checks on resource configurations:

Check	Description
Reference integrity	Do service selectors match pod labels?
Contract matching	Do service ports align with container ports?
Fitness checks	Are resource limits reasonable for the workload?

4. Issue Classification

Categorizes the problem:

Configuration issue — Mismatched selectors, incorrect limits, missing env vars
Runtime issue — Application crash, memory leak, external dependency failure

5. Root Cause Analysis

Multi-model AI analyzes all data:

Correlates signals across logs, events, and configurations
Identifies the underlying cause (not just the symptom)
Assesses confidence level based on evidence strength

6. Recommendation Generation

Produces:

Root cause statement with supporting evidence
Immediate actions with specific kubectl commands
Preventive measures for long-term fixes

Multi-Model Strategy

OpsWorker uses different AI models optimized for each stage:

Stage	Model Type	Optimization
Field extraction	Fast (Amazon Nova)	Speed — extract fields in milliseconds
Analysis & recommendations	Reasoning (Claude)	Depth — complex correlation and root cause identification

Next Steps

How Investigations Work — Full investigation details
Root Cause Analysis — Analysis depth

Overview​

Processing Pipeline​

1. Field Extraction​

2. Topology Construction​

3. Configuration Validation​

4. Issue Classification​

5. Root Cause Analysis​

6. Recommendation Generation​

Multi-Model Strategy​

Next Steps​