Data Collection
Overview
During an investigation, OpsWorker collects data from your Kubernetes cluster through the in-cluster agent. All collection is read-only and scoped to the resources relevant to the investigation.
What Is Collected
| Data Type | Source | Purpose |
|---|---|---|
| Pod status | kubectl get pod | Current state, restart counts, container statuses |
| Pod logs | kubectl logs | Application errors, stack traces, crash output |
| Kubernetes events | kubectl get events | Scheduling, state transitions, errors, warnings |
| Deployment specs | kubectl get deployment | Replica count, strategy, resource limits |
| Service specs | kubectl get service | Selectors, ports, type |
| Endpoints | kubectl get endpoints | Healthy/unhealthy backends |
| Ingress rules | kubectl get ingress | Routing rules, TLS configuration |
| ConfigMap contents | kubectl get configmap | Application configuration |
| Secret metadata | kubectl get secret | Names, labels, annotations only (values are never read) |
| Node status | kubectl get node | Conditions, capacity, allocatable resources |
How Collection Works
- The investigation engine determines what data is needed based on the alert and discovered topology
- Commands are sent to the in-cluster agent via SQS
- The agent executes the kubectl-equivalent query against the Kubernetes API
- Results are returned via SQS to the investigation engine
What Is NOT Collected
- Secret values — Only metadata (names, labels) is accessed
- Container filesystem — No exec or file access inside containers
- Network traffic — No packet capture or network monitoring
- Metrics — Kubernetes metrics are not collected directly (use Datadog/Grafana integration for metrics)
Data Minimization
OpsWorker only collects data relevant to the specific investigation:
- Only resources in the discovered topology are queried
- Log collection is limited to recent entries
- Collection stops once sufficient data is gathered for analysis
Data in Transit
All data between the agent and OpsWorker cloud is transmitted via AWS SQS over TLS-encrypted connections.
Next Steps
- Data Processing — How collected data is analyzed
- Isolation & Encryption — Data protection details