Skip to main content

Troubleshooting Data Collection

Investigations Return Incomplete Data

Check RBAC Permissions

The most common cause of incomplete data is insufficient RBAC permissions. Verify the agent can access the affected namespace:

kubectl auth can-i get pods \
-n TARGET_NAMESPACE \
--as=system:serviceaccount:opsworker:opsworker-agent

Check all permissions:

kubectl auth can-i --list \
-n TARGET_NAMESPACE \
--as=system:serviceaccount:opsworker:opsworker-agent

If access is denied, update the RBAC configuration. See RBAC.

Check Namespace Scope

If the agent is configured with namespace-scoped access (Role/RoleBinding instead of ClusterRole/ClusterRoleBinding), ensure the affected namespace has the appropriate Role and RoleBinding.

Pod Logs Are Empty

Container Restarted

If a container restarted, its current logs may be empty. Check previous container logs:

kubectl logs POD_NAME -n NAMESPACE --previous

Log Retention

Kubernetes retains logs only for running and recently terminated containers. If the pod was deleted and recreated, previous logs are lost.

For best investigation results, investigate alerts promptly — log data is most complete shortly after the alert fires.

RBAC for Logs

The agent needs get permission on pods/log:

kubectl auth can-i get pods/log \
-n TARGET_NAMESPACE \
--as=system:serviceaccount:opsworker:opsworker-agent

Events Are Missing

Event Retention

Kubernetes events have a default retention of 1 hour. If the investigation runs long after the alert fired, relevant events may have expired.

For best results, configure auto-investigation so investigations start immediately when alerts arrive.

Check Event Access

kubectl auth can-i get events \
-n TARGET_NAMESPACE \
--as=system:serviceaccount:opsworker:opsworker-agent

Specific Resource Types Missing

Missing DataRequired RBACFix
Pod detailsget/list on podsAdd to Role/ClusterRole
Logsget on pods/logAdd pods/log resource
Eventsget/list on eventsAdd to Role/ClusterRole
Servicesget/list on servicesAdd to Role/ClusterRole
Ingressesget/list on ingresses in networking.k8s.ioAdd apiGroup and resource
Deploymentsget/list on deployments in appsAdd apiGroup and resource

Next Steps