Troubleshooting Data Collection
Investigations Return Incomplete Data
Check RBAC Permissions
The most common cause of incomplete data is insufficient RBAC permissions. Verify the agent can access the affected namespace:
kubectl auth can-i get pods \
-n TARGET_NAMESPACE \
--as=system:serviceaccount:opsworker:opsworker-agent
Check all permissions:
kubectl auth can-i --list \
-n TARGET_NAMESPACE \
--as=system:serviceaccount:opsworker:opsworker-agent
If access is denied, update the RBAC configuration. See RBAC.
Check Namespace Scope
If the agent is configured with namespace-scoped access (Role/RoleBinding instead of ClusterRole/ClusterRoleBinding), ensure the affected namespace has the appropriate Role and RoleBinding.
Pod Logs Are Empty
Container Restarted
If a container restarted, its current logs may be empty. Check previous container logs:
kubectl logs POD_NAME -n NAMESPACE --previous
Log Retention
Kubernetes retains logs only for running and recently terminated containers. If the pod was deleted and recreated, previous logs are lost.
For best investigation results, investigate alerts promptly — log data is most complete shortly after the alert fires.
RBAC for Logs
The agent needs get permission on pods/log:
kubectl auth can-i get pods/log \
-n TARGET_NAMESPACE \
--as=system:serviceaccount:opsworker:opsworker-agent
Events Are Missing
Event Retention
Kubernetes events have a default retention of 1 hour. If the investigation runs long after the alert fired, relevant events may have expired.
For best results, configure auto-investigation so investigations start immediately when alerts arrive.
Check Event Access
kubectl auth can-i get events \
-n TARGET_NAMESPACE \
--as=system:serviceaccount:opsworker:opsworker-agent
Specific Resource Types Missing
| Missing Data | Required RBAC | Fix |
|---|---|---|
| Pod details | get/list on pods | Add to Role/ClusterRole |
| Logs | get on pods/log | Add pods/log resource |
| Events | get/list on events | Add to Role/ClusterRole |
| Services | get/list on services | Add to Role/ClusterRole |
| Ingresses | get/list on ingresses in networking.k8s.io | Add apiGroup and resource |
| Deployments | get/list on deployments in apps | Add apiGroup and resource |
Next Steps
- RBAC Configuration — Fix permission issues
- Connectivity Troubleshooting — Fix connection issues