Troubleshooting
Common Issues
Agent Not Connecting
Symptoms: Cluster shows Disconnected in portal.
- Check pod status:
kubectl get pods -n opsworker - Check logs:
kubectl logs -n opsworker -l app=opsworker-agent - Verify outbound connectivity to
*.amazonaws.com(port 443) - Check cluster token is correct:
helm get values opsworker-agent -n opsworker - Check for NetworkPolicy blocking outbound traffic
See Connectivity Troubleshooting for details.
Investigations Not Triggering
Symptoms: Alerts fire but no investigations start.
- Check Alerts in portal — are signals arriving?
- If no signals: verify your webhook URL is correct in the monitoring system
- If signals arrive but no investigations: check Alert Rules — does a matching rule exist with auto-investigation enabled?
- Verify the alert's namespace/severity/labels match rule filters
Investigations Returning Incomplete Results
Symptoms: Investigation completes but data is missing.
- Check agent RBAC permissions for the affected namespace
- Verify with:
kubectl auth can-i get pods -n NAMESPACE --as=system:serviceaccount:opsworker:opsworker-agent - Check if the namespace is within the agent's scope
See Data Collection Troubleshooting.
Slack Notifications Not Arriving
Symptoms: Investigation completes but no Slack message.
- Check Slack integration status in Integrations
- Verify notification routing — is a channel configured?
- Check that the OpsWorker bot is in the target Slack channel
- Try disconnecting and reconnecting the Slack integration
Investigation Stuck in "In Progress"
Symptoms: Investigation doesn't complete.
- Check agent connectivity (Disconnected agent blocks data collection)
- Check agent resource limits (OOM kills interrupt investigations)
- If persistent, the investigation may have timed out — check portal for error details
Portal Access Issues
- Clear browser cache and cookies
- Try a different browser
- Check SSO configuration if using Google or Azure AD sign-in
Getting Help
If these steps don't resolve your issue, contact OpsWorker support with:
- Cluster name and ID
- Agent logs:
kubectl logs -n opsworker -l app=opsworker-agent - Description of the issue and steps you've tried