Custom Reasoning Model
The Custom Reasoning Model integration lets you point OpsWorker at your own LLM endpoint. Once connected, all reasoning calls — investigation analysis, chat responses, recommendation generation — go to your model instead of the default managed AWS Bedrock backend.
This is the self-service equivalent of Bring Your Own LLM. BYO LLM covers deployment-level model configuration for dedicated and private cloud installations; Custom Reasoning Model is the SaaS-tier integration that you configure from the portal.
When to Use
- You have specific compliance or data-sovereignty requirements that rule out the default backend.
- You want to evaluate a specific model's reasoning quality on your investigations.
- You already have an LLM endpoint approved by your security team and want OpsWorker to use it.
If you don't have these constraints, the default OpsWorker AI backend is the simpler choice — there's no setup and it's tuned for the investigation workload.
Compatibility
The Custom Reasoning Model integration expects an OpenAI-compatible chat completions API. Common providers that work:
- Azure OpenAI
- Self-hosted vLLM, TGI, or Ollama serving an instruction-tuned model
- Any third-party LLM with an OpenAI-compatible endpoint
The model must support function calling / tool use for investigation features to work end-to-end. Without tool-use support, the AI cannot invoke OpsWorker's integration tools (Kubernetes MCP, Grafana MCP, etc.) and investigation quality degrades to chat-only behavior.
Configuration
In the OpsWorker portal:
- Go to Integrations → choose Custom Reasoning Model
- Provide:
- API endpoint URL — the OpenAI-compatible chat completions endpoint
- API key — authentication token for the endpoint
- Model name — the specific model identifier (e.g.,
gpt-4o, your deployment ID, etc.)
- Save
OpsWorker validates the endpoint on save by issuing a small completion request. If validation fails, the integration is not activated.
Scope
Custom Reasoning Model integrations are configured at the cluster level. Different clusters can use different models if needed — useful when, for example, a regulated cluster needs a specific provider while a development cluster uses the default backend.
Considerations
- Latency affects time-to-investigation. Slow endpoints will lengthen investigations beyond the default sub-2-minute target.
- Quality varies by model. Reasoning models with strong instruction-following and tool-use produce noticeably better investigations than fast lightweight models.
- Cost is on you. You pay your LLM provider directly for the inference. OpsWorker does not mark up or bill for inference when you bring your own model.
- No fallback by default. If your endpoint is unavailable, investigations on that cluster will fail until it's restored — they do not silently fall back to the default backend.
Differences from BYO LLM
| Aspect | Custom Reasoning Model (integration) | BYO LLM (deployment config) |
|---|---|---|
| Availability | SaaS portal | Dedicated AWS, Private Cloud |
| Configuration | Per cluster from the portal | At deployment time |
| Changes | Self-service | Requires OpsWorker team |
| Use case | Per-cluster model choice | Org-wide model substitution |
Next Steps
- BYO LLM — Deployment-time LLM configuration
- AWS Bedrock — Default backend details
- Azure OpenAI — Azure-specific setup