docs/netdata-ai/investigations/custom-investigations.md
Create deeply researched, context‑aware analyses by asking Netdata open‑ended questions about your infrastructure. Custom Investigations correlate metrics, anomalies, and events to answer the questions dashboards can’t—typically in about two minutes.
Two ways to launch:
Troubleshoot with AI (top‑right). The current view’s scope (chart, dashboard, room, service) is captured automatically; add your question and context.Insights → New Investigation for a blank canvas and full control.Reports are saved in Insights and you’ll receive an email when ready.
Think of this as briefing a teammate. Include time ranges, environments, related services, symptoms, and recent changes.
Request: Why are my checkout‑service pods crashing repeatedly?
Context:
- Started after: deployment at 14:00 UTC of version 2.3.1
- Impact: Customer checkout failures, lost revenue ~$X/hour
- Recent changes: payment gateway integration update; workers 10→20
- Logs: "connection refused to payment-service:8080", "Java heap space"
- Environment: production / eks-prod-us-east-1
- Related: payment-service, inventory-service, redis-session-store
Request: Compare system metrics before and after the user‑authentication‑service deployment.
Context:
- Service: user-authentication-service v2.2.0
- Deployed: 2025‑01‑24 09:00 UTC
- Changes: JWT→Redis sessions; Argon2 hashing
- Concern: intermittent logouts; rising redis_connected_clients
- Windows: 24h before vs 24h after
Request: Identify underutilized nodes for cost optimization.
Context:
- Monthly compute: ~$12K
- Mixed workloads (prod + staging)
- Dev envs run 24/7; batch nodes idle 20h/day
- Goal: save $2–3K/month without reliability impact
Automate recurring investigations (weekly health, monthly optimization, SLO conformance) from the Insights tab. See Scheduled Investigations for examples and setup.
Settings → Usage & Billing → AI CreditsInvestigations overviewScheduled InvestigationsAlert Troubleshooting