Incident Console

Choose a symptom. Leave with an action path.

Naxaya turns field notes into a compact response surface: evidence, diagnosis, bounded action, rollback and the exact notes to open next.

live playbook 12 symptoms

Cloud

Application Gateway returns 502

Open atlas

Separate backend health, DNS resolution, TLS settings and private network reachability before changing the application.

Evidence

Backend health state, probe result, gateway diagnostic logs, DNS answer from the gateway path and TLS/SNI settings.

First checks

Check backend health state
Resolve backend name from the gateway path
Validate TLS/SNI and probe configuration

Bounded action

Change only the failing boundary: probe, backend FQDN, certificate binding or route. Retest the same request path after each change.

Rollback

Restore previous probe/backend settings and keep the captured failing timestamp for comparison.

Short handoff

[Incident] Application Gateway returns 502
Context: Separate backend health, DNS resolution, TLS settings and private network reachability before changing the application.
Evidence to confirm: Backend health state, probe result, gateway diagnostic logs, DNS answer from the gateway path and TLS/SNI settings.
Immediate checks: Check backend health state | Resolve backend name from the gateway path | Validate TLS/SNI and probe configuration
Proposed action: Change only the failing boundary: probe, backend FQDN, certificate binding or route. Retest the same request path after each change.
Rollback: Restore previous probe/backend settings and keep the captured failing timestamp for comparison.

Post-incident review

Handled symptom: Application Gateway returns 502
Initial hypothesis: Separate backend health, DNS resolution, TLS settings and private network reachability before changing the application.
Evidence used: Backend health state, probe result, gateway diagnostic logs, DNS answer from the gateway path and TLS/SNI settings.
Checks performed: Check backend health state | Resolve backend name from the gateway path | Validate TLS/SNI and probe configuration
Decision / action: Change only the failing boundary: probe, backend FQDN, certificate binding or route. Retest the same request path after each change.
Rollback plan: Restore previous probe/backend settings and keep the captured failing timestamp for comparison.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Azure Application Gateway: diagnose 502 errors without mixing DNS, TLS and backend health Azure Private Endpoint: build a validation matrix before production

Cloud

Internal APIM returns an error on a private API

Open atlas

Correlate Application Gateway/WAF and APIM logs, then separate DNS, TLS, policy, identity and private backend reachability before changing policies or opening access.

Evidence

Correlate Application Gateway/WAF and APIM logs, then separate DNS, TLS, policy, identity and private backend reachability before changing policies or opening access.

First checks

Check whether WAF blocked the request
Confirm APIM received the same path
Validate backend DNS and TLS from the APIM path
Replay with a correlation ID

Bounded action

Run the first checks in order: Check whether WAF blocked the request | Confirm APIM received the same path | Validate backend DNS and TLS from the APIM path | Replay with a correlation ID. Open the linked notes before changing production.

Rollback

Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Short handoff

[Incident] Internal APIM returns an error on a private API
Context: Correlate Application Gateway/WAF and APIM logs, then separate DNS, TLS, policy, identity and private backend reachability before changing policies or opening access.
Evidence to confirm: Correlate Application Gateway/WAF and APIM logs, then separate DNS, TLS, policy, identity and private backend reachability before changing policies or opening access.
Immediate checks: Check whether WAF blocked the request | Confirm APIM received the same path | Validate backend DNS and TLS from the APIM path | Replay with a correlation ID
Proposed action: Run the first checks in order: Check whether WAF blocked the request | Confirm APIM received the same path | Validate backend DNS and TLS from the APIM path | Replay with a correlation ID. Open the linked notes before changing production.
Rollback: Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Post-incident review

Handled symptom: Internal APIM returns an error on a private API
Initial hypothesis: Correlate Application Gateway/WAF and APIM logs, then separate DNS, TLS, policy, identity and private backend reachability before changing policies or opening access.
Evidence used: Correlate Application Gateway/WAF and APIM logs, then separate DNS, TLS, policy, identity and private backend reachability before changing policies or opening access.
Checks performed: Check whether WAF blocked the request | Confirm APIM received the same path | Validate backend DNS and TLS from the APIM path | Replay with a correlation ID
Decision / action: Run the first checks in order: Check whether WAF blocked the request | Confirm APIM received the same path | Validate backend DNS and TLS from the APIM path | Replay with a correlation ID. Open the linked notes before changing production.
Rollback plan: Stop the change, restore the last known safe state and keep the captured evidence for comparison.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Azure internal APIM: diagnose a private API before changing policies KQL snippet: correlate WAF and APIM on an Azure private API Azure Application Gateway: diagnose 502 errors without mixing DNS, TLS and backend health

Networking

Private Endpoint name still resolves publicly

Open atlas

Confirm the CNAME chain, Private DNS Zone association and hybrid forwarding from the consuming network.

Evidence

nslookup from the workload subnet, CNAME chain, private DNS zone links, resolver forwarding path and cached answers.

First checks

Run nslookup from the workload network
Check privatelink CNAME
Verify Private DNS Zone links and forwarders

Bounded action

Fix zone association or forwarding first, then clear caches and retest from the consuming network.

Rollback

Restore the previous link or forwarder and document the public/private answer difference.

Short handoff

[Incident] Private Endpoint name still resolves publicly
Context: Confirm the CNAME chain, Private DNS Zone association and hybrid forwarding from the consuming network.
Evidence to confirm: nslookup from the workload subnet, CNAME chain, private DNS zone links, resolver forwarding path and cached answers.
Immediate checks: Run nslookup from the workload network | Check privatelink CNAME | Verify Private DNS Zone links and forwarders
Proposed action: Fix zone association or forwarding first, then clear caches and retest from the consuming network.
Rollback: Restore the previous link or forwarder and document the public/private answer difference.

Post-incident review

Handled symptom: Private Endpoint name still resolves publicly
Initial hypothesis: Confirm the CNAME chain, Private DNS Zone association and hybrid forwarding from the consuming network.
Evidence used: nslookup from the workload subnet, CNAME chain, private DNS zone links, resolver forwarding path and cached answers.
Checks performed: Run nslookup from the workload network | Check privatelink CNAME | Verify Private DNS Zone links and forwarders
Decision / action: Fix zone association or forwarding first, then clear caches and retest from the consuming network.
Rollback plan: Restore the previous link or forwarder and document the public/private answer difference.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Azure snippet: check Private Endpoint DNS resolution Azure snippet: detect Private Endpoint DNS drift Azure hybrid DNS: when to use Private Resolver, on-premises forwarders and private zones Azure Private Endpoint: detect Terraform, DNS, and network drift before incident

Cloud

A synthetic probe fails on an Azure private path

Open atlas

Separate DNS, TLS, Application Gateway health, WAF blocks and runner network before changing routing or application code.

Evidence

Separate DNS, TLS, Application Gateway health, WAF blocks and runner network before changing routing or application code.

First checks

Resolve the hostname from the probe network
Check TLS/SNI with the real hostname
Correlate probe run with WAF and gateway logs

Bounded action

Run the first checks in order: Resolve the hostname from the probe network | Check TLS/SNI with the real hostname | Correlate probe run with WAF and gateway logs. Open the linked notes before changing production.

Rollback

Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Short handoff

[Incident] A synthetic probe fails on an Azure private path
Context: Separate DNS, TLS, Application Gateway health, WAF blocks and runner network before changing routing or application code.
Evidence to confirm: Separate DNS, TLS, Application Gateway health, WAF blocks and runner network before changing routing or application code.
Immediate checks: Resolve the hostname from the probe network | Check TLS/SNI with the real hostname | Correlate probe run with WAF and gateway logs
Proposed action: Run the first checks in order: Resolve the hostname from the probe network | Check TLS/SNI with the real hostname | Correlate probe run with WAF and gateway logs. Open the linked notes before changing production.
Rollback: Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Post-incident review

Handled symptom: A synthetic probe fails on an Azure private path
Initial hypothesis: Separate DNS, TLS, Application Gateway health, WAF blocks and runner network before changing routing or application code.
Evidence used: Separate DNS, TLS, Application Gateway health, WAF blocks and runner network before changing routing or application code.
Checks performed: Resolve the hostname from the probe network | Check TLS/SNI with the real hostname | Correlate probe run with WAF and gateway logs
Decision / action: Run the first checks in order: Resolve the hostname from the probe network | Check TLS/SNI with the real hostname | Correlate probe run with WAF and gateway logs. Open the linked notes before changing production.
Rollback plan: Stop the change, restore the last known safe state and keep the captured evidence for comparison.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Azure: make private paths verifiable with synthetic probes KQL snippet: track synthetic probes for an Azure private path Azure Application Gateway: diagnose 502 errors without mixing DNS, TLS and backend health Azure Private Endpoint: detect Terraform, DNS, and network drift before incident

Cloud

Azure Container Apps private ingress fails or reaches the wrong revision

Open atlas

Separate private DNS, Application Gateway handoff, Container Apps ingress mode, revision traffic and console logs before rolling back or changing traffic weights.

Evidence

Separate private DNS, Application Gateway handoff, Container Apps ingress mode, revision traffic and console logs before rolling back or changing traffic weights.

First checks

Resolve the hostname from the caller network
Check ingress target port and active revisions
Correlate system and console logs

Bounded action

Run the first checks in order: Resolve the hostname from the caller network | Check ingress target port and active revisions | Correlate system and console logs. Open the linked notes before changing production.

Rollback

Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Short handoff

[Incident] Azure Container Apps private ingress fails or reaches the wrong revision
Context: Separate private DNS, Application Gateway handoff, Container Apps ingress mode, revision traffic and console logs before rolling back or changing traffic weights.
Evidence to confirm: Separate private DNS, Application Gateway handoff, Container Apps ingress mode, revision traffic and console logs before rolling back or changing traffic weights.
Immediate checks: Resolve the hostname from the caller network | Check ingress target port and active revisions | Correlate system and console logs
Proposed action: Run the first checks in order: Resolve the hostname from the caller network | Check ingress target port and active revisions | Correlate system and console logs. Open the linked notes before changing production.
Rollback: Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Post-incident review

Handled symptom: Azure Container Apps private ingress fails or reaches the wrong revision
Initial hypothesis: Separate private DNS, Application Gateway handoff, Container Apps ingress mode, revision traffic and console logs before rolling back or changing traffic weights.
Evidence used: Separate private DNS, Application Gateway handoff, Container Apps ingress mode, revision traffic and console logs before rolling back or changing traffic weights.
Checks performed: Resolve the hostname from the caller network | Check ingress target port and active revisions | Correlate system and console logs
Decision / action: Run the first checks in order: Resolve the hostname from the caller network | Check ingress target port and active revisions | Correlate system and console logs. Open the linked notes before changing production.
Rollback plan: Stop the change, restore the last known safe state and keep the captured evidence for comparison.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Azure Container Apps: diagnose private ingress before changing revisions KQL snippet: diagnose Container Apps private ingress and revisions Azure: make private paths verifiable with synthetic probes

Cloud

AKS private ingress returns 502 or reaches no service endpoints

Open atlas

Separate private DNS, Application Gateway health, ingress controller routing, Kubernetes service selectors, endpoint slices and pod readiness before rolling back a deployment.

Evidence

Separate private DNS, Application Gateway health, ingress controller routing, Kubernetes service selectors, endpoint slices and pod readiness before rolling back a deployment.

First checks

Resolve the hostname from the caller network
Check Application Gateway backend health and host header
Verify ingress, service and endpoint slices
Correlate controller and application logs

Bounded action

Run the first checks in order: Resolve the hostname from the caller network | Check Application Gateway backend health and host header | Verify ingress, service and endpoint slices | Correlate controller and application logs. Open the linked notes before changing production.

Rollback

Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Short handoff

[Incident] AKS private ingress returns 502 or reaches no service endpoints
Context: Separate private DNS, Application Gateway health, ingress controller routing, Kubernetes service selectors, endpoint slices and pod readiness before rolling back a deployment.
Evidence to confirm: Separate private DNS, Application Gateway health, ingress controller routing, Kubernetes service selectors, endpoint slices and pod readiness before rolling back a deployment.
Immediate checks: Resolve the hostname from the caller network | Check Application Gateway backend health and host header | Verify ingress, service and endpoint slices | Correlate controller and application logs
Proposed action: Run the first checks in order: Resolve the hostname from the caller network | Check Application Gateway backend health and host header | Verify ingress, service and endpoint slices | Correlate controller and application logs. Open the linked notes before changing production.
Rollback: Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Post-incident review

Handled symptom: AKS private ingress returns 502 or reaches no service endpoints
Initial hypothesis: Separate private DNS, Application Gateway health, ingress controller routing, Kubernetes service selectors, endpoint slices and pod readiness before rolling back a deployment.
Evidence used: Separate private DNS, Application Gateway health, ingress controller routing, Kubernetes service selectors, endpoint slices and pod readiness before rolling back a deployment.
Checks performed: Resolve the hostname from the caller network | Check Application Gateway backend health and host header | Verify ingress, service and endpoint slices | Correlate controller and application logs
Decision / action: Run the first checks in order: Resolve the hostname from the caller network | Check Application Gateway backend health and host header | Verify ingress, service and endpoint slices | Correlate controller and application logs. Open the linked notes before changing production.
Rollback plan: Stop the change, restore the last known safe state and keep the captured evidence for comparison.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Azure AKS: diagnose private ingress before changing deployments KQL snippet: correlate AKS private ingress and application logs Azure Application Gateway: diagnose 502 errors without mixing DNS, TLS and backend health

Cloud

Azure Functions private HTTP endpoint returns 403, 503 or no request logs

Open atlas

Separate private DNS, Private Endpoint reachability, access restrictions, Functions runtime state, private storage and Application Insights evidence before redeploying code or opening public access.

Evidence

Separate private DNS, Private Endpoint reachability, access restrictions, Functions runtime state, private storage and Application Insights evidence before redeploying code or opening public access.

First checks

Resolve the hostname from the caller network
Replay with a correlation ID
Check Function App access restrictions and Private Endpoint status
Correlate requests, traces and exceptions

Bounded action

Run the first checks in order: Resolve the hostname from the caller network | Replay with a correlation ID | Check Function App access restrictions and Private Endpoint status | Correlate requests, traces and exceptions. Open the linked notes before changing production.

Rollback

Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Short handoff

[Incident] Azure Functions private HTTP endpoint returns 403, 503 or no request logs
Context: Separate private DNS, Private Endpoint reachability, access restrictions, Functions runtime state, private storage and Application Insights evidence before redeploying code or opening public access.
Evidence to confirm: Separate private DNS, Private Endpoint reachability, access restrictions, Functions runtime state, private storage and Application Insights evidence before redeploying code or opening public access.
Immediate checks: Resolve the hostname from the caller network | Replay with a correlation ID | Check Function App access restrictions and Private Endpoint status | Correlate requests, traces and exceptions
Proposed action: Run the first checks in order: Resolve the hostname from the caller network | Replay with a correlation ID | Check Function App access restrictions and Private Endpoint status | Correlate requests, traces and exceptions. Open the linked notes before changing production.
Rollback: Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Post-incident review

Handled symptom: Azure Functions private HTTP endpoint returns 403, 503 or no request logs
Initial hypothesis: Separate private DNS, Private Endpoint reachability, access restrictions, Functions runtime state, private storage and Application Insights evidence before redeploying code or opening public access.
Evidence used: Separate private DNS, Private Endpoint reachability, access restrictions, Functions runtime state, private storage and Application Insights evidence before redeploying code or opening public access.
Checks performed: Resolve the hostname from the caller network | Replay with a correlation ID | Check Function App access restrictions and Private Endpoint status | Correlate requests, traces and exceptions
Decision / action: Run the first checks in order: Resolve the hostname from the caller network | Replay with a correlation ID | Check Function App access restrictions and Private Endpoint status | Correlate requests, traces and exceptions. Open the linked notes before changing production.
Rollback plan: Stop the change, restore the last known safe state and keep the captured evidence for comparison.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Azure Functions: diagnose a private HTTP endpoint before changing code KQL snippet: correlate an Azure Functions private HTTP endpoint Azure: make private paths verifiable with synthetic probes

Cloud

Azure WAF blocks a legitimate request

Open atlas

Start from blocked requests, rule ID and URI before deciding between exclusion, custom rule or application fix.

Evidence

Blocked URI, ruleId, match variable, client IP, hostname, request ID and exact time window.

First checks

List blocked URIs in KQL
Identify ruleId and match field
Validate false-positive scope

Bounded action

Create the smallest exclusion or custom rule that covers the false positive without disabling the rule globally.

Rollback

Remove the exclusion/custom rule and verify expected blocking returns for the same rule family.

Short handoff

[Incident] Azure WAF blocks a legitimate request
Context: Start from blocked requests, rule ID and URI before deciding between exclusion, custom rule or application fix.
Evidence to confirm: Blocked URI, ruleId, match variable, client IP, hostname, request ID and exact time window.
Immediate checks: List blocked URIs in KQL | Identify ruleId and match field | Validate false-positive scope
Proposed action: Create the smallest exclusion or custom rule that covers the false positive without disabling the rule globally.
Rollback: Remove the exclusion/custom rule and verify expected blocking returns for the same rule family.

Post-incident review

Handled symptom: Azure WAF blocks a legitimate request
Initial hypothesis: Start from blocked requests, rule ID and URI before deciding between exclusion, custom rule or application fix.
Evidence used: Blocked URI, ruleId, match variable, client IP, hostname, request ID and exact time window.
Checks performed: List blocked URIs in KQL | Identify ruleId and match field | Validate false-positive scope
Decision / action: Create the smallest exclusion or custom rule that covers the false positive without disabling the rule globally.
Rollback plan: Remove the exclusion/custom rule and verify expected blocking returns for the same rule family.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next KQL snippet: list Azure WAF blocked URIs quickly WAF and KQL: identify a false positive before creating an exclusion Azure WAF: add an OWASP/CRS exclusion without weakening all protection Azure WAF: frame an emergency custom rule without losing evidence Azure snippet: audit WAF custom rule priorities

Automation

Terraform state lock is stuck

Open atlas

Prove that no apply is still running before using force-unlock, then restart with a clean plan.

Evidence

Lock ID, lock owner, CI run, backend target, pending plan and whether an apply is still active.

First checks

Identify lock owner
Check CI job status
Run plan after unlock

Bounded action

Unlock only after proving no apply is running, then start with a fresh plan before any apply.

Rollback

Return to previous commit or restore the last validated state version if drift was introduced.

Short handoff

[Incident] Terraform state lock is stuck
Context: Prove that no apply is still running before using force-unlock, then restart with a clean plan.
Evidence to confirm: Lock ID, lock owner, CI run, backend target, pending plan and whether an apply is still active.
Immediate checks: Identify lock owner | Check CI job status | Run plan after unlock
Proposed action: Unlock only after proving no apply is running, then start with a fresh plan before any apply.
Rollback: Return to previous commit or restore the last validated state version if drift was introduced.

Post-incident review

Handled symptom: Terraform state lock is stuck
Initial hypothesis: Prove that no apply is still running before using force-unlock, then restart with a clean plan.
Evidence used: Lock ID, lock owner, CI run, backend target, pending plan and whether an apply is still active.
Checks performed: Identify lock owner | Check CI job status | Run plan after unlock
Decision / action: Unlock only after proving no apply is running, then start with a fresh plan before any apply.
Rollback plan: Return to previous commit or restore the last validated state version if drift was introduced.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Terraform snippet: diagnose a stuck state lock before force-unlock Terraform Azure: secure a private state backend without breaking CI

Infrastructure

A secret rotation or managed identity change breaks an application or pipeline consumer

Open atlas

Separate preparation, cutover, revocation and managed identity diagnostics; validate the real execution identity, private path and authentication errors before deleting the old value or broadening access.

Evidence

First checks

List real consumers
Verify the runtime identity and vault read access
Check private DNS and source network
Watch 401/403/500 or Key Vault denials

Bounded action

Run the first checks in order: List real consumers | Verify the runtime identity and vault read access | Check private DNS and source network | Watch 401/403/500 or Key Vault denials. Open the linked notes before changing production.

Rollback

Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Short handoff

[Incident] A secret rotation or managed identity change breaks an application or pipeline consumer
Context: Separate preparation, cutover, revocation and managed identity diagnostics; validate the real execution identity, private path and authentication errors before deleting the old value or broadening access.
Evidence to confirm: Separate preparation, cutover, revocation and managed identity diagnostics; validate the real execution identity, private path and authentication errors before deleting the old value or broadening access.
Immediate checks: List real consumers | Verify the runtime identity and vault read access | Check private DNS and source network | Watch 401/403/500 or Key Vault denials
Proposed action: Run the first checks in order: List real consumers | Verify the runtime identity and vault read access | Check private DNS and source network | Watch 401/403/500 or Key Vault denials. Open the linked notes before changing production.
Rollback: Stop the change, restore the last known safe state and keep the captured evidence for comparison.

Post-incident review

Handled symptom: A secret rotation or managed identity change breaks an application or pipeline consumer
Initial hypothesis: Separate preparation, cutover, revocation and managed identity diagnostics; validate the real execution identity, private path and authentication errors before deleting the old value or broadening access.
Evidence used: Separate preparation, cutover, revocation and managed identity diagnostics; validate the real execution identity, private path and authentication errors before deleting the old value or broadening access.
Checks performed: List real consumers | Verify the runtime identity and vault read access | Check private DNS and source network | Watch 401/403/500 or Key Vault denials
Decision / action: Run the first checks in order: List real consumers | Verify the runtime identity and vault read access | Check private DNS and source network | Watch 401/403/500 or Key Vault denials. Open the linked notes before changing production.
Rollback plan: Stop the change, restore the last known safe state and keep the captured evidence for comparison.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Service identity and secret rotation: a production runbook, not an isolated task KQL snippet: detect authentication errors after secret rotation Azure managed identity: diagnose private access before changing permissions KQL snippet: diagnose Key Vault denial with managed identity

Automation

An automation entry point behaves like a remote console

Open atlas

Bound inputs, templates and repository structure before exposing operations to more users.

Evidence

Inputs accepted by the template, permissions, inventory scope, repository branch and audit trail.

First checks

List accepted inputs
Remove arbitrary command fields
Review job template permissions

Bounded action

Replace arbitrary inputs with bounded choices and isolate job templates by operational intent.

Rollback

Disable the exposed template or revert to the previous approved template version.

Short handoff

[Incident] An automation entry point behaves like a remote console
Context: Bound inputs, templates and repository structure before exposing operations to more users.
Evidence to confirm: Inputs accepted by the template, permissions, inventory scope, repository branch and audit trail.
Immediate checks: List accepted inputs | Remove arbitrary command fields | Review job template permissions
Proposed action: Replace arbitrary inputs with bounded choices and isolate job templates by operational intent.
Rollback: Disable the exposed template or revert to the previous approved template version.

Post-incident review

Handled symptom: An automation entry point behaves like a remote console
Initial hypothesis: Bound inputs, templates and repository structure before exposing operations to more users.
Evidence used: Inputs accepted by the template, permissions, inventory scope, repository branch and audit trail.
Checks performed: List accepted inputs | Remove arbitrary command fields | Review job template permissions
Decision / action: Replace arbitrary inputs with bounded choices and isolate job templates by operational intent.
Rollback plan: Disable the exposed template or revert to the previous approved template version.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next AWX: design job templates that do not become a dangerous remote console Ansible in production: structure an operations repository before exposing it in AWX

A private AI agent can act but nobody can explain the action

Open atlas

Tie sources, identities, tool calls, logs and human validation before increasing autonomy.

Evidence

Source documents, identity, tool calls, prompt context, logs and human approval point.

First checks

List approved sources
Trace tool calls
Define human approval points

Bounded action

Reduce tool scope, require approval on sensitive actions and trace every tool call to a source.

Rollback

Disable the tool integration or force human validation until the action path is explainable.

Short handoff

[Incident] A private AI agent can act but nobody can explain the action
Context: Tie sources, identities, tool calls, logs and human validation before increasing autonomy.
Evidence to confirm: Source documents, identity, tool calls, prompt context, logs and human approval point.
Immediate checks: List approved sources | Trace tool calls | Define human approval points
Proposed action: Reduce tool scope, require approval on sensitive actions and trace every tool call to a source.
Rollback: Disable the tool integration or force human validation until the action path is explainable.

Post-incident review

Handled symptom: A private AI agent can act but nobody can explain the action
Initial hypothesis: Tie sources, identities, tool calls, logs and human validation before increasing autonomy.
Evidence used: Source documents, identity, tool calls, prompt context, logs and human approval point.
Checks performed: List approved sources | Trace tool calls | Define human approval points
Decision / action: Reduce tool scope, require approval on sensitive actions and trace every tool call to a source.
Rollback plan: Disable the tool integration or force human validation until the action path is explainable.
To improve: detection, runbook, guardrail, ownership and communication delay.

Open next Private-network AI agent: which controls to keep around data, actions and logs