Separate backend health, DNS resolution, TLS settings and private network reachability before changing the application.
01 Evidence
Backend health state, probe result, gateway diagnostic logs, DNS answer from the gateway path and TLS/SNI settings.
02 First checks
- Check backend health state
- Resolve backend name from the gateway path
- Validate TLS/SNI and probe configuration
03 Bounded action
Change only the failing boundary: probe, backend FQDN, certificate binding or route. Retest the same request path after each change.
04 Rollback
Restore previous probe/backend settings and keep the captured failing timestamp for comparison.
Cloud
Internal APIM returns an error on a private API
Open atlas Correlate Application Gateway/WAF and APIM logs, then separate DNS, TLS, policy, identity and private backend reachability before changing policies or opening access.
01 Evidence
Correlate Application Gateway/WAF and APIM logs, then separate DNS, TLS, policy, identity and private backend reachability before changing policies or opening access.
02 First checks
- Check whether WAF blocked the request
- Confirm APIM received the same path
- Validate backend DNS and TLS from the APIM path
- Replay with a correlation ID
03 Bounded action
Run the first checks in order: Check whether WAF blocked the request | Confirm APIM received the same path | Validate backend DNS and TLS from the APIM path | Replay with a correlation ID. Open the linked notes before changing production.
04 Rollback
Stop the change, restore the last known safe state and keep the captured evidence for comparison.
Networking
Private Endpoint name still resolves publicly
Open atlas Confirm the CNAME chain, Private DNS Zone association and hybrid forwarding from the consuming network.
01 Evidence
nslookup from the workload subnet, CNAME chain, private DNS zone links, resolver forwarding path and cached answers.
02 First checks
- Run nslookup from the workload network
- Check privatelink CNAME
- Verify Private DNS Zone links and forwarders
03 Bounded action
Fix zone association or forwarding first, then clear caches and retest from the consuming network.
04 Rollback
Restore the previous link or forwarder and document the public/private answer difference.
Cloud
A synthetic probe fails on an Azure private path
Open atlas Separate DNS, TLS, Application Gateway health, WAF blocks and runner network before changing routing or application code.
01 Evidence
Separate DNS, TLS, Application Gateway health, WAF blocks and runner network before changing routing or application code.
02 First checks
- Resolve the hostname from the probe network
- Check TLS/SNI with the real hostname
- Correlate probe run with WAF and gateway logs
03 Bounded action
Run the first checks in order: Resolve the hostname from the probe network | Check TLS/SNI with the real hostname | Correlate probe run with WAF and gateway logs. Open the linked notes before changing production.
04 Rollback
Stop the change, restore the last known safe state and keep the captured evidence for comparison.
Cloud
Azure Container Apps private ingress fails or reaches the wrong revision
Open atlas Separate private DNS, Application Gateway handoff, Container Apps ingress mode, revision traffic and console logs before rolling back or changing traffic weights.
01 Evidence
Separate private DNS, Application Gateway handoff, Container Apps ingress mode, revision traffic and console logs before rolling back or changing traffic weights.
02 First checks
- Resolve the hostname from the caller network
- Check ingress target port and active revisions
- Correlate system and console logs
03 Bounded action
Run the first checks in order: Resolve the hostname from the caller network | Check ingress target port and active revisions | Correlate system and console logs. Open the linked notes before changing production.
04 Rollback
Stop the change, restore the last known safe state and keep the captured evidence for comparison.
Cloud
AKS private ingress returns 502 or reaches no service endpoints
Open atlas Separate private DNS, Application Gateway health, ingress controller routing, Kubernetes service selectors, endpoint slices and pod readiness before rolling back a deployment.
01 Evidence
Separate private DNS, Application Gateway health, ingress controller routing, Kubernetes service selectors, endpoint slices and pod readiness before rolling back a deployment.
02 First checks
- Resolve the hostname from the caller network
- Check Application Gateway backend health and host header
- Verify ingress, service and endpoint slices
- Correlate controller and application logs
03 Bounded action
Run the first checks in order: Resolve the hostname from the caller network | Check Application Gateway backend health and host header | Verify ingress, service and endpoint slices | Correlate controller and application logs. Open the linked notes before changing production.
04 Rollback
Stop the change, restore the last known safe state and keep the captured evidence for comparison.
Cloud
Azure Functions private HTTP endpoint returns 403, 503 or no request logs
Open atlas Separate private DNS, Private Endpoint reachability, access restrictions, Functions runtime state, private storage and Application Insights evidence before redeploying code or opening public access.
01 Evidence
Separate private DNS, Private Endpoint reachability, access restrictions, Functions runtime state, private storage and Application Insights evidence before redeploying code or opening public access.
02 First checks
- Resolve the hostname from the caller network
- Replay with a correlation ID
- Check Function App access restrictions and Private Endpoint status
- Correlate requests, traces and exceptions
03 Bounded action
Run the first checks in order: Resolve the hostname from the caller network | Replay with a correlation ID | Check Function App access restrictions and Private Endpoint status | Correlate requests, traces and exceptions. Open the linked notes before changing production.
04 Rollback
Stop the change, restore the last known safe state and keep the captured evidence for comparison.
Cloud
Azure WAF blocks a legitimate request
Open atlas Start from blocked requests, rule ID and URI before deciding between exclusion, custom rule or application fix.
01 Evidence
Blocked URI, ruleId, match variable, client IP, hostname, request ID and exact time window.
02 First checks
- List blocked URIs in KQL
- Identify ruleId and match field
- Validate false-positive scope
03 Bounded action
Create the smallest exclusion or custom rule that covers the false positive without disabling the rule globally.
04 Rollback
Remove the exclusion/custom rule and verify expected blocking returns for the same rule family.
Automation
Terraform state lock is stuck
Open atlas Prove that no apply is still running before using force-unlock, then restart with a clean plan.
01 Evidence
Lock ID, lock owner, CI run, backend target, pending plan and whether an apply is still active.
02 First checks
- Identify lock owner
- Check CI job status
- Run plan after unlock
03 Bounded action
Unlock only after proving no apply is running, then start with a fresh plan before any apply.
04 Rollback
Return to previous commit or restore the last validated state version if drift was introduced.
Infrastructure
A secret rotation or managed identity change breaks an application or pipeline consumer
Open atlas Separate preparation, cutover, revocation and managed identity diagnostics; validate the real execution identity, private path and authentication errors before deleting the old value or broadening access.
01 Evidence
Separate preparation, cutover, revocation and managed identity diagnostics; validate the real execution identity, private path and authentication errors before deleting the old value or broadening access.
02 First checks
- List real consumers
- Verify the runtime identity and vault read access
- Check private DNS and source network
- Watch 401/403/500 or Key Vault denials
03 Bounded action
Run the first checks in order: List real consumers | Verify the runtime identity and vault read access | Check private DNS and source network | Watch 401/403/500 or Key Vault denials. Open the linked notes before changing production.
04 Rollback
Stop the change, restore the last known safe state and keep the captured evidence for comparison.
Automation
An automation entry point behaves like a remote console
Open atlas Bound inputs, templates and repository structure before exposing operations to more users.
01 Evidence
Inputs accepted by the template, permissions, inventory scope, repository branch and audit trail.
02 First checks
- List accepted inputs
- Remove arbitrary command fields
- Review job template permissions
03 Bounded action
Replace arbitrary inputs with bounded choices and isolate job templates by operational intent.
04 Rollback
Disable the exposed template or revert to the previous approved template version.
AI
A private AI agent can act but nobody can explain the action
Open atlas Tie sources, identities, tool calls, logs and human validation before increasing autonomy.
01 Evidence
Source documents, identity, tool calls, prompt context, logs and human approval point.
02 First checks
- List approved sources
- Trace tool calls
- Define human approval points
03 Bounded action
Reduce tool scope, require approval on sensitive actions and trace every tool call to a source.
04 Rollback
Disable the tool integration or force human validation until the action path is explainable.