Cloud

Azure Container Apps: diagnose private ingress before changing revisions

Build an operational runbook for Azure Container Apps private ingress failures by separating DNS, ingress mode, revision routing, application logs and rollback evidence.

09 Jun 2026 azurecontainer-appsprivate-endpointdnskqllogsmonitoringrunbookrollbackautomation

Azure Container Apps is often used as a convenient landing zone for internal APIs, lightweight backends or event-driven services. The operational difficulty starts when a private endpoint, an internal environment, Application Gateway, DNS forwarding and revision traffic all sit on the same request path. A 404, 502, empty response or intermittent failure can look like an application bug while the real issue is a hostname, ingress setting, revision split or private DNS record.

The use case is a private API hosted on Azure Container Apps. Internal clients reach a friendly hostname, traffic may enter through Application Gateway, and the container app exposes internal ingress inside a Container Apps environment. A new revision has just been deployed, or a private endpoint/DNS change has been applied. Before rolling back blindly or changing traffic weights, the runbook must prove which layer is failing.

Write the private ingress path as an operational object

A Container Apps private path has more moving parts than the application code suggests. The environment can be internal. The app can expose ingress as internal or external. A private endpoint can be used for the environment. A custom domain can point to Application Gateway or to the Container Apps environment FQDN. Inside the environment, traffic is routed through the platform proxy and then to one or more revisions.

text container-apps-private-path.txt

Internal client
Resolves api.internal.example.com
Calls the expected private hostname

Optional Application Gateway
Terminates TLS or forwards with the configured host header
Runs WAF and health probes
Sends traffic to the Container Apps environment path

Container Apps environment
Exposes private endpoint or internal load balancer path
Resolves the environment domain through private DNS
Routes traffic through the platform ingress proxy

Container app
Accepts internal ingress on the configured target port
Splits traffic between active revisions
Emits system and console logs

Backend dependencies
Require private DNS, identity, secrets or managed service access

This map prevents the first common mistake: treating every failed request as a broken revision. If the hostname still resolves publicly, if Application Gateway health probes fail, or if traffic never reaches the Container Apps environment, changing a revision will only add noise.

Separate DNS, ingress and revision symptoms

The runbook should classify the symptom before changing anything. A missing DNS record does not need a rollback. A revision crash does. A traffic split pointing at an unhealthy revision needs a controlled shift, not a WAF exception.

text symptom-reading.txt

Symptom
Hostname resolves to public address
  Check private DNS zone, VNet link, forwarding rule and custom domain target

Application Gateway returns 502
  Check backend health, host header, TLS/SNI and probe path toward Container Apps

Request reaches Container Apps but returns 404
  Check ingress type, target port, custom domain binding, path routing and revision label

Request is intermittent
  Check active revisions, traffic weights, replica restarts and scale events

Request reaches app but dependency fails
  Check managed identity, private DNS, Key Vault or downstream service firewall

The useful question is not only “is the app up?”. It is “does this exact hostname reach the expected environment, the expected ingress configuration and the expected revision?”.

Prove private DNS before debugging the container

Start from the same network that real clients use. If a diagnostic runner sits outside the private path, it can produce a clean result while production clients fail, or the opposite. The first check should capture the hostname, CNAME chain and final address.

bash 01-container-apps-private-dns-check.sh

HOSTNAME=api.internal.example.com

nslookup "$HOSTNAME"
dig +short "$HOSTNAME"

a=$(dig +short "$HOSTNAME" | tail -n 1)
case "$a" in
10.*|172.16.*|172.17.*|172.18.*|172.19.*|172.2*|172.30.*|172.31.*|192.168.*)
  echo "private_resolution_ok=$a"
  ;;
*)
  echo "unexpected_public_or_empty_resolution=$a"
  exit 2
  ;;
esac

openssl s_client -connect "$HOSTNAME:443" -servername "$HOSTNAME" </dev/null 2>/dev/null | openssl x509 -noout -subject -issuer

If this fails, keep the fix at the DNS or private endpoint layer: private DNS zone records, VNet links, resolver forwarding or the custom domain target. A new container image cannot repair a resolver path.

Read Container Apps system and console logs together

Container Apps exposes two useful Log Analytics views: system logs for platform events and console logs for application output. During an ingress incident, looking at only stdout/stderr is too narrow. The platform may already be telling you that a revision is provisioning, deactivating, failing probes or receiving no traffic.

kusto 02-container-apps-private-ingress-triage.kql

let Window = 2h;
let App = "orders-api";
let Env = "aca-prod-weu";
let System =
ContainerAppSystemLogs_CL
| where TimeGenerated > ago(Window)
| where ContainerAppName_s == App or EnvironmentName_s == Env
| project TimeGenerated,
        Source="system",
        Environment=EnvironmentName_s,
        App=ContainerAppName_s,
        Revision=RevisionName_s,
        Replica="",
        Message=Log_s;
let Console =
ContainerAppConsoleLogs_CL
| where TimeGenerated > ago(Window)
| where ContainerAppName_s == App
| project TimeGenerated,
        Source="console",
        Environment=EnvironmentName_s,
        App=ContainerAppName_s,
        Revision=RevisionName_s,
        Replica=tostring(ContainerGroupName_g),
        Message=Log_s;
System
| union Console
| where Message has_any ("ingress", "probe", "revision", "error", "failed", "timeout", "502", "404")
| order by TimeGenerated desc

This query gives the incident lead a compact picture: platform events, revision names, replicas and application messages in one timeline. It is intentionally broad at the start. Once the failing revision or event family is identified, narrow the query.

Compare configured traffic with observed logs

Container Apps revisions make rollback attractive, but traffic weights need evidence. If multiple revisions are active, a small percentage can still create a visible intermittent incident. If labels are used, a caller may target a labeled revision even when default traffic looks healthy.

bash 03-container-apps-revision-check.sh

APP=orders-api
RG=rg-prod-apps

az containerapp ingress show --name "$APP" --resource-group "$RG" --query '{external:external,targetPort:targetPort,transport:transport,traffic:traffic}' --output table

az containerapp revision list --name "$APP" --resource-group "$RG" --query '[].{name:name,active:properties.active,traffic:properties.trafficWeight,created:properties.createdTime}' --output table

The comparison is simple: if logs show failures only on one revision and traffic points to it, shift traffic or roll back that revision. If no request reaches any revision, stay at ingress, DNS or gateway. If every revision logs the same downstream 403, the dependency path is the better suspect.

Make rollback small and observable

A rollback should not become a silent workaround. Before shifting traffic, capture the failing revision, current weights, last deployment reference and the diagnostic evidence that made rollback reasonable. After the rollback, keep the same hostname and correlation header for validation.

bash 04-bounded-rollback.sh

APP=orders-api
RG=rg-prod-apps
GOOD_REV=orders-api--000018
BAD_REV=orders-api--000019

az containerapp ingress traffic set --name "$APP" --resource-group "$RG" --revision-weight "$GOOD_REV=100" "$BAD_REV=0"

CORRELATION_ID="ops-$(date +%Y%m%d%H%M%S)"
curl -vk "https://api.internal.example.com/health" -H "x-correlation-id: $CORRELATION_ID"

echo "validate_correlation_id=$CORRELATION_ID"

If the rollback fixes the request but DNS, gateway and ingress checks were never captured, the team still has a fragile result. The next deployment may reintroduce the failure because the true condition was not written down.

Conclusion

Azure Container Apps private ingress is operationally friendly only when the path is observable. DNS, private endpoint, ingress mode, platform proxy, revision routing and application logs all need to be read before a fix is chosen.

The practical rule is direct: do not change revision traffic until traffic has reached Container Apps, do not change DNS until the resolver path is proven, and do not add gateway exceptions without a visible gateway symptom. With that discipline, a private Container Apps incident becomes a bounded diagnosis instead of a blind rollback.