Cloud
Azure Application Gateway: diagnose 502 errors without mixing DNS, TLS and backend health
A diagnostic method for Azure Application Gateway 502 errors that separates DNS resolution, probes, backend settings, TLS, hostnames, certificates and real application behavior.
A 502 error on Azure Application Gateway rarely gives the full cause. It says that the gateway did not receive a usable response from the backend, but several layers can produce that result: DNS, routing, NSG, probe, certificate, hostname, timeout, stopped backend, WAF interpretation or simply a poor health endpoint.
The scenario here is a business application published behind Application Gateway with an HTTPS listener, a private backend and dedicated probes. The goal is to diagnose without mixing every layer. A 502 should not immediately trigger a WAF change, certificate replacement or application restart. First, identify which layer is failing.
Read Application Gateway as a chain
Application Gateway is easier to troubleshoot when its objects are followed in order. The listener receives a request for a name. The rule sends it to a backend pool with a backend setting. The backend setting defines protocol, port, hostname and associated probe. The probe determines whether the backend is considered healthy.
HTTPS client
-> listener app.example.com
-> rule app-prd
-> backend pool bp-app-prd
-> backend setting bhs-app-prd
-> probe probe-app-prd
-> HTTPS backend app.example.com:443 If the probe uses a different hostname from the backend setting, the result can be misleading. If the backend pool contains a private IP but the backend certificate expects a DNS name, TLS can fail even when the server is reachable. If the probe tests / while the application requires authentication, the backend may be marked unhealthy although the application is available.
Start with backend health
Backend health gives the first useful clue. It shows whether Application Gateway considers the backend healthy and often includes the associated message: probe failed, certificate mismatch, timeout, forbidden, connection refused or DNS issue. That message is not final proof, but it points to the layer to verify.
az network application-gateway show-backend-health -g rg-network-hub-prod -n agw-internet-prod-001 -o table If the backend is unhealthy, handle the probe path before testing the full Internet flow. If the backend is healthy but the client receives 502, look at listeners, rules, timeouts, access logs and real application behavior.
Verify the backend hostname
The most frequent issue in HTTPS publication is the hostname used toward the backend. When the backend pool contains a private IP, Application Gateway cannot infer the name the server must present in its certificate. The backend setting must define the expected hostname explicitly.
az network application-gateway http-settings show -g rg-network-hub-prod --gateway-name agw-internet-prod-001 -n bhs-app-prd --query '{protocol:protocol, port:port, hostName:hostName, pickHostNameFromBackendAddress:pickHostNameFromBackendAddress, probe:probe.id}' -o json The expected result should show HTTPS, the right port and an explicit hostName such as app.example.com. Calling the backend by IP may work over HTTP, but it becomes fragile over HTTPS because certificate validation relies on a name.
Test from a point close to the gateway
A test from an administrator workstation does not prove that the gateway reaches the backend. Test from a source that shares the network path with Application Gateway, such as a diagnostic VM in the relevant hub or spoke. Use the backend hostname, not just the IP address.
nslookup app.example.com
curl -vk https://app.example.com/health
curl -vk --resolve app.example.com:443:10.10.54.8 https://app.example.com/health The last command forces the IP address while keeping the TLS hostname. If it fails, the issue is often on the backend side: certificate, reverse proxy, application state, closed port or incorrect health path.
Choose a stable probe
A probe should test technical availability, not a full business journey. It should be fast, stable, cheap and independent from unnecessary third-party services. A protected page is not always a good probe. A /health endpoint is useful if it represents the minimum state expected by the gateway.
az network application-gateway probe show -g rg-network-hub-prod --gateway-name agw-internet-prod-001 -n probe-app-prd --query '{protocol:protocol, host:host, path:path, interval:interval, timeout:timeout, match:match}' -o json A probe that is too strict creates artificial outages. A probe that is too permissive lets a partially broken backend stay in rotation. The compromise must be documented.
Separate WAF and backend issues
The WAF can block a request before it reaches the application, but it should not become the default suspect for every 502. A WAF block should appear in WAF logs and behave differently from an unhealthy backend. Check logs for the same period and, when possible, the same correlation context.
If backend is unhealthy
Read backend health
Verify probe, hostname, TLS, port and routing
If backend is healthy but client receives an error
Verify listener, rule, access logs, timeout and application
If WAF is suspected
Find an explicit WAF block in logs
Do not create an exception before identifying the rule Conclusion
Diagnosing an Application Gateway 502 means following the chain in order: listener, rule, backend setting, pool, probe, network, TLS and application. Errors become easier to read when backend hostnames are explicit, probes test stable endpoints and logs are reviewed by layer.
A good runbook does not only list commands. It states which hypothesis each command validates. That discipline avoids random fixes: changing WAF for a TLS issue, modifying DNS for a poor probe, or restarting an application while the backend setting calls the wrong name.