Cloud
Azure Functions: diagnose a private HTTP endpoint before changing code
Build an operational runbook for private Azure Functions failures by separating DNS, Private Endpoint, access restrictions, private storage, Application Insights logs and rollback evidence.
A private Azure Function can fail before the code is involved. The hostname may still resolve publicly, the Private Endpoint may be approved but unreachable from the caller network, access restrictions may block the request, the runtime may fail to start because private storage is unreachable, or the app may return 403, 502, 503 or timeouts before the function executes.
The use case is an internal API served by an Azure Functions HTTP trigger. Consumers call it through a private network, sometimes behind Application Gateway or internal APIM. A change has just touched DNS, networking, storage, managed identity, application settings or deployment package. Before redeploying code or opening public access, the runbook must prove where the request stops.
Read the private path as one chain
The diagnosis is more reliable when the team draws the full path before changing settings. A private Function App is not only a URL. It combines DNS, Private Link, access rules, the Functions runtime, platform storage, identity and logs.
Internal client or probe
Resolves api.internal.example.com
Calls the expected hostname with the correct host header
Private entry point
Application Gateway, internal APIM or direct client
Preserves TLS/SNI and correlation ID
Function App
Private Endpoint on the HTTP site
Access restrictions aligned with the caller network
Functions runtime started
Storage and dependencies
AzureWebJobsStorage reachable through the expected private path
Key Vault, files, queues or downstream APIs resolved privately when required
Observability
Application Insights or Log Analytics receives traces, requests and exceptions
Rollback targets the layer that actually changed This map prevents broad fixes. A 403 caused by an access rule is not fixed by redeployment. A runtime that cannot start because private storage is blocked is not fixed by changing Application Gateway.
Classify the symptom before acting
The first question is not “does the function work?”. It becomes: “does the private hostname reach the Functions site, is the runtime started, and did the function execute the request?”.
Symptom
DNS returns a public address or no answer
Check privatelink.azurewebsites.net zone, VNet links and hybrid DNS forwarding
curl returns 403 immediately
Check access restrictions, Private Endpoint, source route and optional APIM/Application Gateway
curl returns 502 or 503
Check runtime state, worker process, Functions settings and storage availability
No request appears in Application Insights
Go back to DNS, gateway, APIM, access restrictions or Private Endpoint
The request appears but fails with an exception
Read traces, exceptions, managed identity, Key Vault and downstream dependencies
Runtime does not start after a network change
Validate AzureWebJobsStorage, private storage DNS and storage account network rules This classification gives a simple rule: do not touch code until function execution is visible in the logs.
Prove DNS, TLS and access from the consumer network
The test must start from the same network as the real consumer: application subnet, probe runner, diagnostic VM or internal APIM environment. A public workstation test can give the wrong answer.
HOSTNAME=api.internal.example.com
PATH=/api/health
CORRELATION_ID="ops-$(date +%Y%m%d%H%M%S)"
nslookup "$HOSTNAME"
dig +short "$HOSTNAME"
openssl s_client -connect "$HOSTNAME:443" -servername "$HOSTNAME" </dev/null 2>/dev/null | openssl x509 -noout -subject -issuer
curl -vk "https://$HOSTNAME$PATH" -H "x-correlation-id: $CORRELATION_ID" -H "x-naxaya-check: private-functions"
echo "correlation_id=$CORRELATION_ID" If resolution is not private, fix DNS before anything else. If TLS or the host header fails, inspect the entry point. If curl returns 403 before Functions logs show the call, the problem is probably network or access restrictions.
Check the Function App and private storage
A Function App can accept a private route but remain unusable if the runtime cannot reach its platform storage. This often appears after network hardening: the site is private, but AzureWebJobsStorage or its related file share no longer resolves or is not allowed from the right path.
APP_RG=rg-prod-app
APP_NAME=func-prod-orders
az functionapp show -g "$APP_RG" -n "$APP_NAME" --query "{state:state, httpsOnly:httpsOnly, defaultHostName:defaultHostName, outboundIpAddresses:outboundIpAddresses}" -o table
az functionapp config access-restriction show -g "$APP_RG" -n "$APP_NAME" -o table
az functionapp config appsettings list -g "$APP_RG" -n "$APP_NAME" --query "[?name=='AzureWebJobsStorage' || contains(name, 'WEBSITE_') || contains(name, 'FUNCTIONS_')].[name,value]" -o table
az network private-endpoint-connection list --id "$(az functionapp show -g "$APP_RG" -n "$APP_NAME" --query id -o tsv)" --query "[].{name:name,status:privateLinkServiceConnectionState.status}" -o table The goal is not to paste secrets into a ticket. It is to validate dimensions: runtime active, expected restrictions, network settings present, Private Endpoint approved and storage reachable through private DNS.
Correlate requests, traces and exceptions in KQL
When the request reaches the Function App, Application Insights should show at least one request, trace or exception in the incident window. Correlating by hostname, URL and correlation ID prevents mixing the private failure with unrelated application noise.
let Window = 2h;
let Host = "api.internal.example.com";
let CorrelationId = "ops-20260611080000";
let Req =
requests
| where timestamp > ago(Window)
| where url has Host or tostring(customDimensions["x-correlation-id"]) == CorrelationId
| project timestamp, Source="request", name, resultCode, success, operation_Id, url, cloud_RoleName;
let Tr =
traces
| where timestamp > ago(Window)
| where message has_any (Host, CorrelationId, "Host lock", "storage", "listener", "Starting", "Stopping", "Function")
| project timestamp, Source="trace", message, severityLevel, operation_Id, cloud_RoleName;
let Ex =
exceptions
| where timestamp > ago(Window)
| project timestamp, Source="exception", message=outerMessage, severityLevel, operation_Id, cloud_RoleName;
Req
| union Tr, Ex
| order by timestamp desc Quick read: no request after a correlated curl points to DNS, private edge or access restrictions; a 403 or 503 request points to platform and configuration; a request with an exception points to code, identity or a dependency.
Keep rollback bounded
Rollback should restore the layer that changed, not the entire environment. If a DNS change broke private resolution, restore the zone or forwarder. If an access restriction blocked APIM, restore the rule. If the last package throws an exception after entering the function, roll back the deployment.
Recent change
Private DNS
Rollback: restore the previous record, VNet link or forwarder
Evidence: private nslookup + curl with correlation ID
Access restriction
Rollback: restore the previous source rule
Evidence: request visible in Application Insights
Private storage
Rollback: restore previous storage network permission or DNS path
Evidence: runtime started + no Host lock/storage errors
Application package
Rollback: return to the validated package or slot
Evidence: same request succeeds with operation_Id kept Conclusion
A private Azure Function must be diagnosed like a production path, not like a small serverless handler. DNS, Private Endpoint, restrictions, runtime, storage and logs must be separated before code changes.
The operating rule is intentionally strict: until a correlated request appears in logs, stay on network and platform; once it appears, move to runtime, identity, dependencies or code. That separation reduces temporary public openings, unnecessary redeployments and overly broad rollbacks.