AWS APIGateway Latency Metrics Madness


Recently I observed an interesting phenomenon where 2 metrics used to measure the latency of requests going through the APIGateway ended up inverted in a way you wouldn't think possible.

For context the Latency is defined as:

The time between when API Gateway receives a request from a client and when it returns a response to the client. The latency includes the integration latency and other API Gateway overhead.

The Integration Latency is defined as:

The time between when API Gateway relays a request to the backend and when it receives a response from the backend.

So you would think since APIGateway is a big proxy the Latency would always be higher than the Integration Latency since in theory it includes the Integration Latency. This is true for all requests that are proxied.

What I realized is that the Latency can be really low for requests that are NOT proxied. Why wouldn’t a request be proxied? Because we were blocking tens of thousands of requests using WAF(Web Application Firewall) those requests had no recorded Integration Latency but they did have an effect on the overall Latency. Because AWS WAF is so fast and efficient at what it does, those requests were serviced in less than 10 milliseconds and because there were so many malicious requests being blocked the average Latency dropped significantly.

Think of it like back in college: The Professor is grading on a curve and because the WAF was so fast at handling the requests and there were so many requests, the curve(Latency) was thrown off while the Integration Latency was not thrown off. Let me know if you have observed this phenomenon before.