Querying AWS ALB Logs using Athena
Recently, I had a requirement of querying AWS Application Load Balancer Logs to get some data around request/ sec and p95 latencies.
The Application load balancer logs are stored in AWS S3 by default and follows a consistent format which is documented here
AWS Athena is the best tool to query such logs.
Best practices using AWS Athena
-
Make sure you specify the time period when querying Athena, else the data scanned will be very huge and you will end up paying lot more.
-
To find out the relevant time period to query, have a look at the AWS Cloudwatch metrics and find intreseting patterns such as spikes in request count, response time etc
-
If your ALB has comples routing logic, make sure to specify the Target group in the query
Find url and times it was called within the specified time period
SELECT request_url, count(*) as count FROM "alb_logs"."<alb_name>" where year='2021' and month='10' and day='24' and
request_creation_time > '2021-10-24T13:37:00.000000Z' and request_creation_time < '2021-10-24T13:38:00.000000Z' group by request_url order by count desc limit 50
Find p95 Latency by url
SELECT request_url, approx_percentile(target_processing_time, 0.95) as p95 FROM "alb_logs"."<alb_name>" where year='2021' and month='10' and day='24' and request_creation_time > '2021-10-24T13:43:00.000000Z' and request_creation_time < '2021-10-24T13:44:00.000000Z' group by request_url order by p95 desc limit 50