Querying AWS ALB Logs using Athena


Recently, I had a requirement of querying AWS Application Load Balancer Logs to get some data around request/ sec and p95 latencies.

The Application load balancer logs are stored in AWS S3 by default and follows a consistent format which is documented here

AWS Athena is the best tool to query such logs.

Best practices using AWS Athena

  • Make sure you specify the time period when querying Athena, else the data scanned will be very huge and you will end up paying lot more.

  • To find out the relevant time period to query, have a look at the AWS Cloudwatch metrics and find intreseting patterns such as spikes in request count, response time etc

  • If your ALB has comples routing logic, make sure to specify the Target group in the query

Find url and times it was called within the specified time period

SELECT request_url, count(*) as count FROM "alb_logs"."<alb_name>" where year='2021' and month='10' and day='24' and 
request_creation_time > '2021-10-24T13:37:00.000000Z' and request_creation_time < '2021-10-24T13:38:00.000000Z' group by request_url order by count desc limit 50 

Find p95 Latency by url

SELECT request_url, approx_percentile(target_processing_time, 0.95)  as p95  FROM "alb_logs"."<alb_name>" where year='2021' and month='10' and day='24' and request_creation_time > '2021-10-24T13:43:00.000000Z' and request_creation_time < '2021-10-24T13:44:00.000000Z'  group by request_url order by p95 desc limit 50