Measuring HTTP Response Time of ECS Docker Container with CloudWatch Insight

Photo by ammiel jr on Unsplash

Around two months ago, I have given task to increase the speed and performance of Backstreet Academys website. This is not a new task for me, since I have done this performance-related task in my previous job, but for me in order to optimize something we need to look the current state of the things that we want to optimize. If we don’t know the state, how we are make sure that we are really done the job or even make it worse. For example, if we want to increase the page load of the website, we need to know how the average page load time of our website first before taking the action of the optimization. Also remember “Premature Optimization Is the Root of All Evil”.

So, I looked up anything on our existing infrastructure that I can use as the metric. Google Analytics? Nope, there is no data on our page speed metric. AWS Elastic Container Service Logs? Nope, there is no data related response time. Anything? Nope, we don’t have anything that can be a metric for our optimization. Well, this is bad. How we are going to optimize something, if we don’t have any idea where our current position is. So, I decided to set up some analytics tools that can help us to determined our current response time of the website. There is some paid service that provide that, which obviously not my first option, then I looked up on our ECS logs on CloudWatch Logs service on AWS. I found some interesting feature on their left-side navigation panel called “Insights”.

From AWS documentation:

CloudWatch Logs Insights enables you to interactively search and analyze your log data in Amazon CloudWatch Logs. You can perform queries to help you quickly and effectively respond to operational issues. If an issue occurs, you can use CloudWatch Logs Insights to identify potential causes and validate deployed fixes.

Well, this is interesting. An idea came up in my mind, what if I can put the response time on ECS logs and use this feature to analyze it. This could be a good solution to track our response time. Then I start exploring this idea.

The first thing that I need to do is put the response time on the application HTTP log. Since we are using morgan as our logger middleware for HTTP request to our website, we can just modify the format of the log to show the response time.

From this:

To this:

See any difference? Yap, there is new :response-time format that allow us to track the response time of the website in milliseconds. With this changes, we basically done the first step. But wait, I know some of you may be still using Apache log in your ECS container. So if you are using Apache log, you can simply use log format below in your apache log configuration.

LogFormat "%h %l %u %t \"%r\" %>s %O %{ms}T \"%{Referer}i\" \"%{User-agent}i\"" combined

If you notice that is default combined apache log with additional %{ms}T which records response times in millisecond. If you use other stuff to log your http request on your application, I believe there some information out there that can provide ways to get the response time value. Google it!

After the website is starting to produce the new log with response time on it, something like this:

xxx.xx.xx.xx — — [01/Jul/2019:05:39:02 +0000] “GET / HTTP/1.1” 200 81427 40.332 “" “Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1”

See the number 40.332 on it, thats the response time. Which represent ~40 milliseconds. You already have that, we can continue to the next step, which is setting up CloudWatch Logs Insight to parse this data. Go to the section, and you will see this,

Oops, redacted. 😏

Then, select the log group from your ECS service on the top of the panel, and you need to put a query to parse the log data. The query that I used to parse the log is,

This is how the query above works:

  1. In the first line, we put parse to parse the log message and put each log string into structured data.
  2. Then in the second line, we put filterto only show the log of successfull page rendering represented as HTTP status code 200.
  3. In the last line, we add stats to parse response time that we have parsed previously in order to show average response time every 2 hours.

This is the simple query, which you can extends more depends on your need. When you run the query you will get the data showed up in Logs section also the visualization chart on the next tab.

Logs section
Visualization section

If you are using CloudWatch dashboard, you can simply add this data to your dashboard, which something we do. However, when you put this data on dashboard, it still show as Logs section, which represented as table. I’m wondering if it possible to show it as the chart, but if you know the way to show the data as chart, please let me know. 😄

That’s it, I hope this article can solve your problem to monitoring HTTP response time in ECS Docker Container. See you!

Lead Software Engineer at Mekari — Empower businesses and professionals to progress effortlessly (