Load Testing Metrics Explained – LoadStorm
The load testing metrics described here are key performance indicators for your web application or web site. Response metrics show the performance measurement from a user perspective while volume metrics show the traffic generated by the load testing tool against the target web application.
Response Metrics
- Average Response Time
- Peak Response Time
- Error Rate
Volume Measurements
- Concurrent Users
- Requests per Second
- Throughput
In LoadStorm, the load testing metrics are plotted in one-minute intervals. All of the load generating servers feed data back to the LoadStorm reporting engine. Calculations are applied to the raw data from every request and response, which results in objective metrics that are useful to determine the effectiveness of your target web application to handle the load. To see an interactive example of LoadStorm’s analysis view, click here.
Average Response Time
When you measure the start of every request and the end of every response to those requests, you will have data for the round trip of what is sent from a browser and how long it takes the target web application to deliver what was needed. For example, one request will be a web page…let’s say the home page of the web site. The load testing system will simulate the user’s browser in sending a request for the “home.html” resource. On the target’s side, the request is received by the web server, it makes further requests of the application to dynamically build the page, and when the full HTML document is compiled, the web server returns that document along with a response header.
The Average Response Time takes into consideration every round trip request/response cycle up until that point in time of the load test and calculates the mathematical mean of all response times for that interval. The resulting metric is a reflection of the speed of the web application being tested – the BEST indicator of how the target site is performing from the users’ perspective. The Average Response Time includes the delivery of HTML, images, CSS, XML, Javascript files, and any other resource being used. Thus, the average will be significantly affected by any slow components. Also geographic locations can have small impact on response times if the end user is thousands of miles away from the target web server.
Response times can be measured as either:
- Time to First Byte
- Time to Last Byte
Some people like to know when the first byte of the response is received by the load generator (simulated browser). This shows how long the request took to get there and how long the server took to start replying. However, that is only part of the real equation. It seems to be much more valuable to know the entire cycle of response that encompasses the duration of download for the resource. Meaning, why would I want to know only part of the response time? What is most important is what the user experiences, and that includes the delivery of the full payload from the server. A user wants to see the HTML page – which requires receipt of the full document. So the Time to Last Byte would be preferred as a Key Performance Indicator (KPI) over Time to First Byte.
Peak Response Time
Similar to the previous metric, Peak Response Time is measuring the round trip of a request/response cycle. However, the peak will tell us what the LONGEST cycle was at this one minute interval of the test. For example, if we are looking at a graph that is showing 5 minutes into the load test that the Peak Response Time is 12 seconds, then we now know one of the requests took that long. The average may still be less than one second because many of the other resources had speedy response times.
The Peak Response Time shows us that at least one of our resources are potentially problematic. It can reflect an anomaly in the web application where a specific request was mishandled by the target system. For example, this could be an “expensive” database query involved in fulfilling a certain request such as a search results page that makes it take much longer, and this metric is great to expose those issues.
Typically images and stylesheets are not the slowest (although they can be when a mistake is made like using a BMP file). In a web application, the process of dynamically building the HTML document from application logic and database queries is usually the most time intensive part of the system. It is less common, yet occurs more often with open source apps, to have very slow Javascript files because of their enormous size. Large files can produce slow responses that will show up in Peak Response Time, so be careful when using big images or calling big JS libraries. Many times, you really only need less than 20% of the Javascript inside those libraries. Lazy coders won’t take the trouble to clean out the other 80%, and that will hurt their system performance.
Error Rate
It is to be expected that some errors may occur when processing requests, especially under load. Most of the time you will see errors begin to be reported when the load has reached a point that exceeds the web application’s ability to deliver what is necessary.
The Error Rate is the mathematical calculation that produces a percentage of problem requests compared to all requests. The percentage reflects how many responses are HTTP status codes indicating an error on the server, as well as any request that times out before receiving or completing its response.
The web server will return an HTTP Status Code in the response header. Normal codes are usually 200 (OK) or something in the 3xx range indicating a redirect on the server. A common error code is 500, which means the web server knows it has a problem with fulfilling that request. That of course doesn’t tell you what caused the problem, but at least you know that the server is aware that there is a technical issue in the system somewhere.
It is much trickier to measure something you never receive, so an error code can be reported by the load testing tool for a condition not indicated by the server. Specifically, the tool must wait for some period of time before it quits “listening” for a response. The tool must determine when it will “give up” on a request and declare a timeout condition. Response timeouts will usually not receive a status code from a web server, so the load testing tool must assign a custom error message “Request Read Timeout” to indicate the timeout.
Other errors can be hard to describe because they do not occur at the HTTP level. A good example is when the web server refuses a connection at the TCP network layer. There is no way to receive an HTTP Status Code for this, thus the load testing tool must assign an error message “Request Connection Timeout” to display for reporting this condition back to you in the load testing results.
Error Rate is a significant metric because it measures “performance failure” in the application. It tells you how many failed requests are occurring at a particular point in time of your load test. The value of this metric is most evident when you can easily see the percentage of problems increase significantly as the higher load produces more errors. In many load tests, this climb in Error Rate will be drastic. This rapid rise in errors tells you where the target system is stressed beyond its ability to deliver adequate performance.
No one can define the tolerance for Error Rate in your web application. Some testers consider less than 1% Error Rate successful if the test is delivering greater than 95% of the maximum expected traffic. However, other testers consider any errors to be a big problem and work to eliminate them. It is not uncommon to have a few errors in web applications – especially when you are dealing with thousands of concurrent users.
Concurrent Users
Concurrent users is the most common way to express the load being applied during a test. This metric is measuring how many virtual users are active at any particular point in time. It does not equate to RPS because one user can generate a high number of requests, and each VUser will not constantly be generating requests.
A virtual user does what a “real” user does as specified by the script that you have created in the load testing tool. If there are 1,000 VUsers, then there are 1,000 scripts executing at that particular time. Many of those 1,000 VUsers are making requests at the same time, but there are many VUsers that are not because of “think time”. Simply put, think time is the pause after each page that simulates what happens with a real user as he or she reads the page received before clicking again.
Requests per Second
RPS is the measurement of how many requests are being sent to the target server. It includes requests for HTML pages, CSS stylesheets, XML documents, JavaScript libraries, images, Flash/multimedia files, and any other requested resource.
RPS will be affected by how many resources are called from the site’s pages. Some sites can have between 50 to 100 images per page, and as long as these images are small in size (e.g.