
Sign up in 30 seconds.
No credit card.
No risk.
No download.
Load Testing Blogs
Load Impact 2.3 released!
We're happy to introduce Load Impact 2.3!
Load Impact 2.3 contains a new and improved proxy recorder that automatically detects pages and creates page load time result metrics for each of your web pages. The recorder also allows you to insert code comments in the generated user scenario, which can be useful in order to find where in your user scenario code a certain page is being loaded.
Behind the scenes, Load Impact 2.3 also includes a lot of optimizations that result in a much faster reporting interface, especially for large tests that generate a lot of results data these optimizations will make a huge difference to how snappy the "view test" page feels. And for live tests, the reporting page will also be a lot smoother. In fact, Load Impact 2.3 is a major rewrite of the underlying storage subsystem and how data is being accessed by the user interface code. More things are loaded on-demand now (i.e. as/when needed) and this results in a page that is much lighter on the client computer. You should now be able to view even the largest tests on the flimsiest of laptops.
Other improvements you will find in 2.3 include:
- Graphical editor support for data stores, custom metrics and other new API functionality
- Several API updates - http.page API functions, named parameters, etc.
- You can now plot graphs of load generator CPU and memory usage during the test!
- The URL list on the report page now displays bytes received and compression ratio
- Content type classification now uses the Content-Type header
- Click the pie charts to highlight different objects in the URL list on the test report page
- Many bug fixes...
New home for my blog
- --
- Scott Barber
- President & Chief Technologist, PerfTestPlus, Inc.
- Author, Web Load Testing for Dummies
- Co-Author, Performance Testing Guidance for Web Applications
- Contributing Author,Beautiful Testing, & How To Reduce the Cost of Testing
Handling consumable parameter data in LoadRunner
Performance Book with new chapter on Performance Engineering Practices
Best Practices on APM in Windows Azure and Silverlight
dynaTrace AJAX Edition 3.6 now supports Firefox 11
Specification Proposal for JavaScript Timing in Browsers
The Super Bowl Effect on Website Performance
Cloud Connect Santa Clara 2012
dynaTrace AJAX Edition 3.5 with Beta support for Internet Explorer 10 and Firefox 10
Do you need to monitor your Mobile App?
About the Performance of Map Reduce Jobs
Internet Explorer 9 and Firefox 8/9 support with dynaTrace Ajax Edition 3.4
Ideas for VuGen Add-ins
HP Discover 2012 (Las Vegas)
SST 0.2.1 Release Announcement (selenium-simple-test)
SST version 0.2.1 has been released.
SST (selenium-simple-test) is a web test framework that uses Python to generate functional browser-based tests.
SST version 0.2.1 is on PyPI: http://pypi.python.org/pypi/sst
install or upgrade with:
pip install -U sst
Changelog: http://testutils.org/sst/changelog.html
SST Docs: http://testutils.org/sst
SST on Launchpad: https://launchpad.net/selenium-simple-test
SST downloads | (Jan 1 2012 - April 23 2012)
1600+ downloads from PyPI since initial release.
Measuring web application performance
Going by the many posts in various LinkedIn groups and blogs, there seems to be some confusion about how to measure and analyze a web application’s performance. This article tries to clarify the different aspects of web performance and how to go about measuring it, explaining key terms and concepts along the way.
Web Application Architecture
The diagram below shows a high-level view of typical architectures of web applications.
The simplest applications have the web and app tiers combined while more complex ones may have multiple application tiers (called “middleware”) as well as multiple datastores.
The Front end refers to the web tier that generates the html response for the browser.
The Back end refers to the server components that are responsible for the business logic.
Note that in architectures where a single web/app server tier is responsible for both the front and back ends, it is still useful to think of them as logically separate for the purposes of performance analysis.
Front End Performance
When measuring front end performance, we are primarily concerned with understanding the response time that the user (sitting in front of a browser) experiences. This is typically measured as the time taken to load a web page. Performance of the front end depends on the following:
- Time taken to generate the base page
- Browser parse time
- Time to download all of the components on the page (css,js,images,etc.)
- Browser render time of the page
For most applications, the response time is dominated by the 3rd bullet above i.e. time spent by the browser in retrieving all of the components on a page. As pages have become increasingly complex, their sizes have mushroomed as well – it is not uncommon to see pages of 0.5 MB or more. Depending on where the user is located, it can take a significant amount of time for the browser to fetch components across the internet.
Front end Performance Tools
Front-end performance is typically viewed as waterfall charts produced by tools such as the Firebug Net Panel. During development, firebug is an invaluable tool to understand and fix client-side issues. However, to get a true measure of end user experience on production systems, performance needs to be measured from points on the internet where your customers typically are. Many tools are available to do this and they vary in price and functionality. Do your research to find a tool that fits your needs.
Back End Performance
The primary goal of measuring back end performance is to understand the maximum throughput that it can sustain.Traditionally, enterprises perform “load testing” of their applications to ensure they can scale. I prefer to call this “scalability testing“. Test clients drive load via bare-bones HTTP clients and measure the throughput of the application i.e. the number of requests per second they can handle. To increase the throughput, the number of client drivers need to be increased until the point where throughput stops to increase or worse stops to drop-off.
For complex multi-tier architectures, it is beneficial to break-up the back end analysis by testing the scalability of individual tiers. For example, database scalability can be measured by running a workload just on the database. This can greatly help identify problems and also provides developers and QA engineers with tests they can repeat during subsequent product releases.
Many applications are thrown into production before any scalability testing is done. Things may seem fine until the day the application gets hit with increased traffic (good for business!). If the application crashes and burns because it cannot handle the load, you may not get a second chance.
Back End Performance Tools
Numerous load testing tools exist with varying functionality and price. There are also a number of open source tools available. Depending on resources you have and your budget, you can also outsource your entire scalability testing.
Summary
Front end performance is primarily concerned with measuring end user response times while back end performance is concerned with measuring throughput and scalability.
Response time metric for SLA
Service Level Agreements (SLAs) usually specify a response time criteria that must be met. Although SLAs can have a wide range of metrics like throughput, up time, availability etc., we will focus on response times in this article.
We often hear phrases like the following :
- “The response time was 5 seconds”
- “This product’s performance is much worse than slowpoke’s. It takes longer to respond.”
- “Our whizbang product can perform 100 transactions/sec with a response time of 10 seconds or less”
Do you see anything wrong in these statements? Although they sound fine for general conversation, anyone interested in performance should really be asking what exactly do they mean.
Let’s take the first statement above and make the assumption that it refers to a particular page in a web application. When someone says that the response time is 5 seconds, does it mean that when this user typed in the URL of this page, the browser took 5 seconds to respond? Or does it mean that in an automated test repeatedly accessing this page, the average response time was 5 seconds? Or perhaps, the median response time was 5 seconds?
You get the idea. For some reason, people tend to talk loosely about response times. Without going into details of how to measure the response time (that’s a separate topic), this article will focus on what is a meaningful response time metric.
For purposes of this discussion, let us assume we are measuring the response time of a transaction (which can be anything – web, database, cache etc.) What is the most meaningful measure for the response time of a transaction?
Mean Response TimeThis is the most common measure of response time, but alas, usually is the most flawed as well. The mean or average response time simply adds up all the individual response times taken from multiple measurements and divides it by the number of samples to get an average. This may be fine if the measurements are fairly evenly distributed over a narrow range as in Figure 1.
- Figure 1: Steady Response Times
- Figure 2: Varying Response Times
But if the measurements vary quite a bit over a large range like in Figure 2, the average response time is not meaningful. Both figures have the same scale and show response times on the y axis for samples taken over a period of time (x axis).
Median Response TimeIf the average is not a good representation of a distribution, perhaps the median is? After all, the median marks the 50th percentile of a distribution. The median is useful when the response times do have a normal distribution but have a few outliers. In this case, the median helps to weed out the outliers.The key here is few outliers. It is important to realize that if 50% of the transactions are within the specified time, that means the remaining 50% have a higher response time. Surely, a response time specification that leaves out half the population cannot be a good measure.
90th or 95th percentile Response TimeIn standard benchmarks, it is common to see 90th percentile response times used. The benchmark may specify that the 90th percentile response time of a transaction should be within x seconds. This means that only 10% of the transactions have a response time higher than x seconds and can therefore be a meaningful measure. For web applications, the requirements are usually even higher – after all, if 10% of your users are dissatisfied with the site performance, that could be a significant number of users. Therefore, it is common to see 95th percentile used for SLAs in web applications.
A word of caution – web page response times can vary dramatically if measured at the last mile (i.e. real users computers that are connected via cable or DSL to the internet). Figure 3 shows the distribution of response times for such a measurement.
- Figure 3: Response Time Histogram
It uses the same data as in Figure 2. The mean response time for this data set is 12.9 secs and the median is even lower at 12.3 secs. Clearly neither of these measures covers any significant range of the actual response times. The 90th percentile is 17.3 and the 95th is 18.6. These are much better measures for the response time of this distribution and will work better as the SLA.
To summarize, it is important to look at the distribution of response times before attempting to define an SLA. Like many other metrics, a one size fits all approach does not work. Response time measurements on the server side tend to vary a lot less than on the client. A 90th or 95th percentile response time requirement is a good choice to ensure that the vast majority of clients are covered.

