Yesterday, we examined how the new availability of elastic cloud computing has changed load and stress testing for the better. It lowers the cost, increases scalability, and facilitates running tests more efficiently. An article on load-testing.org about load and stress testing got me thinking about how our paradigms in this industry have changed and continue to evolve.
Today, we look at the reasons for why we would want to run tests more frequently and earlier in the software product lifecycle. To borrow a Chicago axiom about voting – test early and test often. It’s the new paradigm.
I read a good blog post about load and stress testing that brought out some good points, and I agree so much that this post will provide my view of the subject.
In the load-testing.org article, it states that old thinking dictates buying your own dedicated servers for generating load and licensing legacy software to run test scripts. It’s expensive, slow, cumbersome, inefficient, wasteful. Not only are tools like LoadRunner expensive, they require considerable training and experience in staffing.
Another key point in the article is about infrastructure costs:
“The advent of Cloud Computing offers a new model – one that eliminates the burdens of capital investment and load testing configuration. Cloud Computing uses the federated power of a large number of servers to provide on-demand processing power to online applications, which rent only as much processing power from the cloud as they need.
This post is part 1 and will cover my thoughts on the way cloud computing has changed load testing forever. Tomorrow will have part 2 and will focus on the impact of Agile on stress and performance testing.
Old School Paradigms of Load Testing
My experience in programming over the past 30 years (I’m not really that old) has shown that just about everything related to application performance has changed significantly. Even the terminology. A new paradigm has arrived, and I’m embracing it.
I was reading a post about how the performance of Apache wasn’t quite as good as Nginx, but I got sidetracked by a link on the article that lead me to a broader study that was interesting. An informative performance benchmark comparison was published on http://www.trustleap.com/ where the author concludes that G-WAN running on Linux using a C language-based application has unbelievably better performance than any of the other tested combinations.
First, let me say that I love seeing the metrics. Second, I would love to run a test like this with LoadStorm. Third, my inner geek cynic kicked in as soon as I realized that TrustLeap is the producer of G-WAN. Maybe these numbers are suspect because of their source, however I want to share them with you.
Comparing Apache/Linux, GlassFish/Linux, IIS/Windows, and G-WAN/Windows & Linux
Basilio Briceno is the Senior Developer at Naranya – one of the leading new media companies in LatinAmerica, with special focus in the mobile entertainment and mobile marketing world. He is also a Community Member of the Mozilla Foundation and Project Lead at Tlalokes PHP framework.
Basilio is or has been a college professor, a public speaker, and an independent consultant with these specialties: PHP, UNIX, GNU/Linux, FreeBSD, Solaris, Apache, IIS, Bind, Bash, Photoshop, Gimp, (X)HTML, DB/2, Websphere, JSP, JSF, Javascript, MySQL, Oracle, Perl, PostgreSQL, Postfix, and XML. Check out his personal blog site when you get a chance.
Let’s start the interview.
How much involvement do you have with load and performance testing?
That’s what I do everyday, and that’s why companies hire me. My principal task is to be the most worried person in the company about performance testing. That’s why I try to be involved in every aspect, from UI testing, to load testing, and OS tuning.
What would you say is the difference between load testing and performance testing?
To put it into a boxing metaphor, performance testing allows us to know the precision and speed of the fighter’s arms and fists, the power of his punches, the resistance, velocity and movement of his legs. Load Testing allows you to know the amount of rounds the fighter is capable of competing.
What do you think is the most important aspect of load testing?
Emulating the real conditions that the application is going to be exposed to and exceeding the expectations. If the Load Testing results are superior to expectations, it is less likely that uncomfortable surprises will appear in production.
Joe Emison is the VP of Technology at BuildFax. He is also the Chief Systems Architect at BUILDERadius. A world-class multi-tasker, Joe lives in Ashville, North Carolina.
Joe has experience with load and performance testing, and we are grateful he invested his time to share some good thoughts with us about a topic we are excited about.
On an interesting note, Joe is also a JD graduate from Yale Law School. How many alpha geeks do you know that went to law school? It’s a rare breed that only the folks at LoadStorm can hunt down. 😉
Let’s start the interview.
Please share one tip or best practice that is important to you regarding performance (or load) testing or engineering?
Before you start, you need to understand how everything works. Down to this level of detail:
- How many queries does your application make to the database for different activities?
- How many connections should your application make to the database server?
- How much memory should your application be using?
Otherwise, you won’t know how to use the results of load testing.
Will Wolff-Myren is a software professional that loves building web applications. He has a blog with technical articles that are usually centered around developing on a Microsoft platform.
Will is a Software Engineer working for about the last 5 years at Learning.com. He is a fan of Firebug and Charles. He identifies some good links and useful books with us too. We appreciate Will taking the time to share some information with us.
What is your technical background?
I’ve had a variety of experience in web design/development and software engineering, starting with my first experiences in Java programming (all the way back in Java 1.2), and leading me to my current position as an ASP.NET software engineer.
Do you consider yourself more of a software developer or QA professional?
I consider myself to be much more of a software developer, though, of course, I try to do as much QA on my own code as I can before delivering it to our QA team for further review.
A successful website fails quickly. Successful with high volumes of visitors. Failure in performance under load. Lots of marketing money and energy wasted.
Web Performance (fast response) is important – 46% of people abandon sites because of slow load times.
If load testing and performance engineering was easy, my boss wouldn’t have hired a geek like me. A swimsuit model would have gotten my job.
Google found that an extra 500ms in latency cost them 20% of their search traffic.
If you give someone a web application, you will frustrate them for a day; if you teach them how to code a web application, you will frustrate them for a lifetime.
A load tester learns more from one HTTP status codes of 500 than from 1 billion status codes of 200.
Today I saw a tweet that led me to download a document published by WebPagetest.org’s development team containing proposed changes to the way web performance testing is conducted. The following is a summary of the document with a little commentary. The focus of the document is NOT on load testing; rather, it primarily deals with individual web page analysis. Thus, the definition of performance testing used herein is relative to taking a browser, hitting a page, and analyzing the response metrics relative to the single page.
Load times of each resource such as images, CSS, HTML, Flash, XML, Javascript files, etc. are a key measurement. The speed of DNS lookups, intitial connection, content download, start render, and document complete are other important measurement in the type of performance testing involved in this proposal. Patrick Meenan, Sadeesh Kumar Duraisamy, Qi Zhao, and Ryan Hickman are the authors of this piece, and they refer to the scope of their proposal as “ad-hoc performance testing”. They submit four main points:
- Current state of web performance testing
- Proposed changes
- Use cases
- Making it happen
Current Web Performance Testing
Their document has bullet points without much explanation, so I must read between the lines and offer my thoughts. The first bullet is “Monolithic Solutions”. Yep, I think I understand that one. Most of the performance testing solutions in the market are well-known by developers because those tools have been around for a long time. Until recently, there have only been a few players such as Mercury, Rational, and Borland. Consolidation in the past 5 years caused the names to change to HP, IBM, and Micro Focus, but the software tools are the same monolithic solutions created in the 1990’s.
The vast majority of marketshare has been controlled by the large corporations with deep pockets for advertising and PR. Their sales relationships with the IT Managers and CTOs of Fortune 1000 companies has assured them of guaranteed deals through being a part of a suite or through the FUD factor. Big companies must buy software from big companies. Otherwise, the CTO would be exposing himself or herself to ridicule and contempt by the vendors. This contempt would be delivered to other executives like CFOs that don’t know software, but they do understand risk mitigation. Along this line of thinking pitched by the entrenched big vendors’ is the conclusion that monolithic solutions are safer. Why? Mainly because they are too big to fail. Oh my goodness! How could a rational (pun intended) IT executive swallow that crap? There is lots of IBM software that is dead and gone – which cost many companies millions of dollars to replace/convert to the new software products. Reminder: GM was too large to fail as well…see the fallacy?
On February 4, 2004, an enviable fellow geek shared a strange new website with the rest of the world that would literally impact everyone’s class reunions forever. Today is the 7th birthday of Facebook.
Sites Can Grow Exponentially
The four founders were Harvard students and started the site from their dorm room. The idea was only for college students. It’s immediate popularity drove them to expand to Columbia, Yale, and Stanford – within 1 month!
It only took Facebook 10 months to reach 1 million active users!!
It is not hyperbole to say that this single web application not only change the life of college students, but eventually the world as a whole. Initially it was a way for young men to creep on hot girls and find other students for sharing class notes or previous tests has now evolved into a way for young men to creep on hot girls and pretend they didn’t see their grandmother’s posting of their baby pictures.
This little college website now is a huge marketing phenomenon with about one-half a BILLION people signed up and 50% of those are on the site daily. It has guaranteed that anyone who was living the single college carefree lifestyle in the past 5 years will always have a haunting fear that an incriminating picture will expose their sins later in life. I guess the future Bill Clintons won’t be running for president after all. Facebook has changed our lives in ways we can’t imagine. I wonder how many marriages will end from the distrust created by the stomach-turning photo of a spouse (from earlier days) found through a tag notice.
This may not work for non-English speakers, but Cambridge University presents the following paragraph:
Olny srmat poelpe can raed this.
I cdnuolt blveiee that I cluod aulaclty uesdnatnrd what I was rdanieg. The phaonmneal pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in what oredr the ltteers in a word are, the olny iprmoatnt tihng is that the first and last ltteer be in the rghit pclae. The rset can be a taotl mses and you can still raed it wouthit a porbelm. This is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the word as a wlohe. Amzanig huh? Yaeh and I awlyas tghuhot slpeling was ipmorantt! If you can raed this psas it on !!
Could you read it? Impressive! It would at first appear to be unintelligible, but it surprisingly makes sense. While this is an interesting mind trick, what does it teach us about performance testing? Well, it has been my observation that most owners of websites can understand performance metrics intuitively. Their brains can overcome the potential obstacles from their not being experts in our industry. They can figure out the scrambled world of performance engineering without knowing the code (pun intended).
I’ll probably take some heat for this post. Most professional testers get deep into the details of arcane aspects of the science of load testing because that is their job. It’s important to them, and they have studied it for years, and they are immersed in it. I understand. It helps to differentiate them from other testers that are not as knowledgeable; thus, it is a competitive advantage to incorporate as many functions of performance as possible. Consultants certainly need to show off their advanced skills gained from decades of load testing so that the customer can be assured of getting good value. Again, I understand. I respect these highly trained testers and engineers.
That said, I’m of the opinion that many times the professionals spend 80% of their time building load test scenarios that only have a 20% impact on the performance metrics. Implementing these nuances into scripts and testing plans makes the project more thorough, but from a business ROI perspective, gettting that extra 20% of accuracy is NOT worth the 80% of effort. The following paragraphs discuss some examples of considerations that I recommend you ignore when building your test scenarios.
Response times and other performance measurements will be greatly affected by what pages are requested, what forms are submitted, and what buttons are clicked during a load test. Thus, a key aspect of being a good load tester is the ability to create test scenarios well.
How should you develop test cases? Hopefully this post will give you some useful suggestions you can put into practice.
The primary purpose of load testing is usually to find bottlenecks that decrease performance, and then mitigate or eliminate those bottlenecks. CEOs, CTOs, Vice Presidents of Marketing, and Product Managers want to make sure customers are not impacted negatively when the site is very successful. As developers, we need to make the app fast and efficient. We need to make the customer happy which makes our boss happy. If we run load tests that produce results that have no correlation to what happens in production, then our tests are failures. The test metrics are useless. No value is gained by the process. We have wasted our time.
It is important to simulate realistic behavior by virtual users and in the appropriate proportions because the performance of your system will be affected by not only the increase in load, but also in the types of processing needed to deliver the responses. Each page can have significant differences in resources needed to satisfy the user’s request. So if you run a load test that hits your home page 100,000 times per hour, you probably won’t answer many performance questions about your e-commerce application. The home page might have some images and Flash videos, but I suspect it won’t make any complex queries against your database. Realistic load testing needs to trigger the interaction between your various layers of architecture in order to find the bottlenecks that will decrease performance.
Google found that an extra 500 ms in latency cost them 20% of their search traffic – 5 YEARS AGO!!
At the Web 2.0 conference in 2006, Google VP Marissa Mayer made a presentation where she shared the above statistic. The negative affect of incremental degrading response time was uncovered because of differences in the number of search results that people wanted. An experiment with a segmented group of searchers had 30 results instead of the normal 10. The page with 10 results took 400 ms to generate, while the page with 30 results took 900 ms. This experimental, scientifically segmented group had 20% less traffic; thus the revenue to Google also dropped by 20%.
I don’t know how we perceive a half of a second delay, but it obviously makes a difference.
As we begin a new year, hoping it will be usher in global economic recovery, I would like to share what I can find about the collective forecasting in our industry. In my search, I found several predictions about stress testing the financial systems. Hmmm…interesting, but not helpful.
One really cool article on Business Insider showed the predictions of visionaries from 1931 of their view of 80 years in the future. William Ogburn had the best view of our high-tech world:
Technological progress, with its exponential law of increase, holds the key to the future. Labor displacement will proceed even to automatic factories. The magic of remote control will be commonplace. Humanity’s most versatile servant will be the electron tube.
Fascinating! Bravo Mr. Ogburn. Although it wasn’t the electron tube, you have the right idea. Wonder what will replace silicon chips in the next 80 years?
Mobile Performance Testing Predictions for 2011
Joshua Bixby, the President of web performance company Strangeloop, puts forth these prognostications in his article entitled 2011 Web performance predictions for the mobile industry:
- Companies will generate at least 15% of Web sales via their social presence and mobile applications.
- Android will become the No. 1 mobile platform, surpassing the iPhone in terms of units and usage.
- Retailers will realize that mobile shoppers have a goal-driven “hunter” mentality.
- As a result of No. 3, mobile Web performance will become as important as desktop Web performance.
We have about 3 feet of new snow here from the storm earlier in the week. It makes for a picturesque Christmas holiday for my family.
Today I’m just going to share a summary of 2 performance testing articles that I read this morning, then I’m taking off early to enjoy time with my wife, mom, and daughters.
Performance Testing in Agile Framework
According to the Mozilla Blog of Metrics, they can drive an additional 60,000,000 Firefox downloads per year by making a few minor tweaks to their top landing pages.
WEB PERFORMANCE to the rescue! Higher performance translates to higher conversions and revenue.
Yesterday a customer called me to ask about how his current traffic levels translate to the load he should use for performance testing. We discussed it in several ways, and he said I had been helpful. This morning he sent me an email with a fantastic suggestion to write a blog post about our discussion.
The focus is to determine how many virtual users someone should use in their load tests. Sounded good to me, and here is how he framed the question in the email:
For example, while looking at Google Analytics for a given average day, during a peak hour we had:
- 2000 visitors in 60 minutes
- 10,000 page views
- avg page views 5
- avg time on site 7 minutes
So I wanted to figure out how many users should we feed the LoadStorm system to simulate this traffic as a base line. Does this math look correct for this case?
2000 users in 1 hour (60 minutes), 7 min time on site
60 minutes / 7 min = 8.5
2000 / 8.5 = 235 Users
Establishing an Algorithm
Well, his calculations seem to be logical and have accurate math. I would first say that a test of 235 concurrent users seems like a good baseline based on the Analytics numbers. Each test scenario should have an average duration of 7 minutes to reflect the “avg time on site” metric. That would lead to the conclusion that the total number of users turned over about 9 times during the hour. Put another way, the total 2,000 visitors do not translate to 2,000 concurrent users because each user only stays a few minutes. If they stayed for an hour each, then we need to test for 2,000 concurrently. They don’t stay long; therefore, dividing the 2,000 users by 8.5 (visit duration) would tell you that approximately 235 users were using the site concurrently.
This approach is going to tell us how many users we have on average. Let’s put this into a mathematical formula:
U = V / (60/D)
Where: U is the number of load test virtual users (that’s what we are trying to figure out) V is the average number of visitors per hour D is the average duration of a visitor
60 is the number of minutes in an hour
Let’s state this formula again in English like a math word problem:
Load Test Virtual Users is equal to the Average Visitors per Hour divided by the User Turnover Rate per Hour
Today is the birthday of one of the creepiest prognosticators ever. Nostradamus was born in 1503 in France and became a physician and astrologer who dabbled in prophecy. He wrote vague and often cataclysmic predictions that brought him quite a bit of notoriety in his day, and he still spawns History Channel specials about how “amazing” he was. Sorry, I just think that he wrote some brilliant predictions that could apply to anything.
If you think that he successfully predicted the World Trade Center attack in September of 2001, then please get out of the load and performance testing industry. Do some research and you will find that many of the “predictions” attributed to Nostradamus were not even written by him at all.
Web performance was NOT predicted by Nostradomus, but here are some of my predictions about how web performance will affect the world as we know it:
Amazon states that for every 100ms of latency, they lose 1% of their sales. That’s an enormous amount of money for what most of us would call a tiny amount of performance. Guess it goes to show that web performance is more important than even us load testing geeks can imagine. So, don’t get too upset if your CEO doesn’t fully comprehend the fact that the company would be in a world of hurt without you.
Performance issues on websites during the process of completing a transaction will cause thirty-seven to forty-nine percent (37-49%) of users to abandon the site or switch to a competitor. The lost revenue from this experience is bad, but the worst part may be that 77% will make sure other people know about the problems. You will never know how much money is lost and how much damage was done to your brand reputation.