I read a good article this morning that presents a case study of scaling a web site. 6 Ways to Kill Your Servers – Learning How to Scale the Hard Way by Steffen Konerow presents some excellent points about how to avoid system crashes in your web application. Surprisingly, there are not direct mentions of load testing the site re-launch before going live. Load testing is implied throughout, yet never specifically addressed.
While not everyone will have these same problems, and each web app is distinct in its architecture, the tuning process Steffen describes is very applicable to almost any web performance tuning. The conclusion is classic and can never be repeated too frequently:
“Scaling a website is a never ending process.”
I have learned the hard way that assuming your site performance is acceptable now and will remain acceptable is a bad idea. You should continually be testing and tuning. How many times have you hear me say that? Load test, tune, load test, tune, rinse and repeat.
Most of the components of your web application will remain fairly stable. However, over the weeks and months of production deployment there will invariably be many small changes. A few examples include:
- New releases of your application code
- Periodic upgrades to the web server software
- Configuration modifications in the network
- Database query additions from a new customer report(s)
- Bug fixes and system patches in the load balancing software
- Environmental changes in the data center
Web performance tuning is an ongoing process because web applications have thousands of moving parts. Nothing should be taken for granted.
Web Performance Tuning Lessons from Steffen
I love how he puts the lessons in terms of, “Do this and your servers will DIE!” He does a good job explaining how some architecture decision resulted in a performance problem. I would probably like a little more detail, but I’m not complaining. The article is well-written and a good length for a blog post. Here are his lessons in order.
#1 – “Put Smarty compile and template caches on an active-active DRBD cluster with high load and your servers will DIE!”
The first lesson is about how mirroring and high availability cluster software can corrupt your file system. I can use this as a reason to thoroughly load and performance test your new system before putting it into production. There is a high probability this clustering side effect would have been identified during a high-volume test.
#2 – “Put an out-of-the-box webserver configuration on your machines, do not optimize it at all and your servers will DIE!”
This lesson applies to IIS, Apache, or any type of web server software you use. Whenever you have dozens of configuration settings in a piece of software, you will almost never get high performance without performance tuning those settings. I have rarely encountered a company that didn’t feel their situation “is unique across the industry”. So, why would you think that your system won’t require unique setup? It will.
#3 – “Even a powerful database server has its limits and when you reach them – your servers will DIE!”
As a manager of a commercially successful web application (SaaS model learning system), several years ago I was faced with a decision about scale. Our customers were overwhelming our servers. My naive answer was to buy bigger machines. It wasn’t really the problem, and my solution was the easy path. Bad management and bad technical decision. Our database was the bottleneck. Or more accurately, our expensive SQL queries were killing our performance. Examining slow query logs and re-writing poor SQL statements is a better answer than just upgrading hardware.
#4 – “Stop planning in advance and your servers are likely to DIE.”
Be proactive. Web performance is a moving target. Your success should be driving your metrics ever higher. More customers, more content, more creativity on the part of marketing, etc. will make performance tuning an iterative necessity. Planning ahead should involve setting clear performance criteria and load testing against those performance goals. Not just before re-launching your site, but load tests should be run against every release of your code. I’m not kidding – it is that important. You may not believe me now, however you will after that seemingly simple programming mod craters your performance and 25% of a normal day’s load grinds your system to a halt! Been there, done that, and you should learn from my mistakes.
#5 – “Forget about caching and you will either waste a lot of money on hardware or your servers will die!”
Caching is your best ally in the world of web performance tuning. You probably don’t overlook it, and you probably turn it on in the web server. My recommendation is to look for as many ways to utilize caching as possible. RAM trumps physical drive access every time. Good caching will reduce hardware, software, and hosting footprint, often over 60%. I’ve seen it make an immediate 500% improvement in throughput. Steffen used Memcached, and it is very effective. Other caching tools can provide enormous scalability gains, such as aiCache our performance improvement partner. When in doubt, cache it.
#6 – “Put a few hundred thousand small files in one folder, run out of Inodes and your server will die!”
I agree completely with Steffen’s statements:
Never ever start thinking “that’s it, we’re done” and lean back. It will kill your servers and perhaps even your business. It’s a constant process of planning and learning.
Web performance tuning never ends. I strongly advocate you adopt a process of “load test, tune, load test, tune, rinse & repeat” because you want sleep well at night. You don’t want to be called into the boss’ office to explain why the site crashed. Bad performance doesn’t have to be a surprise if you stay proactive about it.
The most compelling reason I can share for why you should be continually testing and tuning is because web performance tuning translates to revenue and profit. It’s proven empirically (just follow that link to see the data).