Performance Testing in a Scaled Environment Doesn't Work

This article will explain the risks associated with using a scaled (aka downsized) environment for Performance Testing. I’ve been a little off topic lately and I thought I would jump back into the realms of Performance Testing. I thought I would attempt to answer one of the most complicated questions I’m faced with when Load Testing. The question is this “If we half the size of the performance/load testing environment can’t we just multiple the figures up?” This is a straightforward question and the answer is simple – ‘NO’. But justifying the answer and explaining in simple terms is more difficult. Particularly to PM’s and people not directly attached to the technology. So I’m going to attempt to answer in simple terms why scaled load testing environments tend not to work and highlight the risks to be considered when using them. Point people at this article if you struggle to answer this question – and let me know what they think.

First lets take an object: A square – if we half the square, do we get half the size? Well yes and no – the square is half the size, but its capacity is 1/4 of the original square.

This is a very simplistic view – but it illustrates if the environment is ‘halved’ the capacity will not. I’m setting the scene so please bear with me …..

Now IT projects are complicated engineering projects – each piece of the system is build separately and then put together before delivery. A little like a car, which I’ll use as an example. Lets say the top speed of the car (aka production) is 100mph. If we half the environment (e.g. Buy a PC that has half the processing power) we are effectively halving the size of the engine, which is giving the car 1/4 of the capacity & performance. So now the maximum speed the resized car can reach is 25 mph. But the size of the car is the same – we are driving the same sized car with an engine that is 1/4 of the size.

Now lets imagine other parts of the car are software subsystems – they bolt indirectly onto the engine. Wheels, Nuts, bolts, steering and Axial. The smaller engine is driving these but can never load them past 25mph. So actually everything looks OK in the scaled environment when we are going full speed i.e. 25mph. Now lets take the axial – an essential part of the car that is attached to the engine, lets say this has been unwittingly updated but a fault is introduced that means it will break at 30 mph. This performance fault will never be seen in the scaled environment. It will only be seen in production. So the key lesson here is: By using a scaled environment you are unlikely to find performance issues past your environments capacity. However, if the axial breaks at 24 mph you will find it – so it does reduces risk.

From this –> To this

Now lets say the wheel nuts have been redesigned too. These have been initially designed to be over-tolerance, which means they have been designed to break at 200mph, way above the 100mph limit of the car. Lets say a re-design fault means that the tolerance has actually been lowered to 102mph. What does this mean – This means the system won’t won’t break in test or live, but the actual the capacity of the associated components have been drastically reduced without visibility. So the key lesson here is: By using a scaled environment it becomes more difficult to see if the overall capacity and tolerance of the system has been reduced – making the chances of live performance failure more likely.

So what can Performance Testers do?

Now in reality – the engine is made of many components that are essential to its speed (e.g. spark plugs, piston, camshaft). When we scale a hardware environment down we do not reduce the size of these components in proportion. Our essential IT components consist of memory, L1 Cache, CPU speed, network, disk i/o, DB – these interact in a complex way and resizing one will affect the overall performance. If you have to prioritize any of these – then attempt to keep memory the same. So the key takeaway here is: If you have to work in a scaled environment – do not scale everything down. Identify, Prioritize and attempt to keep as many key attributes as close to live as possible.

So that’s a simple analogy. In reality you also have to consider the risks of deadlocking (less likely in a scaled environment), configuration differences and actual scalability capabilities of the application. I consulted at a company that had 200+ identical instances of a middle tier server – replicating this in a load testing environment wasn’t practical or cost effective. So I studied the architecture and identified and communicated the limitations of the performance test environment. This also enabled me to identify some follies before I even begun performance testing. Software developers had taken a common component out of the instances and centralized it – without even considering that they had now introduced a single point of failure which was going to be subjected to 200 times more traffic. So another takeaway is: Study the Environment and architecture – attempt to scale the performance testing environment sensibly.

Of course every architecture is different – people rarely program in languages such as CUDA, so its rare to come across truly scalable software architectures that will downsize proportionally.

So here are the key takeaways:

By using a scaled environment you are unlikely to find performance issues past your scaled environments capacity.
By using a scaled environment you can reduce the capacity and tolerance of overall system without visibility – this increases the risk of live performance issues
If you have to work in a scaled environment – do not scale everything down. Identify, Prioritize and attempt to keep as many key attributes as close to live.
Identify and communicated the limitations of the scaled performance test environment to management.
Study the environment and architecture – attempt to scale the performance testing environment sensibly.
If you use a scaled environment for Performance testing then make sure adequate (and fast) roll back procedures are in place in the live environment

I hope you find this useful in explaining the risks associated with scaled performance environments.

Just a quick note – Its worth mentioning here that I have never ever seen Capacity Planning tools work. They are expensive & ineffective. There are much better ways of forecasting capacities or increasing the performance of the existing application.

Performance Testing in a Scaled Environment Doesn't Work

So what can Performance Testers do?

Web Performance Optimization, Part 3: Data Caching – LoadStorm

Web Performance News of the Week – LoadStorm

Web Performance Statistics

Magento Scalability: Web Performance Lab Experiment

Performance Optimization Archives – Page 2 of 12 – LoadStorm

Video Tutorials for Load and Performance Testing

So what can Performance Testers do?

Similar Posts