We are planning a load test of a basic LAMP Drupal target system using LoadStorm. This provides a description of the necessary environment on the system, a walk-through of testing steps, and expected results from the load test. LoadStorm does not require the user to setup large amounts of servers for an isolated test.
The Test Environment
The test environment is a LAMP Drupal installation on a small Amazon EC2 instance. The default benchmark will have no performance modifications to the installation. The purpose of using LoadStorm as opposed to JMeter is the unnecessary nature of setting up your own servers for an isolated test.
The target of the load test is to replicate a typical Drupal site with a certain proportion of anonymous and verified users. In order for this to happen, we need two things. First, a number of content including pages, stories, and comments must exist or be generated. The amount of content is heavily dependent on the number of users and the age of the the site. Second, there must be at least as many registered Drupal users as there are virtual users from the LoadStorm. This content must be generated easily and the settings for the users must be somewhat representative of an actual Drupal setup. The Devel module provides default behavior for adding a number of random users. However, there is no way of determining what their usernames are and transferring that information to LoadStorm.
LoadStorm already has a number of virtual user identities. The virtual users can harness their identities to log into sites and replicate logged-in users. These virtual identities and their passwords will eventually be available for download from the LoadStorm resource site. The identities of the virtual users depend on setting up the three resource files.
Preparation Shell Script
- Creates a backup of the current state of the Drupal SQL server. Adds the LoadStorm virtual user identities to the Drupal site
Finish Shell Script
- Reverts Drupal to the state before the preparation shell script was run. This will remove the LoadStorm users from the target system. An article on backup/restore of mysql database
LoadStorm Users File
- Contains the usernames and passwords of the LoadStorm virtual users.This will be loaded into the sql database during the preparation phase and then later removed
Once the preparation script has been run, the administrator may begin to add auto generated content to the Drupal installation through the Devel module. The Devel module http://drupal.org/project/devel must be loaded into the Drupal’s module installation directory. Navigate to http://www.mysite.com/?q=admin/ generate and begin the setup process. This will provide more locations for the virtual users to traverse. It takes about 5 and a half minutes to create 10000 pages on the small EC2 instance. This will most likely not be necessary if a mature Drupal site is already preexisting. Under closer inspection, the user names are generated from a type of word generating algorithm built into the Devel code and the content creation is taken from a hard-coded dictionary.
It is important that the only modules that are running during the test are the ones that will be active in a production system. Devel may remain activated for monitoring purposes but it may skew the results.
The Load Test
There are three types of basic users in Drupal. The user types are admin, logged-in, and anonymous. Each type will generate a unique type of load on the target system. There is seldom a significant number of administrators on a single system at one time. This type of user can be ignored for this test. This does not take into account the fact that administrators may be performing maintenance tasks resulting in significant system slowdown. Ideally, the user mixture should reflect the production ratio if it is known. For this basic test, we will use a 50-50 mixture of logged-in and anonymous users.
These two blog entries describe a similar load test of Drupal using JMeter. JMeter is a load testing tool which provides the ability to ramp up the number of threads(virtual users). The threads loop their designated scenario for the specified time period. The distributed nature of JMeter only allows it to mirror its threads onto another server to create double of the exact same thing. This is different from LoadStorm in several regards. First, LoadStorm distributes only one scenario to a virtual user. Once the virtual user finishes, he dies. LoadStorm additionally has the behavior of navigating randomly through a domain. As described above, LoadStorm can function on both an anonymous or logged-in state. This produces more realistic user scenarios.
http://www.johnandcailin.com/blog/john/load-test-your-drupal-application-scalability-apache-jmeter
The load test should be performed in a ramping style. This will allow the tester to determine the maximum number of users traversing the site while performance remains under acceptable parameters.
Acceptable Parameters:
- A typical user will not want to wait more than 5 seconds for a page to load. A virtual user waiting longer than 5 seconds should be considered a failure.
- It is never acceptable for a user to be receive a 404 ‘page not found’ or other error when one is not appropriate. These should be considered errors on their first encounter.
- A nonresponsive server should be considered a major failure. This will end the virtual users script. The virtual user will only try once to connect to the server.
The test should begin with a single user. If there is a problem with the test setup, then this user will hopefully detect this problem from the beginning of its run. The test should slowly ramp up to 50 users over a period of 30 mins.
The type of user determines the types of scenarios that the user can perform. Scenarios can be selected from a list of preconstructed behaviors or the tester can create their own to suite their needs. The distribution of users across the various choices of scenarios is decided through weighting or proportioning. Ideally, the distribution will reflect actual user patterns on the production system. For this test, the proportion of each scenario is given below.
Proportions of Scenarios
- 10% of logged-in users will create new content
- 30% of logged-in users will navigate randomly until commenting on a story
- 60% of logged-in users will navigate randomly
- 100% of anonymous users will navigate randomly
Performance tweaks and system resource distribution can be used to improve the performance of the system. Once a base test has been run, the tester can perform this exact test with modifications to their system. This may include placing resource files (images, video, and static pages) on a separate server. Also, you may wish to upgrade to a larger instance in the EC2. The test can be repeated to measure the effect of the modifications to the system.
The Expected Results
The Amazon EC2 instance will most likely not have excessive bottleneck on bandwidth. I have viewed Amazon’s architecture supporting 5-7 MBPS for a single EC2 instance. This is particularly certain if the LoadGenerators and the target system are within the same zone. Instead, the CPU and DRAM will not be able to handle the load of the dynamic web content. This means that modifications which reduce the load to these two resources will have the greatest effect on your system’s performance. I would assume from the examples given below that throughput and response time would scale perfectly while the load is below 10-20 users. After this point, I am uncertain how much load the system can take before it is overloaded or excessively violates the acceptable parameters.
There have been previous tests on this type of setup using JMeter. This provides a walk-through of the test as well as some interesting graphs on the performance of the setup as the type of instance is upgraded on Amazon EC2.
This blog entry provides some other types of LAMP upgrades and their affects on the scalability of the target system. This test uses Apache Benchmark(AB).
http://buytaert.net/drupal-webserver-configurations-compared
If you would like a system supporting enormous amounts of scale, then you may wish to scale into a distributed architecture. As you scale up, you will want to load test each step in order to determine the appropriate capacity.