Feed aggregator

Regular Expressions and Pattern Matching with BrowserMob and Selenium

BrowserMob - Fri, 02/04/2011 - 5:35pm

Hello Readers, Welcome back!!

This article intends to explain a few ways of using regular expressions with BrowserMob and Selenium. As we all know already, regular expressions are extremely helpful in scripting dynamic web sites, especially when you run into situations such as picking the first product from a dynamic list of products, clicking on the last link of a dynamic drop-down etc.

To begin with, let’s see a couple of examples for using regular expressions with BrowserMob’s VU (Virtual User) scripts. Not sure what VU is? Check it out here or contact BroserMob Support at anytime.

Example 1:
Single-line Regex – In this example we are trying to find a match against a piece of content spanning across a single line, which is made easy with the BrowserMob method findRegexMatches

In the snippet below, we are trying to parse the first item under the ‘News’ section on yahoo.com using single-line regex.

browserMob.beginStep('Yahoo Home'); var response = c.get('http://www.yahoo.com/',200); // single-line regex // getBody() returns the body of the HTTP response var matches = browserMob.findRegexMatches(response.getBody(), "a class=\"small\" href=\"(.*?)\""); // logging for troubleshooting purposes browserMob.log(matches[0]); var item = matches[0]; browserMob.endStep(); browserMob.beginStep('Follow the First News Item'); // go to the first new item parsed from the previous step var response = c.get(item,200); browserMob.endStep();

Example 2:
Multi-line Regex – In this example we are trying to find a match against a piece of content spanning across multiple lines. The Javascript ‘RegExp’ object comes handy in this instance.

In the snippet below, we are trying to parse the first hyperlink from html that spans across multiple lines as indicated below. As you can see, there are few newlines separating the ‘ul’ and ‘li’.

<ul class="menu"> <li><a href="/website-load-testing"> browserMob.beginStep('BM Home'); var response = c.get('http://browsermob.com/performance-testing',200); // multi-line regexp // The regular expression uses '\s' which is any whitespace, including newline, OR \S which is anything NOT a white space var re = new RegExp(/<ul class="menu">[\s|\S]*?<li><a href=\"(.*?)\"/i); var content = response.getBody(); myArray = re.exec(content); var item = myArray[1]; browserMob.endStep(); browserMob.beginStep('Load Testing Home'); // 'item' is the url parsed from the previous step var response = c.get('http://browsermob.com/'+item,200); browserMob.endStep();

That was easy !! Now let’s look at a few ways of pattern matching with Selenium for BrowserMob’s RBU scripts.

Selenium supports a few methods that help match text patterns. However, selenium locators don’t accept regular expressions. Only patterns or values accept them.

Globbing:

selenium.click("link=glob:*Gifts"); // Clicks on any link with text suffixed with 'Gifts' selenium.verifyTextPresent("glob:*Gifts*");

Regular Expressions:[regexp, regexpi]

selenium.click("link=regexpi:^Over \\$[0-9]+$"); //matches links such as 'Over $75', 'Over $85' etc

Contains:

selenium.highlight("//div[contains(@class,'cnn_sectbin')]"); //highlights the first div with class attribute that contains 'cnn_sectbin' selenium.highlight("css=div#cat_description:contains(\"to last\")"); //locating a div containing the text 'to last' using css selector

Starts-with:

selenium.click("//img[starts-with(@id,'cat_prod_image')]"); //clicks on the first image that has an id attribute that starts with 'cat_prod_image' selenium.click("//div[starts-with(@id,'tab_dropdown')]/a[last()]"); //clicks on the last link within the div that has a class attribute starting with 'tab_dropdown' selenium.click("//div[starts-with(@id,'tab_dropdown')]/a[position()=2]"); //clicks on the second link within the div that has a class attribute starting with 'tab_dropdown' selenium.highlight("css=div[class^='samples']"); //highlights div with class that starts with 'samples'

Ends-with:

selenium.highlight("css=div[class$='fabrics']"); //highlights div with class that ends with 'fabrics' selenium.click("//img[ends-with(@id,'cat_prod_image')]"); //clicks on the first image that has an id attribute that ends with 'cat_prod_image'

[Note: ends-with is supported only by Xpath 2.0. FF 3 might throw an error for this.]

Happy Testing !!

Tweet This Post

Categories: Load Testing Blogs

52 weeks of Application Performance – The dynaTrace Almanac

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

2010 is over and there has been a log going on in the application performance space. We started this project at the beginning of the year inspired by Stoyan Stefanov’s performance advent calendar of 2009. (There is also one for 2010). Now twelve months later we have our 2010 performance almanac available. According to Wikipedia an almanac [...]

Categories: Load Testing Blogs

dynaTrace Firefox Closed Beta Program started

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

We had quite a number of people showing interest in dynaTrace support for Firefox when we announced the Closed Beta Program. This week we handed out the first build of dynaTrace Ajax Edition for Firefox to a selected number of participants. The goal of the Closed Beta Program is to verify that the implementation delivers [...]

Categories: Load Testing Blogs

How to explain growing Worker Threads in JBoss

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

I recently got engaged with a client who ran an increasing load test against their load-balanced application. I got engaged because they had a phenomenon they couldn’t explain – here is an excerpt of the email: We have a jBoss application with mySQL that runs stable in a load testing environment with let’s say 20 [...]

Categories: Load Testing Blogs

Is Your Business Ready to Scale? How to Combine the Power of Load Testing & Performance Optimization

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

In the holiday season of 2009 we worked with one of the Top 5 US Retailers to help them solve their performance and scalability issues. A year earlier they had a serious outage during the busiest online shopping time leading to tens of millions of lost revenue. Keynote and dynaTrace are hosting a series of [...]

Categories: Load Testing Blogs

How Database Queries Slow Down Confluence User Search

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

We are using Confluence for both our internal Wiki as well as for our external Community Portal. I just came across a very nasty performance bug in the version we are running on our external system. We run 3.2 and the User Search Feature keeps me waiting several minutes each time I search for users. [...]

Categories: Load Testing Blogs

Making your Web Sites faster – Tutorial at QCon London

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

In the last year I did several Web Performance Optimization Workshops. Mainly with customers or members of our AJAX Community. We typically spent a couple of hours to look at their current web projects, analyzed potential performance problems and discussed how to speed these pages up. The outcome was both a better understanding of the [...]

Categories: Load Testing Blogs

5 Steps to setup ShowSlow as Web Performance Repository for dynaTrace Data

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

Alois Reitbauer explained in detail how dynaTrace continuously monitors several thousand URLs and uploads the performance data to the public ShowSlow.com instance. More and more of our dynaTrace AJAX Community Members are taking advantage of this integration in their internal testing environments. They either use Selenium, Watir or other functional testing tools to continuously test [...]

Categories: Load Testing Blogs

Understanding Twitter’s Javascript in Multiple Browsers: How to Profile, Debug and Trace across Firefox and IE 6,7,8

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

Every time I meet-up with web developers, either through a customer engagement or when I am giving a presentation about web performance optimization, I ask this question: Who is using Firefox and who is using Internet Explorer as the main browser? The answer is easy to guess. I hardly ever get any hands raised for [...]

Categories: Load Testing Blogs

Sneak Peak on Firefox support with dynaTrace

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

We have received lots of great feedback on our deep dive browser diagnostics support for Internet Explorer 6, 7, and 8. The feedback has come both from our community as well as from thought leaders in the broader web 2.0 community such as John Resig, Steve Souders, Stoyan Stefanov, and many more. But the one [...]

Categories: Load Testing Blogs

Proactively Avoid Site Abandonment by Identifying Thread Contention Issues

Performance, Scalability, Architecture - DynaTrace - Tue, 02/01/2011 - 8:15am

Lucy Monahan, Principle QA Performance Engineer at Novell, approached this with the following problem proposition: Imagine that you go to a shop to buy something and instead of customers lining up one-by-one, sequentially, each one vies for the shopkeeper’s attention. And imagine that you arrived as customer #2 and the shopkeeper takes one or more [...]

Categories: Load Testing Blogs

Taking on the Fail Whale and Tumblbeast/Tumbeast

LoadImpact - Sat, 01/29/2011 - 10:19am

Animal tussle going on, please hold

As most of you know, Twitter displays a "Fail Whale" when their servers are overloaded and a 503 page is displayed. A 503 page is displayed when the server is down for maintenance or overloaded. A few companies have decided to modify their 503 page to make it more interesting.

Matthew Inman (creator of http://theoatmeal.com/) decided to create a similar page for Tumblr. Its called Tumblbeast and it looks like this:

Update: Tumblr has renamed it Tumbeast and officially adopted it as a 503 page.

We thought we should make one for our customers....

Just for fun. =).

You can link to this image at:

By the way, for those who are interested in making a 503 service unavailable page, here's a decent guide: http://webhostinghelpguy.inmotionhosting.com/web-hosting/how-to-make-a-503-service-unavailable-page/

Update: Oatmeal created this image:

Like us, Matthew is taking a jab at the Fail Whale.You can read more about it here:

http://theoatmeal.com/blog/fail_whale

Categories: Load Testing Blogs

New Hands On Demo Video of dynaTrace AJAX Edition 2.0 available

Performance, Scalability, Architecture - DynaTrace - Sat, 01/29/2011 - 5:15am

We have just released a new Demo Video of dynaTrace AJAX Edition 2.0. It gives a full Walk Through of how to analyze the performance of a web page by using the new Performance Report feature, comparing your results against the Top Alexa pages and also discusses how to use dynaTrace AJAX Edition in an [...]

Categories: Load Testing Blogs

A lesson in validation

Performance & Open Source - Fri, 01/28/2011 - 3:44pm

Those who have worked with me know how much I stress the importance of validation: validate your tools, workloads and measurements. Recently, an incident brought home yet again the importance of this tenet and I hope my narrating this story will prove useful to you as well.

For the purposes of keeping things confidential, I will use fictional names. We were tasked with running some performance measurements on a new version of a product called X. The product team had made considerable investment into speeding things up and now wanted to see what the fruits of their labor had produced. The initial measurements seemed good but since performance is always relative, they wanted to see comparisons against another version Y. So similar tests were spun up on Y and sure enough X was faster than Y. The matter would have rested there, if it weren’t for the fact that the news quickly spread and we were soon asked for more details.

Firebug to the rescue

At this point, we took a pause and wondered: Are we absolutely sure X is faster than Y? I decided to do some manual validation. That night, connecting via my local ISP from home, I used firefox to do the same operations that were being performed in the automated performance tests. I launched firebug and started looking at waterfalls in the Net panel.

As you can probably guess, what I saw was surprising. The first request returned a page that caused a ton of object retrievals. The onload time reported by firebug was only a few seconds, yet there was no page complete time!

The page seemed complete and I could interact with it. But the fact that firebug could not determine Page Complete was a little disconcerting. I repeated the exercise using HttpWatch just to be certain and it reported exactly the same thing.

Time to dig down deeper. Taking a look at the individual object requests, one in particular was using the Comet model and it never completed. On waiting a little longer, there were other requests being sent by the browser periodically. Neither of these request types however had any visual impact on the page. Since requests were continuing to be made, firebug obviously thought that the page was not complete.

Page Complete or Not?

This begged the question: how did the automated tests run and how did they determine when the page was done? There was a timeout set for each request, but if the request was terminating because of the timeout, we surely would have noticed since the response times reported would have been the timeout value. In fact, the response time being reported was less than half the timeout value.

So we started digging into the waterfalls of some of the automated test results. Lo and behold – a significant component of the response time was the HTTP Push (also known as HTTP Streaming) one. There were also several of the sporadic requests that were being made well after the page was complete. This resulted in arbitrary response times for Y being reported.

It turned out that the automated tool was actually quite sophisticated. It doesn’t just use a standard timeout for the entire request. Instead it monitors the network and if no activity is detected for 3 seconds, it considers the request complete. So it captured some of the streaming and other post-PageComplete requests and returned when the pause between them was more than 3 seconds. That is why we thought we were seeing “valid” response times which looked reasonable and had us fooled!

Of course, this leads to the big discussion of when exactly do we consider a http request as complete? I don’t want to get into that now as my primary purpose of this article is to point out the importance of validation in performance testing. If we had taken the time to validate the results of the initial test runs, this problem would have been detected a long time ago and lots of cycles could have been saved (not to mention the embarrassment of admitting to others that the initial results reported were wrong !)

Categories: Load Testing Blogs

Finding out common user behaviour

LoadImpact - Fri, 01/28/2011 - 10:31am

Learn how to use Google Analytics to derive common user behaviour

Introduction

In addition to finding out how many simultaneous users you need for a load test (for more information you can refer to http://loadimpact.com/blog/search?criteria=analytics), you might also find it challenging to choose visitor behaviors that are representative of all your site visitors. There are many permutations of behaviors possible and it is not practical to simulate everyone of them. We can, however, make use of analytics software to find common user behavior. This guide explains how Google Analytics helps you do the job.

Extracting the data

On the right menu of Google Analytics, select Content -> Top Landing Pages.

Choose an appropriate date range. In this example, we have chosen two months worth of data. Wider date ranges produce more accurate user behaviors, but will not be able to tell you the latest common behavior. To avoid ending up with irrelevant user behaviors, be careful not to choose a start date before any major revisions of your website.

You should now be able to see this on the right hand side of the page. Click the second option to have the data displayed in pie chart form.

Let's focus on the two most popular landing pages, "/" and "/index.php". Visitor paths will be represented by nodes. We will dive into the concept in more detail in a later analysis, but let's complete the whole diagram first.

Next, under "Overview", click "Entrance Paths".

font-family:"Calibri","sans-serif";mso-ascii-theme-font:minor-latin;mso-fareast-font-family:
Calibri;mso-fareast-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";mso-bidi-theme-font:minor-bidi;
mso-ansi-language:EN-US;mso-fareast-language:EN-US;mso-bidi-language:AR-SA">

Here you will see the various paths of different content. "Then viewed these pages" shows what was viewed and the corresponding percentages after your visitors went to the selected content. "And ended up here" shows what content was viewed after that.

Let's take a look on the paths taken by visitors after arriving at the http://loadimpact.com/ page. We will ignore the first two results as there is an auto loading script in the index page (this can be accounted for by increasing sleep time later on). Hence, other probable content that our visitors will go to are "/products.php" and "/products.php?basic pages". We will add this information to the diagram:

We click "/products.php?light=" and look under "And ended up here:". Again, we find the two most visited pages and plot them in the diagram together with corresponding precentages of users that came from the previous page.

For the branch starting with "/index.php", we get the following:

The above diagram shows the paths commonly taken by visitors starting from loadimpact.com/index.php.

We could go on for a few more iterations, but for this example we will stop here. Next, go to "Top Exit Pages" and verify that the top two exit contents appeared in your last iteration.

In this case they did appear and we can safely say that three iterations are sufficient to give an accurate picture of visitor behaviors.

Analysis

In every branch, we multiply the percentages together to obtain a "comparison number". For example, starting from /index.php to /pageanalyzer.php to / yields 0.064*0.0403*0.0221 = 0.000057.

We do that for every branch and then rank the sequence of content from highest to lowest "comparison number".

Hence, the most common user behavior would be to go to our index page, and from there click "Sign up now" under "Load Test Light", and then return to the index page before leaving the website. This is illustrated in the top diagram below.

The other common behavior would be to go to our index page (Note that http://loadimpact.com/ and http://loadimpact.com/index.php point to the same page with the same codes), and then subsequently click "Sign up now" followed by "Proceed to registration" on the products page.

Once common user behaviors are determined, we can proceed to transform it into a load script via the session recorder (available through purchase of any Load Impact Premium account).

It should be noted that this method is only valid for a general load test on your website. You might also want to test specific functionality, in which case this method is not very useful.

Categories: Load Testing Blogs

Web Performance Optimization Use Cases: Part 4 Load Time Optimization

Performance, Scalability, Architecture - DynaTrace - Thu, 01/27/2011 - 3:15am

In this post of our WPO Use Case series I discuss another very important use case. Load Time Optimization is most likely the most vital use case from an end user perspective. At the same time it is also highly important from a business perspective as studies by ShopZilla or Google and Bing show that [...]

Categories: Load Testing Blogs

Antivirus Add-On for IE to cause 5 times slower page load times

Performance, Scalability, Architecture - DynaTrace - Wed, 01/26/2011 - 2:15am

The dynaTrace AJAX Community has been really active lately – with lots of great forum posts and “discoveries”. As the Community Lead I want to update you about the recent activities as there are several topics that should be interesting for everybody that is interested in Web Performance: Antivirus Software to cause page load times [...]

Categories: Load Testing Blogs

Going to PyCon 2011!

Corey Goldberg - Tue, 01/25/2011 - 4:04pm

This year I am attending my first PyCon (the annual Python community conference).

I will be in Atlanta: March 10-13.

If anyone is interested in meeting up or collaborating while I'm there, get in touch:

Twitter: @cgoldberg
Homepage/Info: goldb.org

Categories: Load Testing Blogs

Tools for Web Performance Analysis

Performance & Open Source - Fri, 01/21/2011 - 10:33pm

At Yahoo!, I’m currently focused on analysis of end user performance. This is a little different than what I’ve worked on in the past, which was mainly server-side performance and scalability. The new focus requires a new list of tools so I thought I’d use this post to share the tools I’ve been learning and using in the past couple of months.

HttpWatch

This made the top of my list and I use it almost every day. Although it has features very similar to firebug, it has two features that I find very useful – the ability to save the waterfall data directly to a csv file and a stand-alone HttpWatch Studio tool that easily loads previously saved data and reconstructs the waterfall ( I know you can export Net data from firebug, but only in HAR format). And best of all, HttpWatch works with both IE and Firefox. The downside is that it works only on Windows and it’s not free.

Firebug

This is everyone’s favorite tool and I love it too. Great for debugging as well as performance analysis. It is a little buggy though – I get frustrated when after starting Firebug, I go to a URL expecting it to capture my requests, only to find that it disappears on me. I end up keeping firebug on all the time and this can get annoying.

HttpAnalyzer

This is also a Windows-only commercial tool – it is similar to Wireshark. It’s primary focus is http however, so it is easier to use than Wireshark. Since it sits at the OS level, it captures all traffic, irrespective of which browser or application is making the http request. As such, it’s a great tool for analyzing non-browser based http client applications.

Gomez

Yet another commercial tool, but considering our global presence and the dozens of websites that need to be tested from different locations, we need a robust, commercial tool that can meet our synthetic testing and monitoring requirements. Gomez has pretty good coverage across the world.

I have a love-hate relationship with Gomez. I love the fact that I can do the testing I want at both backbone and last mile, but I hate it’s user interface and limited data visualization. We have to resort to extracting the data using web services and do the analysis and visualization ourselves. I really can’t complain too much – since I didn’t have to build these tools myself !

Command-line tools

Last, but not the least, I rely heavily on standard Unix command-line tools like nslookup, DIG, curl, ifconfig, netstat, etc. And my favorite text processing tools remain sed and awk. Every time I say that, people shake their heads or roll their eyes. But we can agree to dis-agree without getting into language wars I think.

Categories: Load Testing Blogs

dynaTrace AJAX Edition 2.0 is ready for download

Performance, Scalability, Architecture - DynaTrace - Mon, 01/17/2011 - 6:15pm

We are celebrating the first birthday of dynaTrace AJAX Edition with a new version of this deep-dive browser diagnostics tool for Internet Explorer. We just recently reached 20k+ active users and are glad that people like Steve Souders or John Resig endorsed this tool in the last year. Download it here! Thanks for all the [...]

Categories: Load Testing Blogs

Storm on Demand Users	Cost
250	$9.97
500	$19.95
1,000	$39.90
5,000	$199.50
10,000	$399.00
25,000	$997.50
50,000	$1,995.00

LoadStorm

Free Breeze Account

Better Solution

Most Recent Blog Posts

Feed aggregator

Regular Expressions and Pattern Matching with BrowserMob and Selenium

52 weeks of Application Performance – The dynaTrace Almanac

dynaTrace Firefox Closed Beta Program started

How to explain growing Worker Threads in JBoss

Is Your Business Ready to Scale? How to Combine the Power of Load Testing & Performance Optimization

How Database Queries Slow Down Confluence User Search

Making your Web Sites faster – Tutorial at QCon London

5 Steps to setup ShowSlow as Web Performance Repository for dynaTrace Data

Understanding Twitter’s Javascript in Multiple Browsers: How to Profile, Debug and Trace across Firefox and IE 6,7,8

Sneak Peak on Firefox support with dynaTrace

Proactively Avoid Site Abandonment by Identifying Thread Contention Issues

Taking on the Fail Whale and Tumblbeast/Tumbeast

New Hands On Demo Video of dynaTrace AJAX Edition 2.0 available

A lesson in validation

Finding out common user behaviour

Web Performance Optimization Use Cases: Part 4 Load Time Optimization

Antivirus Add-On for IE to cause 5 times slower page load times

Going to PyCon 2011!

Tools for Web Performance Analysis

dynaTrace AJAX Edition 2.0 is ready for download

Load Testing Resources

Storm on Demand - Pay Per Test

Web Developers Like Us!

Syndicate