As we’ve discussed previously, Web site optimization directly affects a company’s bottom line. A sudden traffic spike that swamps a website’s capacity can cost a company thousands or even tens of thousands of dollars per hour. Web servers and Web applications should be built and deployed from day one with performance at the forefront of everyone’s mind.

Web site administrators and web application developers have a host of tricks and techniques they can employ to deliver Web pages more quickly. From my experience, I’ve seen 1,000% performance improvement from simple web application optimization techniques. Caching is the #1 tuning trick in the web developers kit. Customers ask me weekly what I recommend for speeding up their app. I always start with “caching, caching, and more caching”. It’s like magic for a site.

A ViSolve white paper shows a return on investment of $61,000 for a $20,000 total cost of ownership of only two caching servers!

In this article we’ll look at how two different but related forms of caching are used to conserve server resources and reduce network latency; thus, greatly improving your customers’ experience with your site.

What is Caching?

Caching refers to any mechanism that stores previously retrieved content for future use. As I learned it in college back in the VAX/VMS operating systems class, it is temporarily putting something into memory that you will use again in order to avoid hitting the hard drive. Computer scientists today are less concerned about saving every byte like we were back then. Still, web applications are constantly re-using data and files; so why in the world would we want to make an expensive hit to the database? Hard drives can be 10,000 times slower than memory because they are mechanical and must move to the correct position and spin to the exact spot where data exists. Memory moves at the speed of electricity.

The goal of caching is to increase the speed of content delivery by reducing the amount of redundant work a server needs to perform. Putting a file in memory to re-use it can save millions of drive accesses; thus, the speed of getting the browser what the user needs is increased by magnitudes. Caching and performance go hand-in-hand. It’s a no-brainer.

Caching can operate at three different levels of a Web application’s architecture:

  1. The Web Server Layer
  2. The Application Layer
  3. The Database Layer

In this article, we’ll examine how a Web server can optimize content delivery both by controlling the Web browser cache and by caching content in memory.

Web Server Cache Control

The most well known cache is the Web browser cache stored on a user’s machine. When a Web browser such as IE, Firefox or Chrome retrieves a resource, it doesn’t display it once and throw it away: it stores the content on the user’s local hard drive.

A Web server can use a number of techniques (discussed below) to instruct the Web browser when it is permissible to use this local copy of the content in lieu of downloading the content again from the Web server.

A Web server delivers different types of data, including (to name just a few) HTML code, images, stylesheets, and Javascript libraries. Some of this information, such as the output of a PHP or other CGI script, is highly dynamic and volatile, and can return different results each time it is accessed. Other files, such as images and client-side scripts, may change far less frequently. These files are prime candidates for caching.

Web server cache control uses HTTP headers to regulate which resources are pulled from the cache, and how often. Setting effective cache control headers on content is critical to the performance of a Web application because it directly affects how browsers request files from your web server.

The Expires header tells a Web browser that a specific type of header expires at a specific day and time. The Cache-control header uses a combination of caching directives and age modifiers to instruct a Web client on whether a specific piece of content can or cannot be cached, and for how long. The Cache-control header is documented in section 14.9 of RFC2616.

Web servers can also use the ETag header, which assigns a unique ID to each version of a file, image, or other component. When the Web server delivers content, it stamps it with an ETag value. Later, if the client believes the content might have expired, it makes an HTTP request including an If-None-Match header, with the value of that header set to the last ETag value for that component. If the content has changed since that ETag value was issued, the server will respond with new content. Otherwise, if the content has not changed, the server will response with a 304 HTTP status and an empty HTTP response body. ETags prevent a Web server from re-delivering content that is still fresh, thereby conserving server resources, bandwidth, and reducing page load time.

Web Server Caching and Cache Proxies

Web server caching is implemented in one of two ways. A Web server can cache relatively static content in memory, greatly reducing the amount of time spent retrieving content from storage. Some Web servers, such as G-WAN, are optimized for static content, and automatically cache data in memory. Other Web servers allow memory caching through configuration files. Apache, for instance, supports the mod_cache module, which provides both a memory cache for static content (mod_mem_cache), and a disk cache for caching content assembled from dynamic sources or retrieved from an external provider (mod_disk_cache).

A web server cache can also take the form of a special server called a web cache proxy, which intercepts client requests and serves content previously cached from the actual host server where the application resides. These proxy caches are usually operated in various locations worldwide, and requests are routed dynamically to the appropriate proxy cache server based on the user’s location. This user proximity speeds content delivery to the user by reducing network latency. It also reduces the number of requests made against the target server.

Cisco describes a proxy cache this way:
When a browser wishes to retrieve a URL, it takes the host name component and translates that name to an IP address. A HTTP session is opened against that address, and the client requests the URL from the server.

When using a proxy cache, not much is altered in the transaction. The client opens a HTTP session with the proxy cache, and directs the URL request to the proxy cache instead.

Cache proxies work the same way that a Web browser’s cache works: they use the information included in HTTP headers such as Expires and Cache-control to determine if a given component is fresh or stale. Setting accurate cache control headers on content is critical for the success of a Web cache proxy.

Most high-volume Web projects use some form of Web cache proxy. For example, Wikimedia deploys 50 instances of the Squid web cache proxy in three locations worldwide to speed delivery of content to users.

Web Server Caching Helps – More Types of Caching to Come in Next Post

The conclusion is obvious: cache early and cache often. Caching is good – caching is your friend. If you just built a web application, then my recommendation to you is to turn on caching everywhere you can in the architecture.

While Web server caching and cache control are extremely powerful techniques for reducing load on a Web server, they’re not the only types of caching available. In our next article, we’ll examine how Web application developers can leverage Application Caching and Database Caching for further performance gains.

Similar Posts