Presentations for the upcoming CMG conference are available on

I am presenting two topics at the Computer Measurement Group (CMG) annual conference this week, and I have just posted the slide decks I will be using on

The latest slide deck for Monitoring Web Application Response Times in Windows is available here.

And the HTTP/2: Recent protocol changes and their impact on web application performance one is available here.

If you are a regular reader of this blog, most of this material will look familiar. Enjoy!


HTTP/2: a change is gonna come, Part 1.

The Internet Engineering Task Force (IETF), the standards body responsible for Internet technology, recently accepted the HTTP/2 draft specification, the final official hurdle to be cleared prior to widespread industry adoption. HTTP is the protocol used between web clients and web browsers to exchange messages, and HTTP/2 is the first major revision of the HTTP protocol adopted since 1999 when HTTP/1.1 was finalized. (The principal HTTP 1.1 protocol changes provided for web page composition based on connection-oriented, dynamic HTML, which was still evolving at the time.)

The changes to the protocol for HTTP/2 are also directed squarely at the web page composition process. They are designed to speed up page load times, mainly through the use of (1) multiplexing where the web client can make multiple requests in parallel over a single TCP connection, and (2) server push where the web server can send content to the web client that it expects the web client will need in the near future, based on the current GET Request. This is a major change in the web’s application processing model, requiring adjustments at both the web server and the web client to support multiplexing and server push. In addition, many web sites, currently built to take advantage of the capabilities in HTTP/1.x, may require re-architecting to take better advantage of HTTP/2. Performance tools for web application developers will also need to play catch up to provide visibility into how multiplexing and server push are operating in order to assist with these re-architecture projects.

In a blog post explaining what web developers can expect from HTTP/2, Mark Nottingham, chairperson of the IETF HTTP Working Group, cautions, “HTTP/2 isn’t magic Web performance pixie dust; you can’t drop it in and expect your page load times to decrease by 50%.” Nottingham goes on to say, “It’s more accurate to view the new protocol as removing some key impediments to performance; once browsers and servers learn how and when to take advantage of that, performance should start incrementally improving.” Even though HTTP/2 is brand new, its multiplexing capabilities are largely based on a grand scale Google experiment known as SPDY that pioneered that feature. In this article, I will try to describe what the HTTP/2 changes will and won’t accomplish, based on what we know today about SPDY performance. In addition, I will make some specific recommendations to help you get ready and take advantage of the new capabilities in HTTP/2.

While HTTP/2 shapes up to be an important change to the technology that powers the Internet, the protocol revision does not address other serious performance concerns. For instance, it is not clear how other networking applications that rely on web services – think of all the apps that run on your phone that are network enabled – will be able to benefit from either multiplexing or server push. For browser-based web apps, HTTP/2 does not change the requirement for the browser to serialize the loading and execution of JavaScript files. Finally, while the HTTP/2 changes recognize that network latency is the fundamental source of most web performance problems, there is very little that can be done at the application protocol layer to overcome the physical reality of how fast electrical signals can be propagated through time, space, and wires.

One worrisome aspect of the HTTP/2 changes is how uncomfortably they fit atop the congestion control mechanisms that are implemented in TCP, the Internet’s Host-to-Host transport layer. These congestion control mechanisms were added to TCP about twenty years ago in the early days of the Internet to deal with severe network congestion that made the Internet virtually unusable under load. Current TCP congestion policies like slow start, additive increase/multiplicative decrease, and the initial size of cwin, the congestion window, cause problems for HTTP/2-oriented web applications that want to open a fat pipe[1] to the web client and push as much data through it as quickly as possible. TCP congestion control mechanisms are an important aspect of the transport-level protocol that manages the flow of messages through the underlying networking hardware, which is shared among consumers. For best results, web sites designed for HTTP/2 may find that some of these congestion controls policies need adjusting to take better advantage of new features in the web application protocol.

At odds with its use with video streaming and other bulk file copy operations, TCP was simply never designed to be optimal for throughput-oriented networked applications that are connected over long distances. Instead, TCP, by requiring a positive Acknowledgement from the Receiver for every packet of data sent, is able to provide a reliable message delivery service atop the IP protocol, which deliberately does not guarantee delivery of the messages it is handed to deliver. The designers behind the HTTP/2 changes are hardly the first set of people to struggle with this. In fact, the current TCP congestion control policies are designed to prevent certain types of data-hungry web applications from dominating and potentially overloading the shared networking infrastructure that all networking applications share. Call me an elite snob if you want, but I don’t relish a set of Internet architecture changes of which HTTP/2 may only be the first wave that optimizes the web for playback of high-definition cat videos on mobile phones at the expense of other networking applications, but that does appear to be the trend today in network communications technology.

Another interesting aspect of the HTTP/2 changes is the extent to which Real User Measurements (RUM) of web Page Load Time were used to validate and justify the design decisions that have been made, another example of just how influential and resilient the YSlow scalability model has proved. This is in spite of the many limitations of the RUM measurements, raising serious questions about how applicable they are to web applications that make extensive use of JavaScript manipulation of the DOM and add interactive capabilities using AJAX techniques to call web services asynchronously. In both sets pf circumstances, this DOM manipulation is performed in JavaScript code that executes after the page‘s Load event fires, which is when page load time is measured. RUM measurements that are gathered in the page’s Load event handler frequently do not capture this processing time. How valid the RUM measurements are in those environments is an open question among web performance experts.

Characterizing web application workloads

Much of the discussion that place in public in presentations, blogs and books on the subject of web application performance proceeds in blithe ignorance of the core measurement and modeling concepts used in the discipline of software performance engineering. One aspect that is strikingly absent from this discourse is a thorough consideration of the key characteristics of web application workloads that impact performance. Workload characterization is essential in any systematic approach to web application performance.

The performance characteristics of web applications span an enormous spectrum based on the size and complexity of the web pages that are generated. Some of those performance characteristics will make a big difference in whether or not the HTTP/2 protocol will help or hinder their performance. In particular, there appear to be three characteristics of web applications that will have the greatest impact on performance under HTTP/2:

  • the number of separate domains that GET Requests are directed to in order to construct the page
  • the number of HTTP objects (files, essentially) that need to be fetched from each domain,
  • and the distribution of the size of those objects

With regard to the number of domains that are accessed, looking at the top 500 sites, web applications range from one or two domains to having to pull together content from more than fifty. This behavior spans a range from monolithic web publishing where the content is consolidated on a very concise number of web servers to a more federated model where many more distinct web sites need to be accessed. Web sites that are consolidated in one or two domains perform better under HTTP/2 than those that rely on the federated model and were architected that way to perform well under HTTP/1.x. In addition, within each domain, the number of HTTP objects requested and their size is also pertinent to their performance under HTTP/2.

The number of HTTP objects that need to be fetched and their size are, of course, the two key components of the scalability model used in performance tools like YSlow that offer recommendations for building web pages that load faster. However, YSlow and similar tools currently ignore the sizable impact that multiple domains can have on web page load time. Overall, the HTTP/2 changes highlight the need to extend the deliberately simple model of web page load time that YSlow and its progeny have propagated.


After extensive testing at Google and elsewhere, some clarity around SPDY performance has begun to emerge; we are starting to understand the characteristics of web applications that work well under SPDY and those that SPDY has little or no positive impact on. At a Tech Talk at Google back in 2011, the developers reported that implementing SPDY on Google’s web servers resulted in a 15% improvement in page load times across all of the company’s web properties. The SPDY developers did acknowledge that the experimental protocol did not help much to speed up Google Search, which was already highly optimized. On the other hand, SPDY did improve performance significantly at YouTube, a notoriously bandwidth-thirsty web application. Overall, Google’s testing showed SPDY required fewer TCP connections, fewer bytes transferred on uploads, and reduced the overall number of packets that needed to be transmitted by about 20%.

Google initially rolled out SPDY to great fanfare, publicizing the technology at its own events and at industry conferences like Velocity. At these events and on its web site, Google touted page load time improvements on the order of 50% or more in some cases, but did not fully explain what kinds of web site configuration changes were necessary to achieve those impressive results. Since then, there have also been several contrary reports, most notably from Guy Podjarney, a CTO at Akamai, who blogged back in 2012 that the touted improvements were “not as SPDY as you thought.” Podjarney reported, “SPDY, on average, is only about 4.5% faster than plain HTTPS, and is in fact about 3.4% slower than unencrypted HTTP” for a large number of real world sites that he tested. After extensive testing with SPDY, Podjarney observed that SPDY did improve page load times for web pages with either of the two of the following characteristics:

  • monolithic sites that consolidated content on a small number of domains
  • pages that did not block significantly during resolution of JavaScript files and .css style sheets

On a positive note, Podjarney’s testing did confirm that multiplexing the processing of Responses to GET Requests at the web server can boost performance when a complex web page is composed from many Requests that are mostly directed to the same domain, allowing HTTP/2 to reuse a single TCP connection for transmitting all the Requests and their associated Response messages.

As I will try to explain in further detail below, the HTTP/2 changes reflect the general trend toward building ever larger and more complex web pages and benefit the largest web properties where clustering huge numbers of similarly-configured web servers provides the ability to process a high volume of HTTP Requests in parallel. As for web pages growing more complex, the HTTP Archive, for example, shows the average web page increased in size from 700 KB in 2011 to 2 MB in 2015, with the average page currently composed of almost 100 HTTP objects. Internet access over broadband connections is fueling this trend, even with network latency acting as the principal constraint on web page load time.

A large web property (see Alexa for a list of top sites) maintains an enormous infrastructure for processing huge volumes of web traffic, literally capable of processing millions of HTTP GET Requests per second. The web site infrastructure may consist of tens of thousands (or more) individual web servers, augmented with many additional web servers distributed around the globe in either proprietary Edge networks or comparable facilities leased from Content Delivery Network (CDN) vendors such as Akamai. The ability to harness this enormous amount of parallel processing capability to respond to web Requests faster, however, remains limited by the latency of the network, which is physically constrained by signal propagation delays. A front-end resource of these infrastructures that is also constrained is the availability of TCP connections, which is limited by the width of the TCP Port number, which is 16 bits. That limitation in TCP cannot be readily changed, but the HTTP/2 modifications do address this constraint.

SPDY also included server push and prioritization, but far less is known about the impact of those specific new features today. The final draft of the HTTP/2 protocol specification is available at

In the next post, I will drill deeper into the major features in the HTTP/2 revision.


Processing boomerang beacons in ASP.NET

A prerequisite to enabling boomerang web beacons for your web pages, as discussed in the previous post, is providing a web server component that expects incoming GET Requests for the boomerang.gif and understands how to respond to those Requests. Let’s see how to handle the boomerang.gif GET Requests in IIS and ASP.NET.

An excellent way to add support for boomerang web beacons is to provide a class that supports the IHttpModule interface, which works so long as you are running IIS 7.0 or later in integrated mode. In integrated mode, which is also known as integrated pipeline mode, IIS raises a pre-defined sequence of events for each and every web request. The IIS feature was called the integrated pipeline mode because it extended the original ASP.NET event pipeline, introduced in ASP.NET version 1, to apply to the processing of all Http Requests, beginning with IIS version 7. The IIS integrated pipeline provides “a single place to implement, configure, monitor and support server features such as single module and handler mapping configuration, single custom errors configuration, single url authorization configuration,” in the words of Mike Voldarsky, who was a Program Manager for the Microsoft Web development team at the time. (In addition, Mike published an excellent technical article discussing the capabilities of the on the subject in MSDN Magazine subsequent to the product’s release.)

So, to process boomerang beacons, you could instantiate a BoomerangBeacon class that supports the IHttpModule interface that adds an event handler for the IIS pipeline BeginRequest event:

public class BoomerangBeacon : IHttpModule


 public BoomerangBeacon()

{            }

 public void Dispose()

{             }

public void Init (HttpApplication application)


application.BeginRequest += (new EventHandler(this.Application_BeginRequest));




IIS raises the BeginRequest event as soon as a GET Request is received and is queued for processing in the pipeline. Next, provide an Application_BeginRequest method to intercept the boomerang beacon GET Request as soon as the, something along the lines of the following:

private void Application_BeginRequest(Object source, EventArgs e)


// Create HttpApplication and HttpContext objects to access

         // request and response properties.

HttpApplication application = (HttpApplication)source;

HttpContext context = application.Context;


string beaconUrl = GetBeaconUrl();  // Helper routine to set beacon url

string filePath = context.Request.FilePath;

string fileExtension = VirtualPathUtility.GetExtension(filePath);

if (fileExtension.Equals(“.gif”))


if (filePath.Equals(“/” + beaconUrl) || filePath.Contains(beaconUrl))


  …   // Process the beacon parms






This example calls a Helper routine, GetBeaconUrl(), to retrieve the name of the beacon Url from a web.config application setting. After processing the parms attached to the boomerang beacon GET Request, the BeginRequest event handler calls the End() method on the ASP.NET Response object that flushes the web beacon GET Request from the pipeline and terminates any further processing of the boomerang beacon. Calling Response.End() returns an empty HTTP Response message to the requestor with an HTTP status code of 200 – which is the “OK” return code – to the web browser that issued the Request.

NameValueCollection parms = context.Request.QueryString;

The key, of course, is processing the web beacon Query string parms that contain the measurement data from the web client. ASP.NET automatically generates a QueryString property that parses the GET Request query string and returns a NameValueCollection

HttpRequest request = context.Request;
string Browser = request.Browser.Type;
string Platform = request.Browser.Platform;
string HttpMethod = request.HttpMethod;


making it a relatively simple matter of looping through the parms collection and pulling out the measurements, exactly how Yahoo’s boomerang howto.js script I referenced earlier does. Your ASP.NET code that processes the web beacon can also pull the HttpMethod, Browser.Type and Browser.Platform directly from the Request object, as illustrated in Listing 8.

In addition, your IHttpModule class can also process the IIS ServerVariables collection to pull the identifying web client IP address and TCP Port assignment. These IIS web server variables are returned as strings, but it is not too much trouble to transform them into binary fields, although you do need to be able to handle both IP 4-byte v4 and 16-byte v6 addresses.

In my version of the IHttpModule that processes the boomerang beacons, I decided to generate an ETW event from the NavTiming API measurements with an event payload that contains the web client Page Load time measurements. This approach allows me to integrate RUM measurements from the web beacons with other web server-side measurements that can be calculated from other ETW events. In the next section, I will discuss what web server-side measurements can be gathered using ETW, and how I integrated boomerang beacon RUM measurements into IIS web server performance management reporting using those trace events..