HTTP/2: a change is gonna come, Part 1.

The Internet Engineering Task Force (IETF), the standards body responsible for Internet technology, recently accepted the HTTP/2 draft specification, the final official hurdle to be cleared prior to widespread industry adoption. HTTP is the protocol used between web clients and web browsers to exchange messages, and HTTP/2 is the first major revision of the HTTP protocol adopted since 1999 when HTTP/1.1 was finalized. (The principal HTTP 1.1 protocol changes provided for web page composition based on connection-oriented, dynamic HTML, which was still evolving at the time.)

The changes to the protocol for HTTP/2 are also directed squarely at the web page composition process. They are designed to speed up page load times, mainly through the use of (1) multiplexing where the web client can make multiple requests in parallel over a single TCP connection, and (2) server push where the web server can send content to the web client that it expects the web client will need in the near future, based on the current GET Request. This is a major change in the web’s application processing model, requiring adjustments at both the web server and the web client to support multiplexing and server push. In addition, many web sites, currently built to take advantage of the capabilities in HTTP/1.x, may require re-architecting to take better advantage of HTTP/2. Performance tools for web application developers will also need to play catch up to provide visibility into how multiplexing and server push are operating in order to assist with these re-architecture projects.

In a blog post explaining what web developers can expect from HTTP/2, Mark Nottingham, chairperson of the IETF HTTP Working Group, cautions, “HTTP/2 isn’t magic Web performance pixie dust; you can’t drop it in and expect your page load times to decrease by 50%.” Nottingham goes on to say, “It’s more accurate to view the new protocol as removing some key impediments to performance; once browsers and servers learn how and when to take advantage of that, performance should start incrementally improving.” Even though HTTP/2 is brand new, its multiplexing capabilities are largely based on a grand scale Google experiment known as SPDY that pioneered that feature. In this article, I will try to describe what the HTTP/2 changes will and won’t accomplish, based on what we know today about SPDY performance. In addition, I will make some specific recommendations to help you get ready and take advantage of the new capabilities in HTTP/2.

While HTTP/2 shapes up to be an important change to the technology that powers the Internet, the protocol revision does not address other serious performance concerns. For instance, it is not clear how other networking applications that rely on web services – think of all the apps that run on your phone that are network enabled – will be able to benefit from either multiplexing or server push. For browser-based web apps, HTTP/2 does not change the requirement for the browser to serialize the loading and execution of JavaScript files. Finally, while the HTTP/2 changes recognize that network latency is the fundamental source of most web performance problems, there is very little that can be done at the application protocol layer to overcome the physical reality of how fast electrical signals can be propagated through time, space, and wires.

One worrisome aspect of the HTTP/2 changes is how uncomfortably they fit atop the congestion control mechanisms that are implemented in TCP, the Internet’s Host-to-Host transport layer. These congestion control mechanisms were added to TCP about twenty years ago in the early days of the Internet to deal with severe network congestion that made the Internet virtually unusable under load. Current TCP congestion policies like slow start, additive increase/multiplicative decrease, and the initial size of cwin, the congestion window, cause problems for HTTP/2-oriented web applications that want to open a fat pipe[1] to the web client and push as much data through it as quickly as possible. TCP congestion control mechanisms are an important aspect of the transport-level protocol that manages the flow of messages through the underlying networking hardware, which is shared among consumers. For best results, web sites designed for HTTP/2 may find that some of these congestion controls policies need adjusting to take better advantage of new features in the web application protocol.

At odds with its use with video streaming and other bulk file copy operations, TCP was simply never designed to be optimal for throughput-oriented networked applications that are connected over long distances. Instead, TCP, by requiring a positive Acknowledgement from the Receiver for every packet of data sent, is able to provide a reliable message delivery service atop the IP protocol, which deliberately does not guarantee delivery of the messages it is handed to deliver. The designers behind the HTTP/2 changes are hardly the first set of people to struggle with this. In fact, the current TCP congestion control policies are designed to prevent certain types of data-hungry web applications from dominating and potentially overloading the shared networking infrastructure that all networking applications share. Call me an elite snob if you want, but I don’t relish a set of Internet architecture changes of which HTTP/2 may only be the first wave that optimizes the web for playback of high-definition cat videos on mobile phones at the expense of other networking applications, but that does appear to be the trend today in network communications technology.

Another interesting aspect of the HTTP/2 changes is the extent to which Real User Measurements (RUM) of web Page Load Time were used to validate and justify the design decisions that have been made, another example of just how influential and resilient the YSlow scalability model has proved. This is in spite of the many limitations of the RUM measurements, raising serious questions about how applicable they are to web applications that make extensive use of JavaScript manipulation of the DOM and add interactive capabilities using AJAX techniques to call web services asynchronously. In both sets pf circumstances, this DOM manipulation is performed in JavaScript code that executes after the page‘s Load event fires, which is when page load time is measured. RUM measurements that are gathered in the page’s Load event handler frequently do not capture this processing time. How valid the RUM measurements are in those environments is an open question among web performance experts.

Characterizing web application workloads

Much of the discussion that place in public in presentations, blogs and books on the subject of web application performance proceeds in blithe ignorance of the core measurement and modeling concepts used in the discipline of software performance engineering. One aspect that is strikingly absent from this discourse is a thorough consideration of the key characteristics of web application workloads that impact performance. Workload characterization is essential in any systematic approach to web application performance.

The performance characteristics of web applications span an enormous spectrum based on the size and complexity of the web pages that are generated. Some of those performance characteristics will make a big difference in whether or not the HTTP/2 protocol will help or hinder their performance. In particular, there appear to be three characteristics of web applications that will have the greatest impact on performance under HTTP/2:

  • the number of separate domains that GET Requests are directed to in order to construct the page
  • the number of HTTP objects (files, essentially) that need to be fetched from each domain,
  • and the distribution of the size of those objects

With regard to the number of domains that are accessed, looking at the top 500 sites, web applications range from one or two domains to having to pull together content from more than fifty. This behavior spans a range from monolithic web publishing where the content is consolidated on a very concise number of web servers to a more federated model where many more distinct web sites need to be accessed. Web sites that are consolidated in one or two domains perform better under HTTP/2 than those that rely on the federated model and were architected that way to perform well under HTTP/1.x. In addition, within each domain, the number of HTTP objects requested and their size is also pertinent to their performance under HTTP/2.

The number of HTTP objects that need to be fetched and their size are, of course, the two key components of the scalability model used in performance tools like YSlow that offer recommendations for building web pages that load faster. However, YSlow and similar tools currently ignore the sizable impact that multiple domains can have on web page load time. Overall, the HTTP/2 changes highlight the need to extend the deliberately simple model of web page load time that YSlow and its progeny have propagated.

SPDY

After extensive testing at Google and elsewhere, some clarity around SPDY performance has begun to emerge; we are starting to understand the characteristics of web applications that work well under SPDY and those that SPDY has little or no positive impact on. At a Tech Talk at Google back in 2011, the developers reported that implementing SPDY on Google’s web servers resulted in a 15% improvement in page load times across all of the company’s web properties. The SPDY developers did acknowledge that the experimental protocol did not help much to speed up Google Search, which was already highly optimized. On the other hand, SPDY did improve performance significantly at YouTube, a notoriously bandwidth-thirsty web application. Overall, Google’s testing showed SPDY required fewer TCP connections, fewer bytes transferred on uploads, and reduced the overall number of packets that needed to be transmitted by about 20%.

Google initially rolled out SPDY to great fanfare, publicizing the technology at its own events and at industry conferences like Velocity. At these events and on its web site, Google touted page load time improvements on the order of 50% or more in some cases, but did not fully explain what kinds of web site configuration changes were necessary to achieve those impressive results. Since then, there have also been several contrary reports, most notably from Guy Podjarney, a CTO at Akamai, who blogged back in 2012 that the touted improvements were “not as SPDY as you thought.” Podjarney reported, “SPDY, on average, is only about 4.5% faster than plain HTTPS, and is in fact about 3.4% slower than unencrypted HTTP” for a large number of real world sites that he tested. After extensive testing with SPDY, Podjarney observed that SPDY did improve page load times for web pages with either of the two of the following characteristics:

  • monolithic sites that consolidated content on a small number of domains
  • pages that did not block significantly during resolution of JavaScript files and .css style sheets

On a positive note, Podjarney’s testing did confirm that multiplexing the processing of Responses to GET Requests at the web server can boost performance when a complex web page is composed from many Requests that are mostly directed to the same domain, allowing HTTP/2 to reuse a single TCP connection for transmitting all the Requests and their associated Response messages.

As I will try to explain in further detail below, the HTTP/2 changes reflect the general trend toward building ever larger and more complex web pages and benefit the largest web properties where clustering huge numbers of similarly-configured web servers provides the ability to process a high volume of HTTP Requests in parallel. As for web pages growing more complex, the HTTP Archive, for example, shows the average web page increased in size from 700 KB in 2011 to 2 MB in 2015, with the average page currently composed of almost 100 HTTP objects. Internet access over broadband connections is fueling this trend, even with network latency acting as the principal constraint on web page load time.

A large web property (see Alexa for a list of top sites) maintains an enormous infrastructure for processing huge volumes of web traffic, literally capable of processing millions of HTTP GET Requests per second. The web site infrastructure may consist of tens of thousands (or more) individual web servers, augmented with many additional web servers distributed around the globe in either proprietary Edge networks or comparable facilities leased from Content Delivery Network (CDN) vendors such as Akamai. The ability to harness this enormous amount of parallel processing capability to respond to web Requests faster, however, remains limited by the latency of the network, which is physically constrained by signal propagation delays. A front-end resource of these infrastructures that is also constrained is the availability of TCP connections, which is limited by the width of the TCP Port number, which is 16 bits. That limitation in TCP cannot be readily changed, but the HTTP/2 modifications do address this constraint.

SPDY also included server push and prioritization, but far less is known about the impact of those specific new features today. The final draft of the HTTP/2 protocol specification is available at http://http2.github.io/http2-spec/.

In the next post, I will drill deeper into the major features in the HTTP/2 revision.

 .

Tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *