Page Load Time and the YSlow scalability model of web application performance

This is the first of a new series of blog posts where I intend to drill into an example of a scalability model that has been particularly influential. (I discussed the role of scalability models in performance engineering in an earlier blog post.) This is the model of web application performance encapsulated in the YSlow tool, originally developed at Yahoo by Steve Souders. The YSlow approach focuses on the time it takes for a web page to load and become ready to respond to user input, a measurement known a Page Load Time. Based on measurement data, the YSlow program makes specific recommendations to help minimize Page Load Time for a given HTML web page.

The conceptual model of web application performance underpinning the YSlow tool has influenced the way developers think about web application performance. YSlow and similar tools influenced directly by YSlow that attempt to measure Page Load Time are among the ones used most frequently in web application performance and tuning. Souders is also the author of an influential book on the subject called “High Performance Web Sites,” published by O’Reilly in 2007. Souders’ book is frequently the first place web application developers turn for guidance when they face a web application that is not performing well.

Page Load Time

Conceptually, Page Load Time is the delay between the time that an HTTP GET Request for a new web page is issued and the time that the browser was able to complete the task of rendering the page requested. It includes the network delay involved in sending the request, the processing time at the web server to generate an appropriate Response message, and the network transmission delays associated with sending the Response message. Meanwhile, the web browser client requires some amount of additional processing time to compose the document requested in the GET Request from the Response message, ultimately rendering it correctly in a browser window, as depicted in Figure 1.

Simple-Page-Load-Time-example

Figure 1. Page Load Time is the delay between the time that an HTTP GET Request for a new web page is issued and the time that the browser was able to complete the task of rendering the page requested from the Response message received from the designated web server.

 

Note that, as used above, document is intended as a technical term that refers to the Document Object Model, or DOM, that specifies the proper form of an HTTP Response message such that the web browser client understands how to render the web page requested correctly. Collectively, IP, TCP and HTTP are the primary Internet networking protocols. Initially, the Internet protocols were designed to handle simple GET Requests for static documents. The HTML standard defined a page composition process where the web browser assembles all the elements needed to build the document for display on the screen, often requiring additional GET Requests to retrieve elements referenced in the original Response message. For instance, the initial Response message often contains references to additional resources – image files and style sheets, for example – that the web client must then request. And, as these Response messages are received, the browser integrates these added elements into the document being composed for display. Some of these additional Response messages may contain embedded requests for additional resources. The composition process proceeds ad infinitum until all the elements referenced are assembled.

Page Load Time measures the time from the original GET Request, all subsequent GET Requests required to compose the full page. It also encompasses the client-side processing of both the mark-up language and the style sheets by the browser’s layout engine to format the display in its final form, as illustrated in Figure 2.

assembling-a-web-page-from-multiple-GET-requests

Figure 2. Multiple GET Requests are frequently required to assemble all the elements of a page that are referenced in Response messages.

This conceptual model of web application response time, as depicted in Figure 2, suggests the need to minimize both the number and duration of web server requests, each of which requires messages to traverse the network between the web client and the web server. Minimizing this back and forth across the network will minimize Page Load Time:

Page Load Time » RoundTrips * Round Trip Time

The original YSlow tool does not attempt to measure Page Load Time directly. Instead, it attempts to assess the network impact of constructing a given web page by examining each element of the DOM after the page has been assembled and fully constructed. The YSlow tuning recommendations are based on a static analysis of the number and sizes of the objects that it found in the fully rendered page since the number of network round trips required can faithfully be estimated based on the size of the objects that are transmitted. For each Response object, then:

RoundTrips = (httpObjectSize / packet size) + 1

which is then summed over all the elements of the DOM that required GET Requests (including the original Response message).

Later, after Souders left Yahoo for Google, he was responsible for the construction of a tool similar to YSlow’s being incorporated into the Chrome developer tools that ship with Google’s web browser. The Google tool that corresponds to YSlow is called PageSpeed. (Meanwhile, the original YSlow tool continues to be available as a Chrome or IE plug-in.)

In the terminology made famous by Thomas S. Kuhn in his book “The Structure of Scientific Revolutions,” the YSlow model of web application performance generated a paradigm shift in the emerging field of web application performance, a paradigm that continues to hold sway today. The suggestion that developers focus on page load time gave prominence to a measurement of service time as perceived by the customer. Not only is page load time a measure of service time, there is ample evidence that it is highly correlated with customer satisfaction. (See, for example, Joshua Bixby’s 2012 slide presentation.)

While the YSlow paradigm is genuinely useful, it can also be abused. For instance, the YSlow rules and recommendations for improving the page load time of your application often need to be tempered by experience and judgment. (This is hardly unusual in rule-based approaches to computer performance, something I discussed in an earlier series of blog posts.) Suggestions to minify all your Javascript files or consolidate multiple scripts into one script file are often contraindicated by other important software engineering considerations. For example, it may be important to factor Javascript code into separate files when the scripts originate in different development teams and have very different revision and release cycles. Moreover, remember that YSlow does not measure page load time directly. Instead, it reasons about page load time based on the elements of the page that it discovers in the DOM and its knowledge of how the DOM is assembled.

Subsequently, other tools, including free tools like VRTA and Fiddler and commercial application performance monitoring tools like Opnet and DynaTrace, try a more direct approach to measuring Page Load Time. Many of these tools analyze the network traffic generated by HTTP requests. These network capture tools attempt to estimate Page Load Time based on the time the first HTTP GET Request generated a network packet that was transmitted by the client to the last packet sent by the web server in the last Response message associated with the initial GET. Network-oriented tools like Fiddler are easy for web developers to use and Fiddler, in particular, has many additional facilities, including ones that help in debugging web applications.

Over time, the Internet protocols began developing the capabilities associated with serving up content generated on the fly by applications. This entailed supporting the generation of dynamic HTML, where Response messages are constructed on demand by web applications, customized based on factors such as the identity of the person who issued the Request, where in the world the requestor was located when the Request was initiated, etc. With dynamic HTML requests, the still relatively simple process illustrated in Figure 2 potentially can grow considerably more complex. One of these developments included the use of Javascript code running inside the web browser to manipulate the DOM directly on the client, without the need to ever contact the web server, once the script itself was downloaded. Note that web application performance tools that rely solely on the analysis of the network traffic associated with HTTP requests cannot measure the amount of time spent inside the web browser executing Javascript code that is construct a web page dynamically.

However, developer-oriented timeline tools running inside the web browser client can gain access to the complete sequence of events associated with composing and rendering the DOM, including Javascript execution time. The developer-oriented performance tools in Chrome, which includes the PageSpeed tool, then influenced the design and development of similar web developer-oriented performance tools that Microsoft’s started to build into the Internet Explorer web browser. Running inside the web browser, the Chrome and IE performance tools have direct access to all the timing data associated with Page Load Time, including Javascript execution time.

A recent and very promising development in this area is a W3C spec that standardizes the web client timeline events associated with page load time, which also specifies an interface for accessing the performance timeline data from Javascript. Chrome, Internet Explorer, and webkit have all adopted this standard, paving the way for tool developers for the first time to gain access to complete and accurate page load time measurements in a consistent fashion across multiple web clients.

I will continue this discussion of YSlow and its progeny in a future blog post..