Complications facing the YSlow scalability model. (Why is this web app running slowly? Part 5)

This is a continuation of a series of blog entries on this topic. The series starts here

With the emergence of the Web 2.0 standards that enabled building dynamic HTML pages, web applications grew considerably more complicated, compared to the static view of the page composition process that the YSlow scalability model embodies. Let’s review some of the more important factors that complicate the (deliberately) simple YSlow model of Page Load Time. These complications include the following:

  • YSLow does not attempt to assess JavaScript execution time, something that is becoming more important as more and more complicated JavaScript is developed and deployed. JavaScript execution time, of course, can vary based on which path through the code is actually executed for a given request, so the delay can vary, depending on the scenario requested. The processing capacity of the underlying hardware where the web client that executes the script resides is also a factor. This is especially important when the web browser client is running inside a cell phone, since apps on cells phones are often subject to significant capacity constraints, including battery life, RAM capacity (which impacts cache effectiveness), and lack of external disk storage.

In addition to the variability in script execution time that is due to differences in the underlying hardware platform, the execution time of your web page’s JavaScript code can potentially vary based on the specific path a scenario takes through the code. Both of these factors suggest adapting conventional performance profiling tools for use with JavaScript executing inside the web browser. Using the tools at webpagetest.org, it is also possible to measure when the web page is “visually complete,” which is computed by taking successive video captures until the video image stabilizes.

  • Web browser provide multiple sessions such that static content, where possible, can be downloaded in parallel. Current web browsers can create 4-8 TCP sessions per host for parallel processing, depending on the browser. Souders’ blog entry here compares the number of parallel sessions that are created in each of the major web browsers, but, unfortunately, that information is liable to be dated. These sessions persist while the browser is in communication with the web server, so there is also a savings in TCP connection processing whenever connections are re-used for multiple requests. (TCP connections require a designated sequence of SN, SYN-ACK and ACK packets to be exchanged by the Sender and Receiver machines prior to any application-oriented HTTP requests being transmitted over the link.) Clearly, parallel TCP sessions play havoc with the simple serial model for page load time expressed in Equations 3 & 4.

Parallel browser sessions potentially so improve download performance that web performance experts like Souders have reassessed the importance of the YSlow rule recommending having fewer static HTTP objects to load. Another browser-dependent complication is that some of the browsers block parallel sessions in order to download JavaScript files. Since script code, when it executes, is likely to modify the DOM in a variety of ways, for the sake of integrity it is better to perform this operation serially. This behavior of the browser underlies the recommendation to reference external JavaScript files towards the end of the page’s HTML markup, when that is feasible, of course, which maximizes the benefit of parallel downloads for most pages.

The fact that the amount of JavaScript code embedded in web pages is on the rise, plus the potential for JavaScript downloads to block parallel downloading of other types of HTTP objects, has Souders and others recommending asynchronous loading of JavaScript modules. For example, Souders writes, “Increasing parallel downloads makes pages load faster, which is why [serial] downloading external scripts (.js files) is so painful.”[1] As a result, Souders recommends using asynchronous techniques for downloading external JavaScript files, including using the ControlJS library functions to control how your scripts are loaded by the browser. (See also http://friendlybit.com/js/lazy-loading-asyncronous-javascript/ for native JavaScript code examples.)

The basic technique defers loading external JavaScript until after the DOM’s window.onload event has fired by adding some code to the event handler to perform the script file downloads at that point in time. The firing of the window.onload event also marks the end of Page Load Time measurements. Using this or similar techniques, measurements of Page Load time can improve without actually improving the user experience, especially when the web page is not fully functional until after these scripts files are both downloaded and executed. It is not uncommon for one JavaScript code snippet to conditionally request the load of another script, something that also causes the browser to proceed serially. Moreover, the JavaScript execution time can result in considerable delay. Suffice to say that optimizing the performance of the JavaScript associated with the page is a topic for an entire book.

  • Equation #3 showing the YSlow scalability model provides a single term for the round trip time (RTT) for HTTP requests when round trip time is actually more accurately represented as an average RTT where the underlying distribution of round trip times is often non-uniform. Some of the factors that cause variability in round trip time include:
  • content is often fetched from multiple web servers residing at different physical locations. Locating different web servers initially requires separate DNS look-ups to acquire the IP address of each server and establishing a TCP connection with that server,
  • content may be cached at the local machine via the browser cache or may be resident at a node physically closer to the web client if the web site uses a caching engine or Content Delivery Network (CDN) like Akamai.

In general, caching leads to a highly non-uniform distribution of round trip times. Those Http objects that can be cached effectively exhibit one set of round trip times, based on where the cache hit is resolved (local disk or CDN), while objects that require network access yield a different distribution. A pattern that is encountered frequently is an HTML reference to a third party Ad server, which seems like it is often the last and slowest HTML reference to be resolved. The Ad servers from Google, Amazon and others not only know who you are and where you are (in the case of a phone that is equipped with GPS), they also have access to your recent web browser activity so they are likely to have some knowledge of what advertising content you might be interested in. Figuring out just what is the best ad to serve up to you at any point time may necessitate a substantial amount of processing back at that 3rd party ad server site, all of which delays the generation of the Response message the web page is waiting on.

  • Many web sites rely on asynchronous operations, including AJAX techniques, to download content without blocking the UI. AJAX is an acronym that stands for Asynchronous JavaScript and XML, and it is a collection of techniques for making asynchronous calls to web services, instead of synchronous HTTP GET and POST Requests. Since the tool does not attempt to execute any of the web page’s JavaScript, any use of AJAX techniques on the page is opaque to YSlow.

AJAX refers to the capability of JavaScript code executing in the browser to issue asynchronous XMLHttpRequest methods calls to web services, acquire content, and update the DOM, all while the web page remains in its loaded state. AJAX techniques are designed to try to hide network latency and are sometimes used to give the web page interactive features that mimic desktop applications. A popular example of AJAX techniques in action are textboxes with an auto-completion capability, familiar from Search engines and elsewhere. As you start to type in the textbox, the web application prompts you to complete the phrase you are typing automatically, which says you some keystrokes.

A typical autocompletion textbox works as follows. As the customer starts to type into a textbox, a snippet of JavaScript code grabs the first few keyboard input characters and passes that partial string of data to a web service. The web service processes the string, performs a table or database look-up to find the best matches against the entry data, which it returns to the web client. In a JavaScript callback routine that is executed when the web service sends the reply, another snippet of script code executes to display to modify the DOM and display the results returned by the web service. These AJAX callback routines typically do update the DOM, but the asynchronous aspect of the XMLHttpRequest means the state of the page remains loaded and the all the input controls on the page remain in the ready state. To be effective, both the client script and the web service processing the asynchronous call must handle the work very quickly to preserve the illusion of rapid interactive response. To ensure that the asynchronous process is executed smoothly, both the XMLHttpRequest and the web service response messages should be small enough to fit into single network packets, for example.

Nothing on this list of its deficiencies diminishes the benefits of using the YSlow tool to learn about why your web page might be taking too long to load. These limitations of the YSlow approach to improve web page responsiveness do provide motivation for tools to augment YSlow by actually measuring and reporting page load time. I will take a look at those kinds of web application performance tools next.

 

 


.

Why is this web app running slowly? — Optimization strategies (Part 4)

This is a continuation of a series of blog entries on this topic. The series starts here.

The YSlow model of web application performance, depicted back in Equations 3 & 4 in the previous post, leads directly to an optimization strategy to minimize the number of round trips, decrease round trip time, or both. Several of the YSlow performance rules reflect tactics for minimizing the number of round trips to the web server and back that are required to render the page. These include

  • designing the Page so there are fewer objects to Request,
  • using compression to make objects smaller so they require fewer packets to be transmitted, and
  • techniques for packing multiple objects into a single request.

For example, the recommendation to make fewer HTTP requests is in the category of minimizing round trips. YSlow rules regarding image compression or the use of “minified” external files containing JavaScript and CSS are designed to reduce the size of Response messages, which, in turn, reduces the number of round trips that are required to fetch these files. For example, the web page where you can download the free minify utility at https://code.google.com/p/minify/ that Google provides to web developers contains the following description of the program:

Minify is a PHP5 app that helps you follow several of Yahoo!’s Rules for High Performance Web Sites. It combines multiple CSS or Javascript files, removes unnecessary whitespace and comments, and serves them with gzip encoding and optimal client-side cache headers.

All the text-based files that are used in composing the page – .htm, .css, and .js – tend to compress very well, while the HTTP protocol supports automatic unpacking of gzip-encoded files. There is not a great benefit from compressing files already smaller than the Ethernet MTU, so YSlow recommends packing smaller files into larger ones so that text compression is more effective.
Meanwhile, the performance rules associated with cache effectiveness are designed to minimize RTT, the round trip time. If current copies of the HTML objects requested from the web server can be retrieved from sources physically located considerably closer to the requestor, the average network round trip time for those Requests can be improved.

With its focus on the number and size of the files necessary for the web browser to assemble in order to construct the page’s document object from these component parts, YSlow uses an approach to optimization known in the field of Operations Research (OR) as decomposition. The classic example of decomposition in OR is the time and motion study where a complex task is broken into a set of activities that are performed in sequence to complete a task. The one practical obstacle to using decomposition, however, is that YSlow understands the components that are used to compose the web page, but it lacks measurements of how long the task and its component parts take.

As discussed in the previous section, these measurements would be problematic from the standpoint of a tool like YSlow which analyzes the DOM once it has been completely assembled. YSlow does not attempt to measure the time it took to perform that assembly. Moreover, the way the tool works, YSlow deals with only a single instance of the rendered page. If it did attempt to measure network latency or cache effectiveness or client-side processing compute power, it would be capable of only gathering a single instance of those measurements. There is no guarantee that that single observation would be representative of the range and variation in behavior a public-facing web application would expect to encounter in reality. As we consider the many and varied ways caching technology, for example, is used to speed up page load times, you will start to see just how problematic the use of a single observation of the page load time measurement to represent the range and variation in actual web page load times can be.

Caching.

Several of the YSlow performance rules reflect the effective use of the caching services that are available for web content. These services include that portion of the local file system that is used for the web client’s cache, a Content Delivery Network, which are caches geographically distributed around the globe, and various server-side caching mechanisms. Effective use of caching improves the round trip time for any static content that can readily be cached. Since network transmission time is roughly a function of distance, naturally, the cache that is physically closest to the web client is the most effective at reducing RTT. Of the caches that are available, the cache maintained by the web browser on the client machine’s file system is physically the closest, and, thus, is usually the best place for caching to occur. The web browser automatically stores a copy of any HTTP objects it has requested that are eligible for caching in a particular folder within the file system. The web browser cache corresponds to the Temporary Internet Files folder in Internet Explorer, for example.

If a file referenced in a GET Request is already resident in the web browser cache – the disk folder where recently accessed cacheable HTTP objects are stored – the browser can add that file to the DOM without having to make a network request. Web servers add an Expiresheader to Response messages to indicate to the web browser that the content is eligible for caching. As the name indicates, the Expires header specifies how long the existing copy of that content remains current. Fetching that content from the browser cache requires a disk operation which is normally significantly faster than a network request. If a valid copy of the content requested is already resident in the browser cache, the round trip time normally improves by an order of magnitude since a block can be fetched from disk in 5-10 milliseconds on average. Note that reading a cached file from disk isn’t always faster than accessing the network to get the same data. Like any other factor, it is important to measure to see which design alternative performs better. In the case of an intranet web application where web browser requests can be fielded very quickly, network access, often involving less than 1 ms of latency, might actually be preferred because it could be much faster to get the Http object requested directly from the IIS kernel-mode cache than for the web client to have to access its local disk folder where Temporary Internet Files are stored.

Note also, that while caching does not help the first time a customer accesses a new web page, it has a substantial impact on subsequent accesses to the page. Web traffic analysis programs will report the number of unique visitors to a web site – each of these is subject to a browser cache that is empty of the any of the content that is requested. This is referred to as a cold start in cache. It is only the repeat visitors that benefit from caching, subject to the repeat visit to the web site occurring prior to the content expiration date and time. In Souders’ book, he reports an encouragingly high number of repeat visits to the Yahoo site as evidence for the YSlow recommendation. When network latency for an external web site is at least 100-200 ms, accessing the local disk-resident browser cache is an order of magnitude faster.

When the web browser is hosted on a mobile phone, which is often configured without a secondary storage device, the capability to cache content is consequently very limited. When Chrome detects it is running on an Android phone, for example, it configures a memory resident cache that will only hold up to 32 files at any one time. If you access any reasonably complex web site landing page with, say, more than 20-30 href= external file references, the effect is to flush the contents of the Chrome mobile phone cache.

Any CSS and JavaScript files that are relatively stable can potentially also be cached, but this entails implementing a versioning policy that your web developers adhere to. The snippet of html that I pulled from an Amazon product landing page that I discussed earlier illustrates the sort of versioning policy your web developers need to implement to reap the benefits of caching, while still enabling program bug fixes, updates, and other maintenance to ship promptly.

Another caching consideration is that when popular JavaScript libraries like jquery.js or angular.js or any of their add-on libraries that are incorporated into your web applications, you will find that current copies of these files already exist in the browser’s cache and do not require network requests to retrieve them. Taking a moment to check the contents of my Internet Explorer disk cache, I can see several different versions of jquery.js are currently resident in the IE cache. Another example is the Google Analytics script, ga.js, which so many web sites utilize for tracking web page usage is frequently already resident in the browser cache. (I will be discussing some interesting aspects of the Google Analytics program in an upcoming section.)

Content that is generated dynamically is more problematic to cache. Web 2.0 pages that custom built for a specific customer probably contain some elements that are unique for the user ID, while other web page parts are apt to be shared among many customers. Typically, the web server programs that build dynamic HTML Response messages will simply flag them to expire immediately so that they are ineligible for caching by the web browser. Caching content that is generated dynamically is challenging. Nevertheless, it is appropriate whenever common portions of the pages are reused, especially when it is resource-intensive to re-generate that content on demand. We will discuss strategies and facilities for caching at least some portion of the dynamic content web sites generate in a future Post.

Beyond caching at the local machine, YSlow also recommends the use of a Content Delivery Network (CDN) similar to the Akamai commercial caching engine to reduce the RTT for relatively static Response messages. CDNs replicate your web site content across a set of geographically distributed web servers, something which allows the CDN web server physically closer to the requestor to serve up the requested content. The net result is a reduction in the networking round trip time simply because the CDN server is physically closer to the end user than your corporate site. Note that the benefits of a CDN even extend to first time visitors of your site because they contain up-to-date copies of the most recent static content from your primary web site host. For Microsoft IIS web servers and ASP.NET applications, there are additional server-side caching options for both static and dynamic content that I will explore much later in this discussion.

Extensive use of caching techniques in web technologies to improve page load time is one of the reasons why a performance tool like YSlow does not actually attempt to measure Page Load Time. When YSlow re-loads the page to inventory all the file-based HTTP objects that are assembled to construct the DOM, the web browser is likely to discover many of these objects in its local disk cache, drastically reducing the time it takes to compose and render the web page. Were YSlow to measure the response time, the impact of the local disk cache would bias the results. A tool like the WebPageTest.org site tries to deal with this measurement quandary by accessing your web site a second time, and comparing the results to first-time user access involving a browser cache cold start.

Having read and absorbed the expert advice encapsulated in the YSlow performance rules and beginning to contemplate modifying your web application based on that advice, you start to feel the lack of actual page load time measurements keenly. It is good to know that using a minify utility and making effective use of the cache control headers should speed up page load time. But without the actual page load time measurements you cannot know how much adopting these best practices will help your specific application. It also means you do not know how to weigh the value of improvements from tactics like Expires headers for CSS and JavaScript files to boost cache effectiveness against the burden of augmenting your software development practices with an appropriate approach to versioning those files, for example. Fortunately, there are popular tools to measure web browser Page Load Time directly, and we will look at them in a moment.

Next: Complications that the simple YSlow model does not fully take into account.

Why is this web app running slowly? — Part 3.

This is a continuation of a series of blog entries on this topic. The series starts here.

In this post, I will begin to probe a little deeper into the model of web page response time that is implicit in the YSlow approach to web application performance.

The YSlow performance rules, along with Souder’s authoritative book that describes the YSlow corpus of recommendations and remedies in more detail, have proved to be remarkably influential among the current generation of web application developers and performance engineers. The scalability model that is implicit in the YSlow performance rules is essentially a prediction about the time it takes to compose a page, given the underlying mechanics of web page composition that entails gathering and assembling all of the files associated with that page. One benefit of the model is that it yields an optimization strategy for constructing web pages that will load more efficiently. The YSlow performance rules embody the tactics recommended for achieving that goal, encapsulating knowledge of both the

  • HTTP protocol that allows a page to be assembled from multiple parts, and
  •  the underlying TCP/IP networking protocol that governs the mechanism in which file transfers are broken into individual packets for actual transmission across the network.

In this post, I will explore the model, the optimization strategy it yields, and some of the more important tactical recommendations the tool makes in order to design web pages that load fast. Any analytic method that relies on an underlying predictive model runs the risk of generating inaccurate predictions when the model fails to correspond closely enough to the real behavior it attempts to encapsulate. With that in mind, I will also describe some of the deficiencies in the YSlow model

One notable concern that arises whenever you do attempt to apply the YSlow optimization rules to your web app is that the YSlow tool does not attempt to measure the actual time it takes to load the page being analyzed. The effect of the missing measurements is that very specific YSlow recommendations for improving page load time exist in a vague context, lacking precisely the measurements that allow you to assess how much of an improvement can be expected from following that recommendation. This lack of measurement data is a serious, but not necessarily always fatal, shortcoming of the YSlow approach.

Today, a very popular alternative to YSlow is available at http://www.WebPageTest.org which does provide page load time measurements. Since WebPageTest adopts a very similar scalability model, I will begin with a discussion of the original YSlow tool.

As noted above, YSlow does not attempt to measure the amount of time it takes to transmit the HTTP data files requested over the network. Such measurements are problematic because network latency is mainly a function of the distance between the web client and server host machines. Network delay generally varies based on the geographical location of the client relative to the web server, although the specific route IP packets take to arrive at the server endpoint from the client is also significant.

Note: in the IP protocol, a route between a source IP address and destination IP address is generated dynamically for each packet. The IP packet forwarding mechanism uses routers that forward each packet one step closer to the desired destination until it ultimately reaches it. Each intermediate endpoint represents one hop. The physical distance between two IP endpoints in a hop is the primary determinant of the latency. (An IP command line utility called tracert, which is included with Windows, attempts to discover and document the path that a packet will travel from your computer in order to arrive at a specific destination IP address. The output from the tracert command shows each network hop and the latency associated with that hop. There are also visually oriented versions of the tracert tool (see, for example, http://www.visualiptrace.com/), some of which even attempt to locate each network hop on a map to indicate visually the physical distance between each hop.)

Without going into too many of the details regarding IP packet routing and TCP session management here, it is generally safe to assume, as YSlow does, that network latency has a significant impact on overall page load times, which is true for any public-facing web site. In the case of running a tool like YSlow, where you are only able to gather one instance of network latency, that singleton measurement can hardly be representative of what is in reality a very variegated landscape of network connectivity across this interconnected world of ours. Naturally, measurement tools have also evolved to fill this important gap, too. For example, the WebPageTest program is hosted at a dozen or so locations around the world, so it is possible to at least compare the performance of your web application across several different geographically-distributed locations.

Even though YSlow’s prediction about page load time is rendered qualitatively – in the form of a letter grade, since YSlow does not attempt to measure actual page load times – it will help to express the model of web application performance underlying the YSlow recommendation engine in quantitative terms.

In formal terms, in YSlow,

PageLoadTime ≈ Web Page Composition and Render time

[equation 1]

In effect, the YSlow model assumes that the network latency to move request and response messages back and forth across the public Internet in order to compose web pages dominates the cost of computation on the local device where that page composition is performed inside the web client. In reality, the assembly of all the files the page references and render time are distinct phases,

PageLoadTime ≈ Web Page Composition + Render time

[equation 2]

but page render time is even more problematic to measure, given the widespread use of JavaScript code to respond to customer input gestures, etc. YSlow assumes that the web client processing time in assembling the different components of a page is minimal compared to the cost of gathering those elements of the page across the network from a web server located on some distant Host machine. It is often quite reasonable to ignore the amount of time spent inside the web client to perform the actual rendering, as YSlow assumes. Each HTTP GET Request needed to compose the page requires a round trip to the web server and back. Computers being quite fast, while data transmission over long distances is relatively slow, the cost of computation inside the web browser can often be ignored, except in extreme cases.

To be sure, this simplifying assumption is not always the safest assumption. For example, as powerful as they are becoming, the microprocessors inside mobile devices such as phones and tablets are still significantly slower than desktop PCs and laptop computers. When any significant amount of compute-bound logic is off-loaded from the web server to JavaScript code executing inside the web client, the amount of computing capacity available to the web client can easily become a critical factor in page load time.

Nevertheless, by ignoring the processing time inside the web client, YSlow can assume that page composition and render time is a largely function of the network delays associated with fetching the necessary HTTP objects:

PageLoadTime ≈ RoundTrips* RTT

[equation 3]

namely, the number of round trips to the web server and back multiplied by the Round Trip Time (RTT) for each individual network packet transmission associated with those HTTP objects.

A simple way to measure the RTT for a web server connection is to use the ping command to send a message to the IP address registered to the domain name and receive a reply. (You can use the Whois service to find the IP address if you only know the web site domain name.) Using ping or a network packet sniffer like WireShark, it is easy to determine that the network transmission round trip time from your client PC to some commercial web site is 100-200 ms or more. On a typical desktop or portable PC, which contains an extremely powerful graphics co-processor, the amount of CPU time needed to render a moderately complex web page is likely to be far less than 200 ms.

We have seen that large HTTP Request and Response messages require multiple packets to be transmitted. Therefore, the total number of round trips to the web server that are required can be calculated using the following formula:

RoundTrips = SUM (HttpObjectSize / PacketSize)

[equation 4]

since each HTTP object requested requires one or more packets to be transmitted, depending on the size of the object. Ignoring for a moment some web page composition wrinkles that will complicate this simple picture considerably, this simple model does yi eld a reasonable, first-order approximation of page load time for many web applications. In emphasizing the assembly process which involves retrieving all the HTTP objects that are referenced in the DOM over a TCP/IP network in order to compose a web page, the YSlow model of performance certainly clarifies several of the most important factors that determine the performance of web applications. There are excellent reasons to turn to a tool like YSlow for expert advice about why your web application is running slowly.

If your web application is actually performing badly in either a stress test or in an actual production environment, there are also a number of good reasons to be wary of some of the YSlow recommendations. For one, in the interest of simplification, the YSlow performance rules ignore a number of other potential complicating factors that, depending on the particular web application, could be having a major impact on web page responsiveness.

 When it comes to making changes to your web application based on its YSlow grade, the other important issue is that the YSlow recommendations do not reflect actual Page Load time measurements for your web page. Using the YSlow tool alone, you are not able to quantify how much improvement in page load time to expect from implementing any of the recommended changes. Since not every recommended change is an easy change to implement, YSlow cannot help you understand the benefit of the change so that you weigh it against the cost of implementing the change.

For instance, sure, you could combine all your scripts into one minified script, but the current way that individual blocks of script code are packaged may provide benefits in terms of application maintainability, stability and flexibility. Your web page may use scripts from multiple sources, perhaps different teams within your own development organization contribute some and others are pulled from external sources, including third parties, for example, that build and maintain popular, open source JavaScript libraries that complement JQuery or AngularJS. From a software engineering quality perspective, packing all these scripts from different sources into one may not be the right Build option to use.

Knowing how much of an improvement can be gained from implementing the YSlow rule about packing some or all of your script files helps you understand whether this is a packaging trade-off that is worthwhile, given that the change will complicate software maintenance and possibly compromise software quality. For the record, a web page that is flat out broken is always categorically worse than one that is just merely slow, so there is a negative impact from any engineering change that reduces the application’s quality or makes it harder to maintain. Because of the need to understand trade-offs like this, the actual page load time measurements are very important to consider in weighing the benefits of some performance optimization. This has sparked development of performance tools to complement YSlow, which we will take a closer look at in a moment.

Something else to consider when you are thinking of making changes to your web application to conform to the YSlow performance recommendations is why so many apparently successful web applications appear to ignore these recommendations almost completely. The Google search engine landing page at www.google.com, for example, has historically drunk the YSlow Kool-Aid by providing an extremely simple web page that requires only 2 or 3 round trips to load. At the other end of the spectrum in terms of complexity, you will find landing pages for Amazon.com, the largest and most successful e-commerce web site in the world. The last time I ran YSlow against a www.amazon.com landing page, I found it required retrieval of over 300 different HTTP objects to compose the Amazon web page DOM.

Despite violating the basic YSlow performance rules, it is difficult to argue against Amazon’s approach, given Amazon’s experience and acumen in the world of e-commerce. In fact, you will find that many, many commercial web sites that participate in e-commerce or serve as portals serve up very complex web pages, garnering YSlow grades more like Amazon on the YSlow performance rules than the Google Search page. Companies with web-based retailing operations like Amazon use an experimental approach to web page design where design alternatives are evaluated using live customers to see which approach yields the best results. If Amazon and other e-commerce retailers wind up serving up complex web pages that flaunt the YSlow performance rules, it is safe to conclude that Amazon customers not only tolerate page load times that routinely exceed 5-10 seconds, in some sense they prefer these complex pages, perhaps because the page conveniently and neatly encapsulates all the relevant information they require prior to making a purchasing decision.

On the other hand, in a future post we will look at some of the evidence that improving web page response times is correlated with increased customer satisfaction (which is admittedly difficult to measure) and improved fulfillment rates (which is often much easier to measure).

Deriving the YSlow model of web application performance, depicted in Equations 3 & 4, leads directly to an optimization strategy to minimize the number of round trips, decrease round trip time, or both. More on optimization next time.

.