Understanding Guest Machine Performance under Hyper-V

In this post I begin to consider guest machine performance under Hyper-V. This is the seventh post in a series on Hyper-V performance. The series began here.

All virtualization technology requires executing additional layers of systems software that adds overheads in many functional areas of Windows, including

  • processor scheduling,
  • intercepting and emulating certain guest machine instructions that would violate the integrity of the virtualization scheme,
  • machine memory management,
  • initiating and completing IO operations, and
  • synthetic device interrupt handling.

The effect of virtualization in each one of these areas of execution is to impart a performance penalty, and this applies equally to VMware, Zen, and other flavors of virtualization that are available. Windows guest machine enlightenments under Hyper-V serve to reduce some of the performance penalties associated with virtualization, but they cannot eliminate all of it. Your application suffers some performance penalty when it executes on a virtual machine. The question is how big is the performance penalty.

Executing these additional layers of software under virtualization always impacts the performance of a Windows application negatively, particularly its responsiveness. Individually, executing these extra layers of software adds a very small amount of overhead every time one of these functional areas is exercised. Added together, however, these additional overhead factors are significant enough to take notice of. But the real question is whether they are substantial enough to actively discourage data centers from adopting virtualization technology, given its benefits in many operational areas. Earlier in this series, I suggested a preliminary answer, which is “No, in many cases the operational benefits of virtualization substantially outweigh the performance risks.” Still, there are many machines that remain better off being configured to run on native hardware. Whenever maximum responsiveness and/or throughput is required, native Windows machines reliably outperform Windows guest machines executing the same workload.

Where Hyper-V virtualization technology excels is in partitioning and distributing hardware resources across virtual machines require far less capacity than is available on powerful server machines. Furthermore, by exploiting the ability to clone new guest machines rapidly, virtualization technology is often used to enhance the scalability and performance of an application that requires a cluster of Windows machines to process. Virtualization can make scaling up and scaling out such an application operationally easier. However, you should be aware that there are other ways to cluster machines to achieve the same scaling up and scaling out improvements without incurring the overhead of virtualization.

Performance risks.

The configuration flexibility that virtualization provides is accompanied by a set of risk factors that expose virtual machines to potential performance problems that are much more serious in nature than the additional overhead considerations discussed immediately above. These performance risks need to be understood by IT professionals charged with managing the data center infrastructure. The most serious risk that you will encounter is the ever-present danger of over-loading the Hyper-V Host machine, which leads to more serious performance degradation than any of the virtualization “overheads” enumerated above. Shared processors, shared memory and shared devices introduce opportunities for contention for those physical resources among guest machines that would not otherwise be sharing those components if allowed to run on native hardware. The added complexity of administering the virtualization infrastructure with its more ubiquitous level of resource sharing is a related risk factor.

When a Hyper-V Host machine is overloaded, or over-committed, all its resident guest machines are apt to suffer, but isolating them so they share fewer resources, particularly disk drives and network adaptors, certainly helps. However, shared CPUs and shared memory are inherent in virtualization, so achieving the same degree of isolation with regard to those resources is more difficult, to say the least. This aspect of resource sharing is the reason Hyper-V has virtual processor scheduling and dynamic memory management priority settings, and we will need to understand when to use these settings and how effective they are. In general, priority schemes are only useful when a resource is over-committed, essentially an out-of-capacity situation. This creates a backlog of work – a work queue – that is not getting done. Priority sorts the work queue, allowing more of the higher priority work to get done, at the expense of lower priority workloads. Like any other out-of-capacity situation, the ultimate remedy is not priority, but finding a way to relieve the capacity constraint. With a properly provisioned virtualization infrastructure, there should be a way to move guest machines from an over-committed VM Host to one that has spare capacity.

Somewhere between over-provisioned and under-provisioned is the range where the Hyper-V Host is efficiently provisioned to support the guest machine workloads it is configured to run. Finding that balance can be difficult, given constant change in the requirements of the various guest machines.

Finally, there are also performance risks associated with guest machine under-provisioning, where the VM Host machine has ample capacity, but one or more child partitions is constrained by its virtual machine settings from accessing enough of the Hyper-V Host machine’s processor and memory resources it requires.

Table 2 summarizes the four kinds of Hyper-V configurations that need to be understood from a cost/performance perspective, focusing on the major performance penalties that can occur.

Table 2. Performance consequences of over or under-provisioning the VM Host and its resident guest machines.

Condition

Who suffers a performance penalty

Over-committed VM Host All resident guest machines suffer
Efficiently provisioned VM Host No resident guest machines suffer
Over-provisioned VM Host No guest machines suffer, but hardware cost is higher than necessary
Under-provisioned Guest Guest machine suffers

In the next blog entry, I will make an effort to characterize the performance profile of each configuration condition, beginning with the case that generates the least damaging performance penalty, namely that of the over-provisioned VM Host. Characterizing application performance when the Hyper-V Host machine is over-provisioned will provide insight into the minimum performance penalties that you can expect to accrue under virtualization..

Understanding Hyper-V Dynamic Memory management

In this post I will examine what happens when Hyper-V machine memory is overcommitted and how the Hyper-V Dynamic Memory Balancer adjusts guest machine virtual machine memory allocations to adjust to that condition. This is the sixth post in a series on Hyper-V performance. The series began here.

The best way to understand how the Hyper-V Dynamic Memory Balancer works is through a simple example where machine memory is overcommitted. The scenario begins on a Hyper-V Host machine with 12 GB of RAM that is running five Windows guest machines with the following dynamic memory settings:

 Virtual Machine

Minimum

Maximum

WIN81TEST1

2 GB

6 GB

WIN81TEST2

2 GB

6 GB

WIN81TEST3

512 MB

6 GB

WIN81TEST4

512 MB

6 GB

WIN81TEST5

512 MB

8 GB

Only guest machine 5 is active, as illustrated in the screen shot of the Hyper-V Manager console, shown in Figure 17. The Hyper-V Host used in the test contains 12 GB of RAM. The remaining four guest machines are idle initially. As the benchmark workload running on guest machine 5 consumes more physical memory , the Hyper-V Dynamic Memory Balancer kicks in, stealing memory from the idle machines. (Note that the Hyper-V root partition is also a virtual machine that consumes physical memory.) When machine memory gets over-committed, Hyper-V will reduce the amount of machine memory allocated to guests 3-4 to near their minimum values, using the ballooning technique discussed in the previous post. The dynamic memory minimum parameters for machines 1-2 ensure that Hyper-V cannot reduce their physical memory allotments below their assigned 2 GB minimum.

HyperVManagerScreenshot showing dynamic memory adjustments(3)

Figure 17. The Hyper-V Manager console showing the five active guest machines at the beginning of the Dynamic Memory case study.

The workload executed on guest machine 5 is a memory soaker program that is severely constrained in an 8 GB virtual machine, so its machine memory allotment quickly increases. Notice, however, that guest machines 1-2 retain all 2 GB of their machine memory allotments, which reflects their Dynamic Memory Minimum memory settings. The Physical Memory allocations for a two-hour window at the beginning of the test are shown in Figure 18 indicate that the Assigned Memory values shown in the console window are representative ones.

Hyper-V view of available memory in severely memory constrained scenario

Figure 18. Physical Memory allocations over a two-hour window at the beginning of the test scenario. Test machines 1 & 2 are running at their minimum dynamic memory settings, which was 2 GB.

The Memory Pressure indicators for this initial configuration are shown in Figure 19. The Memory Pressure measurements for guest machines 1-2 are much lower than the remaining guest machines which are subject to memory adjustments by the Dynamic Memory Balancer. There is a sustained period beginning around 1:20 pm where the Memory Pressure measurements for machines 3-5, which are subject to dynamic memory adjustments, exceed the Memory Pressure threshold value of 100. Because of the minimum memory settings in effect for guest machines 1 & 2, their Memory Pressure readings are less than 40.

Average memory pressure when machine memory is over-committed and Hyper-V cannot steal memory from some machines

Figure 19. Memory Pressure measurements for machines 3-5 exceed the threshold value of 100. Because of the minimum memory settings in effect for guest machines 1 & 2, their Memory Pressure readings are less than 40.

At this point, you can work backwards from the Memory Pressure measurements and the amount of physical memory visible to the partition and calculate the number of committed bytes reported by the guest machines. Or you can also gather performance measurements from the guest Windows machines.

Figure 20 shows a view of Committed Bytes and the guest machine’s Commit Limit taken from inside guest machine 5 during the same measurement interval as Figure 19. Prior to 1:20, the guest machine Commit Limit is about 4.5 GB. Sometime around 1:20 pm, Hyper-V added machine memory to the guest machine that boosted the Commit Limit to about 5.5 GB. At that point, the guest machine started paging excessively. The VM Host was so bottlenecked due to this excessive disk paging beginning around 1:20 pm that there are gaps in the guest machine performance counter measurements that are available, indicating the guest machine was dispatched erratically.

Guest machine Committed bytes in severely memory constrained scenario

Figure 20. A view from inside guest machine 5 of Committed Bytes and the guest machine’s Commit Limit during a period where the guest faced a severe physical memory constraint. Note several gaps in the measurement data. These gaps reflect intervals in which performance data collection was delayed or deferred because of virtual processor dispatching delays and deferred timer interrupts due to contention for the VM Host machine’s physical processors.

To alleviate a machine memory shortage, remove or migrate one or more of the guest machines that are executing to another Hyper-V host to free up some machine memory. Around 4:30 pm, I manually shut down guest machines 1 & 2, which quickly freed up 4 GB of machine memory that were holding onto. As this machine memory became available, Hyper-V began to increase the memory allotment for guest machine 5, which remained under severe memory pressure. As shown in Figure 21, the guest machine 5 Commit Limit quickly increased to 8 GB, which was sustained for about one hour as the memory soaker program continued to execute, while the number of committed bytes began to approach the partition’s 8-GB allocation limit. After a sustained period when Committed Bytes reached 80% of the Commit Limit, the Windows guest extended the size of its paging file to increase the Commit Limit to about 12 GB around 5:40 pm.

Note that there are several gaps in the guest machine performance data shown in Figure 18. These gaps reflect intervals in which performance data collection was unavailable due to delays in virtual processor scheduling and synthetic timer interrupts that were deferred by the hypervisor. Excessive contention for the VM Host machine’s physical processors, which as we saw, were all quite busy, causes these delays at the level of the guest machine. The gaps themselves are fairly strong evidence that the physical processors on the VM Host machine are over-committed.

Guest machine Committed bytes in less severely memory constrained scenario

Figure 21. A later view from inside guest machine 5 of Committed Bytes and the guest machine’s Commit Limit after 4 GB of physical memory were freed up. Committed Bytes increases to almost 8 GB, the maximum dynamic memory setting. The Commit limit is boosted to 8 GB and then to 12 GB.

Following this adjustment, the configuration reaches a steady state with Guest Machine 5 running with its maximum dynamic memory allotment of 8 GB. The Commit Limit remains at 12 GB. While Committed Bytes fluctuates due to periodic .NET Framework garbage collection, you can see in Figure 22 that it averages close to the 8 GB physical memory allotment, but with peaks as high as 10 GB.

Guest machine Committed bytes in less severely memory constrained steady state

Figure 22. A final view from inside guest machine #5, looking at Committed Bytes and the Commit Limit during a period of memory stability. Committed Bytes hovers near the 8 GB mark, while the Commit Limit remains 12 GB.

Figure 23 returns to the Hyper-V view of machine memory management, reporting the physical memory allocated to the three active guest machines. Guest machine 5 is allocated close to 8 GB, while between them guest machines 3 and 4 have acquired an additional 1.5 GB of physical memory.

Hyper-V view of available memory in less severely memory constrained scenario

Figure 23. Physical memory allocated to guest machine #5 approaches the 8 GB Maximum Memory setting. The Hyper-V Dynamic Memory Balancer makes minor memory adjustments continuously, even when the workloads are relatively stable.

Figure 24 revisits the hypervisor view of Memory Pressure for the three guest machines that remained running, each of which is subject to Dynamic Memory management.

Average memory pressure when machine memory is over-committed - Copy

Figure 24. Memory Pressure readings for the three remaining active guest machines remains high, which triggers minor Memory Add and Memory Remove operations every measurement interval.

During this measurement interval, guest machine 5 is running with a physical memory allocation at or near its dynamic memory maximum setting. Hyper-V continues to make minor dynamic memory adjustments and the Memory Pressure for all three machines remains high and continues to fluctuate.

Discussion

Hyper-V adjusts machine memory allocation by attempting to balance a measurement called Memory Pressure across all guest machines running at the same memory priority, adding machine memory to a partition running at a higher Memory Pressure value and removing machine memory using a ballooning technique from a partition running at a lower Memory Pressure value. Memory Pressure is a memory contention index calculated from the ratio of the guest machine’s Committed Bytes divided by its current machine memory allocation. When Memory Pressure increases beyond 100, a guest machine is likely to experience an increased rate of paging to disk, so adding machine memory to the guest machine to prevent that from happening is often an appropriate action for Hyper-V to take.

Memory Pressure is less viable as a memory contention index when applications such as Microsoft SQL Server are allowed to allocate virtual memory up the physical memory limits of the guest machine. Applications like SQL Server also listen for low memory notifications from the Windows OS when the supply of physical memory is depleted due to ballooning. Upon receiving a low memory notification from Windows, SQL Server will release some of the virtual memory allocated in its process address space by releasing some of its buffers. Similarly, the .NET Framework runtime will trigger a garbage collection to release unused space in any of the process address space managed heaps whenever a low memory notification from Windows is received. The combination of Hyper-V dynamic memory adjustments, ballooning inside the guest OS to drive its page replacement algorithm, and dynamic memory management at the process address space level makes machine memory management in Hyper-V very, very dynamic!

With dynamic memory, it becomes possible to stuff more virtual machines into the machine memory footprint of the Hyper-V Host machine, making Hyper-V more competitive with VMware ESX in that crucial area. The minimum physical memory setting of 32 MB grants Hyper-V considerable latitude in reducing the physical memory footprint of an inactive guest machine to a base minimum amount, less than the amount of physical memory a guest machine would need to get revved up again after a period of inactivity.

It is also easy to stuff more virtual machines the machine memory than can safely coexist, with the potential to create performance problems of a serious magnitude. Any guest machine that is attempting to run when the machine memory of the Hyper-V Host is over-committed is apt to face serious consequences. In an earlier discussion of Windows virtual memory management, we have seen that determining the memory requirements of the guest machine workload is a difficult problem, complicated by the fact that the memory requirements may themselves vary based on the time of day, a particular mix of requests, or additional factors that can influence the execution of the workload.

While acknowledging that memory capacity planning is consistently challenging, I would like to suggest that the Hyper-V dynamic memory capability does open up unique opportunities to deal with the difficulties. With virtualization, you gain the flexibility to size a guest machine to execute in a Guest Physical memory footprint that is difficult to configure on native hardware. When you are able to determine the memory requirements of the workload, with Hyper-V you can configure a maximum dynamic memory footprint for a guest machine that might not exist in your inventory of actual hardware. In the example shown in Figure 4, the dynamic memory maximum is set to 6 GB. If the physical machines available are all configured with 8 GB or more of machine memory, then you are already seeing a practical advantages from running that workload as a Hyper-V guest.

One recommended approach to memory capacity planning is systematically varying the memory footprint of the machine and observing the paging rates, for example, in a load test. This is an iterative, trial-and-error method that is much easier to use with virtualization.

As we have seen, Dynamic Memory provides the ability to specify a flexible range of machine memory for a guest machine to operate within and then gives the hypervisor the freedom to experiment with memory settings within that range. The Dynamic Memory adjustment mechanism that makes decisions in real-time based on the Memory Pressure exerted by the guest machine is an excellent way to approach sizing physical memory. What’s more, since memory requirements can be expected to vary over time, the Hyper-V dynamic memory capability can also provide the flexibility to deal with this variability effectively.

To be clear, the Dynamic Memory Balancer does not attempt to settle on a physical memory configuration that is optimal for the guest machine to run in. Optimization based on determining what the physical memory requirements of a workload are remains something the performance analyst still must do. What Hyper-V does instead is balance the amount of physical memory allocated to each partition across all guest machines running at the same memory priority, relative to their physical memory usage – it attempts to equalize the Memory Pressure readings for all the guest machines running in a memory priority band. If the Hyper-V Host machine does not contain enough RAM to service each of the guest machines adequately, the Dynamic Memory Balancer will distributes physically memory in a manner that may cause every guest machine to suffer from a physical memory shortage. Moreover, a guest machine that has the freedom to extend the amount of physical memory it is using to its maximum setting can create a physical memory shortage on the Host machine that will impact other resident guest machines.

Performance risks aside, data centers derive significant benefits from virtualization, including those that arise in activities closely allied to performance, namely, provisioning, scalability and capacity planning. Virtualization brings additional flexibility to both provisioning and capacity planning. In planning for CPU capacity, for example, it is not possible to purchase a machine with three CPUs, but if that is the capacity a workload requires, it certainly is possible to configure a guest partition that way. Dynamic Memory provides the flexibility to specify a range of machine memory for a guest machine to operate within and then gives the hypervisor the freedom to “experiment” with different memory settings within that range. Being able to specify a start-up memory value and a much higher maximum value for guest machine provides configuration flexibility that is very desirable in the absence of good intelligence about how much memory the guest machine really needs.

As noted above, by default, Hyper-V seeks to balance the physical memory allocations across a set of guest machines that are configured to run at the same memory priority. It wisely bases memory adjustments on measurements sent from the guest Windows machine reflecting how much physical memory is currently committed. But the Hyper-V approach to balancing memory allocations across guest machines lacks a goal-seeking mechanism that attempts to find an optimal memory footprint for the workloads..

Hyper-V architecture: Memory Ballooning

This is the fifth post in a series on Hyper-V performance. The series began here.

Ballooning

Removing memory from a guest machine while it is running is a bit more complicated than adding memory to it, which makes use of a hardware interface that the Windows OS supports. One factor that makes removing memory from a guest machine difficult is that the Hyper-V hypervisor does not gather the kind of memory usage data that would enable it to select guest machine pages that are good candidates for removal. The hypervisor’s virtual memory capabilities are limited to maintaining the second level page tables needed to translate Guest Virtual addresses to valid machine memory addresses. Because the hypervisor does not maintain any memory usage information that could be used, for example, to identify which of a guest machine’s physical memory pages have been accessed recently, when Guest Physical memory needs to be removed from a partition, it uses ballooning, which transfers the decision about which pages to remove from memory to the guest machine OS, which can execute its normal page replacement policy.

Ballooning was pioneered in VMware ESX, first discussed publicly in a paper by Carl Waldpurger entitled “Memory Resource Management in VMware ESX Server,” published in Dec. 2002. See Proc. Fifth Symposium on Operating Systems Design and Implementation (OSDI ’02). The Hyper-V implementation is similar, but with some key differences. One key difference is that the Hyper-V hypervisor has no ability to ever remove guest physical memory arbitrarily and swap it to a disk file, as VMware ESX does when it faces an acute shortage of machine memory. VMware ESX swapping selects pages at random for removal, and without any knowledge of how guest machine pages are used, the hypervisor can easily choose badly. The Microsoft Hyper-V developers chose not to implement any form of hypervisor swapping of machine memory to disk. For page replacement, Hyper-V relies solely on the virtual memory management capabilities of the guest OS, which is usually Windows, when there is a shortage of machine memory. Frankly, performance suffers under either approach when there is an extreme machine memory shortage – overloading machine memory is something to be avoided on both virtualization platforms. Hyper-V does have the virtue that machine memory management is simpler, relying on a single mechanism to relieve a machine memory shortage.

In both virtualization approaches, it is important to be able to understand the signs that the VM Host machine’s memory is over-committed. In Hyper-V, these include:

  • a shortage of Hyper-V Dynamic Memory\Available Memory
  • sustained periods where the Hyper-V Dynamic Memory\Average Memory Pressure measurements for one or more guest machines hovers near 100
  • internal guest machine measurements show high paging rates to disk (Memory\Pages/sec, Memory\Page-ins/sec)

Because ballooning transfers the decision about which pages to remove from guest physical memory to the guest OS, we need to revisit virtual memory concepts briefly in this new context. One goal of virtual memory management is to utilize physical memory efficiently, essentially filling up physical memory completely, aside from a small buffer of unallocated physical pages that are kept in reserve. Memory over-commitment works because processes frequently allocate more virtual memory than they need at any one moment in time. Consequently, it is usually not necessary to back every allocated virtual memory page with guest physical memory. Consider a guest machine that reports a Memory Pressure reading of 100 – in other words, its Committed Bytes = Visible Physical Memory. Typically, 10-20% of the machine’s committed pages are likely to be relatively inactive, which would allow the OS to remove them from physical memory without much performance impact.

Since virtual memory management by design tends to fill up physical memory, it is not uncommon for the OS to need to displace a currently allocated virtual page from physical memory to make room for a new or non-resident page that the process has just referenced from time to time. Windows implements an LRU page replacement policy, trimming older pages from process working sets when physical memory is in short supply. Windows and Linux guest machines manage virtual memory dynamically, keeping track of which of an application’s virtual pages are currently being accessed. Furthermore, the OS’s page replacement policy ages allocated virtual memory pages that have not been referenced in the current interval. The pages of a process that have not been referenced recently are usually better candidates for removal in favor of current pages.

The ballooning technique used in Hyper-V – and in VMware ESX, as well –  pushes the decision about which specific pages to remove down to the guest machine, which is in a far better position to select candidate pages for removal because the guest OS does maintain memory usage data. The term “ballooning” refers to a management thread running inside the guest machine that acquires empty physical memory buffers when the hypervisor signals that it wants to remove physical memory from the partition. This action can be thought of as the memory balloon inflating. Having once inflated, when Hyper-V decides to add memory back to the child partition, it deflates the balloon, freeing up balloon memory that was previously acquired.

In Hyper-V, ballooning is initiated by the Dynamic Memory Balancer, a task hosted inside the Root partition’s Virtual Machine Management Server (VMMS) component. Whenever the Dynamic Memory Balancer decides to adjust the amount of guest physical memory allotted to a guest machine, it communicates with the specific VM worker process running in the Root partition that maintains the state of the guest machine. If the decision is to remove memory, the VM worker process issues a message to request page removal that is communicated to the child partition across the VMBus.

The memory ballooning process used to reduce the size of guest physical memory is depicted in Figure 15. Inside the child partition, the Dynamic Memory VSC – also responsible for implementing the guest OS enlightenment that reports the number of guest OS committed bytes – responds to the remove memory request by making a call to the MmAllocatePagesForMdlEx API, which acquires memory from the non-paged pool. This pool of allocated physical memory, normally used by drivers for DMA devices that need access to physical addresses, is the “balloon” that inflates when Hyper-V determines it is appropriate to remove guest physical memory from the guest machine. The Dynamic Memory VSC then returns to the Root partition – via another VMBus message – a list of the Guest Physical addresses of the balloon pages that it has just acquired. The Root partition then signals the hypervisor that these pages are available to be added to a different partition.

HyperV memory balloon processing

Figure 15. The balloon driver is a Dynamic Memory VSC that responds to a VMBus request to remove memory by acquiring memory from the non-paged pool. The balloon driver then returns a list of physical memory pages that the hypervisor can immediately grant to a different virtual machine.

Since the balloon driver in the guest machine will pin the memory balloon pages in nonpaged physical memory until further notice, the physical memory pages in the guest machine balloon prove the exception to the rule that memory locations can only be occupied by one guest machine at a time. The pages in the balloon are set aside, remaining accessible from inside the guest machine; however, the balloon driver ensures that they are not actually accessed. This allows Hyper-V to grant the machine memory these balloon pages occupy to another guest machine to use.

From inside the guest Windows machine, the balloon inflating increases the amount of nonpaged pool memory that is allocated, as illustrated in Figure 16. Figure 16 reports on the size of the nonpaged Pool in a Windows guest during a period when the balloon inflates (shortly after 5 pm) and then deflates about an hour later.

Hyper-V guest machine ballooning

Figure 16. Inside the guest Windows machine, the balloon inflating corresponds to an increase in the amount of nonpaged pool memory that is allocated. In this example, the balloon deflates about 1 hour later.

As in VMware, ballooning itself has no guaranteed immediate impact on physical memory contention inside the Windows guest machine. So long as the guest machine has a sufficient supply of available pages, the impact remains minimal. Over time, however, ballooning can pin enough guest OS pages in physical memory to force the guest machine to execute its page replacement policy. In the case of Windows, this means that the OS will also issue a LowMemoryResourceNotification event, which triggers garbage collection in a .NET Framework application and a similar buffer manager trimming operation in SQL Server. On the other hand, if ballooning does not cause the guest OS machine to experience memory contention, i.e., if the balloon request can be satisfied without triggering the guest machine’s page replacement policy, there will be no visible impact inside the guest machine other than an increase in the size of the nonpaged Pool..