Memory Resource Management in VMware ESX Server
Carl Waldspurger. Memory Resource Management in VMware ESX Server in Proceedings of the 5th Symposium on Operating Systems Design and Implementation, 2002.
Reviews due Thursday, 2/22.
« The Duality of Memory and Communication in the Implementation of a Multiprocessor Operating System | Main | Implementing Remote Procedure Calls »
Carl Waldspurger. Memory Resource Management in VMware ESX Server in Proceedings of the 5th Symposium on Operating Systems Design and Implementation, 2002.
Reviews due Thursday, 2/22.
Comments
SUMMARY
In "Memory Resource Management in VMware ESX Server" Carl Waldspurger describes memory management mechanisms and policies used in VMware ESX server. The following are addressed: page reclamation/eviction, efficient memory utilization while providing performance isolation guarantees, content-based page sharing and transparent page remapping.
PROBLEM
Operating systems are designed to have full control of hardware resources. This assumption is broken when an OS is running in a virtual machine created by a virtual machine monitor (VMM). In order to maximize compatibility, as well as be able to run unmodified OSes, the VMM may not be able to coordinate its actions with the running OS. This results in hardware having two managers that may have trouble communicating. This, in turn, results in performance anomalies.
CONTRIBUTIONS
Idea that VMM should coax guest OS to do its work since the guest OS is likely to have more information to make a good decision.
Efficient memory vitalization by using machine TLB for translating virtual addresses in guest OS to addresses of physical memory.
Identification of the "double paging" problem, where a guest OS may want to swap memory what was already swapped by VMM.
Ballooning technique: "inflating" a memory balloon causes the guest OS to swap "least necessary" pages. The balloon module communicates which pages it has to the VMM. Those (physical) pages may be reused by VMM. Thus the decision of what to swap is performed in the guest OS.
Content-based page sharing: VMM identifies pages with identical content in different guests and uses only one machine physical page to store it.
QoS guarantees based on memory shares: resource shares are traded between guest OSes with the VMM determining the cost of a share. This results in transfer of resources from those who want them least (sell of cheapest) to those who need them most (willing to pay the most).
Page remapping: because IO involving high memory can be more expensive than that with low memory, VMM keeps track of high memory pages frequently used for IO and transparently remapping them to pages in low machine memory.
FLAWS
VMware is perfect. Seriously. One thing I wished for is a bit more information about how ESX Server actually worked, to give more context. I am VMware workstation user, but after reading the introduction I did not feel that I had a sufficient understanding of how ESX Server works.
RELEVANCE
Very relevant. Virtual machines are the future.
Posted by: Vladimir Brik | February 22, 2007 10:28 AM
Summary
This paper discusses the use of ballooning and memory sharing between guest operating systems in the VMware ESX server.
Problem
The problem that the authors are trying to solve is the problem of running unmodified guest operating systems inside of a virtual machine and the challenges this presents in memory management. In particular how to know which pages in memory it is best to reclaim among the pages used by the guest OS.
Contributions
The primary contributions of this paper are the discussion of memory virtualization with allowing overcommitment of resources, the ballooning technique for forcing guest operating systems to decide which pages to reclaim, and the technique for sharing identical memory between different virtual machines.
The ballooning technique discussed inserts a small "balloon" module into the guest operating system as a device driver or kernel service. The purpose of this module is to put pressure on the memory management system of the guest OS by pinning pages in the guest OS when pressure is needed and when less pressure is needed it deallocates pages. This module then uses private channels to tell the virtual machine what physical pages it has claimed and those can be reclaimed by the virtual machine.
The technique for sharing memory between virtual machines relies on checking the content of pages on multiple virtual machines and allowing sharing of pages with identical contents. In order to achieve this, hashing of page contents is used to identify possible identical pages which are then subject to a full comparison.
Flaws
In my opinion, this paper is very solid. Perhaps I feel this way because I use VMware and find it to be an excellent product.
Relevance
This paper is extremely relevant because virtual machines are increasing in popularity as methods both for running multiple oses and also for providing increased security.
Posted by: Aaron Bryden | February 22, 2007 10:06 AM
Summary:
The paper talks about different memory management and sharing techniques used in the popular VMWare ESX Server virtual machine.
Problem:
With memory and computing resources becoming cheap and better, the idea of running multiple virtual machines on a host machine became attractive, because it allows scalability and fault containment for commodity OS running on VMs. Proliferation of this idea also meant need for effective usage and sharing of resources available between all virtual machines on the host machine. This paper focuses on some techniques memory management and sharing, that allows overcommitment of available resources.
Contribution:
The paper introduces several new ideas.
- ballooning is as simple and effective as it could get, to reclaim pages without knowing the OS policies
- A content-based page sharing (combined with transparent copy-on-write) to identify shareable pages and share them.
- QOS guarantees using shares and reclamation of unused memory using an Idle memory tax(i.e. using usage statistics of memory allocated)
- Dynamic reallocation of memory using different strategies at different memory levels
- An I/O remapping technique to low memory to avoid unnecessary copying to high-mem during I/O transfers
- Lots of data to prove that the claims are right
Flaws:
The paper was very well written. Two minor flaws I could find were the following.
- Paper talks about the hashing function used to identify shareable pages having 'incredibly small'. Wouldnt it be fatal, if it still happens?
- There are no experiments/data to substnatiate performance improvement in I/O remapping, though it is straight forward.
Relevance:
With computing resources becoming cheaper and more powerful these days, virtualization has a significant role to play - in OS development and production use. These techniques have been proved very succesful in modern VMs
Posted by: Base Paul | February 22, 2007 07:00 AM
Summary:
Virtual machines are great for increasing the utilization of a machine. This paper describes how VMware, Inc. has implemented memory management. VMware's policy of not modifying the guest OS has forced its engineers to come up with several interesting ideas for managing memory between competing VMs.
Problem:
The paper states that one reason for the recent popularity of virtual machines is the underutilization of resources. VMware's memory management implementation addresses underutilization by providing each virtual machine with a memory limit. The virtual machines' combined memory limits may exceed the actual machine memory (aka "overcommitment"). The problem this paper addresses is how to do memory management in the overcommited context.
Contributions:
The VMware balloon device driver is a great idea! By loading the device driver into a guest OS, VMware can inflate or deflate the "balloon", which forces the guest OS to comply with VMware's memory management decision using the guest operating systems own management mechanisms.
Content-based page sharing to take advantage of redundant ops/data. Pages are identified by content, and a hash table is queried to see if there are any candidates for identical contents. If so, a more exhaustive comparison is made. If the contents are the same, use copy-on-write sharing to save on memory usage.
Proportional allocation allows control over memory usage. To prevent any potential starvation issues, VMware introduces its "idle memory tax." Idle pages "cost" more than active pages, and so are reclaimed.
I/O Page Remapping allows for the VMware server to exploit a processor feature that allows 64 GB of memory to be addressed with 36-bit addresses. VMware gets tricky and can remap "hot" pages in high memory to pages in low memory.
Flaws:
As everyone is saying, aside from a lack of justification for a few decisions, this paper doesn't seem to have any glaring flaws. I guess if I was picking nits, I'd say that some parts of it feel like advertisements.
Relevance:
Virtualization is increasingly popular, trendy, and useful. The implementation details provided in this paper are surely of interest to anyone remotely interested in virtualization.
Posted by: Jon Beavers | February 22, 2007 05:47 AM
Summary:
This paper describes the techniques that are used in VMWare's high-end product, ESX server. It covers a few nifty tricks they use, and provides some performance data.
Problem:
On Tuesday, we talked in class about how doing resource scheduling in microkernel/exokernel-type architectures is difficult, but this is exactly the problem that the VMWare people had to address. (Only in this case it's arguably worse because the guest OSs they discuss running are all built with the expectation that they will be running on the bare metal.)
Contributions:
* A number of neat ideas, such as ballooning as an indirect way of reclaiming memory, random sampling of memory to determine how much memory was being used (so they could tax it), and random memory scans to find similar pages. The first technique is probably pretty much only applicable to virtual machine monitors, but the other two could be done in commodity OSs too, especially the page sharing. (Though in that case it's much less clear that you'd get much of a benefit since the big win for VMWare is shared code between VMs.)
* The concept behind page sharing isn't new. Pages are implicitly shared when a process forks in Unix for example, and one of the primary benefits of shared libraries are that the text segments can be shared between processes using them. However, the idea of scanning memory looking for these opportunities without being informed is to my knowledge new. For instance, IBM's z/VM supports sharing code segments, but as far as I know you have to explicitly instruct the hypervisor that you want that sharing to occur.
* A few evaluations that show the effectiveness of the techniques they use. They demonstrate how taxing unused pages reduces the allocation granted to VMs, how much sharing pages with the same content reduces memory load, etc.
Flaws:
The paper wasn't scratch-n-sniff. I found this very disappointing. Actually, seriously, I think this was a very well-written paper. The one thing that I would like to see is a comparison of their indirect methods of doing content-based sharing and page reclamation with methods to achieve the same effect in a paravirutalized environment. I know they are aiming at being able to run unmodified operating systems (read: Windows), but as virtualization becomes more important I think paravirtualization will become more important too.
Relevance:
Virtualization is becoming very big. The Longhorn server will apparently include virtualization technology in some fashion, new x86 processors are equipped with VT/Pacifica technology to allow hardware virtualization, and IBM has experienced a resurgence of interest in their VM product in concert with Linux over the last few years. Thus techniques that are designed and to work in these environments will become increasingly common.
Posted by: Evan Driscoll | February 22, 2007 12:04 AM
Summary:
This paper describes the key mechanisms and policies of OS “VM ware ESX Server”, which enables to run existing common OS such as Windows and Linux safely and efficiently on ESX Server as an guest OS without a modification.
Problem:
Recently, there are many under utilized server machines in current industrial computer scenes. Virtualization of the machine resource and running virtual machines has been a popular solution to utilize those servers. But, since most of VM systems are tied to hardware architecture or require modification on guest OSes which will be running on Virtual Machine. To increase performance and convenience, a VM server which runs directly on the machine and provides a virtualized resources same as an real devices and enable sharing fairly is needed.
Contributions:
No Modification on guest OS:
To make it popular on industrial scene, it is important to reduce the cost to start using the system and the approach to getting rid of guest OS modification is really important and also challenging since it requires an complete, clear, and safe virtualization technique.
Running VM server directly on the machine as an OS is another important aspect to provide high performance and real individuality of each Virtual Machine.
Efficient and Fair Resource Allocation:
For high performance and efficient resource allocation, ESX Server has proposed several key mechanisms. Ballooning enables to allocate the memory dynamically depending on needs of each virtual machine by polling the current state of each virtual machine. Content Based Page Sharing enables to share the memory which has same content and used by several servers without each guest OS noticing the fact. Idle Memory Measurement enables to utilize the memory by finding out pages which are silently occupied by the guest OS and kick them out from the memory.
Flaws:
I felt this paper was well organized. And since this is a real industrial product, I felt the paper were pretty completed and all features was well balanced. It might be better if there were a little more real-case experiment environment such as a case in a small company and large company. Maybe a talk about other devices such as networking might be also interesting. Also I guess they do have a function like this but, it also might be interesting if they had some kind of mechanism to configure the weight of each virtual machine to customize the performance depending on roll or importance of each virtual machine
Relevance:
This ESX Server system is currently a product of VM ware and seems like becoming widely used in companies. Also, it seems like it can use several machines as a pool of resource and dynamically allocate them to each virtual machine. Since the resources are getting cheaper and work load haven’t grown that big, and also VM enables sharing the machine safely, this technique is very important.
Posted by: Hidetoshi Tokuda | February 21, 2007 11:34 PM
Summary
This paper presents VMware ESX server mechanisms and policies for efficiently managing memory in a virtual machine environment without any modification to the guest operating systems. In order to efficiently manage memory the authors propose techniques like ballooning to reclaim pages considered least valuable by the operating system running in a virtual machine, idle memory tax to achieve efficient memory utilization and content based page sharing to avoid redundancy.
Problem Description
In a virtual machine environment, multiple operating systems might be running multiple workloads at the same time. As a result a lot of pressure is put on the memory subsystem. If the policies to manage the memory subsystem are not efficient, the operating systems and workloads running in different virtual machines can suffer significantly. In this paper, the authors present several policies to efficiently manage the memory subsystem while running several virtual machines.
Contributions
Some of the contributions of this paper are as follows.
1. Incase of a virtual machine, when an application generates a virtual address, it is first converted to a physical address and then it is converted to a machine address which refers to actual hardware memory. As a result, an extra level of translation gets involved. The authors propose the idea of using shadow map tables which contain direct mapping from virtual to machine address and are kept consistent with physical to machine mappings. As a result the extra level of translation is avoided. This approach can result in significant performance gains.
2. In order to efficiently reclaim the memory from the guest operating systems, the authors propose a technique called ballooning. Balloon drivers are installed in the guest operating systems and they poll the server once per second to obtain a target balloon size. Based on the size of the balloon, guest operating system can either free the memory or claim it. This approach is quite advantageous. The total physical memory allocated to an operating system can only be changed at boot time. Using ballooning, the amount of physical memory allocated to an operating system can be changed dynamically as by inflating the balloon puts pressure on the guest operating system to invoke its own native memory management systems.
3. The authors also propose content-based page sharing. Guest operating systems running on different virtual machines can access the same physical page. Instead of keeping multiple copies of the same page in memory, the authors propose the idea of having content-based page sharing. As a result common pages are shared among different virtual machines. This can save a lot of space in the physical memory. A naïve solution to implement content based sharing can result in a lot of page comparisons. To avoid this overhead, the authors propose the idea of assigning a hash value to all the pages. Only after the hash is matched, the contents of the two pages are compared.
4. To reclaim idle memory, the authors introduce idle memory tax. Based on the tax rate, idle pages are reclaimed from a guest operating system and are assigned to systems that are in need to memory. This can significantly improve the overall performance of the system.
Flaws
ESX server uses randomized page replacement policy. The authors do not justify the use of a randomized policy. Page replacement policies can significantly affect the performance of an operating system and as discussed in the Working Set paper, using randomized page replacement policy is not a good idea.
Relevance
Virtual machines are gaining a lot of popularity these days. The ideas presented in this paper are quite relevant to modern virtual machine systems. Memory management is still a very important factor that affects the performance of a virtual machine system. The memory management policies proposed in this paper can significantly improve the performance of these systems.
Posted by: Atif Hashmi | February 21, 2007 10:07 PM
Summary
The VMware ESX server is presented in this paper specifically the memory management organization. The main issues related to memory management and virtual machines are discussed and solutions are presented that allow proprietary guest operating systems to run without any modifications.
Problems Addressed
A primary goal of this work was that of building a virtual machine that other operating systems could run within while not requiring any modifications to the guest system. The virtual machine must create the allusion to the guest operating system that it has the whole machine to itself. To accomplish this issues related to memory virtualization including page replacement, memory allocation, and memory sharing are all addressed and novel solutions presented. Since the virtual machine does not have any direct control over the guest operating system memory allocation and reclaiming can be tricky since it is handled by the guest operating system that the virtual machine can not control.
Contributions
Running many virtual machines on a single system where each machine requires a significant amount of memory to run can rapidly exhaust the amount of physical ram in the system. Thus most of the contributions in this paper deal with reducing the amount of memory needed to run multiple virtual machines and also ways of reclaiming memory from the virtual machines.
In order to control the memory management of the guest operating system a novel technique called "Ballooning" was presented where a service was installed in the guest operating system that had a direct connection to the VM server. Through this connection the service can use more or less memory affecting the way the guest operating system manages its available memory. To help reduce the amount of required memory for multiple virtual machines running similar software a mechanism for sharing was presented that seems to be very effective while not requiring a lot of overhead to maintain. Finally to help maximize the use of the ram a tax on memory pages was enforced for memory regions that were inactive. This would allow memory to be freed to be used in places where it may be needed more.
Flaw
Random page selection for demand paging seems like it may not be the optimal strategy since it may cause unnecessary page faults. Also a question that arose during the reading was that of how accurate the estimate of idle memory pages is. Since really only a small sampling of the memory is used to determine the faction of idle memory used by a virtual machine it seems some machines could have their memory moved to other machines unjustifiably and visa versa.
Relevance
As processors become faster and more importantly memory becomes more plentiful and cheaper the popularity of virtual machines will continue to grow. The ideas presented in this paper seem like they could have a significant impact on the design and implementation of the VM server in the years to come.
Posted by: Nuri Eady | February 21, 2007 10:02 PM
Summary
Ballooning and page sharing are used to allow for efficient virtual memory management without any modification of the operating systems being run.
Problem
Current virtualization techniques either had to modify the operating systems being run or had to under provision the virtual machines to ensure enough memory during peak usage.
Contributions
Ballooning is the process of running a pseudo device driver inside of the virtual OS. This facilitates reclaiming memory from the OS. With the ability to reclaim memory the VMM can overprovision VMs and better take advantage of the benefits of statistical multiplexing.
The paper described the problem of double paging (swap by VM OS shortly after the VMM swap the page to motivate the idea of page sharing. Using hashes the VMM compared pages based on the content of the page. Duplicate copies of pages were removed in order to increase memory utilization. Copy-on-write is used if the pages deviate at a later time.
A tax system is used to allocate shared memory to VMs. The less a shared page is used the more likely it is to be reclaimed. This allows the system to continually reclaim pages, while still allowing enough free pages to fill spikes in demand. A number of techniques were presented to measure memory activity.
Flaws
Possibly I do not understand the corporate environment well enough, but I felt the case for running 10 copies of an OS on the same machine was not well justified. With the price of memory and machine declining, I’m not sure there is such a need for such multiplexing. Why not just add another blade? Not to mention the difficulty of finding 10 VMs who’s peak demands are uncorrelated. Other than the somewhat extreme analysis I felt the paper was well rationed and written.
Relevance
With the increased popularity of virtualization these techniques will help VMs’ performance stay on par with non-virtualized machines’ performance. Reducing server costs has the potential to have a large influence.
Posted by: Kevin Springborn | February 21, 2007 08:57 PM
Paper Review: Memory Resource Management in VMware ESX Server [Waldspurger]
Summary:
This paper presents the techniques used by a special operating system,
VMware ESX Server, designed to be the "host" operating system for
virtual machines that run common, unmodified "guest" operating systems.
The effectiveness of these techniques is shown by the results of various
measurement studies.
Problem:
The problem addressed by this work is to effectively manage an
overcommitted memory resource when the "client" guest operating systems
are all designed to manage machine memory themselves, and therefore
behave somewhat more aggressively than just application processes under
a common operating system might.
Contributions:
A number of novel techniques are offered:
* A trick called ballooning, used to cause a guest operating
system to invoke its own memory managment algorithms to essentially
free-up memory (from the host OS perspective) so that it can be given
to other virtual machines.
* A technique to identify pages that can be shared by
comparing their content, by random selection over time.
* A share-holding and taxation system, described in economic terms,
to deal with memory allocation decisions amongst competing virtual
machines running guest operating systems.
* A technique to identify idle pages by statistical sampling so
that they can be reclaimed for use elsewhere. (Guest OSes sometimes
seem to overallocate because they were designed to run directly on
the hardware.)
Flaws:
I don't find any flaws to this work. The author could have provided
some more validation about the statistical sampling technique to
identify idle pages. The empirical tests clearly show that the
techniques produce desirable results, but there are no comparisons to
the performance if the guest operating systems were run directly on the
hardware.
Relevance:
This work is relevant to systems that emulate hardware to host guest
operating systems that would otherwise run directly on the hardware.
The success acheived without modification of the guest OS is impressive,
and is beneficial in business environments that wish to maintain servers
virtually to reduce space, power, and other costs.
Posted by: Dave Plonka | February 21, 2007 08:05 PM