A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines
Kenichi Kourai Shigeru Chiba
Tokyo Institute of Technology
Server consolidation with VMs
Server consolidation is widely carried out
Multiple server machines are integrated on one physical machine Recently, using virtual machines (VM) Multiplexing resources
VM VM ...
VMM hardware
VMs are run on a virtual machine monitor (VMM)
Software aging of VMMs
Software aging of a VMM is critical
Software aging is...
• The phenomenon that software state degrades with time • E.g. exhaustion of system resources
Software aging of a VMM affects all VMs on it
• E.g. performance degradation
VM
VM ...
VMM
Software rejuvenation of VMMs
Preventive maintenance
Performed before software aging of a VMM affects its VMs Occasionally stops a VMM, cleans its internal state, and restarts it Cleans the internal state automatically and completely The easiest way
Typical example: rebooting a VMM
Drawbacks (1/2): Increasing service downtime
The VMM reboot needs:
Rebooting all OSes running on the VMs
• The time tends to be long
• Larger number of VMs • Longer startup time of services
VM
OS
OS
VMM
...
A hardware reset
• The BIOS power-on self test is time-consuming
OS shutdown
VMM shutdown
hardware reset
VMM boot
OS boot
Drawbacks (2/2): Performance degradation
The file cache is lost by the OS reboot
OSes cannot restore performance until the file cache is re-filled
• They strongly rely on the file cache to speed up file accesses
The time tends to be long
• The file cache size is increasing
• Large amount of memory for a VM • Free memory as the file cache
OS
process file cache
disk
Warm-VM reboot
Fast rejuvenation technique
Efficiently reboots only a VMM
• The VMM reboot causes no OS reboot
Basic idea
• Suspend all VMs before the VMM reboot • Resume them after the reboot
Challenge
• How does a VMM efficiently deal with the large memory images of VMs?
On-memory suspend of VMs
Freezes the memory images of VMs on the main memory
That memory area is just reserved
• The time does not depend on the memory size
Saving them into a slow disk is inefficient
Suspend To RAM Traditional suspend is ACPI S4 state
disk VM
ACPI S3 state for VMs
freez e main memory
On-memory resume of VMs
Unfreezes the memory images preserved on the main memory
They are reused directly as the memory of VMs
• No need to read them from a slow disk
The file cache of OSes is also restored
• No performance degradation
VM
disk
unfreez e main memory
Quick reload of VMMs
Directly boots a new VMM without a hardware reset
The memory images of VMs are preserved through the VMM reboot
• Software can keep track of them • A hardware reset does not guarantee this
A VMM is rebooted quickly
• No overhead due to a hardware reset
new VMM preload
main memory VM old VMM
Comparison with other methods
Cold-VM reboot
Needs the OS reboot A naive implementation of the warm-VM reboot
• VMs are saved into a disk
Cold-VM Saved-VM Warm-VM Yes No No
Saved-VM reboot
Reboot method Depend on # of VMs
Depend on services
Performance degradation
Yes
Yes
No
Yes No
No
No No
Depend on mem size of VMs No
Model for availability
Must consider the software rejuvenation of both a VMM and OSes
Warm-VM reboot
• The OS rejuvenation is independent
OS rejuvenation
VMM rejuvenation
OS rejuvenation
Cold-VM reboot
• The OS rejuvenation is affected by the VMM rejuvenation
• # of the OS rejuvenation increases
VMM rejuvenation
RootHammer
We have implemented the warm-VM reboot into Xen 3.0.0 VM physical
On-memory suspend/resume
memory
memory
• Based on Xen's suspend/resume • Manages the mapping from the VM memory to the physical memory
Quick reload
• Based on the kexec mechanism in Linux • Kexec for a VMM is included in the latest Xen
• It is not for reusing the memory images
Experiments
Examine that the warm-VM reboot reduces downtime and performance degradation
Comparison
• Cold-VM reboot with the OS reboot • Saved-VM reboot using Xen's suspend/resume ...
server client
Linux
Linux
VMM
2 dual-core 12 GB 15,000 rpm gigabit Opteron SDRAM SCSI disk Ethernet Linux
Performance of on-memory suspend/resume
Suspend/resume of one VM with 11 GB of memory
Ours: 1 sec Xen's: 280 sec
• Depends on the memory size
Suspend/resume of 11 VMs
Ours: 4 sec OS reboot: 58 sec
• Depends on # of VMs
Effect of quick reload
VMM boot hardware reset or quick reload VMM shutdown
The time of rebooting a VMM with no VMs
70 60 50 40 30 20 10 0 Warm-VM Cold-VM
Warm-VM reboot
• 11 sec • The time of quick reload is negligible
Cold-VM reboot
• 59 sec • The time due to a hardware reset is 48 sec
Downtime of services
Warm-VM reboot
Always the same
• 42 sec
Saved-VM reboot
Depends on # of VMs
• 429 sec (11 VMs)
Cold-VM reboot
Affected by the service type
• 157 sec (sshd) • 241 sec (JBoss)
Availability of JBoss
The warm-VM reboot achieves four 9s
Assumptions
• OS rejuvenation every week
• 34 sec
• VMM rejuvenation every 4 weeks
• In 0.5 week after the last OS rejuvenation
1 week OS rejuvenation
Warm-VM reboot Cold-VM reboot Saved-VM reboot
99.993% 99.985% 99.977%
0.5 week
VMM rejuvenation
Performance degradation
The throughput of the Apache web server
before and after the VMM reboot Warm-VM reboot
• No degradation
Cold-VM reboot
• Degraded by 69%
Software rejuvenation in a cluster environment
Clustering achieves zero downtime
Multiple hosts can provide the same service
Let us consider the total throughput of all hosts in a cluster total throughput
Warm-VM reboot
• (m-1)p
mp (m-1)p 42 sec 241 sec m: # of hosts p: throughput of one host
Cold-VM reboot
• (m-1)p • (m-0.69)p for a while after the reboot
t
Comparison with VM migration in a cluster environment
VM migration achieves nearly zero downtime
VMs are moved to another host
• Xen's live migration, VMware's VMotion
total throughput mp (m-1)p 42 sec 17 min
Total throughput
Normal run
• (m-1)p • One host is reserved for migration
t
Live migration
• (m-1.12)p
Related work
Microreboot [Candea et al.'04]
Reboots only a part of subcomponents
• The warm-VM reboot enables rebooting only a parent component (VMM for VMs)
Checkpointing/restart [Randell '75]
Saves/restores OS processes
• Similar to suspend/resume of VMs
Optimizations of suspend/resume
Incremental suspend, compression of memory images
Conclusion
We proposed the warm-VM reboot
On-memory suspend/resume
• Freezes/unfreezes the memory images of VMs
Quick reload
• Preserves the memory images through the VMM reboot
It achieved fast rejuvenation
Downtime reduced by 83% at maximum No performance degradation