Experimental Study of Virtual Machine Migration in

Document Sample
Experimental Study of Virtual Machine Migration in Powered By Docstoc
					Experimental Study of Virtual Machine
Migration in Support of Reservation of
          Cluster Resources


                   Ming Zhao, Renato J. Figueiredo

      Advanced Computing and Information Systems (ACIS)
             Electrical and Computer Engineering
                      University of Florida
                 {ming, renato}@acis.ufl.edu


Advanced Computing and Information Systems laboratory
    Motivating Application
         Dynamic Data-driven Brain-machine Interface
                Resource demanding parallel application
                Time-critical during online experiments
                       Must finish each closed loop within 100ms
                No stringent time requirement for offline study

                                       Online Experimenting                                Offline Study

          Brain                                                                                                  ACIS
        institute
                        In Vivo
                    Data Acquisition
                                                                    1            2                    K
                                                                BMI
                                                                                     •••          BMI
                                                                             BMI
                                                               Model        Model                Model


                                                                      Parallel Computing                    Data
                                                                    on Virtualized Resources               Storage
                    Closed-Loop
                      Control


                                         Dynamic Data-driven Brain-machine Interfaces
                                                    (http://www.acis.ufl.edu/dddasbmi)
Ming Zhao, VTDC’07                                            2
    Motivating Application
          Motivation for VM-based resource reservation
                Online experiments                                                                           Latencies
                         Dedicate cluster resources for VMs running BMI models                            Data
                         Without reservation, deadline is often violated                                 polling
                                                                                                          Data
                Offline study                                                                           transfer
                         Share cluster resources with VMs running other workloads
                                                                                                      Computing
                Cluster reservation
                         Migrating exiting VMs to other resources                                        Robot
                                                                                                         control
                                                                                                                    100ms
                          Online Experimenting                              Offline Study


          In Vivo
      Data Acquisition
                                                  1           2                  K
                                              BMI         BMI     •••         BMI
                                             Model       Model               Model

                                                   Parallel Computing                        Data
                                                 on Virtualized Resources                   Storage
      Closed-Loop
         Control

                         Dynamic Data-driven Brain-machine Interfaces
                                      (http://www.acis.ufl.edu/dddasbmi)
Ming Zhao, VTDC’07                                              3
    Overview
         Goal:
              Efficient VM-based cluster resource reservation


         Challenge:
              Vacate resources in time (to meet schedule) but not too early (to
              fully utilize resources)


         Solution:
              Build a migration overhead model through experimental analysis




Ming Zhao, VTDC’07                        4
    Outline
         Architecture

         Experimental analysis
              Methodology
              Migrating a single VM
              Migrating a sequence of VMs
              Migrating VMs in parallel


         Summary




Ming Zhao, VTDC’07                      5
    VM-based Resource Reservation
         Virtual resource manager
              Global management of virtualized
              resources
              Presents an abstracted interface for                                VM2
                                                           VM1                                      VM3
              resource requests
                                                                                             4) VM instantiation
              Hides the complexity of using                                VM
                                                                         Scheduler
                                                                                     3) Reservation:
              virtualized resources                                                        Create a VM with ... by 10am

              Coordinates with VM schedulers to
              reserve, allocate resources with VMs
                                                                                        P1



         Virtual machine scheduler                             1) Resource request:
                                                                 Need 1GHz CPU, 1GB RAM at 10am
              Per-host management of VMs               Job                                               Virtual
                                                                                                        Resource
                                                     Manager                                            Manager
              Presents a unified interface for                 2) Admission control: OK
              managing VMs                                     6) Resource handlers:
                                                                 Machine IP: 10.5.156.30
              Hides the complexity of using
              different VM software

Ming Zhao, VTDC’07                               6
    VM Migration based Cluster Vacating
         Example
                     P1                     P1                       P1
              VM1 VM2 VM3                                            VM

              VM4 VM5                                                VM

              VM6 VM7                                                VM

              VM8 VM9                                                VM


                     P2                    P2                        P2
                                     VM1 VM2 VM3              VM1 VM2 VM3

              VM                      VM VM4 VM5               VM VM4 VM5


                     P3                    P3                       P3
              VM                      VM VM6 VM7               VM VM6 VM7

              VM                      VM VM8 VM9               VM VM8 VM9


            (I) P1 is shared by   (II) P1 is reserved by   (III) P1 is dedicated to
                VMs running       migrating the existing   the VMs created for a
            various workloads       VMs to P2 and P3         parallel application
Ming Zhao, VTDC’07                           7
    Experimental Setup
         Physical cluster
              Each node has four CPUs
              (2.33GHz), 4GB memory                   Node 1             Node 2

              Gigabit Ethernet                      VM1    VM2


         Virtual machines                             Local              Local

              VMware Server based                    vmx
                                                    REDO
                                                            vmx
                                                           REDO
                                                    vmem   vmem
              VM disk
                     Shared read-only image on                    NFS
                     NFS (.vmdk)
                                                                  vmdk
                     Independent changes (redo
                     logs) on local disks (.REDO)
              VM memory
                     Mapped to file (.vmem)
                     Stored on local disks


Ming Zhao, VTDC’07                              8
    Experimental Setup
         Migration strategy
              Suspend
                                                     Node 1             Node 2
                     Synchronizes memory
                     state to file                 VM1    VM2
              Copy
                     Uses FTP to transfer            Local              Local
                     memory state file, disk
                     redo files                     vmx
                                                   REDO
                                                           vmx
                                                          REDO
                                                   vmem   vmem
                     Maximum throughput is
                     about 100MB/s                               NFS
              Resume
                                                                 vmdk
                     Restores memory state
                     from file in foreground
              Not based on VMware
              solutions (VMotion)



Ming Zhao, VTDC’07                             9
              Migration Overhead of a Single VM
                    Different memory sizes
                          Suspend and resume phases are fast
                          Copy time grows as memory size increases
                    Different methods to store the migrated states
                          RAMFS removes the impact of disk I/Os to the migration process
             18                                                                  18
                    128MB                                                                128MB
             16     256MB                                                        16      256MB
                    384MB                                                                384MB
             14     512MB                                                        14      512MB
                    640MB                                                                640MB
             12                                                                  12      768MB
                    768MB
                    896MB                                                                896MB




                                                                      time (s)
  time (s)




             10                                                                  10      1024MB
                    1024MB

              8                                                                   8

              6                                                                   6

              4                                                                   4

              2                                                                   2

              0                                                                   0
                      suspend           copy           resume                              suspend          copy          resume

                  Using disks on the destination host to store                        Using RAMFS on the destination host to store
                            migrated VM state files                                            migrated VM state files
Ming Zhao, VTDC’07                                               10
    Modeling Copy Phase
         Using regression methods
              Polynomial for using disk, linear for using RAMFS

         RAMFS setup is used for all the following experiments
                                 18

                                 16
                                                                   Using disks
                                                          y = 1E-05x2 + 0.0043x + 1.2704
                                 14
                                                                   R2 = 0.9967
                                 12
                      time (s)




                                 10

                                 8
                                                                                   Using RAMFS
                                                                                y = 0.0089x + 0.7502
                                 6
                                                                                       R2 = 1

                                 4

                                 2

                                 0
                                      0     200     400           600          800         1000        1200
                                                           memory size (MB)

                                          Using regression to model the copy phase
Ming Zhao, VTDC’07                                              11
    Modeling Suspend and Resume Phases
         Resume
              Typically short: memory state is buffered after the copy phase
         Suspend
              Typically short: memory state is frequently synchronized to file
                                    22
                                         baseline
                                    20
                                         256MB
                                    18   512MB
                                         768MB
                                    16
                                         1024MB
                                    14
                         time (s)




                                    12

                                    10

                                     8

                                     6

                                     4

                                     2

                                     0
                                                    suspend        resume


Ming Zhao, VTDC’07                                            12
    Modeling the Suspend and Resume Phases
         Running memory-intensive workload in the migrated VM
              A “rogue” program that continuously modifies memory
              Suspend time increases as more synchronization is needed

                                  22
                                       baseline
                                  20
                                       256MB
                                  18   512MB
                                       768MB
                                  16
                                       1024MB
                                  14
                       time (s)




                                  12

                                  10

                                   8

                                   6

                                   4

                                   2

                                   0
                                                  suspend        resume


Ming Zhao, VTDC’07                                          13
              Migrating a Sequence of VMs
                   Per-VM migration time is consistent regardless of the
                   number of sequentially migrated VMs


             10                                                               10
                  single VM                                                        single VM
              9   two VMs                                                      9
                                                                                   two VMs
                  four VMs
              8   eight VMs                                                    8   four VMs

              7                                                                7

              6                                                                6




                                                                   time (s)
  time (s)




              5                                                                5

              4                                                                4

              3                                                                3

              2                                                                2

              1                                                                1

              0                                                                0
                  suspend      copy      resume       total                         suspend     copy       resume       total

                  Per-VM migration time when migrating a                           Per-VM migration time when migrating a
                      sequence of 256MB-RAM VMs                                        sequence of 512MB-RAM VMs
Ming Zhao, VTDC’07                                            14
                         Migrating VMs with CPU-intensive Workload
                             CPU-intensive workload
                                    Runs iteratively, consumes 100% of CPU
                             Performance degradation
                                    = Impacted iteration time – Regular iteration time
                             Takes longer for applications in VMs to recover to full performance
                        20                                                                                        20
                                    Performance degradation   VM migration time                                        VM1
                        18                                                                                        18   VM2
                                                                                                                       VM3
                        16                                                                                        16
                                                                                                                       VM4
                        14                                                                                        14
  Migration delay (s)




                                                                                             Iteration Time (s)
                        12                                                                                        12

                        10                                                                                        10

                         8                                                                                         8

                         6                                                                                         6

                         4                                                                                         4
                         2
                                                                                                                   2
                         0
                                                                                                                   0
                              VM1             VM2             VM3                 VM4                                        Time (s)

                             Migrating four VMs in sequence; each has 512MB-RAM and runs a CPU-intensive benchmark

Ming Zhao, VTDC’07                                                                      15
                        Migrating VMs with Memory-intensive Workload
                               Memory-intensive workload
                                      Runs iteratively, touches 100% of memory
                               Performance degradation
                                      = Time of 2 impacted iterations – Regular iteration time * 2
                               Migration is a more memory intensive process
                        32                                                                                     32
                        30            Performance degradation   VM migration time                              30   VM1
                        28                                                                                     28   VM2
                                                                                                                    VM3
                        26                                                                                     26
                                                                                                                    VM4
                        24                                                                                     24
                        22                                                                                     22
  Migration delay (s)




                                                                                          Iteration time (s)
                        20                                                                                     20
                        18                                                                                     18
                        16                                                                                     16
                        14                                                                                     14          c
                        12                                                                                     12
                        10                                                                                     10
                         8                                                                                      8
                         6                                                                                      6
                         4                                                                                      4
                         2                                                                                      2
                         0                                                                                      0
                                VM1            VM2              VM3            VM4                                        Time (s)

                             Migrating four VMs in sequence; each has 512MB-RAM and runs a memory-intensive benchmark

Ming Zhao, VTDC’07                                                                   16
    Migrating VMs with Web Workloads
         Apache Web server
                          Serves constant-rate requests from outside clients (100 request/s each)
                          Takes longer for the throughputs to recover
                        250                                                                        250
                              VM1                                                                        VM2
                        200                                                                        200
           Reply rate




                                                                                      Reply rate
                        150                                                                        150

                        100                                                                        100

                         50                                                                         50

                          0                                                                          0

                        250                                                                        250
                              VM3                                                                        VM4
                        200                                                                        200
           Reply rate




                                                                                     Reply rate
                        150                                                                        150

                        100                                                                        100

                         50                                                                        50

                          0                                                                          0


                                                     550
                                                               Aggregate
                                                     500
                                        Reply rate




                                                     450
                                                     400
                                                     350
                                                     300
                                                     250
                                                           0        10     20   30       40                    50   60   70
                                                                                  Time (s)

                              Migrating four VMs in sequence; each has 512MB-RAM and runs a Web server

Ming Zhao, VTDC’07                                                              17
                        Migrating VMs in Parallel
                            Parallel migration has shorter total migration time
                                 Faster vacating of cluster resources
                            Sequential migration has shorter per-VM migration time
                                 Less impact on the applications in the VMs
                       60                                                                      30
                            256M-P                                                                  256M-P
                            256M-P
                            256M-S                                                                  256M-S
                       50                                                                      25
                            256M-S
                            512M-P                                                                  512M-P
                            512M-S                                                                  512M-S
  Migration time (s)




                       40                                                                      20




                                                                          Migration time (s)
                       30                                                                      15


                       20                                                                      10


                       10                                                                       5


                       0                                                                        0
                             1           2                   4   8                                   1           2                   4   8
                                             Number of VMs                                                           Number of VMs

                                     Total migration time                                                    Per-VM migration time


Ming Zhao, VTDC’07                                                   18
    Related Work
         VM-based resource management
              In-VIGO
              Virtual workspaces
              Virtuoso
              Shirako


         VM migration based resource reallocation
              VIOLIN
              Sandpiper

         VM migration optimizations
              VMware VMotion
              Xen live migration


Ming Zhao, VTDC’07                 19
    Summary
         Conclusions:
              It is feasible to predict the migration time of VMs with different
              configurations and different numbers
              It takes longer for a VM’s application to recover than the VM
              migration time
              Sequential migration has less impact on VMs’ applications
              Parallel migration is faster for migrating multiple VMs


         Future work:
              Generalizes and validates the model across different physical
              platforms, with different VM technologies
              Considers modeling of live migrations



Ming Zhao, VTDC’07                          20
    Acknowledgement
         DDDASBMI project team
              http://www.acis.ufl.edu/dddasbmi


         Sponsorship from NSF

         Publication approval from VMware




         Questions?
              ming@acis.ufl.edu
              http://www.acis.ufl.edu/~ming

Ming Zhao, VTDC’07                       21