GLVM-NIM by panniuniu


									  Disaster Readiness
 using NIM and GLVM

Maryland AIXPower Users Group
       Wendy McConnell
        June 14, 2007
Why Disaster Readiness?
    Traditional Backups for DR
 mksysb to tape

 mkcd or mkdvd
  Drawbacks of Traditional Methods
 Off-site transportation risks
 Failed Media
 Cost of media
NIM for Backups
       Benefits of NIM for DR

 Boot from NIM Master instead of tape/DVD
  is more reliable
 Once NIM Master is restored, multiple
  images can be pushed at once
 Network load is faster than media
Drawbacks of NIM Server for DR

 Need to recover NIM servers at hot-site
 Need to recover images / NIM resources
  from TSM
 Can not begin image loads until NIM is fully
  ready (approx 5 hours)
       Remote NIM Master
                      No need to recover
                       NIM Master image at
                      All LPP_SOURCES
                       needed for recovery of
                       contracted servers
                      Latest SPOT for each
Still requires TSM     AIX level required
restores of mksysb    NIM Master ready to
       images          use
         Solution – GLVM and
         Remote NIM Server
 Global Logical Volume Manager is now
  included with AIX (previously part of
 Allows mirroring to remote DASD via TCP/IP
 Eliminates need to recover images from NIM
  TSM server
 Both server images and TSM database
  backups remotely mirrored
                GLVM & 2 NIM Servers
                          /dr_images and
      NIM Server in          /tsmbkups          NIM Server at
      Data Center        3 copies – one at        Hot Site
                        data center and 2 at
                              hot site
         GLVM Client                              GLVM Server
            hdisk                                  rpvserver0

                        GLVM uses TCP/IP
1 set of LUNs                                               2 sets of LUNs

          SAN at Data
                                               SAN at Hot
 Must create LVs with full strictness
 Each mirror copy must reside on a different
  set of “hdisks”
 While active, disks on remote server should
  not be accessed via conventional methods –
  they are unavailable to AIX commands.
 lsdev on remote server will show “rpvserver”
  devices – one for each disk under the
  control of GLVM
                    What DR Looks Like
                            /dr_images and
      NIM Server in            /tsmbkups              NIM Server at
      Data Center          2 copies at hot site         Hot Site

                                                        GLVM Server
                                                        rpvserver0 –
         GLVM Client                                      disabled
            hdisk                                          DASD
                                                        accessed as
                          If disaster, TCP/IP              hdisks

                              link is gone
1 set of LUNs                                                     2 sets of LUNs

                         If exercise – TCP/IP link
          SAN at Data    remains, and one set of     SAN at Hot
            Center          rpvservers continues
                             mirroring data while
                           other is removed from
                         GLVM usage at “disaster
   What Happens at DR Time?
 If an exercise, politely remove one set of LV
  mirrors from the dr_vg volume group
 Change the rpvserver devices on the remote
  NIM server to a defined state.
 Import the dr_vg volume group on the
  remote nim server using the any of the
  hdisks that were once a rpvserver
 Vary on the volume group and mount
  filesystems – images now available
What Happens at DR Time? - contd

 Clients to be recovered are added as NIM
  machine resources
 Images in newly mounted filesystem are
  made NIM mksysb resources
 Clients are set to perform a NIM bos_inst
  from mksysb
 Clients are booted from NIM server over
  network and images begin to load
 Have been able to shave about 5 – 7 hours
  from recovery time – depending on number
  of servers included in exercise
 Tapes are no longer an issue
 TSM backups of images are still available in
  the event an image is damaged
 All TSM database backups available via disk
  instead of tape – additional time savings
       So, What’s Required?
 DASD – enough on the local SAN to hold
  the mksysb images for the servers to be
  recovered at the hot site as well as any
  “backup software” database backups
 DASD – enough on the remote SAN for 2
  copies of the mksysb images and “backup
  software” database backups kept on the
  local SAN
       So, What’s Required?
 Network Bandwidth – GLVM syncs just like
  LVM – so the pipe between the local and
  remote sites must be big enough to keep the
  mirrors in synch
 Proper tuning of time-out values for the
  GLVM clients depending on network flow
 Monitoring – whatever monitoring system in
  use should watch for stale partitions
        So, What’s Required?
 Servers – remote server needs to be large
  enough to push the number of images
 Control of mksysb image sizes – make use
  of the /etc/exclude.rootvg files to exclude
  anything that would not be useful in a DR
  situation (i.e. we exclude accounting

To top