Rollout/Rollback Plan by Rsici7

VIEWS: 57 PAGES: 5

									                                  Rollout/Rollback Plan
Purpose of change:
        Force 100Mb full-duplex communication on all network interfaces. Modify Alteon port
        definitions as recommended by manufacturer.

Background:
        Sometimes auto-negotiate of transfer rates can fail, causing the switch and the Sun's NIC
        to communicate at different rates. This can result in serious communication problems
        between the two (usually manifested in high collision rates and dropped packets under
        high traffic).

        These changes are preventative measures. By forcing all NICs and switches to 100Mb
        full-duplex mode, we will eliminate the possibility of communicating at different rates.

Estimated rollout time:
Estimated testing time:                   5 minutes
Estimated rollback time:

Rollout user impact:
        Users will be logged out at one point during the rollout.
Rollback user impact:
        Users will be logged out at one point during the rollback .

Rollout timeline

Time    Action                                             Who        User impact
00:00   Add to /etc/system on all production               Lori       None – this can be done
        hosts:                                                        ahead of time
                * Force_100_fdx_BEGIN
                set hme:hme_adv_autoneg_cap=0
                set hme:hme_adv_100T4_cap=0
                set hme:hme_adv_100fdx_cap=1
                set hme:hme_adv_100hdx_cap=0
                set hme:hme_adv_10fdx_cap=0
                set hme:hme_adv_10hdx_cap=0
                * Force_100_fdx_END
00:00   Add to /etc/system on db01-prod and                Lori       None – this can be done
        db02-prod (in addition to the above                           ahead of time
        entries):
                * Force_100_fdx_BEGIN
                set qfe:qfe_adv_autoneg_cap=0
                set qfe:qfe_adv_100T4_cap=0
                set qfe:qfe_adv_100fdx_cap=1
                set qfe:qfe_adv_100hdx_cap=0
                set qfe:qfe_adv_10fdx_cap=0
                set qfe:qfe_adv_10hdx_cap=0
                * Force_100_fdx_END
00:01   Reboot web02-prod, app02-prod, db02-         Lori   None – complete failover
        prod                                                should result in no user
                                                            impact
00:01   Simultaneously, Rob will configure these     Rob    None
        server’s ports on the Cisco switches to
        force 100Mb full-duplex.
            - set port speed port_num 100
            - set port duplex port_num full
00:15   After all servers come back up:              Lori   None
            - verify services running/site           and
                functional                           Rob
            - verify network connectivity (ping)
            - verify 100Mb full-duplex
00:20   Fail db server over to db02-prod. This       Lori   Users will have to log in
        requires restarting Dynamo                          again.
00:25   Reboot web01-prod, app01-prod, db01-         Lori   None
        prod
00:25   Simultaneously, Rob will configure these     Rob    None
        servers’ ports on the Cisco switches to
        force 100Mb full-duplex.
            - set port speed port_num 100
            - set port duplex port_num full
00:40   After all servers come back up:              Lori   None
            - verify services running/site           and
                functional                           Rob
            - verify network connectivity (ping)
            - verify 100Mb full-duplex
00:45   Possibly fail db server back to db01-prod    Lori   Users will have to log in
?       (discuss w/ Terry to see what he prefers).          again.
        Requires restarting Dynamo
00:55   Config Alteon ports                          Rob    Users will have to log in
            - /cfg/port port-number fast mode               again. (Can be done
                full                                        simultaneous with db fail
            - /cfg/slb/port port-num /client                back.)
                enable-disable
            - /cfg/slb/port x port-num /server
                enable-disable
            - apply
            - save
            - ptcfg

        Config corresponding Cisco ports
           - set port speed port_num 100
           - set port duplex port_num full
01:05   Verify site login.                           both   None
If the rollout is determined to be a failure, following is the roll back procedure.

Rollback Timeline

Time    Action                                         Who      User impact
00:00   Remove lines from /etc/system on all           Lori     None
        production hosts:
               * Force_100_fdx_BEGIN
               set hme:hme_adv_autoneg_cap=0
               set hme:hme_adv_100T4_cap=0
               set hme:hme_adv_100fdx_cap=1
               set hme:hme_adv_100hdx_cap=0
               set hme:hme_adv_10fdx_cap=0
               set hme:hme_adv_10hdx_cap=0
               * Force_100_fdx_END
00:00   Remove lines from /etc/system on db01-         Lori     None.
        prod and db02-prod:
               * Force_100_fdx_BEGIN
               set qfe:qfe_adv_autoneg_cap=0
               set qfe:qfe_adv_100T4_cap=0
               set qfe:qfe_adv_100fdx_cap=1
               set qfe:qfe_adv_100hdx_cap=0
               set qfe:qfe_adv_10fdx_cap=0
               set qfe:qfe_adv_10hdx_cap=0
               * Force_100_fdx_END
00:05   Reboot web02-prod, app02-prod, db02-           Lori     None – complete failover
        prod                                                    should result in no user
                                                                impact
00:05   Simultaneously, Rob will configure these       Rob      None
        server’s ports on the Cisco switches to
        allow auto-negotiate.
            - set port speed port_num auto
00:20   After all servers come back up:                Lori     None
            - verify services running/site             and
                functional                             Rob
            - verify network connectivity (ping)
00:25   Fail db server over to db02-prod. This         Lori     Users will have to log in
        requires restarting Dynamo                              again.
00:30   Reboot web01-prod, app01-prod, db01-           Lori     None
        prod
00:30   Simultaneously, Rob will configure these       Rob      None
        server’s ports on the Cisco switches to
        allow auto-negotiate.
            - set port speed port_num auto
00:45   After all servers come back up:                Lori     None
            - verify services running/site             and
               functional                            Rob
            - verify network connectivity (ping)
            - verify 100Mb full-duplex
00:50   Possibly fail db server back to db01-prod    Lori   Users will have to log in
?       (discuss w/ Terry to see what he prefers).          again.
        Requires restarting Dynamo
01:00   Config Alteon ports                          Rob    Users will have to log in
            - /cfg/port port-num auto                       again. (Can be done
            - /cfg/slb/port port-num /client                simultaneous with db fail
               enable-disable                               back.)
            - /cfg/slb/port port-num /server
               enable-disable
            - apply
            - save

01:10   Verify site login.                           both   None
 Appendix A – Checking Status of Ethernet Interfaces on Sun
To check qfe interfaces, substitute qfe for hme
To check other instances of an hme or qfe interface, substitute the appropriate instance number.

Check the present link information:

ndd -set /dev/hme instance 0
this selects hme device
instance 0 = hme0
instance 1 = hme1

ndd -get /dev/hme transceiver_inuse
0=internal rj45 100baseTx connector
1=external mii transceiver

ndd -get /dev/hme link_status
0=down
1=up

ndd -get /dev/hme link_speed
0=10Mb
1=100Mb

ndd -get /dev/hme link_mode
0=half duplex
1=full duplex

ndd -get /dev/hme adv_autoneg_cap
ndd -get /dev/hme adv_100fdx_cap
ndd -get /dev/hme adv_100hdx_cap
ndd -get /dev/hme adv_100T4_cap
ndd -get /dev/hme adv_10hdx_cap

To check the the link partner(Switch ot MII transceiver) capabilities:

0=link partner not adv this feature
1= link partner has this capability

ndd /dev/hme lp_autoneg_cap
ndd /dev/hme lp_100fdx_cap
ndd /dev/hme lp_100hdx_cap
ndd /dev/hme lp_100T4_cap
ndd /dev/hme lp_10fdx_cap
ndd /dev/hme lp_10hdx_cap

								
To top