Business Continuity

Document Sample
Business Continuity Powered By Docstoc
					          Business Continuity and
        Disaster Recovery Strategies


                    October 2, 2002



Paul DiGiacomo,
 Product Management Director
AT&T Business - Managed Services
AT&T Ultravailable® And Recovery Solutions
Pdigiacomo@att.com, 973-644-6340
Topics

 Key Drivers

 AT&T BC/DR Best Practices Governance and Execution Process

 AT&T BC/DR Best Practices Architecture

 AT&T BC/DR Internal and Client Experiences



 Summary



                                                               2
CxO Drivers Today

      - eBusiness
                                         - Volatile Markets
      - 24x7, Always On
                                         - Officer Liability
      - Globalization
                                         - HIPAA
                                         - OHS / SEC / Comptroller




              Financial                 Risk
               Trends                  Trends

 - Cost Reduction                             - Broader Threats
 - TCO / ROI Focus                            - Greater Risk
 - CapEx Reduction                            - Greater Exposure
 - Managed Services




     - Productivity Focus
                                    - Improved Network Intelligence
     - Scarce Qualified Resources
                                    - Exponential Data Growth
     - Internet Time
                                    - Emerging Protocols
     - Consolidation
                                    - Application Management
                                                                     3
Industry Trends Shaping Customer
Needs
 Customer /
  Market                                                     Continuity
  Trends                                    Recoverability                 Reliability


                 Technology          Security                                        Scalability
                   Trends
                                                  Business and technology
                                                   leaders today are facing
 Structural /                   Accessibility         highly demanding                   Performance
Organizational                                           performance
   Trends                                               requirements

                 Regulatory /         Quality                                        Availability
                 Shareholder
                   Trends
                                                Integrity                 Predictability
                                                             Accuracy
    Risk
   Trends

                                                                                                    4
 AT&T BC/DR Best Practices

                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                           Develop &
Manage Events       Simulate /
                                        Deploy
                      Test /
                                        BC/DR
                    Exercise
                                    Plans & Assets




                                                                               5
 AT&T BC/DR Best Practices

                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                           Develop &
Manage Events       Simulate /
                                        Deploy
                      Test /
                                        BC/DR
                    Exercise
                                    Plans & Assets




                                                                               6
BC/DR: Key Drivers, Triggers, Enablers

   Culture and Heritage of Quality and Reliability Planning
   Audit or Risk Committee Findings / Concerns
   Reliability Differentiator in the Marketplace
   Legal and Regulatory Requirements
   Due Diligence / Insurability
   Competitive Benchmarks
   Recent Outage(s) or Disaster(s)
   Stakeholder Expectations
   Data Protection Requirements
   Process Availability Requirements
   Data Center Migration / Consolidation

                                                               7
 AT&T BC/DR Best Practices

                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                           Develop &
Manage Events       Simulate /
                                        Deploy
                      Test /
                                        BC/DR
                    Exercise
                                    Plans & Assets




                                                                               8
An Exemplary Approach to Governance
                                    Chairman /
                                      CEO         Board of
                                                  Directors       Audit
Corp.         Corp. Officer      Corp.                          Committee
Officer     (& BC Champion)      Officer

                                                   Site Incident Mgt Teams
           BC/DR Council
 Senior      Senior    Senior                     Site 1      Site 2   … Site n
 Officer     Officer   Officer        BC/DR
                                      Officer
                                                       Corporate Support Team
BC/DR Steering Committee
                                            Finance        Security       HR
    IT     Application        Business
Operations Developm’t           Unit        Public                       Real
                                           Relations        Legal       Estate
Business        Network       Business
  Unit         Operations       Unit          IT         Network       Business
                                           Operat’ns     Operat’ns      Units
                                                                                  9
 AT&T BC/DR Best Practices

                    Create a           Develop             Monitor
  Acknowledge
                     BC/DR            & Deploy            Progress
  Importance of
                   Governance        Standards &          Of BC/DR
     BC/DR
                    Structure          Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                           Develop &
Manage Events       Simulate /
                                        Deploy
                      Test /
                                        BC/DR
                    Exercise
                                    Plans & Assets




                                                                               10
   BC/DR Standards and Policies
            Certification and Assurance Standards

         Incident Management Process Standards

      Response Planning Standards

    Risk Management Standards
 Disaster Recovery Planning Standards

Integrated Planning Process Standard           Plan Distribution Standard
Classifications Standard                       Exceptions Standard
Funding Standard                               Work Center DR Plan Standard
Component DR Plan Content Standard             Application DR Plan Standard
Data Backup / Offsite Storage Standard         Data Retrieval DR Plan Standard
Disaster Recovery Planning Tool Standard       Network DR Plan Standard
Security Standard                              AT&T Core Network DR Plan Standard
Levels of Service Standard                     AT&T Internal Data Network DR Plan Standard
Plan Exercise (Test) Standard                  Platform DR Plan Standard
Approval Standard                              Recovery Management DR Plan Standard
Plan Maintenance and Change Control Standard   DR Integrated Planning Process Flow
Training and Awareness Standard                DR Teams Roles and Responsibilities
Risk Acceptance for Non-Compliance Standard    Component DR Plan Content


                                                                                             11
 AT&T BC/DR Best Practices

                    Create a           Develop             Monitor
  Acknowledge
                     BC/DR            & Deploy            Progress
  Importance of
                   Governance        Standards &          Of BC/DR
     BC/DR
                    Structure          Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                           Develop &
Manage Events       Simulate /
                                        Deploy
                      Test /
                                        BC/DR
                    Exercise
                                    Plans & Assets




                                                                               12
Potential Key Processes / Functions

                Leadership / Strategy
                                           Marketing &
                    Research &               Sales
                    New Product
                    Development
   Supplier                             Customer
  Management                              Care /
                     Production/         Support
                     Operations
                                            Billing /
               Communications & IT         Collections

  Support Functions: HR / Legal / Finance / IR / PR
                                                         13
  Example Impact Assessment Questions
Process Description       Primary functions, responsibilities, and accountabilities
Regulatory Reporting      Types of reports and frequency
Operational Impacts       Impact (Service Level Agreements) and relative importance
Financial Impacts         Lost revenue and other financial impacts
Technology Resources      Communications and applications
Work Inflows / Outflows   Internal and external process inputs / outputs
Outage Tolerance          How long could your Process be completely idled?
Impact Profiles by Time   Impact based on: monthly, weekly, daily and hourly
Work Backlogs             Backlog, normal and seasonal
Special Requirements      Any one-of-a-kind items required to conduct business
Backups                   Frequency of and access to backups
Work Around Procedures    Are their work around procedures? How good are they?
Workload Shifting         What percentage of workload can be shifted to vendors for how long?
Disruption Experience     History and type of process disruptions
Process Vulnerability     Vulnerability of your Process to a prolonged disruption or outage
Restoration Complexity    How difficult to recover to an acceptable level after a disruption?
Recovery Time Objective   What is the optimal Recovery Time Objective (RTO) for your Process?

                                                                                                14
 AT&T BC/DR Best Practices

                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                           Develop &
Manage Events       Simulate /
                                        Deploy
                      Test /
                    Exercise
                                        BC/DR         Reduce the Threat,
                                    Plans & Assets    Vulnerability, Risk or
                                                      Exposure


                                                                               15
   Best Practices Risk Assessment
   Approach                   Current
                               Risk
                                                                     Mitigation Investments
                                                                        Plan A      Plan B
 Information Technology: Server and client platforms, LAN,
  MAN, WAN, data security, data management, applications,
  voice, storage, Disaster Recovery plans…
 Facility Security: Perimeter security, entrances / exits,
  loading dock, security cameras, alarm methods, remote
  monitoring, guard staffing, interior security systems…
 Information Security: Servers, routers, firewalls, applications,
  intrusion detection systems, security policies and standards,
  organization structure, training and policy deployment…
 Infrastructure / Environmentals: Physical building structure,
  location, HVAC, water / plumbing, environment inspection…
 Facilities Safety: Fire exits, emergency procedures, fire
  suppression equipment (extinguishers, Halon, FM2000, dry &
  charged sprinklers), emergency lighting, test schedules…
 Power: Grounding, distribution, switching, dual grids, UPS
  installation, maintenance, capacity, load testing, generator
  installation, maintenance, DC battery plant…
 Other: Organizational structure, training & education,
  customer / supplier contracts…
                                                                                        16
    Typical Client Scenario




   Centralized Data Center / Work Center / Call Center
   Single Location for Mission Critical Data
   Single Location for Mission Critical Computing
   Single Location for Mission Critical Applications
   Single Points of Failure for Network Access
   Unacceptable Concentration of Risk
   The Enterprise MAY NOT Survive a Disaster



                                                    17
    Costs of Outages & Disasters

       Quantitative              Quantitative
                                                          Strategic
          Direct                    Indirect
    - Revenue Impact         - Market Share Loss     - Potential for Total
    - Opportunity Cost   +   - Customer Share    +     Business Failure
    - Penalty Clauses          Loss                  - Brand Equity Loss
    - Fines                  - Litigation            - Market Cap Loss
    -…                       - ...                   - ...


                                 Recovery &
     Physical Loss
  - IT/Network Assets
                                 Restoration
                           - Relief & Recovery
                                                      Total
+ - Lost Data            + Operations
                                                  = Business
  - Buildings, Vehicles,   - Interim Operations

  -…
    Furniture              - Replacement
                           - ...
                                                     Impact
                                                                             18
                        Costs of Outages & Disasters are
                        Time and Severity Dependent

                        $1B
                                                                                      Time
Dollar Cost of Outage




                                                                                    Dependent
                        $1M                                                           Costs
                                                                  Shorter      Longer


                        1K$                          Larger
                                   Severity
                                  Dependent
                                    Costs            Smaller
                         0$

                              seconds   minutes   hours        days    weeks      months   years


                                                                                                19
 AT&T BC/DR Best Practices

                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                           Develop &
Manage Events       Simulate /
                                        Deploy
                      Test /
                                        BC/DR
                    Exercise
                                    Plans & Assets




                                                                               20
Disaster Recovery Approach

  Typical Approach Elements:
      – Off-site data vaulting
      – Shared IT Resources
      – Permanent Primary Site, Shared Subscription to Temporary Recovery
        Site
    Some Down Time
    Some Data Loss
    Lower Cost
    Best Where Investment in Duplication Would Exceed
     Importance of Process / Service / Asset
  Not Network-Centric



                                                                            21
System Disaster Recovery:
Deferrable Workload Strategy
Production Sites                                         Primary Recovery Site(s)                                    Secondary Recovery Site
                                                                                                                      (Deferrable Workload)
                    Critical                                            Test / Dev /
                   Business                                              Deferrable
                     Apps                                                  Apps
Production
  Site 1
                        /* App 1 Listing */                                         /* App 1 Listing */
                        main() {                                                    main() {
                        int I, j, k;                                                int I, j, k;




                                                                                                                       Vendor Location
                          char *s;
             /* App 1 Listing */                                                      char *s;
                                                                         /* App 1 Listing */
             main() {        {                                           main() {        {
             int I, j, k;    zxy lqr zcnutjkd;                           int I, j, k;    zxy lqr zcnutjkd;
             char *s;        for xykj = xzemi;{                          char *s;        for xykj = xzemi;{
                                 xz += fllskj + fjeio;                                       xz += fllskj + fjeio;



                                                          Production
                  {                                                           {
                                 fkjldkfokw;}
                  zxy lqr zcnutjkd;                                                          fkjldkfokw;}
                                                                              zxy lqr zcnutjkd;
                for xykj = xzemi;{                                          for xykj = xzemi;{
                    xz += fllskj + fjeio;                                       xz += fllskj + fjeio;
                    fkjldkfokw;}                                                fkjldkfokw;}




                                                            Site 3

Production
  Site 2

                                                                                                                         Vendor Location
                 Vaulted
                  Data                                                             Data
      - AT&T Production Site 3 is Primary Recovery Site
      - Test / Development / Deferrable Workload Moved to Vendor Site
      - Guaranteed Recovery to Second Vendor Location
      - Full IT Recovery Environment with Extended Stay Potential
Disaster Recover Timeline
  Data Backup
     Data in Transit
       Data Vaulted
                                                  Incident Management
                Disaster
                   Notification, Damage Assessment & Declaration
                       Response
                           Relief
                                           Recovery
                                                           Restoration

Recovery           Recovery Time
  Point            Objective (RTO)
Objective
 (RPO)
                                                                    23
Fundamental Continuity Strategy:
Network-Based Geographic Dispersion



           The needs of today’s customers
               for Business Continuity
              mandate a network-centric
        geographically-dispersed infrastructure



                 Distant Enough for Safety


        Close Enough for Cost-Effective Performance


                                                      24
Disaster Recovery vs.
Business Continuity
          Disaster Recovery                          Business Continuity

 Typical Approach Elements:                 Typical Approach Elements:
     – Off-site data vaulting                     – Data Mirroring
     – Shared IT Resources                        – Computing Fail-over
     – Permanent Primary Site, Shared             – Multiple Permanent Sites
       Subscription to Temporary Recovery
       Site

   Some Down Time                             No Down Time
   Loses Data                                 No Data Loss
   Lower Cost                                 Higher Investment
   Best Where Investment in                   Best for Mission Critical
    Duplication Would Exceed                    Processes / Services / Assets
    Importance of Process /
    Service / Asset
 Not Network-Centric                        Highly Network-Centric

                                                                                25
Four Major Availability Levels / Strategies
               Standard       Disaster            High        Ultravailable
              Availability    Recovery         Availability    (99.999%)


                              Server w/                        Dispersed
              Single                            Local
Computing                      Hot-Site                        Cluster w/
              Server                           Cluster
                             Subscription                       Failover



                               Storage
               Single                           Local         Synchronous
                              Device w/
 Data         Storage                       RAID Striping/      Remote
                               Off-Site
              Device                          Mirroring         Mirroring
                              Vaulting



            Legacy LAN/      Trailerized    Unprotected       Protected
 Network     MAN/WAN            NDR           DWDM            Metro Ring
            Connectivity     Resources       Services          Services




                                                                              26
AT&T Ultravailable® Network Services
                           Diverse Routing with Automatic Protection Switching /
                                Optical Path Failover for 99.999% Availability

                                                                      Secure, Conditioned
                                                                        Network Nodes

                   Dual Laterals,            24x7 Centralized
                    Dual Risers          Monitoring & Management
                   To Eliminate           for Service Assurance
                      SPOFs



                                         Multiprotocol for     Non-Switched All-
                                          Flexibility and       Optical for Low
                                         Future-Proofing           Latency
                                            Gigabit Ethernet
                                             Fibre Channel
Client- or AT&T-                            FICON, ESCON
 Facility Based                                   OCx
                                            D1 Digital Video
          64 Unprotected or 32 Protected
                                                           Data Rates of 2.5Gb/s ->
     Wavelengths per Fiber for High Bandwidth,
                                                       10Gb/s Evolving to 40 Gb/s and up
         Rapid Provisioning, & Low Cost
                                                                                      27
AT&T Ultravailable® Suite
             24x7 Centralized Monitoring & Management for Service Assurance


                         MAN-Area Server Clustering / Fail-Over




                     Synchronous Mode Data Mirroring based on
                    Leading Vendor Disk Arrays for Zero Data Loss


    SANs
                Remote Server-Based or Serverless Backup / Restore
                  to Automated Tape Libraries for Data Protection
 Primary
 Storage
                    Network Agnostic:
                                                                             99.999%
                        - Fibre Channel / FICON/ ESCON over Ultravailable
                                                                               Data
                          Dedicated or Wavelength DWDM
Private or                                                                  Availability
                        - IP / GigE over Metro Ethernet Services
 Hosted                 - Channel Extension over T1/T3/OCx
 Storage
                                                                                      28
   AT&T Continuity, Recovery, Hosting, &
   Security Services Managed Token
                     Authentication   Managed
                                                                                          VPN
                    GCSC
                                                      Managed Internet                          Managed
 Managed                                                 Services                               Hosting
  Services
Client Portal                                                             Managed
                                   Managed Intrusion Detection
                                                                          Firewall
                                      & Scanning Services
                                                                          Services
Ultravailable
Computing                                                                                  Ultravailable
                                                                                            Managed
                                           GNOC            NDR Trailers                    Hosted Data


  SANs          Ultravailable
                    Data                                                                         NAS


                                         CO                                               Ultravailable
                                                                                              Data
 Managed
 Primary
                                                                                           Ultravailable
 Storage                          AT&T                                AT&T                     Tape
                                   CO                                  CO
  Client                                                                             AT&T Hosting
 Location                       Ultravailable Network and Wavelength Services          Location
                                                                                                     29
Example Applications



 Interlocation Trunking   Remote Disk Mirroring
                                                                             Remote Back-up

                                         SAL
                                      EMP ORG        SAL
                                                  EMP ORG
                                      Doe 37 C5   Doe 37 C5
                                       Ng 27 C5    Ng 27 C5
                                       rd 88 F9    rd 88 F9


                          EMP ORG
                             SAL                                 SAL
                                                              EMP ORG
                          Doe 37 C5                           Doe 37 C5
                           Ng 27 C5                            Ng 27 C5
                           rd 88 F9                            rd 88 F9




   SAN Extension          Database Replication                                  LAN Bridging




                             Server Failover /
                                                                          Multimedia Conferencing
  Content Distribution      Remote Clustering
                                                                                                    30
AT&T BC/DR Best Practices

                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                            Develop &
Manage Events       Simulate /
                                         Deploy
                      Test /
                                         BC/DR
                    Exercise
                                     Plans & Assets




                                                                               31
 Example DR Plan Timeline

ID   Task Name
 1   Deploy the SDR Process
 2   Run DR Merge
 3   Run Audits & Correct Errors
 4   Emergency ODA Retrofit
 5   Forward Prioritization List
 6   Apply Joker Equipment Options
 7   Swing / Build A Links
 8   Relocate ASTN NI Backup
 9   Build ASTN NI F Links
10   Assign ISDN D Link Nodes
11   Re-Engineer Switched T1s
12   Deploy the NDR Trailers
13   Prep the NDR Trailers for Recovery
14   Connect Fiber and T3 Facilities
15   Test and Turn Up Technologies
16   Trunk Recent Changes
17   Trunk Status Evaluation
18   Provide Status Reports
19   Overall Service Evaluation




                                          32
Process Architecture Performance:
Four Parallel Example Activities

    Bring up OS
     10 Hours

       Swing
        WAN
      10 Hours
                      How long will this take?!?
      Retrieve
    Vaulted Tapes
      10 Hours

      Test LAN
      10 Hours

                                                   33
                  Importance of Exercising Process

                100%
                 90%
Likelihood of Meeting or
Beating Recovery Time




                      50%




                           0%
                                  0                                                 72
                                hours   Minimum    Recovery    Mode      90%       hours
                                        Recovery     Time             Confidence
                                          Time     Objective           Recovery
                                                                         Time
                                                                                     34
Process Architecture Performance:
Four Serial Normally Distributed
Activities




           Sorted Results of 100 Trials

                            Each normally distributed activity has a mean
                            of 10 and a standard deviation of 5.

                            Results:
                                       Minimum: 23.9027
                                       Maximum: 67.7011
                                       Mean: 41.0530
                                       Median: 40.1506
                                       95% Confidence: 57.8925


                                                                            35
Certification and Assurance Metrics

                                                                                                                      Certification Achieved

                                                                                                                          Assurance Level A
Unit and C&A
                                                                                                                          Certification:
Joint Assurance Assessments                                                                                               - Process Owner
                                                                                                                             requirements
                                                                                                                            for RTO/RPO **
                                                                                               Assurance Level B
                                                                                                                            and service
                                                                                               Simulation Exercise
                                                                                                                            level met/ensured
                                                                                               Conducted - All Critical
                                                                                                                          • No critical
                                                                                               Deficiencies / Critical
                                                                                                                            deficiencies/MRs.
                                                                                               MRs Corrected:
                                                                    Assurance Level C          - Component simulation
                                                                                                 exercise completed              90-95%
                                                                    Simulation Exercise
                                                                                               - Copy of Critical data
                                                                    Conducted:
                                                                                                 must be off-site
                                                                    - Component Simulation
                                                                                               - Critical MRs /
                                                                      Exercise Conducted
                                            Assurance Level D                                    deficiencies corrected
Unit Self Assessments                                               -Copy of Critical Data
                                                                                                 and closed by Post
                                            Exercise Ready:           should be Off-site
                                                                                                 Review.
                                            - Required Content      - Critical MRs /
                                                                                               - The exercise was
                                            - LDRPS                   deficiencies occurred
                    Assurance Level E                                                            completed in no more
                                            - Plan Maintenance         in Exercise
                                                                                                 than double the time
                    Plan Documented:          Validated
                                                                            51-65%               specified by the RTO.
                    - Local Format          - Paper Walk Thru
                    - Data Identified and     Completed
                        Backed up                                                                        66-89%
                    - Plan Updated within        35-50%
                      the Last 12 months
Assurance Level F
     No Plan              9-34%

     5-8%

                     % Reflects Estimated Likelihood of Recovery.    ** RTO - Recovery Time Objective
                                                                        RPO - Recovery Point Objective
AT&T Experience




                  37
Network Disaster Recovery

      Joker
      4ESS
                                                                                                                                           4ESS


              AT&T Switched Network

                                                                                                                                           4ESS
               D IS A ST E R R EC O VE R Y
                                                                    AT& T
                                                                                               D IS A ST E R R EC O VE R Y
                        D IS A ST E R R EC O VE R Y                                                                          AT& T
                                                                        AT& T
                                      D IS A ST E R R EC O VE R Y
                                                                                AT& T




      DR Intertoll Trailers                                                             DR Access Trailer




                                                      )                                 4ESS                                         (    End
                                                                                                                                         Offices




                                                                                Disaster Site                                                      38
Network Disaster Recovery




                            39
    NDR Mobile Recovery Assets

   Access Trailers
   Digital Access and Cross-Connect Systems Trailers
   DTMS/FASTAR® Trailers
   Lightwave Trailers
   5ESS Switch Recovery Platform Trailers
   DMS500 Switch Recovery Platform Trailers
   Lightguide Regeneration Trailers
   Digital Radio Trailers
   Power Generation
   Digital Radio Recovery Trailers
   Portable Radio Towers
   Emergency Communications Vehicles

                                                        40
NDR Exercises
  Training and Field Exercises Conducted Quarterly
  Test, Exercise, and Develop Capabilities
     – Declaration / Deployment / Transportation / Set-Up
     – Technology
     – Teams
     – Processes
  Sample of Exercises Conducted Since 1997


      ’98 Salt Lake City, UT               ’97 Oakbrook, IL       ’00 White Plains, NY
 ’99 Lodi, CA ’01 Denver, CO ’98 Kansas City, MO                 ’98 Arlington, VA
           ’00 Phoenix, AZ   ’00 St. Louis, MO
                                                             ’99 Atlanta, GA
                         ’99 San Antonio, TX                ’01 Tampa, FL


                                                                                     41
 Recent NDR Deployments
 Date         Situation           Location                 Assets            Purpose
9/2001   WTC Disaster         NYC, NY &         Technology Trailers,   Communications
                              Northern NJ       Satellite Units        NYPD Support
                                                                       Humanitarian Relief
6/2001   Flooding (Tropical   Houston, TX       Satellite Unit         Communications
         Storm Allison)                         Technology Trailers    Humanitarian Relief
2/2000   Tornado              Camilla, GA       Satellite Unit         Humanitarian Relief
9/1999   Flooding             Tarboro, NC       Satellite Unit         Humanitarian Relief
         (Hurricane Floyd)
9/1999   Flooding             Rochelle Park,    Satellite Unit         Communications
         (Hurricane Floyd)    NJ                Technology Trailers    Humanitarian Relief
5/1999   Tornadoes            Oklahoma City,    Satellite Unit         Humanitarian Relief
                              OK
7/1998   Forest Fires         Brevard Co., FL   Satellite Unit         Humanitarian Relief
2/1998   Tornado              Lake Mary &       Satellite Unit         Humanitarian Relief
                              Kissimmee, FL
9/1997   Gas Line Break       Scranton, PA      Satellite Unit         Humanitarian Relief
9/1997   Train Derailment     Dunkirk, NY       Regenerator Trailer    Communications
4/1997   Flood                Grand Fork, ND    Satellite Unit         Humanitarian Relief
3/1997   Floods               Ohio, Kentucky,   Satellite Units        Communications
                              West Virginia     Lightguide Trailer     Humanitarian Relief
2/1997   Flood                Lodi, CA          Regenerator Trailer    Communications
                                                                                         42
NDR Recovery Site Work




                         43
AT&T BC/DR Best Practices

                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                            Develop &
Manage Events       Simulate /
                                         Deploy
                      Test /
                                         BC/DR
                    Exercise
                                     Plans & Assets




                                                                               44
WTC: AT&T Perspective

           9/11 8:48AM AA 11 Hits WTC North Tower
           9/11 8:53AM Stories Begin to Air on GNOC-
                         Monitored Broadcast Networks
           9/11 8:53AM Network Duty Officer, GNOC Aware
           9/11 8:53AM Automatic Network Controls, RTNR
                        Automatically Reroutes Traffic
           9/11 8:55AM Targeted GNOC monitoring of NYC
           9/11 8:58AM GNOC Detects Unusual Call Volume
           9/11 8:58AM Manual Network Controls Instituted –
                         Limits on NYC-Inbound Calls

                       9/11 9:00AM NDR NE and SE Region
                                    Pre-Activation
                       9/11 9:21AM Management Control
                                    Bridge Activated



                                                               45
                        AT&T BC/DR Best Practices
                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                            Develop &
Manage Events       Simulate /
                                         Deploy
                      Test /
                                         BC/DR
                    Exercise
                                     Plans & Assets




                                                                               46
AT&T Ultravailable® Network Services
    9/11 9:05AM     UA 175 Hits South Tower
    9/11 9:59AM     South Tower Collapses
    9/11 9:59:35AM South Tower Transport Node Crushed
    9/11 9:59:35AM Ultravailable Client Traffic Fails Over Successfully


     GCSC



                                                    WTC
                                                   South
                                                   Tower



                               CO




                        AT&T                               AT&T
                         CO                                 CO

                     Ultravailable Network and Wavelength Services
                                                                           47
Network Disaster Recovery


               9/11 10:20AM All NYC AT&T Offices
                             Ordered to Evacuate All
                             Non-Essential, Non-
                             Network Personnel
               9/11 10:45AM MCB Orders Satellite
                             Phones Readied Nationally
               9/11 11:00AM Mid-Atlantic Emergency
                             Operations Center
                             Activated
               9/11 11:30AM Southeast Emergency
                             Operations Center
                             Activated
               9/11 11:50AM NDR Equipment
                             Deployment Initiated


                                                         48
AT&T 9/11 Network Disaster Recovery

               9/12 4:00AM   NDR Team Assembles at Staging
                              Area
               9/12 12:00PM Emergency Communications
                              Vehicle To One Police Plaza
               9/12 7:00PM Recovery Location Selected
               9/12 10:00PM Location Secured, Trailers Depart
                              Staging Area
               9/12 10:30PM Positioning and Leveling Begins
               9/13 2:50AM Fiber Spliced to Recovery
                              Location
               9/13 8:45AM Grounding Complete
               9/13 12:00PM Power Cabling Complete - 500KW
               9/13 1:55PM Transport, Digital Cross Connect
                              Up
               9/21          ECV Moved From NYPD To
                              Support Relief Workers        49
                        AT&T BC/DR Best Practices
                   Create a            Develop             Monitor
  Acknowledge
                    BC/DR             & Deploy            Progress
  Importance of
                  Governance         Standards &          Of BC/DR
     BC/DR
                   Structure           Policies           Program


                                                               Transfer Risk
                    Assess &       Assess Threats,
  Respond,        Prioritize Key   Vulnerabilities,
  Repair , &      Areas of the        Risks &         ?              Accept Risk
  Recover           Business         Exposures



                                                       Mitigate Risk
                                                        Based on
                                                      Business Case
  Monitor &                           Develop &
Manage Events       Simulate /
                                        Deploy
                      Test /
                                        BC/DR
                    Exercise
                                    Plans & Assets




                                                                               50
 Questions to Consider
 Can you identify who is in charge of Business Continuity? What are the
  strategic continuity objectives and organizational structure to guarantee
  their achievement?

 Do you know which processes, services, and/or assets are most critical?

 Do you know the threats, vulnerabilities, risks and financial impact to
  those processes, services, and assets?

 Have you developed a sound business and financial analysis of
  alternatives?

 How confident are you that your plan will meet objectives?

 Have you balanced classic strategies such as DR to state-of-the-art high
  availability architectures? What SLA’s are required?


                                                                              51
Summary

 Business Continuity is a critical imperative in today’s world

 A successful corporate Business Continuity program needs:
   – A comprehensive, closed-loop governance, planning and execution process –
     across multiple lines of business and functional areas
   – A geographically-dispersed, hardened infrastructure, integrated and
     synchronized by the network, for physical threat protection
   – A robust information security strategy and architecture for logical threat
     protection



 The Network is central to physical and logical protection
   – Network access and transport
   – Network security
   – Hardened facilities

                                                                                  52
        The needs of
  today’s customers for
   Business Continuity
require a network-centric
 approach to protecting
   critical infrastructure
        components.




                        53

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:10
posted:9/11/2012
language:Unknown
pages:53