Case Study How to Write a

Reviews
Shared by: rossmanjerry
Stats
views:
110
rating:
not rated
reviews:
0
posted:
6/12/2009
language:
English
pages:
0
Case Study: How to Write a Telecom Disaster Prevention and Recovery Plan Linda Henning Telecommunication Manager GBMC - 2007 1 GBMC – Greater Baltimore Medical Center 2 GBMC includes: Greater Baltimore Medical Center The 292-bed Medical Center located on a beautiful 106 acre suburban campus, is Central Maryland’s leading community hospital. Employs 3000 people. It serves nearly 22,000 inpatients annually, and handles approximately 60,000 emergency room visits. Babies are us! We have delivered 22,200 babies in the last 5 years. GBMC performs 40,146 inpatient and outpatient surgeries per year. Hospice of Baltimore Provides comfort and care to patients with life-limiting illnesses. Hospice workers care for an average of 376 patients a day. GBMC Foundation Supports the GBMC mission by managing fundraising efforts. 3 GBMC Telecom Staff (with 140 years combined telecom experience!) Consists of: • Linda Henning – Manager, former ROLM trainer, 1986. Reports to VP-CIO of MIS. • Bertha James – Telephone Operator Supervisor of 10 hospital operators covering 24/7 by 365. • Sandy V. – Telecommunication Specialist, former ROLM trainer, 1984. • Don Walker – PBX Engineer, former ROLM ATAC, 1982. • Milt Webb – PBX Engineer, former ROLM engineer, 1980 • Mark Brenner – PBX Engineer, former ROLM system designer, 1983 • Kathy P. – 20+ years Bell System/Verizon billing expertise. 4 GBMC - Telecom at-a-glance MAIN CAMPUS Siemens Model 80 System 1 4976 Ports System 2 2899 Ports Cornet to the 14 year old 100% reliable, wonderful Siemens Model 70 CBX 4720 Ports REMOTE SITES Model 10 CBX 200 Ports Model 10 CBX 150 Ports Model 30 HiCom 200 Ports Model 30 HiCom 100 Ports 1 HiCom 150 12 Key Systems 5 GBMC - Telecom at-a-glance 6 Nodes of Phonemail – 50+ Channels 3 Xpressions Servers 950 Voice users converted from Phonemail 200 Unified Messaging 200 Call Processing menus still in Phonemail 2 Agile Servers SDC – Intellidesk, Intellispeech, Webservices Spectralink – 400 phones and counting 6 Why subject yourself to the pain of writing a Disaster Prevention & Recovery Plan? 7 The Auditors are Coming! The Auditors are Coming! WHY? • Since September 11th, auditors no longer accept our word that everything will be okay….. • Legislators have instituted the Sarbanes-Oxley Act as well as HIPAA. 9 Top 3 Things on the Auditors “Hit” List 1. Disaster Recovery Plan 2. Data Center security 3. Documented change control 10 This GBMC disaster plan was written as a result of NOT finding any comprehensive plan for Telecom on the internet, in a text book or on “for sale” templates. This is a condensed version of a 79 page plan SO…….. 11 Start writing one now or a consultant will be writing one for you! 12 Now to the content portion of this presentation……… There are 3 parts to this plan: I. II. Prevention Plan Recovery Plan III. Business Resumption 13 Part I Prevention Plan 14 Write a Mission Statement GBMC’s Mission Statement It is the mission of the Telecommunications Department to provide quality telecommunication services to support the Company's business goals. This Plan has been developed under the direction of Linda Henning, Telecommunications Manager. With this Plan, the Telecommunications Department will: 15 GBMC’s Mission Statement 1. Ensure that critical telecommunications systems and facilities are sufficiently backed up and protected so that critical Company telecommunications equipment will be recovered within 4 hours of an outage occurring, depending upon the severity level of the outage. During this four-hour window, telecommunication disaster recovery equipment may be deployed. 2. Provide telecommunications recovery in the most economical way possible, covering essential applications and operations based on their relationship to the business. 16 GBMC’s Mission Statement 3. Restore normal telecommunications operations as soon as possible after a disaster. 4. Protect employees, equipment, facilities, and data involved. 5. Coordinate telecommunications recovery activities with applicable Company recovery plans and local, state and government disaster recovery plans. 17 What are the plan objectives? The objectives of the Telecommunications Department disaster recovery plan are as follows: 1. 2. 3. 4. To protect human life; To minimize risk to the hospital; To prepare to recover critical operations; To safeguard the hospital against lawsuits; 18 What are the plan objectives? The objectives of the Telecommunications Department disaster recovery plan are as follows: 5. 6. 7. 8. 9. To protect the hospital’s competitive position To preserve patient confidence and goodwill; To define what is at stake; To make a preliminary business impact analysis; To form a synopsis of recovery strategy. 19 What form of disasters could occur in your area? What is a disaster? Natural Causes Human Error Intentional Causes Fire Flood Lightning Earthquake Hurricane Tornado Temperature Programming Errors Sabotage Improper Maintenance Terrorism Unauthorized Personnel Vandalism Lack of Training Computer Viruses Carelessness Disgruntled Employees Cable Cuts Theft Union Activities 20 What are your Critical Telecom assets? Device/ Asset Telecommunication Staff Model 70 CBX HiCom Model 80 – System 1 & System 2 Phonemail – 3 systems Spectralink Intellidesk – Intellispeech - Webservices 2 Servers in Data Center Xpressions – 3 Servers in Data Center Agile – 2 Servers in Data Center Zetron Siemens Siemens Siemens Siemens SDC Siemens Siemens Comm-Tronics Vendor Customer Number 21 What are the basic levels of a disaster? 22 Four Basic Severity Levels of a Disaster Minor - A minor interruption to telecommunications operations, e.g., hardware, software, facilities or personnel, which has a negligible effect on the Company. i.e. unplugged telephones, which produce switch errors. 23 Four Basic Severity Levels of a Disaster Intermediate - An interruption that causes the telecommunications operations center to activate alternative communications strategies and closely monitor the situation. For example, Spectralink, Phonemail alarm, Intellidesk problems. 24 Four Basic Severity Levels of a Disaster Major - An interruption to operations which will cause an extended (but recognized, usually through previous experience) delay in user services. For example, a motherboard in the switch, PRI card, Phonemail, Xpressions 1 or 2, Intellidesk. 25 Four Basic Severity Levels of a Disaster Critical - An interruption which forces the communications center to shut down; people, hardware, software and facilities are impacted. For example, a power outage has occurred; a PBX is lost; however, backups can restore the system database. Depending on cause and severity of problem such as loss of power, flood, or cable cut. 26 Decision Criteria Specific decision criteria which management will use to decide on the disaster status of an individual event include the following: Determine if loss of human life or severe injury is possible. Determine what impact the loss of communications will represent to the affected department(s), division(s), business unit(s), etc. Determine to what extent backup systems and/or facilities are readily available to be used in the outage. Determine if spare components, backup software, etc. are readily available to facilitate system recovery in advance of vendor/carrier support. Obtain input from police, fire, building maintenance staff, and other knowledgeable sources as to the potential damage. 27 • • • • • Review what you have in place now…. Do you Have? By-pass Phones Switch redundancy Halon or Environmentally friendly equivalent No sprinklers in Switch room - so no flood 28 Review what you have in place now…. Do you Have? Cell Phones and PDA Type Phones with charged batteries UPS back-up batteries DC power plant and gel batteries Access controlled Switchroom – no vandalism Back-up Generators 29 Review what you have in place now…. Do you Have? 2 Way Radios DISA (Direct Inward System Access) turned off Custom Redirect from Verizon or other carrier to reroute calls 24/7 In-House Help Desk You now have the foundation of your very own plan! 30 Part II Recovery Plan 31 Recovery Plan Scope of the Disaster Recover Plan GBMC’s Telecommunication Business Resumption Plan is designed to respond to a disaster at the main campus or other facilities under the oversight of the Telecommunications Manager. A disaster is defined as an incident that damages the facility, equipment or any mission-critical functions, such as the implementation of emergency codes. This plan provides recovery tasks checklists, forms and procedures required to effect a timely recovery. 32 Low Probability high impact types of Disasters In the event of a Telecom disaster, the following specific recovery events need to be initiated, depending on the type and severity of the disaster. There are several types of outages that are beyond GBMC and Telecom’s prevention planning and preparedness. These are: • • • • A major cut or damaged fiber optic or copper cable on GBMC’s campus or in the Towson area. Verizon Central Office failure. Water damage to the PBX Switchroom, resulting in a total room outage of the telephone switches. In the event of a major cable cut, GBMC will still have internal phone services. 33 Disaster Recovery Site • Is there another location that can handle the calls? If so, arrange for design of Custom/Switch Redirect from your dial tone vendor. • This would be your hot site. MIS usually has a location like Sunguard in Philadelphia, PA. • For Telecom is not so easy, unless you have a fully functional duplicate PBX at the hot site. 34 GBMC’s Disaster Recovery site GBMC has a live back-up Disaster Recovery site. This location is our Patient Accounting Office in Timonium, MD. This location was chosen for the following reasons: Similar telephone systems that telecommunications staff has technical training to modify software to implement changes required to accommodate GBMC’s main campus telephone calls. Service is provided out of a different Verizon Central Office than the hospital. Short driving distance but off campus. Ease of deploying telecom staff to help answer phone calls. 35 • • • • Summary of Activities – The 4 R’s React, Recover, Repair and Resume GBMC’s Telecommunications Department can institute the following disaster recovery activities based on the severity of the disaster, in accordance with the GBMC Incident Command Protocol. REACT Deploy full-scale disaster program Notify backup communications center, hot/cold site; Patient Accounting Alert vendors, carriers, suppliers; Verizon, Paetec, Siemens 36 Summary of Activities – The 4 R’s React, Recover, Repair and Resume Establish communications command center Re-deploy telecommunications staff, e.g., attendants; to Disaster Recovery Site - Patient Accounting Timonium MD Activate alternate facilities, systems; Custom Redirect Distribute Cell Phones Determine how long company can operate in recovery mode 37 Summary of Activities – The 4 R’s React, Recover, Repair and Resume RECOVER Begin recovery to backup center, if needed; Re-establish local dial tone, Re-establish 800 service, other switched services; Reroute critical analog/digital circuits 38 Summary of Activities – The 4 R’s React, Recover, Repair and Resume Recover commercial power, backup power sources Recover PBX/Key/Voicemail/ACD systems Recover communication system software, databases, etc. 39 Summary of Activities – The 4 R’s React, Recover, Repair and Resume REPAIR AND TEST Replace damaged outside facilities Replace damaged communications systems, software; Re-cable main distribution frame, intermediate frames, if required Test recovered systems for proper operation Test recovered network assets for proper operation 40 Summary of Activities – The 4 R’s React, Recover, Repair and Resume RESUME Re-establish and verify network integrity Re-establish and verify security Employee, operational logistics maintained; 41 Summary of Activities – The 4 R’s React, Recover, Repair and Resume Begin cleanup; Continue to inform employees, management, customers, and media of recovery status Conduct recovery review; document analysis of recovery. 42 Order of Service Restoration This took a very long time to prioritize. Order of Service Restoration Communications/MIS Department Emergency Department Lab Pharmacy Respiratory Nursing Units Unit 59 - SICU Unit 25, 26, 27, L&D NICU & Newborn Nursery Unit 34, 35, 36, 37,38 Unit 43, 45, 46, 48 Unit 54, 57, 58 Plant Ops Security 43 Recovery Sequence MAIN CAMPUS RECOVERY SEQUENCE Hardware/Applications 1. HiCom 80 2. CBX Model 70 3. Zetron 4. Intellidesk – SQL Server 5. Spectralink 6. Phonemail 7. Xpressions Server 1 8. Xpressions Server 2 9. Agile server 1 10. Agile server 2 11. Intellispeech 12. IntegraTRAK 13. Web Services 14. Xpressions Test App 2 Expected Recovery Window* 12-24 6-12 1-6 1-6 24+ 12-24 12-24 12-24 24+ 24+ 1-6 24+ 1-6 24+ Level of Disaster Critical Critical Critical Major Major Major Major Major Intermediate Intermediate Minor Minor Minor Intermediate Individual(s) Responsible for Validation Telecom PBX Engineer Telecom PBX Engineer Database Administrator Database Administrator Telecom PBX Engineer Telecom PBX Engineer Xpressions Database Administrator Xpressions Database Administrator Telecom PBX Engineer Telecom PBX Engineer Database Administrator Call Accounting Database Admin Database Administrator Xpressions Database Administrator 44 Recovery & Restoration Time Frames You may be held accountable for these, so be careful what you put in writing! Be very careful with what you can deliver because you will have to test the plan to prove it works GBMC tested our plan on November 29, 2006. Recovery and Restoration Time Frames The action(s) taken immediately following a disaster of any proportions fall into four timed phases. Listed below are the primary events that should occur in each phase. Many of the events in one phase will occur concurrently as a result of the efforts of various members of telecommunications disaster recovery teams. 45 Recovery & Restoration Time Frames 1-6 Hours After Being Notified 1. Protect human lives; 2. Assess damages; 3. Notify vendors, carriers, users; Siemens, SDC, Verizon, Comm-Tronics, Paetec 4. Establish command center; if needed 5. Notify senior management Administrator -on-Call 6 Notify Help Desk 46 Recovery & Restoration Time Frames 6-12 Hours After Being Notified 1. Notify users so they can assist in the recovery, if necessary; telecommunication operator needed at Patient Accounting to answer phones. 2. Establish hardware, software, and facility requirements; 3. Order necessary equipment and supplies; 4. Move off-site tapes, documentation to backup site; 5. Move emergency components to backup site; 47 Recovery & Restoration Time Frames 12-24 Hours After Being Notified 1. 2. 3. 4. 5. 6. 7. 8. Transportation system fully operational; Establish operations at backup site Activate and test operating system software; i.e. test call processing, incoming calls Activate and test system databases; Test/verify all new/replacement equipment; Test/verify all transmission facilities; Verizon Custom Redirect Restore disk files using backup tapes; Modify call routing 48 Recovery Procedures What would you do if…..? 49 Major Loss of External Connectivity Major Loss of Internodal or Interswitch Connectivity Evacuation of Switch Room Extended Loss of Electrical Power Physical Damage to Switch Room Physical Damage to PBX Hardware Major Loss of Campus Telephony Loss of Cable Infrastructure 50 This is what GBMC would do…… 51 Major Loss of External Connectivity Response Procedure Responsibility Immediately upon detection 1. 2. 3. 4. 5. 6. 7. 8. Determine what sites are impacted. Isolate what circuits are down. Reroute outgoing calls to back-up circuits & alternate pathways. Verify status of switch hardware – if source of failure, replace from stock or place service order. Verify integrity of cable infrastructure – if source of failure, initiate repair procedures & determine source liability. Notify Telco vendor of circuit failure and diagnostic findings. Open trouble ticket. Interact with vendor until circuit restored. Verify circuit functionality, after service restored. 1. 2. 3. 4. 5. Telecom Manager PBX Engineer PBX Engineer PBX Engineer PBX Engineer 6. 7. 8. Telecom Staff Telecom Staff Telecom Staff 52 Major Loss of Internodal or Interswitch Connectivity Response Immediately upon detection 1. 2. 3. 4. Procedure Determine which equipment is impacted If collocated nodes, isolate problem and replace failing component. If distributed node, if source of failure, initiate repair procedures & determine source liability. If HiCom link, verify integrity of cable infrastructure - if source of failure, initiate repair procedures. Verify Cornet circuit and hardware – if source of failure replace from stock or place service order. Verify connectivity hardware – if source of failure place service order & expedite Interact with vendor until circuit restored Verify functional connectivity. 1. 2. 3. 4. 5. 6. 7. 8. Responsibility PBX Engineer Staff PBX Engineer Staff PBX Engineer Staff PBX Engineer Staff PBX Engineer Staff PBX Engineer Staff PBX Engineer Staff PBX Engineer Staff 5. 6. 7. 8. 53 Major Loss of Campus Telephony Response Procedure Responsibility Immediately upon detection 1. 2. 3. 4. 5. 6. Determine extent of service loss. Isolate node(s), or site as source failure – if physical damage call Verizon. If power or cooling, go to Plant Ops. If Common Control hardware, replace from stock or place service order. If service order placed, interact with vendor until hardware restoration. If HiCom expedite. Verify system functionality.. 1. 2. 3. 4. 5. 6. Telecom Manager & PBX Engineer Telecom Staff Telecom Staff PBX Engineer Staff PBX Engineer Staff PBX Engineer Staff 54 Loss of Cable Infrastructure Response Procedure Responsibility Immediately upon detection 1. 2. Determine location of cable disruption. Issue stop work orders to perpetrator of damage, if on campus, secure area, maintain safety standards. Ascertain extent of damage, initiate repair procedures & determine source of liability. If extended repair, reroute essential functions at impacted area to temporarily restore service. Interact with vendor until service is restored. Verify service restoration.. 1. 2. Telecom Manager & PBX Engineer PBX Engineer 3. 4. 5. 6. 3. 4. 5. 6. Contact Data Center Tech PBX Engineer Staff PBX Engineer PBX Engineer 55 Evacuation of Switch Room Response Procedure Responsibility Immediately upon notification or upon detection 1. 2. 3. 4. 5. 6. 7. 8. Maintain safety of staff & comply with all safety officer instructions. Secure switch room door on exiting. Establish remote login of PBX. Maintain close contact with emergency responders to ascertain ongoing conditions & ability to reoccupy space. Upon access, ascertain any damage to system and/or infrastructure & initiate proper response. If extended repair, reroute essential functions at impacted area to temporarily restore service. Interact with vendor until service is restored. Verify service restoration 1. Telecom Manager 56 Extended Loss of Electrical Power Response Procedure Responsibility Immediately upon detection 1. 2. 3. 4. 5. Verify functioning of DC power plant back-up generator and UPS. Call GBMC Plant Ops to ascertain status of repair and restoration efforts. Monitor Switch Room functions and backup power/cooling supply. Monitor transition back to normal power supply and verify normal functionality. Verify service restoration. 1. 2. 3. 4. 5. Telecom Manager & PBX Engineer PBX Engineer PBX Engineer Staff PBX Engineer Staff PBX Engineer 57 Physical Damage to Switch Room Response Procedure Responsibility Immediately upon detection 1. 2. 3. Ascertain any damage to equipment or support infrastructure. Determine impact of damage on immediate operations Coordinate repair efforts with vendor to minimize operational disruptions. 1. 2. 3. Telecom Manager & PBX Engineer PBX Engineer PBX Engineer and Plant Ops 58 Physical Damage to PBX Hardware Response Immediately upon detection or access Procedure Responsibility 1. 2. 3. 4. 5. 6. 7. 8. Ascertain full extent of damage, loss of service and salvage potential. Restore critical and essential services on existing capacity; initiate emergency by-pass service if necessary. Relocate hardware and expedite delivery of needed replacement components to restore basic DID and essential outbound services. If loss is significant & extended, arrange for Custom Redirect. Arrange for DID intercept message with details and bypass numbers from Verizon. Initiate recovery plans with vendor; coordinate vendor implementation team; establish milestone and timeline. Interface with vendor during recovery activity. Integrate restored services into operation during off times to minimize disruptions 59 1. 2. 3. 4. 5. 6. 7. 8. Telecom Manager & PBX Engineer PBX Engineer Telecom Staff Telecom Staff Telecom Manager Telecom Staff Telecom Staff Telecom Staff Part III Business Resumption Plan 60 BUSINESS RESUMPTION Restore Critical Business Functions Coordinate and restore the original site. Restore hardware systems. Restore software systems. Restore power/UPS. Replace fire detection and suppression systems. 61 BUSINESS RESUMPTION Restore Critical Business Functions Address additional security concerns. Rewire the facility. Restore original LAN configuration Restore original wide-area network configuration. Test new hardware and software. 62 BUSINESS RESUMPTION Restore Critical Business Functions Train operations personnel on new equipment. Train employees on new equipment. Schedule migration back to original site. Coordinate return to original site. 63 Wrap up Activities 1. 2. 3. 4. 5. 6. 7. Review critical events log. Evaluate vendor performance. Recognize extraordinary achievements. Prepare final review and activity report. Aid in liability assessments. Schedule compensatory time off Schedule the party! 64 Not a good note taker????? This may just make your day! 65 The Data Disk includes: This PowerPoint presentation A copy of GBMC’s Disaster Prevention Recovery & Business Resumption Plan 66 Questions? Resources • The Definitive Guide to Business Resumption Planning - Leo A Wrobel • Telecommunications Disaster Recovery Plan Template – Paul F Kirvan • Twenty Years of My Life in Telecom – Linda Henning 67 One More Thing…….. Please fill out your session evaluation ! Have a safe trip home!!

Related docs
How to Write a Case Study
Views: 4178  |  Downloads: 93
Case Study
Views: 87  |  Downloads: 2
Case study write-up template and checklist
Views: 4  |  Downloads: 0
How to Write a Case Study Analysis
Views: 490  |  Downloads: 23
HOW TO WRITE A CASE STUDY
Views: 200  |  Downloads: 10
How to Write a Case Study
Views: 45  |  Downloads: 2
How to Write a Personal Case Study
Views: 152  |  Downloads: 1
How to Write a Personal Case Study
Views: 83  |  Downloads: 4
How to write a case report
Views: 1557  |  Downloads: 37
How to write a study protocol
Views: 91  |  Downloads: 16
Write-On
Views: 41  |  Downloads: 2
premium docs
Other docs by rossmanjerry
Deed on redemption of ground rent
Views: 483  |  Downloads: 4
Transcript of Northwest Ordinance
Views: 153  |  Downloads: 0
Trusteeship agreement for failing business
Views: 219  |  Downloads: 3
Transcript of Dawes Act
Views: 224  |  Downloads: 0
Inclusion of settlement for past services
Views: 628  |  Downloads: 0
License to insolvent debtor to continue business
Views: 208  |  Downloads: 0
Contract for Purchase of Corporate Stock
Views: 404  |  Downloads: 19
Partnership interest
Views: 724  |  Downloads: 16
Covenant Not to Compete
Views: 414  |  Downloads: 15
Alabama Registered LLP
Views: 229  |  Downloads: 0