Docstoc

Network Department of Systems and Computer Engineering

Document Sample
Network Department of Systems and Computer Engineering Powered By Docstoc
					               Henry Starzynski
               Network Operations Support
January 2011   Global Network Mgmt Centre
               Bell Canada
Henry Starzynski – Manager, Global Network Management Centre
• Graduated from the University of Waterloo in 1982 with Bachelor of Mathematics
  (Computer Science)
• Post graduation, worked for a computer time sharing company called Datacrown,
 which become CSG, then SHL-Systemhouse
• I’ve been with Bell almost 27 years!
• Started out working on network design tools for Datapac and Megastream services
• Moved to our network management centre taking care of Datapac, managing the 7/24 console
               then Frame Relay (Hyperstream) support
• Today, I continue with legacy network support, bring in new business for our centre, support our computers
• I have a life outside of Bell too! I’m involved in the local community with Scouts Canada and Parent
               Council at my son’s high school – so, when you are free of University life, don’t forget to
               be involved in your community as well! You have lots of energy and knowledge that can
               help make local communities, whereever you end up, much better!
• Don’t forget, when you leave Carleton, learning never ever stops! Keep your brains active, technology
               is continually changing
        Bell Canada’s GNMC
•   GNMC = Global Network Management Centre
•   One of the world’s first Data Network Management Centres
•   Operating locally in Ottawa, serving Bell Canada customers
    globally
                         Bell Canada GNMC
                                   A bit about who were are …
•   Involved in managing data networks in Canada since 1974, globally since 1992
•   Originally - the National Data Network Control (NDNC) for domestic (Canada only)
    core data networks: Dataroute, Datapac (packet switching) , Megastream (Pt-Pt T1),
    Hyperstream (frame relay), Canadian ATM Gateway networks
•   Expanded to include private networks (Lotto Quebec) and VPN clouds
•   Started internationally with Financial Networks Associates (FNA – consortium of 8
    countries ) network in 1991 (Alcatel based network)
•   Evolved into Global Network Management (GNMC) at the individual customer circuit
    level
•   Today, we serve as International Help Desk/SPOC (single point of contact) for
    international data circuit troubles
     –   Provide proactive fault management, provisioning, change and performance management
                   Bell Canada GNMC
Main Focus Areas:
 Core Network Management (WAN) of legacy data networks (Datapac=Packet Switching,
  Frame Relay, Mega services=T1 point-point services)
 Single Point of Contact (SPOC) for international customer data circuits
 VPN Managed Services (MPLS) and support of private or virtual private network clouds
  and routers (LAN)
 Technical Support and Maintenance Engineering on existing legacy networks

GNMC is involved in all 4 major processes of Network Management:
          Fault Management
          Configuration Management (Provisioning)
          Performance Management
          Change Management
                          Network Management
• Like any industry, we toss around lots of BUZZ WORDS

• What do all those terms mean??
         • WANs
         • Clouds
         • OSSs
         • Network Management
         • SPOC
• Why do we do network management & customer management?

• Why is it important?

• What the heck is a network – anyway?

                                         WELL
                         let’s start … WHAT IS A NETWORK?
                     What is a Network ??
A Network means something different to everyone

                    For example, a ‘network’ can be ..

• Local Area Network - those within a building, office, floor, etc.
• Point to Point network - connecting two sites regardless of distance
• The ‘CLOUD’ - the service provider’s network – the infrastructure, sometimes
  termed the Public Network
• The `NET - the ubiquitous network
• The PSTN – Public System Telephone Network
• Wireless network
• Home Network
• A VPN – a Virtual Private Network
• A ‘social’ network!

 A NETWORK MEANS DIFFERENT THINGS TO DIFFERENT PEOPLE
        BUT whatever your definition, all networks do the same thing!
                       What is a Network ?
• A standard definition of a ‘network’ we will use is the following:
• A set of elements linked together to provide paths to transmit
          information, (data, voice, video) from one location to another.
• A critical tool which allows businesses to operate and people to communicate

• When it is all boiled down, All information is ‘data’, and it travels over a network.
• Successful networks are managed
                    Examples of Data Networks
• Transport Networks (Sonet, DS3, DS1, Fibre, MPLS core) – the BIG
infrastructures
• Circuit Switched (Public Switched Telephone Network)
• Dedicated (Point to point)
• Packet/Frame/Cell (legacy services)
• IP (Internet/ Intranet)
• Local Area Networks, in the home, office, or around the campus.
• Private (TV, Radio, Financial, Lottery) or Virtual Private Networks
(VPNs)
• Wireless
                    Network Characteristics
• Common characteristic of all networks is
    • the transmission of DATA (information, etc.)

• Some type of information (i.e. - data) is being transmitted from one
person/computer/location to another, for business, pleasure, research,
etc.

• In today’s world, we take data communications over networks for
granted - it is there, reliable, fault tolerant, and it NEVER fails.

• We use it every day, it is part of our daily routines, part of our ‘life’!

                             We expect connectivity!
       What then - is Network Management
                      and why is it important ?
• All types of networks transmit data in some form

• Network management has 5 main processes:
                Fault Management
                Configuration Management (Provisioning)
                Accounting Management
                Performance Management (including Change Management)
                Security Management


Bruce Deachman The Ottawa Citizen
Sunday, March 20, 2005

In 1994, Nicholas Negroponte, founder of MIT's Media Lab, predicted one billion people would be using the
Internet by the year 2000. What he failed to point out, was that most of them would be trying to get U2 tickets.
At least that's how it must have felt for countless fans who were unable to snag tickets to the Bono-led,
 Irish rock band's Nov. 25 Corel Centre show yesterday morning, as technology failed to keep pace with
 overwhelming demand, leaving old-fashioned overnight campers the happiest of all
Question!

What is the latest current estimate of the number of internet users in the world?
Anyone remember this??

ROOT CAUSES OF BLACKOUTS AND THEIR REMEDY
The electric power transmission system of the United States is seriously deficient.
Experts generally agree that fixing this system to an adequate level would take
many years and cost of tens of billions of dollars. But the root causes of the recent
“Blackout of 2003” can be solved in a relatively short time and at a much more
reasonable cost.

The root causes of the present problems are:
     • A totally outdated reliability philosophy; and
     • Inadequate real time monitoring of the transmission grid.

Isn’t the power grid a network too? Of course! Electricity is just a form of ‘data’!
                   Why ‘Network Management’?

  From a network provider’s viewpoint …
• Manage network resources equitably to ensure users can establish communications
quickly & reliably

• Ensure information is transferred with original quality, integrity, and securely

• Operate a high performance, reliable, cost effective network that meets customer/
  business/organizational needs and requirements

• Plan and implement measures to prevent or mitigate interruptions of service
  degradation

• Make $$$$$ for the network provider and its shareholders

• Gain market share for the network provider

• At Bell Canada, networks are the building blocks of our own business – they are why we
exist!
                   Why ‘Network Management’?

  From the customer’s viewpoint …
• Ensure information is transferred with original quality, integrity, and securely

• Obtain service at best cost/service/value combination

• To ensure a customer’s business operates with minimum downtime, in order to meet
 the requirements of its’ customers

• Meet regulatory, legal, safety requirements

• For a customer, networks are critical
    • For businesses, for their operations.
    • For the general public, so we can communicate, get money, etc
Network Management Poses Endless Challenges
by Willie Schatz
If network managers are in accord about anything, it’s that they have a lot
more tasks to do than resources to handle them.
The fundamental roles of a network administrator are to provide network
connections for computer equipment and to ensure availability and
performance of network communications.
But that’s only the beginning. The administrator must set up and manage
hardware and software solutions, enabling servers, clients, printers and other
peripherals to communicate. He or she also is responsible for providing users
the highest quality server functionality, which means uninterrupted, optimum
network availability and performance.
This same individual also must plan so any changes required in the network
conform with changes in the larger enterprise system.
“People really think network management is easier than it really is”.
    Network Management Processes
There are five processes involved in network management

Configuration Management ==Provisioning
• Programming network elements to communicate with each other and user equip.
• User datafill to make their service functional
• Copying critical (non default) network provisioning parameters to storage in
  offline in databases
• Ensuring billable parameters/features are updated in related billing systems
• Providing ‘dumps’, downloads, or application program interfaces (APIs) to other
  downstream systems

Why is Configuration/Provisioning management important?
• Users want their service when it is ordered (on due date)
• Users want to get the options they pay for
• The network provider needs to ensure their service is billed
         Network Management Processes
Fault Management==Service Assurance
• Surveillance - proactive - alarms/traps from the network that indicate major problems
• Isolating problems - reactive - when users have troubles
• Having clearly defined escalation procedures - how to prioritize troubles
• Providing customers with timely and honest status on problems - when will it be fixed?
• Performing analysis on failures for trends, root cause


Service Assurance is .. REAL-TIME surveillance, control , and analysis of a
network, with the objective of ensuring maximum use of network resources , particularly
when it is under stress due to traffic overload or failure conditions.
      Network Management Processes

Performance Management
• Performance measures can be internal (for the provider), regulated (CRTC), or
 to assist the customer (how is my network performing)
• Network performance (Mean time to repair, Network availability) are standard
 metrics used in the industry, and are often basis for ‘service level agreements’
• Customers may require information on their traffic patterns - are they
 paying for bandwidth they don’t require, or is their network overloaded?
• Many customers want guarantees of performance – a Service Level Agreement (SLA)
 in order to ensure they are getting the performance they pay for.
• A SLA may include the following
      • Network Availability
      • Frame/Cell/Packet delivery
      • Mean time to Repair
      • Penalty clauses for non-performance
      • Delay metrics
       Network Management Processes

Change Management
• Scheduling downtime / maintenance activities (new software, network upgrades)
 with users (notification, release or emergency)
• Ensuring software levels are compatible with all network components
• Keeping the customer informed of planned service interruptions is critical

Networks are in need of periodic maintenance for software or hardware upgrades,
 etc. In a 7x24 world, unscheduled downtime can mean
• loss of revenue
• legal liability
• threats to public safety.
FROM: CHANGE MANAGEMENT PLANNED OUTAGE
            Foreign-Tel COMMUNICATIONS Dept.: GNMC
   Phone: 1-555-868-7883
   Fax: 1-555-868-7822
   Please respond to the following Email: tcsccip@foreigntelcommunications.com


ForeignTel Communications would like to inform you that the Change Management activity will be
performed as indicated below:

_____________________________________________________________________
Outage #:        POM041793     / POT356369

Your ref. #:

Description:      DISREGARD OUTAGE NOTICE//THIS IS NOT SERVICE
               AFFECTING//WE ARE ADDING BACKBONE CAPACITY:
               PORTLAND-SANTA CLARA DURING THIS PERIOD,
               NETWORK WILL BE IN HAZARDOUS CONDITION. WALL
               NOC WILL CLOSELY MONITOR THE NETWORK AND ANY
               ALARMS ON IT

Scheduled Planned Start Date (UTC): february 16, 2009 15:00:00
Scheduled Planned End Date (UTC): february 24, 2009 03:00:00
       Related Network Management Activities
• Co-ordination with other Carriers and Agencies.
  No one carrier can route traffic everywhere on the planet. Strategic alliances and
co-operation amongst carriers is essential.

• Dynamic Controls.
  Can traffic be rerouted around failures or congestion? Is this automatic or manual?

• Disaster recovery planning.
Could it happen to you? What would you do in the event of a ‘disaster’?

• Security
 Who has access to the network infrastructure? Can it be ‘hacked’? Ensuring one
customer’s data does not go to another customer.
Security Management
• The goal of security management is to control access to
  network resources according to local guidelines so that the
  network cannot be sabotaged (intentionally or
  unintentionally) and sensitive information cannot be
  accessed by those without appropriate authorization.
• Security management subsystems work by partitioning
  network resources into authorized and unauthorized areas.
   – They identify sensitive network resources (including systems, files,
     and other entities) and determine mappings between sensitive
     network resources and user sets.
   – They also monitor access points to sensitive network resources and
     log inappropriate access to sensitive network resources.
AT&T Customer Info Hacked

By TSC Staff
8/29/2006 9:05 PM EDT

AT&T late Tuesday said that hackers broke into a computer system and accessed
personal data, including credit card information, from thousands of customers who
had purchased DSL equipment from the company's Web store.


Kaspersky says Web hack 'should not have happened'
02/09/2009
It's the worst thing that can happen to a computer security vendor: This weekend,
Moscow's Kaspersky Lab was hacked.
A hacker, who identified himself only as Unu, said that he was able to break into a section
of the company's brand-new U.S. support Web site by taking advantage of a flaw in the
 site's programming.
               Network Management Centre
                       Functions
• 7 x 24 operation - it’s more than a buzzword.
• Operations Support Systems for provisioning, change management, surveillance, trouble
  tracking, customer records
• Subject experts/access to engineering support personnel or labs
• Multiple & diverse communications channels
• Situation (War) room
• Secure and Independent Power Supply
• Access to Information Databases
• Contact information for support resources (level 1,2 3 support, vendor support)
• Secure location
• Fully redundant backup location
                      When Disaster strikes!
• If something will go wrong .. It will ..
      • Ice Storm of 1998/Hurricane Katrina & other natural disasters
      • Toronto Simcoe Central Office fire July 1999
      • Power plant failures
      • Hackers and viruses (SQL Worm)
      • September 11/terrorist attacks


• All of these test the plans of a network provider.
      • Are contingency plans in place? Have they been tested or gathered dust for 5 years?
      • Is there an escalation chain of command?
      • Are there agreements with other suppliers/vendors/competitors?
      • What contingencies are in place to get critical services restored as quickly as possible

• When service is lost, the prime objective, after immediate human safety, is the
  restoration of service
From July 1999 …
TORONTO - Phones stopped ringing in several major cities in Canada on Friday
after an explosion caused a major system failure at a Bell Canada building in Toronto.
 The failure knocked out phone lines, most cell phones, internet services and bank machines
in downtown Toronto. Cantel and digital cell phones appear to be working. Police
report 911 emergency systems are working, but the police are urging people to use these
systems only for real emergencies. The failure was caused by an explosion on the fourth
floor at the downtown bell centre at around 8:00 am. One person was reportedly
injured. Immediately after the explosion, battery powered backup systems kicked in.
But they ran out of power a few hours later. The Toronto Stock Exchange is back up and
running after it suspended trading briefly but brokerages are having trouble
communicating. Phone systems in Ottawa and Montreal and as far away as Halifax and
Vancouver have also been affected as calls that normally routed through Toronto are
rerouted through other cities. Bell Canada says it hopes to have services restored

by midafternoon.


ATLANTA (CNN) -- A series of cyber-attacks Tuesday left some of the Web's most
       high-profile sites staggering under the weight of tens of thousands of bogus
       messages.

          The targets included retail giant Amazon.com,   electronic auction house eBay, discount retailer
          Buy.com and CNN Interactive.

DISASTERS CAN HAPPEN? How will your network provider handle the trouble?
• Another  aspect of Network Management is Planning
• A carrier will have a plan for a disaster situation,
       as well anticipating potential issues
• Examples of planning for potential issues include
   • Y2K
   • more recently, the change in dates for Daylight
       Savings Time
   • Other various clock rollover issues
• A carrier may also do periodic disaster simulations
       to test the response of various groups as well
       as procedures
                     SPOC Function
What is a SPOC?


   In Bell Canada, the GNMC is the Single Point of Contact (SPOC) for all
    Fault Management and Change Management between Canadian Help
    Desks and Test Centres and all the global carriers that Bell uses to
    provide international reach for our customer circuits
   SPOC for all other carriers to get their issues fixed within Canada
   One door for all trouble management into or out of Canada
   Avoids having many different groups learn the processes for dealing
    with each of the carriers, or the carriers having to learn about all the
    various ops centers within Canada
   Provides flexibility to move quickly and customize for customer
    reasons, with centralized expertise
   As a SPOC, we get to compare service levels provided by different
    global carriers and use this info to get better performance
         Operational Support Systems
• Successful network management uses standardized protocols or vendor-specific
   mechanisms to transmit alarms and commands
  (e.g. Simple Network Management Protocol)

• Operational control data can be transmitted over conventional data networks,
  over the same network (inband), or over another network (out of band).

• The systems which receive alarms, allow for network configuration, troubleshooting,
  and control is commonly called Operational Support Systems (OSS).

• OSS may be more than 10 times the cost of the network infrastructure!

• OSSs may consist of Workstations, Databases, network elements, scripts, provisioning
  systems, security systems, offline databases and billing systems.

• Without a good OSS structure, a great network infrastructure will fail. The network
  objectives cannot be met without this.
         Operational Support Systems
• No one OSS does it all - if fact, many OSSs are required, and these must interact
 with each other. This is typically via Application Program Interfaces (API) or
 some standard format for information exchange.

• The interaction can be simple - or complex. Often, simple format changes in one
  OSS will impact many other ‘downstream’ OSSs.

• Remember where the money is spent - Not on the network infrastructure, but on
  the systems that make the network run.

• The following diagram shows a SAMPLE interaction between various systems.
                                          Sample Operational Support Systems



                                    Test Centres, NDNC                 Fault Mgmt/
                                                                     trouble shooting
                                                                     OSSs

                          BILLING    BILLING           BILLING
        BILLING            FILES                        FILES                              Call detail/
                                         Billing OSS
        SYSTEM                                                                             usage OSS
 (Customer receives                                                                                                  BILLING
bill for service/usage)                                                                                               RECS



                                                                                                           PROV
                          ORDER
            Order          INFO             ORDER ENTRY/                       Network Provisioning           Recs             NETWORK
            system                          Assignment system                        system                                    Elements
                                                                              (Customer gets service)

                                                                                                                                 SNMP
                                Customer and assignment                                                                          TRAPS
                                dumps (feed other OSSs)
CUSTOMER
                                                                                                                            Fault
 ORDERS                                                                                                   PROV
                                                              Cust.. Stats Data                                           Mgmt OSS
 SERVICE                                                                                                  RECS
                               Trouble                           Collection Sys.
                              Ticket system                                                  ALERTS
                                                                                                                                 ALERT
                                                                                                                                DISPLAY
                                                                                               Surveillance
                                                       Telco local                               Centres
                                                       assignment
                                                       system
                              Change Mgmt
                                   Metrics
• Each network needs some means of measuring its success, and to see where
  improvement can be made. Public networks may be regulated. Metrics may be stipulated
  in Service level agreements (SLAs) between provider and customer

• To the end user/customer, the most critical metrics are the following:
      • Mean time to repair (MTTR)
      • Network Availability ((Total available time-total downtime)/(Total avail. Time))
      • Quality of Service (QOS)
      • round trip delay
      • Network congestion/blocking
      • frame/packet/cell loss
      • repeat failures

• To the network provider, the following are important metrics:
      • Network Availability
      • EBITDA (Earnings Before Interest Taxes Depreciation & Amortization)
      • Cost / Revenue (return on investment)
      • Market Share
      • Network capacity
                                  Metrics
•To the shareholder the following are important:
     • Dividend
     • share price
     • Return on Investment
                                  Summary
• Networks can be simple, or extremely complex and mission critical

• Network quality , reliability, diversity, and low cost are essential

• The operation of a high quality reliable, cost effective network requires
  effective Network Management Centre(s), along with skilled people and good support
   tools (operational support systems)

• As networks continue to evolve, customers will manage more and more of their own
  networks.

• Challenges for the future include global coverage, scaling for growth,
  new technologies, telco mergers, acquisitions, failures - an industry always in flux.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:9/11/2012
language:Unknown
pages:41