Design and Implementation by wulinqing

VIEWS: 4 PAGES: 24

									 Design and Implementation
of TWAREN Hybrid Network
    Management System
National Center for High-Performance Computing
  Speaker: Ming-Chang Liang & Li-Chi Ku




                                                 1
             Outline
 Introduction
 Motivation
 Issues
 Design
 Implementation
 Future works
                       2
INTRODUCTION


               3
             About TWAREN
 TWAREN (TaiWan Advanced Research &
  Education Network) network construction was
  completed at the end of 2003 and started its
  operation and service in the beginning of 2004.
 In its initial phase, IP routing was the main
  service provided.
 The network management programs coming
  along with the purchase of network equipments,
  including CIC, Webtop, CW2K, HP Openview,
  HP NNM and other solutions.


                                                    4
            Initial phase of TWAREN
                                 MOECC                                            10GE
                 NTU                                                              STM-64/OC-192
  NCCU                               C6509
                                               ASCC                               STM-16/OC- 48
                 C6509                                                            GE
 C6509                                             C7609
                                                             NDHU
 C6509
                                 Taipei
                                                               C6509
 NCU                                 GSR        EBT10GE
                                                            CCU
                                                            C6509
 NHLTC

 C6509                             TWAREN
                                                                            C6509
                          GSR                       GSR
C6509                                                                       NCU
                 Hsinchu                                   Tainan
NCTU


         NTHU                            GSR
                                                  C6509
         C6509           C6509      Taichung       NTTU
                                                                    C6509
                          NCHU                                       NYSU
                                                                                            5
        Initial phase of NMS
                                    Remedy                  CLI

             WebTop                Help Desk
                                                                            Notification
                                                  Gateway
                            API                                   CLI
SMTP HTTP
            ISM                    Cisco Info
DNS  FTP
                                    Center

                                                             Probe
                                                                            Trap
                     CW2K
                                                       NNM                             CTM
                     (DFM)               Trap

                                                                            Trap
                  PING                           PING                                PING
                                  Trap                            Trap
                  Polling                        Polling                             Polling




                       12416       7609         3750       2522      2600          15454       15600
            NAM
                                                                                                       6
              Phase 2 of TWAREN
   TWAREN was adapted for more protection methods and
    better availability at the end of 2006, called TWAREN
    phase 2.
   Tens of optical switches and hundreds of lightpaths were
    then served as the foundation of the layer 2 VLAN
    services and the layer 3 IP routing services.
   In 2008, tens of VPLS switches were further incorporated
    to provide additional Multi-point VPLS VPN service.
   The layer 1 lightpaths can be protected by SNCP, layer 2
    VLAN by spanning tree recalculation and layer 2 VPLS
    by fast reroute technology.
   All these improvements transform TWAREN phase 2 into
    a true hybrid network capable of providing multiple
    layers of services and high availability .

                                                               7
             Architecture of TWAREN phase 2
                    NTU                       ASCC                           NCCU               NIU                   6509
                                                                                                            15454
              6509    7609                           7609                                            7609
                                                                                                                     NDHU
                                                                              6509
                 15454                        15454                                             15454       3750      7609


6509                                         15600

NCU 15454
                                     12816      15454       12816
7609

                                     MOEcc            7609C                                     NCNU
                                                                                                                    NHLTC
                                             Taipei                                              7609
                     12816                                                  12816
6509                                                                                                         3750     6509

NCTU 15454   15600   15454   7609C   Hsinchu Taichung               7609C            15454

7609
                     12816   NCHC                                   NCHC    12816                    6509
                                               Tainan
                                                                                             15454   NCHU
                                      NCHC              7609C
                                                                                                     7609
6509
                                                                                                                    NTTU
                                     12816     15454        12816
NTHU 15454                                                                                                   3750     6509
                                               15600
7609


                15454                          15454                           15454          STM64
                                                                                              STM16
             6509     7609               6509         7609                  6509      7609     10GE
               NSYSU                           NCKU                            CCU              GE

                                                                                                                             8
MOTIVATION


             9
      Why need new NMS?
 The architecture of TWAREN phase 2
  became more and more complicated.
 Since TWAREN phase 2 has more protection
  methods, a single point of hardware or circuit
  failure will not interrupt the service level
  provided to the end users.
 The initial phase of NMS was no longer
  competent for the hybrid network anymore
  because it is hard to determine and predict
  the correlation between failures and affected
  services.

                                                   10
    Requirements for new NMS
 Automatically determine the correlation
  between failures, affected services, affected
  customs and severity level on this highly
  safeguard network.
 Provide single integrated visual user interface.
 Use integrated database, logs, message flows
  and exchange protocols.
 After several surveys, we decided to develop
  a new NMS which be suitable for monitoring
  all services provided by TWAREN phase 2.


                                                     11
ISSUES


         12
    Uncertainty of SNMP implementation
 There are some different implementations
  of the SNMP TRAP/MIB among
  equipments of same brand.
 The SNMP OIDs or the return values may
  vary between OS upgrade on the same
  equipment and are usually hard to reveal
  beforehand.
 Therefore, the system must be designed in
  a way such that these changes can be
  accommodated with minimal
  modifications.
                                              13
  The lack of skillful programmers
 Our programmers are the same guys with
  the members of operating team.
 We are not professional programmers and
  have not accordant programming language.
 The system must be partially available and
  operational during the early phase of its
  development such that it can evolve along
  with the real needs.
 So, an unified standard of communication
  between different modules is necessary

                                           14
 Huge historical data and computing

 For minimizing the false positive and
  false negative rate, baseline thresholds
  would have much better quality when
  they are dynamically generated from
  historical data.
 Therefore, we need to store
  sufficiently large historical data sets
  and to have very high efficiency to
  retrieve the data back while
  calculating those thresholds.
                                          15
   Automatically determine affected
           services and customs
 TWAREN phase 2 inherently has the
  ability to guard against a single point
  of hardware or circuit failure, so the
  failure is less likely to affect the actual
  service provisioning.
 An intelligent management system
  which is able to determine the scope
  of failure affected service will reduce
  the management cost.
                                            16
DESIGN


         17
             1st Stage System Architecture
                                                     GUI &
  Monitor Objs             Control API
                                                  Ticket System


  Traps
                 Data Collectors                  Fault Detection

  MIBs                                                                      Fault Location
                              Current Status
 Syslogs                           DB
                                                          Threshold
                                                             DB
Net flows
                               Long Term
                                  DB
Telnet/SSH
                                                         Case/Action
                                                             DB
   TL1                                                                       Auto Action


  Mirror                           Threshold Analyzer

   Interactive                                              Report System
   Passive
                                                                                             18
Relationship of Data Tables
  Basic Data Tables   Relationship Tables
    Component            Circuit

    People               VLAN Services


    Location             VPLS Services

                         ONS
    Unit
                         Light Path

                         ONS
    Vendor               Cross Connection


    …., etc              …., etc



                                            19
             Basic Data Tables
Component Data Table
Component_ID     Parent_C_ID      Name
                                                        Vendor Data Table
     1                   0       TN7609P                ID    Name
     12                  1        Slot_1                 1     CHT
     2                   0       TP15454                 2    APBT
     16                  2        Slot_3                 3   RingLine
     135                12        Port_9


People Data Table
ID   Name       Phone        Address   Service_Time     Service_WeekDay
 1   John    0939123123      xxxxxxx           8-17          1,3,5
 2   Mary    0958123123      xxxxxxx           ALL           ALL


Location Data Table                    Unit Data Table
ID   Name      Address                 ID        Name

 1   MOEcc     xxxxx                       1     NCKU

 2   NTU       xxxxx                     18       THU



                                                                            20
                       Relationship Data Tables
Circuit Data Table
ID                 Name              Vendor    Identify     From_CID     To_CID      Bandwidth
 1        Taipei_Tainan_STM64             1   8D543267         13             35      STM64
 2        NCHU_NCNU_10GE                  2   ST16987          23             67          10GE




ONS Topology Link Table                          ONS Light Path Table
NodeA        NodeB        PortA     PortB          LP      PortFrom    PortTo      SNCP_LP         CRS_Trace       Size
     12        45         1467      2346              2      2312      2345           0          359,556,522,475    4
     16        32         2312      3421              98     3434      4455          99           482,541,335      16
                                                      99     3434      4455          98          482,469,541,335   16



ONS Cross Connection Table
CRS        PortA     PortB        SNCP_CRS    ChannelA      ChannelB    Size
 482       1744      1756            0            5            13        4
 21        3321      3343            24          17            33        16
 24        3546      4534            21           1            17        16

                                                                                                                          21
IMPLEMENTATION


                 22
          Current monitor objects
   Trap monitor
      Used interfaces, BGP, etc.
   Environment of equipment room
      Temperature (auto threshold), Voltage
   Statuses of equipments
      Temperature , CPU, RAM, FANs, Power-Supply
   BGP peering with other networks
      Statuses, Number of exchanged routes (auto threshold), Utilization analysis
   Performance monitor
      End to End RTT (auto threshold), End to End Packet Lost Rate (auto
        threshold), End to End Availability
   Throughput
      Backbone (auto threshold), Designate interfaces
   Top N
      Bytes, Flows, Packets
   Routes monitor
      The routes of customs (exact comparison)
   VPLS VPN
      Throughput of CE side, MACs of VPN
   Optical Network
      Current topology of lightpaths
   VLAN
      Current topology of VLAN


                                                                                     23
            Future works
 Combine all developed monitor objects
  with single integrated visual user
  interface.
 Enhance the monitoring of optical,
  VPLS and VLAN networks.
 Automatically determine the fault
  location, root cause and affected scope.
 Minimize the false positive and false
  negative rate.

                                             24

								
To top