Docstoc

Fault Management

Document Sample
Fault Management Powered By Docstoc
					Fault Management

             IACT 918 July 2004
                     Gene Awyzio
  SITACS University of Wollongong
    Overview
    • Fault Management is the process of
      locating and correcting network
      problems or faults
    • Comprehensive fault management is
      probably the most important task in
      Network Management




2
    Benefits of Fault Management
    Process
    • Increased network reliability
      – Provides tools allowing engineer to quickly
         • Detect problems
         • Initiate recovery procedures
    • Need to maintain the illusion of complete
      and continuous connectivity
    • Also provides tools to extract information
      about the networks current state

3
    Accomplishing Fault
    Management
    • Can be considered as a three (3) step
      process
      – Identify the fault
      – Isolate the cause of the fault
      – Correct the fault if possible




4
    Identifying the fault
    • Gathering Information to identify a
      problem
      – To learn that a problem exists we need to
        gather data about the current state of the
        network
    • Two approaches
      – Log critical network events
      – Poll network devices

5
    Identifying the fault
    • Critical network events
      – Examples
         • Failure of a link
         • Lack of response from host
      – Transmitted by network device when fault
        conditions occur
      – Reactive method
      – If device fails it cannot send an event


6
    Identifying the fault
    • Occasional Polling
       – Can help find faults in a timely manner
       – Tradeoff
           • Degree of timeliness vs bandwidth consumption
       – Other factors
           • Number of devices to poll
           • Bandwidth of links
       – Example
           • Assume each query and response is 100 bytes long (including
             data and header information)
           • For a network of 30 devices
               – (100 + 100) * 30 = 6000bytes/polling interval = 48,000 bits/polling
                 interval
           • Polling every minute
               – 800 bits/second
               – (48,000 bits/polling interval * 60 secs * 60 polls) = 172,800,000 = 173
                 Megabits/hour
           • Polling every 10 minutes
               – 17.3 Megabits/hour
               – May not know about event for 10 minutes


7
    Deciding Which Faults to
    Manage
    • Need to decide which faults to mange
      – Need to prioritise faults
      – If number of faults reports is high network may not
        handle volume
      – Limiting event traffic can reduce redundant
        transmissions and storage
    • Factors to consider
      – Scope of control over network
      – Size of network


8
    Fault Management of a Network
    Management System
    • Simplest system
      – Reports existence of fault but NOT location
    • More complex tool
      – Uses capability of hosts and network devices to
         • Send critical network events
         • Facilitate isolation of fault cause
    • Advanced tool
      – Correction of fault



9
     Impact of a Fault on the
     Network
     • A fault management tool MUST be capable of
       analysing how a fault can affect other areas of
       the network
     • Need to know
       – What services the fault
          • STOPS
          • IMPACTS
       – Not only that a fault has occurred but also how that
         fault affects other network communication
     • Data can come from performance
       management tools


10
     Form of Reporting Faults
     • Common forms of fault reporting
        – Text
        – Graphical
        – Auditory signals
     • Text
        – Will work on any type of terminal
     • Graphical
        – Considered to be very effective
        – Can use flashing images to gain attention
        – Colour can be used to indicate device status
     • Auditory signals
        – Will quickly call attention to the occurrence of a fault




11
12

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:5/23/2012
language:English
pages:12