Docstoc

Performance Management _Best Practices_

Document Sample
Performance Management _Best Practices_ Powered By Docstoc
					Performance Management
     (Best Practices)

    REF:www.cisco.com
    Document ID 15115
Introduction

• Performance Management involves
 optimization of network response time and
 management of consistency and quality of
 individual and overall network services
  – Need to measure the user/application
    response time
Performance management issues

•   User performance
•   Application performance
•   Capacity planning
•   Proactive fault management

• It is important to note that with newer
    application like video and voice performance
    management is the key success
Critical success factors (1/2)

• Gather a baseline for both network and
  application data
• Perform a what-if analysis on network and
  application
• Perform exception reporting for capacity
  issues
• Determine the network management
  overhead for all proposed or potential
  network management services
Critical success factors (2/2)

• Analyze the capacity information
• Periodically review capacity information for
 both network and applications as well as
 baselining and exception
• Have upgrade or tuning procedures set up
 to handle capacity issues on both a
 reactive and long-term basis
Indicators for performance
management (1/3)
• Performance indicators provide mechanism by
    which an organization can measure critical
    success factors.
•   They are the followings:
•   Document the network management business
    objectives

• Create detailed and measurable service level
    objectives
Indicators for performance
management (2/3)
• Provide documentation the service level
  agreement (SLA) with charts or graphs that
  show the success or failure of how these
  agreements are met over the time

• Collect a list of the variables for the baseline
  such as polling interval, network management
  overhead incurred, possible trigger threshold
   – whether the variable is used as a trigger for a trap,
     and trending analysis used against each variable
Indicators for performance
management (3/3)
• Have a periodic meeting that reviews the analysis
  of the baseline and trends.

• Have a what−if analysis methodology documented.
  – This should include modeling and verification where
    applicable

• When thresholds are exceed, develop
  documentation on the methodology used to
  increase network resources.
  – One item to document is the time line required to put in
    additional WAN bandwidth and a cost table
Performance management process
flow (1/3)

         Develop a network management
              concept of operation




              Measure Performance




        Perform a Proactive Fault Analysis
Performance management process
flow (1/3)
• 1 develop a network management concept
 of operation
  – Define the required features : Services,
    Scalability and Availability objectives
  – Define availability and network management
    objectives
  – Define performance SLAs and Metrics
  – Define SLA
Performance management process
flow (2/3)
• 2 Measure Performance
  – Gather network baseline data
  – Measure availability
  – Measure response time
  – Measure accuracy
  – Measure utilization
  – Capacity planning
Performance management process
flow (3/3)
• 3 perform a proactive fault analysis
  – Use threshold for proactive fault management
  – Network management implementation
  – Network operation metrics
Performance management process
flow

         Develop a network management
              concept of operation




              Measure Performance




        Perform a Proactive Fault Analysis
Develop a network management
concept of operation (1/3)
• The purpose is to describe the overall
  desired system characteristics from an
  operational standpoint
• The use of this document is to coordinate
  the overall business goals of network
  operation, engineering, design other
  business units and the end users.
Define the required features: Services,
Scalability objectives (1/2)

• Define services :to understand applications,
  basic traffic flows, users and site counts and
  require network services (create model of
  your network)
• Create solution scalability objectives: to help
  network engineers design networks that
  meet future growth requirement and not
  experience resource constraint.
  – media capacity, number of routes and etc
Define the required features: Services,
Scalability objectives (2/2)

• These are the standard performance
 goals:
  – Response time
  – Utilization
  – Throughput
  – Capacity (maximum throughput rate)
Define availability and network
management objectives (1/2)
• Availability objectives define the level of
  services (service level requirements)
  – define different class of service for a particular
    organization
  – Higher availability objective might necessitate
    increased redundancy and support procedures
Define availability and network
management objectives (2/2)
• Define manageability objectives to ensure
 that overall network management does
 not lack management functionality
  – Must understand the process and tools for
    organization
  – Uncover all important MIB or network tool
    information
Define performance SLAs and
Metrics
• The performance SLAs should include the
 average expected volume of traffic, peak
 volume of traffic, average response time
 and maximum response time allowed
Define SLAs

• SLA (Service Level Agreement) - enterprise
• SLM (Service Level Management) – service provider
• SLM include definitions for problem types and
  severity and help desk responsibilities
  – Escalation path, time before escalation at each tier
    support level
  – Time to start work on the problem
  – Time to close target based on priority
  – Service to provide in the area of capacity planning,
    hardware replacement
Performance management process
flow
         Develop a network management
              concept of operation




              Measure Performance




        Perform a Proactive Fault Analysis
Measure Performance

• Gather Network Baseline data
  – Perform a baseline of the network before and
    after a new solution deployment
  – A typical router/switch baseline report
    includes capacity issues related to CPU,
    memory, buffer, link/media utilization,
    throughput
  – Application baseline: bandwidth used by app
    per time period
Measure availability

• Availability is the the measure of time for
 which a network system or application is
 available to a user
  – Coordinate the help desk phone calls with the
    statistics collected from managed devices
  – Check scheduled outages
  – Etc
Measure Response Time

• Network response time is the time required to
    travel between two points
•   Simple level – pings from the network management
    station to key points I the network. (not accuracy)
•   Server-centric polling : SAA (Service Assurance
    Agent) on router (Cisco) to measure response time
    to a destination device
•   Generate traffic that resembles the particular
    application or technology of interest
Measure accuracy

• Accuracy is the measure of interface traffic
  that does not result in error and can be
  expressed in term of percentage
• Accuracy = 100 – error rate
• Error rate = ifInErrors * 100 /
  (ifInUcastPkts + IfInNUcastPkts)
Measure Utilization (1)

• Utilization measure the use of a particular
  resource over time
• Percentage in which the usage of a
  resource is compared with its maximum
  operational capacity
• High utilization is not necessarily bad
• Sudden jump in utilization can indicate
  unnormal condition
Measure Utilization (2)


• Input utilization =
  ifInOctets *8*100/(time in second)*ifSpeed
• Output Utilization
  ifOutOctets *8*100/(time in second)*ifSpeed
Capacity planning

• The following are potential areas for
 concern:
  – CPU
  – Backplane or I/O
  – Memory
  – Interface and pip sizes
  – Queuing, latency and jitter
  – Speed and distance
  – Application characteristics
Performance management process
flow

         Develop a network management
              concept of operation




              Measure Performance




        Perform a Proactive Fault Analysis
Perform a Proactive fault analysis

• One method to perform fault management
  is through the use of RMON alarms and
  event groups
• Distributed management system that
  enables polling at a local level with
  aggregation of data at a manager to
  manager
Use threshold for proactive fault
management (1/2)
• Threshold is the point of interest in specific
  data stream and generate event when
  threshold is triggered
• 2 classes of threshold for numeric data
  – Continuous threshold apply to continuous or
    time series data such as data stored in SNMP
    counter or gauges
  – Discrete threshold apply to enumerated objects
    or discrete numeric data such as Boolean
    objects
Use threshold for proactive fault
management (2/2)
• 2 different forms of continuous threshold
  – Absolute :use with gauges
  – Relative (delta): use with counter
• Step to determine threshold
  – 1 select the objects
  – 2 select the devices and interfaces
  – 3 determine the threshold values for each
    object or interface
  – 4 determine the severity for the event
    generated by each threshold
Network management
implementation
• The organization should have an
  implemented network management
  system.
• SNMP/RMON or other network
  management system tools
Network operation metrics (1/2)

• Number of problems that occurs by call
  priority
• Minimum, maximum and average time to
  close in each priority
• Breakdown of problems by problem type
  (hardware, software crash, configuration,
  power user error)
Network operation metrics (2/2)

• Breakdown of time to close for each
 problem type
• Availability by availability or SLA
• How often you met or missed SLA
 requirements
Performance Management
        Indicator
Document the network
management business objectives
(1/3)
• This document is the organization network
  management strategy and should
  coordinate the overall business goals of
  network operations, engineering, design,
  other business units and the end users.
• It enable the organization to form the long
  range planning activities for network
  management and operation.
Document the network
management business objectives
(2/3)

• Identify a comprehensive plan with
  achievable goals
• Identify each business service/application
  that require network support
• Identify those performance-based metric
  needed to measure service
Document the network
management business objectives
(3/3)

• Plan the collection and distribution of the
  performance metric
• Identify the support needed for network
  evaluation and user feedback
• Have documented, detailed and
  measurable SLA objectives
Document the Service Level
Agreements
• Before document the SLA, you must define
  the service level objectives metrics
• This document should be available to users
  for evaluation to provide feedback for
  variables needed to maintain the service
  agreement level
• SLAs are living agreement
  – What works today might become obsolete
    tomorrow
Create a list of variables for the
baseline
• This list includes items such as
  – polling interval
  – Network management overhead incurred
  – Possible trigger thresholds
  – Trending analysis used against each variable
  – Router health
  – Switch health
  – Routing information
  – Utilization
  – delay
Reviews the baseline and trends

• Network management personnel should
  conduct meeting periodically (operational
  and planning)
• Also include the review of SLA
Document a what-if analysis
methodology
• A what-if analysis involves modeling and
  verification of solutions.
• It includes the major questions, the
  methodology, data sets and configuration
  file
• The main point is that he what-if analysis is
  an experiment hat someone else should be
  able to recreate with the information
  provided in the document
Document the methodology used
to increase network performance
• This document includes additional WAN
    bandwidth and a cost table that helps increase
    the bandwidth for a particular type of link
•   It helps the organization realize how much time
    and money it costs to increase the bandwidth
•   Periodic review this document to ensure that it
    remain up to date
Configuration Management
      (Best Practice)

     Ref.: www.cisco.com
     Document ID 15111
High Level process flow for
Configuration Management
                        Start

                   Create Standards

                 Implement Standards

                Maintain Documentation

           Validate and Audit Standards

           NO
                      Improve ?
                           YES

                  Review Standards
Create Standards (1)

• Create Standards helps reduce network
 complexity, the amount of unplanned
 downtime and exposure to network impact
 events
Create Standards (2)

• Following standards for optimal network
  consistency
  – Software version control and management
  – IP addressing standard and management
  – Naming convention and Domain Name System/ DHCP
    assignment
  – Standard Configuration and Descriptors
  – Configuration Upgrade procedure
  – Solution Templates
Software Version Control and
Management (1)
• Software version control is the practice of
  deploying consistent software versions on
  similar network devices
• Limit amount of software defects and
  interoperability issues
• Reduce the risk of unexpected behavior
  – With user interfaces
  – Feature behavior / upgrade behavior
Software Version Control and
Management (2)
• Following steps for Software version control
  – Determine device classifications based on chassis,
    stability and new feature requirements
  – Target individual software versions for similar-device
    classification
  – Test, validate and pilot chosen software versions
  – Document successful version as standard for similar-
    device classification
  – Consistently deploy or upgrade all similar devices to
    standard software version
IP Address Standards and
Management (1)
• IP address management is the process of
  allocating, recycling and documenting IP
  address and subnets in a network
• It reduces the opportunity for overlapping
  or duplicate subnets, wasted IP address
  space, complexity
IP Address Standards and
Management (2)
• We should standard subnet size for standard application
    –   Subnet size   of   building
    –   Subnet size   of   WAN link
    –   Subnet size   of   Branch site
    –   Subnet size   of   Loopback
• The subnet block should promote IP summarization
    (contiguous IP )
•   Create standards for IP assignment
    – Router should be the first available address
    – Switch may be the next available address
    – Dynamic address should be followed by fixed address
• Finally document standard you developed and IP allocation
Naming Convention and DNS/DHCP
Assignment (1)
• Consistent, structure use of naming conventions
  and DNS for devices helps
  – Create a consistent point to routers for all network
    management information related to a device
  – Reduce the opportunity for duplicate IP address
  – Creates simple identification of a device showing
    location, device type and purpose
  – Improve inventory management by providing a
    simpler method to identify network devices
Naming Convention and DNS/DHCP
Assignment (2)
• On router, it is strongly recommended to
  use loopback interface as the primary
  management interface
• Loopback interface can be used for trap,
  SNMP and syslog
• Individual interface can have name
  convention that identifies the device,
  location, purpose and interface
Naming Convention and DNS/DHCP
Assignment (3)
• It is also recommended to identify DHCP
  ranges and adding them to the DNS
  including location of the user
• Example: “dhcp-bldg-c21-10” to “dhcp-
  bldg-c21-253” which identifies IP address
  in building C, second floor wiring closet 1
Standard Configuration and
Descriptors (1)
• Standard Configuration applies to protocol and
    media configuration as well as global
    configuration command
•   Descriptors are interface commands used to
    describe an interface
•   It is recommended to create standard
    configurations for each device classification
    – Router , LAN switch, WAN Switch, ATM switch
Standard Configuration and
Descriptors (2)
• Each standard configuration contain the global, media,
  and protocol configuration command
• Global configuration
   – Password, vty, banners
   – SNMP configuration, Network Time Protocol (NTP)
• Media configuration
   – ATM, Frame Relay, Fast Ethernet configuration
• Protocol Configuration
   – Routing protocol
   – Access control list
   – QoS configuration
Standard Configuration and
Descriptors (3)
• Descriptors are developed by creating a
  standard format that applies to each
  interface
• The descriptor includes
  – the purpose and location of the interface
  – Other devices and location connected to the
    interface
  – Circuit identifier
Standard Configuration and
Descriptors (4)
• It is recommended
    – to keep standard configuration parameters in a
      standard configuration file
    – downloading the file to each new device prior to
      protocol and interface configuration
• We should document the standard configuration
    file including an explanation of each global
    configuration parameter and why it is important
•   RME (Cisco Resource Manager Essentials)
Configuration Upgrade Procedure
(1)
• Upgrade procedures ensure that software and
    hardware upgrades occur smoothly with minimal
    downtime
•   Upgrade procedures include
    –   vendor verification
    –   Vendor installing references such as release notes
    –   Upgrade methodologies or steps
    –   Configuration guideline
    –   Testing requirement
Solution Templates (1)

• Solution templates are used to define modular
    network solutions
•   A network module may be a wiring closet, a
    WAN field office or an access concentrator
•   It is used to ensure that similar deployment can
    be carried out in exactly the same way
    – can reduce risk level to the organization
Solution Templates (2)

• Specific details of the solution template
   – Hardware and hardware modules including memory,
     flash, power and card layouts
   – Logical topology including port assignment
   – Software versions including firmware versions
   – All non-standard, non-devices specific configuration,
     VLAN configuration, access lists, switching paths,
     spanning tree parameters and etc
   – Out of band management requirement
   – Cable requirement
   – Installation requirement including environmental, power
     and rack location
Maintain Documentation (1)
                      Start

                 Create Standards

               Implement Standards

              Maintain Documentation

         Validate and Audit Standards

         NO
                    Improve ?
                         YES

                Review Standards
Maintain Documentation (2)

• It is recommended to use the following
 network documentation critical success
 factor
  – Current device , link and end user inventory
  – Configuration version control system
  – TACACS (Terminal Access Controller Access-
    Control System) configuration log
  – Network topology documentation
Validate and Audit Standards (1)
                        Start

                   Create Standards

                 Implement Standards

                Maintain Documentation

           Validate and Audit Standards

           NO
                      Improve ?
                           YES

                  Review Standards
Validate and Audit Standards (2)

• We can use configuration management
  performance indicators to measure
  configuration management success
• Configuration management performance
  indicators
  – Configuration integrity checks
  – Devices, protocol and media audits
  – Standards and documentation review
Configuration integrity checks

• It should evaluate the overall
 configuration of the network its complexity
 and consistency and potential issues

• For cisco network, it is recommended to
 use Netsys configuration validating tool.
Device, Protocol and Media Audits

• It is used to check consistency in software
  versions, hardware devices and modules,
  protocol and media and naming
  convention
• Ciscowork RME is a configuration tool that
  can audit and report on hardware versions
  modules and software versions
Standards and Documentation
review
• It is done to ensure that the information is accurate and up
  to date
• The audit should include reviewing current documentation
  recommending changes or additions and approving new
  standards
• Following documents should be reviewed on a quarterly
  basis
   –   Standard configuration definition
   –   Solution templates including recommended hardware configuration
   –   Current standard software versions
   –   Upgrade procedures for all devices and software version
   –   Topology documentation
   –   Current templates
   –   IP address management

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:8
posted:2/2/2012
language:
pages:69