    Switching IT: Troubleshooting 201

             en years ago, the
             network was rela-                                                        Bridge or Switch           Router
             tively       simple.
    There were hubs, bridges,
    and routers. Each was a dis-
    crete box, readily identifi-
    able from the others.
    Troubleshooting was also
    simple. If you were attached
    to a hub, then the rules for
    troubleshooting a collision
    domain applied.
       At the point where the
    collision domain attached to      Hub
    a bridge, all errors stopped.
    Troubleshooting using a
    protocol analyzer was the best available option, and was          network and cable analyzers can be combined with tra-
    very effective once the user knew the basics of the net-          ditional protocol analyzers to deliver unprecedented
    work and the protocols in use.                                    views into the network. Regardless of your tool of
       Then switches appeared on the scene. A switch may              choice, there are 5 basic approaches that are generally
    act as an Open System Interconnection (OSI) Layer 2               applied to troubleshooting a switched environment:
    bridge, or it may operate in one of several other modes               Access the switch console via remote login with an
    that involve low-latency forwarding techniques. By                application such as Telnet or directly to the serial port.
    implementing low latency forwarding techniques, the               This is usually the same method used to configure the
    scope of the network that may be involved in an error is          switch.
    now equal to the broadcast domain. Regardless of for-                 Use SNMP to query the switch, usually from a net-
    warding technique, traffic is only forwarded to the “cor-         work management platform.
    rect” port. Thus, using a protocol analyzer to monitor                Install a Tap or Splitter, usually on an uplink. Then
    and troubleshoot becomes instantly ineffective in a               monitor the output from the Tap or Splitter with a pro-
    switched environment. The protocol analyzer typically             tocol analyzer or other diagnostic tool.
    only sees broadcasts and traffic to unknown destinations              Configure a Mirror or Span port to get a copy of the
    when attached to any unused switch port.                          traffic for analysis. The mirror can be configured to
       A new generation of network analyzers can provide              monitor the activity on one or more ports at the same
    substantially faster solutions to networking problems.            time.
    Active discovery tools using Simple Network                           Install a shared media hub between the problem
    Management Protocol (SNMP) for analysis, Remote                   device and the switch. Then monitor the collision
    Monitoring (RMON,) traffic analyzers, and a hybrid of             domain with a protocol analyzer or other diagnostic
         Neal Allen, a technology specialist for Fluke Networks,         Each of these approaches has positive and negative
     plays a critical role in beta testing, strategic partnerships,   aspects, and none of them are perfect. You will probably
     and event logistics. He has been involved in the design,         have to employ several of them depending on the trou-
     installation, and troubleshooting of networks for more
                                                                      bleshooting situation.
     than 15 years.

  SNMP                                                                                                   Shared media
  If security is enabled                                                                              means half duplex.
  at the switch or any-                                                                               Placing a hub on a
  where along the way,                                                                                full duplex link is
  it may not be possible                                                                              likely to result in
  to talk to the switch                                                                               worse perform-
  to obtain statistics.                                                                               ance than the
     Most useful infor-                                                                               problem you were
  mation       may      be                                                                            already       trou-
  obtained from stan-                                                                                 bleshooting.
  dard Management                                                                                        Not all hubs
  Information Bases                                                                                   are OSI Layer 1
  (MIBs) instead of pri-                                                                              repeaters.      You
  vate MIBs, though                                                                                   may not see what
  not all switches sup-                                                                               you are expecting,
  port standard MIBs. Also, not all standard MIBs are          especially if the “hub” is really a small and inexpensive
  well implemented by all switches. Private MIBs provide       switch itself.
  a view into new features and functionality that the stan-       Installing the hub means adding 2 additional points
  dard MIBs don’t know about, such as rate limiting.           of failure: the hub, and another cable.
     The primary job of the switch is to forward traffic,         Once a shared media hub is installed almost any
  not answer SNMP queries. Under high traffic loads            monitoring tool may be used to troubleshoot the prob-
  and/or high traffic bursts, many switches will temporar-     lem, including protocol analyzers.
  ily stop updating SNMP statistics, and may even stop
                                                               Mirror or Span
  recording them briefly.
     SNMP queries add traffic to a network that you are        Only traffic from the ports being mirrored will appear
  already troubleshooting for some sort of performance         on the configured mirror port. You still need to have a
  problem, this is an especially important consideration if    pretty good idea where the problem is. Also, many
  a Wireless Area Network (WAN) link is involved. They         switches do not permit traffic to be transmitted into the
  also require some level of Central Processing Unit per-      output mirror port, resulting in a listen-only situation.
  formance from the switch, which impacts the switches            Operating a mirror will usually reduce the perform-
  ability to forward traffic.                                  ance of the switch by some amount.
                                                                  Even if you guessed correctly on which port(s) to
  Tap or Splitter                                              mirror, the forwarding technique employed by the
  Taps provide a view into both half and full duplex links,    switch may prevent you from seeing the error(s) that are
  but monitoring a full duplex link requires an expensive      causing the problem you are troubleshooting. Errors
  2-port analyzer unless you are willing to see only one       are not generally forwarded by switches—though some
  side of a conversation at a time. Not all traffic on a       techniques permit certain errors to be forwarded.
  switch passes through the uplink, which is often where          If the combined activity on the mirrored port(s)
  the tap is installed.                                        exceeds the total output capacity of the mirror port,
     Installation of a Tap requires that the link be discon-   then traffic will be discarded without notification. You
  nected for a short period of time, which further impacts     will unknowingly miss potentially critical traffic while
  performance on the network.                                  you are troubleshooting.
                                                                  Despite the various positive and negative aspects of
  Shared Media Hub                                             the common switch troubleshooting techniques, you
  As with the Tap, installing a hub only gives you a view      will have to use them. There simply are not a lot of
  of the traffic passing through a single port—not the         alternatives. The good thing is that all of the issues that
  whole switch.                                                cause problems for troubleshooting also help to keep
                                                               the rest of the network running even when one or more

                                                                                                                NEAL ALLEN

    users are experiencing problems. Beware: there are          ware is much faster. Troubleshooting in this environ-
    some recent developments in switching technology that       ment is not the same as troubleshooting a collision
    should be considered during the troubleshooting             domain or broadcast domain problem. Unless you can
    process.                                                    check the switch configuration, this can be difficult to
    Rate Limiting
    Some switches permit configuration of limits on how         VLANs
    much bandwidth a particular user, protocol, or address      A simple Virtual Local Area Network (VLAN) configu-
    is permitted. Other users, protocols, or applications       ration assigns a set of ports to be a broadcast domain. To
    may consume the entire capacity of the connection.          pass traffic between 2 VLANs on the same switch usu-
    Thus, web access between two adjacent ports of the          ally requires a trip to a router, possibly located on
    same switch may crawl along at 5 MBPS despite being         another blade in the switch or an entirely separate
    connected at 100 MBPS, while a File Transfer Protocol       device.
    (FTP) download proceeds at near wire-speed across the          More complex configurations can assign the VLAN
    same connection. The anticipated reaction is to assume      dynamically based on port, address, or other criteria.
    that the web server is experiencing problems, when in       This may mean that the troubleshooting tool must
    fact it is the intentional configuration of the switch.     assume the identity of the problem station in order to
    Short of logging into the switch configuration manage-      effectively troubleshoot the problem.
    ment screens, this type of configuration is nearly impos-
    sible to detect.                                            Virtual High-Speed Ports
                                                                Many switches permit combining several lower speed
    Load Balancing                                              ports to form a single logical higher speed port. This is
    Some switches are designed to perform load balancing.       sometimes known as EtherChannel, though it is not
    The troubleshooting impact of this is that you know         limited to a single vendor, or a single Ethernet speed.
    where the traffic entered the switch, but you may have      Taps are available to monitor multiple physical links
    difficulty predicting                                                                          joined into a logical
    where it should come                                                                           port, but it requires
    out. Unless you can                                                                            special software and
    check the switch con-                                                                          hardware.
    figuration, this can be
    difficult to trou-                                                                             Redundancy
    bleshoot.                                                                                      Switches inherently
                                                                                                   offer Spanning Tree
    OSI Layer 3, 4,                                                                                as an OSI Layer 2
    5-7 Forwarding                                                                                 means of maintain-
    Functionality                                                                                  ing and managing
    Switches have become                                                                           parallel     network
    much      faster    and                                                                        paths. If anything
    smarter. The front-                                                                            goes wrong with the
    end silicon is now able                                                                        way parallel paths are
    to perform many                                                                                handled, a broadcast
    routing       functions                                                                        storm often results
    without passing the                                                                            and     brings     the
    traffic up to software                                                                         broadcast domain to
    for routing decisions.                                                                         a halt. A variety of
    The software is usual-                                                                         troubleshooting
    ly more multi-func-                                                                            issues      surround
    tioned, but the hard-                                                                          Spanning Tree prob-

  lems, but the solution is simply to disconnect one of the  by the 802.3 Ethernet standard at this time. Many diag-
  parallel connections. Finding the problem parallel con-    nostic tools cannot be directly connected to a port using
  nection is the challenge. Furthermore, since switches      a non 802.3 technology.
  also double as routers depending on the options loaded
  with the switch operating system or the installed hard-    Shortening the Troubleshooting Cycle
  ware, the issue of standby ports becomes something to      There are a considerable number of challenges that net-
  worry about. Various acronyms found include Virtual        work support staff must overcome when troubleshoot-
  Router Redundancy Protocol                                                           ing      switched     networks.
  (VRRP), Hot Standby Router         “Awareness of the potential                       Awareness of the potential
  Protocol       (HSRP)        and                                                     problems is paramount to a
  Extreme Standby Router              problems is paramount to a successful troubleshooting
  Protocol (ESRP). All of them         successful troubleshooting episode, and the list of poten-
  describe how two or more                                                             tial troubleshooting challenges
  switches may be used in paral-
                                           episode, and the list of                    continues to grow.
  lel, blocking traffic until such      potential troubleshooting                          Each additional feature that
  time as the active switch fails.        challenges continues to                      is introduced into switching
      In some instances the par-                                                       technology creates a new trou-
  allel path is constantly active,
                                                        grow.”                         bleshooting challenge.
  but unused. Other configura-                                                             None of the challenges are
  tions permit the unused path to be tested and then held    insurmountable, but continued education is critical.
  “down” until needed, or both paths to be load balanced     Once aware of the challenge it, becomes a relatively
  and used continuously. On occasion all paths are kept      simple matter of selecting the right tool or tools for the
  up and load balanced until a failure closes a path. To     problem at hand. Gone are the days of using a single
  troubleshoot in this environment requires some knowl-      tool to solve most problems, however, the suite of fea-
  edge of the configuration, or at least the presence of the tures available from the range of available tools is grow-
  parallel path. Independent actions by the dynamic          ing right alongside the increasing catalog of switch fea-
  Layer 2 protocols and the dynamic Layer 3 protocols        tures.
  can sometimes prevent the “active” port at the other
  layer from receiving traffic.

  Asymmetrical Routed Paths                                      An Acronym Guide
  Similar to redundancy issues, parallel routed paths may        CPU—Central Processing Unit
  create problems. Depending on the network configura-           ESRP—Extreme Standby Router Protocol
  tion, it is possible to have asymmetrical paths in opera-      FTP—File Transfer Protocol
  tion. Traffic can leave on one path and return on anoth-       HSRP—Hot Standby Router Protocol
  er. This situation may cause connection oriented proto-
                                                                 LRE—Long Reach Ethernet
  cols like Transmission Control Protocol (TCP) to
  receive packets out of order, which may result in
                                                                 MIB—Management Information Bases
  retransmissions (apparent as a slow response to the            OSI—Open System Interconnection
  user).                                                         RMON—Remote Monitoring
                                                                 SNMP—Simple Network Management
  Unusual Frame Types                                            Protocol
  Many switches now offer vendor proprietary links to            TCP—Transmission Control Protocol
  “improve” performance. These usages include all sorts          VLAN—Virtual Local Area Network
  of innovative solutions to distance, speed, or perform-        VRRP—Virtual Router Redundancy
  ance challenges.
     Two examples include Jumbo Frames and Long
                                                                 WAN—Wireless Area Network
  Reach Ethernet (LRE), neither “standard” is supported

  32 IT Horizons                                                       November/December 2003

