Switching IT: Troubleshooting 201
BY NEAL ALLEN
en years ago, the
network was rela- Bridge or Switch Router
There were hubs, bridges,
and routers. Each was a dis-
crete box, readily identifi-
able from the others.
Troubleshooting was also
simple. If you were attached
to a hub, then the rules for
troubleshooting a collision
At the point where the
collision domain attached to Hub
a bridge, all errors stopped.
Troubleshooting using a
protocol analyzer was the best available option, and was network and cable analyzers can be combined with tra-
very effective once the user knew the basics of the net- ditional protocol analyzers to deliver unprecedented
work and the protocols in use. views into the network. Regardless of your tool of
Then switches appeared on the scene. A switch may choice, there are 5 basic approaches that are generally
act as an Open System Interconnection (OSI) Layer 2 applied to troubleshooting a switched environment:
bridge, or it may operate in one of several other modes Access the switch console via remote login with an
that involve low-latency forwarding techniques. By application such as Telnet or directly to the serial port.
implementing low latency forwarding techniques, the This is usually the same method used to configure the
scope of the network that may be involved in an error is switch.
now equal to the broadcast domain. Regardless of for- Use SNMP to query the switch, usually from a net-
warding technique, traffic is only forwarded to the “cor- work management platform.
rect” port. Thus, using a protocol analyzer to monitor Install a Tap or Splitter, usually on an uplink. Then
and troubleshoot becomes instantly ineffective in a monitor the output from the Tap or Splitter with a pro-
switched environment. The protocol analyzer typically tocol analyzer or other diagnostic tool.
only sees broadcasts and traffic to unknown destinations Configure a Mirror or Span port to get a copy of the
when attached to any unused switch port. traffic for analysis. The mirror can be configured to
A new generation of network analyzers can provide monitor the activity on one or more ports at the same
substantially faster solutions to networking problems. time.
Active discovery tools using Simple Network Install a shared media hub between the problem
Management Protocol (SNMP) for analysis, Remote device and the switch. Then monitor the collision
Monitoring (RMON,) traffic analyzers, and a hybrid of domain with a protocol analyzer or other diagnostic
Neal Allen, a technology specialist for Fluke Networks, Each of these approaches has positive and negative
plays a critical role in beta testing, strategic partnerships, aspects, and none of them are perfect. You will probably
and event logistics. He has been involved in the design, have to employ several of them depending on the trou-
installation, and troubleshooting of networks for more
than 15 years.
November/December 2003 www.aami.org IT Horizons 29
SWITCHING IT: TROUBLESHOOTING 201
SNMP Shared media
If security is enabled means half duplex.
at the switch or any- Placing a hub on a
where along the way, full duplex link is
it may not be possible likely to result in
to talk to the switch worse perform-
to obtain statistics. ance than the
Most useful infor- problem you were
mation may be already trou-
obtained from stan- bleshooting.
dard Management Not all hubs
Information Bases are OSI Layer 1
(MIBs) instead of pri- repeaters. You
vate MIBs, though may not see what
not all switches sup- you are expecting,
port standard MIBs. Also, not all standard MIBs are especially if the “hub” is really a small and inexpensive
well implemented by all switches. Private MIBs provide switch itself.
a view into new features and functionality that the stan- Installing the hub means adding 2 additional points
dard MIBs don’t know about, such as rate limiting. of failure: the hub, and another cable.
The primary job of the switch is to forward traffic, Once a shared media hub is installed almost any
not answer SNMP queries. Under high traffic loads monitoring tool may be used to troubleshoot the prob-
and/or high traffic bursts, many switches will temporar- lem, including protocol analyzers.
ily stop updating SNMP statistics, and may even stop
Mirror or Span
recording them briefly.
SNMP queries add traffic to a network that you are Only traffic from the ports being mirrored will appear
already troubleshooting for some sort of performance on the configured mirror port. You still need to have a
problem, this is an especially important consideration if pretty good idea where the problem is. Also, many
a Wireless Area Network (WAN) link is involved. They switches do not permit traffic to be transmitted into the
also require some level of Central Processing Unit per- output mirror port, resulting in a listen-only situation.
formance from the switch, which impacts the switches Operating a mirror will usually reduce the perform-
ability to forward traffic. ance of the switch by some amount.
Even if you guessed correctly on which port(s) to
Tap or Splitter mirror, the forwarding technique employed by the
Taps provide a view into both half and full duplex links, switch may prevent you from seeing the error(s) that are
but monitoring a full duplex link requires an expensive causing the problem you are troubleshooting. Errors
2-port analyzer unless you are willing to see only one are not generally forwarded by switches—though some
side of a conversation at a time. Not all traffic on a techniques permit certain errors to be forwarded.
switch passes through the uplink, which is often where If the combined activity on the mirrored port(s)
the tap is installed. exceeds the total output capacity of the mirror port,
Installation of a Tap requires that the link be discon- then traffic will be discarded without notification. You
nected for a short period of time, which further impacts will unknowingly miss potentially critical traffic while
performance on the network. you are troubleshooting.
Despite the various positive and negative aspects of
Shared Media Hub the common switch troubleshooting techniques, you
As with the Tap, installing a hub only gives you a view will have to use them. There simply are not a lot of
of the traffic passing through a single port—not the alternatives. The good thing is that all of the issues that
whole switch. cause problems for troubleshooting also help to keep
the rest of the network running even when one or more
30 IT Horizons www.aami.org November/December 2003
users are experiencing problems. Beware: there are ware is much faster. Troubleshooting in this environ-
some recent developments in switching technology that ment is not the same as troubleshooting a collision
should be considered during the troubleshooting domain or broadcast domain problem. Unless you can
process. check the switch configuration, this can be difficult to
Some switches permit configuration of limits on how VLANs
much bandwidth a particular user, protocol, or address A simple Virtual Local Area Network (VLAN) configu-
is permitted. Other users, protocols, or applications ration assigns a set of ports to be a broadcast domain. To
may consume the entire capacity of the connection. pass traffic between 2 VLANs on the same switch usu-
Thus, web access between two adjacent ports of the ally requires a trip to a router, possibly located on
same switch may crawl along at 5 MBPS despite being another blade in the switch or an entirely separate
connected at 100 MBPS, while a File Transfer Protocol device.
(FTP) download proceeds at near wire-speed across the More complex configurations can assign the VLAN
same connection. The anticipated reaction is to assume dynamically based on port, address, or other criteria.
that the web server is experiencing problems, when in This may mean that the troubleshooting tool must
fact it is the intentional configuration of the switch. assume the identity of the problem station in order to
Short of logging into the switch configuration manage- effectively troubleshoot the problem.
ment screens, this type of configuration is nearly impos-
sible to detect. Virtual High-Speed Ports
Many switches permit combining several lower speed
Load Balancing ports to form a single logical higher speed port. This is
Some switches are designed to perform load balancing. sometimes known as EtherChannel, though it is not
The troubleshooting impact of this is that you know limited to a single vendor, or a single Ethernet speed.
where the traffic entered the switch, but you may have Taps are available to monitor multiple physical links
difficulty predicting joined into a logical
where it should come port, but it requires
out. Unless you can special software and
check the switch con- hardware.
figuration, this can be
difficult to trou- Redundancy
bleshoot. Switches inherently
offer Spanning Tree
OSI Layer 3, 4, as an OSI Layer 2
5-7 Forwarding means of maintain-
Functionality ing and managing
Switches have become parallel network
much faster and paths. If anything
smarter. The front- goes wrong with the
end silicon is now able way parallel paths are
to perform many handled, a broadcast
routing functions storm often results
without passing the and brings the
traffic up to software broadcast domain to
for routing decisions. a halt. A variety of
The software is usual- troubleshooting
ly more multi-func- issues surround
tioned, but the hard- Spanning Tree prob-
November/December 2003 www.aami.org IT Horizons 31
SWITCHING IT: TROUBLESHOOTING 201
lems, but the solution is simply to disconnect one of the by the 802.3 Ethernet standard at this time. Many diag-
parallel connections. Finding the problem parallel con- nostic tools cannot be directly connected to a port using
nection is the challenge. Furthermore, since switches a non 802.3 technology.
also double as routers depending on the options loaded
with the switch operating system or the installed hard- Shortening the Troubleshooting Cycle
ware, the issue of standby ports becomes something to There are a considerable number of challenges that net-
worry about. Various acronyms found include Virtual work support staff must overcome when troubleshoot-
Router Redundancy Protocol ing switched networks.
(VRRP), Hot Standby Router “Awareness of the potential Awareness of the potential
Protocol (HSRP) and problems is paramount to a
Extreme Standby Router problems is paramount to a successful troubleshooting
Protocol (ESRP). All of them successful troubleshooting episode, and the list of poten-
describe how two or more tial troubleshooting challenges
switches may be used in paral-
episode, and the list of continues to grow.
lel, blocking traffic until such potential troubleshooting Each additional feature that
time as the active switch fails. challenges continues to is introduced into switching
In some instances the par- technology creates a new trou-
allel path is constantly active,
grow.” bleshooting challenge.
but unused. Other configura- None of the challenges are
tions permit the unused path to be tested and then held insurmountable, but continued education is critical.
“down” until needed, or both paths to be load balanced Once aware of the challenge it, becomes a relatively
and used continuously. On occasion all paths are kept simple matter of selecting the right tool or tools for the
up and load balanced until a failure closes a path. To problem at hand. Gone are the days of using a single
troubleshoot in this environment requires some knowl- tool to solve most problems, however, the suite of fea-
edge of the configuration, or at least the presence of the tures available from the range of available tools is grow-
parallel path. Independent actions by the dynamic ing right alongside the increasing catalog of switch fea-
Layer 2 protocols and the dynamic Layer 3 protocols tures.
can sometimes prevent the “active” port at the other
layer from receiving traffic.
Asymmetrical Routed Paths An Acronym Guide
Similar to redundancy issues, parallel routed paths may CPU—Central Processing Unit
create problems. Depending on the network configura- ESRP—Extreme Standby Router Protocol
tion, it is possible to have asymmetrical paths in opera- FTP—File Transfer Protocol
tion. Traffic can leave on one path and return on anoth- HSRP—Hot Standby Router Protocol
er. This situation may cause connection oriented proto-
LRE—Long Reach Ethernet
cols like Transmission Control Protocol (TCP) to
receive packets out of order, which may result in
MIB—Management Information Bases
retransmissions (apparent as a slow response to the OSI—Open System Interconnection
user). RMON—Remote Monitoring
SNMP—Simple Network Management
Unusual Frame Types Protocol
Many switches now offer vendor proprietary links to TCP—Transmission Control Protocol
“improve” performance. These usages include all sorts VLAN—Virtual Local Area Network
of innovative solutions to distance, speed, or perform- VRRP—Virtual Router Redundancy
Two examples include Jumbo Frames and Long
WAN—Wireless Area Network
Reach Ethernet (LRE), neither “standard” is supported
32 IT Horizons www.aami.org November/December 2003