Network Troubleshooting Tools

Troubleshooting Tools
                Kent Reuber
             ITS Networking
                April 6, 2007
What problems do you need to
Tool descriptions
Q&A time

Tool descriptions are in the
 “Software” section of the LNA
What are the problems?
 Are hosts online? (ping)
 How do you get to hosts? (traceroute)
 What are hosts running? (nmap)
 Where/when have hosts been seen?
 “The network is slow” (Netspeed, iperf)
 DHCP and DNS (SUNet reports)
 Wireless problems (various)
 Packet sniffing (wireshark), and batch
  NetDB changes (NetDB CLI)
Ping and traceroute
Ping: Are you there?
 Ping sends ICMP echo requests to a host and asks
  for a reply. Reply time is also returned.
 Some hosts may choose not to reply by security
  policy. It may not mean that they’re down.
 Stanford de-prioritizes pings at some of our
  borders, so a long ping time or dropped pings
  does not indicate a poor connection.
 Stanford maintains a special host:
    “”
    Exempt from ping filter.
    Have outside users ping “ping-me” if they claim that
     connections to Stanford are unavailable or slow.
Ping for Advanced Users
Can increase packet size to see
 duplex errors. (Unix: ping -s)
  Default small (<60 byte) ping packets
   don’t generate enough traffic to show
   duplex problems.
  Try using pings of 1000+ bytes.
Use nmap or similar utility for “ping
 sweeps” of entire networks:
  “nmap -sP <network range>” (Ex.
   “nmap -sP”)
Traceroute: How do I get
 How traceroute works:
    Source sends a series of packets with increasing time-to-
     lives. (TTL is the allowed number of router hops.)
     Unix/Mac: UDP, Windows “tracert”: ICMP.
    Routers will decrement TTL and respond with an ICMP
     “unreachable” message if TTL is 0.
    Like ping, a timestamp is returned.

Traceroute notes
 Routers need not reply to traceroutes.
  Lack of a reply does not mean that the
  router is down.
 Return traffic doesn’t necessarily use the
  same path.
    This can cause problems with firewalls and
     packet shapers that assume they see the
     whole conversation.
    When troubleshooting connection problems,
     you may want to have the destination send
     traceroutes to you as well.
Nmap: Scanning nets
In addition to ping scans, you can
 scan for open ports on hosts.
This can be useful for seeing who is
 running a service (intentionally or
My recipe for scanning for open
 TCP ports:
  ”nmap -P0 -sT net -p ports -oG - | grep
Getting nmap
Download from
Unix and MacOS X usually
 require compiling from source.
Windows binary available.
IPM: IP <-> MAC addresses
Stanford-specific utility
How it works:
  Devices broadcast ARP packets when
   they need to communicate locally.
  Routers see these ARP and cache it.
  Information is periodically harvested
   and kept in a database.
  Using IPM, you can track when an
   IP/MAC was first and last seen and
IPM: What’s it good for?
You can find MAC addresses
 which aren’t in Netdb.
Find out where a particular
 device has been seen.
See if multiple devices are
 using a single IP address.
More on IPM
Where is it:
  AFS: /usr/pubsw/sbin/ipm
  Note: this directory is not in your
   default PATH.
Using IPM:
  Wildcards: “_” (single
   character), “%” (multiple
  Run “ipm -h” to see list of options.
MAC vendor codes
 MAC addresses are 48-bit (6 bytes)
  xx:xx:xx:xx:xx:xx, where each “x” is a
  hexadecimal number 0-9,a-f.
 First 3 bytes are the Organizationally
  Unique Identifier (OUI), which tell you who
  made the network card.
 Can look this up. My favorite site:
 Can tell you when NetDB records are
  outdated. For example, a NetDB record
  for a Macintosh with MAC address
  00:0b:db (Dell) is clearly wrong.
Netspeed and Iperf
Netspeed & Iperf: Speed
Often hear “the network is slow”.
  Is it the client, the network or a server?
  Where’s the bottleneck?
Useful tools:
  Netspeed (Web based speed to
   campus backbone).
  Iperf (command line tool for point-to-
Web based speed testing to
 Stanford backbone:
Useful for finding duplex errors
 (misconfigured hubs or
 switches) in the path.
Command line testing tool.
  Can also run speed tests against and
  Can be run in server mode for
   testing speed between arbitrary
   points (e.g., within your network)
How fast can you go?
DSL: 1 Mbps (asymmetric)
802.11b wireless: 1-5 Mbps
802.11g wireless: 1-12 Mbps
Fast Ethernet: 80+ Mbps
Gigabit: ??
Note: consider these tests as upper
 bounds. For gigabit especially, you
 may not be able to transfer real
 data this fast.
Troubleshooting DHCP
Many things can go wrong.
 Problems are rarely caused by
 DHCP server unavailability.
Things to check:
  What IP is the host getting?
  Netdb record for the host.
  DHCP server logs, roaming pool
   utilization reports.
Understanding DHCP
 Stanford has two DHCP servers: dusk and dawn.
 Info from Netdb is uploaded approximately every 15
  minutes. Give Netdb the time to upload data.
 At Stanford, MAC address information is required for
  successful DHCP.
 Initial DHCP is a four step process using broadcasts;
  renews are different.

DHCP addresses are valid for a
 limited period (wired and wireless).
  Normal DHCP: 2 days
  Roaming DHCP: 42 minutes
Hosts will re-confirm their leases
 halfway through the lease period.
  Clients use unicast directly to the
   DHCP server (clients have an address
   and they know who their server is).
  Renew message type is used.
DHCP roaming
 If the Netdb record has a “home” IP address
  appropriate for the network where the device is
  located, DHCP servers will send it.
    Can have “home” IP addresses and still be able to
     roam to other networks.
    Can have multiple “home” addresses bound to
     each MAC address.
 If no appropriate address is entered, DHCP will
  look for available roaming addresses on the local
    Number of roaming address is specified by the LNA.
     Defined in the Netdb network record.
    Usually there are only a handful of roaming
     addresses. Can easily run out of them.
What address did you get?
 The address received may tell you what
  the problem is.
 Self assigned (169.254.*.*):
    NetDB record not set up properly.
    No roaming address available.
    Routing or DHCP server problem (less likely).
 10.x.x.x:
    Used by Network self-registration system.
    Could also be used by a rogue.
 192.168.*.*:
    Probably a rogue DHCP server.
Finding rogues
Try pinging the gateway that’s
 being distributed.
Use “arp” command to get the
 MAC address of the gateway. Or
 use a sniffer if you have one.
Look at switch MAC tables and find
 the offending hosts. Shut off the
 port or go have a “chat”.
New Net-to-Switch configs block
 rogue DHCP servers!
Available DHCP reports
 DCHP logs for a given host.
    Type in MAC address and see the conversation.
    Takes practice to read.
 Roaming address utilization
    How many roaming addresses were used in a day.
 DHCP reports from dusk and dawn
    Hourly logs show number of DHCP messages for
    “No free leases” may indicate that you’re out of
     roaming addresses.
 All reports are linked from LNA Guide software
  section: http://lnaguide/software.html
DNS at Stanford
Host information is entered in NetDB
  Uploads to DHCP servers about every
   15 minutes.
  Uploads to DNS servers about every
     Starts at 5 minutes after the hour.
     Takes about 20 minutes. Should be done
      by 30 minutes past the hour.
     Specific info on timing is kept in the NetDB
      help files.
DNS inspection tools
 Standard: “host”, “nslookup”, “dig”.
 Stanford whois can show you most NetDB
   “whois -h <query>”
   Use “%” and “_” as wildcards as per ipm.
   Great for people who need “read-only”
    access, since you don’t need a NetDB
   For host names, you need to end query in a “.”
    or specify “” so that whois knows
    you want information on a host.
Wireless problems
Wireless is slow or unavailable.
Reports can be vague. “Wireless is
 slow on the 2nd floor.”
Isolating the problem can speed
  Exactly where is the problem
  What access point is the user
   connecting to?
  Do others have problem in the area?
Wireless tools
 Access point association:
    Mac: Internet Connect utility
    PC: ??
 Access point discovery for seeing
  available AP’s and channels:
  NetStumbler, iStumbler
 Iperf and Netspeed are useful for
  checking speed problems.
 Often, a AP reboot will solve the problem.
    AP jack (tso) information is in Netdb.
    Can unplug and replug if necessary.
Packet sniffer
EtherPeek and Wireshark
Stanford has site license for
 Etherpeek, but it’s still expensive.
Wireshark (formerly Ethereal) is free.
 (Motto: “Sniff free or die!”)
  X windows application for Unix/Mac.
  Binary for Windows.
  Some books are available!
Advice on Sniffing
Need for a sniffer is rare, but
 invaluable when you need it.
  Learn to use it before you need it!
You will need to set up special
 “span” ports on your switches to
 see all traffic.
  No need if you’re interested in
   broadcasts and multicasts.
  Most useful for seeing traffic entering
   and leaving your net.
NetDB Command Line
NetDB CLI overview
Designed for power users.
Provides a subset of NetDB
 functionality (mostly nodes) for
 batch changes. New features
 are periodically added.
Use with caution. Try one or
 two hosts before doing big
How to run NetDB CLI
Located in AFS space:
  /usr/pubsw/sbin/netdb (note: this
   directory is probably not in your PATH)
  Use -h option to get command syntax
Stuff you can do (to a single
 machine or list of machines):
  Change administrators, locations.
  Change IP addresses.
  Delete nodes.

