Diploma Thesis Qucosa

Document Sample
Diploma Thesis Qucosa Powered By Docstoc
					                  School of Computer Science
   Professorship of Computer Networks and Distributed Systems




                      Diploma Thesis



Analysis, Implementation and Enhancement of Vendor dependent and
        independent Layer-2 Network Topology Discovery.




                                by
                       Alexander Barthel


                               Tutor
                            Ralf König




                  Chemnitz, Germany, April 15 2005
Table of Contents
                                                                                                                           iii
List of Abbreviations.......................................................................................................
List of Figures..................................................................................................................v
List of Tables.................................................................................................................viii
1. Introduction to network discovery................................................................................1
    1.1. Different views of one network.............................................................................1
    1.2. Aspects of network discovery...............................................................................2
    1.3. Related work.........................................................................................................3
        1.3.1. Standards dealing with network discovery....................................................3
        1.3.2. Drafts about network discovery.....................................................................4
        1.3.3. Existing software implementing network discovery......................................7
            1.3.3.1. Examples for Open Source software......................................................7
            1.3.3.2. Examples for Closed Source software....................................................8
    1.4. Typical problems...................................................................................................9
    1.5. Goals of this work.................................................................................................9
2. A common approach to network discovery.................................................................10
    2.1. Gathering data from devices...............................................................................10
        2.1.1. Command Line Interface.............................................................................11
        2.1.2. Simple Network Management Protocol.......................................................11
        2.1.3. Comparison of device access methods........................................................13
    2.2. Finding all devices..............................................................................................16
        2.2.1. Using proprietary protocols for finding devices..........................................16
        2.2.2. Finding devices using Link Layer Discovery Protocol (LLDP).....................19
        2.2.3. Scanning subnets for devices......................................................................19
        2.2.4. Find devices by MAC-addresses..................................................................20
        2.2.5. Device list....................................................................................................22
        2.2.6. Comparison of ways to find all devices........................................................22
    2.3. Generate topology from data..............................................................................22
        2.3.1. Common data model....................................................................................24
        2.3.2. Topology by Cisco Discovery Protocol topology (CDP) data........................25
        2.3.3. Topology by Spanning Tree Protocol (STP) data.........................................27
        2.3.4. Topology by Filtering Database information................................................29
            2.3.4.1. Complete knowledge of Filtering Databases........................................30
            2.3.4.2. Minimum knowledge of Filtering Databases........................................32
        2.3.5. Comparison and conclusion of topology discovery methods........................34
3. Design and implementation of the sample network discovery application - Netdisco.37
    3.1. Analysis of Netdisco's architecture.....................................................................37
        3.1.1. Discovery-script...........................................................................................38
        3.1.2. Helper functions..........................................................................................39


                                                                  i
        3.1.3. Database back-end......................................................................................40
        3.1.4. Web front-end..............................................................................................44
    3.2. Analysis of Netdisco's mode of operation............................................................46
        3.2.1. Discovery scenarios.....................................................................................46
            3.2.1.1. Discovery of a single devices...............................................................46
            3.2.1.2. Discovery of all CDP devices................................................................56
            3.2.1.3. Discovery of non-CDP devices..............................................................59
            3.2.1.4. Refresh devices....................................................................................59
            3.2.1.5. Host discovery.....................................................................................59
        3.2.2. Topology generation....................................................................................68
            3.2.2.1. CDP topology.......................................................................................68
            3.2.2.2. Manual topology.................................................................................71
        3.2.3. Detection of changes in networks................................................................72
        3.2.4. Problems during operation..........................................................................73
4. Improvements on Netdisco.........................................................................................74
    4.1. Vendor independent topology determination......................................................74
    4.2. Parallel device operations...................................................................................78
        4.2.1. Preconditions for concurrent database operations......................................80
        4.2.2. Preconditions for concurrent device operations..........................................81
        4.2.3. Implementation of parallel network discovery.............................................81
        4.2.4. Implementation of parallel device refresh...................................................85
        4.2.5. Implementation of parallel host discovery...................................................87
    4.3. Network detail maps and visualization of layer-1 and layer-2 properties............88
    4.4. Experiments and results of implementations......................................................90
        4.4.1. Results of using vendor independent topology information.........................90
        4.4.2. Results of concurrent operations.................................................................90
        4.4.3. Resource requirements for concurrent operations......................................96
        4.4.4. Problems during implementation...............................................................109
    4.5. Verification of results........................................................................................109
5. Integration into an existing device inventory...........................................................110
    5.1. Mapping between Netdisco and TUCOMA identifiers......................................110
    5.2. Adoption from Netdisco to TUCOMA data-model.............................................111
6. Conclusion................................................................................................................114
    6.1. Main results......................................................................................................114
    6.2. Remaining problems.........................................................................................114
    6.3. Future outlook..................................................................................................115
Appendix A - User documentation of the new Netdisco features..................................116
Appendix B – Content of the annexed CD.....................................................................118
Bibliography.................................................................................................................119



                                                                ii
List of Abbreviations
ARP
     Address Resolution Protocol
BFS
     Breadth First Search
BID
     Bridge Identifier
BPDU
     Bridge Protocol Data Unit
CDP
     Cisco Discovery Protocol
CGI
     Common Gateway Interface
CLI
     Command Line Interface
CPAN
     Comprehensive Perl Archive Network
CSV
     Comma Separated Value
DBMS
     atabase Management System
DNS
     Domain Name System
ERD
     Entity Relationship Diagram
FDB
     Filtering Database
FIFO
     First In First Out
HTML
     Hypertext Markup Language
ICMP
     Internet Control Message Protocol
IEEE
     Institute of Electrical and Electronics Engineers
IP
     Internet Protocol
ISO
     International Organisation of Standardization


                                               iii
LAN
  Local Area Network
LLDP
  Link Layer Discovery Protocol
MAC
  Media Access Control
MAN
  Metropolitan Area Network
MIB
  Management Information Base
OID
  Object Identifier
OUI
  Organizationally Unique Identifiers
RFC
  Request For Comments
SMI
  Structure of Management Information
SNMP
  Simple Network Management Protocol
SSH
  Secure Shell
STP
  Spanning Tree Protocol
TCP
  Transmission Control Protocol
TUCOMA
  TU-Chemnitz Communication Manager
UDP
  User Datagram Protocol
VLAN
  Virtual LAN
VTP
  VLAN Trunk Protocol
WAN
  Wide Area Network
WLAN
  Wireless LAN




                                        iv
List of Figures
                                                                 ..........................10
Figure 2.1 Network management architecture, simplified from [CI02]
Figure 2.2 Example of a network comprised of CDP-capable devices............................17
Figure 2.3 Example of a network comprised of CDP devices connected to a single CDP-
unaware device..............................................................................................................18
Figure 2.4 Example of a network comprised of CDP devices connected to a single CDP-
aware device..................................................................................................................19
Figure 2.5 Entity Relationship Diagram of a common layer-1 and layer-2 topology model.
24
Figure 2.6 Differences in Layer-1 and Layer-2 Topologies.............................................25
                                                                                ...27
Figure 2.7 Example of a bridged LAN, showing Spanning Tree port states and roles,
Figure 2.8 Relaying of MAC-frames, [IEEE802.1D] figure 7-4.......................................29
Figure 2.9 Concept of discovery methods based on forwarding tables...........................31
Figure 2.10 Example of Through sets, adapted from [LOWE2002]................................33
Figure 2.11 Illustration of the Minimum Knowledge Requirement, adapted from
[LOWE2002]...................................................................................................................34
Figure 3.1 Netdisco Architecture...................................................................................38
Figure 3.2 Entity Relationship Diagram of Netdisco's database tables..........................40
Figure 3.3 Discovery of a single device..........................................................................46
Figure 3.4 Discovery of a single device, Netdisco sub-routine discover.........................47
Figure 3.5 Netdisco sub-routine get_device...................................................................48
Figure 3.6 Netdisco sub-routine create_device..............................................................49
Figure 3.7 Netdisco sub-routine store_device................................................................51
Figure 3.8 Netdisco sub-routine store_interfaces...........................................................52
Figure 3.9 Netdisco sub-routine find_neighbors............................................................54
Figure 3.10 Netdisco sub-routine topo_add_link............................................................55
Figure 3.11 Discovery of all CDP devices.......................................................................56
Figure 3.12 Discovery of all CDP devices of example network in figure 3.11.................56
Figure 3.13 Netdisco sub-routine run, used to discover all CDP-devices.......................58
Figure 3.14 Netdisco sub-routine refresh, used to refresh all known devices................59
Figure 3.15 Host discovery............................................................................................60
Figure 3.16 Scheme of host discovery............................................................................60
Figure 3.17 Netdisco sub-routine macwalk....................................................................61
Figure 3.18 Netdisco sub-routine macsuck, part 1.........................................................62
Figure 3.19 Netdisco sub-routine macsuck, part 2.........................................................63
Figure 3.20 Netdisco sub-routine mac_savecache..........................................................64
Figure 3.21 Netdisco sub-routine walk_fwtable.............................................................65
Figure 3.22 Netdisco sub-routine arpwalk.....................................................................66
Figure 3.23 Netdisco sub-routine arpnip.......................................................................67


                                                                v
Figure 3.24 Netdisco sub-routine graph.........................................................................68
Figure 3.25 Netdisco module function netdisco::make_graph........................................69
Figure 3.26 Netdisco sub-routine graph_each................................................................70
Figure 3.27 Netdisco sub-routine graph_addnode.........................................................71
Figure 3.28 Example for manual topology......................................................................72
Figure 4.1 Comparison of CDP and STP topology information, with an outer non-CDP
device.............................................................................................................................75
Figure 4.2 Comparison of CDP and STP topology information, with an inner non-CDP
device.............................................................................................................................76
Figure 4.3 Add supplementary seen Spanning Tree links to the known CDP Topology,
topo_add_stp..................................................................................................................77
Figure 4.4 Parallel discovery implemented as Master-Slave design pattern...................79
Figure 4.5 Master-Slave Pattern as implemented in Netdisco........................................80
Figure 4.6 Netdisco sub-routine run with parallelization...............................................83
Figure 4.7 Parallel CDP discovery of example network in figure 3.11 on page 56.........84
Figure 4.8 Devices arranged in a line.............................................................................85
Figure 4.9 Implementation of Netdisco's refresh function in parallel mode...................86
Figure 4.10 Functions arpwalk and macwalk parallelized..............................................88
Figure 4.11 Netdisco's new sidebar for additional network maps..................................89
Figure 4.12 Comparison of different operations with different number of processes.....91
Figure 4.13 Average per device time of operations, in dependency of the number of
processes........................................................................................................................95
Figure 4.14 Average per device time of operations, in dependency of the number of
processes........................................................................................................................96
Figure 4.15 CPU and memory load during discovery with a limit of 1 process............101
Figure 4.16 Network traffic during discovery with a limit of 1 process.......................101
Figure 4.17 CPU and memory load during discovery with a limit of 100 processes.....102
Figure 4.18 Network traffic during discovery with a limit of 100 processes................102
Figure 4.19 CPU and memory load during discovery with a limit of 100 processes and
bulkwalk disabled.........................................................................................................103
Figure 4.20 Network traffic during discovery with a limit of 100 processes and bulkwalk
disabled........................................................................................................................103
Figure 4.21 CPU and memory load during discovery with a limit of 200 processes.....104
Figure 4.22 Network traffic during discovery with a limit of 200 processes................104
Figure 4.23 CPU and memory load during refresh with a limit of 100 processes.........105
Figure 4.24 Network traffic during refresh with a limit of 100 processes....................105
Figure 4.25 CPU and memory load during refresh with a limit of 200 processes.........106
Figure 4.26 Network traffic during refresh with a limit of 200 processes....................106
Figure 4.27 CPU and memory load during arpwalk with a limit of 25 processes..........107
Figure 4.28 Network traffic during arpwalk with a limit of 25 processes.....................107


                                                                  vi
                                                                          ........108
Figure 4.29 CPU and memory load during macwalk with a limit of 25 processes
Figure 4.30 Network traffic during macwalk with a limit of 25 processes....................108
Figure 5.1 TUCOMA-designator to DNS-name mapping..............................................110
Figure 5.2 DNS-name mapping to TUCOMA-designator..............................................111
                                                                         .
Figure 5.3 Simplified Entity Relationship Diagram of the TUCOMA data-model .........112
Figure 5.4 Example of a connection between active components................................112




                                                vii
List of Tables
Table 2.1 Comparison of Command Line Interface and Simple Network Management
Protocol..........................................................................................................................13
Table 2.2 Comparison of Telnet, GET-NEXT and GET-BULK.........................................16
Table 2.3 Example for different IDs................................................................................23
Table 2.4 Values necessary to determine layer-1 and layer-2 connections.....................25
Table 2.5 MIB variables necessary to determine a CDP connections.............................26
Table 2.6 MIB variables necessary to determine an STP connections............................28
Table 2.7 Comparison of discovery methods..................................................................35
Table 3.1 Overview of Netdisco's database tables..........................................................41
Table 3.2 Netdisco Database Table device.....................................................................42
Table 3.3 Netdisco Database Table device_ip.................................................................42
Table 3.4 Netdisco Database Table device_port.............................................................43
Table 3.5 Netdisco Database Table node........................................................................43
Table 3.6 Netdisco Database Table node_ip...................................................................44
Table 3.7 Overview of Netdisco's HTML-files used by the web front-end.......................44
Table 3.8 Common MIB variables store_device retrieves...............................................51
Table 3.9 Steps while discovering the sample network from figure 3.11.......................57
Table 4.1 Shared memory variables used by sub-routine run in concurrency mode.......82
Table 4.2 Steps while discovering the sample network from figure 3.11 in parallel- mode
.......................................................................................................................................85
Table 4.3 Shared memory variables used by arpwalk and macwalk...............................87
Table 4.4 Comparison of different operations with a different number of processes......92
Table 4.5 Equations for the time operations ideally take in parallel mode, if the process
limit is greater or equal bn,max......................................................................................94
Table 4.6 Characteristics of the machine used for parallelization tests..........................97
Table 4.7 Resource usage during selected operations....................................................99




                                                                   viii
                                                   Diploma Thesis – Network Discovery – A. Barthel



1. Introduction to network discovery
   Network management becomes more and more important as computer-networks
grow steadily. A fundamental part of network management, is the knowledge of the un-
derlying topology of a certain network. This knowledge can be gained by different meth-
ods and it is needed in different point of views. These individual views arise from
distinct needs of persons, who are involved in network management. The following use
cases represent selected groups of managers or engineers. They are not complete, but
should increase comprehension of the complexity of network discovery. For end users
                                                                                 -
the network appears to be transparent, they do not see the topology behind the net
work-port they are connected to. Therefore the end users' point of view is not being con-
sidered.
   The term network topology is generally used in this document as the completeness of
nodes and the links between nodes. Whereas nodes can be network-devices, hosts or
any active component in a network and links are the connections between those nodes.



1.1. Different views of one network

View of a business manager
   Business managers are mostly interested in what devices are present in a network. It
is not necessary for them to know how devices are connected. They have to know what
devices are operating in a network and what devices will be needed in the future, so
they can create budget plans of how much money must be available for buying new ones
or exchanging obsolete ones. This point of view is based on a device inventory or stock.
   Another possible demand of a business manager could be, to know where a cus-
tomers web-site can be seen, where this site is currently looked at or how often it is
viewed. This is used for prediction and calculation of web-site statistics. In these cases
it is also necessary to know the network topology. Otherwise wrong decisions are likely
to be made based on incomplete or wrong knowledge. A less detailed topology view is
needed in this case.

View of a software engineer
                                                                                       -
   For a software engineer of distributed applications it is elementary to have an imagi
nation of the network topology, he designs his application for. Parameters that play a
                                                                                     -
role are: throughput, latency and reachability for example. With this knowledge, possi
ble bottlenecks or faulty operations can be avoided. Summarizing, this point of view re-
lies on path-information and connections throughout a certain network. It is not
                                                                                    -
important to know, what exact class of devices lay on the path passed through the net
work, but the constraints of this path are of interest.



                                             -1-
1. Introduction to network discovery


View of a network manager
   The persons who are most interested in network topologies are network managers. It
is their job to know the topology. They must jump in if any malfunction, caused by user,
device or management errors occur. In everyday situations they have to plan networks
to be operating at their best possible efficiency. Reliability and fault tolerance are other
important topics. As network requirements are constantly in change and to be prepared
in any exceptional situation, it is highly eligible to know the current state of topology. At
this point of view, both link parameters and connected devices are of interest.



1.2. Aspects of network discovery
   As shown above individual needs, points of view and external influences lead to dif-
ferent aspects of network discovery. These aspects can be classified into some common
classes.

Aspect of granularity
   Granularity means in which level of detail a network discovery is to be done and rep-
resented. Levels can be differentiated by ISO (International Organisation of Standard-
ization) network layers. For example layer-1, layer-2 or layer-3 views.         One further
classification can be done by regions. Local Area Networks (LAN), Metropolitan Area
Networks (MAN) or Wide Area Networks (WAN) are particular samples of that arrange-
ment. Dividing a topology into a device based granularity is also possible. This is a view
of devices that have certain properties or are from special interest, Wireless LAN
(WLAN) devices for instance.
   Regardless of it's kind, a topology must have the granularity or level of detail, that is
appropriate for a certain point of view. It has to be complete, but should not have more
details necessary to understand it as a whole.
   A higher granularity than essentially needed, may lead to an unclear collection of
nodes and links. It is possible to maintain different levels of detail for one view. Where
for example one view has a high level of abstraction, which gives an overview and one
or more other views are more detailed, making it possible to see every information of in-
terest.

Aspect of scale
   The dimensions of a network topology are also to be defined. Facts mentioned that
are important for granularity, are also applicable for the scale of a topology. Extensions
must be defined to have all necessary information within a representation of a network.
And it should be avoided to place outside information within topology data. The repre-
sentation of a topology must be chosen appropriately, so that it can be handled well.



                                             -2-
                                                 Diploma Thesis – Network Discovery – A. Barthel


Changes in the network topology
   A challenging part that may occur are changes in the network topology. Networks
are not static, they vary over time. These changes may appear in all different kinds of
data. For instance devices respectively nodes in a topology are added or removed. This
may also happen to links. Representations and data collecting processes must be de-
signed to obey this constraint.

Aspect of vendor dependence
   Vendor dependence is a double-edged sword. A whole product suite of software and
hardware by a single vendor is perfectly matched against each other's needs. But on the
other hand this determines a certain dependency between the vendor and it's cus-
tomers. Furthermore the customer has to trust in the ability and courtesy of the vendor
to implement standards as they are supposed to.
   Therefore it must be kept in mind that in heterogeneous networks with different ven-
dors' devices, a cooperation in management features is not guaranteed. Experience
shows, that even in vendor-homogeneous networks a smooth collaboration can fail.
   A very good topology model is only half as much worth, if the data needed by this
                                                                                   -
model cannot be achieved from devices, because the manufacturer of the available net
                                                                                   -
work equipment did not implement a standard well or omitted useful features. A desir
able proposal is to implement vendor independent network discovery applications and
models.
   Summarizing, there are distinct aspects that emerge from different needs. It has to
be clear what implicit and explicit constraints a certain model or environment can have.
The model created for the representation has to fit properly in the characteristics of the
underlying “real world” network environment.



1.3. Related work


1.3.1. Standards dealing with network discovery
   Currently there is no common standard for network discovery yet. Though
IEEE802.1AB will be a common standard, but it has not been published at the moment
this work was being written.
   There has been work on defining a common Simple Network Management Proto-
col-Management Information Base (SNMP-MIB) which holds network-topology data.
This MIB is proposed in Request For Comments (RFC) 2922 - “Physical Topology MIB”
[RFC2922]. It defines how information about physical network connections are kept
within this MIB and how these information relate to other MIBs. But it does not specify
a protocol to distribute topology data across neighboring devices. Without such a proto-

                                           -3-
1. Introduction to network discovery


col it cannot be determined if devices cooperate or if manufacturers try to implement
this MIB. Because it would make no sense to provide data, which cannot be distributed
to other device and that is actually collected in other MIBs, too.
   Standard SNMP-MIBs can be used to determine network data and device configura-
tions. With the help of these information it would be possible to derive network topology
in a standardized way and without having to know special device-specific methods to
gather data. Notable MIBs are: Bridge-MIB [RFC1493], Interface-MIB [RFC2863] and
Q-Bridge-MIB [RFC2674]. But as vendors do not implement these MIBs completely or
develop their own MIBs, it is hard to design a common way for all types of devices and
different manufacturers. Even in a vendor-homogeneous environment there are
products that support a certain MIB while others not or they supply it incorrectly or in-
completely.
   The three MIBs mentioned above define certain device and interface properties. The
Interface-MIB specifies ISO layer-1 attributes of device interfaces. Bridge-MIB rules ISO
layer-2 device qualities that are used for managing MAC (Media Access Control) bridges
based on IEEE (Institute of Electrical and Electronics Engineers) 802.1D [IEEE802.1D].
It also holds operational data which is collected during normal activity of such a bridge.
For instance forwarding table entries and Spanning Tree Protocol (STP) information. Ex-
tensions to the 802.1D and for managing IEEE802.1Q Virtual LAN (VLAN)
[IEEE802.1Q] features, Q-Bridge-MIB is specified.
                                                                                      -
   As manufacturers do not stand still and place new products in the market, they devel
oped their own protocols for network discovery. For example Cisco Systems Inc. de-
signed the Cisco Discovery Protocol, a technically matured protocol to distribute several
hardware configuration parameters and network topology information. Every manage-
able Cisco device “speaks” this protocol and some products of other vendors like
Hewlett Packard, too. CDP is primarily used to obtain protocol addresses and platform
                                                                                       -
of neighboring devices. It can also be used to show information about a neighbor's inter
face properties and device features. CDP operates at layer-2, which makes it robust
against layer-3 misconfiguration. The SNMP-MIB for Cisco Discovery Protocol is defined
in the CISCO-CDP-MIB [CDPMIB].



1.3.2. Drafts about network discovery
   Layer-2 bridges are also known as transparent bridges, which implies their operation
is not visible to other devices. This property makes it impossible to figure out layer-2
                                                                                      -
and respectively layer-1 network topology ad hoc. It can only be determined by using ei
ther special features, implemented to support network discovery or by exploiting and
evaluating information needed for operation of bridges. Particular data to adapt are
STP-objects and forwarding data bases.


                                            -4-
                                                  Diploma Thesis – Network Discovery – A. Barthel


   Notable papers trying to implement these methods are following in this section. All of
the mentioned try to achieve a vendor-independent, on publics standards based way to
discover a network.
   The publications “Topology Discovery for Large Ethernet Network” [LOWE2002] and
“Topology Discovery in Heterogeneous IP Networks” [BREI2000] both rely on MAC-ad-
dress learning of layer-2 bridges. They collect all entries from all known devices and ap-
ply certain rules on the derived sets of forwarding databases. Results from both
methods are layer-2 network topologies, which consist of all devices having a distinct
MAC-address. This includes end stations, too. The latter technique introduces the “Com-
pleteness Requirement”. That means, every address forwarding table in each device
must be complete. That is, they have to contain the full set of MAC addresses which are
potentially reachable from any devices interface within a single subnet. As the standard
aging time of address forwarding entries, kept by layer-2 bridges, is 300s, this con-
straint is hard to satisfy. But two proposals are made to accomplish the completeness.
   First a constant network traffic is generated, which prevents forwarding entries from
being aged out. This is proposed to be done by constantly sending Internet Control Mes-
sage Protocol (ICMP) echo-request throughout the entire network and expecting the de-
vices to return ICMP echo-replies. To gain permanent access to any machine in a large
network and having them running the ICMP traffic generation all-time, is a challenging
task itself. Second solution to soften the completeness requirement is to decrease the
forwarding set of a bridge-port to a user-defined reasonably large fraction.
   The second method of network topology discovery based upon address forwarding ta-
bles uses a opposite constraint. It defines the “Simple Connection Theorem”, which im-
plies a minimum knowledge constraint. For a pair of bridges, only three forwarding
entries have to be shared and only one host has to be accessed, namely the one discov-
ery queries are sent and received from. All this reduces additional effort for discovering
network topology.
   An issue, both papers do not pay attention is that bridge-ports do not necessarily
have unique layer-2 addresses, [IEEE802.1D] page 49, section 7.12.2 states: “The indi-
vidual MAC Entity associated with each Bridge Port shall have a separate individual
MAC Address”. For instance, all Alteon switches (two ACE184) tested, have one distinct
MAC-address for all ports on each device. Under this circumstances it is not possible to
determine exactly, which neighboring device ports are connected to each other.
                                                                                    -
   A third method for discovery based on bridges functioning is the “Layer-2 Path Dis
covery Using Spanning Tree MIBs” [STOT2002]. As the name suggests, this discovery
uses the STP-data stored in each layer-2 capable device to ascertain network topology.
Each bridge by default transmits Bridge Protocol Data Units (BPDUs) which contain
                                                                                    -
Spanning Tree Protocol information from the sending bridge, including among other val


                                            -5-
1. Introduction to network discovery


ues the Bridge Identifier (BID) and port identifier for the sending bridge. The bridge-ID
consists of eight eight-bit values. Two octets define priority followed by six octets, which
are recommended to be equal to the lowest numbered bridge-port MAC-address, i.e. the
address for port 1. The transmission of STP configuration frames are repeated periodi-
cally and after a certain time the spanning-tree converges to a stable state.
   Each bridge stores the STP-data in it's SNMP-Bridge-MIB. By querying all bridges it
is possible to obtain the layer-2 network topology. A advantage of this method is that
even ports in STP-blocking state can be detected, as the information about the port is
also advertised. A problem occurred during test, Cisco Catalyst 5000 switches did not
store the port-ID value, so with this class of devices it is not possible to apply the STP-
discovery method.
   All three vendor-independent methods mentioned, assume that all necessary data
contained in devices, are read out before the network topology can be recreated. This
arises from the fact that ports and devices are identified for topology determination by
their layer-2 address respectively their bridge-ID. Network-management tools like
SNMP or telnet use Internet Protocol (IP) addresses or Domain Name System (DNS) –
names to address devices and they communicate using higher layer (3, 4 and 7) proto-
cols. Beside this human beings are used to identify a devices by layer-3 addresses or
-names. Although it is possible to resolute devices MAC-address (BID with cut priority)
to an IP address, by the help of Address Resolution Protocol (ARP)-caches from layer-3
capable devices, there do not have to exist an IP to MAC address-mapping for all device-
ports. So, with the lack of information which device a certain port-id belongs to, a neigh-
boring device must be handled in advance, otherwise a device seen on a port cannot be
discovered by means of SNMP, for instance.
   A method which can circumvent this constraint will be described in chapter 2.2.4
“Find devices by MAC-addresses“, it is based on the idea that port-MAC-addresses are
consecutive. Using this assumption it is possible to identify a device a port belongs to
with a certain probability.
   A further issue arises by the usage of VLANs. As the name implies, VLANs are virtual
LANs, which determines that each VLAN can have it's own set of forwarding entries and
Spanning Tree data. [IEEE802.1Q] 2003 edition introduces multiple Spanning Tree in-
stances. 1998's issue of IEEE802.1Q, on page 16, section 6.7 defines a single spanning-
tree instance for the VLAN environment. There are currently no standard SNMP-MIB
that reflect the sets of per-VLAN Spanning Tree or Filter Databases. An example of a
vendor specific way providing per VLAN data is given by Cisco devices. Adding
“@VLAN-ID” to the SNMP-community makes it possible to get the desired information
for a certain VLAN.
   Beside all the method-specific cutbacks, it must be clear that it is only possible to dis-


                                             -6-
                                                   Diploma Thesis – Network Discovery – A. Barthel


                                                                                    -
cover an “active” topology. Which means only links that are enabled and are transport
ing Ethernet-frames can be found by any topology detection method. A port in state “ad-
ministratively down” is comparable to a physically not connected port, so it cannot
transmit any bit of information. Consequence of that is, a pair of ports connected to
each other and one port is disabled, will not be discovered, although it is part of the lay-
er-1 topology.



1.3.3. Existing software implementing network discovery
   This section mentions some products which can be placed in the area of network
topology discovery. A division into open source and closed source product is made.



1.3.3.1. Examples for Open Source software
   Both, NeDi [NEDI] (version 0.87) and Netdisco [NETDISCO] (version 0.94) generally
use CDP for network discovery. NeDi is a small and handy tool that enables the user to
manage and discover Cisco devices. It is implemented in Perl and uses plain Comma
Separated Value (CSV) files as database. The web front-end is very clear with a neat
design. The network-map layout can be device or location oriented. For reading device
configuration and topology data, SNMP or the Command Line Interface (CLI) via telnet
can be used. The CLI-access is also used for sending any management commands to one
or many devices. Further features of NeDi are a device-stock, configuration viewer,
device status and report functions and a node (any other device that has an MAC-ad-
dress) viewer. All listing and report functions have a good variation of predefined search
patterns or the ability to enter own filter expressions. User management is left to ac-
cess-control by the used web-server.
   NeDi is implemented in one single Perl-script which handles discovery and several
Common Gateway Interface (CGI) Perl-scripts that are used for the web front-end. The
discovery script accomplishes all information retrieval and database access. It is con-
trolled by command-line parameters. Unfortunately the NeDi-code has shortcomings in
the aspect of software-engineering. The separation of concerns is not very distinctive.
All SNMP – Object Identifiers (OIDs) and management commands are “hard-coded” in
the according sub-routines, which decrease scalability and robustness. Beyond that,
sub-routines have a short comment which documents what it coarsely does. But a user
                                                                                   -
documentation is given at the projects homepage. At the time this work has been writ
ten no developers guide, system-design or analysis existed. This deficiencies make it
hard to enhance NeDi.
   The second example for open source software is Netdisco. It is also based on means
of CDP discovery. Implementation language is perl, too. It is divided into database back-



                                             -7-
1. Introduction to network discovery


end (PostgreSQL [POSTGRES]), web front-end (with supplied by the help of Mason [MA-
SON]), discovery Perl-script and a Perl-module with helper functions. In contrast to
NeDi, Netdisco accesses devices exclusively by SNMP. Device queries are executed by
SNMP::Info [SNMPINFO], an object-oriented collection of device-class specific proper-
ties and methods. SNMP::Info makes SNMP-queries transparent to the user and re-
moves the “burden” to know OIDs or MIB-variables. For device-classes known to
SNMP::Info, it is aware of certain issues and bypasses them. In the background it uses
Net-SNMP [NETSNMP] Perl-modules, just like NeDi does directly. Netdisco separates
database access, SNMP access, discovery and other functions more properly. Database
access is made transparent with the help of the netdisco Perl-module, which contains
sub routines that take care of all database accesses. Although Netdisco provides some
                                                                                    -
more in-line documentation, it has no “developers guide”, too. The improvement of Net
disco is part of this work, therefore a documentation of Netdisco's functioning is presen-
ted in chapter 3 “Design and implementation of the sample network discovery
application - Netdisco“. In opposite to NeDi, Netdisco uses an external program
(Graphviz [GRAPHVIZ]) for graph generation. Internal graph handling is done by using
the Comprehensive Perl Archive Network (CPAN) module Graph::Undirected [GRAPH],
which provides several well known methods for handling undirected graphs. That brings
the benefit of flexible graph control before any visualization is done. Netdisco imple-
ments it's own user-management, which is accessible via the web front-end. Further fea-
tures are device and node search, device and node inventory as well as a network map.
Device management functions are reduced to port-control (enable/disable) and a port
report, where reasons for disabling ports can be seen.
   Both, NeDi and Netdisco discover devices one by one. In a large network this may
last very long, up to several hours. This can be a great issue on accuracy, due to ageing
of values stored in devices.



1.3.3.2. Examples for Closed Source software
   Examples for Closed Source software are mentioned in this section. The products
chosen have a high market presence. Cisco CiscoWorks – Campus Manager [CIS-
COWOR] and Hewlett-Packard OpenView - Network Node Manager [OPENVIEW]. Both
offer a vast amount of management functions beside the network discovery and have op-
tions to enhance to features desired. A detailed description would be far beyond the
scope of this work. Both products probably can handle all the necessary situations, but
in a university environment there are highly qualified employees available, who are able
to maintain and handle self-developed solutions and students who enhance existing
products. So far universities have different financial interests, constraints and resources
than public businesses, where is often no money left for commercial products.


                                            -8-
                                                 Diploma Thesis – Network Discovery – A. Barthel



1.4. Typical problems
   Typical problems in network discovery result from different types of needs and ef-
forts. Hardware manufacturers implement proprietary management solutions for their
products. Network managers implement a solution fitting to their network. In both cases
solutions are less common and cannot be adopted to all combinations of network de-
vices. A general implementation which is totally vendor-independent is a desirable goal.
But the dilemma is, vendors must implement public standards into their devices to have
the ability of designing such a common solution. 802.1AB is going to approach anytime.
But the question is, how much time will pass until manufacturers adopt it and when will
every device in a network “speak” it? Until that time, different solutions must be used.
The improvement of existing software is hardened by rare documentation and less
strength in software engineering disciplines.



1.5. Goals of this work
   As mentioned above, Netdisco will be improved in this work. Particular points are
parallelization of device processing and the resolution of the question if a vendor inde-
pendent network discovery is possible and how it can be realized.




                                           -9-
2. A common approach to network discovery



2. A common approach to network discovery
                                                                                      -
   Until this point the term “network discovery” is used in different ways and with dif
ferent meanings. It can be used for the purpose of (network) device discovery, which
means one or more devices are known and all other devices in the network are detected
beginning from a start- or seed-device. A second meaning is (network) topology discov-
ery. This is actually no device discovery, but the determination of the devices topology,
a kind of a map of which devices are connected. An implication of this use is that all de-
                                                                                  -
vices forming the network have to be known. So a more common interpretation of “net
work discovery” is a combination of both meanings. It determines certain steps which
are necessary to perform network discovery and to create the network topology. These
steps are discussed in the following chapters, including possible methods how to
achieve the purpose of those steps.



2.1. Gathering data from devices




                 Figure 2.1 Network management architecture, simplified from [CI02]


   First of all, the means are presented how management or configuration data can be
transmitted from and to devices.




                                                - 10 -
                                                   Diploma Thesis – Network Discovery – A. Barthel


2.1.1. Command Line Interface
   The Command Line Interface (CLI) is a shell-like basic command line interpreter, im-
                                                                                     -
plemented in a device's operating system. It can be accessed through a console port at
tached to the device or through a remote session. Console ports are typically serial RS-
232 ports, which can be used with a console terminal or with a terminal emulation. The
fact that a terminal is directly connected to the device has the disadvantage to have one
cable per device, which is limited to a certain length and therefore to a certain place.
Access to many devices is really inefficient with this method. But the great advantage
over any other accessing technique is exactly the “one cable per device” attribute. It
makes the access to the CLI even possible if any other network communication is infea-
sible, which can be caused through a configuration or network error.
   Another way to use the CLI is possible via a remote session, which is established
over a network-communication port. For that purpose it is necessary to assign an IP-ad-
dress in the device's configuration. This is used to address the device in an IP-network.
To communicate with a certain device, application layer protocols like Telnet [RFC854]
or Secure Shell (SSH) are used to transmit data through layer-4 Transmission Control
Protocol (TCP) [RFC793].
   Access Control to devices is gained through password authentication. In case of Cis-
co devices there are distinct command-modes, which represent different privilege-lev-
els. Connection speed is limited to the physical medium used.
   CLI commands depend on device types and are vendor specific implemented. There
is no common standard which set of commands a device supports or which parameters
are accepted. They can even vary in different operating system versions on the same
class of device. CLI commands are intended to set device properties and configuration
by a human being, so any output is optimized to be human-readable. A further pro-
cessing of shown data by a (computer) application can be done, but any output has to be
pre-processed in order to be evaluated by an application. Data-types are implicitly
defined by the usage within a command or output.



2.1.2. Simple Network Management Protocol
   The Simple Network Management Protocol [RFC1157] is a public standard devel-
oped for the use of network management. The SNMP architecture consists of network
management stations and network elements. Network management stations are applica-
tions that control and monitor network elements, which are devices that are running a
management agent. These agents collect specified data from devices, which can be re-
trieved and evaluated by the management stations. SNMP rules how information is ex-
changed between the elements and the stations. Furthermore the Structure of
Management Information (SMI) is used to define the structure of management objects,

                                          - 11 -
2. A common approach to network discovery


their behavior and data-types. A Management Information Base contains the description
of all management objects and their data-types.
   SNMP is an application-layer protocol, per default the User Datagram Protocol (UDP)
[RFC768] is used for transmission, alternatively TCP is supported.
   Authentication knows three different types: Read, read-write and trap. Passwords for
                                                                                    -
these types are called “community”. Management objects are arranged in a tree-like hi
erarchy. Each object within a subtree can be identified by it's own number, the Object
Identifier (OID).
   In SNMP version 1, four fundamental operations are specified: GET, GET-NEXT, SET
and TRAP. GET retrieves one piece of management information, whereas GET-NEXT is
used to retrieve sequences of such information. With SET it is possible to pass change-
requests to an SNMP-agent. A TRAP is a piece of information an agent sends automati-
cally, if a predefined state is reached or a change in conditions occurred. The manage-
ment station which should receive all traps, can be specified in the device's
configuration. SNMPv2 defines further types of operations. One notable is GET-BULK.
This function is generally equal to GET with a following GET-NEXT. The essential differ-
ence is that GET-BULK tries to send as much data in a single response as feasible,
which implies that incomplete responses are possible. A GET-BULK can save a lot of
protocol overhead compared to a “normal” GET-NEXT. Because the SNMP-header infor-
                                                                                       -
mation are only sent once for each response, thats not a difference so far. But if possi
ble, in a single response more than one value is sent. Through that accumulation of
values GET-BULK can generate unintended conditions, which the management station
must be aware of.
   For instance, a GET-NEXT requests an OID where corresponding data is a formed in
a table. That means this OID carries information in columns that have an identifier and a
value. A single row consists of the base-OID + identifier and the value stored for it. The
first request is sent with the base-OID, then the SNMP-agent returns the first value for
it, including the OID + identifier. Now, GET-NEXT increases the identifier an sends the
                                                                                  -
next request with base-OID + the determined successor-ID. This continues in both di
rections until the agent's response contains a value with an OID out of range of the re-
quested base-OID. That is the indicator for GET-NEXT to stop requesting further rows
of information. This operation is called “snmpwalk" in Net-SNMP terminology. A
GET-BULK request is almost the same procedure, but with the difference that the
SNMP-agent responses as much data as it can put in one answer packet.
   Connection speed is limited by the bandwidth of the transmitting network (port) and
the speed the SNMP-agent is able to generate answer requests.




                                            - 12 -
                                                        Diploma Thesis – Network Discovery – A. Barthel


2.1.3. Comparison of device access methods
   For the purpose of an effective and centralized management, access from                      many
points (machines) in the network is desirable. Security reasons suggest to have a self-
contained management subnet. A serial-CLI connection may be the last possibility to re-
gain access to a cut-off device, but it is not a flexible solution for information retrieval
from many devices. SNMP-traps are asynchronously events, so they can be used for
adding up information and to monitor certain values and conditions. Therefore other
methods are preferred for gathering data in a flexible and determined way.
   Telnet-CLI has a user-friendly operation paradigm. This makes it easy for human be-
ings to navigate through device data, but simultaneously this makes it more complex to
use data within an application. Because data has to be processed and arranged into a
“computer-friendly” format. Command sets and structure varies from vendor to vendor
and from device class to device class. Table 2.1 gives an comparison between CLI and
SNMP.

     access method                               CLI                             SNMP
     communication protocol                   TCP*/UDP                          UDP*/TCP
     application protocol                     telnet/ssh                          SNMP
     authentication                           password                     SNMP-community
     command specification                 per device-class                       SNMP
     data types                                implicit                        SNMP-MIBs
                                        human interaction and              management and
     purpose
                                           configuration                    configuration
     speed limitation                         port-speed                       port-speed


    *default


        Table 2.1 Comparison of Command Line Interface and Simple Network Management Protocol


   SNMP presents a well defined way to communicate with network devices. Data-types
and navigation as well as command sets are equal throughout all devices that implement
                                                                                    -
an SNMP-agent. This universality has the cost of an increased amount of protocol over
head.
   As an example the mac-address-table of a Cisco Catalyst 3524 and a Cisco Catalyst
1912 switch are read out. First with the use of a telnet CLI session :


   telnet hostname, password, show mac-address-table


   see example 2.1 for output from the C3524 and example 2.3 for output from the
C1912. And second with an SNMP-query:




                                               - 13 -
2. A common approach to network discovery


   snmpwalk -c read-community hostname -On 1.3.6.1.2.1.17.4.3.1.1
   snmpwalk -c read-community hostname -On 1.3.6.1.2.1.17.4.3.1.2


   see example 2.2 for output from the C3524 and example 2.4 for output from the
C1912. MAC-addresses and the device-port they were seen on, are the information of in-
terest.
   Output from CLI is better readable. But for an application processing it would be
necessary to extract the address in example 2.1 from column one and for the port from
column four. In example 2.3 corresponding port information is listed in column two and
three. In contrast to that, in both examples 2.2 and 2.4, a MAC-address is listed in the
value for an instance of OID “.1.3.6.1.2.1.17.4.3.1.1” and the corresponding port is the
value for the same instance identifier in OID “.1.3.6.1.2.1.17.4.3.1.2”. For example:
MAC-address “00 A0 24 4E 23 CB” can be found with the instance identifier
“.0.160.36.78.35.203” and the port it has been seen on can be found in OID
“.1.3.6.1.2.1.17.4.3.1.2” with the same instance identifier “.0.160.36.78.35.203” ap-
pended, it is “13”. This value can be translated with the help of IF-MIB (1.3.6.1.2.1.2)
[RFC2863] object ifIndex (.2.1.1) and ifDescr (.2.1.2) to the layer-1 name of the de-
vice port: “FastEthernet0/1”. This seems to be more complex, but it is the same tech-
nique for all devices, whereas output from CLI has different structures for different
devices. For instance the command to view forwarding entries at a Cisco C3524 is:
   show mac-address-table

the same information is gain from an Alteon ACE180 switch by issuing:
   /info/fdb/dump




                                            - 14 -
                                                     Diploma Thesis – Network Discovery – A. Barthel




  0000.0c14.7c3a         Dynamic             1       FastEthernet0/24
  0050.0fb3.2b1b         Dynamic             1       FastEthernet0/14
  0090.a69a.5fff         Dynamic             1       FastEthernet0/14
  0090.f228.0a1b         Dynamic             1       FastEthernet0/22
  00a0.244e.23cb         Dynamic             1       FastEthernet0/1


                             Example 2.1 Output from C3524 CLI




  .1.3.6.1.2.1.17.4.3.1.1.0.0.12.20.124.58 = Hex-String: 00 00 0C 14 7C 3A
  .1.3.6.1.2.1.17.4.3.1.1.0.80.15.179.43.27 = Hex-String: 00 50 0F B3 2B 1B
  .1.3.6.1.2.1.17.4.3.1.1.0.144.166.154.95.255 = Hex-String: 00 90 A6 9A 5F FF
  .1.3.6.1.2.1.17.4.3.1.1.0.144.242.40.10.27 = Hex-String: 00 90 F2 28 0A 1B
  .1.3.6.1.2.1.17.4.3.1.1.0.160.36.78.35.203 = Hex-String: 00 A0 24 4E 23 CB

  .1.3.6.1.2.1.17.4.3.1.2.0.0.12.20.124.58 = Integer: 38
  .1.3.6.1.2.1.17.4.3.1.2.0.80.15.179.43.27 = Integer: 27
  .1.3.6.1.2.1.17.4.3.1.2.0.144.166.154.95.255 = Integer: 27
  .1.3.6.1.2.1.17.4.3.1.2.0.144.242.40.10.27 = Integer: 36
  .1.3.6.1.2.1.17.4.3.1.2.0.160.36.78.35.203 = Integer: 13


                            Example 2.2 Output from C3524 SNMP




  0003.6BBF.7E56      FastEthernet 0/27 Dynamic               All
  00A0.244E.23CB      FastEthernet 0/27 Dynamic               All


                             Example 2.3 Output from C1912 CLI




  .1.3.6.1.2.1.17.4.3.1.1.0.3.107.191.126.86 = Hex-String: 00 03 6B BF 7E 56
  .1.3.6.1.2.1.17.4.3.1.1.0.160.36.78.35.203 = Hex-String: 00 A0 24 4E 23 CB

  .1.3.6.1.2.1.17.4.3.1.2.0.3.107.191.126.86 = Integer: 27
  .1.3.6.1.2.1.17.4.3.1.2.0.160.36.78.35.203 = Integer: 27


                            Example 2.4 Output from C1912 SNMP




   Table 2.2 shows how many packets and bytes are sent during a Telnet, an SNMP
GET-NEXT and an SNMP GET-BULK request for the mac-address-table and the corre-
sponding ports. Because a different number of entries are returned from a CLI-session
and an SNMP-request, the bytes per entry and packets per entry ratio is listed explicitly.
The GET-NEXT request is performed by snmpwalk from Net-SNMP package and
GET-BULK accordingly by snmpbulkwalk. To get the corresponding port of a forwarding
entry two OIDs have to be requested, see examples above.



                                            - 15 -
2. A common approach to network discovery



       method       total number       bytes sent       bytes received   bytes   packets sent
                      of entries       to device         from device     total    to device
   telnet CLI             12              4268               4045         8313        77
   snmpwalk               64              6330               6568        12898        66
 snmpbulkwalk             64               760               2711         3471         8


       method packets received packets total bytes per entry packets     bytes
                from device                                  per entry per packet
   telnet CLI        51            128           692,75        10,67      64,95
   snmpwalk          66            132           201,53         2,06      97,71
 snmpbulkwalk         8             16             54           0,25     216,94


                      Table 2.2 Comparison of Telnet, GET-NEXT and GET-BULK




   What can be seen is that a GET-NEXT issues one packet per entry, plus one entry for
each OID requested, plus one entry per OID that exceeds the sub-tree and therefore
marks the end of the request. 64 entries are returned while two OIDs were requested.
This results in 66 packets per direction and a total of 132 packets. GET-BULK issues 8
packets to the device, each OID requested causes one packet for the starting request
and three for the next bulk of data. Telnet is TCP based, so a higher protocol overhead
is caused by the transmitting protocol. Furthermore not visible control characters are
transferred to arrange and enhance the display of values. These characters cause an in-
creasing number of total bytes sent, but do not carry any information needed for analy-
sis.
                                                                                      -
   Bytes and packets per entry are proportional, whereas bytes per packet is reciprocal
ly proportional to them. This states that a GET-BULK request is most effective, because
it has to send less bytes per entry and less packets at all. A logical conclusion is, that
the fewer packets and bytes are sent, the less time to transmit the requested informa-
tion is needed.



2.2. Finding all devices
   One part of topology discovery is to know all devices that comprise a network. This is
an absolutely necessary precondition. Without this knowledge, topology information are
incomplete and data that is used to determine the network layout cannot be read out
from devices. To acquire this knowledge, several methods are presented in the next
chapters.



2.2.1. Using proprietary protocols for finding devices
   The Cisco Discovery Protocol (CDP) is primarily used to obtain protocol addresses of
neighboring devices and to discover certain properties of those devices. CDP runs over

                                               - 16 -
                                                       Diploma Thesis – Network Discovery – A. Barthel


layer-2 (data link layer). Therefore systems supporting different layer-3 (network layer)
protocols can learn about each other, but it does not actively discovery a network itself.
   Each device configured for CDP advertises it's CDP-data by sending periodic mes-
sages to a Cisco specific multicast address (01:00:0C:CC:CC:CC). Those data contains
information about the device's platform, (layer-3) address, capabilities and the name of
the interface this CDP-advertisement was sent out. So a CDP-capable device knows ex-
actly which interface is connected to which other devices and interfaces. With this
knowledge it is possible to start discovery from one device by reading out it's CDP-
neighbor data to get on to the next device. By repeating this method for all devices
found in any device's CDP-data, a whole network of connected devices can be found.
This determines the network topology. In networks composed of CDP-capable devices,
this works without any problems. See figure 2.2 for an example of such a network en-




                Figure 2.2 Example of a network comprised of CDP-capable devices




tirely comprised of CDP-capable devices A to F.
   But there are devices that are not CDP-capable. Those devices can be divided into
two groups. One group can be defined as CDP-aware and the second group as CDP-un-
aware. CDP-capable devices send, receive and store CDP data. CDP-aware devices
“know” of the existence of CDP by entries in the bridges' Filtering Database and there-
fore do not forward CDP-frames. CDP-unaware devices forward CDP-frames like any
other “normal” multicast frames. For CDP-unaware devices static filtering rules can be
set in the devices' Filtering Database to apply the knowledge of CDP manually and en-
hance them to CDP-aware devices.
   The problem is that CDP-unaware devices will forward CDP frames to other devices.
If one of these devices is a CDP-capable device, it will receive and store the CDP-data,


                                              - 17 -
2. A common approach to network discovery


as if it came from a direct neighboring device. This will cause a wrong topology. Figure
2.3 shows such a situation where device C is CDP-unaware. It is not clear which device
of the upper group A and B will be seen as a neighbor of the lower group. This depends
on which CDP advertisement will be received last by switch A or B.




  Figure 2.3 Example of a network comprised of CDP devices connected to a single CDP-unaware device




   CDP-aware devices will discard CDP-frames, this is the correct behavior. But such a
device will be a border of the CDP network topology, because other devices that may be
behind that CDP-aware device are not visible from CDP-capable devices. An example
can be seen in figure 2.4 with device C being a CDP-aware one.




                                                - 18 -
                                                         Diploma Thesis – Network Discovery – A. Barthel




   Figure 2.4 Example of a network comprised of CDP devices connected to a single CDP-aware device




   CDP information are provided by the CISCO-CDP-MIB. The variables needed to iden-
tify a neighbor device are: cdpInterfacePort is the local device's port index, this index is
appended to other MIB variables to indicate the port an information is seen on; cdpCac-
                                                                                    -
heAddress contains the first layer-3 address for the neighboring device; cdpCacheDevi
cePort is the name of the interface a neighbor device is connected to the local port.
[CDPMIB]



2.2.2. Finding devices using Link Layer Discovery Protocol (LLDP)
   The Link Layer Discovery Protocol (LLDP) defined in IEEE802.1ab provides a vendor
independent way of finding all devices in a network. It works in the same way, CDP
does, sending LLDP data units (LLPDDUs) to neighbor devices. Those data units contain
information about IEEE802.1 and IEEE802.3 device specific properties. Each vendor
can extend specific information provided in the LLDP-MIB. Exact behavior and mode of
operation of LLDP can only be determined from drafts, because this standard is not
available so far (March 2005).



2.2.3. Scanning subnets for devices
                                                                                   -
   Without knowing a device's neighbor layer-3 address it is not possible to communi
cate with it by means of SNMP. But as explained above, it is necessary to know all de-
vices in a network. A way to find out which device in network respond to SNMP is to
scan a network for them. The main principle of this method is trying to establish an
SNMP “connection” to a certain IP-address. If an answer is received, the tested IP be-
longs to a device and it can be marked as available. Otherwise no answer will be re-


                                                - 19 -
2. A common approach to network discovery


ceived from that IP-address and the request times out. The address is then marked as
not available.
   This method requires the IP-range (IP-subnet) devices are assigned to, to be known.
If addresses are set up for instance in a distinct subnet, a simple address range can be
chosen to try. Or if the IP-address of a router is know, it's ARP and routing tables can be
used to find the devices' IP-addresses. A drawback to this method arises, if device ad-
dresses are randomly mixed with host addresses. This causes a potential security risk,
because it cannot be determined if an IP-address belongs to a host or to a device. When
an SNMP-request is sent to an IP-address, it contains the SNMP read-community in
plain text (note: SNMP version 3 supports encryption). If a host receives that SNMP-
message it can easily capture the read-community. As the read-community can only be
used for reading values, it present only a low security risk, assuming that read-write-
community is sufficiently different.
   Timeouts caused by SNMP tries on IP-addresses not assigned to devices, are another
issue to this method. If the ratio of assigned to unassigned IP-addresses is very bad, a
lot of time is wasted by waiting for timeouts. The total time spent on waiting for time-
outs depends on the relation R between the average time Taverage a device needs to re-
spond to a request and the value Ttimeout set for the time to wait until a request expires.


                                                 T average
                                            R=
                                                 T timeout

   The lower R is, the more time is spent on waiting for timeouts compared to the time
on waiting for a devices' responses to a request. A value of R = 1 means, it takes the
same time to retrieve information from a device, as long as it takes to wait for a timeout.
R can be lowered by reducing the time and/or the number of retries until an IP-address
is marked as not available. It has to be taken in consideration that retries and timeouts
have a sense, when devices are to busy for a moment, then they won't answer a request
or for any other reason a request is lost on the way to that device. Maybe a short period
of time later they are able to respond again, but now no retry is made. This will cause to
mark an IP-address as unassigned although it is available.



2.2.4. Find devices by MAC-addresses
   A variation of the scanning approach is a method that tries to identify a device by it's
MAC-addresses. The idea behind this, is that MAC-addresses assigned to device ports
are consecutive. As mentioned above it is not necessarily true that all ports have a dis-
tinct MAC-address, if they have one at all. For this method it is assumed that every port
has it's own distinct MAC-address and those addresses are assigned consecutively by


                                                 - 20 -
                                                     Diploma Thesis – Network Discovery – A. Barthel


vendors. This makes it possible to find the device a particular MAC-address belongs to
with a certain probability.
   Yet another condition must be satisfied. To be able to make a resolution from a MAC-
address to an IP-address, there must be an ARP-table where to look up those mappings.
One way to get such an ARP-table is to read a layer-3 device's ARP-cache. But if non lay-
er-3 device's IP-address is known, it is not possible to get this information. Another way
is to read out the local ARP-cache of the management host. Now, the problem is to keep
the local ARP-cache up to date or to fill it with all necessary addresses. For this purpose
an ICMP echo request can be used. A logical consequence is that this management-host
must be able to reach any device in the network by ICMP and devices do respond to
ICMP messages. For speeding up the ICMP echo requests and ICMP echo replies, an IP-
broadcast address can be used to determine which devices are alive and respond to the
ICMP echo request.
   The IP-address of devices that seem to be alive are getting sent another ICMP echo
request. This second request will issue an ARP-request, which is not issued by a broad-
cast-ping. This fills the local ARP-cache with the address information of the evaluated
device.
   A further possibility to get an non-device ARP-cache is to run a reverse-ARP daemon.
This requires to maintain address mappings manually and is therefore not very flexible
as an automatic detection.
   For determining the devices the known port MAC-addresses belong to, it is necessary
to know the IP-address of a seed-device and the information supplied by the BRIDGE-
MIB from it. This MIB contains data of MAC-addresses seen on that device. Now such a
retrieved address is taken and compared to all known port MAC-addresses. If no port
matches, the assumption of consecutive port MAC-addresses is applied.
   This is achieved by subtracting the current MAC-address from each MAC-address in
the local ARP-cache. For the MAC-address in the local cache, that has the least absolute
value of difference, the corresponding IP-address is considered as the device it belongs
to. From this device all port MAC-addresses are read and compared to the MAC-address
under investigation. If it is in the set of port addresses of that device, the desired device
is found, otherwise the device with second least absolute value of difference is taken
next. This process is continued until the desired device is found. All port-MACs are hold
in a list, if all devices are read ones, all available addresses of the network are know and
the mapping between a MAC-address and a port is always successful.
   This method can save time by avoiding unnecessary timeouts. Furthermore acciden-
tally sending the read-community to an untrusted host is reduced, if the ICMP echo-re-
quests are only sent to device belonging to a management subnet.
   It cannot be determined if all devices are found with this method. To assure that the


                                            - 21 -
2. A common approach to network discovery


device list is complete, ARP-caches of layer-3 devices have to be evaluated, too.



2.2.5. Device list
   Maintaining a list with all devices in a network is another simple method to know all
devices. Like every manually acquired catalog it must be updated accurately and regu-
larly. If this policy is not realized consequently, errors will occur. In cases a device stock
or inventory is maintained, such a list can be generated out of this databases.



2.2.6. Comparison of ways to find all devices
   The best solution, in the meaning of least additional effort is CDP. All information
necessary to know its neighbors is contained within a device. With that data all “CDP-
speaking” devices can be reached, without any further operations required to resolve
the neighbors IP-address or DNS-name.
   Scanning a address range is the simplest algorithm. If timeouts and sending SNMP
read-community to any device is reasonable, this method is easy to implement and
works in any environment.
   Guessing devices by their MAC-address will reduce the flaws of address scanning,
but they are still present.
   A manually maintained list is also an easy way to keep track of devices, but it may be
out of date.



2.3. Generate topology from data
   Now that all devices of a network are known, it must be clarified what data is needed
to create a network's topology and how this data must be processed to form it.
   For determining a topology the addresses of a device and its ports must be known. In
different topologies, different addresses for one device or port are used. The connec-
tions between devices are determined by a pair of ports, one at each device.
   Representation of a human readable forms of a device or port address respectively
name can be used, but mappings between different topology addresses must be kept.




                                             - 22 -
                                                       Diploma Thesis – Network Discovery – A. Barthel




             Value                          Device                              Port
          layer-1 name                     C3524XL                       FastEthernet 0/1
               serial                    FAA0446F20F
            IP-address                    10.0.1.111                         10.0.1.112
           DNS-name                       c3524-1                        c3524-1-mngt-port
          MAC-address                 00:08:e3:94:01:81                  00:08:e3:94:01:82
            STP-ID                    00:30:85:03:71:40                         41


                               Table 2.3 Example for different IDs


   For instance in a layer-2 topology MAC-addresses can be used to identify a device
port or a device itself. Spanning Tree Protocol identifiers for devices and ports are also
possible to use. In a visualization of such a topology it is more common to use layer-3
addresses or DNS-names of a device and layer-1 names for ports. See table 2.3 for ex-
amples of different types of IDs.
   Hosts extending a network topology are identified in the same manner like devices:
with it's IP-address or DNS-name.
   Regardless which type of address is used, they must identify a device or port unam-
biguously in a certain topology. Devices can have multiple DNS-names, IP- and MAC-ad-
dresses. Layer-1 names and serial numbers do not have to be accessible by means of
                                                                                       -
requesting a device. This makes it hard to find a unique identifier for an automated net
work discovery. By making assumptions and constrains that can be applied to a given
network, an IP-address or DNS-name is nevertheless a proper identifier. For instance in
a network where devices have only one DNS-name and multiple IP-addresses, the de-
vice's DNS-name can be used as unique ID. Furthermore in this network exists a man-
agement subnet which contains all devices and therefore all devices have one of their
IP-addresses within that subnet. With that constrain the management IP of a device can
be used as unique device ID.
   Further data that does not determine network topology but offers information for a
certain link, device or port properties is also to be retrieved. For a layer-1 topology
physical connection the speed of it, duplex information and port state are of interest for
example. Spanning Tree operational data, VLAN-ID and forwarding tables extent layer-2
topology information, which represent logical connections.
   All these data is collected within devices' SNMP-agents that supply it via SNMP-
MIBs. Which MIBs this exactly are and how those values are formed to network topolo-
gies, describe the following chapters.




                                              - 23 -
2. A common approach to network discovery


2.3.1. Common data model




        Figure 2.5 Entity Relationship Diagram of a common layer-1 and layer-2 topology model


   An Entity Relationship Diagram (ERD) presenting the common data model of a layer-
1 and layer-2 network topology is shown in figure 2.5. Because layer-1 represents the
physical network and layer-2 the logical network topology, differences between both
may appear. Figure 2.6 shows what deviations are possible.




                                                - 24 -
                                                        Diploma Thesis – Network Discovery – A. Barthel




                     Figure 2.6 Differences in Layer-1 and Layer-2 Topologies


   [IEEE802.3] link aggregation defines how multiple physical ports can be used as a
logically single MAC-bridging port. Therefore such a port uses a single STP instance.
VLANs split a single bridged LAN into several ones and the Spanning Tree Protocol
“cuts” possible connections that form a loop in the network. Combinations of Spanning
Tree-, VLAN- and aggregation-topologies also occur.
   In table 2.4 values are mentioned, which are necessary to determine a connection in
a layer-1 and a layer-2 topology.

             Layer            Values necessary to determine a connection
               1       Device-ID, Port-ID
                       Device-ID, Port-ID; Port: VLAN-information, STP-state, link-
               2
                       aggregation data


              Table 2.4 Values necessary to determine layer-1 and layer-2 connections




2.3.2. Topology by Cisco Discovery Protocol topology (CDP) data
   As mentioned in section 2.2.1 “Using proprietary protocols for finding devices“ CDP


                                               - 25 -
2. A common approach to network discovery


collects all layer-1 data necessary for building a layer-1 CDP-network topology. This
data is aligned explicitly in a device, without having to request any other device.



              MIB                OID-variable name                      Purpose
             CDP-MIB         cdpCache                      CDP-neighbor data
             CDP-MIB         cdpCacheAddress               the neighbors' layer-3 address
             CDP-MIB         cdpCacheDevicePort            the neighbors' port name

             IF-MIB          ifIndex                       mappings of different port indexes

             IF-MIB          ifDescr                       IF-MIB port name
             VTP-MIB         vlanTrunkPortNavtiveVlan      native VLAN of VLAN-trunk ports
             VTP-MIB         vlanTrunkPortVlansEnabled     VLANs enabled on VLAN-trunks
                                                           native VLAN of a non-VLAN-trunk
      VlanMembership-MIB vmVlan
                                                           port


                  Table 2.5 MIB variables necessary to determine a CDP connections


   MIB object-table cdpCache (OID: 1.3.6.1.4.1.9.9.23.1.2) from [CDPMIB] contains
those information. cdpCacheAddress (.1.1.3) determines the layer-3 address of a neigh-
boring device and cdpCacheDevicePort (.1.1.7) the layer-1 address for the correspond-
ing port of it. Both objects have the same instance identifier, which can be mapped via
IF-MIB (1.3.6.1.2.1.2) [RFC2863] ifIndex (.2.1.1) and ifDescr (.2.1.2) to the layer-1
name of the device's local port.
   Layer-2    data is    partly contained       in   VtpMIB (1.3.6.1.4.1.9.9.46)      and    Vlan-
MembershipMIB (1.3.6.1.4.1.9.9.68). VtpMIB contains trunk port specific information,
like the native VLAN in vlanTrunkPortNavtiveVlan (.1.6.1.1.5) and the VLANs a trunk
port is able to handle frames for, in vlanTrunkPortVlansEnabled (.1.6.1.1.4). Vlan-
MembershipMIB (1.3.6.1.4.1.9.9.68) vmVlan (.1.2.2.1.2) holds the VLAN-ID a non-trunk
port is assigned to.
   CDP information only, do not provide data on hosts, except they have implemented a
CDP daemon. Any additional data must be achieved by other standard MIBs.
   CDP and VLAN Trunk Protocol (VTP) both transmit configuration frames via layer-2.
Therefore only active ports, in the meaning of not administratively disabled, can provide
topology data.
   If CDP is enabled each device port holds information for it's neighboring device and
port. To determine the network topology each connection has to be found, which re-
quires to collect data from all device ports. The amount of data to derive and the num-
ber of operations to build the network topology is therefore directly proportional to the
number of ports P in a network. Thus the complexity of data amount and number of op-
erations can be estimated by O(P).

                                                - 26 -
                                                        Diploma Thesis – Network Discovery – A. Barthel


2.3.3. Topology by Spanning Tree Protocol (STP) data
   Spanning tree information are kept in each bridge and for each port of it. The 1998
issue of [IEEE802.1D] defines how the Spanning Tree of a LAN is determined. In the
2004 issue of [IEEE802.1D] the Rapid STP is defined, but methods in this work relate to
the “simple” STP.




         Figure 2.7 Example of a bridged LAN, showing Spanning Tree port states and roles,
                        simplified from [IEEE802.1D], page 141 figure 17-4


   Each device taking part in the Spanning Tree Protocol, receives and sends BPDUs,
which are used by a bridge to calculate values for the current network environment,
seen by itself. Configuration values are transmitted using the multicast MAC-address
01:80:C2:00:00:00. BPDUs are not forwarded by any bridge port. [IEEE802.1D]
   Figure 2.7 shows an example for a bridged LAN with the roles and states the bridge
ports are assigned to. Bridges 111 through 444 are a kind of backbone of this network.
All ports are connected to each other, which presents loops in the network. But the
Spanning Tree Protocol determines an acyclic topology by blocking links within the
loop. It can be seen that bridge 111 is the Root Bridge, all of it's ports are Designated
Ports. Whereas all other bridges have one Root Port connected to a Designated Port and
one port in Blocking respectively Discarding State. Ports store the Bridge and Port ID of

                                               - 27 -
2. A common approach to network discovery


the next Designated Bridge towards the path to the Root Bridge, regardless if a port is
in Forwarding or Blocking state.




             MIB                  OID-variable name                        Purpose
         Bridge-MIB         dot1dBaseBridgeAddress            the bridge's base address
                            dot1dStpPortDesignatedBridg base bridge address of the
         Bridge-MIB
                            e                           designated bridge
                                                              the port ID of the designated
         Bridge-MIB         dot1dStpPortDesignatedPort
                                                              bridge

            IF-MIB          ifIndex                           mapping of different port indexes

            IF-MIB          ifDescr                           IF-MIB port name


                   Table 2.6 MIB variables necessary to determine an STP connections




   Spanning Tree data is available through Bridge-MIB object-table dot1dStp
(1.3.6.1.2.1.17.2). For each device port the MIB-object dot1dStpPortDesignated-
Bridge (.15.1.8) contains the bridge-ID for the Designated Bridge. If a port lies on the
path away from the Root Bridge, it contains the device's own bridge-ID. Furthermore
dot1dStpPortDesignatedPort (.15.1.9) contains the port-ID for the Designated Port of
the Designated Bridge. [RFC1493]
   Reading Spanning Tree information from each bridge and building a map out of the
collected data, determines the layer-2 network topology. It includes all ports that are
taking part in the Spanning Tree Protocol. So all ports but disabled and not connected
ones are detected by this topology discovery.
   Requirements for this methods are: the Spanning Tree Protocol is running on bridges
and the Spanning Tree has converged, that means MIB tables are up to date and no
changes occur during topology discovery.
   If VLANs are used, this method can be applied to each individual VLAN, presumed
these information are available. Furthermore it is important to know what “flavor” of
STP is implemented in the network devices and how STP management data is provided.
In a Cisco environment for example, Bridge-MIB tables are carried for each VLAN and
can be requested by appending “@VLAN-ID” to the SNMP read community string.
   Spanning Tree data itself does not provide topology information that can be used to
discover hosts. This has to be done by other means. [STOT2002] describes a method of
“endpoint analysis” with the help of STP and forwarding entries.
   Complexity estimations are equal to CDP. Because data is carried for each port and
                                                                                    -
each port has to be evaluated only once to determine a connection. Therefore complexi


                                                 - 28 -
                                                        Diploma Thesis – Network Discovery – A. Barthel


ty for data amount and operations are O(P).
   To enhance device and port names to more human readable ones, it is possible to
translate Bridge- and Port-ID to schemes mentioned in section 2.3. Again this can be
achieved by stripping the instance identifier from an entry and using this identifier with
the IF-MIB table entries to determine the desired notation.



2.3.4. Topology by Filtering Database information




                    Figure 2.8 Relaying of MAC-frames, [IEEE802.1D] figure 7-4


   The following two methods rely on the very own function of layer-2 bridges, the re-
laying of Media Access Control frames. Each user data frame a bridge-port passes by,
contains a source and destination layer-2 address. The source address of frames re-
ceived by a port will be stored in the bridge's Filtering Database FDB, see [IEEE802.1D]
section 7.8 for detailed information and figure 2.8 for an illustration.
   Default aging time of such entries is 300 seconds, as stated in [IEEE802.1D] table 7-
5. Assuming each distinct device port and end station's network adapter posses its indi-
vidual MAC address, they advertise their position and address whenever they transmit a
frame. Exactly spoken, the direction a frame with the destination of a certain MAC ad-
dress, relative to the traversed bridge is stored in an FDB entry. This implies, each de-
vice not providing FDB tables will be treated like end stations.



                                               - 29 -
2. A common approach to network discovery


   Sections 2.3.4.1 and 2.3.4.2 describe how set-operations are applied to FDB tables to
determine the position of a MAC address in the whole active network topology. Active
network means that only ports in learning or forwarding state will collect forwarding en-
tries. [IEEE802.1D] section 7.8
   Equally to other methods, no changes during discovery are expected. This is hard to
achieve, because retrieving all FDB data from all devices in a large network may last
longer than 300 seconds. Within this period table entries age out and will change or pro-
vide insufficient topology information.
   FDB data is available through Bridge-MIB dot1dTpFdbTable (1.3.6.1.2.1.17.4.3). If
VLANs are used, FDB methods can be applied to each individual VLAN. [IEEE802.1Q]
Section 7.4 describes how forwarding information are stored in 802.1Q bridges. It de-
pends on vendors to provide FDB tables on a per VLAN basis.



2.3.4.1. Complete knowledge of Filtering Databases
   [BREI2000] expects the following conditions for this method. The evaluated network
is a “single subnet switched domain”, that is each switched domain contains of only one
subnet. There are no VLANs. Address forwarding tables are complete, which means de-
vices contain the full set of MAC addresses reachable from each element's interfaces in
the processed network in their FDB.
   Two discovery stages are defined: 1st discover all nodes in the network and 2nd dis-
cover all edges between these nodes.
   Discovering all network nodes is started at a router. The IP-address of it has to be
known. Then, all neighboring routers are located by reading the router's ipRouteTable
(1.3.6.1.2.1.4.21) from MIB-II [RFC1213]. This table contains the next hop routers for
particular destinations. Processing all those tables of all found routers results in the full
set of routers in this subnet. To find all switches within the network, each router's inter-
face being able to perform direct delivery to a subnet is evaluated. The IP-address of
such an interface and the subnet-mask of the underlying subnet is taken to determine
the set of IP-addresses in this particular subnet. Now, all IP-addresses from this set are
probed being a router or a switch by determining if Bridge-MIB is present and testing
the value of ipForwarding (1.3.6.1.2.1.4.1). If it is equal to 0 and Bridge-MIB is present
then the device is a switch, otherwise it is a router.
   Next step is to identify the edges of the network graph. All connections between de-
vice ports are edges. To obtain edges the location of device ports, actually their MAC-
addresses, in the network must be determined. This is done by applying Lemma III.1
from [BREI2000]: “Interfaces Sij and Skl are connected to each other if and only if




                                             - 30 -
                                                         Diploma Thesis – Network Discovery – A. Barthel



                         Aij ∪ Akl =U and Aij ∩ Akl =∅ .”


   Aij is defined as forwarding table of port j on switch i.
   Sij is defined as interface j on switch i.
   U is the set of MAC-addresses corresponding to switches and routers of a subnet s.




                 Figure 2.9 Concept of discovery methods based on forwarding tables


   This Lemma is also called “Direct Connection Theorem” in [LOWE2002]. The idea be-
hind it, is that the forwarding tables from a pair of interfaces are disjunctive sets, and
the conjunction of these sets is complete. If this requirement is met, both interfaces
must be connected to each other. Because those two interfaces part the network, it can
be exactly said which MAC-addresses are contained in the first part and which are con-
                                                                                        -
tained in the second part. The reason for this is the bridges' address learning. If a net
work is loop-free, any learned address can only be located in one direction of a switch
port. Figure 2.9 illustrates this concept.
   Each unicast frame crossing from network A to network B will leave it's source ad-
dress in forwarding table of port 2 on switch X and on port 1 of switch Z. No MAC-ad-
dress from network B will be stored on these ports. The same situation occurs when
frames from network B are transmitted to network A. The source addresses of those

                                                - 31 -
2. A common approach to network discovery


frames will be seen on port Z2 as well as on port X1. No frame's source address from net-
work A appears on port Z2 or on port X1. If the Direct Connection Theorem can be ap-
plied to forwarding tables of ports X1 and Z1, those ports are connected. Determining all
pairs of ports satisfying the Theorem's requirements results in the network topology of
all switches.


   To discover where routers or end stations are attached to the network, the position
of their interfaces' MAC-addresses must be found. They can only be related to a switch
                                                                                         -
interface which is not connected to another switch. Such an interface is called leaf inter
face. Leaf interfaces do not meet the defined conditions of Lemma III.1. The leaf inter-
face that contains the address of a desired router or end station is the port where this
device is connected to.
   A mathematical proof of the Direct Connection Theorem is provided in [BREI2000].
Where is also discussed how the completeness requirement can be met by engaging
means of ICMP or by lowering the requirement by reducing the full set of addresses to a
predefined set. Furthermore constraints are made where the Direct Connection Theo-
rem cannot determine an exact topology.
   The amount of data to be retrieved, depends on the number of ports P within the
evaluated network and the number of entries F in the forwarding table of a single switch
port. This results in the complexity of O(P*F).
   Each pair of forwarding tables and each pair of entries from those tables must be
compared which depends on the number of ports and the number of entries. Complexity
for compare-operations is therefore O(P²*F²).



2.3.4.2. Minimum knowledge of Filtering Databases
   This method has basically the same approach like the one described before, namely
that a port can find a certain MAC-address in only one direction of it. But this method
reverses the technique to determine a pair of connected ports by finding a contradic-
tion, that determines a pair of ports cannot be “simply” connected.
   Two bridges A and B are simply connected by port XA and YB if they can send frames
to each other, there may be other bridges traversed, too. A contradiction to a simple
connection is found if two forwarding entries of the same address are pointing into dif-
ferent directions.
   To ease the notation of the algorithm the term “through set” is introduced. Tix is the
through set of switch i on port x, which is the union of all forwarding tables of all ports
from switch i except the table of port x.


                          T ix =Fi1∪Fi2∪...∪Fin ∩Fix 

                                             - 32 -
                                                       Diploma Thesis – Network Discovery – A. Barthel




   In other words, if frames are received on a bridge port i, the through set of this port i
contains the addresses those frames will be sent to, through this bridge. Figures 2.10
and 2.11 show examples of through sets. A and B are bridges, 1,2 and 3 are devices or
end stations with distinct MAC-addresses. F stands for FDB and T for through set.




                  Figure 2.10 Example of Through sets, adapted from [LOWE2002]




   The Simple Connection Theorem is defined by Theorem 5.1.: Let a, b be bridges and

there exists exactly one pair of ports ax and by such that         T ax ∩T by =∅ . Then ax and by
are (simply) connected. Furthermore, if ax and by are (simply) connected, then

T ax ∩T by =∅ .[LOWE2002]
   With insufficient information available, more than one pair of ports may satisfy the
condition of the Simple Connection Theorem. Therefore the Minimum Knowledge Re-
quirement is defined by Lemma 5.2:
   The ports x and y that connect bridges a and b are uniquely determined if and only if
at least one of these conditions is met:
  1. Each bridge has an entry for the other's address in its FDB, not including out-of-
     band ports; or
  2. Bridge a has an entry for b an     Fax and ∃k≠x :Fby ∩Fak ≠∅ ; or
  3. Forwarding entries for three nodes are shared between a and b, divided among at
     least two ports on one of a or b and three ports on the other bridge. x and y must
     be           included             in              those            ports.            Formally,

      ∃i , j , i≠ j:Fax ∩Fbi ≠∅∧Fby ∩Fbj≠∅,and ∃k≠x :Fby ∩Fak ≠∅ .
     [LOWE2002]
   The first condition is trivial. FDB entries will determine exactly which ports are con-
nected to each other. Condition three is a fully general approach to the second one. Fig-
ure 2.11 illustrates it. In contrast to figure 2.10 which provides insufficient information,
as bridge A and B are symmetric and could be swapped without having to change data,
figure 2.11 assures the minimum knowledge to determine one exact topology. It can be
seen that the additional address information of station 3 is shared in both bridges'



                                              - 33 -
2. A common approach to network discovery




       Figure 2.11 Illustration of the Minimum Knowledge Requirement, adapted from [LOWE2002]


FDBs. With that data it is not possible to exchange bridge A and B, because port 02 on
bridge A is the only FBD with two entries and can therefore only be connected to bridge
B.
     [LOWE2002] section 5.1 proofs the Simple Connection Theorem and the Minimum
Knowledge Requirement, which is beyond the scope of this work.
                                                                                            -
     Finding each pair of ports satisfying the conditions above will result in the active net
work topology.
     The amount of data which is to be read is the same like in method 2.3.4.1 “Complete
knowledge of Filtering Databases“: O(P*F). Although the number of operations depends
on how quickly a contradiction in through sets respectively forwarding tables can be
found, the complexity for the number of operations is O(P²*F²), too. The complexity of
through set entries equals to the complexity of forwarding table entries. For ports that
are not simply connected the best case is, that a contradiction is found for the first com-
parison of through set entries. But for ports that are simply connected more compare
operations have to be done. To check whether the Minimum Knowledge Requirement is
met, further comparisons have to be done. For all ports each pair of FDBs have to be
compared against each other. Therefore no matter of best, average or worst case, com-
plexity is O(P²*F²).



2.3.5. Comparison and conclusion of topology discovery methods
     Deciding for one method to discover a network topology depends on different as-
pects. It even starts before a network to discover may exist, namely when decisions for
the supplying vendor and network equipment is made. In a vendor-homogeneous net-
work which provides a proprietary protocol for collecting network topology information,
each necessary data is collected explicitly in every device. There is no need to know any
device in advance, except one seed device, and no further more or less complex opera-
tions have to be done to determine the network topology. In the case of CDP, even more
manufacturers beside Cisco support it. The protocol is designed to ease network man-


                                                - 34 -
                                                       Diploma Thesis – Network Discovery – A. Barthel


agement and topology discovery, therefore it is robust against non-standard network en-
vironments and special cases where it may fail to work correctly.
   But such a “clean” homogeneous network is easily “polluted” by a single device that
does not support CDP. Different vendor independent and general methods must be ap-
plied in those situations. Possibilities are means base upon Spanning Tree Protocol or
Filtering Database tables. Both are implemented in the basic functionality of bridges.
STP data explicitly carries topology information for active and non-active connections. If
all bridge-ID to IP mappings are known from all devices, a discovery like in the case of
CDP would be possible. But not without knowing just one seed device. FDB methods
have the advantage that even end station and router locations can be determined. But
these methods have the highest costs in operation and data collection.
   A problem to both vendor independent methods is that some devices do not provide
sufficient information. For example all Cisco Catalyst 5000 switches tested, did not save
data for the Designated Bridge port needed by STP topology proposal. Besides that all
ACE switches tested, did have only one distinct MAC-address for all of it's device ports.
                                                                                     -
This makes it impossible to apply only one of these methods to determine the whole net
work topology.
   A general solution to network discovery would be to combine all methods, where spe-
cial cases for known troublesome devices are used. The order in which methods are
used is 1st CDP, 2nd STP and 3rd FDB. Each method adds connections to the topology for
those which are not detected yet by the previously method. Because data retrieval is
more expensive than processing data, STP-approach is taken before FDB methods. Be-
cause FDB techniques need much more data for finding connections. Furthermore with
CDP and STP information it is possible to mark ports as switch-trunks. These ports do
not have to be taken in consideration when applying FDB methods for locating end sta-
tions or non-CDP routers. Those trunk-ports already connect a pair of ports, where no
other device can be attached, presumed no shared segment exists.

                      Method                   CDP              STP            FDB
                  Active topology              Yes               Yes            No
                 Not active topology           Yes               Yes            No
                  Router discovery             Yes               No            Yes
                   Host discovery               No               No            Yes
             Detect shared segments             No               n/a           Yes
                  Amount of data               O(P)             O(P)          O(P*F)
                     Operations                O(P)             O(P)         O(P²*F²)
            Life cycle of topology data     “eternal”         “eternal”        300s


                           Table 2.7 Comparison of discovery methods




                                              - 35 -
2. A common approach to network discovery


    All methods have in common that they can only detect connections that are enabled.
Ports that are administratively disabled appear like ones that are not connected. A solu-
tion to this is to manually support those missing information in some MIB-tables like IF-
MIB's ifAlias (1.3.6.1.2.1.31.1.1.1.18). Which assumes this table is not used for any
                                                                                     -
other purpose. In addition the intention was to gather a network topology all automati
cally. CDP and STP discovery is also able to detect non-active connections, which are in
blocking or learning state. Another disadvantage of FDB methods is that topology infor-
mation provided by forwarding entries age out after a defined time, because they are re-
freshed “every now and then” only. This causes to have insufficient data to determine
the network topology. Whereas CDP and STP data is repeatedly sent until they are ex-
plicitly stopped, so they provide topology information constantly.
    Shared segments connected to hubs, dumb switches without management functions
implemented and devices that do not answer to management requests are also problem-
atic to every discovery method. Hubs add multiple connections where actually only one
can exist. That is not a problem at the boundaries of a network, where for instance end
stations are connected, but in the inner connections1 between switches it can cause fail-
ures in discovery. CDP will fail when using hubs, because only the last CDP-advertise-
ment received will be carried as a neighbor. All other devices potentially connected
through a hub will be “forgotten”. The method bases upon STP data will likely work with
devices connected to a hub, due to the Spanning Tree Protocol blocking redundant con-
nections. But it is not clear if the network topology is determined correctly. FDB meth-
ods in contrast are able to detect shared segments, as they are based on the bridges'
functionality. Dumb switches respectively switches not providing sufficient information
constitute “holes” in the network, where the topology seems to have no connections, but
where indeed are some, dividing one network into many. This can be covered by engag-
ing discovery in every separate network.
    Another requirement is that network changes do not occur during discovery. This
cannot be eliminated but changes of the switched network do not take place often, so
further discovery runs can circumvent this problem. Depending on the method used,
topology data is transient. End stations can only be detected by means of FDB so they
appear and disappear frequently, therefore those information must be kept over a cer-
tain period of time and between distinct discovery runs.




1   also known as a network's backbone

                                            - 36 -
                                                    Diploma Thesis – Network Discovery – A. Barthel



3. Design and implementation of the sample network
discovery application - Netdisco


3.1. Analysis of Netdisco's architecture
   Netdisco is a sample discovery application based on CDP information. It is entirely
implemented in Perl. Discovery and management functions are controlled by a Perl-
script. It is also possible to operate through a web front-end. All collected topology data
is stored in the PostgreSQL database back-end. This database is accessed through the
Netdisco Perl module netdisco.pm. SNMP-data is gained by the help of SNMP::Info.
Topology visualization is obtained through GraphViz. The network image, several man-
agement and information Hypertext Markup Language (HTML) pages are generated by
web front-end scripts, that use Mason for dynamically added data. These pages are sup-
plied by the Apache web server. Figure 3.1 gives an overview of Netdisco's architec-
ture.




                                           - 37 -
3. Design and implementation of the sample network discovery application - Netdisco




                                    Figure 3.1 Netdisco Architecture



3.1.1. Discovery-script
   The Netdisco Perl script netdisco is the central component of Netdisco. It contains
all functions necessary to operate. Complete interaction with Netdisco is provided by
command line operation. Parameters are arranged into options, network commands, de-
vice commands and administration. Starting Netdisco without any parameter will list a
complete list of possible commands and options.
   Main actions are distinct discovery operations, like discovering a single device, a
whole network or refresh all known devices; collecting ARP informations, called ARPnip;
retrieving forwarding tables, called MACsuck; graph-drawing; enable debugging; user
control; expiring devices or nodes; starting and stopping the administration daemon,
which is used by the web front-end. Netdisco itself does not access the back-end
database directly nor does it request any MIB object or OID from devices. It uses trans-
parent functions provided by SNMP::Info and the Netdisco module. SNMP::Info encap-
sulates SNMP device access and provides a common interface to the user respectively
Netdisco. SNMP::Info knows several predefined device-classes where properties and in-
formations are handled class specifically. This enables Netdisco to use unique access
and retrieval operations for all devices. SNMP::Info furthermore uses the Net-SNMP



                                                  - 38 -
                                                       Diploma Thesis – Network Discovery – A. Barthel


package and it's Perl modules to perform SNMP operations. Visualization is obtained by
the use of GraphViz. Netdisco determines the graph to draw and passes it to GraphViz.
This generates the image supplied for the web front-end. All graph properties are set
within the Netdisco-script. Certain depiction modifications can be modified via Netdis-
co's configuration file.



3.1.2. Helper functions
   Helper functions are provided by the Netdisco Perl-module netdisco.pm. These
functions give Netdisco access to the back-end PostgreSQL database, supplying func-
                                                                                       -
tions to all database requests, sent by the main script. Therefore several database oper
ations are implemented that create database handles, offer predefined DB-operations
for setting or deleting certain values and to create manual database request with param-
eters passed to the DB. Any database access is done with the help of this module, this
makes it possible to switch the back-end Databases Management System without having
to change functions in Netdisco itself.
   Netdisco's configuration file netdisco.conf is parsed by the Netdisco module. It
                                                                                     -
processes all values and saves them in variables making them globally available in Net
disco.
   The module furthermore builds the network graph data structure out of all neighbor
information, contained in the database. The graph is returned to the Netdisco script,
which visualizes it by the use of GraphViz.
   Some smaller functions are available to determine logical values of certain properties
or to process values extracting information out of it. The file README-API-SHARED pro-
vided in the Netdisco documentation gives a compact description of the purpose of
available functions.




                                              - 39 -
3. Design and implementation of the sample network discovery application - Netdisco


3.1.3. Database back-end




                  Figure 3.2 Entity Relationship Diagram of Netdisco's database tables


   The Netdisco back-end is built upon the object-relational Database Management Sys-
tem (DMBS) PostgreSQL. Figure 3.2 shows in an Entity-Relationship-Diagram how the
database tables and data structures are related to each other. Attributes indicated with
“meta” are additional properties of an entity, which are stored in the corresponding ta-
bles. These are not necessary for determining topology data.




                                                  - 40 -
                                                         Diploma Thesis – Network Discovery – A. Barthel




             Table                                    Information hold
                            device specifics, SNMP parameters used to connect to device
             device
                            successfully
                            data about device ports, including layer-1/-2 properties and
          device_port
                            neighbor devices
            device_ip       all devices' IP-aliases
              node          device and port nodes are connected to
            node_ip         IP-addresses of nodes, ARP-data
             users          Netdisco users
               log          logging information
         device_port_log                                       ontrol feature
                            logging data about Netdisco's port-c
             admin          administration daemon request
            session         web front-end session information
                            Organizationally Unique Identifiers, map MAC-addresses to
               oui
                            vendors


                           Table 3.1 Overview of Netdisco's database tables


   Several tables, see table 3.1 for an overview, are contained withing the Netdisco
database. These are table device, holding device specific information valuable for man-
agement and SNMP data for which the device was successfully accessed, see table 3.2.
(Note: Bold values determine the primary key of a table, italic values are added to the
standard Netdisco tables, for enhancements)
   Devices are identified by one IP-address. If multiple IPs are assigned to a device the
one IP that is returned first from a DNS-resolution is used. Note that this is not always
the same. All other alias IP-addresses assigned to a device are stored in table
device_ip, table 3.3.




                                                - 41 -
3. Design and implementation of the sample network discovery application - Netdisco


                       Name                     Data type
                       ip                       inet PRIMARY KEY
                       creation                 TIMESTAMP DEFAULT now()
                       dns                      text
                       description              text
                       uptime                   bigint
                       contact                  text
                       name                     text
                       location                 text
                       layers                   varchar(8)
                       ports                    integer
                       mac                      macaddr
                       serial                   text
                       model                    text
                       ps1_type                 text
                       ps2_type                 text
                       ps1_status               text
                       ps2_status               text
                       fan                      text
                       slots                    integer
                       vendor                   text
                       os                       text
                       os_ver                   text
                       log                      text
                       snmp_ver                 integer
                       snmp_comm                text
                       vtp_domain               text
                       last_discover            TIMESTAMP
                       last_macsuck             TIMESTAMP
                       last_arpnip              TIMESTAMP
                       base_addr                macaddr


                                 Table 3.2 Netdisco Database Table device


                       Name                     Data type
                       ip                       inet
                       alias                    inet
                       port                     text
                       dns                      text
                       creation                 TIMESTAMP DEFAULT now()


                                Table 3.3 Netdisco Database Table device_ip


   Each devices' port information is stored into table device_port, table 3.4. The pri-
mary key for ports is comprised of the device IP it belongs to and it's physical port name
(layer-1 address). It must be ensured that port names are unique. By implementing
SNMP::Info classes for devices not using distinct port names, this problem can be


                                                   - 42 -
                                                            Diploma Thesis – Network Discovery – A. Barthel


solved. In this device-port table all information necessary to determine the network
topology is aggregated. Furthermore it keeps layer-1 and layer-2 data for each port.


                    name                    Data type
                    ip                      inet
                    port                    text
                    creation                TIMESTAMP DEFAULT now()
                    descr                   text
                    up                      text
                    up_admin                text
                    type                    text
                    duplex                  text
                    duplex_admin            text
                    speed                   text
                    name                    text
                    mac                     macaddr
                    mtu                     integer
                    stp                     text
                    remote_ip               inet
                    remote_port             text
                    remote_type             text
                    remote_id               text
                    vlan                    text
                    lastchange              bigint
                    stp_p_port              text
                    stp_p_bridge            text
                    stp_p_id                text


                           Table 3.4 Netdisco Database Table device_port




   The knowledge of which port a node is connected to is collected in table node, see ta-
ble 3.5. It holds the MAC-address of the node, the switch and the port being attached to.

                    Name                    Data type
                    mac                     macaddr
                    switch                  inet
                    port                    text
                    active                  boolean
                    oui                     varchar(8)
                    time_first              timestamp default now()
                    time_last               timestamp default now()


                               Table 3.5 Netdisco Database Table node


   IP-addresses belonging to the node's MAC-addresses are kept in table node_ip, see

                                                   - 43 -
3. Design and implementation of the sample network discovery application - Netdisco


table 3.6.


                       Name                     Data type
                       mac                      macaddr
                       ip                       inet
                       active                   boolean
                       time_first               timestamp default now()
                       time_last                timestamp default now()


                                 Table 3.6 Netdisco Database Table node_ip


   Further tables exist for Netdisco user management (users), logging (log,
device_port_log), administrative request queuing (admin), session management (ses-
sions) and the Organizationally Unique Identifiers (OUI) that map MAC-addresses to
vendors.



3.1.4. Web front-end

                 HTML-file                                Purpose/Content
                                     administration panel, link to device and user control,
                admin.html
                                     maintenance, admin log
              admin_dev.html         device and discovery operations
              admin_user.html        Netdisco user management
              portcontrol.html       linked in each device view, disable/enable ports
                device.html          shows device and it's port properties
                netmap.html          network map
             device_search.html      search for devices
                 node.html           search for the port a node is connected to
              device_inv.html        inventory of devices
               ip_search.html        node inventory search
              traceroute.html        layer-2 traceroute
                duplex.html          duplex mismatch finder
              port_report.html       report of disabled ports and the reason for it
                  log.html           backend log


                 Table 3.7 Overview of Netdisco's HTML-files used by the web front-end


   To have a convenient user interface, Netdisco integrates a web front-end. It consists
of dynamic HTML pages, one for each purpose. Information presented in these pages
are generated with the help of Mason, which offers the possibility to write Perl instruc-
tions into HTML pages. The Apache web server delivers the HTML content to the user's
browser. Access to the front-end is restricted by user-name and password. Each user
can have additional permissions necessary for controlling port-state or having full ad-


                                                    - 44 -
                                                      Diploma Thesis – Network Discovery – A. Barthel


ministrative access. User-management is controlled from Netdisco itself, not by the
Apache web-server. Because pages are generated on demand, a session management is
required to let users see their current session specific information only. Mason supplies
this feature.
   Pages can be classified into administrative and informative ones. Administrative
pages offer device- and (Netdisco-)user control. Additionally every operation from the
Netdisco-script can be performed. These administrative pages are available from the
“Administration Panel” page (admin.html). Another one is the “device port control”
page (portcontrol.html). It is accessible from the device page (device.html) which
can be either reached from the “Network Map” (netmap.html) or from the “Device
Search” (device_search.html) page.
                                                                                     -
   Informative pages retrieve values out of the database and present the network topol
ogy map. Search functionality for devices and nodes (node.html) are present as well as
device (device_inv.html) and node inventories (ip_search.html). Further network
operational data can be examined through “Layer 2 Traceroute” (traceroute.html)
and “Duplex Mismatch Finder” (duplex.html); the “Port Report” (port_report.html)
where the reason and by whom a port was disabled can be seen; the “Backend Log”
(log.html) and access to documentation complete the options of the web front-end.
    With the full set of Netdisco's operations and the Netdisco user-management avail-
able, there is no need to have access to the Netdisco-machine, once the web front-end
is established. Because user-management is database driven and transparent to Netdis-
co itself, an integration into an existing user administration is possible.




                                             - 45 -
3. Design and implementation of the sample network discovery application - Netdisco



3.2. Analysis of Netdisco's mode of operation
   The following section describes how Netdisco discovers a network within distinct
scenarios and how it identifies the network topology from the retrieved data.



3.2.1. Discovery scenarios


3.2.1.1. Discovery of a single devices




                                  Figure 3.3 Discovery of a single device


   Discovering a single device is the elementary function of Netdisco's discovery meth-
ods. It retrieves all information belonging specifically to one device no matter if it sup-
ports CDP or not. The sub-routine discover is a kind of master, in the discovery process
of a single device. It distributes work to the distinct functions necessary to handle a de-
vice appropriately.
   To start discovery Netdisco is called: netdisco -d device where device can be ei-
ther a hostname or an IP-address. Figure 3.4 shows a flow-chart of how discovery of the
desired device is done.
   First it is tested whether the IP's subnet is excluded from discovery, defined in net-
disco.conf. If so, Netdisco will finish.




                                                  - 46 -
                                                         Diploma Thesis – Network Discovery – A. Barthel




               Figure 3.4 Discovery of a single device, Netdisco sub-routine discover



   Otherwise discovery is continued by calling get_device, presented in figure 3.5,
with the determined IP as parameter. get_device tries to ascertain if the device IP is an
alias IP of the device and if the device is stored in the database yet. For a known device
the SNMP version and read-community will be taken out of the DB, because those were


                                                - 47 -
3. Design and implementation of the sample network discovery application - Netdisco




                              Figure 3.5 Netdisco sub-routine get_device


successfully used to retrieve information from that device and avoid possible timeouts
from wrong communication settings. With those values create_device is called. If this
call fails or if the device is not know yet, every SNMP read-community in netdisco.-
conf is tried until one succeeds. Within these attempts the SNMP version is left to be


                                                  - 48 -
                                                      Diploma Thesis – Network Discovery – A. Barthel


figured out by create_device. In the case all attempts fail an undefined value is re-
turned to the sub routine discover, which therefore ends discovery at all and the de-
vice IP is marked as undiscovered.




                        Figure 3.6 Netdisco sub-routine create_device



  The mode of operation of create_device can be seen in figure 3.6. A device's host-
name, which is actually it's IP-address if called by get_device, and SNMP read-commu-
nity are parameters passed to create_device. Optionally SNMP version and the
SNMP::Info subclass of an device can be specified. With those values and some further
default parameters from Netdisco's configuration file, an SNMP::Info object is created.
To test if communication is successful with that object, the MIB variable SysServices



                                             - 49 -
3. Design and implementation of the sample network discovery application - Netdisco


(1.3.6.1.2.1.1.7) from MIB-II is requested. If that fails an SNMP::Info object with SNMP
version 1 is tried next. Should the result be negative again or SNMP version has already
been 1, then no SNMP “connection” is possible and create_device returns undef.
From a successfully created SNMP::Info object, able to communicate with it's corre-
sponding device, a DNS resolution of it's hostname to an IP, and reversely getting the
first DNS entry of this IP will be done next. This is only done to associate these values
with the created device object, which is returned to get_device and finally to discov-
er.
   Now the “real” device information retrieval starts, by passing the created
SNMP::Info object to store_device, followed by store_interfaces and find_neigh-
bors.
   store_device , figure 3.7, gets the root-IP and first DNS entry for a the device IP, if
it is an alias IP and already found in the Netdisco database. Now, all device IP-address-
es and interfaces are retrieved by requesting ipAdEntIfIndex (1.3.6.1.2.1.4.20.1.2)
and ifIndex (1.3.6.1.2.1.2.2.1.1) from MIB-II. After that, all aliases in device_ip for
                                                                                       -
the current device IP are deleted. All alias addresses of the processed IP and their cor
responding ports are saved to table device_ip instead. Finally, further device specific
properties are read from the device.




                                                  - 50 -
                                                     Diploma Thesis – Network Discovery – A. Barthel




                       Figure 3.7 Netdisco sub-routine store_device


  Table 3.8 shows the MIB variables that are requested from every device, some more
are retrieved but they are device and vendor specifically stored. SNMP::Info knows of


           MIB                     Variable name                              OID
   IF-MIB               ifDescr                                  1.3.6.1.2.1.2.2.1.2
   IF-MIB               ifNumber                                 1.3.6.1.2.1.2.1
   IF-MIB               ifIndex                                  1.3.6.1.2.1.2.2.1.1
   RFC1213-MIB          system                                   1.3.6.1.2.1.1
   Entity-MIB           entPhysicalDescr                         1.3.6.1.2.1.47.1.1.1.1.2
   Entity-MIB           entPhysicalSerialNum                     1.3.6.1.2.1.47.1.1.1.1.11


                    Table 3.8 Common MIB variables store_device retrieves


this deviations and handles them device based, thus different MIB variables are read.


                                            - 51 -
3. Design and implementation of the sample network discovery application - Netdisco


These carry information about power supplies, number of fans and slots, devices' opera-
tion system, it's vendor and VTP domain if present. After getting all available data from
a device these information are saved to table device in the database and store_device
finishes.




                           Figure 3.8 Netdisco sub-routine store_interfaces




                                                  - 52 -
                                                    Diploma Thesis – Network Discovery – A. Barthel


   Next sub routine called is store_interfaces, shown in figure 3.8. It gets passed the
created SNMP::Info object as parameter. First of all, store_interfaces purges all in-
terface information in table device_port for the current device. This prevents to mix up
possibly aged out data with new one, which is read out directly afterwards. Interfaces'
layer-1 properties are of interest, like type, speed, state, admin-state and duplex infor-
mation. But also layer-2 characteristics: the interfaces' native VLAN and STP state are
requested. Again this knowledge depends on device-classes and is handled appropriate-
ly by SNMP::Info. Because all interface information is provided in a whole data struc-
ture by SNMP::Info and even some administration interfaces are to be ignored, a loop
resolving data per interface is processed. This loop inserts port properties in the
database's table device_port, too.
   Device and interface attributes retrieved so far, describe only a certain device, but
no topology information are contained.
   As Netdisco is CDP-based, all CDP neighbor data seen on a device is processed next
in find_neighbors. Figure 3.9 gives an overview of it's function. Once again the device
SNMP::Info object is passed as parameter. The device's CDP neighbor's IP-addresses
are read first. To know whether a device supports CDP or not, the MIB variable cdpRun
is tested. If cdpRun is set to true or some neighbor IPs are found, the device is consid-
ered to be CDP enabled. Because CDP neighbor IPs have to be gathered anyway, this
test is actually done twice. If no IPs are known, no further neighbor information are of
interest or even available.
   Data from those neighbors known to the currently processed device are requested. In
particular this is the neighbor's ports connected to the local ports, CDP-ID and platform.
Each neighbor is tried to be associated with the port it is seen on and stored in table
device_port. If the neighbor device is not marked as discovered, it is put into the glob-
al array %Discover_Queue, used during discovery of multiple devices.




                                           - 53 -
3. Design and implementation of the sample network discovery application - Netdisco




                            Figure 3.9 Netdisco sub-routine find_neighbors




                                                  - 54 -
                                                      Diploma Thesis – Network Discovery – A. Barthel


   After CDP information is processed manual topology data is evaluated from file net-
disco-toplology.txt. The content of the manual topology is loaded even before dis-
cover is run, but processing the manual links is done after SNMP device discovery. See
figure 3.10 for the operation of topo_add_link.




                        Figure 3.10 Netdisco sub-routine topo_add_link


   This is the last step in discovery of a single device. And therefore this device is
marked in global hash-variable %Discovered. Because a device can be seen with anoth-
er of it's IP-addresses as a neighbor somewhere else in the topology, all IP-aliases of the
discovered device are marked discovered in the global hash %Discovered_Alias.




                                             - 55 -
3. Design and implementation of the sample network discovery application - Netdisco


3.2.1.2. Discovery of all CDP devices




                                Figure 3.11 Discovery of all CDP devices


   Devices supplying CDP information can be used to discover all CDP-neighbors, as de-
scribed in chapter 2.2.1 Using proprietary protocols for finding devices.
   Seed-device is the device called, where discovery starts from. Basically a single de-
vice is discovered and all CDP neighbor of the devices are added to the discovery
queue. The sub-routine run of Netdisco evaluates all devices in this queue in the same
way. And while discovering one of the queued devices, it's CDP neighbors are added
again to the queue. This process runs until all CDP devices reachable from the seed-de-
vice are discovered.




               Figure 3.12 Discovery of all CDP devices of example network in figure 3.11




                                                  - 56 -
                                                             Diploma Thesis – Network Discovery – A. Barthel


   Figure 3.12 and table 3.9 illustrate how the devices in the sample network from fig-
ure 3.11 are evaluated and how neighbor devices are added to and removed from the
discovery queue.
   The exact order devices of the same discovery level are processed, depends on the
order they are returned by the SNMP response.

     Discovery Step                  Device                  CDP-Neighbors          Discovery Queue
               1                 192.168.100.12              192.168.100.254         192.168.100.254
                                                             192.168.100.28           192.168.100.28

               2                192.168.100.254               192.168.100.4          192.168.100.28
                                                              192.168.100.1          192.168.100.4
                                                                                     192.168.100.1

               3                 192.168.100.28               192.168.100.8           192.168.100.4
                                                                                      192.168.100.1
                                                                                      192.168.100.8

               4                 192.168.100.4                                        192.168.100.1
                                                                                      192.168.100.8

               5                 192.168.100.1                                        192.168.100.8

               6                 192.168.100.8


                   Table 3.9 Steps while discovering the sample network from figure 3.11


   The discovery queue is handled First In First Out (FIFO). This causes the discovery
process to step through the network from one level of distance to the next in a Breadth
First Search (BFS). First all devices of one level are discovered then discovery proceeds
to the next.
   Figure 3.13 shows how Netdisco does implement it. To engage discovery of the
whole CDP network Netdisco is called: netdisco -r seed-device.




                                                    - 57 -
3. Design and implementation of the sample network discovery application - Netdisco




                 Figure 3.13 Netdisco sub-routine run, used to discover all CDP-devices




                                                  - 58 -
                                                          Diploma Thesis – Network Discovery – A. Barthel


3.2.1.3. Discovery of non-CDP devices
   There is no possibility to discover non-CDP devices. They must be discovered individ-
ually by invoking the single device discovery.



3.2.1.4. Refresh devices
   If no new devices are added to the network, the renewal of information contributed
by all known devices can be done in Netdisco via the refresh function: netdisco -R.
Figure 3.14 shows a flowchart of it.




             Figure 3.14 Netdisco sub-routine refresh, used to refresh all known devices


   This method is a little quicker than a discovery of the network, because timeouts
caused by wrong SNMP version or communities will not appear, as they are taken from
the database. It is assumed that none of those parameter changed since the first discov-
ery of a device.
   Furthermore refreshing devices can be used as a simple network error detection. De-
vices that are known in the Netdisco database, but do not respond during refreshment
are still present, but will be noted as not-responding.



3.2.1.5. Host discovery
   Host discovery is another distinct feature of Netdisco. It provides information about
which device port an end station is connected to. To be exact, it is determined which de-

                                                 - 59 -
3. Design and implementation of the sample network discovery application - Netdisco


vice port a certain MAC-address is seen on.




                                      Figure 3.15 Host discovery




                                 Figure 3.16 Scheme of host discovery


   Figure 3.16 shows a simplified scheme of how Netdisco does host discovery. End sta-
tions or host are called “nodes” in Netdisco. It is assumed that a node can only be con-
nected to ports not linked to another device's port. Which is true in most cases, where
for instance work-group switches or hubs may only be connected to “user”-ports and not
in the inner backbone of the layer-2 network. Figure 3.2 on page 40 reflects this 1:n re-


                                                  - 60 -
                                                      Diploma Thesis – Network Discovery – A. Barthel


lation of device port and nodes.
   With these constraints defined, host discovery is done accordingly: Get all forwarding
                                                                                 -
entries from layer-2 capable devices; detect uplink-ports where nodes cannot be at
tached to and finally assign the nodes respectively their MAC-addresses to the ports
they were seen on. Because these ports are non-uplink-ports, MAC-addresses will only
appear once at them, in a loop-free network and can therefore be associated with a sin-
gle device port. Uplink-ports are detected by testing for neighbor devices.




                           Figure 3.17 Netdisco sub-routine macwalk


   To get all forwarding entries from all layer-2 devices Netdisco implements a method
named macwalk, figure 3.17.
   To start a macwalk Netdisco is called netdisco -m. It loads from all known devices:
IP-address, aliases, layers a device supplies and all port MAC-addresses. This is done to


                                             - 61 -
3. Design and implementation of the sample network discovery application - Netdisco


know from which devices forwarding tables have to be requested from and to avoid mix-
ing port addresses with node addresses. If a device provides layer-2 information and
therefore Bridge-MIB is accessible, sub-routine macsuck, figures 3.18 and 3.19, is called
for this device.




                            Figure 3.18 Netdisco sub-routine macsuck, part 1




                                                  - 62 -
                               Diploma Thesis – Network Discovery – A. Barthel




Figure 3.19 Netdisco sub-routine macsuck, part 2


                      - 63 -
3. Design and implementation of the sample network discovery application - Netdisco




   Macsuck itself is actually an initiation for reading out forwarding tables from de-
vices. It prepares the SNMP communication and for devices capable of using the Cisco
community-indexing, it retrieves forwarding tables per VLAN. The real collection of for-
warding entries is done in sub-routine walk_fwtable, figure 3.21. It also skips uplink-
ports, since assuming nodes cannot be attached to them. Detection of uplinks is either
done by looking for neighboring devices in database table device_port or by testing
forwarding entries belonging to other devices. Because those forwarding entries can
only be seen on ports from the backbone of a network, where only devices are linked to
each other and therefore no node can be connected.
   Beside other parameters walk_fwtable is passed a pointer to fw_cache. In this vari-
able, forwarding entries are cached during reading out devices. Saving information
which device port a node is connected to is done via mac_savecache, figure 3.20, into
database table node.




                            Figure 3.20 Netdisco sub-routine mac_savecache




                                                  - 64 -
                              Diploma Thesis – Network Discovery – A. Barthel




Figure 3.21 Netdisco sub-routine walk_fwtable




                     - 65 -
3. Design and implementation of the sample network discovery application - Netdisco


   End stations are denoted so far by their MAC-addresses. But it is more common to
use IP-addresses or hostnames instead, as these are better human readable. Mappings
from MAC-addresses to IP-addresses are stored in layer-3 devices' ARP-tables. To get
those information Netdisco is called netdiso -a, which starts an arpwalk, figure 3.22.
First this function loads all devices and MAC-addresses belonging to device ports from
the Netdisco database.




                               Figure 3.22 Netdisco sub-routine arpwalk



    Then it tests if a device is able to supply layer-3 data. If so, it calls the arpnip sub-
routine, figure 3.23. arpnip does the real ARP-cache fetching from devices and saves
the MAC <-> IP mapping into database table node_ip. With this operation all necessary
information for a host discovery are collected in the database.




                                                  - 66 -
                           Diploma Thesis – Network Discovery – A. Barthel




Figure 3.23 Netdisco sub-routine arpnip


                  - 67 -
3. Design and implementation of the sample network discovery application - Netdisco


3.2.2. Topology generation
   Now that all necessary information are retrieved from the network the topology gen-
eration can be done. For detecting only the device topology, macwalk and arpwalk do
not need to be done, as they only provide host – node specific topology data.



3.2.2.1. CDP topology
   Because Netdisco uses CDP data to discover devices one by one, topology informa-
tion is implicitly obtained and used during discovery. To generate the network topology
from those information, Netdisco simply reads them from database table device_port,
where ports having a neighbor device are listed. Graph creation is started by calling
netdisco -g. This will engage sub-routine graph, see figure 3.24.




                                Figure 3.24 Netdisco sub-routine graph




                                                  - 68 -
                                                     Diploma Thesis – Network Discovery – A. Barthel


  First of all graph calls make_graph, figure 3.25, from the Netdisco module. This does




                     Figure 3.25 Netdisco module function netdisco::make_graph


the real neighbor-data retrieval from the database. With this data an undirected graph


                                            - 69 -
3. Design and implementation of the sample network discovery application - Netdisco


object is formed and returned to graph, going on to calculate the biggest sub-graph. All
smaller sub-graphs are striped off from the biggest one and graph_each, figure 3.26, for
this graph is called.




                              Figure 3.26 Netdisco sub-routine graph_each




   graph_each uses the image-, graph, node- and edge properties set in netdisco.conf
to create a GraphViz object. Then all nodes from the biggest sub-graph passed to
graph_each, are evaluated by graph_addnode, 3.27.




                                                  - 70 -
                                                      Diploma Thesis – Network Discovery – A. Barthel




                        Figure 3.27 Netdisco sub-routine graph_addnode




                                                                                     -
   This function sets the GraphViz node label and HTML image-map location for later vi
sualization. The modified GraphViz graph-object is returned to graph_each and all
edges between the graph-nodes are added to the GraphViz object. Finally the GraphViz
visualization methods for defined image-types are generated.



3.2.2.2. Manual topology
   Topology information Netdisco is not able to detect, can be added in file netdisco-
topology.txt. The whole syntax of it is supplied in file README. To define a manually
set link, the device's hostname or IP-address must be given at one line. Next line starts
with “link:” followed by the name of the local port's source device, the hostname or IP of
destination device and it's port the source device is connected to. These entries must be
separated by commas.




                                             - 71 -
3. Design and implementation of the sample network discovery application - Netdisco




                               Figure 3.28 Example for manual topology


   Figure 3.28 shows the example from figure 3.11 on page 56, all CDP devices are dis-
covered automatically. Only device 192.168.100.5 is left, because it does not take part
in CDP neighbor advertisements and is therefore invisible to CDP topology. To add it to
                                                                                         -
the manual topology information, it first has to be discovered as a single device, by call
ing netdisco -d 192.168.100.5. This is necessary to get it known to Netdisco. To add
the connection between device 192.168.100.5 and device 192.168.100.28 lines in net-
disco-topology.txt are:
   192.168.100.5
   link:Fa0/13,192.168.100.28,Port1
   Port names must be exactly given like recorded in Netdisco's database. The opposite
direction from device 192.168.100.28 to device 192.168.100.5 is also possible to pro-
vide, because Netdisco uses an undirected graph as model and treats both variations
equally.
   To explicitly import manual topology information Netdisco is called netdisco -T.
                                                                                    -
Otherwise this is automatically done while discovering a single device, the whole net
work or during a refresh. Netdisco does not make a distinction between CDP and manu-
ally added topology information as both are stored in database table device_port.



3.2.3. Detection of changes in networks
   Netdisco is able to detect if devices are added to the network. This is possible be-
cause information about all devices are loaded at the beginning of a network discovery.
To only read out data from new devices the 'new-only' switch -N can be appended when
calling netdisco -r. Furthermore during network discovery and device refresh; old,
new and missed devices are counted, so a comparison of these values after different dis-


                                                  - 72 -
                                                   Diploma Thesis – Network Discovery – A. Barthel


covery runs tell if changes or errors occurred. A “real” automatic change-detection is
not provided. Netdisco can only be used as a tool to detect modifications manually.



3.2.4. Problems during operation
   The first problem appeared, was that a Cisco WS-C6500 did not show any data about
CDP neighbors, which was not correct. Tests revealed that this device has problems
when gathering information through GET-BULK operations. Those problems are limited
to cdpCache MIB-variables. WS-C4000 devices do also have problems with GET-BULK,
they don't deliver certain Bridge-MIB variables, which results in rejecting forwarding
entries from Netdisco. Both device classes show the symptom that they return unde-
fined values, where actually valid ones are available. These failures are not correlating
with certain device properties or device OS-versions. Tests showed that devices with ex-
actly the same software running and the same hardware configuration, presented differ-
ent behaviors. But it seems that distinct device classes are more vulnerable to this
“bulkwalk-bug”. To satisfy this constraint, SNMP::Info version 1.0 supplies the feature
to disable bulkwalk for certain device-classes at all. When using lower versions of
SNMP::Info a workaround is: first expire the problematic device from the Netdisco
database; set SNMP-versoin to 1 in netdisco.conf, this will prevent Netdisco from us-
ing bulkwalk, because SNMP-v1 does not support it; then discover the device again, this
stores the SNMP-connection data into the database; then set SNMP-version back to 2.
Now, every time the device is accessed SNMP-v1 will be used for it and SNMP-v2 for all
other devices. This step must be performed for all devices with bulkwalk-problems.
   Another problem appeared due to the fact that one device can have multiple IP-ad-
dresses and DNS-names. Netdisco tries to come up against this, by defining a “root-ip”,
which is used as the primary key for a device, and by resolving the DNS-name for the
“root-ip” of the device. But it cannot be determined if devices will always have the same
“root-ip”, when being discovered for the first time and if the DNS-name resolved for it,
is repeatedly constant. Proposal for the “root-ip” case is to use an “include-subnet”
wherein device are discovered. This subnet may be even the only one where devices an-
swer to SNMP requests, as it presents the management interface of devices, whereas all
other ports are “normal” communication interfaces.




                                          - 73 -
4. Improvements on Netdisco



4. Improvements on Netdisco
   Analysis of Netdisco's implementation showed, that improvements are possible. The
following chapters present enhancements developed in this work. These are: adding
functions to determine vendor independently network topologies, speed up data re-
trieval by concurrently requesting devices and to show layer-2 properties in a detailed
graph.



4.1. Vendor independent topology determination
   Chapter 2.3.3 “Topology by Spanning Tree Protocol (STP) data” on page 27 de-
scribed how it is possible to use STP data for topology information. Netdisco discovers
all CDP seen connections and builds the topology out of it. But non-CDP devices are
missing. If all of those devices are added by single device discovery, Spanning Tree data
provides connections between non-CDP devices. Netdisco did not retrieve the necessary
STP data, Designated Bridge and Designated Port, so far. Therefore collection of it has
to be implemented in Netdisco. This is achieved by adding information retrieval meth-
ods to class SNMP::Info::Bridge. Nedisco's database table device_port has also to be
prepared by adding certain values to it. What is exactly stored can be seen in table 3.4
Netdisco Database Table device_port on page 43, values printed in italics are added to
the table.
   When expanding the CDP network topology, two possible cases appear. First, a de-
vice is added at the border of the topology and second, a device is inserted in between
two devices.
   All links seen by CDP are also seen by STP, with additional ones from non-CDP de-
vices.




                                          - 74 -
                                                          Diploma Thesis – Network Discovery – A. Barthel




        Figure 4.1 Comparison of CDP and STP topology information, with an outer non-CDP device


   Figure 4.1 illustrates an example, where a device is additionally seen as an outer de-
vice.
   Because the topology determined by Spanning Tree information depends on which
device is the tree's root, there are presented all three possible cases, whereas CDP data
remains. CDP determines a connection between port 2 of device A and port 1 of device
B. Because both devices send CDP advertisements, link information is seen on both ends
of it. STP data is seen by only one end of the connection. If a device has stored itself as
designated bridge, the other device's port of the connection must contribute the neces-
sary link information.
   For instance case two: STP-root is device B. The connection between device A and B
is only seen at Port 2 of A, because B is the Spanning Tree root and therefore it's own
designated bridge. Port A2 does also have device B noted as designated bridge on port
1. This defines the connection of Port A2 and B1 sufficiently, which proofs CDP-link in-
                                                                                     -
formation. Furthermore port C1 does also see device B as designated bridge. This deter
mines another link, namely port C1 to port B2, which adds device C to the topology. C is
not visible to the CDP topology, because it is not issuing CDP advertisements.
   Figure 4.2 shows the same topology constellation, but instead of device C, device B
                                                                                     -
is the non-CDP and CDP-unaware device. It can be seen that B is invisible to CDP topol
ogy which determines a connection between A2 and C1, this is not correct in the real
topology, but it is in CDP. With the help of STP data the connections between A2 and B1


                                                 - 75 -
4. Improvements on Netdisco




         Figure 4.2 Comparison of CDP and STP topology information, with an inner non-CDP device


and between B2 and C1 is detected, which inserts device B in between A and C.
   If device B was CDP-aware and would therefore not forward CDP-advertisements, it
would cut the CDP-topology at this point and the first case presented in figure 4.1 arises
for both devices A and C, with B added by STP data. This also adds the missing link
again.
   The algorithm, how to use that information to add or insert a device to the topology
is implemented in sub-routine topo_add_stp and is depicted in figure 4.3.




                                                  - 76 -
                                                      Diploma Thesis – Network Discovery – A. Barthel




Figure 4.3 Add supplementary seen Spanning Tree links to the known CDP Topology, topo_add_stp




                                             - 77 -
4. Improvements on Netdisco


   By calling topo_add_link at the end of topo_add_stp, topology information are
saved into the Netdisco database. This has the advantage that no other functions had to
be changed.
   The network determined by CDP and STP data consists of all CDP capable and all
non-CDP but layer 2 capable devices. It would be possible to use additional methods
based on forwarding tables to detect further devices. But making a few assumptions on
a network, FDB methods provide only redundant information. Supposing that links be-
tween devices do not have any workgroup-switch or hub connected, determines that
hosts or routers cannot be attached to such uplink-ports. From this it follows that for-
warding information seen on non-uplink-ports determine the location of end stations, be-
cause those ports are the only non-uplink-ports that carry forwarding entries for the end
stations connected to them. This information is collected by Netdisco's macwalk.
   To have routers contained also in the visualization, port-mac addresses of them have
to be noted. When engaging host-discovery, these router-ports are added to the topolo-
gy information. This visualization-feature is not implemented yet.



4.2. Parallel device operations
   The time discovery takes is important for the accuracy of the collected data. Because
                                                                                    -
the faster information are derived, the less network changes may occur during the col
lection phase and the more current this data is.
   All devices are processed equally and while one is being read out, all others wait in
the discovery queue. These device operations can be performed in parallel. As a form of
implementation the Master-Slave design pattern can be used.[BUSC96] This pattern de-
fines a master which controls and distributes work to several identical slave processes.
                                                                                         -
Discovery of a single device is determined for the slave tasks. These tasks are all identi
cal and do not influence each other, so they can be executed independently and simulta-
neously. During discovery each slave process adds it's neighbor devices to the discovery
queue. The master controls the queue and starts a new slave whenever a device is
added to it. Figure 4.4 illustrates this concept.




                                             - 78 -
                                                        Diploma Thesis – Network Discovery – A. Barthel


   The Master-Slave pattern specifies that the master combines results delivered from
slaves. Netdisco uses global variables to keep track of already discovered devices, the




             Figure 4.4 Parallel discovery implemented as Master-Slave design pattern


                                                                                      -
discovery queue and others. Furthermore device discovery functions directly store infor
mation into the database. Both should be done by the master. To realize this concept, it
would be necessary to redesign Netdisco. All functions must not use global variables, in-
stead return acquired information to the calling function. This might result in massive
memory transfer operations, if a lot of data is processed. For reasons of time available
for this work, it is not implemented yet. Figure 4.5 shows another approach used in-
stead.




                                               - 79 -
4. Improvements on Netdisco




                    Figure 4.5 Master-Slave Pattern as implemented in Netdisco


   The master invokes the slaves as specified by the model, but discovery functions are
not changed. This brings along two constraints. First: the need of concurrent database
access and second: global variables must be available to the master and all slave pro-
cesses.
   The following sections shows how these constraints are implemented.



4.2.1. Preconditions for concurrent database operations
   The possibility of accessing the database concurrently is the first condition to be sat-
isfied, because it is important that simultaneous discovery processes are able to save
the retrieved values into the database without influencing other ones. Otherwise this
can result in an inconsistent database.
   Netdisco uses the PostgreSQL DBMS which features several locking methods, either
accessible directly or by using transactions. PostgreSQL uses per default Read Commit-
                                                                                       -
ted transaction isolation level and furthermore it executes every operation as a transac
tion, [POSTGRES] documentation chapter 12. Those facts make an explicit use of
transaction redundant, as they are applied implicitly. But for large database operations
it is better to begin and commit transactions directly, because communication overhead
produced by quickly consecutive database access is minimized.
   This is implemented by two sub-routines in netdisco.pm. They simply get a database
handle and set AutoCommit to 0, for starting a transaction, called sql_begin() and en-
gaging an explicit commit() followed by setting AutoCommit back to 1, for sql_commit
().

                                                - 80 -
                                                     Diploma Thesis – Network Discovery – A. Barthel


   Sub-routines arpnip, find_neighbors, store_interfaces and walk_fwtable are
provided with this feature. [WE2004]
   Deadlocks cannot occur, because one device is handled by only one process and only
once during discovery.
   Netdisco is designed as a single running instance. Therefore certain problems occur
when running in parallel mode. One of this is that Netdisco never issues a database dis-
connect. Because after it exits, an implicit disconnect is done while flushing the
database handle. But now database handles expire before Netdisco finally finishes and
those handles are still hold open, which results in a handle-mix-up. Solving this problem
is done by adding an explicit disconnect to netdisco.pm sub-routines sql_commit(),
root_device() and sql_do() before they return.
   Depending of the default-settings for PostgreSQL it is necessary to increase the num-
ber of maximum possible connections. This is achieved by setting max_connections in
file /var/lib/pgsql/data/postgresql.conf. Details about correct file location and
values are available in the PostgreSQL documentation [POSTGRES].



4.2.2. Preconditions for concurrent device operations
   The master-slave principle is used to control concurrent operations. All Netdisco
functions that repeatedly issue the same function, can be adopted to the parallel mode.
Those functions are: discovery of the whole network, refreshing all devices, ARPwalk
and MACwalk. All have in common that they call another elementary function which pro-
cesses a single device. This is the basic condition to apply the Divide and Conquer
mechanism of the Master-Slave design pattern.
   Parallelization needs a method that issues identical functions. This is achieved by the
use of the fork-mechanism. The CPAN module Parallel::ForkManager [FORK02] offers a
convenient way to make fork operations transparent to the developer and supplies im-
plicitly control of how many children are forked. With that feature, pool size limitation is
possible. This is necessary due to the increasing amount of resources concurrent opera-
tions require.
   Furthermore variables that are used to control concurrent operations by the master
and which are also accessed by slave instances, must be available globally. To let only
one process change such a value at a time, means of locking must be applied. Both con-
straints are satisfied by CPAN module IPC::Shareable [IPC01] which provides methods
to tie variables to shared memory.



4.2.3. Implementation of parallel network discovery
   The sub-routine run is taken as the base of implementation of the master process for


                                            - 81 -
4. Improvements on Netdisco


concurrent network discovery. Global variables used by run and it's slave processes,
have to be locked during access, they are shown in table 4.1.

     Variable Name               Purpose
     @Discover_Queue             discovery queue
     %Discovered                 already discovered devices
     %Discovered_Alias           aliases of already discovered devices
     %UnDiscovered               already processed devices which did not respond
     %NoCDP                      devices not carrying CDP information
     $Aliases                    IP-address aliases


           Table 4.1 Shared memory variables used by sub-routine run in concurrency mode


   It is not possible that shared variables are dead-locked, because within each variable-
lock no other variable is locked. Therefore locks can be serialized. From the point where
the master-process-loop is entered, those shared variables have always to be locked,
then processed and unlocked afterwards.




                                                - 82 -
                                   Diploma Thesis – Network Discovery – A. Barthel




Figure 4.6 Netdisco sub-routine run with parallelization


                          - 83 -
4. Improvements on Netdisco


   Figure 4.6 shows how run is modified to implement parallelization. The initialization
process, drawn in gray, did not change from sequential execution. After that, an infinite
loop is started which is the master-process. It runs as long as devices are in the discov-
ery queue. Within this loop the running process is forked for every device currently in
the discovery queue. To limit the total number of concurrently running processes, the
variable max_procs is invented to the Netdisco configuration file. While instantiating
the Parallel::Forkmanager object, max_procs is passed to it. The created object takes
automatically care of the number of running processes and does only fork as many as it
is allowed to by the limitation.
   Withing the child processes discover is called. This function and the called sub-rou-
tines are almost unchanged, except that global variables apply a lock whenever they are
accessed. Furthermore sequential discover marks a device discovered when all sub-
routines of it finished. In parallel-mode it may appear that a device is still being discov-
ered by a process and simultaneously seen as a neighbor from another currently discov-
ered device. At this moment it will be added to the discovery queue again, because it is
not marked discovered yet. To circumvent this problem a device is immediately tagged
as discovered at the beginning of discover, before any of it's sub-routines is called.
   The method wait_all_children used by Parallel::ForkManager is blocking the par-
ent process until all forked children exit. When this has happened, the discovery queue
is empty if no device is left to discover in the whole (CDP) network or the master-loop
starts again by issuing slave-processes for each device in the discovery queue.
   This method discovers devices in one level of distance, seen from the seed-device
and then proceeds to the next level and so on.




            Figure 4.7 Parallel CDP discovery of example network in figure 3.11 on page 56




                                                 - 84 -
                                                          Diploma Thesis – Network Discovery – A. Barthel


   Figure 4.7 and table 4.2 show how the example network in figure 3.11 on page 56
would be processed.

      Discovery Step              Device                  CDP-Neighbors        Discovery Queue
             1                192.168.100.12          192.168.100.254           192.168.100.254
                                                       192.168.100.28            192.168.100.28

             2                192.168.100.254             192.168.100.4          192.168.100.4
                               192.168.100.28             192.168.100.1          192.168.100.1
                                                          192.168.100.8          192.168.100.8

             3                 192.168.100.4
                               192.168.100.1
                               192.168.100.8


       Table 4.2 Steps while discovering the sample network from figure 3.11 in parallel- mode


   It can be seen that devices are discovered simultaneously in each level.
   How much this concurrency operation accelerates network discovery, depends on
the structure of the underlying topology and the device selected as starting-device. De-
vices can only be processed simultaneously if the degree of branching seen from the
seed-device is higher than one. That means the seed device or neighbors of it, have
more than one neighbor device throughout the entire network. Otherwise they would be
aligned in a chain and no parallel processing would be possible, as only one device is in
the discovery queue at a time.
   An example of such a situation is presented in figure 4.8. Device A is the chosen
seed-device and therefore discovered first. It's neighbor B, is the only device seen from
A. During discovery of A, B is put into the discovery queue. Then B is discovered and so




                                Figure 4.8 Devices arranged in a line


on. Devices are discovered one by one, because the successor device is marked discov-
ered and therefore only one new neighbor is found. If a device other than A or E is se-
lected as seed-device at least on step of simultaneously discovered devices will occur,
because devices B, C and D have two neighbors. Therefore deciding for a central device
to start discovery from is recommended.



4.2.4. Implementation of parallel device refresh
   In contrast to a real network discovery the number of devices are known in advance


                                                 - 85 -
4. Improvements on Netdisco


at device refresh. This makes it simple to implement the master-slave design to the re-
fresh operation of Netdisco. For each known device a process is forked which calls
discover for it, figure 4.9. Again, the Parallel::ForkManager takes care of total number
of running children implicitly, which is limited by the configuration variable max_procs.
   The order devices are refreshed, depends on the order they are treated by the hash-
variable %OldDevices, in which they are loaded. In turn this is related to Perl and can
be constituted “randomly” for this purpose.
   In comparison to function run no further variables have to be tied to shared memory,
because sub-routine discover is the same like in run and all device that have to be pro-
cessed are known in advance.




              Figure 4.9 Implementation of Netdisco's refresh function in parallel mode




                                                 - 86 -
                                                       Diploma Thesis – Network Discovery – A. Barthel


4.2.5. Implementation of parallel host discovery
   For discovering and naming hosts, ARPwalk and MACwalk are required to be run.
Both need some more variables to be globally accessible during parallel device opera-
tions. Table 4.3 lists them. These are for statistically break-downs only.

     Variable Name             Purpose
     $ArpTotal                 total number of ARP entries
     %MacSeen                  number of host MAC-addresses
     %MacTotal                 total number of MAC-addresses


                 Table 4.3 Shared memory variables used by arpwalk and macwalk


   All other variables do not have to be available by other slave processes, because
these are processed per device and absolutely independent to each other.
   Parallelization of both “walk”-functions are implemented equally. But of course slave
processes are called accordingly to the layer they collect data. Therefore both methods
are presented in 4.10.




                                              - 87 -
4. Improvements on Netdisco




                       Figure 4.10 Functions arpwalk and macwalk parallelized



4.3. Network detail maps and visualization of layer-1 and layer-2
properties
   By means of automatic network discovery it is only possible to detect active links of
the topology. Netdisco collects both layer-1 and layer-2 data of devices comprising this
topology and saves them into it's database. Figure 4.11 shows how Netdisco's web
front-end navigation sidebar has been changed to provide access to the new network
maps.


                                                - 88 -
                                                         Diploma Thesis – Network Discovery – A. Barthel


   Visualization of the network is provided by a map where only devices and the connec-
tions, including the corresponding link speed, between them are illustrated. This is ac-
cessible through the HTML-link [Network Map] or [CDP]. Additionally one map can be
composed like the originally one used by Netdisco, but it is completely assembled by
Spanning Tree topology information, marked as link 1 [STP].




                   Figure 4.11 Netdisco's new sidebar for additional network maps


   To have a more detailed view, further maps that depict certain layer-1 and layer-2
properties have been implemented. Two maps are provided that show by which ports
devices are connected, 2 [CDP] and 3 [STP]. One of them (2) is generated by all layer-1
topology information found; and the second one (3) by STP supplied data. Two further
maps corresponding to the first two are available, 4 [CDP] and 5 [STP], that show addi-
tional layer-2 properties of ports. It provides the possibility to mark a port if it is set to a
chosen native VLAN or if it is a VLAN-trunk port without a native VLAN. Furthermore
the Spanning Tree Protocol port status is indicated if it is in blocking state. It has to be
noted that this is the state provided by the Bridge-MIB Spanning Tree information. It
can vary from VLAN to VLAN, if devices support multiple Spanning Tree instances cor-
responding to VLANs.
   For generation of the STP-device map, Netdisco's function make_graph is reused
with an optional parameter to control if it is desired to use STP data.
   To build detail maps make_graph is enhanced by the possibility to set layer-2 proper-
ties beside the link-speed of a connection. This function is called make_layer_graph.
It's main concept equals the former one, but links are expanded by the ports devices are
connected to each other. This is done by adding '#port-name' to the device name. Fur-
thermore the specific layer-2 properties of the currently evaluated port are assigned to
it. With this procedure it is possible to reuse the functions creating the GraphViz object


                                                - 89 -
4. Improvements on Netdisco


                                                                                          -
for visualization, with only little changes of it to reflect the assigned values in the net
work images.
    Link aggregation is shown implicitly in layer-2 graphs 1,3,4 and 5. Aggregated ports
are depicted as a single port, because they share a single STP instance.
    Generation of the detailed network maps lasts about 1 to 2 seconds, with a number of
about 650 graph-nodes.



4.4. Experiments and results of implementations


4.4.1. Results of using vendor independent topology information
    Results of tests showed that the implemented methods are working as expected. It is
possible to find connections which cannot be determined by CDP data only, but with the
help of STP. Links that CDP and STP have in common are detected equally by both.
    On the other side the network topology generated by Spanning Tree data misses
some devices. CDP finds 198 connected devices in the tested network, whereas STP
only detects 165 devices. This is because:
•   CDP advertisements traverse devices forwarding these frames, but simultaneously
    not taking part in it, whereas BPDU frames are not forwarded;
•   STP is only used by layer2 capable devices;
•   CDP by other devices too;
•   some Cisco Catalyst 1900 switches do not provide a Bridge Base Address which is
    needed by STP methods to determine links;
•   devices with administratively disabled STP.
    All devices supplying insufficient information cannot be detected, but combination of
methods do compensate some of this flaws.
    Alteon switches could not be determined in the topology, because layer-2 switches of
this vendor do not use unique port names, which are needed in combination with the de-
vices' IP-address as primary database key in table device_port. This problem can be
solved by implementing an SNMP::Info class which knows of this problem and changes
port names into unique ones. A similar class already exists for Alteon layer-3 devices.



4.4.2. Results of concurrent operations
    Several test-runs have been performed to try if and how much parallel device opera-
tions accelerates network discovery in a real environment. To do this, functions discov-
er, refresh, ARPwalk and MACwalk have been executed with different limitations of
simultaneously running processes. The tests have been made in a network consisting of
about 200 CISCO devices and 3 Alteon switches. In CDP-network discovery those Alteon

                                             - 90 -
                                                         Diploma Thesis – Network Discovery – A. Barthel


                                                                                   -
devices will not appear, because they are CDP-unaware and will therefore be transpar
ent to the network. But this will not change test results significantly, as it is a total of
about 1,5 % of all devices.
   Figure 4.12 illustrates how much time different operations need and how much they
are accelerated by parallelization. Table 4.4 presents the times of the operation tests, in
dependency of process limits.




          Figure 4.12 Comparison of different operations with different number of processes




                                                - 91 -
4. Improvements on Netdisco



      operation        process limit       time[minutes]         devices/entries   average time
       discover                        1              27,37                  193               0,142
       discover                        2              12,82                  193               0,133
       discover                        5                  5,93               193               0,154
       discover                      10                   4,15               193               0,215
       discover                      25                   3,42               193               0,443
       discover                      50                   3,78               193               0,979
       discover                     100                   3,29               193               1,705
       discover*                    100                   9,17               193               4,751
       discover                     200                   3,32               193               3,440

        refresh                        2              11,62                  193               0,120
        refresh                        5                  5,38               193               0,139
        refresh                      10                   3,13               193               0,162
        refresh                      25                   3,63               193               0,470
        refresh                      50                   2,55               193               0,661
        refresh                     100                    2,6               193               1,347
        refresh*                    100                   5,77               193               2,990
        refresh                     200                    3,8               193               3,938

       macwalk                         1              22,78                44712               0,001
       macwalk                       25                   7,05             48506               0,004
       macwalk                       50                   5,72             50961               0,006
       macwalk                      100                   3,86             38645               0,010

        arpwalk                      25                   1,88              4330               0,011
        arpwalk                      50                   3,39              8751               0,019
        arpwalk                     100                   2,93              8252               0,036


   * operation without bulkwalk


           Table 4.4 Comparison of different operations with a different number of processes


   Additionally the number of devices at operation discover and refresh; respectively
the number of table entries read at operation macwalk and arpwalk are mentioned; and
the average time it takes to process one device or entry. All tests have been made in the
same network environment on three different days and different times of those days.
Most values are averages from up to 16 distinct runs. Though, some of the values are
deviant. This is due to the effects of normal network and device operations during mea-
surements taken place. Furthermore operations depend on the amount of data which is
retrieved. Especially at macwalk and arpwalk the number of entries devices supply are
                                                                                       -
very fluctuating, because those values are devices' operational data, which is dynamical
ly learned and aged out again. Whereas at device specific data that is read during dis-
cover or refresh, changes occur on purpose only and therefore more infrequently.
   All methods have in common that the speed-up increases very quickly with a small

                                                 - 92 -
                                                       Diploma Thesis – Network Discovery – A. Barthel


number of concurrent processes in the range of 2 to 25. Over 25 parallel requests the
acceleration is only slightly, which reaches to about 100 processes. With more than 100
operations a slowdown can be noticed. This is caused by memory usage, which exceed-
ed the physical memory of the test machine and memory swapping began.
                                                                                       -
   A notable fact is that speed-up of discovery is almost as quick as refresh. This indi
cates that the chosen seed-device is a very good starting point of discovery in the “cen-
ter” of the network and that the network is probably designed in a star-like pattern.
   Section 4.2.3 Implementation of parallel network discovery on page 81 states that
                                                                             he
discovery is done distance level by distance level. Under this circumstances t branch-
                                                                                  -
ing factor bn defines how many concurrent processes are necessary to discover a net
work level at once. Because if the number of simultaneously running processes equals
                                                                                        -
bn or is greater than it, all devices of a certain level can be handled at once. The maxi
mum branching factor bn,max is the greatest branching factor of a level, in the whole net-
work. All other levels will have a lower bn. If the process limit is set to bn,max all devices
of one distance level can be discovered at once. Therefore the time to discover the
whole network is:
                                            t n =d n⋅t d
   with:
   dn – depth of network, seen from the seed-device;
   td – average time to retrieve data from a device, actually the device in each level
       responding most slowly must be chosen, but the average value is an appropriate
       approximation;
   tn – time to discover a whole network;
   Using an algorithm that discovers devices immediately after they are added to the
discovery queue, is would not be quicker, because the slowest discovery path through
the network determines the total time to discover. This is exactly the same time as de-
veloped for the algorithm above.
   In theory, complete refresh can be done simultaneously on all devices. This would
take as long as the device with the longest response needs. MACwalk and ARPwalk be-
have similarly. But these two operations vary more in the amount of data delivered by
devices compared to discover and refresh. The more logically central, corresponding to
it's function, a device is aligned in the network the more information it keeps in the ARP
respectively forwarding tables and the more acceleration depends on the number of de-
vices that are actually supplying information and how much data per device is collected.
Forwarding information are kept in almost all devices, because a Bridged-LAN typically
consist of a high percentage of layer-2 switches. Therefore with more parallel running
processes, speed-up increases appropriately. But ARP tables provided by layer-3 devices
are less populated throughout the network. Only a few devices can be read out, so ac-


                                              - 93 -
4. Improvements on Netdisco


celeration is not as good as at other operations. Table 4.5 shows the time operations
need if the limit of processes is greater or equal the maximum of branch factor bn.

     operation        time needed                               dependencies
       discover          td = dn * ta           dn – depth of network seen from seed-device
                                              ta – average time to retrieve data from a device

       refresh             tr = ta            ta – average time to retrieve data from a device

       arpwalk            taw = ta            ta – average time to retrieve data from a device

      macwalk              tm = ta            ta – average time to retrieve data from a device


     Table 4.5 Equations for the time operations ideally take in parallel mode, if the process limit is
                                          greater or equal bn,max


   The assumptions made above do not take in consideration the effort of managing
                                                                                    -
concurrent processes. How much the time to manage parallelization influences the over
all time, depends on the resources of the machine Netdisco runs on and the way the op-
erating system as well as Perl are handling the increased number of simultaneously
running processes. The average time needed to evaluate one device or entry, given in
table 4.4, page 92, shows that it increases with the number of processes. Figures 4.13
                                                                                    -
and 4.14 illustrate that time additionally used to manage processes increases approxi
mately linear with the number of processes. Furthermore can be seen that the time to
handle one device or entry depends on the amount of data to process. This explains the
differences in average times between ARPwalk and MACwalk and the equality of discov-
er and refresh.




                                                    - 94 -
                                                         Diploma Thesis – Network Discovery – A. Barthel




     Figure 4.13 Average per device time of operations, in dependency of the number of processes


   With this results, equations in table 4.5 must be extended with an expression of addi-
tional time to handle processes. This time tp can be calculated in dependency of the
number of processes p and the average time to handle a process tp,average.




                                                - 95 -
4. Improvements on Netdisco




      Figure 4.14 Average per device time of operations, in dependency of the number of processes




                                           t p =p⋅t p , average

   The value tp must be added to equations in table 4.5. With the knowledge of tp it can
be predicted if the increase of concurrent processes will accelerate the Netdisco opera-
tion or if the increase will slow it down, because the effort to manage processes exceeds
the actual operation itself.



4.4.3. Resource requirements for concurrent operations
   The properties of the network structure cannot be influenced, by the topology discov-
ery. Operations necessary to obtain data from devices effect the network itself, the de-
vices requested and the machines Netdisco and the DBMS are running on. That's why it
has also to be taken in consideration how much and how often information are read
from devices and how this does effect the entire environment.
   Conditions under which the network topology determination is done, have to be bal-
anced between the demands on it and the positive as well as the negative effects.
   For instance, a network discovery from a starting point without any known informa-
tion will be done very rarely, under normal circumstances only once. Whereas a network
                                                                                         -
discovery, to get known of the latest changes in the topology or a refresh of device infor

                                                 - 96 -
                                                          Diploma Thesis – Network Discovery – A. Barthel


mation is done regularly, maybe once a day. Host information otherwise are more fluc-
tuating and have to be regained more often. Furthermore the demands, which data is
actually needed, must be clarified.
   To have a possible impression of the resources required by the different information
retrieval processes, test results and estimations are presented in this chapter.


                            attribute                        version/value
                               CPU                         1.2 GHz AMD Athlon
                        physical memory                           512MB
                        network adapter                   100MBit/s 3com 3c905B
                       Operating System                    Fedora Core Linux 1
                              Kernel                        2.4.22-1.2115.nptl
                               Perl                              5.8.3-16
                             Netdisco                              0,94
                           SNMP::Info                               0,9
                            Net-SNMP                                5,2
                           PostgreSQL                            7.3.4-11
                              dstat                                0.5.7


                Table 4.6 Characteristics of the machine used for parallelization tests


   The characteristics of test-machine are shown in table 4.6. All values have been mea-
sured by dstat [DSTAT04], executed by the command:
  dstat -tcdnmlpy -N eth0 --csv .
                                                                                   -
   Diagrams of the CPU and memory load as well as the network traffic during the dif
ferent operations are shown in figures 4.15 to 4.30 on pages 101 to 108.




                                                 - 97 -
4. Improvements on Netdisco


   What can clearly be seen is that the more processes running simultaneously, the
more CPU and memory load they produce. Network traffic depends on the kind of oper-
ation and therefore the amount of data transferred. In parallelized discovery operations
at least 3 different distance levels can be recognized, where most of the devices are lo-
cated, relative to the seed-device. Figure 4.20, page 103 proofs that SNMP operations
without using GET-BULK send and receive almost the same amount of data, where in
contrast all other operations issuing GET-BULK receive more data than they send. Re-
fresh of devices is done randomly. Therefore refresh operations in figure 4.23 to figure
4.26 show a different pattern compared to discovery operations. With a limit of 200 pos-
sible simultaneously running processes, refresh operation starts device processing at
once on all devices. In figure 4.25 can be seen that memory usage is almost all time over
90% at this operation. Because the OS tries to keep a little amount of physical memory
free, it starts to use the swap-memory. This is a very slow process and therefore certain
device operations time out, which is recognizable by the reduced network traffic in fig-
ure 4.26 and the lower CPU load in figure 4.25.
   Table 4.7 gives an overview of the network traffic produced by distinct operations,
including SNMP traffic and transfered data; the CPU as well as the memory load. These
values are not averages, instead they are taken from certain runs, that are relatively
representative.




                                           - 98 -
          operation     process        network traffic      network traffic       network traffic         CPU load         memory load       time[mins]      devices/
                         limit        received[kbyte]        send[kbyte]           total[kbyte]          average[%]        average[%]                        entries
           discover               1              7769,69              2393,26              10162,95                  6,2             40,61          26,33          193
           discover             100              7555,71              2349,25               9904,96                47,95             34,71            3,38         193
          discover*             100              16144,6             15045,02              31189,62                27,36             35,57            9,17         193
           discover             200              7551,64              2347,58               9899,22                49,91             22,77            3,32         193
           refresh              100              6262,43              1987,37                 8249,8               38,85             66,57            3,27         192
           refresh              200              1764,22                522,23              2286,45                11,24             69,87            4,05         191
           arpwalk               25              3020,27                548,23                3568,5               54,79             18,04            1,88        4330




 operation            process      data received per         data send per               data total        bandwidth recv          bandwidth recv      bandwidth total
                       limit      device/entry[kbyte]      device/entry[kbyte]         per device [kb]     usage[kbyte/s]          usage[kbyte/s]      usage[kbyte/s]
    discover                1                     40,26                       12,4                52,66                     4,92               1,51                 6,43
    discover              100                     39,15                      12,17                51,32                    37,26              11,58                48,84
  discover*               100                     83,65                      77,95                161,6                    29,34              27,34                56,69
    discover              200                     39,13                      12,16                51,29                    37,91              11,79                49,69
      refresh             100                     32,62                      10,35                42,97                    31,92              10,13                42,05
      refresh             200                      9,24                       2,73                11,97                     7,26               2,15                 9,41
    arpwalk                25                       0,7                       0,13                 0,82                    26,78               4,86                31,64
   macwalk                 25                      0,18                          0,1               0,28                    19,19              10,12                29,31


* operation without bulkwalk



                                                             Table 4.7 Resource usage during selected operations
 - 99 -
4. Improvements on Netdisco


   Conclusion of table 4.7 and the following diagrams is that network bandwidth usage
is very low. Even a 10MBit connection supplies enough bandwidth to operate more than
100 simultaneously running processes. The limiting factor is the CPU and memory load.
The tests presented have the flaw that Netdisco and PostgreSQL are running on the
same machine. Therefore it cannot be exactly determined which one is the most con-
suming system. Installing the DBMS on a different machine can lower the requirements
for the Netdisco machine itself, but network traffic will rise instead. As network traffic
produced by discovery operations is relatively low, this is a desirable improvement.
   Impact on network devices is very high during the operations. The reason for this is
the high CPU usage while generating SNMP data. How much this effects normal device
functions depends on the vendor's implementation of SNMP agents. But it is very likely
that devices are prepared for such SNMP data retrieval. In the case of Cisco devices,
SNMP processes on devices are running with a low priority and will back-off if re-
sources are needed by the device's “real” duties. During the test were performed, no
network failure or interference of normal device operations occurred, which proofs
these assumptions.




                                           - 100 -
                                             Diploma Thesis – Network Discovery – A. Barthel




Figure 4.15 CPU and memory load during discovery with a limit of 1 process




   Figure 4.16 Network traffic during discovery with a limit of 1 process



                                   - 101 -
4. Improvements on Netdisco




            Figure 4.17 CPU and memory load during discovery with a limit of 100 processes




               Figure 4.18 Network traffic during discovery with a limit of 100 processes



                                                 - 102 -
                                                          Diploma Thesis – Network Discovery – A. Barthel




Figure 4.19 CPU and memory load during discovery with a limit of 100 processes and bulkwalk disabled




   Figure 4.20 Network traffic during discovery with a limit of 100 processes and bulkwalk disabled



                                                - 103 -
4. Improvements on Netdisco




            Figure 4.21 CPU and memory load during discovery with a limit of 200 processes




               Figure 4.22 Network traffic during discovery with a limit of 200 processes



                                                 - 104 -
                                              Diploma Thesis – Network Discovery – A. Barthel




Figure 4.23 CPU and memory load during refresh with a limit of 100 processes




   Figure 4.24 Network traffic during refresh with a limit of 100 processes



                                    - 105 -
4. Improvements on Netdisco




             Figure 4.25 CPU and memory load during refresh with a limit of 200 processes




                Figure 4.26 Network traffic during refresh with a limit of 200 processes



                                                 - 106 -
                                              Diploma Thesis – Network Discovery – A. Barthel




Figure 4.27 CPU and memory load during arpwalk with a limit of 25 processes




   Figure 4.28 Network traffic during arpwalk with a limit of 25 processes



                                    - 107 -
4. Improvements on Netdisco




            Figure 4.29 CPU and memory load during macwalk with a limit of 25 processes




                Figure 4.30 Network traffic during macwalk with a limit of 25 processes



                                                 - 108 -
                                                       Diploma Thesis – Network Discovery – A. Barthel


4.4.4. Problems during implementation
   Different problems occurred during implementation. One of those problems is that
Cisco WS-C5000 devices do not supply STP Designated Port information. But as these
devices deliver CDP-neighbor information, which is preferred against STP-topology in-
formation, the lack of data can be circumvented.
   Although for parallelization Netdisco is forked, which implies device SNMP-opera-
tions, SNMP::Info MIB-initialization is done sequentially. This must depend on the way
Perl handles forked processes and SNMP::Info does this initialization. Because this op-
eration takes some seconds and is performed for each device, a dummy SNMP::Info ob-
ject is instantiated before Netdisco is forked. This causes to inherit the instance to all
children, which do not have to do the MIB-initialization again. This sub-routine is imple-
mented as parallel_init() in the Netdisco discovery script.
   During simultaneously device operation the Netdisco-machine can have a very high
system load. Furthermore operation may also wait for locked variables to get access.
This has the effect that timeouts must be increased to meet the changed requirements
due to parallel operations.
    During concurrent operations, database problems occurred sometimes. The exact
cause could not be determined, because the problems raised randomly in about 1% of
all tests. But it is very likely that Netdisco can be ruled out as cause of the problem, oth-
erwise they would appear more frequently and could be predicted.



4.5. Verification of results
                                                                                    -
   A question that has not been answered so far is: How can discovery results be veri
fied?
   This can be done by comparing different types of discovery. But as developed in this
work, different methods do not necessarily present the same results. So the only way to
verify results is to maintain topology manually. Which was intentionally tried to be
avoided, besides that it is also not error-free.
   Automatic topology determination is more reliable than manual maintenance. If er-
rors occur during automatic topology determination they are systematic and possibly
avoidable. Error in manual topology are caused by the human-factor, which cannot be
avoided. So automatic methods are preferred, but manual information can be used to
compare the received network topology to find possible systematic errors.




                                             - 109 -
5. Integration into an existing device inventory



5. Integration into an existing device inventory
   Netdisco so far is a pure network discovery tool with simple inventory functions. If
another device inventory exists in the environment Netdisco is used in, an integration or
cooperation of both systems is desirable.
   In this case such an inventory exists in form of the TU-Chemnitz Communication
Manager (TUCOMA) [BA01]. Each layer-1 component and all connections between them
are registered in this system. All entries within that database are entered manually.
With an integration of Netdisco into TUCOMA it would be possible to reduce effort of
maintaining the parts that Netdisco can automatically supply. On the other side, within
the TUCOMA database all network devices are hold, which makes it easy to know all de-
vices. This might be helpful if non-CDP devices cannot be discovered by Netdisco and
have to be processed manually.
                                                                                      -
   To be able to identify one device uniquely in both systems, a mapping between differ
ent device-identifiers have to be found. And the second constraint to be satisfied, is the
adoption from the Netdisco to the TUCOMA data-model.



5.1. Mapping between Netdisco and TUCOMA identifiers
                                                                                         -
   Devices are uniquely identified in both systems. But they use different IDs for it. Net
disco uses the devices' IP-address and TUCOMA it's own designator. Both use device-
serial numbers but they are not identical, as a result of different locations serials are
stored or noted in and on devices. TUCOMA furthermore uses a sequential inventory
number for unique identification, but this is used internally only. For better human read-
ability a generic designator comprising of device and room specifics for identification of
devices is used, see [BA01] page 52 for details.
   For all device designators a DNS-alias exists, that maps the designator to the de-
                                                                                       -
vice's IP-address. This makes it easy to find a device with the TUCOMA identifier in Net
disco, as it uses IP-address for device identification.




                          Figure 5.1 TUCOMA-designator to DNS-name mapping


   The opposite direction from Netdisco ID, respectively IP-address, to TUCOMA desig-
nator is more difficult. Because a resolution of the device's IP-address returns it's DNS-
name, without the possibility to get the DNS-alias that reflects the TUCOMA designator.
This makes it necessary to find the TUCOMA designator for the device.
   One way to do this, is to determine the designator from the device's DNS-name, by

                                                   - 110 -
                                                      Diploma Thesis – Network Discovery – A. Barthel


extracting room-, building-, sequential number and the device type.
   An example shows figure 5.2.




                      Figure 5.2 DNS-name mapping to TUCOMA-designator


   Device class is set to “br” for bridges. The component-index is not known, but with
the combination of all other parameters a device can be uniquely found. For devices
where this scheme is not applied for any reasons, the TUCOMA-designator is registered
in RFC1213-MIB variable sysName.
   This name-mapping scheme is mostly applicable, but DNS-names for Alteon Network
switches do not follow this naming-convention. They are named “ace” appended by the
vendor's device type name, for instance “184”. This results in the DNS-name “ace184”
which corresponds to the “bridge type” registered in the TUCOMA database.
   The easiest way to have a complete list of DNS-name to designator mappings, is to
use the above mentioned “TUCOMA-designator to IP-address mapping” and save the re-
versed entries to a database or list.



5.2. Adoption from Netdisco to TUCOMA data-model
   The TUCOMA data-model is very common compared to Netdisco's. It defines as cen-
tral element a component. This component can contain other components as well as it
can be part of another one. Furthermore components contain connection points where
other connection points are attached to. Figure 5.3 reflects this concept in a simplified
ERD.
   Netdisco's data-model is shown in figure 3.2 on page 40. It can be adapted for de-
vices and device ports to the TUCOMA model, by defining these entities as TUCOMA
components.

                                            - 111 -
5. Integration into an existing device inventory


   With the help of DNS-name - designator mapping, it is possible to define the device
components for TUCOMA. Device ports are identified in Netdisco by their layer-1 name;
TUCOMA uses sequential port numbers. These can be taken from IF-MIB entries by
counting all ports, presumed they are listed in the same order.




              Figure 5.3 Simplified Entity Relationship Diagram of the TUCOMA data-model


   Another problem is that modular devices are listed as one device in Netdisco, where-
as TUCOMA registers them as separate components, contained in the superior compo-
nent. Port numbers are independently named in each sub-component, which makes it
impossible to map layer-1 port-names to sequential device port numbers. To solve this
problem other ways of identifying a port must be developed. For example adding layer-1
names to TUCOMA port-designators.
   TUCOMA is a real layer-1 database, where all components and connections between
them are hold. It is not differentiated if a component is an active one, like devices, or a
passive one; like cables, patch-panels or data-ports.
   As developed before, only active components can be detected by means of network
discovery. Therefore the connections between those passive/in-detectable components
and active/detectable ones must be handled like transparent connections, which is illus-
trated in figure 5.4.




                      Figure 5.4 Example of a connection between active components


   Because all passive components are contained within the TUCOMA database, they


                                                   - 112 -
                                                      Diploma Thesis – Network Discovery – A. Barthel


can be assigned to the connected active component respectively device.
      What has to be clarified is if it makes sense to import connections determined by
Netdisco into TUCOMA. Because Netdisco or any other method based on data retrieval
from devices, can only detect active connections. But in the TUCOMA database all con-
nections, regardless if active or passive respectively disabled, are contained. If, for in-
stance, a change in topology is detected and this will be automatically updated to the
layer-1 database, all passive connections are lost and must be entered manually. This re-
quires a semi-automatic processing where parts automatically detectable are added and
non-detectable information can be inserted in a manual second phase.
      A further desirable goal is to merge both databases into a single one. Devices can be
identified as described in this and the chapter before. But without a way to manage all
connections both the Netdisco and the TUCOMA connections must be hold separately in
it.




                                            - 113 -
6. Conclusion



6. Conclusion


6.1. Main results
   Within this work it has been shown how it is possible to find network topologies, by
the use of vendor implemented protocols, such as the Cisco Discovery Protocol. Besides
that, the answer to the question if a vendor independent network discovery is possible,
by presenting discovery methods based on Spanning Tree data and Filtering Databases,
has been given.
   The analysis of the Open Source project Netdisco, presents a fundamental contribu-
tion as a development guide for improvements on it.
   An enhancement that is built upon this analysis, is the parallelization of device opera-
tions. It accelerates the operations of network discovery to a fraction of the sequential
processing. Which presents a very valuable benefit for the information retrieval in large
networks. Additionally estimations of time-consumption and resource requirements of
distinct discovery methods have been developed. This enables the Netdisco users to
                                                                                    -
make considerations about the system load and period of time discovery takes in a cer
tain network environment.
   Furthermore it has been developed how information from the Cisco Discovery Proto-
col, the Spanning Tree Protocol and Filtering Databases can be used to form a com-
prised network map of supplementary topology information.
   The differences of layer-1 and layer-2 network topologies have been presented. And
the visualization of this different topology views is provided by several network maps,
which show distinct layer-1 and layer-2 device properties.
   Proposals of how to adapt the Netdisco database into an existing device inventory
have also been made.



6.2. Remaining problems
   A problem that is not solved so far, is to find the port-membership of link aggrega-
tion. Though, aggregated links can be detected, it is not determined which ports form
such an aggregated link.
   The retrieval of per VLAN layer 2 information is another remaining problem. Never-
theless the Q-Bridge-MIB defines a public standard for data collection on a per VLAN
basis, some vendors do not implement it. Cisco provides a different way with their per
VLAN-community strings, but Alteon switches do not provide any VLAN specific infor-
mation that can be retrieved by means of SNMP.
   Besides this it is necessary to implement some more device and vendor specific
SNMP::Info classes to support a greater variety of devices.


                                           - 114 -
                                                      Diploma Thesis – Network Discovery – A. Barthel


    In general it would be easier to discover networks if devices supplied sufficient infor-
mation. Standards for this exists, but it is the duty of vendors to implement them appro-
priately in devices.



6.3. Future outlook
    Knowing the network topology is essential for a lot of management processes. Within
this chapter proposals are made for developing further applications and uses based on
the concepts presented throughout this work.
•   The knowledge collected by network discovery, could be the basis of a root-cause-
                                                                                     -
    analysis system, which is able to determine or estimates the reason for network er
    rors.
•   The next step in development would be to use that information of the determined
    cause and try to solve the problem automatically.
•   A history of changes in network topology would be also helpful to analyze the occur-
    rence of failures or unintentional behavior.
•   Netdisco is a modular application, it could be either integrated into a collection of
    network management tools or into a Network Management System.
•   It is also possible to extent Netdisco with modules that make a full featured NMS out
    of itself.
•   If Netdisco is used within a pool of network management tools it would be desirable
    to integrate it into a central user management and a single DBMS.
•   Development of gathering per VLAN Spanning Tree information has begun in sub-
    routines stp_suck and stp_walk, provided within the Netdisco script. But integration
    of those information into the database and furthermore into the graph-generation
    process, is not finished yet.
•   Furthermore Netdisco's modularity allows to adapt different visualization methods. A
    great feature would be, if it was possible to draw every network device and end sta-
    tion. With the possibility to set the level of information-detail appropriately to the
    users' needs.
•   A search function, that finds the position of a certain node within the network graph
    is a also a desirable feature. Furthermore a dynamically created graph would allow to
    navigate through the graph.


                                                                                       -
    To implement the mentioned visualization features, it is suggested to use object-ori
ented design patterns, which assign the desired information to retrieval methods and on
the other side to visualization methods. This would enhance the separation of concerns
already used in Netdisco.



                                            - 115 -
Appendix A - User documentation of the new Netdisco features
                                                                                         -
    The installation process of Netdisco did not changed from the original version. Instal
lation instructions can be found at the Netdisco project-site2. But additional CPAN-mod-
ules are necessary in order to use the newly implemented features. Those modules are:
IPC::Shareable and Parallel::Forkmanager.
    Note that the in version 0.58 of module Graph syntax changes occurred, which are
incompatible to Netdisco 0.94. A patch to solve this problem has been announced in the
Netdisco mailling-list and is also put onto the provided CD.


    Using new features via command line parameters:


    Parallelization:
•   enable concurrency mode: set max_procs in netdisco.conf
•   disable concurrency mode: ./netdisco -r | -R | -m | -a -c
    Invoking concurrent device operations is done automatically, if the number of maxi-
mum parallel running processes, set in the configuration file parameter max_procs is
greater than 1. It can be explicitly disabled with the command line option -c. Operations
that can be parallelized are -r --discoverall, -R --refresh , -m --macwalk and -a
--arpwalk.


    Using Spanning Tree topology data:
•   ./netdisco -s
•   set stp_topo in netdisco.conf
    The integration of Spanning Tree topology data can be enabled in netdisco.conf
with the boolean-parameter stp_topo. If it is set to 1, yes or true Spanning Tree infor-
mations are used to find connections after a network discovery and after a refresh of all
devices. Furthermore it can be invoked from the command line with the parameter -f
-fillstp.


    Creating network maps:
•   ./netdisco -l 1 [-s]
•   ./netdisco -l 2 [-s] -[vlan vlan[range]]
    Several network maps are added. Device topology map determined by Spanning Tree
data only, can be generated by invoking Netdisco with parameter -s --stp-graph. Net-
work detail maps showing port information are created via the parameter -l --layer
with recommended parameter 1 or 2. When using 1 the map additionally contains all the
port devices are connected to each other. Choosing 2 draw the same detail map but also
marks ports that are in Spanning Tree blocking state, furthermore it gives the possibility

2   http://netdisco.org

                                           - 116 -
to append another parameter -vlan with the desired VLAN or a range of VLANs. This
marks the ports which are assigned to the native VLAN, the ports that are assigned to
another native VLAN and the ports that are assigned to no native VLAN. Colors for the
illustration of the port and link properties are controlled via parameters in netdisco.-
conf. Detail maps can also be created for STP data only, by appending parameter -s.


    Include Subnet:
•   set discover in netdisco.conf
    During device discovery a subnet can be defined which devices have to belong to
otherwise they are not discovered. This subnet is specified in netdisco.conf through
parameter discover. It can contain a list of IPs and subnets, like parameter
discover_no.


    Further explanations and all other command line parameters as well as configuration
file settings are provided in file README.




                                             - 117 -
Appendix B – Content of the annexed CD


•   netdisco-0.94_with_mibs.tar.gz : original Netdisco 0.94 package including
    MIBs,
•   netdisco-0.94i_with_mibs.tar.gz : Netdisco 0.94i, including improvements de-
    veloped within this work
•   netdisco-0.94i_patches.tar.gz : patches which can be applied to the original
    Netdisco 0.94
•   SNMP-Info-0.9.tar.gz : original SNMP::Info 0.9 package
•   SNMP-Info-0.9i.tar.gz : SNMP::Info 0.9 including methods to retrieve values used
    by improved version of Netdisco
•   SNMP-Info-0.9i.tar.gz : patches which can be applied to the original SNMP::Info
•   netdisco-0.94_with_Graph-0.58.patch : patch so solve incompatibility problems
    with Graph 0.58


    All other CPAN modules and necessary packages, are either integrated into distribu-
tion packages or can be downloaded from CPAN or corresponding project homepages.




                                          - 118 -
Bibliography
[RFC2922]: A. Biermann, K. Jones, Physical Topology MIB, 2000

[RFC1493]: E. Decker, P. Langille, A. Rijsinghani, K. McCloghrie, Definitions of
           Managed Objects for Bridges, 1993

[RFC2863]: K. McCloghrie, F. Kastenholz, The Interfaces Group MIB, 2000

[RFC2674]: E. Bell, A. Smith, P. Langille, A. Rijhsinghani, K. McCloghrie, Definitions of
           Managed Objects for Bridges with Traffic Classes, Multicast Filtering and
           Virtual LAN Extensions, 1999

[IEEE802.1D]: IEEE standard for local and metropolitan area networks--Media access
           control (MAC) Bridges, 2004

[IEEE802.1Q]: IEEE Standards for Local and metropolitan area networks—Virtual
           Bridged Local Area Networks, 2003

[CDPMIB]: Cisco Discovery Protocol MIB,
           http://www.cisco.com/univercd/cc/td/doc/product/fhubs/fh300mib/mibcdp.ht
           m (02/2005)

[LOWE2002]: Bruce B. Lowekamp, David R. O'Hallaron, and Thomas R. Gross, Topology
           Discovery for Large Ethernet Networks, 2001, 237 - 248

[BREI2000]: Yuri Breitbart, Minos N. Garofalakis, Cliff Martin, Rajeev Rastogi, S.
           Seshadri, Abraham Silberschatz, Topology Discovery in Heterogeneous IP
           Networks, 2000, 265-274

[STOT2002]: David T. Stott, Layer-2 Path Discovery Using Spanning Tree MIBs, 2002

[NEDI]: NeDi 0.87, http://nedi.web.psi.ch/ (02/2005)

[NETDISCO]: Netdisco 0.94, http://www.netdisco.org (02/2005)

[POSTGRES]: PostgreSQL 7.3, http://www.postgresql.org/ (02/2005)

[MASON]: Mason 0.4, http://www.masonhq.com/ (02/2005)

[SNMPINFO]: SNMP::Info 0.9, http://snmp-info.sourceforge.net/ (02/2005)

[NETSNMP]: Net-SNMP 5.2, http://net-snmp.sourceforge.net/ (02/2005)

[GRAPHVIZ]: GraphViz 1.12, http://www.graphviz.org/ (02/2005)

[GRAPH]: Graph::Undirected, http://search.cpan.org/~jhi/Graph-0.55/ (02/2005)

[CISCOWOR]: Cisco CiscoWorks Campus Manager,
           http://www.cisco.com/en/US/products/sw/cscowork/ps563/index.html

                                           - 119 -
           (02/2005)

[OPENVIEW]: HP OpenView Network Node Manager,
           http://www.managementsoftware.hp.com/products/nnm/index.html (02/2005)

[CI02]: Network Management Basics,
           http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/nmbasics.htm
           (04/2005)

[RFC854]: J. Postel, J. Reynolds, TELNET PROTOCOL SPECIFICATION, 1983

[RFC793]: J. Postel, TRANSMISSION CONTROL PROTOCOL, 1981

[RFC1157]: J. Case, M. Fedor, M. Schoffstall, J. Davin, A Simple Network Management
           Protocol (SNMP), 1990

[RFC768]: J. Postel, User Datagram Protocol, 1980

[IEEE802.3]: Carrier sense multiple access with collision detection (CSMA/CD) access
           method and physical layer specifications, 2002

[RFC1213]: K. McCloghrie, M. Rose, Management Information Base for Network
           Management of TCP/IP-based internets: MIB-II, 1991,

[BUSC96]: Frank Buschmann, Regine Meuniger, Hans Rohnert, Peter Sommerfeld, A
           System Of Patterns, 1996

[WE2004]: SQL transactions,
           http://sourceforge.net/mailarchive/message.php?msg_id=10206785 (2004)

[FORK02]: Parallel::ForkManager, http://search.cpan.org/~dlux/Parallel-ForkManager-
           0.7.5/ (03/2005)

[IPC01]: IPC::Shareable, http://search.cpan.org/~bsugars/IPC-Shareable-0.60/ (03/2005)

[DSTAT04]: dstat 0.5.7, http://dag.wieers.com/home-made/dstat/ (04/2005)

[BA01]: Heiko Bachmann, Untersuchung und Realisierung von Modulen zur Verwaltung
           der Netzwerkkomponenten im Campusrechnernetz, 2001




                                         - 120 -
   Selbstständigkeitserklärung:


                                                                                    -
   Hiermit erkläre ich, dass die vorliegende Arbeit nur mit den aufgeführten Hilfsmit
teln und Quellen erstellt wurde.




   Alexander Barthel                                  Chemnitz, den 15. April 2005




                                        - 121 -

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:11/6/2012
language:English
pages:131