Docstoc

Network layer

Document Sample
Network layer Powered By Docstoc
					                         Computer networks
The Internet and computer networks           4

 Defining the Internet                        4

 Nuts and bolts description:                 4

 Services description:                       4

 Protocols                                   4

 Network edge:                               4

 Network core:                               4

 Internet service providers (ISP:s):         5


 Delays in packet switching networks         5

 Protocol layering                           5

Application layer                            6
 Non-persistent and persistent connections   6


 The web: HTTP                               6

 Cookies                                     6

 Web caching                                 6

 Conditional GET                             7


 File transfer: FTP                          7

 E-mail: SMTP                                7

 DNS (Domain Name System)                    7

 The life of a DNS query:                    7

 DNS caching                                 8


 Peer-to-peer                                8

Transport layer                              8

 UDP                                         8

 Reliable data transfer                      9

 ARQ (Automatic Repeat Request) protocols    9
  Sliding window protocol                                9

  Selective repeat protocol                             10


  TCP (Transmission Control Protocol)                   10

  Estimating Round Trip Time (RTT)                      10

  Timeout interval                                      11

  Duplicate ACK                                         11

  TCP flow control                                       11

  TCP handshake                                         12

  Congestion control                                    12


Network layer                                           13

  Network service models                                13

  Virtual circuit networks (not the Internet)           14

  Datagram networks                                     14

  What’s inside a router?                               14


  The transport layer on the Internet                   14

  IPv4                                                  15

  IPv6                                                  17


  Routing algorithms                                    18

  Routing algorithm classification:                      18

  Link-state algorithm:                                 18

  Bellman-Ford equation                                 19

  Distance-vector (DV) algorithm                        19

  Hierarchical routing                                  19

  RIP (Routing Information Protocol)                    20

  OSPF (Open Shortest Path First)                       20

  Border Gateway Protocol (BGP4) and Inter-AS routing   20


Link layer                                              21

  Link layer service models                             21

  Error-detection and correction (EDC)                  21
Multiple access protocols   22
The Internet and computer networks
This section deals with a broad overview of computer networking:
• The basic software and hardware components.
• The network edge: End systems and applications.
• Network core: Links and switches, access networks and physical media.
• Delay, loss and throughput.
• Protocols and service models.

Defining the Internet

Nuts and bolts description:
Computer network of millions of systems. Each system connected by a network of
communication links and packet switches. Different links have different transmission rates.
The packet switches are routers and link layer switches. End systems and packet switches
run protocols (namely, TCP/IP).

Services description:
An infrastructure that provides services to applications (distributed apps). The Internet
applications run on the end systems. Application Programming Interfaces (API:s) specifies
the behavior of the system.

Protocols
A protocol defines the format and the order of messages exchanged between two or more
communicating entities, as well as actions taken on the transmission and/or receipt of a
message or other events. Computer networking is the what, why and how of protocols.

Network edge:
End systems = hosts (clients and servers)
Clients tend to be desktop PC:, laptops, phones etc. Servers are powerful, dedicated
machines. Client software and server software are distinct. This software treats the Internet
as a ”black box”.
Access to the network is provided through many different schemes: Dial-up, DSL, Cable,
Fiber, Ethernet, WLAN, 3G, etc. The physical media is guided (twister-pair copper wire,
coaxial cable, fiber optics) or unguided (radio channels).

Network core:
Circuit switching: Resources for creating a path are reserved for the duration of the
transmission. The sender can transfer data at a guaranteed constant rate. Multiplexing a
circuit (share the channel between N users) is done either through FDM (frequency-
division multiplexing) or TDM (time-division multiplexing).
Packet switching: Predominant on the Internet. Resources are not reserved and messages
may have to queue (or be lost). Links use store-and-forward transmission. Statistical
multiplexing is used in packet switching networks. If the probability that a user is active is
10%, then the probability that 10 or more are active is 0,04%. Thus, then there’s a 99,96%
chance that the packet switching is better (more utilizing) than circuit switching.
The Internet has special routing protocols to forward the packets to the next link, using a
forwarding table.

Internet service providers (ISP:s):
Tier 1: Backbone networks, international in coverage, connected to each other tier 1
providers and to some tier 2.
Tier 2: National networks, connected to some tier 1 provider.
Tier 3: Local providers, connected to some tier 2 or some tier 1.

Delays in packet switching networks
Nodal processing delays: Time required to examine the packet header and where to direct
the packet (dproc < 10-6 s).
Queueing delays: The packet waits to be transmitted to the next link (dqueue < 10-3 s).
Transmission delay: Determined by the size of the packet L [bits] and the speed of the link
R [bits/s] (dtrans = L/R < 10-3 s).
Propagation delay: Determined by the physical medium and the length (distance to travel)
(dprop < 10-3 s).
End-to-end delay: Assuming N - 1 routers on the path, the link speed constant etc. we get:
d end-to-end = N ( dproc + dqueue + dtrans + dprop )
Queueing is a field of it’s own. Let a denote the average packet arrival rate and that all
packets are L bits long. Then La is the average bit arrival rate. R is the processing rate.
Then, if La/R > 1, the queue grows without bound.

Protocol layering
The Internet protocol stack:
                                     Application layer

                                      Transport layer

                                      Network layer

                                        Link layer

                                      Physical layer
Application layer: Network applications and their protocols (HTTP, SMTP, FTP, ...). An
application layer packet is called a message.
Transport layer: Transports messages. Two transport layer protocols: TCP and UDP. A
transport layer packet is called a segment.
Network layer: The IP protocol and routing protocols. Guides segments from one host to
another. A network layer packet is called a datagram.
Link layer: Delivers the packet to the next link. Examples of protocols: Ethernet, WLAN,
PPP, etc. A link layer packet is called a frame.
Physical layer: Responsible for moving individual bits from one node to another.
Application layer
Network services, clients and servers, processes and sockets and transport-layer interfaces.

Example applications: The web, e-mail, DNS, P2P file sharing and P2P telephony.

Processes exchange messages. A process sends messages into, and receives from, a socket. A
message must know the adress and the port number of the receiver.

Application layer protocols define how processes on different end systems communicate
with each other:
• The types of messages exchanged include requests and reponses.
• The syntax of various message types, fields and how they’re organized.
• The semantics of the fields (the meaning of the information in the field).
• Rules for determining when and how to send messages.

Non-persistent and persistent connections
If each request-response pair takes place on a seperate TCP connection, the connection is
non-persistant (this is how the HTTP protocol operates). If many requests can be made on
the same TCP connection, it’s a persistent connection (like an FTP connection). A server
that does not keep information about the clients is running a stateless protocol.

The web: HTTP

Cookies
Cookies are used to identify users over an HTTP connection. There are four components:
• A cookie header in the HTTP response message.
• A cookie header in the HTTP request message.
• A cookie file on the client end system.
• A back-end database on the server end system.
The client sends a usual HTTP request to the server. The server creates the identity x for
the user and responds with an HTTP response with set-cookie: x. The next time the client
requests an object, it includes the cookie ID x. The server now knows who the user is and
can take some user-specific action.

Web caching
A web cache satisfies HTTP requests on the behalf of an HTTP server. The web cache has
storage and keeps copies of recently requested objects there. HTTP requests are often
directed to a cache first:
1. The browser (client) establishes a TCP connection to the web cache and sends an HTTP
   request message.
2. The web cache checks to see if it has a copy of the object locally. If it does, it returns
   the object within an HTTP message.
3. Otherwise, the web cache opens a TCP connection to the web site (origin server) and
   requests the object, then
4. Relays the object back to the client (and saves it in the cache).
A web cache can substantially reduce the response time for a client request it there’s a
bottleneck between the cache and the origin server.

Conditional GET
HTTP has a mechanism for the cache to update its contents called a conditional GET. It
uses the GEt request message and the ”If-Modified-Since” header line. The server could
then respond with the 304- Not Modified reply message or the modified object.

File transfer: FTP
FTP is a state protocol. It runs two parallel connections over TCP, one control connection
on port 21 and one data connection over port 22. This is called out of band rather than in-
band. Popular FTP commands include USER, PASS, LIST, RETR and STOR.

E-mail: SMTP
The Simple Mail Transfer Protocol is the primary protocol for Internet e-mail. It uses
TCP. There are client and server sides of the protocol. It restricts the contents of the e-
mail to 7-bit ASCII. Connections are established directly between services: SMTP is a push
protocol. POP, IMAP are used to pull mails from a server (these are different protocols).

DNS (Domain Name System)
Maps IP adresses to host names. It also provides:
* Host aliasing between hostnames and canonical hostnames (such that relay.tr-lol.rofl.com
  translates to www.rofl.com).
* Mail server aliasing (for e-mail adresses).
* Load distribution: When clients request for a name mapped to a set of adresses, the
  server reponds with all the adresses (but the order is alternated in a round-robin
  fashion).

A centralized DNS system would not work, thus DNS is decentralized. DNS servers are
organized into a hierarchy:
* Root DNS servers (13 of them) which are not servers but clusters.
* Top Level Domain (TLD) servers responsible for different top level domains
  (.com, .se, ...)
* Authoritative DNS servers. An organization’s DNS server that maps the hosts of one
  organization and the customers of that organization.
* Local DNS server, located at the ISP.

The life of a DNS query:
The client wants to resolve www.rofl.com and sends a query to the local DNS server.
The DNS server sends a query to a root DNS server.
The root DNS server sends the IP of a TLD server.
The local DNS server sends a query to the TLD server.
The TLD server sends the IP of an authoritative server.
The local DNS server sends a query to the authoritative server.
The authoritative server sends the IP of the requested web page.
The local DNS server sends that IP back to the client.

DNS caching
Whenever a DNS server answers a query, it saves the information for a short while (no
longer than 2 days). This reduces the number of queries.

Peer-to-peer
Contrasts with the client-server paradigm. Pairs of interconnected hosts communicate
directly. In file distribution, each peer can send a portion of the file to another peer. P2P
applications are thus inherently scaleable, since each peer contributes with both download
and upload capacity. Other P2P applications include Internet telephony.


Transport layer
A TCP/IP network provides two distinct transport layer protocols. TCP and UDP provide
the transport layer multiplexing and demultiplexing, as well as integrity check of the data
(using checksums).

Demultiplexing is the job of delivering the data in a transport layer segment to the correct
socket. Multiplexing is the job of gathering data chunks at the source host from different
sockets, creating segments and passing them on to the network layer. Each segment has a
source and a destination port number.

UDP
UDP does as little as possible: only multiplexing/demultiplexing and some light error
checking. Many applications are better suited for UDP:
* Applications have finer control over what is send and when.
* No congestion control or buffer (really).
* No connection establishment, it just blasts away packets.
Protocols running over UDP include DNS, RIP (the routing protocol), SNMP (network
management), internet telephony and streaming multimedia.

The UDP segment has the following structure:
[ Source port nr ] [ Destination port nr ] [ Length ] [ Checksum ] [ Application message ]

Checksum: Add all words in the message and add with the checksum. If that returns only
ones, then the message is intact (unaltered). UDP must provide error checking because
there is no guarantee that the lower level protocols do that. This is an example of the end-
to-end principle: ”Functions placed at the lower levels may be redundant... when compared
to the cost of providing them at a higher level.”
Reliable data transfer
TCP implements reliable data transfer which means that it will guarantee that the
message (eventually) will arrive (or at least confess that it didn’t). The protocol
implements this even though the underlying layers may be unreliable. Reliable protocols
require bidirectional data transfer – how else would we acknowledge a packet?

ARQ (Automatic Repeat Request) protocols
Basic actions: Send and receive (duh)
Error detection: Checksum (or other mechanism) to detect corrupted segments.
Receiver feedback: Positive (ACK) or negative (NAK) acknowledgements.
Retransmission when ....
... received a NAK.
... received a garbled ACK or NAK.
... received nothing (time out).
Necessitates the use of packet sequence numbers (or data might be receieved in the wrong
order!).

Pipelined reliable data transfer protocols
Rather than operating in a stop-and-wait manner, the sender is allowed to send multiple
packets. Then:
* The sequence number range must be much larger.
* The sender and receiver may need to buffer several messages.

Two basic approaches to pipelined reliable data transfer exists: Go-Back-N (sliding
window) protocol and selective repeat protocol.

Sliding window protocol

                                           A   =   ACK:ed packets
AAAAAAAAAsssssssuuuuuu___________          s   =   sent packets
         \___________/                     u   =   usable, not yet sent
         Window size N                     _   =   not usable

Packets between 0 and base - 1 are sent and acknowledged. Packets between base and base
+N can be sent. nextseqnum is the sequence number of the next packet to be sent.

Also note that:
* An acknowledgement for a packet with packet number n will be considered as a
  cumulative acknowledgement of all packets with sequence numbers s < n.
* If a time-out occurs, the sender will resend all packets that have not been acknowledged.
* Out-of-order packets are discarded for simplicity – then the receiver will not have to
  worry about buffering packets received out-of-order.
Selective repeat protocol

                                         A = ACK:ed packets
AAAAAAAAAsAsAAsAAAsuuu______         s = sent packets
         \___________/                    u = usable, not yet sent
         Window size N                    _ = not usable

If the window size and bandwidth-delay product are both large then many packets will be
resent in case of a single packet error. SR protocols require each packet to be individually
acknowledged, so that individual packets can be resent if lost.

Also note:
* Out of order packertrs are buffered until it can be delivered in-order.
* The receiver reacknowledges already received packets.
* The lack of synchronization between sender and receiver necessitates that the window
  size must be less than half of the sequence number space.

TCP (Transmission Control Protocol)
TCP is the connection-oriented protocol. Though, it is not a virtual circuit as the
intermediate network components do not maintain a TCP state. TCP provides a full-
duplex service (data flows in both directions). TCP is point-to-point (only between exactly
two hosts). During the three-way-handshake a send buffer is set aside. TCP sends the data
from this buffer. The maximum data that can be sent per packet is determined by the
maximum segment size (MSS). The MSS is in turn determined by the link-layer equivalent
(maximum transmission unit, MTU).

The TCP segment is structured as follows:
* Source port number (16 bits)
* Destination port number (16 bits)
* Sequence number (32 bits)
* Acknowledgement number (32 bits)
* Receive window (16 bits) used for flow control – the number of bytes the receiver is
  willing to accept.
* Header field (4 bytes) specifies the length of the TCP header in number of 32-bit words.
* Checksum field (16 bits)
* Options field is used when negotiating the MSS.
* Flags used for connection setup and teardown and ACK messages.

Sequence numbers: TCP numbers each byte, not each segment! Cumulative
acknowledgements are used and sent in the opposite data stream. Often, out-of-order
packets are buffered.

Estimating Round Trip Time (RTT)
The estimate is updated when a new SampleRTT is obtained:
EstRTT = EstRTT * 7/8 + SampleRTT * 1/8
(weighted average or exponential weighted moving average).

The RTT variation is defined as:
DevRTT = (1 - B)DevRTT + B|SampleRtt - EstRTT|
with B = 0.25 per default.

Timeout interval
TCP determines the retransmission timeout interval t as:
t = EstRTT + 4 DevRTT
* The interval should be larger than EstRTT but not much larger.
* The margin should be large when the variance is large.
* The margin should be small when the variance is small.
(the margin = t - EstRTT)

TCP uses a single retransmission timer even though it’s pipelined. In the event of a
timeout:
* TCP retransmits the not yet acknowledged packet with the lowest sequence number
       and
* doubles the timeout interval, rather than deriving it from EstRTT and DevRTT.
* The interval thus grows exponentially.
This gives a limited form of congestion control.

Duplicate ACK
An ACK that reacknowledges a segment is called a duplicate ACK. When a gap is
detected, the lowest in-order sequence number is acknowledged. This helps the sender to
detect the packet loss. Since the packets are in the pipeline, many duplicate ACK:s will be
sent whenever a packet is lost.
When three duplicate ACK:s are received, TCP goes into fast retransmit mode, and sends
the lost packet even though it’s not timed out yet.

One difference between the sliding window protocol and TCP is that TCP will not resend
subsequenct packets in the case of a packet loss.

TCP flow control
Match the rate of sending to the rate of receiving, to prevent buffer overflows and dropped
packets. Let

RcvBuffer = size of the buffer at the receiver
rwnd = spare room in the buffer, such that
       rwnd = RcvBuffer - LastbyteRcvd + LastbyteRead
The receiver sends the value of rwnd to the sender in any (all) packet it sends in the other
direction. The sender then makes sure that
LastByteSent – LastByteAcked < rwnd
If rwnd = 0 then the sender must send at least one packet so that no deadlock will occur.
TCP handshake
Used when creating a TCP connection:
1. The client sends a special TCP segment with the SYN bit set to 1. THe client also
   provides a random sequence number (client_isn).
2. When the TCP segment arrives at the server host, it allocates buffers and variables to
   the connection. A SYNACK segment is sent back with an acknowledgement field value
   of client_isn + 1. The server also chooses a random sequence number (server_isn).
3. Upon receiving the SYNACK, the client allocates buffers for the connection and replies
   with a standard ACK (with sequence number server_isn + 1).

To close a TCP connection, a segment with the FIN bit set to 1 is sent and is replied to
with an ACK with the FIN bit also set to 1.

Congestion control
One cost of a congested network is that large queueing delays are expected as the packet
arrival rate nears the link capacity. Another is that the sender must perform
retransmissions of packets lost due to the full buffers. Retransmissions of queued, not lost,
packets will only make congestion worse.

There are two types of congestion control:
* End-to-end: No support or guarantees from the lower layers. The transport layer must
  ensure congestion control on its own. (TCP uses this model.)
* Network-assisted: Routers provide explicit feedback regarding congestion to the sender.
  This may be as simple as a bit indicating congestion on a link.

TCP forces each sender to limit the rate of which it sends traffic. The congestion window
variable cwnd imposes a constraint on the rate:

LastByteSent – LastByteAcked < min(cwnd, rwnd)

...because the sender’s rate ≈ cwnd / RTT [bits/s].

TCP perceives congestion in the network by receing three duplicate ACK:s or by time-out
events. If all is well, the variable cwnd is increased incrementally, that is: TCP is a self-
clocking protocol.
* A lost segment indicates congestion, and hence, the sender’s rate should decrease.
* An acknowledgement segment indicates that the network is delivering the packets, and
  the sender’s rate can be increased. This is called bandwidth probing.

Slow start: When a TCP connection begins, cwnd is initialized to a small number of 1 MSS
(maximum segment size). cwnd is then incremented by 1 MSS every time an ACK is
received. Since two segments are sent the second time, it will then be incremented twice.
This continues; the send rate starts slow but grows exponentially.
In the case of a loss event, the sender sets cwnd to 1 and starts over. It also sets a new
variable caleld ssthresh (slow start threshold) to cwnd/2. Then, when cwnd > ssthresh,
TCP goes into congestion avoidance mode.

Congestion avoidance mode: Rather than doubling the value of cwnd every RTT, it is
instead increased by one. For example, if ten segments were sent, cwnd is incremented by
0.1 MSS for every ACK received. When a time-out occurs, ssthresh is again set to cwnd/2
and then cwnd is set to 1 MSS.

TCP can enter fast recovery mode, where it sets cwnd = 3 and increments it by 1 MSS for
each ACK:ed packet until the lost packet is ACK:ed. TCP then goes back into the mode it
was in before. TCP only enters fast recovery mode if the sender receives three duplicate
ACK:s.

Macroscopic throughput: Let W be the window size (cwnd) when a loss event occurs. Then
the throughput will range from 0.5 W/RTT to W/RTT; or on average:

Throughput = 0,75 W / RTT


Network layer
Two important network layer funtions are:
* Forwarding. When a packet arrives at a router, the router must move the packet to the
  next link.
* Routing. The network layer must determine the route or path taken by packets as the
  flow from sender to receiver. Algorithms that calculate these routes are called routing
  algorithms.

Every router has a forwarding table. A value in a header field is compared to the posts in
this table.

Network service models
Specifications that could be provided by the network layer include:
* Guaranteed delivery of packets
* Guaranteed bounded delay before packet is delivered.
* In-order packet delivery (the same order in which they were sent).
* Guaranteed minimal bandwidth.
* Guaranteed maximum jitter (that is, similar delays between in-order packets).
* Security and encryption of the contents in the packets.
etc.

The IP (internet protocol) offers a minimum of these: a best-effort model (no promises).
Virtual circuit networks (not the Internet)
A VC network is one that provides a connection oriented service at the network layer. This
is implemented in the network components (routers), not like th end-to-end model in the
transport layer.

A VC consists of:
(a) a path (a series of tubes) between the hosts
(b) VC numbers, one for each router
(c) entries in the forwarding tables

In a VC network, routers must maintain the connection state information for the ongoing
connections. Tables are changed when a connection is set up or torn down. This is done
with signaling messages specified by signaling protocols.

Datagram networks
In a datagram network, when a packet arrives at a router, the router uses the packet’s
destination adress to look up the appropriate output link interface (and then forwards the
packet to that link). The router matches the prefix of an adress to a link:

Prefix match                                    Link
----------------------------------------------------
11001000 00010111 00010                        0
11001000 00010111 000101000                    1
11001000 00010111 00010                        2
otherwise                                      3

When there are duplicate matches, the router uses the longest prefix matching rule. The
longest prefix always wins.

These forwarding tables are modified by the routing protocols. They can be modified at
any time, causing packets in the same series to take different routes.

What’s inside a router?
The input ports perform the following functions:
* Terminates a physical layer link to the router.
* Performs link-layer functions needed to interoperate with th data lin-layer functions at
  the remote side of the incoming link.
The switching fabric connects the input and output ports.
The output ports stores the packet and then sends it to the next router.
The routing processor executes the routing protocols, maintains routing tables and
performs network management functions.

The transport layer on the Internet
Three major things make up the internet’s network layer:
1. Internet Protocol (IPv4, IPv6)
2. Routing protocols (RIP, OSPF)
3. Error- and information reporting (ICMP)

IPv4
The key fields in the IPv4 datagrame are:
* Version number. These 4 bits specify the IP version.
* Header length. 4 bits to specify where the data begins.
* Type of service. Different data types can be distinguished and treated differently.
* Datagram length. Total length of hte IP datagram. 16 bits puts an upper limit at 65535
  bytes (though the MSS limits the size to 1500 bytes in practice).
* Identifier, flags, fragmentation effects
* Time to live: A clock counting downwards at each hop so that the datagram does not
  circulate forever.
* Protocol: Used when the packet arrives, determines where the packet is sent. For
  example, a value of 6 means the packet should be delivered to TCP.
* Header checksum: In case the transport layer does not support integrity checks!
* Source and destination IP adresses (32 bits)
* Options (rarely used)
* Data: The actual payload; usually a TCP or UDP segment.

Datagram fragmentation:
Not all link layer protocols can carry large packets. Ethernet is limited to 1500 bytes and
some wide-area links can only carry 576 bytes (MTU). The solution is to fragment the
datagram. The fragments are reassembled before the segment can be delivered to the
transport layer (this is done at the end system – the receiver!). This is why the header has
identification, flag and fragmentation offset. If flag = 1 , then there’s more to follow. The
offset is given in 8 bytes per bit (so that 185 gives an offset of 1480 bytes).

IPv4 adressing:
Each IP adress is 32 bits long so there are 232 possible adresses (≈ 4 billion). These are
written in dotted decimal notation, in which each byte of the adress is written in decimal
form. Thus: 193.32.216.9 = 11000001 00100000 11011000 00001001. Each device on the
Internet must have an unique IP adress (or each NAT router).

Subnets:
   [ ]-----.                                 .-----[ ]
223.1.1.1  |                                 | 223.1.2.1
           |                                 |
   [ ]-----x--------------( X )--------------x
223.1.1.2  | 223.1.1.4 -^       ^- 223.1.2.3 |
           |                                 |
   [ ]-----´                                 `-----[ ]
223.1.1.2                                       223.1.2.2
Here the router has two IP adresses (one for each subnet).The left network is the
223.1.1.0/24 network (because the first 24 bytes are the same in all the IP adresses
thereof). The subnet mask is:

Subnet mask      Decimal notation       Binary notation
/24              255.255.255.0          11111111 11111111 11111111 00000000
/25              255.255.255.128        11111111 11111111 11111111 10000000
/26              255.255.255.192        11111111 11111111 11111111 11000000
...              ...                    ...

To determine the subnets, detach each interface from its router to create isolated networks.
These are the subnets.

CIDR
The Internet’s adress assignment strategy is known as Classless Interdomain Routing
(CIDR). Each IP is divided into two parts and again has the notation a.b.c.d/x where x
indicates the number of bits in the first part (32-x bits are then in the second part). The
first part is called prefix.

Only the prefix is considered outside the range of the prefix – and only the second part is
considered inside the range of the prefix. Often devices within an organization will share
the same prefix. Also note that the remaning bits can be used to subnet within an
organization.

An ISP can divide its allocated IP adresses to customers:

ISP block        200.23.16.0/20    11001000   00010111   00010000   00000000
Customer #1      200.23.16.0/23    11001000   00010111   00010000   00000000
Customer #2      200.23.18.0/23    11001000   00010111   00010010   00000000
Customer #3      200.23.20.0/23    11001000   00010111   00010100   00000000

ISP:s obtains blocks of IP:s from ICANN, a non-profit organization that manages those.

DHCP
Host adresses are configured through DHCP or the Dynamic Host Configuration Protocol.
It gives a host:
* An IP adress on its local network.
* The subnet mask on that network.
* The first-hop router (Gateway) adress.
* The adress of the local DNS server.

DHCP is a client-server protocol. The router in each subnet typically acts as DHCP server.
The life of a DNS request:

1. DHCP server discovery. A DHCP discover message is sent through UDP, port 67. The
   IP datagram containing this is sent to 255.255.255.255 (broadcast adress).
2. The DHCP server reponds with an offer message to 255.255.255.255, containing a
   proposed IP adress, the subnet mask, and a lease time.
3. The client requests the proposed IP adress (via broadcast).
4. The server responds with an ACK (via broadcast).

Note that a TCP connection cannot be maintained as the client moves to new subnets.

Internet Control Message Protocol (ICMP)
Used to communicate errors and other information. ICMP segments are carried inside IP
datagrams (protocol is between Transport and Network layers).The headers have only type
and code fields (the receiver adress is put into the IP header!) but the first 8 bytes of the
IP datagram that caused the error are included as payload. ICMP is used for messages like
Destination network unreachable, ping and such. Traceroute uses the ICMP time-to-live-
expired warning messages to trace the path of a packet stream, for instance!

IPv6
The most important changes in IPv6 are:
* Increased adress space from 32 bits to 128 bits, to ensure that the planet will never run
  out of IP adresses.
* Streamlined (smaller) 40-byte header of fixed size.
* ”Flow” label and priority, to differentiate kinds and types of traffic.

The header also contains:
* Version number (0110), but putting 0100 here will not create a valid IPv4 header!
* Payload length, in number of bytes
* Next header: Identifies the protocol to which the datagram is to be delivered.
* Hop limit = Time to live
* Source and destination adresses
* Payload.

Removed from IPv6 (compared to IPv4):
* Fragmentation fields. If the packet is too big, an ICMP (actually, ICMPv6) warning
  message is sent to the sender. The packet is then dropped completely.
* Header checksum (this is left to TCP and UDP to handle).
* Options.

Transitioning from IPv4 to IPv6
How can the entire Internet be made IPv6-compatible? A ”Flag day” is not possible; with
billions of hosts involved!

Dual stack appoach: IPv6 nodes are also IPv4-compatible. They must be able to determine
if another note is IPv6-compatible (that can be solved through DNS queries, which returns
either IPv6-adresses or IPv4 adresses). However, the IPv6-specific information is lost when
the datagram is sent over an IPv4 link.
Tunneling approach: The IPv6 node on the sending side puts the entire IPv6 datagram as
payload in an IPv4 datagram when the link only supports IPv4.

It is extremely difficult to change network-layer protocols since the entire internet is built
on the. On the other hand, new application layer protocols have seen rapid deployment.

Routing algorithms
An algorithm that finds the least-cost path in a graph:

   2      (v) --- (w) 5
      /    |   1   | \
(u)        | 2     | 1 (z)            c(x,y) = 1 = cost of link (x,y)
      \    |       | /
   1      (x) --- (y)   2              cost of path (x1,x2,x3,...xp) =
                 1                      c(x1,x2)+...+c(xp-1,xp)

Routing algorithm classification:
Global: All routers have knowledge of all other routers (link-state algorithms).
Decentralized: Routers are only aware of their direct neighbors. These algorithms are
iterative where the routers exchange information with neighbors.
Static: Routes change slowly over time (often by administrator intervention).
Dynamic: Routes change quickly and in response to link cost changes.

Link-state algorithm:
Global algorithm where the global state is known through broadcast to all nodes. Each
node computes the least cost paths from itself to all other nodes using Dijkstras algorithm;
which gives the forwarding table for that node.

Let c(x,y) be the cost, D(v) the current known cost from source to node v, p(v) the
predecessor node on that route to v (for back-tracking!). Also let N´ be the set of nodes
where the least cost route is known.

The algorithm starts with an initialization step followed by |N´-1| steps:

//Initialization:
N´ = {u}
for (all nodes v)
 if (v adjacent to u) then D(v) = c(u,v)
 else D(v) = Inf
end for
//Loop
loop until all nodes in N´
 find w not in N´ such that D(w) is a minimum
 update D(V) for all v adjacent to w and not in N´:
  D(V) = min(D(v), D(w)+c(w,v)
end loop
Bellman-Ford equation
The cost of the least cost path from x to y is given by dx(y). Then the costs are related by:

dx(y) = minv( c(x,v)+ dv(y) )

where minv is taken over all of x’s neighbors. This can be easily verified by drawing a
simple graph of nodes and link costs. The solution to this equation yields an entry in the
forwarding table.

Distance-vector (DV) algorithm
Distributed algorithm (no global state known). Asynchronous algorithm such that the
nodes do not synchronize among themselves.

Each node x maintains:
* The cost c(x,v) to any directly attached neighbor v.
* Node x’s distance vector Dx, containing all estimates of its cost to all destinations y in the set
  of all nodes N.

Each node sends its distance vector to its neightbors from time to time (such as when a
cost change is detected). When a message is received, the distance vector is updated with
the Bellman-Ford equation:

Dx(y) = minv( c(x,v)+ Dv(y) ) for each y in N

If the vector was changed, the node communicates this new vector to its neighbors. Now
the cost estimates will eventually converge such that Dx(y) –> dx(y) and the algorithm
stops by itself when this happens!

Updating the forward tables: The node x needs to know which the neighboring node v* is
that is the next-hop node on the closest path to y. Whichever nodes ”wins” in the BF
equation minimum is that node!

Hierarchical routing
This scheme adresses some fundamental issues in the routing algorithms already described:
* Routers are not identical!
* We can’t store billions of destinations in vectors! Even if we had the memory in each
  router, the updates would require all the bandwidth on the Internet!

Solution:
* Aggregate/organize routers into autonomous systems (AS:es) or administrative regions.
* The routing protocol inside an AS is called an intra-autonomous routing protocol.
* A router responsible for forwarding packets outside an AS is called a gateway router.
* Routing information between gateways are handled by the inter-AS protocol (on the
  Internet, that’s BGP4).
Therefore, we have two groups of routing protocols on the Internet:
1. Inter-AS (between AS:es), with BGP4.
2. Intra-AS (inside AS:es), with RIP and OSPF.

RIP (Routing Information Protocol)
RIP is a distance vector algorithm protocol (which we now know what it is) that was made
popular through it’s inclusion in the original BSD-UNIX distribution in 1982. The distance
metric used is number of hops (that is, each link has a cost of one). There is an upper limit
of 15 hops in the protocol (maximum network radius).

Routing tables are maintained with 3 fields: Destination subnet, Next router and Number of
hops to destination. Messages between nodes are exchanged approximately every 30
seconds. If a router doesn’t hear from its neighbor at least every 180 seconds, the link is
considered dead. Then RIP modifies the local routing table and propagates this change to
the neighboring routers in a message.

Routers can request information from each other through RIP request/response messages.
These are carried in UDP segments – RIP is indeed implemented in the application layer!

OSPF (Open Shortest Path First)
Link-state protocol that uses broadcasting and an algorithm similar to Dijkstra’s least-cost
algorithm. It offers more features than RIP – each router builds a world map of the entire
autonomous system and link costs can be set manually by an administrator.

A router broadcasts whenever there is a link change or once every 30 minutes. OSPF
messages are carried directly by IP, so OSPF implements reliable transport on it’s own.
It also has some security features (imagine that RIP doesn’t!) optional: preshared keys and
MD5 hashes – as well as message sequence numbers to prevent replay attacks.

OSPF can also be configured into a (secondary) two-level hierarchy within the AS itself.
One AS is divided into areas where one of them is the ”backbone” area, responsible for
communications outside the AS. Link-state advertisements are then broadcasted only
within the local areas. Each node stores detailed area topology and knows the shortest
route to different areas.

Border Gateway Protocol (BGP4) and Inter-AS routing
BGP4 provides each AS the means to:
* Obtain subnet reachability information from neighboring AS:es.
* Propagate reachability information to all AS-internal routers.
* Determine routes to subnets based on that information policy.

BGP peers communicate via TCP connections called BGP sessions. In BGP jargon,
destinations are not hosts but subnets (prefixes). The peers exchange lists of which subnets
are reachable from that subnet. BGP gateway routers distribute this information to all
other routers in the AS (each router runs BGP!). There is thus a mesh of BGP sessions
inside an AS.

A router that knows many routes to a subnet must choose one. It then uses the following
elimination rules:
1. Policy. A smaller network connected to two larger may not want to forward traffic
   between the two larger. Thus, as a policy, it can abstain from advertising that route,
   letting the bigger networks find their own way to route their traffic.
2. Shortest AS-path (number of hops in number of subnets).
3. Closest NEXT-HOP or hot potato routing. Send the packet in the direction which is
   least expensive for the router in question.
4. Other criteria, whatever that may be.


Link layer
Some jargon: In the link layer, we don’t care if we’re dealing with hosts or routers, we deal
only with nodes. Communication channels between nodes are called links.

Link layer protocols defines how frames are exchanged between individual nodes (link layer
packets are called frames).

Link layer service models
Services that could be offered by a link layer protocol include:
* Framing, that is encapsulating network-layer datagrams in link-layer frames.
* Link access – a medium access control (MAC) protocol specifies how frames can be
  transmitted onto a link which may be shared by many users.
* Reliable delivery (again)
* Flow control
* Error detection and correction
* Half- or full duplex links

The link layer is implemented in network adapters or network interface cards (NIC:s).
Here, in the link layer, software meets hardware!

Error-detection and correction (EDC)
Data is augmented with error-detection and correction (EDC) bits. This allows the receiver
to sometimes (not always) detect bit errors. Sophisticated schemes require many EDC bits
and much computational overhead.

Parity checks: A single parity bit is used (even or odd parity). Through this bit we can find
about half of all errors introduced.

 10101110 1 (Even parity bit)
 10101111 0 (Even parity bit)
 10101110 0 (Odd parity bit)
Two dimensional parity checks can find two errors and correct a single error. This is also
called Forward Error Correction (FEC):

 11010101010   |   0
 10100101101   |   1
 11011011010   |   1    Assuming even parity check, can you find the bit error?
 11010111010   |   1
 11010111010   |   1
 -----------
 01100011101

Checksums: The d bits of data are regarded as k-bit integers and summed into a
checksum. An example (again) is the internet checksum (IP) uses 16-bit integers and
complemetns this number. The receiver sums the checksum with the data and should get
only ones as an answer. This is cheap and often used in software rather than hardware.

Cyclic redundancy checks (CRC:s): Given d bits of data D, we want to compute the r bit
long EDC bit pattern R. We want to compute R such that

D * 2r XOR R = nG

where n is an integer and G is a r+1 bits long bit pattern called a generator. This gives:

R = [ Remainder ] D * 2r / G

Multiple access protocols
On a broadcast link, many multiple sending and receiving nodes will share the same
channel. If two or more nodes send simultaneously, the frames will collide.
There are multiple multiple access protocol categories:
* Channel partitioning schemes: time-division (TDM) and frequency-division (FDM)
  schemes.
* Random-access protocols: ALOHA and CSMA (which is used in Ethernet).
* Taking turns protocols: Polling and token-passing.

The ideal multiple access protocol:
* If only one node has data to send, then that node should utilize the entire link capacity
  R.
* When N nodes are sending, then each node should utilize R/N.
* Decentralized and simple.

Random access protocols

Slotted ALOHA:
* All frames are of equal size L.
* Time is divided into slots of L/R seconds (R is link capacity in bits/seconds).
* Nodes start to transmit at beginning of time slots.
* The nodes are synchronized so that each node knows when the next slot begins.
* When frames collide, it’s detected by the sending nodes. The nodes retransmit in the
  next slot with probability p.

The efficiency is defined as the fraction of successfully used slots when there are many
nodes with much data to send. The efficiency of slotted ALOHA is:
Let the probability that one node is sending be p.
Then the probability that the remaining nodes do not send is (1-p)N-1.
So the probability that one of the nodes has success is Np(1-p)N-1.
In the limit of the optimal value of p, the maximum efficiency is 1/e ≈ 37 %.

Pure ALOHA:
Like slotted ALOHA, but without the actual slots. Nodes start to send at any time. The
probability that any node has success is then Np(1-p)2(N-1). In the limit of the optimal value
of p, the maximum efficiency is then 1/2e or half that of slotted ALOHA!

Carrier Sense Multiple Access (CSMA):
Carrier sensing is simple: Listen on the channel before you start sending! Don’t interrupt
others! Try again later. If there’s a collision, stop sending and try again later.

The channel end-to-end propagation delay causes collisions even though the nodes take
care not to start sending if someone else already is. (They don’t know that!) The channel
efficiency approaches one as the propagation delay tends to zero.

Taking turns protocols
There are two types of taking turns protocols:

Simple polling protocol: A master node polls each other node in a round-robin, telling
them that’s okay to send a specific number of frames on the link. The drawbacks are that
the master node is a single point of failure and that we introduce polling delays. If the
master node goes down, the entire link can be killed.

Token passing: No explicit master node, instead responsibilities pass with a token. If the
current master node goes down, the other nodes can invoke some recovery scheme (say,
after a timeout period, the previous master node takes over).

Token passing protocols are highly efficient and decentralized. However, they are not so
simple (or cheap) to implement.

Link-layer adressing (Media Access Control)
Every network adapter has a physical adress (a MAC adress). This adress is 6 bytes long
and expressed in hexadecimal form: 1A-23-F9-CD-06-9B.

The IEEE sells adress space to network adapter manufacturer adresses (the first three
bytes identify these manufacturers). The broadcast adress is FF-FF-FF-FF-FF-FF-FF.
Adress resolution protocol (ARP)
The job of the ARP protocol is to translate between IP-adresses and MAC-adresses. Each
network adapter has an ARP-table with fields IP adress, MAC adress and Time to live. An
entry in the table expires after 20 minutes.

ARP queries are sent with ARP packets which include the IP adress and the sender’s
MAC adress. These are sent to the broadcast adress. Then, the one (if any) node with the
correct MAC-adress replies with an ARP packet with that information.

ARP is autonomous and does not require configuration from the system’s administrator.

Sending a packet off a subnet:
1. A datagram is first sent to the gateway router (it’s local IP and MAC-adress).
2. When the datagram reaches IP in the router, the forwarding table tells the router which
   interface it should be sent to.
3. The router then encapsulates the datagram in a new frame and sends it through the
   other subnet. (It learns the MAC-adress of the destination through an ARP query on
   that subnet!)

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:12/31/2012
language:English
pages:24