Embed
Email

network-layer

Document Sample

Shared by: xiaopangnv
Categories
Tags
Stats
views:
7
posted:
11/21/2011
language:
English
pages:
32
Network Layer: IP

COMS W6998

Spring 2010



Erich Nahum

Outline

 IP Layer Architecture

 Netfilter

 Receive Path

 Send Path

 Forwarding (Routing) Path

Recall what IP Does



 Encapsulate/

IP-packet format

decapsulate

0 3 7 15 31 transport-layer

messages into IP

Version IHL Codepoint Total length datagrams

Fragment-ID

DM

Fragment-Offset

 Routes datagrams

F F

to destination

Time to Live Protocol Checksum  Handle static

and/or dynamic

Source address

routing updates

Destination address  Fragment/

reassemble

Options and payload datagrams

 Unreliably

IP Implementation Architecture

Higher Layers



ip_input.c ip_output.c

ROUTING

Forwarding ip_queue_xmit

ip_local_deliver_finish Information Base



ip_route_input ip_route_output_flow ip_local_out

NF_INET_LOCAL_INPUT

ip_forward.c NF_INET_FORWARD NF_INET_LOCAL_OUTPUT

ip_local_deliver

ip_forward ip_forward_finish ip_output





MULTICAST NF_INET_POST_ROUTING

ip_rcv_finish

ip_mr_input

ip_finish_output

NF_INET_PRE_ROUTING





ip_rcv ip_finish_output2

ARP

neigh_resolve_

output dev.c

dev.c

dev_queue_xmit

netif_receive skb

Sources of IP Packets

1. Packets arrive on an interface and are

passed to the ip_rcv() function.

2. TCP/UDP packets are packed into an IP

packet and passed down to IP via

ip_queue_xmit().

3. The IP layer generates IP packets itself:

1. Multicast packets

2. Fragmentation of a large packet

3. ICMP/IGMP packets.

Outline

 IP Layer Architecture

 Netfilter

 Receive Path

 Send Path

 Forwarding (Routing) Path

What is Netfilter?

 A framework for packet “mangling”

 A protocol defines "hooks" which are well-defined

points in a packet's traversal of that protocol stack.

 IPv4 defines 5

 Other protocols include IPv6, ARP, Bridging, DECNET

 At each of these points, the protocol will call the

netfilter framework with the packet and the hook

number.

 Parts of the kernel can register to listen to the different

hooks for each protocol.

 When a packet is passed to the netfilter framework, it

will call all registered callbacks for that hook and

protocol.

Netfilter IPv4 Hooks

 NF_INET_PRE_ROUTING

 Incoming packets pass this hook in ip_rcv() before routing

 NF_INET_LOCAL_IN

 All incoming packets addressed to the local host pass this

hook in ip_local_deliver()

 NF_INET_FORWARD

 All incoming packets not addressed to the local host pass

this hook in ip_forward()

 NF_INET_LOCAL_OUT

 All outgoing packets created by this local computer pass

this hook in ip_build_and_send_pkt()

 NF_INET_POST_ROUTING

 All outgoing packets (forwarded or locally created) will pass

this hook in ip_finish_output()

Netfilter Callbacks

 Kernel code can register a call back function to be

called when a packet arrives at each hook. and are

free to manipulate the packet.

 The callback can then tell netfilter to do one of five

things:

 NF_DROP: drop the packet; don't continue traversal.

 NF_ACCEPT: continue traversal as normal.

 NF_STOLEN: I've taken over the packet; stop traversal.

 NF_QUEUE: queue the packet (usually for userspace

handling).

 NF_REPEAT: call this hook again.

IPTables

 A packet selection system called IP Tables has

been built over the netfilter framework.

 It is a direct descendant of ipchains (that came from

ipfwadm, that came from BSD's ipfw), with

extensibility.

 Kernel modules can register a new table, and ask

for a packet to traverse a given table.

 This packet selection method is used for:

 Packet filtering (the `filter' table),

 Network Address Translation (the `nat' table) and

 General preroute packet mangling (the `mangle' table).

Outline

 IP Layer Architecture

 Netfilter

 Receive Path

 Send Path

 Forwarding (Routing) Path

Naming Conventions

 Methods are frequently broken into two stages

(where the second has the same name with a suffix

of finish or slow, is typical for networking kernel

code.)

 E.g., ip_rcv, ip_rcv_finish

 In many cases the second method has a “slow”

suffix instead of “finish”; this usually happens when

the first method looks in some cache and the

second method performs a lookup in a more

complex data structure, which is slower.

Receive Path: ip_rcv

Higher Layers



ip_input.c

 Packets that are not addressed to

ip_local_deliver_finish ROUTING

the host (packets received in the

promiscuous mode) are dropped.

ip_route_input

NF_INET_LOCAL_INPUT  Does some sanity checking

ip_forward.c  Does the packet have at least the

ip_local_deliver size of an IP header?

ip_forward

 Is this IP Version 4?

ip_rcv_finish MULTICAST  Is the checksum correct?

ip_mr_input  Does the packet have a wrong

NF_INET_PRE_ROUTING length?

 If the actual packet size > skblen,

ip_rcv then invoke

skb_trim(skb,iphtotal_len)

dev.c

 Invokes netfilter hook

netif_receive skb

NF_INET_PRE_ROUTING

 ip_rcv_finish() is called

Receive Path: ip_rcv_finish

Higher Layers



ip_input.c

 If skb->dst is NULL, ip_route_input()

is called to find the route of packet.

ip_local_deliver_finish ROUTING  Someone else could have filled it in

ip_route_input  skb->dst is set to an entry in the

NF_INET_LOCAL_INPUT routing cache which stores both the

ip_forward.c destination IP and the pointer to an

ip_local_deliver entry in the hard header cache

ip_forward

(cache for the layer 2 frame packet

header)

ip_rcv_finish MULTICAST

 If the IP header includes options, an

ip_mr_input

ip_option structure is created.

NF_INET_PRE_ROUTING

 skb->input() now points to the

ip_rcv function that should be used to

handle the packet (delivered locally

dev.c

or forwarded further):

 ip_local_deliver()

netif_receive skb

 ip_forward()

 ip_mr_input()

Receive Path: ip_local_deliver

Higher Layers



ip_input.c

 The only task of

ip_local_deliver_finish ROUTING ip_local_deliver(skb) is to re-

ip_route_input assemble fragmented packets

NF_INET_LOCAL_INPUT

by invoking ip_defrag().

ip_forward.c

ip_local_deliver  The netfilter hook

ip_forward

NF_INET_LOCAL_IN is

ip_rcv_finish MULTICAST invoked.

ip_mr_input

NF_INET_PRE_ROUTING

 This in turn calls

ip_local_deliver_finish

ip_rcv





dev.c



netif_receive skb

Recv: ip_local_deliver_finish

Higher Layers



ip_input.c

 Remove the IP header from skb by

__skb_pull(skb, ip_hdrlen(skb));

ip_local_deliver_finish ROUTING

 The protocol ID of the IP header is

ip_route_input

NF_INET_LOCAL_INPUT used to calculate the hash value in the

inet_protos hash table.

ip_forward.c

ip_local_deliver  Packet is passed to a raw socket if one

ip_forward exists (which copies skb)

 If transport protocol is found, then the

ip_rcv_finish MULTICAST

handler is invoked:

ip_mr_input

NF_INET_PRE_ROUTING  tcp_v4_rcv(): TCP

 udp_rcv(): UDP

ip_rcv  icmp_rcv(): ICMP

 igmp_rcv(): IGMP

dev.c

 Otherwise dropped with an ICMP

netif_receive skb Destination Unreachable message

returned.

Hash Table inet_protos

net_protocol

0 udp_rcv()

inet_protos[MAX_INE T_PROTOS] handler

udp_err()

err_handler

gso_send_check

gso_segment

gro_receive

gro_complete









net_protocol igmp_rcv()

1

handler Null

err_handler

gso_send_check

gso_segment

gro_receive

gro_complete









MAX_INET_ net_protocol

PROTOS

Outline

 IP Layer Architecture

 Netfilter

 Receive Path

 Send Path

 Forwarding (Routing) Path

Send Path: ip_queue_xmit (1)

Higher Layers



ip_output.c





skbdst is checked to see

ip_queue_xmit



ROUTING

if it contains a pointer to an ip_local_out

entry in the routing cache. ip_route_output_flow



 Many packets are routed NF_INET_LOCAL_OUTPUT

through the same path, so

storing a pointer to an ip_output

routing entry in skbdst

saves expensive routing NF_INET_POST_ROUTING



table lookup.

ip_finish_output

 If route is not present (e.g.,

the first packet of a socket), ip_finish_output2

then ip_route_output_flow() ARP

neigh_resolve_

is invoked to determine a output dev.c



route. dev_queue_xmit

Send Path: ip_queue_xmit (2)

Higher Layers



ip_output.c



ip_queue_xmit

 Header is pushed onto

ROUTING

packet ip_local_out

ip_route_output_flow

 skb_push(skb,

sizeof(header + options); NF_INET_LOCAL_OUTPUT



 The fields of the IP header

ip_output

are filled in (version, header

length, TOS, TTL, NF_INET_POST_ROUTING

addresses and protocol).

ip_finish_output

 If IP options exist,

ip_options_build() is called.

ip_finish_output2

 Ip_local_out() is invoked. ARP

neigh_resolve_

output dev.c



dev_queue_xmit

Send Path: ip_local_out

Higher Layers



ip_output.c

 The checksum is computed ip_queue_xmit

 ip_send_check(iph)

ROUTING

 Netfilter is invoked with ip_local_out

NF_INET_LOCAL_OUTPUT ip_route_output_flow



using skb->dst_output() NF_INET_LOCAL_OUTPUT

 This is ip_output()

 If the packet is for the local ip_output

machine:

NF_INET_POST_ROUTING

 dst->output = ip_output

 dst->input = ip_local_deliver

ip_finish_output

 ip_output() will send the

packet on the loopback device

 Then we will go into ip_rcv() ip_finish_output2

ARP

and ip_rcv_finish() , but this neigh_resolve_

output dev.c

time dst is NOT null; so we will

end in ip_local_deliver() . dev_queue_xmit

Send Path: ip_output

Higher Layers



ip_output.c

 ip_output() does very little, ip_queue_xmit

essentially an entry into the ROUTING

output path from the ip_route_output_flow ip_local_out



forwarding layer. NF_INET_LOCAL_OUTPUT



 Updates some stats.

ip_output

 Invokes Netfilter with

NF_INET_POST_ROUTING

NF_INET_POST_ROUTING

and ip_finish_output() ip_finish_output





ip_finish_output2

ARP

neigh_resolve_

output dev.c



dev_queue_xmit

Send Path: ip_finish_output

Higher Layers



ip_output.c

 Checks message length against ip_queue_xmit

the destination MTU ROUTING



 Calls either ip_route_output_flow ip_local_out



 ip_fragment() NF_INET_LOCAL_OUTPUT



 ip_finish_output2()

ip_output

 Latter is actually a very long

inline, not a function NF_INET_POST_ROUTING





ip_finish_output





ip_finish_output2

ARP

neigh_resolve_

output dev.c



dev_queue_xmit

Send Path: ip_finish_output2

Higher Layers



ip_output.c

 Checks skb for room for MAC ip_queue_xmit

header. If not, call ROUTING

skb_realloc_headroom(). ip_route_output_flow ip_local_out



 Send the packet to a neighbor NF_INET_LOCAL_OUTPUT

by:

 dst->neighbour->output(skb) ip_output



 arp_bind_neighbour() sees to it NF_INET_POST_ROUTING

that the L2 address (a.k.a. the

mac address) of the next hop ip_finish_output

will be known.

 These eventually end up in ARP

ip_finish_output2

neigh_resolve_

dev_queue_xmit() which passes output dev.c

the packet down to the device.

dev_queue_xmit

Outline

 IP Layer Architecture

 Netfilter

 Receive Path

 Send Path

 Forwarding (Routing) Path

Forwarding: ip_forward (1)

ROUTING

Forwarding

Information Base



ip_route_input ip_route_output_flow



ip_input.c ip_forward.c NF_INET_FORWARD ip_output.c



ip_rcv_finish ip_forward ip_forward_finish ip_output









 Does some validation and checking, e.g.,:

 If skb->pkt_type != PACKET_HOST, drop

 If TTL len > mtu) and no fragmentation is allowed (Don‟t fragment bit is

set in the IP header), the packet is discarded and the ICMP

message with ICMP_FRAG_NEEDED is sent back.

Forwarding: ip_forward (2)

ROUTING

Forwarding

Information Base



ip_route_input ip_route_output_flow



ip_input.c ip_forward.c NF_INET_FORWARD ip_output.c



ip_rcv_finish ip_forward ip_forward_finish ip_output









 skb_cow(skb,headroom) is called to check whether there is still

sufficient space for the MAC header in the output device. If not,

skb_cow() calls pskb_expand_head() to create sufficient space.

 The TTL field of the IP packet is decremented by 1.

 ip_decrease_ttl() also incrementally modifies the header checksum.

 The netfilter hook NF_INET_FORWARDING is invoked.

Forwarding: ip_forward_finish

ROUTING

Forwarding

Information Base



ip_route_input ip_route_output_flow



ip_input.c ip_forward.c NF_INET_FORWARD ip_output.c



ip_rcv_finish ip_forward ip_forward_finish ip_output









 Increments some stats.

 Handles any IP options if they exist.

 Calls the destination output function via skb->dst-

>output(skb) – which is ip_output()

IP Backup

Recall the IP Header



IP-packet format



0 3 7 15 31





Version IHL Codepoint Total length



DM

Fragment-ID Fragment-Offset

F F



Time to Live Protocol Checksum





Source address





Destination address





Options and payload

Recall the sk_buff structure

sk_buff

sk_buff_head next sk_buff

prev

sk

tstamp

struct sock dev net_device

...lots..

...of..

...stuff.. Packetdata

transport_header ``headroom„„

network_header MAC-Header

mac_header IP-Header

head

data UDP-Header

tail UDP-Data

end ``tailroom„„

truesize dataref: 1

users nr_frags

... skb_shared_info

destructor_arg

linux-2.6.31/include/linux/skbuff.h

Recall pkt_type in sk_buff

 pkt_type: specifies the type of a packet

 PACKET_HOST: a packet sent to the local host

 PACKET_BROADCAST: a broadcast packet

 PACKET_MULTICAST: a multicast packet

 PACKET_OTHERHOST:a packet not destined for the

local host, but received in the promiscuous mode.

 PACKET_OUTGOING: a packet leaving the host

 PACKET_LOOKBACK: a packet sent by the local host

to itself.



Related docs
Other docs by xiaopangnv
pollution
Views: 1  |  Downloads: 0
User_Manual
Views: 3  |  Downloads: 0
ch09
Views: 0  |  Downloads: 0
E6-10597
Views: 0  |  Downloads: 0
kanon-aabenraa4
Views: 1  |  Downloads: 0
Cisco PIX Comparison
Views: 0  |  Downloads: 0
President's Message
Views: 0  |  Downloads: 0
Kim
Views: 0  |  Downloads: 0
9 and 10 Year Olds
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!