socket programming

Document Sample
socket programming Powered By Docstoc
					Socket Programming
 What is a socket?
 Using sockets
   Types (Protocols)
   Associated functions
   Styles

   We will look at using sockets in C
   For Java, see Chapter 2.6-2.8 (optional)
      • Note: Java sockets are conceptually quite similar

What is a socket?
 An interface between application and
    The application creates a socket
    The socket type dictates the style of
       • reliable vs. best effort
       • connection-oriented vs. connectionless
 Once configured the application can
   pass data to the socket for network
   receive data from the socket (transmitted
    through the network by some other host)
  Two essential types of sockets
 SOCK_STREAM                          SOCK_DGRAM
       a.k.a. TCP                         a.k.a. UDP
       reliable delivery                  unreliable delivery
       in-order guaranteed                no order guarantees
       connection-oriented                no notion of “connection” –
       bidirectional                       app indicates dest. for each
                                           can send or receive

3 2                                          App                      D1
         socket               Dest.
                                         3 2
                                                   socket              D2

           Q: why have type SOCK_DGRAM?
Socket Creation in C: socket
 int s = socket(domain, type, protocol);
    s: socket descriptor, an integer (like a file-handle)
    domain: integer, communication domain
        • e.g., PF_INET (IPv4 protocol) – typically used
      type: communication type
        • SOCK_STREAM: reliable, 2-way, connection-based
        • SOCK_DGRAM: unreliable, connectionless,
        • other values: need root permission, rarely used, or
    protocol: specifies protocol (see file /etc/protocols
     for a list of options) - usually set to 0
 NOTE: socket call does not specify where data will be
  coming from, nor where it will be going to – it just
  creates the interface!                                   4
A Socket-eye view of the



 Each host machine has an IP address
 When a packet arrives at a host

 Each host has 65,536
                            Port 0
                            Port 1
 Some ports are
 reserved for specific
 apps                       Port 65535

   20,21: FTP
                      A socket provides an interface
   23: Telnet          to send data to/from the
   80: HTTP            network through a port
   see RFC 1700 (about
    2000 ports are
Addresses, Ports and Sockets
 Like apartments and mailboxes
    You are the application
    Your apartment building address is the address
    Your mailbox is the port
    The post-office is the network
    The socket is the key that gives you access to the right
     mailbox (one difference: assume outgoing mail is placed
     by you in your mailbox)

 Q: How do you choose which port a socket
  connects to?

The bind function
 associates and (can exclusively) reserves a port
  for use by the socket
 int status = bind(sockid, &addrport, size);
      status: error status, = -1 if bind failed
      sockid: integer, socket descriptor
      addrport: struct sockaddr, the (IP) address and port of the
       machine (address usually set to INADDR_ANY – chooses a
       local address)
      size: the size (in bytes) of the addrport structure
 bind can be skipped for both types of sockets.
  When and why?

Skipping the bind
   if only sending, no need to bind. The OS finds a
    port each time the socket sends a pkt
   if receiving, need to bind

   destination determined during conn. setup
   don’t need to know port sending from (during
    connection setup, receiving end is informed of

Connection Setup                     (SOCK_STREAM)

 Recall: no connection setup for SOCK_DGRAM
 A connection occurs between two kinds of
      passive: waits for an active participant to request
      active: initiates connection request to passive side
 Once connection is established, passive and active
  participants are “similar”
      both can send & receive data
      either can terminate the connection

Connection setup cont’d
 Passive participant         Active participant
    step 1: listen (for
     incoming requests)
                                   step 2: request &
    step 3: accept (a
                                    establish connection
    step 4: data transfer
                                   step 4: data transfer
 The accepted
  connection is on a new         Passive Participant
  socket                       a-sock-1    l-sock   a-sock-2
 The old socket
  continues to listen for
  other active
  participants                  socket               socket
 Why?
                              Active 1              Active 2
Connection setup: listen & accept
 Called by passive participant
 int status = listen(sock, queuelen);
    status: 0 if listening, -1 if error
    sock: integer, socket descriptor
    queuelen: integer, # of active participants that can
      “wait” for a connection
    listen is non-blocking: returns immediately

 int s = accept(sock, &name, &namelen);
    s: integer, the new socket (used for data-transfer)
    sock: integer, the orig. socket (being listened on)
    name: struct sockaddr, address of the active participant
    namelen: sizeof(name): value/result parameter
        • must be set appropriately before call
        • adjusted by OS upon return
      accept is blocking: waits for connection before returning
connect call
 int status = connect(sock, &name, namelen);
    status: 0 if successful connect, -1 otherwise
    sock: integer, socket to be used in connection
    name: struct sockaddr: address of passive
    namelen: integer, sizeof(name)

 connect is blocking

Sending / Receiving Data
 With a connection (SOCK_STREAM):
   int count = send(sock, &buf, len, flags);
        •   count: # bytes transmitted (-1 if error)
        •   buf: char[], buffer to be transmitted
        •   len: integer, length of buffer (in bytes) to transmit
        •   flags: integer, special options, usually just 0
    int    count = recv(sock, &buf, len, flags);
        •   count: # bytes received (-1 if error)
        •   buf: void[], stores received bytes
        •   len: # bytes received
        •   flags: integer, special options, usually just 0
      Calls are blocking [returns only after data is sent
       (to socket buf) / received]
  Sending / Receiving Data                             (cont’d)
 Without a connection (SOCK_DGRAM):
      int count = sendto(sock, &buf, len, flags, &addr, addrlen);
        • count, sock, buf, len, flags: same as send
        • addr: struct sockaddr, address of the destination
        • addrlen: sizeof(addr)
      int count = recvfrom(sock, &buf, len, flags, &addr,
        • count, sock, buf, len, flags: same as recv
        • name: struct sockaddr, address of the source
        • namelen: sizeof(name): value/result parameter
 Calls are blocking [returns only after data is sent (to
  socket buf) / received]

 When finished using a socket, the socket
  should be closed:
 status = close(s);
   status: 0 if successful, -1 if error
   s: the file descriptor (socket being closed)

 Closing a socket
   closes a connection (for SOCK_STREAM)
   frees up the port used by the socket

The struct sockaddr
 The generic:                  The Internet-specific:
   struct sockaddr {              struct sockaddr_in {
       u_short sa_family;             short sin_family;
       char sa_data[14];              u_short sin_port;
   };                                 struct in_addr sin_addr;
                                      char sin_zero[8];
      sa_family
                                   sin_family = AF_INET
        • specifies which
                                   sin_port: port # (0-65535)
          address family is
          being used               sin_addr: IP-address

        • determines how the       sin_zero: unused
          remaining 14 bytes
          are used

 Address and port byte-ordering
  Address and port are stored as
    integers                                        struct in_addr {
        u_short sin_port; (16 bit)                   u_long s_addr;
        in_addr sin_addr; (32 bit)                 };
 Problem:
    different machines / OS’s use different word orderings
         • little-endian: lower bytes first
         • big-endian: higher bytes first
      these machines may communicate with one another over the
                   machine                    Little-Endian

          128    119   40    12                    128   119   40     12
Solution: Network Byte-Ordering
 Defs:
    Host Byte-Ordering: the byte ordering used by
     a host (big or little)
    Network Byte-Ordering: the byte ordering used
     by the network – always big-endian
 Any words sent through the network should be
  converted to Network Byte-Order prior to
  transmission (and back to Host Byte-Order once
 Q: should the socket perform the conversion
 Q: Given big-endian machines don’t need
  conversion routines and little-endian machines do,
  how do we avoid writing two versions of code?
 UNIX’s byte-ordering funcs
 u_long htonl(u_long x);         u_long ntohl(u_long x);
 u_short htons(u_short x);       u_short ntohs(u_short x);

  On big-endian machines, these routines do nothing
  On little-endian machines, they reverse the byte
                   machine         Little-Endian12       40 119 128
128 119 40    12

          128    119   40   12          128   119   40     12

  Same code would have worked regardless of endian-
    ness of the two machines                                       20
Dealing with blocking calls
 Many of the functions we saw block until a certain
      accept: until a connection comes in
      connect: until the connection is established
      recv, recvfrom: until a packet (of data) is received
      send, sendto: until data is pushed into socket’s buffer
        • Q: why not until received?
 For simple programs, blocking is convenient
 What about more complex programs?
   multiple connections
   simultaneous sends and receives
   simultaneously doing non-networking processing

Dealing w/ blocking (cont’d)
 Options:
   create multi-process or multi-threaded code
   turn off the blocking feature (e.g., using the fcntl file-
    descriptor control function)
   use the select function call.

 What does select do?
   can be permanent blocking, time-limited blocking or non-
   input: a set of file-descriptors
   output: info on the file-descriptors’ status
   i.e., can identify sockets that are “ready for use”: calls
    involving that socket will return immediately

select function call
 int status = select(nfds, &readfds, &writefds,
  &exceptfds, &timeout);
    status: # of ready objects, -1 if error
    nfds: 1 + largest file descriptor to check
    readfds: list of descriptors to check if read-ready
    writefds: list of descriptors to check if write-ready
    exceptfds: list of descriptors to check if an
     exception is registered
      timeout: time after which select returns, even if
       nothing ready - can be 0 or 
       (point timeout parameter to NULL for )
To be used with select:
 Recall select uses a structure, struct fd_set
    it is just a bit-vector
    if bit i is set in [readfds, writefds, exceptfds],
     select will check if file descriptor (i.e. socket) i
     is ready for [reading, writing, exception]
 Before calling select:
    FD_ZERO(&fdvar): clears the structure
    FD_SET(i, &fdvar): to check file desc. i

 After calling select:
   int FD_ISSET(i, &fdvar): boolean returns TRUE
    iff i is “ready”

Other useful functions
 bzero(char* c, int n): 0’s n bytes starting at c
 gethostname(char *name, int len): gets the name of
  the current host
 gethostbyaddr(char *addr, int len, int type): converts
  IP hostname to structure containing long integer
 inet_addr(const char *cp): converts dotted-decimal
  char-string to long integer
 inet_ntoa(const struct in_addr in): converts long to
  dotted-decimal notation

 Warning: check function assumptions about byte-
  ordering (host or network). Often, they assume
  parameters / return solutions in network byte-
  order                                                    25
Release of ports
 Sometimes, a “rough” exit from a program (e.g.,
  ctrl-c) does not properly free up a port
 Eventually (after a few minutes), the port will be
 To reduce the likelihood of this problem, include
  the following code:
       #include <signal.h>
       void cleanExit(){exit(0);}
      in socket code:
       signal(SIGTERM, cleanExit);
       signal(SIGINT, cleanExit);

Final Thoughts
 Make sure to #include the header files that
  define used functions
 Check man-pages and course web-site for
  additional info


Shared By:
Chandra Sekhar Chandra Sekhar http://
About My name is chandra sekhar, working as professor