The Web the HTTP Protocol

Document Sample
The Web the HTTP Protocol Powered By Docstoc
					  The Web: Terminology                                                   The Web: the HTTP Protocol

  r   Web page:                   r   Browser: Client, user              HTTP: hypertext transfer
          consists of “objects”       agent, e.g.                          protocol
      m                                                                                                                         htt
                                                                         r Web’s application layer                                 pr
          addressed by a URL              Internet Explorer (MS)                                                                      equ
      m                               m
                                                                           protocol                              PC running   htt         est
      Most Web pages                      Firefox (Mozilla),                                                      Explorer
                                                                                                                                 pr
                                                                                                                                    esp
  r                                                                      r HTTP uses TCP as transport                                  ons
                                          Chrome (Google),                                                                                 e
      consist of:                         Safari (Apple)
                                                                           service
                                                                         r client/server model
      m   base HTML page, and
                                                                            m client: browser that                                        st
          several referenced                                                                                                           ue
      m
                                                                               requests, receives,                                 req             Server
                                                                                                                                 p           nse
          objects                                                              “displays” Web objects                         htt         po       running
                                  r   Web server: Server                                                                              res
  r   URL has three                                                         m server: Web server sends                          h ttp            Apache Web
                                      for Web is called                                                                                            server
      components: host                                                         objects in response to
                                          Apache                               requests
      name, port number and
                                      m
                                          MS Internet                                                             Linux running
      path name:
                                      m
                                          Information Server             r http1.0: RFC 1945                         Firefox
                                                                         r http1.1: RFC 2068
http://www.eng.tau.ac.il:80/index.html

                                                                   1                                                                                          2




  HTTP 1.0 Message Flow                                                  HTTP 1.0 Message Flow (more detail)
                                                                                                                  0. http server at host
                                                                         Suppose user clicks                         www.eng.tau.ac.il waiting for
                                                                            www.eng.tau.ac.il/index.html             TCP connection at port 80.
  r   Server waits for requests from clients                              1a. client initiates TCP connection        (bind() , listen()).
                                                                               (connect()) to http server at
  r   Client initiates TCP connection (creates socket) to
                                                                               www.eng.tau.ac.il. Port 80 is
      server, port 80                                                          default for http server.          1b. server accepts connection
  r   Client sends request for a document                                                                           (accept())

  r   Web server sends back the document                                     2. http client sends http request
                                                                                message (containing URL) into
  r   TCP connection closed                                                     TCP connection socket            3. http server receives request
                                                                                (write())                           message (read()), forms
                                                                                                                    response message containing
  r   Client parses the document to find embedded                                                                   requested object (index.html),
      objects (images)                                                                                              sends message (write())
       m repeat above for each image
                                                                       time
                                                                   3                                                                                          4
   HTTP 1.0 Message Flow (cont.)                                        HTTP Request Message: General Format
                                        4. http server closes TCP
                                           connection.                  r    ASCII (human-readable format)

     5. http client receives response
        message containing html file,
        parses html file, finds
        embedded image


time6. Steps 1-5 repeated for each
        of the embedded images




                                                                    5                                                                  6




  HTTP Request Message Example: GET                                     HTTP Response Message
                                                                              status line
                                                                               (protocol
                                                                             status code        HTTP/1.0 200 OK
  request line                                                              status phrase)      Date: Wed, 23 Jan 2008 12:00:15 GMT
 (GET, POST,           GET /somedir/page.html HTTP/1.0                                          Server: Apache/1.3.0 (Unix)
HEAD commands)         Host: www.nytimes.com                                          header
                                                                                                Last-Modified: Mon, 22 Jun 1998 …...
                                                                                                Content-Length: 6821
                       Connection: close                                                lines
               header User-agent: Mozilla/4.0                                                   Content-Type: text/html
                 lines Accept: text/html, image/gif, image/jpeg
                       Accept-language: en                                                      data data data data data ...
 Carriage return,                                                           data, e.g.,
     line feed                                                              requested
                       (extra carriage return, line feed)                    html file
  indicates end
    of message



                                                                    7                                                                  8
HTTP Response Status Codes                                               Trying out HTTP (client side) for yourself
In the first line of the server->client response
  message. A few sample codes:                                          1. Telnet to your favorite Web server:
                                                                         telnet www.tau.ac.il 80 Opens TCP connection to port 80
200 OK                                                                                           (default http server port) at www.tau.ac.il.
   m   request succeeded, requested object later in this message                                 Anything typed in sent
                                                                                                 to port 80 at www.tau.ac.il
301 Moved Permanently
   m   requested object moved, new location specified later in          2. Type in a GET http request:
       this message (Location:)
                                                                           GET /index.html HTTP/1.0          By typing this in (hit carriage
400 Bad Request                                                                                              return twice), you send
   m   request message not understood by server                                                              this minimal (but complete)
                                                                                                             GET request to http server
404 Not Found
   m   requested document not found on this server                      3. Look at response message sent by the http server.
505 HTTP Version Not Supported

                                                                   9                                                                            10




HTTP/1.0 Delay                                                          HTTP Message Flow: Persistent HTTP

r For each object:
   m TCP handshake --- 1 RTT                                             r Default for HTTP/1.1
   m client request and server responds --- at least
     1 RTT (if object can be contained in one
     packet)                                                             r On same TCP connection: server parses
                                                                            request, responds, parses new request, …
r Can reduce delay?
                                                                         r Client sends requests for all referenced
                                                                            objects as soon as it receives base HTML

                                                                         r Fewer RTTs
                                                                   11                                                                           12
Browser Cache and Conditional GET                                              HTTP Message Extension: Form
                               client                        server
r Goal: don’t send object if                                                   r if an HTML page contains forms, they are
  client has up-to-date stored
  (cached) version                      http request msg                          encoded in message body
                                    If-modified-since:        object
r client: specify date of                 <date>
                                                                not
  cached copy in http request             http response       modified
   If-modified-since:                       HTTP/1.0
     <date>                             304 Not Modified
r server: response contains
  no object if cached copy up-
  to-date:                               http request msg
                                                               object
                                        If-modified-since:
   HTTP/1.0 304 Not
                                              <date>
     Modified                                                  modified
                                          http response
                                         HTTP/1.1 200 OK
                                                …
                                             <data>
                                                                         13                                                                              14




  HTTP Message Flow Extensions:
                                                                               User-server Interaction: Cookies
  Keeping State
                                                                                                              client                        server
  r Why do we need to keep state?                                             Goal: no explicit application
                                                                                level session                      usual http request msg
  r In FTP, the server keeps the connection                                   r Server sends “cookie” to           usual http response +
     open with each client, and thus the state                                  client in response msg                 Set-cookie: #
     (e.g., current dir/password). Why does’t                                     Set-cookie: 1678453
                                                                              r Client presents cookie in          usual http request msg
     HTTP use this approach?                                                     later requests                          Cookie: #           cookie-
                                                                                  Cookie: 1678453                                            specific
                                                                                                                  usual http response msg     action
                                                                              r Server matches
                                                                                 presented-cookie with
                                                                                 server-stored info                usual http request msg
                                                                                                                                              cookie-
                                                                                  m authentication
                                                                                                                        Cookie: #
                                                                                                                                              specific
                                                                                  m remembering user              usual http response msg      action
                                                                                    preferences, previous
                                                                                    choices
                                                                         15                                                                              16
   User-Server Interaction: Authentication                                      Summary: HTTP
   Authentication goal: control     client                    server            r HTTP message format
     access to server documents          usual http request msg                    mASCII (human-readable
   r stateless: client must present                                                 format) requests,
                                          401: authorization req.
     authorization in each request                                                  header lines, entity body,
                                          WWW-authenticate:
                                                                                    and responses line
   r authorization: typically name,
                                                                                r HTTP message flow
     password                             usual http request msg                   m   stateless server
      m Authorization: header             + Authorization:line                          • each request is self-contained;
        line in request                                                                   thus cookie and
                                         usual http response msg                          authentication,
      m if no authorization                                                               are needed
        presented, server refuses                                                         in each message

        access, sends                    usual http request msg                    m   reducing latency
                                           + Authorization:line                         • persistent HTTP
           WWW-authenticate:                                                                 – the problem is introduced by layering !
           header line in response        usual http response msg   time                • conditional GET reduces server/network workload and latency
                                                                                        • cache and proxy reduce traffic and latency
Browser caches name & password so
that user does not have to repeatedly enter it.                            17
                                                                                                                                                        18