Document Sample
intro Powered By Docstoc
					      CSE 291

System Services for the
   World Wide Web

       Winter 2000

     Geoffrey M. Voelker
Class Goal
     Provide background for doing experimental systems
      research in wide-area systems such as the Web
           Architectures for wide-area distributed systems
               » What are good models for structuring systems to support apps?
           Understanding system behavior
               » What are the workloads?
           Enhancing performance
               » Caching, prefetching
           New opportunities
               » Multimedia, XML
     Read, evaluate, present a variety of research papers
     Do projects on interesting problems in area

 January 11, 2000                       CSE 291 Intro                            2
     Course structure
           Presentations
           Evaluations
           Projects
     Course contents
           Web intro
           Topics and papers

 January 11, 2000               CSE 291 Intro   3
     We will present papers in class for discussion
           Roughly 2 papers per 1.5 hour class (sometimes 3)

     Everyone presents (including me)
           How many depends on # students registered
           I will present Thursday’s papers and do my share during the
            rest of the quarter

     Look over schedule between now and Thursday
     We’ll allocate papers on Thursday

 January 11, 2000                  CSE 291 Intro                          4
     You must submit evaluations of papers
           Email them to me by noon of day of class
           No evals if you have to present
     Brief (½ page)
           Summary of paper (research problem, conclusions)
           What you learned
           Any ideas that occurred to you
           Your frank opinion of topic and/or work
     If this gets to be too burdensome, might notch it down
      to one evaluation per day

 January 11, 2000                 CSE 291 Intro                5
Class Participation
 The presentations are for fostering discussion
 …so I expect you to participate in discussions

     Presenters
           Come prepared with discussion questions
     Rest of us
           Use your evaluations as a basis for discussion

 January 11, 2000                  CSE 291 Intro             6
Class Project
     For those signed up for four units
     Work in pairs
     Schedule in handout, on the Web
           Roughly 5 weeks setup, 5 weeks working
     Start thinking about what you might want to work on,
      who you might want to work with
     I’ll have a list of topics, but I also encourage you to use
      your own
     Your final will be a class presentation on your project

 January 11, 2000                CSE 291 Intro                  7
Course Contents
     Topics
           Wide-area system architectures
           Naming
           Scalable servers
           Workload characterizations
           Caching, prefetching
           Protocols
           Security
           Emerging applications

     Overview of Web to help put things into context

 January 11, 2000                 CSE 291 Intro         8
How does the Web work?
     The canonical example in your Web browser

        Click here

     “here” is a Uniform Resource Locator (URL)

     It names the location of an object on a server

 January 11, 2000                 CSE 291 Intro        9
In Action…


                    Client                             Server

          Client resolves name of server (
          Establishes a connection with the server
          Sends the server the name of the object (null)
          Server returns the object

 January 11, 2000                     CSE 291 Intro             10
     How should objects be named?
           URLs name locations…if an object moves, the URL breaks
     Location-independent names seem like the obvious
      way to go
           Why don’t we use them (e.g., URIs)?
           How do we make them work, esp. in the face of mobility?
     How it works now, how it might work in the future
           DNS [Mockapetris88]
           DNS for URIs [Daniel96]
           Names as programs [Vahdat99]
           Finding replicas [Guyton95], [vanSteen98]

 January 11, 2000                 CSE 291 Intro                       11


                    Client                             Server

          Communication between the client and server is done via
           HTTP over TCP/IP

 January 11, 2000                     CSE 291 Intro                  12
     What kind of transport protocol should the Web use?
     HTTP 1.0
           One TCP connection/object
           Complaints: inefficient, slow, burdensome…
     HTTP 1.1
           One TCP connection/many objects (persistent connections)
           Solves all problems, right? Huge amount of complexity
               » Clients, proxies, servers
     How do they compare?
           Protocol differences [Krishnamurthy99], performance
            comparison [Nielsen97], effects on servers [Manley97],
            overhead of TCP connections [Caceres98]

 January 11, 2000                       CSE 291 Intro                  13
Scalable Servers


          Of course, you are not the only person accessing the server…

 January 11, 2000                 CSE 291 Intro                       14
Scalable Servers
     How do you build servers to handle millions of hits
      a day?
           Web servers: Flash [Pai99], scheduling [Crovella99]
           Mail servers: EarthLink [Christenson97, Saito99]
           Principles: Transcend, HotBot [Fox97]
           Techniques: Load balancing [Pai98]

 January 11, 2000                  CSE 291 Intro                  15
Web Caching

           Clients              Proxy Cache             Servers

          Gee, is there some way to offload those busy servers?
          Use caches to exploit reference locality among clients

 January 11, 2000                 CSE 291 Intro                     16
     How should we build caching systems for the Web?
           Seminal paper [Chankhunthod96]
           Proxy caches [Duska97]
           Akamai hack [Karger99]
           Cooperative caching [Tewari99, Fan98, Wolman99]
           Popularity distributions [Breslau99]

 January 11, 2000                CSE 291 Intro                17
     The fastest way to download a page is to fetch it
      before it is accessed
     How do you know what will be accessed?
     How much bandwidth can you afford for mistakes?
           Performances bounds [Kroeger97]
           Survey paper w/ practical approach [Duchamp99]

 January 11, 2000                CSE 291 Intro               18
     We can’t just assume we’re in a back room anymore
     How do we secure access to resources?
           Infrastructure: SDSI [Rivest96]
           Wide-area service: CRISIS [Belani98]
           Computational grids: Globus [Foster98]
           E-commerce: SSL [Wagner96]
           Downloaded code: Java [Wallach97]

 January 11, 2000                 CSE 291 Intro           19
Workload Characterizations
     How can you fix it if you don’t look inside?
           Is the Web slow because of the network, the server, CPU-
            hogging browsers?
     What is the behavior of clients, proxies, and servers?
           Golden fleece: Invariants across populations and time
           E.g., Zipf-like distribution of object popularities
     How can we use workloads to shape the systems we
           Characterization survey [Pitkow98]
           Rate of change [Douglis97]
               » Key to data dissemination (caching, prefetching, etc.)

 January 11, 2000                      CSE 291 Intro                      20
Emerging Applications
     HTML, gif, jpeg, etc. are all old news
     What’s the new, cool stuff?
     Multimedia
           Streaming multimedia next big thing (20% b/w in UW traces)
           Workloads [Mena00], tools [Caceres99], delivery [Eager00]
     XML
           I don’t know what it is, so I’d like to learn
           Papers TBD

 January 11, 2000                     CSE 291 Intro                      21
     The Web is basically a simple read-only data access
           Click, fetch, click, fetch, click, fetch…
     Why not fully generalize it into a universal wide-area
      distributed system?
           The Web as an operating system: WebOS [Vahdat98]
           Wide-area computational grids: Legion [Lewis96,
            Grimshaw98], Globe [vanSteen97], Globus [Foster97,

 January 11, 2000                     CSE 291 Intro              22


        Clients             Proxy Cache                 Servers

January 11, 2000               CSE 291 Intro                      23

Shared By: