webcaching_CDN by wuxiangyu


									  The Web: some jargon
                                User agent for Web is
   Web page:
                                 called a browser:
     consists of “objects”
                                     MS Internet Explorer
     addressed by a URL
                                     Netscape Communicator
   Most Web pages
                                Server for Web is
    consist of:
                                 called Web server:
        base HTML page, and
                                     Apache (public domain)
        several referenced
         objects.                    MS Internet
                                      Information Server
   URL has two
    components: host name
    and path name:
The Web: the http protocol

http: hypertext transfer
 Web’s application layer      PC running
  protocol                      Explorer

 client/server model
    client: browser that
      requests, receives,                      Server
      “displays” Web objects                   running
                                              NCSA Web
    server: Web server
      sends objects in
      response to requests
                                Mac running
 http1.0: RFC 1945              Navigator
 http1.1: RFC 2068

The http protocol: more
http: TCP transport               http is “stateless”
  service:                         server maintains no
 client initiates TCP               information about
  connection (creates socket)        past client requests
  to server, port 80
 server accepts TCP             Protocols that maintain
  connection from client            “state” are complex!
 http messages (application-     past history (state) must
  layer protocol messages)          be maintained
  exchanged between browser       if server/client crashes,
  (http client) and Web server      their views of “state” may
  (http server)                     be inconsistent, must be
 TCP connection closed             reconciled

  http example
  Suppose user enters URL                                     (contains text,
       www.someSchool.edu/someDepartment/home.index          references to 10
                                                               jpeg images)
   1a. http client initiates TCP
       connection to http server
       (process) at
                                        1b. http server at host
                                           www.someSchool.edu waiting
       www.someSchool.edu. Port 80
                                           for TCP connection at port 80.
       is default for http server.
                                           “accepts” connection, notifying
   2. http client sends http request
        message (containing URL) into
        TCP connection socket           3. http server receives request
                                           message, forms response
                                           message containing requested
                                           sends message into socket
   http example (cont.)
                                           4. http server closes TCP
       5. http client receives response
          message containing html file,
          displays html. Parsing html
          file, finds 10 referenced jpeg
       6. Steps 1-5 repeated for each
          of 10 jpeg objects

Non-persistent and persistent connections

Non-persistent              Persistent
 HTTP/1.0                   default for HTTP/1.1
 server parses request,     on same TCP
  responds, and closes        connection: server,
  TCP connection              parses request,
 2 RTTs to fetch each        responds, parses new
  object                      request,..
 Each object transfer       Client sends requests
  suffers from slow           for all referenced
  start                       objects as soon as it
                              receives base HTML.
But most 1.0 browsers use    Fewer RTTs and less
parallel TCP connections.     slow start.

  http message format: request

   two types of http messages:        request, response
   http request message:
      ASCII (human-readable format)

  request line
 (GET, POST,         GET /somedir/page.html HTTP/1.0
HEAD commands)       User-agent: Mozilla/4.0
                     Accept: text/html, image/gif,image/jpeg
             header Accept-language:fr

 Carriage return,   (extra carriage return, line feed)
     line feed
  indicates end
    of message
http request message: general format

http message format: respone
  status line
 status code         HTTP/1.0 200 OK
status phrase)       Date: Thu, 06 Aug 1998 12:00:15 GMT
                     Server: Apache/1.3.0 (Unix)
                     Last-Modified: Mon, 22 Jun 1998 …...
                     Content-Length: 6821
                     Content-Type: text/html

                     data data data data data ...
 data, e.g.,
  html file

http response status codes
In first line in server->client response message.
A few sample codes:
200 OK
      request succeeded, requested object later in this message
301 Moved Permanently
      requested object moved, new location specified later in
       this message (Location:)
400 Bad Request
      request message not understood by server
404 Not Found
      requested document not found on this server
505 HTTP Version Not Supported
 Trying out http (client side) for yourself

1. Telnet to your favorite Web server:
 telnet www.eurecom.fr 80 Opens TCP connection to port 80
                          (default http server port) at www.eurecom.fr.
                          Anything typed in sent
                          to port 80 at www.eurecom.fr

2. Type in a GET http request:
  GET /~ross/index.html HTTP/1.0     By typing this in (hit carriage
                                     return twice), you send
                                     this minimal (but complete)
                                     GET request to http server

3. Look at response message sent by http server!

   User-server interaction: authentication

   Authentication goal: control     client                    server
     access to server documents          usual http request msg
    stateless: client must present       401: authorization req.
     authorization in each request         WWW authenticate:
    authorization: typically name,
     password                             usual http request msg
       authorization: header             + Authorization:line
        line in request
                                         usual http response msg
       if no authorization
        presented, server refuses
        access, sends                     usual http request msg
           WWW authenticate:               + Authorization:line
           header line in response        usual http response msg   time
Browser caches name & password so
that user does not have to repeatedly enter it.                        12
       User-server interaction: cookies

 server sends “cookie” to    client                        server
  client in response mst           usual http request msg
   Set-cookie: 1678453             usual http response +
 client presents cookie in            Set-cookie: #
  later requests
   cookie: 1678453
                                   usual http request msg
 server matches                         cookie: #            cookie-
  presented-cookie with                                      spectific
                                  usual http response msg      action
  server-stored info
    authentication
    remembering user              usual http request msg
     preferences, previous              cookie: #
     choices                      usual http response msg      action

 User-server interaction: conditional GET

 Goal: don’t send object if   client                        server
  client has up-to-date stored
                                        http request msg
  (cached) version                  If-modified-since:        object
 client: specify date of                 <date>
  cached copy in http request             http response       modified
   If-modified-since:                       HTTP/1.0
     <date>                             304 Not Modified

 server: response contains
  no object if cached copy up-
  to-date:                               http request msg
   HTTP/1.0 304 Not
                                              <date>           object
     Modified                                                  modified
                                          http response
                                         HTTP/1.1 200 OK
Web Caches (proxy server)
Goal: satisfy client request without involving origin server
 user sets browser:                                           origin
  Web accesses via web                                         server
 client sends all http                       server
  requests to web cache       client
      if object at web
       cache, web cache
       immediately returns
       object in http
      else requests object
       from origin server,     client
       then returns http                                       server
       response to client

Why Web Caching?
Assume: cache is “close”
  to client (e.g., in same                  public
 smaller response time:
  cache “closer” to                               1.5 Mbps
  client                                          access link

 decrease traffic to            institutional
  distant servers                                     10 Mbps LAN

      link out of
       institutional/local ISP
       network often                                   institutional
       bottleneck                                         cache

참고 자료 1: web caching

Large-Scale Web Caching and
Content Delivery
          Jeff Chase

 Caching for a Better Web
  Performance is a major concern in the Web
  Proxy caching is the most widely used
    method to improve Web performance
      Duplicate requests to the same document
       served from cache
      Hits reduce latency, network utilization, server
       load Hits
      Misses increase latency (extra hops)
             Misses             Misses

Clients               Proxy Cache                       Servers

                                         [Source: Geoff Voelker]   19
Cache Effectiveness
 Previous work has shown that hit rate
  increases with population size         [Duska et al. 97,
  Breslau et al. 98]
 However, single proxy caches have
  practical limits
       Load, network topology, organizational
 One technique to scale the client
  population is to have proxy caches
                                   [Source: Geoff Voelker]   20
 Cooperative Web Proxy Caching
 Sharing and/or coordination of cache state among
  multiple Web proxy cache nodes
 Effectiveness of proxy cooperation depends on:
  Inter-proxy communication distance    Proxy utilization and load balance
  Size of client population served

        Proxy                                                              Clients

                                                    [Source: Geoff Voelker]
  Hierarchical Caches
Idea: place caches at exchange or
switching points in the network, and
                                                  origin Web site
cache at each level of the hierarchy.          (e.g., U.S. Congress)

Resolve misses through the parent.         upstream


        clients                                                        clients

Content-Sharing Among Peers

 Idea: Since siblings are “close” in the network, allow
 them to share their cache contents directly.


     clients                                       clients

     Harvest-Style ICP Hierarchies

  Examples                          Idea: multicast probes within each
  Harvest [Schwartz96]              “family”: pick first hit response or
  Squid (NLANR)                     wait for all miss responses.
  NetApp NetCache

                                object request
                                object response
                                query (probe)
client                          query response

Issues for Cache Hierarchies
    With ICP: query traffic within “families” (size n)
      • Inter-sibling ICP traffic (and aggregate overhead) is quadratic
        with n.
      • Query-handling overhead grows linearly with n.
    miss latency
      • Object passes through every cache from origin to client:
        deeper hierarchies scale better, but impose higher
    storage
      • A recently-fetched object is replicated at every level of
        the tree.
    effectiveness
      • Interior cache benefits are limited by capacity if objects
        are not likely to live there long (e.g., LRU).

  Hashing: Cache Array Routing
  Protocol (CARP)


Microsoft Proxy Server

                                    g-p     q-u
                         a-f                         v-z

                                          1. single-hop request resolution
                           hash           2. no redundant caching of objects
 “GET www.hotsite.com”   function         3. allows client-side implementation
                                          4. no new cache-cache protocols
                                          5. reconfigurable
Issues for CARP
 no way to exploit network locality at each level
        • e.g., relies on local browser caches to absorb repeats
 load balancing
      hash can be balanced and/or weighted with a load factor
       reflecting the capacity/power of each server
      must rebalance on server failures
        • Reassigns (1/n)th of cached URLs for array size n.
        • URLs from failed server are evenly distributed among the
          remaining n-1 servers.
 miss penalty and cost to compute the hash
        • In CARP, hash cost is linear in n: hash with each node and
          pick the “winner”.

Directory-based: Summary
Cache for ICP
   Idea: each caching server replicates the cache
    directory (“summary”) of each of its peers (e.g.,
         • [Cao et. al. Sigcomm98]
       Query a peer only if its local summary indicates a hit.
       To reduce storage overhead for summaries, implement the
        summaries compactly using Bloom Filters.
            – May yield false hits (e.g., 1%), but not false misses.
            – Each summary is three orders of magnitude smaller than the
              cache itself, and can be updated by multicasting just the flipped

     A Summary-ICP Hierarchy

 e.g., Squid configured   miss               Summary caches at each level of the hierarchy
                                             reduce inter-sibling miss queries by 95+%.
 to use cache digests


                                               object request
                                               object response
client                                         query response

Issues for Directory-Based
    Servers update their summaries lazily.
      • Update when “new” entries exceed some threshold
      • Update delays may yield false hits and/or false
    Other ways to reduce directory size?
      • Vicinity cache [Gadde/Chase/Rabinovich98]
      • Subsetting by popularity [Gadde/Chase/Rabinovich97]
    What are the limits to scalability?
      • If we grow the number of peers?
      • If we grow the cache sizes?

On the Scale and
 [Wolman/Voelker/.../Levy99] is a key paper in this
  area over the last few years.
      first negative result in SOSP (?)
      illustrates tools for evaluating wide-area systems
        • simulation and analytical modeling
      illustrates fundamental limits of caching
        • benefits dictated by reference patterns and object rate of
        • forget about capacity, and assume ideal cooperation
      ties together previous work in the field
        • wide-area cooperative caching strategies
        • analytical models for Web workloads
      best traces

UW Trace Characteristics

      Trace               UW
      Duration            7 days
      HTTP objects        18.4 million
      HTTP requests       82.8 million
      Avg. requests/sec   137
      Total Bytes         677 GB
      Servers             244,211
      Clients             22,984

                            [Source: Geoff Voelker]   32
 A Multi-Organization Trace
 University of Washington (UW) is a large and diverse
  client population
        • Approximately 50K people
 UW client population contains 200 independent campus
        • Museums of Art and Natural History
        • Schools of Medicine, Dentistry, Nursing
        • Departments of Computer Science, History, and Music
 A trace of UW is effectively a simultaneous trace of
  200 diverse client organizations
      Key: Tagged clients according to their organization in trace

                                           [Source: Geoff Voelker]    33
Cooperation Across Organizations

  Treat each UW organization as an
   independent “company”
  Evaluate cooperative caching among these

  How much Web document reuse is there
  among these organizations?
    Place a proxy cache in front of each
    What is the benefit of cooperative caching
     among these 200 proxies?
                                [Source: Geoff Voelker]   34
    Ideal Hit Rates for UW proxies
 Ideal hit rate - infinite
    storage, ignore
    cacheability, expirations

 Average ideal local
       hit rate: 43%

                                [Source: Geoff Voelker]   35
   Ideal Hit Rates for UW proxies
 Ideal hit rate - infinite
  storage, ignore
  cacheability, expirations

 Average ideal local
  hit rate: 43%

 Explore benefits of
  perfect cooperation
  rather than a particular

 Average ideal hit rate
  increases from 43% to       [Source: Geoff Voelker]   36
Sharing Due to Affiliation

  UW organizational sharing vs. random organizations
  Difference in weighted averages across all orgs is
   ~5%                         [Source: Geoff Voelker]   37
 Cacheable Hit Rates for
 UW proxies
 Cacheable hit rate -
  same as ideal, but
  doesn’t ignore

 Cacheable hit rates
  are much lower than
  ideal (average is 20%)

 Average cacheable hit
  rate increases from
  20% to 41% with
  (perfect) cooperative
  caching                  [Source: Geoff Voelker]   38
Scaling Cooperative Caching
 Organizations of this size can benefit significantly
  from cooperative caching
 But…we don’t need cooperative caching to handle
  the entire UW population size
      A single proxy (or small cluster) can handle this
       entire population!
      No technical reason to use cooperative caching for
       this environment
      In the real world, decisions of proxy placement are
       often political or geographical

 How effective is cooperative caching at scales
  where a single cache cannot be used?
                                [Source: Geoff Voelker]      39
   Hit Rate vs. Client Population
 Curves similar to other studies
      [e.g., Duska97, Breslau98]
 Small organizations
      Significant increase in hit
       rate as client population
      The reason why cooperative
       caching is effective for UW
 Large organizations
      Marginal increase in hit rate
       as client population increases

                                        [Source: Geoff Voelker]   40
In the Paper...
 1. Do we believe this? What are some possible
  sources of error in this tracing/simulation study?
      • What impact might they have?
 2. Why are “ideal” hit rates so much higher for
  the MS trace, but the cacheable hit rates are the
      • What is the correlation between sharing and cacheability?
 3. Why report byte hit rates as well as object hit
      • Is the difference significant? What does this tell us about
        reference patterns?
 4. How can it be that byte hit rate increases with
  population, while bandwidth consumed is linear?

Trace-Driven Simulation:
Sources of Error
 1.   End effects: is the trace interval long enough?
        • Need adequate time for steady-state behavior to become
 2.   Sample size: is the population large enough?
        • Is it representative?
 3.   Completeness: does the sample accurately
  capture the client reference streams?
        • What about browser caches and lower-level proxies? How
          would they affect the results?
 4.   Client subsets: how to select clients to
  represent a subpopulation?
 5. Is the simulation accurate/realistic?
        • cacheability, capacity/replacement, expiration, latency

  What about Latency?
 From the client’s
  perspective, latency
  matters far more than
  hit rate
 How does latency change
  with population?

 Median latencies improve
  only a few 100 ms with
  ideal caching compared
  to no caching.

                             [Source: Geoff Voelker]   43
 1. How did they obtain these reported latencies?
 2. Why report median latency instead of mean?
      • Is the difference significant? What does this tell us? Is it
        consistent with the reported byte hit ratios?
 3. Why does the magnitude of the possible error
  decrease with population?
 4. What about the future?
      • What changes in Web behavior might lead to different
        conclusions in the future?
      • Will latency be as important? Bandwidth?

Large Organization Cooperation
 What is the benefit of cooperative caching
  among large organizations?

 Explore three ways
    Linear extrapolation of UW trace
    Simultaneous trace of two large organizations
     (UW and MS)
    Analytic model for populations beyond trace

                                [Source: Geoff Voelker]   45
   Extrapolation to Larger Client
 Use least squares fit to create
  a linear extrapolation of hit
 Hit rate increases
  logarithmically with client
  population, e.g., to increase hit
  rate by 10%:
       Need 8 UWs (ideal)
       Need 11 UWs (cacheable)

 “Low ceiling”, though:
       61% at 2.1M clients (UW cacheable)

 A city-wide cooperative cache
   would get all the benefit

                                             [Source: Geoff Voelker]   46
UW & Microsoft Cooperation
 Use traces of two large organizations to
  evaluate caching systems at medium-scale
  client populations
 We collected a Microsoft proxy trace
  during same time period as the UW trace
    Combined population is ~80K clients
    Increases the UW population by a factor of 3.6
    Increases the MS population by a factor of 1.4

 Cooperation among UW & MS proxies…
    Gives marginal benefit: 2-4%
    Benefit matches “hit rate vs. population” curve
                                 [Source: Geoff Voelker]   47
UW & Microsoft Traces
    Trace               UW             MS
    Duration            7 days         6.25 days
    HTTP objects        18.4 million   15.3 million
    HTTP requests       82.8 million   107.7 million
    Avg. requests/sec   137            199
    Total Bytes         677 GB         N/A
    Servers             244,211        360,586
    Clients             22,984         60,233
    Population          ~50,000        ~40,000

                                  [Source: Geoff Voelker]   48
UW & MS Cooperative Caching

 Is   this worth it?
                        [Source: Geoff Voelker]   49
Analytic Model
 Use an analytic model to evaluate caching systems
  at very large client populations
      Parameterize with trace data, extrapolate beyond trace
 Steady-state model
    Assumes caches are in steady state, do not start cold
    Accounts for document rate of change
    Explore growth of Web, variation in document popularity,
     rate of change
 Results agree with trace extrapolations
    95% of maximum benefit achieved at the scale of a
     medium-large city (500,000)

                                       [Source: Geoff Voelker]   50
Inside the Model
 [Wolman/Voelker/Levy et. al., SOSP 1999]
    refines [Breslau/Cao et. al., 1999], and others

 Approximates      asymptotic cache behavior assuming
  Zipf-like object popularity
      caches have sufficient capacity
 Parameters:
     = per-client request rate
     = rate of object change
    pc = percentage of objects that are cacheable
     = Zipf parameter (object popularity)

 [Breslau/Cao99] and others observed that Web
  accesses can be modeled using Zipf-like probability
      Rank objects by popularity: lower rank i ==> more popular.
      The probability that any given reference is to the ith most
       popular object is pi
        • Not to be confused with pc, the percentage of cacheable
 Zipf says: “pi is proportional to        1/i, for some  with
  0 <  < 1”.
      Higher  gives more skew: popular objects are way popular.
      Lower  gives a more heavy-tailed distribution.
      In the Web,  ranges from 0.6 to 0.8 [Breslau/Cao99].
      With =0.8, 0.3% of the objects get 40% of requests.

Cacheable Hit Ratio: the Formula
 CN is the hit ratio for cacheable
  objects achievable by population of
  size N with a universe of n objects.
                                      
                                      

                       1       1      dx
       CN              
                      Cx      Cx    
              1            1  n     
                               N     

                  C            

Inside the Hit Ratio Formula

  Approximates a sum over a universe of n objects...
       ...of the probability of access to each object x...
           …times the probability x was accessed since its last change.

                                                      
                                                      

                                       1       1      dx
                    CN                 
                                      Cx      Cx    
  C is just a normalizing     1            1  n     
 constant for the Zipf-like                    N     

  popularity distribution,                                    C = 1/
which must sum to 1. C is
not to be confused with CN.
                                  C            
                                                  dx     in [Breslau/Cao 99]
                                              x               0<<1

 Inside the Hit Ratio Formula, Part
What is the probability that i was accessed since its last invalidate?
   = (rate of accesses to i)/(rate of accesses or changes to i)
                        = Npi / (Npi + )

                                      
                                      

                       1       1      dx
     CN                
                      Cx      Cx    
              1            1  n        Divide through by Npi.
                               N     
                                           Note: by Zipf pi = 1/Ci

                  C            
                                  dx         so: 1/(Npi) = Ci/N

Hit Rates From Model
 Cacheable Hit
     Focus on cacheable
 Four curves
  correspond to
  different rate of
     Believe even Slow
      and Mid-Slow are
 Knee at 500K – 1M

                           [Source: Geoff Voelker]   56
Extrapolating UW & MS Hit
      These are from the
      simulation results, ignoring
      rate of change (compare to                What is the
      graphs from analytic                      significance
      model).                                   of slope ?

                                     [Graph from Geoff Voelker]
  Latency From Model

 Straightforward
  calculation from the
  hit rate results

                         [Source: Geoff Voelker]   58
Rate of Change
 What is more important, the rate of
  change of popular objects or the rate of
  change of unpopular objects?

 Separate popular from unpopular objects
 Look at sensitivity of hit rate to variations
  in rate of change

                             [Source: Geoff Voelker]   59
  Rate of Change Sensitivity
Popular docs sensitivity
      Top curve
      Unpopular low R-of-C
      Issue is minutes to hours
Unpopular docs sensitivity
      Bottom curve
      Popular low R-of-C
      Days to weeks to month
Unpopular more sensitive
than popular!
      Compare differences in
       hit rates between A,C
       and B,C

                                   [Source: Geoff Voelker]   60
참고자료 2: Content Distribution
Networks and Quality of Service

What is a CDN?
 A system (often overlay network to
  Internet) for high-performance delivery of
  multimedia content.
 A CDN maintains multiple locations with
  mirrors of the same content (known as
  surrogates) and redirects users to the
  most appropriate content location.
 This distributes the load and also moves
  the content closer to the user, avoiding
  potential congestion and reducing response
  times (latency).
Need for CDNs
 Multimedia such as videoconferences and
 streaming broadcasts.
   Sensitive to response-time delays
   Require large amounts of bandwidth

 CDNs address these requirements by
 minimizing the number of backbone routers
 that content must traverse and
 distributing the bandwidth load

Victoria’s Secret
 As an example of CDN’s scalability:
 Once a year the Victoria’s Secret Lingerie
  Company broadcasts their Fashion Parade.
     1,000,000+ viewers watching live @ 25 Kbps
 The first year they tried it the enormous
  load crashed their centralized servers and
  many missed the show
 Since then they have started using Yahoo
  and Akamai for their CDN.
   As   many as 2 million watched the show in 2001
      without any hiccups.
CDNs and Cache
 Caches are used in the Internet to move
  content closer to the user.
    Reduces load on origin servers
    Eliminates redundant data traversal
    Reduces latency

 CDNs make heavy use of cache
    Origin servers are fully or partially cached at
     surrogate servers close to the users

 Request-routing
     Initiates communication between client and a
      surrogate server.
 Content Distribution
     Mechanisms that move content from origin
      servers to surrogates.
 Content-delivery
     Consists of surrogate servers to delivery copies
      of content to users.

How CDN Routing Works
1.   Client requests content from
     a Site.
2.   Site uses a CDN as their
     provider. Client gets
     redirected to the CDN.
3.   Client gets redirected to
     most appropriate cache.
4.   If the CDN has a cache at
     the Client’s ISP, the Client
     gets redirected to that
5.   The CDN cache serves the
     content to the client.
6.   If content is served for
     ISP’s cache performance
     improves due to close
     proximity to client.

Request Routing
 Direct a client’s request for objects
  served by a CDN to the most appropriate
 Two commonly used methods:
     1. DNS Redirection
     2. URL Rewriting

DNS Redirection
 Authoritative DNS server redirects client
  request by resolving CDN server to IP
  address of one content server.
 A number of factors determine which
  content server is used in final resolution.
     Availability of resources, network conditions
 Load balancing can be implemented by
  specifying a low TTL field in a DNS reply.

DNS Redirection…
   Two types of CDNs using redirection:
     1. Full site content delivery
     2. Partial site content delivery

Full Site Content Delivery
 All requests for origin-server are
  redirected by DNS, to a CDN server.
 Companies include:
   Adero
   NetCaching
   UniTech’s Networks IntelliDNS

Partial Site Content Delivery
 Origin site alters an object’s URL so that
  it’s resolved by the CDN’s DNS server.
  eg: www.foo.com/bar.gif     becomes
 Companies include:
   Akamai
   Digital Island
   MirrorImage
   Speedera

URL Rewriting
 Origin server dynamically generates pages
  to redirect clients to different content
 Page is dynamically rewritten with the IP
  address of a mirror server.
 Clearway CDN is one such company.

Content Distribution
 Mechanisms that move content from
  origin servers to surrogates.
 Two different methods to get content to
     1. Pre-caching
     2. Just-in-time

 Content is delivered to cache before
  requests are generated.
 Used for highly distributed usage.
 Caches can be updated during off-hours to
  reduce network load.

 Content is pulled from the origin server to
  the cache when a request is received from
  a client.
 The object is delivered to the client and
  simultaneously stored on the cache for
  later perusal.
 Can implement multicasting for efficient
  content transfer between caches.
 Leased lines may be used between servers
  to ensure QoS.
Content Delivery
 Consists of servers placed around the Internet with each
    server caching the central content.
   Transparent to end user – looks like it’s coming from central
   Distributed structure means less load on all servers.
   Can support QoS for customers with differing needs.
      Eg: Gold class, Silver class, Bronze class scheme
   Costs becomes cheaper
      Cost of buying new servers is relatively cheaper than
       trying to obtain higher output from just one server.

CDN Usage in the Real World

 Allow organisations to put on large multimedia events
    Internal: Company announcements, video
     conferencing meetings, instructor-led staff
    External: Event hosting such as concerts, fashion
 Allow organisations to improve internal data flows
    Decentralised intranet system to reduce WAN

CDN Options

 Companies can choose to outsource or build
  their own network
   Outsource: setup and maintenance costs much
    lower, no need for inhouse experts, providers may
    also have additional services such as live event
    management and extensive usage statistics
   Own network: greater control, privacy

CDN Providers

 Some of the largest companies include
   Akamai
   Digital Island
   Yahoo (mainly streaming media)

 Extensive networks covering large areas
    Akamai has over 13000 servers in more than 60

Tested Performance

 How good are these networks?
   Largest companies tested for streaming live
    broadcast capabilities as well as on-demand
   Each provider sent 1 hour MPEG-2 stream via
    satellite and needed to encode at 100kbps in real-
    time before transmission
   Yahoo achieved average packet loss rate of 0.006%
   Another study found internet packet loss of > 9%
    for similar bandwidth and distance
   However, this is upper end of results

Tested Performance (cont)

 After September 11 many web sites flooded
  with traffic
 Typical sites that experienced massive
  increases in traffic were airline and news
 Akamai used to serve 80% of MSNBC.com’s
  traffic, including around 12.5 million streams
  of video
 Akamai also used by MSNBC.com for Winter
Performance Issues

 Many design aspects can affect performance
   Capabilities of infrastructure
   Location of equipment
   DNS Redirection

 DNS Redirection is crucial in obtaining
  optimal performance, but is also one of the
  hardest areas to perfect

DNS Redirection Issues

 Study found that neither Akamai nor Digital
  Island could redirected the client to the
  optimal server in their content distribution
  networks consistently
 In a small fraction of cases performance was
  far from optimal
 Due to difficulty in determining user’s exact
  location and the best server at the time


 Using a number of mechanisms including load
  balancing and caching servers, content
  delivery networks aim to distribute internet
  content towards the network edge
 Avoids bottlenecks involved in centralized
  architecture, and reduces latency between
  end user and content
 Common uses for these networks is support
  for a large number of users to access popular
  web sites, or as a delivery means for
  streaming multimedia


Hierarchical Caches and CDNS
 What are the implications of this study for
  hierarchical caches and Content Delivery
  Networks (e.g., Akamai)?
      Demand-side proxy caches are widely deployed and are
       likely to become ubiquitous.
      What is the marginal benefit from a supply-side CDN
       cache given ubiquitous demand-side proxy caching?
      What effect would we expect to see in a trace gathered
       at an interior cache?
 CDN interior caches can be modeled as upstream
  caches in a hierarchy, given some simplifying

An Idealized Hierarchy
                                Level 1 (Root)

                                       Level 2

                N2 clients    N2 clients
                       N1 clients

Assume the trees are symmetric to simplify the math.
 Ignore individual caches and solve for each level.
Hit Ratio at Interior Level i
 CN gives us the hit ratio for a complete
  subtree covering population N
 The hit ratio predicted at level i or at any
  cache in level i over R requests is given by:
         hits at level i   hi Rpc (CNi  CNi 1 )
                           
      requests to level i  ri    ri 1  hi 1

  “the hits for Ni (at level i) minus the hits captured by level
           i+1, over the miss stream from level i+1”

Root Hit Ratio

 Predicted hit ratio for cacheable objects,
  observed at root of a two-level cache
  hierarchy (i.e. where r2=Rpc):

                h1 CN1  CN2
                r1   1  CN2

 Generalizing to CDNs

    Request                                  Interior Caches
    Routing                              (supply side “reverse proxy”)
    Function                      ƒ               NI clients

ƒ(leaf, object, state)
                                                    Leaf Caches
                                                     (demand side)
                                                      NL clients
                         NL clients NL clients
                                N clients

Symmetry assumption: ƒ is stable and “balanced”.

Hit ratio in CDN caches
  Given the symmetry and balance
   assumptions, the cacheable hit ratio
   at the interior (CDN) nodes is:

                CN I  CN L
                  1  CN L

NI is the covered population at each CDN cache.
NL is the population at each leaf cache.

   Cacheable interior hit ratio

                           fixed fanout NI /NL

                        Interior hit rates improve as
                        leaf populations increase....

               increasing NI and NL -->

Interior hit ratio
as percentage of all cacheable
                                          ....but, the interior
                                          cache sees a declining
 marginal                                 share of traffic.

               increasing NI and NL -->


To top