Docstoc

high_performance_web_server

Document Sample
high_performance_web_server Powered By Docstoc
					High Performance Web Server



            NTUIM
            R89725018 Chen Pei-wen
            R89725013 Cheng Pei-chun
               Outline
 Introduction
 Load balancing

 Content-based Switching

 Implementation Architecture

 Conclusion

 Reference
               Introduction
 Performance and high availability are
  critical at web sites which receive large
  number of requests.
 The QoS a web server provides to end
  users depends on
    – Network-transfer speed
    – Server-response time
          Introduction(con’d)
 Network-transfer speed is mainly a
  matter of Internet-link bandwidth.
 Server-response time depends upon
  available resources:
    – Single server
    – Multiple servers
          Introduction(con’d)
   Single server

       Web-server   Well-designed server process
        Software    Adjust web-server software

       Operation    Specific operation system
        System      Adjust operation-system parameter


                    Install more RAM
       Hardware     Replace the CPU with a faster one
                    Use faster SCSI controllers and disks
          Introduction(con’d)
   Multiple Servers
    – Improve performance by increasing the
      number of Web servers.
    – This involves an attempt to distribute the
      traffic onto a cluster of back-end Web
      servers.
    – Load balancing is needed.
             Load Balancing
   Goal
    – To balance the traffic onto available server
    – The technical distribution is totally
      transparent to the end user
       Load Balancing(con’d)
   Benefits
    – Improve reliability (fault tolerance)
       • If you are using a single server and it fails, the site goes
         down with it.
       • This is especially bad for e-commerce and financial sites
         which lose money if they are out of service.
       • With a load balanced group of servers, loss of a single
         server will only slightly affect overall site performance
         and the site will not go down.
   Load Balancing(con’d)

– Improve performance
  • Load balancing allows multiple servers to be available to
    handle larger number of incoming client requests.
– Lower cost
  • With load balancing providing fault tolerance to the entire
    site, the reliability of each individual server is less critical.
  • We can use lower-cost servers without compromising
    overall reliability.
   Load Balancing(con’d)

– Improve scalability and flexibility
   • With one server, all you can do if your traffic increases is
     upgrade that server or buy a bigger one.
   • With load balanced groups of servers, you can simply
     add more servers gracefully to server farm.
– Improve maintainability
   • The flexibility of a load balanced group of servers allows
     you to remove individual servers from service for repair
     or upgrade without affecting the overall availability of the
     site.
           Load Balancing(con’d)
    Load balancing algorithm (RFC 2391)
    1. Round-Robin
       •    This is the simplest scheme, where a host is selected
            simply on a round robin basis, without regard to load
            on the host.
    2. Least Load first (session count)
       •    the host with least number of sessions bound to it is
            selected to service a new session.
       •    Each session is assumed to be as resource consuming
            as any other session
       Load Balancing(con’d)
3. Least traffic first (bytes or packet count)
   •    measure system load by tracking packet count or byte
        count directed from or to each of the member hosts
        over a period of time.
4. Least weighted load
   •    Weights to sessions, based on likely resource
        consumption estimates of session types
   •    Weights to hosts based on resource availability.
5. Fastest response
   •    periodically ping member hosts and measure the
        response time to determine how busy the hosts really
        are
    Load Balancing - RR DNS
   Round-Robin DNS Approach
    – Allows a single domain name to be associated
      with several IP addresses
       • Example : using CNAME (canonical name) resource records
         www.foo.dom. IN CNAME www1.foo.dom
                              IN CNAME www2.foo.dom
                              IN CNAME www3.foo.dom
                               IN CNAME www4.foo.dom
     Load Balancing - RR DNS
   Operation
    – A browser has to take to retrieve the URL is to resolve the
      corresponding IP address

    – A name resolver that calls a nearby DNS server, which then
      actively iterates over the distributed DNS server hierarchy on
      the Internet until it reaches the Round-Robin DNS server,
      which finally gives the IP address

    – A browser takes the IP address and create a connection with
      the assigned server
Load Balancing - RR DNS
     Load Balancing - RR DNS
   Attractiveness of Round-Robin DNS
    – The concept is simple
    – It requires no additional hardware
   Drawbacks of Round-Robin DNS
    – DNS is unaware of the status of web servers
    – All servers are assumed to have equal capability
      to offer all services
Load Balancing - RR DNS
– The caching of DNS data can cause load
  imbalances
   • In practice, DNS servers cache the resolved data at any point
     in the DNS hierarchy both to decrease the resolver traffic and
     to speed up resolving.
                                           Database              Database


           query                 query                 query
 User                Name                   Name                  RR DNS
process              resolver               Server                Server
          response              response              response


                      cache                 cache                  cache
    Load Balancing - L4 Switch
   Layer-4 Switch Approach
    – These switches sit between the connection to the
      Internet and the server farm
Load Balancing - L4 Switch
    Load Balancing - L4 Switch
   Operation
    – The switch recognizes when a client is requesting a new
      session by identifying the TCP SYN packet

    – The request is forwarded to the best available server based
      on the configured load balancing algorithm

    – The switch maintains a session-server binding table that
      associates each active session with the real server to which
      it is assigned
Load Balancing - L4 Switch
– It performs address substitution so that the real server will
  transparently receive packets for that session

– Likewise, the switch intercepts packets traveling from the
  real server to the client and performs the reverse address
  substitution

– The switch recognizes when the session is terminated by
  identifying the TCP FIN packet

– Then it removes the session-server binding from its
  binding table
    Load Balancing - L4 Switch
   Attractiveness of Layer-4 Switch
    –   Good load balancing can be achieved
    –   No problem of the caching of DNS data
    –   Sophisticated algorithm can be used
    –   Aware of the failures of web servers
   Limitation of Layer-4 Switch
    – It has no concept of what content is being
      requested
Load Balancing - L4 Switch
– All content should be replicated
– Cache hit rate may be low
             Load Balancing
   Benefits of Content Awareness
     Content-based Switching
   Content-based Switching
    – Intelligently load balances traffic across delivery
      nodes, dynamically directing specific content
      requests to the best site and server at that
      moment.
    – Based on content availability, application
      availability and server load.
    – Adds protection against flash crowds and ensures
      transaction continuity for e-commerce applications.
    – Enables advanced personalization and
      prioritization for important content and customers.
Content-based Switching(con’d)
    Benefits
     – Increased performance due to improved hit rates
       in the back-end’s main memory caches.
     – Increased secondary storage scalability due to the
       ability to partition the server’s database over the
       different back-end nodes
     – The ability to back-end nodes that are specialized
       for certain types of requests
Content-based Switching(con’d)
    Products
     – ArrowPoint's
       Content Smart™ Web Switches
       WebNS™
     – Foundry network’s
       ServerIron™ Traffic Management system
       Internet IronWare™
     – Nortel Networks’s
       Accelar Load Balancing Server Switch
Content-based Switching(con’d)
   Switch Architecture
Content-based Switching(con’d)
   Idea




    Simultaneously, virtualrequest in requested HTTPintercepts
    Web makesspoofs is created switch portathe optimal
    User switch with theTCP connectexamines URLand all
           control block Web of the back to client header
                 a content IP the by typing URL
    A flow is created betweenswitch the and ASIC into a
    server and
    and URL "snaps" together with the flow from the select
    examines packet URLs.
    the request. packets are forwarded without intervention
    Browser. and compares to current content rules to clientby
    subsequent
    best server or cache to satisfy request.
    the switch controllers.
    to the switch.
Content-based Switching(con’d)
Content-based Switching(con’d)
Content-based Switching(con’d)
Content-based Switching(con’d)
   Variant Switching Scheme
    – URL Switching
      Directs HTTP requests to a group of servers using
      information contained in URL string.
       • Greater control over the website deployment to place
         different web content on different servers
       • Eliminating unnecessary duplication of all content across
         all load-balanced servers.
       • Ex: Different file types
                Different request
Content-based Switching(con’d)
   Variant Switching Scheme
    – Cookie Switching
      Directs HTTP requests to a server group based on
      information embedded in a cookie in the HTTP
      header.
       • Cookie specifies which server group should handle the
         request.
       • Ensures that a particular server group always handles
         requests from a particular client even across sessions.
       • Guarantee persistent end-user experience.
       • Ex: Personalized web page
              Prioritized service
Content-based Switching(con’d)
   Variant Switching Scheme
    – SSL Session ID Switching
      All the SSL connections between a client and
      server must reach the same host.
       • Ensures that all the traffic for a SSL transaction with a
         given SSL-ID always goes to the same server.
       • Key feature for commerce, financial web sites
       • Ex: Prevent shopping cart loss
               Access control
               Prevent source address overload
Content-based Switching(con’d)
   Load balancer evaluation criteria
    –   Plans for Layer 3 or Layer 4 switching
    –   Number of servers and planned growth
    –   Type of content to be balanced
    –   Number of server sites to be balanced
    –   Sophistication of balancing algorithms
    –   Degree of fault tolerance required
    –   Interfaces and port density
    –   Support requirements
Implementation Architecture

 Design, Implementation and Performance of a
            Content-Based Switch

                    Infocom 2000

   {George Apostolopoulos, David Aubespin, Vinod Peris,
             Prashant Pradhan, Debanjan Saha}
Implementation Architecture(con’d)
    Switch Architecture

  – The Layer 5 system consist
    of a switch core to which a
    number of custom built
    intelligent port controller are
    attached.
  – Layer 5 functions, such as
    the parsing of HTTP
    protocol messages and
    URL based routing, are
    performed by the processor.
Implementation Architecture(con’d)
    Switch Architecture

     – Port controller identify
       the packets that need to
       be handled by the
       processor and forward
       them to the processor.
     – Make sure it can
       achieve very high
       speed while delivering
       sophisticated Layer 5
       functionality.
Implementation Architecture(con’d)
     Operation Blueprint
     – Phase 1: it intercepts the TCP connection setup request
       from the client and response by establishing a connection
       to the client.
     – Phase 2:after routing decision is made, it sets up a second
       connection to the appropriate server node.
     – Phase 3:splicing the two TCP connections
Implementation Architecture(con’d)
    Processing at Port Controllers
Implementation Architecture(con’d)
    Processing at CPU
     – CPU acts as the end-points for the TCP
       connections to the client and the server until.
       they are spliced.
     – Splices the connection by sending the
       appropriate control messages to the port
       controllers.
     – Handling of TCP options deserves special
       attention.
        • Reject all TCP options
        • Enumerate the minimum set of options supported by
          all nodes
Implementation Architecture(con’d)
     URL look up

     To be able to dispatch
      HTTP requests based on
      URLs, the L5 system has
      to know mapping from the
      URL to the web server on
      which the page resides.
     Use a hash function and
      set default size of all hash
      buckets to 256.
                 Conclusion
   The concept of content-based switching is
    understandable, but efforts are needed to
    implement it well.
   Content based-service differentiation can be
    used to provide service differentiation based
    on the user profiles.
   Not only load balancing but also persistence
    pays in E-Commerce.
                       Reference
   RFC 2391 LSNAT
   Webtechniques Load balancing your web sites
    http://www.webtechniques.com/archives/1998/05/engelschall/
   HydraWEB Load Balancing
    http://www.hydraweb.com/load_balancing/index.asp
   Techniques for Designing High-Performance Web Sites
    http://www.research.ibm.com/people/i/iyengar/ieeeic/ieeeic.html
   Locality-Aware Request Distribution in Cluster-based Network
    Service In Architectural Support for Programming Languages
    and Operating System, 1998
   TCP/IP & Related Protocols 2 edition Uyless Black
    Chapter four : The Domain Name System
               Reference (con’d)
   Foundry Products Application Notes
    http://www.foundrynet.com/appnotes.html
   Alteon WebSystems Web Switching White Paper
    http://www.alteonwebsystems.com/products/whitepapers/index.asp
   Design, Implementation and Performance of a Content-Based
    Switch Infocom 2000
    {George Apostolopoulos, David Aubespin, Vinod Peris, Prashant
    Pradhan, Debanjan Saha}
   Cisco CDNs
    http://www.cisco.com/warp/public/779/largeent/learn/technologie
    s/content_networking/
URL Switching




                Back
Cookie Switching




                   Back
SSL Session ID Switching




                           Back

				
DOCUMENT INFO