Building Scalable Web Services Using Apache JServ by olq42616


									Building Scalable Web Services Using
       Apache JServ

            Sunny Gleason
             COM S 717
      Tuesday, December 4, 2001
             In This Lecture
•   What is JServ?
•   The Alternatives
•   Java Servlet API
•   Apache JServ / Tomcat
    – Scalability
    – Load Balancing
    – Fault-Tolerance
• JServ Security
• Running a web service has changed a
  lot since the early 1990’s
• Originally static HTML, text, and images
• Still a great deal of HTML content
• Shift from static pages to dynamically
  generated content
• Database-driven content, WAP, XML,
            What is JServ?
• JServ Server is a Java Servlet Engine
  (compliant with the Java Servlet API v2.0)
• Free software produced by the Apache
  Software Foundation
• Mod_jserv is a module for connecting JServ
  to Apache HTTP Server
• JServ engine has been replaced by Tomcat
• Mod_jserv has been replaced by mod_jk
              HTTP Basics
• HyperText Transfer Protocol
• Built on top of TCP
• 2 Well-Known Methods:
  – GET
  – POST
• Other Methods
  – HEAD, PUT, DELETE, ...
• Stateless
                  HTTP GET
• Format:
   – GET url HTTP/1.1 crlf headers crlf crlf
• The url string contains the resource identifier
  i.e. “/top.htm”
• The headers contain optional information
  provided by the client to the server
• Query Data may follow a question mark in
  the URL
   – i.e. “/”
                HTTP PUT
• Format:
  – PUT url HTTP/1.1 crlf headers crlf crlf
• Form data not passed through URL
• Allows submission of data values which
  are larger than maximum URL length
  – [URL ~ 2k on MS IE4.0 and above]
     HTTP Server Response
• HTTP 200 OK crlf headers crlf crlf
• Headers include MIME-type, content
  length, content encoding
• Other responses: 301 Redirect, 401
  Authorization Required, 403 Access
  Forbidden, 500 Internal Server Error
• Persistent Client-Side Information
• <Server, Key, Value, Expiration Date>
• Server sets cookie using Set-Cookie
• All future requests to server (before
  expiration date) accompanied by cookie
  in header
   Serving Dynamic Content
• We discuss 3 early models for dynamic
  – CGI
  – Mod_perl
  – Mod_php
       The Alternatives: CGI
• Common Gateway Interface
• Advantages
  – Flexibility - run any program
     • bash, perl, python, php
  – Low process overhead when idle
• Disadvantages
  – Reload interpreter upon every request
  – Re-establish (costly) database connections
  – Security concerns - passing parameters
    The Alternatives: Mod_Perl
•   Apache module for Perl
•   Memory-resident interpreter
•   Precompiled scripts / Script cache
•   Speed / Memory Tradeoff
    – HTTP Processes maintain individual perl
    – Allows persistent database connections, other
      persistent server state
    – Consistency between HTTP processes was not
      always assured
   The Alternatives: Mod_Php
• Apache module for PHP
  (PHP: Hypertext Preprocessor)
• Template-based language
  – Code tags are “embedded” within HTML template
  – Similar to MS ASP
• Suitable where HTML to script code ratio is
• Huge library of add-on modules
• Similar tradeoffs as mod_perl
  The Alternatives: Summary
• Should application logic be running on
  the web server?
  – scalability
  – fault-tolerance
  – security
• Clearly, need something better for
  enterprise-scale applications
               Apache JServ
• Separate Application Server from Web Server
   – Clean up the architecture
   – Improve Scalability
   – Provide fault-tolerance
• Embrace Java Philosophy
   – “Write once, run anywhere”
• Provide additional Servlet functionality
   – Like user sessions
          JServ: Openness
• JServ is 100% Java Code
  – Platform-Independent
  – Runs on any compliant JVM (IBM, Sun, ...)
• JServ is built on top of TCP
• Part of the Apache Software Foundation
  – Integrates nicely with Apache HTTP Server
  – Ports available for Windows, BSD, Linux ...
              JServ: Security
• JServ/Apache can run on different hosts
  (also: different users)
• JServ itself is comprised of many “Zones”
    – A zone is a JVM which executes some number of
      Java Servlets
•   JServ may be placed behind a firewall
•   JServ offers ACL security by IP address
•   Optional shared-key authentication
•   Apache HTTP Server may integrate SSL for
    secure HTTP client-server interaction
      JServ: Load Balancing
• Level 0: 1 - 1 Apache/JServ
  – No load balancing, no redundancy
• Level 1: 1 - n Apache/JServ
  – Each JServ hosts different zones
    (load partitioning)
• Level 2: 1 - m*n Apache/JServ
  – Each zone may be balanced among several JServs
• Level 3: p - m*n Apache/JServ
  – Multiple Apache Servers, multiple JServs
         JServ: Levels 0-1
• Level 0: allows smaller hosts to run
  entire application on a single machine
• Level 1: allows different hosts to serve
  different applications
• Typically difficult to plan/partition
  applications in this manner
            JServ: Level 2
• 1 - m*n Apache/JServ
  – Allows Apache to balance requests among
    several JServ servers hosting the same
  – Apache configuration file specifies ratio of
    hits for each JServ
  – Each HTTP process chooses server for
    each JServ zone, sends new requests to
    this target
            JServ: Level 3
• p - m*n Apache/JServ
  – Allows HTTP traffic to be load-balanced
    among several Apache servers
  – Allows Servlet workload to be distributed
    among several JServ servers
  – In order for the system to work, each
    Apache HTTP server must have identical
    JServ configuration
    • (To preserve sessions, as we’ll see later)
     JServ: Session Handling
• Once established, a session is bound to a
  particular JServ
• But, HTTP client accesses might be “sprayed”
  among many HTTP servers
  – Allows HTTP Server fault-tolerance
• Identical mod_jserv configuration allows
  different Apache servers to “route” requests
  to the right JServ
• Mechanism requires client to maintain a
  cookie which contains JServ server ID
     JServ: Session Handling
• How does it work?
  – Every time a request arrives for a balanced
    ServletMountPoint, mod_jserv chooses a JServ to
    handle the request
  – mod_jserv adds a cookie trailer to the
    environment variables of the JServ request
    (i.e. JS3)
  – JServ appends the cookie trailer to the end of the
    session cookie
  – Upon subsequent requests, Apache examines
    cookie, and sends the request to the correct JServ
      JServ: Fault-Tolerance
• (Assume Level 3)
• No Single Point of Failure
  – Apache can become overloaded and fail, but JServ
    servers continue to provide services (although SSL
    sessions lost)
  – JServ redundancy allows applications to continue
    running even if multiple hosts fail (although
    application sessions will be lost)
  – Since any Apache can route to any JServ, as long
    as one of each stay up, the system can work
      JServ: Fault-Tolerance
• How is the JServ fault tolerance
  – Each Apache contains a memory-mapped file
    where it keeps JServ information
  – Each Apache process has access to the file
  – If a process does not receive a response from a
    JServ process, it marks it as DOWN in the file
     • (Load is re-distributed [fairly] among the survivor JServs)
  – A “watchdog” process pings the JServs
    intermittently, updates the JServ status in memory
    if the server is back online
   JServ: Fault-Tolerance
– Apache Fault-Tolerance: Step 1
  • 1. Web server requests
  • 2. HTTP Load-balancing system routes request
  • 3. Apache server chooses a random JServ
    machine, say
  • 4. JServ machine responds to request with
    content of page, along with cookie with name
    “JServSessionID” and value “xxxx-JS1”
      JServ: Fault-Tolerance
• Apache Fault-Tolerance: Step 2
  – 1. Client requests another page from
  – 2. HTTP Load-balancing system routes request to
  – 3. Apache server recognizes session cookie, finds
    “JS1” at end of the cookie
  – 4. Apache looks up “JS1” in JServ configuration,
    routes request to
      JServ: Load-Balancing
• Step 1: JServ Load-Balancing
  – 1. Client A requests a servlet (A1)
  – 2. HTTP chooses target JServ (A’1)
  – 3. Client A cookie is set for JS1
  – 4. Client B requests a servlet (B2)
  – 5. HTTP chooses target JServ (B’2)
  – 6. Client B cookie is set for JS2
      JServ: Load-balancing
• Step 2: Session Handling
  – 1. Client B requests a servlet (sends
    previously-set cookie)
  – 2. HTTP server recognizes cookie
  – 3. Request is routed to JServ2 (B’2)
      JServ: Fault-Tolerance
• [assume Jserv1 goes down]
• Step 3: JServ Fault-Tolerance
  – 1. Client A requests a servlet
  – 2. HTTP Server recognizes the JS1 cookie
  – 3. Request is passed to JServ1, resulting in
  – 4. HTTP marks JServ1 “dead” in shared memory
  – 5. HTTP looks up another server for the servlet
    mount point, sends request to JServ3
  – 6. If a new session is needed, a new one is
    created and the new cookie is set to “JS3” (JS1
      JServ: Fault-Tolerance
• Implementation Issues
  – Denial of Service
     • Failed requests must be re-distributed evenly!
     • Otherwise, a single server will bear the brunt of the load,
       and probably crash
  – Network Partitioning and Application-level Data
    Synchronization Issues
     • Must still be anticipated by the app. designer
  – Watchdog process
     • For single-threaded watchdog, if timeout is t, time
       between crash and restoration could be f*t, where f is
       the number of failed processes
       JServ: Manageability
• Shared JServ State allows HTTP process
• Admins can mark JServs as “shutdown” in
  shared memory
• JServ processes can be brought down for
• Apache HTTP processes redirect requests
  among “live” servers
• Detailed availability information can be
  produced by logging contents of shared
  memory file
     Tomcat: New Features
• Enhanced security model
• Property files which specify access
  rights (open socket, write file, etc.)
• Allows different protection levels within
  the same JVM (i.e. Java 2 protection
• JServ provides:
  – Limited support for
     • Load balancing
     • Fault-tolerance
     • External Security
  – Good support for
     • Internal Security
• N-tier application abstraction provides
  flexibility when needed, “loopback” option
                The End
• Any questions/comments?
  – Apache Web Server
  – JServ / Tomcat Servlet Containers
  – Scalability / Load-balancing
  – Fault-tolerance
  – Security
         For Further Info
• Apache Jakarta Project

To top