Lesson 2 – Web Architecture and Infrastructure

Document Sample
Lesson 2 – Web Architecture and Infrastructure Powered By Docstoc
					Lesson 2 – Web Architecture and
         Infrastructure
   School of Continuing and Professional
               Studies – NY
       Spring 2010– Sig Handelman
             February 25, 2009
                 Outline
• Lesson 2.    XML and WEB SERVERS
•
• XML, Web Servers, Web Service and SOAP
•
• READINGS: Chapter 5 & 6 - Web Application
  Architecture
                    URL’s
• http://www.ics.uci.edu/~rohit/IEEE-L7-
  namespaces.html
                   HTML
• P. 64 Figure 4.1 SGML, XML and their
  applications
• 4.1 HTML came out of SGML – came out of
  ANSI in 1980 – adapted by the IRS and the
  DOD
• Four parts – char set, DTD, specification
  (semantics) and document instances
        Document Type Definition
•   DTD
•   Element Definitions
•   Figures 4.5 and 4.6
•   See this in HTML valid tables
•   Attribute definitions – values for information
    procesomg
           SGML Declaration
• Document character set – ASCII, EBCDIC o
  other
• Concrete Syntax – fixed delimiters
• Feature Usage
           Evolution of HTML
• Current revision 4.01 is from 1999
• Afterwards came HTML 5 and XHTML
• HTML 4 introduced internationalization, so
  foreign languages can be adapted easily
• Figure 4.10 a document
• Figure 4.11 Meta Elements
• Figure 4.12 Style
   HTML and Scripting languages
• Figure 4.13 Script document written in
  JavaScript
               HTML Rendering
•   Browser
•   Style Sheets (CSS)
•   Linked
•   Embedded
•   Inlines
                       XML
•   Also derived from SGML
•   Core XML
•   XML has DTD’s and it also has schema
•   Namespaces to have several sets of tags in the
    same document
                   XML
• http://www.w3.org/standards/xml/
             Web Services
• http://www.w3.org/standards/webofservices/
               Web Servers
• First topic Web Servers
     IIS, Apache, and Sun IPlanet
• IIS is the Microsoft Internet Service installed
  on Server class machine, XP Professional, Vista
  Business, Server 2000, 2003 and 2008
• Apache is the HTTPD provided by the Apache
  foundation.
• IPlanet is the old name of the Sun Server. It is
  a reference platform for Java
Web Servers in the World
Microsoft IIS
        Apache in their own words
• In February of 1995, the most popular server software on the Web
  was the public domain HTTP daemon developed by Rob McCool at
  the National Center for Supercomputing Applications, University of
  Illinois, Urbana-Champaign. However, development of that httpd
  had stalled after Rob left NCSA in mid-1994, and many webmasters
  had developed their own extensions and bug fixes that were in
  need of a common distribution. A small group of these webmasters,
  contacted via private e-mail, gathered together for the purpose of
  coordinating their changes (in the form of "patches"). Brian
  Behlendorf and Cliff Skolnick put together a mailing list, shared
  information space, and logins for the core developers on a machine
  in the California Bay Area, with bandwidth donated by HotWired.
  By the end of February, eight core contributors formed the
  foundation of the original Apache Group:
• Brian Behlendorf Roy T. Fielding Rob Hartill David Robinson Cliff
  Skolnick Randy Terbush Robert S. Thau Andrew Wilson
            Google’s Web Servers
• AMENABLE TO EXTENSIVE PARALLELIZATION, GOOGLE’SWEB
  SEARCH
• APPLICATION LETS DIFFERENT QUERIES RUN ON DIFFERENT
  PROCESSORS AND,
• BY PARTITIONING THE OVERALL INDEX, ALSO LETS A SINGLE QUERY
  USE
• MULTIPLE PROCESSORS. TO HANDLE THIS WORKLOAD, GOOGLE’S
• ARCHITECTURE FEATURES CLUSTERS OF MORE THAN 15,000
  COMMODITYCLASS
• PCS WITH FAULT-TOLERANT SOFTWARE. THIS ARCHITECTURE
  ACHIEVES
• SUPERIOR PERFORMANCE AT A FRACTION OF THE COST OF A
  SYSTEM BUILT
• FROM FEWER, BUT MORE EXPENSIVE, HIGH-END SERVERS
           Google’s Services

• http://googlesystem.blogspot.com/2007/09/g
  oogles-server-names.html
         SUN IPLANET and More
• Sun IPlanet was the reference work for Java programs
  in the late 1990’s early 2000’s.
• In it’s time this was the leading outlet for the
  production of Java platform architecture
• I remember in 2002, I started a SOA – Service Oriented
  Architecture – project with the one of the Automotive
  companies while I was still with IBM
• Immediately they said, get an Iplanet system installed
• Sun has combined this server into the Sun Enterprise
  system.
   What do Real HTTP Servers Do?
• P. 113 in the HTTP book
1) Setup a client TCP session or close if the client is not
   wanted
2) Receive HTTP Request from the network
3) Process the request and take action
4) Access Resource – HTML file or other object on the
   server
5) Construct Response – HTTP object with correct
   headers
6) Send Response
7) Log this activity in a log file
                    Step 2
• Receive Input Messages
• This steps leads to multiple levels of software
  complexity – in multithreading and
  multiprocessing – to achieve very high rates of
  parsing
                   Step 4
• Mapping and accessing resources
• Starting point on a Server DocRoot, users can
  explore the content of the DocRoot, within
  the security and privacy framework
• No backing “Up” and out of the DocRoot into
  the servers files
• Resources may also include other resources –
  this is the beginning point for Server Side
  Includes - SSI
          Server Side Includes
• Server has commands as it parses the HTTP to
  include other files in the output
• This merges well with the HTML frame object,
  which can include another page on top of the
  current page
      Static & Dynamic Content
• Static Pages remain relatively constant as time
  progresses
• Dynamic pages mean the pages can modify
  themselves by adding and deleting parts
                          CGI -1993
• In 1993, the World Wide Web (WWW) was small but booming.
  WWW software developers and web site developers kept in touch
  on the www-talk mailing list, so it was there that a standard for
  calling command line executables was agreed upon. Specifically
  mentioned in the CGI spec are the following contributors:
• Rob McCool (author of the NCSA HTTPd web server)
• John Franks (author of the GN web server)
• Ari Luotonen (the developer of the CERN httpd web server)
• Tony Sanders (author of the Plexus web server)
• George Phillips (web server maintainer at the University of British
  Columbia)
• Rob McCool drafted the initial specification, and NCSA still hosts it.
  It was swiftly implemented in many servers.
                                   CGI
• The Common Gateway Interface (CGI) is a standard for interfacing external
  applications with information servers, such as HTTP or Web servers. A
  plain HTML document that the Web daemon retrieves is static, which
  means it exists in a constant state: a text file that doesn't change. A CGI
  program, on the other hand, is executed in real-time, so that it can
  output dynamic information.For example, let's say that you wanted to
  "hook up" your Unix database to the World Wide Web, to allow people
  from all over the world to query it. Basically, you need to create a CGI
  program that the Web daemon will execute to transmit information to the
  database engine, and receive the results back again and display them to
  the client. This is an example of a gateway, and this is where CGI, currently
  version 1.1, got its origins.
• The database example is a simple idea, but most of the time rather
  difficult to implement. There really is no limit as to what you can hook up
  to the Web. The only thing you need to remember is that whatever your
  CGI program does, it should not take too long to process. Otherwise, the
  user will just be staring at their browser waiting for something to happen.
              CGI Continued
• The database example is a simple idea, but
  most of the time rather difficult to implement.
  There really is no limit as to what you can
  hook up to the Web. The only thing you need
  to remember is that whatever your CGI
  program does, it should not take too long to
  process. Otherwise, the user will just be
  staring at their browser waiting for something
  to happen.
SSL
                           SSL
• The number of SSL certificates found by Netcraft's SSL
  Survey that are within their validity period, have a
  common name that matches the hostname and are
  issued by a widely trusted third party, has now
  exceeded one million.
• Netcraft's first SSL Survey in November 1996 found a
  total of only 3,283 certificates around the globe. It took
  nearly a year for this total to grow to ten thousand, and
  it wasn't until August 2000 that the total exceeded a
  hundred thousand. In comparison, the past year has
  seen an average growth of more than 18,000
  certificates per month.
                SSL in Detail
• SSL was an invention of Netscape which was
  bought out by the Sun Microsystems Corporation
• Very important rules for SSL
1. Server Authentication – not phony
2. Client Authentication
3. Integrity – data is save
4. Encryption – data is encrypted
5. Efficiency
6. Ubiquity
                SSL in Detail
7. Administrative Scalability
8. Adaptability (supports the best security
   needs of the day)
9. Social Viability – meets the cultural and
   political needs of the society
    The art and science of secret coding
•   Ciphers
•   Public Key
•   Signatures
•   All wrapped up in a Certificate
                     Javascript
• JavaScript was originally developed by Brendan Eich of
  Netscape under the name Mocha, which was later
  renamed to LiveScript, and finally to JavaScript.[5] The
  change of name from LiveScript to JavaScript roughly
  coincided with Netscape adding support for Java
  technology in its Netscape Navigator web browser.
  JavaScript was first introduced and deployed in the
  Netscape browser version 2.0B3 in December 1995.
  The naming has caused confusion, giving the
  impression that the language is a spin-off of Java, and it
  has been characterized by many as a marketing ploy by
  Netscape to give JavaScript the cachet of what was
  then the hot new web-programming language.[6][7]
Web Serivces
                      Log Files
• Most Web servers offer the option to store logfiles in
  either the common log format or a proprietary format.
  The common log file format is supported by the
  majority of analysis tools but the information about
  each server transaction is fixed. In many cases it is
  desirable to record more information. Sites sensitive to
  personal data issues may wish to omit the recording of
  certain data. In addition ambiguities arise in analyzing
  the common log file format since field separator
  characters may in some cases occur within fields. The
  extended log file format is designed to meet the
  following needs:
                  Log Files
• Permit control over the data recorded.
• Support needs of proxies, clients and servers
  in a common format
• Provide robust handling of character escaping
  issues
• Allow exchange of demographic data.
• Allow summary data to be expressed.

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:16
posted:8/27/2011
language:English
pages:37