Dynamic-Content Web Caching using Cooperative Proxy Scheme by sofiaie


									Dynamic-Content Web Caching
  Cooperative Proxy Scheme

          Βελισκάκης Μανώλης
          Εθνικό Μετσόβιο Πολστετνείο
          Dept. of Electrical & Computer Engineering
          Knowledge and Database Systems Laboratory

          Σσνάντηση DBLAB
         Τρίτη, 20 Ιανοσαρίοσ 2004

 Problem Definition
 Dynamic-Data Web Caching vs
  Cooperative Schemes
 Proposed Web Caching Algorithm
 Current and Future Work
 Discussion
   Problem Definition – What?

Query Results

Dynamic Data for personalization purposes
    Problem Definition – Where?
 Client
 Proxy
 Edge-of-net
 Internet Service Provider
 Edge-of-Enterprise
 Application Server
 Web Server
        Problem Definition – How?
Nowadays Approaches

   Exact matching query
   Materialized Views
   DB Characteristics to Proxies
    Problem Definition – Topology

 Broadcast queries
 Hierarchical Caching
 URL Hashing
 Directory based Cooperation
    Problem Definition - Issues

 Replacement Policy
 Cache Consistency
 Proxy Communication
 Web objects placement
              Dynamic-Data Web Caching
               vs Cooperative Schemes

   Exact matching                              Broadcast queries
    query                                       Hierarchical
   Materialized Views                           Caching
   DB Characteristics                          URL Hashing
    to Proxies                                  Directory based

                       Replacement Policy
                       Proxy Communication
                       Web objects placement
                 Dynamic-Data Web Caching
                  vs Cooperative Schemes

Conclusions (?)

   Exact Matching Query
     – Common Web Caching Issues
     – Not interesting
   DB Characteristics to Proxies
     – Common DB Replication Issues
     – Interesting Issue: Create Cache Tables knowing that there is a
       cooperative proxy Scheme
               Dynamic-Data Web Caching
                vs Cooperative Schemes

Conclusions (?)
   Materialized Views
    – Many interesting issues
         Query rewriting
         Replacement Algorithm
         Appropriate Cooperative Scheme
         Web Objects exchange between Proxies
         Consideration of DBMS structure
         Dynamic or a priori definition of Materialized Views
         Giving DB capabilities to Proxies (queries on Materialized
         Communication between Proxies
         Proposed Web Caching Algorithm –
    Hybrid Topology (Hierarchical-Directory Based)
    C       PROXY 1a                            C    PROXY 2a
    L                                           LI
    I         Q.M                               E      Q.M
    E                                           N
    N        CACHE                              T     CACHE
    T                                           S

            DIRECTORY                                DIRECTORY

PROXY 1b                PROXY 1c    PROXY 2b                     PROXY 2c
  Q.M                     Q.M         Q.M                          Q.M

 CACHE                   CACHE       CACHE                        CACHE

DIRECTORY               DIRECTORY   DIRECTORY                    DIRECTORY

                WEB SERVER                  WEB SERVER

                 DATABASE                    DATABASE
                  SERVER                      SERVER
          Proposed Web Caching Algorithm –
               Web Objects description

      are 3 different ways to refer to a
 There
 Web Object
  – URL
  – QTag
  – QTag+Query Result (Whole Web Object)
                 Proposed Web Caching Algorithm –
                      Web Objects description


  ID:Number,                          //Unique identifier for every QTag
  Query:String,                       //Contains the query that has been asked to the Back-End
     LocationOfWebServer:URL,         //Contains the URL Location of the Web Server that stands
                                      in front of the Database
     DatabaseID:Number,               //Contains the ID of the Database where the query was
     TimeToLive:Number (sec),         //Determines the period in which the query is valid and can
                                      satisfy Requests
     Weight:Number,                   //Determines the significance (Weight) of the query.
     Relationships:List of QTag.ID    //Determines a list of Web Objects that are
                                      frequently used with the current Web Object in
                                      order to satisfy query requests
              Proposed Web Caching Algorithm –
                   Web Objects description
                         QTAG + Query Results

        Query=”Select name, surname, age from Customers where
        LocationOfWebServer =”http://www.dblab.ece.ntua.gr/siteNo1”,
        DatabaseID =1,
        Relationships=”15433456, 15433766, 15682456, 15432456
John, Manolopoulos, 28
John, Nikolaidis,35
                              Query Result
                 Proposed Web Caching Algorithm –
                          Proxy Structure

                                                        PROXY STRUCTURE

REST OF COOPERATIVE                 URL/QTag
      SCHEME                      TRANSFORMER

                                   QUERY REWRITER            CALCULATOR

                              CACHE DIRECTORY

                                                MAIN CACHE
             Proposed Web Caching Algorithm –
                     Proxy Structure –
                  URL/QTag Transformer

   Proxies manipulates Web-Objects (Query Results) through
    their <QTags>
   Extract from a Web Object’s URL the
     – Query (Knowing the CGI that produces the Query
     – LocationOfWebServer
     – DatabaseID
   1-1 correspondence between URLs and QTags
               Proposed Web Caching Algorithm –
                       Proxy Structure –
                        Query Rewriter

   Rewriting the requested Web Objects (Queries) in case there is not
    an exact match of the requested query cached but it can be
    satisfied from other already cached web objects (queries).

   Query rewriter will follow standard query-rewriting methods and
    techniques that are already used to database system and
                 Proposed Web Caching Algorithm –
                         Proxy Structure –
                         Weight Calculator
Every web object will be characterized from a Weight W which will be determined from the
following factors:

S               (Determined from the web-object’s size)

Πs              (Determined from the influence percentage of Factor   S   to the Weight)

CS              (Determined from the web-object’s cost-retrieval)

Πcs             (Determined from the influence percentage of factor   CS to the Weight)
Ρ               (Determined from the web-object’s popularity)

Πp              (Determined from the influence percentage of Factor   Ρ to the Weight)
R               (Determined from the web-object’s significance as far as its relationships concerns)

Πr              (Determined from the influence percentage of Factor   R to the Weight)
                                                    Proposed Web Caching Algorithm

                          The Request                          The URL/QTag Transformer Finds the                     The QTag is sent to           Query Rewriter Rewrites the
                        arrives to a Proxy                     QTag that best describes the incoming                   Query Rewriter                  Query and produces
                                                                               URL                                                                         Sub-QTags

                                                                                Some of the
                                                                                Sub-QTags                                                                       None of the
                                                                                 are cached                                                                     Sub-QTags
                                                                                                                    The Query Rewriter asks                     are cached
                                                                                                                   the Cache Directory if any
                                                                                                                      of these Sub-QTags is
                                                                                                                   already cached in the Main
                                                                                                                                                                        Send request to Web
                                                                                                                                                                         Server and Caches
                                                                                                                                                                            the response
                                                                      The Query Rewriter asks
                           ALL of the rest                            the Cooperative-scheme                                       All the Sub-
                          of the Sub-Qtags                            Directory if the rest Sub-                                    QTags are
                            are cached in                              QTags cached in other                                         cached
                            other Proxies                                     Proxies

                                                                                     Not all of the rest
  Query Rewriter                 Proxy retrieves the Cached Web                      of the Sub-Qtags
retrieves the locally             Objects from the other Proxies                       are cached in
cached Web Objects                  and sends them to Query                            other Proxies

                                                                                                                                                     Query Rewriter
                                                                                                                                                   combines the Sub
                                                                                                                                                  QTags and the proxy
                                                                                                                                                   sends the response

                                               The Proxy Caches                           Weight Calculator Refreshes
                                              locally the retrieved                      Weight Value and Parameters of
                                                 Web Objects                                     the Sub-Tags
     Current and Future Work

 Study and Testing the proposed new
 Definition of Workload
 Better Definition and Testing of the
  proposed Algorithm

 Efficiency of Testing Tools (Simulator)
 Ideas for efficient Web Caching for
 Comments
Thank You

To top