Memento Wayback Prototype

Document Sample
Memento Wayback Prototype Powered By Docstoc
					Memento: Wayback Prototype



                             The Memento Team

                           Herbert Van de Sompel
                             Michael L. Nelson
                             Robert Sanderson
                            Lyudmila Balakireva
                              Scott Ainsworth
                              Harihar Shankar


                        Memento is partially funded by the
                             Library of Congress

http://www.mementoweb.org
     Memento: Wayback Prototype
        Lyudmila Balakireva
                          Prototype Overview



•  Wayback request processing has an internal logic separated into 3 steps:

    •  Request Parser
    •  Query or Results Capture
    •  Replay

•  The prototype for Memento implements each of these.

•  Baseline Code:
     •  Open Source version from Sourceforge:
        wayback-1.5.0-SNAPSHOT from October 2009




                           Memento: Wayback Prototype
                              Lyudmila Balakireva        2
                                                

                   Handling Memento: Access Point

Configuration:

It was necessary to configure new access point in the wayback.xml
configuration file:

     80:memento
       For TimeGate and Memento support




                          Memento: Wayback Prototype
                             Lyudmila Balakireva       3
<bean name="80:memento" class="org.archive.wayback.webapp.AccessPoint">
   <property name="configs"><props>
     <prop key="aggregationPrefix">
     http://mementoarchive.lanl.gov/store/wayback-1.5.0-SNAPSHOT/ore/
     </prop></props></property>
   <property name="collection" ref="localbdbcollection" />
   <property name="replay" ref="mementourlreplay" />
   <property name="query"><bean class="org.archive.wayback.query.Renderer">
     <property name="captureJsp" value="/WEB-INF/query/TimeGate.jsp" />
    </bean></property>
   <property name="uriConverter">
    <bean class="org.archive.wayback.archivalurl.ArchivalUrlResultURIConverter">
      <property name="replayURIPrefix"
     value="http://mementoarchive.lanl.gov/store/wayback-1.5.0-SNAPSHOT/
memento/"/>
    </bean>
   </property>
   <property name="parser"> <bean
class="org.archive.wayback.archivalurl.ArchivalUrlRequestParser">
     <property name="maxRecords" value="1000" />
     <property name="earliestTimestamp" value="2000" />
    </bean></property>
 </bean>
                                                

                 Handling Memento: Request Parser

Parser Implementation: TimeGateParser.java

Process Steps:

   •  Examines X-Accept-Datetime header for datetime

   •  Checks datetime format's validity

   •  Checks if request URI is a TimeGate

   •  Sets AnchorDate of WayBackRequest object to datetime




                           Memento: Wayback Prototype
                              Lyudmila Balakireva       5
                 Handling Memento: Capture Module


Capturing Module Implementation: TimeGate.jsp

Process Steps:

   •  If datetime in archiving range:
             •  Generate 302 response
   •  If datetime not in archiving range:
             •  Generate 406 response

   •  Set headers:
        •  Vary:      negotiate, X-Accept-Datetime

        •  TCN:       choice or list

        •  Alternates: original, first, last, next, prev

        •  Location: nearest Memento
        •  Link:         to TimeBundle aggregation
                             (passed as param from config)
                            Memento: Wayback Prototype
                               Lyudmila Balakireva       6
                    Handling Memento: Capture Module


Example:                 (PATH) = mementoarchive.lanl.gov/store/wayback-1.5.0-SNAPSHOT

$ curl --header X-Accept-Datetime:'Mon, 02 Nov 2009 13:22:59 GMT'
       http://(PATH)/memento/timegate/http://news.bbc.co.uk/

HTTP/1.1 302 Moved Temporarily

Date: Mon, 25 Jan 2010 20:28:38 GMT
Set-Cookie: JSESSIONID=F4B8806F9045FE4A6232D4CA67CBCDC4;
     Path=/wayback-1.5.0-SNAPSHOT
Vary: negotiate,X-Accept-Datetime
Link: <http://(PATH)/ore/timebundle/http://news.bbc.co.uk/>;rel="aggregation"
Alternates: {"http://news.bbc.co.uk/" 1.0 {dt original}},{"http:/(PATH)/memento/
     20091007230049/http://news.bbc.co.uk/" 1.0 {dt "Wed, 07 Oct 2009 23:00:49 GMT" first}},
     {"http://(PATH)/memento/20091102172725/http://news.bbc.co.uk/" 1.0 {dt "Mon, 02 Nov
      2009 17:27:25 GMT" last}},…
TCN: choice
Location: http://(PATH)/memento/20091102172725/http://news.bbc.co.uk/



                                 Memento: Wayback Prototype
                                    Lyudmila Balakireva             7
                      Handling Memento: Replay 


At <baseurl>/memento

•  Process Steps (100% Memento approach. Alternatives possible):

   •  Do not rewrite urls to wayback base

   •  Do add <base href="original">
           to make image references on the page absolute

   •  Included MementoHeaders.jsp which appends Memento headers:

       •  X-Content-Datetime: Datetime of Memento
               Note: Identical to X-Archive-Orig-Date?

       •  Alternates: URIs for original, first, last, next and previous

                            Memento: Wayback Prototype
                               Lyudmila Balakireva         8
                         Handling Memento: Replay 


Example:                 (PATH) = mementoarchive.lanl.gov/store/wayback-1.5.0-SNAPSHOT

GET http://(PATH)/memento/20091102172725/http://news.bbc.co.uk/
HTTP/1.1 200 OK
Date: Mon, 25 Jan 2010 20:34:07 GMT
Set-Cookie: JSESSIONID=B35887B23D099DE0246A2490DC32083A;
      Path=/wayback-1.5.0-SNAPSHOT
X-Content-Datetime: Mon, 02 Nov 2009 17:27:25 GMT
Alternates: {"http://news.bbc.co.uk/" 1.0 {dt original}},{"http://(PATH)/memento/
20091007230049/http://news.bbc.co.uk/" 1.0 {dt "Wed, 07 Oct 2009 23:00:49 GMT" first}},
{"http://(PATH)/memento/20091102172725/http://news.bbc.co.uk/" 1.0 {dt "Mon, 02 Nov 2009
17:27:25 GMT" last}}, …
Link: <http://(PATH)/ore/timebundle/http://news.bbc.co.uk/>;rel="aggregation"
X-Archive-Orig-Expires: Mon, 02 Nov 2009 17:27:25 GMT
X-Wayback-Guessed-Charset: iso-8859-1
X-Archive-Orig-Date: Mon, 02 Nov 2009 17:27:25 GMT
X-Archive-Orig-Content-Type: text/html
Content-Type: text/html;charset=iso-8859-1
Content-Length: 107727

                                Memento: Wayback Prototype
                                   Lyudmila Balakireva            9
                           Conclusions


•  Memento Framework can be incorporated into existing Wayback
workflow

•  Various Replay approaches are possible.
     •  We implemented 100% Memento approach (no URL-rewriting)
           •  Requires:
                •  Original servers to implement Memento redirects;
                •  And intelligent Memento support by clients.
     •  Alternative approaches are possible. Need to be discussed.


                            Thank You



                       Memento: Wayback Prototype
                          Lyudmila Balakireva       10

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:10/13/2011
language:Dutch
pages:10