Docstoc

032518_QVIZ_D_5_2_Toolkit_Architecture_report

Document Sample
032518_QVIZ_D_5_2_Toolkit_Architecture_report Powered By Docstoc
					                                    Project Number: 032518




                        QVIZ
         Query and context based visualization of time-spatial cultural dynamics
                            Specific Targeted Research Project
                             Information Society Technologies




  Toolkit architecture report version 2
                 D 5.2
                            Due date of deliverable: 29/02/2008
                            Actual submission date: 28/04/2008




Start date of project: 01/05/2006                                    Duration: 24 month
Umeå University                                                                    Final
                                          QVIZ Toolkit architecture report 2008-04-28




Abstract
Number and name     Project Number: 032518
                    QVIZ, Query and context based visualization of time-spatial
                    cultural dynamics

Work Package        WP 5 & WP 6

Task                D 5.2, D6.1, D6.2

Date of delivery    Contractual: 29/02/2008          Actual: 28/04/2008

Code name           032518                           Version     draft      final 

Nature              Report

Distribution Type   Public

Authors (Partner)   Laura Albornos, lam268@tid.es
                    Mikael Berglund, mikael.berglund@ladok.umu.se
                    Kalev Koppel, kalev.koppel@regio.ee
                    Vojtech Kupca, vojtech.kupca@humlab.umu.se
                    Johan Lindskog, johan.lindskog@humlab.umu.se
                    Bob Mulrenin, bob.mulrenin@salzburgresearch.at
                    Fredrik Palm, fredrik.palm@humlab.umu.se
                    Erik Uus, erik.uus@ra.ee
                    Alexander von Lünen, alex.von-luenen@port.ac.uk

Contact Person      Mikael Berglund, mikael.berglund@ladok.umu.se

Abstract            Toolkit architecture report version 2 for the QVIZ project, co-
                    funded by ICT Research Framework Programme of the European
                    Union. Handling components structure and general architecture.

Keywords List       Architecture, Components, Functionalities, Interface, Protocols,
                    Interactions, Technologies, Tools, Framework, Platforms, User
                    Requirement, User scenario




                                         2
                                                                         QVIZ Toolkit architecture report 2008-04-28




Table of contents
1. EXECUTIVE SUMMARY .................................................................................................5
2. INTRODUCTION ................................................................................................................6
    2.1 GENERAL ARCHITECTURE ................................................................................................6
3. QUERY VISUALIZATION ENVIRONMENT ..............................................................7
    3.1 DESCRIPTION OF COMPONENTS ........................................................................................7
    3.2 TIME SPATIAL CLIENT ......................................................................................................8
       3.2.1 Map component architecture ..................................................................................8
             3.2.1.1 Map client and vector server communication in the time-spatial interface................. 9
        3.2.2 Map Component Features .................................................................................... 11
        3.2.3 Time-bar Features ................................................................................................ 12
        3.2.4 Map legend ............................................................................................................ 13
             3.2.4.1 Technical description of map legend tool ...................................................................13
        3.2.5 History of the point tool........................................................................................ 14
             3.2.5.1 Technical description of HoP tool ...............................................................................15
    3.3 TIME SPATIAL MIDDLEWARE........................................................................................ 16
       3.3.1 Vector data generalization ................................................................................... 16
       3.3.2 Optimization of vector loading by panning the map or changing the map time16
             3.3.2.1 Time request optimization............................................................................................17
       3.3.3 Calculation of thematic map classes ................................................................... 17
       3.3.4 System requirements of the time-spatial middleware ......................................... 18
    3.4 FACETED QUERY COMPONENT ..................................................................................... 19
       3.4.1 FQC architecture .................................................................................................. 19
       3.4.2 Features ................................................................................................................. 19
4. COLLABORATIVE ENVIRONMENT/KNOWLEDGE BUILDING TOOLS
(CET) ....................................................................................................................................... 23
    4.1 DESCRIPTION OF COMPONENTS AND TOOLS ................................................................ 23
       4.1.1 Collaborative Environment Activities, User Interfaces and detailed descriptions
       ......................................................................................................................................... 23
       4.1.2 Collaborative Environment Portal ...................................................................... 24
       4.1.3 Plug-ins.................................................................................................................. 26
       4.1.4 Social Knowledge Content ................................................................................... 26
       4.1.5 Knowledge Content Query Manager ................................................................... 28
       4.1.6 Collection Manager .............................................................................................. 28
       4.1.7 Communities and member management.............................................................. 28
       4.1.8 Publication Manager ............................................................................................ 29
       4.1.9 RDF based Repositories ....................................................................................... 29
             4.1.9.1 Overview RDF based Repositories..............................................................................29
             4.1.9.2 Domain Ontology..........................................................................................................30
       4.1.10 non-RDF based storage...................................................................................... 32
       4.1.11 Single Sign On and User Management ............................................................. 32
    4.2 KNOWLEDGE CONTENT INTEGRATION ......................................................................... 32
       4.2.1 Visualization / Presentation using SPARQL based Templates .......................... 32
       4.2.2. Archival Social bookmarking: User context, Archive and Community
       knowledge ....................................................................................................................... 33
       4.2.3. Component Integration via Archival Resources ................................................ 34
       4.2.4. Simple Knowledge Organisation (SKOS)........................................................... 35
             4.2.4.1. SKOS based Tagging Support in the Collaborative Environment............................35
             4.2.4.2. SKOS Vocabulary Builder and Search Application..................................................35
        4.2.5. Knowledge Content Export to Knowledge Content Object (KCO) aware
        Environments .................................................................................................................. 36
             4.2.5.1. Knowledge Content Carrier Architecture...................................................................37




                                                                        3
                                                                        QVIZ Toolkit architecture report 2008-04-28




5. MIDDLEWARE ................................................................................................................ 39
    5.1 CORE PORTAL COMPONENTS QVIZ USER MANAGEMENT ......................................... 39
       5.1.1 User Management................................................................................................. 39
    5.2 ARCHIVE RESOURCE HUB ............................................................................................. 40
       5.2.1 ARH Modules ........................................................................................................ 40
       5.2.2 Bookmark module ................................................................................................. 41
       5.2.3 CET module........................................................................................................... 41
       5.2.4 CoP module ........................................................................................................... 41
       5.2.5 Resource module ................................................................................................... 41
       5.2.6 User module .......................................................................................................... 41
       5.2.7 Archive Abstraction Layer module ...................................................................... 41
       5.2.8 SAS Module ........................................................................................................... 42
            5.2.8.1 SAS Architecture ..........................................................................................................43
            5.2.8.2 Installation procedure for the SAS Module................................................................. 43
6. BOOKMARKING CLIENT............................................................................................ 44
    6.1 USER INTERFACE ........................................................................................................... 44
    6.2 ARCHITECTURE AND DATA MODEL ............................................................................... 44
    6.3 INSTALLATION ............................................................................................................... 45
7. DATA IMPORT AND DATABASE CREATION ....................................................... 46
    7.1 TECHNICAL REQUIREMENTS FOR CONTENT PROVIDERS ............................................. 46
    7.2 DATA O RIGINS ............................................................................................................... 46
    7.3 DATA I MPORT ................................................................................................................ 47
    7.4 TOOLS FOR INDEXING RESOURCES ................................................................................ 48
       7.4.1 Context................................................................................................................... 48
       7.4.2 Tool support .......................................................................................................... 48
       7.4.3 Data model ............................................................................................................ 49
       7.4.4 SQL Queries .......................................................................................................... 51
       7.4.5 User interface........................................................................................................ 53
       7.4.6 Summary ................................................................................................................ 55
    7.5 QVIZ USER INTERFACE STORAGE (QUIS) .................................................................. 56
       7.5.1 QUIS Data Structure ............................................................................................ 57
       7.5.2 Examples of querying in the QUIS database ...................................................... 59
       7.5.3 Querying over quis_relation ................................................................................ 60
    7.6 TOOLS FOR EDITING THE ADMINISTRATIVE ONTOLOGY ............................................. 64
APPENDIX ............................................................................................................................. 66
    A. FACETED QUERY COMPONENT TECHNICAL DOCUMENTATION ................................... 66
    B. ARCHIVE ABSTRACTION LAYER API............................................................................. 71
       hook_whois ..................................................................................................................... 71
       hook_fetch ....................................................................................................................... 71
       hook_parse_id ................................................................................................................ 72
       hook_get_arh_id............................................................................................................. 72
       hook_get_archive_ids .................................................................................................... 72
       hook_resource_save ....................................................................................................... 72
       hook_resource_lookup_url ............................................................................................ 73
       hook_resource_object_url ............................................................................................. 73
       hook_resource_description_url..................................................................................... 73
       hook_get_qviz_resource_level ...................................................................................... 73
       hook_install..................................................................................................................... 74




                                                                      4
                                           QVIZ Toolkit architecture report 2008-04-28




1. Executive Summary
This report describes the architecture and technical implementation of the
components used in the second prototype implementation of the QVIZ platform,
called P2b_dev. Specifically, the Query Visualization Environment, Collaborative
Environment and the middleware components will be discussed. Furthermore, the
requirements and the technical solutions for data management and integrating
archives in the QVIZ project are specified.
The Query Visualization Environment helps the user search and retrieve large
amounts of information. The information consists of an administrative unit
ontology and its associated archival resources. The Faceted Query Component
displays facets; lists of categories, that can be ordered arbitrarily. Selecting
different facets and categories will filter the search results, in this case archival
resources. By searching in the QVIZ system the user gets a single access point to
two national archives. In the archival portals users can create social bookmarks of
archival resources. These bookmarks can be used in the Collaborative
Environment where users can utilize functions, which implement collaborative
knowledge building.
The prototype can be accessed through this webpage:
http://qviz.eu/TryQVIZ.php




                                          5
                                                  QVIZ Toolkit architecture report 2008-04-28




2. Introduction
The Use Cases developed in WP2 provided the development team with the
approximate picture of how the main QVIZ components should interact with one
another and with the users. These Use Cases, presented in D4.1.3 System
Specification and Requirement Report, laid the foundation for the understanding
of the implementation part of the project. Further refinement of the use-case
development and prioritization was carried out on the consortium wiki, by Skype
discussions as well as through physical meetings hosted by the partners.
This report also includes the deliverables D 6.1 Integration tools social software
and knowledge content model and D 6.2 Tools for dynamic and query-based
visualization related to concept in time and space and CH-resources These
deliverables are merged into the Toolkit architecture report in order to be able to
provide a richer and more complete picture of the tools used in the QVIZ system,
as previously communicated and agreed with the Commission.

2.1 General architecture
As stated above, the QVIZ portal is composed of two integrated parts that are
visible to the users: The Query Visualization Environment and the Collaborative
Environment. These exist in conjunction with accompanying middleware and
back-end services.

     Query Visualization



          Faceted Browser                 Map Component             Time-Bar




                       QUIS
    Map Server                                Archive
                       Database                                       Archive Portal
                                              Resource Hub


       Raster              Ontology &                Services               Bookmarking Client
       Maps                Resources


                                                                     User
                                                                     Management
    Collaborative Environment

                                                                         Users
           Collaborative Environment         RDF Store




                                   Figure 1. General architecture




                                                 6
                                          QVIZ Toolkit architecture report 2008-04-28




3. Query Visualization Environment
Visualizing information is a central task in this project. With a faceted query
system one can narrow down search results by selecting different facets. The result
of this search is visualized on a map and in a result list. If the user is selecting
facets describing different levels of administrative units, the map can center on
each selected unit, giving the user a broader picture of the result. If the user has
selected a facet as well as a time period, the time-bar will show the changes in
administrative units over that time period. The interface includes a faceted
browser, a map and a time bar. This environment is used for searching and
visualizing information.

3.1 Description of components
The architecture, data structures and interface communication specifications of the
different components are described below as well as their interrelations.




             Figure 2. Layout of the Query Visualization Environment.


The interface is divided into two logical parts: the Time Spatial Client and the
Faceted Query Browser. The Time Spatial Client consists of the map and the time-
bar. The Faceted Query Browser consists of the Faceted Query Component, a
Language Selector, the Result List, and the Contextual Area. These are integrated
through a JavaScript controller. See Figure 3 for a high-level overview of the
architecture.




                                         7
                                               QVIZ Toolkit architecture report 2008-04-28




                    Figure 3. Architecture of the QVIZ User Interface.

3.2 Time Spatial Client
The Time Spatial Client consists of two core components – a map and a time-bar.
The purpose of the TSC includes visualizing and filtering query results made in
the system and providing other Internet GIS capabilities. The map component and
the time bar are interlinked with other system components through the JavaScript
controller following the Model View Controller-pattern 1.

3.2.1 Map component architecture
The Map component is based on the Macromedia Flash 7.0 software that is a
widespread platform for creating rich web applications. The Map component can
handle both raster and vector graphics. Raster images in the background are
normally used as a base map while the vector graphics in the foreground presents
the thematic information such as points of interests etc. The Flash map client loads
the raster images from the map server and the vector data in XML format from a
database server. The vector and raster data are then combined in the end user’s
browser.




1
    http://en.wikipedia.org/wiki/Model-view-controller




                                              8
                                          QVIZ Toolkit architecture report 2008-04-28




Raster data handling is based on raster-tiling technology. The tiling consists of
cutting a large raster file in many rectangles, which are re-assembled on demand.
The fact that tiles are loaded asynchronously while navigating on the map makes
the application fast and improves the usability.
The following figure gives an overview of the communication between the
components within the map component architecture.




                        Figure 4. Communication Diagram


The map component is integrated into an HTML page. JavaScript is used to load
the flash component, to update its state and to handle its events. The Flash
component uses a data server to get the vector data (XML). The Flash component
loads map images from a map server.
The following system requirements must be fulfilled to use the map component:
Web server with     Apache or IIS web server with PHP (version 4 or higher)
PHP support
XML data source     Any database with the ability to convert database data into
for layers and      XML files according to the required schema (PostgreSQL,
objects             Oracle, DB2, MySQL, MSSQL, etc)
Tiling server       A file server containing the pre-generated tiles (i.e. map
                    images to be shown in the map component).

The end user, i.e. the visitor of the web page, must have Adobe Flash
(Macromedia) installed and JavaScript enabled in web browser.
The map client is highly configurable through a set of configuration settings and
its functionality is controlled by a JavaScript API.

3.2.1.1 Map client and vector server communication in the time-spatial interface
The following diagram is shown as an example that contains all the
communication between the map client component and the map vector server.




                                         9
                                         QVIZ Toolkit architecture report 2008-04-28




Figure 5. Sequence diagram illustrating selecting administrative unit from Faceted
                               Browser component

  1. The map component can center to the selected unit by asking its geometry
     from the map vector server. The request arguments are:
          a. id of the selected administrative unit.
          b. time parameters – current time, time period start and end years.
     Map vector server returns:
          a. unit geometry.
          b. administrative unit level the unit belongs to.
          c. time period the unit has had borders described by the geometry.
  2. After the selected unit’s geometry is received by the map client the
     component re-centers and zooms on to the map.
  3. Then the map client requests from the map vector server the ids of units
     that should be shown on the currently visible map at the currently selected
     time (both for the thematic and parent unit layers). The request arguments
     are:
          a. current time.
          b. current category.
          c. bounding box of current map viewport.
     Map vector server response contains:
          a. ids of units.
          b. existence period for every id.
  4. When the ids response for the parent unit layer has been received, the map
     client asks for the geometries of the parent units from the map vector
     server. The ids and periods are used to optimize the request, because the




                                       10
                                         QVIZ Toolkit architecture report 2008-04-28




      objects that are already on map should not be asked from the server. The
      request arguments are:
           a. current time.
           b. current category.
           c. exact area the asked polygon geometries should intersect with.
           d. units ids to be ignored by server i.e. not to be sent in response
               because they are already on the map.
      When the response has been received, the map component draws the
      geometries to the parent unit layer.
   5. When the ids response for the thematic layer has been received, the map
      client asks the Faceted Browser component for the resource numbers of
      each unit to be shown on the map. This information is used to color the
      geometries on the thematic layer of the map.
   6. When the information about the number of resources for the units is
      available, the Faceted Browser tells the map client to update the thematic
      layer.
   7. Map client asks for the geometries for the thematic layer from the map
      vector server. The request arguments are:
           a. current time.
           b. current category.
           c. exact area the asked polygon geometries should intersect with.
           d. units ids to be ignored by server.
           e. number of resources for each unit to be shown on map.
      The map vector server response contains:
           a. geometries of units with style applied (colored according to
               respective thematic class).
           b. legend data – thematic classes (based on the number of resources)
               and the number of units in each class.
           c. style information – the thematic class of each unit shown on map
               (this information is used to update unit color even if the unit’s
               geometry is not sent in response).
      When the response has been received, the map component draws the
      colored geometries of the units to the thematic layer and updates the color
      of those units, whose geometries have not changed.

3.2.2 Map Component Features
    1. To zoom in or out, do one of the following:
       •   Click on the zoom-bar steps or drag its slider.
       •   Press the “+”, “PgUp” keys or the up and down arrows on the
           keyboard simultaneously to zoom in. Press the “-“, PgDn” keys or the
           left and right arrows on the keyboard simultaneously to zoom out.
       •   Scroll the mouse wheel forwards to zoom in and scroll it backwards to
           zoom out.
       •   Double-click on map to zoom in one step.
   Different administrative unit types are displayed at different zoom ranges. If
   you zoom in the lower level units will appear, while the higher-level units
   disappear and vice versa. There are always 2 levels presented on the map: the
   thematic layer units (colored areas) and its parent units (thicker borders). For
   instance the map displays manors and counties or counties and states at the
   same time.
   2. To pan (move the map), do one of the following:



                                        11
                                            QVIZ Toolkit architecture report 2008-04-28




        •   Click and drag the map.
        •   Drag and drop the current map box or location marker on the
            overview map
        •   Press the up arrow on the keyboard to move north
        •   Press the down arrow on the keyboard to move south
        •   Press the right arrow on the keyboard to move east
        •   Press the left arrow on the keyboard to move west
        •   Press the up and the right arrow simultaneously to move northeast etc.
    3. Send the extent of map window to the JavaScript client. The Map client
       sends the coordinates of the bounding box, the coordinates of the centre
       point of the map window and the zooming factor values to the client.
    4. Thematic mapping. The Map client is able to display a colored choropleth
       map using the frequency numbers about the spatial distribution of
       resources or community activities.
    5. Zoom to object. Map zooms to selected object and highlights its borders
       using administrative unit id code.
    6. To select administrative unit from map, click on colored areas in the map
       window. The result list and the context area will be updated with relevant
       information about the selected unit, the map centers to the selected unit
       and highlights the unit with red borders; the period of existence of the
       selected unit is also reflected in the time-bar. When the user clicks on the
       map, the map client sends the administrative unit id code to the JavaScript
       client.
    7. Background map layer. Displays the background map image using raster
       streaming technology.

3.2.3 Time-bar Features
The Time-bar lets the user select the period of interest (time window) and change
the time of the map. The period of interest helps the user to filter the values in the
faceted browser and in the different result lists.
    •   To define the period of interest, drag the time window sliders to the start
        and the end year of the period.
    •   To define the current map time, drag the map time slider to the desired
        year. The border situation and thematic coloring changes.
Time-bar gives the user additional information about the selected administrative
unit (Fig. 6):
    •   The Amplitude chart shows the temporal distribution of the archival
        resources related to that unit.
    •   The Border change bar gives the user an overview of the border changes
        for the specific unit – period of existence of the selected border is
        highlighted and the rest of the border changes are reflected by changing
        colors (dark and light alternately changing).




                  Figure 6. Time-bar chart and border change bar




                                          12
                                           QVIZ Toolkit architecture report 2008-04-28




3.2.4 Map legend
The number of archival documents related to the administrative units is expressed
by thematic coloring on the map. In the map legend window each color
corresponds to a certain range of resources (Fig. 7). The number in brackets
indicates the count of administrative units on the map, which belong to a certain
range of resources.
    •     To view the explanation of the colors, click on the map legend tool icon
              in the map toolbar.




                                 Figure 7. Map legend

3.2.4.1 Technical description of map legend tool
The Map legend component is a MapCat subcomponent, which can be controlled
through the MapCat API.
The component is loaded through JavaScript in index.php:
MapCatAPI.broadcast("mapcat.addComponent", "mapLegend",
"map/components/mapLegend/mapLegend.swf");

MAP_CONTROLLER.onMapLegendLoaded     is registered as a listener to mapLegend.loaded
events.
MapCatAPI.addCallback('mapLegend.hidden', MAP_CONTROLLER.onMapLegendHidden);

onMapLegendLoadedis positioning the legend component at the application start
(legend component stays invisible).
FlashApi.onJsDataReceived    is called every time the THEMATIC layer is updated. This
function updates the legend according to the data received from the server by
calling the MapCat API function MapCatAPI.broadcast('mapLegend.setItems',
legend) where legend is an Array containing objects with properties: styleId – one
of the styles defined in map/styles_thematic_qviz.xml (optional, if omitted the
item will not have a colored box in front of text), description – text of legend
item.
The map legend header is taken from f_text["t_map_legend_resource"][f_lang].
If f_text["t_map_legend_no_info"][f_lang] is not defined, the default message is
used: "Number of resources in". The current year is added to the header.
If there are no objects on the THEMATIC layer an appropriate message is shown in
the legend instead of legend items. The message value is taken from
f_text["t_map_legend_no_info"][f_lang]. If




                                          13
                                           QVIZ Toolkit architecture report 2008-04-28




f_text["t_map_legend_no_info"][f_lang]       is not defined, the default message is
used: "Thereare no polygons available for thematic information for the
chosen area and time."


3.2.5 History of the point tool
To view the administrative history of the location, choose the History of Point
button in the map toolbar      and click on the map. The red marker is drawn on
map and the list of administrative units is displayed in the right section of the
page. The History of Point tool lists (Fig. 8) the administrative units and their
respective borders, which intersect with the selected location and fit the selected
time window.




                            Figure 8. History of Point list

Description of user & application actions:

    •   User action: The user activates history of point tool by clicking on the
        icon in the map toolbar. Then the user clicks on map with the left mouse
        button while directing the cursor to the point of interest.
        Application action: A listing of the intersecting AU units is displayed in
        the result list. For each AU, the name is displayed and all time series of
        polygons of the AU within the time window.
    •   User action A (optional): The user clicks on an AU polygon time interval
        in the result list.
        Application action: The map centers on the selected polygon, the time-bar
        current-time cursor is moved to fit into the time interval of the selected
        polygon.
    •   User action B (optional): The user clicks on an AU unit name.
        Application action: The map is centered on the selected AU. At the same
        time the AU description is loaded into the results container by the faceted
        browser along with contextual information.
    •   User action C (optional): The user clicks on map.
        Application action: The AU is highlighted on the map and the time-bar is
        updated. At the same time the AU description is loaded into the results
        container by the faceted browser along with contextual information.




                                          14
                                            QVIZ Toolkit architecture report 2008-04-28




3.2.5.1 Technical description of HoP tool
The HoP tool is controlled by code in the file js/HISTORY_OF_POINT.js. The layout
of the HoP tool is described by Gfx/maptools.css, gfx/maptools_ie.css, and
history.png.

Here is a brief description of the HoP tools JavaScript functions:
When a user clicks on the HoP tool icon, the function toggleHistoryPoint() is
called. This activates the “edit” mode in the map component. After the user clicks
on the map by pressing the left mouse button, a red dot is displayed that will
disappear after a predefined time, while at the same time a HoP query is executed
to the QVIZ AU database using the function historyPointQuery. The result of the
query is returned in JSON format to the function presentationManager(), which
populates the result and context lists with the help of helper functions.
When user clicks on an AU name the function, centerToUnitHistory() is executed
and the functions centerToSelectedUnit() and unitClicked() are called.
When the user clicks on a polygon, the functions centerToPolygonHistory() and
centerToSelectedUnit are executed.


Here is an example of a HoP query to the database by the JavaScript function
historyPointQuery:


HTTP POST:
http://polaris.regio.ee/dev_test/history_point.php?request=history_point&crd=4320565.40
820312,3498677.07617187&timeinterval=1600,2008
The response is in JSON format, i.e. an array of objects like:
{"id":"10704730.000000000000000","start_year":"2007","end_year":"5000","name
":"SWEDEN","type":"STATE" ,"type_level":"4"}


SQL of HoP query:
SELECT
         f.id, (duration).g_start_date.g_year AS start_year,
         (duration).g_end_date.g_year AS end_year,
         name,
         type,
         g.g_type_level AS type_level
FROM
         demo_lambert_europe f,
         hgis.g_unit_type g
WHERE
        (f.geom
        &&expand(setsrid(makepoint($coords[0],$coords[1]),3034),$radius))
        AND
        distance(setsrid(makepoint($coords[0],$coords[1]),3034),f.geom) <
        $radius
        AND util.get_start_year(duration) < $dates[1]
        AND util.get_end_year(duration) >=$dates[0]
        AND f.type = g.g_unit_type
ORDER BY
        type_level ASC, type ASC, name ASC,start_year DESC, end_year ASC;




                                          15
                                          QVIZ Toolkit architecture report 2008-04-28




3.3 Time Spatial Middleware
The Time Spatial Middleware component provides the interconnection with the
spatial databases, stands for calculation of thematic map classes, vector data
generalizations, and optimization of vector data loading by panning the map or
changing the map time.

3.3.1 Vector data generalization
To reduce the size of data requested by the map component from the map vector
server and to speed up the drawing of geometries to the map the geometries are
generalized by the map vector server before sending them to the client side.
In the current version of the map vector server the PostGIS simplify function is
used to generalize the requested geometries. This function is an implementation of
the Douglas-Peuker algorithm. The generalization parameters are configurable and
are specified for each map zoom level (map scale) based on the zoom level and
their respective map resolution i.e. meters per pixel.
The Time Spatial middleware does not render any map images; they are requested
from the separate map server by the map client.

3.3.2 Optimization of vector loading by panning the map or changing the map
time
Every time the map is panned or zoomed the vector data layers should be updated.
To speed up the update of the vector data layers while panning the map the
following features were implemented:
The map component requests the geometries, which intersect only the area not
covered by the map before panning (Fig. 9). The large rectangle in the lower left
corner shows the area covered by the map before map panning. The large
rectangle in the upper right corner depicts the area covered by the map after
panning, while the smaller rectangle in the middle depicts the area that does not
need to be requested in the optimized vector data request.




                                        16
                                            QVIZ Toolkit architecture report 2008-04-28




                                Figure 9. Map panning
The Geometries that have been loaded already and shown on the map may
intersect the area of the request. That is why the request also contains the ids of all
the geometries currently shown on map. The map vector server uses these ids to
filter the geometries returned in the response.
The color of an administrative unit is chosen according to its number of resources,
but it also depends on the number of resources of the other units displayed on the
map. The color of the administrative unit will be changed every time the map is
panned. This action is performed without reloading the geometry.
When zooming in or out on the map the system still needs to request the entire
data set from the server, because the vector generalization must be in use. In this
case it is not possible to use the described optimization.

3.3.2.1 Time request optimization
When the current time of the time spatial client is changed, the vector data layers
will be updated. In this case the area covered by the map does not change, but
some of the administrative units geometries needs to be updated because:
    1. The administrative unit borders may have changed over time. In that case
         the geometry should be reloaded.
    2. An administrative unit may not exist at the new time. In that case it should
         be removed from map.
    3. An administrative unit may not exist at the previous time. In that case it
         should be loaded from server.
    4. The number of resources of an administrative unit may have changed. In
         that case its color on map should be changed without reloading its
         geometry.

To optimize the update of the vector data layers the following steps are taken:
    1. Before requesting the geometries of administrative units, the map
       component requests the ids of the administrative units to be shown on a
       given area at given time. The response contains the administrative unit
       existence period for each id.
    2. The new response is compared to the ids and existence periods of the units
       shown on the map before the change of selected time:
           a. if a unit existence period has changed then the unit geometry is
                reloaded, otherwise it is skipped by the map vector server in the
                update response;
           b. if a unit id from the old response does not exist in the new one,
                the geometry is removed from the map;
           c. if a unit id exists in the new response but not in the old one then
                its geometry is loaded from the server during the update response;
    3. The vector data update request contains the style information (used to
       color geometries on map) for each unit id received in the response at step
       1. The new styles are compared to the old ones for the same id and if the
       style has changed the geometry color is changed.


3.3.3 Calculation of thematic map classes
The coloring is calculated in the MapCat server where the observation values are
number of archival resources and the class intervals are intervals of archival
resource numbers. The algorithm used is called equal count.




                                          17
                                            QVIZ Toolkit architecture report 2008-04-28




The algorithm partitions a given array of observation values into a given number
of class intervals so that the first class interval is reserved for non-positive values
and the numbers of observations in all other class intervals are as equal as
possible. More precisely, minimizing the variance of the observation counts in
positive class intervals (i.e. the mean of (X-EX)^2 where X is the observation
counts in a given positive interval and EX is the mean observation count over all
positive intervals). The algorithm is exact, i.e. the solution is one of the optimal
ones.
The algorithm takes O(m*log(m)+k*(n+p^2)) time where k is the number of
intervals wanted, n is the total number of observations, m is the number of
observations with positive values and p is the number of different positive values.
So the time complexity is not more than O(k*n^2). The constructed class intervals
will be numbered 0..k-1 where k >= 2 is the number of classes (taken from the
length of the array classUpperLimits). Class interval 0 will contain all non-positive
elements, class 1 will be the least positive interval and so on; the class intervals
will cover all the given values.
If there are less unique positive values than k-1 then some class intervals will be
empty and the indices of non-empty class intervals will be distributed evenly, i.e.
with the same spacing between each non-empty positive class interval and its
preceding non-empty or non-positive class interval, such that the spacing is
maximal: e.g. if k=8 and there are 2 distinct positive values then the spacing is 3
and the positive non-empty class intervals will have indices 3 and 6 and class
intervals 1, 2, 4, 5 and 7 will be empty (if there are also some non-positive values
then the indices of the non-empty class intervals will be 0, 3 and 6).

3.3.4 System requirements of the time-spatial middleware
These are the system requirements for the time spatial middleware:
    • Apache Tomcat - Servlet engine used for running Java servlets
        http://apache.datanet.ee/jakarta/tomcat-5/v5.0.28/
    • Log4j - library used for logging
        http://apache.datanet.ee/jakarta/log4j/binaries/jakarta-log4j-1.2.8.zip
    • Apache DB - database pooling http://db.apache.org/




                                           18
                                           QVIZ Toolkit architecture report 2008-04-28




3.4 Faceted Query Component
This part will deal with the Faceted Query Component (FQC) and look at its main
features from both user and technical points of view.
FQC is an integral part of the QVIZ environment. It is meant to be a convenient
starting point for users searching for archive resources and/or their volumes. FQC
consolidates different types of data associated with archive resources into logical
groups or topics. It comprises a set of so called facets each containing a list of
terms related to a particular topic. For example, a facet called Countries would
have individual countries listed as its facet terms because it is natural to search for
archive resources by country of origin. Identification of common characteristics of
data objects of interest (in our case references to archive volumes) is called faceted
classification. Several of the mutually exclusive characteristics correspond to
many different views (or facets) of the data objects at hand. Browsing a faceted
classification system using facets gives the users a tool to choose the order in
which the archive data is explored and presented.

3.4.1 FQC architecture
From the user point of view, the FQC is a part of the front-end user interface and
is closely linked with the Time-Spatial Component (TSC) (i.e. a Map with a time
bar). Internally the FQC has two main parts:
    •   The client-side, which is essentially a dynamic web user interface, whose
        purpose is to display facets and their content, register user events, fetch
        data from the backend server and call TSC functions if necessary;
    •   The server-side, which acts as a backend to the client side. Its purpose is
        to provide the client with data optimized for the needs of the user
        interface. This component is also called the Archive Resource Hub.
The FQC is implemented using dynamic web development techniques broadly
called AJAX. These techniques include HTML, CSS and JavaScript on the client-
side, as well as PHP programming on the server-side.
Facets can be implemented using either an HTML select element or an HTML
table element. QVIZ has chosen the table since it provides more control over the
look-and-feel and content than a typical select menu, whose implementation varies
slightly for different browsers.
Each facet is therefore implemented as a separate table attached to a higher-level
table using row and cell elements. The positioning in a row is done automatically
and allows easy adding, removing and swapping of tables or facets.
This implementation of a faceted browser can easily be used in other projects. The
implementation is described in greater detail in Appendix A.

3.4.2 Features
This subsection describes features and functionalities of the facets, the Result List,
and all available actions that a user can perform. (See Figure 10 below for an
overview of the facet features).




                                          19
                                       QVIZ Toolkit architecture report 2008-04-28




                              Figure 10. Facet

•   Add an arbitrary facet from the list of Inactive Facets by clicking on the
    facet name button. The selected facet is moved to the first free position in
    the list of Active Facets. If the facet is the only one in the list of Active
    Facets, it is automatically populated with all the available data related to
    that facet. The facet terms are sorted in English alphabetical order.
•   Remove an arbitrary facet from the list of Active Facets by clicking on the
    facet name button. The selected facet is moved back to the list of Inactive
    Facets. If the removed facet contained a selection, the content of all facets
    to the right of the removed one is erased. If the removed facet is the first
    one, the list of facets shifts to the left and the new first facet is
    automatically populated with data.
•   Swap two adjacent facets by clicking on left or right swap buttons (« and
    ») in the upper corners of active facets. This operation allows users to alter
    the order in which facets are displayed. The content of all facets to the
    right of the swapped ones is erased. If the swap includes the first facet, the
    data is automatically loaded into the new first facet.
•   Page through the facet content by clicking on the previous page or next
    page buttons (← and →) in the lower corners of active facets. This
    operation loads the previous or the next page of a facet. If the previous
    page button is used on the first page, the current page wraps around to the
    last page. If the next page button is used on the last page, the current page
    wraps around to the first page. Any previous selections and content of all
    facets to the right of the one where the paging occurs are erased. For
    consistency reasons, The Result List and Contextual Area are removed.
    The information about the current page and the last page are displayed in
    the middle of the page bar.
•   Browse information in the facets by clicking on individual facet terms. A
    click in the current facet initiates a series of actions:
        •   New content of all facets to the right of the current facet is loaded.
            If the current facet was an AUO facet, then the newly loaded facet
            terms are filtered based on this selection and the resource counts
            do not change. If the current facet was a non-AUO facet, then
            both the facet term and the resource count filters will be applied.
            If a selection in a non-AUO facet occurred, all facets to the right
            of this non-AUO facet will display a) both direct and indirect AUs
            of bookmarks related to the selected term, and b) only one mixed
            resource count for each term. This count specifies the number of



                                     20
                                       QVIZ Toolkit architecture report 2008-04-28




            result list resources with at least one bookmarked archive resource
            both directly and indirectly connected to the current AUO level
            (for AUO-facets) or the current non-AUO facet term (for non-
            AUO facets). The facet terms in all facets are sorted in English
            alphabetical order.
        •   AU Information and AU History data related to the selected facet
            term are displayed in the Contextual Area.
        •   A list of resources related to the currently selected facet term is
            displayed in the Result List Area. Since resources can either be
            directly or indirectly related to the selected AU, the result list is
            displayed with the directly and indirectly related resources in
            separate result list tabs (see "Direct Resources" and "Indirect
            Resources" tabbing buttons). Both types are sorted in English
            alphabetical order. For mixed resource counts the "Indirect
            Resources" tabbing button is disabled (see Figure 11 below).
        •   If the current facet was an AUO facet, the map zooms in or out
            onto the selected AU and highlights its border using a red color.
•   Search AU names using Search AU facet. The Search AU facet can be
    browsed using paging or searched for a name using a text input field next
    to the facet's heading. Pressing "Enter" key or "Go!" initiates the search.
    Other functionalities are the same as for the AUO facets.




                                     21
           QVIZ Toolkit architecture report 2008-04-28




Figure 11. Result list.




          22
                                          QVIZ Toolkit architecture report 2008-04-28




4. Collaborative
Environment/Knowledge Building
Tools (CET)
Archival material descriptions from archives, archival bookmarks, and
automatically generated objects seed the collaborative environment with both user
and archival contexts. To enhance access to archival resources, communities
within the Collaborative then continue to extend this semantic network and create
new contexts for these archival resources.
The goal is for users to create new knowledge and content in communities, and
associate knowledge from Communities of Practice (CoP) with archival resources
referenced in archive portals, as well as new content in the collaborative
environment. The users enhance the comprehension and access to both new
content and archival content from semantic annotations, metadata descriptions,
and new content.
Overall, the focus of the collaborative environment and knowledge building tools
is to enhance access to the archival resources using community based annotation
of resources, content building, and making relationships to other resources. The
users must be kept motivated to add value to the resources, and therefore, content
creation and relevant domain ontologies are considered. Especially to support
research needs, social software features, resource organisation/categorisation by
users and communities, Simple Knowledge Organisation (SKOS) based on SKOS
tagging, and visualization. The growth of knowledge and content provides a
means for users and communities to visualize or discover archive materials in new
contexts.

4.1 Description of Components and Tools
4.1.1 Collaborative Environment Activities, User Interfaces and detailed
descriptions
Additional screenshots and details are provided in the D6.4.2b User Manual.
Please see also the following resources associated with this deliverable.
    •   P2b User Manual –section: Collaborative Environment for Social
        Knowledge Building (CET)
    •   SKOS Vocabulary Builder and Search Application (Appendix)
Additional deliverables that support the software:
    •   D3.3 Domain Ontology
    •   D3.4 Knowledge Content Model




                                        23
                                               QVIZ Toolkit architecture report 2008-04-28




4.1.2 Collaborative Environment Portal
This portal is responsible for the collaborative environment for knowledge
building. It is deployed as a Java web application; however, it is heavily supported
by AJAX technologies based on the Dojo 2 toolkit and JSON3.
The portal is accessible at http://qviz.eu/TryQVIZ.php
Users can login, perform basic text searching and advanced ontology supported
search methods, create typed content, manage semantic annotations and metadata,
and visualize relationships between resources based on ontology properties via
automated SPARQL 4 queries triggered by direct and inherited resource types.
The domain ontology is described in D3.3 Domain Ontology5.
The content management and ontology or knowledge base provide a means to
associate ontology instances to content 1:1. Each instance in the ontology has a
corresponding content page; likewise, any ontology class or property also has a
content page. Describing the ontology elements might help users understand the
nature of resource types, their inherited resource types, and ontology relationships
between these types.
When creating content, each content page can be typed by one or more ontology
classes and therefore infers the class properties for annotating the content page.
Users might later browse by resource type, or navigate based on the resource type
within filter/browse or visualization features. The goal is to build connections
between resources using named relations from the ontology, which then can be
visualized and browsed using the Relationships Overview portlet6; a knowledge-
content visualization generated by SPARQL content templates triggered by the
resource’s direct and inferred ontology classes and relationships.




                                   Figure 12. Basic Layout.




2
    http://dojotoolkit.org/
3
    http://www.json.org/
4
    SPARQL http://www.w3.org/TR/rdf-sparql-query/
5
    QVIZ Deliverable D5.3 Domain Ontology
6
    http://en.wikipedia.org/wiki/Portlet




                                             24
                                          QVIZ Toolkit architecture report 2008-04-28




                 Figure 13. Basic Layout with Workspaces example.

In Figure 12 and Figure 13, the basic layout of the portal is shown. It is composed
of portlets and some portlets can have additional divisions such as a tabbed layout.
The D6.4.2b User manual provides greater detail and examples.
In the main centre area, a variety of portlets can be stacked, currently the most
important portlets are: User workspace and Communities portlet, visualization
portlet of relationships overviews, a multi-featured search & browse portlet and a
content portlet, where the user can perform various activities to create content,
link and provide semantic relationships between the resources.
The workspace portlet provides the user with features to manage references to
resources either in their own workspace or with Community workspaces. The
workspaces help to organize resources, to facilitate further sharing, to further
categorize resources, or even for searching. The workspace is also part of the
ontology, as it enhances search and visualization processes.
The relationships overview portlet provides the important feature for visualizing
knowledge-content. Additional details will be provided in the knowledge content
section. Relationships can be assigned between resources using the Relate feature
or by inserting the reference directly in the content. Authors can also share their
collaborative documents resources with one or more communities providing
members with basic view or edit privileges of the textual content; to enhance
knowledge and archival resource access, the semantic descriptions and
visualization are always visible by all QVIZ users. Many resources created are
sharable by all users; these types of resources facilitate knowledge building and
will be discussed in subsequent sections.
Basic user management and access control support the integrated QVIZ platform,
which is accessible through the central authentication component that is a single




                                         25
                                                QVIZ Toolkit architecture report 2008-04-28




sign-on, or more appropriately an enterprise reduced sign-on7. The QVIZ
application requires interaction with the collaborative environment and currently
access with complete anonymity is not supported, except for the Query
Visualization Environment.
It is important to distinguish normal Wiki access rights from the CET; researchers
might wish to conceal their full content before sharing, however, the needs of the
many require that we share as much knowledge as possible to enhance access to
the archival materials. A goal is to enhance all communities with knowledge
created in the environment, although users provide the means to share full works
with communities.

4.1.3 Plug-ins
The plug-ins;
       •    Cross-domain Interaction JavaScript/AJAX library. This QVIZ tool
            provides the means for two applications in different domains to interact.
            It can also be used to ensure that one instance of an AJAX based
            application is opened in the user’s browser. The facetted browser, for
            example, can call methods for the CET to perform and transfer the user to
            an existing window or open a new instance of the CET.
       •    3rd party AJAX libraries - Client browser environment communication to
            the server environment(s) is achieved using the Dojo toolkit, version 1+
            (January snapshot). The stability was difficult to achieve both in early
            versions and the current version.

4.1.4 Social Knowledge Content
General features list
       •    Content Management
                •    Create typed resources based on the ontology. A resource type
                     might dictate the possible relationships it can have with other
                     resource types (ontology classes). The archival resource
                     description is an example of a metadata rich resource type. Also,
                     there are differences among ontology classes in the metadata data
                     properties and named property relationships between ontology
                     classes – although for Collaborative document subtypes, we use a
                     core set of textual metadata Dublin Core (prefix dc:) properties to
                     provide basic descriptions and relationships among other
                     collaborative documents. The reason is that it is best to use
                     properties that might be common or can be mapped to metadata
                     schemas. Potentially, many resources could be described using
                     one or more metadata schemas such as bibliographic scheme
                     (BibTeX). However, in the future we could consider using a
                     general bookmark to describe the referenced resources in
                     different metadata contexts.
                •    Content management
                         •    Edit content using a basic WYSIWYG editor (Tiny
                              MCE) and wiki-style open linking.


7
    Single sign-on , Enterprise reduced sign-on http://en.wikipedia.org/wiki/Single_sign-on




                                              26
                                      QVIZ Toolkit architecture report 2008-04-28




                •    Wiki Link building supports linking to a copied reference
                     or an item selected from a quick search.
•   Manage and organize collections in the user or community folders.
•   Copy/Paste references: A clipboard feature that copy/pastes references to
    workspace folders or into content.
•   Manage certain semantic annotations and metadata of typed resources
    based on the domain ontologies.
•   Perform semantic annotations indirectly using features such as by
    managing workspaces or creating resources.
•   Interface with an archival social bookmarking application to store the
    archival resource description, to make automatic semantic relations to
    users, communities, and the user created bookmark tag objects and
    automated tag extraction from the archival resource description.
•   Discuss feature at community or resource level.
•   Perform visualization of related resources.
•   Perform a simple search.
•   Perform filter search by resource type, such as Archival Social
    Bookmarks, Archival Resource Descriptions, Tags, Events, Collaborative
    Document subtypes such as Article or Tutorial.
•   Perform community navigation and organization or resources in
    collections.
•   Perform community management for restricted and unrestricted
    Communities by moderators. Manage membership applications and
    provide information about communities and their members.
•   Restrict or allow access to resources created in the site. Access to
    resources depends on several factors:
        •   Who authored the resource and their role
        •   The resource type. Many resource types are accessible by all
            users, except for most resources of type:
            qvizcet:CollaborativeDocument. (Posts and Questions are
            subtypes that are accessible by everyone).
        •   If the resource is a CollaborativeDocument resource, then did the
            author share it with one or more communities (either view rights
            or edit rights by members)? Is the user that desires access, a
            member of one of the shared communities?
        •   Do community members have read or write rights to the
            resource? Is the resource still shared?
•   Share a CollaborativeDocument with one or more Communities.
•   Automated sharing of knowledge among all users.
        •   Resources that are not of type CollaborativeDocument are
            generally viewable and sharable by all users. For example:
            named entities and all social bookmarks are shared with all users.




                                    27
                                          QVIZ Toolkit architecture report 2008-04-28




             •   Archival Resource Descriptions are not modifiable by anyone
                 except the Administrator.
             •   Relationship Overviews and metadata on all resources are
                 viewable, including CollaborativeDocument.
    •    Allow members of a community either viewing or editing rights.
         Members can edit, change selected metadata and semantic annotations.
    •    View resources from the faceted browser.
    •    Search over archive resource description metadata.
             •   Search within the CET.
             •   Lookup archive resource images at the archive portal.
             •   Visualize an administrative unit on the map in the query
                 environment.

4.1.5 Knowledge Content Query Manager
Services available for external components are executable via an AJAX library.
These actions provide basic integration features for searching and browsing the
resources stored in the CET. The faceted browser, for example, is able to invoke
actions to be performed within the CET.

4.1.6 Collection Manager
Collections are a means of organizing resources created or found in the User
workspace and Community. A user workspace is a subclass of a
RestrictedCommunityOfPractice and all Communities have a workspace of
collections represented as trees in the User Interface workspace portlet. These
collection objects can also have their own content description, metadata and
relationships to other resources assigned by users directly or automatically via user
interactions such as the copy/paste action, whereby a resource is associated with a
particular collection. Additionally, collections are referenced in the relationships
overview portlet.
Collections can also be referenced in the content and are a means of aggregating
resources during collaboration. In this sense, they can function as aggregated
social bookmarks.
User and Community workspaces are based on the ontology currently because of
the advantages of using SPARQL queries. The user workspace is a subclass of the
restricted Community of Practice and the folders are subclasses of sioc:Container:
qvizcet:userFolder and qvizcet:copFolder.

4.1.7 Communities and member management
Users can create new communities, all users belong to a default community where
many resources are automatically associated if they are relevant to the site
knowledge, such as tags, named entities. Users join new communities by applying
to restricted or unrestricted communities. Once a member, they gain access to the
discussion feature for the community level.
    •   Create restricted and unrestricted Communities based on the domain
        ontology.
    •   Manage members and membership, especially for restricted communities
        where moderators need to accept new applications for membership.



                                         28
                                              QVIZ Toolkit architecture report 2008-04-28




       •   Manage community collections.
As discussed earlier, a user workspace is a community of one, and defined as a
restricted community. The potential is for users to create communities that
function as the personal workspace. The main intention for the user workspace is
to provide a staging area for resource sharing among other communities. A new
resource must always belong to one community, which is the default user
workspace, in this way users may remove or share certain typed resources with
one or more communities.

4.1.8 Publication Manager
This involves the exporting of content and semantic descriptions for external
systems or usage. This is part of the Knowledge content integration related tasks.
Resources can be exported in RDF/XML, however, the export for the Knowledge
Content model considers a broader range of associations and it can include
additional metadata of associated objects and possibly their content. Content of
associated objects such as tags, events are permissible to export, but not if the
associated resource is of the type qvizcet:CollaborativeDocument, such as another
associated Article.
The target resource for export and its associated resources are described in a new
RDF graph model based on the domain ontology. The semantic description
includes both data type properties and object properties (URIs) for each resource;
however, not all content is automatically exported, this depends on the resource
type of the particular resource. In general, the system does not export content of
associated objects with the direct or inherited type: CollaborativeDocument,
CommunityUserProfile or User. The export of content for all other associated
resources (CommunityOfPractice, Tag, Event, Organization,
ArchiveResourceDescription, ArchiveSocialBookmark, etc), helps to more
completely describe the target resource in new environments and can also be used
to supplement the description of the Knowledge Content Object facets 8.

4.1.9 RDF based Repositories

4.1.9.1 Overview RDF based Repositories
The primary focus was to create RDF based repositories that could be queried
using the SPARQL9 query language. The frameworks, Jena10 and Sesame12,
support the storage, inference, and querying of RDF data. The collaborative
environment uses the SPARQL query engine, ARQ 11, a subcomponent of Jena.
Jena is configured with a RDFS + OWL in memory reasoner,
OWL_MEM_RDFS_INF, to support the application API and SPARQL queries.
The SKOS vocabulary application is based on Sesame 12rather than Jena, because
of the interesting developments in this particular framework. Sesame supports
context or named graphs without the need for reification to express a context for a


8
    QVIZ D3.4 Knowledge Content Object (KCO) Report
9
    SPARQL http://www.w3.org/TR/rdf-sparql-query/
10
     Jena http://jena.hpl.hp.com/wiki/Main_Page
11
     ARQ http://jena.hpl.hp.com/wiki/Main_Page
12
     Sesame http://www.openrdf.org/




                                            29
                                               QVIZ Toolkit architecture report 2008-04-28




resource. In the Collaborative Environment, social bookmarks can be used as a
means of expressing context about referenced resources.
Currently, the SKOS vocabulary builder uses Sesame version 2+, although it
could be configured for Jena. A custom reasoner supports the application in order
to conserve resources and improve performance rather than using an OWL
reasoner13.

4.1.9.2 Domain Ontology
The Dolce14 based domain ontology and integrated ontologies are described in the
deliverable D3.3 Domain Ontology. The following table briefly describes the role
of the additional ontologies.
The following table provides an overview of some of the external ontologies used.
Ontology Name                      Notes
(Namespace prefix)
Dolce14 (dul)                      Foundational ontology
Creative commons 15 (cc)           Creative Commons
                                   Provides basic rights description for resources
                                   such as License, Requirement, Restriction, and
                                   Prohibition.
Dublin Core16 (dc)                 Provides basic textual metadata for resources
Dublin Core Terms (dct)            such as: dc:title, dc:creator, dc:language,
                                   dct:created, dct:temporal, dct:spatial,
Dublin Core DCMI Type
                                   dc:source (URL of generic system social
                                   bookmark or any resource’s source)
                                   dcmitype/Image - to support SKOS symbol
                                   The domain ontology includes properties that
                                   could be mapable to Dublin Core, such as
                                   dul:hasPart or qvizcet:references, however, the
                                   domain properties are usually OWL object
                                   properties which explicitly defined the inverse
                                   properties; which are not supported in the
                                   Dublin Core ontologies.




13
  Scaling Jena in a commercial environment,
http://jena.hpl.hp.com/juc2006/proceedings/portwin/paper.pdf
14
     Dolce http://www.loa-cnr.it/DOLCE.html
15
     http://creativecommons.org/
16
     http://dublincore.org/




                                              30
                                                QVIZ Toolkit architecture report 2008-04-28




SKOS17                              Simple Knowledge Organisation
Simple Knowledge
                                    In the domain ontology, the skos:Concept is the
Organisation (skos)
                                    superclass of qvizcet:Tag and subclass of
                                    (Dolce) dul:AbstractQuality. The user is able
                                    to tag resources using relations: skos:subject or
                                    skos:primarySubject. To associate symbols, the
                                    user might use either dc:source metafield or
                                    associate a dcmes:Image via the
                                    skos:prefSymbol relation. Optionally the users
                                    might associate a Tag with a
                                    qvizcet:Vocabulary, a type of SKOS collection.
                                    In the vocabulary builder, SKOS, skos:concept
                                    is associated with schemes
SIOC18 (sioc)                       SIOC – Semantically linked Online
                                    Communities
                                    This ontology supports the discussion and
                                    workspace collection (folder) feature. The
                                    workspaces for user and Community
                                    workspaces use specific folders types
                                    (qvizcet:userFolder or qvizcet:copFolder) to
                                    support organization and queries. The domain
                                    ontology subclasses are the sioc:Container:
                                    qvizcet:userFolder or qvizcet:copFolder.
                                    When adding resources to a folder, for
                                    example, associations are made via the
                                    qviz:references relation.
FOAF19                              Friend of a Friend
(foaf)
                                    Required by SIOC
Trust 20 (trust)                    Trust Ontology
                                    This is of future interest to support ranking or
                                    rating of resources including persons for a
                                    particular subject. Trust of person or agent for
                                    a particular subject or resource. This is a means
                                    of ranking as well.
DILIGENT21                          DILIGENT V.3
Argumentation: Ontology
                                    In this prototype, the project discontinued the
                                    use of the argumentation ontology because the
                                    users can as easily employ tags to
                                    communicate these concepts.


17
     http://www.w3.org/2004/02/skos/
18
     http://sioc-project.org/
19
     http://www.foaf-project.org/
20
     http://trust.mindswap.org/
21
     http://www.aifb.uni-karlsruhe.de/WBS/dvr/publications/stica2006engler.pdf




                                              31
                                               QVIZ Toolkit architecture report 2008-04-28




4.1.10 non-RDF based storage
A relational model supports the user and content management. A separate
specialized repository can be available to external applications containing extracts
from the whole collaborative environment. The specialized Content Repository
was also initially used to support collection management for the user and
community workspaces, however, in the proof-of-concept, the project decided to
focus on RDF based representations to better integrate with the SPARQL query
feature. In the future, tools could provide workspace management using relational
models that either synchronizes with the ontology or is supported by a SPARQL
query engine extension. QVIZ currently support workspace folder collections
using the ontology; sioc:Container subclasses: qvizcet:UserFolder,
qvizcet:CopFolder.

4.1.11 Single Sign On and User Management
The single sign on is further described under the Middleware Components section,
the collaborative environment includes a client interacting with the Central
Authentication System (CAS)22 and provides the user database for CAS. This
supports other QVIZ components, especially the Social Bookmarking client,
which stores objects in the collaborative environment for the authenticated user.

4.2 Knowledge Content Integration
4.2.1 Visualization / Presentation using SPARQL based Templates
A flexible and intuitive solution for visualizing knowledge content is displayed in
the Relationships Overview portlet. A specialized SPARQL template transformer
renders sets of SPARQL query/template pages associated with the direct and
inherited resource’s type(s) of each resource displayed. Types include the
resource’s direct and inherited ontology classes. For example, all resources share a
common type, superclass dul:Entity, which triggers one SPARQL template (a set
of queries and template content) in the Relationship Overview portlet; since a
resource is associated with additional ontology classes, additional SPARQL query
templates are triggered for that resource. The SPARQL query templates are part of
the common content and can be updated per request by the site Administrator. The
current sets of templates are stored as content and associated with a specific
namespace identifiable by the visualization software.
Each SPARQL content template is stored as content within the CET under the
sparqlclass namespace – this makes it convenient to respond to user
customizations in a particular installation. The following is a sample list of classes
supported by SPARQL content templates. Because of class inheritance, one or
more templates are triggered for all resources.
           dul:Entity
           qvizcet:SocialResource
           qvizcet:Tag
           qvizcet:Vocabulary
           qvizcet:CommunityOfPractice
           qvizcet:ArchivalResourceDescription

22
     Central Authentication System http://www.ja-sig.org/products/cas/




                                             32
                                           QVIZ Toolkit architecture report 2008-04-28




        qvizcet:ArchivalSocialBookmark
        qvizcet:CollaborativeDocument
        sioc:Post
        sioc:forum
        User
Potentially, the relationship overview could also be generated and merged with
exported content objects. The administrators only need to make modifications to
the XSL style sheet for customizations.

4.2.2. Archival Social bookmarking: User context, Archive and Community
knowledge
The integration of archival social bookmarking objects is the cornerstone of
knowledge content within QVIZ – social bookmarking contributes resources from
two perspectives: 1) The user context as described by the ArchivalSocialBookmark
object 2) The archive context described by the ArchivalResourceDescription
objects which can be referenced by other ArchivalSocialBookmark objects and
resources within the CET.
Archival Social bookmarks are a subclass of a social bookmark, however, based
on the ontology, and requires a resource description
qvizcet:ArchivalResourceDescription. Simple Social bookmarks can be created in
the CET to provide a means for users to create a context – a semantic description -
about a resource. However, Archival Social bookmarks must be created by the
social bookmarking tool, which ensures that the user has actually examined the
digital archival resource and its description at the archive portal. From any
archival social bookmark or archival resource description, the user can view the
relationships to other resources (relationship overview), metadata descriptions for
archival resources and the associated archival social bookmarks. There are a
variety of ways to associate the archival social bookmark or archival resource
description to other collaborative content – by assigning folder collections,
including the reference within the content via the Wiki Link builder feature, or
applying named relations to a the target resource using the relate functionalities.
This is explained and described further in the D6.4.2b User Manual.
In Figure 14, the archival resource description
(qvizcet:ArchivalResourceDescription) provides the search enabled metadata
properties for other associated resources, the relationship overview portlet, access
to the archive resource on the archival portal, and descriptions of bookmarks
made from users in communities.




                                         33
                                           QVIZ Toolkit architecture report 2008-04-28




                  Figure 14. Archival Resource Description Object.

4.2.3. Component Integration via Archival Resources
Knowledge and content accessibility by other QVIZ components, especially, the
Faceted Browser, is made possible using a cross-domain solution developed in
QVIZ. This addresses the cross-domain security issues of sending information to
other browser windows, which means that QVIZ components can be setup on
different servers without the use of proxy servers or maintaining one common
server. The collaborative environment is maintained in one window where
requests for either a CET resource or indirectly, the CET’s archive resource
context, which uses the archive access URL archiveResourceRedirectURL also
found in a result list of the facetted browser or the social bookmarking application.
If one imagines that additional application components are added in the future to
form a suite of archival tools, this solution makes it possible to include them. The
following figure provides demonstrations and sample coding for using the cross-
domain integration features by other QVIZ components.




           Figure 15. Cross domain Integration within the Web Browser.

From the Faceted browser result list in the Query Visualization environment, a
user can invoke the CET to open and list the results for resources associated with a
particular query. In Figure 16, a facetted browser query into the CET is displayed
where the user can browse resources and the semantic net of resources.



                                         34
                                             QVIZ Toolkit architecture report 2008-04-28




     Figure 16. Display archival resources for a collection selected from the Facetted
                                   Browser component.

4.2.4. Simple Knowledge Organisation (SKOS)
SKOS was used as model for QVIZ tags within the collaborative environment as
described in the domain ontology. However, within the environment, QVIZ does
not include tools for creating more structured vocabularies that further exploit the
SKOS model. Instead a separate application prototype was developed that might
be used to supplement vocabulary building for cross-linking to SKOS based tags
in the collaborative environment.
The skosification of the environment goes well beyond most tagging solutions,
and coupled with creating community vocabularies or creating tags in
communities, users are able to create folksonomies.

4.2.4.1. SKOS based Tagging Support in the Collaborative Environment
Tagging in the collaborative environment creates SKOS based concepts, which
can be optionally referenced within simple vocabularies created by community
members. These vocabularies are SKOS collections, rather than SKOS concept
schemes because these tags are closer to Folksonomies 23, which the archive
experts requested. However, these tags are available to the site for association with
simple community vocabularies. A more formal tool for creating more structured
SKOS vocabulary was kept separate from the main collaborative environment
Tag objects are either manually created by users within the Collaborative
environment or automatically generated during the creation of particular objects
such as objects of type: qvizcet:ArchivalSocialBookmark or
qvizcet:ArchivalResourceDescription,. The automated tagging extracts properties
such as any keyword property or other designated properties of interest.
Folksonomies for communities (Commsonomies 24) are archived from different
perspectives: A user can store the initial Tag object under their default and share it
with one or more users. All tags objects can be reused by all communities, and
communities can make greater emphasis by collecting Tags in vocabularies
directly related to the Community. In this manner, there are ways to relate tags in
Commsonomies and visualize them in the Relationships Overview portlet.

4.2.4.2. SKOS Vocabulary Builder and Search Application
Early in the project, one goal was to provide users with a web based vocabulary
tool for building and search tool based on the SKOS – Simple Knowledge
Organization Specification. During the project, the archive experts expressed more
interest in yet simpler tagging solutions, therefore the project use SKOS based
tagging in the Collaborative environment without a tight coupling to the SKOS


23
     Folksonomy http://en.wikipedia.org/wiki/Folksonomy
24
  http://www.slideshare.net/klamma/communityaware-semantic-multimedia-tagging-
from-folksonomies-to-commsonomies/




                                            35
                                             QVIZ Toolkit architecture report 2008-04-28




vocabulary builder tool. Users can then create simple SKOS:Collection objects to
associate their SKOS based tags.
The SKOS demo application is integrated with the QVIZ at the level of the user
management and Single sign-on. However, the JavaScript AJAX/library can be
exploited by applications (Mash-ups), although the more central issue is how users
wish to search or browse resources based on the tool.
Currently, within the collaborative environment, users can cross-reference a
qvizcet:Tag (subclass of skos:Concept and dul:AbstractQuality) to a concept in the
vocabulary builder tool by referring to the SKOS concept in the qvizcet:Tag’s
dc:source property. This provides a possible integration point for basing future
Use Cases connecting more formal SKOS vocabularies with the more
“Folksonomy” oriented qvizcet:Tag.
A more detailed description of this tool is provided in the Appendix section.

4.2.5. Knowledge Content Export to Knowledge Content Object (KCO) aware
Environments
The Knowledge Content Object (KCO) is used to support semantic web services,
services that are provided according to a plan described within the KCO itself. The
GRISINO 25 project will use knowledge and content exported from QVIZ and
transform the exported serialization into a KCO conforming to their environment.
The transformation is facilitated by the QVIZ KCO and Dolce based domain
ontology. A service will be provided by GRISINO to accept a generated RDF
based graph serialization exported from the collaborative environment.
Currently, the Publication Manager provides functionalities to export a serialized
OWL RDF graph based on the domain ontology. The QVIZ KCO schema includes
the domain ontology and mappings. From this stage, the serialized graph can be
further processed by services of a target KCO environment. The serialized graph
includes content and semantic descriptions of many objects associated to the
particular qvizcet:CollaborativeDocument that we wish to export. These objects
can include the descriptions for named entities, SKOS tags, involved communities,
directly associated archival social bookmarks and archive resource descriptions,
related qvizcet:CollaborativeDocument. Normally, the textual content of a
resource is exported except for text content of other CollaborativeDocuments.
KCOs are also described as a “unit of value which can be manipulated by
Semantic Web Services (SWS)”26 – they support the content and knowledge
description. Service providers of KCO aware environments, such as GRISINO,
might offer a series of semantic web services. Each KCO might include a plan that
describes which services should be performed. By some agreement between
publishers and a plan might be automatically attached when a new KCO is
published in the environment. For the QVIZ KCO, the GRISINO project will offer
the environment for handling KCOs and provide added value services, such as
automated information extraction, inclusion in an alternative search and
publication environment supported by GRID infrastructure and semantic web
services. They will be receiving and transform the incoming KCO using specific
QVIZ transformers and include a plan that provides a default set of services for


25
     GRISINO http://www.grisino.at/
26
     Semantics conference 2006 http://www.semantics2006.net/




                                           36
                                                   QVIZ Toolkit architecture report 2008-04-28




QVIZ KCOs. The transformed KCOs from QVIZ are supported by the
middleware described as KCCA or Knowledge Content Carrier Architecture.

4.2.5.1. Knowledge Content Carrier Architecture




                    Figure 17. Content Carrier Architecture (KCCA) 27

The following KCCA discussion27 is extracted from resources in projects
supporting and defining the KCCA. The goal is to provide an overview of where
the QVIZ resources can be exported into new knowledge aware environments.
However, one important reason for exporting materials is to gain the added value
of semantic web services offered in other environments. GRISINO provides
semantic web services and GRID services which use the KCO as the knowledge
content unit which
The Knowledge Content Carrier Architecture (KCCA) the semantic middleware
providing support for semantic definition of tasks and specific services.
The KCCA Middleware is composed of the following primary components:
       •   KCCA Repository: The KCCA Repository, which is shown at the bottom
           of Figure 17, provides interfaces with databases for storage of content,

27
     Metokis http://metokis.salzburgresearch.at/
Wolfgang Maass, Sunil Goyal, Wernher Behrendt (2004): Knowledge Content Objects and
a Knowledge Content Carrier Infrastructure for ambient knowledge and media aware
content systems.
http://metokis.salzburgresearch.at/files/papers/maass_et_al_2004_knowledge_content_carr
ier.pdf
Grisino D1.1 State of the Art in Semantic Web Services, Grid and Intelligent Content
Objects - Can they meet?




                                              37
                                      QVIZ Toolkit architecture report 2008-04-28




    metadata, ontologies and KCOs (Knowledge Content Objects). The
    metadata within the KCCA middleware is stored in a RDF database.
•   KCCA Middleware Components: The KCCA Middleware Components
    provide specific components and modules that enable the building of the
    actual middleware e.g. authentication, workflow engine, session
    management, inference engine, rule layer, and system registry.
•   KCCA Services Container (Request Broker): The KCCA Request
    Broker provides support to plug in middleware components and also
    provides support for system and domain level services. The domain level
    services include services for all three application domains, services related
    to multimedia systems (digital rights management etc), registry services
    etc. The system level services include services for accessing KCCA
    Repositories, KCCA Middleware components etc. It also includes KCO
    services, which provide access, query and manipulation of KCOs.
•   KCTP: KCTP (Knowledge Content Transfer Protocol) is a lightweight
    request/response protocol implemented by the KCCA Middleware that
    allows applications to perform operations on KCOs.




                                     38
                                                QVIZ Toolkit architecture report 2008-04-28




5. Middleware
Middleware is the software that sits “in the middle” between applications (e.g. a
word processing program) working on different operating systems (Unix,
Windows, z/OS, etc.). It is similar to the middle layer of a three-tier single system
architecture, except it is stretched across multiple systems or applications.
Examples include database systems, telecommunications software, transaction
monitors and messaging-and-queuing software28.
The QVIZ project features several middleware components in order to create an
integrated system.

5.1 Core Portal Components QVIZ User Management
A decision was made to create a common user management structure in QVIZ.
This presents the user with a seamless experience within the system. A central
component is used as the identity provider for the QVIZ system. This component
is deployed in the CET.
It provides the means to identify users to all components that require
authentication, but also provides user identifiers for data input/output requirements
within QVIZ components.
The technical solution for the user management is the Central Authentication
Service, CAS29 for short. It is used to share the user identities between the
different modules of the QVIZ platform.

5.1.1 User Management
The user can apply for an account at the QVIZ website. The following form is
filled out by applying users:




                                Figure 18. User sign-up form.

The user management is handled as shown in the picture below.




28
     http://en.wikipedia.org/wiki/Middleware
29
     http://www.ja-sig.org/products/cas/




                                               39
                                            QVIZ Toolkit architecture report 2008-04-28




                                          User




                                         Applies




                            Rejected               Accepted
                                        Decision




                                                      Add
                              Contact
                                                     user to
                               user
                                                     system




                                                     Contact
                                                      user



                     Figure 19. User management process flow.

To clarify;
    1. The user wants to try QVIZ.
    2. The user applies on the web form shown above.
    3. A mail is sent to the QVIZ administrators.
    4. If the user is accepted, a mail with username and password is sent out by
       e-mail.
    5. If the user is rejected, the user will be notified by e-mail.

5.2 Archive Resource Hub
The Archive Resource Hub serves as the integration broker of all components of
the QVIZ system.
The Archive & Resource Hub (ARH) is split up into several modules as well as an
Archive Abstraction Layer (AAL), which in turn has its own modules. Each
module represents either an entity within the system, like a bookmark, or an
interface to another system. This is also true for the AAL module since they act as
interfaces to the archives. The AAL modules enable the QVIZ system to introduce
new archives with a common API. This API is specified in Appendix B.

5.2.1 ARH Modules
The modules utilize a hook function system. Whenever a significant action takes
place in the ARH, such as rendering a form or saving a bookmark, an appropriate
hook for that action is fired, which lets all modules tap into that functionality and
in turn act on the event. Because of this, the ARH’s own flow of code is mostly
unaffected by which modules are plugged in and what they do.




                                          40
                                          QVIZ Toolkit architecture report 2008-04-28




5.2.2 Bookmark module
The bookmark module represents the bookmark entity. It is responsible for
loading, storing and saving information associated with the bookmark, as well as
providing graphics and validation of the bookmarking part of the bookmarking
form. It also contains a set of utility functions related to the bookmarking
functionality.

5.2.3 CET module
The CET module acts as the interface to the Collaborative Environment (CET). It
is responsible for authenticating users to the CET’s CAS as well as converting
bookmarks to the CET format and storing them in the CET. It also provides the
low-level interface for fetching CoPs (see below).

5.2.4 CoP module
The CoP module is responsible for dealing with Communities of Practice (CoP). It
loads and saves CoPs (although the actual low-level loading is done through the
CET module). It also provides the rendering of available CoPs in the bookmarking
form as well as validation of that element of the form.

5.2.5 Resource module
The Resource module represents the resource entity. It has the capability of
loading and saving resources recursively through their hierarchies (though the
actual low-level loading is done in the AAL modules, more about that later). It
also provides rendering of the resource information part of the bookmarking form
as well as some resource related utility functions.

5.2.6 User module
The user module is responsible for loading, saving and authenticating users. This
includes authenticating the users with the CAS server as well as keeping track of
sessions and rendering the necessary form elements.

5.2.7 Archive Abstraction Layer module
The AAL modules handle all the archive specific issues so that the rest of the
system does not have to. Their job is to provide a consistent interface to an archive
regardless how that archive is implemented. They adapt to the archive specific
ways of retrieving data as well as compensating for any quirks the archive might
have.
There is a need to have a specially crafted AAL module for each archive that is to
be supported by the QVIZ system; however this is also the only adaptation that
should be necessary, the rest of the system will work as expected as long as the
AAL modules follow the specified protocol.
The primary functions of these modules are the fetching of resource data from the
archives and the translation between archive specific resource identifiers and the
QVIZ resource identifiers. Furthermore, they also provide other functions to the
rest of the system such as providing information about the archive they support.
A simplified example of the output XML, of the information exchanged, with the
associated parameters is shown below. Note that the XML structure is the same
for both archives modules:




                                         41
                                            QVIZ Toolkit architecture report 2008-04-28



<description id="200250813866" parent_id="200250076845">
    <title>Personaalraamat XXIX (Vana-Koiola, Partsi, Viira, Uibujärve,
Saarjärve,   Põlva pastoraat, Vastse-Koiola)
    </title>
    <date>1881-1891</date>
    <reference>EAA.3147.1.205</reference>
    <level>AHV</level>
    <language>ET</language>
</description>


5.2.8 SAS Module
The module for the Swedish Archives is the intermediate communication module
between one the core parts of the ARH (Archive Abstraction Layer) and the
Swedish archival component. The SAS module is based on a PHP implementation,
and manages the communications with the archival web services via a SOAP
interface.
The SAS module is in charge of fetching information from the archive and is a
provider of the QVIZ database management solutions for storing resource data and
the parsing of the parameters required by the QVIZ system, and translating the ids
from the archival format into the ARH one. This fetching functionality is used and
required for displaying the information to the user when bookmarking an archive
resource.
The identifier of the archival description unit is passed from the ARH to the SAS,
in order to request the relevant resource information. The parameters retrieved by
the web service are the following:
    •   Identifier: string, alphanumeric identifier. It is possible that it contains
        none, one or several parents ids.
        This is an example of a SAS identifier:
        {45E42C3C-0FF7-4915-AFCF-492ABCD3B872}_A0025818_00008
        Several parts are comprised, the number in brackets is the pure id, the item
        separated by “_” is the batch (“A0025818”) and the next number is the
        identifier for the specific image (“00008”).
    •   Title: a string, which represents the title of the resource.
    •   Date: an array in which a period of time can be stored.
    •   Reference: the reference code for the Archives’ resources.
    •   Level: a string for the Swedish resources. There are the following option
        levels “Arkiv”, “Serie” and “Volym”.
        Regarding the structure organization in the Swedish Archives, the
        hierarchy of these levels is as follows: Arkiv is above the Serie, which in
        turn is above the Volym.
    •   Language: is a string parameter to represent the language of the resource,
        which is based on ISO 639-1 standard (2-letter code for a language).
In the case, of requiring the parent AU information; the SAS module will go
through the data recursively, to populate the whole XML structure.
The SAS module acts as a mediator, between the information that is stored in the
archive and the QVIZ database. The connection used to the PostgreSQL database
is based on PDO (PHP Data Object), which provides a data-access abstraction
layer for doing queries and retrieving of data. The ARH will access to the tables of




                                          42
                                                      QVIZ Toolkit architecture report 2008-04-28




internal resources identifiers, for doing the translation between them. Both are
stored in the QVIZ database server.

5.2.8.1 SAS Architecture
The SNA module is composed by a PHP class including a SOAP client interface
to communicate with the archival web services. The module is configured through
an external configuration file.




              ARH




        Apache

            Parsing Archive ID s

        Translating Archive ID s into
                   AR H ID                                                        ARCHIVAL
                                        Processing SOAP request       /
          Translating AR H ID into         Fetching information                 WEB SERVICES
                 Archive ID             associated w ith a resource
                                             from the Archive
        Parsing specific QVIZ levels


              Saving resource




            PH P 5 (SOAP library )        PostgreSQL


                          Figure 20. Internal architecture of SNA module.

5.2.8.2 Installation procedure for the SAS Module
These steps are necessary to install the module in the ARH.
    •     The module requires PHP 5 to support SOAP specification This is enabled
          by default.
    •     The PostgreSQL PDO drivers are enabled by editing the php.ini. PDO and
          all the major drivers ship with PHP as shared extensions, and simply
          needs to be activated by editing the php.ini file
                       - extension = php_pdo.dll
          Then, the other database-specific DLL files either use dll() to load them
          at runtime, or enable them in php.ini, below it is php_pdo.dll. In our case
          for PostgreSQL this corresponds to:
                       - extension=php_pdo_pgsql.dll




                                                    43
                                         QVIZ Toolkit architecture report 2008-04-28




6. Bookmarking Client
To interact with the QVIZ system through the archive portals, a bookmarking
client is available. This client interacts with the QVIZ system by storing
bookmarks. These bookmarks are dispatched through the Archive Resource Hub
and stored in the QVIZ system.

6.1 User interface
The user interface when the bookmark icon is pressed is shown below:




         Figure 21. Bookmarking dialogue window in the Estonian archive.

6.2 Architecture and data model
The following steps are taken when the user decides to bookmark a resource:
   1. If the user is not logged in, he or she is presented with the login dialogue.
   2. The pop-up window is shown in the archive portal.
   3. The resource metadata is fetched from the archive through the Archive
      Resource Hub.
   4. The users’ Communities of Practice is fetched from the Collaborative
      Environment through the Archive Resource Hub.
   5. The user can enter title, description, keywords and select CoP.


The information sent to the QVIZ system is:



                                        44
                                           QVIZ Toolkit architecture report 2008-04-28




    •   Bookmark title
    •   Bookmark id
    •   Bookmark description
    •   Bookmark community of practice
    •   Bookmark user owner
    •   Bookmark keywords
    •   Archive resource id
    •   Resource time span
    •   Resource reference
    •   Resource archive level
    •   Resource administrative unit ids
    •   Resource institute
    •   Resource description lookup service URL
    •   Resource redirect service URL
    •   Resource portal

6.3 Installation
The installation of the QVIZ system in an archive is composed of two steps.
The bookmarking client is easily installed through the inclusion of a JavaScript
file in the header of each document, which displays a bookmarkable resource. A
bookmark markup tag is then written in the document where the bookmarking link
should appear.
The second part is the resource lookup service. How this is implemented will
differ between archives, but its requirements are to deliver a certain dataset on
request based on resource identifiers. Current implementations make use of XML
and SOAP.




                                        45
                                           QVIZ Toolkit architecture report 2008-04-28




7. Data Import and Database Creation
The QVIZ Administrative Ontology combines two very different approaches to
organizing information about administrative units: An archival tradition which
treats the administrative units primarily as corporate bodies, and a Geographical
Information Systems approach, which treats them primarily as geometric shapes,
more specifically as polygons. The data model is based on treating the
administrative units primarily as entities in a hierarchy, but the usability of the
system is very restricted unless there is also geographical information about each
unit.
The QVIZ administrative ontology is efficiently modeling the administrative units
across the globe. It has been designed to handle the dynamics in the administrative
history. The administrative units are closely related to the governance of people.
One example could be church or taxations districts. Church districts could have
the function to keep track of births, deaths, migration, holy communions, etc. A
taxation district could handle records of how much tax the inhabitants within an
area need to pay. However these administrative units do change over time. They
can change their name, territory and its relations to other units. This dynamics is a
reflection of the changes in the structure of governance.
The use of administrative unit ontology in QVIZ is very closely related to the
principle of logically storing archival resources. Therefore the QVIZ uses the
administrative ontology to explore and find archival resources. This ontology is
combined with the time-spatial index of archival resources.
These resources have a close relationship to the administrative ontology and its
relations to the creator of the archival documents. QVIZ has designed the system
to focus on digital archival documents. These indexes are built and maintained by
the archival institutions and exported to the QVIZ system.

7.1 Technical Requirements for Content Providers
QVIZ can be used by content providers that are digitalizing archival document and
providing internet portals to access the digital documents
In order to make use of QVIZ-framework, content providers need to support the
following content and software services:
    • Provide an index of archival references connected to administrative unit
        (time and space).
    • Provide initial knowledge of relevant administrative ontology content
        (when new areas are being involved).
    • Provide services for social bookmarking of archival resources (such as
        resource information services and redirect services).
    • Integrated QVIZ social bookmarking toolkit within their web portal.
    • Provide synchronizing of provided content into QVIZ-system.

This chapter will describe how the last bullet point is achieved.

7.2 Data Origins
The QVIZ project uses many different sources for data. These include
    •   The national archives. They provide the archive resources transferred in
        files.



                                         46
                                                     QVIZ Toolkit architecture report 2008-04-28




      •     The University of Portsmouth supplies the administrative ontology and
            connected GIS information.
      •     Regio provides the map software and connects the GIS information with
            the Digital Chart of the World 30.

7.3 Data Import
The picture below shows the steps needed to populate the QVIZ system.




                    Archive
       Resources




                     Manual
                    Correction




                                        Data
                                       Import            QUIS
                                       Module           Database




 Admin Ontology                                           Map
 GIS Information   University of                        Database
                    Portsmuth




                                   Figure 22. Data import process flow.

The following steps are needed to populate the QVIZ system
      1. The national archives extracts the archive resources to a machine-readable
         format
                   a. These files are manually corrected in custom software and
                      Microsoft Access. This is necessary to ensure the data integrity.
      2. The Administrative Ontology is compiled into a database.
      3. The GIS information about Europe and the Administrative Units is
         compiled to the same database.
      4. This database is combined with the archive resources to create an
         application specific database called QUIS – QVIZ User Interface Storage.




30
     http://www.maproom.psu.edu/dcw/




                                                   47
                                            QVIZ Toolkit architecture report 2008-04-28




7.4 Tools for indexing resources
7.4.1 Context
To join the QVIZ system, the archives must provide content as follows:
1) Descriptions of archival units (fonds, series, volumes, etc.)
2) Digital content of archival material.
3) Administrative ontology.
4) Index that connects descriptive units on volume level to administrative units.
To be able to provide the required content is the major challenge for archives that
want to join the QVIZ.
As for the National Archives of Estonia that by the time of joining the project only
were able to provide descriptions of archival units and digital content. It took a
great deal of time and help from QVIZ partners to build the Estonian
administrative ontology and create the index that links descriptive units to
administrative units.

7.4.2 Tool support
To facilitate the development of the Estonian index of resources, a tool was built
for that purpose. This tool is called SAM, Semi-automatic Matching of
Descriptive Units to Administrative Units. By populating the tables in the data
model specified below, an archive could ease the implementation of a QVIZ-
compliant data structure.
The goal of SAM is first to automatically link (by running SQL queries) as many
descriptive units as possible on volume level corresponding to administrative
units. A user interface that enables the manual correction of automatically created
links is also provided.




                                           48
                                          QVIZ Toolkit architecture report 2008-04-28




7.4.3 Data model




                            Figure 23. SAM Data model.

Table “dsc_unit” stores the hierarchical data of the descriptive units.
Columns “id” and “parent_id” support adjacency list model of descriptive units.
Columns “left_val” and “right_val” support the modified preorder tree traversal
algorithm.
“Type” of descriptive units can be fond, subfond, series, subseries, volume or
subvolume.
Though in NAE system any descriptive unit can be defined by multiple time
periods, also “start_year” and “end_year” values are stored in “dsc_unit” table to
facilitate matching descriptive units against administrative units.




                                        49
                                          QVIZ Toolkit architecture report 2008-04-28




Following statements describe the hierarchy of descriptive units:
    •   A fond can have any number of subfonds and/or series as child nodes.
    •   A subfond can have any number of series as child nodes.
    •   A series can have any number of subseries and/or volumes as child nodes.
    •   A subseries can have any number of subseries and/or volumes as child
        nodes.
    •   A volume can have any number of subvolumes as child node.
    •   A subvolume has no child nodes.
    •   If a volume has no start and end year specified, at least one of its parents
        or children must have start and end year specified.

Following schema illustrates the hierarchy of the descriptive units:




                     Figure 24. Hierarchy of descriptive units.

Tables “lookup” and “dictionary” constitute a search index for titles of
descriptive units that facilitates matching them against names of administrative
units. Titles of descriptive units are normalized; non-wanted characters stripped;
punctuation marks changed to spaces; titles split into words, unique words inserted
into “dictionary” table and linked to descriptive units through “lookup” table.
Table “adm_unit” stores names, types, start and end years of administrative
units.
The goal of semi-automatic matching is to populate table “link” so that as many
descriptive units as possible on volume level are linked to corresponding
administrative units.




                                        50
                                           QVIZ Toolkit architecture report 2008-04-28




Column “adm_unit” stores a foreign key that points to the primary key of
“adm_unit” table.
Column “dsc_unit” stores a foreign key that points to the primary key of
“dsc_unit” table.
Volumes are linked to administrative units either directly or indirectly.
Volume is linked directly if the title of volume itself contains word that matches the
name of administrative unit and if the time span between the start and end year of
volume overlaps with the time span between the start and end year of that
administrative unit.
Volume is linked indirectly if a title of volume parent or child unit contains word
that matches the name of administrative unit and if the time span between the start
and end year of volume overlaps with the time span between the start and end year
of that administrative unit.
If a volume is linked indirectly, “via_dsc_unit” points to the volume child or
parent unit on the basis of how the link was created.

7.4.4 SQL Queries
For both direct and indirect links, the system checks if the time span between the
start and end year of volume overlaps with the time span with the start and end
year of an administrative unit. But not all descriptive units on volume level have
start year and end year defined. Hence, to get more matches the system first needs
to query and update year data for these volumes.
First, it queries volume children to compute start year and end year for volumes
that have none.
The following query is run:
SELECT     p.id, min(c.start_year) as start_year, max(c.end_year) as end_year
FROM       dsc_unit p, dsc_unit c
WHERE      p.id = c.parent_id
AND        p.type = 'volume'
AND        p.start_year IS NULL
AND        c.start_year IS NOT NULL
GROUP BY   p.id


And then an iteration over the resulted array of rows to update volume start year
and end year:

UPDATE dsc_unit
SET    start_year=$row[start_year], end_year=$row[end_year]
WHERE id=$row[id]


If there are still some volumes without start year and end year, a query over the
parents of these volumes will be run to compute the start year and end year for
these volumes.

The following query is run:

SELECT     c.id, max(p.start_year) as start_year, min(p.end_year) as end_year
FROM       dsc_unit p, dsc_unit c
WHERE      p.left_val < c.left_val
AND        p.right_val > c.right_val
AND        c.type = 'volume'
AND        c.start_year IS NULL
AND        p.start_year IS NOT NULL
GROUP BY   c.id




                                         51
                                           QVIZ Toolkit architecture report 2008-04-28




And then an iteration over the resulting array of rows to update volume start year
and end year:
UPDATE dsc_unit
SET    start_year=$row[start_year], end_year=$row[end_year]
WHERE id=$row[id]


The system is now ready to match volume titles against names of administrative
units on the condition that the time span between the start and end year of volume
overlaps with the time span between the start and end year of an administrative
unit.

INSERT INTO link (dsc_unit, adm_unit)
SELECT      du.id, au.id
FROM        dictionary di
KEY JOIN    lookup lu
INNER JOIN dsc_unit du ON du.id = lu.unit_id
INNER JOIN adm_unit au ON au.name = di.word
WHERE       du.type = 'volume'
AND        (au.start_year BETWEEN du.start_year AND du.end_year OR
            au.end_year   BETWEEN du.start_year AND du.end_year)


But the system needs to create indirect links as well. So first it matches titles of
descriptive units of other types against names of administrative units. This time
one does not need to care about start and end year. Simply use these matches only
as helpers to build indirect links later.

INSERT INTO   link (adm_unit, dsc_unit)
SELECT        au.id, du.id
FROM          dictionary di
KEY JOIN      lookup lu
INNER JOIN    dsc_unit du ON du.id = lu.unit_id
INNER JOIN    adm_unit au ON au.name = di.word
WHERE         du.type != 'volume'

Now the volumes to administrative units can be linked via the volume children:

INSERT INTO link (adm_unit, via_dsc_unit, dsc_unit)
SELECT      li.adm_unit, li.dsc_unit, pu.id
FROM        dsc_unit pu, dsc_unit cu
INNER JOIN adm_link li ON li.dsc_unit = cu.id
INNER JOIN adm_unit au ON li.adm_unit = au.id
WHERE       pu.id = cu.parent_id
AND         cu.type = 'subvolume'
AND        (au.start_year BETWEEN pu.start_year AND pu.end_year OR
            au.end_year   BETWEEN pu.start_year AND pu.end_year)

Linking volumes via children to admin units can produce some duplicates. To
counter that one can run a query that deletes duplicate links:
DELETE FROM link WHERE id NOT IN
       (SELECT min(id) FROM link GROUP BY adm_unit, dsc_unit)


Finally a link volumes to administrative units can be created via the volume
parents.

First, select all units that are not of volume or subvolume type and that are already
linked to admin units. And then select all volumes that are children to these units.
SELECT pu.id as via_dsc_unit, cu.id as dsc_unit, cu.start_year, cu.end_year
FROM   dsc_unit pu, dsc_unit cu
WHERE cu.left_val BETWEEN pu.left_val AND pu.right_val




                                         52
                                          QVIZ Toolkit architecture report 2008-04-28



AND    pu.type!= 'volume'
AND    pu.type!= 'subvolume'
AND    cu.type = 'volume'
AND    pu.id IN (SELECT DISTINCT(dsc_unit) FROM link)

Then iterate over the resulting array of rows to filter out those that match volume
by start and end year overlap from the linked admin units:
SELECT li.adm_unit as adm_unit
FROM   link li INNER JOIN adm_unit au ON li.adm_unit = au.id
WHERE li.dsc_unit = $row[via_dsc_unit]
AND   (au.start_year BETWEEN $row[start_year] AND $row[end_year] OR
       au.end_year   BETWEEN $row[start_year] AND $row[end_year])

Then link volumes to these admin units indirectly via the parent unit:
INSERT INTO link (adm_unit, dsc_unit, via_dsc_unit)
VALUES     ($row2[adm_unit], $row[dsc_unit], $row[via_dsc_unit])

Linking volumes via parents to admin units can produce some duplicates. To
counter that run a query that deletes duplicate links:
DELETE FROM link WHERE id NOT IN
       (SELECT min(id) FROM link GROUP BY adm_unit, dsc_unit)


7.4.5 User interface
SAM provides a user interface that enables to check and correct the automatically
created links manually.
The following three screenshots illustrate the main views of that user interface.




                           Figure 25. SAM user interface.



                                         53
                QVIZ Toolkit architecture report 2008-04-28




 Figure 26. SAM user interface.




Figure 27. SAM editing interface.




               54
                                          QVIZ Toolkit architecture report 2008-04-28




7.4.6 Summary
SAM software is useful for rapidly building sets of test data for QVIZ. When
archives want to build systematic high quality indexes that link descriptive units to
administrative units SAM cannot replace the expert archivist, but instead help the
expert in creating the indexes.




                                         55
                                            QVIZ Toolkit architecture report 2008-04-28




7.5 QVIZ User Interface Storage (QUIS)
By nature, the faceted browser is a dynamic component, which is supposed to
react promptly to any user action. Therefore there is a need for an efficient (not
necessarily normalized) source of data. Since the end user interface displays data
from different QVIZ environments the QUIS database needs to combine certain
parts of these environments into a common consistent structure. The database
consists of the following seven tables (tables are populated in numerical order 1 to
7; relations between tables 1 -- 2, 1 -- 4, 3 -- 4 and 3 -- 5 are 1-to-N, table 6 and 7
are standalone).
Table 1, quis_au, is storing basic information about each administrative unit, such
as years of existences, a name to display (although units can have multiple names).
Table 2, quis_relation is a core table for the faceted browser. It has been designed
to explore the complex relation between administrative units over time. It can
record poly-hierarchical relations. By using this table, information about units
either higher or lower in the hierarchy can be displayed. Since the faceted browser
allows arbitrary ordering of the facets this table can be queried in this fashion. The
table is recording the path in the hierarchy, where time is a node of such a path.
Table 3, quis_au holds one or more resources for each AU.
Table 4, quis_au_resource is a join table for n-to-m relations between quis_au
and quis_resource.
Table 5, quis_bookmark holds one or more bookmarks for each resource.
Table 6, quis_years is an auxiliary table used with a resource histogram in the
time bar.
Table 7, quis_names holds names of AUs in different languages.
This structure is described in further detail below in the section QUIS Data
Structure.




                                          56
                                       QVIZ Toolkit architecture report 2008-04-28




7.5.1 QUIS Data Structure




                        Figure 28. QUIS data structure.

Legend:
  * is a constituent of the primary key in a given table
  + is a constituent of the foreign key in a given table

1) Table quis_au holds the AU hierarchy information

*au_id                   // global AU id (g_unit)
 au_code1      // institutional AU code 1 (e.g. Sweden:
                            29B2F543-F234-4BBD-A1E9-6D7605AB1060)
 au_code2      // institutional AU code 2 (e.g.Sweden:
                            SE/158404000)
 au_year_start           // the start year of AU's existence
 au_year_end             // the end year of AU's existence
 au_name                 // name of the AU
 au_direct_count         // number of resources directly connected
                            to a given AU (this value is now dynamic)
 au_indirect_count    // number of resources indirectly
                            connected to a given AU
 au_type                 // type of an AU (e.g. EST_KUB)
 au_type_level           // level of an AU (e.g. 6)
 au_type_n_label         // name of the AU type in the national
                            language
 au_type_g_label         // name of the AU type in English

2) Table quis_relation holds the AU relation information

+au_id                // foreign key reference to quis_au(au_id)
 rel_year_start         // the start year of AU's relation
 rel_year_end           // the end year of AU's relation
 rel_au_name            // name of an AU, same as quis_au(au_id)
 rel_au_id_l01          // AU id (for levels 1 to 13) constituting
                           a distinct path in the AU hierarchy
 rel_au_name_l01      // name of an AU
 ...
 rel_au_id_l13
 rel_au_name_l13

3) Table quis_resource holds one or more resources for each AU

*res_institute          // resource institute code (e.g. 'SNA')
*res_id1                // resource id for portal 1 (e.g. Sweden:
                           http://www.nad.ra.se)

 res_id2                 // resource id for portal 2 (e.g. Sweden:
                            http://www.svar.ra.se)
 res_id3                 // resource id for portal 3
 res_code      // resource code (e.g. Sweden:
                            SE/VALA/00001/D II/1)
 res_year_start          // interpreted start year (e.g. 1800)
 res_year_start_orig // original start year (e.g.
                            '1800:1')
 res_year_end            // interpreted end year (e.g. 9999)
 res_year_end_orig    // original end year (e.g. 'NULL')
 res_title               // resource title
 res_level               // resource level (e.g. 'L_VOL')
 res_language            // resource language (e.g. 'SWE')
 res_archive             // resource archive (e.g. 'Bro kyrkoarkiv')




                                      57
                                          QVIZ Toolkit architecture report 2008-04-28



 res_bookmarked           // 'Y' / 'N' flag indicating if the resource
                             was bookmarked

4) Table quis_au_resource is a junction table for n-to-m relation between
quis_au and quis_resource

*+au_id                    // foreign key reference to quis_au(au_id)
*+res_institute         // foreign key reference to
                                quis_resource(res_institute)
*+res_id1        // foreign key reference to
                              quis_resource(res_id1)

5) Table quis_bookmark holds one or more bookmarks for each resource

*bm_id                  // generated bookmark id
 bm_timestamp_created // timestamp of bookmark creation
 bm_timestamp_edited // timestamp of the last bookmark editing
 bm_cop_member          // CoP member registered in QVIZ
 bm_cop_group           // CoP group registered in QVIZ
+res_institute        // foreign key reference to
                           quis_resource(res_institute)
+res_id1                // foreign key reference to
                           quis_resource(res_id1)
 bm_timestamp         // timestamp of bookmark creation
 description            // user added bookmark description
 title                  // user added bookmark title

6) table quis_years is an auxiliary table used with resource histogram in
the time bar

*yr_years        // year value (0-current year)

7) table quis_names holds the names of AUs in different languages

*name                     //   AU name (capitalized)
*global_id                //   global AU id
*name_language            //   language of the AU name
 name_status              //   'P' for preferred language
 pretty_name              //   pretty AU name
 unit_type                //   type of an AU (e.g. EST_KUB)
 proper_name              //   proper AU name




                                        58
                                        QVIZ Toolkit architecture report 2008-04-28




7.5.2 Examples of querying in the QUIS database
These sections show how the QUIS database works by presenting common ways
of searching information in the database.




                           Figure 29. Example path 1.

This example shows a path, which is showing that a particular parish is part of
many different hierarchies over time. For example between 1971 and 1973 Sävar
was part of Sävar Kommun, Hovrätten för Övre Norrland, and part of Sweden at
the same time.




                                       59
                                           QVIZ Toolkit architecture report 2008-04-28




7.5.3 Querying over quis_relation
The quis_relation table is designed to use the 13 levels of the administrative
ontology. Each facet also displays the Administrative units for that level. In this
example Nation is level 5 and State is level 4.
Example 1: Show the states that Eesti (Estonia as a nation) has been part of
between 1887 and 1924 (indicated on time-bar).




                        Figure 30. Query over quis_relation.




                                         60
                                         QVIZ Toolkit architecture report 2008-04-28




Example 2. Show the counties that were part of Eesti between 1887 and 1913.




                              Figure 31. Part query.

Example 3. Complex query, combining facets of different type in different order.
This query shows:
    •   The sub- parishes that are part of Harjumaa County,
    •   The nation Eesti, which Harjumaa is a part of.
    •   The states Denmark, Polish Lithuanian commonwealth, Russia and
        Sverige which Eesti and Harjumaa was part of at different time-periods.




                            Figure 32. Complex query.




                                       61
                                           QVIZ Toolkit architecture report 2008-04-28




Example 4: Combing quis_relation with quis_au_resources and quis_resources.
For each administrative unit facet there is an archival resource counter in brackets.
For actions performed in either the time-bar or among the facet elements, several
queries are made to:
    1. Update units in the different facets considering the time-filter and
       selection made.
    2. Update frequency number of archival resources.
    3. When a unit is selected, the list archival resources and AU-context
       information is also updated.
    4. Show the unit on the map if there is geographical information for that unit.




                            Figure 33. Combined query.




                                         62
                                          QVIZ Toolkit architecture report 2008-04-28




Example 5: Querying using the tables quis_bookmark, quis_au_resource,
quis_relation and quis_names .

This query set up will require a dynamic construction of the query.
    1. The first selection of frepalm populated the state facet and the search
       facet. It also listed three archival resources.
    2. The second selection is Sverige, which populated the Search facets and the
       result list of archival resources.
    3. In the search AU-facet a text string was added to filter the units visible in
       this facet. The AU-search uses the quis_names tables, which can store
       multiple names for each unit.




                          Figure 34. Collaborative query.

As is shown, the archival resources listed seems to be coming from Estonia, which
is true, but since Estonia was part of Sweden 1583-1719 it also connects to
Sweden (Sverige).




                                         63
                                           QVIZ Toolkit architecture report 2008-04-28




7.6 Tools for editing the Administrative Ontology
The QVIZ Editing System (ES) is designed to edit the various levels of data in the
QVIZ database according to the data structure of the Administrative Unit
Ontology (AUO). The ES allows editing of meta-data and data tables, the splitting
or merging of units, and the creation of new units. Furthermore, it implements a
user access control system that restricts the editing of the data to the role that has
been defined to that particular user. For example, some users may be allowed to
edit existing data, but not to delete existing data rows or add new ones. See Figure
35 for a screenshot of the login screen.




                    Figure 35: Login screen to the Editing System

The ES is implemented as a Web Application on an Apache Tomcat server, using
JSP in concert with various frameworks, such as Struts, iBatis and Apache
Commons. Platform independence is one of the major goals of the ES, allowing
the use of both PostgreSQL and Oracle databases as a back end. JSP was chosen
over PHP, as the former allows for better maintainability of the software and a
cleaner overall design.
One of the main benefits of the ES is the inherent enforcement of the AUO rules.
Whereas it is next to impossible to ensure data integrity within data editing which
takes place on a SQL level, the ES front end enables a quality control mechanism
with a good level of granularity, i.e. through the abovementioned user access
control. Also, in a production environment, editing the data on the SQL level is
not an option, as this would require extensive training of the data entry personnel.
On the other hand, the ES provides a handy abstraction of the AUO, as the users
do not need to know about the intricacies of the ontology in order to edit data. This
way, a high level of data integrity is achieved by checking the data prior to its
insertion into the database.
The Editing System is needed for the sustainability of the QVIZ project, as a user-
friendly application is required to popularize the use of the QVIZ system. After
all, the prospective end-user, chiefly archivists and librarians, who are supposed to
maintain the database in the use-case scenarios, cannot be expected to maintain the
system from an SQL shell. User-friendly and quality-assuring ES software is
therefore crucial for the sustainability of the QVIZ project.



                                         64
                                   QVIZ Toolkit architecture report 2008-04-28




Figure 36: Editing screen for one entry in a unit's status information




                                 65
                                          QVIZ Toolkit architecture report 2008-04-28




Appendix
A. Faceted Query Component Technical Documentation
This part deals with the technical details of the implementation of the Faceted
Query Component. The FQC is programmed using dynamic HTML techniques
broadly called AJAX. The client-side is programmed in JavaScript, the server-side
in PHP.
The facets can be implemented using either a HTML select element or a HTML
table element. QVIZ has chosen the table since it provides with more control over
its look-and-feel and content than a typical select menu, whose implementation
varies slightly for different browsers.
Each facet is therefore implemented as a separate table attached to a higher-level
table using row and cell elements. The positioning in a row is done automatically
and allows easy adding, removing and swapping of tables (or facets) using the
following JavaScript constructs (object element references are marked in blue):
    •   adding a facet:

        f_tr.appendChild(<facet>_td);

    •   removing a facet:

        element.parentNode.removeChild(<facet>_td);


    •   swapping a facet:

        <facet1>_td.parentNode.insertBefore(<facet2>_td,<facet1>_td)

Each element of the facet has its own id so that it is possible to reference this
element throughout the whole code. The following figure gives an overview of the
used element ids. The strings shown in brackets are variable: <facet> is a facet
code (f_code) as described in the file fb_def.js; <tr> is a facet row number (>= 1).




                          Figure 37. Facet HTML structure.




                                        66
                                                  QVIZ Toolkit architecture report 2008-04-28




Each row of a facet (see "tr elements" in the figure above) is attached through
<tbody>, <table> and <td> elements to the main table as follows (closing tags are
not shown).


               <TABLE>                          f_tab
                   <TR>                         f_tr
           +-       <TD>                        <facet>_td
           |          <TABLE>                   <facet>_table
           |               <TBODY>              <facet>_tbody
           |                 <TR>               <facet>_tb
           |                   <TD>             <facet>_tb_1
           |                   <TD>             <facet>_tb_2
           |                   <TD>             <facet>_tb_3
               |              <TR>               <facet>_<tr>
               |                         <TD>           <facet>_<tr>_<td>
 facet_1   |                    ...
           |                 <TR>               <facet>_pb
           |                   <TD>             <facet>_pb_1
           |                        <A HREF>    <facet>_pb_1_a
           |                   <TD>             <facet>_pb_2
           |                        <SPAN>      <facet>_pb_current
           |                        <SPAN>      <facet>_pb_max
           |                   <TD>             <facet>_pb_3
           +-                       <A HREF>    <facet>_pb_3_a


           +-
 facet_2   |                    ...
           +-

Similarly, for the result list there exists element IDs as shown in the following
figure; <tr> is a result list row number (>= 1).




                                                67
                                              QVIZ Toolkit architecture report 2008-04-28




                           Figure 38. Result list HTML structure.

The corresponding elements structure for the result list is as follows:

                 <TABLE>                  rl_table
                   <TBODY>                rl_tbody
            +-
            |       <TR>                 rl_tb
            |         <TD>               rl_tb_1
title bar   |              <SPAN>        rl_tb_span_1
            |              <SPAN>        rl_tb_span_2
            |              <SPAN>        rl_tb_span_hint
            +-




                                            68
                                           QVIZ Toolkit architecture report 2008-04-28



           +-
           |    <TR>                   rl_ft
           |      <TD>                 rl_ft_1
filter_bar |      <FORM>               rl_ft_form
           |           <INPUT>         rl_ft_input
           |           <SPAN>          rl_ft_span
           +-


           +-
           |    <TR>                   rl_sb
switch bar |      <TD>                 rl_sb_1
           |      <TD>                 rl_sb_2
           +-


           +-
           |    <TR>                   rl_<tr>
           |      <TD>                 rl_<tr>_1
           |      <TD>                 rl_<tr>_2
           |           <TABLE>         rl_<tr>_2_table
           |             <TBODY>       rl_<tr>_2_tbody
           |               <TR>        rl_<tr>_2_1
rl content |                    <TD>   rl_<tr>_2_rr
           |               <TR>        rl_<tr>_2_2
           |                    <TD>   rl_<tr>_2_ra
           |               <TR>        rl_<tr>_2_3
           |                    <TD>   rl_<tr>_2_rt
           |                    <TD>   rl_<tr>_2_rl
           |      <TD>                 rl_<tr>_3
           |       ...
           +-


           +-
           |    <TR>                   rl_pb
           |      <TD>                 rl_pb_1
           |           <A HREF>        rl_pb_1_a
  page bar |      <TD>                 rl_pb_current
           |      <TD>                 rl_pb_max
           |      <TD>                 rl_pb_2
           |           <A HREF>        rl_pb_2_a
           +-




                                         69
                                            QVIZ Toolkit architecture report 2008-04-28




Apart from having facets and the result list stored in the above shown structures it
is necessary to keep track of all content information related to all facets, result lists
and the contextual area. This information is dynamic in nature and requires the
following global declarations to be made (remark: term "associative array" is used
for Object type variable):
    •   f_list
        An indexed array containing a list of active facets (facet codes) in the
        order in which they currently appear on the webpage.

    •   f_paging
        A mixed associative-indexed two-dimensional array containing paging
        information for each facet (1st index: facet code, 2nd index: 0 for current
        page, 1 for maximum page).

    •   rl_paging
        A mixed associative-indexed two-dimensional array containing paging
        information for both direct and indirect resources in the result list (1st
        index: 'direct' or 'indirect', 2nd index: 0 for current page, 1 for maximum
        page).

    •   f_selected
        An associative array containing selected row ids (keys are facet codes).

    •   f_content
        A mixed associative-indexed three-dimensional array holding the content
        of all facets (1st index: facet code, 2nd index: row number, 3rd index: cell
        index (1-4) or 0 for cell name).

    •   f_reslist
        A mixed indexed-associative three-dimensional array holding all returned
        results (1st index: result number starting 1, 2nd index: result fields
        res_institute, res_id1, res_id2, res_id3, res_code, res_year_start,
        res_year_end, res_title, res_archive, res_bookmarked).


    •   f_context
        A mixed indexed-associative two-dimensional array containing AU
        existence and AU History information (1st index: 0 for AU existence
        information, >1 for AU History relations; 2nd index: AU existence or AU
        History relation fields: a) au_id, au_name, au_year_start, au_year_end,
        au_type_n_label; b) rel_year_start, rel_year_end, rel_au_id_l01,
        rel_au_name_l01, ..., rel_au_id_l13, rel_au_name_l13).


    •   m_response
        A mixed associative-indexed multi-dimensional array containing the
        response of the last executed Map request; element 'm_action' always
        exists specifying the current Map request action.




                                           70
                                             QVIZ Toolkit architecture report 2008-04-28




B. Archive Abstraction Layer API
The Archive Abstraction Layer (AAL) modules are required to implement a set of
functions. The functions are specified below. In each function name, the “hook”
keyword is to be replaced with the short/machine-name of the archive. The
short/machine-name can be virtually anything, for example it is “SNA” for the
Swedish National Archive and “ENA” for the Estonian National Archive. This
means that the implementation of the hook_fetch function in the ENA module
would be called ena_fetch.
When attempting to implement a new AAL module, it would probably be of great
help to look at the two existing modules, the SNA and ENA modules, for
reference.

hook_whois
Description
The purpose of this simple function is to take a domain name and respond with
whether that domain name belongs to the archive of this module or not. For
example the Estonia National Archive would respond to the domain names
“www.eha.ee” and “ais.ra.ee“ since they both belong to this archive and contain
websites containing bookmarkable archival resources.
Arguments
String $domain; A fully qualified domain name.
Return values
String|false; The short/machine-name of the archive if the passed $domain belongs
to this archive, false if not.

hook_fetch
Description
Fetches all information associated with a resource from the archive.
Arguments
Array $resource_archive_ids; The unique archive-specific identifiers for the
resource to fetch.
Return values
Array|false; An array structured as below or false on failure.
The structure of the array to return:
array(
         'institution' => <archive short/machine name>
         'id' => <the archive-specific unique identifier for this resource>
        'parent_id' => <the archive-specific unique identifier for the parent
resource of this resource>
         'title' => <the title of this resource, in no specific language>
         'timespan' => array('from' => <the start year of this resource, 4 integers>,
'to' => <the end year of this resource, 4 integers>)



                                           71
                                            QVIZ Toolkit architecture report 2008-04-28




          'language' => <the language this resource exists in, described in iso6391
format>
         'reference' => <a string containing the information a person would need to
locate this resource physically in the archive>
        'archive_level' => <the archive-specific name for the level on which this
resource is on in the resource hierarchy>
)

hook_parse_id
Description
This function will attempt to parse whatever is given into an array of archive
specific identifiers. Possible arguments might be; an array of archive ids, a
tokenized string of ids, a numeric value, etc.
Arguments
$id; Can be of any type and contain anything.
Return values
Array|false; An array containing the archive identifiers on success, false on failure.

hook_get_arh_id
Description
Translates a set of archive identifiers into its corresponding internal ARH
identifier.
Arguments
Array $archive_ids; All the archive identifiers of a specific resource.
Return values
Int; The ARH identifier of the resource.

hook_get_archive_ids
Description
Translates an ARH resource identifier to the corresponding archive identifiers.
Arguments
Int $arh_id; The ARH identifier to translate.
Return values
Array; The archive identifiers.

hook_resource_save
Description
Called whenever a resource is saved in the system. Should be used to save the
ARH identifier to archive identifiers translation mappings and whatever else that
may need to be done in relation to this archive when a resource is saved.
Arguments
Object $resource; A fully populated resource object.



                                           72
                                            QVIZ Toolkit architecture report 2008-04-28




Return values
Bool; True on success, false on failure.

hook_resource_lookup_url
Description
Gets the URL for fetching a certain resource through the archives backend lookup
service (normally the same service used for hook_fetch).
Arguments
Array $resource_ids; The identifiers of the resource to construct the URL for.
Return values
String; The URL that will fetch information about this resource from the lookup
service.

hook_resource_object_url
Description
Gets the URL for viewing the actual resource itself. For example an image of a
scanned document. Not a page containing a description of the resource.
Arguments
String $resource_ids; The identifiers of the resource to construct the URL for.
Return values
String; The URL that will present the resource.

hook_resource_description_url
Description
Gets the URL for viewing the page describing a certain resource, not the resource
itself.
Arguments
String $resource_ids; The identifiers of the resource to construct the URL for.
Return values
String; The URL that will present the description of the resource.

hook_get_qviz_resource_level
Description
Used for translating between QVIZ and archive-specific levels of resources. The
internal QVIZ level system can be thought of as a weight system. The volume
level is level 0. Any levels above the volume level are 'lighter' than the volume
(smaller than zero) and any levels below the volume level are 'heavier' (larger than
zero).
Arguments
String $archive_resource_level; The archive-specific level identifier to translate
into a QVIZ level identifier.
Return values




                                           73
                                         QVIZ Toolkit architecture report 2008-04-28




Int; The QVIZ resource level.

hook_install
Description
Called when this module is installed in the QVIZ system. Used for doing initial
setup for this module, such as creating new database tables.
Arguments
None.
Return values
String; A message describing the status of the install. For example “Successfully
installed <archive name> module.” on success.




                                        74

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:7/29/2012
language:
pages:74