What is the Semantic Web

Document Sample
What is the Semantic Web Powered By Docstoc
					Semantic Annotation for
Web Content Adaptation

Unit 14 of Spinning the Semantic Web




                                       1
          Introduction

• Necessary for Web contents to be adapted for transparent
  access from a variety of client agents (cellular phones, PDA)
   – A large, full-color image may be reduced with regard to size and
     color depth, removing unimportant portions of the content, when
     accessed by certain devices
• Better presentation and faster delivery to client devices
• Transcoding: transformation of information from one form to
  another
   – Web content transcoding
   – Crucial for universal Web access under varying conditions that may
     depend on client capabilities, network connectivity, or user
     preferences

                                    2
Composite Capabilities/ Preferences Profiles
                 (CC/PP)




                                        3
          Introduction

• CC/PP stands for Composite Capabilities/Preferences
  Profile, and is a system for expressing device capabilities
  and user preferences.
• The goal of the CC/PP framework is to specify how client
  devices express their capabilities and preferences (the user
  agent profile) to the server that originates content (the origin
  server). The origin server uses the "user agent profile" to
  produce and deliver content appropriate to the client device.
  In addition to computer-based client devices, particular
  attention is being paid to other kinds of devices such as
  mobile phones.

                                 4
          Devices

•   The web is accessed by various devices:
    – PC, NoteBook, PDA, Mobile Phone…
•   Each one having different capabilities
    – Hardware:screen size/color, audio, bandwidth…
    – Software:mpeg, mp3, 3GPP, AMR…




                                  5
           CC/PP & RDF
• The CC/PP framework starts
  with RDF and then overlays a
  CC/PP-defined set of semantics
  that describe profiles.
• CC/PP, RDF based profiler, is a
  collection of information of
  capabilities of hardware platform
  and system software, and
  preferences of the user.




                                      6
         Advantages of CC/PP

• By only sending required content, no time or bandwidth is
  wasted sending unwanted content. This can also lead to
  faster page loading times.
• A server can provide information to a more diverse range on
  browsers. This can not only be beneficial in economic terms,
  but also in terms of site accessibility.
• You give the users what they want, not what you think they
  want.
• So many…



                               7
Deployment(Client & Server Proxies)




              8
Deployment (Server Proxy only)




             9
Deployment (Client Proxy only)




             10
Deployment (Ideal Approach)




            11
CC/PP Query




          12
Content adaptation




            13
          Two ways to use CC/PP profiles

•   Selection
    If the web server has a set of pre-written web pages,
    suitable for a number of different devices, then the profile
    can be used to decide which of these pre-written pages is
    most suitable for the web browser.
•   Transformation
    Web page content can be kept in a neutral format (e.g.
    XML). This can then be transformed into an appropriate
    format, using the profile to decide what that format is.



                                14
           CC/PP Implementations

•   DICE
•   Hewlett Packard
•   DELI
•   Intel
•   Inria
•   Keio University - Portal
•   UMBC
•   JIGSAW
•   X-Smiles Browser
•   So many…
                               15
         Demonstrations

• An example of RDF file and graph

• A Demo Page presenting the functionality of the CC/PP
  protocol




                             16
            Reference

•   http://www.w3c.org
•   http://www.w3.org
•   http://www.webstandards.org
•   http://www.ccpp.org/
•   http://dice.ccpp.info/
•   http://www.tml.hut.fi/Opinnot/Tik-111.590/2000/Papers/Rdf.html
•   http://castrato.ics.forth.gr/qh/
•   http://www.csse.monash.edu.au/projects/MobileComponents/pr
    ojects/pda_doc_layout/seminar-html/


                                 17
External Annotation Framework




                                18
          Annotation Schemes

• Inline annotation: embed annotations in a Web document
   – Created as extra attributes of document elements
      • HTML browsers ignore unknown attributes in a HTML document
   – Ease of annotation maintenance, eliminating the bookkeeping task
     annotations with their target documents
   – Require annotators to have document ownership
• External annotation: separate annotation from the original
  document
   – Raise no issues related to document ownership
   – Facilitate the sharing and reuse of annotations across documents
   – Avoid the mixing of contents and metadata


                                   19
          Applications of Web Content
          Annotation
• Discovery
   – Accurate searches of Web resources
• Qualification
   – Descriptions of users’ preferences regarding privacy
• Adaptation – the focus of this unit




                                   20
Overview of An Annotation-
Based Transcoding




            21
          External Annotation Files

• Contain metadata that address a part of a document to be
  annotated
   – XPath and XPointer are used to associate annotated portions of a
     document with annotating descriptions
      • A reference may point to a single element or a range of elements
      • If a target element has an ID attribute, the attribute can be used
        for direct addressing with the need for a long path expression
• Use RDF as the fundamental syntax of annotation files
   – User preferences and device capability: Composite
     Capability/Preference Profiles (CC/PP)
   – Document profiles (http://www.w3.org/TR/xhtml-prof-req/)


                                    22
Framework of External
Annotation




            23
          Association

• How to select an annotation file for a Web document
   – Implicitly  by means of a structural analysis of the subject
     document
   – Explicitly  by means of <link> tag
• An annotation file can be associated with a single document
  file, but the relation is not limited to one-to-one
   – Many annotation files for one Web document
   – One annotation file for multiple Web documents
      • Useful when it is necessary to annotate common parts of Web
        documents, such as page headers, company logo images, and
        sidebar menus


                                    24
Annotation-Based Transcoding
           System




                               25
          Overview

• Content can be adapted on a content server, a proxy, or a
  client terminal
   – An adaptation engine should not be forced to reside in any particular
     location
• Use a proxy-based approach for content adaptation




                                    26
          Transcoding Architecture

• Intermediary
   – Computational entities that reside along the Web transaction path
   – Facilitate an approach to making ordinary information streams into
     smart streams that enhance the quality of communication
• An intermediary processor or a transcoding proxy can
  operate on a document to be delivered and transform the
  contents with reference to associated annotation files




                                   27
          Authoring-Time Transcoding

• Requirement for authoring-time transcoding
   – WYSIWYG editor
   – Let the annotator to navigate from an existing annotation to a portion
     of an annotated document designated by XPath / Xpointer
   – Verify the results of content adaptation through a previewer
• Authoring-time transcoding is crucial when annotations are
  employed for content adaptation, rather than discovery or
  qualification of contents
   – Content adaptation often changes the structure of original
     documents as the results of transcoding



                                    28
Authoring-Time Transcoding




           29
WYSIWYG Annotation Tool




          30
HTML Page Splitting for Small-
      Screen Devices




                                 31
          Annotation Vocabulary
• An annotation vocabulary for HTML page splitting needs to
  be specified to constrain the possibilities for decomposition,
  combination, and partial replacement of contents
• Annotation of Web Content for Transcoding
• Alternatives
   – Provide alternative representations of a document or any set of its
     elements
   – Color image  grayscale image
   – A transcoding proxy selects the alternative that best suits the
     capabilities of the requested client device
       • Elements in the annotated document can then be altered either
         by replacement or by on-demand conversion


                                    32
          Annotation Vocabulary (Cont.)

• Splitting Hints
   – An HTML file that can be shown as a single page on a normal
     desktop PC may be divided into multiple pages on clients with
     smaller display screens
   – pcd:Group: specifies a set of elements to be considered as a logical
     unit and provides hints for determining appropriate page break points
• Selection Criteria
   – Help a transcoding module select, from alternatives, the one that
     best suits the client device
   – pcd: role  value attribute (proper content, side menu, decoration…)
   – pcd:importance  priority (low important content may be ignored or
     displayed in a smaller font)

                                    33
Annotation Descriptions




            34
          Adaptation Engine
• Run on an intermediary server called WBI
• Flow chart
   – Upon receipt of the request from a client browser, an original page is
     retrieved for the first time from a content server.
   – The editor component of the plugin tries to find the locations of
     annotation files:
       • If it is specified in a link element in an HTML header section,
         retrieve the designated annotation file.
       • Lookup in a table for the mapping between an URL of the original
         page and that of an annotation. If it is found, retrieve the
         designated annotation file.
       • Otherwise, the original page is returned as it is and the session is
         terminated.


                                     35
          Adaptation Engine (Cont.)

• Flow Chart (Cont.)
   – The generator component of the plugin generates a current page to
     be returned.
      • Taking account of client capabilities included in an HTML request
         header, the generator extracts a portion of a document object
         tree and returns a sub-tree to the client




                                   36
Adaptation Engine – System
Flow




           37
          Application to Real-Life HTML
          Pages
• The Web page used as an example is a news page from a
  corporate Web site
• The news page consists of three tables stacked from top to
  bottom.
   – The top and middle tables correspond respectively to a header menu
     and a search form.
   – The bottom table is used for layouting.




                                  38
Layout of A Real-Life News
Page




            39
Annotations for Splitting the
News Page




             40
Annotation for fragmentation
of an actual news page




            41
Screen copy of a small display
preview on an authoring tool




            42
Comparison of display contents
on a small-screen device




            43
            Splitting Result
• The page splitting not only reduces the content to be delivered, but also
  places the primary content near the top of the fragmented page that is
  provides with navigational features
    – Placing navigational features (menu bars etc.) near the top of pages
    – Placing key information at the top of pages
    – Reducing the amount of information on the page
• page fragmentation based on semantic annotation will be more
  appropriate than page transformation done by solely syntactic
  information (removing white spaces, shrinking or removing images…)
    – Semantic rearrangement is one of the critical limitations of the syntactic
      transformation approach.
    – The navigational features achieved by this semantic annotation are
      noteworthy from the perspective of Web content accessibility.


                                          44
          Issues

• Consistency between an Original Document and Its
  Annotation
   – Necessary to provide a way of keeping them synchronized
• Extensibility
   – Custom-tailored transcoding module that runs without any external
     meta-information.
   – Using a general-purpose transformation engine, such as XSLT,
     which employs externally provided transformation rules
   – Task-specific  semantics
      • Roles such as header, auxiliary, and layouter supplement
        semantics that cannot be fully prescribed in the definitions of
        Web document

                                   45
Comparison of transcoding
approaches in terms of extensibility




              46

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:7
posted:6/26/2011
language:English
pages:46