Getty Metafinder Prototype

Shared by: HC12100112935
Categories
Tags
-
Stats
views:
0
posted:
10/1/2012
language:
English
pages:
29
Document Sample
scope of work template
							Helping people find content …
preparing content to be found

Enabling the Semantic Web

         Joseph Busch
Outline

 Why Semantics Matter
 What is the Semantic Web
 Semantic Content Management
Why Semantics Matter
When you own a
Rembrandt you can
spell his name any
way you want.
But when you
want to find a
Rembrandt …
you better spell
his name
correctly.
Vocabulary resources can help find the right
artist even if their name is typed incorrectly.
Users cannot type in the
complex queries needed to
find all the relevant items...
But this can be done
automatically.
Complex queries are
even more important
when you search the
entire web.
So you find Rembrandt the
Dutch guy...
… And not Rembrandt
the toothpaste.
Search Failure



 19% Character errors.             19%
  (Young, et al)             21%          40%
 40% Vocabulary errors.           20%
  (Seaman)
 20% Index confusion.
 21% Successful (Nielsen)
Search Solution

 Generate more consistent content to search on.
 Correct user errors.
 Map the language of users to the language of the target
  content.
Search Alternatives
Personalization       Content needs to be tagged
                      with attributes that map to
                      user categories
Analytics             Users don’t follow predictable
                      & consistent pathways

Taxonomies            Automatically generated
                      taxonomies reflect
                      ambiguities of natural
                      language
Syndication           Requires subscriber profiles,
                      well-categorized content, &
                      managed rules
Solution for Search Alternatives

 Predictable standardized structures, and
 Consistent semantics to work on

                          … so machines can understand it.
What is the Semantic Web
Berners-Lee’s Semantic Web

 Formatting content so that machines can understand it.
 Use XML/RDF:
    Infinitely flexible markup language.
    Process content in many more ways than simply for viewing
    it.
 Problem: Mostly syntax … not semantics (in the human
  sense of meaning, i.e., language)
XML is a Grail-like Object

 XML is just a means for encoding information—an
  envelope standard. The real value is still in the information
  that you put in the envelope.
 Filling XML placeholders such as <meta>, <subject>, and
  <maker> requires semantic information management.
Soergel’s SemWeb Proposal

 System of integrated access to data on concepts and
  terminology.
 Bring together variety of sources that exist largely in
  separate worlds, including dictionaries, thesauri,
  classification schemes, etc.
 Federated system with multiple collaborators.
 Common interface to all concept & terminology knowledge
  bases on the Internet.
The Real Semantic Web

 Namespace for uniquely identifying a semantic scheme &
  each concept within each scheme.
 Broad template or conceptual schema for holding all types
  of semantic information & specifying relationships among
  them.
 Definitions of services for interacting with the System.
Vocabulary Markup Language (VocML)

 XML schema for the Semantic Web.
 Broad template for structured representation of semantic
  schemes.
    Dublin Core metadata.
    Tags and syntax for uniquely identifying each concept.
    Typed relationships (hierarchical, associative, etc.)
    Typed notes.

     Networked Knowledge Organization Systems
                nkos.slis.kent.edu
<?xml version="1.0"?>
<!DOCTYPE VocML SYSTEM "VocML.dtd“>                                                                  Dublin Core
<VocML version=”1.1“>
<SrcVocab>
<SVHeader>
    <dc:Title>DFSIC-1998</dc:Title>
    <dc:Source>Standard Industrial Classification (1987)</dc:Source>
    <dc:Creator>Interwoven</dc:Creator>
    <dc:Contributor>U.S. Department of Commerce</dc:Contributor>
              …
    <workNum UIDprefix=”DFSIC-1998” DisplayTitle=”Standard Industrial Classification” BriefDisplay=”SIC”>
</SVHeader>                                                                                           Unique ID
<SVTerm UID=”DFSIC-1998::0139” CCID”104:43”>
    <label>Field Crops, except Cash Grains, not elsewhere classified</label>
    <definition>Establishments primarily engaged in the production of field crops, except cash grains, not elsewhere
    classified. This industry also includes establishments deriving 50 percent or more of their total value of sales of
    agricultural products from field crops, except cash grains (Industry Group 013), but less than 50 percent from
    products of any single industry.</definition>
    <cla>0139</cla>
    <typedRelation UREF=”DFSIC-1998::013” UTYPE=”Z39.19-1980::2" Name=”BT”>
    <typedRelation UREF=”DFSIC-1998::013900” UTYPE=”Z39.19-1980::3" Name=”NT”>
              …
                                                                                  Typed Relationships
Implementing the Semantic Web
The Holy Grail is ...

 Accurate information automatically processed so that it
  can easily be found and used for applications.
 A rich web of linked information, with markup allowing
  machines to route relevant information to the audiences
  that value it most.
Metatagging

 The hard work is mining content to extract key information:
    Recognize the mentions of people, organizations, places,
     and things.
    Infer the subject matter.
 And putting it into formats with standard labels for effective
  exploitation.
Semantic Content Management

                                 User Queries
                                 • database search
                                 • text search



                                       Exploit It

Raw Content                                          Relevant
                                                     Information
• unstructured text       Vocabularies
                                                     • found items
• untagged data
                                                     • granular text
                        Tag It


                      Structured
                      Content
                      • metadata
                      • XML/RDF
Exploiting the Semantic Web

 Route content to audience segments that value it most.
 Link mentions of people, organizations, places, and things
  to other information related to those entities.
 Populate portal directories.
 Precisely search heterogeneous content items.
Predictions
Predictions

 VocabularyML.
    Semantic standard for unique identifiers (a namespace) for
    people, organizations, places, and things and the
    relationships among them.
    See: nkos.slis.kent.edu
 Technologies that enable the persistent naming of the
  information inside XML envelopes.
 Generation of enormous value through interoperability
  among web applications.
        Joseph A. Busch
Content Intelligence Evangelist
         ASIST President, 2001


                  415-778-3129
              fax 415-778-3131
        jbusch@interwoven.com

    Moving business to the Web
          www.interwoven.com

						
Related docs
Other docs by HC12100112935
Jasper Desk Company
Views: 0  |  Downloads: 0
Forensic Science History Timeline Quiz
Views: 69  |  Downloads: 2
AT3 Event Planning and Concepts Inc
Views: 0  |  Downloads: 0
Conceptmaps
Views: 0  |  Downloads: 0
applications MCOMP 8 all grades
Views: 3  |  Downloads: 0
148 12 behavior analyst 2 tph
Views: 17  |  Downloads: 0
Confirmed Loans Pending Disbursement Report
Views: 2  |  Downloads: 0