LSIDs and RDF by t12GWHOt

VIEWS: 0 PAGES: 46

									LSIDs and RDF
   Kevin Richards
    TDWG 2006
             Introduction
• Kevin Richards (Landcare Research NZ)
  – Landcare Informatics group
  – GUID Subgroup
  – LSID .NET code port
• Disclaimer / intended audience
• Overview of tutorial
             Requirements
• Files on network
  – ftp://cissus.mobot.org/incoming/tdwg/Tutorial/
  – Username: garfile Password: garden2003
• Internet connection
• Text editor (or xml editor)
                     RDF
• What is RDF
  – Resource Description Framework
  – w3C
  – Describes resources on the web
  – Putting data on the web
  – Intended for machine processing
  – Has a defined xml syntax
  – Semantic relationships of data objects
  – Aimed at distributed data with varying types
                       RDF Triples
• English language equivalent –
  “http://www.example.org/index.html has a creator whose value is
  John Smith”
• Uses GUIDs (Web based GUIDs – URIs, etc)
   – Eg GUID1 has creator GUID2
• Subject : Predicate : Object format
   – Subject (the object being described)
       • GUID or blank (infinite set)
   – Predicate (the relationship/property type)
       • GUID to a property type, eg the ID of the property “creator”
   – Object (the value assigned to the subject object)
       • GUID of another object, blank, or literal
• Build up map/graph of object relationships
                              RDF Graphs
•     The basis of RDF
•     Useful analysis and equivalence calculation tool
•     Eg
       – example.org/index.html page created by staffid 85740
       – example.org/index.html page language is english (en)




    • Equivalent Triples
       –   <http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/creator>
           <http://www.example.org/staffid/85740> .
           <http://www.example.org/index.html> <http://www.example.org/terms/creation-date>
           "August 16, 1999" . <http://www.example.org/index.html>
           <http://purl.org/dc/elements/1.1/language> "en"
                       RDF formats
• Statement -
  “http://www.example.org/index.html has a
  creator whose value is John Smith”
• N3 notation –
  :http://www.example.org/index.html dc:creator “John Smith”.
• Xml –
 <rdf:Description rdf:about="http://www.example.org/index.html">
       <dc:creator>John Smith</dc:creator>

 </rdf:Description>
                         Basic RDF
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:dc="http://purl.org/dc/elements/1.1/">

   <rdf:Description rdf:about="http://www.example.org/index.html">
         <dc:language>en</dc:language>
   </rdf:Description>

</rdf:RDF>


• rdf:Description – basic description xml node
• Every xml node must be namespaced and the namespace must be
  resolvable
• DublinCore – handy set of basic descriptive elements, such as title,
  creator, language, etc
                        rdf:resource
Eg

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:dc="http://purl.org/dc/elements/1.1/">

     <rdf:Description rdf:about="http://www.example.org/index.html">
           <dc:language>en</dc:language>
           <dc:creator rdf:resource="http://www.ldodds.com/foaf/foaf-a-
     matic/JohnSmith" />
     </rdf:Description>

</rdf:RDF>
                         RDF Types
• Define data types
• Improves control of input data ranges
• Eg birthDate of a person

   <person:birthDate rdf:dataType=
     "http://www.w3.org/2001/XMLSchema#date“>1940-6-19</person:birthDate>


• Eg example.com employee Jane Smith is of type Person
   (n3 notation)

   _:jane exterms:mailbox <mailto:jane@example.org> .
   _:jane rdf:type exterms:Person .
   _:jane exterms:name "Jane Smith" .
   _:jane exterms:empID "23748"
      RDF Example FOAF

• Go to FOAF-a-matic web site
  – http://www.ldodds.com/foaf/foaf-a-matic
• Create Profile, Generate RDF
• Copy RDF
• Go to W3C RDF Validator
  – http://www.w3.org/RDF/Validator/
  – Validate RDF – displays triples
                 Inference
                                   FOAF Person A
  Rdf triple – A knows B


FOAF Person B
                                       Inferred triple - A is
                                       one degree
Rdf triple – B knows C                 separated from C


                   FOAF Person C
                      RDF Exercise 1
•   Create some RDF instance xml for a specimen
     – Make up an namespace for the elements
     – Make up elements for:
        • Specimen Catalog Number (eg SP1)
        • Specimen Collected By
        • Specimen Locality
     – Validate using RDF validator on the web

•   Eg RDF
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:dc="http://purl.org/dc/elements/1.1/">

    <rdf:Description rdf:about="http://www.example.org/index.html">
           <dc:language>en</dc:language>
    </rdf:Description>

</rdf:RDF>
               Example Solution
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:spec="http://specimen.org/specimens/">

  <rdf:Description rdf:about=“SP1">
       <spec:catalogNumber>SP1</spec:catalogNumber>
       <spec:collectedBy>Fred Smith</spec:collectedBy>
       <spec:locality>George St</spec:locality>

       <spec:collectedDate
  rdf:dataType=“http://www.w3.org/2001/XMLSchema#date”>2006-
  10-18</spec:collectedDate>

  </rdf:Description>

</rdf:RDF>
            RDF Schema
• Similar to xml schema
• Define RDF classes, types and properties
• Namespace -
  http://www.w3.org/2000/01/rdf-schema#
                          RDF Schema
• Example Schema – TCS-RDF
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:tn="http://tdwg.org/2006/03/12/TaxonNames/">

   <rdfs:Class rdf:about="http://tdwg.org/2006/03/12/TaxonNames/TaxonName">
         <rdfs:label xml:lang="en">Taxon Name</rdfs:label>
         <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-
   schema#Resource"/>
   </rdfs:Class>

    <rdf:Property rdf:about="http://tdwg.org/2006/03/12/TaxonNames/nameComplete">
           <rdfs:label xml:lang="en">Name Complete</rdfs:label>
           <rdfs:domain
    rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/TaxonName"/>
           <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/>
    </rdf:Property>
</rdf:RDF>
           RDF Exercise 2
• Collection RDF Schema
  – Write an RDF schema to describe our
    collection.org example LSIDs
  – Write an instance of this schema using
    rdf:Description tag or typed tag
  – Validate using RDF validator on the web
                        RDF
• RDF Triple Stores
  – What are they
  – Problems
    • size
    • speed of lookup
    • distribution/duplication
                    Ontologies
• Defined using RDFS or OWL (Web
  Ontology Language) & others
• Dublin Core
  – eg http://purl.org/dc/elements/1.1/creator
  – Eg http://www.example.org/index.html has a creator whose
    value is John Smith =
     • Subject = http://www.example.org/index.html
     • Predicate = http://purl.org/dc/elements/1.1/creator
     • Object = http://www.example.org/staffid/85740 (John’s Id)

• FOAF (Friend Of A Friend)
• TDWG ontologies?
              RDF Tools
• RDF Tools/Software
  – Protégé
  – Altova SemanticWorks
  – Oracle triple stores
  – W3c Validator
           RDF References
• RDF Primer – http://www.w3.org/TR/rdf-primer/
• RDF Syntax - http://www.w3.org/TR/rdf-syntax-
  grammar/
• RDF Schema - http://www.w3.org/TR/rdf-
  schema/
• Another Primer -
  http://notabug.com/2002/rdfprimer/
• w3c RDF Tutorial -
  http://www.w3schools.com/rdf/default.asp
• Dublin Core RDF - http://dublincore.org/
PART 2 - LSIDs
                     LSIDs
• Overview
• Requirements:
  – Files on network
    • ftp://cissus.mobot.org/incoming/tdwg/Tutorial/
    • Username: garfile Password: garden2003
  – Web server (IIS)
  – Text editor
                     LSIDs
• Background on GUIDs
  – What is a GUID?
    •   Globally Unique IDentifier
    •   Persistent
    •   Opaque & transparent
    •   Resolvable?
    •   Examples: UUID, DOI, Handle, LSID, PURL
                    LSIDs
– What is an LSID?
  •   Life Science IDentifier
  •   Developed by OMG & W3C
  •   Implemented by the team at IBM
  •   Structure –
      urn:lsid:authority:namespace:object:revision
       Eg urn:lsid:indexfungorum.org:names:213649
  • Used for – data objects, databases, images,
    files etc?
  • Versioning
  • Whos using them
                     GUID subgroup
• Subgroup decisions:
  – To use LSIDs for identifying biodiversity data, but not
    exclude use of other GUIDs
  – Reuse GUIDs where they already exist, eg some
    DOIs for literature (assuming no commercial restrictions)
  – Use RDF for metadata of objects identified by LSIDs
  – Implement as a minimum the HTTP GET metadata
    service for each LSID server/resolver
  – See GUID Report -
    http://wiki.gbif.org/guidwiki/wikka.php?wakka=GUID2Report&show_comments=1
Pros and Cons of LSIDs
URL     Tied to physical addresses
        Inspection required to determine
         identical contents
        Brittle (broken links)

LSID    Same Id = same content
        Location independent
        Enables transparent caching
        Formalized, rich metadata

Cons    Requires specialised software to resolve
         an LSID (not built in to most software)
        LSID Tools and Services
•   IBM LSID Launchpad
•   Firefox LSID Browser
•   LSID Tester (Rod Page)
•   Web based resolver – http://lsid.biopathways.org/resolver/

• Example LSID servers:
    – Bio Pathways –
      http://lsid.biopathways.org/resolver/urn:lsid:gene.ucl.ac.uk.lsid.biopathw
      ays.org:hugo:MVP (doesn’t work with Launchpad)
    – Index Fungorum - urn:lsid:indexfungorum.org:names:213649
    – IPNI – urn:lsid:ipni.org:names:30000959-2:1.1.2.1
    – uBio - urn:lsid:ubio.org:namebank:11815
               LSID Code
• Current Code Stacks
  – Open Source (sourceforge.net)
  – Java, C++, Perl (IBM)
  – Microsoft .NET (Myself)
     An LSID is resolved using a three-
              part resolution
           1 – Resolve
               LSID
             Authority   DNS + HTTP

                         Query Authority
            2 – Get
            Available
                                           Client
            Services
                           WSDL


           Meta Data       Data via
3–
            Store
Retrieve
Data          Data       SOAP, HTTP,
              Store      FTP, NFS, AFS,
                         DFS
                   DNS Lookup
•   Example
•   Go to command prompt
•   > nslookup
•   > set type=srv
•   > _lsid._tcp.indexfungorum.org
•   Results in :
    lsid.indexfungorum.org (194.203.77.68)
    -> used as authority location for resolving
       indexfungorum.org LSIDs
Questions before hands on
        tutorial?
          Really Simple LSID
• A basic LSID setup
  – LSID server with only one service for returning
    metadata for LSIDs using HTTP GET
  – IIS + PHP


• Specimen collection example
  – urn:lsid:collection.org:specimens:[id]
  – Specimen text files to load metadata from
  – No data
         LSID Authority Setup

LSID HTTP Request


Default is to return wsdl
of php authority location



LSID Metadata Request       IIS
                                   PHP      Load metadata
                                  scripts   from file

Return metadata
         Step 1 IIS authority
• Add ‘authority’ folder to
  c:\inetpub\wwwroot
• Copy authority files to authority directory
• Start IIS:
  – Configure authority as a web application
• Explain files
              Step 2 PHP
• Copy PHP files to c:\php
• Configure IIS to run PHP files (add
  php5isapi.dll to the mappings for authority)
• Add index.php to default docs
• Set security permissions on php files/dir?
Step 3 Configure LSID Launchpad

• Add local host authority for collection.org
• Add application for text editor
• Test, eg
  lsidres:urn:lsid:indexfungorum.org:names:213649
         Step 4 Test authority
• Test authority
  – Browse to http://localhost/authority/
• Test metadata
  – Browse to http://localhost/authority/metadata.php
  – Browse to
    http://localhost/authority/metadata.php?lsid=urn:lsid:c
    ollection.org:specimens:1
• Test using LSID Launchpad
  – lsidres:urn:lsid:collection.org:specimens:1
  – Cant read/display it as it is text, not rdf
       Use RDF for metadata
• Use the RDF collection schema (created
  in the RDF section) in our LSID authority
• Build an RDF instance for the schema
• Test the LSID metadata using LSID
  Launchpad
  – Eg lsidres:urn:lsid:collection.org:specimens:4
            Linking to other data
• Add link for the person
• Eg
   – Change collectedBy property
        <rdf:Property rdf:about="&coll;collectedBy">
                <rdfs:label xml:lang="en">Collected By</rdfs:label>
                <rdfs:isDefinedBy rdf:resource="&coll;"/>
                <rdfs:domain rdf:resource="&coll;Specimen"/>
                <rdfs:range rdf:resource="http://collection.org/specimens/Person"/>
        </rdf:Property>
      to the rdf schema
   – Add
      <coll:collectedBy rdf:resource="urn:lsid:collection.org:people:1"/>
      To the instance
         Linking to other data
• See lsidres:urn:lsid:collection.org:specimens:6
            Java LSID Code
• Highlights of the LSID Java Stack
  – Simple APIs, synchronous and asynchronous
  – Support for HTTP, SOAP and FTP
  – WSDL, Data and Metadata Cache
  – Highly configurable
     • Cache location/policy, host-mappings, metadata
       handling
  – Leverages many Java technologies
     • Xerces, Xalan, Axis, wsdl4J, and Castor
              Java LSID Web Application

                                                        Load services
                                                                          Services Configuration File
                                           Startup

                                                             Create registry


                                                                                  Service Registry
                       LSID
                       Server               LSID
Internet




           HTTP Req
                                                             Lookup Service for LSID authority
                                           Request
           HTTP Resp
                                Response
                                                     Call appropriate          LSID Authority Classes
                                                     authority class
                                                     function
   LSID Resolution code (Client)
• Given an LSID in String format, we can easily open an InputStream
  to the data or metadata. (listing1)

LSID lsid = new LSID(“urn:lsid…”);
LSIDResolver resolver = new LSIDResolver(lsid);
InputStream data = resolver.getData();

• If we care about which protocol or location to use, we can choose
  from those available in the WSDL returned from
  getAvailableServices.

LSIDWSDLWrapper wsdl = resolver.getWSDLWrapper();
LSIDDataPort port = wsdl.getPortForProtocol(WSDLConstants.SOAP);
InputStream in = resolver.getData(port);
                   LSID References
• LSID Source Forge - http://lsid.sourceforge.net/
• LSID .NET Source Forge - http://sourceforge.net/projects/lsid-
  dotnet
• LSID Tutorial - http://www-
  128.ibm.com/developerworks/opensource/library/os-lsid/
• LSID Specification - http://www.omg.org/cgi-bin/doc?dtc/04-05-01
• LSID Tester - http://linnaeus.zoology.gla.ac.uk/~rpage/lsid/tester/
• LSID Launchpad - http://www-
  124.ibm.com/developerworks/downloads/detail.php?group_id=124
  &what=rele&id=553
• GUID Subgroup - http://www.tdwg.org/TDWG_GUID.htm
• GUID Subgroup Reports
            – http://wiki.gbif.org/guidwiki/wikka.php?wakka=GUID2Report&show_c
              omments=1
            – http://wiki.tdwg.org/twiki/pub/TIP/TipDocuments/GUID1Report
              .pdf
• Firefox LSID developer site - http://lsid.mozdev.org/
Thanks to
• Ricardo Pereira
• Roger Hyam
• Rod Page
• Lee Belbin
• The team at Landcare Research:
     • Jerry Cooper
    • Nick Spencer
    • Aaron Wilton
    • Michael Cochrane

								
To top