Docstoc

Identifying the Identifiers

Document Sample
Identifying the Identifiers Powered By Docstoc
					       Identifying the Identifiers


                   Douglas Campbell
National Library of New Zealand Te Puna Mātauranga o Aoteaora

                 DC-2007, 29th August 2007



                                                                #255685
Designing Identifiers


• Identifier Theory
• Identifier Qualities
• Identifier Design Checklist
                 Identifiers 101
Identify this…
                       Chair    椅
                       Seat
                       Plastic and metal thing
                       1 metre tall with four legs
                 Identifiers 101
Identify this…
                       Chair
                       Seat
                       Plastic and metal thing
                       1 metre tall with four legs
                       The one on the left
                       The blue one
                 Identifiers 101
Identify this…
                       Chair
                       Seat
                       Plastic and metal thing
                       1 metre tall with four legs
                       The one on the left
                       The blue one
                       The stacking chair
                       Vitra Tom Vac Chair
                       Asset no. #1123-33
   Certificate
       Identifiers 101
For spectacular achievement
in assigning chair identifiers
    Awarded to: You .
                     Identifiers 999
  Identify these…
                       Application Profile (namespace)
    Metadata schema (namespace)
                                         Age (data field)
Economics (social networking tag)
                                                  Donor (relationship)
       Economics (subject authority)
Boy with dog (photograph in collection) Romeo and Juliet (FRBR work)
  Boy with dog (digitised photo)
 Craft Society (harvested, archived website)
                  Picasso, Pablo (1881-1973) (name authority)
            Film director (agent role)
       Identifiers in Communication

Identifiers are part of communicating – to refer to a thing




The goal is “sameness”
  Latin: „idem‟ (the same) + „facere‟ (to make)
        Prerequisites for identifying

To Identify        … we need to Differentiate
To Differentiate   … we need to Compare sameness
To Compare         … we need to Define / Describe
To Describe          … we need to:
  - Observe characteristics      eg. size, location
  - Interpret characteristics    eg. smell, topic
  - Assign new characteristics eg. name, logo, id string
   Identify by comparing characteristics

We identify by comparing the sameness, or not, of characteristics
                                       • Can sit on
“Bring a seat in…”                     • Antique
                                       • Wooden             
            • Can sit on               • Can sit on
                                       • Modern
                                       • Stackable          
                                       • Can sit on
                                       • Animal
                                       • Black              
   Identify by comparing characteristics

We identify by comparing the sameness, or not, of characteristics
                                       • Can sit on
“Bring an office seat in…”             • Antique
                                       • Expensive          
            • Can sit on               • Can sit on
            • Office                   • Office
                                       • Stackable          
                                                            
                                       • Can sit on
                                       • Animal
                                       • Expensive
  Promoting characteristics to identify

Use description characteristic(s) as surrogate (substitute)
• A symbol to represent the thing
• Convenience
  – compare single identifiers instead of lots of characteristics
• Promotion changes the characteristic‟s role
  – purpose is differentiation, not description
               Context

                         4 Same St
                                       
                                     
 Invitation              4 Same St
                           Same City
Street Party
     at
   No. 4
                    House #1221-3
                                     
            Describing and Identifying


        characteristics   Description   characteristics
                                                      Identifier
Thing

            describe                                  Identifier

                                                                 contexts
                            associate
                                                          Identifier
                  Identifier Definitions

Identifier:
   “A stated association between a symbol and a thing;
   that the symbol may be used to unambiguously refer
   to the thing within a given context.”
        Thing:  Any entity, idea, action, resource, object, etc.
        Symbol: Any mark, token, sensory stimulus, character string, etc.

Identifier system:
   “Policies, processes, and/or mechnisms for
   assigning, managing, and using identifiers.”
                    Semiotics

• Study of how we communicate using signs and symbols
• We use symbols with no intrinsic meaning,
  we add meaning around these symbols
                Semiotic Triangle

• “Nothing is a sign unless it is interpretted as a sign”
    - Charles Pierce




              Object            Symbol         Symbol
   Thing
         Semiotic Triangle




                                   agent
                 Concept


        Object         Symbol   Symbol
Thing
         Semiotic Triangle




                                          agent
                 Concept


        Object               Symbol    Symbol
Thing        (implied relationship)


        • An identifier is a thought
                    Semiotic Triangle

          remembrance of
   association and context


                                                     agent
                             Concept


                  Object                 Symbol   Symbol
   Thing                 (implied relationship)


• Identifiers are the manifestation of the act of identifying
        The Deconstructed Identifier

The desconstructed identifier has six aspects:
• A Thing
• A Symbol            built from characteristics in a description
• An Associaton       between the symbol and the thing
• A Context
• An Agent            that states the association and context
• A Remembrance       of the association and context
                  Identifier Qualities

• Scope
• Uniqueness
• Granularity
• Intelligence
• Actionability
• Persistence
• Extensibility
• Context
                             Scope

Draw identifier from description, but be clear what is being described




 Newspaper article
                      ?                DB
      Multiple scopes in one record

A MARC record contains multiple scopes
  (that‟s not wrong, just be aware of it)

            100 Creator
            245 Title                   Vocabulary term
            650 Subject
            856 URL

Dublin Core‟s “one-to-one rule” is useful alternative
                          Uniqueness

 Often want to refer to a thing unambiguously



Identifier   Identifier   Identifier      Identifier




             Thing                Thing   Thing        Thing
              The unique Johns

                             Cannot uniquely
John   John    John   John   identify John

                             Can uniquely identify
Jane   Mike    John          John (set of 1)
                             Unique by coincidence

                             Cannot uniquely
Jane   Mike    John   John   identify John
                             (group of people again)
                          Uniqueness

 • A thing has only one identifer
 • An identifer only relates to one thing


  
Identifier   Identifier
                           
                          Identifier        Identifier




             Thing
                                 Thing   Thing
                                                          
                                                         Thing
                      Uniqueness

• A thing has only one identifer
• An identifer only relates to one thing

         Identifier         Identifier     Identifier    Identifier




         Thing             Thing         Thing          Thing
                Global uniqueness

• Wrap naming authority identifier around local identifiers

        NA1                  NA2               NA3
      Org 1               Org 2             Org 3
      1                   1                 1
      2                   2                 2
      3                   3                 3



      NA1:1               NA2:1             NA3:1
      NA1:2               NA2:2             NA3:2
      NA1:3               NA2:3             NA3:3
                          Granularity

Question: How deeply should we break groups into
  separately identified things?
             • Journal
                  • Newspaper
                         • Page
                             • Article
                                   • Photo

Answer: If you have an need to identify it, then identify it!!
• Methodology examples: FRBR, <indecs>
• In practice, the identifier system may dictate constraints
                      Intelligence

Adding roles of “description” and “remembrance” to identifiers
       Intelligent                   Dumb
       Semantic
       Transparent:                  Opaque:
    nytimes_22may2004              1134        Remembrance
       Remembrance                              22 May 2004



Intelligent identifiers are time-dependent – based on your world
   view at the time, eg. country names, “gay”, email address
               Location as Identifier

• Lazy identifiers, e.g.:
   – System file path to HTML file = Web URL
   – Location on shelf = identifier

• Location identifiers are needed to access the thing,
  but may not be the best identifier to publicise
• Dilution – different identifiers for copies of the same thing




       a.com/x.html          b.org/y.html      c.com/z.jsp
                Sidebar: http URIs

• URI = Universal Resource Identifier
• URL = Universal Resource Locator (a kind of URI)


• Many original URLs were lazy identifiers (using file location)
• Many are now more considered identifiers
  – they just happen to start with “http://” – called “http URIs”
• Often are intelligent identifiers, so beware (as discussed)
                   Actionability

                 Remembrance           Live
Identifier                     Thing



Identifier
                              Thing
                                       Dead


                 Remembrance           Actionable
Identifier                     Thing   Resolvable
                                       De-referenceable
                 Remembrance
                               Thing
             Context
                        Persistence

• How long does an identifier need to live?
• How do we keep it alive that long?
• Not a technology issue, is a commitment issue
• Need policies for handling changes in environment
   –   When an identifier is retired
   –   When the thing itself changes
   –   When the identifier system becomes obsolete
   –   When the custodian of the identifier changes
   –   Degree of mutability (allow identifiers to be re-associated?)
                       Extensibility

• Persistence of identifier systems
   – Risk from unanticipated demand
   – Risk from re-use in unanticipated situations
   – Risk from changes to the environment

• Future-proof by including capacity to be adapted
   – As generic form as possible
   – Hooks for community-defined extensions
   – Consider scalability
   – Follow international standards
   – Keep application independent
                          Context

• Remembrance of association and its context
• Remembrance of context alongside or combined
   – .This journal has ISSN     1234-5678
   –   urn:issn:1234-5678      [except need to know what a URN is!]

• Often context is missing, assuming the reader can infer it!
• Multiple identifiers in multiple contexts is not undesirable
   – Sameness is different for different communities
   – Though helpful for similar contexts to be combined
   Checklist for designing identifiers

1. Audience
   • Consider how the identifiers are intended to be used and
     potential downstream uses
2. Scope
   • Determine the thing(s) being identified/described (scope,
     granularity)
3. Context
   • Determine the context(s) things are being identified within
     (granularity).
     For example, is it a concept/item/component/instance/etc.,
     or what communities will it serve?
    Checklist for designing identifiers

4. Overlap
   •   Consider the relationship of the identifiers to other similar
       identifiers and/or contexts, consider merging
5. Persistence
   •   Determine the expected identifier lifespan and strategies to
       preserve the relationship to the associated thing for that long
       (e.g. commitment level, resourcing, and policies)
6. Design the identifier system:
   •   Identifier structure design – uniqueness, intelligence, actionability,
       persistence, extensibility, and communication of context
   •   Addressability - combine identifiers or standalone identifiers?
   •   Support - policies, processes, and mechanisms
       Checklist for designing identifiers

7. Assign locally
   •    implementation (within your scope of control)

8. Global uniqueness
   •    Wrap local identifiers with global authority identifiers for wider
        use

9. Use them
   •    i.e. avoid using equivalent identifiers that may cause
        duplication or confusion!
Conclusion
            Describing and Identifying


        characteristics   Description   characteristics
                                                      Identifier
Thing

            describe                                  Identifier

                                                                 contexts
                            associate
                                                          Identifier
        The Deconstructed Identifier

The desconstructed identifier has six aspects:
• A Thing
                                               remembrance of
• A Symbol                                     association and context
• An Associaton
• A Context                                                      agent

• An Agent                     Concept
• A Remembrance

                     Object                Symbol           Symbol
         Thing             (implied relationship)
                       Action List

1. Look backwards
  – Identifier audit

2. Look forwards
  – Identifier goals & requirements

3. Take action!
  – Make identifiers unique internally
        DCMI Identifiers Community

• DCMI not the right place to do identifiers work
  But is a good place to disseminate
• Key issues raised in DC Conference Special Session
   – Identification vs. resolution
   – Importance of being able to resolve

• New DCMI Identifiers Community – October 2007
• http://dublincore.org/groups/identifiers/
• http://dublincore.org/identifierswiki/
           Identifiers Special Session

Thursday 30th August 2007, 11.30am – 1pm, Karimata Room
• Identifier work:
   –   NISO identifiers roundtable – John Kunze
   –   DOIs & URNs – Juha Hakala
   –   Identifier principles – Stu Weibel
   –   Identifiers at National Library of New Zealand – Douglas
       Campbell
• Discussion on identifier issues:
   – What are the issues or areas of confusion?
   – What are the areas we DO understand?
   – What are some best practices?
                  Questions?

http://www.dcmipubs.org/ojs/index.php/pubs/article/view/34


            douglas.campbell@natlib.govt.nz

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:5
posted:3/31/2011
language:Albanian
pages:47