Pattern Recognition for Technica

Document Sample
Pattern Recognition for Technica Powered By Docstoc
					                                      OCLC Online Computer Library Center




Pattern Recognition for Technical Services:
Interpreting the OCLC Environmental Scan


Eric Childress
Consulting Project Manager
OCLC Research



2006 Ohio Library Council Technical
Services Retreat
Mohican Resort & Conference Center
                                         OCLC Online Computer Library Center




“Future Shock”

A personal perception of "too much
  change in too short a period of time."
  - Alvin Toffler, Future Shock (1970)




    Source: http://en.wikipedia.org/wiki/Future_Shock
                                   OCLC Online Computer Library Center




This presentation

   The big picture
   The infoscape
     Content & Publishing
     Copyright & Licensing
     Collections & Acquisitions
     Metadata & Vocabularies
     Library Systems
        OCLC Online Computer Library Center




The Big Picture
                                                    OCLC Online Computer Library Center




Big patterns

     Production anywhere, Global distribution

     Digital content & Portable devices
          iTunes & iPods

     Self-service users
          OffWeb: ATMs, self-check-out

          OnWeb: Webstores, eGov, eBanking, etc.

     Microcontent/Disaggregation/”My”aggregation
          Ringtones, e-News, RSS readers, My Yahoo/MSN/etc…

     Open Source & Open Content
                                                                OCLC Online Computer Library Center




Voices carry
    Individual-driven content rising:
         Simple, easy, free/affordable ways to share yourself:
              Personal web pages (Tripod, many others)
              Digital images (e.g., flickr, others) – including from cameraphones
              Blogs (Bloglines, others)

    Information is social & peer-to-peer (“Participative Net“)
         Open models (Wikipedia)
         Current generation shares content instinctively
    Brand & voice through new channels
         Blogging by top execs & by staff
         Personal branding -- Webcred is key to one‟s fortunes
         “Brand inside should equal Brand outside” (Tom Peters)
                                                        OCLC Online Computer Library Center




Data rules
    Deep indexing:
         Google, Yahoo, etc. library digitization initiatives
         Amazon‟s “Search inside”
         On demand: Google Alerts, MSN RSS, etc.
         Library space: netLibrary, Alexander Street, many others
    Instant verification:
         RSS, blogs, search engines, online news, opinion sites, fact checking
          sites, etc. combine to produce fast news, and tend to rapidly expose
          big lies & big spin
    Recommendation systems:
         Amazon, Apple iTunes, other retailers – “people like you chose…”
         Novel concepts: Pandora – suggests music based on intrinsic patterns
          of music you like (the “music genome”)
                                             OCLC Online Computer Library Center




The Web reborn

  Web 1.0
     A collection of websites on a shared network
  Web 2.0 (in progress)
     The network as platform
        Spans all connected devices
        Delivers software as a continually-updated service

     Supports an architecture of participation
        Consuming and remixing data from multiple sources
        Providing your data for consuming/remixing by others
           OCLC Online Computer Library Center




Infoscape



Content & Publishing
                                                      OCLC Online Computer Library Center




It’s all digital (or will be)

     Content is now born digital
          Editorial & publishing workflows are on computing platforms

          Systems can output various formats including electronic & physical

     Strong interest in digitizing older material
          Google Print Library project / Yahoo & Open Content Alliance / Million
           Books project, Project Gutenberg, others..

          Many, many digital library projects

          Other sources – Archives, museums, government agencies, NGO &
           university press publication backfiles, more…
                                                    OCLC Online Computer Library Center




Book trade
    A complex space gets more complex
    Mergers & failures have created:
        Megapublishers/Media Giants
        Megaretailers (e.g., Wal-Mart)
    Web has had an impact on publishing & retail:
        Give e to sell p model (e.g., National Academies Press)
        New players: Amazon, isbn.nu, many others taking retail market share
        Bricks and mortar stores building web presence
    E-books & e-audiobooks
        Developing momentum (esp. STM e-books) and acceptance
        Novel approaches such as e-text into factual databases being tried
        Pricing models & copyright/DRM still pose barriers
                                                        OCLC Online Computer Library Center




Serial/media publishing
     Publisher print-to-online transition accelerating
     Self-aggregation
          Article, news item, headline replacing journal, newspaper, magazine as
           unit of consumption
     Newspapers, magazines, radio, TV:
          More players – more TV channels, satellite radio, Internet radio, Web
           news sources, Google news, etc.
          Audience shifting to online or alternatives (e.g., Journalism alternatives
           such as news blogs, alternative news outlets)
          Ad revenue offline not transitioning as fast as readers to online ; losing
           audience & revenue to Craigslist, other sales/classified ad channels
                                                OCLC Online Computer Library Center




The long-tail




 Article (Wired 2004) by Chris Anderson
 Using sales data from Amazon, etc. builds a case that in the Web
 age, niche & end-of-sales-cycle titles [i.e. backlist] (in yellow, the
 long tail) in digital format should be regarded as profitable front-
 list:
      Burden (storage) and sales (e-version or POD) costs are minimal
      Small volume sales over a large list = significant revenue
 N.B. BISG estimates 2004 used book market = $2.2 Billion (111
 million books, 8.4% total consumer spending on books.)
           OCLC Online Computer Library Center




Infoscape



Copyright & Licensing
                                                               OCLC Online Computer Library Center




Copyright
     World copyright regime growing more uniform and less public
      domain/fair use friendly:
         WIPO (World Intellectual Property Organization)

         More signatories to Berne Convention

         Well-funded, powerful publishing & media interests are pressing for
          strong protection (e.g., DRM (digital rights management)

     Public domain and kindred space getting better organized:
         Open Content/Access movement gaining momentum
              Creative Commons & similar content licensing efforts

              Government funding bodies pressing for open access for funded research

         For software, various open licensing regimes, Open Software
          movement
                                                                   OCLC Online Computer Library Center




Copyright & Licensing
   Libraries are players, but not entirely agreed on best solution –
    various voices advocate:
        Digital First Sale provision similar to physical First Sale scheme
        Fair use exceptions for libraries (including unlocking privileges for
         locked digital content)
        Major overhaul of copyright, Digital Millennium Act, etc. to restore
         Founders‟ idea of default public domain save a brief period of
         protection early in the life of the intellectual property
   Terms and conditions vary across owned and leased content in
    library collections:
        Owned content:
             Chiefly physical materials – Terms & Conditions usually known (First Sale doctrine)

        Leased content:
             From bundles, consortial deals, etc. for content leasing
             Chiefly digital, often not stored on library-controlled hardware or storage mediums
             Varied terms, subscription schemes, provisions for long-term access
           OCLC Online Computer Library Center




Infoscape



Collection Management
& Acquisitions
               OCLC Online Computer Library Center




Source: OCLC
                                                             OCLC Online Computer Library Center




Published content space…
   Libraries originally established to collect and manage scarce
    content in physical containers
        Now in a period of content abundance (the Web)
        Libraries still prone to physical collection perspective overlay on e-resources

   Physical materials supply chain ever more automated
        Ordering, processing/cataloging, ready to shelve…
   Digital content continues to make inroads into libraries
    (spending up; users want it)
        E-books finally gaining some traction
        E-audiobooks getting attention and interest from users
        Strong trend to access published digital remotely rather than load locally

   Collection/selection process trends
        New and improved selection tools from ILS vendors, jobbers
        Cooperative collection arrangements, cooperative remote storage
OCLC Online Computer Library Center




                     Source:
                     ARL
                                                                 OCLC Online Computer Library Center



Other parts of collections grid…
    Special collections:
         Often unique to single library -- typically high interest in digitizing, but
          not necessarily bandwidth/funding
         ARL‟s “hidden collections” work (addressing cataloging backlog)
    Education/research products:
         Opportunity for libraries to help scholarship & teaching, but not simple
          or inexpensive task
         Mostly poorly developed interfaces between systems, processes,
          practices in Course Mgt. Systems (CMS) & those in library services.
          Overlap with e-reserves? Library often invisible in CMS
    Open web:
         Varied content (akin to Grey literature) & unclear what role(s) libraries
          should/can play vs. search engines, Internet Archive, etc.
         Various slices-of-web projects:
              Some libraries harvest all or some content from their country‟s domain
              Topical/period projects such as Library of Congress‟ Election 2002 Web Archive
           OCLC Online Computer Library Center




Infoscape



Metadata &
Vocabularies
                                                                     OCLC Online Computer Library Center




Metadata
    Libraries have long tradition of quality:
         Interoperability across communities of practice
         Rich, authoritative descriptions (now sought by search engines, others)
    But things are changing in library cataloging …
         Severe cost consciousness & ROI (return on investment) review
              Cataloging is expensive, and old assumptions are being revisited – we don‟t create
               card catalogs much anymore, but our metadata still card catalog-oriented
              Are we missing opportunities?
                     AACR2 & MARC mix content & presentation – difficult to fully leverage value
                     Non-library staff willing to build metadata (esp. for right-side-of-grid items) –
                      can libraries apply expertise to influence & leverage other record sources?

         FRBR (Functional Requirements of Bibliographic Records)
              Very powerful rethink of relationships in content & metadata
              Initial work in building better OPAC displays (OCLC, RLG, VTLS, LC, others…)

         RDA (Resource Description & Access) [formerly “AACR3”]
              Addresses separating metadata content from metadata display (ISBD)
              Will significantly change prevailing cataloging practice
                                                               OCLC Online Computer Library Center




Vocabularies
     Controlled vocabularies are “In”
         Corporate sector investing & leveraging for managing internal content,
          driving sales in webstores

         Libraries, museums, archives continue to invest in formal vocabularies
          to facilitate search & retrieval

     Controlled vocabularies are “Out”
         Clay Shirky & other digerati have declared them passé
              Too big, too complicated, too old school, too slow to adopt new concepts, too
               closed

         Advocate:
              Tagging: Adding keyword access points to images, music files, etc.

              Folksonomies: collaborative categorization using freely chosen keywords

     Various agencies experimenting with exposing vocabularies in
      new, machine-readable ways (e.g., OCLC’s terminology
      services project)
           OCLC Online Computer Library Center




Infoscape



Library Systems
                                               OCLC Online Computer Library Center




New demands….

    Support empowered consumption
      Open Source/Content IP being leveraged (e.g., Apache,
        Personalization
    Surface libraries seamlessly
      Point-of-need delivery (e.g., library content in non-library
       apps)
    Open standards, easy integration
      Many trading partners, changing often
      Mash-ups deliver remixed functions & data from multiple
       providers in a seamless, integrated experience (using tools
       like Greasemonkey, etc.)
                                      OCLC Online Computer Library Center




Convergence

  Bookseller, publisher, library catalogs showing
   feature convergence
     Various efforts underway to get publisher/jobber
      data earlier in bibstream
  Big central files for searching, not federated small
   silos
     Search engines & Big bib files (Open WorldCat ,
      RedLightGreen)
     But…silos make reasonable harvesting targets (OAI)
                                                             OCLC Online Computer Library Center




Future systems

    System refactoring
        Modularity (micro-services, remixing, multiple sources)

        Layering (loosely-coupled systems)

        Interoperability (low-friction, high reuse)
             Lightweight protocols gaining favor (e.g., SRW/SRU, microformats)

        Machine-oriented services (web services)

    User-centered design
        User-tasks-oriented designs (e.g., NCSU catalog)

        User-customized views/a la carte (e.g., „my” university portals)

        User-contributed content (tagging, etc.)
                           OCLC Online Computer Library Center




End

 eric_childress@oclc.org
                                                      OCLC Online Computer Library Center




Further reading…
    Rethinking how we provide bibliographic services for the
     University of California
         http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf
    “Making data work - Web 2.0 and catalogs” / Lorcan Dempsey
         http://orweblog.oclc.org/archives/000815.html
    “Thinking about the catalog” / Lorcan Dempsey
         http://orweblog.oclc.org/archives/000919.html
    OCLC Scan & other reports
         http://www.oclc.org/reports
    It’s All Good
         http://scanblog.blogspot.com

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:4/13/2010
language:English
pages:30