Taxonomy Development Workshop Facets and Faceted Navigation Development Tom by dffhrtcv3

VIEWS: 25 PAGES: 27

									Facets and Faceted Navigation
        Development

                   Tom Reamy
           Chief Knowledge Architect
                   KAPS Group
   Knowledge Architecture Professional Services
           http://www.kapsgroup.com
Agenda
 Two Case Studies
   –   Good and Bad
 Development Process
   –   Research Foundation
   –   Facet Design: Sources
   –   Integrated Solution
       • Metadata Strategy – Technology and People
   – Develop, Test, Monitor, Refine Application
 Conclusions



                                                     2
Enterprise Environment – Case Studies
 A Tale of Two Taxonomies
  –   It was the best of times, it was the worst of times
 Basic Approach
  –   Initial meetings – project planning
  –   High level K map – content, people, technology
  –   Contextual and Information Interviews
  –   Content Analysis
  –   Draft Taxonomy – validation interviews, refine
  –   Integration and Governance Plans

                                                            3
Enterprise Environment – Case One – Taxonomy, 7 facets

 Taxonomy of Subjects / Disciplines:
   –   Science > Marine Science > Marine microbiology > Marine toxins
 Facets:
   –   Organization > Division > Group
   –   Clients > Federal > EPA
   –   Instruments > Environmental Testing > Ocean Analysis > Vehicle
   –   Facilities > Division > Location > Building X
   –   Methods > Social > Population Study
   –   Materials > Compounds > Chemicals
   –   Content Type – Knowledge Asset > Proposals



                                                                        4
Enterprise Environment – Case One – Taxonomy, 7 facets

 Project Owner – KM department – included RM, business
  process
 Involvement of library - critical
 Realistic budget, flexible project plan
 Successful interviews – build on context
   –   Overall information strategy – where taxonomy fits
 Good Draft taxonomy and extended refinement
   –   Software, process, team – train library staff
   –   Good selection and number of facets
 Final plans and hand off to client


                                                            5
Enterprise Environment – Case Two – Taxonomy, 4 facets

 Taxonomy of Subjects / Disciplines:
   –   Geology > Petrology
 Facets:
   – Organization > Division > Group
   – Process > Drill a Well > File Test Plan
   – Assets > Platforms > Platform A
   – Content Type > Communication > Presentations




                                                     6
Enterprise Environment – Case Two – Taxonomy, 4 facets

 Environment Issues
   – Value of taxonomy understood, but not the complexity
     and scope
   – Under budget, under staffed
   – Location – not KM – tied to RM and software
       • Solution looking for the right problem
   – Importance of an internal library staff
   – Difficulty of merging internal expertise and taxonomy




                                                             7
Enterprise Environment – Case Two – Taxonomy, 4 facets

 Project Issues
   –   Project mind set – not infrastructure
   –   Wrong kind of project management
        • Special needs of a taxonomy project
        • Importance of integration – with team, company
   –   Project plan more important than results
        • Rushing to meet deadlines doesn’t work with semantics as
          well as software




                                                                8
Enterprise Environment – Case Two – Taxonomy, 4 facets

 Research Issues
   –   Not enough research – and wrong people
   –   Interference of non-taxonomy – communication
   –   Misunderstanding of research – wanted tinker toy connections
        • Interview 1 implies conclusion A
 Design Issues
   –   Not enough facets
   –   Wrong set of facets – business not information
   –   Ill-defined facets – too complex internal structure



                                                                 9
Taxonomy Development
Conclusion: Risk Factors
 Political-Cultural-Semantic Environment
   –   Not simple resistance - more subtle
        • – re-interpretation of specific conclusions and sequence of
          conclusions / Relative importance of specific recommendations
 Understanding project scope
 Access to content and people
   –   Enthusiastic access
 Importance of a unified project team
   –   Working communication as well as weekly meetings



                                                                          10
Faceted Navigation: Development process
Overview
 Research Foundation – KA Audit
   – Environment – Technology and People
   – Users, Content, Information Behaviors and Needs

 Facet Design - Sources
   –   Selection of Facets and Facet Structure
 Integrated solution
   –   Metadata Strategy – Technology and People
 Application
   –   Design, Develop, Test, Refine
   –   Monitor and Refine

                                                       11
Faceted Navigation: Development process
Information / Knowledge Environment
 Strategic Foundation
   – Info Problems – what, how severe
   – Political environment – support, special interests
 Strategic Questions – why, what value from the taxonomy and
  facet classification, how are you going to use it
 Technology Environment – ECM, Enterprise Search
 High Level Content Map / Content Structures
 High Level Community Map – formal and informal




                                                                12
Faceted Navigation: Development process
Facet Design - Sources
 Facet Theory and Practice
    –   Broaden your perspective
 Domain Collection - metadata
    – Database or Catalog
    – Unstructured content – Much more difficult
 Content Structure – vocabularies, glossaries, etc.
 Building Facets – facetize the taxonomy
    –   Pull out facets –
         • Chemistry – Agents/Compounds, Instruments
         • Chemistry and Health -- methods
 Current or projected metadata as source
    –   Content Types – presentations, well reports, policy


                                                              13
Faceted Navigation: Development process
Research Foundation
 Users – formal and informal communities
    –   How do users think, categorize
    –
    Information behaviors and needs
  – Natural Level categories
 What labels do they use?
    –   Assets vs. Facilities and instruments / Processes vs Activities
    –   Issue – labels that people use to describe their business and label
        that they use to find information
 Suitability of Facets and Facet Labels
    –   Support for user tasks
 Interviews, surveys, search log analysis, folksonomies


                                                                              14
Faceted Navigation: Development process
An Integrated Approach: Elements
 Multiple Knowledge Structures
   –   Facet – orthogonal dimension of metadata
   –   Taxonomy - Subject matter / aboutness
 Technology – Search, Content Management
 Text analytics
   –   Entity extraction – feeds facets, signatures, ontologies
   –   Taxonomy & Auto-categorization – aboutness, subject
 People – tagging, evaluating tags, fine tune rules and
  taxonomy
 People – Users, social tagging, suggestions


                                                                  15
Faceted Navigation: Development process
Integrated Solutions: Technology
 Search – Integrated features, facets and clusters and tag
  clouds and feedback
 Enterprise Content Management – tagging and Policy
   –   Place to add metadata, supported by policy
   –   Gather input from authors, tag clouds plus
 Text Analytics – Taxonomy management, entity extraction,
  categorization, sentiment
   –   Auto-populate variety of metadata – author, title, date, etc.
   –   Relevance – best bets to weights and classes of documents



                                                                       16
Faceted Navigation: Development process
Software Tools – Auto-categorization
 Auto-categorization
   –   Training sets – Bayesian, Vector Machine
   –   Terms – literal strings, stemming, dictionary of related terms
   –   Rules – simple – position in text (Title, body, url)
   –   Advanced – saved search queries (full search syntax)
   –   NEAR, SENTENCE, PARAGRAPH
   –   Boolean – X NEAR Y and Not-Z
 Advanced Features
   – Facts / ontologies /Semantic Web – RDF +
   – Sentiment Analysis – positive, negative, neutral




                                                                        17
Faceted Navigation: Development process
Software Tools – Entity Extraction
 Dictionaries – variety of entities, coverage, specialty
    – Cost of update – service or in-house
    – Inxight – 50+ predefined entity types
    – Nstein – 800,000 people, 700,000 locations, 400,000 organizations
 Rules
    – Capitalization, text – Mr., Inc.
    – Advanced – proximity and frequency of actions, associations
    – Need people to continually refine the rules
 Entities and Categorization
    –   Total number and pattern of entities = a type of aboutness of
        the document – Bar Code, Fingerprint


                                                                          18
Faceted Navigation: Development process
Integrated Solution: People
 Programmers, Librarians, Taxonomists, Metadata specialist
   –   Integrate, design, develop rules, monitor activity & quality
 Authors, Subject Matter Experts
   –   Input into design (important facets), rules, activity meaning
 Users – Web 2.0
   –   Feedback – quality and usability
   –   Suggestions – missing terms, bad categorization & entity
   –   Tags Clouds & folksonomy – for social networking features,
       not for information retrieval



                                                                       19
Faceted Navigation: Development process
Faceted Navigation Application
 Usability Studies
   –   Integration with browse/search - Findability
   –   Equal ranked facets or primary-secondary facets
   –   Granularity of Facets
   –   Ordering of the facets
   –   Sorting within facets
 Monitor usage and refine.
   – Unused facets / Preferred facets / facet combinations
   – Map to user communities / information behaviors
 Refine auto-categorization and entity values
   –   Disambiguation


                                                             20
Conclusion - Development
 Design starts with self-knowledge – users, content, activities
 Integrated Solution is needed
    – Multiple Knowledge structures, technology, people
    – Search, Content management, text analytics
 Faceted navigation requires a lot of Metadata
 Text Analytics (Entity extraction and auto-categorization) are
  essential
 Monitor and Refine never ends – dedicated resources
 Semantic Projects are different
    –   Project management, software evaluation



                                                                   21
Conclusions – Faceted Navigation
 The future is the combination of simple facets (name catalogs of
  entities) with rich taxonomies with complex semantics / ontologies
    –   Ontologies = Relationships of two facets
 Facets call for a new type of taxonomies
    –   Faceted taxonomies and/or simple taxonomies
 Future – new kinds of applications:
    –   Text Mining, research tools, sentiment
 Future of Search – smart ways to refine results, not better
  relevance
    – Real problem with 10 mil hits – no way to get to target
    – Include facets, taxonomies, semantics, & lots of metadata




                                                                  22
             Questions?
                Tom Reamy
          tomr@kapsgroup.com
                KAPS Group
Knowledge Architecture Professional Services
        http://www.kapsgroup.com
Faceted Navigation Resources
 Articles
   –   Faceted Classification Resource Collection
        • http://deyalexander.com/resources/faceted-classification.html
   –   A Simplified Model for Facet Analysis
        • http://iainstitute.org/pg/a_simplified_model_for_facet_analysis.ph
          p
   –   Mailing List for Faceted Classification
        • http://www.poorbuthappy.com/fcd/
   –   Study – Facets on the Web (75 ecommerce sites)
        • http://mypage.iu.edu/%7Eklabarre/facetstudy.html


                                                                          24
Faceted Navigation Resources
 Example Implementations
  –   Berkeley SIMS – Flamenco
      http://bailando.sims.berkeley.edu/flamenco.html
  –   Facetmap – demo’s – www.facetmap.com
 Tools
  –   Business Objects / Inxight – entity and fact extraction –
      www.inxight.com
  –   Teragram – www.teragram.com
  –   Lexalytics – www.lexalytics.com
  –   Data Harmony – www.dataharmony.com
  –   Smart Logic – www.smartlogic.com

                                                                  25
Faceted Navigation Resources

 Vendors
  –   Most Search vendors now offer faceted navigation
  –   FAST, Autonomy, etc.
      • Beware of parametric search sold as facets
  –   Most focused on facets – application and metrics:
      • Endeca – http://www.endeca.com




                                                          26
Faceted Navigation Resources
 Articles
   –   How to Make a Faceted Classification and Put It On the Web
        • http://www.misatonic.org/library/facet-web-howto.html
   –   Putting Facets on the Web: An Annotated Bibliography
        • http://www.miskatonic.org/library/facet-biblio.html
   –   Ecommerce – cooking and kitchen – Faceted Navigation
       http://www.
   –   Extended Faceted Taxonomies for Web Catalogs
        • http://www.ercim.org/publication/Ercim_News/enw51/tzitzikas.html
   –   Webdesignpractices – study of ecommerce use of faceted
       navigation – Use of Faceted Classification
        • http://www.webdesignpractices.com/navigation/facets.html


                                                                             27

								
To top