Docstoc

Subject Indexing

Document Sample
Subject Indexing Powered By Docstoc
					Subject Indexing




    ARD Prasad
ard@drtc.isibang.ac.in
        Assumption


• Indexing = Classification
   Indexing Problem

• How to kill my neighbor?
Importance of Context


• How much Context?
• How to represent Context?
           Luhn’s Contribution
•   Assumption:Title provides the context
•   Permute the title
•   KeyWord In Context (KWIC)
•   Variants
    – KWOC
    – KWAC
    – Etc.
    Significance: Permutation
                 Indexing

• Context-Free Indexing
  – Uniterm Indexes
• Context-Sensitive Indexing
  – Pre-Coordinate Indexing (Mostly manual)
  – Post-Coordinate Indexing (Using Phrases,
    Boolean etc.)
  Context-Sensitive Indexing of
    Ranganathan's School

• Context
  – POPSI (Postulate based Permuted Subject
    Indexing)
• Controlled Vocabulary
  – Clasaurus – Classified Thesaurus based on
    Analytico-Synthetic approach
    Adopting Analytico-Synthetic
             Approach


• Idea Plane
• Verbal Plane
• Notational Plane (Shelf arrangement)
         Co-extensiveness
• Extension (broad)
• Intension (narrow)
• Representation should be co-extensive all
  the planes of work
• Keyword should be co-extensive to the
  idea it represents
               Classification
• Organizing Classification
  – Brings out hierarchical structures
• Associative Classification
  – Brings out associations (relations)
  – Ex:
     • Child – Psychology
     • Child – Medicine
     • Child - Education
    Associative Classification
• Normally derived from Organizing
  Classification for better results
• Non-hierarchical
• Breaks the rigid mono-hierarchical nature
  of Organizing Classification
• Can result in many or Poly-hierarchical
  relations
• Which means, it brings out many
  contexts for a Keyword
           Division

• Genus/species
• Whole/part
  – constituents
    • Protein, Glucose (Composition)
  – Step in a process
             Facets

•   D – Discipline (Subject)
•   E – Entity (Personality)
•   P – Property (Matter/property)
•   A – Action (Energy)

• DEPA
          Modifiers (Speciators)
• Generally produce genus/species relation
• Each facet can have modifiers
• Modifiers
  – Time modifier
  – Space modifier
  – Form modifier
  – Discipline based modifier
  – Entity based modifiers
  – Property based modifiers
  – Action based modifiers
• Independent Modifier
• Dependent Modifiers
• Common Modifiers
  – Time
  – Space
  – Form
  – Environment
• Special Modifiers
  – D, E, P, A modifiers
                 Modifiers

• Independent (can modify anything)
  – Ex: High temperature
• Dependent (can only modify modifiers)
  – Ex: Very high termperature
        Common Modifiers



• Space, Time, Environment and Form
             Space Modifiers

•   World            • Roman Empire
•   Continents       • Mediterranean
•   Countries          counties
•   Provinces        • Equatorial zone
•   States           • Islamic countries
•   Districts
            Time Modifiers


•   Millennium     •   Night
•   Century        •   Season
•   Year           •   Dry period
•   Month          •   Stormy period
•   Day
     Environmental Modifiers

•   Land           •   Marine
•   Subterranean   •   Industrial
•   Desert         •   Fresh water
•   Forest         •   Lake
•   Mountains      •   Tropical
•   Cost land
           Form Modifiers


•   Bibliography   •   Case study
•   Encyclopedia   •   Mail
•   Periodical     •   Parody
•   History        •   Picture
•   Biography
             Special Modifiers

• Modify one and only one elementary
  category
• They can be D,E,P,A based
  – Modifiers of disease
    •   Infectious
    •   Viral
    •   Bacterial
    •   Fungul
           Kinds of Subjects
• Simple Subject
  – Any of DEPA
• Compound Subject
  – A combination of simple subjects
    • Ex: agriculture of Rice
• Complex subject
  – Phase Relations
               Classaurus
• A Vocabulary Control Device
• A thesaurus arranged in facets ( a
  Classified Thesaurus or Faceted
  Thesaurus)
• Each Key term may have
  – Elementary Category / Modifier
  – Use
  – Used For (UFs)
  – Broader Term (BTs)
  – Narrower Term (NTs)
           Representation
• POPSI syntactic rules can use Rule-based
  knowledge Representation
• Classaurus can be presented as book for
  manual indexing
• Machine Environment
  – Frame based Knowledge Representation
    (Expert Systems)
  – Embedded as features in NLP
  – SKOS, OWL in Semantic Web
                 POPSI
• Analysis of the content of a document
• Normalizes terminology using Classaurus
• Synthesis of the concepts representing a
  document
• Generates context based subject Indexing
  strings
                   Steps in POPSI
1. Raw title
2. Expressive titles (Normally from Abstract)
3. Analysis
4. Formalization (Identifying Categories using Classaurus &
   preparing the raw subject Index entry following the Syntax
   prescribed by Postulational approach - POPSI)
5. Standardization (choosing Standard terms using Classaurus)
6. Modulation (Choosing Broader Terms using Classaurus)
7. Preparation of the Index entry for Organizing Classification
   (Adding numeric indicators)
8. Permutation (bringing out associate Classification)
                     Syntax

• DISCIPLINE (BASE) is followed by
• ENTITY (CORE OBJECT) which is followed by
• PROPERTY and/or ACTION.
• PROPERTY and/or ACTION may be followed by
  COMMON MODIFIERS (CM).
• MODIFIERS for each of the Elementary Categories
  follow immediately after each category

• D[m],E[m],P[m],A[m], CM
                      Indicators
• To affect the Associative Relations while
  sorting
  – 0 Form Modifier
  – 2 Time Modifier
  – 3 Environment Modifier
  – 4 Place Modifier
  – 8 Entity/Core Object
  – 9 Discipline/Base
          » Cont...
          Indicators cont...
• .1 Action
• .2 Property
• .3 Constituent
• .4 Part
• .5 Modifier of Kind 1 including Phase
  Relation Modifier
• .6 Species/Type, including those created
  by Modifiers of Kind 2
 Title: Dry salt curing of pig skin in Thailand

• (Discipline/Base) Leather Technology
  (Entity/Core) Pig skin (Action) Dry salt curing
  (Place Modifier) Thailand.

• (D/B) Leather technology (E/C) Hide and skin
  (Part of E) Skin (Type of E) Pig skin (A) Beam-
  house operation (Sub-action) Curing (Type of A)
  Salt curing (Type of A) Dry salt curing (Common
  modifier) Thailand.
• Leather technology 8 Hide and skin 8.4
  Skin 8.6 Pig skin 8.1 Beam-house
  operation 8.1.4 Curing 8.1.6 Salt curing
  8.1.6 Dry salt curing 4 Thailand
• Medicine 0 Bibliography 2 Nineteen fifties
• Medicine 0 Dictionary
• Medicine 0 History 2 Nineteenth century
• Medicine 0 History 4 India 2 Nineteenth century
• Medicine 1
• Medicine 8.1 Physiology
• Medicine 8.2 Anatomy
• Medicine 8.2 Disease
• Medicine 8.2 Disease 8.2.1 Diagnosis
• Medicine 8.2 Disease 8.2.1 Treatment
• Medicine 8 Child
• Medicine 8 Child 8.2 Disease
• Medicine 8 Child 8.2 Disease 8.2.6 Infectious disease
• Medicine 8 Child 8.2 Disease 8.2.6 Infectious disease 8.2.1 Treatment
• Medicine 8 Child 8.4 Respiratory system 8.2 Disease 8.2.6 Infectious
  disease 8.2.1 Treatment
• Medicine 8 Child 8.4 Respiratory system 8.4 Lung 8.2 Disease 8.2.6
  Infectious disease 8.2.1 Treatment
      Thank You



• Questions?

				
DOCUMENT INFO