Making provision for Multilingual Information Retrieval

Document Sample
scope of work template
							Making provision for
Multilingual Information Retrieval


                                    Nobuko Miyairi
           LIS670 Information Storage and Retrieval
                                       Spring 2001
   Background
    TUFS (Tokyo University of Foreign Studies)
    – 26 languages
    – students and faculty from other countries

    TUFS Library
    – 500,000 books + 1,000 periodicals
    – over 120 languages
    – 9 librarians



4/30/01                      2
   Challenges
    You don’t know the language
    – “How can I locate web sites about Sufism written
    in Urdu?”

    You don’t know the technology
    – “How can I display Urdu on my computer?”

    You can’t evaluate what you retrieved
    – “the current trend of academic libraries in
    Thailand.”

4/30/01                       3
   Who needs it?
    Information Seeker
    – students, scholars, researchers
    – general audience

    Surrogate Searcher
    – librarians

    Information Provider
    – webmasters, content creators


4/30/01                      4
   Information on WWW
    Growing Scale of WWW
    – more than 1 billion pages (as of 2001)
    – 68% is English


    How about translation?
    – can’t cover the amounts
    – can’t catch up with the speed
    – can’t satisfy diverse information needs


4/30/01                       5
   Solution 1
    Localization
    – OS level
    – mostly bilingual, not multilingual
    Software
    – editor, word processor
    – browser
    – IME (input method editor)

    Gateway Service


4/30/01                        6
   Solution 2
    Tools for Surrogate Searcher
    – cross-lingual search engine


    – automatic translation service



    Standards
    – universal character set (e.g., Unicode, ISO10646)
    – document standard (e.g., XML, MARC21)

4/30/01                       7
   Summary
    Multilingual Computing Technology
    CLIR: Cross-lingual Information Retrieval
    Services on the Web
    Standards

    None of them is cure-all
    – interdisciplinary cooperation
    – and your creativity!


4/30/01                       8