Leveraging Semantic Technologies for Enterprise Search

Document Sample
Leveraging Semantic Technologies for Enterprise Search Powered By Docstoc
					     Leveraging Semantic
         Technologies
     for Enterprise Search
Gianluca Demartini
demartini@L3S.de
L3S Research Center
Leibniz Universität Hannover   1
      – M.Sc. in Udine, Italy (Dec 05)
      – Ph.D. Student in Hannover, Germany (Mar 06)
      – Research Interests:
            • IR evaluation
            • Enterprise Search
            • Integration of SW and IR
      – My Goal: get a Ph.D. (before end 2009)
Gianluca Demartini                                    2
– No previous work
  • see the 58 references in the paper
– 1 slide per Research Question

– Thoughts on my    Big Picture


                                         3
         How can we query for different item types together:
           integrating document and people search

        – Extension of Vector Space Model to
          consider Docs and People
        – Place also People into the Space
          considering several evidences of expertise
        – Query properly in order to retrieve both
          Docs and People
Gianluca Demartini                                             4
                           How can we query
                     structured and unstructured
                             data together?


                 – DB search
                     • Keyword search in DB
                 – IR search
                     • Structured search (author:john)
                 – Goal: (un-)structured search on
                   (un-)structured data
Gianluca Demartini                                       5
                  How can we benefit from both
            Semantic Web and Information Retrieval
                 techniques in enterprise search?


     – Semantic Search
           • Use metadata to improve content-based search
     – IR indexing
           • Use taxonomies instead of flat term-based indexing
     – Expert Search
           • Use ontologies as expertise taxonomies

Gianluca Demartini                                            6
              How can we enrich automatically
      the metadata annotation in a social infrastructure?



      – Scenario: desktops with metadata annotations
      – Search in your community for new metadata
        annotations
      – Ask to similar peers how they annotate similar
        resources
Gianluca Demartini                                          7
          Web, enterprise and desktop: how do they differ?

      – Link structure
      – Spam
      – Privacy
            • Sharing data
            • Activities Logging



Gianluca Demartini                                           8
    How can we systematically evaluate enterprise search?


      – Relevance Definition
      – Metrics
      – Standard evaluation approach: testbed
      – Privacy issues for building public collections



Gianluca Demartini                                          9
                           How can we personalize
                     the enterprise search user experience?


– User Observation
      • Activity logging
      • Context detection
– Tasks
– User Role

Gianluca Demartini                                            10
                     Privacy   Information Retrieval Tags
                     Personalization Web Search Evaluation Algorithms
                     User Modelling Semantic    Web Desktop Search SOA
                     Metadata Recommendation Context   Social Networks
                     Expert Search



              – Integrate techniques from
               different fields

              – Innovate where the
               improvement is (economically)
               assessable
Gianluca Demartini                                                       11
                     Thanks

Gianluca Demartini            12

				
DOCUMENT INFO
Shared By:
Stats:
views:68
posted:7/14/2011
language:Italian
pages:12
Description: Today's era of information explosion, information on a daily basis at an alarming rate. According to the statistics show that the world authority on global trading data information from the annual growth rate of 61%, and other information related to annual growth rate of more than 92%. Research into traditional relational database management system from the processing of data as structured data, to include paper documents, electronic documents, faxes, reports, tables, pictures, audio and video files, including information known as unstructured data or content. Through the survey found that in the vast amounts of information stored in corporate, structured data accounted for only 15% of the total data, and unstructured data accounts for 85% of the total data. Orderly storage, management and use of unstructured data mining is the value of all successful enterprises to improve global competitiveness and productivity of the primary means.