Docstoc

META SEARCH - PPTLinks

Document Sample
META SEARCH - PPTLinks Powered By Docstoc
					                       METASEARCH
National institute of science &Technology




                                               META SEARCH

                                                    Presented By
                                                  Chilika Pujari


                                                 Under the guidance of
                                            Mr. Indraneel Mukhopadhyay

                                                                         1
                       METASEARCH

                                            What is it?
National institute of science &Technology




                                             A meta search engine is a system that supports
                                              unified access to multiple local search engines
                                             It does not maintain its own index on web pages .
                                             A metasearch engine is an effective tool to quickly
                                              reach a large portion of the deep Web




                                            CHILIKA PUJARI                                      2
                       METASEARCH

                                             When to use?
National institute of science &Technology




                                             Need to be used cautiously
                                             Good for simple searches, particularly if search
                                              terms are distinctive or unique
                                             Good for testing with a few keywords – and find
                                              which individual search engine returns good results
                                             Good for ‘quick and dirty searching’ if you are in a
                                              hurry and want to find a few relevant sites quickly
                                             For complex searches, involving many search
                                              terms, Boolean logic, etc., it is better to use
                                              individual search engines

                                            CHILIKA PUJARI                                      3
                       METASEARCH

                                                     METASEARCH ENGINE
National institute of science &Technology




                                                        TECHNOLOGY
                                            It includes

                                             techniques that identify what search engines are
                                              likely to contain useful results for a given query .

                                             methods that determine what pages from selected
                                              search engines should be retrieved and how the
                                              results from different search engines should be
                                              merged are reviewed.

                                            CHILIKA PUJARI                                       4
                       METASEARCH
                                            Metasearch Software Component Architecture
National institute of science &Technology




                                                                            User
                                                                       1               8
                                                                     User Interface
                                                       2
                                                 Database Selector                     7       Result Extractors
                                                       3    Collection Fusion
                                                      Document Selector                    Result Merger
                                                       4
                                                 Query Dispatcher                  6                   6
                                                    5
                                                                             5
                                                   Search                                        Search
                                                   Engine                  ......                Engine



                                            CHILIKA PUJARI                                                         5
                       METASEARCH

                                            Metasearch Software Component......
National institute of science &Technology




                                            Database selector
                                            It is responsible for sending each user query to only
                                            potentially useful search engines for processing
                                            failing on which it may cause wasteful network traffic
                                            The database selection process can be classified into
                                            the three categories
                                                Rough representative approaches
                                                Statistical representative approaches
                                                Learning-based approaches

                                            CHILIKA PUJARI                                           6
                       METASEARCH

                                            Metasearch Software Component......
National institute of science &Technology




                                             Rough representative approaches: The representative of a
                                             database contains only a few selected key words or paragraphs .
                                             It can only provide a very general description about the
                                             contents of databases .
                                             Statistical     representative       approaches:       Database
                                             representatives have detailed statistical information about the
                                             document databases. This detailed statistics allow more
                                             accurate estimation of database usefulness with respect to any
                                             user query .
                                             Learning-based approaches: It learns the knowledge
                                             regarding which databases are likely to return useful pages to
                                             what types of queries from past retrieval experiences.

                                            CHILIKA PUJARI                                                 7
                       METASEARCH

                                            Metasearch Software Component......
National institute of science &Technology




                                             Collection fusion
                                             This method determines what Web pages should be
                                             retrieved from each selected search engine and how
                                             the retrieved Web pages from multiple search
                                             engines should be merged into a single result list.
                                             It includes two modules:
                                              document selection module (document selector)
                                              result merge module (result merger)


                                            CHILIKA PUJARI                                         8
                       METASEARCH

                                            Metasearch Software Component......
National institute of science &Technology




                                            Document selector: It determines what pages to
                                            retrieve from the document database of the search
                                            engine .It retrieves as many potentially useful pages as
                                            possible, and as few useless pages as possible .
                                            Result merger: It combines the results into a single
                                            ranked list .It ranks all returned pages in descending
                                            order of their desirability.


                                            CHILIKA PUJARI                                         9
                       METASEARCH

                                            Metasearch Software Component......
National institute of science &Technology




                                             Result extractor: URLs of retrieved pages are
                                             correctly extracted from the HTML file of each result
                                             page. Since different search engines use different
                                             ways to organize their result, a separate result
                                             extractor needs to be created for each local search
                                             engine.
                                              Query dispatcher: It established a connection with
                                             the server of the search engine and passes the query
                                             to it. HTTP is used for the connection and data
                                             transfer.
                                            CHILIKA PUJARI                                       10
                       METASEARCH

                                            Advantages & Disadvantages
National institute of science &Technology




                                               Advantages
                                                Query can be run across multiple search engines
                                                User needs to learn only the search interface of the
                                                 meta search tool
                                                Better results: retrieves top-ranking pages from
                                                 individual search engines
                                               Disadvantages
                                                Unique features of individual search engines is lost
                                                Not exhaustive: use only top results returned by
                                                 search engines

                                            CHILIKA PUJARI                                        11
                       METASEARCH
                                                     EXAMPLES OF
National institute of science &Technology




                                                  META SEARCH ENGINES
                                             All-in-One Search Page (http://www.allonesearch.com/)
                                             Dogpile (http://www.dogpile.com/)
                                             Find.com (http://www.find.com/)
                                             MetaCrawler (http://www.metacrawler.com/ )
                                             SavvySearch (http://www.savvysearch.com/)
                                              WebRing (http://www.webring.org/)




                                            CHILIKA PUJARI                                   12
                       METASEARCH

                                                            CONCLUSION
National institute of science &Technology




                                            • Meta search provide a common interface and
                                              conduct searches in many search engines
                                              simultaneously and return results in a uniform
                                              format
                                            • It do not gather web pages, build indexes, accept
                                              URL additions, classify or review web sites




                                            CHILIKA PUJARI                                        13
                 National institute of science &Technology   METASEARCH




CHILIKA PUJARI
                                      THANK YOU !!



14

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:10/4/2011
language:English
pages:14