Learning Center
Plans & pricing Sign in
Sign Out

How Search Engines Work White Paper www i marketing asia team i marketing asia Tel 44 0 208 1234 164 Tel


									                            How Search Engines Work
                                          White Paper
Tel +44 (0) 208 1234 164
 Tel +66 (0) 898 740 040
                 H      ow search Engines Work
                        White Paper

                            The Basics
                            The objective of any search engine is to provide the user with the most
                            appropriate list of web sites based on the words a user types into the
                            search engine search box – in other words the most relevant answers.

                                      Description                               Search Engine Actions

                                                                         Search engines run automated programs,
                                                                            called "bots" or "spiders" that use the
                               Search Engine crawls the Web              hyperlink structure of the web to "crawl"
                                     (Finds the page)                     the pages and documents that make up
                                                                                     the World Wide Web.

                                                                          Once a page has been crawled, its contents
                                                                                     are "indexed" - the data
                                   Webpage is indexed                             is stored in a giant database
                                   (Stored in database)                    (search engine "index"). The index is the
                                                                         basis for the list of websites produced when
                                                                                      a user searches online.

B   uilding                                                             Search engine refers to its index. This index
                                 User enters search query               enables the searching and sorting of billions
    usiness Online                    (Asks for info)                   of documents to deliver the MOST relevant
                                                                                     sites - very quickly.

                                                                         When a request for information comes into
                              Search Engine processes query               the search engine, the engine retrieves
                              (Extract pages from database)                 from its index all the matching site
                                                                               (pages) that match the query.

                                                                            Once the search engine has determined
                                                                             which results are a match, the engine's
                                                                         algorithm (a mathematical sorting equation)
                                      Site ranking
                                                                           runs calculations on each of the results to
                                   (Orders the results)                   identify which is most relevant to the given                                                query. They order these on the results pages
                                                                                   from most to least relevant.
Tel +44 (0) 208 1234 164
 Tel +66 (0) 898 740 040    This all sounds simple, but the volume of users and the amount of informa-
                            tion to index means the search engines need to process millions of calcula-
                            tions every second.

                 H      ow search Engines Work
                        White Paper

                            Information Retrieval
                            Modern search engines rely on the scientific process of information retrieval.

                            There are two critical elements,which power this process:

                            Relevance (Content)
                            - the degree the content of the webpage matches the user's query.

                            Search engines look at whether the search terms are found in important
                            areas of the webpage, namely:

                            - Content
                            - Keywords
                            - Metatags
                            - Links
                            - Consistency in all the above

                            The engines collect data based on the frequency of use of terms and the
                            co-occurrence of words and phrases throughout the web. If certain terms or
                            phrases are often found together on pages or sites, search engines can
                            construct intelligent theories about their relationships.

                            Mining semantic data (the science of language) has given search engines
                            some of the most accurate data about word combinations ever assembled
                            artificially. This immense knowledge of language and its usage gives them

Building Online
                            the ability to determine which pages in a site are topically related, what the
 usiness                    topic of a page or site is, how the link structure of the web divides into
                            topical communities and much, much more.

                            Popularity (Links)
                            - the relative importance, measured via the number and
                            quality of recommendations of a given webpage that match the user's query.

                            The popularity of a webpage increases with every document that references it.

                            - Who is linking
                            - Is the link relevant
                            - Do the sites relate and does their content relate
                            - How important are the sites that are linking
                            - Trust value of each site
                            - Historical link data
Tel +44 (0) 208 1234 164    - Site registration (Date of registration)
 Tel +66 (0) 898 740 040
                            Link metrics are in place so that search engines can find information to trust.
                            In the academic world greater citation meant greater importance, but in a
                            commercial environment, manipulation and conflicting interests interfere with
                            the purity of citation based measurements making validation of the source and
                            the context vital for ensuring quality resultsing quality results.

                 H          ow search Engines Work
                            White Paper

                             The algorithm then determines scoring for the documents and ranks the
                             results in decreasing order by using the relevance and popularity of the
                             site combined with hundreds of other factors that can be individually
                             measured and filtered through the search engine algorithms.

                             What Hinders a Search Engine?
                             Some navigation slows down or can even completely stop search engines
                             from reaching a website's content.

                             As search engine spiders crawl the web, they rely on the architecture of
                             hyperlinks to find new documents and revisit those that may have changed.
                             If there are complicated website structures with little unique content and
                             complex links then a Search Engine will be slowed down, if there are no
                             links then a Search Engine simply stops and moves to a different website.

                             The key to ensuring that a site's content can be crawled is to provide direct,
                             HTML links to each page. If a page cannot be accessed from the home page
                             it will be unlikely to be indexed by the search engines.

Building Online

                              About i-Marketing
                              i-Marketing is a marketing agency assisting companies in developing online business. Based in Asia, we bring
                              together professional, qualified European marketers and strong technical skills which when linked to the low Asian     cost base provide exceptional value. Using technology as an enabler and creatives designed to appeal to your target
                              audience we optimize websites and develop marketing campaigns to make your products and services more visible.
Tel +44 (0) 208 1234 164      Our core markets are Europe and Asia Pacific. Our customers benefit from our internet marketing skills, value for
                              money from access to the low cost base of Asia and our knowledge of successful marketing practices across Europe
 Tel +66 (0) 898 740 040      and Asia.

                              The content provided in this article is not warranted or guaranteed by i-Marketing Co. Ltd. The content provided is
                              intended for educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As
                              such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We
                              are not liable for any negative consequences that may result from implementing any information covered in our


To top