Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

EFFICIENT SEARCH ENGINE BASED ON

VIEWS: 1 PAGES: 8

									National Conference on Role of Cloud Computing Environment in Green Communication 2012                       641




                  EFFICIENT SEARCH ENGINE BASED ON
                          IMAGE AND VIDEO
                                           Vijayan.A#1 and Tomar.D.C#2
                                 Department of Computer Science And Engineering,
                                  MNM Jain Engineering College, Chennai, India

                                  #1
                                       vijayan.selva@gmail.com, #2dctomar@gmail.com

             Abstract – A variety of emerging online data delivery applications challenge existing
      techniques for data delivery to human users, applications, or middleware that are accessing data
      from multiple autonomous servers. A framework is displayed for formalizing and comparing
      pull-based solutions and present dual optimization approaches. The first approach, most
      commonly used nowadays, maximizes user utility under the strict setting of meeting a priori
      constraints on the usage of system resources. An alternative and more flexible approach that
      maximizes user utility by satisfying all users are presented by minimizing the usage of system
      resources where user satisfaction is always high degree estimation in delivering the data. Hence
      with an efficient approach of mining technique quick response for the users providing targeted
      data with user profiles is ensure. In any search engine, accurate information is being displayed
      based on the keyword that is typed in. In existing search the results are produced based only on
      the text input. The proposed system provides an efficient search engine that search the requested
      contents based on the image and video that is entered in the search engine rather than searching
      for the content based on text that is typed in.
      1 INTRODUCTION
             The ways searches for information online are constantly evolving. One of the most
      interesting (comparatively) new methods currently being developed by companies around the
      web revolves around the idea of using images as a basis for search queries. Of course already
      have a plethora of colour-based image search engines as well as other fun search tools have been
      reviewing from time to time, but most of them have one thing in common: the keyword comes
      first: use words to search. By using images as a basis for queries the new advanced image search
      tools coming out of development are able to provide a totally different (beyond the advanced)
      type of experience.

            As the technology is still in its infancy, various developers are working on specific kinds of
      practical applications for this type of search. As the search technology rapidly developed,
      nowadays, main search engines are already able to meet users basic search desire. However,
      current search algorithms or methodologies mostly depend on keywords matching process,
      which could be effective for text search while not efficient for keywords-lacking or non-text
      search scenarios.

            This paper summarizes the solution adopted by current search engine vendors, and
      introduces a new approach that takes up the image and video as search input and displays the
      relevant output. To explore several directions of enhancements that integrate the visual
      information with other information related to the images in the analysis and query processes. The
      demonstration of these methods improves image search functionalities over non-integrated

 Department of CSE, Sun College of Engineering and Technology
National Conference on Role of Cloud Computing Environment in Green Communication 2012                      642


      content-based methods. Image database indexing is used to improve the efficiency of image
      retrieval in response to a query expressed as an example image. The query image is processed to
      extract information that is matched against similar information from images stored in a database
      to retrieve a set of matched images.

             The matching process is achieved by our search engine. Modern image search engines such
      as Google, Yahoo!, Microsoft Live image search are all text Meta word based. To search for
      images, the user’s type in a text query and the search engines rank the result images almost
      sorely based on the text meta-words. The abundant visual information in the images themselves
      is largely neglected. To give a search engine that takes image as input and produce results.

      2 PROBLEM DEFINITIONS
              In general, these standard Web search engines are far from ideal. As the search
      technology rapidly developed, nowadays, main search engines are already able to meet users
      basic search desire. However, current search algorithms or methodologies mostly depend on
      keywords matching process, which could be effective for text search while not efficient for
      keywords-lacking or non-text search scenarios. This paper summarizes the solution adopted by
      current search engine vendors, and introduces a new approach that takes up the image and video
      as search input and displays the relevant output.

      3 EXISTING SYSTEM
            In any search engine, accurate information is being displayed based on our keyword that is
      typed in. In existing search the results are produced based only on the text input. The algorithm
      is used in existing system are Satisfy User Profiles (SUPs) and Delimiter Approach. An adaptive
      monitoring solution Satisfy User Profiles (SUPs) is developed. Through formal analysis,
      sufficient optimality conditions for SUP are identified. The main intuition behind the SUP
      algorithm is to provide higher level of user satisfaction, in terms of functionality and continuous
      operation.

            In order to fulfil high user satisfaction, the user needs should be identified. Our
      experiments show that can achieve a high degree of satisfaction of user utility when the
      estimations of SUP closely estimate the real event stream, and has the potential to save a
      significant amount of system resources. The user will provide the input query string; it converts
      the input query string into delimited words separated by words and searches our repository by
      following ways
         » Query string word contained in Meta content of the page.
         » Query string word contained in source code of the page.

      3.1Demerits

            The exact search term for the data that we are looking for is not known or available. There
      is no alternate way to search for the exact data or content.

      4 PROPOSED SYSTEM
           The proposed system provides an efficient search engine that search for the requested
      content, based on image and video rather than searching for the content that is based on text


 Department of CSE, Sun College of Engineering and Technology
National Conference on Role of Cloud Computing Environment in Green Communication 2012                         643


      typed in the search engine. In order to achieve this Binary Search and Content Based Algorithm
      is used. A binary search or half-interval search algorithm finds the position of a specified value
      (the input "key") within a sorted array. At each stage, the algorithm compares the input key value
      with the key value of the middle element of the array. If the keys match, then a matching element
      has been found so its index, or position, is returned.

            Otherwise, if the sought key is less than the middle element's key, then the algorithm
      repeats its action on the sub-array to the left of the middle element or, if the input key is greater,
      on the sub-array to the right. If the remaining array to be searched is reduced to zero, then the
      key cannot be found in the array and a special "Not found" indication is returned.

            "Content-based" means that the search will analyse the actual contents of the image rather
      than the metadata such as keywords, tags, and/or descriptions associated with the image. The
      term 'content' in this context might refer to colours, shapes, textures, or any other information
      that can be derived from the image itself.

      4.1 Merits
            » User satisfaction.
            » Quick access to the intended data.

      5 PROJECT GOALS
             The main goal of the project is to search the user content using the video and image. This
      is designed for high scalability. It must be efficient in both space and time, and constant factors
      are very important when dealing with the entire Web. The primary goal is to provide high quality
      search results over a rapidly growing World Wide Web. This innovative idea gives extended user
      satisfaction with getting into the first page of search engines or gaining a high page ranking.

      6 SYSTEM ARCHITECTURE
            The process begins with the repository creation like uploading the html pages. On the time
      of upload the user will provide the Meta content of the file this will be captured by the system
      and stores it in the database. The body content of the html will also be stored in the table.




 Department of CSE, Sun College of Engineering and Technology
National Conference on Role of Cloud Computing Environment in Green Communication 2012                      644




                                        Fig 6.1 System Architecture
              The above diagram shows that the users enter the keyword or upload an image or video in
      the search engine. After clicking the search button the search engine identifies the type of input.
      If that input is in the form of text, it searches for Meta content or document using delimiter
      approach in the repository and produces the relevant information result for display.

            If that input is in the form of image, it converts the image to byte array and searches the
      repository and display the result to the appropriate user requests. If that input is in the form of
      video, it converts video frame image to byte array and searches the repository and display the
      result to the appropriate user requests. If that user input doesn’t match with repository it will
      display “Data not available”. Based on SUP algorithm and delimiter approach the result will be
      accurate for user requested content.

      7 SUP ALGORITHM
            The main intuition behind the SUP algorithm is to provide higher level of user satisfaction,
      in terms of functionality and continuous operation. In order to fulfil high user satisfaction, the
      user needs should be identified.
      8 DELIMITER APPROACH
            A delimiter is a sequence of one or more characters used to specify the boundary between
      separate, independent regions in plain text or other data streams. An example of a delimiter is
      the comma character, which acts as a field delimiter in a sequence of comma-separated values.

            The user will provide the input query string; it converts the input query string into
      delimited words separated by words and searches our repository by following ways
          1. Query string word contained in Meta content of the page.
          2. Query string word contained in source code of the page.

      Algorithm




 Department of CSE, Sun College of Engineering and Technology
National Conference on Role of Cloud Computing Environment in Green Communication 2012                          645




               The delimiter approach explains the processing of searching the content in the database
      using the entered text. First the input is read and it is considered as query string and then
      converts the user input into delimited string with separating commas. From the repository find
      the file corresponding to the input string. To find the length of the string start the iteration from
      0. Store the input in word in the array format. Search the repository using binary search in Meta
      content and obtain the results. Search the repository using binary search in document and obtain
      the results. Finally eliminate the duplicate files and the result is given to client with page ranking.

      9 BINARY SEARCH ALGORITHM
            A binary search or half-interval search algorithm finds the position of a specified value (the
      input "key") within a sorted array. At each stage, the algorithm compares the input key value
      with the key value of the middle element of the array. If the keys match, then a matching element
      has been found so its index, or position, is returned. Otherwise, if the sought key is less than the
      middle element's key, then the algorithm repeats its action on the sub-array to the left of the
      middle element or, if the input key is greater, on the sub-array to the right. If the remaining array
      to be searched is reduced to zero, then the key cannot be found in the array and a special "Not
      found" indication is returned.

            A binary search halves the number of items to check with each iteration, so locating an
      item (or determining its absence) takes logarithmic time. A binary search is a dichotomic divide
      and conquer search algorithm. Searching a sorted collection is a common task. A dictionary is a
      sorted list of word definitions. Given a word, one can find its definition. A telephone book is a
      sorted list of people's names, addresses, and telephone numbers. Knowing someone's name
      allows one to quickly find their telephone number and address.

            If the list to be searched contains more than a few items (a dozen, say) a binary search will
      require far fewer comparisons than a linear search, but it imposes the requirement that the list be
      sorted. Similarly, a hash search can be faster than a binary search but imposes still greater
      requirements. If the contents of the array are modified between searches, maintaining these
      requirements may take more time than the searches! And if it is known that some items will be
      searched for much more often than others, and it can be arranged that these items are at the start
      of the list, and then a linear search may be the best.


 Department of CSE, Sun College of Engineering and Technology
National Conference on Role of Cloud Computing Environment in Green Communication 2012                     646




      Implementations
      Iterative
            The following incorrect algorithm is slightly modified (to avoid overflow) from Niklaus
      Wirth's in standard Pascal:




            This code uses inclusive bounds and a three-way test (for early loop termination in case of
      equality), but with two separate comparisons per iteration. It is not the most efficient solution.

      Recursive
            A simple, straightforward implementation is tail recursive; it recursively searches the sub
      range dictated by the comparison: It is invoked with initial low and high values of 0 and N-1.




      10 CONTENT BASED ALGORITHM
            Content - based image retrieval (CBIR), also known as query by image content (QBIC)
      and content-based visual information retrieval (CBVIR) is the application of computer
      vision techniques to the image retrieval problem, that is, the problem of searching for digital
      images in large databases. Content based image retrieval is opposed to concept based
      approaches.



 Department of CSE, Sun College of Engineering and Technology
National Conference on Role of Cloud Computing Environment in Green Communication 2012                        647


            "Content-based" means that the search will analyse the actual contents of the image rather
      than the metadata such as keywords, tags, and/or descriptions associated with the image. The
      term 'content' in this context might refer to colors, shapes, textures, or any other information that
      can be derived from the image itself. CBIR is desirable because most web based image search
      engines rely purely on metadata and this produces a lot of garbage in the results. Also having
      humans manually enter keywords for images in a large database can be inefficient, expensive
      and may not capture every keyword that describes the image. Thus a system that can filter
      images based on their content would provide better indexing and return more accurate results.

      Content Comparison using Image Distance Measures

            The most common method for comparing two images in content based image retrieval
      (typically an example image and an image from the database) is using an image distance
      measure. An image distance measure compares the similarity of two images in various
      dimensions such as color, texture, shape, and others.

             As one may intuitively gather, a value greater than 0 indicates various degrees of
      similarities between the images. Search results then can be sorted based on their distance to the
      queried image. A long list of distance measures can be found in. The user image will be
      converted into byte array and will be searching the byte array content which matches with the
      input.




              For image search, content based search is used to search user query related contents. It
      identifies the image height, width and stores it in two variables m and n respectively and then
      reads alpha, beta, gamma and intensity parameters as bytes. It then converts those parameters
      into byte and stores the byte value in array. This array is assigned to image and the related
      content is displayed.

      Content Based Image Byte Conversion
         1) Image will be converted into byte array
         2) Fetch the images which exactly matches with the image and the images which were
      relevant to the input
         3) Display the text information related to this image as well.


 Department of CSE, Sun College of Engineering and Technology
National Conference on Role of Cloud Computing Environment in Green Communication 2012                    648



      11 CONCLUSION
             The primary goal is to provide high quality search results over a rapidly growing World
      Wide Web. Accomplishment of constructing search engine based text content that is typed in
      search engine is successfully completed in this phase. The proposed system provides an efficient
      search engine that search the requested contents based on the image and video the is entered in
      the search engine rather than searching for the content based on text that is typed in the search
      engine. This innovative idea gives extended user satisfaction with getting into the first page of
      search engines or gaining a high page ranking.
      REFERENCES
      [1] L. Bright and L. Raschid, “Using Latency-Recency Profiles for Data Delivery on the Web,”
      Proc. Int’l Conf. Very Large Data Bases (VLDB), pp. 550-561, Aug. 2002.
      [2] J. Cho and H. Garcia-Molina, “Synchronizing a Database to Improve Freshness,” Proc. ACM
      SIGMOD, pp. 117-128, May 2000.
      4] H. Liu, V. Ramasubramanian, and E.G. Sirer, “Client and Feed Characteristics of RSS, a
      Publish-Subscribe System for Web Micro news,” Proc. Internet Measurement Conf. (IMC), Oct.
      2005.
      [5] V. Padmanabhan and J. Mogul, “Using Predictive Prefetching to Improve World Wide Web
      Latency,” ACM SIGCOMM Computer Comm. Rev., vol. 26, no. 3, pp. 22-36, July 1996.
      [6] H. Roitman, A. Gal, and L. Raschid, “Capturing Approximated Data Delivery Trade-offs,”
      Proc. IEEE CS Int’l Conf. Data Eng., 2008.
      [7] Haggai Roitman, Avigdor Gal, Senior Member, IEEE, and Louiqa Raschid “A Dual
      Framework and Algorithms for Targeted Online Data Delivery,” in Proc. IEEE Vol. 23, No.1,
      Jan 2011.




 Department of CSE, Sun College of Engineering and Technology

								
To top