An Integrated Framework for Content Based Image Retrieval

Document Sample
An Integrated Framework for Content Based Image Retrieval Powered By Docstoc
					                                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                        Vol. 9, No. 6, 2011

   An Integrated Framework for Content Based Image

                     Ritika Hirwane                                                           Prof. Nishchol Mishra
                    SOIT, RGPV, Bhopal                                                         SOIT, RGPV, Bhopal

Abstract—Content-based image retrieval (CBIR) is an important              knowledge to guide the combination of suitable features,
research area for manipulating large amount of images from the             whereby a knowledge-based query is specified by the user.
databases. Extraction of invariant features is the basis of CBIR.
Color, texture, shape and spatial information have been                          In this paper, image retrieval methods based on color,
important image descriptors in content based image retrieval               shape and texture analysis are investigated. We have designed a
systems. This paper presents a framework for combining all the             prototype model for feature extraction. In section II show the
three features i.e. color, texture and shape and accomplish higher         method for color feature extraction by colour histogram and
retrieval efficiency using image by dempster shafer theory of              section III contain method for texture feature extraction by
evidence (DST). The main aim of evidence theory is to represent            energy level algorithm. Furthermore in section IV depict the
and handle uncertain information. An important property of this            method for shape feature edge detection followed by Section V
theory is its ability to merge different data sources in order to          integrated method called dempster evidence theory for
improve the quality of the information retrieval. Dempster                 combining three feature of image [3]. Calculate the similarity
evidence integrating the color, shape and texture analysis in              value between two images and then evaluate the performance
image retrieval, the accuracy is much higher than using the                and compare the characteristic of each image retrieval approach
techniques separately.                                                     in section VI. Finally conclusions are specified in Section
                                                                           VII.The main problem associated with content based image
    Keywords-Belief function, Dempster shafer theory, Evidence
                                                                           retrieval is that how correctly extracting the visual properties
theory, Feature extraction, Probabilities
                                                                           and matching them from the database.The resolution which is
                                                                           proposed here is extracting primitive feature of a query image
                       I.    INTRODUCTION                                  and compares them to those of database images that is shown
    The increasing amount of digitally produced images                     in figure 1.
requires new methods for access data. Content based image
retrieval is a technique which is based on visual contents,                             II.     COLOR FEATURE EXTRACTION
called features of images, to search images from large scale
                                                                               One of the most important features of image that make
image databases according to users’ requests in the form of a
                                                                           possible the recognition of images by humans is color. Color is
query image [2].Basically CBIR include the following four
                                                                           property that depends on the reflection of light to the eye and
parts in system comprehension that is data collection, database
                                                                           the giving out of that information in the brain .One of the most
feature extraction, search in the database, arrange the all the
                                                                           popular method for color feature extraction is color histogram
result in sorted order and lastly deal with the results of the
                                                                           [11].Histogram of image is a graphical representation same as a
                                                                           bar chart in structure that organizes a group of data points into
       Most early image retrieval systems were based on text               user-specified ranges. The histogram condenses a data series
descriptions or visual features of images like color, texture and          into an easily interpreted visual by taking many data points and
shape as indices for retrieval [4].Basically text based and                grouping them into logical ranges or bins. The method that is
content based are the two techniques for search and retrieval              used in this paper for extracting the color features of images is
from the image database. The multi-feature approaches have                 global color histograms [10].
resulted in improvements in retrieval efficiency, but it is still
not very acceptable because different features tend to have
different degrees of significance for different types of queries.
To overcome this problem, considered the use of domain

                                                                                                      ISSN 1947-5500
                                                                                                         (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                                                                   Vol. 9, No. 6, 2011
                                                                                                                             V.    DEMPSTER SHAFER EVIDENCE THEORY
                                                                                                                       Dempster-Shafer Theory is a mathematical theory that is
                                                                                                                   related with evidence [14]. For discrete classes’ space, DST
                 Query Image
                                                                       Base Image
                                                                                                                   can be defined in the term of a probability theory .Where each
                                                                                                                   probability is assigned to sets. Traditionally in probability
                                                                                                                   theory, evidence is related with only one possible event while
                                                                                                                   in Dempster Shafer Theory, evidence are related with multiple
     Query          Query          Query               Database           Database         Database                possible events or sets of events. Where the evidence is
  Color Feature Texture Feature Shape Feature         Color Feature    Texture Feature   Shape Feature             sufficient to give assignments of probabilities to single events.
                                                                                                                   DST is based on the three major functions i.e. Basic Belief
                                                               Shape Distance
                                                                                                                   Assignment function (BBA or m (.)), Belief function (Bel), and
                   Color Distance
                                        Texture Distance
                                            Measure               Measure                                          the Plausibility function (Pl) [9]. BBA is based on evidence
                                                                                                                   theory does like a probability, but defines a mapping of the
                                             Measure                                                               power set between the interval between 0 and 1. Where the
                                                                                                                   Basic Belief Assignment function of the empty set is 0 and the
                                                DST                                                                summation of the Basic Belief Assignment function of all the
                                                                                                                   subsets of the power set is 1. In order to find out the degree of
                                          Image Display                                                            confidence for each proposition A of 2Θ, it is possible to
                                                                                                                   associate an elementary mass function m (A) which indicates
                                                                                                                   all confidence that one can have in this proposition. The
                          Figure 1. Prototype Model For CBIR                                                       quantity m (A) is interpreted similar to the belief strictly placed
                                                                                                                   on A. This quantity differs from a probability by the totality of
                                                                                                                   the belief is distributed not only on the simple classes but also
                   III.        TEXTURE FEATURE EXTRACTION
                                                                                                                   on the composed classes. The BBA can be shown by the
    In the field of computer science and digital image                                                             equations (1).
processing there is no clear meaning of texture feature because
it is based on texture analysis methods and the features                                                                                 m(X) [0,1]
extracted from the image [12]. Yet texture can be considered as                                                                            m(Φ)=0                                (1)
repeated patterns of pixels over a spatial domain, of which the
addition of noise to the patterns and their repetition frequencies                                                     For any classification problem, number of classes can
result in textures that can come out to be random and                                                              defined. Let suppose a set of Θ={c1, c2, c3, … }, called the
unstructured.                                                                                                      frame of discernment. The power set of Θ, contains all possible
                                                                                                                   subsets of Θ. Then the power set, P(Θ)={ c1, c2, c3, …, { c1, c2
     There are lots of method has proposed for texture feature                                                     }, { c1, c3 },… {c2, c3 }, ….., { c1, c2, c3 }, ..…. { c1, c2, c3,
extraction. In this paper one of the popular methods called                                                        …}}. Now we finally compute the total belief provided by the
energy spectrum method is used for texture extraction. For a                                                       body of evidence for a proposition. The belief function, bel(.),
texture classification wavelet transform method are used                                                           allied with the BBA m(.) is a function that assigns a value in [0,
through which the image is decomposed into four sub images,                                                        1] to every nonempty subset B of Θ It is called “degree of
called high-high, high-low, low-high and low-low sub-bands                                                         belief in B” and is defined by equations (2).
[5].Once energy of each band has calculated apply the
similarity distance matrix by euclidian method through which
the most similar images from the extracted result are computed
                                                                                                                                       Bel ( A) =   ∑ m( B)
                                                                                                                                                    B⊆ A

and arrange them in sorted manner.
                                                                                                                       Bel(A) is the total belief committed to A, means the mass
                                                                                                                   of A itself plus the mass attached to all subsets of A. Bel(A) is
                     IV.         SHAPE FEATURE EXTRACTION                                                          the total positive effect the body of evidence has on the value
    Shape feature of image may be defined as the attribute                                                         of Θ being in A. Plausibility function pl (Hn) quantifies the
surface configuration of an object can be outline or contour. It                                                   maximal degree of belief of the hypothesis Hn.
permits an object to be illustrious from its surroundings by its                                                        Dempster-Shafer theory has an operation, called
outline [13]. Shape representations can be generally alienated                                                     Dempster's rule of combination, for the pooling of evidence
into two categories, Boundary-based and Region-based. A                                                            from a variety of sources [1]. This rule is used for aggregation
variety of method has proposed for boundary and region based                                                       of two independent bodies of evidence defined within the same
shape representation. [8].                                                                                         frame of discernment into one body of evidence. Let m1 and
      The method that is used here for shape feature extraction                                                    m2 are two basic belief assignment associated in a frame of
is canny edge detection [7].In which initially smoothes the                                                        discernment Θ. The new body of evidence an m (A) on the
image and finds the image gradient to highlight regions with                                                       same frame Θ can be defined as equations (3).
high spatial derivatives. The gradient array is now further
reduced by hysteresis. Hysteresis is used to track along the left                                                                                                                      (3)
over pixels that have not been suppressed. After shape
extraction of image, distance matrix is calculated.

                                                                                                                                               ISSN 1947-5500
                                                            (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                      Vol. 9, No. 6, 2011
    VI.   PROPOSED ALGORITHM AND EXPERIMENT RESULT                       Step7:-Apply same query for texture feature extraction.
    In this paper Image retrieval is done by dividing the                Step8.Apply the discrete wavelet transform to query image as
retrieving part into four parts, namely color feature, texture           well as image in the database and image decompose to into
feature, shape feature and combining factor. Here we consider            four sub images, in low-low, low-high, high-low and high-high
that all the features are extracted from the images in the image         sub-bands                                                    .
datasets. All these images are randomly selected from the                Step9.After getting the four sub part finds out the energy
WWW and have same resolution. Let suppose the image that                 spectrum value of the each sub part by using following
are selected as a query image from the database is 011.bmp that          equations (4) .
is show in fig 2.                                                                                       m       n
                                                                                           E v =       ∑ ∑
                                                                                                       i = 1   j = 1
                                                                                                                       X (i, j)

                                                                                           E    =     E v / P Q

                                                                         X(i,j)shows the intensity of pixel on the position (i,j).Where M
                                                                         and N are the dimensions of the image.
                                                                         Step10. After getting the energy band of the each sub bands
                                                                         repeat the step8.
                                                                         Step11.Calculate similarity matrix and find out the images
                                                                         similar to the query image by following function shown in
                      Figure 2. Query Image
                                                                         equations (5).

                                                                                               ∑ (x                    )
A. Color Feature Extraction & Similarity Measure                                       Di =            k    − yi , k
    By using the histogram method [6], color feature of the                                    k =1                                     (5)
query image and images that are in the database are extracted.           Step12. Arrange all the generated result in to sorted manner
After applying the algorithm, Euclidean Distance Metric is               and store them in to class2 called H2.
used for finding the similarity measure between the query
image and generated result. Finally we obtained the following
top 10 results. Snapshot of the generated result shown in a
figure 3.Steps for finding the color histogram are:-
Step1:-Select the query image from the dataset.
Step2:-Chosse the one of a color space from the
RGB, HSV color space.
Step3:-Find out the quantization of the color space.
Step4:-Compute the histograms of the query image and
database images.
Step5:-Apply the derivation of the histogram distance function
and find out the images from the database which is most                                  Figure 4. Texture Feature Extraction
similar to the query image.
Step6:- Arrange all the generated result in to sorted manner and
store them in to class1 called H1.
                                                                         C. Shape Feature Extraction & Similarity Measure
                                                                         By using the canny edge detection method, the shape feature of
                                                                         the query image and images that are in the database are
                                                                         extracted and calculate similarity metrics. Finally we obtained
                                                                         the following top 07 results. Snapshot of the generated result
                                                                         shown in a fig 5.Steps for edge detection are:-
                                                                         Step13.Apply same query for shape feature extraction.
                                                                         Step14.Apply Gaussian filters to each image of database and
                                                                         query image for smoothing the image.
                 Figure 3. Color Feature Extraction                      Step15.Compute the Gradient magnitude using approximations
                                                                         of partial derivatives.
                                                                         Step16.Thin edges can be detected by applying non-maxima
B. Texture Feature Extraction & Similarity Measure                       suppression to the gradient magnitude.
    By using the Energy spectrum method, the texture feature             Step17.Detect edges by double threshold.
of the query image and images that are in the database are               Step18. Apply the similarity measure function and find out the
extracted and compute similarity matrix by euclidian distance            images from the database which is most similar to the query
method.Finally we obtained the following top 10 results.                 image.
Snapshot of the generated result shown in a fig 4.Steps for              Step19. Arrange all the generated result in to sorted manner
computing the energy value of images are:                                and store them in to class3 called H3.

                                                                                                        ISSN 1947-5500
                                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                        Vol. 9, No. 6, 2011
                                                                           m(Φ) could be positive when normalized combination rule are
                                                                           Step23.Next is to calculate the belief function, bel(.),related
                                                                           with the BBA m(.) is a function that assigns a value in [0,1] to
                                                                           every nonempty subset B of Ω called degree of belief in B and
                                                                           is defined by equations (7).
                                                                                                 Bel ( A) =   ∑ m( B)
                                                                                                              B⊆ A
                 Figure 5. Shape Feature Extraction
                                                                               We can consider a basic belief assignment as generalization
                                                                           of a probability density function whereas a belief functions is a
D. Final Result By DST and Similarity Measure                              generalization of a probability function.
    Let suppose we are having c1, c2 shows the result                      Step24.Consider two BBAs m1(.) and m2(.) for belief
generated after color feature extraction .Similarly t1, t2 and s1,         functions bel 1(.) and bel 2(.) respectively. Let B and C be
s2 are texture and shape feature extraction of query and                   focal elements of bel 1 and bel 2 respectively. Then m1(.) and
database      image.     After     that     similarity      metric         m2(.) can be combined to obtain the belief mass committed to
d(c1,c2),d(t1,t2)and d(s1,s2) by euclidian distance measure                 A ⊂ Θ according to the following combination or orthogonal
shown in fig 6,are computed. Then find out the intersection set            sum formula ,m(A) defined in equations (8).
and favorable evidence on the matching event on the couple of
set by applying dempster shafer theory of evidence

                                                                                                  Figure 7. Final Result By DST

                                                                                                      VII. CONCLUSION
                                                                           An image is rich in the information and single color, texture or
                                                                           shape feature cannot descript an image well, so the retrieval
                    Figure 6. Feature Extraction                           results based on single feature are not satisfactory usually. To
                                                                           solve this kind of question, extraction method for colour
    In a fig 7 shows that the search content of the query image            feature, texture feature and shape feature is proposed in this
is processed by the dempster shafer evidence theory. In that               paper. An integrated approach combing color, shape and
case the output of the color extraction, shape extraction and,             texture analysis for image retrieval has been designed and
texture extraction processed in the intersection of the set and
                                                                           implemented here in which the output of the all extracted
now find favorable evidence on the matching event on the
couple of set. Steps of proposed DST method are:                           results are processed in the intersection of the set and then find
                                                                           out favorable evidence on the matching event is applied on the
Step20.Now finally applies the dempster shafer algorithm on                couple of set by applying proposed dempster shafer evidence
each mutually exclusive set generated after applying each the              theory. Finally conclude integrating the color, shape and
feature extraction method.
                                                                           texture analysis in image retrieval, by dempster shafer evidence
Step21.Let us note the hypotheses set Ω composed of n single
mutually exclusive subset H defined as Ω= {H1, H2,…….Hn                    theory the accuracy is much higher than using the techniques
}called frame of discernment, and its power set denoted by 2Ω.             separately.
Step22. Now calculate BBA m function that assigns a value in
[0, 1] to every subset A of Ω that satisfies the following                                                REFERENCES
equations (6).                                                             [1]   Mohammad Shahab, Dr. Mohammad Deriche ,”Fault Diagnosis: a
                                                                                 Dempster-Shafer Theory Approach”, 11 February 2009
                         m(Φ)=0                                            [2]   Ritendra Datta, Dhiraj Joshi, JialiI, and James Z. WANG,”Image
                                                                                 Retrieval: Ideas, Influences, and Trends of the New Age” Vol. 40, No. 2,
                       ∀ A∈ X
                                m ( A ) =1                  (6)                  Article 5, ACM Computing Surveys, Publication date: April 2008.

                                                                                                            ISSN 1947-5500
                                                                        (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                                  Vol. 9, No. 6, 2011
[3]  Valery A. Petrushin, Latifur Khan,” Multimedia Data Mining and                  [13] John Canny,”A computational approach to edge detection”, Pattern
     Knowledge Discovery”, ISBN-10: 1-84628-436-8,Springer-Verlag                         Analysis and Machine Intelligence, IEEE Transactions on, PAMI-
     London Limited 2007 .                                                                8(6):679–698, Nov. 1986.
[4] Ying Liua, Dengsheng Zhanga, Guojun Lua,Wei-Ying Mab,”A survey                   [14] A. P. Dempster : A generalisation of Bayesian inference, Journal of the
     of content-based image retrieval with high-level semantics” ,Pattern                 Royal Statistical Society, p.205-247 (1968)
     Recognition Society. Published by Elsevier Ltd ,2006.
[5] Sharmin Siddique, “A Wavelet Based Technique for Analysis and
     Classification of Texture Images,” Carleton University, Ottawa, Canada,
     Proj. Rep. 70.593, April 2002
[6] Sangoh Jeong ,”Histogram-Based Color Image Retrieval”,
     Psych221/EE362 Project Report, Mar.15, 2001
[7] Marinette Bouet, Ali Khenchaf, and Henri Briand, “Shape
     Representation for Image Retrieval”, 1999
[8] S.Abbasi,Curvature scale space in shape similarity retrieval, Ph.D.
     thesis, Centre for Vision, Speech and Signal Processing, Universityof
     Surrey, Guildford, GU2 5XH, England, 1998.
[9] M Lalmas,” Dempster-shafer's theory of evidence applied to structured
     documents: capturing uncertainty”.,Proceedings of ACM SIGIR
     Conference on Research and Development in Information Retrieval,
     Philadelphia, PA, USA, July 1997
[10] A.K. Jain and A. Vailaya, “ Image Retrieval Using Color and
     Shape”,Pattern Recognition, Vo1.29, No& pp.1233-1244, 1996.
[11] Hsu W., Chua T.S. and Pung H.K,”An Integrated Color-Spatial
     Approach to Content based Image Retrieval”, pp. 305-313,ACM
     Multimedia ’95, San Francisco, Nov 1995.
[12] J. R. Smith and S.-F. Chang. " Automated image retrieval using color
     and texture", Technical Report CU/CTR 408-95-14, Columbia
     University, July 1995.

                                                                                                                     ISSN 1947-5500