Docstoc

Content Based Image Retrieval using Dominant Color and Texture features

Document Sample
Content Based Image Retrieval using Dominant Color and Texture features Powered By Docstoc
					                                                                (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                Vol. 9, No. 2, February 2011


Content Based Image Retrieval using Dominant Color
               and Texture features

          M.Babu Rao                                  Dr.B.Prabhakara Rao                                       Dr.A.Govardhan
Associate professor, CSE department               Professor&Director of Evaluation                            Professor&Principal
Gudlavalleru Engineering College                           JNTUK                                          JNTUH college of Engineering
Gudlavalleru, Krishna (Dist.), A.P, India            Kakinada, A.P, India                                    Jagtial, A.P, India
                                                   baburaompd@yahoo.co.in

Abstract— Nowadays people are interested in using digital                    histogram, color correlogram, and dominant color descriptor
images. So the size of the image database is increasing                      (DCD).
enormously. Lot of interest is paid to find images in the database.              Color histogram is the most commonly used color
There is a great need for developing an efficient technique for              representation, but it does not include any spatial information.
finding the images. In order to find an image, image has to be               Color correlogram describes the probability of finding color
represented with certain features. Color and texture are two
                                                                             pairs at a fixed pixel distance and provides spatial information.
important visual features of an image. In this paper we propose an
efficient image retrieval technique which uses dominant color and            Therefore color correlogram yields better retrieval accuracy in
texture features of an image. An image is uniformly divided into 8           comparison to color histogram. Color autocorrelogram is a
coarse partitions as a first step. After the above coarse partition,         subset of color correlogram, which captures the spatial
the centroid of each partition (“color Bin” in MPEG-7) is selected           correlation between identical colors only. Since it provides
as its dominant color. Texture of an image is obtained by using              significant computational benefits over color correlogram, it is
Gray Level Co-occurrence Matrix (GLCM). Color and texture                    more suitable for image retrieval. DCD is MPEG-7 color
features are normalized. Weighted Euclidean distance of color                descriptors [4]. DCD describes the salient color distributions
and texture features is used in retrieving the similar images. The           in an image or a region of interest, and provides an effective,
efficiency of the method is demonstrated with the results.
                                                                             compact, and intuitive representation of colors presented in an
Keywords- Image retrieval, dominant color, Gray level co-                    image. However, DCD similarity matching does not fit human
occurrence matrix.                                                           perception very well, and it will cause incorrect ranks for
                                                                             images with similar color distribution [5, 6]. In [7], Yang et al.
                      I.        INTRODUCTION                                 presented a color quantization method for dominant color
                                                                             extraction, called the linear block algorithm (LBA), and it has
    Content-based image retrieval (CBIR) [1] has become a                    been shown that LBA is efficient in color quantization and
prominent research topic because of the proliferation of video               computation. For the purpose of effectively retrieving more
and image data in digital form. Increased bandwidth                          similar images from the digital image databases (DBs), Lu et
availability to access the internet in the near future will allow            al. [8] uses the color distributions, the mean value and the
the users to search for and browse through video and image                   standard deviation, to represent the global characteristics of
databases located at remote sites. Therefore fast retrieval of               the image, and the image bitmap is used to represent the local
images from large databases is an important problem that needs               characteristics of the image for increasing the accuracy of the
to be addressed.                                                             retrieval system.
    Image retrieval systems attempt to search through a                          In [3,12] HSV color and GLCM texture are used as feature
database to find images that are perceptually similar to a query             descriptors of an image. Here HSV color space is quantized
image. CBIR is an important alternative and complement to                    with non-equal intervals. H is quantized into 8-bins, S into 3-
traditional text-based image searching and can greatly enhance               bins and v into 3-bins. So color is represented with one
the accuracy of the information being returned. It aims to                   dimensional vector of size 72 (8X3X3). Instead of using 72
develop an efficient visual-Content-based technique to search,               color feature values to represent color of an image, it is better
browse and retrieve relevant images from large-scale digital                 to use compact representation of the feature vector. For
image collections. Most proposed CBIR [2,3,4] techniques                     simplicity and with out loss of generality the RGB color space
automatically extract low-level features (e.g. color, texture,               is used in this paper.
shapes and layout of objects) to measure the similarities                         Texture is also an important visual feature that refers to
among images by comparing the feature differences.                           innate surface properties of an object and their relationship to
    Color is one of the most widely used low-level visual                    the surrounding environment. Many objects in an image can be
features and is invariant to image size and orientation [1]. As              distinguished solely by their textures without any other
conventional color features used in CBIR, there are color                    information. There is no universal definition of texture. Texture




                                                                       118                             http://sites.google.com/site/ijcsis/
                                                                                                       ISSN 1947-5500
                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                             Vol. 9, No. 2, February 2011

may consist of some basic primitives, and may also describe               the quality of image if we use these dominant colors to
the structural arrangement of a region and the relationship of            represent image.
the surrounding regions [5]. In our approach we have used the                  In the MPEG-7 Final Committee Draft, several color
texture features using gray-level co-occurrence matrix                    descriptors have been approved including number of
(GLCM).                                                                   histogram descriptors and a dominant color descriptor (DCD)
                                                                          [4, 6]. DCD contains two main components: representative
     Our proposed CBIR system is based on Dominant color                  colors and the percentage of each color. DCD can provide an
[21] and GLCM [17] texture. But there is a focus on global                effective, compact, and intuitive salient color representation,
features. Because Low level visual features of the images such            and describe the color distribution in an image or a region of
as color and texture are especially useful to represent and to            interesting. But, for the DCD in MPEG-7, the representative
compare images automatically. In the concrete selection of                colors depend on the color distribution, and the greater part of
color and texture description, we use dominant colors, Gray-              representative colors will be located in the higher color
level co-occurrence matrix. The rest of the paper is organized            distribution range with smaller color distance. It is may be not
as follows. The section II outlines proposed method in terms              consistent with human perception because human eyes cannot
of Algorithm. The section III deals with experimental setup.              exactly distinguish the colors with close distance. Moreover,
The section IV presents results. The section V presents                   DCD similarity matching does not fit human perception very
conclusions.                                                              well, and it will cause incorrect ranks for images with similar
                                                                          color distribution. We will adopt a new and efficient dominant
                  II.      PROPOSED METHOD                                color extraction scheme to address the above problems [7,8].
    Only simple features of image information can not get                        According to numerous experiments, the selection of
comprehensive description of image content. We consider the               color space is not a critical issue for DCD extraction.
color and texture features combining not only be able to                  Therefore, for simplicity and without loss of generality, the
express more image information, but also to describe image                RGB color space is used. Firstly the image is uniformly
from the different aspects for more detailed information in               divided into 8 coarse partitions, as shown in Fig. 2. If there are
order to obtain better search results. The proposed method                several colors located on the same partitioned block, they are
is based on dominant color and texture features of image.                 assumed to be similar. After the above coarse partition, the
Retrieval algorithm is as follows:                                        centroid of each partition is selected as its quantized color. Let
Step1: Uniformly divide each image in the database and the                X=(XR, XG,XB) represent color components of a pixel with
target image into 8-coarse partitions as shown in Fig.1.                  color components Red, Green, and Blue, and Ci be the
Step2: For each partition, the centroid of each partition is              quantized color for partition i.
selected as its dominant color.
Step3: Obtain texture features (Energy, Contrast, Entropy and
inverse difference) from GLCM.
Step4: construct a combined feature vector for color and
texture.
Step5: find the distances between feature vector of query
image and the feature vectors of target images using weighted
and normalized Euclidean distance.
Step6: sort the Euclidean distances.
Step7: retrieve first 20 most similar images with minimum
distance

 A. Color feature representation
     In general, color is one of the most dominant and
distinguishable low-level visual features in describing image.
Many CBIR systems employ color to retrieve images, such as
                                                                                         Fig. 1 The coarse division of RGB color space.
QBIC system and Visual SEEK. In theory, it will lead to
minimum error by extracting color feature for retrieval using
                                                                           B. Extraction of dominant color of an image
real color image directly, but the problem is that the
                                                                               The procedure to extract dominant color of an image is as
computation cost and storage required will expand rapidly. So
                                                                          follows:
it goes against practical application. In fact, for a given color
image, the number of actual colors only occupies a small                       According to numerous experiments, the selection of color
proportion of the total number of colors in the whole color               space is not a critical issue for DCD extraction. Therefore, for
space, and further observation shows that some dominant                   simplicity and without loss of generality, the RGB color space
colors cover a majority of pixels. Consequently, it won't                 is used. Firstly, the RGB color space is uniformly divided into
influence the understanding of image content though reducing              8 coarse partitions, as shown in Fig. 2. If there are several




                                                                    119                               http://sites.google.com/site/ijcsis/
                                                                                                      ISSN 1947-5500
                                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                          Vol. 9, No. 2, February 2011

colors located on the same partitioned block, they are assumed                     Contrast is the main diagonal near the moment of inertia,
to be similar. After the above coarse    partition, the centroid                   which measures how the values of the matrix are distributed
of each partition (“color Bin” in MPEG-7) is selected as its                       and number of images of local changes reflecting the image
quantized color.                                                                   clarity and texture of shadow depth. Large Contrast represents
                                                                                   deeper texture.
 Let X=(XR, XG,XB) represent color components of a pixel
with color components Red, Green, and Blue, and Ci be the
quantized color for partition i. The average value of color
                                                                                             Entropy S   P(x, y)logP(x, y)
                                                                                                            x    y
                                                                                                                                              (7)

distribution for each partition center can be calculated by
                                                                                   Entropy measures randomness in the image texture. Entropy is
                                                                                   minimum when the co-occurrence matrix for all values is
                                                                                   equal. On the other hand, if the value of co-occurrence matrix
                                                                                   is very uneven, its value is greater. Therefore, the maximum
After the average values are obtained, each quantized color                        entropy implied by the image gray distribution is random.
can         be         determined          by         using
                                                                                                                               1
                                                                                            Inverse difference H      
                                                                                                                       1(xy) P(x,y)
                                                                                                                       x y
                                                                                                                                   2
                                                                                                                                                 (8)
In this way, the dominant colors of an image will be obtained.
                                                                                   It measures number of local changes in image texture. Its
 C. Extraction of texture of an image                                              value in large is illustrated that image texture between the
                                                                                   different regions of the lack of change and partial very evenly.
    Most natural surfaces exhibit texture, which is an
                                                                                   Here p(x, y) is the gray-level value at the Coordinate (x, y).
important low level visual feature. Texture recognition will
therefore be a natural part of many computer vision systems.
In this paper, we propose a texture representation for image                       The texture features are computed for an image when d=1
retrieval based on GLCM.                                                           and =00, 450, 900, 135 0 . In each direction four texture features
    GLCM [11, 13] is created in four directions with the                           are calculated. They are used as texture feature descriptor.
distance between pixels as one. Texture features are extracted                     Combined feature vector of Color and texture is formulated.
from the statistics of this matrix. Four GLCM texture features
are commonly used which are given below:
                                                                                                        III. EXPERIMENTAL SETUP
    GLCM is composed of the probability value, it is defined
by P(i, j d , ) which expresses the probability of the couple
                                                                                        A. Data set
pixels at       direction and d interval. When          and d is
determined, P(i, j d ,  ) is showed by P i, j. Distinctly GLCM                           Wang’s [15] dataset comprising of 1000 Corel images
is a symmetry matrix and its level is determined by the image                      with ground truth. The image set comprises 100 images in each
gray-level. Elements in the matrix are computed by the                             of 10 categories. The images are of the size 256 x 384 or
equation shown below:                                                              384X256. But the images with 384X256 are resized to
                                                                                   256X384.
                                  P(i, j d , )                                         B. Feature set
       P(i, j d ,  )                                              (4)
                          
                          i              j
                                             P(i, j d , )
                                                                                         The feature set comprises color and texture descriptors
   GLCM expresses the texture feature according the                                computed for an image as we discussed in section 2.
correlation of the couple pixels gray-level value at different                          C. Computation of similarity
positions. It quantificationally describes the texture feature. In
this paper, four texture features are considered. They include                             The similarity between query and target image is
energy, contrast, entropy, inverse difference.                                     measured from two types of characteristic features which
                                                                                   includes dominant color and texture features. Two types of
                                                                                   characteristics of images represent different aspects of
                     E  Px, y
                                                        2
        Energy                                                    (5)              property. So during the Euclidean similarity measure, when
                              x      y
                                                                                   necessary the appropriate weights to combine them are also
                                                                                   considered. Therefore, in carrying out Euclidean similarity
     It is a texture measure of gray-scale image represents
                                                                                   measure we should consider necessary appropriate weights to
homogeneity changing, reflecting the distribution of image
                                                                                   combine them. We construct the Euclidean calculation model
gray-scale uniformity of weight and texture.
                                                                                   as follows:
                                                    2
       Contrast I =       x  y                     Px, y    (6)                 D(A, B) =ω1D(FCA , F CB ) + ω2D(FTA , FTB)           (13)




                                                                             120                                http://sites.google.com/site/ijcsis/
                                                                                                                ISSN 1947-5500
                                                                      (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                      Vol. 9, No. 2, February 2011

Here ω1 is the weight of color features, ω2 is the weight of
texture features, F CA and FCB represents the normalized 72-
dimensional color features for image A and B. For a method
based on GLCM, FTA and F TB on behalf of 4- dimensional
normalized texture features correspond to image A and B.
Here, we combine color features and texture features. The
value of ω through experiments shows that at the time
ω1=ω2=0.5 has better retrieval performance.
                IV.        EXPERIMENTAL RESULTS

The experiments were carried out as explained in sections II
and III. The results are benchmarked with some of the existing
systems using the same database [15]. The quantitative
measure is given below
                         1
                p(i )                              1
                        100 1 j 1000, r (i, j ) 100, ID ( j )  ID (i )

      Where p(i) is precision of query image I, ID(i) and ID(j)
are category ID of image I and j respectively, which are in the
range of 1 to 10. The r(i, j) is the rank of image j. This value is
percentile of images belonging to the category of image i, in
the first 100 retrieved images.
  The average precision p t for category t(1≤t≤10) is given by
                            1
                     pt                   p (i )
                           100 1i 1000, ID ( i)  t

    The comparison of proposed method with other retrieval
systems is presented in the Table 1. These retrieval systems are
based on HSV color, GLCM texture and combined HSV color
and GLCM texture. Our sub-blocks based retrieval system is
better than these systems in all categories of the database.
    The experiments were carried out on a Core i3, 2.4 GHz
processor with 4GB RAM using MATLAB. Fig. 2 shows the
image retrieval results using HSV color, GLCM texture, HSV
color and GLCM texture and the proposed method. The image
at the top left- hand corner is the query image and the other 19
images are the retrieval results.
The performance of a retrieval system can be measured in
terms of its recall (or sensitivity) and precision (or
specificity).Recall measures the ability of the system to
retrieve all models that are relevant, while precision measures
the ability of the system to retrieve only models that are
relevant. They are defined as

                Number of relevant images retrieved
    Re call 
                 Total Number of relevant images
                                                                                   Fig. 3 The image retrieval results(dinosaurs) using different techinques (a)
                   Numberof relevantimagesretrieved                                retrieval based on HSV color (b) retrieval based on GLCM texture (c) retrieval
     precision                                                                    based on HSV color and GLCM texture (d) retrieval based on proposed
                    Total Numberof images retrieved
                                                                                   method




                                                                             121                                  http://sites.google.com/site/ijcsis/
                                                                                                                  ISSN 1947-5500
                                                                                         (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                                         Vol. 9, No. 2, February 2011

Table1. Comparison of average precision obtained by
proposed method with other retrieval techniques.                                                                1.6

                                                                                                                1.4                                              Dominant
                                                                                                                                                                 color+GLCM
                                                                                                                1.2                                              t ext ure
                                                                                                                                                                 HSV
                                                                                                                  1                                              color+GLCM
                                            Average Precision                                                                                                    t ext ure
                                                                                                             0.8
                                                                                                                                                                 GLCM t ext ure
                                                                                                             0.6
                                                                            Dominant
                                                                                                             0.4                                                 HSV color
                                                                              color                        Fig. 4 Average precision of various image retrieval methods.
                                                                             +GLCM                           0.2
                                                            HSV color        Texture                             0
                                           GLCM              +GLCM          (proposed                                     20       40       60       80
   Class            HSV color              Texture           Texture         method)                                    N umb er o f r et ur ned i ma g es

   Africa               0.26                0.21              0.25             0.27
                                                                                                           Fig. 4 Average Precision of various image retrieval methods.
  Beaches               0.27                0.35              0.21             0.36

  Building              0.38                 0.5              0.24             0.25

    Bus                 0.45                0.22              0.51             0.52                       2.5

 Dinosaur               0.26                0.29               0.6             0.91                                                                             Dominant
                                                                                                            2                                                   color+GLCM
  Elephant               0.3                0.24              0.26             0.38                                                                             t ext ure
                                                                                                           1.5                                                  HSV
  Flower                0.65                0.73              0.81             0.89                                                                             color+GLCM
                                                                                                                                                                t ext ure
  Horses                0.19                0.25              0.28             0.47                         1
                                                                                                                                                                GLCM texture
 Mountain               0.15                0.18               0.2              0.3                       0.5

   Food                 0.24                0.29              0.25             0.32                                                                             HSV color
                                                                                                            0
  Average               0.315              0.326              0.361           0.467                                    20       40        60       80
                                                                                                                      N umb er o f ret urned i mag es

    The following graph showing the Comparison of average
precision obtained by proposed method with other retrieval                                                   Fig. 5 Average recall of various image retrieval methods.
systems.
                                                                                                                                        V. CONCLUSION
       3.5
                                                                        Dominant
                                                                                                        CBIR is an active research topic in image processing,
          3                                                             color+GLCM                pattern recognition, and computer vision. In this paper, a
                                                                        text ure                  CBIR method has been proposed which uses the combination
       2.5                                                              HSV
                                                                        color+GLCM
                                                                                                  of dynamic dominant color, GLCM texture descriptor.
          2                                                             text ure                  Experimental results showed that the proposed method yielded
        1.5                                                             GLCM t ext ure            higher average precision and average recall with reduced
            1                                                                                     feature vector dimension. In addition, the proposed method
                                                                        HSV color                 almost always showed performance gain of average retrieval
       0.5
                                                                                                  time over the other methods. As further studies, the proposed
          0                                                                                       retrieval method is to be evaluated for more various databases.
                1   2    3     4   5   6    7   8    9 10
                             class numb er
                                                                                                                                        REFERENCES
                                                                                                  [1]    Ritendra Datta, Dhiraj Joshi, Jia Li, James Z. Wang, Image retrieval:
Fig. 3 Average precision of various image retrieval methods for 10 classes of                           ideas, influences, and trends of the new age, ACM Computing Surveys
                               Corel database.                                                          40 (2) (2008) 1–60.
                                                                                                  [2]    W. Niblack et al., “The QBIC Project: Querying Images by Content
The graph in Fig.4 showing the Comparison of average                                                    Using Color, Texture, and Shape,” in Proc. SPIE, vol. 1908, San Jose,
                                                                                                        CA, pp. 173–187, Feb. 1993.
precision obtained by proposed method with other retrieval
                                                                                                  [3]   A. Pentland, R. Picard, and S. Sclaroff, “Photobook: Content-based
systems. And the graph in Fig.5 showing the Comparison of                                               Manipulation of Image Databases,” in Proc. SPIE Storage and
average recall obtained by proposed method with other                                                   Retrieval for Image and Video Databases II, San Jose, CA, pp. 34–
retrieval systems.                                                                                      47, Feb. 1994.




                                                                                            122                                            http://sites.google.com/site/ijcsis/
                                                                                                                                           ISSN 1947-5500
                                                                            (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                            Vol. 9, No. 2, February 2011

[4]     M. Sticker, and M. Orengo, “Similarity of Color Images,” in Proc. SPIE
       Storage and Retrieval for Image and Video Databases, pp. 381-392, Feb.
       1995. [5] Chia-Hung Wei, Yue Li, Wing-Yin Chau, Chang-Tsun Li,
       Trademark image retrieval using synthetic features for describing global
       shape and interior structure, Pattern Recognition 42 (3) (2009) 386–394.
[5]    Chia-Hung Wei, Yue Li, Wing-Yin Chau, Chang-Tsun Li, Trademark
       image retrieval using synthetic features for describing global shape and
       interior structure, Pattern Recognition 42 (3) (2009) 386–394.
[6]    ISO/IEC 15938-3/FDIS Information Technology—Multimedia Content
       Description        Interface—Part       3       Visual     Jul.     2001,
       ISO/IEC/JTC1/SC29/WG11 Doc. N4358.
[7]    Nai-Chung Yang, Wei-Han Chang, Chung-Ming Kuo, Tsia-Hsing Li, A
       fast MPEG-7 dominant color extraction with new similarity measure for
       image retrieval, Journal of Visual Communication and Image
       Representation 19 (2) (2008) 92–105.
[8]    P. Howarth and S. Ruger, “Robust texture features for still-image
       retrieval”, IEE. Proceedings of Visual Image Signal Processing, Vol.
       152, No. 6, December 2005.
[9]    Young Deok Chun, Nam Chul Kim, Ick Hoon Jang, Content-based
       image retrieval using multiresolution color and texture features, IEEE
       Transactions on Multimedia 10 (6) (2008) 1073–1084.
[10]   Y.D. Chun, S.Y. Seo, N.C. Kim, Image retrieval using BDIP and BVLC
       moments, IEEE Transactions on Circuits and Systems for Video
       Technology 13 (9) (2003) 951–957.
[11]   H. T. Shen, B. C. Ooi, K. L. Tan, Giving meanings to www images,”
       Proceedings of ACM Multimedia, 2000, pp.39–48.
[12]   FAN-HUI KONG, “Image Retrieval using both color and texture
       features” proceedings of the 8th international conference on Machine
       learning and Cybernetics, Baoding, 12-15 July 2009.
[13]   JI-QUAN MA, “Content-Based Image Retrieval with HSV Color Space
       and Texture Features”, proceedings of the 2009 International Conference
       on Web Information Systems and Mining.
[14]   P.S.Hiremath, Jagadeesh Pujari ”Content based image retrieval using
       Color, Texture and Shape features”, proceedings of the 15th International
       conference on Advanced Computing and communications.
[15]   http://wang.ist.psu.edu/
[16]   Smith J R, Chang S F. Tools and techniques for color image retrieval,
       in: IST/SPIE-Storage and Retrieval for Image and Video Databases IV,
       San Jose, CA, 2670, 1996, 426-437
[17]   Chia-Hung Wei, Yue Li, Wing-Yin Chau, Chang-Tsun Li, Trademark
       image retrieval using synthetic features for describing global shape and
       interior structure, Pattern Recognition 42 (3) (2009) 386–394.
[18]   S. Liapis, G. Tziritas, Color and texture image retrieval using
       chromaticity histograms and wavelet frames, IEEE Transactions on
       Multimedia 6 (5) (2004) 676–686.
[19]   Song Mailing, Li Huan, “An Image Retrieval Technology Based on
       HSV Color Space”, Computer Knowledge and Technology, No. 3,
       pp.200-201, 2007.
[20]   B S Manjunath, W Y Ma, “Texture feature for browsing and retrieval of
       image data”, IEEE Transaction on PAMI, Vol. 18, No. 8, pp.837-842.
[21]    X-Y wang et al., “An effective image retrieval scheme using color,
       texture and shape features, Comput. Stand. Interfaces (2010),
       doi:10.1016/j.csi.2010.03.004




                                                                                   123                         http://sites.google.com/site/ijcsis/
                                                                                                               ISSN 1947-5500

				
DOCUMENT INFO
Shared By:
Stats:
views:506
posted:3/12/2011
language:English
pages:6