A Hybrid Approach for DICOM Image Feature Extraction, Feature Selection Using Fuzzy Rough set and Genetic Algorithm
Document Sample


(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 11, November 2011
A Hybrid Approach for DICOM Image Feature
Extraction, Feature Selection Using Fuzzy Rough
set and Genetic Algorithm
J. Umamaheswari Dr. G. Radhamani
Research Scholar, Department of Computer Science Director, Department of Computer Science
Dr. G.R.D College of Science, Dr. G.R.D College of Science,
Coimbatore, Tamilnadu, India Coimbatore, Tamilnadu, India.
Umamugesh@yahoo.com radhamanig@gmail.com
Abstract— The proposed hybrid approach for feature and Kanellopoulos 2006) [4] or to select features ( Kavzoglou
extraction, feature reduction and feature selection of Medical and Mather 2002 [5]) but not both at the same time.
images based on Rough set and Genetic Algorithm (GA). A Gray
Level Co-occurrence Matrix (GLCM) and Histogram based GLCM, Histogram, level set, Gabor filters, and
texture feature set is derived. The optimal texture features are wavelet transform [6, 7, 8, 9] are the approaches for texture
extracted from normal and infected Digital Imaging and classification problem. The Gabor filters are poor due to their
Communications in Medicine (DICOM) images by using GLCM lack of orthogonality that results in redundant features, while
and histogram based features. The inputs of these features are wavelet transform is capable of representing textures at the
taken for the feature selection process. The selected features is most suitable scale, by varying the spatial resolution and there
solved by using Fuzzy Rough set and GA. These optimal features is also a wide range of choices for the wavelet function.
are used to classify the DICOM images into normal and infected.
The performance of the algorithm is evaluated on a series of In medical image analysis, the determination of
DICOM datasets collected from medical laboratories. normal and infected brain is classified by using texture.
DICOM and CT image texture proved to be useful to
Keywords- Fuzzy roughest; GLCM; Texture features;
determine the Normal brain [10] and to detect the brain
Histogram Features and region features.
disease part [11].
I. INTRODUCTION There is a big problem in selecting the optimal
features in medical imaging. The evaluation of possible
Nowadays DICOM image analysis is becoming more
feature subsets is usually a painful task. So the large amount of
important for diagnosis process. This process is not easy way
computational effort is required. Fuzzy roughest and Genetic
for optimal identification and early detection of diseases for
algorithm (GA) appear to be a selective approach to choose
improving the surviving rate. Generally the DICOM image is a
the best feature subset while maintaining acceptable feature
valuable and most reliable method in early detection.
selection. Siedlecki and Sklansky [12] compared the GA with
Different methods of DICOM image feature classical algorithms and they proposed the GA for feature
reduction have been used to solve by statistical methods, selection. Fuzzy rough set proved to be the best selection
texture based methods and feature is extracted by using image method for optimal classification.
processing techniques [3]. Some other methods are based on
A new method for extracting features in DICOM
fuzzy theory [1] and neural networks [2].
images with lower computational requirements is proposed
The lack of systematic research on features extracted and selection percentage is analyzed. The tables provide the
and their role to the classification results forces researchers to user with all relevant information for taking efficient decision.
select features arbitrarily as input to their systems. Genetic Thus a synergy of genetic algorithms and fuzzy is used for
algorithms have been successful in discovering an optimal or feature selection in our proposed method.
near-optimal solution amongst a huge number of possible
The remaining paper is organized as follows. Section
solutions (Goldberg 1989). Moreover, a combination of
2 describes the feature extraction process. The feature
genetic algorithms and fuzzy can prove to be very powerful in
selection problem is discussed in Section 3, while Section 4
classification problems. Previously genetic algorithms have
contains the experimental results. Finally section 5 presents
been used either to evolve neural network topology (Stathakis
conclusion and references.
85 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 11, November 2011
II. FEATURE EXTRACTION
TABLE 3 GLCM FEATURES AND VALUES EXTRACTED F ROM
Feature extraction methodologies analyze objects and NORMAL & INFECTED MEDICAL IMAGES
images to extract the features that are representative of the
various classes of objects. In this Work intensity histogram
features and Gray Level Co-Occurrence Matrix (GLCM)
features are extracted [12].
2.1 Intensity Histogram Features
Intensity Histogram analysis has been extensively
used. The intensity histogram features are mean, variance,
skewness, kurtosis, entropy and energy. These are shown in
Table 1.
TABLE 1 FEATURES OF INTENSITY HISTOGRAM
The average value of intensity histogram features
obtained for different type of medical image is given Table 2
as follows:
TABLE 2 INTENSITY HISTOGRAM F EATURES FOR MEDICAL
IMAGES
2.2 GLCM Features
The Gray-Level Co-occurrence Matrix (GLCM) is a III. FEATURE SELECTION
statistical method that considers the spatial relationship of To improve the prediction accuracy and minimize the
pixels, which is also known as the gray-level spatial computation time, feature selection is used. Feature selection
dependence matrix. The pixel and the adjacent pixel is occurs by reducing the feature space. This is achieved by
consider as the spatial relationship and also another spatial removing irrelevant, redundant and noisy features which
relationships can be specified between these two pixels. performs the dimensionality reduction. Popularly used feature
The Following GLCM features were extracted in this selection algorithms are Sequential forward Selection,
paper : Autocorrelation, Contrast, Correlation, Cluster Sequential Backward selection, Genetic Algorithm and
Prominence, Cluster Shade, Dissimilarity Energy, Entropy, Particle Swarm Optimization. In this paper a combined
Homogeneity, Maximum probability, Sum of squares, Sum approach of fuzzy roughest method with Genetic Algorithm is
average, Sum variance, Sum entropy, Difference variance, proposed to select the optimal features. The selected optimal
Difference entropy, Information measure of correlation, features are considered for classification.
information measure of correlation, Inverse difference
normalized. 3.1 Genetic Algorithm (GA) based Feature selection:
During classification, the number of features can be
The value obtained for the above features for a
large, irrelevant or redundant. So the optimal solution is not
typical normal and infected DICOM image is given in the
occurred. To solve this problem feature reduction is
following Table 3,
86 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 11, November 2011
introduced to improve the process by searching for the best TABLE 4 F EATURE SELECTED BY GENETIC ALGORITHM METHOD
features subset, from the original features.
GA is an adaptive method of global-optimization
searching and simulates the behavior of the evolution process
in nature. It is based on Darwin’s fittest principle, which states
that an initial population of individuals evolves through
natural selection in such a way that the fittest individuals have
a higher chance of survival.
The GA maintains a cluster of competing feature
matrices. To evaluate each matrix in this cluster, the inputs are The above Table 5 shows the feature selected by GA method.
multiplied by the matrix, producing a set of output which are
then sent to a classifier. The classifier typically divides the
features into a training set and a testing set, to evaluate 3.2 Feature selection by Rough Set
classification accuracy. Generally each feature is encoded into Fuzzy set involves more advanced mathematical
a vector called a chromosome. concepts, real numbers and functions, whereas in classical set
theory the notion of a set is used as a fundamental notion of
fitness = WA∙Accuracy + Wnb/N (1)
whole mathematics and is used to derive any other
where WA is the weight of accuracy and Wnb is the weight of N mathematical concepts, e.g., numbers and functions [13,14].
feature participated in classification where N ≠ 0. Rough set theory can be viewed as a specific
A fitness value will be used to measure the fitness of implementation of Frege’s idea of vagueness, i.e., imprecision
a chromosome and decides whether a chromosome is good or in this approach is expressed by a boundary region of a set,
not in a given cluster. Initial populations in the genetic process and not by a partial membership, like in fuzzy set theory.
are randomly created. GA uses three operators to produce a Rough set concept can be defined quite generally by means of
next generation from the current generation: reproduction, topological operations, interior and closure, called
crossover and mutation. GA eliminates the chromosomes of approximations. The concept of rough set theory is based on
low fitness and keeps the ones of high fitness. the followings:
Thus more chromosomes of high fitness move to the 3.2.1 Decision Tables
next generation. This process is repeated until a good
chromosome (individual) is found. The Figure 1 illustrates the A decision table consists of two different attribute sets.
One attribute set is designated to represent Conditions (C) and
feature selection using the genetic algorithm.
another set is to represent Decision (D). Therefore, each row
of a decision table describes a decision rule, which indicates a
particular decision to be taken if its corresponding condition is
satisfied. If a set of decision rules has common condition but
different decisions then all the decision rules belonging to this
set are inconsistent decisions, otherwise; they are consistent.
3.2.2 Dependency of Attributes
Similar to relational databases, dependencies between
attributes may be discovered. If all the values of attributes
from D are uniquely determined by values of attributes from C
then D depends totally on C or C functionally determines D
which is denoted by C ⇒D. If D depends on some of the
attributes of C (i.e. not on all) then it is a partial dependency C
⇒kD and a degree of dependency (k; 0 ≤ k ≤ 1) can be
computed as k = γ(C, D), where γ(C, D) is the consistency
factor of the decision table. γ(C, D) is defined as the ratio of
FIGURE 1 FEATURE S ELECTION USING GA the number of consistent decision rules to the total number of
decision rules in the decision tables.
The total features extracted are 40. The selected
features using GA method are tabulated as follows: 3.2.3 Reduction of Attributes
Decision tables where feature vectors are the condition
(C) and desired values for corresponding classes are the
decisions (D) can also represent classification of feature
vectors. Now the dimensionality reduction can simply be
87 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 11, November 2011
considered as removal of some attributes from the decision TABLE 6 F EATURE SELECTED BY PROPOSED APPROACH
table (actually some features from the feature vector)
preserving its basic classification capability. If a decision table
contains some redundant or superfluous data, then collect
those redundant data and remove them.
The selected features using Rough set method are
tabulated as follows
TABLE 5 FEATURE SELECTED BY ROUGH SET METHOD
1 Kurtosis
2 Std
3 Sum Average
4 Sum Variance
IV. EXPERIMENTAL RESULTS
For the comparison of results of different feature
3.3 Proposed Hybrid Approach Algorithm: reduction methods like rough set, GA and the proposed method
1. N number of features is extracted by GLCM and has been used. Feature space is formed using the DICOM
Histogram texture features from the preprocessed Image images. Totally forty features are extracted which forms the
2. Apply roughest algorithm to select the optimal set feature space. Using GA feature space reduced to eight features
containing n1 number of features where n1< N and by rough set method it is reduced to four features. The
3. Apply genetic algorithm to select the best subset proposed method selects only twelve features. These features
containing n2 number of features where n2<N improve the class prediction.
4. Find the Union of n1 features and n2 features to form
final n features The percentage of reduction by GA method is 80%.
5. Use the n features where n<N for Classification. 75 % of reduction is done by rough set method. The selected
features are used for classification which reduces the
classification time and improves the prediction accuracy. The
proposed approach selects feature space of DICOM images
which is reduced by 95%. The following Table 7 gives the
results of the proposed method.
TABLE 7 RESULTS OBTAINED BY P ROPOSED METHOD
GA method 80%
Rough set Method 75%
Proposed method 95%
This gives that the proposed approach is efficient for
image analysis. It’s a better tool for doctors or radiologists to
classify normal brain images and infected brain images.
V. CONCLUSION
The paper developed a hybrid technique with normal and
FIGURE 2 P ROPOSED APPROACH FOR FEATURE S ELECTION infected DICOM images. The proposed approach gives results
in extraction and selection for classifying the images that
The above Figure 2 shows the feature selection by
benefit the physician to make a final decision. The approach
proposed approach. The following Table 6 gives feature
for feature extraction, feature reduction and feature selection
selected by proposed approach.
of images based on Rough set and Genetic Algorithm (GA). A
Gray Level Co-occurrence Matrix (GLCM) and Histogram
based texture feature set is derived. The feature selection is
done by Fuzzy Rough set and GA. These optimal features are
used to classify the DICOM images into normal and infected.
88 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 11, November 2011
The performance of the algorithm is evaluated on a series of
DICOM datasets collected from medical laboratories. The
method has been proved that it is easier and gives desirable AUTHORS PROFILE
results for future process.
REFERENCES Ms.J.Umamaheswari, Research Scholar in Computer Science, Dr.
[1] D.Brazokovic and M.Nescovic ., “Mammogram screening using GRD college, Coimbatore. She has 5 years of teaching experience
and two years in Research. Her areas of interest include Image
multisolution based image segmentation”, International journal of pattern
Processing, Multimedia and communication. She has more than 3
recognition and Artificial Intelligence, Vol.7,No.6, P. 1437-1460,1993. publications at International level. She is a life member of
[2] I.Christiyanni et al ., “Fast detection of masses in computer aided professional organization IAENG.
mammography”, IEEE Signal processing Magazine, P.54- 64, 2000.
[3] S.Lai,X.Li and W.Bischof . “On techniques for detecting circumscribed Dr.G. Radhamani, Director in Computer Science, Dr. GRD College,
masses in mammograms”, IEEE Trans on Medical Imaging, Vol.8, No.4, Coimbatore. She has more than 5 years of teaching and research
P.377-386,1989. experience. She has volume of publications at International level. Her
[4] K. Topouzelis, D. Stathakis and V. Karathanassi , “Investigation of
areas of interest include Mobile computing, e-internet and
communication. She is a member of IEEE.
genetic algorithms contribution to feature selection for oil spill detection”,
Vol. 30, No.3, P.611-625, 2009.
[5] Kavzoglu T and Mather P.M., “The role of feature selection in artificial
neural network applications”, International Journal of Remote Sensing,
Vol.23, No.15, P.2919-2937, 2002.
[6] Dunn C., Higgins W.E., “Optimal Gabor filters for texture segmentation”, IEEE
Transactions on Image Processing, Vol. 4, No.7, P. 947-964,1995.
[7] Chang T., Kuo C., “Texture Analysis and classification with tree structured
wavelet transform”, IEEE Transactions on Image Processing, Vol. 2, No.4, P. 429-
441, 1993.
[8] Dr. H.B.Kekre, Sudeep D. Thepade, Tanuja K. Sarode and Vashali
Suryawanshi, “ Image Retrieval using Texture Features extracted from
GLCM, LBG and KPE”, Vol. 2, No. 5, P.1793-8201, 2010.
[9] M.M. Trivedi, R.M. Haralick, R.W. Conners, and S. Goh, “Object
Detection based on Gray Level Coocurrence”, Computer Vision, Graphics,
and Image Processing, Vol. 28, P. 199-219, 1984.
[10] Schad L.R., Bluml S., Zuna, I., “MR tissue characterization of intracranial tumors
by means of texture analysis, Magnetic Resonance Imaging”, Vol.11, No.6, P. 889-
896, 1993.
[11] Free borough P.A., Fox N.C., “MR image texture analysis applied to the
diagnosis and tracking of Alzheimer’s disease ”, IEEE Transactions on Medical
Imaging, Vol. 17, No.3, P. 475-479, 1998.
[12] Serkawt Khola , “Feature Weighting and Selection A Novel Genetic
Evolutionary Approach”, World Academy of Science, Engineering and
Technology 73, P.1007-1012, 2011.
[13] Ping Yao, “Fuzzy Rough Set and Information Entropy Based Feature
Selection for Credit Scoring”, IEEE , P.247-251, 2009.
[14] Pradipta Maji and Sankar K. Pal, “Fuzzy–Rough Sets for Information
Measures andSelection of Relevant Genes From Microarray Data”, IEEE,
Vol. 40, No. 3, P.741-752, 2010.
89 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
Related docs
Other docs by ijcsiseditor
Digital Images Encryption in Spatial Domain Based on Singular Value Decomposition and Cellular Automata
Views: 0 | Downloads: 0
Agent Behavior in Multiagent Systems: Issues and Challenges in Design, Development and Implementation
Views: 1 | Downloads: 0
Optimizing Cost, Delay, Packet Loss and Network Load in AODV Routing Protocols
Views: 2 | Downloads: 0
Get documents about "