A Fast Accurate Network Intrusion Detection System

Document Sample
A Fast Accurate Network Intrusion Detection System Powered By Docstoc
					                                                      (IJCSIS) International Journal of Computer Science and Information Security,
                                                      Vol. 10, No. 9, September 2012

                       A Fast Accurate Network Intrusion Detection System

            Ahmed A. Elngar                        Dowlat A. El A. Mohamed                          Fayed F. M. Ghaleb

      Computer Science Department      Math & Computer Science Department Math & Computer Science Department
Information & Computer science Faculty          Science Faculty                    Science Faculty
            Sinai University                  Ain-Shams University               Ain-Shams University
            El-Arish ,Egypt                       Cairo, Egypt                       Cairo, Egypt
        elngar 7@yahoo.co.uk               dr dowlatkma@yahoo.com               fmghaleb@yahoo.com

   Abstract—Intrusion Detection System (IDS) is a valuable            Fuzzy Logic (FL) [7], Neural Networks (NN) [8], Support
tool for the defense-in-depth of computer networks. However,          Vector Machines (SVM) [5], [7] and Decision Tree (DT) [9].
Intrusion detection systems faces a number of challenges.             One of the important problems for IDS is dealing with data
One of the important challenge is that, the input data to
be classified is in a high dimension feature space. In this            containing high number of features. High dimensional data
paper, we effectively proposed PSO-DT intrusion detection             may leads to decrease the predictive accuracy of the IDS.
system. Where, Particle Swarm Optimization (PSO) is used as a         Therefore, feature selection can serve as a pre-processing
feature selection algorithm to maximize the C4.5 Decision Tree        tool for high dimensional data before solving the classi-
classifier detection accuracy and minimize the timing speed. To        fication problems. The purpose of the feature selection is
evaluate the performance of the proposed PSO-DT IDS several
experiments on NSL-KDD benchmark network intrusion detec-             to reduce the number of irrelevant and redundant features.
tion dataset are conducted. The results obtained demonstrate          Different feature selection methods are proposed to increase
the effectiveness of reducing the number of features from 41          the performance of IDS [10] including Genetic Algorithm
to 11, which leads to increase the detection performance to           (GA) [11], Principal Component Analysis (PCA) [12] and
99.17% and speed up the time to 11.65 sec.                            Information Gain (IG) [13].
  Keywords-Network Security;Intrusion Detection System;                  In this paper, we propose an anomaly intrusion detec-
Feature Selection; Particle Swarm Optimization ; Genetic              tion system using Particle Swarm Optimization (PSO) to
Algorithm; Decision Tree.                                             implement a feature selection followed by C4.5 decision
                                                                      tree classifier. The effectiveness of the proposed PSO-DT
                     I. I NTRODUCTION
                                                                      IDS is evaluated by conducting several experiments on NSL-
   Reliance on Internet and online procedures increased the           KDD network intrusion dataset. The results reveal that our
potential of attacks launched over the Internet. Therefore,           proposed PSO feature selection based IDS increases the
network security needs to be concerned to provide secure              accuracy and speed up the detection time than other well
information channels. The concept of Intrusion Detection              known feature selection methods compared to. The rest of
(ID) was proposed by Anderson in 1980 [1]. ID is based on             this paper is organized as follows: Section II presents an
the assumption that the behavior of intruders is different            overview of the used methods, including Genetic algorithm,
from a legal user [2]. Intrusion Detection System (IDS)               particle swarm optimization, Decision tree. Section III de-
becomes an essential component of computer networks                   scribes The NSL-KDD network intrusion dataset. Section
security . IDS aims to identify unusual access or attacks             IV introduces the proposed PSO-DT IDS system. Section
to secure internal networks [3], by looking for potential             V gives the implementation results and analysis. Finally,
malicious activities in network traffic and raises an alarm            Section VI contains the conclusion remarks.
whenever a suspicious activity is detected.
   IDS can be categorized into two techniques: misuse detec-                              II. A N OVERVIEW
tion and anomaly detection [4]. Misuse detection uses well-             This section give an overview of Feature Selection, Ge-
defined patterns of attacks (attacks signatures) to identify           netic Algorithm (GA), PSO Algorithm and Decision Tree
known intrusion [5]. While, Anomaly detection creates a               (DT).
normal behavior profile to identify intrusions traffic based on
significant deviations from this normal profile [6]. Anomaly            A. Feature Selection
detection techniques have the advantage of identifying the               Feature selection is one of the important techniques used
unknown attacks [7].                                                  for data preprocessing in IDS [14]. It aims to improve the
   Several pattern classification techniques have been pro-            detection performance through the removal of irrelevant,
posed in the literature for the development of IDS; including         noisy and redundant features. Feature selection can be

                                                                 29                              http://sites.google.com/site/ijcsis/
                                                                                                 ISSN 1947-5500
                                                   (IJCSIS) International Journal of Computer Science and Information Security,
                                                   Vol. 10, No. 9, September 2012

achieve by two different methods: filter methods [15] and           Algorithm 1 GA algorithm
wrapper methods [16]. Filter methods rely on the general            1: Initialize a population of randomly individuals.
characteristics of the data to evaluate the relevance of            2: Evaluate population members based on the fitness func-
the features, without depending on any machine learning                tion.
algorithm to select the new set of features [17].                   3: while termination condition or maximum number of
   While, wrapper methods exploit a machine learning al-               generation Not reach. do
gorithm and use the classification performance to evaluate           4:    Select parents from current population.
the goodness of features [18]. Genetic algorithm [19] and           5:    Apply crossover and mutation to the selected parents.
PSO algorithm [20] are chosen for this study, with the aim          6:    Evaluate offspring.
of employing wrapper feature selection. A brief description         7:    Set offspring equal to current population.
of Genetic algorithm and PSO algorithm is given below.              8: end while
                                                                    9: Return the best individuals.
B. Genetic Algorithm (GA)
   Holland [21] introduced Genetic Algorithm (GA),which
has been successfully applied to solve a search and opti-          C. Particle Swarm Optimization (PSO)
mization problems. GA is a computational model simulate               Particle Swarm Optimization (PSO) is an evolutionary
the evolutionary processes in the nature [22].                     computation technique developed by Kennedy and Eber-
   The basic idea of a GA is to search a hypothesis space          hart in 1995 [23]. PSO simulates the social behavior of
of individuals to find the best individuals. Each individual        organisms, such as bird flocking. PSO is initialized with
is called chromosome and is composed of numbers of                 a random population (swarm) of individuals (particles).
genes. The GA procedure starts from generating an initial          Where, each particle of the swarm represents a candidate
population of random chromosomes. Then the population is           solution in the d-dimensional search space. To discover the
evolved for a number of generations, where the goodness            best solution, each particle changes its searching direction
of the chromosomes are gradually improve depending on              according to:The best previous position (the position with
the increasing value of the fitness function. Each genera-          the best fitness value) of its individual memory (pbest),
tion of GA includes three fundamental operators: selection,        represented by Pi = (pi1 , pi2 , ..., pid ); and the global best
crossover and mutation.                                            position gained by the swarm (gbest) Gi = (gi1 , gi2 , ..., gid )
  1) Selection operation: A population is created with a           [24]
     group of random chromosomes. Based on a fitness                   The d-dimensional position for the particle i at iteration t
     function the chromosomes in the population are eval-          can be represented as:
     uated and selected for the next generation.
  2) Crossover operation: crossover randomly chooses                                    xt = xt , xt , ..., xt
                                                                                         i    i1   i2        id                        (1)
     a point in pairs of the selected chromosomes and
                                                                      While, the velocity (The rate of the position change) for
     exchanging the remaining segments of them to create
                                                                   the particle i at iteration t is given by
     the new chromosomes.
  3) Mutation operation: randomly changes one or more                                    t    t     t          t
                                                                                        vi = vi1 , vi2 , ..., vid                      (2)
     components of a selected chromosomes. [22].
These three operations continues until a suitable solution            All of the particles have fitness values, which are evalu-
has been found or a certain number of generations have             ated based on a fitness function:
passed. Since GA can find a global optimum solution it is                                                      |C| + |R|
                                                                                F itness = α.γR (D) + β                                (3)
well suited to the feature selection problems. Algorithm 1                                                       |C|
shows the structure of a simple Genetic Algorithm (GA).
                                                                      Where, γR (D) is the classification quality of condition
                                                                   attribute set R relative to decision D and |R| is the length of
                                                                   selected feature subset. |C| is the total number of features.
                                                                   While, the parameters α and β are correspond to the impor-
                                                                   tance of classification quality and subset length, α = [0, 1]
                                                                   and β = 1 − α.
                                                                      The particle updates its velocity according to:

                                                                     t+1      t                             t
                                                                    vid = w ×vid +c1 ×r1 (pt −xt )+c2 ×r2 (gid −xt ) (4)
                                                                                           id  id                id

                                                                                           d = 1, 2, ..., D

                                                              30                                http://sites.google.com/site/ijcsis/
                                                                                                ISSN 1947-5500
                                                                (IJCSIS) International Journal of Computer Science and Information Security,
                                                                Vol. 10, No. 9, September 2012

   Where, w is the inertia weight and r1 and r2 are random                      gain for the splitting purpose [28].
numbers distributed in the range [0, 1]. positive constant
                                                                                                                      Gain(D, S)
c1 and c2 denotes the cognition learning factor (the private                              GainRatio(D, S) =                                        (8)
thinking of the particle itself) and the social learning factor                                                    H( |Di | , ..., |Ds | )
                                                                                                                       |D|          |D|
(the collaboration among the particles). pt denotes the best
                                           id                                              III. N ETWORK I NTRUSION DATASET
previous position found so far for the ith particle and gid  t

denotes the global best position thus far [25].                                    For the evaluation of researches in network intrusion
   Each particle then moves to a new potential position based                   detection systems MIT Lincoln Laboratory has collected
on the following equation:                                                      and distributed the DARPA benchmark datasets [29]. The
                                                                                KDD’99 dataset is a subset of the DARPA benchmark
                       xt+1 = xt + vid
                        id     id
                                                                     (5)        dataset prepared by Sal Stofo and Wenke Lee [30]. KDD’99
                         d = 1, 2, ..., D                                       train dataset is five million record of compressed binary TCP
                                                                                dump data from seven weeks of network traffic. Where, each
D. Decision Tree (DT)                                                           KDD’99 training record contains 41 features (e.g., protocol
   Decision tree (DT) introduced by Quinlan [26] is a                           type, service, and flag) and is labeled as either normal or
powerful data mining algorithm for decision-making and                          specific attack type. The training set contains a total of 22
classification problems. DT classifiers can be build from                         training attack types, with an additional 17 attack types in
large volume of dataset with many attributes, because the                       the testing set only. The attacks belong to four categories:
tree size is independent of the dataset size.                                      1) DoS (Denial of Service ) e.g Neptune, Smurf, Pod and
   A DT consists of three main components: nodes, leaves,                              Teardrop.
and edges. Each node specifies a feature in the dataset                             2) U2R (user-to-root: unauthorized access to root privi-
by which the data is to be partitioned. Each node has a                                leges) e.g Buffer-overflow, Load-module, Perl and Spy
number of edges, which are labeled according to possible                           3) R2L (remote-to-local: unauthorized access to local
values of the feature in the parent node. An edge connects                             from a remote machine)e.g Guess-password, Ftp-
either two nodes or a node and a leaf [27]. The process                                write, Imap and Phf
of constructing a decision tree is basically a divide-and-                         4) Probe (probing:information gathering attacks) eg.
conquer process [26]. DT start from the root node and follow                           Port-sweep, IP-sweep, Nmap and Satan.
the edges down until a leaf node representing the class                            Leung and Leckie [31] reported two problems in the
is reached, where it divides the dataset into subsets. This                     KDD’99 dataset which affects the performance of results
process terminates when all the data in the current subset                      evaluation of intrusion detection systems.
belong to the same class. C4.5 algorithm [26] uses Gain
                                                                                   1) 10% portions of the full KDD’99 dataset contained
Ratio measure to choose the best attribute for each decision
                                                                                       only two types of DoS attacks (Smurf and Nep-
node during the building of the decision tree. Where at each
                                                                                       tune).These two types constitute over 71% of the
dividing step, C4.5 choose an attribute which provides the
                                                                                       testing dataset which completely affects the evaluation.
maximum information gain while reducing the bias in favor
                                                                                   2) since these attacks consume large volumes of traffic,
of tests with many outcomes by normalization.
                                                                                       they are easily detectable by other means and there is
   Given probabilities p1 , p2 , ..., ps for different classes in a
                                                                                       no need of using anomaly detection systems to find
dataset the entropy is calculated by:
                                                                                       these attacks.
            H(p1 , p2 , ..., ps ) =         (pi log(1/pi ))          (6)           To solve these problems, NSL-KDD [32] a new dataset
                                                                                is suggested. NSL-KDD dataset consists of selected records
                                                                                of the complete KDD’99 dataset.
   H(D)finds the amount of entropy in class based subsets
of the data set. That subset is split into s new subsets S =                        IV. P ROPOSED A NOMOLY N ETWORK I NTRUSION
D1 , D2 , ..., Ds using some attribute, where a subset of data                             D ETECTION S YSTEM : PSO-DT IDS
set does not need any further split if all examples in it belong
to the same class. ID3 algorithm calculates the information                        The proposed hybrid anomaly intrusion detection sys-
gain of a split by and chooses that split which provides                        tem is using the advantages of PSO feature selection in
maximum information gain.                                                       conjunction with C4.5 DT classifier to detect and classify
                                                                                the network intrusions into five outcomes: normal and four
                                            s                                   categories of intrusions. It consists of the following three
          Gain(D, S) = H(D) −                    p(Di )H(Di )        (7)        fundamental building phases: (1) Preprocessing , (2) Feature
                                           i=1                                  selection based PSO and (3) Classification using C4.5 DT.
  C4.5 algorithm improves ID3 algorithm by using highest                        Figure 1 shows the overall architecture of the proposed PSO-
Gain Ratio that ensures a larger than average information                       DT intrusion detection system.

                                                                           31                               http://sites.google.com/site/ijcsis/
                                                                                                            ISSN 1947-5500
                                                                    (IJCSIS) International Journal of Computer Science and Information Security,
                                                                    Vol. 10, No. 9, September 2012

                                                                                    pbest denotes the best solution the particle Xi has achieved
                                                                                    so far, and gbest denotes the global best position so far.
                                           NSL-KDD Dataset
                                            (41 Features)
                                                                                    Algorithm 2 shows the main steps of the PSO algorithm-
                                                                                    based feature selection.

                                                                                    Algorithm 2 PSO algorithm-based feature selection
                                            Convert Symbolic
                                        features to Numeric value
                                                                                    m: the swarm size.
                                                                                    c1 , c2 : positive acceleration constants.
                                        Attack names converting
                                                                                    w: inertia weight.
                                                                                    MaxGen: maximum generation.
    Preprocessed Dataset

                                                                                    MaxFit: fitness threshold.
       (41 Features)
                                                                                    Global best position (best features of NSL-KDD dataset)
                                           Feature Selection

                                                                                     1:   Initialize a population of particles with random positions
                                           Intrusion Detection                            and velocities on d=1,...,41 NSL-KDD features dimen-
                                             C4.5 Classifier
                                                                                          sions pbesti =0, Gbest=0, Iter=0.
     Reduced Dataset

       (11 Features)
                                                                                     2:   while Iter < MaxGen or gbest < MaxFit do
                                                                                     3:      for i = 1 to number of particles m do
                                                                                     4:         Fitness(i)=Evaluate(i)
                                          Classified Dataset                         5:         if fitness(i) > fitness (pbesti ) then
                                                                                     6:            fitness (pbesti )= fitness(i)
                                                                                     7:            Update pid = xid
Figure 1. The overall architecture of the proposed PSO-DT intrusion                  8:         end if
detection system                                                                     9:         if fitness(i) > Gbest then
                                                                                    10:            Gbest=Fitness(i)
                                                                                    11:            Update gbest = i
                                                                                    12:         end if
   Preprocessing phase: The following three pre-processing
                                                                                    13:         for each dimension d do
stages has been done on the NSL-KDD dataset:
                                                                                    14:            Update the velocity vector.
   1) Symbolic features are converted to numeric value.                             15:            Update the particle position.
   2) Each Attack name is converted to its category, 0 for                          16:         end for
      N ormal, 1 for DoS (Denial of service) , 2 for U 2R                           17:      end for
      (user-to-root), 3 for R2L (remote-to-local), and 4 for                        18:      Iter= Iter+1
      P robe                                                                        19:   end while
   3) Normalization is implemented since the data have sig-                         20:   Return the Global best position.
      nificantly varying resolution and ranges. The features
      values are scaled to be within the range [0, 1], using
      the following equation:                                                          C4.5 DT classification Phase:
                                    X − Xmin                                           A decision tree classifier is built using the C4.5 algorithm
                           Xn =                  −1                      (9)        [26].Then the reduced 11 features output from the PSO
                                  (Xmax − Xmin )
                                                                                    where passed to the C4.5 decision tree classifier to be
      where, Xmin , Xmax are the minimum and maximum                                classified to one of the five categories: Normal, Dos, U2R,
      value of a specific feature. Xn is the normalized                              R2L and prob.
   PSO Feature Selection Phase: In this paper, PSO al-                                     V. I MPLEMENTATION RESULTS AND ANALYSIS
gorithm [23] has been used as a feature selection method
to reduce the dimensionality of the NSL-KDD dataset. PSO                               The proposed PSO-DT intrusion detection system is eval-
efficiently reduces the NSL-KDD dataset from 41 features to                          uated using the NSL- KDD dataset, where 59586 records are
11 features, which reduces 73.1% of the feature dimension                           randomly taken. All experiments have been performed using
space.                                                                              Intel Core i3 2.13 GHz processor with 2 GB of RAM. The
   At every iteration of the PSO algorithm, each particle Xi                        experiments have been implemented using Java language
is updated by the two best values pbest and gbest. Where,                           environment with a ten-fold cross-validation.

                                                                               32                                http://sites.google.com/site/ijcsis/
                                                                                                                 ISSN 1947-5500
                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                          Vol. 10, No. 9, September 2012

                                                                                                  Table III
A. Performance evaluation                                                   PSO-DT DETECTION MEASUREMENTS (11- DIMENSION             FEATURE )

   The detection effectiveness of the proposed PSO-DT IDS
                                                                                     Class name    TP Rate     FP Rate     F-Measure
are measured in term of TP Rate, FP Rate and F-measure;                                Normal       0.989       0.006        0.991
which are calculated based on the confusion matrix. The                                 DoS         0.999       0.002        0.998
confusion matrix is square matrix where columns correspond                              U2R          0.99         0          0.992
                                                                                        R2L         0.963       0.003        0.954
to the predicted class, while rows correspond to the actual                             Probe       0.993       0.001        0.994
classes. Table I gives the confusion matrix, which shows the
four possible prediction outcomes [33].
                             Table I                                         From table II and III, it is clear that the classification accu-
                       C ONFUSION M ATRIX                                 racy achieved using PSO as feature selection method with
                                                                          C4.5 classifier is improved than using C4.5 as standalone
                            Predicted Class
             Actual Class       Normal        Attake
               Normal             TN           FP                            We compared the PSO feature selection method with
               Attake             FN           TP                         a well known feature selection method genetic algorithm
                                                                          (GA). Table IV shows the classification accuracy of applying
  where,                                                                  GA feature selection algorithm with C4.5 classifier.
  True negatives (TN): indicates the number of normal
                                                                                                         Table IV
events are successfully labeled as normal.                                   GA-DT   DETECTION MEASUREMENTS         (12- DIMENSION   FEATURE )

  False positives (FP): refer to the number of normal                                Class name    TP Rate     FP Rate     F-Measure
events being predicted as attacks.                                                     Normal        0.99        0.01        0.988
                                                                                        DoS         0.999       0.003        0.997
                                                                                        U2R         0.985       0.001        0.985
  False negatives (FN): The number of attack events are                                 R2L         0.917       0.001        0.943
incorrectly predicted as normal.                                                        Probe       0.991       0.001        0.992

  True positives (TP): The number of attack events are                       Table V compare the detection accuracy, feature numbers
correctly predicted as attack.                                            and timing speed of C4.5, GA-DT and proposed PSO-
                                                                          DT intrusion detection systems. Table V illustrate that the
                                                                          proposed PSO-DT IDS gives better detection performance
                                    TP                                    (99.17%) than the C4.5 and GA-DT IDS. Also the proposed
                  T P Rate =                                  (10)
                                  TP + FN                                 PSO-DT IDS reduced the feature space from 41 to 11
                                                                          features and enhance the timing speed to 11.65 sec which
                  F P Rate =                                  (11)        is important for real time network applications.
                                  FP + TN
                                                                                                      Table V
                               2 ∗ TP                                      T ESTING ACCURACY, F EATURES N UMBER AND T IMING COMPARISON
         F − measure =                                        (12)
                       (2 ∗ T P ) + F P + F N
                                                                                System       Test accuracy   Features number   Model building Time
B. Experiments and analysis                                                    C4.5 DT          98.45%              41             64.71 sec.
                                                                                GA-DT           98.92%              12             12.26 sec.
   The classification performance measurements are shown                    Proposed PSO-DT      99.17%              11             11.65 sec.
in Table II and III. Table II shows the accuracy measure-
ments achieved for C4.5 classifier using the full dimension
data (41 features). While, Table III gives the accuracy                                           VI. C ONCLUSIONS
measurements for the proposed anomaly PSO-DT network                         In this paper we proposed a Fast accurate anomaly net-
intrusion detection system with reduced dimension feature                 work intrusion detection system (PSO-DT). Where, PSO
(11 features).                                                            algorithm is used as a feature selection method and then
                                                                          classify the reduced data by C4.5 decision tree classifier.
                         Table II
                                                                          The NSL-KDD network intrusion benchmark was used for
                                                                          conducting several experiments for testing the effectiveness
          Class name    TP Rate   FP Rate     F-Measure                   of the proposed PSO-DT network intrusion detection system.
            Normal       0.982     0.012        0.983                     Also, a comparative study with applying GA feature selec-
             DoS         0.998     0.002        0.997
             U2R         0.967     0.003        0.958                     tion with C4.5 decision tree classifier was accomplished.
             R2L         0.932     0.003        0.935                     The results obtained showed the adequacy of the proposed
             Probe       0.983     0.002        0.985                     PSO-DT IDS of reducing the number of features from 41

                                                                     33                                  http://sites.google.com/site/ijcsis/
                                                                                                         ISSN 1947-5500
                                                         (IJCSIS) International Journal of Computer Science and Information Security,
                                                         Vol. 10, No. 9, September 2012

to 11 which leads to enhance the detection performance to                 [15] L. Yu and H. Liu,” Feature selection for high-dimensional
99.17% and decreasing the timing speed to 11.65 sec.                           data: a fast correlation-based filter solution”, In Proc. of the
                                                                               Twentieth International Conference on Machine Learning,
                         R EFERENCES                                           pp. 856-863, 2003.

 [1] J.P. Anderson, ”Computer security threat monitoring and              [16] Y. Kim, W. Street and F. Menczer ”Feature selection for
     surveillance”,Technical Report, James P. Anderson Co., Fort               unsupervised learning via evolutionary search”, In Proc.
     Washington, PA, April 1980.                                               of the Sixth ACM SIGKDD International Conference on
                                                                               Knowledge Discovery and Data Mining, pp. 365-369, 2000.
 [2] W. Stallings, ”Cryptography and network security principles
     and practices”, USA, Prentice Hall, 2006.                            [17] H. Almuallim and T.G. Dietterich, ”Learning Boolean Con-
                                                                               cepts in the Presence of Many Irrelevant Features”, Artificial
 [3] C. Tsai , Y. Hsu, C. Lin and W. Lin, ”Intrusion detection by              Intelligence, vol. 69, pp. 279-305, 1994.
     machine learning: A review”, Expert Systems with Applica-
     tions, vol. 36, pp.11994-12000, 2009.                                [18] H. F. Eid, M. Salama, A. Hassanien and T. Kim, ”Bi-Layer
                                                                               Behavioral-based Feature Selection Approach for Network
 [4] T. Verwoerd and R. Hunt, ”Intrusion detection techniques and              Intrusion Classification”, In Proc. The International Confer-
     approaches”, Computer Communications, vol. 25, pp.1356-                   ence on Security Technology (SecTech), Korea, December 8
     1365, 2002.                                                               10, pp. 195-203, 2011.

 [5] S. Mukkamala, G. Janoski and A.Sung, ”Intrusion detection:           [19] C. Yang, L. Chuang and C. Hong Yang, ”IG-GA: A Hybrid
     support vector machines and neural networks”, In Proc. of                 Filter/Wrapper Method for Feature Selection of Microarray
     the IEEE International Joint Conference on Neural Networks                Data”, Journal of Medical and Biological Engineering, vol.
     (ANNIE), St. Louis, MO, pp. 1702-1707, 2002.                              30, pp. 23-28, 2009.

 [6] E. Lundin and E. Jonsson, ”Anomaly-based intrusion de-               [20] L. Chuang, C. Ke and C. Yang, ”A Hybrid Both Filter and
     tection: privacy concerns and other problems”, Computer                   Wrapper Feature Selection Method for Microarray Classifi-
     Networks, vol. 34, pp. 623-640, 2002.                                     cation”, In Proc. of the International Multi Conference of
                                                                               Engineers and Computer Scientists (IMECS), Hong Kong,
 [7] S. X. Wu and W. Banzhaf,”The use of computational intel-                  March, volI, pp. 19-21, 2008.
     ligence in intrusion detection systems: A review”, Applied
     Soft Computing, vol .10, pp. 1-35, 2010.                             [21] J. H.Holland, ” Adaptation in Natural and Artificial Sys-
                                                                               tems”. University of Michigan Press, Ann Arbor, MI., 1975.
 [8] G. Wang, J. Hao, J. Ma and L. Huang, ”A new approach
     to intrusion detection using Artificial Neural Networks and           [22] B.Jiang, X. Ding, L. Ma, Y. He, T. Wang and W. Xie, ”A
     fuzzy clustering”, Expert Systems with Applications, vol. 37,             Hybrid Feature Selection Algorithm:Combination of Sym-
     pp.6225-6232, 2010.                                                       metrical Uncertainty and Genetic Algorithms”, In Proc. The
                                                                               Second International Symposium on Optimization and Sys-
 [9] T. Abbes, A. Bouhoula and M. Rusinowitch, ”Protocol                       tems Biology (OSB’08), China, pp. 152-157, 2008.
     analysis in intrusion detection using decision tree”, Inform.
     Technol. Coding Comput. vol.1,pp. 404-408, 2004.                     [23] R. Eberhart , J. Kennedy,” A new optimizer using particle
                                                                               swarm theory”, In Proc. of the Sixth International Sympo-
[10] C. Tsang, S. Kwong and H. Wang, ”Genetic-fuzzy rule min-                  sium on Micro Machine and Human Science, Nagoya, Japan,
     ing approach and evaluation of feature selection techniques               pp.39-43,1995.
     for anomaly intrusion detection”, Pattern Recognition, vol.
     40, pp. 2373-2391, 2007.                                             [24] G. Venter and J. Sobieszczanski-Sobieski, ”Particle Swarm
                                                                               Optimization,” AIAA Journal, vol. 41, pp. 1583-1589, 2003.
[11] K.Y. Chan, C.K. Kwong, Y.C. Tsim, M.E. Aydin and T.C.
     Fogarty, ”A new orthogonal array based crossover, with               [25] Y. Liu, G. Wang, H. Chen, and H. Dong, ”An improved
     analysis of gene interactions, for evolutionary algorithms                particle swarm optimization for feature selection”, Journal
     and its application to car door design”, Expert Systems with              of Bionic Engineering, vol.8, pp.191-200, 2011.
     Applications, vol. 37, pp. 3853-3862, 2010.
                                                                          [26] J. R. Quinlan, ”C4.5 Programs for Machine Learning”,
[12] R. B. Dubey , M.Hanmandlu and S. K. Gupta, ”An Advanced                   Morgan Kaufmann San Mateo Ca, 1993.
     Technique for Volumetric Analysis” International Journal of
     Computer Applications, vol. 1, pp. 91-98 , 2010.                     [27] Y. Kuo-Ching, L. Shih-Wei, L. Chou-Yuan and L. Zne-Jung,
                                                                               ”An intelligent algorithm with feature selection and decision
[13] M. Ben-Bassat, ”Pattern recognition and reduction of dimen-               rules applied to anomaly intrusion detection”, Applied Soft
     sionality,” Handbook of Statistics II, vol. 1, North-Holland,             Computing, In press, 2012.
     Amsterdam, 1982.
                                                                          [28] D. Farid and M. Rahman, ”Anomaly Network Intrusion
[14] H. Liu and H. Motoda,” Feature Extraction, Construction and               Detection Based on Improved Self Adaptive Bayesian Al-
     Selection: A Data Mining Perspective”, Kluwer Academic,                   gorithm”, JOURNAL OF COMPUTERS, vol.5, pp. 23-31,
     second printing, Boston, 2001.                                            2010

                                                                     34                                 http://sites.google.com/site/ijcsis/
                                                                                                        ISSN 1947-5500
                                                         (IJCSIS) International Journal of Computer Science and Information Security,
                                                         Vol. 10, No. 9, September 2012

[29] MIT Lincoln Laboratory, DARPA Intrusion Detection Evalu-
     ation, http://www.ll.mit.edu/CST.html,MA, USA, July, 2010.

[30] KDD’99 dataset, http://kdd.ics.uci.edu/databases, Irvine, CA,
     USA, July, 2010.

[31] K. Leung and C. Leckie, ”Unsupervised anomaly detection
     in network intrusion detection using clusters”, In Proc. of
     the Twenty-eighth Australasian conference on Computer
     Science, vol. 38, pp. 333- 342, 2005.

[32] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani,
     ”A Detailed Analysis of the KDD CUP 99 Data Set”,
     In Proc. of the 2009 IEEE symposium on computational
     Intelligence in security and defense application (CISDA),

[33] R. O. Duda, P. E. Hart, and D. G. Stork, ”Pattern Classifica-
     tion”, JohnWiley & Sons, USA, 2nd edition, 2001.

                                                                     35                             http://sites.google.com/site/ijcsis/
                                                                                                    ISSN 1947-5500