Docstoc

Machine Learning Techniques for Intrusion Detection System

Document Sample
Machine Learning Techniques for Intrusion Detection System Powered By Docstoc
					                                                           (IJCSIS) International Journal of Computer Science and Information Security,
                                                           Vol. 10, No. 4, April 2012




                       Machine Learning Techniques for
                         Intrusion Detection System

                    Shaik Akbar                             Dr. J.A. Chandulal                            Dr. K. Nageswara Rao
                  Research Scholar,                              Professor,                                Professor & H.O.D
                 Associate Professor,                       GITAM University,                                  P.V.P.S.I.T,
                 SVIET, Nadamuru.                                                                              Vijayawada.
                akbarphd2008@gmail.com
                                                             Visakhapatnam.
                                                                                                         hodcse@pvpsiddhartha.ac.in
                                                             chandulal@gitam.edu



Abstract—The fast expansion of computer networks amount of                     categories of intruders. Outside intruders come to your system from
threats are grown extensively. Intrusion Detection System (IDS)                outside your network and they are likely to attack a person‟s external
is only recognized and protects the system successfully. The                   presence. They are likely to go around the firewall and attack
paper presents Genetic Algorithm and C4.5 algorithm which                      machines on the internal work. In comparison to them insiders are
recognizes attack type connections. These two algorithms                       legitimate users of your internal network, misusing privileges and
consider different features by duration, protocol type, hot etc. in            resort to impersonation of higher privileged users or for gaining
creating a rule set. The Genetic Algorithm and C4.5 algorithms                 access from external sources they are likely to use proprietary
are trained on the KDDCup99 Data Set in order to create a set of               information.
rules which applied on Intrusion Detection System classifies
                                                                               For determining if there has been an intrusion and for monitoring
different kinds of attacks. Our experimental results are good
                                                                               network traffic intrusion detection systems are designed signature
with high detection rate and low false alarm rate for Denial of
                                                                               based and anomaly based are the two primary methods for detection.
Service (DoS), Root to Local (R2L), User to Root (U2R) and
                                                                               Signature based method, otherwise also known as detection of
Probe attacks. These experimental results are compared with
                                                                               misuse, tries to find if as a signal of intrusion the specific signature
G.A based IDS and C4.5 based IDS.
                                                                               matches. Network traffic is subjected to scanning as it passes by for
                                                                               specific signatures which the similarity between these systems and
Keywords—IDS, KDDCup99 Data Set, Genetic Algorithm, DoS,
                                                                               virus detection systems though they can detect many or all unknown
R2L, U2R, Probe.
                                                                               patterns of attack, they prove to be of scanty us as regards attack
                                                                               methods which are yet unknown. Most popular intrusion detection
                                                                               systems can be categorized under this. IDS meant for misuse
                      I.       INTRODUCTION                                    detection utilizes a database of traffic or activity patterns relating to
                                                                               known attacks for identifying and categorization of harmful activity
                                                                               on the network. Anomaly based systems primarily try to map events
                                                                               to such a point. Where they „learn„ what is normal and later detect an
As computer technology gradually develops and to the alarm of                  anomaly which may signal an intrusion. Detection techniques
computer crimes go on increasing, the fear and seizure of such                 concerning anomaly take for granted that all activities are necessarily
violations prove to be more and more difficult and demanding. To a             anomalous. This goes to prove that provided profile system for a
great extent, security mechanisms are designed to ensure prevention            normal activity can be established.
of unauthorized access to system resources and data. As of date,
absolute prevention of breaches concerning security seems to be                KDDCup99 Data set is used for Intrusion Detection and the
unrealistic. So we must make an effort at detecting these intrusions           formation model is checked on the data set. The procedure of
as and when they happen, to ensure initiation of action for repairing          Artificial Intelligence for detection of intrusions is the way to
the damage and prevention of further harm. Over the years, detection           construct accurate or correct IDS. To identify misuse, anomaly
of intrusion has turned out to be a major area of research in the field        detection and detecting key patterns are identified by using the rule
of computer science many innovative techniques have been put to                based, Genetic Algorithm and C4.5 algorithm techniques.
use in these systems.

The last ten years witnessed the growth of information revolution.
We can find that changes have been brought about in our lives by the
internet more than ever before. There are infinite possibilities and
opportunities nevertheless; risks and possibilities of harmful
intrusions are also likely to occur. Outsiders and insiders are the two




                                                                          85                                http://sites.google.com/site/ijcsis/
                                                                                                            ISSN 1947-5500
                                                           (IJCSIS) International Journal of Computer Science and Information Security,
                                                           Vol. 10, No. 4, April 2012


                     II.      RELATED WORK


Selvakani [1]: This technique detects the attacks using ruleset with
the help of Genetic Algorithm. This technique develops rules R2L,
U2R, Probe, DoS attacks. The average performance of the method is
low detection rate.

Bridges [2]: This technique is a combination of fuzzy data mining
procedures and Genetic Algorithm in identifying network anomalies
and misuses. The attributes of the network audit data are not
recognized accurately in the most of the existing Genetic Algorithm                Figure 1: Proposed Genetic Algorithm Intrusion Detection
based IDS‟s. Though the features play a main role in Intrusion
                                                                                                                  System
Detection. The author proposed introducing fuzzy numerical
functions. This technique uses Genetic Algorithm to recognize the
best parameters of the fuzzy functions for choosing the features of
the related network.                                                                A.      Learning and Detection Phase: Calculate new generation,
                                                                                            application of genetic operators on the novel generation
Crosbie [3]: The network anomalies can be identified by applying
                                                                                            until the most appropriate individual is reached, the most
multiple agent techniques and Genetic Programming. The set of
agents that establish the network actions can be finding out by an                          suitable individual for learning and testing phase are
agent, which examines one parameter of the network audit data and                           Learning Phase: Using Learning phase GA based IDS
Genetic Programming. Several small independent agents can be used
                                                                                            guides has been trains.
in this technique which is an advantage and the communication
between the agents is a problem.                                                            Detection Phase: The performance is calculated with the
                                                                                            testing data set.
Chittur [4]: Proposed Genetic Algorithm for anomaly detection.
Random digits were produced using Genetic Algorithm. An entry
value was produced at any conviction value more than this threshold                 B.      Feature Extraction and Pre-processing Phase: translating
value was classified as a malicious attack. The practical result                            the symbolic features into numerical ones, regularizing the
verified that GA produced effectively an exact experimental
                                                                                            data set, selecting the most appropriate features can be
performance model from training data. The main drawback of this
approach was established the threshold value is more difficult and                          done by selecting two separate learning and testing data
high false alarm rate leading when used to detect unknown or new                            sets from the KDDCUP99.
attacks.
                                                                                    1)      Training and Testing Phase using GA
Xiang et al. [5]: state that intrusion detection is the procedure of
monitoring the events happening in a computer system or network                The two sections for the proposed GA based Intrusion Detection
and evaluating them for signs of intrusions. For correct intrusion             methods are learning phase and detection phase. The learning phase
detection, we must have consistent and total data about the target
system activities. Similarly, routers and firewalls give event logs for        consists of a set of classification rules from network audit data using
network activity. These logs might have simple information, such as            GA. The Intrusion Detection phase is a collection of rules used to
network connection openings and closings, or a total record of each            divide incoming network connections in the real time environment.
packet that appeared on the wire.
                                                                               Once the rules are formed, the intrusion detection is simple and
                                                                               efficient.

    III.     ENHANCED GENETIC ALGORITHM APPROACH TO IDS                        The fitness function used to determine the fitness value of the
                                                                               individual rule is

                                                                               Step 1) Let „xi‟ be the binary string value of ith String

                                                                               Step 2) Let f(xi) = xi2

                                                                                        n
                                                                               Step 3) ∑ f(xi)
                                                                                       i=1




                                                                          86                                    http://sites.google.com/site/ijcsis/
                                                                                                                ISSN 1947-5500
                                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                                              Vol. 10, No. 4, April 2012


            Where „n‟ is the number of strings                                     Step 4 estimates the rank selection of entities. Step 5-7 apply the
            Where fxi is the fitness of ith string
                                                                                   crossover and mutation operators to every rule in the new population.
            Where i is the ith string
                                                                                   Step 8 chooses the top best chromosomes into new population.
                                            n
                                                                                   Finally, Step 9 verifies and decides whether to stop the training
Step 4) Evaluate Fitness = f(xi) * 100 / ∑ f(xi)
                                           i=1                                     process or to go into the next generation to continue the development
         Where f(xi) fitness of individual string
                                                                                   process.
   n
  ∑ f(xi) is the sum of fitness of all individuals in a population.
  i=1
                                                                                   Key Steps of the Detection Algorithm
Finally, it can be written as

                 Fitness = f(x) / f(sum)                (1)
                                                                                   Algorithm: Rule set formation with Genetic Algorithm
Where f(x) is the fitness of entity x and f is the total of all entities
Rank Selection is similar to relative selection. Individual populations            Intput: Number of productions, Set Binary String, Population range,

are sorted and ranked based on their fitness value.                                Crossover
                                                                                              possibility, Mutation possibility.

                   Ps(i) = r(i) / rsum                 (2)                         Output: A set of selected Features.


Where Ps(i) is probability of selection individual                                 Step 1)      Initialize the Population randomly

          r(i) is rank of individuals                                              Step 2) Amount of Records in the Training Set

          rsum is sum of all fitness values                                        Step 3) Estimate Fitness = f(x)/ f (sum)
                                                                                               Where f (x) is the fitness of individual x and f is the entire
We collect the classified dataset from the Genetic Algorithm and                               fitness of all individuals
rules applied to detect the errors.                                                Step 4) Rank Selection Ps(i) = r(i) / rsum
                                                                                              Where Ps(i) is probability of selection individual

     2)     Rule set generation                                                                        r(i) is rank of individuals
                                                                                                       rsum is sum of all fitness values.
Simple rules for network traffic by Genetic algorithms differentiate
                                                                                   Step 5) For each Chromosome in the New Population
normal network connections from anomalous connections. The
                                                                                   Step 6) Apply regular Crossover operator to the Chromosome
possibilities of intrusions are referred in anomalous connections. The
                                                                                   Step 7) Apply Mutation operator to the Chromosome
rules stored in the rule base are typically in the following form
                                                                                   Step 8) Choose the top best 60% of Chromosomes into new

                        if {condition} then {action}                                             population
                                                                                   Step 9) if the number of generations is not reached, go to Step 3.



    IV.         PROPOSED DETECTION ALGORITHM OVERVIEW
                                                                                                      V.       EXPERIMENTAL RESULTS
List shows the main steps of the operational detection algorithm as
well as the training process. It first generates the initial population            From the above implementation we have successfully generate some
and loads the network audit data. Then the initial population is                   rules that classify the stated attack connections and for applying
developed for a number of generations. In every creation, the                      Genetic Algorithm on selected feature set and find the fitness value
qualities of the rules are firstly calculated, and then quantities of best-        for each generation.
fit rules are selected. The training procedure starts by arbitrarily
                                                                                   This section reports four different attack categories that can
generating an initial population of rules (Step 1). Step 2 estimates
                                                                                   recognize the performance of the detection percentage and false
the total number of records in the audit data. Steps 3 compute the
                                                                                   positive rate. The first experiment used 10 out of 41 features, the
fitness of each rule and select the best-fit rules into new population.




                                                                              87                                   http://sites.google.com/site/ijcsis/
                                                                                                                   ISSN 1947-5500
                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                             Vol. 10, No. 4, April 2012


second experiment used 7 out of 41 features, the third experiment
                                                                                                   100
used 9 out of 41 features and the fourth experiment used 11 out of 41
features.                                                                                          80

                                                                                                                                                    Detection Rate (%)




                                                                                  Detection Rate
                                                                                                   60                                               (Hoffman)
                                                                                                                                                    Detection Rate (%)
    Table 1: Enhanced Rule based GA - Detection Rate for DoS,                                                                                       (Selvakani)
                   R2L, U2R, Probe attacks                                                         40                                               Detection Rate (%)
                                                                                                                                                    (Enhanced G.A)

                                                                                                   20
 Sl.                             Detection Rate           False Positive
            Attack Category
 No                                   (%)                      (%)                                  0
                                                                                                         DoS   Probe        U2R        R2L
                                                                                                               Attack Categories
     1             DoS                93.70                   0.063

     2             R2L                88.85                   0.112                Figure 2: Shows the performance of G.A and Enhanced G.A

     3             U2R                92.50                   0.075                                            VI.       DECISION TREE

     4             Probe              95.33                   0.055             A decision tree model consists of a set of rules for separating a
                                                                                enormous various population into smaller, more homogeneous
    Average Success Rate             92.595                   0.076
                                                                                groups with respect to a exacting objective Variable . A decision tree
                                                                                may be carefully constructed by hand in the manner of Linnaeus and
Table 2: Overall Performance Comparisons of G.A VS Enhanced                     the productions of taxonomists that followed him, or it may be
                             G.A
                                                                                developed frequently by applying any one of several decision tree
                     Detectio                                  False
Sl                              Detection     Detection                         algorithms to a model set comprised of pre-classified data.
         Attack       n Rate                                  Positive
 .                              Rate (%)      Rate (%)
         Categor       (%)                                      (%)
N                               (Selvakan     (Enhance
            y        (Hoffma                                 (Enhanced          The C4.5 algorithm is Quinlan‟s extension of his own ID3 algorithm
o                                   i)         d G.A)
                        n)                                     G.A)
                                                                                for creating decision trees. Just as with CART, the C4.5 algorithm
1         DoS          82.9       86.7         93.70            0.063
                                                                                recursively visits each decision node, selecting the best split, until no
2         Probe        75.3       79.1         95.33            0.112           further splits are possible. However, there are interesting differences
                                                                                between CART and C4.5:
3         U2R          73.1       71.2         92.50            0.075
                                                                                - Unlike CART, the C4.5 algorithm is not limited to binary splits.
4         R2L          85.3       83.3         88.85            0.055
                                                                                Whereas CART always produces a binary tree, C4.5 creates a tree of
  Average
                      79.15      80.075        92.595           0.076           more variable shape.
Success Rate

                                                                                - For categorical features, C4.5 by default creates a split branch for
The graph in figure 2 shows the performance of G.A and Enhanced                 each value of the categorical attribute. This may result in more
G.A in terms of accuracy for the DoS, R2L, U2R, Probe.                          “bushiness” than preferred, since some values may have low
                                                                                frequency or may logically be connected with other values.

                                                                                - The C4.5 technique for estimating node homogeneity is quite
                                                                                different from the CART method and is examined in detail below.




                                                                           88                                          http://sites.google.com/site/ijcsis/
                                                                                                                       ISSN 1947-5500
                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                             Vol. 10, No. 4, April 2012


                      VII.      C4.5 ALGORITHM                                      IX.        EXISTING ALGORITHM: INFORMATION GAIN

                                                                              Let S be a set of training set samples with their matching labels.
Algorithm: Produce a decision tree from the given training data.
                                                                              Assume there are m classes and the training set contains Si samples
Input: Training samples, represented by distinct/ continuous                  of class „I„ and „s‟ is the total number of samples in the training set.
attributes; the set of applicant attributes, attribute-list.
                                                                              Estimated information necessary to classify a given sample is
Output: A decision tree                                                       calculated by:
                                                                                                   i=1
Method:
                                                                                 I(S1,S2,………Sm) = - ∑ Si / S log2Si                                (1)
                                                                                                    m
1) Generate a node N
                                                                              A feature F with values {f1,f2, ………fv} can divide the training set
2) If samples are all of the same class, C, then
                                                                              into v subsets
3) Return N as a leaf node labeled with the class C
                                                                              Furthermore let Sj contain Sij samples of class i. Entropy of the
                                                                              feature F is
4) If attribute-list is empty then
                                                                                      V
5) Return N as a leaf node labeled with the most common class in
                                                                                E(F)= ∑ S1j + …….+Smj / S * I(S1j,S2j,…..Smj)                      (2)
samples; (majority voting)
                                                                                     j=1
6) Choose test-attribute, the attribute among attribute-list with the
highest information gain ratio;
                                                                              Information gain for F can be calculated as:
7) Label node N with test-attribute;
                                                                                Gain(F) = I( S1,S2, …… ,Sm) - E(F)                                  (3)
8) For every identified value ai of test-attribute

9) Produce a branch from node N for the condition test-attribute = ai;
                                                                              In this study, information gain is considered for class labels by using
10) Let si be the set of samples in samples for which test-attribute =        a binary discrimination for each class. That is, for every class, a
ai;
                                                                              dataset example is considered in-class, if it has the equal label; out-
11) If si is empty then                                                       class, if it has a different label. Accordingly as opposed to calculating
                                                                              one information gain as a general assess on the importance of the
12) Attach a leaf labeled with the most common class in samples;
                                                                              feature for all classes, so calculate an information gain for each class.
13) Else attach the node returned by Generate_decision_tree (si,
                                                                              Thus, this signifies how well the feature can classify the given class
attribute-list).
                                                                              (i.e. normal or an attack type) from other classes.


                   VIII.     ATTRIBUTE SELECTION
                                                                               X.         PROPOSED ENHANCEMENT: GAIN RATIO CRITERION
The information gains determine used in step (6) of above Enhanced
C4.5 algorithm is used to select the test feature at each node in the
                                                                              The idea of information gain established previous tends to support
tree. Such a compute is referred to as an attribute selection measure
                                                                              attributes that have a huge number of values. For example, if we have
or a measure of the goodness of split. The attribute with the
                                                                              an attribute D that has a separate value for each record, then Info
maximum information gain (or greatest entropy reduction) is selected
                                                                              (D,T) is 0, thus Gain (D,T) is maximal. To compensate for this, it
as the test feature for the present node. This feature decreases the
                                                                              was suggested in [6] to use the following ratio in its place of gain.
information required to classify the samples in the resulting
partitions. Such an information-theoretic approach minimizes the
                                                                              Split info is the information due to the split of T on the basis of the
possible number of tests needed to classify an object and guarantees
                                                                              value of the categorical attribute D, which is defined by
that a simple tree is create.
                                                                                                   n
                                                                                  Split Info(x) = -∑ |Ti| / |T|.log2 |Ti| / |T|                    (4)
                                                                                                  i=1




                                                                         89                                 http://sites.google.com/site/ijcsis/
                                                                                                            ISSN 1947-5500
                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                             Vol. 10, No. 4, April 2012


And the gain ratio is then calculated by                                          In Enhanced C4.5 the gain ratio, states the amount of helpful
                                                                                  information created by split, i.e., that shows helpful for classification.
    GainRatio(D,T) = Gain(D,T)/SplitInfo(D,T)                        (5)
                                                                                  If the split is near-trivial, split information will be small and this ratio
The gain ratio, states the amount of useful information created by
                                                                                  will be unbalanced. To avoid this, the gain ratio condition selects a
split, i.e., that appears helpful for classification. If the split is near
                                                                                  test to maximize the ratio above, subject to the limitation that the
slight, split information will be small and this ratio will be
                                                                                  information gain should be large, at least as great as the average gain
unbalanced. To avoid this, the gain ratio standard selects a test to
                                                                                  over all tests examined.
maximize the ratio above, subject to the control that the information
gain must be large, at least as large as the average gain over all tests           XII.                      OVERALL PERFORMANCE FOR C4.5 ALGORITHM VS
examined.
                                                                                                                   ENHANCED C4.5 ALGORITHM


       XI.       CLASSIFYING AND DETECTING ANOMALIES
                                                                                  This table 3 shows the overall detection rate and false positive rate
                                                                                  for C4.5 and Enhanced C4.5 algorithm. Enhanced C4.5 gives better
Misuse detection is done through applying rules to the test data. Test            accuracy for DoS, Probe, R2L and U2R categories compared to C4.5
data is collected from the KDDCUP Data set. The test data is stored               algorithm.
in the database. The rules are applied as SQL query to the database.
This classified data under different attack categories as follows:                  Table 3: Overall detection rate and false positive rate for C4.5
                                                                                                   and Enhanced C4.5 algorithm
1) DOS (Denial of Service)
                                                                                                                                               Detection                  False
                                                                                                                              Detection
2) Probe                                                                                                                                       Rate (%)                Positive (%)
                                                                                   Sl.                      Attack            Rate (%)
                                                                                   No                      Category
3) U2R (User to Root)                                                                                                                         (Enhanced                  (Enhanced
                                                                                                                               (C4.5)
                                                                                                                                                C4.5)                      C4.5)
4) R2L (Root to Local)
                                                                                    1                        DoS                90.6               92.92                      0.085

The C4.5 algorithm creates a decision tree, from the root node, by                  2                       Probe               84.0               88.29                      0.152
selecting one remaining feature with the highest information gain as
                                                                                    3                        U2R                83.6               84.00                      0.220
the test for the current node. In this work, Enhanced C4.5, by
selecting one remaining attribute with the highest information gain                 4                        R2L                53.7               66.91                      0.398
ratio as the test for current node is considered a later version of the
                                                                                  Average Success
C4.5 algorithm, will be used to build the decision trees for                                                                   77.975              83.03                      0.213
                                                                                       Rate
classification. From the table 3 it is clear that Enhanced C4.5
outperforms the classical C4.5 algorithm Split info is the information
due to the split of T on the basis of the value of the categorical                The graph in figure 3 shows the performance of C4.5 and Enhanced
attribute D, which is defined by                                                  C4.5 algorithm in terms of accuracy for the DoS, R2L, U2R, Probe.


                                                                                                     100


                 n                                                                                   80
Split Info(x) = -∑ |Ti| / |T|.log2 |Ti| / |T|        (4)
                 i=1
                                                                                    Detection Rate




                                                                                                     60                                                    Detection Rate (%) (C4.5)

                                                                                                                                                           Detection Rate (%) (Enhanced
And the gain ratio is then calculated by                                                             40                                                    C4.5)



                                                                                                     20
GainRatio (D,T) = Gain(D,T) / SplitInfo(D,T) (5)
                                                                                                      0
                                                                                                            DoS       Probe        U2R       R2L
                                                                                                                      Attack Categories




                                                                                    Figure 3: Shows the performance of C4.5 and Enhanced C4.5




                                                                             90                                                    http://sites.google.com/site/ijcsis/
                                                                                                                                   ISSN 1947-5500
                                                                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                              Vol. 10, No. 4, April 2012


      Table 4: Performance Comparison of Enhanced G.A Vs
                         Enhanced C4.5

                                                                                                                      Future we have to implement with more features and different
                                                           False              Detectio            False
                                       Detection
       Attack                                             Positive             n Rate            Positive             classification methods.
Sl.                                    Rate (%)
       Categ                                                (%)                 (%)                (%)
No                                    (Enhanced
        ory                                              (Enhanced            (Enhanc           (Enhanced
                                         G.A)
                                                           G.A)               ed C4.5)            C.4.5)              References:
1            DoS                        93.70                   0.063          92.92                 0.085
                                                                                                                      [1] S. Selvakani K, Rengan S Rajesh “ Integrated Intrusion
2      Probe                            95.33                   0.112          88.29                 0.152            Detection System Using Soft Computing”, IJNS, Vol.10, No.2,
3          U2R                          92.50                   0.075          84.00                 0.220            pp.87-92, March 2010.

4            R2L                        88.85                   0.055          66.91                 0.398            [2] Bridges S.M. and Vaughn R.B, “Fuzzy Data Mining and
                                                                                                                      Genetic Algorithms Applied to Intrusion Detection”, Proceedings
    Average                                                                                                           of 12th Annual Candian Information Technology Security
    Success                             92.595                  0.076          83.03                 0.213
                                                                                                                      Symposium, PP.109-122, 2000.
     Rate

The graph in figure 5 shows the performance of enhanced G.A and                                                      [3] Crosbie Mark and Gene Spafford 1995, ”Applying Genetic
                                                                                                                     Programming to Intrusion Detection”. In Proceeding of 1995 AAAI
enhanced C4.5 in terms of accuracy for the DoS, R2L, U2R, Probe                                                      Fall Symposium on Genetic Programming, pp. 1-8 Cambridge,
categories.                                                                                                          Massachusetts.

                        100                                                                                          [4] Chittur. A, “ Model Generation for an Intrusion Detection System
                                                                                                                     using Genetic Algoirhms”, High School Hornors Thesis,
                        80
                                                                                                                     http”//www/.cs columibi.edu / ids / publications / gaidsthesis
                                                                                                                     01.pdf.accessed in 2006.
       Detection Rate




                        60                                                       Detection Rate (%) (Enhanced
                                                                                 G.A)
                                                                                 Detection Rate (%) (Enhanced
                        40                                                       C4.5)                               [5] C. Xiang and S.M. Lim, “Design of multiple-level hybrid
                                                                                                                     classifier for intrusion detection system, “ in IEEE Transaction on
                        20
                                                                                                                     System, Man, Cybernetics, Part A, Cybernetics, Vol.2, No.28,
                         0                                                                                           Mystic, CT , pp. 117-122, May, 2005.
                                 DoS       Probe         U2R            R2L
                                            Attack Categories
                                                                                                                     [6] J. Shavlik and M. Shavlik, “ Selection, combination, and
                                                                                                                     evaluation of effective software sensors for detecting abnormal
                                                                                                                     computer usage, “ Proceedings of the First International Conference
      Figure 4: Shows the Performance of Enhanced G.A and
                                                                                                                     on Network security, Seattle, Washington, USA, pp. 56-67, May
                    Enhanced C4.5 algorithm
                                                                                                                     2003.
                              XIII.     CONCLUSION AND FEATURE WORK


The Enhanced Genetic Algorithm is a well suitable mechanism for
                                                                                                                                                Shaik Akbar received M.Sc (Computers)
Intrusion Detection compared to enhanced C4.5 algorithm. Obtain
                                                                                                                                                from    Acharya     Nagarjuna      University,
different classification rules for Intrusion Detection through Genetic
                                                                                                                                                M.Tech (CS&T) from Andhra University.
Algorithm. The proposed Genetic Algorithm presents the Intrusion
                                                                                                                                                Pursuing Ph.D from GITAM University.
Detection System for detecting DoS, R2L, U2R, Probe from
                                                                                                                                                Presently working as Associate. Professor
KDDCUP99 Dataset. A selected set of features is used, ten out of 41
                                                                                                                                                in Sri Vasavi Institute of Engineering and
used for DoS category, 7 out of 41 used for R2L category, 9 out of 41
                                                                                                                                                Technology, Nandamuru, Pedana Mandal,
used for U2R category, 11 out of 41 used for Probe category which
                                                                                                                     Affiliated to J.N.T.U, Kakinada. My area of interest is Intrusion
have high detection rates and low false alarm rate. The outputs of the
                                                                                                                     Detection, Network Security and Algorithms.
experiments are satisfactory with an average success rate of 92.595%
and the overall results of the technique implemented are good. In




                                                                                                                91                                http://sites.google.com/site/ijcsis/
                                                                                                                                                  ISSN 1947-5500
                                 (IJCSIS) International Journal of Computer Science and Information Security,
                                 Vol. 10, No. 4, April 2012




Dr.Prof.J.AChandulal.Ph.D.,    Dept     of
Computer Science and Engineering, GITAM
UNIVERSITY, over 30 years of teaching
experience. Published 20 papers in various
National and International Conferences and
Journals.   My area of interest is Soft
Computing, Algorithms and Advanced
Database.



Dr.Prof. K.NageswaraRao received B.Tech
(Electronics) from Karnataka University,
M.Tech(computers) from Andhra University
and Ph.D from Andhra University. Presently
Working as Professor & H.O.D in P.V.P.S.I.T,
Vijayawada affiliated to J.N.T.U, Kakinada.
My area of interest is Robotics, Software
Engineering, Algorithms and Software
Reliability.




                                               92                           http://sites.google.com/site/ijcsis/
                                                                            ISSN 1947-5500

				
DOCUMENT INFO
Shared By:
Stats:
views:79
posted:5/16/2012
language:Latin
pages:8