False Positive Reduction using IDS Alert Correlation Method based on the Apriori Algorithm by ijcsis


More Info
									                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                             Vol. 8, No. 7, October 2010

 False Positive Reduction using IDS Alert Correlation
        Method based on the Apriori Algorithm
                                          Homam El-Taj, Omar Abouabdalla, Ahmed Manasrah,
                                                Mohammed Anbar, Ahmed Al-Madi

                                           National Advanced IPv6 Center of Excellence (NAv6)
                                                        Universiti Sains Malaysia

                                                           Penang, Malaysia
                                             {homam, omar, ahmad, anbar, almadi}@nav6.org

Abstract—Correlating the Intrusion Detection Systems (IDS)                  methods have minimum amount of false positive, while
is one challenging topic in the field of network security. There            anomaly methods can detect novel attacks.
are many benefits from correlating the IDS alerts: to reduce
the huge amount of alerts that IDS triggers, to reduce the false                           III.IDS ALERTS’ CORRELATION STUDIES
positive ratio and to figure out the relations between the alerts
to get better understanding of the attacks. One of these
correlation techniques based on the data mining. In this paper              Correlation is part of intrusion detection studies that smoothes the
we developed new IDS alerts group correlation method (GCM)                  progress of the analysis of intrusion alerts based on the similarity
based on the aggregated alerts by the Threshold Aggregation                 between alert attributes, this can represented in mathematical

                                                                                           ��������������������������������_���������������������������������������� = {����������������������������������������1 , ����������������������������������������2 , … , ������������������������������������������������ }
Framework (TAF) we create our correlation method by                         expression as below:
adapting the Apriori algorithm for large data. This method
used to reduce the amount of aggregated alerts and to reduce
the ratio of false positive alerts.
                                                                            Where the group of alerts {Alert1, Alert2, … , Alertn} with the same
Keywords—Intrusion Detection System; False Positive Alerts;
                                                                            features which have relations is represented by Corr_Alert.
Alert Correlation; Data Minig.
                                                                            However, most of the correlation methods focus on IDS alerts by
                                                                            examining other intrusion evidence provided by system monitoring
                                                                            tools or scanning tools. The aim of correlation analysis is to detect
                                                                            relationships among alerts so it will be easy to build attack
Based on the essential and extensive usage of internet and
their applications, threats and intrusions become wider and
smarter. And because IDS triggers huge amount of alerts the
                                                                            A.        Classification of Alert Correlation Technique
need of study these alerts become essential too. The study of
IDS alerts led to bringing to light some of the IDS issues
                                                                            IDS alerts correlation studies got many angles to cover this issue
which should be studied, these issues comes in how to group
                                                                            using many methods and techniques which can be categorized by:
the alerts, define the relation between the alerts and reduce
                                                                            similarity-based, pre-defined attack scenarios, pre-requisites and
the false alerts.
                                                                            consequences and statistical causal analysis.
                                                                                      a)          Similarity-Based
IDS monitors the protected network activities and analyze
                                                                            This technique is based on comparing alert features to see if
them to trigger alerts if there is any malicious activity
                                                                            there is a similarity between the features, mainly the
accrued. IDS can detect these activities based on anomaly
detection methods [1], misuse detection methods [2] or a                    correlation will be based on these features (Source IPs,
compensation between both of them. While anomaly                            Distention IPs, Source Ports and Distention Ports).
methods detect the malicious traffic by determining the                     Valdes and Skinner [3] correlated the IDS alerts by three
abnormality between the suspicious activities flow and the                  phases starting with the minimum similarity is based on the
norm flow based on a chosen threshold, misuse methods                       similarity of source and destination IPs, while the second
                                                                            phase similarity is based on attack class and attack name
detect malicious activates based on their signatures. The
                                                                            plus source and destination IPs. This phase ensures that it
main differences between these methods based on the
                                                                            correlates the same alert from different sensors, and the last
detecting novel attacks and the false positive ratio, misuse
                                                                            phase a threshold value is applied to correlate two alerts
This research was sponsored by the National Advanced IPv6 Center of         based on the similarity of similar attack class with no
Excellence (NAv6) Fellowship in Universiti Sains Malaysia (USM).
                                                                            consideration of other features.
                                                                      151                                                    http://sites.google.com/site/ijcsis/
                                                                                                                             ISSN 1947-5500
                                                        (IJCSIS) International Journal of Computer Science and Information Security,
                                                        Vol. 8, No. 7, October 2010

                   b) Pre-Defined Attack scenarios                     IV.     PROPOSED ALERT CORRELATION METHOD USING THE
The idea of studying the attack scenarios came from the fact                                APRIORI ALGORITHM
that intrusions mainly took several actions to a successful             Our correlation method is based on the IDS aggregated alert
attack.                                                                 using Threshold Aggregation Framework (TAF), TAF output
Debar and Wespi [4] They proposed a system to correlate                 will be accurate aggregated alerts with no redundant alerts
and aggregate IDS alerts triggers by different sensors, their           and incomplete alerts. In TAF to aggregate two alerts or
system got two steps starting by removing the redundant                 more a threshold value should be applied to give more
alerts if they are from different sensor, then correlating the          accuracy combination results [7].
alerts is achieved by applying the Consequences rules which             Figure 4.1 shows the TAF flowchart, the TAF has two types
specifies that any alert should be followed by another type             of inputs; the IDS alerts and the user aggregation options.
of alert, depending on these rules the alerts will be                   Depending on these two inputs the aggregation will be done.
correlated so the aggregation phase will start to check if              The user will choose which type of aggregation method to
there are any similarity between the source and destination             aggregate the IDS alerts.
IPs and attack class.                                                   We propose Group Correlation Method (GCM) which will
                                                                        use the output of the TAF to correlate the alerts by using the
c)   Pre-Requisites and Consequences                                    Apriori algorithm.
                                                                        From the GCM flowchart in Figure 4.2 we can see that there
This technique comes in the middle between features                     is an alert counter checker to see whether the amount of the
similarity correlations and scenarios based correlations. Pre-          alert in the file less than or equal 2 we drop the alerts since
requisites can be defined as the essential conditions that              there will be no need to correlate them.
must exist for the attack to be succeeded, and consequences
for the attacks are defined as conditions that might exist
                                                                                          User Selection
after a specific attack occurred.
Cuppens and Miege [5] they proposed a cooperation module
for IDS alerts with five main functions: alert base                                         Selection
                                                                                                           With Thr      Threshold Value
management function to normalize the alerts, alert clustering                                                                Thr = tr

and alert merging functions used to detect the similarity so
                                                                                           Without Thr
the alerts will be clustered and merged with each other, alert
correlation function will use the explicit correlation rules                                                      Database
                                                                                      Query Generator                                                         Save
with pre-defined and consequence statement to do the                                                              Container

correlation, intention recognition function which is used to
extrapolate intruder actions provides a global diagnosis of                                                            Alert
                                                                                  Missing Features                    Checker              Aggregation Data
the (past, present and future) of the intruders actions, and
reaction function used to help the system administrators to
choose the best measurement to prevent the intruder’s                                                                 Check                  Generating
                                                                             Drop Alert         Bad Parsing                                   Results
malicious actions.                                                                                                    Parsing


d)       Statistical Causal Analysis                                                                             Data Parser               Show Results to

This technique relies on the way of ranking the IDS alerts
based on one of the statistical models to correlate them.                                                     Data Manipulator

Kumar et.al [6] implemented anomaly detection by using
Granger Causality Test (time series analysis method) to                       IDS Alerts         New Alerts     Data Analyzer
correlate alerts in attack scenario analysis. This technique
aims to reduce the amount of raw alerts by merging alert                                                 Figure 4.1 TAF flowchart [7]
based on their features, statistical causal analysis uses
clustering technique to rank the alerts based on the relations
of attacks. This technique is a pure statistical causality
analysis with no need for a pre-defined knowledge attack

                                                                 152                                            http://sites.google.com/site/ijcsis/
                                                                                                                ISSN 1947-5500
                                                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                          Vol. 8, No. 7, October 2010

A.   Apriori Algorithm                                                                                                                Item in the second item group as one
                                                                                                                                      set of S{ i1, i2, ….., in }
The reason of choosing the Apriori algorithm because it is                                                                 (3)         Set minSupp & Set minCon
one of fastest data mining algorithms used to find all frequent                                                            (4)         Calculate support value for each in in S

                                                                                                                                       Do ⋂n iar
itemsets in a large database[8]. Apriori algorithm depends on                                                              (5)         Iteration I = n-1
                                                                                                                                       While I ≥ 1
two predefined threshold values (Support and Confidence) to                                                                (6)
see whether the itemset (group of alerts) are related to each                                                              (7)
other or not. The Support value equals the frequent of items                                                               (8)         Calculate Support and Confidence for
in the itemset, while the Confidence value can be calculated                                                               in in D{ j1, j2, ….., jm } where D ∈S

                    ������������������������ + ������������������������
                                                ∗ 100%
         �������������������������������������������������������������������������������� =    (1)
by the following equation:                                                                                                 (9)        For each jm in D if Support < minSupp

                                                                                                                                      OR Confidence < minCon Drop the
                                                                                                                           (10)        I = I-1
Where LHD is the support of left side, RHD is the support of                                                                                       Figure 4.3 Apriori Algorithm
right side.
                                                                                                         B.                     Mathmatical representation of Apriori Algorithm
                Files of
            Aggregated Alerts
                                                                                                         For a better understanding of Apriori algorithm we are
                                                                                                         mathematically representing it as follow:

                                                                                                         Let Itemset S =i1, i2, ….., in, R =1, 2, 3, …, g and I=
                 Alert Amount
                                                                 Amount ≤ 2                              The Initial Step:-


                                                                                                         Iteration I=0 :-

                                                                                                         �������� = (��������1, ��������2, … . . , ����������������), �������� = (��������1, ��������2, … , ����������������) ������������������������ℎ ��������ℎ���������������� ����������������
                                                                                                                                            ∈ {1, 2, 3, … , ��������}, �������� = (1,2, … . , ��������)
            Generate Itemset Ia                                    Drop Alert


                                                                                                         �������������������������������������������������������� = |��������| = ��������
                    MinSupp                           YES

           Calculate for each ia                                      YES

                                                                                                         Iteration I=1:-
               Support &

                 If ia Support <                            If ia Confidence    Show Results to
                                                                                                         We make intersection between ie and id where e ≠ d such

                                                                                                         ���������������� ∩ ���������������� = (��������1 , ��������2 , … , ���������������� )�������� ∩ (��������1 , ��������2 , … , ���������������� )�������� = (��������1 , ��������2 , … , ���������������� )
  Save               MinSupp                                     < MinCon           User                 that

                                                                                                         Where, ��������1 , ��������2 , … , ���������������� ∈ 1,2,3, … , �������� ������������������������ �������� ≤ ��������, �������� ≤ ��������
                                     Figure 4.2 GCM flowchart

Support value should be calculated first for each itemset in
the current iteration, and only the itemsets that are bigger

                                                                                                         ������������������������ = ���������������� ∩ ����������������
than the threshold value minSupp. The second step is to
calculate the confidence by using equation 1. this step will be

                                                                                                         �������� = ������������������������
done for each itemset in the current iteration, this
confidences value will be compared with the second

                                                                                                         Where, �������� = 1, … ��������, ������������������������ �������� = 1, … , ��������
threshold value minCon to determine whether the current
itemset will be used in the second iteration or not. However;

                                                                                                         �������� ≠ ��������
the main idea of Apriori is to determine if there is a
relationship between the alerts which will be distinguished

                                                                                                         �������� = ���������������� ∩ ����������������
by the confidence value.
Apriori works as illustrated in figure 4.3:

                                                                                                         �������������������������������������������������������� = |��������| = ��������
       (1)        Read the aggregated alert
       (2)       Get two Items as a set of the First Item
                 and the value of the redundant of that

                                                                                                   153                                                          http://sites.google.com/site/ijcsis/
                                                                                                                                                                ISSN 1947-5500
                                                                                                                  (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                  Vol. 8, No. 7, October 2010

���������������� �������� < �������������������������������������������������������� then eliminate ied

Iteration I=2 :-
                                                                                                                                 then the average of all confidence for that itemset will be the

We make intersection between Three S ie , id & ih
                                                                                                                                 confidence for it.
                                                                                                                                 To understand the mathematical representation, check the

���������������� ∩ ���������������� ∩ ��������ℎ = (���������������� ∩ ���������������� )�������� ∩ ��������ℎ = (��������1 , ��������2 , … , ���������������� ) ∩
                                                                                                                                 following Example:
                                                                                                                                 Let the sample of the first item and the second item took

(��������1 , ��������, … , ���������������� ) = (��������1 , ��������2 , … , ���������������� )
                                                                                                                                 from the table 4.2, minSupp = 2, minCon = 80%.

Where, ��������1 , ��������2 , … , ���������������� ∈ 1,2,3, … , �������� ������������������������ �������� ≤ ��������, �������� ≤ ℎ
                                                                                                                                                                       TABLE 4.2 EXAMPLE SET
                                                                                                                                                               First Item                                          Second Item

�������� = ������������������������ ℎ
                                                                                                                                                                      1                                                     1
                                                                                                                                                                      2                                                     1

Where, �������� = 1, … , �������� ������������������������ �������� = 1, … , �������� ������������������������ ℎ = 1, … , ��������
                                                                                                                                                                      5                                                     2
                                                                                                                                                                      2                                                     3

�������� ≠ �������� ≠ ℎ
                                                                                                                                                                      3                                                     1
                                                                                                                                                                      4                                                     2

T = ie ∩ id ∩ ih
                                                                                                                                                                      1                                                     2
                                                                                                                                                                      2                                                     3
                                                                                                                                                                      3                                                     2

�������������������������������������������������������� = |��������| = ��������
                                                                                                                                                                      5                                                     2

���������������� �������� < �������������������������������������������������������� then eliminate iedh
                                                                                                                                 So First item F = {1, 2, 5, 2, 3, 4, 1, 2, 3, 5}, and Second

Iteration I = c :- (General Form)
                                                                                                                                 Item S = {1, 2, 3}

We make intersection between each itemset in c S= ia1 ,

ia2,…, iac                                                                                                                       ����������������������������������������������������������������0 = {2, 3, 2, 1, 1} (Items (4, 5) will eliminated <
                                                                                                                                 F0 = {1, 2, 3, 4, 5} and S0 = {{1, 2}, {1, 2, 3}, {1, 2}, {2},
                                                                                                                                 {2}} (No redundancy in second Item)

ia1 ∩ ia2 … . .∩ iac = �                                      iar = (j1 , j2 , … . . , jz )


Where, ��������1 , ��������2 , … , ���������������� ∈ 1,2,3, … , �������� ������������������������ �������� ≤                                                        ����������������������������������������������������������������1 = {2, 2, 1} (Item (2, 3) will eliminated <

from all order in S
                                                                                                                                 F1 = {(1, 2), (1, 3), (2, 3)} and S1 = {{1, 2}, {1, 2}, {2}}

                                                                                                                                 Confidence ��������(1,2) = �������������������������������������������������������� �                                                           ∗ 100%

S = �                         iar
                                                                                                                                                                            ��������������������������������������������������������(1,2)                                     100 + 67
                                                                                                                                                                         +                                         ∗ 100%� =
                                                                                                                                                                                ��������������������������������������������������������2                                         2
�������������������������������������������������������� = |��������| = ��������      ���������������� �������� < ��������������������������������������������������������                                                                           = 83%
                                                                                                                                                                                                              2                                 2
                                                        �                iar                                                     Confidence ��������(1,3) = �������������������������������������������������������� � ∗ 100% + ∗ 100%�
                                                                                                        then   eliminate

                                                                 r=1                                                                                                                                          2                                 2
                                                                                                                                                                         = 100%
                                                                              z                                                  Confidence ��������(2,3) = �������������������������������������������������������� � ∗ 100% + ∗ 100%� =
                                                                                                                                                                                                              1                                 1

                            Confidence of S =                                                                                                                                                                 3                                 2
                                                                       Support ⋂c iar                                                               = 42% (Item will be eliminated)
                                                                                r=1                                                    2

                                                                                                                                 ����������������������������������������������������������������0 = {3}

                                                                                                                                                                                      2        2
                                                                                                                                 F2 = {(1, 2, 3)} and S1 = {{1, 2}}

                                                                                                                                 Confidence ��������(1,2) = �������������������������������������������������������� � ∗ 100% + ∗ 100% + 0
                                           (�������������������������������������������������������� �             ������������������������ )                                                                                 2        2
The denumerator

                                                                                                                                                                        ∗ 100%� = 100%

                                                       �                ������������������������
representing the Support of all components in

                                                                                                                                 From the above example it is Obvious that: First; the
                                                                                                                                 stopping rule of the iterations when there are no items to
                                                                                                                                 compare with. Second; the itemsets (1,2), (1,3), (1, 2, 3)
The confidence should be calculated for each itemset, and

                                                                                                                           154                                                      http://sites.google.com/site/ijcsis/
                                                                                                                                                                                    ISSN 1947-5500
                                                                 (IJCSIS) International Journal of Computer Science and Information Security,
                                                                 Vol. 8, No. 7, October 2010

have relationships by their percentage of {83%, 100%,                            [5]        F. Cuppens and A. Miege, "Alert correlation in a cooperative
                                                                                            intrusion detection framework," in IEEE Symposium on Security
100%}. Third; the items (4) and (5) are out of range.
                                                                                            and Privacy, Berkeley, California, USA, 2002, pp. 202-215.
                                                                                 [6]        V. Kumar, J. Srivastava, A. Lazarevic, W. Lee, and X. Qin,
               V.    IMPLEMENTATION ISSUES                                                  "Statistical Causality Analysis of Infosec Alert Data," in
                                                                                            Managing Cyber Threats. vol. 5: Springer US, 2005, pp. 101-
Group correlation Method (GCM) can be used as standalone                                    127.
system to read the aggregated IDS alerts, moreover; GCM                          [7]        Homam El-Taj, Omar Abouabdalla, Ahmed Manasrah, Ahmed
can work only with complete alerts with no redundancy to                                    Al-Madi, Muhammad Imran Sarwar, and S. Ramadass,
correlate them easing the analyst job. GCM has two main                                     "Forthcoming Aggregating Intrusion Detection System Alerts
                                                                                            Framework," in The Fourth International Conference on
inputs: the user to choose his threshold values (minSupp and                                Emerging Security Information, Systems and Technologies
minCon) and the aggregated IDS alerts to be correlated.                                     (SECURWARE 2010 ), Venice/Mestre, Italy 2010.
Figure 4.2 shows the GCM flowchart. GCM will start                               [8]        W. Kosters and W. Pijls, "Apriori, a depth first
processing the IDS alerts with no need of filtering the alerts                              implementation," in Frequent Itemset Mining Implementations
                                                                                            Repository (FIMI03), 2003.
or remove the redundancy. Finally the result will be shown
and save in the database based on user request. The process                                                AUTHORS PROFILE
of dropping the insufficient alerts means that these alerts                      Homam El-Taj Is a research officer and fellowship holder in National
have no relationships with other alerts.                                         Advanced IPv6 Centre of excellence (NAv6) at Universiti Sains Malaysia
                                                                                 (USM), He hold his Bachelor in Computer Science From Philadelphia
                      VI.    DISCUSSION                                          University Amman Jordan 2003, and a Master degree in computer science
                                                                                 from (USM) in the area of Distributed Computing 2006, His master
This paper presented the GCM method for correlating the                          research was on Message Digest Based on Elliptic Curve Concept (MDEC),
aggregated alerts from TAF. The advantages of the proposed                       Currently he is a PhD Candidate in NAv6 at USM, His PhD research area
method are the improvement of the alert correlation process,                     in the field of Network Security, He has published several research articles
                                                                                 in Journals and Proceedings.
especially when it is related to accurate irredundant alerts
only, and reducing the time for correlating the alerts. The
                                                                                 Dr. Omar Amer Abouabdalla obtained his PhD degree in Computer
main objective is to minimize the amount of alerts by                            Sciences from University Science Malaysia (USM) in the year 2004.
investigating the relationships between the alerts and alerts’                   Presently he is working as a senior lecturer and domain head in the National
features which will lead to minimizing the false positive                        Advanced IPv6 Centre - USM. He has published more than 50 research
                                                                                 articles in Journals and Proceedings (International and National). His
form the IDS alerts.
                                                                                 current areas of research interest include Multimedia Network, Internet
This method intends to become a general guide that can be                        Protocol version 6 (IPv6), and Network Security.
implemented and extended to full Forensic investigation
system. Other benefits of the proposed methods are: Firstly,                     Dr. Ahmed M. Manasrah is a senior lecturer and the Head of iNetmon
to discover the attacks’ behaviors. Secondly, finding novel                      project as well as the research and innovation of the National Advanced
attacks. Thirdly, this method will save the time of analyzing                    IPv6 Centre of Excellence (NAV6) in Universiti Sains Malaysia. He is also
                                                                                 the IMPACT Research Domain Head for Botnet and threat assessment
the alerts. Finally, using this method will give us relational                   Research. Dr. Ahmed obtained his Bachelor of Computer Science from
accurate alerts with no false alerts. Modifying the value of                     Mutah University, al Karak, Jordan in 2002. He obtained his Master of
the two thresholds will control the amount of correlated                         Computer Science and doctorate from Universiti Sains Malaysia in 2005
alerts.                                                                          and 2009 respectively. Dr. Ahmed is heavily involved in researches carried
                                                                                 by NAv6 centre, such as Network monitoring and Network Security
                                                                                 monitoring with filing 3 Patents in Malaysia.
This research was supported by the National Advanced IPv6                        Mohammed Anbar is a research officer in National Advanced IPv6 Centre
Center of Excellence (NAv6) in Universiti Sains Malaysia                         of Excellence (NAv6) at Universisti Sains Malaysia. His main research area
(USM).                                                                           is Network Security and Malware Protection. Anbar has achieved his
                                                                                 Masters in information technology from university Utara Malysia (UUM)
                                                                                 in 2009. Currently, he is a PhD candidate in NAv6.
                                                                                 Ahmed Azmi Almadi is a research officer in National Advanced IPv6
[1]      W. Fan, M. Miller, S. Stolfo, W. Lee, and P. Chan, "Using               Centre of Excellence (NAv6) at Universisti Sains Malaysia. His main
         artificial anomalies to detect unknown and known network                research area is Network Security and Malware Protection. Almadi has
         intrusions," Knowledge and Information Systems, vol. 6, pp.             obtained his Masters in Computer Science from USM in 2007. Currently,
         507-527, 2004.                                                          he is a PhD candidate and fellowship holder in NAv6. His PhD research
[2]      M. Sheikhan and Z. Jadidi, "Misuse Detection Using Hybrid of            focuses on Botnet Detection.
         Association Rule Mining and Connectionist Modeling," World
         Applied Sciences I, vol. 7, pp. 31-37, 2009.
[3]      A. Valdes and K. Skinner, "Probabilistic alert correlation," in
         the Fourth International Symposium on Recent Advances in
         Intrusion Detection, 2001, pp. 54–68.
[4]      H. Debar and A. Wespi, "Aggregation and correlation of
         intrusion-detection alerts," in 4th International Symposium on
         Recent Advance in Intrusion Detection(RAID) 2001, 2001, pp.

                                                                           155                                  http://sites.google.com/site/ijcsis/
                                                                                                                ISSN 1947-5500

To top