VIEWS: 142 PAGES: 5 CATEGORY: Emerging Technologies POSTED ON: 5/15/2012
International Journal of Computer Science and Information Security (IJCSIS) provide a forum for publishing empirical results relevant to both researchers and practitioners, and also promotes the publication of industry-relevant research, to address the significant gap between research and practice. Being a fully open access scholarly journal, original research works and review articles are published in all areas of the computer science including emerging topics like cloud computing, software development etc. It continues promote insight and understanding of the state of the art and trends in technology. To a large extent, the credit for high quality, visibility and recognition of the journal goes to the editorial board and the technical review committee. Authors are solicited to contribute to the journal by submitting articles that illustrate research results, projects, surveying works and industrial experiences. The topics covered by this journal are diversed. (See monthly Call for Papers) For complete details about IJCSIS archives publications, abstracting/indexing, editorial board and other important information, please refer to IJCSIS homepage. IJCSIS appreciates all the insights and advice from authors/readers and reviewers. Indexed by the following International Agencies and institutions: EI, Scopus, DBLP, DOI, ProQuest, ISI Thomson Reuters. Average acceptance for the period January-March 2012 is 31%. We look forward to receive your valuable papers. If you have further questions please do not hesitate to contact us at email@example.com. Our team is committed to provide a quick and supportive service throughout the publication process. A complete list of journals can be found at: http://sites.google.com/site/ijcsis/ IJCSIS Vol. 10, No. 3, March 2012 Edition ISSN 1947-5500 � IJCSIS, USA & UK.
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 3, March 2012 A Survey on Building Intrusion Detection System Using Data Mining Framework V. Jaiganesh, Assistant Professor M. Thenmozhi Assistant Professor Dr. P. Sumathi, Assistant Professor Department of Computer Science Department of Information Department of Computer Science, Dr.N.G.P. Arts and Science College Technology Chikkanna Government Arts College, Coimbatore Avinashilingam University for Tirupur e-mail: firstname.lastname@example.org Women, Coimbatore e-mail: email@example.com e-mail: firstname.lastname@example.org Abstract— Recently, network attacks have increased to a greater has become the major concern of the computer society to detect extent. Hackers and intruders can produce several successful and to prevent intrusions efficiently. efforts to cause the crash of the networks and web services by illegal intrusion. New threats and interrelated solutions to avoid An intrusion is a violation of the security policy of the these threats are budding jointly with the secured system system, and thus, intrusion detection mainly refers to the evolution. So, Intrusion Detection System (IDS) has become an methods that detect violations of system security policy. Since active area of research in the field of network security. The the cruelty of attacks in the network has increased radically, optimization of IDS becomes an attractive domain due to the Intrusion detection system has become an essential factor to the security audit data as well as complex and active properties of security infrastructure of several companies. Intrusion detection intrusion behaviors. The main purpose of IDS is to protect the facilitates companies to defend their systems from various resources from threats. Intrusion Detection System examines and attacks that come with rising network connectivity and calculates the user behavior, and then these behaviors will be dependence on information systems . considered an attack or a normal behavior. Intrusion detection systems have been integrated with data mining approaches to Recently, intrusion detection techniques through data identify intrusions. There are various data mining approaches mining approaches have attracted several researchers. As an such as classification tree, Support Vector Machines, etc., used essential application area of data mining, intrusion detection for intrusion detection. In this paper, thorough investigations focus to lessen the burden of examining vast volumes of audit have been done on the existing data mining approaches to detect data and recognizing the performance optimization of detection intrusions.. (Abstract) rules. Several researchers have suggested numerous techniques in various groups, from Bayesian techniques  to decision Keywords- Intrusion Detection System (IDS), intruders, trees [5, 6], from rule based models  to functions studying Machine Learning techniques, Data mining . These techniques have improved the efficiency of the detection to a certain extent. I. INTRODUCTION It is observed from the existing techniques that, most Computer networks and their related applications have researchers utilized a single algorithm to detect multiple attack become an attractive source in the era of information society classes with miserable performance in certain scenarios. But, . Similarly, in recent years, the potential thread to the global detection performance can be greatly improved through information infrastructure has also increased greatly. In order complicated technique. to guard against several cyber attacks and computer viruses, numerous computer security approaches have been extensively In the present scenario, data mining approaches have taken researched in the recent years. The major security techniques valuable steps towards solution of several issues in different proposed are cryptography, firewalls, anomaly, intrusion intrusion detection issues. There are various benefits in detection, etc. Among the available existing techniques, utilizing the data mining approaches for solving the problem of intrusion detection techniques have been considered to be one network intrusion . Some of the benefits are listed below: of the most significant and competent techniques for protecting • It can process huge amount of data. complex and dynamic intrusion attacks. • User’s subjective evaluation is not needed, and it is Network intrusion and information safety issues are mainly more appropriate to detect the unobserved and due to the consequences of extensive internet usage. For hidden information. example, on February 7th, 2000 the first Denial of Service (DoS) attacks of huge volume were established, aiming the Moreover, data mining systems easily performs data computer systems of huge corporates like Yahoo!, eBay, summarization and visualization that facilitate the security Amazon, CNN, ZDnet and Dadet . Alternatively, network analysis in various research areas . intrusion is regarded as a new weapon of world war. Thus, it This paper thoroughly investigates the existing data mining approaches which help in preventing intrusion attacks. The 32 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 3, March 2012 characteristic features of the intrusion detection techniques are accuracy and detection rate. presented in this paper which would facilitate further research Data mining approaches have achieved considerable in the field of network security. importance in presenting the helpful information and thereby can assist in improving the decision on recognizing the II. LITERATURE SURVEY intrusions (attacks). Panda and Patra  evaluated the The idea of intrusion detection system was proposed by performance of several rule based classifiers, for instance, Anderson in 1980 . Anderson employed statistic technique JRip, RIDOR, NNge and decision table by using ensemble to examine the behavior of user and to detect those attackers approach with the intention of constructing an efficient who accesses the system in an unauthorized way. Denning network intrusion detection system. The author exploited  presented a prototype of IDES (Intrusion Detection KDDCup'99, intrusion detection benchmark dataset (which is Expert System) in 1987, then, the concept of intrusion a fraction of DARPA evaluation program) for this detection system was known progressively, and Denning’s experimentation. It can be revealed from the outcome that the approach was considered as a considerable landmark in the this scheme is perfect in identifying network intrusions, area of intrusion detection. provides low false positive rate, uncomplicated, consistent and Zenghui and Yingxu  proposed a data mining faster in constructing an efficient network intrusion system. framework for generating intrusion detection models. The man Due to the increase in the number of computer networks at goal is to employ data mining techniques namely, the present scenario, ensuring security in a network against classification, meta-learning, association rules, and frequent various attacks is essential. Intrusion detection system is one episodes to review data for computing misuse and abnormality of the popular tools to provide security against the intruders in detection models that correctly capture the actual behavior a network. Exploiting data mining approaches has increased (i.e., patterns) of intrusions and normal behaviors. Even the quality of intrusion detection neither as anomaly detection though, this detection model can significantly detect a or misrepresented detection from large scale network traffic considerable percentage of old and new PROBING and U2R operation. Association rule is a popular method to construct attacks, it missed a vast number of new DOS and R2L attacks. quality misused detection. On the other hand, the limitation of Theodoros Lappas and Konstantinos Pelechrinis  mostly association rule is the fact that it often produced with concentrated on data mining approaches that are being used thousands rules which diminishes the performance of IDS. for dealing with DOS and R2L attacks, and then proposed a Namik and Othman  concentrated on applying post- new idea on how data mining can help IDSs by utilizing mining to decrease the number of rules and remaining the biclustering as a tool to analyze network traffic and improve most quality rules to generate quality signature. Each partition IDSs. is mined using Apriori Algorithm, which later carries out post- Sun and Wang  presented a new weighted support mining using Chi-Squared ( ) computation approaches. The vector clustering algorithm and utilized it to deal with the excellence of rules is measured depending on Chi-Square problem of anomaly detection. Experimental results reveal the value, which is computed based on the support, confidence fact that this method obtains high detection rate with low false and lift of every association rule. alarm rate. Su-Yun Wu and Ester Yen  compared the Emerging technologies have metamorphosed the performance efficiency of machine learning techniques such characteristics of surveillance and monitoring application, as classification tree and support vector machines in intrusion however the sensory data obtained using different gadgets still detection system. It is observed from the results that the remain unreliable and inadequately synchronized. State algorithm of C4.5 for classification tree and SVM are similar transition analysis is turning out to be significant components to certain level for R2l attack in terms of accuracy, but the in recognizing intrusions. Ganesh et al.,  developed a accuracy of C4.5 is higher than SVM for other types of attack. semantic based intrusion detection system in which state Intruder is one of the most common threats to security. At transition analysis, pattern matching and data mining present, intrusion detection has come out as a significant techniques are incorporated to enhance the intrusion detection practice for providing network security. In recent times, data accuracy. Patterns and rules are generated depending on the mining approaches have been exploited for the purpose of events identified by WSN. The sink obtains information intrusion detection. The effectiveness of the feature selection regarding the numerous actions taking place in the coverage techniques is one of the fundamental parameter that has an area and correlates the streaming data in spatial domain and effect on the success of Intrusion Detection System (IDS). time domain. The semantic rules are generated using ANTLR Amudha and Abdul Rauf  evaluated the performance of tool. data mining classification approaches specifically, J48, Naive Networks are safeguarded by means of exploiting several Bayes, NBTree and Random Forest with the use of KDD firewalls and encryption software's. However most of these CUP'99 dataset and mainly concentrated on Correlation available methods are not adequate and efficient. Majority of Feature Selection (CFS) measure. The results of this the current intrusion detection systems for mobile ad-hoc evaluation revealed that NBTree and Random Forest performs networks are mostly concentrating on either routing protocols better than other two approaches based on the predictive 33 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 3, March 2012 or only on its effectiveness, but it is unsuccessful to address classes. This technique has exposed that by oversampling the the security related issues. Some of the nodes which take part instances of the anomaly and moreover this technique assists in the communication may be selfish, for instance, certain the Support Vector Machine algorithm to overcome the soft nodes may not forward the packets to the target and by this margin. Consequently, it classifies better future instances of means it reduces the battery power utilization. In some other this class of interest. cases, certain nodes may act as malicious by initiating security Some heterogeneous security equipments for instance, attacks like Denial-of-Service or hack the information. The firewalls, intrusion detection systems and anti-virus gateways, vital objective of the security solutions for wireless networks can generate considerable security events which are is to offer security services, for instance, authentication, complicated to manage effectively. As a result a log-based confidentiality, integrity, anonymity and availability to mobile mining, distributed and multi-protocol supported framework users. Esfandi  integrates agents and data mining of security monitoring system is developed by Lv Guangjuan approaches to avoid anomaly intrusion in mobile ad-hoc et al.,  and described the structural design of the networks. Home agents present in each system obtain the data information security monitoring system. The major from its individual system and by means of data mining concentration is on the correlation analysis engine which approaches the local anomalies are observed. The Mobile illustrates the process that the detection model is constructed agents observe the neighboring nodes and obtain the using data mining approaches. Security event correlation information from adjacent home agents to find out the depending on data mining analysis can automatically obtain correlation between the observed anomalous patterns before it association rules, investigate alarming and found new invasion sends the data. This scheme was capable of preventing all the model, and hence it is extremely intelligent technique. security attacks in an ad-hoc network and reduces the false Xin Xu et al.,  proposed a outline for adaptive intrusion alarm positive. detection with the help of machine learning approaches. Multi- Te-Shun Chou and Tsung-Nan Chou  proposed a hybrid class Support Vector Machines (SVMs) is employed to design for intrusion detection that integrates anomaly classifier construction in IDSs and the performance of SVMs detection with misuse detection. This technique also includes is assessed on the KDD99 dataset. Significant results were an ensemble feature selecting classifier and a data mining obtained in the experimental evaluation. For instance, classifier. The former includes four classifiers using dissimilar detection rates of 76.7%, 81.2%, 21.4% and 11.2% were sets of features and each of them utilizes a machine learning obtained for DoS, Probe, U2R, and R2L attacks respectively algorithm called fuzzy belief k-NN classification algorithm. while False Positive is maintained at the fairly low level of The latter exploits data mining approaches to automatically average 0.6% for the four groups. But, this approach can be obtain computer users' normal behavior from training network only employed to a very small set of data (10,000 randomly traffic data. The outcome of ensemble feature selecting sampled records) comparing to the huge original dataset (5 classifier and data mining classifier are then combined million audit records). So, this method is not suitable for all together to obtain the final decision. the circumstances and is not regarded as one of the best Several techniques have been developed for intrusion approach. detection using data mining approaches but from the Yang Li and Li Guo  have already recognized the beginning it is uncertain that which data mining approach is insufficiency of KDD dataset. However, a supervised network most efficient. Zhenwei Yu and Tsai  developed a Multi- intrusion detection technique depending on Transductive Class SLIPPER (MC-SLIPPER) scheme for intrusion Confidence Machines for K-Nearest Neighbors (TCM-KNN) detection to discover whether there is any significant machine learning algorithm and active learning based training data selection method had been proposed by Yang Li and Li advantage from boosting dependent learning approach. The Guo. This new approach was evaluated on a subset of KDD fundamental idea is to employ the available binary SLIPPER dataset by random sampling 49,402 audit records for the as a central module, which is a rule learner depending on training phase and 12,350 records for the testing phase. An confidence-rated boosting. Numerous arbitral strategies average TP of 99.6% and FP of 0.1% was reported but no depending on prediction confidence are developed to judge additional information about the exact detection rate of each results from all binary SLIPPER modules. attack categories was presented by the authors.. Security of computers and the networks that connect them is progressively turning out to be much essential. On the other III. PROBLEMS AND DIRECTIONS hand, constructing effective intrusion detection techniques There are various problems and issues present in the with better accuracy and real-time implementation are existing intrusion detection techniques which are analyzed in indispensable. Muntean et al.,  developed a novel data this section. This section also provides certain possible mining dependent method for intrusion detection by utilizing solutions to the problems in the existing techniques. Cost-sensitive classification together with Support Vector Majority of the intrusion detection techniques available in Machines. The author introduced an algorithm that enhances the literature employed a single algorithm to detect multiple the classification for Support Vector Machines, by multiplying attack categories with miserable performance in most of the in the training phase the instances of the underrepresented 34 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 3, March 2012 scenarios.  G.H. John and P. Langley, “Estimating Continuous Distributions in Bayesian Classifiers”, Proceedings of the 11th Conference on Existing intrusion detection systems are highly dependant Uncertainty in Artificial Intelligence, Pp. 338-345, 1995. on human analysts to distinguish intrusive from non-intrusive  J. Ross Quinlan, “C4.5: Programs for Machine Learning”, Morgan network traffic. Kaufmann, Publishers Inc. San Francisco, CA, USA, 1993. Moreover, existing IDSs are developed to detect only  Ron Kohavi, “Scaling up the accuracy of Naïve-Bayes classifier: A particular known service level network attacks. Many attempts decision-tree hybrid”, Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Pp. 202-207, 1996. have been made to deal with this problem, but resulted in an  Ian H. Witten, Eibe Frank and Mark A. Hall, “Data Mining: Practical unacceptable level of false positives. Simultaneously, adequate Machine Learning Tools and Techniques”, 2nd Edition, Morgan data exist or could be collected to facilitate network Kaufmann, San Francisco, 2005. administrators to discover these policy violations. But, the data  P. Werbos, “Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences”, PhD Thesis, Harvard University, 1974. are so vast and thus, the analysis process takes very long time  Ming Xue and Changjun Zhu, “Applied Research on Data Mining and the administrators don’t have the resources to go through Algorithm in Network Intrusion Detection”, International Joint it all and detect the relevant knowledge. Thus, the network Conference on Artificial Intelligence (JCAI), Pp. 275-277, 2009. administrators don’t have the resources to proactively  Eric Bloedorn, Alan D. Christiansen, William Hill, Clement Skorupka, investigate the data for policy violations, particularly in the Lisa M. Talbot, Jonathan Tivel, “Data Mining for Network Intrusion Detection: How to Get Started”, Technical Paper, 2001. existence of a high number of false positives that cause them  J.P. Anderson, “Computer security threat monitoring and surveillance”, to waste their inadequate resources. Technical Report, James P. Anderson Co., Fort Washington, Thus, the most important problem with the existing IDSs Pennsylvania, 1980. approaches is that, the existing IDSs do not provide significant  D.E. Denning, “An intrusion detection model”, IEEE Transaction on Software Engineering, Pp. 222–232, 1987. result for all types of attacks.  Zenghui Liu and Yingxu Lai, “A Data Mining Framework for Building It is to be understood that, there is considerable variation Intrusion Detection Models Based on IPv6”, Proceedings of the 3rd from one attack category to another and thus, identifying International Conference and Workshops on Advances in Information attack category specific algorithm offers a promising research Security and Assurance, Seoul, Korea, Springer- Verlag, Volume 5576, Pp. 608-618, 2009. direction for improving intrusion detection performance.  Theodoros Lappas and Konstantinos Pelechrinis, “Data Mining In order to handle the above mentioned problems, an Techniques for (Network) Intrusion Detection System”, 2007. effective and novel research in the areas of data mining and  Sheng Sun and YuanZhen Wang, “A Weighted Support Vector intrusion detection has to be carried out. Efficient machine Clustering Algorithm and its Application in Network Intrusion learning techniques can be used which provide decision aids Detection”, First International Workshop on Education Technology and for the analysts and which automatically generate rules to be Computer Science (ETCS), Vol. 1, Pp. 352-355, 2009. used for computer network intrusion detection. Moreover,  Su-Yun Wu and Ester Yen, “Data mining-based intrusion detectors”, Neuro-fuzzy techniques can be utilized with better learning Expert Systems with Applications, Vol. 36, No. 3, Pp. 5605-5612, 2009. techniques to provide precise results in IDS.  P. Amudha and H. Abdul Rauf, “Performance Analysis of Data Mining Approaches in Intrusion Detection”, International Conference on Process Automation, Control and Computing (PACC), Pp. 1–6, 2011. IV. CONCLUSION  M. Panda and M.R. Patra, “Ensembling Rule Based Classifiers for Detecting Network Intrusions”, International Conference on Advances in Intrusion Detection Systems provide the fundamental Recent Technologies in Communication and Computing (ARTCom), Pp. detection techniques to secure the systems present in the 19-22, 2009. networks that are directly or indirectly connected to the  A.F. Namik and Z.A. Othman, “Reducing network intrusion detection Internet. This paper provides a thorough investigation on the association rules using Chi-Squared pruning technique”, 3rd Conference existing intrusion detection techniques through data mining on Data Mining and Optimization (DMO), Pp. 122-127, 2011. approaches. This paper effectively analysis the problems  K.S. Ganesh, M.R. Sekar and V. Vaidehi, “Semantic Intrusion Detection available in the existing intrusion detection techniques. This System using pattern matching and state transition analysis”, paper also suggests certain solutions to the problems available International Conference on Recent Trends in Information Technology (ICRTIT), Pp. 607-612, 2011. in the existing IDSs. This paper would a suitable platform for  A. Esfandi, “Efficient anomaly intrusion detection system in adhoc the novel researches in the field of network security. networks by mobile agents”, 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT), Vol. 7, Pp. REFERENCES 73-77, 2010.  Te-Shun Chou and Tsung-Nan Chou, “Hybrid Classifier Systems for  Huy Nguyen and Deokjai Choi, “Application of Data Mining to Intrusion Detection”, Seventh Annual Communication Networks and Network Intrusion Detection: Classifier Selection Model”, Proceedings Services Research Conference (CNSR), Pp. 286-291, 2009. of the 11th Asia-Pacific Symposium on Network Operations and Management (APNOMS), Challenges for Next Generation Network  Zhenwei Yu and J.J.P. Tsai, “A multi-class SLIPPER system for Operations and Service Management, Pp. 399–408, 2008. intrusion detection”, Proceedings of the 28th Annual International Computer Software and Applications Conference (COMPSAC), Vol. 1,  Brian Krebs, “A Short History of Computer Viruses and Attacks”, Pp. 212-217, 2004. http://www.securityfocus.com/news/2445.  L. Vokorokos, A. Kleinova and O. Latka, “Network Security on the Intrusion Detection System Level”, Proceedings of International Conference on Intelligent Engineering Systems (INES), Pp. 270-275, 2006. 35 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 3, March 2012  M. Muntean, H. Valean, L. Miclea and A. Incze, “A novel intrusion M. Thenmozhi is working as an Assistant detection method based on support vector machines”, 11th International Professor in the Department of Symposium on Computational Intelligence and Informatics (CINTI), Pp. Information Technology, Faculty of 47-52, 2010. Engineering, Avinashilingam University  Lv Guangjuan, Xu Ruzhi, Zu Xiangrong and Deng Liwu, “Information for Women, Coimbatore, and doing M.E., Security Monitoring System Based on Data Mining”, Fifth International Network Engineering in Anna University Conference on Information Assurance and Security, Pp. 472-475, 2009. of Technology, Coimbatore. She received her B.E., at Avinashilingam University for  Xin Xu, “Adaptive Intrusion Detection Based on Machine Learning: Women, Coimbatore. She has attended Feature Extraction, Classifier Construction and Sequential Pattern various seminars and conferences. She has six years of Prediction”, International Journal of Web Services Practices, Vol. 2, No. teaching experience and her interests include Data Mining 1-2, Pp. 49-58, 2006. and Networking.  Yang Li and Li Guo, “An Active Learning Based TCM-KNN Algorithm for Supervised Network Intrusion Detection”, 26th Computers & Security, Pp. 459-467, 2007. Dr. P. Sumathi is working as an Assistant Professor in the Department of Computer Science, Chikkanna AUTHORS PROFILE Government Arts College, Tirupur. She V. Jaiganesh is working as an Assistant received her Ph.D., in the area of Grid Professor in the Department of Computer Computing in Bharathiar University. Science, Dr. N.G.P. Arts and Science She has done her M.Phil in the area of College, Coimbatore and doing Ph.D., in Software Engineering in Mother Teresa Manonmaniam Sundaranar University, Women’s University and received MCA degree at Kongu Thirunelveli. He has done his M.Phil., in the Engineering College, Perundurai. She has published a area of Data Mining in Periyar University. number of papers in reputed journals and conferences. He has done his post graduate degrees MCA She has about fifteen years of teaching and research and MBA in Periyar University, Salem. He has presented experience. Her research interests include Data Mining, and published a number of papers in reputed conferences Grid Computing and Software Engineering. and journals. He has one decade of teaching and research experience and his research interests include Data Mining and Networking. 36 http://sites.google.com/site/ijcsis/ ISSN 1947-5500
Pages to are hidden for
"A Survey on Building Intrusion Detection System Using Data Mining Framework"Please download to view full document