A Survey of Botnet Detection

Document Sample
A Survey of Botnet Detection Powered By Docstoc
					                               A Survey of Botnet Detection
                    LI Heng-Feng                                      HOU Ru-Xin
        Master of Engineering in Distributed               Master of Engineering in Distributed
                     Computing                                         Computing
           The University of Melbourne                        The University of Melbourne
               Melbourne, Australia                               Melbourne, Australia
            hengfeng12345@gmail.com                              houruxin@hotmail.com


Abstract— Although botnet has appeared many years, it just accesses the concern of public
recently. The botnet poses a huge threat to personal privacy and wealthy through the
Internet, because it has become a platform for network crimes, such as DDoS, extortion,
spamming, spreading new malware, sniffing traffic and click fraud. This paper describes the
architecture and life-cycle of botnet and classifies botnet detection techniques into three
types—honeynet, signature-based and network traffic-based detections. Besides, this paper
surveys current detection approaches in those three types, and makes a brief comparison
between them. Finally, it is presented that a hybrid method by using different techniques is a
good way to detect the botnet.


Keywords: Botnet; Botnet Detection; Bot; Security;


                                       I.      INTRODUCTION
       According to the explanation described in [1], the “botnet” is a network of large numbers of
the infected end-hosts called “bots”, which is controlled by a remote human operator called
“botmaster”. And a tiny undetectable malicious program sits on compromised computers which
performs nefarious tasks [2]. Botmasters use command and control (C&C) channels to
disseminate their commands to the bots; and those channels are established in a variety of
protocols, such as Internet Relay Chat (IRC), Peer-to-Peer (P2P) [1]. In other words, the botmaster
has an aggressive and powerful bot army that performs any tasks under its commands.
       Compared to the existing malwares(malicious software), such as worm and virus, the existence
of C&C channels is the key difference, because bots receive the command and perform malicious
behaviors under the control of botmaster. The first generation of botnets uses a centralized C&C
architecture, which made them vulnerable to be detected and shutdown [3]. Then, P2P-based
botnets have emerged, which do not suffer from a single point of failure because of its distributed
architecture [3].
       The most common use of botnets are criminally motivated (i.e. monetary) or for destructive
purposes [4]. Botnet provides a platform for network crimes, such as Distributed
Denial-of-Service Attacks (DDoS), extortion, spamming, spreading new malware, sniffing traffic
and click fraud [4, 5]. Thus, botnet poses a severe threat on the cyber-security. To address the
problems brought by the botnet, there are several approaches for botnet detection proposed by
researchers.
       This survey presents the taxonomy of botnet and some detection approaches. Besides, the
benefits and drawbacks of those approaches are discussed, followed with brief evaluation and
comparison. The rest of the paper is organized as follows: Section 2 describes the development of

                                                   1	
  
	
  
botnet and its characteristics. A deep insight on botnet technology is presented in this section. And
different classified approaches for botnet detection are displayed in Section 3, including the setting
up honeynet, signature-based and network traffic-based detections. The evaluation and
comparison of those approaches is given in Section 4. Finally, the conclusions are presented in
Section 5.


                                    II.     BACKGROUND OF BOTNET
       Historically, bots can be traced their roots to the Eggdrop bot created by Jeff Fisher for
assisting in IRC channel management in 1993 [6]. Today, botnet has already developed into a
malicious application, which poses a big threat to the current cyber security. Botnet has become a
valuable asset for their owners—bot masters—who make money by hiring them out to other cyber
criminals to use as a route to market for cybercrime attacks such as phishing attacks, spam attacks,
identify theft, click fraud and the distribution of scam emails [7]. The following will present the
architecture     life-cycle and characteristic of botnets.


A. Botnet Architecture
       Like most malwares, botnet is self-propagating application, which can easily infect vulnerable
hosts. Botnet uses command and control(C&C) channels for communication, though which the
botnet master can update the botnet and implement their commands. As the C&C channel is
generally used everywhere and the anonymity of C&C architecture, the detection of botnet is very
difficult in listening abnormal commands. In academic, botnets can be classified based on
different command and control architecture. Normally, there are Internet Relay Chat based, short
for IRC-based, HTTP-based and peer to peer (P2P) based botnets.
       The IRC botnet is the most popular botnet. The IRC (Internet Relay Chat) is a chat system that
provides one-to-one and one-to-many instant messaging over the Internet [6]. Users can create,
select and add his own interested channels, and can send a message to everyone or person in the
chat room [8]. Besides, the channel manager can set channel password to hide the channels [8].
The IRC architecture is shown in Figure 1.




                                                                                                           	
  
                              Figure	
  1.	
  The	
  Typical	
  Structure	
  of	
  IRC	
  botnets	
  [8]
       During recent years, the other form botnets have sprung up in the Internet, such as HTTP
botnets, P2P botnets. In the HTTP botnet, the botmaster simply sets the command in a file at a

                                                                  2	
  
	
  
C&C server [9]. The bots frequently connect back to read the command file [9]. This form of
botnet is relatively loose because the control of bots is not real-time and there is a delay between
the botmaster issuing a command and the bots receiving that command [9]. The architecture of
HTTP botnet is shown in Figure 2.




                                                                                                                       	
  
                                              Figure	
  2.	
  The	
  HTTP	
  botnet	
  [9]
       The P2P botnet is a new trend in the development of botnet. Because the architecture of IRC
botnet is centralized, if the C&C server has been shutdown, the whole botnet does not work. And
the botmasters are looking for a new robust form to establish the botnet, which is the P2P botnet.
The P2P botnet does not suffer from a single point of failure, because they do not have centralized
C&C servers [10]. The P2P botnet is shown in Figure 3.




                                                                                                                	
  
                          Figure	
  3.	
  Example	
  of	
  Peer-­‐to-­‐peer	
  Botnet	
  Architecture	
  [10]


B. Botnet Life-Cycle
       Generally, a typical botnet exhibits these lifecycle behaviors: initial infection and injection,
connection, malicious command and control(C&C), and the last is update and maintenance as
described in Feily.M et al. [11].
       In the initial infection period, the bot-herder (attacker), who wants to attack others with botnet,
should configure initial vulnerable hosts. The infected hosts will execute the bot program, once the
program runs on the host, the computer leads to be a “zombie”. Then botmaster can run their code
on the zombies.
       When connected to botnet, a zombie will receive commands via command and control(C&C)
channel which established by bot program. After executing the code from botmaster, the zombie
becomes one member of the whole botnet. Since then, botmaster can remotely control the
zombies.
       The last phase is update and maintenance. The botnet administrators should update the botnet
for reasons. For instance, the botmasters may need to change their target from BBC to Google, so
they can submit new commands to their zombie bots. Furthermore, they may need to evade
detection; they may change the location of servers or through other approaches.


                                            III. BOTNET DETECTION
       Due to the fact that the prevalent of botnet is dramatically increasing, many researchers begun

                                                                   3	
  
	
  
to propose some related solutions to detect the botnet. There are three main approaches in the
botnet detection techniques, setting up honeypot, signature-based and network traffic-based
detections.


A. Honeypot
       Amongst the various botnet detection approaches, the honeypot is a well-known technique for
discovering the tools, tactics, and motives of attackers. A honeypot is a trap set to detect, deflect,
or in some manner counteract attempts at unauthorized use of information systems [12]. Generally,
it consists of a computer, data, or a network site that appears to be part of a network, but is
actually isolated, protected, and monitored, and which seems to contain information or a resource
of value to attackers [12]. Two or more honeypots on a network from a honeynet, which is used
for monitoring a larger or more diverse network in which one honeypot may not be sufficient [12].
       A typical honeypot project is the Honeynet Project [13]. They use an unpatched version of
Windows 2000 or Windows XP as a honeypot, which is very vulnerable to attacks. After the
honey pot is successfully exploited by malwares, they can collect all the necessary information
and the honeypot can catch further malware. This information includes:
       DNS/IP-address of IRC server and port number
       (optional) password to connect to IRC-server
       Nickname of bot and structure
       Channel to join and (optional) channel-password
       That information is very helpful to find out the botmaster and shutdown the whole botnet.
       Although the honeypot is a good approach in detecting unknown botnet, it does carry risks to a
network, and must be handled with care [12]. If they are not properly walled off, an attacker can
use them to break into a system [12].


B. Signature-based Detection
       The signature-based detection mainly focuses on the features of bots behavior, as they are
different from the normal activities. Due to the fact that the bots communicate with the C&C
server through commands, the information of bots should be collected largely. In order to
understand the mechanism of the botnet, it will track the behavior of bots in the host and get the
model of bot behavior signature. By using the model of the bots, the anomaly detection can be
used in the host to detect the botnet.
       The early researchers in this aspects paid more attention to the behavior of hosts such as the
invocation of system calls and the related log files. In 2006, Al-Hammadi and Aickelin [14]
proposed a detection method based on monitoring the log files made by applications, which store
the system invocation calls. And another related idea that focused on the similarity of invocating
system calls appeared in 2007; through simulations with traces of worms and non-worms, Malan
[15] identified the signatures of worms and non-worms which are the similarity over time of
invocating system calls [15]. Those methods are rapid and reliable, but they are based on the
knowledge of existing botnets or tracing the bots to identify the signatures, which sometimes is
really difficult to find the similarity.
       Due to the prevalent of the IRC botnet, some specific approaches are presented based on the
properties of IRC botnet. In 2006, Binkley and Singh [16] proposed an anomaly-based algorithm
for detecting IRC-based bonets, which combines IRC tokenization and IRC message statistics

                                                   4	
  
	
  
with a TCP-based anomaly detection. In their approach, the IRC component creates two tuples,
one for determining the IRC net based on IP channel names, and a sub-tuple which collects
statistics on individual IRC hosts in channels. After that, the channels are sorted by the amount of
scanners producing a sorted list of potential botnets. Hence, the system can obviously spot the
clients of botnets and easily reveal the bot servers with the gross statistical measures [16]. Because
of establishing on the properties of known botnet and specific protocol, it is hard to identify the
unknown botnet or the botnet based on other protocol, such HTTP, P2P and etc.
       Another detection software Rishi was developed by Goebel and Holz [17] in 2007, which
mainly based on passively monitoring for unusual or suspicious IRC nicknames, IRC servers, and
uncommon server ports. And it observe protocol message and use an analysis way together with a
scoring function and black-/whitelists to detect IRC characteristics [17].
       With the deeper insight of the botnet, some other approaches emerged, which focused on the
signatures of the specific botnet, such as the spamming botnet. In 2008, Yinglian et al. [18]
developed a framework, called AutoRE, which generated a spam signature. By using the spam
signature, the AutoRE is able to detect the botnet-based spam mails and botnet membership [18].
And in 2009, a structure analysis method of spam email was presented in Sroufe et al. [19], which
is characterizing an email by mimicking human visual inspection and generated a botnet signature.
       Although the signature-based detection has a high accuracy and a low false positive, its
drawbacks is that they difficultly identify the unknown botnet. Furthermore, they only aimed at a
particular botnet or protocol which contributes to the low scalability.


C. Network Traffic -based detection
       The Network Traffic-based detections pay more attention to the difference of net flow between
normal hosts and bots. They make efforts in the analysis of net flow and detecting the C&C
channel by using the characteristics of net flow of the bots.
       There are three typical network traffic-based systems developed by Guofei Gu research group.
In 2007, a new detection strategy was proposed by them, which focuses on detecting the infection
and coordination communication that happens during the infection process of a malware [20].
They developed an application, called BotHunter, which developed an evidence trail of data
exchanges that match a state-based infection sequence model, by tracking the two-way dialog
flows over the communication [20]. In 2008, BotSniffer, a prototype system proposed in Guofei et
al. [9], is based on the correlation and similarity of bots which are in the same botnet, for example,
bots engage in coordinated communication, propagation, and attack and fraudulent activities. The
BotSniffer’s monitor engine examines network traffic, generates connection record of suspicious
C&C protocols, and detects activity response behavior and message response behavior in the
monitored network [9]. In this year, they developed another similar botnet detection system,
named BotMiner, which is independent of the protocol and structure used by botnets [21].
Although these approaches are able to be independent of the protocol and botnet structure, it is
hard to identify the similarity of bots and may lead to a high false positive rate.
       Strayer et al. [22] presented a botnet detection approach in 2008, which the bandwidth, packet
timing and burst duration of flow characteristics are used for evidence of botnet command and
control activity. Their system performs a filter to reduce the amount of data and uses machine
learning techniques to classify the flows. After that, the clusters of flows that share similar timing
and packet size characteristics could be classified and those clusters are then analyzed to try to

                                                   5	
  
	
  
identify the botmaster host [22].
           In 2010, there are two new network traffic-based detection approaches. One is presented in
Nogueira et al. [23], which based on an Artificial Neural Network to identify the licit and illicit
traffic patterns. A new detection framework Botnet Security System (BoNeSSy) is developed by
this approach, which distinguishes the different applications or group applications by using the
unique characteristic traffic pattern for each application [23]. Another one is an anomaly-based
botnet detection schemes for IRC protocol proposed in Zilong et al. [8], which uses NetFlow data
as raw data for analyzing. The advantage of this approach is that it can discover botnets in
encrypted traffic and detecting the botmaster. But it is not able to sense the non-standard port for
IRC botnets and check the real-time flow [8]. The above three approaches are established on
grouping or analyzing the net flow according to the unique communication mechanism of
applications.


                                                      IV. COMPARISON
           A brief comparison of above approaches is presented in this section. The comparison is based
on those key factors, including unknown botnet detection, protocol & structure independent, false
positive, cost, and risk. As shown in Table 1.
                                      Table	
  1.	
  COMPARISON	
  OF	
  BOTNET	
  DETECTIONS	
  
                 	
            Different	
        Unknown	
               Protocol&	
     Low	
  False	
     Low	
      Low	
  
                              Approaches	
         Botnet	
               Structure	
      Positive	
        Cost	
     Risk	
  
                                                  Detection	
      Independent	
  
          Honeypot	
              [13]	
               O	
                     X	
              O	
            X	
        X	
  
                                  [14]	
               X	
                     X	
              O	
            O	
        O	
  
                                  [15]	
               X	
                     X	
              O	
            O	
        O	
  
       Signature-­‐base           [16]	
               X	
                     X	
              O	
            O	
        O	
  
                d	
               [17]	
               X	
                     X	
              O	
            O	
        O	
  
                                  [18]	
               X	
                     X	
              O	
            O	
        O	
  
                                  [19]	
               X	
                     X	
              O	
            O	
        O	
  
                                  [20]	
               O	
                     O	
              X	
            O	
        O	
  
                                   [9]	
               O	
                     O	
              X	
            O	
        O	
  
           Network	
              [21]	
               O	
                     O	
              X	
            O	
        O	
  
        Traffic-­‐based	
         [22]	
               O	
                     X	
              X	
            O	
        O	
  
                                  [23]	
               O	
                     O	
              X	
            O	
        O	
  
                                   [8]	
               O	
                     X	
              X	
            O	
        O	
  


           First, Honeynet[13] is a useful approach to collect the unknown botnet statistics so that the
botnet technology and characteristics are able to be identified. And it is always accurate which has
low false positive. However, it may introduce the risk to a network without careful consideration.
Besides, the honeypot is deployed for companies, corporations, research organizations or
educational institutions which are devoted to analyze the botnet. Secondly, although
signature-based detection has low false positive rate, it depends on the knowledge of existing
botnet or protocol. In [14, 15], those approaches are rapid and reliable, but they are based on the
specific platform because the log files and system invocation calls are different in various
platforms. In [16,17] , it is hard to identify the unknown botnet or the botnet based on other
                                                                  6	
  
	
  
protocol, such HTTP, P2P and etc., because of establishing on the properties of known botnet and
specific protocol. In [18, 19], they only aimed at a particular botnet which contributes to the low
scalability. Finally, the Network Traffic-based detection performances well in unknown botnet
detection and the cost field, but this technique have high false positive rate due to the
misclassification of legitimate traffic flow. Although [9, 20, 21] approaches are able to be
independent of the protocol and botnet structure, it is hard to identify the similarity of net flow and
may lead to a high false positive rate. The [8, 22, 23] approaches has the advantage of detecting
unknown botnet by grouping or analyzing the net flow, but the misclassification of the normal net
flow contributes to a high false positive rate.


                                         V.   CONCLUSIONS
       The potential risk of botnet pose a huge threat to personal privacy and wealthy through the
Internet. Although botnet has appeared many years, it just accesses the concern of public recently
and its detection techniques are still immature. This paper surveys the background of botnet and
the botnet detections.
       In this paper, the botnet detection approaches are divided in three types, including the setting
up honeypot, signature-based and network traffic-based detections. Each type of detection
techniques has its benefits and drawbacks.
       Different approaches focus on a different specific field. The hybrid botnet detection is a good
way to take advantages of those approaches. The honeypot should be deployed by research
organizations or big companies to collect the information of unknown botnet and understand the
techniques of them. And that information can be used to develop signature-based technique, which
is able to detect the abnormal behavior in end host. Besides, it is useful to deploy network
traffic-based techniques in routers or firewalls of intranets and Ethernets.


                                            REFERENCES

[1] M. Rajab, J. Zarfoss, F. Monrose, & A. Terzis 2006, ‘A Multifaceted Approach to
Understanding the Botnet Phenomenon’, Proc. 6th ACM SIGCOMM Conference on Internet
Measurement (IMC’06), pp. 41-52.
[2] Jim Louderback 2007, ‘Beware of Botnets’, PC MAGAZINE, pp. 13.
[3] Zeidanloo, H.R., Bt Manaf, A., Vahadani, P., Tabatabaei, F., & Zamani, M. 2010, ‘Botnet
Detection Based on Traffic Monitoring’, 2010 International Conference on Networking and
Information Technology (ICNIT), pp. 97-101.
[4] Bacher, P, Holz, H, Kotter, M & Wicherski G 2008, Know your enemy: Tracking botnets, The
Honeynet Project, Naperville, viewed 4 October 2010, <http://www.honeynet.org/papers/bots>.
[5] Ianelli, N & Hackworth, A 2005, ‘Botnets as a Vehicle for Online Crime’, CERT Request for
Comments (RFC) 1700.
[6] Chao, L, Wei, J & Xin, Z 2009, ‘Botnet: Survey and Case Study’, 2009 Fourth International
Conference on Innovative Computing, Information and Control, pp. 1184-1187.
[7] Tony, B 2010, Microsoft Exposes Scope of Botnet Threat, PCWorld, viewed 15 October 2010,
<http://www.pcworld.com/businesscenter/article/207961/microsoft_exposes_scope_of_botnet_thr
eat.html >.
[8] Zilong, W, Jinsong, W, Wenyi, H & Chengyi, X 2010, ‘The Detection of IRC Botnet Based on

                                                    7	
  
	
  
Abnomal Behavior’, 2010 Second International Conference on MultiMedia and Information
Technology, pp. 146-149.
[9] Guofei, G, Junjie, Z & Wenke, L 2008, ‘BotSniffer: Detecting botnet command and control
channels in network traffic’, Proc. 15th Annual Network and Distributed System Security
Symposium (NDSS’08), pp. 269-286.
[10] Zeidanloo, H.R., Shooshtari, M.J.Z., Amoli, P.V., Safari, M. & Zamani, M. 2010, ‘A
Taxonomy of Botnet Detection Techniques’, 2010 3rd IEEE International Conference on
Computer Science and Information Technology (ICCSIT), pp. 158-162.
[11] Feily, M., Shahrestani, A. & Ramadass, S. 2009, ‘A Survey of Botnet and Botnet Detection’,
2009 Third International Conference on Emerging Security Information, Systems and
Technologies, pp. 268-273.
[12]   Honeypot       (computing),   (4    August        2003),    viewed        4    October     2010,
<http://en.wikipedia.org/wiki/Honeypot_(computing)>.
[13] Getting information with the help of honeynets, (8 October 2008), viewed 7 October 2010,
<http://www.honeynet.org/node/59>.
[14] Al-Hammadi, Y & Aickelin, U 2006, ‘Detecting Botnets through Log Correlation’, Proc.
IEEE/IST Workshop on Monitoring, Attack Detection and Mitigation, pp. 97-100.
[15] Manlan, D.J. 2007, ‘Rapid Detection of Botnets through Collaborative Networks of Peers’[Ph.
D dissertation], Harvard University, Cambridge, Massachusetts.
[16] Binkley, J.R. & Singh, S. 2006, ‘An Algorithm for Anomaly-based Botnet Detection’, Proc.
2nd Workshop on Steps to Reducing Unwanted Traffic on the Internet, pp. 43-48.
[17] Jan, G & Thorsten, H 2007, ‘Rishi: Identify Bot Contaminated Hosts by IRC Nickname
Evaluation’, Proc. HotBots’07, First Workshop on Hot Topics in Understanding Botnets.
[18] Xie, Y, Yu, F, Achan, K, Panigrahy, R & Hulten, G 2008, ‘Spamming Botnets: Signatures
and Characteristics’, ACM SIGCOMM’08.
[19] Sroufe, P., Phithakkitnukoon, S., Dantu, R. & Cangussu, J. 2009, ‘Email Shape Analysis for
Spam Botnet Detection’, 2009 6th IEEE Consumer Communications and Networking
Conference(CCNC), pp. 1-2.
[20] Guofei, G, Phillip, P, Vinod, Y, Martin, F & Wenke, L 2008, ‘BotHunter: Detecting Malware
Infection   through    IDS-Driven    Dialog   Correlation’,       Proc.   16th       USENIX     Security
Symposium(Security’07), pp. 167-182.
[21] Guofei, G, Roberto, P, Junjie, Z & Wenke, L 2008, ‘BotMiner: Clustering Analysis of
Network Traffic for Protocol- and Structure-Independent Botnet Detection’, Proc. 17th USENIX
Security Symposium (Security’08), pp. 139-154.
[22] Strayer, W, Lapsely, D, Walsh, R & Livadas, C 2008, ‘Botnet Detection Based on Network
Behavior’, Botnet Detection Countering the Largest Security Threat, pp. 1-24.
[23] Nogueira, A, Salvador, P & Blessa, F 2010, ‘A Botnet Detection System based on Neural
Networks’, 2010 Fifth International Conference on Digital Telecommunications, pp. 57-62.




                                                 8	
  
	
  

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:151
posted:2/16/2012
language:English
pages:8