Docstoc

Data Visualization

Document Sample
Data Visualization Powered By Docstoc
					    Adaptive Data Visualization
           <><><><><>
 Packet Information Collection and
Transformation for Network Intrusion
      Detection and Prevention




Richard A. Aló, Ali Berrached, Mohsen Beheshti,
    Ping Chen, Jack Han, Francois Modave
Center for Computational Sciences and Advanced
             Distributed Simulation
        University of Houston-Downtown
      Problem: Adaptive Data Visualization


Visualization- graphical presentation of a data set,
  with goal of helping and providing viewer with a
  qualitative understanding of information
  contents in a natural and direct way.
What a visualization system
should do : convert forward




 = f (213, 108, 30, 1704, 17, 2, 44, 140, 477, -108, 0.0)
What a viewer should be allowed
   to do : convert backward
What a viewer should be allowed to
 do REALLY : find knowledge
   Graphical elements
 point
 line
 polyline
 glyph
 2-D or 3-D surface
 3-D solid
 image
 text
    Element properties
color/intensity
location
style/texture/shade/light
size(no perspective view)
angle
relative position/motion
       What is the problem exactly
 Basic requirement - find a f(…) satisfying:
   – Different data values should be represented differently in
       display, the more different, the more different in display
 Computation constraints:
   – Performance: line is better than curve
   – Memory usage
 Data constraints:
   – Infant stage,domain knowledge,universal theory unlikely
   – Display high dimensional data in 3D world or 2D screen
 Human beings constraints:
   –   not efficient, slow processing
   –   Ambiguous
   –   User-depended, area-depended
   –   Eye limits
 Non-uniform data distribution



Need cluster the data set first
         Non-uniform
knowledge/information distribution
  – Water temperature: change from 40C to 41C
    and change from 99C to 100C are different
  – Change of water temperature from 40C to 41C
    and change of patient body temperature from
    40C to 41C are different
Need integrate domain knowledge by
 interaction with users
   Adaptive Data Visualization
       System Properties
 Interactive and adaptive
 Correctness
 Maximizing
     Interactive and Adaptive Visualization System

Domain knowledge integration achieved by
 choosing proper association function
 transformation functions during
  visualization process.
Interactive/ Provide mechanism for views to
  adjust or change transformation functions
  during visualization process.
Interaction allows user to guide visualization
  system step by step to display/ clarify what
  is of interest.
                Correctness
 If possible: visualization system should
  show different dimensions of a data set
  differently through different visual objects
  or visual properties (visual elements) of the
  same visual objects.
 The more different the values are, the more
  differently they should be rendered.
 The more different the information
  represented by data values are, the more
  differently they should be rendered.
            Maximizing
To optimize the rendering quality, the
 maximal range of visual objects/elements
 should be used.
      Adaptive Data Visualization
              Algorithm
                  Load the dataset


        Find clusters for each individual dimension

Perform association and transformation according to “Maxmizing” rule


                    Render data



           Viewer wants to change
                                                Viewer changes
           association step?
                                          Yes   association

                           No

           Viewer wants to change         Yes   Viewer changes
           transformation step?                 transformation
              Future Work
 More applications
    Packet Information Collection and Transformation for
        Network Intrusion Detection and Prevention

   Introduction
   The SNORT System
   The SNORT Setup
   The See5 System
   Data Transformation
   Information Fusion Framework for Intrusion Detection
   Conclusion and Future Work
                Introduction


 Network Intrusion Detection System (IDS)
 Network Intrusion Prevention System
  (IPS)
 Suspicious network activities
  – misuse
  – anomaly
         Intrusion Detection Process


 Network Intrusion Detection System (IDS)
 Network Intrusion Prevention System
  (IPS)
 Suspicious network activities
  – misuse
  – anomaly
             CSRL Fusion System


 Data Collection: Capture packet data in
  network traffic by using the tool SNORT
 Data Preprocess: Transform data into the
  suitable input format that are required by
  See5
 Pattern Detection: Apply See5 to induce
  intrusion detection rules, a set of alert
  rules for recognizing malicious activities
 Response: Integrate the detection rules
  into a firewall to prevent potential attacks
                       SNORT System

 Network sniffer developed by Martin Roesch in 1998
 Logs packets in a database
 SNORT database
   – four tables to record information of network packets
     using the following protocols, icp, udp, icmp, and ip
   – two other tables
      • acid_event to consolidate all the logs of alerts
      • opt to hold the optional data that can be part of the
        TCP/IP protocol .
                   SNORT Setup


 Database: MySql
 Two Systems setup
  – Working system
     • Two servers for cross platform and data fusion
        – Linux server
        – Windows server
     • WAN
  – Testing system
     • Testing SNORT rules and transforming data
     • LAN
                    SNORT Rule Type


ruletype nonalert
   {
      type alert
       output database: log, mysql, user=snort
       password=password dbname=snortTest
       host=localhost
   }
                  SEE5 System


 A machine learning and data mining
  system for Windows, evolved from C4.5
 Generate a decision tree
 Two input files
  – .names – attributes and characteristics such as
    data type, range, etc.
  – .data – the raw data set
System Framework
      System Framework
Essential Attacks Collected from Two Sensors per Day
Essential Attacks Collected from Two Sensors per Hour
                      Conclusion


 CSRL Project on progress
 Four components of IDS and IPS
  – Data Collection    -- finished
  – Data Preprocessing -- finished
  – Pattern Detection -- on going
  – Response           -- Future

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:8
posted:12/1/2011
language:English
pages:28