Sensitivity of PCA for Traffic Anomaly Detection by jbw10297

VIEWS: 11 PAGES: 27

									Sensitivity of PCA for
Traffic Anomaly Detection

  Evaluating the robustness of
  current best practices



      Haakon Ringberg1, Augustin Soule2,
      Jennifer Rexford1, Christophe Diot2
      1Princeton University, 2Thomson Research
Outline
       Background and motivation
        Traffic anomaly detection
        PCA and subspace approach
       Problems with methodology
       Conclusion & future directions




                                         2
A network in the Internet

   AS                       AS



              Network


  AS                         AS
                                  3
Network anomalies
                                                     March
                                                    Madness


  BOTNET

 Computer Computer Computer            Network
 Computer Computer Computer




 Computer Computer Computer



 Computer Computer Computer



 Computer Computer Computer
                                                        VIAGRA
 Computer Computer Computer
                              We want to be able to
                              detect these anomalies!
                                                                 4
Network anomaly detectors

 We’re
 good!



                              Network

    Anomaly
    Detector


              Monitor health of network
              Real-time reporting of anomalies
                                                  5
Principal Components
Analysis (PCA) Benefits
   Finds correlations
    across multiple links
       Network-wide analysis              =
                                Anomaly
       [Lakhina SIGCOMM’04]    Detector
                                                     PCA

   Demonstrated ability to
    detect wide variety of                                 AS
                                 AS
    anomalies
       [Lakhina IMC’04]                   Network

   Subspace methodology
       We use same software
                                AS
                                                           Victim
                                                                6
Principal Components
Analysis (PCA)
   PCA transforms data
    into new coordinate
    system
   Principal components
    (new bases) ordered by
    captured variance
   The first k tend to
    capture periodic trends
       normal subspace
       vs. anomalous subspace

                                 7
Pictorial overview of
subspace methodology
1.   Training: separate normal &
     anomalous traffic patterns               PCA
                                                         normal
2.   Detection: find spikes          signal
3.   Identification: find original                      anomalous
     spatial location that caused
     spike (e.g. router, flow)

                                     A

                                              Network

                                                                  B
                                                                      8
Pictorial overview of problems
with subspace methodology
                                                   topk
       Defining normalcy can
        be challenging
                                             PCA
         Tunable knobs                                normal
         Contamination             signal
                                                     anomalous
       PCA’s coordinate
        remapping makes it
                                    A
        difficult to identify the
        original location of an
                                             Network
        anomaly
                                                                B
                                                                    9
Data used




               Géant and Abilene networks
               IP flow traces
               21/11 through 28/11 2005
               Anomalies were manually
                verified

                                             10
Outline
       Background and motivation
       Problems with approach
        Sensitivity to its parameters
        Contamination of normalcy
        Identifying the location of detected anomalies
       Conclusion & future directions



                                                          11
Sensitivity to topk
                                              PCA
   PCA separates normal from                        normal
    anomalous traffic patterns       signal
   Works because top PCs tend                      anomalous
    to capture periodic trends
   And large fraction of variance




                                                                12
Sensitivity to topk
                                             topk
   Where is the line drawn
    between normal and                 PCA
    anomalous?                                  normal
                              signal
   What is too anomalous?                     anomalous




                                                         13
Sensitivity to topk




      Very sensitive to number of
      principal components included!
                                       14
Sensitivity to topk
   Sensitivity wouldn’t be
    an issue if we could
    tune topk parameter
   We’ve tried many
    different methods
       3σ deviation heuristic
       Cattell’s Scree Test
       Humphrey-Ilgen
       Kaiser’s Criterion
   None are reliable
                                 15
Contamination of normalcy
                              What happens to large
                               anomalies?
                                They capture a large
                                  fraction of variance
                                Therefore they are included
                                  among top PCs
                              Invalidates assumption that
                               top PCs need to be periodic
                              Pollutes definition of normal
         PCA                  In our study, the outage to
                normal         the left affected 75/77 links
signal
                                Only detected on a handful!
               anomalous

                                                           16
Identifying anomaly locations
   Spikes when state
    vector projected on
    anomaly subspace
       But network operators
        don’t care about this
       They want to know
        where it happened!                     state vector

   How do we find the
    original location of the
    anomaly?

                                                              17
                                anomaly subspace
Identifying anomaly locations
   Previous work used a                                 state vector

    simple heuristic
       Associate detected spike
        with k flows with the
        largest contribution to the
                                          anomaly subspace
        state vector v
   No clear a priori reason          A
    for this association
                                                 Network

                                                                        B
                                                                        18
Outline
       Background and motivation
       Problems with approach
       Conclusion & future directions
        Defining normalcy
        Identifying the location of an anomaly




                                                  19
Defining normalcy

                   Large anomalies can
                    cause a spike in first
                    few PCs
                       Diminishes effectiveness
                       But we can presumably
                        smooth these out (WMA)
                   But first PCs aren’t
                    always periodic
                       whichk instead of topk?
                       Initial results suggest this
                        might be challenging also

                                                  20
Fundamental disconnect
between objective functions
   PCA is optimal at
    finding orthogonal
    vectors ordered by
    captured variance
   But variance need not
    correspond to normalcy
    (i.e. periodicity)
   When do they
    coincide?


                              21
Identifying anomaly locations
                                                 AS
   PCA is very effective at     AS
    finding correlations
                                       Network
   But is accomplished by
    remapping all data to
    new coordinate system       AS
                                                 Victim
   Strength in detection
    becomes weakness in        AS
    identification
                                      Network
   Inherent limitation
                                                   AS
                                                          22
Conclusion
   PCA is sensitive to its parameters
   More robust methodology required
       Training: defining normalcy (topk, whichk)
       Detection: tuning threshold
       Identification: better heuristic
   Disconnect between objective functions
       PCA finds variance
       We seek periodicity
   PCA’s strengths can be weaknesses
       Transformation good at detecting correlations
       Causes difficulty in identifying anomaly location
                                                            23
                     Thanks!
                  Questions?


Haakon Ringberg
Princeton University Computer Science
http://www.cs.princeton.edu/~hlarsen/
Outline
       Background and motivation
       Problems with approach
       Future directions
       Conclusion
        Addressable problems, versus
        Fundamental problems




                                        25
Conclusion: addressable
   PCA is sensitive to its parameters
   More robust methodology required
       Training: defining normalcy (topk, whichk)
       Detection: tuning threshold
       Identification: better heuristic
   Previous work used same data and optimized
    parameter settings as Lakhina et al.
   But these concerns might be addressable
                                                     26
Conclusion: fundamental
   We don’t know what “normal” is
   Disconnect between objective functions
       PCA finds variance
       We seek periodicity
   PCA’s strengths can be weaknesses
       Transformation good at detecting correlations
       Causes difficulty in identifying anomaly location
   Are other methods are more appropriate?
       We require a standardized evaluation framework
                                                            27

								
To top