Visualizing attacks on honeypots

Document Sample
Visualizing attacks on honeypots Powered By Docstoc
					                          Visualizing attacks on honeypots

                                           Project Proposal

                         Jop van der Lelie -
                              Rory Breuk -
                                              June 7, 2012

1     Introduction
The National Cyber Security Centre (NCSC) constantly monitors the internet for threats. To monitor
and follow the trends on malware infections it has deployed various honeypots in unused IP space. To
gather all this data in a central database they are using SURFcert IDS, an open-source Distributed
Intrusion Detection System based on passive sensors. The sensors are placed in each network that needs
to be monitored and sends all data back to the logging server. The sensor is running honeypot software,
such as Nepenthes, which can simulate multiple known Windows vulnerabilities. Whenever an attacker
triggers the honeypot it will log all details of the attack and the attacker.
All the data is stored in a database which is accessible through a simple web-interface available at the
logging server. Although the server offers some kind of reporting functionality it is not possible to easily
browse the data and analyse it. Especially when a large number of sensors is used as is the case with

2     Research Question
To be able to identify patterns and trends in the alerts generated by honeypots an interactive visualization
can be used. Each alert is a possible attack on the honeypot. Such a visualization can give network
security analysts more insight in the data and enables them to visual analyse the attacks. In order to
define the requirements for such a visualization, we formulated the following research question:

      Which visualizations can be used to give more insight into attacks, using the honeypot data
      gathered with the SURFcert IDS?

To answer this research question we defined a list of subquestions:

    • What insight can help the analysis of the attacks?
    • What information from the honeypot logs can provide this insight?

    • Which visualizations can be used to show this information?
    • How can these visualizations be combined into one interactive dashboard?
    • How can the information retrieval from the honeypot logs be optimized?

By answering these research questions we will be able to create an interactive visualization that can give
more insight in the data gathered by honeypots from the SURFcert Intrusion Detection System.

3      Scope
In order to analyse the attacks on the honeypots we have to visualize the data from the alerts generated
by SURFcert IDS. These alerts contain the source and destination (IP address and port) as well as attack
details such as the attempted exploit or downloaded binaries.
While most related research focuses on NetFlow information, this is not available in our case. Apart
from the source and destination of an attack we do not have NetFlow data at hand. Our visualization
will show the relations between multiple attacks to identify groups of attackers, like botnets. Because of
the nature of a honeypot we can provide details-on-demand for each specific attack such as commands
executed by an attacker.

4      Related Research
In 2008, Fischer et al. [1] created a tool called NFlowVis which can be used to visual analyse attacks in
large-scale networks. This tool combines NetFlow information gathered from NfSen[2] with logs from an
IDS but only focusses on the NetFlows when visualizing the data.
Jaime Blasco [3] created a visualisation for IDS logs on Nepenthes Honeypots. He used graphs to
connect countries to IP addresses and connects these IP addresses to types of attacks coming from them.
Although the visualisation shows some interesting trends, it is difficult to scale for a large dataset such
as ours.
In 2011, Visoottiviseth et al. [4] worked on methods for managing logs of multiple Honeypots. They
created a number of simple visualisations including a geovisualization, depicting the distribution of
attacking source IP addresses and a tool for querying log data.
Dawkins et al.[5] describe a framework for managing data from multiple IDS’s. The methods are more
comprehensive than those described in Visoottiviseth et al. [4], although we will use SURFcert IDS for
this task, this paper helps understanding this better.
Another tool, VIAssist, a tool for visual analytics for cyber defence was developed in 2009 by Goodall
et al. [6]. This tool visualizes net flow data and lets the expert easily browse the data using the multiple
views. It can provide details-on-demand when an specific flow needs to be investigated further. The tool
integrates with the SiLK network flow analysis tools 1 to support multiple analysis tools as input.

5      Approach
There are multiple steps in creating a successful visual analytics:

    1. Information gathering
    2. Structuring the information

    3. Gather and structure evidence
    4. Develop the case
    5. Report and evaluate

We will focus on the latter three parts, since the data mining and information structuring is already
done in the database used by SURFcert IDS. The data is mined by all sensors and structured by the
logging server in a PostgreSQL database.
We want to extend the current reporting and analysis functionalities of the web-interface offered by the
logging server. For this, we are using the Data-Driven Documents (D3) javascript library[7]. We’ll first
define which information about anomalies is interesting to show from these IDS logs and which can give

more insight than current IDS reporting methods allow. Out of these possibilities we will choose one to
use for our visualization.
We’ll look at different visualization techniques and define which can be used to show this information.
Afterwards we will combine these visualization techniques into one interactive visualization.
Our visualization will be presented in a proof of concept in which we will show that visualization is a
powerful tool to give a user more insight in information.

6     Planning
The time frame for this project is 4 weeks giving us a total of 20 days to work on the project. We plan
to divide the work on the project as shown below.

 Week 1     Task
            Finalize proposal
            Literature study
            Structuring the data
            Understanding the data
            Defining which information is interesting to show
 Week 2     Task
            Create draft report
            Defining which visualizations to use
            Implementing the visualizations
            Implementing queries for the visualizations
 Week 3     Task
            Implementing the visualizations
            Implementing queries for the visualizations
            Create a PoC dashboard using visualizations
            Work on report
 Week 4     Task
            Finetuning visualizations
            Finalize report
            Prepare presentation

[1]   Fabian Fischer et al. “Large-scale Network Monitoring for Visual Analysis of Attacks”. In: (2008).
[2] NfSen, Netflow Sensor. url:
[3]   J. Blasco. “An approach to malware collection log visualization”. In: Journal (2005).
[4]   V. Visoottiviseth et al. “Distributed honeypot log management and visualization of attacker ge-
      ographical distribution”. In: Computer Science and Software Engineering (JCSSE), 2011 Eighth
      International Joint Conference on. IEEE. 2011, pp. 23–28.
[5]   J. Dawkins et al. “A framework for unified network security management: identifying and tracking
      security threats on converged networks”. In: Journal of Network and Systems Management 13.3
      (2005), pp. 253–267.
[6]   J.R. Goodall and M. Sowul. “VIAssist: Visual analytics for cyber defense”. In: Technologies for
      Homeland Security, 2009. HST’09. IEEE Conference on. IEEE. 2009, pp. 143–150.
[7] D3.js JavaScript library. url:


Shared By:
Description: bing INC google INC Honeypot technologies and their applicability as an internal countermeasure