"Event Correlation in Security" Anton Chuvakin, Ph.D., GCIA, GCIH WRITTEN: 2002 In the deep and somewhat muddy sea of security marketing terms, "correlation" appears to be the current pack leader, closly chased by "intrusion prevention". In this paper we will try to cast an objective look at what security correlation really is and how it helps to improve the organization's security posture. Introduction The recent security spending survey by http://www.infosecuritymag.com/2003/may/coverstory.pdf indicates that deployment rates of many security technologies will soar in the next three years. All the above devices, whether aimed at prevention or detection, generate huge volumes of audit data. Firewalls and other devices logging network connection information are especially guilty of producing vast oceans of data. Many diverse data formats and representations are used for those log files and audit trails. Also, a percentage of events generated by network IDS and IPS are false alarms and do not map to real threats. To further confuse the issue, different devices might report on the same things happening on the network, but in a different way, with no apparent way of figuring the truth of their relationship. There is a definite need for a consistent analysis framework to identify various threats, prioritize them and learn their impact on the target organization. This must be done as fast as possible (preferably in real-time) for attack identification. It is also important that analysis be performed also over the long term for threat trending and risk analysis. For a detailed summary of security data analysis challenges see http://www.tisc2002.com/newsletters/418.html SIM (Security Information Management) is the "discipline" that will solve the correlation security event data challenge. Correlation is defined as "establishing or finding relationships between entities". However, the good security-specific definition is lacking. In security, “event correlation” may be defined as improving threat identification and assessment process by looking not only at individual events, but at their sets, bound by some common parameter (“related”). Types of correlation Security-specific correlation can be loosely categorized as rule-based or statistical (algorithmic). A rule-based correlation engine has some pre-existing knowledge of the attack (the rule) and from this is able to define what has actually been detected in precise terms. Such attack knowledge is used to relate events and analyze them together in a common context. Statistical correlation does not employ any pre-existing knowledge of the malicious activity, but instead relies upon the knowledge (and recognition) of normal activities, which has been accumulated over time. Ongoing events are then rated by a built-in algorithm and may also be compared to the accumulated activity patterns, to distinguish normal from abnormal (suspicious). This distinction among correlation types is somewhat similar to signature vs anomaly IDS, and makes a SIM function as a kind of a meta-IDS, operating on a higher-level data: log records as opposed to packet (streams). Combined, both correlation methods can help to sift through the large volume of diverse data and identify high severity threats. Rule-based Correlation Rule-based correlation engines applies scenarios that an known attack must follow to detect exactly this attack. Such scenarios might be encoded in the form of a sequence of events (first this, then that...), therefore some action(s) must trigger the detection process. Rule-based correlation deals with states, conditions, timeouts and actions. Let us define those important terms. A state is a well-defined logical or operational mode that the correlation rule might be in. A state may contain various conditions, such as matching incoming events by the source IP address, protocol, port, event type, producing security device type, username and other data components of the event. Although data components vary from device to device, a SIM solution typically normalizes many data component formats using a generic or uniform event schema. This schema must be rich enough to normalize data without incurring any information loss A timeout defines how long the correlation engine will be in a certain state, for a given rule. If the correlation engine must maintain a lot of rules in waiting state in memory, this resource might be exhausted. Thus, rule timeouts plan an important role in correlation peformance. A transition is an event when one rule state is switched to another one. For a complicated rule, many transitions are possible. An action is response event, initiated when all the rule conditions are met. Various actions may result from rules, such as user notification, alarm escalation, configuration changes or automatic incident case investigation. The correlation engine is able to track various states and switch from state to state, depending on conditions and incoming events. The correlation engine receives events in real time from the alarm-generating security devices it monitors, and applies the relevant correlation rules to the event flow, taking into account any data filtering (e.g., log records that are not considered relevant). Correlation rules may be applied to incoming events in real-time (as they arrive) or to the historical events stored in the database. In the latter case, the rules are used as a form of data mining or analytics, which helps uncover concealed activity, e.g., threats such as slow port scans or low level activity from Trojan or exploitation software operating on a compromised host. Such rules may be run periodically, for incident identification, or in the course of the investigation of suspicious activity, for seeking out the prior occurrences of similar (and thus possibly related) activity. Unlike real-time rules, which become useless if the incidence of false alarms is too high (as is the case for poorly designed and administered signature-based IDSs), database rules can tolerate a certain level of false alarms for the purpose of drastically reducing false negatives. This is due to the fact that real-time rules usually feed the alarm notification system, while database rule-based correlation will be launched by the analyst during security incident the investigation. As long as the rule-based analytics will uncover a hidden threat, which is impossible to discover otherwise, an analyst might be able to tolerate a certain level of false alarms. Statistical correlation Statistical correlation uses special numeric algorithms to calculate threat levels incurred by the security relevant events on various IT assets. Such correlation looks for deviations from normal event levels and other routine activities. Risk levels may be computed from the incoming events and subsequently tracked in real time or historically, so that deviations become apparent. Algorithmic correlation may leverage event categorization to compute the threat levels specific to various attack types, such as the threat of a denial of service or worm/ virus attack, and track them over time. Detecting threats using statistical correlation does not require any pre-existing knowledge of the attack to be detected. Statistical methods may, however, be used to detect threats on pre-defined activity thresholds. Such thresholds may be configured based on the experiences monitoring the environment. For example, if normal level of specific reconnoissance activity (e.g., port scans) is exceeded for a prolonged period of time, the alarm might be generated by the system. Correlation may also use various parameters for enterprise assets to skew the statistical algorithm for higher accuracy detection. Some of them are defined by system users (such as the affected asset value to the organization) or are automatically computed from other available event context data (such as vulnerability scanning results or measure of normal user activity on the asset). That allows to define broader context for transpiring security events and thus help understand how they contribute to the organization's risk posture. Challenges with correlation Both types of correlation have inherent challenges, which can fortunately be mitigated by combining both methods to create coherent correlation coverage, leading to quality threat identification and ranking. First, can we assume that the attacker will follow a scenario, which can be caught by the rule-based correlation system? Unlike the network IDS system that needs a specific signature with detailed knowledge of the attack, a correlation system rule may cover the broad range of malicious activities, especially if intelligent security event categorization is utilized. It may be done without going into the specifics of a particular IDS signatures. For example, rules may be written to look for certain activities that usually accompany the system compromise, such as backdoor communication or the installation of hacker tools on a compromised system. An attacker is hard-pressed to avoid installing rootkits, RATS and other code if he intends to use the compromised machine for his purposes. Extensive research using deception networks or honeynets allows us to learn more and more of the attackers' patterns of behavior and to encode them as correlation rules, available out of the box. Second, can multiple rules cause the number of false positives to actually increase instead of decrease? Indeed, deploying many rules without any regard to the environment might generate false alarms. However, it is much easier to understand and tune the SIM correlation rules than intricate binary matching patterns. The latter requires in-depth understanding of the attack network packets, memory corruption issues and specifics of the exploitation techniques. On the other hand, tuning the correlation rule involves changing the timeouts and adding or removing conditions. Overall, in case of correlation rules, one may also define response actions with higher confidence, since one can bind the rules to a specific asset or group of assets. Third, rule-based correlation is relatively intensive computationally. However, using highly optimized correlation engines and intelligently applying filters to limit the flow of events allows gaining maximum advantage of the rule-based correlation. Additionally, many rules can be combined together so that the correlation engine does not have to keep many similar events in memory. It also makes sense to apply more specific correlation rules to a large number of assets, where false positives flood might endanger the security, and to apply wider and more generic rules to critical assets, where an occasional false alarm is better than missing a single important alert. This way all the suspicious activities directed against a small group of critical assets will be detected. Fourth, statistical correlation may not pick up anomalous activity if it is performed at low enough levels, essentially merging with the normal. Hiding attack patterns under volumes and volumes of similar normal activity might deceive the statistical correlation system. Similarly, a single occurrence of an attack might not impact the statistical profile enough to be noticed. However, careful baselining of the environment and then using statistical methods to track the deviations from such baseline might allow detecting some of attacks that are "stealthy". Also, rule-based correlation efficiency compensates for those rare events and enables their detection, even if algorithmic correlation misses them. Conclusion SIM products leveraging advanced correlation techniques and intelligent alert categorization may become indispensable. It certainly seems that, as enterprises deploy more and more security point solutions, appliances and devices, human correlation alone will become increasingly difficult, and inevitably, impractical. The situation we find ourselves in today is one where "best of breed" is a practical necessity, but this typically means that we have many security devices, and each one only addresses certain aspects of the overall security services. Thus, we need to integrate the interpretation of security events under some common umbrella. Security Information Management solutions are promising alternatives to the human correlation we rely on so heavily today. ABOUT THE AUTHOR: This is an updated author bio, added to the paper at the time of reposting in 2011. Dr. Anton Chuvakin (www.chuvakin.org) is a recognized security expert in the field of log management and PCI DSS compliance. Anton leads his security consulting practice www.securitywarriorconsulting.com, focusing on logging, SIEM, security strategy and compliance for security vendors and Fortune 500 organizations. He is an author of books "Security Warrior" and "PCI Compliance" (www.pcicompliancebook.info) and a contributor to "Know Your Enemy II", "Information Security Management Handbook"; and now working on a book about system logs. Anton has published dozens of papers on log management, correlation, data analysis, PCI DSS, security management (see list www.info-secure.org). His blog www.securitywarrior.org is one of the most popular in the industry. In addition, Anton teaches classes (including his own SANS class on log management) and presents at many security conferences across the world; he recently addressed audiences in United States, UK, Singapore, Spain, Russia and other countries. He works on emerging security standards and serves on advisory boards of several security start- ups. Dr. Anton Chuvakin was formerly a Director of PCI Compliance Solutions at Qualys. Previously, Anton worked at LogLogic as a Chief Logging Evangelist, tasked with educating the world about the importance of logging for security, compliance and operations. Before LogLogic, Anton was employed by a security vendor in a strategic product management role. Anton earned his Ph.D. degree from Stony Brook University.
Pages to are hidden for
"tisc-correlation_paper"Please download to view full document