A major challenge facing the computer and communications security community is determining algorithms for detecting security events using only network data. A security event is defined here not as an attempted attack, but as a successful attack (resulting in a compromised host, for example). One of the requirements for detecting security events using network data is the ability to properly evaluate and compare similar detectors. For example, current practice does not include stating the conditions under which one particular detector will perform better than another. As a result the best approaches to solving particular network security problems are not known. My research therefore focuses on the development and evaluation of network security approaches.
My general interest in network security is in determining what can be detected given primarily connection information. I work with limited data, such as flow data, because I am particularly interested in very large networks ( e.g., international or government organizations), which often cannot collect individual packet header information due to the volume of traffic passing through them. I am working on two projects in this area: scan detection and the identification of compromised hosts.
Scans have been identified as preceding approximately 50\% of attacks, and have also been used for detecting particular types of worms. Current approaches to scan detection require packet-level information, bidirectional data or knowledge of the network. One of the better detection methods is the Threshold Random Walk (TRW), which requires either the ability to see replies to connection requests or knowledge of what hosts are on a network. In contrast, we have designed a system, which is currently in operational use at a large organization, based solely on unidirectional flow information. One of the challenges that this approach still faces is the ability to detect stealthy scans, such as scans that are very slow or that are targeted only against existing hosts. I am interested in detecting such scans as an extension of my PhD dissertation, which focused on detecting distributed scans.
In addition to detecting scans, which can be used to determine potential adversary targets, it is also important to identify compromised hosts. I have developed an initial prototype of a system that detects anomalous network activity on a per host basis by comparing current activity to a baseline. This approach shows promise, but requires further investigation. Open issues include determining the best methods for baselining and the ideal time scale, how the baseline could or should be updated over time, and how far from the baseline is considered an acceptable deviation. In addition, studies will need to be performed to determine the actual true and false positive rates achieved by this approach, along with user studies to determine if this rate is acceptable in practice.
There are two approaches generally used to evaluate new network security techniques, the first of which uses a data set specifically developed for testing and comparing anomaly-based intrusion detection systems. However, this data set suffers from several flaws and is now over five years old. The second approach uses network traces gathered from live networks to test detectors. However, this approach suffers from a lack of ground truth --- we do not know where all of the attacks are. Also, results obtained in one network environment do not
necessarily reflect the results that would be obtained given a different environment.
For my dissertation I developed a new evaluation methodology based on logistic regression modeling of the true positive rate of a detector. Ground truth was obtained by performing the attacks using the DETER network (http://www.isi.edu/deter), which were combined with traffic gathered from live networks. Key characteristics that were suspected to contribute to the true and false positive rates were identified and used to design a representative data set for training a logistic regression model of the detector's true positive rate. The model was tested using a second data set constructed using a similar process. Additionally, a third data set was constructed where the background noise was gathered from networks that had not been used in the training or testing process. This evaluation methodology was also used on a second detector, demonstrating how the logistic regression equations can be used to compare two detectors in terms of the conditions under which each performs best.
Future research on this methodology includes determining if this approach can be more generally applied. It needs to be tested against multiple detectors of different types and complexities in order to determine if it can be generalized. A more generic approach or set of guidelines will follow from these tests, so that others can apply this methodology to their own detectors. The end goal is to develop a methodology that, when applied to a single detector, provides a model that indicates the variables that are important to the performance of the detector. Further, the model will indicate what the expected true and false positive rates would be for this detector when applied in new environments.