Control Behavior Models

[PDF]
Intrusion detection~\cite{} involves detecting and preventing malicious actions that can potentially (a) compromise data integrity and/or confidentiality and (b) availability of resources. At its core, intrusion detection requires a methodology that can effectively distinguish malicious behavior from correct behavior of systems, thereby increasing the capability of preventing intrusive behavior on one hand and decreasing the number of false alarms on the other. Anomaly detection, a typical methodology applied in intrusion detection systems (IDS), is a two-phase process. In the first phase, a model is generated which captures the intended (or correct) behavior of the system~\cite{FHRS90,Lunt92,ALJTV95} while the second phase compares the generated model against observed behavior of the system. Deviations of the observed behavior from the model are flagged as potential attacks to the system. The approach has the capability of detecting both known/unknown attack sequences. However, it can potentially suffer from the disadvantage of unduly large number of false alarms in the event that the generated model does not replicate all possible correct behavior of the system. In this context, static-analysis based techniques are preferred to the ones based on runtime-analysis, due to the formers capability to capture the complete control behavior of the system in terms of the system calls invoked by it. \typeout{system call: why? some stuff here} Models generated by static-analysis are typically control-flow graphs (CFGs) where a state is represented using the program counter value and the inter-state transitions are labeled by the system calls that are being invoked at the source state. Monitoring the observed behavior using such a model involves matching observable system call sequences with the sequence of transitions in the control flow graph. Two important requirements of this technique, as eluded in \cite{GJM02}, are \begin{enumerate} \item Precision: the model must capture all possible sequence of actions that are deemed as correct system behavior and nothing else. This stems from the underlying requirements that correct behavior must not be classified as erroneous and anomalous behavior must not go un-noticed. \item Efficiency: monitoring observable behavior must incur minimal overhead in time and space usage. \end{enumerate} Unfortunately, enhancing both precision and efficiency is a difficult, if not impossible, task owing to the fact the precision requires incorporating minute details of the system in the model which in turns slows down the monitoring phase thereby reducing efficiency. We present, in this paper, a novel approach which enhances precision without incurring loss of efficiency. \paragraph{The driving problem.} Models based on CFGs capture the branching (if-then-else, while-do) behavior of a procedure. In other words, CFGs can be represented in terms of regular languages. However, such regular behavior is \emph{insufficient} to model programs, with (recursive) procedures, as calls to and returns from a procedure can only be represented in terms of non-regular patterns, i.e., context-free languages. CFG models, therefore, are imprecise. To counter the inherent imprecision and generate models that are closer to the programs, push-down systems (PDSs) are employed. Push-down system is expressive enough to capture the context-free patterns and in turn can truly replicate the exact call-return sequences present in a program. Expressiveness of a PDS comes from the fact that, unlike CFG, it explicitly keeps track of the changes in the execution stack of the program as it moves from one state to another. At a high level, PDS maintains a stack of return locations, the location at the top-of-the stack being the state to which the program goes to once the current procedure exits. Though more precise than CFG, PDS suffers from the disadvantage of being space inefficient (the push-down stack is as large as the execution stack of the program being monitored). In short, precision comes at the cost of efficiency.~\cite{GJM02} proposed an optimization technique where requirement is precision is compromised to gain higher efficiency. The technique is based on generation of what is referred to as \emph{hybrid model} (see Section~\ref{sec:prelim}). A hybrid model can be ranked in between a CFG model and PDS model both in terms of efficiency and precision.

Return to homepage of Prem Uppuluri
Last Updated: May 20, 2004