Control Behavior Models
[PDF]
Intrusion detection~\cite{} involves detecting and preventing
malicious actions that can potentially (a) compromise data integrity
and/or confidentiality and (b) availability of resources. At its core,
intrusion detection requires a methodology that can effectively
distinguish malicious behavior from correct behavior of systems,
thereby increasing the capability of preventing intrusive behavior on
one hand and decreasing the number of false alarms on the other.
Anomaly detection, a typical methodology applied in intrusion
detection systems (IDS), is a two-phase process. In the first phase, a
model is generated which captures the intended (or correct) behavior
of the system~\cite{FHRS90,Lunt92,ALJTV95} while the second phase
compares the generated model against observed behavior of the
system. Deviations of the observed behavior from the model are flagged
as potential attacks to the system. The approach has the capability of
detecting both known/unknown attack sequences. However, it can
potentially suffer from the disadvantage of unduly large number of
false alarms in the event that the generated model does not replicate
all possible correct behavior of the system.
In this context, static-analysis based techniques are preferred to the
ones based on runtime-analysis, due to the formers capability to
capture the complete control behavior of the system in terms of the
system calls invoked by it. \typeout{system call: why? some stuff
here} Models generated by static-analysis are typically control-flow
graphs (CFGs) where a state is represented using the program counter
value and the inter-state transitions are labeled by the system calls
that are being invoked at the source state. Monitoring the observed
behavior using such a model involves matching observable system call
sequences with the sequence of transitions in the control flow
graph. Two important requirements of this technique, as eluded in
\cite{GJM02}, are
\begin{enumerate}
\item Precision: the model must capture all possible sequence of
actions that are deemed as correct system behavior and nothing else.
This stems from the underlying requirements that correct behavior must not
be classified as erroneous and anomalous behavior must not go un-noticed.
\item Efficiency: monitoring observable behavior must incur minimal
overhead in time and space usage.
\end{enumerate}
Unfortunately, enhancing both precision and efficiency is a difficult,
if not impossible, task owing to the fact the precision requires
incorporating minute details of the system in the model which in turns
slows down the monitoring phase thereby reducing efficiency. We
present, in this paper, a novel approach which enhances precision
without incurring loss of efficiency.
\paragraph{The driving problem.}
Models based on CFGs capture the branching (if-then-else, while-do)
behavior of a procedure. In other words, CFGs can be
represented in terms of regular languages. However, such regular
behavior is \emph{insufficient} to model programs, with (recursive)
procedures, as calls to and returns from a procedure can only be
represented in terms of non-regular patterns, i.e., context-free
languages. CFG models, therefore, are imprecise.
To counter the inherent imprecision and generate models that are
closer to the programs, push-down systems (PDSs) are
employed. Push-down system is expressive enough to capture the
context-free patterns and in turn can truly replicate the exact
call-return sequences present in a program. Expressiveness of a PDS
comes from the fact that, unlike CFG, it explicitly keeps track of the
changes in the execution stack of the program as it moves from one
state to another. At a high level, PDS maintains a stack of return
locations, the location at the top-of-the stack being the state to
which the program goes to once the current procedure exits. Though
more precise than CFG, PDS suffers from the disadvantage of being
space inefficient (the push-down stack is as large as the execution
stack of the program being monitored). In short, precision comes at
the cost of efficiency.~\cite{GJM02} proposed an optimization
technique where requirement is precision is compromised to gain higher
efficiency. The technique is based on generation of what is referred
to as \emph{hybrid model} (see Section~\ref{sec:prelim}). A hybrid
model can be ranked in between a CFG model and PDS model both in terms
of efficiency and precision.
Return to homepage of Prem Uppuluri
Last Updated: May 20, 2004