Docstoc

13th ICCRTS C2 for Complex Endeavors

Document Sample
13th ICCRTS C2 for Complex Endeavors Powered By Docstoc
					               13th ICCRTS: C2 for Complex Endeavors


                  “Mutual Information and the Analysis of Deception”

                          Topic 4: Cognitive and Social Issues

                        E. John Custy (contact) and Neil C. Rowe



               SPAWAR Systems Center and Naval Postgraduate School

                           Code 24527, San Diego, CA 92152

            and Code CS/RP, 1411 Cunnningham Road, Monterey CA 93943

                             (619) 553-6167, (831) 656-2462

                       john.custy@navy.mil, and ncrowe@nps.edu

Abstract

This paper describes a general analysis technique for deception. We show that deception
can be placed within the same modeling framework commonly associated with
communication, and that elementary concepts from information theory can then be
applied. In particular, the average effectiveness of a deception can be measured in terms
of the mutual information between two random variables: the random variable
representing the true state of nature, and the one representing a targets perception of the
state of nature. Our analysis technique provides (i) a general yet simple framework for
understanding deception, and (ii) a practical method for measuring the effectiveness of
specific deception scenarios.



Keywords:      deception, information theory, communication, spoofing channels
                        13th ICCRTS: C2 for Complex Endeavors



Mutual Information and the Analysis of Deception


Abstract

This paper describes a general analysis technique for deception. We show that deception
can be placed within the same modeling framework commonly associated with
communication, and that elementary concepts from information theory can then be
applied. In particular, the average effectiveness of a deception can be measured in terms
of the mutual information between two random variables: that representing the state of
nature, and that representing a targets perception of the state of nature. Our analysis
provides (i) a general yet simple framework for understanding deception, and (ii) a
practical method for measuring the effectiveness of specific deception scenarios.

Keywords: deception, information theory, communication, spoofing channels


I.     Introduction

One of the most powerful ways of countering an adversary is through deception. Intuition
suggests that deception and communication are somehow related, but the precise nature
of any relationship that might exist has not yet been established. This is unfortunate
because the past several decades have seen sophisticated and useful mathematical tools
developed to characterize communication systems, yet few, if any, analysis techniques
currently exist for deception [17].



In this paper we describe how the average effectiveness of a deception can be quantified
with the same conceptual tools used to characterize conventional communication
systems. Our analysis technique is based on the average mutual information between two
specific random variables that arise in the course of any deception. These two random
variables are (i) the actual state of nature, and (ii) the state of nature perceived by the
deception target. Mutual information is measure of how much information the value of
one random variable provides, on average, about the value of another. In our model, the
deception target attempts to “communicate” with reality, and specifically tries to
determine the value taken by some specific state variable. The deceiver plays a role
analogous to noise in a communication system. Justification for this model comes from
the concept of deception being the imposition of a specific false version of reality onto a
target.



It is important to note that our analysis technique gives information only about the
average effectiveness of a deception, not about the success or failure of any particular
deception event. This information about average effectiveness nonetheless provides


                                            1
                         13th ICCRTS: C2 for Complex Endeavors


important insight into, for example, how deceptions can be expected to “evolve” as
participants become aware of the tactics being used by others.



This paper is structured as follows. The following section introduces some basic
terminology and concepts about communication systems to support later discussions.
Section III, which contains the core ideas of this paper, describes how deception can be
treated communication. These general concepts are illustrated by some examples in
Section IV. Section V presents some material that we feel is interesting and relevant, but
which must still be considered “work in progress.” Finally, Section VI concludes with a
high-level summary of major points.


II.    A Sketch of Communication Terminology and Concepts

In simplest terms, a conventional communication system consists of an information
source (or “transmitter”) and an information receiver (“destination” or “sink”), the two
being linked through a channel. The source picks at random a message from a large
number of possible messages and impresses it onto the channel. The channel delivers to
the receiver what can be interpreted as “evidence” about the message chosen by the
source, and the receiver uses this evidence to infer the sent message.



The most useful way to characterize both the source and the channel is in terms of their
statistical behavior, because anything deterministic about their behavior is not of much
conceptual interest [13]. Stated in different terms, the source is characterized statistically
because the specific information to be communicated is unknown when the system is
designed, and likewise the deterministic behavior of the channel can be accommodated
before the system is used. To simplify terminology, the phrase random variable will be
used to designate any random outcome, regardless of whether numbers have been
assigned to those outcomes.



Our discussion of deception will rely heavily on Figure 1, which shows an abstract
representation of a discrete binary communication system. When interpreted in
conventional communication terms, the information source generates symbols denoted A
and B, with a-priori probabilities pA and pB=1−pA. That is, the fraction of A’s in a long
sequence generated by this source will be about pA, and the fraction of B’s will be about
pB=1−pA.



The transition probabilities pAB and pBA indicate the rates at which each symbol type is
delivered to the destination either correctly or incorrectly. That is, non-zero transition
probabilities will cause a symbol of one type to be delivered as the other type. For

                                              2
                        13th ICCRTS: C2 for Complex Endeavors


example, pAB is a probability that represents the rate at which an input of symbol A is
delivered to the destination as symbol B. Transitions of this sort in communication
systems are typically caused by noise.



Significantly, transition probabilities close to one do not necessarily imply poor
communication performance. For example, the case pAB=pBA=1 is equivalent to the case
pAB=pBA=0, as far as communication effectiveness is concerned. Both of these cases
provide noiseless communication, with the situation pAB=pBA=1 requiring only a little
post-processing at the destination, namely the conversion of all A’s to B’s and all B’s to
A’s.




   Figure 1. An abstract representation of a discrete binary communication system.



A particular type of statistical average useful for describing communication system is the
entropy of a random variable [Cover and Thomas, 2006]. The entropy of a random
variable X is denoted H(X), and represents (in informal terms) the average number of bits
needed to describe the values taken on by that random variable. For example, if a discrete
random variable X has an entropy of 2 bits/value, then we can say that, on the average,
we need 2n bits to represent n particular values of this random variable.



A natural generalization of entropy is the mutual information I(X;Y) between two
random variables X and Y . Mutual information, which takes the form

                       I(X;Y)=H(X)−H(X|Y)...



                                            3
                        13th ICCRTS: C2 for Complex Endeavors


can be interpreted as the average number of bits that one random variables provides about
the other. It can be shown that

                       I(X;Y)≥0;

                       I(X;Y)=I(Y;X);

                       I(X;X)=H(X).

The entropy concept is useful because it conveniently encapsulates the (weak) law of
large numbers, which allows the use of simple “counting” arguments for evaluating
communication systems.



Mutual information finds use, for example, when a channel capable of conveying a signal
of some sort (optical, radio frequency, acoustic, etc.) becomes part of a communication
system. This is accomplished by introducing interfaces between the source and channel,
and between the channel and destination. A typical goal is to optimize performance
subject to cost and other constraints, with performance evaluated by comparing the actual
value of I(X;Y) against the maximum possible value of I(X;Y) for the channel.
Statistical descriptions of channel characteristics allow computation of the maximum
possible value of I(X;Y).



Though these ideas find their most direct application in the development of
communication systems, they are very general and are routinely applied under a variety
of circumstances. Perhaps the most important point for our purposes is that the
mathematical tools used to analyze communication systems can just as easily be used to
analyze sensor systems, because these terms refer fundamentally to the same thing.
Informally, a communication system can be viewed as simply a specially configured
sensor system, and a sensor system, on the other hand, can be viewed as a special type of
communication system. Stated differently, the random variable X in Figure 1 may be
generated by, or located alongside, an agent who has an interest in transferring
information to the destination, or, alternately, X can be associated with a process that is
completely indifferent to that transfer process. The mutual information between the
source and destination does not depend on the interests of the participants.


III.   A Link Between Communication & Deception

Our analysis technique for deception is based fundamentally on the following definition.
We use the term deception target or just target to refer to anyone who has a deception
applied against them, whether or not they “fall” for it. This definition is based on
[Godson and Wirtz, 2002, p. 6].



                                            4
                         13th ICCRTS: C2 for Complex Endeavors


    Deception is the presentation of a specific false version of reality by a deceiver
    to a target for the purpose of changing the targets actions in a specific way that
    benefits the deceiver.

Deception is the imposition of a specific false version of reality onto an adversary; that is,
a deceiver does not surround the correct version of reality with an obscuring fog, but
rather replaces it with a specific and carefully created false version of reality. Deception
is thus quite distinct from the denial of information to an adversary, and it is quite distinct
from efforts that direct an adversary in a random, haphazard direction. As stated
eloquently in [Whaley, 2007], a successful deception will make an adversary “... quite
certain, very decisive, and wrong” [emphasis in the original].

These ideas can be made precise. People in general, and deception targets in particular,
are constantly observing their environment in an ongoing effort to make correct
inferences about that environment. A deceiver manipulates the environment of a target
so that observations will suggest some specific incorrect version of reality. Figure 1
depicts this idea with the environment, or “state of nature,” represented as a source of
information, and the deception target represented as an information destination. A full
representation of nature in terms of state variables would be hopelessly complicated, but
for our deception model we need only be concerned with a single state variable that can
take either of two states, which we will call A and B. State A is essentially an arbitrary
state of nature, subject only to the requirement that it not be the “false” state of nature
mentioned in Definition 1, with other requirements perhaps imposed by the specific
deception. These mild requirements make the deception possible, so we refer to state A
as the precipitating state for the deception (though it can also be described as the actual
state). State B is the false version of reality referred to in Definition 1. We refer to state
B as the “false” or bogus state of nature that the deceiver uses as a mask, or disguise, for
state A.

Our model requires that either state A or state B hold, but that both not hold
simultaneously. Of particular interest is that Definition 1 implies that there are
necessarily circumstances under which a given deception cannot be carried out.
Specifically, any given deception has associated with it a “false,” or bogus, version of
reality, and it must be possible for this bogus version of reality to actually occur. When
the bogus version of reality is actually in effect, the given deception cannot be carried
out. For similar reasons, when state A is in effect, state B cannot be. Thus states A and
B are mutually exclusive, and though they do not necessarily exhaust all the possible
values that a state variable can take, any other values are of no interest for the deception,
and we can, for convenience, condition all probabilities on the event AB. Because the
precipitating and false versions of reality are mutually exclusive, and because we are
interested only in cases where one or the other hold, the mathematical tools developed for
binary communication system can be applied to deception scenarios.

As an aside we note that if the bogus version of reality were impossible, it would be
pointless to try to convince a target that it was actually in effect. Miracle weight-loss
pills and other outlandish products are not counter-evidence to this statement; it is only
necessary that the target consider the false state of nature to be plausible. It is interesting

                                              5
                        13th ICCRTS: C2 for Complex Endeavors


to note that these phenomena can be modeled as sensors operating with unrealistic, or
“skewed,” a-priori probabilities.

The role of the deceiver in this model is analogous to that of channel noise in a
communication system. When nature is in a state appropriate for a specific deception,
and when a deceiver decides to exploit those circumstances, one or more deception
targets will have evidence of a specific false version of reality placed into their
environment. The imposition of a false version of reality is represented in Figure 1 by a
non-zero value assigned to the transition probability pAB. That is, pAB represents the rate
at which the precipitating state A appears as the bogus state of nature B in the eyes of the
target.

In broad terms, our model of deception has the deception target attempting to infer the
correct state of nature while a deceiver attempts to impose a specific false state of nature.
In essence, the deception target is separated from reality by a communication, or sensor,
channel, and successes by a deceiver act as channel noise which cause errors in the
targets inferences about nature.

A.     One-Sided Deceptions

The effectiveness of a deception is reflected in the mutual information between two
random variables which, as described above, are present in any deception. These are (i)
the random variable representing an information source or state of nature, denoted X in
Figure 1; and (ii) the random variable representing a targets decision about the source,
denoted Y. The mutual information I(X;Y) between these two random variables
represents the number of bits of valid information that a deception target obtains, per
observation, about the value of X. An “observation” is the total information gathered by
a target before acting, and the state believed to be correct by the target is determined by
their actions.




                                             6
                        13th ICCRTS: C2 for Complex Endeavors




     Figure 2.      Mutual information between random variables at the source and
      destination of a noiseless channel, Z-channel, and binary symmetric channel.

The most straightforward interpretation of Definition 1 in terms of Figure 1 occurs with
0<pAB≤1 and pBA=0. The resulting mutual information I(X;Y) as a function of pAB is
shown by the dashed curve in Figure 2. This most elementary type of deception, which is
analogous to a Z-channel in communication systems, will be referred to hereafter as a
one-sided deception.

An important characteristic of a one-sided deception is that a deception target can make
only one type of error. Specifically, a deception target is only capable of misinterpreting
a state variable as being in state B when it is actually in state A.

A one-sided deception is most effective when pAB=1. When this occurs, repeated
deployment of the deception results in the target (or target community) perceiving only
state B, regardless of whether state A or B is in effect. Figure 2 shows that the mutual
information is sensitive to the value of pAB, and in fact the derivative of I(X;Y) at pAB=1
is -1/2. As a consequence, when pAB≈1, a small decrease of ∆ in pBA causes an increase
of about 2∆ bits to the deception target. This sensitive behavior comes about because the
target is only capable of misinterpreting A as B, and appearances of A are valid.

In summary, a one-sided deception, the most elementary type of deception that can be
represented in our model, corresponds to 0<pAB≤1 and pBA=0 in Figure 1. The mutual
information I(X;Y) is given by the dashed curve in Figure 2, and, as with any deception
tells us the average number of bits of information the deception target gains per


                                            7
                         13th ICCRTS: C2 for Complex Endeavors


observation. Under a one-sided deception, the deception target can make only one type
of error, because only one value of a specific random variable is being disguised.

B.     Symmetric Complements and Symmetric Deceptions

Following the terminology introduced above, a one-sided deception results, when
successful, in state B replacing, or masking, state A when state A occurs. Consider the
transition probability pBA, which is zero for a given one-sided deception. This transition
probability can be formally interpreted as the rate at which the state A is made to stand in
place of state B. An error of this type is related to, but quite distinct, from the given one-
sided deception on which it is based. A one-sided deception with pAB>0 and pBA=0 has
associated with it a deception characterized by pBA>0 and pAB=0, which we will call the
symmetric complement of the original. A deception and its symmetric compliment used
together will be referred to as a symmetric deception. That is, a symmetric deception
disguises state A as state B when A occurs, and disguises state B as state A when B
occurs.

The idea of the symmetric complement of a deception can be considered from another
perspective. Note that Figure 1 can be used to represent two distinct one-sided
deceptions: one with pAB>0 and pBA=0, and one with pBA>0, pAB=0. Each of these one-
sided deceptions is the symmetric complement of the other. When used together they
form a symmetric deception.

The solid “U-shaped” curve in Figure 2 shows the mutual information associated with a
symmetric deception with pAB=pBA. This curve shows that a symmetric deception can
cause a targets observations to become useless (i.e., I(X;Y)≈0) without transition
probabilities taking the extreme values of zero or one. Additionally, at I(X;Y)=0 the
derivative of I(X;Y) with respect to pAB=pBA is zero.

There is no reason to expect that the transition probability of a deception will equal the
transition probability of its symmetric complement, and in fact these parameters are
completely independent of each other. However, it can be shown that symmetric
deceptions in general achieve I(X;Y)=0 for values of pAB and pBA away from zero and
one, and that the derivative of I(X;Y) with respect to $pAB and pBA at I(X;Y)=0 is zero.
These two properties suggest that the requirements for an effective symmetric deception
are much less stringent than those associated with a one-way deception. As an aside it
should be noted that these two properties are not implied by the fact that I(X;Y) is a
convex function of the transition probabilities.

A further important characteristic of symmetric deceptions is that, as shown in Figure 2,
the value of I(X;Y) increases as transition probabilities increase beyond 1/2. This
situation is analogous to a binary communication channel that ``flips'' most of the bits
sent through it. When this occurs in a communication system, a little post-processing at
the receiver, namely inverting all received symbols can easily remedy this situation.

In the case of a symmetric deception, however, the increase in I(X;Y) as transition
probabilities increase beyond 1/2 implies that information about the actual state values is


                                              8
                        13th ICCRTS: C2 for Complex Endeavors


available to the deception target. This information can be extracted by the target in the
following way: the target simply decides which state is most strongly implied by the
available evidence, and then behaves as if the opposite state were in effect. Stated in
different terms, if the target community knows that a symmetric deception is being
applied against them, and this community knows that their performance is worse than
would occur if they completely neglected any relevant observations, then the this
community can exploit the deception to their advantage by behaving according to the
state opposite to that implied by observations. The interesting conclusion is that a
deception target can always exploit a deception that is too strong.

Thus, unless the mutual information between these two random variables is low,
deceivers leave themselves vulnerable to counter-deception techniques. This suggests
that after a long enough exchange of measures and counter-measures, a deception will
degenerate into what is essentially a case of denial.

C.     Further Significance of Mutual Information

Interpreting deception in terms of mutual information leads to the concepts of one-sided
and symmetric deceptions, which constitute unique perspectives into deception provided
by information theory. We note here for completeness that our model also allows mutual
information to serve the same purposes for deception that it does for communication.

For example, mutual information allows apparently disparate deceptions to be compared.
That is, because the effectiveness of a target in perceiving the correct state of nature is
measured in the common units of bits for all deceptions, it becomes possible to compare
deceptions of very different types.

The use of mutual information also allows computations to be carried out for deceptions.
If two communication channels provide mutual information values of I1(X;Y) and I2(X;Y),
then the two channels operating independently of each other in parallel provide a mutual
information of I1(X;Y)+I2(X;Y). Likewise, if a target is able to gather I1(X;Y) bits of
information about the state of nature in the face of some particular deception, and another
target in another particular deception scenario (that is, operating independently of the
first) obtains I2(X;Y) bits of information about the state of nature in the face of that
deception, then it can be said that the community of targets gathers I1(X;Y)+I2(X;Y) bits
of information about the state of nature through the two deceptions.

In addition, we can use mutual information to evaluate the cost effectiveness to a
deceiver of changing the transition probabilities associated with a deception. Suppose
that a one-sided deception is operating with a transition probability p1. A change ∆p1,
which will presumably change the effort and costs incurred by the deceiver, will result in
a change of ∆I(X;Y). This change is a non-linear function of p, and a change that may be
cost effective at p1 may not be cost effective at some other p2.




                                             9
                        13th ICCRTS: C2 for Complex Endeavors


IV.    Examples

So far our presentation has been as general as possible. The following examples,
summarized in Table 1, provide specifics that may be helpful for understanding the
material introduced above. The first two examples are prototypical deceptions, at least in
the popular imagination: the sale of used cars, and claims made on income tax forms.
The next two examples, camouflage and identity theft, are like the previous two in that
they are one-sided deceptions that do not have useful symmetric complements. The last
two examples, one involving runways and the other involving honeypots, are interesting
because they have natural symmetric compliments.




A.     Sale of Shoddy Item

Participants exchanging goods, services, and/or money must share a common interest,
because otherwise the exchange would not take place. However, there is also conflict:
each party wants to “get” as much as possible and “give” as little possible. Potential state
variables include the maximum amount of money that the purchaser is willing to spend,
the minimum amount of money that the seller is willing to accept, and the quality of any
products being exchanged.

This example will consider the sale of a used car which, for simplicity, is assumed to
have a fixed price, but which may be a “high quality,” or “shoddy” item. A potential
buyer has available some class of used cars from which to make a choice, say, all of the
cars on a particular lot, or all of the cars of a particular model and year. Before making
any observations, our hypothetical purchaser knows only the a-priori probabilities
associated with the available collection: that is, they know that x% of the members of a
particular class are of “high quality” and that(1-x)% are of “low quality.”

As far as a potential purchaser is concerned, the quality of a used car in this collection is
random: the quality of any candidate for purchase may be high, or it may be low. Before
making any observations, the best choice that the purchaser can make is based on the a-
priori probabilities associated with the class in question. For example, if 90% of the cars
in the collection are of low quality, then the best choice to assign to an arbitrary member
chosen at random from the class is “low quality.” If half of the cars in the collection are


                                             10
                        13th ICCRTS: C2 for Complex Endeavors


high quality and half are low quality, then on the average neither choice will be better
than the other for an arbitrary member of the class chosen at random.

Before making a decision, however, a potential purchaser will usually supplement the a-
priori probabilities with some observations. That is, our purchaser will look at several
specimens from the collection, noting milage, cleanliness, service records, and so forth.
These observations provide no guarantees about the state of the automobile, but they do
provide evidence that can be used to form an estimate automobiles state.

We will assume that salespersons have no incentive to disguise the actual state of nature
to a customer examining a high quality car. That is, we will assume that causing a high
quality car to appear of low quality will only hurt the salesperson. On the other hand,
however, on those occasions when a potential purchaser is making observations of a low
quality car, the salesperson may attempt to counter these observations in some systematic
way, say by “rolling back” the odometer. However, the particular techniques used by the
salesperson to crate this false version of reality are irrelevant to our analysis. The only
significant point is that a potential purchaser observing a low-quality car will have a false
version of reality presented to them by the salesperson. The deceptive salesperson will
presumably be successful some fraction of the time.

This is a one-sided deception because the salesperson is only attempting to make a low
quality item look as if it were high quality; there is no attempt to make high quality items
appear as low quality. Assume for simplicity that half the cars on a lot are of high
quality, and the other half are of low quality. Suppose further that a particular salesman,
when presenting a low quality car, is successful x% of the time that in making the car
appear as high quality. The dashed curve in Figure 2 will provide a value of mutual
information, say y bits, associated with the one-sided deception with this transition
probability. Then the community of deception targets (i.e., used car purchasers) can view
that particular salesperson as a channel which provides each customer with y bits of
information about the quality of a candidate car. Each salesperson provides an
information gathering ability that conforms to the dashed line in Figure 2s.

Consider the extreme case in which the salesperson is totally ineffective in their
deception efforts. Then a customer can make noiseless observations about the quality of
a car through this salesperson, resulting in one bit per observation. Through this
salesperson, a purchaser can perfectly determine the quality of a used car.

Imagine, on the other hand, a salesperson who is a perfectly effective deceiver. This
salesperson always makes every low quality automobile appear as a high quality car.
Then all observations made through this salesperson will be “high quality.” This
salesperson is useless to the customer, because he/she conveys no information; in
essence, the noise associated with this channel is so bad that zero bits of information can
be conveyed through it. Note that in order to achieve this extreme case, the deceiver
must be completely effective; a small decrease in this salespersons effectiveness means
that a non-zero number of bits will be transferred on the average for each observation
made by the average target. That is, the target can be sure that any car that appears to be
of low quality actually is of low quality, due to this deception being one-sided.

                                             11
                         13th ICCRTS: C2 for Complex Endeavors


Under the one-sided deception outlined above, the best the dealer can to is make all cars
appear to be of high quality. However, suppose the salesperson presents high quality cars
as low quality. This situation is not entirely implausible: it could arise if the dealer wants
to “hold” the high quality cars for selected customers or for their own use; to circumvent
poorly conceived tax laws; or it could result from a dealership that is a “front” for some
illegal activity. Under these circumstances, a potential customer is subject to two distinct
types of errors: the customer may mistake a low quality car as one of high quality, and
the customer may mistake a high quality car as being of low quality. If the probability of
each of these types of errors is equal, then the solid curve in Figure 2 gives the average
number of bits of information gained by a purchaser per observation.

Under these circumstances, a salesperson only has to be $50\%$ successful at each type
of deception in order to make $I(X;Y)=0$. When $I(X;Y)=0$, a targets observations are
useless, so a salesperson who is successful about half the time at each type of deception
provides the same amount of information to a customer about the quality of a car as does
the flipping a of a fair coin. This situation is more robust for the salesperson than the
corresponding one-sided deception: even if the salespersons success rate varies slightly
from $50\%$, the amount of information gathered by the target will still remain about
zero.

The salesperson may desire to cause an error rate of greater than 50% on the part of the
targets. However, if the salesperson is successful in causing more than 50% errors on the
part of automobile purchasers, then there is sure to be some systematic characteristic of
the deception technique(s) that could be exploited by the target community. That is, if
targets systematically believe that more than half of the high quality cars are low quality,
and more than half the low quality cars are believed to be high quality, then, as intuition
would suggest, there is a non-zero amount of mutual information that the target
community can take advantage of. However, it may be non-trivial for the target
community to discern and exploit this mutual information.

B.     Reporting Income for Tax Purposes

Another prototypical deception is the reporting of income for tax purposes. The state
variable being manipulated is the income of a taxpayer, and for simplicity we will assume
that income can take on one of only two values: A denotes the value “high income,” and
B denotes the value “low income.” The taxpayer is the deceiver in this scenario, and the
deception consists of reporting that income is low when in fact it is high. Only a person
with high income can carry out this deception; a person with low income is powerless to
attempt this particular deception.

The tax examiner is the deception target, and this person uses material such as (but not
necessarily limited to) the form submitted by the taxpayer to discern the correct value of
the state variable. The transition probability pAB is the rate at which the tax examiner
accepts state B when state A is in effect; that is, pAB is the rate at which the examiner
accepts as true a high income taxpayers statement that their income is low.



                                             12
                         13th ICCRTS: C2 for Complex Endeavors


The transition probability pBA, on the other hand, represents the rate at which someone
who actually has low earnings is believed to have high earnings. If the tax agent makes
no “honest” mistakes, pBA will be zero. However, if for some reason low income
taxpayers provided evidence that they were high income, the tax examiner would be
dealing with a symmetric deception. If enough high income persons gave convincing
evidence that they were low income, and enough low income persons gave convincing
evidence that they were high income, the tax examiners observations would become
worthless for determining income.

C.      Camouflage

A soldier or a hunter who dons special clothing to blend in with, say, a forest background
is manipulating a state variable related to location. The state variable can take on values
“someone is located in this forest,” which we will denote A, and “no one is located in this
forest” denoted B. The special clothing and slow, quiet movements of the deceiver
provide evidence to observers that B is in effect when A actually is. Someone who is not
in the forest cannot carry out this deception.

If we imagine that the deception target is a sentry who is surveilling the forest, and that
the camouflage is perfectly effective, then the sentry will believe that state B holds. The
symmetric complement of camouflage is a deception in which a region is made to appear
populated when in fact it is empty. Conceivably, such a deception could be carried out
with noise makers and/or mechanical devices for creating motion. Appropriate use of
this symmetric complement can achieve the same ends as perfect camouflage, namely,
making observations by the sentry useless.

D.      Identity of an Individual

Many cases of social engineering [Mitnick, 2002] involve a deceiver who assumes the
identity of an “insider.” The state variable of interest here can take on either the value
“this person has authority” which we will denote B, or alternately “this person is not who
they claim to be,” denoted A.

The transition probability pAB then represents the rate at which an “outsider” is successful
at being accepted as an “insider.” The symmetric complement is a deception in which a
person with authority provides evidence that they have no authority. As in all the cases
above, this symmetric complement is not very natural or practical in any real sense.

E.      Runway Strafing

This example illustrates that the symmetric complement of a deception can be a very
intuitive concept. Consider the following passage [Whaley, 2002], which involves the
use of “dummy” aircraft to divert attacks away from real aircraft.



     Sometime around mid-1942, Major Oliver Thynne was a novice planner with Colonel
     Dudley Clarke's “A” Force, the Cairo-based British deception team. From intelligence,

                                              13
                           13th ICCRTS: C2 for Complex Endeavors

     Thynne had just discovered that the Germans had learned to distinguish the dummy
     British aircraft from the real ones because the flimsy dummies were supported by struts
     under their wings. When Major Thynne reported this to his boss, Brigadier Clarke, the
     “master of deception,” fired back

     “Well, what have you done about it?”

     “Done about it, Dudley? What could I do about it?”

     “Tell them to put struts under the wings of all the real ones, of course!”

Here the original one-sided deception is perpetrated by the British defenders, who use
dummy aircraft to deceive the enemy attackers. The state variable being manipulated by
the deceivers is the identity of an item sitting on a runway. Each item on the runway has
a state variable associated with it, with possible values “real aircraft” and “dummy
aircraft.” The only mistake that the deception targets can make is to misbelieve that a
dummy aircraft is a real aircraft; the deception is thus one-sided.

However, the one-sided deception turns out to be imperfect, and because of its simplicity,
the symmetric complement is deployed. If the one-sided deception were perfect, the
attackers would would not be able to distinguish real aircraft from the dummies. This
story illustrates that an imperfect one-sided deception can be “salvaged” by proper
deployment of the symmetric complement.

F.       Honeypots and False Honeypots

One of the most effective ways of gathering information about the techniques used by
computer intruders is through the use of a honeypot [TheHoneynetProject, 2004]. A
honeypot is a computer that is placed on a network for the purpose of being broken into
by computer intruders. Honeypots do not contain any information of value, and are
usually highly instrumented so that the maximum amount of information about intruders
can be gathered. Most computer intruders try to avoid honeypots so that their intrusion
techniques, which may have required significant effort to develop, will not be revealed.
A honeypot may thus be “tricked out” to appear as a non-honeypot computer containing
valuable information.

A honeypot thus represents a deception involving a state variable that can take the states
“this computer is ordinary,” denoted B, and “this computer is a honeypot,” denoted A. Of
particular interest is the symmetric complement of this deception, which is a deception in
which an ordinary computer is made to appear as a honeypot. Deployment of this
symmetric complement could potentially lead to a situation where a computer intruder
cannot determine whether a computer that they have broken into is a honeypot or an
ordinary computer. That is, the intruders observations about the machine they have
broken into, gathered by examining all different aspects of the machine, would be useless
for determining whether the machine is a honeypot or an ordinary computer.

Because most computers on large networks like the Internet are ordinary, a computer
intruder is safe in assuming that an arbitrary computer chosen for intrusion is ordinary.

                                                  14
                         13th ICCRTS: C2 for Complex Endeavors


We thus suspect that the main benefits of this symmetric deception would come to those
deploying honeypots, because honeypots make up only a small part of the computer
population, and little or no evidence would be available to an intruder to foil the
deception. However, computer intrusion in general may become less appealing when
intruders have to work “blind” to honeypots.


V.     Miscellaneous Notes & Ongoing Work

This section contains material that is relevant and interesting, but does not seem to fit
naturally anywhere else in this paper.

A.     Deception Inputs and Outputs

Our model treats “reality” as non-deterministic. All that anyone, including a potential
deception target, can do is make observations and then act on inferences based on those
observations. Observations can only provide evidence that a particular state of nature
holds, and though this evidence can be extremely strong, no guarantees come along with
an observation. Deception is possible only because of the non-deterministic nature of
reality, and only because it is possible for deceivers to “feed” false observations to a
target. No one can be deceived about something that is deterministic.

The success or failure of a deception is measured by a targets actions, which in turn
reflect the perception of reality by that target. That is, the target either behaves according
to the specific false version of reality advocated by the deceiver (that is, the deception
succeeds), or the target behaves according to the actual, correct version of reality (the
deception fails). If the target behaves in a way that is independent of the the particular
state of nature in question, then that target should be considered irrelevant to evaluating
the success of the deception. For example, if a bogus money-making opportunity is
thoroughly presented to a potential target, but the target suddenly drops dead before they
have an opportunity to accept or decline, then we argue that this specific deployment
cannot be counted as a success or a failure.

B.     Targets That Act as Ill-Designed Sensors

We have seen that the average success of a deception can be analyzed as if it were a form
of noise in a sensor channel between a deception target and reality, and that this model
provides interesting insights into deception and counter-deception. It turns out that
certain odd behavior on the part of deception targets can also be modeled in terms of
communication concepts. In particular, a deception target that acts in accord with a very
unlikely or impossible state of nature (miracle weight-loss pills, etc.) is similar to a
communication receiver that is operating according to a-priori probabilities that are
incorrect.

To see this, imagine a communication receiver designed to receive binary symbols that
occur with equal frequencies; that is, if the symbols are denoted 0 and 1, then any long
sequence sent generated by the source will have equal numbers of 0's and 1's. Suppose
now that 90% of the symbols generated by the source are 0's and 10% are 1's. In this

                                             15
                           13th ICCRTS: C2 for Complex Endeavors


case, the threshold for deciding between 0's and 1's will be significantly different than in
the previous case; in essence, the receiver requires much more evidence to decide that a 1
was sent. On the other hand, a receiver designed to operate with a 90/10% mixture of 0's
and 1's will, if provided with a 50/50% input, will err by reporting too many 0's.

These ideas are illustrated by the following story, which can be interpreted as that of a
sensor operating with incorrect a-priori probabilities [cartalk, 2007].

     I worked my way through college as a Volvo mechanic, 1969-71. During those years, the
     extremely dependable but dated Volvo 120 series was being replaced by the extremely
     trendy but unreliable 140 series.
     Our shop foreman decided to buy a small Fiat, about 1500cc, saying that he could no
     longer trust the Volvo, and furthermore, he REALLY loved the TREMENDOUS gas
     mileage of the Fiat. The first week he had the Fiat, he did nothing but rave about the gas
     mileage, so we decided to help him. Every day we would add, at first a pint, then more
     and more gas to his tank when he wasn't looking. He went crazy.
     Our skeptical-looking (we were all in on it) crew would be regaled by his tales of getting,
     well, first it was 34, then 50, then 63 miles per gallon. He would snarl condescendingly at
     our gas guzzling Volvos, then reflect on the brilliance of Italian engineering. The Fiat
     dealership, of course, had several explanations. Tight engine. American gas. Driving
     habits. Then we gradually began to reduce the amount we added, until it was zero, and
     then of course we siphoned increasing amounts from the Fiat's tank.
     At first, the bragging slowed to a stop. He became surly. How was the Fiat? Wouldn't
     answer. Then of course he kept taking it back to the Fiat dealership, which, of course, had
     several explanations. Tight engine. American gas. Driving habits. In the end, he found us
     out, and our schedules were screwed for months.

The behavior of this target is analogous to a communication receiver with a detection
threshold set inappropriately low for certain types of symbols. This can result in (i)
accepting as valid states of nature that are extremely unlikely (as in this story); or (ii)
accepting unremarkable states of nature based on small amounts of evidence.

C.       Spoofing Channels

The analysis technique presented in this paper provides no real guidance for the
development of specific deception techniques. However, we are working on a specific
deception technique that is worth mention here because of the interesting relationship this
deception holds to its symmetric complement.

Our deception technique developed from thoughts on how best to respond to a computer
intrusion. A number of computer Intrusion Detection Systems (IDS's) are available for
detecting and alerting system administrators when a computer intrusion occurs; a well
known example of an IDS is SNORT [SNORT, 2007]. However, it is not clear what a
system administrator should when an intrusion has been detected. One option is to
immediately drop the connection to the intruder, which ensures that sensitive information
is protected to the maximum extent possible. This option, thought, has the significant
disadvantage that the intruder is unequivocally notified that they have been detected. It
would be much better if the connection could be maintained and the intruders activities

                                                 16
                        13th ICCRTS: C2 for Complex Endeavors


on the target machine observed. This would aid forensic work and would provide
guidance for strengthening defenses against further intrusions. However, it is critically
important that sensitive information be protected.

We are developing a computer Intrusion Response System (IRS) based on a the idea of a
spoofing channel. A spoofing channel is like a communication channel, but with a
slightly less stringent performance requirement. A conventional communication channel
is obligated to provide as output the same string of symbols provided as input, but in
contrast, the output of a spoofing channel is required only to have the same statistical
structure as the input, with no stronger relationship promised. When used in an IDS, a
spoofing channel delivers to an intruder not the original document residing on the target
computer, but rather a spoof of that document, with the spoof having the same statistical
structure as the original, but no stronger relationship.

The most significant characteristic of the spoofing channel is that it exploits for deception
the uncertainty that conventional communication resolves. That is, communication is
carried out because the information at the source is not known at the destination. If the
information inside a target computer were known to the intruder, there would be no point
in committing the intrusion.

Our work so far has focused on spoofing channels for natural language text. In this
special case, when we say that the output of a spoofing channel has the same statistical
structure as the input, we mean that individual words appear with the same frequency in
the input and output, word pairs appear with the same frequency, word triples, and so on.
Thus when we say that the output of a spoofing channel has the same statistical structure
as the input, we mean that all $n$-tuples of words appear with the same frequency.

Two fundamental techniques have suggested themselves for automatic generation of
spoofs of natural language text documents.

        1.     One technique for modifying a documents meaning while maintaining its
“style” structure is through manipulation of the target documents semantic structure.
This is intuitively the most straightforward approach to changing the meaning of a
document while maintaining the same “style.” An example of this sort of technique
would consist of negating and un-negating some particular subset of assertions in the
subject document.

        2.      Another technique for automatically changing a documents meaning is
through manipulations based on syntactic structure. A technique of this sort might
consist of simply swapping two successive noun phrases (which may appear in the same,
or in different, sentences). This technique depends heavily on pareidolia, which is the
psychological phenomena of finding meaning in random and presumably ambiguous
patterns [pareidolia entry on Wikipedia, 2007].

On occasion, spoofing channels have been observed “in the wild.” Examples include the
classic spoof created by hand by Alan Sokal [Sokal, ], and SCIgen, which automatically
generates random computer science research papers \cite{scigen}.


                                             17
                         13th ICCRTS: C2 for Complex Endeavors


Interestingly, the spoofing channel deception is identical to its symmetric complement.
The fundamental reason for this is that the deception target is inherently unable to
distinguish between valid information and a properly constructed spoof: the point of
using a communication channel is to identify, or distinguish, valid information from
among all the possibilities. Stated in different terms, a properly constructed spoof makes
the observations of a deception target inherently useless.

At this point, the spoofing channel appears to require handling as a “special case” for
analysis. Analysis of the spoofing channel is a part of our ongoing research.


Conclusion

The material in this paper does not constitute a mathematical theory of deception, but it is
just about as good: we have shown that the existing theory of communication can be
used, almost “as is,” to describe deception. Based on a simple and natural definition of
deception, our modeling technique allows the effectiveness of a deception to be evaluated
in bits, and thus allows insights and computations that would otherwise not be possible.
Because information theory has traditionally described the transfer of valid pictures of
reality from one location to another, it is reasonable that the transfer of invalid pictures of
reality can be described by the same means.

Also, we have illustrated a form of duality between deception and communication. In a
conventional communication system, the mutual information between the random
variable at the information source and the information destination is of great interest: the
goal of a communication system designer is to make the mutual information between
these two random variables as large as possible. A successful deception, on the other
hand, reduces the mutual information between reality and a targets perception of reality
to the lowest value possible.

This material is exciting not because of the specifics presented here, but rather because of
the many open questions that remain. A few outstanding topics include the relationship
of rate distortion theory [Cover and Thomas, 2006} to deception; analysis of deception as
a game [Garg and Grosu, 2007] with payoffs quantified by mutual information; models of
deception using continuous state variables; and the influence of deception on the stability
of signaling systems [Searcy and Nowicki, 2005].


References

http://www.cartalk.com/content/features/hell/01.05.html, retrieved 14 August 2007.

Cover, Thomas M., and Joy A. Thomas. 2006. Elements of Information Theory, 2nd Ed.
       Hoboken, NJ: John Wiley and Sons.

Godson, Roy, and James J. Wirtz, Eds. 2002. Strategic Denial and Deception: The
      Twenty-First Century Challenge. New Brunswick, NJ: Transaction Publishers.


                                              18
                       13th ICCRTS: C2 for Complex Endeavors


The Honeynet Project. 2004. Know Your Enemy: Learning About Security Threats, 2nd
      Ed. New York: Addison-Wesley.

Garg, Nandan, and Daniel Grosu. 2007. Deception in Honeynets: A Game-Theoretic
       Analysis. Proceedings of the 2007 IEEE Workshop on Information Assurance,
       United States Military Academy, West Point, NY.

Mitnick, Kevin D. and William L. Simon. 2002. The Art of Deception: Controlling the
       Human Element of Security. Indianapolis, Indiana: Wiley Publishing.

http://en.wikipedia.org/wiki/Pareidolia

Rowe, Neil C., Binh T. Duong, and E. John Custy. 2006. “Fake Honeypots: A Defensive
      Tactic for Cyberspace.” Proceedings of the 7th IEEE Workshop on Information
      Assurance, U.S. Military Academy, West Point, New York.

Rowe, Neil C., Han C. Goh, Sze L. Lim, and Binh T. Duong. “Experiments with a
      Testbed for Automated Defensive deception Planning for Cyber-Attacks.”

http://pdos.csail.mit.edu/scigen/

Searcy, William A., and Stephen Nowicki. 2005. The Evolution of Animal
       Communication: Reliability and Deception in Signaling Systems. Princeton, NJ:
       Princeton University Press.

Shannon, Claude E., and Warren Weaver. 1949. The Mathematical Theory of
      Information. Chicago: University of Illinois Press.

http://www.snort.org

http://www.physics.nyu.edu/faculty/sokal/

Whaley, Barton. Conditions Making for Success and Failure of Denial and Deception:
      Authoritarian and Transitional Regimes. Printed as Chapter 3 of [Godson and
      Wirtz 2002].

Whaley, Barton. 2007. Stratagem: Deception and Surprise in War. Boston: Artech
      House.
                                   _________________




                                          19
                          13th ICCRTS: C2 for Complex Endeavors




Last, First. 2000. Full title of the book. City: Publisher.

Last, First, and First Last. 2000. Full title of the article in the publication. In Full title of
        the journal or publication 28: 254–267.

Last, First, First Last, and First Last. 2000. Full title of the article presented at the
        conference. Paper presented at Full Name of the Event, June 5–7, in City, State.




                                                20

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:11
posted:4/14/2010
language:English
pages:22