BA 386T STATISTICS COURSE NOTES
Part 8
Decision Analysis
XI. STATISTICAL DECISION THEORY (DISCRETE CASE)
Thinking about decision-making with risk analysis tools can sharpen decision-making
skills, even if the ultimate decision incorporates other, nonquantitative criteria.
A. Four Elements: [Ref: Albright 7.2] 1
Actions – the choices available to the manager
States of nature – the uncertain conditions outside the manager’s control that, together with
the manager’s action, determine the payoff
Payoffs – the numerical rewards resulting from the combination of the manager’s choice and
the state of nature
Probabilities – the likelihoods of each state of nature
The four elements can be presented equivalently in the form of a decision table or decision tree
[Ref: Albright 7.2.3].
Examples:
1. Insurance
A 23-year-old male is considering purchase of a $100,000 term life insurance policy for a
premium of $200 per year. Actions: buy, or not buy. States of nature: Live, or die.
Probabilities: Live (0.99842), die (.00158). Payoffs are shown in two forms below:
Decision Table
States of Nature
Actions Live Die
(Prob=0.99842) (Prob=0.00158)
Buy -$200 $99,800
No buy 0 0
Decision Tree
Live 99.842% 0%
$0 -$200
Yes FALSE Live or die?
-$200 -$42
Die 0.158% 0%
$100,000 $99,800
Insurance decision Buy?
$0
No TRUE 100%
$0 $0
(The decision tree was produced by the Precision Tree software tool in the Palisade Decision
Tools package supplied with the Albright text CD-ROM. See further notes on using and
1
Color coding is used with few exceptions as follows: red for important terms, blue for important formulas, magenta
for important facts presented without proof, bright green for references to the text.
-1-
interpreting this tool below and the extensive notes in Albright, Chapter 7, beginning in
Section 7.3.)
2. Banking
A customer has applied for a $10,000 loan from a Mexican bank. The default rate for members
of the general public is 5%. The loan interest rate would be 10%. The bank can earn a risk-free
rate of 4% in government securities. The bank manager must decide what to do about the loan
application. Actions: Make loan, or do not make loan. States of nature: Default, or no default.
Probabilities: Default (5%), no default (95%). Payoffs shown in decision table and tree below:
States of Nature
Actions Default (Prob=0.05) No default (Prob=0.95)
Make loan -$10,000 $1,000
No loan $400 $400
Default 5% 5%
-$10,000 -$10,000
Lend TRUE Default?
$0 $450
No default 95% 95%
$1,000 $1,000
Loan decision Lend?
$450
Do not lend FALSE 0%
$400 $400
Constructing Decision Trees [Ref: Albright 7.3]
You can construct decision trees by hand, or you can do it in Excel with PrecisionTree.
(PrecisionTree is one of the tools in the Palisade Decision Tools software package that came on
the CD-ROM with the Albright text.) The next paragraph discusses the making of decision trees
in general.
A decision tree is constructed and read from left to right. Think of the events represented
in a decision tree as happening in sequence from left to right. A managerial decision point is
represented by a green square. There, the manager chooses an action. A ruddy circle represents
a probabilistic branch point where nature chooses a state. Each branch point, whether for nature
or the manager, is called a node. Costs (negative numbers) and gains (positive numbers) are
written on the branches at the appropriate nodes. The probabilities of states of nature are
recorded above nature’s branches. At the extreme right are the terminal nodes, called leaves,
where the payoffs for each path are shown along with the probability of ending up there if the
optimal path is followed. The expected gain or loss from each possible managerial choice is
evaluated and the best of the choices is labeled TRUE. To find the optimal path (according to
the EMV principle – see next section), follow the TRUE choices from left to left.
Ex: In the loan example, the first important event is the banker’s decision whether to make the
loan, represented by the green square. If he does not lend, then there are no further events, and
the bank makes 4% x $10,000 = $400 by putting the principal in a government security. The
$400 gain is written below the “Do not lend” branch. If the banker does lend, then the next event
is whether the applicant will default, represented by the ruddy circle. The probability of default
is 0.05, and its cost is the loss of the principal (-$10,000). The probability and the cost are
written on the “Default” branch. The probability of no default is 0.95, and its gain is the interest
-2-
(10% x $10,000 = $1,000). The probability and the gain are written on the “No default” branch.
If you use PrecisionTree, these are the only figures that are required to be written on the tree;
PrecisionTree will calculate the other figures: all of the leaf values, the TRUE and FALSE
evaluations, and the expected gain or loss from each managerial choice. The expected gain for
making the loan is $450; for not making the loan the expected gain is $400. Thus, the optimal
action according to the EMV principle (see next section) is to make the loan, as indicated by the
TRUE marker on the “Lend” branch.
Making decision trees with PrecisionTree. You can construct decision trees in Excel by
using the Precision Tree tool in the Palisade Decision Tools software package that came installed
on your laptop. This section of the Course Notes will provide only a few remarks about using
this tool. The best way to learn Precision Tree is to make a few decision trees with it. Beginning
in section 7.3, your Albright text has extensive, step-by-step examples showing how to make
trees. If you want to use PrecisionTree, I recommend that you try those examples, following the
instructions in the text, and comparing your results to the Excel output shown there.
Precision Tree is an Excel add-in, but not in quite the same way that StatPro is. Precision
Tree is a separate program that works together with Excel. To run Precision Tree, click your
Start button Programs Palisade Decision Tools Precision Tree 1.0 for Excel. This will
launch Precision Tree. Precision Tree will add a couple of colorful menu bars to the bars already
in Excel. If Excel is not already running when you launch Precision Tree, Precision Tree will
launch Excel. Precision Tree adds to Excel’s capabilities and takes nothing away. You will still
have StatPro and all Excel functionality available while Precision Tree is running. Precision
Tree does not permanently attach itself to Excel: When you exit Excel, the attachment is broken,
and the next time you launch Excel, the Precision Tree menu bars will not be there – but you can
re-establish them by launching Precision Tree from the Start button, as before.2
B. EMV Principle [Ref: Albright 7.2.2]
The EMV principle says the manager should choose the action with greatest Expected
Monetary Value. Since return is identified with the expected value, EMV says choose the action
with greatest (expected) return. Since risk is identified with the standard deviation, EMV
therefore downplays the role of risk in managerial decision-making. Such a principle makes
sense for decisions that can be repeated a large number of times. But in ignoring risk, EMV may
not apply to one-of-a-kind decisions.3
Calculating EMV
If you have constructed a Payoff Table or Decision Tree for your decision problem, it is
easy to calculate EMV for each action. Payoff is a random variable with outcomes that depend
on the states of nature. The EMV for a given action is the expected value of the Payoff random
variable when the action is chosen.
Examples:
2
If you ever have major problems with PrecisionTree or any of the other Decision Tools suite of software, you can
re-install the Suite by following the directions at www.bus.utexas.edu/cbacc/coe/dtools.htm , or see the SWAT team.
I suggest checking with the SWAT team before attempting such a major software solution.
3
EMV also applies to companies that confront a large number of one-of-a-kind decisions, provided each decision
puts a relatively small part of the company’s fortunes at risk.
-3-
1. Insurance (continuation)
States of Nature
Actions Live Die EMV
(Prob=0.99842) (Prob=0.00158)
Buy -$200 $99,800 -200*.99842+
99800*.00158= -42
No buy 0 0 0*.99842+0*.00158=0
The EMV of the “Buy” action is -$42; the EMV of the “No buy” action is $0. Since EMV(No
buy) > EMV(Buy), then “No buy” would be the EMV decision. Under EMV, no one would buy
insurance as long as insurance companies insist on making a profit. Since buying insurance is
rational, this example suggests that in ignoring risk, EMV may not always be a reliable principle
for decision-making. But these comments are from the buyer’s perspective. For the insurer, the
table and tree are the same, but the signs of the payoffs are reversed: positive becomes negative,
and negative becomes positive. The insurer’s EMV is +$42, and the optimal action is to sell the
policy. From the insurer’s perspective, EMV is a reliable guide to action in this case, as long as
there are sufficiently many policyholders to spread the risk.
The Decision Tree shows that the “Yes” branch is inferior for the buyer. So the “Yes” branch
would be “pruned”.
Live 99.842% 0%
$0 -$200
Yes FALSE Live or die?
-$200 -$42
Die 0.158% 0%
$100,000 $99,800
Insurance decision Buy?
$0
No TRUE 100%
$0 $0
2. Mexican Bank loan decision (continuation)
States of Nature
Actions Default (Prob=0.05) No default (Prob=0.95) EMV
Lend -$10,000 $1,000 -10000*.05+.95*1000=450
Do not lend $400 $400 400*.05+400*.95=400
The EMV of the “Lend” action is $450; the EMV of the “Do not lend” action is $400. Since
EMV(Lend) > EMV(Do not lend), then “Lend” would be the EMV decision.
The Decision Tree shows that the “Do not lend” branch would be pruned.
-4-
Default 5% 5%
-$10,000 -$10,000
Lend TRUE Default?
$0 $450
No default 95% 95%
$1,000 $1,000
Loan decision Lend?
$450
Do not lend FALSE 0%
$400 $400
C. EVPI [Ref: Albright 7.5.2]
EVPI means the Expected Value of Perfect Information. EVPI is intended to represent
the value added to decision-making by knowing in advance what the outcome will be. Since no
advice could be better than knowing the result in advance, EVPI therefore provides an upper
bound for the value of expert opinion. You should never pay more for advice4 than you could
improve your profit from knowing the outcome in advance.
EVPI can be calculated as the difference between two EMVs:
EVPI = EMV (best decision with perfect information) – regular EMV
EVPI can also be obtained by reconstructing the decision tree, so as to interchange the
locations of the action nodes (green squares) with their corresponding chance nodes (ruddy
circles) – moving the action nodes left and the chance nodes right! Ordinarily, the manager first
makes a decision and then sees nature’s response (ex: banker decides to lend or not, and then the
applicant defaults or not). By reversing this sequence, the banker gets to see whether the
applicant would default or not before deciding whether to lend or not. Thus, in reverse sequence,
the banker gets to make the best possible decision, acting under perfect information.
One application of EVPI would be to set a limit on what to pay a consultant. You should
not pay a consultant more than the value to you of his advice. EVPI sets an upper limit on the
value of a consultant’s advice. Note that EVPI is not what you should pay a consultant, but a
limit on what to pay. In reality, no consultant has perfect information, so you should pay less
than EVPI.
Examples:
1. Insurance (continuation)
(I will now switch perspective in the insurance example from the insurance buyer to the
perspective of the company, since the company makes repeated insure/no insure decisions with
potential customers, whereas the decision is singular for each customer.) If the company knew in
advance that a customer would die, the optimal action would be to deny coverage. If the
company knew that the customer would live, the optimal action would be to provide coverage.
Consider 23-year old male customers. 99.842% of them will live, and 0.158% will die. Suppose
the company knew for sure what the fate of each would be. Then 99.842% of the time the
insurer would provide coverage and make $200 in premiums per man. For the 0.158% of 23-year
old male customers who will die, the insurer will deny coverage and make 0$ (but avoid
$100,000 loss) per man. The insurer’s EMV (for best decision with perfect information) =
4
“Advice” here means not only individual or group opinion, but also information that you might get from sample
data. EVPI establishes a limit to the value of any kind of assistance you might receive in making a decision. Too
often, managers spend more on advice than the advice adds value to the decision.
-5-
200*0.99842 + 0*0.158 = $199.684. The improvement in EMV that results from perfect
knowledge is 199.684 – 42 = $157.684. For coverage decisions made repeatedly, it would not be
rational to pay more than $157.684 per customer for even perfect knowledge of whether the
customer will die. This includes expert medical opinion and medical tests, etc. Since most
opinion and tests are far from perfect, the practical limit on the value of expert opinion and
diagnostic tests would be far less than the $157.684 figure.
The decision tree below shows how you can calculate this figure by reversing the
locations of action and chance nodes. The chance nodes are now first. The insurer gets to make
the decision after seeing whether the customer will live or die. Follow the “TRUE” markers to
see that the optimal action is to sell the policy if the customer lives, and not to sell the policy if
the customer dies. The EMV for this tree overall is $199.68. So the increase in EMV from
being allowed to choose an action after nature chooses a state is $199.68 - $42 = 157.68, which is
the EVPI.
Insure TRUE 99.842%
$200 $200
Live 99.842% Insure?
$0 $200
Do not insure FALSE 0.000%
$0 $0
Insurance decision (EMV for EVPI) Live or die?
$199.68
Insure FALSE 0.000%
-$100,000 -$100,000
Die 0.158% Insure?
$0 $0
Do not insure TRUE 0.158%
$0 $0
2. Mexican Bank loan decision (continuation)
If you knew in advance that the applicant would default, your optimal action would be to
deny the loan. If you knew in advance that the applicant would not default, your optimal action
would be to grant the loan. How much can you expect to make if you always knew in advance
whether the applicant would default? 5% of the applicants will default, so 5% of the time you
will deny the loan and take the $400 interest from government securities. 95% of the applicants
will not default, so 95% of the time you will grant the loan and make $1000 in loan interest.
Your EMV(best decision with perfect information) = 400*.05 + 1000*.95 = $970. The
improvement in EMV that results from perfect knowledge is 970 – 450 = $520. For loan
decisions made repeatedly, it would not be rational to pay more than $520 per applicant for even
perfect knowledge of whether the applicant will default. Since most advice – even statistical
data! – is far from perfect, $520 is the absolute maximum that should be paid for help in making
this decision.
The decision tree below shows how you can calculate this figure by reversing the
locations of action and chance nodes. The chance nodes are now first. The banker gets to make
the decision after seeing whether the applicant will default or not. Follow the “TRUE” markers
to see that the optimal action is to make the loan if the applicant does not default, and not to
make the loan if the applicant defaults. The EMV for this tree overall is $970. So the increase in
EMV from being allowed to choose an action after nature chooses a state is $970 - $450 = $520,
which is the EVPI.
-6-
Lend FALSE 0%
-$10,000 -$10,000
Default 5% Lend?
$0 $400
Do not lend TRUE 5%
$400 $400
Loan decision (EMV for EVPI) Default?
$970
Lend TRUE 95%
$1,000 $1,000
No default 95% Lend?
$0 $1,000
Do not lend FALSE 0%
$400 $400
D. EVSI [Ref: Albright 7.5.1]
EVSI means the Expected Value of Sample Information. The idea of EVSI is similar to
the idea of EVPI. But whereas EVPI measures the gain in expected profit resulting from
additional, perfect knowledge, EVSI measures the gain in expected profit resulting from
additional, imperfect knowledge – knowledge resulting from sampling. As with EVPI, EVSI
establishes an upper limit on the value of imperfect sample data. It is not rational to pay more
than EVSI to acquire the sample information (for repeated decisions).
Examples:
1. Insurance (continuation)
The insurance company may require a health history and/or a medical examination before
extending coverage. The history/exam provides sample data that improves the insurer’s estimate
of whether the customer will die during the coverage period. The insurer should not pay more
than EVSI to examine the medical history, or perform the medical tests of the customer’s health. 5
2. Mexican Bank loan decision (continuation)
The bank collects data from a loan application and may verify employment, salary, and other
indicia of ability and willingness to repay the loan. The sample data from the applicant improves
the bank’s estimate of whether the applicant will meet his loan obligations. The bank should not
pay more than EVSI to process and evaluate the applicant’s loan application data.
Calculating EVSI
EVSI can be calculated as the difference between two EMVs:
EVSI = EMV(best decision with free sample data) – regular EMV
But this definition does not reveal how to evaluate EMV (best decision with free sample data).
The key to understanding how to do this is to realize that the sample data will change the
probabilities of the states of nature.
Examples:
1. Insurance (continuation)
A medical finding that a customer has a heart problem increases the chance of death. This
reduces the expected payoff from insuring such a customer.
2. Mexican Bank loan decision (continuation)
5
Often, the insurer requires the customer to pay for a medical exam. However, this reduces the amount of premium
that the insurer could otherwise charge, because a rational customer looks to the total cost of coverage in choosing an
insurer and in deciding whether to obtain coverage at all.
-7-
The loan application reveals that the applicant is over 30 years of age. This reduces the
probability of default, which increases the expected payoff from lending to such an applicant.
The resulting decision table and decision tree depend on the outcome of the sample data. I will
illustrate how to incorporate sample data into the decision process in the two examples:
Examples:
1. Insurance (continuation)
Suppose that a $20 medical test (paid for by the insurance company) can detect whether or not
the customer has a heart problem. The issue for the insurer is whether to require this test of
customers applying for insurance. Suppose that it is known that the incidence of heart problems
among 23-year old men who live is 0.02 [i.e., P(heart problem | live) = .02] – this would be
determined by consulting published medical studies of living men. Suppose also that the
incidence of heart problems among 23-year old men who die is 0.15 [i.e., P(heart problem | die) =
.15] – this would be determined by consulting published medical studies or autopsies of deceased
men. From these facts and the already known probability of death for 23-year old men [P(die) =
0.00158], we can adjust our estimates of the probability of death for the two possible outcomes
of the test: Either the customer has a heart problem, or he does not. We calculate P(die | heart
problem) = 0.01173, and P(die | no heart problem) = 0.001371. These probabilities can be
calculated from Bayes Theorem [Ref: Albright 7.6], or from elementary reasoning (cf.
Mexican bank analysis), as in the following table. In this table, we suppose 10,000,000
hypothetical 23-year old men in order to guarantee the computations will yield all whole
numbers, and fill in the cells using the given information above. For example,
0.00158*10,000,000 = 15,800 men die. Of these, 15% had heart problems: 0.15*15,800 = 2,370.
Etc.
STATES OF NATURE
Live Die TOTAL P(die)
TEST Heart problem 199,684 2,370 202,054 0.01173
RESULTS No heart problem 9,784,516 13,430 9,797,946 0.00137
TOTAL 9,984,200 15,800 10,000,000
From the table, you can calculate P(die | heart problem) = 2,370 / 202,054 = 0.01173, and P(die |
no heart problem) = 0.001371.
Thus, if the medical test signals a heart problem, the chance of death is increased from
.00158 to .01173; and if the test signals no heart problem, the chance of death is decreased from
.00158 to .001371. Now, the insurance company in effect faces two decision tables – one if the
test signals a problem, and another if the test signals no problem. The payoff tables below
exclude the cost of the test, since EVSI is concerned with the value of the sample data before the
cost is taken into account:
Decision table if test signals heart problem:
STATES OF NATURE
Live Die EMV
PROBABILITY= 0.98827 0.01173
Insure $200.00 -$99,800.00 -$972.95
ACTIONS
No insure $0.00 $0.00 $0.00
Decision table if test signals no heart problem:
STATES OF NATURE
Live Die EMV
PROBABILITY= 0.998629 0.00137
-8-
Insure $200.00 -$99,800.00 $62.93
ACTIONS
No insure $0.00 $0.00 $0.00
If the medical test signals a heart problem, then the appropriate action is “Do not insure”,
and the company loses no money. If the medical test signals no heart problem, the appropriate
action is “Insure”, and the company can expect to make $62.93 before payment of the $20 cost
for the test. Now, 2.02054% of 23-year old men have a heart problem (see above table: 202,054 /
10,000,000). So the company will gain $0 on 2.02054% of its customers and gain $62.93 on
97.97946% of its customers. The final EMV for free sample data is therefore 0*0.02054 +
62.93*.9797946 = $61.66. The EMV without sample data was $42. So
EVSI = EMV(best decision with free sample data) – regular EMV =
$61.66 - $42 = $19.66
But this is less than the cost of the $20 medical test! By testing, the company would be worse off
by $0.34 per customer, on average, than by not testing. Therefore, it is not to the advantage of
the insurance company to pay for the medical test. The sample data add less value than their
cost.
Precision Tree to the rescue:
Precision Tree can construct a decision tree that incorporates sample information.
Additional nodes are inserted into the tree in order to allow for different sample outcomes. You
need to have the (conditional) probabilities that we calculated above for the possible sample
outcomes. Here is the decision tree for the insurance decision that includes the possibility of
testing for a heart condition and the possible outcomes from that test. The tree below excludes
the $20 cost of the test to make it consistent with the above calculations for EVSI. By following
the tree from left to right along the paths with nodes labeled “TRUE”, you can discover the
optimal managerial choices. The first managerial decision is whether or not to test. That choice
is resolved in favor of testing (assuming zero cost of sample data) because the testing path has
higher EMV ($61.66 for testing versus $42 for not testing). The second and final managerial
decision is whether or not to insure. The optimal choice is to insure because insuring has EMV
of $62.93 vs EMV of $0 for not insuring. The path that emerges from an affirmative decision to
test has additional nodes compared with the path that emerges from a negative decision to test.
That is, the medical test has two sample outcomes: there either is or is not a heart problem,
whereas there are no sample outcomes if the decision is not to test because there is then no
medical test. To see the effect of including the cost of the test, go to the Excel spreadsheet with
PrecisionTree active and type –20 into the cell that now has a 0 in the “Test” branch. Then the
optimal decision switches to “Do not Test.”
-9-
Live 98.8270% 0.0000%
$0.00 $200.00
Insure FALSE Live?
$200.00 -$972.95
Die 1.1730% 0.0000%
-$100,000.00 -$99,800.00
Problem 2.0205% Insure?
$0.00
Do not insure TRUE 2.0205%
$0.00 $0.00
Test TRUE Heart problem?
$0.00 $61.66
Live 99.8629% 97.8452%
$0.00 $200.00
Insure TRUE Live?
$200.00 $62.93
Die 0.1371% 0.1343%
-$100,000.00 -$99,800.00
No problem 97.9795% Insure?
$62.93
Do not insure FALSE 0.0000%
$0.00 $0.00
Medical test tree Give test?
$61.66
Live 99.8420% 0.0000%
$0.00 $200.00
Insure TRUE Live?
$200.00 $42.00
Die 0.1580% 0.0000%
-$100,000.00 -$99,800.00
Do not test FALSE Insure?
0 $42.00
Do not insure FALSE 0.0000%
$0.00 $0.00
-10-
2. Mexican Bank loan decision (continuation)
Suppose that it costs $2 to process a customer’s loan application, from which the bank
learns whether or not the applicant is over 30 years of age, among other things.6 The issue for
the bank is whether to pay the $2 charge to process the loan application. Just how valuable is the
information that the bank collects from the application? How much value does it add to the loan
decision? To answer this, we need to adjust the probability of default for whether or not the
applicant is over 30. In the Mexican bank data,7 we learn that 8 of 50 defaulters are over 30 [i.e.,
P(over 30 | default) = 0.16]; and 45 of 50 non-defaulters are over 30 [i.e., P(over 30 | non-default)
= 0.90]. These are valid estimates of the proportion over-30 in the defaulting and non-defaulting
populations because the 50 sample defaulters were randomly selected from the population of
defaulters, and the 50 sample non-defaulters were randomly selected from the population of non-
defaulters. But the combined sample of 100 borrowers is not a random sample from the entire
population of borrowers – the whole sample has proportionately far too many defaulters: no bank
could survive if 50% of its customers defaulted. So the combined sample of 100 is not
representative of the population of borrowers. This reasoning also applies to the sample of those
over 30 years of age and the sample of those under 30. Those samples are also not representative
of the over-30 and under-30 populations, because each sample has proportionately too many
defaulters. The weights applied to the defaulters and non-defaulters must be changed from 50-50
to their true proportions before we can properly infer from these data. We don’t know exactly
what the true proportions are, but we will suppose for the sake of illustration that the overall
incidence of default among potential customers is 0.05. This assumption will allow us to weight
the default and non-default samples appropriately (5% and 95%, instead of 50% and 50%).
From these facts and assumptions, we can adjust the probability of default for the applicant’s
age: P(default | over 30) = 0.00927, and P(default | 30 or under) = 0.306569. These probabilities
can be calculated from Bayes Theorem [Ref: Albright 7.6], or from elementary reasoning (cf.
Mexican bank analysis), as in the following table. In this table, we suppose 10,000 hypothetical
loan applicants in order to guarantee that the calculations produce all whole numbers, and fill in
the cells using the given information above. For example, 0.05*10,000 = 500 applicants would
default. Of these, 16% are over 30: 0.16*500 = 80. Etc.
STATE OF NATURE
Default No default TOTAL P(default)
Over 30 80 8,550 8,630 0.00927
AGE
30 or under 420 950 1,370 0.30657
TOTAL 500 9,500 10,000
From the table, you can calculate P(default | over 30) = 80 / 8630 = 0.00927, and P(default | 30 or
under) = 420 / 1370 = 0.306569.
Thus, if the applicant is over 30, the chance of default is decreased, compared with the
overall default probability of 0.05. And if the applicant is 30 or under, the chance of default is
increased. Now, the bank in effect faces two decision tables – one if the customer is over 30, and
another if the customer is 30 or under. The payoff tables below exclude the $2 cost of processing
the loan application, since EVSI is concerned with the value of the sample data before the cost is
taken into account:
Decision table if application says over 30
STATES OF NATURE
Default No default EMV
6
A gentle reminder: It is illegal in the United States to use age as a factor in determining whether or not to make a
loan or extend credit.
7
See MexicanBank.xls
-11-
Probability = 0.00927 0.99073
Loan -$10,000.00 $1,000.00 $898.03
ACTIONS
No loan $400.00 $400.00 $400.00
Decision table if application says 30 or under
STATES OF NATURE
Default No default EMV
Probability = 0.30657 0.69343
Loan -$10,000.00 $1,000.00 -$2,372.26
ACTIONS
No loan $400.00 $400.00 $400.00
If the applicant is over 30, then the indicated action is “Loan”, and the bank increases its
EMV from $450 when not using a loan application to $898.03 before the $2 cost of the loan
application. If the customer is 30 or under, the indicated action is “No loan”, and the bank’s
EMV remains at $400 before the $2 cost. Now, 86.3% of the bank’s potential customers are over
30 (see above table: 8,630 / 10,000). So the bank will gain $898.03 on 86.3% of its customers
and gain $400 on 13.7% of its customers.8 The final EMV for free sample data is therefore
898.03*0.863 +400*0.137 = $829.80. The EMV without sample data was $450. So
EVSI = EMV(best decision with free sample data) – regular EMV =
$829.80 - $450 = $379.80
Since this considerably exceeds the bank’s cost for processing the loan application, it is
much in the bank’s interest to collect the information and pay $2 for the loan application
processing. The value of the sample data far exceeds its cost.
Precision Tree to the rescue:
Precision Tree can construct a decision tree that incorporates sample information and
find the EVSI. Additional nodes are inserted into the tree in order to allow for different sample
outcomes. You need to have the (conditional) probabilities that we calculated above for the
possible sample outcomes. Here is the decision tree for the lending decision that includes the
possibility of processing a loan application and the possible outcomes from that application. By
following the tree from left to right along the paths with nodes labeled “TRUE”, you can
discover the optimal managerial choices. The first managerial decision is whether or not to take
a loan application. That choice is resolved in favor of taking the application because the
application path has higher EMV ($829.80 for the application vs $450 for not taking the
application – before the cost of the data). The second and final managerial decision is whether or
not to lend. The optimal choice here depends on the result of the loan application. If the
application shows the customer is over 30, the optimal decision is to lend because lending has
EMV of $898.03 vs EMV of $400 for not lending. If the customer is 30 or under, the optimal
decision is not to lend because not lending has EMV of $400 vs EMV of -$2,372.26 for lending.
Note that for any node where the decision is “Do not lend”, there is no need to add a branch to
deal with the default/no default outcomes because no loan is made. To see the effect of
including the cost of the application, go to the Excel spreadsheet with PrecisionTree active and
8
This simplified discussion ignores some complicating dynamics. In the real world, if the bank decides to deny a
loan, the bank probably would not immediately put the funds into government securities, but would continue to
process loan applications until it found a more credit-worthy applicant – if there were additional customers in
waiting. The simplifying assumption here is that the bank has funds available for all potential customers, so the
funds go either to a borrower or to securities.
-12-
type –2 into the cell that now has a 0 in the “Application” branch. Then the optimal decision
remains “Application.”
Default 0.9270% 0.80%
-$10,000.00 -$10,000.00
Lend TRUE Default?
$0.00 $898.03
No default 99.0730% 85.50%
$1,000.00 $1,000.00
Over 30 86.3000% Lend?
$0.00 $898.03
Do not lend FALSE 0.00%
$400.00 $400.00
TRUE Over 30?
Application
$0.00 $829.80
30.6569% 0.00%
Default
-$10,000.00 -$10,000.00
FALSE Default?
Lend
$0.00 -$2,372.26
No default 69.3431% 0.00%
$1,000.00 $1,000.00
30 or under 13.7000% Lend?
$0.00 $400.00
Do not lend TRUE 13.70%
$400.00 $400.00
Loan application?
Loan decision
$829.80
Default 5.0000% 0.00%
-$10,000.00 -$10,000.00
Lend TRUE Default?
$0.00 $450.00
No default 95.0000% 0.00%
$1,000.00 $1,000.00
No application FALSE Lend?
0 $450.00
Do not lend FALSE 0.00%
$400.00 $400.00
-13-