Reconsidering the Experimental Evidence for Quasi-Hyperbolic by leader6


									Reconsidering the Experimental Evidence for
                    Quasi-Hyperbolic Discounting

                             Gregory Besharov & Bentley Coffey∗
                                   Department of Economics
                                          Duke University†

                                         February 10, 2003


             Experiments on intertemporal choice have found “preference reversals” and related
         anomalies. These robust findings have been considered a major source of support for the
         quasi-hyperbolic discounting model of consumption preference. Our analysis clarifies
         the relationship between the experimental results and the model. We prove that, like
         exponential discounters, quasi-hyperbolic discounters are best off when they maximize
         wealth by choosing the greater financial reward in experiments. When experimental
         rewards are not financial, the choices of even exponential agents cannot be theoretically
         restricted because of complementarities, across goods and time, and because of learning
         that occurs between decisions. Since generalizing preferences from exponential to quasi-
         hyperbolic is neither necessary nor sufficient to generate the experimental results, there
         is a fundamental identification problem.

       Key words: hyperbolic discounting, quasi-hyperbolic preferences
       JEL code: D91

       Thanks to B. Douglas Bernheim for encouraging pursuit of this idea and to Daniel Graham, Kerry
Smith, Curtis Taylor, the North Carolina Cognition Group, and especially Robert Hahn and Scott Hemphill
for helpful discussions. Comments may be sent to
      Mailing address: 305 Social Sciences Building; Durham, North Carolina 27708

1     Introduction
Question 1: Would you rather receive $50 today or $100 in 6 months?
Question 2: Would you rather receive $50 in one year or $100 in 1 year plus 6 months?

    When asked these two questions by Ainslie and Haendel (1983), experimental subjects
answered them differently. Many chose the smaller-sooner reward in the first question and
the larger-later reward in the second. Since the relative present value of the two should
not systematically be affected by the delay, the phenomenon has been termed a “preference
reversal.” Many experiments and experimenters have overwhelmingly verified the result that
subjects do not simply choose the option with the higher present value. The choices imply
that discount rates for rewards over longer time horizons are lower than those over shorter
time horizons (e.g., Thaler (1981)). Since the discount rates found experimentally take the
general shape of hyperbolas, the phenomenon has been referred to as hyperbolic discounting.
In their review, Frederick, Loewenstein and O’Donoghue state that hyperbolic discounting
is the “best-documented anomaly” (2002: 360) of intertemporal choice. Subjects simply do
not choose the largest rewards. Our analysis suggests, however, that the experiments do not
identify the theory of quasi-hyperbolic consumption preferences.
    The experimental results have been thought to bear on whether people have quasi-
hyperbolic discounting preferences. Such preferences for an agent at time t are represented
by the quasi-geometric function U t = u(ct ) + β            δ τ −t u(cτ ) where cτ is the consumption
                                                   τ =t+1
in time τ . They generalize standard preferences by the inclusion of the term β, between zero
and one, that is implicitly equal to one when discounting is exponential. When β = 1, the
marginal rate of substitution between consumption at different dates is not constant. As a
result an agent is, to use O’Donoghue and Rabin’s (1999) term, “present-biased” relative
to exponentially-discounting agents. The experimental results are taken to be evidence of
present bias, and thus against exponential discounting. The problem is that, when rewards
are financial, quasi-hyperbolic agents should maximize wealth just as exponential agents do,
and, when rewards are not financial, the behavior of exponential agents is consistent with

the reported experimental results.
   The finding of declining discount rates for financial rewards is neither necessary nor
sufficient to establish that experimental subjects discount consumption quasi-hyperbolically.
Contrary to statements from the literature, the impatience of a quasi-hyperbolic agent for
consumption does not imply a similar impatience for the receipt of financial rewards. Since a
quasi-hyperbolic agent’s preferences over consumption profiles change over time, the behavior
of such agents has been studied theoretically as a game played by “selves” at different
times. The quasi-hyperbolic agent’s consumption profile is an equilibrium outcome of the
strategic interaction of the selves. We prove that for any choice between financial rewards,
equilibria exist in which quasi-hyperbolic agents choose to maximize their wealth. The
intuition is that any consumption profile supported by a given level of wealth is dominated
by a profile supported by any higher level of wealth. As has been pointed out previously, it
becomes easier to constrain the present-biased agents as wealth increases (Bernheim et al.
1999). In equilibria selected by rules respecting the Pareto criterion, the larger reward is
always chosen. In the presence of liquidity constraints or uncertainty, quasi-hyperbolic agents
need not choose the reward with higher present value, but the same is true for exponential
   If quasi-hyperbolic preferences are to be identified using reward-based experiments, then
it must be through the use of non-financial rewards, but experiments using non-financial
rewards also fail to provide definitive support. One confound is that non-financial rewards
must be specific goods and services, not an aggregated consumption good, and specific
consumption goods have complementarities with aspects of the environment that change
over time. Choices may differ across time not because preferences are quasi-hyperbolic but
because the prices and availability of complements and substitutes vary. Another confound
is that some rewards have an option value that makes them more valuable the further they
occur in the future. The increase in value may lead to the appearance of declining discount
rates even when consumption is discounted exponentially. Experiments that seek to evade
these problems by offering an individual exactly the same choices at two different points of

time have a different confound–that learning may take place in the interim. Surprisingly,
the studies do not consider the possibility that a decision might be optimal ex ante but not
usually ex post.
       One way to understand the effect of these confounds is to consider the minimum
willingness-to-accept (WTA) of an exponentially-discounting agent for non-financial rewards.
The argument of the experimental literature is essentially that the finding of declining im-
plied discount rates for rewards suggests that agents discount quasi-hyperbolically rather
than exponentially. The experimental studies do not consider complementarity, learning,
option value, or other confounds that can affect the WTA, causing it not to decline expo-
nentially (or at all). Likewise ignored is that quasi-hyperbolic preferences do not explain
the common finding of non-positive discount rates.1 Moreover, experiments that ask directly
for choices over non-financial reward profiles over time find that people prefer sequences of
rewards that improve over time to those that become worse (Frederick et al. (2002)). The
results are in some sense opposite to the present-bias of hyperbolic consumers, and Loewen-
stein (1987) uses the term “reverse time preference” to describe the phenomenon. These
problems–the confounds affecting WTA, the finding of non-positive discount rates, and re-
verse time preference–both individually and jointly question the conclusion that existing
experiments identify the form of discounting.
       Past criticisms of experimental evidence for quasi-hyperbolic discounting have not treated
the fundamental identification problem we raise. The criticisms most related to ours have
stressed issues of arbitrage and wealth-maximization. Pender (1996) writes, “[i]f agents are
optimizing an intertemporal utility function, their opportunities for intertemporal arbitrage
are also important in determining how they respond to such experiments.” Intertemporal
arbitrage enters our discussion of reward choice because the ability to shift a reward across
       Our review yielded 28 articles purporting to find experimental evidence of quasi-hyperbolic discounting.
Approximately one-third of the experiments in the articles reported subjects with non-positive discount
rates. Since all but one of the other articles did not report on the issue, this should be taken as a lower
bound of the frequency of an experiment finding such subjects. Furthermore, the design of some experiments
mechanically ensured that subjects would have positive discount rates.

time will affect the WTA, but does not speak to the issue of whether existing experiments
identify quasi-hyperbolic preferences. In an analysis more closely related to ours, Mulligan
(1996) poses the question of whether hyperbolic discounting can be detected by offering
fungible rewards. He notes that if hyperbolic agents were not wealth-maximizing, then
exponential discounters operating a “Dutch book” or money pump would bankrupt them.
He does not analyze, however, the choices of quasi-hyperbolic agents for rewards as we have,
nor does he discuss identification issues.
   The previous failure to address identification issues appears to be the result of interpret-
ing choices for rewards as equivalent to choices for consumption. As Laibson (2002) has
noted, the assumption is often made that experimental rewards are consumed when they are
given. For financial rewards and others that can be shifted intertemporally, consumption
immediately after receipt is not an implication of quasi-hyperbolic consumption preferences
but is rather an assumption directly on behavior. Making the assumption can only mislead
when testing the model.
   Two recent papers present alternative explanations of the experimental data: Read’s
(2001) model of subadditivity positing that discounting is greater when subjects divide time
into more periods and Rubinstein’s (2003) model of “similarity relations.” These papers are
orthogonal to ours in that they do not discuss the identification problem at the heart of our
paper nor do we offer an alternative theoretical model. Because they address the relation of
their theoretical models to experimental results, we do not.
   Section 2 addresses experiments with financial rewards and develops theoretical results
regarding the wealth-maximizing behavior of quasi-hyperbolic agents. Section 3 reviews the
experiments with non-financial rewards and shows that they are consistent with exponential
discounting. Conclusions follow.

2     Choices for Financial Rewards
Do agents with hyperbolic consumption preferences choose rewards on the basis of present
financial value? The experimental results on preference reversals and declining discount
rates have been taken as evidence that they do not. We address the question theoretically
by allowing an agent the choice between two rewards. We find that, contrary to discussions
in the literature, that quasi-hyperbolic agents can always become better off with a greater
level of wealth.
    As discussed in the introduction, the quasi-hyperbolic model of preferences considers
an agent as a series of selves at each time t extending to a time T that may be either
finite or infinite. The preferences of self t are U t = u(ct ) + β                δ τ −t u(cτ ) where cτ is the
                                                                       τ =t+1
consumption at time τ , β ∈ (0, 1] and δ ∈ (0, 1). The function u(·) is strictly increasing,
strictly concave, and u(0) = 0. The choice of the initial self is between high wealth k and low
wealth k o . Since we assume that an agent can borrow or lend income at the same interest
rate rt that can vary across time, we can equivalently consider the agent as receiving an
exogenous series of monetary rewards.
    The quasi-hyperbolic agent’s usual consumption game follows the choice of reward. When
self t receives some level of capital kt and the next self t + 1 receives kt+1 , the consumption
of self t is implicitly ct = kt − (1 + rt )−1 kt+1 . A history of the consumption subgame is
h = (k1 , k2 , ...), a history at time t is ht = (k1 , k2 , ..., kt ), and a history from time τ to time t
is hτ,t = (kτ , kτ +1 , ..., kt−1 , kt ). We describe the difference between the value of two rewards
k and k o as the “bonus” b = k − k o .
    The question we seek to answer is whether quasi-hyperbolic agents select the reward that
maximizes wealth or whether they may choose ones that have rewards that are relatively
smaller and sooner. Our answer to the question begins with the result that an equilibrium
always exists in which the quasi-hyperbolic agent maximizes wealth. With a finite horizon,
the equilibrium is unique: the quasi-hyperbolic agent always chooses to maximize wealth.
(We relegate the proof for the finite horizon case to an appendix because it follows directly
from existing results.) Though there may be multiple equilibria in the infinite horizon,

selection criteria satisfying the Pareto criterion eliminate equilibria in which the lower reward
is chosen. Thus, we expect agents with quasi-hyperbolic consumption preferences to choose
to maximize wealth.
   The result that equilibria exist in which the higher level of wealth is chosen follows
directly from the fact that for any consumption profile with a level of wealth k o , there is a
dominating consumption profile at any higher level of wealth. We prove the result by using
the original consumption equilibrium to construct a new consumption equilibrium. In the
constructed equilibrium, the selves must leave at least as much capital as in the original
equilibrium, but can otherwise consume what they wish. If any self does not, then he is
punished just as he would be punished for having overconsumed by the same amount in the
original consumption equilibrium.

THEOREM 1 For any choice of financial rewards, an equilibrium exists in which the quasi-
hyperbolic agent maximizes wealth.

Proof. After the choice of a reward with net present value k o , there is a non-empty set of
consumption equilibria S(k o ) as shown by Krusell and Smith (2002). Take any so ∈ S(k o ).
We show that for any k > k o there is an equilibrium s∗ that dominates so .
   In the constructed equilibrium s∗ , the amount of the bonus reaching self t is bt = kt −
kt . Self t consumes some amount that results in leaving the next self (t + 1) at least as
                                                    ∗      o
much capital as in the original equilibrium; i.e., kt+1 ≥ kt+1 . In other words, the self may
                                  ∗    o
consume any remaining bonus bt = kt − kt . If the self D is the first to leave too little,
then the successive selves follow the behavior specified in the original equilibrium so when D
overconsumes by the same amount. It will be useful to refer to the future utility received by
self t for any choice he makes following a particular history when the successive selves have
strategies {sτ }∞ . Define Vt (ht , ct , {sτ }∞ ) = β
                t+1                          t+1                δ τ −t u(sτ (hτ )).
                                                       τ =t+1

   The equilibrium strategy of self t is as follows.
                c∗ ∈ argmax u(c ) + V (h , c , {s∗ }∞ )                      0
                                                                     if kτ ≥ kτ for all τ < t
                t
                                 t   t t t       τ t+1
                             o
                ct ∈ 0,kt − kt+1
                            1+rt

  st (ht ) =
                o o
                st (h , hD+1,t )
                                                                otherwise, where D is the first self with
                     D
                                                                       o
                                                                kD+1 < kD+1

   Three steps complete the proof. We demonstrate first that a well-defined argmax c∗ exists

in the specified range, second, that no self finds it optimal to consume outside the range,
and third, that the constructed equilibrium is Pareto improving.
   First, an optimum exists for self t’s problem for any strategy combination of the succeed-
ing players. That an optimum exists follows from the compactness of the choice set and the
boundedness of the program. It is bounded below by zero and above by the value of the
program when β = 1; i.e., the value of the exponential agent’s program with the same level
of wealth. Since an optimum exists for any strategy combination of the succeeding players,
it also exists for their equilibrium strategy combination.
   Second, self t does not wish to over-consume. Any benefit to the self from consuming
an additional amount on the equilibrium path is less than what the same self would have
gotten from consuming that additional amount in the original equilibrium. Over-consuming
was not optimal for the self in the original equilibrium, so it cannot be in the constructed
equilibrium either. Consider self t’s consumption of an additional amount x over the allowed
co + bt for total consumption ct = co + bt + x ≤ kt . Because the original equilibrium so is an
 t                                  t

equilibrium, we know that
                        c0 +x
                                u (s)ds ≤ Vt (ho , co , {so }∞ ) − Vt (ho , co + x, {so }∞ ).
                                               t t        τ t+1         t t           τ t+1                (1)

The gain in the constructed equilibrium from consuming the additional amount x when
previous selves have played their equilibrium strategies is less than the gain in the original
equilibrium. The gain from consuming the additional amount x for self t on the equilibrium

path is
                                        c0 +bt +x
                                         t                       c0 +x
                                                    u (s)ds <            u (s)ds                       (2)
                                       co +bt
                                        t                       co

with strict inequality holding from the strict concavity of the period utility function.
   Because succeeding selves choose the consumption levels from the original equilibrium in
response to the defection, the payoff from future periods is the same: Vt (h∗ , co +bt , {s∗ }∞ )−
                                                                           t t            τ t+1

Vt (h∗ , co + bt + x, {s∗ }∞ ) = Vt (ho , co , {so }∞ ) − Vt (ho , co + x, {so }∞ ). It follows that
     t t                τ t+1         t t        τ t+1         t t           τ t+1

               c0 +bt +x
                           u (s)ds < Vt (h∗ , co + bt , {s∗ }∞ ) − Vt (h∗ , co + bt + x, {s∗ }∞ ).
                                          t t             τ t+1         t t                τ t+1       (3)
             co +bt

Thus, every self receives at least kt and no self consumes more than co + bt .

   Finally, each self is at least as well off. If the self chooses ct = c0 + bt ≥ co , leaving capital
                                                                        t         t
kt+1 = kt+1 then the period’s utility will be at least as great as in the original equilibrium
and the continuation payoff will be the same. Since this is an option for each self along the
equilibrium path, all selves must be at least as well off as in the original equilibrium. Since
the total level of resources is higher, at least one agent must become better off.
   Now consider the initial self’s choice of reward. Suppose that the continuation equilibrium
following the choice of k is s∗ and the one following the choice of k o is so . Then the initial
self will choose to maximize wealth.
   We have shown that in the infinite horizon, there are equilibria in which it is optimal for
the agent to maximize wealth. Any refinement that respects Pareto dominance would select
an equilibrium in which the higher reward is chosen. We are not aware of an appropriate
refinement which eliminates the choice of the higher reward.
   Our results imply that experiments offering financial rewards do not identify quasi-
hyperbolic preferences; the hypothesis that β = 1 cannot be tested. The natural question
raised by our finding is why subjects are not observed to maximize wealth. We think that,
particularly because subjects are often students, liquidity constraints often bind. Sufficiently
strict liquidity constraints that relax over time could generate the appearance of declining
discount rates. An alternative explanation is that the declining implied discount rates are
the result of uncertainty in future interest rates as discussed by Weitzman (1998). There

are also the previously mentioned theories of Read (2001) and Rubinstein (2003). It seems
likely that one of these explanations, or a combination of them, is responsible for the observed

3         Choices for Non-Financial Rewards
A relatively small minority of experiments on intertemporal choice involve non-financial
rewards. For agents who discount exponentially, the value of a reward is the minimum
willingness-to-accept (WTA), that is, the amount of money paid immediately that makes
the agent just as well-off as actually receiving the non-financial reward. If the WTA for
one reward is larger than the WTA for another reward, then an agent’s maximized utility
with the first reward is larger than with the second reward, and the first should be chosen.
Thus, for exponential discounters the WTA framework generalizes the treatment of financial
rewards of the previous section.2
        There are special cases in which agents should treat non-financial rewards in the same
way that they treat financial rewards of the same market value. One is when there are
no transaction costs of selling them. Another is when the reward is inframarginal in the
sense that a consumer who had received the WTA for a reward would have been indifferent
to purchasing the reward on the market in the period that it is offered in the experiment.
The best examples in this category are Pender’s (1996) experiments offering rice to peasants
in rural India.3 Though these special cases are not our major concern for any particular
experiment, they can confound interpretation of what has been done. Generally, however,
        The structure of nearly all experiments implies that the WTA rather than the maximum willingness-
to-pay is the appropriate monetization of rewards. When experimenters essentially endow subjects with
their chosen reward, the appropriate welfare comparison is of maximized utility when the agents have the
reward. Though it is theoretically possible for WTA to be unbounded (Hanemann (1991)), such is not a
concern with the existing body of experiments. That the WTA for a reward may not be uniquely defined
for quasi-hyperbolic agents is immaterial in our treatment of exponential agents.
     The few examples of “preference reversal” disappear after initial trials, and Pender ascribes them to
unfamiliarity with the experimental setting.

the value of a reward depends on liquidity constraints, market prices, complementarities,
and a host of other omitted variables. In this section we demonstrate that the behavior
of subjects in experiments with non-financial rewards is consistent with WTA-maximizing
behavior of exponential agents in the two types of experiments that have been conducted.
   The first type of experiment asks subjects to give monetary values for a reward or to
choose between a pair of rewards separated by a fixed period of time. Subjects are said to
discount hyperbolically if their implied discount rates decline over time or if the preferred
choice reverses with an equal increase in the delay to both choices. Though not stressed in
the literature, the implied discount rate of many subjects is negative or zero. For example,
Redelmeir and Heller (1993) found that 62.1% of their subjects had discount rates of zero
and 10% had negative discount rates. The same lack of inclination to consume immediately
was found by Loewenstein’s (1987) experiments asking subjects their maximum willingness-
to-pay (WTP) to avoid receiving a non-lethal 110 volt shock and also to obtain a kiss
from the movie star of their choice. Contrary to the predictions of models with positive
discounting, the value of the kiss peaked at 3 days and the WTP to avoid the shock climbed
as it became more distant. Clearly, the findings challenge the position that quasi-hyperbolic
discounting fully explains experimental results: they demonstrate that omitted variables
must affect choice, and open the possibility that those omitted variables generate the behavior
taken to be evidence of time inconsistency of any sort. The first subsection discusses how
complementarities in consumption can generate the appearance of time inconsistency. The
second subsection considers option value aspects of rewards that raise the WTA over time
and generate the appearance of declining discount rates.
   The second type of experiment addresses the time consistency of a consumption profile
by asking subjects to make a decision for the same future date at two points in time–usually
substantially in advance and immediately before the choice takes effect. The change in
decision is taken to be evidence of hyperbolic discounting. The third subsection discusses
the confound for such experiments that, first, the subjects obtain information in the interim
that may affect their choice, and second, their initial response need not be genuine when

they have a chance to change it later.

3.1    Complementarities

The prices of some goods drop predictably over time, and changes in the public provision
of goods can be known in advance. The utility of consuming rewards, and by extension the
WTA for them, may be drastically altered by the timing and quantity of their complements.
The ability of complementarity to generate behavior that can be attributed to hyperbolic
discounting has been found before, notably in Becker and Murphy’s (1988) explanation of
addictive behaviors with intertemporal complementarity.
   The second experiment in Loewenstein (1987) is no more supportive of hyperbolic dis-
counting than the first. Subjects had the choice of having dinner at a fancy French restaurant
either immediately or after a week’s delay. When told they would be eating at home in the
third week, 84% of the subjects chose the delayed dinner. When told that they would have a
fancy lobster dinner in the third week, choices tended to switch to having the French dinner
immediately. The results strongly suggest a complementarity in fancy meals.
   Another case where intertemporal additivity apparently does not hold is in aversive stim-
ulus. Navarick (1982) offered subjects either 40 seconds of quiet followed by 0 seconds of
noise (40 quiet, 0 noise) or (25 quiet, 15 noise). In a confirmation of positive discounting
and the aversiveness of the stimulus, the subjects chose the former. When the same choices
were preceded by 20 seconds of additional quiet to make the choices (60 quiet, 0 noise) or
(45 quiet, 15 noise), nearly half chose the latter. Evidently, the additional quiet was more
aversive than the “aversive” noise because of complementarity with other quiet. The re-
sult should not be surprising because of the aversiveness of sensory deprivation referred to
sometimes as “white noise torture.” (The experiment was intended to simplify a previous
experiment in Solnick et al. (1980) which had reported results that were ambiguous because
of technical issues of experimental design.) A deeper issue for experiments with aversive
stimuli is the willingness of subjects to participate. When subjects are offered the chance to
leave with no effect on their compensation, as in Navarick (1982), almost none do.

   In a related paper with a strongly contrasting result, Millar and Navarick (1984) asked
subjects to choose between the following two options regarding video games: wait for 30
seconds, play 20 seconds and wait 40 seconds (wait 30, play 20, wait 40) or (wait 70, play
20). Far more people chose the latter than the former, in contrast to what would occur if
subjects did not like to wait while doing nothing. The results suggest that the preferences
for rewards with waiting times are highly dependent on experimental design.
   Also addressing the optimal time for a rest period, one experiment in Raineri and Rachlin
(1993) offered subjects the choice of when to have a vacation. Possible complementarities
include previously scheduled activities; it may simply be inconvenient for subjects to take a
vacation at a particular point in time. Though the implied discount rates declined over time,
the authors referred to the possible disruptions from the larger vacations: “a 1- or 10-year
vacation would disrupt these student’s lives and leave them with no additional assets at the
end.” The WTA for the vacations should be affected by these factors. In another experiment
Raineri and Rachlin offered subjects the use of an economy car. Since they did not consider
either the effects of technological change in the quality of cars or changes in the consumption
of subjects over time, their results do not reflect only discounting but the other factors as
   That the “value” of a reward need not be constant across time is also an issue for Kirby
and Herrnstein (1995). They offered popular semi-durable goods (e.g., walkman, watch,
Harvard sweatshirt, etc.), each selected to have a similar approximate market price of $30, to
Harvard undergraduates. With the relative size of rewards determined by subjects’ ratings,
the experiment employed the standard procedure of increasing equally the delay to two
rewards until choice shifted to the larger, later prize. With rewards selected to be equal in
market value, the change in preferred reward may simply be the result of complementarities.
In support of their importance, the authors report that subjects took into account such
changes, calculating receipt dates “in relation to school holidays, impending vacations, birth
dates of friends or relatives, graduation dates, even a weekend date, and so on.” It is far
from clear that “preference reversals” are the result of quasi-hyperbolic discounting.

3.2        Uncertainty, Insurance, and Option Value

Some rewards increase in value over time, particularly rewards that have an insurance compo-
nent. Consider a fixed medical insurance plan. As an individual advances through adulthood,
a greater probability of serious illness makes health insurance more valuable. The rewards in
several experiments offer subjects a return to full health that would, for insurance reasons,
be expected to increase in real value over time. Thus it is not clear whether subjects have a
constant discount rate applied to a rising reward or have a declining discount rate applied
to a constant reward.
       There are few experiments in which these concerns are paramount. A typical one is Chap-
man and Elstein’s (1995) comparison of the discounting of monetary payments, vacations,
and a temporary return to full health from a particular poor health state.4 Subjects were
told to imagine that they had been in a state of poor health for the past two years and that
the state would persist until their death. They were then asked to specify the duration of
a period of full health at a given delay that they considered equivalent to a given period of
full health beginning immediately. The experimenters concluded that the implied discount
rates declined over time.
       The experimental design assumes that discounting is the sole factor affecting the WTA
for the value of returning to full health. The possibility that uncertainty increases the
reward’s (undiscounted) value is completely ignored. The study could have attempted to
relate determinants of uncertain future health to the relative discounting across rewards,
though other complementarities would likely still have posed problems.

3.3        Learning

One way to establish time inconsistency is to compare a planned decision at two different
dates. The main problem with such a protocol is that individuals’ uncertainty may be
       We focus on the treatment of health because the vacation issues are quite similar to those in Raineri
and Rachlin (1993). Chapman (1996) examined the same monetary and health rewards as Chapman and
Elstein (1995), with similar results.

resolved between the dates in a way that affects their optimal choices. A secondary problem is
that subjects may be providing answers they think their questioners wish to hear, particularly
when the choices are not binding in the initial query.5 Three experiments offer choices in
this fashion.
      Illustrating both issues is Christensen-Szalanski’s (1984) study of pregnant women’s
choices of whether to use anesthesia during labor. The expectant mothers were asked well
in advance of labor and also at the start of labor whether they wished to use anesthesia.
The result was that women who had never had children before shifted to choosing to use
anesthesia as labor approached. The paper interprets the results as evidence of impatience:
during labor, the short term costs of not using anesthesia loom large relative to the benefits
of “natural” childbirth. Since the experienced mothers did not express the same prefer-
ence shift, the informational explanation seems likely: new mothers learned about the pain
of childbirth between their choices. The quasi-hyperbolic preference interpretation cannot
explain the difference in behavior between new and experienced mothers.
      The other problem with the experiment is that the subjects presumably knew that their
first choice was not binding–they would not be denied anesthesia if they wanted it later.
As a result, their first choice was without consequence. They could, for example, make the
statement they thought was expected of them or what they thought the interviewer wanted
to hear. The inability to commit to subjects’ initial decisions will generally compromise
studies of real-world situations with serious implications.
      Opportunities for learning and the inability of experimenters to commit also call into
question the implications of Read and van Leuwen’s (1998) experiment evaluating choices
between unhealthy and healthy snacks. Subjects were more likely to make the unhealthy
choice when asked immediately before the snacks were to be given than when asked a week
in advance. Such results are not clear evidence of time inconsistency: if subjects learn about
the nature of their hunger between their initial declaration and the moment they actually
      The problems associated with non-binding stated preference techniques such as contingent valuation are
well-known (e.g., Diamond and Hausman (1994)).

consume the snack, healthy snacks can be optimal ex ante and still have a low probability of
being optimal ex post. (Under conditions weaker than expected utility maximization, agents
trade-off the probabilities against the magnitudes of state-contingent payoffs when making
advance decisions.) The second issue, that choices may not bind and that subjects may wish
to sound virtuous, also applies. Subjects may have supposed that their advance choice did
not matter–a supposition that would have turned out to be true.6
        A third experiment in which learning likely affects the results is Read, Loewenstein, and
Kalyanaraman’s (1999) study of the choice of movies to rent. They define the smaller-sooner
reward as a light comedy or action flick and the larger-later reward as a subtitled or depressing
film. They found that subjects were more likely to choose high-brow movies when asked in
advance than when asked immediately before they were going to watch. Again, there is the
issue of a decision being ex ante optimal and ex post optimal with a low probability. When
asked to choose a movie in advance, the subjects must do so not knowing which they will
be in the mood for. Even exponential discounters may have moods in which they would like
to watch fluff whether because of other decisions or because of exogenous factors. Without
addressing these factors, the experiment provides no way to know whether the choice is the
result of discounting.

4         Conclusion
Though experiments with financial rewards commonly find that some subjects have non-
positive implied discount rates, our review confirms the general result that subjects have
declining implied discount rates. (In fact, we abandoned a planned statistical meta-analysis
because of a lack of variation in the outcomes across experiments.) Since our theoretical
analysis suggests that the generalization of consumption preferences from exponential to
quasi-hyperbolic should not affect choices for financial rewards, there must be omitted vari-
        The experiment’s additional findings on the determinants of choice are very interesting but do not bear
on the issue of identifying quasi-hyperbolic preferences.

ables such as liquidity constraints that affect choice. Without knowing what those variables
are and being able to control for them, experiments with financial rewards do not identify
quasi-hyperbolic preferences. By the same logic, the common finding of nonpositive discount
rates is not evidence against quasi-hyperbolic preferences because they could be generated
by the omitted variables.
   Though the generalization to quasi-hyperbolic preferences may affect choices for non-
financial rewards, there are variables that affect choice that have been omitted from the
analyses. As we discussed, theory provides no restriction for the relative WTA of non-
financial rewards because of complementarities and uncertainty. The limited set of experi-
ments with non-financial rewards have not dealt with these factors. Thus, their findings of
declining implied discount rates could be generated by exponentially discounting agents.
   We hope that our emphasis on the effects of “omitted variables” is not taken to be
an unfairly facile attempt to dismiss experimental findings. In each of the categories of
experiments involving non-financial rewards, there are systematic effects that would lead
exponential discounters to appear otherwise. The presence of these systematic effects implies
that evidence taken as contrary to quasi-hyperbolic preferences is not. We have discussed
findings of non-positive discount rates and of choices for increasing consumption profiles.
When the omitted variables that we stress are not taken into account, then neither of these
findings are evidence against quasi-hyperbolic discounting.
   In light of the issues we have raised, a natural question to ask is whether there are any
reward-based experiments that can identify quasi-hyperbolic preferences. Our formal results
suggest that experiments with financial rewards cannot, so any hope of doing so relies on non-
financial rewards. Establishing that discounting is not exponential requires either that the
undiscounted WTA for a reward is either unaffected by other variables or that the variables
that affect it, and the way they do, are known. In an ideal experiment, the rewards would
be non-market, non-storable, and strongly separable in utility from other determinants of
well-being. Since the value of any given reward can vary immensely as a function of variables
that are not observable to analysts, the problems cannot easily be addressed. Perhaps the

best way of illuminating time inconsistency is with the study of commitment devices such as
Christmas clubs, forced diet programs, and the like. Though commitment devices may not
be useful in directly estimating preference parameters, they are likely to be the strongest
form of evidence that agents do not simply discount the future exponentially.
   Our conclusion that the existing experiments do not identify quasi-hyperbolic preferences
should not be taken to imply that we think the model is not useful. Applications have cast
light on the effects of financial innovations on savings rates (Laibson (1997)), on incentives
for procrastinators (O’Donoghue and Rabin (1999)), and, generally, on the implications of
naivete and sophistication. Even if it were the case that a particular class of experiments
were to reject quasi-hyperbolic consumption preferences–rather than fail to identify them,
as we have found–the model could still be useful for understanding other aspects of behav-
ior. As Rabin (1998) discusses, the results emerging from the literature on psychology and
economics suggest that decision-making depends on context in a way that does not lead to
general models. In conjunction with our results, the statement suggests caution in draw-
ing conclusions about optimal intertemporal resource allocation from the quasi-hyperbolic
discounting model, particularly when those conclusions are to be used to form policy.

 [1] Ainslie, George and Varda Haendel. 1983. “The Motives of the Will.” In E. Gotteheil
    et al. (eds.), Etiologic Aspects of Alcohol and Drug Abuse. Springfield, Il.: Thomas.

 [2] Becker, Gary S. and Kevin M. Murphy. 1988. “A Theory of Rational Addiction.”Journal
    of Political Economy 96(4), 675-700.

 [3] Bernheim, B. Douglas, Debraj Ray, and Sevin Yeltekin. 1999. “Self-Control, Saving, and
    the Low Asset Trap.” Mimeo online at:∼bernheim/selfcontrol.pdf

 [4] Chapman, Gretchen. 1996. “Temporal Discounting and Utility for Health and Money.”
    Journal of Experimental Psychology: Learning, Memory, and Cognition 22(3), 771-791.

 [5] Chapman, Gretchen and Arthur Elstein. 1995. “Valuing the Future: Temporal Dis-
    counting of Health and Money.” Medical Decision Making 15(4), 373-386.

 [6] Christensen-Szalanski, Jay. 1984. “Discount Functions and the Measurement of Patients’
    Values: Women’s Decisions During Childbirth.” Medical Decision Making 4(1), 47-58.

 [7] Diamond, Peter A. and Jerry A. Hausman. 1994. “Contingent Valuation: Is Some Num-
    ber Better than No Number?” The Journal of Economic Perspectives 8(4), 45-64.

 [8] Frederick, Shane, George Loewenstein, and Ted O’Donoghue. 2002. “Time Discounting
    and Time Preference: A Critical Review.” Journal of Economic Literature 40(2), 351-

 [9] Hanemann, W. Michael. 1991. “Willingness to Pay and Willingness to Accept: How
    Much Can They Differ?” The American Economic Review 81(3), 635-647.

[10] Kirby, Kris and R.J. Herrnstein. 1995. “Preference Reversals Due to Myopic Discounting
    of Delayed Reward.” Psychological Science 6(2), 83-89.

[11] Krusell, Per and Anthony Smith, Jr. 2003. “Consumption-Savings Decisions with Quasi-
    Geometric Discounting.” Econometrica 71, 365-375.

[12] Laibson, David. 1997. “Golden Eggs and Hyperbolic Discouting.” Quarterly Journal of
    Economics 112(2), 443-477.

[13] Laibson, David. 2002. “Intertemporal Decision Making.” In Encyclopedia of Cognitive
    Science. Mimeo online at:

[14] Loewenstein, George. 1987. “Anticipation and the Valuation of Delayed Consumption.”
    The Economic Journal 97, 666-684.

[15] Millar, Andrew and Douglas Navarick. 1984. “Self-Control and Choice in Humans: Ef-
    fects of Video Game Playing as a Positive Reinforcer.” Learning and Motivation 15,

[16] Mulligan, Casey. 1996. “A Logical Economist’s Argument Against Hyperbolic Discount-
    ing.” Available from author.

[17] Navarick, Douglas. 1982. “Negative Reinforcement and Choice in Humans.” Learning
    and Motivation 13, 361-377.

[18] O’Donoghue, Ted and Matthew Rabin. 1999. “Doing it Now or Later.” American Eco-
    nomic Review 89(1), March 1999, 103-124.

[19] Pender, John. 1996. “Discount Rates and Credit Markets: Theory and Evidence From
    Rural India.” Journal of Development Economics 50, 257-296.

[20] Rabin, Matthew. 1998. “Psychology and Economics.” Journal of Economic Literature
    36(1), 11-46.

[21] Raineri, Andres and Howard Rachlin. 1993. “The Effect of Temporal Constraints on
    the Value of Money and Other Commodities.” Journal of Behavioral Decision Making
    6, 77-94.

[22] Read, Daniel. 2001. “Is Time-Discounting Hyperbolic or Subadditive?” Journal of Risk
    and Uncertainty 23(1), 5-32.

[23] Read, Daniel and Barbara van Leeuwen. 1998. “Predicting Hunger: The Effects of
    Appetite and Delay on Choice.” Organizational Behavior and Human Decision Processes
    76(2), 189-205.

[24] Read, Daniel, George Loewenstein, and Shobana Kalyanaraman. 1999. “Mixing Virtue
    and Vice: Combining the Immediacy Effect and the Diversification Heuristic.” Journal
    of Behavioral Decision Making 12, 257-273.

[25] Redelmeier, Donald and Daniel Heller. 1993. “Time Preference in Medical Decison Mak-
    ing and Cost-Effectiveness Analysis.” Medical Decision Making 13, 212-217.

[26] Rubinstein, Ariel. 2003. “Is it ‘Economics and Psychology’ ?: The Case of Hyperbolic
    Discounting.” International Economic Review. Volume and page information not yet

[27] Solnick, Jay, Catherine Kannenberg, David Eckerman, and Marcus Waller. 1980. “An
    Experimental Analysis of Impulsivity and Impulse Control in Humans.” Learning and
    Motivation 11, 61-77.

[28] Thaler, Richard. 1981. “Some Empirical Evidence on Dynamic Inconsistency.” Economic
    Letters 8, 201-207.

[29] Weitzman, Martin L. 1998. “Why the Far-Distant Future Should Be Discounted at Its
    Lowest Possible Rate.” Journal of Environmental Economics and Management 36(3),

5       Appendix
Proof of Theorem 1 for Finite Horizon Case
    When the horizon is finite, there is a unique outcome following the initial self’s choice
of either k o or k. The characterization of consumption follows well-known arguments. As
the outcome to a game of complete and perfect information, the consumption plan can
be solved with backwards induction. The final self T will fully consume any remaining
capital kT . Self T − 1 faces the following problem: max{u(cT −1 ) + βδu(kT )} where kT =
                                                           cT −1
(1 + rT −1 )(kT −1 − cT −1 ). The assumptions imply that the first-order condition defines the
unique solution cT −1 (kT −1 ). The same is true for cT −2 (kT −2 ), cT −3 (kT −3 ),..., c0 (k0 ). The
curvature of u implies that ct and the optimal kt+1 are increasing in kt for all t, including
time 0. It follows that the initial self has higher utility with the payment stream that has
higher present value. Thus, the initial self chooses to maximize wealth in the finite horizon


To top