; Model Stability and the Subprime Mortgage Crisis
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Model Stability and the Subprime Mortgage Crisis

VIEWS: 5 PAGES: 44

  • pg 1
									                                   IRES2010-010




       IRES Working Paper Series



Model Stability and the Subprime
        Mortgage Crisis




              Xudong An
            Yongheng Deng
            Eric Rosenblatt
            Vincent W. Yao




          September 12, 2010
                 Model Stability and the Subprime Mortgage Crisis


                Xudong An†, Yongheng Deng§, Eric Rosenblatt‡, Vincent W. Yao‡



                                         September 12, 2010



                                               Abstract

We study the potential model instability problem with respect to mortgage default risk and
examine to what extent it helps explain the default shock during the recent crisis. We find that
econometric default risk models based on historical data can be unstable over time. Due to
temporal shifts in the parameters, default prediction of the 2006 vintage subprime loans based on
hazard and Logit models estimated with 2003 vintage loan data can generate over 40 percent
fewer defaults than the actual number, assuming perfect forecast of house price change. We also
find that the combined impact of parameter instability and bad forecast of HPI growth enlarges
the under-prediction of default rate but the marginal impact of parameter instability is larger than
that of bad HPI forecast. Our findings have important implications regarding model limitations
and risk, model improvements, economic capital, and regulatory reform.

Keywords: subprime mortgage, default risk, model stability, hazard model, Logit model




                                                            

   The authors are grateful to John Clapp, David Geltner, Richard Green, Michael Lea, David Ling, Tony
Sanders, and Brent Smith for helpful discussions and suggestions. We also thank participants in the
Maastricht-MIT-NUS 2009 Real Estate Finance and Investment Symposium, the 2010 Weimer School of
Advanced Studies in Real Estate and Land Economics, the Finance Seminar at San Diego State University for
helpful comments.
†
   Department of Finance, College of Business Administration, San Diego State University, 5500 Campanile
Dr., San Diego, CA 92182-8236; xan@mail.sdsu.edu, (619) 594-3027, (619) 594-3272(fax).
§
   Institute of Real Estate Studies, National University of Singapore; 21 Heng Mui Keng Terrace, #04-02,
Singapore, 119613; ydeng@nus.edu.sg, (65) 6516-8291, (65) 6774-1003 (fax).
‡
   Fannie Mae, 3900 Wisconsin Avenue, Washington, DC 20016. E-mails: eric_rosenblatt@fanniemae.com and
vincent_w_yao@fanniemae.com.

                                                                                                       1
 
                   Model Stability and the Subprime Mortgage Crisis


1. Introduction


The recent financial crisis was originally triggered by the large scale of unexpected losses on

mortgages and mortgage-related securities.1 The default shock has called upon an investigation

of what went wrong with the credit risk models. Some see the problem coming from the data

input. For example, Satyajit Das, a former Citigroup banker, told Bloomberg reporters that:


          “The models are fine. But they have an input problem. It becomes a number we pluck out

          of the air. They could be wrong, and the ratings could be misleading.”2


Others, however, blame model instability. For example, Alan Greenspan, 2008, suggested:


          “The whole intellectual edifice, however, collapsed in the summer of last year because

          the data inputted into the risk management models generally covered only the past two

          decades — a period of euphoria.”3


In this paper, we investigate the potential model instability problem with respect to mortgage

default risk and examine to what extent the model instability explains the default shock in the

crisis.


Our study is with regard to two econometric models that are well-established in the academic

literature and that have been widely adopted by the mortgage industry, the Logit model and the

Cox proportional hazard model. Although mortgage lenders and investors usually keep their
                                                            
1
  Major mortgage investors had to substantially write down their mortgage assets and rating agencies had to
adjust their ratings to reflect revised expectations of default losses during the crisis. Many mortgage lenders
went into bankruptcy due to unexpected losses.
2
  “CDO Boom Masks Subprime Losses, Abetted by S&P, Moody's, Fitch,” Bloomberg News, May 31, 2007.
3
   “The Financial Crisis and the Role of Federal Regulators,” the House Committee on Oversight and
Government Reform hearing on October 23, 2008.

                                                                                                             2
 
specifications of those models proprietary and thus we cannot evaluate models of a particular

lender or investor, we hope to form insights about the general features of those econometric

models through this study.


We find that both the conventional Logit model and the hazard model with reasonable

specifications show inter-temporal instability. For example, based on Wald tests, subprime

mortgage loans originated in 2006 have significantly different default sensitivity to house price

appreciation (depreciation) than those originated in 2003. To assess how the model instability

explains the default shock during the subprime mortgage crisis, we use parameters estimated

with the 2003 vintage data to forecast default probabilities of the 2006 vintage loans and study

the aggregate prediction accuracy. Using the actual realization of the default risk factors such as

house price appreciation (HPI growth) and leverage of the 2006 vintage, we find that the hazard

model estimated with the 2003 vintage data predicts about 40% fewer defaults than the actual

results of the 2006 vintage while the Logit model predicts about 41% fewer defaults.

Alternatively, we take imperfect HPI growth forecast into consideration. If one is to assume the

same HPI growth during the periods of 2006-2009 and 2003-2006, the under-prediction will be

more severe. The predicted default rate is less than 50 percent of the actual results. Stated

differently, the actual default rate of the 2006 vintage loans during the 2006-2009 period is more

than twice as high as that is predicted based on the 2003 vintage models. Apparently, adding bad

forecast of other variables such as interest rate and unemployment into consideration, the ex post

default rate could be several times higher than that is predicted. This finding coincides with the

relation between expected and actual losses of the 2006 vintage loans: the 2006 vintage were

originated with similar spreads with those of the 2003 vintage, reflecting mortgage lenders

expected similar losses from those two vintages; however, the ex post default rate of the 2006


                                                                                                 3
 
vintage is over 3 times higher than that of the 2003 vintage. Meanwhile, comparing the impacts

of parameter instability and bad forecast of HPI growth, we find that the marginal impact of bad

HPI forecast is smaller than that of a bad model.


The model instability problem we penetrate in this paper is similar to the so-called “Lucas

critique” regarding econometric policy evaluations (Lucas, 1976). The unprecedented crisis in

the subprime mortgage market and the widely believed structural break in the mortgage and

financial markets provide us a unique opportunity to study this issue in the field of risk

management.        Our findings in this paper have a number of implications regarding model

limitations and risk, model improvements, economic capital and regulatory reform.


The rest of the paper proceeds as follows: in the next section, we review the state of the art

econometric models of mortgage default risk and discuss the specifications of the hazard model

and Logit model that we are focusing on in this paper; we report our data and explain our

sampling technique in section 3; in section 4, we discuss estimates of the two econometric

models and parameter stability tests; we explain our prediction experiments and assess how

model instability explains the default shock in section 5; and we provide concluding remarks in a

final section.


2. Mortgage Default Risk Models


2.1. Econometric models of mortgage default risk


The past forty years have seen a growing literature on mortgage default risk based on ex post

loan performance4, which helps lenders and investors to understand determinants of mortgage

                                                            
4
  There is also a literature that tries to understand the implied (ex ante) default risk through mortgage prices
(see, Kau, Kenan and Kim 1994, Capozza Kazarian and Thomson 1998 and many others).

                                                                                                                   4
 
default risk and lays out the foundation for default risk prediction, pricing and management.

Econometric models for mortgage default risk have evolved from simple linear regressions to the

more sophisticated Logit and hazard models.


Linear regression von Furstenberg (1969, 1970a, 1970b) develop the first academic default risk

model, a linear regression based on aggregate data. The author regress the aggregate default rate

of FHA/VA loans on loan characteristics such as loan-to-value (LTV) ratio and age of the loan,

and find that home equity at loan origination is the most important predictor of default.

Subsequently, a number of studies apply that technique to different loan samples (e.g. loans

originated by S&Ls) and alternative mortgage instruments (ARMs and GPMs).5 Later studies

also experiment with more explanatory variables such as borrower income, payment-to-income

ratio, metro unemployment rate, and mark-to-market LTV. Those studies provide important

guidance for lenders’ underwriting practice and lead to major revisions in underwriting criteria.6

Linear regressions with aggregate data are still used in recent studies of subprime mortgage

default (see, e.g. Mian and Sufi 2009).


Probit and Logit models The literature on mortgage default proliferates as disaggregate loan

level data become available. While some early studies such as Herzog and Earley (1970) still

apply linear regressions to disaggregate loan level data (with 0/1 dependent variable)7, many

more studies apply probit and Logit models based on economic theory of discrete choice.

Jackson and Kasserman (1980) is the first to use probit model based on individual FHA loans.

Campbell and Dietrich (1983) and many others use Logit models. The use of probit/Logit model

                                                            
5
  See, for example, von Furstenberg and Green (1974), Follain and Struyk (1977), Vandell (1978), Jackson and
Kasserman (1980), Foster and Van Order (1984, 1985), Clauretie (1987), and Quigley and Van Order (1991).
6
  For example, lenders made important changes in response to pressure on revealed redlining practice and
Fannie Mae revised standards on ARMs based on academic studies (Vandell, 1993).
7
  Other examples include Williams, Beranek and Kenkel (1974) and Webb (1982).

                                                                                                          5
 
allows researchers to explore more risk factors at the loan level such as loan purpose, loan term,

etc. Meanwhile, contemporaneous equity position replaces original equity position in newer

studies (e.g. Zorn and Lea 1989, Cunningham and Capone 1990). Transaction costs of default,

trigger events and related borrower characteristics appear more frequently in the models (see, e.g.

Vandell and Thibodeau 1985, Hendershott and Schultz 1993, Archer, Ling and McGill 1996,

1997, Capozza, Kazarian and Thomson 1997).          As option pricing theory being applied to

mortgage valuations, more and more studies conduct tests on whether default is significantly

related to put option “in the money” (see, e.g. Quigley and Van Order 1995, Archer, Ling and

McGill 1996, Philips, Rosenblatt and Vanderhoff 1996). Recent applications of Logit models

create event-history for each loan and thus are better suited to study the impacts of time varying

variables (see, e.g. Clapp et al 2001, Ambrose and Sanders 2003, Clapp, Deng and An 2006, An,

Clapp and Deng 2010). For subprime mortgage default, Rajan, Seru and Vig (2010) and Keys,

Mukherjee, Seru and Vig (2010) have applied Logit models in their studies.


Hazard model Developed in the biostatistics literature and first used for mortgage prepayment

risk studies (e.g. Green and Shoven 1986, and Quigley 1987), the proportional hazard model has

prevailed for mortgage default risk studies in the past two decades (see Van Order 1990, Quigley

and Van Order 1991, Schwartz and Torous 1993, Deng, Quigley and Van Order 1996 and many

others). In contrast with Logit model with event-history data, which assumes the borrower’s

choices in each month are i.i.d. events, hazard model is based on conditional default probability

and implicitly handles path-dependency. It is thus more appealing for the modeling of borrower

behavior that is usually path-dependent. Moreover, Logit model is restricted by the assumption

of no correlations among competing risks via unobservable variables. By contrast, hazard model

has the flexibility to allow correlated competing risks (Clapp, Deng and An 2006). Recently, a


                                                                                                 6
 
number of papers have applied the Cox proportional hazard model to study subprime mortgage

default (e.g. Demyanyk and Hemert 2008, Gerardi, Shapiro and Willen 2008, Elul 2009,

Haughwout, Okah and Tracy 2009, Green, Rosenblatt and Yao 2010, An, Yao and Rosenblatt

2010)8. However, proportional hazard model is not as convenient as the multinomial Logit model

in dealing with the competing risks of mortgage default and prepayment. To address that issue,

Deng, Quigley and Van Order (2000) apply the competing risks hazard model to mortgage

prepayment and default, and a number of following studies have adopted that methodology.

Deng, Quigley and Van Order (2000) and Deng and Quigley (2002) also model the unobserved

heterogeneity with a mass-point mixed competing risks hazard model. Alexander, Grimshaw,

McQueen and Slade (2002) and Pennington-Cross (2003) apply that technique to subprime

mortgage default. Clapp, Deng and An (2006) extend the unobserved heterogeneity concept to

multinomial Logit model. Covariates included in a hazard model are usually similar to those in a

Logit model.


Although other models such as discriminant analysis, neural networks analysis, and classification

trees analysis appear in the literature9, Logit model and hazard model have been the dominant

econometric models used in the academic literature for mortgage default risk. They have also

become the standard tool of mortgage default risk analysis in the mortgage industry.


A Cox proportional hazard model assumes that the hazard rate of default of a mortgage loan at

period T since its origination follows the following form:


        hi T ; X i  t    h0 T  exp  X i  t  '   , i  1, n .       (1)


                                                            
8
  Ciochetti et al (2003), Chen and Deng (2003), and An, Deng and Sanders (2009) apply the model to CMBS
loan default.
9
  See Morton (1975), Episcopos, Pericli and Hu (1998), and Feldman and Gross (2005), respectively.

                                                                                                          7
 
Here h0 T  is the baseline hazard function, which only depends on the age (duration) of the loan,

T 10; X i  t  is a vector of proportional covariates for individual loan i that are time-varying or

time-invariant risk factors.


In a Logit model, the default probability of a loan at age T is:


                                      exp  Z i  t , T  '  
           Pri T ; Z i  t                                      .                    (2)
                                    1  exp  Z i  t , T  '  


Here the dependence of default probability on loan age (duration) is modeled by including loan

duration dummy variables in the covariates set Zi  t , T  .


2.2. Model specification


Our model specification generally follows the existing literature. We include the following

covariates in our models:


Negative equity von Furstenberg (1970a, 1970b), Williams, Beranek and Kenkel (1974) and

many other find that home equity at loan origination is the most important predictor of mortgage

default.      Later studies use contemporaneous LTV that takes house price change and loan

amortization into consideration, and find it to be a significant risk factor (see, e.g. Foster and Van

Order 1984, Vandell and Thibodeau 1985, Deng 1997). Recent studies on subprime default risk

also find such variable a significant risk factor (e.g. Alexander, Grimshaw, McQueen and Slade

2002)11. In this paper, we calculate the borrower’s negative equity as the difference between

contemporaneous house value and market value of the loan. Home price index (HPI) and loan
                                                            
10
    Notice that the loan duration time T is different from the natural time t, which allows identification of the
model.
11
    Alternatively, Demyanyk and Hemert 2008 and Elul 2009 use house price appreciation.

                                                                                                                    8
 
amortization are incorporated in our calculation. Additionally, we acknowledge multiple loans

(liens) on some properties and thus use the combined loan amount in negative equity calculation.


FICO score FICO score is a numerical summary of borrower’s history of debt repayment.

Although prime mortgage lenders screen out the low credit score borrowers, some researchers

still find the level of credit score matters to default risk among those loans originated (e.g. Clapp

et al 2001). For subprime mortgage loans, many recent studies have found it to be a significant

risk factor (see, e.g. Pennington-Cross 2003, Demyanyk and Hemert 2008, Elul 2009, Keys,

Mukherjee, and Seru and Vig 2010). Since many mortgage loans have cosigners (husbands or

wives), we use the minimum of the FICO scores of the two in this study.


Backend ratio The literature has long included payment-to-income ratio as a default risk factor

(e.g. Herzog and Earley 1970, Archer, Ling and McGill 1996, Deng and Gabriel 2006). The

payment-to-income ratio (frontend ratio) and debt-to-income ratio (backend ratio) are indeed two

important mortgage underwriting variables besides FICO score. But again for many subprime

mortgage loans, those two variables far exceed the traditional underwriting thresholds and recent

studies have found debt-to-income ratio to be significant default risk determinants (Demyanyk

and Hemert 2008, Green, Rosenblatt and Yao 2010).


Loan type Fixed rate mortgage (FRM) behave very differently from adjustable-rate mortgage

(ARM) (see, e.g. Cunningham and Capone 1990, Philips, Rosenblatt and Vanderhoff 1995,

Calhoun and Deng 2002). Some research has also found that 15-year FRMs are less risky than

30-year FRMs (e.g. Alexander, Grimshaw, McQueen and Slade 2002, Deng and Gabriel 2006).

In this study, we focus on FRM but include 15-year FRMs as explanatory variable.




                                                                                                   9
 
Property type Recent research has found that condominium loans are less likely to default,

everything else equal (Agarwal, Ambrose and Sanders 2009). Therefore, we test whether

different property types, i.e. single unit, two-to-four unit and condominium, have different

default risk.


Loan purpose Different loan purposes indicate borrowers being in different stage of their

housing tenure as well as in different financial situations. While Clapp et al (2001) find refinance

loans are more likely to default among prime mortgage loans, recent research such as Elul (2009)

find that subprime refinance loans are less likely to default, everything else equal. A related

variable considered in the literature is whether the property is an existing/new unit (see, e.g. von

Furstenberg and Green 1974, Campbell and Dietrich 1983, Deng and Gabriel 2006). We consider

three different loan purposes in this study: home purchase, rate/term refinance, and cash out

refinance.


Documentation type An important feature of subprime mortgage loans is that many loans do not

have full documentation of income, asset or employment. The low documentation may be

caused by borrower’s difficulty in verifying their income, asset or employment, or in some

extreme situations borrowers simply state the income they don’t have (stated income loans).

Recent studies have found that low doc loans have higher default risk (e.g. Demyanyk and

Hemert 2008, Rajan, Seru and Vig 2009).


Occupancy type Demyanyk and Hemert (2008) and Agarwal, Ambrose and Sanders (2009) find

that investor properties are more likely to default. In this paper, we consider the following three

types of occupancy types: owner-occupied, second/vacation home, investor property.




                                                                                                 10
 
Mortgage brokerage type Many popular media has ascribed the subprime crisis to mortgage

brokers. Green, Rosenblatt and Yao (2010) have found that broker and correspondent loans have

higher default risk than retail loans. We therefore include brokerage type in our models.


Origination loan balance Size of the loan is thought to be related to the transaction cost of

default (e.g. Clapp et al 2001, Deng and Gabriel 2006) and recent studies of subprime mortgage

default risk have found it to be significantly related to default (see, e.g. Demyanyk and Hemert

2008, Elul 2009).


Origination LTV Some researchers believe that LTV at origination (or down payment) does not

only affects the equity position of the borrower throughout the life of the loan, but also reveals

borrower’s default propensity, or indicates the borrower’s ability to save, or affects borrower’s

default decision as sunk costs (see, Yezer, Phillips and Trost 1994, Deng, Quigley and Van

Order 1996, 2000, Kelly 2009, Green, Rosenblatt and Yao 2010). Additionally, lenders may pay

different levels of due diligence on high LTV and low LTV loans. Therefore, in addition to

considering combined LTV in negative equity calculation, we include origination LTV.


Prepayment penalty Most prime residential mortgage loans are free to prepay. By contrast,

many subprime loans have prepayment penalty clause in the mortgage contracts. Researchers

believe prepayment penalty limit the subprime borrower’s ability to refinance into more

affordable loans and thus increase the chance of default (e.g. Demyanyk and Hemert 2008, Elul

2009, Agarwal, Ambrose and Sanders 2009).


Unemployment rate One possible reason of mortgage default is borrower’s loss of job and thus

not being able to make the mortgage payment. Therefore, the mortgage default risk literature has

long included local area unemployment as a risk factor (see, e.g. Williams, Beranek and Kenkel


                                                                                               11
 
1974, Campbell and Dietrich 1983, Deng, Quigley and Van Order 2000). Recent research has

also found it to be a significant risk factor for subprime loans (e.g. e.g. Demyanyk and Hemert

2008, Elul 2009).


Excess premium             There is increasing evidence that mortgage lenders can possess private

information about borrower/loan quality that is not reflected in underwriting documents (see, e.g.

Elul 2009, Rajan, Seru and Vig 2009, An, Deng and Gabriel 2010, Keys, Mukherjee, Seru and

Vig 2010). Therefore, we include excess premium as a proxy of lender’s private information

about loan quality. The variable is constructed as the residual of a mortgage spread regression

that includes all observable default risk factors on the right hand side.12


Other variables We also consider some other variables such as jumbo loan status, growth of per

capita disposable income and growth of population in the metro area, corporate credit spread,

and HPI volatility. There are not included in the final model due to multicollinearity problem.

Ideally, we would include borrower characteristics such as age, gender, ethnicity and profession,

number of dependents, and neighborhood variables such as whether the property is in central city,

neighborhood homeownership rate, poverty level, crime rate, percent of homes foreclosed, etc.

However, we do not have data on those variables.


3. Data


3.1 Data sources


Our data is mainly from First American CoreLogic LoanPerformance (hereafter LP). The LP

database contains loan-level data on over 80 percent of all securitized subprime mortgages,

which is also over half of all subprime mortgage loans originated in US.
                                                            
12
    The regression results are available upon request.

                                                                                               12
 
LP provides detailed information on each subprime mortgage loan, including note rate, original

loan balance, LTV, loan term (30 year, 15 year, etc.), loan type (fixed-rate, 5-1 ARM, etc.), loan

purpose (home purchase, rate/term refinance, cash out refinance), borrower credit score,

occupancy status, number of units, originator type (broker, retail lender, etc.), and prepayment

penalty type. LP also tracks the performance (default, prepayment, mature, or current) of each

loan in every month. Therefore, we construct the event-history of each loan, starting from its

origination to default, prepayment, mature, or our data collection point, whichever is the earliest.


We also merge other information such as HPI growth, interest rate, MSA-level unemployment

rate and income growth into our loan level data. Treasury rate and interest rate swap rate is

matched into the data to calculate the mortgage spread. HPI is from Fannie Mae and it is at the

zip code level. Treasury interest rate, corporate bond yields are from the Federal Reserve, and

MSA-level income growth and unemployment rate are from Moody’s Economy.com.


3.2 Sampling


The LP database contains about 14 million subprime mortgage loans. For our study purposes, we

focus on first-lien, fixed-rate mortgage loans, which are about 19 percent of all loans13 . We

further apply a number of filters: we first exclude loans originated before 1995 since LP has

relatively less accurate information about those loans; seasoned loans are excluded since

information such as loan balance and LTV of those loans is not at loan origination; we also

exclude those loans with interest only periods or those not in metropolitan areas (MSAs); loans

with missing or wrong information on property type, refinance indicator, occupancy status,

backend ratio, FICO score, documentation level or mortgage note rate are excluded.

                                                            
13
    A large fraction of the subprime mortgage loans are ARMs, e.g. about 38 percent of the LP sample are 2/28
ARMs.

                                                                                                           13
 
We further adopt a sampling technique for our purposes of study: we select a 10% random

sample of three vintage loans, those originated in 2000, in 2003 and 2006. The numbers of

subprime mortgage loans of those three vintages are 8,533, 31,836 and 26,876, respectively.

Then for each vintage, we look at a three-year window of loan performance after loan origination.

For example, for loans originated in 2000, we focus on its performance in 2000, 2001 and 2002.

In so doing, we have three non-overlapping samples.


3.3 Descriptive statistics


In table 1, we report the performance of the three vintage subprime mortgage loans. Default is

defined as over 90- day delinquency, and censor means that the loan is alive at the end of the

three-year window. Default rate varies across the three vintages, e.g. 2003 vintage has a

cumulative default rate of about 7 percent over the three-year window, in contrast to the 16

percent of the 2000 vintage and the 22 percent of the 2006 vintage. Apparently the strong house

price appreciation the 2003 vintage experienced during 2003-2005 helped most of the 2003

vintage loans stay current, while the 2001-2002 economic downturn and the sharp house price

decline starting from 2006 contributed to the high default rates of the 2000 and 2006 vintages.

Overall, default rates of all the three vintages of subprime mortgage loans are much higher than

that of prime mortgage loans as reported in previous studies (see, e.g., Philips, Rosenblatt and

VanderHoff 1995, Deng, Quigley and Van Order 2000, Clapp, Deng and An 2006).


Figure 1 compares the cumulative default rates of the three vintages over the life of the loan. In

every quarter after loan origination in the three-year window, the 2003 vintage has lower default

rate than the 2000 and 2006 vintages. Default rate of the 2006 vintage starts lower than that the




                                                                                               14
 
2000 vintage but soon surpassed that of the 2000 vintage one year after loan origination. Over 20

percent of loans originated in 2006 default within two years of origination.


Table 2 compares the loan characteristics of the three vintages. The 2000 vintage has much lower

average loan amount but much higher average coupon rate. Mortgage spread is defined as

difference between the mortgage coupon rate and comparable maturity Treasury rate14. The 2000

vintage has an average mortgage spread of 505 bps, while the 2003 vintage and the 2006 vintage

has an average mortgage spread of 339 bps and 343 bps, respectively. Apparently the relative

magnitude of the mortgage spread of the 2000 and 2003 vintages somehow reflects the

aforementioned default rate differences between these two vintages. However, this risk-return

relationship is not true when we compare the 2006 vintage with the 2003 vintage – while they

have similar average mortgage spread, the 2006 vintage have over 3 times higher cumulative

default rate than the 2003 vintage over a three-year window. This finding concurs the so called

“default shock” – lenders and investors found several times higher default rate than expected

during the housing and subprime mortgage crisis.


Average FICO score improves over time. In fact, the average FICO scores of the 2003 and 2006

vintages both exceed 620, the traditional FICO score cutoff for prime mortgage loans. This

pattern is consistent with many anecdotal evidences that subprime lending became more for non-

credit reasons as the market evolved. This observation is also supported by the increases of

borrowers having low/no documentation. In 2000, only 19 percent of subprime loans have

low/no doc, while the percentage of low/no doc increased to 32 and 28 percent, respectively, in

2003 and 2006.



                                                            
14
    10-year Treasury rate for FRM 30 and 7-year Treasury rate for FRM 15.

                                                                                              15
 
    Combined LTV also increases monotonically from 2000 to 2006. In fact, table 3 shows that the

2006 vintage has substantially higher proportion of high LTV loans. Nearly 18 percent of loans

originated in 2006 have LTV higher than 97 percent, while that number is less than 3 percent for

the 2000 vintage. Proportion of less risky 15-year loans decreases over time. In 2000, 19 percent

of subprime FRMs are 15-year, while in 2006 this number becomes only 5 percent. Loan

purpose compositions also vary over time. Percentage of loans as rate/term refinance loans is 10

percent in 2000. It increased to 18 percent in 2003 and then fell back to 10 percent in 2006. We

also notice that a large proportion of the 2003 vintage were originated by mortgage brokers or

correspondent lenders. Prepayment penalty prevails in all of the three vintages.


Table 4 further presents a comparison of the time-varying covariates of the three vintage loans.

The most significant difference comes from HPI growth. The 2000 and 2003 vintages

experienced an average HPI growth of 7 percent and 14 percent, respectively. In contrast, the

2006 vintage had an average HPI decline of 4 percent. Correspondently, the average negative

equity of the 2006 vintage is much higher than those of the 2000 and 2003 vintages. Both the

2000 and 2006 vintage loans experienced an average 1 percentage point increase in

unemployment rate, while the 2003 vintage had decline in unemployment rate (improvement in

employment).


4. Model Estimation and Tests of Model Stability


4.1. Model estimation


Both the hazard model and the Logit model are estimated using the maximum likelihood

estimation methods as discussed in Clapp, Deng and An (2006).




                                                                                              16
 
Table 5 reports our hazard model estimates on the three separate samples, which are constructed

based on the event-history data of the three vintage loans. The first column of the coefficients is

for the 2000 vintage sample. Most of the estimates are conforming to our expectation. For

example, default probability decreases with FICO score. The higher the original loan balance, the

lower the likelihood that the loan will default post-origination, everything else equal. Low/no

doc loans, investment property loans, and loans with prepayment penalty all have higher default

risk than their reference groups, respectively. 15-year FRM and condo loans have lower default

risk. Backend ratio is marginally significant with the expected sign of coefficient. Those loans

with original LTV higher than 80 percent do not show a significant different default risk than

those with LTV lower than or equal to 80 percent. We do see a significant positive relationship

between negative equity and default probability – the larger the negative equity, the more likely

the loan will default. Interestingly, Excess premium is significantly related to default probability,

which supports the notion that loan originators do possess valuable private information regarding

loan default risk and they incorporate that information in loan pricing.


The 2003 vintage estimates show more significant default risk factors. For example, backend

ratio is now highly significant with the expected sign of coefficient. Loans with higher than 80

percent original LTV are riskier than those with original LTV less than or equal to 80 percent,

everything else equal. 2- to 4-unit property loans have higher risk than 1-unit loans. Both

rate/term refinance and cash out refinance loans are shown to be less risky than home purchase

loans. In addition, broker/correspondent loans tend to be riskier. Change in unemployment rate

becomes a significant risk factor with the expected impact. FICO score, log loan balance, low/no

doc, 15-year FRM, condo loan, investment property, Excess premium and negative equity




                                                                                                 17
 
continue to be significant and have the same signs of coefficient with those of the 2000 vintage

estimates.


The significant default risk factors of the 2006 vintage are similar to those of the 2003 vintage

except that LTV greater than 80 percent, condo loan, and broker/correspondent loan become

marginally significant. However, we notice that the magnitude of many risk factors is very

different from those of the 2003 vintage estimates.


In table 6, we present our estimates of the Logit model. Here we are concentrating on default

probability and thus prepayment and censor observations are counted as non-default and a binary

Logit model is estimated. First, we notice that the estimates of all the three vintage models are

similar with those of the hazard model estimates reported in table 5. Second, comparing the

estimates across the three vintage samples, the patterns are also similar with those discussed

above.


4.2. Tests of parameter stability


To formally assess whether parameters estimated with the three separate samples are statistically

different, we conduct Wald tests as discussed in Andrews and Fair (1988). Basically, denote 

                                                                                         
and  * as true parameters of any two models (based on two different vintage loans), and  and

   *
 as their estimates. We test the following hypothesis:


         H0 :    *
                                                                                  (3)


The Wald statistic is:




                                                                                              18
 
                                      
                                          1
       W     '  var   var  
            *                 *              *

                   
                                   
                                                                                  (4)


Under the null hypothesis, the Wald test statistic should be  2 distributed with a degree of

freedom equal to the number of parameters in the model (number of rows in the first or third

matrix in equation (4)).


Wald test results of the hazard model are reported in table 5 to the side of the estimates. Moving

from the 2000 vintage model to the 2003 vintage model, a number of parameters are statistically

different: default probability of the 2003 vintage are more sensitive than that of the 2000 vintage

to FICO score and log loan balance, as the magnitude of those two coefficients are significantly

larger in the 2003 vintage model than in the 2000 vintage model; interestingly, higher than 80

percent LTV loans have significantly higher default risk in the 2003 vintage model but not in the

2000 vintage model. This may be due to the fact that when more subprime loans are available,

higher risk borrowers self-select into high LTV loans. Rate/term refinance and cash out refinance

also become significant in the 2003 vintage model, which could be due to relatively worse

performance of the home purchase loans in the 2003 vintage; however, three of the prepayment

penalty dummy variables become insignificant; finally, the sensitivity of default probability to

Excess premium declines significantly, which is consistent with findings in An, Yao and

Rosenblatt (2010) that Excess premium becomes less predictive of default possibly due to

subprime lenders’ decreasing effort to collect soft information when originate-to-distribute

becomes easier.


Comparing the parameters of the 2003 and 2006 model, again we see significant parameter

instability. The sensitivity of default probability to FICO score and change in unemployment rate

becomes smaller in the 2006 vintage model, while LTV greater than 80 percent, condo loan,
                                                                                                19
 
broker/correspondent loan become insignificant in the 2006 vintage model. The most remarkable

changes come from log loan balance and negative equity. Log loan balance is negatively

associated with default probability in the 2003 vintage model but it becomes positively related to

default probability in the 2006 vintage model. Negative equity coefficient does not change sign

but the magnitude in the 2006 vintage model is more than three times higher than that in the

2003 vintage model. In other words, the 2006 vintage subprime borrowers are much more

sensitive to negative equity in their default decisions. This is in fact quite intuitive: some

borrowers might not choose to default when house price is on the rise even if they had some

negative equity in their houses; but many borrowers might have chosen to default when house

price was falling even if they only had small negative equity in their houses.


Wald test results on the Logit models are reported in table 6. They are very similar to the

aforementioned results on the hazard models. A number of parameters are unstable over time but

the most instability comes from the negative equity variable. Borrowers become much more

sensitive to negative equity (decline in house price) in default during the crisis. Apparently,

when house price dropped dramatically during the crisis, this increased sensitivity made things

worse as they multiply to the increase negative equity to cause more defaults.


5. Default Shock and the Subprime Mortgage Crisis


Econometric default risk models rely heavily on historical data. Mortgage lenders and investors

typically use mortgage loan performance observed in previous periods to estimate how certain

default risk factors such as house price appreciation affects mortgage default probability and loss

severity. Such models are then used to predict future default losses under simulated paths of




                                                                                                20
 
house price appreciation.15 One can imagine that if models are unstable over time, even with the

most accurate predictions of the risk factor dynamics, default probability (loss) predictions will

be significantly off the target.


The subprime mortgage crisis is characterized by an unusually large fraction of subprime

mortgage loans originated during 2005-2007 turning into default during 2007-2009. This high

wave of default comes as a shock to many lenders, investors and rating agencies. Evidenced in

the previous analysis, the 2006 vintage subprime mortgage loans were originated with very

similar mortgage spreads with those of the 2003 vintage; however the ex post default rate of the

2006 vintage is over three times higher than that of the 2003 vintage.


In this section, we conduct a simple econometric experiment to decompose this “default shock”,

which is to see how much of the default rate “surprise” is due to the unprecedented house price

drop (HPI input error) and how much of the surprise is due to the changing sensitivity of the

parameters (parameter instability).


Notice that the subprime mortgage market started to explode in 2003 while default rates of

subprime loans really take off in 2006. Therefore, using the 2003 vintage model to predict the

2006 vintage data will be an interesting experiment regarding model instability. We obtain the

parameter estimates from the 2003 vintage sample and use them as default risk factor loadings16.

In the first experiment, we use the actual subsequent values of the default risk factors for the

2006 vintage loans, together with parameters estimated based on the 2003 vintage sample to

predict default rate of the 2006 vintage. This experiment tells us that if we have perfect

prediction of the default risk factors how accurate we can predict default probability. Notice that
                                                            
15
    Those predictions together with scenario analysis and sensitivity analysis are then used to assist mortgage
underwriting, pricing and risk management.
16
    We set the insignificant parameters to zero because they are statistically indifferent from zero.

                                                                                                                  21
 
this is not a completely feasible forward-looking prediction but it separates the parameter

instability problem from model input error problem.


With both the hazard model and the Logit model, we make quarter-by-quarter predictions.

Figure 2 plots the predicted cumulative default rates by the two models in contrast with the

actual cumulative default rate. We see that while the two models have very similar predictions,

both under-predict defaults remarkably. Table 7 simply presents the aggregate results. Again,

both the hazard model and the Logit model under-predict the default probability of the 2006

vintage loans. While the actual cumulative default rate of the 2006 vintage loans is 22.2 percent

in a three-year window, our hazard model prediction is only 13.3 percent and our Logit model

prediction is only 13.0 percent. Normalized by the actual default rate, we can see from figure 3

that the hazard model predicts about 40% fewer defaults than the actual results while the Logit

model predicts about 41% fewer defaults.


Prior to the crisis, the predicted future house price path was probably much higher than the actual

subsequent price path. Therefore in our second experiment, in addition to using the 2003 vintage

model estimates to predict default of the 2006 vintage, we assume a naïve house price model –

the one that predicts the HPI growth rate in each zip-code during 2006-2008 remains the same

with that of 2003-2005. In so doing, we are able to see a combined impact of parameter

instability and bad HPI forecast.


Again, table 8 presents the cumulative predicted defaults in a three-year window and compares

them to the actual figures. Both models predict less than 11 percent of defaults while the actual

default rate is about 22 percent. Therefore, the combined impact of parameter instability and bad

HPI forecast is larger than the sole impact of parameter instability: it causes over 50 percent


                                                                                                22
 
fewer defaults than the actual results. State differently, the actual default rate is over twice higher

than the predicted default rate. However, an interesting observation from a comparison of table

7 and table 8 is that the marginal impact of the bad HPI forecast is much smaller than that of the

parameter instability (prediction accuracy comes down from 60 percent to 48 percent in contrast

to from 100 percent to 60 percent). This may help explain why the default wave came as a

“surprise” even though many lenders and investors conducted scenario analysis and some of

them might have already predicted much lower HPI growth for the 2006-2008 period – using a

wrong model is more detrimental than applying an unrealistic HPI growth.


6. Conclusions and Discussions


The subprime mortgage market has experienced an explosive development in the early- and mid-

2000s and then collapsed in 2007. During the past three years, massive defaults of subprime

mortgage loans have caused catastrophic losses in the financial markets. Much of the default loss

came as a shock to the investment community, as evidenced either from the non-proportionate

mortgage spreads charged by lenders at loan origination or from the large scale of write down

mortgage lenders and investors conduct on their mortgage assets during the crisis. This has

spurred retrospection on what went wrong with the risk management models. Following this

spirit, we investigate the stability of econometric default risk models and conduct econometric

experiments to examine to what extent the model instability explains the default shock.


Estimating separate hazard and Logit models for three vintage loans, all with a three-year

observation window, we find that the prevailing econometric mortgage default probability

models can be highly unstable over time. We find that not only the default risk factors such as

HPI growth are significantly different across the three vintages, coefficients of a number of


                                                                                                    23
 
variables especially that of the negative equity variable are significantly different in those three

vintage models. Comparing the 2003 vintage loans with the 2006 vintage loans, the 2003 vintage

have experienced the highest house price run-up in the history within three years of their

origination, while those loans originated in 2006 were exposed to an unprecedented house price

decline during 2006-2008. Meanwhile, default probability of the 2006 vintage loans are over

three times more sensitive than that of the 2003 vintage to house price change.


Our simulation suggests that both the hazard model and the Logit model estimated with 2003

vintage data under-predict the default probability of the 2006 vintage loans. Assuming a perfect

forecast of HPI and other default risk factors, the hazard model predicts about 40 percent fewer

defaults than the actual results while the Logit model predicts about 41 percent fewer defaults.

When house price forecasting is not accurate, we see a more severe under-prediction. Assuming

a naïve house price prediction, the two econometric models under-predict over 50 percent of the

default rate.


The findings in this paper have a number of implications. First, we have to exercise extra caution

explaining and applying empirical results based on historical data, especially those non-

representative ones. The house price run-up during 2003-2006 was atypical. If we were just to

use data during the atypical period in default risk forecasting we would obtain exceptional results,

as we show in this paper. It is definitely not an easy task to identify the non-representative data

ex ante. Remedies to that problem include using larger sample and longer history, and adding

scrutiny to every data we analyze. Second, judged by the aggregate post-sample prediction

accuracy, we need improvements in default risk models as well as house price forecasting

models. Certainly, the current paper does not explore the optimal specification within the current

hazard or Logit model framework and we do believe improvements can be made in that regard.

                                                                                                 24
 
However, models with more “structural framework” may be more promising. For example, as

many people believe that we have had regime shifts in the mortgage and housing market, models

that can capture those regime shifts many help improve our ability to forecast mortgage default

risk. Third, default risk models can be misleading if used inappropriately and model risk has to

be understood in risk management operations. Model limitations may be masked by other factors

during normal times but when there is structural change that leads to different data generating

mechanism model risk can become most significant and costly. Fourth, from a regulation

perspective, the Basel II regulation framework should be reformed to address the credit cycles

and avoid the pro-cyclicality of usual risk assessment models. Finally, economic capital is

important to mortgage bankers and to the investment community. In that regard, again a

technical problem will be how to get around the pro-cyclicality of usual risk management models.




                                                                                             25
 
References

Agarwal, Sumit, Brent W. Ambrose, Souphala Chomsisengphet and Anthony B. Sanders. 2009. The
 Neighbor’s Mortgage: Does Living in a Subprime Neighborhood Impact your Probability of Default?
 SSRN working paper.

Alexander William P., Scott D. Grimshaw, Grant R. McQueen and Barrett A. Slade. 2002. Some Loans Are
 More Equal than Others: Third-Party Originations and Defaults in the Subprime Mortgage Industry. Real
 Estate Economics 30(4), 667-697.

Ambrose, B. W. and A. B. Sanders. 2003. Commercial Mortgage Backed Securities: Prepayment and Default.
 Journal of Real Estate Finance and Economics, 26(2/3): 179-196.

An, Xudong, John C. Clapp and Yongheng Deng. 2010. Omitted Mobility Characteristics and Property
 Market Dynamics: Application to Mortgage Termination. Journal of Real Estate Finance and Economics
 41(3).

An, Xudong, Yongheng Deng and Stuart A. Gabriel. 2010. Asymmetric Information, Adverse Selection, and
 the Pricing of CMBS. Journal of Financial Economics, forthcoming.

An, Xudong, Yongheng Deng and Anthony B. Sanders. 2009. Default Risk of CMBS Loans: What Explains
 the Regional Variations? National University of Singapore, IRES Working Paper 2009-009.

Andrews, Donald W. and Ray C. Fair. 1988. Inference in Nonlinear Econometric Models with Structural
 Change. Review of Economic Studies 55: 615-640.

Archer, W. R., D. C. Ling and G.. A. McGill. 1996. The Effect of Income and Collateral Constraints on
 Residential Mortgage Terminations. Regional Science and Urban Economics, 26: 235-261.

Archer, W. R., D. C. Ling and G.. A. McGill. 1997. Demographic Versus Option-Driven Mortgage
 Terminations. Journal of Housing Economics, 6(2): 137-163.

Calhoun, Charles, and Yongheng Deng. 2002. A Dynamic Analysis of Fixed- and Adjustable-Rate Mortgage
 Terminations. Journal of Real Estate Finance and Economics, 24: 9-33.

Campbell, T. and J. K. Dietrich. 1983. The Determinants of Default on Conventional Residential Mortgages.
 Journal of Finance, 48(5): 1569-1581.

Capozza, D. R., D. Kazarian, and T.A. Thomson. 1997. Mortgage Default in Local Markets. Real Estate
 Economics, 25(4): 631-655.

Capozza, D. R., D. Kazarian, and T. A.Thomson. 1998. The Conditional Probability of Mortgage Default.
 Real Estate Economics. 26(3): 359-390.

Chen, Jun and Yongheng Deng. 2003. Commercial Mortgage Workout Strategy and Conditional Default
 Probability: Evidence from Special Serviced CMBS Loans. USC Lusk Center for Real Estate Working
 Paper, 2003-1008.



                                                                                                      26
 
Ciochetti, Brian A., Yongheng Deng, Gale Lee, James Shilling and Rui Yao. 2003. A Proportional
  Hazards Model of Commercial Mortgage Default with Originator Bias. Journal of Real Estate Finance
  and Economics 27(1), 5-23.

Clapp, John M., Yongheng Deng and Xudong An, 2006, Unobserved Heterogeneity in Models of Competing
  Mortgage Termination Risks, Real Estate Economics 34(2), 243-273.

Clapp, J. C., G. M. Goldberg, J. P. Harding and M. LaCour-Little. 2001. Movers and Shuckers:
  Interdependent Prepayment Decisions. Real Estate Economics, 29(3): 411-450.

Clauretie, T. M. 1987. The Impact of Interstate Foreclosure Cost Differences and the Value of Mortgages on
  Default Rates. Journal of the American Real Estate and Urban Economics Association, 15(3): 152-67.

Cunningham, D. F. and C. A. Capone, Jr. 1990. The Relative Termination Experience of Adjustable to Fixed-
 Rate Mortgages. Journal of Finance, 45(5): 1687-1703.

Deng, Yongheng, 1997. Mortgage Termination: An Empirical Hazard Model with Stochastic Term
 Structure. Journal of Real Estate Finance and Economics, 14 (3), 309-331.

Deng, Yongheng, John M. Quigley and Robert Van Order. 1996. Mortgage Default and Low Down-payment
 Loans: The Cost of Public Subsidy. Regional Science and Urban Economics, 26: 263-285.

Deng, Yongheng, John M. Quigley and Robert Van Order. 2000. Mortgage Terminations, Heterogeneity and
 the Exercise of Mortgage Options. Econometrica, 68(2): 275-307.

Deng, Yongheng and John M. Quigley. 2002. Woodhead Behavior and the Pricing of Residential Mortgages.
 Lusk Center for Real Estate Working Paper, No. 2001-1005.

Deng, Yongheng, and Stuart A. Gabriel. 2006. Risk-Based Pricing and the Enhancement of Mortgage Credit
 Availability among Underserved and Higher Credit-Risk Populations. Journal of Money, Credit and
 Banking, 38 (6), 1431-1460.

Demyanyk, Yuliya S. and Van Hemert, Otto. 2009. Understanding the Subprime Mortgage Crisis. Review of
 Financial Studies, forthcoming.

Elul, Ronel. 2009. Securitization and Mortgage Default: Reputation vs. Adverse Selection. SSRN working
  paper.

Episcopos, A., A. Pericli, and J. Hu. 1998. Commercial Mortgage Default: A Comparison of Logit with
 Radial Basis Function Networks. Journal of Real Estate Finance and Economics, 17(2):163-178.

Feldman, David and Shulamith Gross. 2005. Mortgage Default: Classification Trees Analysis. Journal of
  Real Estate Finance and Economics 30(4), 369-396.

Follain, J. and R. Struyk. 1977. Homeownership Effects of alternative Mortgage Instruments. Journal of the
  American Real Estate and Urban Economics Association, 5(1): 1-43.

Foster, C. and R. Van Order. 1984. An Option-Based Model of Mortgage Default. Housing Finance Review,
  3(4): 351-372.

                                                                                                       27
 
Foster, C., and R. Van Order. 1985. FHA Terminations: A Prelude to Rational Mortgage Pricing. Journal of
  the American Real Estate and Urban Economics Association, 13:292-316.

Gerardi, Kristopher, Adam Hale Shapiro and Paul S. Willen. 2008. Subprime Outcomes: Risky Mortgages,
 Homeownership Experiences, and Foreclosures. Federal Reserve Bank of Boston working paper.

Green, J. and J. B. Shoven. 1986. The Effect of Interest Rates on Mortgage Prepayment. Journal of Money,
 Credit and Banking 18, 41-50.

Green, Richard K., Eric Rosenblatt and Vincent Yao. 2010. Sunck Costs and Mortgage Default. SSRN
 working paper.

Haughwout, Andrew, Ebiere Okah and Joseph Tracy. 2009. Second Chances: Subprime Mortgage
 Modification and Re-Default. Federal Reserve Bank of New York Staff Report.

Hendershott, P. H., and W. R. Schultz. 1993. Equity and Nonequity Determinants of FHA Single-Family
 Mortgage Foreclosures in 1980s. Journal of American Real Estate and Urban Economics Association.
 21(4): 405-430.

Herzog, J. and J. Earley. 1970. Home Mortgage Delinquency and Foreclosure. New York: National Bureau
 of Economic Research.

Jackson, J. and D. Kaserman. 1980. Default Risk on Home Mortgage Loans: A Test of Competing
  Hypotheses. Journal of Risk and Insurance, 4: 678-690.

Kau, J. B., D. C. Keenan and T. Kim. 1994. Default Probabilities for Mortgages. Journal of Urban
 Economics, 35: 278-296.

Kelly, Austin. 2009. Skin in the Game: Zero Down Payment Mortgage Default. Journal of Housing Research,
 17 (2), 75-99.

Keys, Benjamin, Tanmoy Mukherjee, Amit Seru and Vikrant Vig. 2010. Did Securitization Lead to Lax
 Screening? Evidence from Subprime Loans. Quarterly Journal of Economics 125 ( 1), 307-362.

Lucas, Robert. 1976. Econometric Policy Evaluation: A Critique. In Brunner, K. and A. Meltzer, The Phillips
 Curve and Labor Markets, Carnegie-Rochester Conference Series on Public Policy 1: 19-46. New York:
 Elsevier.

Mian, Atif and Amir Sufi. 2009. The Consequences of Mortgage Credit Expansion: Evidence from the U.S.
 Mortgage Default Crisis. Quarterly Journal of Economics, 124 (4), 1449-1496.

Morton, T. G. 1975. A Discriminant Function Analysis of Residential Mortgage Delinquency and
 Foreclosure. Journal of the American Real Estate and Urban Economics Association, 3(1): 73-90.

Pennington-Cross, Anthony. 2003. Credit History and the Performance of Prime and Nonprime Mortgages.
  Journal of Real Estate Finance and Economics 27(3), 279-301.

Philips, R.A., E. Rosenblatt and J.H. VanderHoff. 1996. The Probability of Fixed and Adjustable Rate
  Mortgage Termination. Journal of Real Estate Finance and Economics 13(2): 95–104.

                                                                                                        28
 
Philips, R. A. and J. H. VanderHoff. 2004. The Conditional Probability of Foreclosure: An Empirical
  Analysis of Conventional Mortgage Loan Defaults. Real Estate Economics, 32(4): 571-587.

Quigley, John M. 1987. Interest Rate Variations, Mortgage Prepayments and Household Mobility. Review of
 Economics and Statistics 69(4), 636-643.

Quigley, John M. and Robert Van Order. 1991. Defaults on Mortgage Obligations and Capital Requirements
 for U.S. Savings Institutions: A Policy Perspective. Journal of Public Economics 44(3): 353-370.

Quigley, John M. and Robert Van Order. 1995. Explicit Tests of Contingent Claims Models of Mortgage
 Default. Journal of Real Estate Finance and Economics, 1(2): 99–117.

Rajan, Uday, Amit Seru and Vikrant Vig. 2010. Statistical Default Models and Incentives. American
 Economic Association Papers and Proceedings, 100 (2), 1-5.

Schwartz, Eduardo S. and Walter N. Torous. 1993. Mortgage Prepayment and Default Decisions: A Poisson
  Regression Approach. Journal of the American Real Estate and Urban Economics Association, 21(4): 431-
  449.

Vandell, K. D. 1978. Default Risk under Alternative Mortgage Instruments. Journal of Finance, 33(5): 1279–
 98.

Vandell, Kerry D. and T. Thibodeau. 1985. Estimation of Mortgage Defaults Using Disagregate Loan History
 Data. Journal of the American Real Estate and Urban Economics Association. 13(3): 292-316.

Vandell, Kerry D. 1993. Handing Over the Keys: A Perspective on Mortgage Default Research. Journal of
 the American Real Estate and Urban Economics Association. 21, 211-246.

Van Order, Robert. 1990. The Hazards of Default. Secondary Mortgage Markets. 1990 (fall): 29-31.

von Furstenberg, G. 1969. Default Risk on FHA-Insured Home Mortgage as a Function of the Term of
  Financing: A Quantitative Analysis. Journal of Finance, 24(2): 459-77.

von Furstenberg, G. 1970a. Interstate Differences in Mortgage Renting Risks: An Analysis of Causes.
  Journal of Financial and Quantitative Analysis, 5: 229-42.

von Furstenberg, G. 1970b. The Investment Quality of Home Mortgages. Journal of Risk and Insurance, 37
  (3): 437-45.

von Furstenberg, G. and R.J. Green. 1974. Home Mortgages Delinquency: A Cohort Analysis. Journal of
  Finance, 29(4): 1545-48.

Webb, B.G. 1982. Borrower Risk under Alternative Mortgage Instruments. Journal of Finance, 37 (1): 169-
 83.

Williams, A. O., W. Beranek and J. Kenkel. 1974. Default Risk in Urban Mortgages: A Pittsburgh Prototype
 Analysis. Journal of the American Real Estate and Urban Economics Association, 2(2): 101-2.




                                                                                                       29
 
Yezer, Anthony M. J., Robert F. Phillips and Robert P. Trost. 1994. Bias in Estimates of Discrimination and
 Default in Mortgage Lending: The Effects of Simultaneity and Self-Selection. Journal of Real Estate
 Finance and Economics 9, 197-215.

Zorn, Peter and Michael Lea. 1989. Mortgage Borrower Repayment Behavior: A Microeconomic Analysis
 with Canadian Adjustable Rate Mortgage Data. Journal of the American Real Estate and Urban Economics
 Association, 17(1): 118-136.




                                                                                                        30
 
         Percentage
    25
                      2000 Vintage
                      2003 Vintage
    20
                      2006 Vintage


    15



    10



    5



    0
          1      2      3      4     5   6    7       8         9    10      11      12
                                                                    Loan age (Quarter)


Figure 1: Cumulative default rates of the three vintage loans




                                                                                          31
 
         Percentage default
    25
                        Actual cumulative default rate
                        Hazard model prediction
    20
                        Logit model prediction


    15



    10



    5



    0
            1       2         3     4       5        6   7   8   9        10      11

                                                                     Loan age (quarter)


Figure 2: Predicted cumulative default rates of the 2006 vintage loans




                                                                                          32
 
               100                                       100
    100
     90
     80
     70
                            60                                        59
     60
                                        48                                        49
     50
     40
     30
     20
     10
     0
          Actual default Prediction Prediction      Actual default Prediction Prediction
                        with actualwith naïve HPI                 with actualwith naïve HPI
                            HPI                                       HPI

                            Hazard model                         Logit model

Figure 3: Model predicted defaults as a percentage of actual defaults, 2006 vintage loan




                                                                                              33
 
Table 1: Performances of the three vintage loans
                           2000                          2003                           2006
                  Number      Percentage          Number    Percentage           Number    Percentage
    Default        1,396        16.36              2,255       7.08               5,969       22.21
    Prepayment     2,571        30.13             14,440      45.36               4,868       18.11
    Censor         4,566        53.51             15,141      47.56              16,039       59.68
    Total          8,533      100.00              31,836     100.00              26,876      100.00
Note: Default is defined as over 90-day delinquency. Censor means that the loan is alive at the data cutoff
point, which is 2002Q4, 2005Q4 and 2008Q4 for the three vintages, respectively.




                                                                                                        34
 
Table 2: Comparison of the loan characteristics of the three vintages

                                               Mean                               Standard Deviation
                                  2000      2003         2006             2000        2003              2006
    Original loan amount ($)    90,568    161,410      170,762          74,401      105,792         119,415
    Coupon rate                    0.11      0.07         0.08             0.02        0.01           0.01
    Mortgage spread (%)            5.05      3.39         3.43             1.58        1.21           1.29
    FICO score                     602        638          626           63.88       64.45           61.62
    Backend ratio                  0.38      0.38         0.39             0.11        0.10           0.10
    Combined LTV                 76.28     78.93        79.38            15.55       15.24           17.42
    LTV>80%                        0.37      0.42         0.36             0.48        0.49           0.48
    Low/No doc                     0.19      0.32         0.28             0.39        0.47           0.45
    Jumbo size loan                0.03      0.07         0.04             0.17        0.26           0.20
    30-year FRM                    0.81      0.89         0.95             0.39        0.31           0.23
    15-year FRM                    0.19      0.11         0.05             0.39        0.31           0.23
    1-unit property                0.90      0.89         0.92             0.30        0.31           0.27
    2- to 4-unit property          0.07      0.06         0.04             0.25        0.24           0.19
    Condo                          0.03      0.05         0.04             0.18        0.21           0.20
    Rate/term refinance            0.10      0.15         0.10             0.30        0.36           0.30
    Cash out refinance             0.65      0.65         0.67             0.48        0.48           0.47
    Home purchase                  0.25      0.20         0.22             0.44        0.40           0.42
    Owner-occupied home            0.88      0.89         0.92             0.33        0.31           0.27
    Second/vacation home           0.01      0.01         0.01             0.10        0.09           0.10
    Investment property            0.11      0.10         0.07             0.32        0.30           0.26
    Broker/correspondent loan      0.04      0.19         0.06             0.20        0.39           0.23
    Retail loan                    0.03      0.10         0.02             0.17        0.30           0.13
    Prep penalty 1-year            0.05      0.08         0.05             0.21        0.26           0.23
    Prep penalty 2-year            0.02      0.04         0.03             0.12        0.19           0.18
    Prep penalty 3-year            0.21      0.50         0.54             0.41        0.50           0.50
    Prep penalty over 3-year       0.26      0.08         0.10             0.44        0.28           0.30
    Number of loans              8,533     31,836       26,876

                                                                                                               35
 
Table 3: Combined LTV distributions of the three vintages
                                                        Combined LTV (%)
                 (0,60)   [60,70)   [70,75)   [75,80)   [80,85)   [85,90)   [90,95)   [95,97)   [97,100)   [100,~)     Total
     2000        16.45    13.08     11.16     23.39     13.08    14.33       5.06      1         1.89       0.56     100.00
     2003        12.3     12.85      8.9      21.46     10.74    16.92       8.61      0.34      7.65       0.23     100.00
     2006        12.44    11.1       7.64     18.11     10.09    14.62       8.06      0.31     17.51       0.12     100.00




                                                                                                                               36
 
   Table 4: Comparison of the time varying covariates of the three vintages

                                      Mean                            Std Dev                      Minimum                      Maximum
                           2000       2003       2006         2000     2003     2006      2000      2003     2006        2000    2003     2006
HPI growth since
                            0.07       0.14      -0.04         0.07     0.14     0.13      -0.15     -0.16    -1.07      0.66     0.79     0.37
origination
Contemporaneous
                           -0.48      -0.62      -0.32         1.16     0.73     0.74     -58.91    -27.16   -56.15      0.21     0.13     0.65
negative equity
MSA-level income
                            0.01       0.02       0.01         0.03     0.04     0.02      -0.16     -1.52    -0.10      0.21     0.40     0.16
growth
Change in MSA-level
unemployment rate           0.01      -0.01       0.01         0.01     0.01     0.01      -0.12     -0.11    -0.10      0.08     0.15     0.14
(percentage point)
Observations (loan-
                           62,025    213,701    192,940
quarters)
   Note: Negative equity is calculated with the contemporaneous house value (based on zip code level HPI) and the market value of the mortgage
   loan outstanding. See Clapp, Deng and An (2006) for more details.




                                                                                                                                                 37
    
Table 5: Hazard model parameter estimates and Wald test results of the three vintage loans
                                             Coefficient (S.E.)                   Wald Statistics
                                     2000          2003             2006      2000-2003    2003-2006
    FICO score                     -0.548***     -0.719***        -0.627***   17.49***       9.37**
                                   (0.032)       (0.026)          (0.016)
    Backend ratio                   0.052         0.079***         0.068***    0.62           0.21
                                   (0.027)       (0.022)          (0.014)
    Log of original loan balance   -0.074*       -0.213***         0.031*     12.13***       72.19***
                                   (0.031)       (0.025)          (0.015)
    LTV>80%                        -0.096         0.321***         0.042      24.96***       23.03***
                                   (0.066)       (0.05)           (0.029)
    Low/No doc                      0.324***      0.451***         0.448***    2.22           0.00
                                   (0.071)       (0.048)          (0.029)
    15-year FRM                    -0.532***     -0.511***        -0.327***    0.03           2.38
                                   (0.095)       (0.083)          (0.086)
    2- to 4-unit property           0.196         0.203*           0.138*      0.00           0.32
                                   (0.106)       (0.091)          (0.069)
    Condo                          -0.399*       -0.579***        -0.098       0.55           9.12**
                                   (0.197)       (0.145)          (0.066)
    Rate/term refinance             0.009        -0.362***        -0.485***    9.14**         1.97
                                   (0.1)         (0.071)          (0.05)
    Cash refinance                 -0.113        -0.37***         -0.427***    8.95**         0.83
                                   (0.067)       (0.054)          (0.032)
    Second/vacation home           -0.178        -0.281            0.185       0.06           2.17
                                   (0.306)       (0.29)           (0.124)
    Investment property             0.446***      0.277***         0.413***    2.26           2.24
                                   (0.085)       (0.074)          (0.052)
    Broker/correspondent loan       0.118         0.115*          -0.097       0.00           7.39**
                                   (0.126)       (0.055)          (0.056)
    Prep penalty 1-year             0.207         0.192*           0.275***    0.01           0.55
                                   (0.128)       (0.094)          (0.06)
    Prep penalty 2-year             0.447*       -0.098            0.079       5.75*          1.7
                                   (0.195)       (0.117)          (0.069)
    Prep penalty 3-year             0.177*       -0.049           -0.007       7.01**         0.53
                                   (0.071)       (0.049)          (0.033)
    Prep penalty over 3-year        0.154*       -0.082            0.076       4.89*          2.71
                                   (0.067)       (0.083)          (0.048)
    Excess premium                  0.423***      0.3***           0.233***   14.32***        9.65**
                                   (0.027)       (0.018)          (0.013)
    Negative equity                 0.281**       0.247***         0.78***     0.13          75.12***
                                   (0.087)       (0.042)          (0.045)
    Change in unemployment rate     0.041         0.194***         0.138***   17.97***       10.4**
                                   (0.034)       (0.011)          (0.014)
    N                               62,025        213,701         192,940

                                                                                                38
 
    -2LogL                            23,535          42,941        113,793
Note: *, ** and *** indicate significant at 0.05, 0.01 and 0.001 level, respectively. The baseline estimates
are not shown in this table.




                                                                                                          39
 
Table 6: Logit model parameter estimates and Wald test results of the three vintage loans

                                             Coefficient (S.E.s)                   Wald Statistics
                                     2000           2003             2006      2000-2003    2003-2006
    FICO score                     -0.561***     -0.733***         -0.649***   16.91***       7.44**
                                   (0.033)       (0.026)           (0.017)
    Backend ratio                   0.053         0.082***          0.07***     0.65         0.2
                                   (0.028)       (0.022)           (0.014)
    Log of original loan balance   -0.077*       -0.218***          0.032*     12.06***     72.95***
                                   (0.032)       (0.025)           (0.015)
    LTV>80%                        -0.096         0.332***          0.043      25.53***     23.8***
                                   (0.068)       (0.051)           (0.03)
    Low/No doc                      0.333***      0.46***           0.462***    2.14         0.00
                                   (0.072)       (0.048)           (0.03)
    15-year FRM                    -0.541***     -0.528***         -0.332***    0.01         2.65
                                   (0.097)       (0.084)           (0.087)
    2- to 4-unit property           0.201         0.2*              0.142*      0.00         0.25
                                   (0.109)       (0.092)           (0.072)
    Condo                          -0.405*       -0.582***         -0.098       0.51         8.99**
                                   (0.199)       (0.146)           (0.068)
    Rate/term refinance             0.01         -0.361***         -0.503***    8.78**       2.55
                                   (0.102)       (0.073)           (0.052)
    Cash refinance                 -0.115        -0.375***         -0.445***    8.87**       1.17
                                   (0.068)       (0.055)           (0.033)
    Second/vacation home           -0.183        -0.283             0.201       0.05         2.29
                                   (0.31)        (0.292)           (0.129)
    Investment property             0.46***       0.278***          0.429***    2.51         2.64
                                   (0.087)       (0.076)           (0.053)
    Broker/correspondent loan       0.119         0.117*           -0.1         0.00         7.37**
                                   (0.129)       (0.056)           (0.058)
    Prep penalty 1-year             0.207         0.202*            0.285***    0.00         0.53
                                   (0.13)        (0.096)           (0.063)
    Prep penalty 2-year             0.453*       -0.103             0.093       5.72*        2.01
                                   (0.2)         (0.119)           (0.072)
    Prep penalty 3-year             0.177*       -0.055            -0.007       7.05**       0.65
                                   (0.072)       (0.049)           (0.034)
    Prep penalty over 3-year        0.154*       -0.081             0.079       4.68*        2.68
                                   (0.069)       (0.084)           (0.05)
    Excess premium                  0.433***      0.31***           0.242***   13.93***      9.18**
                                   (0.028)       (0.018)           (0.013)
    Negative equity                 0.285**       0.24***           0.792***    0.21        78.43***
                                   (0.088)       (0.042)           (0.046)
    Change in unemployment rate     0.041         0.221***          0.147***   22.8***      13.93***
                                   (0.035)       (0.014)           (0.014)
    N                               62,025        213,701          192,940

                                                                                                   40
 
    -2LogL                            12,470          22,609         48,759
Note: *, ** and *** indicate significant at 0.05, 0.01 and 0.001 level, respectively. The baseline estimates
are not shown in this table.




                                                                                                          41
 
Table 7: Impact of parameter instability on default prediction

                               Hazard model prediction                 Logit Model prediction
                               Number         Percentage           Number          Percentage
    Predicted default           3,579              13.32            3,507              13.05
    Actual default              5,969              22.21            5,969              22.21
    Sample size                26,876            100.00            26,876             100.00
Note: These are cumulative (predicted and actual) defaults of the 2006 vintage loan. The prediction is
based on the model estimated with 2003 vintage data and the actual realization of the 2006 vintage
covariates.




                                                                                                         42
 
Table 8: Combined impact of parameter instability and HPI input error on default
prediction

                             Hazard model prediction               Logit Model prediction
                              Number        Percentage           Number          Percentage
    Predicted default          2,852           10.61              2,931             10.91
    Actual default             5,969           22.21              5,969             22.21
    Sample size               26,876          100.00             26,876           100.00
Note: These are cumulative (predicted and actual) defaults of the 2006 vintage loan. The prediction is
based on the model estimated with 2003 vintage data and assumes the 2006 vintage loan has the same zip
code-level HPI growth during 2006-2008 with that of the 2003 vintage loan during 2003-2005.




                                                                                                    43
 

								
To top