Estimating the Predictive Distribution for Loss Reserve Models by 1VU2MW92

VIEWS: 0 PAGES: 34

									Estimating the Predictive Distribution
                 for
       Loss Reserve Models


             Glenn Meyers
     Casualty Loss Reserve Seminar
         September 12, 2006
           Objectives of Paper
• Develop a methodology for predicting the
  distribution of outcomes for a loss reserve model.
• The methodology will draw on the combined
  experience of other “similar” insurers.
  – Use Bayes’ Theorem to identify “similar” insurers.
• Illustrate the methodology on Schedule P data for
  several insurers.
• Test the predictions of the methodology with data
  from later Schedule P reports.
• Compare results with reported reserves.
A Quick Description of the Methodology

• Expected loss is predicted by chain
  ladder/Cape Cod type formula
• The distribution of the actual loss around
  the expected loss is given by a collective
  risk (i.e. frequency/severity) model.
 A Quick Description of the Methodology

• The first step in the methodology is to get the
  maximum likelihood estimates of the model
  parameters for several large insurers.
• For an insurer’s data
  – Find the likelihood (probability of the data) given
    the parameters of each model in the first step.
  – Use Bayes’ Theorem to find the posterior
    probability of each model in the first step given
    the insurer’s data.
A Quick Description of the Methodology

• The predictive loss model is a mixture of
  each of the models from the first step,
  weighted by its posterior probability.
• From the predictive loss model, one can
  calculate ranges or statistics of interest
  such as the standard deviation or various
  percentiles of the predicted outcomes.
                  The Data
• Commercial Auto Paid Losses from 1995
  Schedule P (from AM Best)
  – Long enough tail to be interesting, yet we
    expect minimal development after 10 years.
• Selected 250 Insurance Groups
  – Exposure in all 10 years
  – Believable payment patterns
  – Set negative incremental losses equal to zero.
   16 insurer groups
account for one half of
 the premium volume
 Look at Incremental Development Factors

• Accident year 1986
• Proportion of loss paid in the “Lag”
  development year
• Divided the 250 Insurers into four industry
  segments, each accounting for about 1/4
  of the total premium.
• Plot the payment paths
Incremental Development Factors - 1986


                           Incremental development
                           factors appear to be
                           relatively stable for the 40
                           insurers that represent
                           about 3/4 of the premium.
                           They are highly unstable
                           for the 210 insurers that
                           represent about 1/4 of the
                           premium.
                           The variability appears to
                           increase as size
                           decreases
           Expected Loss Model

   E Paid LossAY ,Lag  Premium AY  ELR  Dev Lag


• Paid Loss is the incremental paid loss in the AY and Lag
• ELR is the Expected Loss Ratio
• ELR and DevLag are unknown parameters
   – Can be estimated by maximum likelihood
   – Can be assigned posterior probabilities for Bayesian analysis
• Similar to “Cape Cod” method in that the expected loss
  ratio is estimated rather than determined externally.
      Distribution of Actual Loss
      around the Expected Loss
• Compound Negative Binomial Distribution (CNB)
  – Conditional on Expected Loss – CNB(x | E[Paid Loss])
  – Claim count is negative binomial
  – Claim severity distribution determined externally
• The claim severity distributions were derived from
  data reported to ISO. Policy Limit = $1,000,000
  – Vary by settlement lag. Later lags are more severe.
Claim Severity Distributions

             Lags 5-10



              Lag 4



              Lag 3




              Lag 2
              Lag 1
 Likelihood Function for a Given
    Insurer’s Losses –  x AY ,Lag 
                                    
                   Likelihood  x AY ,Lag 
                                             
                                   
     10 11 AY

      CNB  x
    AY 1 Lag 1
                         AY ,Lag   | E Paid Loss AY ,Lag 
                                                            
                           where

E Paid Loss AY ,Lag   Premium AY  ELR  Dev Lag
                    
Maximum Likelihood Estimates of
Incremental Development Factors

                       Loss development factors
                       reflect the constraints on
                       the MLE’s described in
                       prior slide
                       Contrast this with the
                       observed 1986 loss
                       development factors on
                       the next slide
Incremental Development Factors - 1986
          (Repeat of Earlier Slide)


                                      Loss payment factors
                                      appear to be relatively
                                      stable for the 40 insurers
                                      that represent about 3/4
                                      of the premium.
                                      They are highly unstable
                                      for the 210 insurers that
                                      represent about 1/4 of the
                                      premium.
                                      The variability appears to
                                      increase as size
                                      decreases
Maximum Likelihood Estimates of
     Expected Loss Ratios




                    Estimates of the ELRs are
                    more volatile for the
                    smaller insurers.
        Using Bayes’ Theorem

• Let W = {ELR, DevLag, Lag = 1,2,…,10} be
  a set of models for the data.
  – A model may consist of different “models” or
    of different parameters for the same “model.”
• For each model in W, calculate the
  likelihood of the data being analyzed.

              Pr data | model
        Using Bayes’ Theorem
• Then using Bayes’ Theorem, calculate the
  posterior probability of each parameter set
  given the data.

          Posterior model | data 
                      
       Pr data | model  Prior model
 Prior Distribution of
Loss Payment Paths

                   Prior loss payment paths
                   come from the loss
                   development paths of the
                   insurers ranked 1-40, with
                   equal probability
                   Posterior loss payment
                   path is a mixture of prior
                   loss development paths.
 Prior Distribution of
Expected Loss Ratios

                   The prior distribution
                   of expected loss ratios
                   was chosen by visual
                   inspection.
 Predicting Future Loss Payments
      Using Bayes’ Theorem
• For each model, estimate the statistic of choice,
  S, for future loss payments.
• Examples of S
  – Moments of future loss payments.
  – The probability density of a future loss payment of x,
  – The cumulative probability, or percentile, of a future
    loss payment of x.
• These examples can apply to single (AY,Lag)
  cells, of any combination of cells such as a given
  Lag or accident year.
  – Use FFT’s to calculate distribution of sum of cells
 Predicting Future Loss Payments
      Using Bayes’ Theorem
• Calculate the Statistic S for each model.
• Then the posterior estimate of S is the
  model estimate of S weighted by the
  posterior probability of each model

          Posterior Estimate of S
                      
    n

   S | model   Posterior model
   i 1
               i                      i   | data 
           Sample Calculations
           for Selected Insurers
• Coefficient of Variation of predictive
  distribution of unpaid losses.
• Plot the probability density of the predictive
  distribution of unpaid losses.
Predictive Distribution
   Insurer Rank 7


                Predictive Mean = $401,951 K
                CV of Total Reserve   = 6.9%
Predictive Distribution
  Insurer Rank 97


                Predictive Mean = $40,277 K
                CV of Total Reserve   = 12.6%
CV of Unpaid Losses
Validating the Model on Fresh Data
• Examined data from 2001 Annual Statements
  – Both 1995 and 2001 statements contained losses
    paid for accident years 1992-1995.
  – Often statements did not agree in overlapping years
    because of changes in corporate structure. We got
    agreement in earned premium for 109 of the 250
    insurers.
• Calculated the predicted percentiles for the
  amount paid from 1996 to 2001
• If model works, the predicted percentiles should
  be uniformly distributed
PP Plots on Validation Data

                  Plot sorted predicted
                  percentiles against
                  uniform distribution.
                  Significant differences
                  given by Kolomogorov-
                  Smirnov test.
                  Critical values @ 95%
                  = ±13.03%
                 Feedback
• If you have paid data, you must also have the
  posted reserves. How do your predictions
  match up with reported reserves?
• Your results are conditional on the data
  reported in Schedule P. Shouldn’t an actuary
  with access to detailed company data (e.g.
  case reserves) be able to get more accurate
  estimates?
 Predictive and Reported Reserves




• For the validation sample, the predictive mean (in
  aggregate) is closer to the 2001 retrospective reserve.
• Possible conservatism in reserves. OK?
• “%” means % reported over the predictive mean.
• Retrospective = reported less paid prior to end of 1995.
    Reported Reserves More Accurate?
•    Divide the validation sample in to two groups and
     look at subsequent development.
         1. Reported Reserve < Predictive Mean
         2. Reported Reserve > Predictive Mean
•    Expected result if Reported Reserve is accurate.
     –   Reported Reserve = Retrospective Reserve for each
         group
•    Expected result if Predictive Mean is accurate?
     –   Predictive Mean  Retrospective Reserve for each
         group
     –   There are still some outstanding losses in the
         retrospective reserve.
Subsequent Reserve Changes
  Group 1   Group 2
                      • Group 1
                      • 50-50 up/down
                      • Ups are bigger

                      • Group 2
                      • More downs than
                        ups

                      • Results are
                        independent of
                        insurer size
        Subsequent Reserve Changes
                                         Reported Reserve @ 1995
                          < Predictive Mean (000)      > Predictive Mean (000)
Number of Insurers                  66                             43
Total Predictive Mean            926,134                      872,660
1995 Reserve @ 1995              803,175                     1,173,124
1995 Reserve @ 2001              856,393                      985,711


    •     The CNB formula identified two groups where:
        –    Group 1 tends to under-reserve
        –    Group 2 tends to over-reserve
    •     Incomplete agreement at Group level
        –    Some in each group get it right
           Main Points of Paper
• How do we evaluate stochastic loss reserve
  formula?
  – Test predictions of future loss payments
  – Test on several insurers
  – Main focus is the testing
• Are there any formulas that can pass these tests?
  – Bayesian CNB does pretty good on CA Schedule P data.
  – Uses information from many insurers
  – Are there other formulas? This paper sets a bar for
    additional research to raise.

								
To top