Risk Analysis Modelling

Description

YYYY

Reviews
Shared by: Jason Batman
Stats
views:
215
rating:
not rated
reviews:
0
posted:
1/23/2009
language:
English
pages:
0
Risk Analysis & Modelling Lecture 10: Aggregate Loss Distributions www.angelfire.com/linux/riskanalysis RiskCourseHQ@hotmail.com What we will look at in this lecture In this final lecture we will look at aggregate claims distributions In Lecture 4 we looked at a claim frequency/severity model The purpose of modelling the frequency and severity was to simulate the aggregate level of claims experienced This aggregate level of claims is a random variable and ultimately determines the profitability of underwriting portfolio The distribution that describes the random behaviour of the total level of claims is known as the aggregate claims (or loss) distribution In this lecture we will look at some techniques which can be used to fit a distribution to this aggregate level, and along the way we will learn about a simple alternative to the Monte Carlo method and how to determine the “fit” of a distribution… Claim Frequency/Severity to Aggregate Loss 0.0025 0.002 Probability Denisty 0.0015 0.001 0.0005 0 0 500 1000 Claim Size 1500 2000 2500 + 0.0025 0.002 0.0015 Simulate Claim Frequency 0.12 0.1 0.08 0.001 0.0005 0 0 500 1000 Claim Size 1500 2000 2500 + 0.0025 0.002 Probability 0.0015 0.06 0.001 0.0005 0.04 0 0 500 1000 Claim Size 1500 2000 2500 0.02 + 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Number Claims Per Week Probability Denisty 0.0025 0.002 0.0015 0.001 0.0005 0 0 500 1000 Claim Size 1500 2000 2500 Distribution of Aggregate Claims 0.0025 + 0.002 0.6 Probability Denisty 0.0015 0.5 0.4 0.001 0.0005 0 0 500 1000 Claim Size 1500 2000 2500 0.3 0.2 0.1 0 -1 1 3 5 7 9 11 Sum = Aggregate Claim Simulate Claim Severity Probability Denisty Probability Denisty Estimating the Aggregate Claims Distribution There are a number of techniques we can use to derive the aggregate claims distribution, some of these are based on advanced numerical techniques such as the Fast Fourier Transform (FFT) In this lecture we will look at the most flexible and simple of these, which is based on Monte Carlo Simulation Once we have the aggregate claims distribution we can place a statistical upper boundary on total claims experienced by the insurer Bootstrapping Bootstrapping is a simple alternative to Monte Carlo simulation Monte Carlo simulation is based on randomly sampling possible outcomes from a statistical distribution Bootstrapping is based on randomly sampling (with replacement) from a pool of outcomes This pool of outcomes can consist of historical observations or simulated values Bootstrapping Vs Monte Carlo In a Monte Carlo Simulation values are randomly sampled from a distribution Inverse Transform Distribution Random Sample In a Bootstrap Simulation values are randomly sampled from a pool of values Random Selection with Replacement Sample Random Sample Bootstrapping Claim Severity In lecture 4 we simulated the claim severity by randomly sampling from an Exponential Distribution In this lecture we will simulate claim severity via a Bootstrap simulation taken from a sample of 250 historic claims The advantage of this approach is that we do not need to make any strict assumptions about the behaviour of the severity of claims (that it follows a specific distribution) The disadvantage is that future claims might not be limited in size to those in our historic sample Claim Frequency and Bootstrapping Severity to Aggregate Loss Simulate Claim Frequency 0.12 0.1 0.08 0.06 0.04 0.02 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Number Claims Per Week Sum = Aggregate Claim The severity of each claim is randomly sampled from the pool of values (bootstrap) Probability Bootstrapping In Excel Excel does not come with a built in Bootstrap function, however using VBA we can easily extend the functionality of Excel This weeks workbook comes with a couple of built in functions: BootStrp and BootStrapSum BootStrp randomly samples a value from a range, and BootStrapSum samples a number of values and adds them together It is possible to perform bootstrapping in Excel without the help of VBA using the INDEX function as noted in the appendix (although this is fairly messy compared to the VBA solution!) Simulating Monthly Aggregate Claims In this lecture we will try to fit a distribution to the aggregate level of claims for the insurance company from lecture 4 We will simulate the claim frequency over the month by assuming that the average waiting time between claims is 0.54 days and that this is exponentially distributed We will simulate the claim severity by bootstrap sampling from the historic pool of 250 observations Empirical CDF So far we have viewed the Cumulative Density Function (CDF) as an equation that calculates the probability of a random variable being less than some level We can also estimate the CDF from a sorted sample of observations for a random variable We have already seen this technique when we estimated the 1% VaR or Quantile by taking the 100th value in a sorted set of 10000 values If we were to use this to estimate the location of all Quantiles (1%,2%,3%....99%,100%) and graph out the result we would obtain the Empirical CDF Empirical Aggregate Claims CDF 1 0.9 Probability Aggregate Claim < C 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 Empirical Estimate of the probability that the aggregate claim will be less than 150000 50000 100000 150000 200000 250000 Aggregate Claim Level (C) Pareto Distribution One distribution used in Actuarial Science to describe aggregate claims is the Pareto Distribution The formula for the CDF of this distribution is given by:  xmin  P  1    x  k Where xmin is the minimum value the random variable can take, k is a shape parameter and P is the probability the random variable will be less than x The inverse CDF is given by x xmin 1 P  1 k The PDF is simply given by the derivative of the cdf cdf ( x) k .xm in pdf ( x)   k 1 x x k Fitting the Pareto Distribution Fitting the Pareto Distribution once we have determined the minimum level (xmin) is a simple matter of selecting the shape parameter k One way we can do this is to use the maximum likelihood method and try a range of values for k and select the value which provides the best fit…. Fitting the Shape Parameter By Formula Sometimes the brute force maximum likelihood approach is the only way to fit the distribution to the data In the case of the Pareto distribution we can also find the shape parameter using a simple calculation (derived using calculus, see appendix): k N  (ln( x )  ln( x i 1 i n m in )) Where k is the maximum likelihood shape parameter, N is the number of observations, xi is the ith observation and xmin is the minimum value for x Assessing Degree of Fit An important question to answer is how well does our Pareto Distribution CDF function (see below) describe the random behaviour of aggregate claims?  65000  cdf ( x)  1     x  1.238 One way to assess how well it fits the data is to compare the CDF for aggregate claims described by this function to the empirical CDF derived from our simulated outcomes Empirical Vs Pareto CDF Empirical Quantile 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 50000 100000 150000 200000 250000 Pareto CDF Quantile QQ Plot The QQ (Quantile-Quantile) plot is simply another graphical way of comparing the Empirical vs Theoretical CDF It consists of an XY scatter graph depicting the location of the various quantiles On the X axis we have the location of the quantile from the empirical CDF, while on the Y axis we have the location of the quantile as predicted by the “distribution” Note that when the predicted quantile equals the distribution quantile the points are along the y=x line (45 degree line) QQ Plot Pareto Fit 1 0.9 0.8 0.7 Actual Fit Line Pareto Quantile 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Empirical Quantile Perfect Fit Line Inspecting the Fit It is obvious from the QQ plot and the graph depicting the Empirical Vs Pareto CDF that the Pareto distribution does not offer an accurate fit for the Aggregate Claims We can conclude with a high degree of certainty that the behaviour of the Aggregate Claims for this insurance company does not follow the Pareto distribution We will now try to fit another distribution to the level of aggregate claims, the Shifted Log Normal distribution Log-Normal Distribution We have already come across the Log-Normal distribution in the lecture on VaR The log-normal is also one of the primary distributions used in actuarial science to describe the aggregate claims The log-normal distribution has a very simple relationship with the normal distribution If we take a normally distributed random variable, known as the normal counterpart (x) and take its exponent we obtain a log-normally distributed random variable (y): x ~  e~ y Log-Normal Distribution ~ x 0.6 0.5 0.45 0.4 0.35 ~ y 0.5 0.4 Probability Density 0.3 0.25 0.2 0.15 0.1 0.05 x ~  e~ y 0.3 0.2 0.1 0 0 -1 1 3 5 7 9 11 Normal Counterpart Lognormal Shifted Log-Normal The exact form of the Log-Normal used in Actuarial science is known as the Shifted LogNormal The relationship between a Normally distributed random variable and the Shifted Log-Normal is simply: x ~  e~  A y The important feature of the shifted log-normal is that the minimum value that y can take is A Essentially, the shifted log-normal can potentially provide a better fit to claims data Shifted-Log Normal ~ x 0.8 0.5 0.45 0.4 0.35 ~ y 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 Probability Density 0.3 0.25 0.2 0.15 0.1 0.05 0 x ~  e~  A y A Shift parameter Fitting the Log-Normal Fitting the log-normal is simple once you select the minimum value A We observe that x ~  e~  A y ~  ln( ~  A) x y So we can transform a shifted log-normally distributed random variable into its normally distributed counterpart by subtracting the shift parameter and taking the natural logarithm We can then fit the normally distributed counterpart x to the dataset by taking its mean and standard deviation Assessing Degree of Fit The best fit log normal has the following parameters: A  65000  x  11 .26  x  0.39 We will use a function coded in VBA called SLGNORMDIST to estimate the location of the Quantiles for the Shifted Log Normal Empirical Vs Log-Normal CDF Empirical CDF 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 50000 100000 150000 200000 250000 Log Normal CDF QQ Plot Log-Normal Fit 1 0.9 0.8 Log Normal Quantile 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 Log Normal Fit Perfect Fit 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Empirical Quantile Inspecting the Fit The Shifted Log Normal provides a very good fit for the level of aggregate claims Whether this fit is accurate enough depends on the use it will be put to We could go onto examine other distributions and see if they provide a better fit…. But for this lecture we will say this distribution acceptable and go onto to graph the Exceedance Probability Curve for aggregate claims CDF to EP Curve The CDF for Aggregate Claims gives the probability that the total level of claims will be less than or equal to some amount X It is often more interesting to know the probability that claims will exceed some level, or the Probability of Exceedance This is simply one minus the probability of the random variable being less than ~ ~ P (C  X )  1  P (C  X )  1  cdf ( X ) EP Curve 1 0.9 Probability of Exceedance 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 90000 110000 130000 150000 170000 190000 210000 230000 250000 270000 Aggregate Claim Level Appendix: INDEX Bootstraping The INDEX function in Excel returns the Nth row in a range: =INDEX(RANGE,ROW_NUMBER) In a Bootstrap simulation we wish to return a random row To generate an integer random number between 1 and N we can use the following formula: =INT(rand() * N) + 1 Where INT rounds a decimal number down to the nearest integer Say we want to select a random value in the range “A1:A10”: =INDEX(“A1:A10”, INT(rand() * ROWS(“A1:A10”)) + 1) Where ROWS will return the number of rows in the range (in this case 10) Appendix: ML Estimate for Pareto By definition the likelihood of a sample x1,x2,……xN being taken from a Pareto distribution is: k xm L   pdf ( xi )   k. k 1 xi i 1 i 1 N N The log likelihood function is therefore: k N  N xm    ln L   ln   k. k 1   N . ln( k )  N .k. ln( xm )  (k  1) ln( xi )  xi  i 1 i 1   Find the turning point of this function by taking the first order derivative: N (k ) N N N   N . ln( xm )   ln( xi )    ln( xi )  ln( xm )   0 k k k i 1 i 1 Re-arranging we derive the value for k at the turning point: k N  ln( x )  ln( x )  i 1 i m N We can see by taking the second order derivative that this will be a maxima when N > 0 and k > 0 Coursework Part A The purpose of the coursework is to give you the opportunity to clarify your understanding of the topics covered in the course It is sufficient to simply cover the material covered in the class Additional marks will be given to groups who do additional research and express their own ideas or opinions Question 1: Discuss the Classical Mean Variance Framework The relevant lectures are 2,3 and 7 Introduce the concept of risk and return as measured by the expected or average outcome and the variance or spread of outcomes The concept of diversification, and how a portfolio’s mean and variance can be derived from the statistical properties of the assets it contains The covariance matrix as a practical of means of calculating the variance The concept and calculation of efficient portfolios and the minimisation of risk for a given target return The definition of VaR as a statistical lower boundary on loss How VaR can be calculated from the mean and variance under the assumption that returns on the portfolio are normally distributed How Monte Carlo simulation can also be used to calculate VaR Additional research for this question could come from any number of areas: Extensions to the basic VaR calculations Criticisms of VaR Alternatives to VaR (such as ETL) Extensions to the classical mean-variance portfolio framework Your opinion on the model and techniques Question 2: Mean-Variance Modelling of the insurance company’s asset liability portfolio The relevant lecture is 5 Introduce the insurance company as a financial entity with a portfolio of assets and liabilities Discuss how the insurance company is fundamentally different from an investor in that it has the ability to lever its portfolio with the sale of insurance policies Show how a model describing the mean and variance of the insurance company’s solvency capital can be derived from the portfolio of assets and underwriting liabilities it holds Discuss how this simple model captures diversification across lines of business and asset classes How this model could be used to place a lower boundary on the value of solvency capital over a time period How this framework can be used to optimise the assetliability portfolio and how the optimal portfolio depends on an optimal combination of asset and liabilities AND an optimal level of leverage Additional research could include: Practical applications of this model Criticisms of the model – it is a simple quantitative framework with a lot of assumptions and omissions Alternatives such as DFA (Dynamic Financial Analysis) and Monte Carlo simulation Your opinions on the model and techniques Question 3a: How Monte Carlo simulations can be used to calculate the risk of insolvency for an insurance company The relevant lectures are 3 and 4 Introduce the concept of Monte Carlo simulation or stochastic simulation as a means by which you randomly sample from a statistical distribution How various statistics such as quantiles and averages can be estimated from this sample How the estimates from the sample are random because the sample itself is random – however this randomness decreases with the sample size The concept of insolvency, required solvency margins and how the behaviour of solvency capital can be modelled by stochastically generating sequences of cash flows (income and payments) Discuss the concept of claim frequency and claim severity and how they can be modelled statistically The generation of realisations of solvency capital paths and how these can be used to measure the probability of insolvency Additional research could include: Extensions to the Monte Carlo technique such as quasi-random sequence (Halton or Sobol) or variance reduction Applications of this model in the real world Different actuarial distributions for loss and frequency Modelling of aggregate loss distributions Your opinion on the model and techniques Question 3b: Monte Carlo Simulations can be used to calculate portfolio level credit risk The relevant lectures are 3 and 8 Introduce the concept of Monte Carlo simulation or stochastic simulation as a means by which you randomly sample from a statistical distribution How various statistics such as quantiles and averages can be estimated from this sample How the estimates from the sample are random because the sample itself is random – however this randomness decreases with the sample size The concept of credit risk, introducing the concepts of credit ratings, transition matrices, credit spreads, The Curse of Dimensionality and how this can be solved using a Monte Carlo Simulation How correlated credit transitions can be modelled Additional research could include: Extensions to the Monte Carlo technique such as quasi-random sequence (Halton or Sobol) or variance reduction Comparisons between the CreditMetrics and other credit risk models (CreditRisk+ and CreditPortfolioView) Analytical approaches to measuring credit risk Your opinions on the model and techniques Question 4: Discuss the general principles and motivations behind CAT modelling, and EVT The relevant lecture is 9 Discuss why it is important for insurance companies to assess the impact of catastrophes on the underwriting portfolios The statistical measure of catastrophes (mean return period etc) and the measure of their impact on the underwriting portfolio (probable maximum loss etc) The construction of CAT models in terms of their components (hazard, vulnerability, exposure and loss) The use of Monte Carlo style techniques to assess the probability distribution of potential losses The accuracy of CAT models Extreme Value Theory as a statistical alternative to modelling extreme losses or claims The concept of tail fitting and the Peak Over Theshold approach (POT) to fitting the EVT curve The concept and calculation of the maximum likelihood technique PML using EVT and random number generation Additional research could include: A more detailed look at specific CAT models such as Eqecat or RMS Case studies in which CAT models have accurately or inaccurately predicted losses A more detailed look at hazard models or vulnerability measurements The mathematics behind EVT and its applications Multivariate EVT Your opinion on the model and techniques Coursework B: Excel Section You should hand in a disk with your spreadsheet solutions along with a presentation This write-up should contain any relevant graphs and results and also include the techniques you used to complete the coursework The write-up should also attempt to explain the results you obtain Email me with any question, asking for help will NOT effect your mark!

Related docs
Risk_Analysis__Modelling
Views: 27  |  Downloads: 2
Credit risk modelling
Views: 71  |  Downloads: 10
Credit Risk Modelling
Views: 8  |  Downloads: 2
Advanced Credit Risk Modelling
Views: 32  |  Downloads: 3
Other docs by Jason Batman
LETTERHEAD
Views: 515  |  Downloads: 54
Disclosure statement
Views: 297  |  Downloads: 0
pos020
Views: 109  |  Downloads: 0
joke
Views: 348  |  Downloads: 6
INDEMNITY AGREEMENT
Views: 312  |  Downloads: 7
Shareholders Resolution Approving Agreement
Views: 173  |  Downloads: 11
Compensation Commitee Charter
Views: 194  |  Downloads: 1
ASSIGNMENT OF MONEY DUE
Views: 252  |  Downloads: 2
Sample Nondisclosure agreement
Views: 626  |  Downloads: 19
Duke ECE 163 Notes
Views: 586  |  Downloads: 16
Interview Questions to Ask Job Candidates3
Views: 1042  |  Downloads: 115