Credit Scoring by alicejenny


									                       Overview of Credit Scoring Systems for Banks
INTRODUCTION ............................................................................................................................... 2

1. SCORING AS A TOOL FOR CREDIT RISK ASSESSMENT...................................................... 2
    1.1 CREDIT SCORING: BUSINESS TASKS ........................................................................................................................ 2
2. CLASSIFICATION OF SCORING MODELS .............................................................................. 3
    2.1 OVERVIEW OF EXISTING MATHEMATICAL MODELS ................................................................................................. 4
3. SCORING: GENERAL FUNCTIONAL REQUIREMENTS ......................................................... 5

CREDITWORTHINESS ..................................................................................................................... 6

5. SPECIFICS OF BUILDING SCORING MODELS ........................................................................ 8
    5.1 GENERAL PREREQUISITES ........................................................................................................................................ 8
    5.2 DEVELOPING A SCORING CARD.............................................................................................................................. 10
    5.3 DEVELOPING AND TESTING A SCORING MODEL .................................................................................................... 11
    5.4 PRESENTING THE MODEL AND TRAINING PERSONAL IN MODEL OPERATION ........................................................ 12
    5.5 MONITORING, MAINTAINING, AND RESETTING THE MODEL.................................................................................. 12
    5.6 SELECTION OF A CREDIT SCORING ALGORITHM .................................................................................................... 12
        5.6.1 Comparative Analysis of Scoring Algorithms ............................................................................................. 12
        5.6.2 Assessment of Scoring Algorithms ............................................................................................................ 14
        5.6.3 Finding the Optimal Cutoff Score ............................................................................................................... 15
    6.1 FORMULATING THE TASK ...................................................................................................................................... 16
    6.2 DEVELOPING REQUIREMENTS ................................................................................................................................ 17
    6.3. SELECTING THE OPTIMAL SOLUTION .................................................................................................................... 18
    6.4 TESTING AND IMPLEMENTATION ........................................................................................................................... 19
    6.5 FURTHER DEVELOPMENT OF THE SOLUTION ......................................................................................................... 20

CONCLUSION .................................................................................................................................. 20


This Manual presents a general methodology for implementing scoring models (including scoring
cards) in banks.
Use of scoring systems is necessitated by a growth in mortgage and retail lending businesses
intended to meet growing demand for retail loans. Scoring systems help lenders make their
products competitive. In fact, scoring may improve efficiency and effectiveness of large-scale
retail lending programs, thus, expanding coverage and reducing costs and time for processing loan
applications and, most importantly, reducing the credit risk. Implementation of a scoring system in
the process of assessing borrowers’ creditworthiness, enables a lender:
         To operate large-scale retail lending programs;
         To monitor the level and quality of the residential demand for lending services efficiently;
         To manage retail lending risks adequately and in a timely manner;
         To reduce operating costs related to assessment of potential borrowers’ creditworthiness;
         To expand a range of loan products offered to customers;
         To streamline the loan origination process.

Scoring is a mathematical or statistical model allowing a bank to estimate the probability of timely
repayment of the loan by a potential borrower based on historical data on other bank’s customers.
In a very simple form, a scoring model is a weighted sum of a number of certain characteristics.
This sum is an aggregate indicator (score). The higher this score, and more creditworthy a
customer. Scoring enables a lender to sort its customers by creditworthiness. The aggregate
indicator (score) of each customer is compared with a certain threshold or cutoff score which
reflects lender’s breakeven point and is set based on the number of compliant customers needed to
offset the losses from one defaulted customer. Customers whose score is above the cutoff score
qualify for loans whereas applications of the other customers are rejected. The major difficulty is to
determine what customer’s characteristics need to be included in the model and what weights
should be assigned to these characteristics.

Credit scoring includes decision making modules and principal methods to support the process of
making a consumer loan decision. These methods help lenders decide who will be granted loans, at
what amount, identify proper strategies to increase return, and assess risks.
Since credit scoring relies on actual data, it can be attributed to quite reliable tools of assessing
creditworthiness of private individuals.
Typically, there are two types of decisions:
1)        Granting a loan or rejecting a loan application;
2)        Identifying a proper strategy for working with existing borrowers (possible increase of
          borrower’s credit limit).
Regardless of methods used, the decisive factor in both cases is availability of large samples of
customer data: data from loan applications, behavior patterns, and subsequent credit history. Most

<Credit4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

methods employ these samples to identify specifics of customers’ credit histories and relationship
among customer characteristics (yearly income, age, duration of a current employment, etc.).
Scoring model applications may be used to accomplish a large number of tasks. By means of
scoring models, the basic concept of the bankruptcy risk extended to other aspects of risk
management, such as:
         Identifying potential borrowers (pre-qualification stage);
         Qualifying borrowers;
         Projecting behavior patterns of current borrowers (loan servicing stage).
Tasks, which are accomplished with scoring models, can be divided into four main groups
1.        Marketing research:
         Assessing creditworthiness of potential borrowers. This allows lenders to reduce costs of
          attracting new borrowers and satisfy needs of current borrowers better;
         Assessing probability of losing borrowers and developing effective strategies for retaining
Response scoring: scoring models which assess the most likely customers’ reaction to mailed
advertising materials on a new project.
Retention/attrition scoring: scoring models which predict likely behavior pattern of a customer –
whether he/she buys the product or switches to another lender after familiarization with the product.
2.        Tasks arising at the loan application stage:
         Making a decision on granting or extending a loan;
         Projecting future behavior patterns of a potential borrower based on the assumptions of
          borrower’s non-performance or poor performance.
These tasks are accomplished in the process of applicant scoring when the probability of default is

3.        Tasks arising at the loan servicing stage:
         Predicting future payment behavior of current borrowers. These estimates allow lenders to
          identify potentially troublesome borrowers, thus reducing the default probability;
Behavioral scoring is intended to assess risks associated with existing borrowers.
4.        Problem loan management:
Problem loan management aims at selecting optimal methods for repayment of indebtedness to
minimize the number of borrowers behind their repayment schedule or maximize the number of
repaid loans.
Scoring models for collection decisions: these are scoring models which help lenders to decide
when actions should be undertaken with regard to non-payers and also choose alternative methods
which are most appropriate in a concrete situation.

                            2. CLASSIFICATION OF SCORING MODELS
All credit scoring methods can be divided into two major classes:

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

         Deduction (expert) methods; and
         Empirical (statistical) methods.

This classification is based on whether accumulated statistics on borrowers is used for scoring. If
statistics is not used, a scoring system is attributed to deductive methods and assessment of
creditworthiness relies on expert experience.
Should available statistics be used, then a system is deemed empirical, and borrower’s
creditworthiness is assessed by certain formula reflecting all the statistical patterns and regularities
which available historical data can reveal.
Depending on statistics and way in which this statistics is used variety of scoring models are
In the scoring models presented below, factors determining borrower-related risks are weighted
based on either statistics or theoretical description (knowledge).

There is a variety of classification methods:
         Statistical methods which are based on the discriminant analysis (linear regression, logistic
         Variants of linear programming;
         Decision tree;
         Neural nets;
         Genetic algorithm;
         Nearest neighbors method.

Regression methods are traditional and most common methods. Fist of all, this is the linear
multifactor regression:

                                        р = wo + w1x1 + w2x2 + … + wnxn   (1)

where р is default probability, w -- weight factors, x -- borrower’s characteristic.
A shortcoming of the model is that there is probability in the left part of the equation which may
assume values from 0 through 1 whereas variables in the right part of the equation my assume any
values from - Ґ to + Ґ.

Logistical regression makes it possible to overcome this deficiency:

                              log (p/(1-p)) = wo + w1x1 + w2x2 + … + wnxn.      (2)

Use of the logistical regression involves more complex calculations to get weight factors, thus,
requiring more powerful computers and advanced software. However, this is not a problem at the
current state of IT development. Today, the logistic regression is the leader of scoring systems.
Another advantage of the logistical regression is that it is capable of dividing customers not only
into two groups (“0” - bad, “1” - good) but also into several groups by risks magnitude.
All regression methods are sensitive to correlation between characteristics, that is why regression
models should not use too correlated variables.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

Linear programming also leads to linear scoring models. It is hardly possible to delineate good
customers from bad customers precisely. Instead, errors should be minimized. The task can be
formulated as “to find weight factors which ensure that the error is minimal”.
Decision tree and neural nets are systems that divide customers into groups that are characterized
by the same risk magnitude and differ from other groups’ risk magnitude to the maximum extent
possible. Neural nets are used primarily to assess creditworthiness of legal entities. Assessment of
legal entities involves smaller samples as compared to consumer lending.
Genetic algorithm is based on the analogy with the biological process of natural selection. In
lending, it looks like this: there is a set of classification models which are subject to mutation,
crossing, etc. resulting to survival of “the fittest” – i.e. the model giving the most precise
Under the nearest neighbors method, one selects a unit to measure the distance between
customers. All customers in a sample get a certain spatial position. Each new customer is
classified based on his/her neighborhood which is characterized by which customers – good or bad
– prevail.
Normally, lenders use combinations of several methods. Each method has its cons and pros. In
addition, a choice of one or another method is preconditioned by lender’s strategy and priorities in
development of scoring models.
Since regression methods show significance of each characteristic for assessing risk magnitude,
they are particularly important for development of a questionnaire form to be completed by
potential borrowers. Linear programming is capable of employing a large number of variables and
modeling specific conditions.


To build a scoring system, the following types of data may be needed:
     Marcroeconomic data: socio-economic development statistics for those regions in which a
     lender has its branches/offices or in which the lender is planning to operate.
     Statistics on level of wages across industries: this data is needed in order to include in the
     scoring model information of occupational affiliation of a borrower so that assessment of his/her
     creditworthiness is more precise.
     Personal and performance data on all existing borrowers of the lender: this information
     includes statistics on repaid and not repaid debts and data on late payments of loan principal and
     interest. The exact composition of personal and performance data needed to run the model will
     be determined at the completion of the preliminary analysis.
     Bank Management’s expert knowledge of each credit product being offered by the lender.

A scoring system for private individuals should meet the following requirements which include
three groups of limitations.
First, the scoring model should allow for changing socio-economic conditions.
Second, it should be adaptable to a user.
Third, it should be capable of describing various types of loans under a single approach: consumer
loans, car loans, mortgage loans, and credit limits on credit cards (overdrafts).
<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

Listed below are requirements to a scoring model which relate to changing socio-economic
     The scoring model should be operable when there are no borrowers’ credit histories;
     The scoring model should allow for regional specifics;
     The scoring model should allow for social stratification of the Ukrainian population
      (distribution of individuals by income);
     The scoring model should be capable of determining borrowers’ actual income based on
      indirect indications.
Therefore, the space of indicators determining borrower’s creditworthiness in multidimensional.
The number of dimensions depends on specific tasks of the scoring model. (Under the unified
approach described in this paper, the dimension is variable.)
The next list shows more concrete functional requirements to the scoring system:
         Compatibility with lender’s information system;
         Possibility to develop several scoring cards;
         A scoring card should be designed at the product level;
         The scoring card should be adjusted to customer’s questionnaire form automatically (i.e.
          new fields should be added to the scoring card as new fields are added to the questionnaire
         Customer’s score should be recalculated automatically each time the customer’s
          questionnaire form is updated;
         Calculation of borrower’s scoring;
         Calculation of co-borrower’s scoring (if deemed necessary);
         Calculation of guarantor’s scoring (if deemed necessary);
         Stop factor should be determined at the product level;
         Keeping track of stop-factors by borrower;
         Keeping track of stop-factors by guarantor;
         Saving of borrower’s credit history.

In order to assess creditworthiness of potential borrowers, scoring systems may be used to build the
“credit portrait” (or “credit profile”) for each borrower.
It is recommended to use such a scoring system which would be capable of assessing the credit risk
of a borrower and entire loan portfolio with a unique model which is adaptable to data. A scoring
model for private individuals may be based on borrower’s individual data in the questionnaire form,
bank management’s expert knowledge, numeric estimates received from statistics on “bad” and
“good” loans, and numeric estimates received from objective regional and industrial information.
Assessing a concrete borrower with the scoring model results in borrower’s credit profile so that a
lender is able:

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

         To divide potential borrowers into “good” and “bad” borrowers to whom loans can and
          cannot be granted, respectively;
         To calculate parameters of a loan transaction for each borrower (the maximum loan amount,
          interest rate, maturity, loan repayment schedule);
         To determine risk magnitude for all retail loans and specifics of loan portfolio management.
Potential borrower’s credit profile is a curve on the plane where the anticipated time of loan
repayment (t) is laid off as abscissa and borrower’s total obligation (loan principal plus interest) is
plotted on ordinate (S).
                                      Chart 1: Borrowers’ Credit Profile Curve

                                                                  Loans for which
                                                                  a borrower does not qualify

                                                                     Loans for which
                                                                     a borrower qualifies

The curve characterizing borrower’s credit profile divides the plane of the chart into two areas: the
area under the curve and the area above the curve which correspond to the loans the borrower
qualify and does not qualify for, respectively.
For example, the borrower may receive a loan with the amount not exceeding S0 (points B and C,
but not point A). The maximum loan amount which may be granted to the borrower is Smax
(point D); in so doing, this amount may be granted for period t*.
Multiple credit profiles can be built for each borrower depending on loan parameters, such as
interest rate, repayment scheme, etc. In such a way an appropriate standards or customized loan
product can be selected for a customer with certain credit profile(s). As a result, the lender is able
to identify a larger number of creditworthy borrowers and increase its assets.
A scoring system, which is based on the outcomes of the analysis of borrower’s credit profile,
enables the lender to allow for loan duration better and improve precision of the assessment of
borrower’s creditworthiness in case of mid- and long-term lending.
This results in better assessment of creditworthiness and monitoring of creditworthiness patterns.
The scoring model, which relies on borrower’s credit profile, calls for input of expert knowledge of
bank management in the form of limits on certain data categories in the questionnaire form. For
example, potential borrowers with monthly income below a certain level or age outside certain
range do not qualify for a loan.
Input data, which are used to build the scoring model based on borrowers’ credit profiles, include:
         Expert knowledge of bank’ managerial staff on current borrowers;
         Personal data on a potential borrower;
         Information on bank’s credit products.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

Use of borrower’s credit profile for building scoring system to assess creditworthiness of private
individuals allows a bank to administer various lending programs (products) under a single

                         5. SPECIFICS OF BUILDING SCORING MODELS


Developing Scoring Algorithms: Practical Aspects
If there is a sufficient database of originated loans, a lender may begin implementation of credit
scoring information technology.
Key components of this technology are:
         Procedures for developing scoring cards and other credit scoring algorithms;
         Procedures for checking applicability of these algorithms under new conditions;
         Applications for front offices which ensure real time support of the loan decision making
Statistical methods of the analysis, which are used in the credit scoring system, rely on probability
models of possible outcomes of a loan transaction.
Listed below are major conditions under which probabilistic modeling is appropriate:
         Probabilistic nature of the outcome of a lending transaction
With this assumption, the outcome of a lending transaction is a random event which occurs with
certain probability.
         Dependence of the probability of lending transaction outcome on certain factors
The probability of lending transaction outcome depends on a number of factors: borrower’s income,
social status, credit history, etc. Other material factors may be macroeconomic indicators for the
validity period of the loan contract (like foreign currency exchange rate, inflation rate, etc.).
         Invariable impact of significant factors
It is assumed that impact of each material factor on the probability of lending transactions outcomes
is invariable over a certain time interval covering partially past and future periods. This is an
important condition as it enables the lender to assess creditworthiness of new applicants based on
loan contracts concluded with other private individuals in the past.
         Independence of outcomes
Outcomes of lending transactions are assumed to be independent from each other.
         Precision of scoring calculations
As a rule, the composition of material factors tends to change over time as the nature of their impact
does. The duration of the period when scoring algorithms remain appropriate depends on the nature
and magnitude of changes in the economy. In practice, it ranges from a few months to a few years.
To ensure functionality of a credit scoring system, probabilistic models of lending transactions
outcomes need to be adjusted from time to time.
In order to be able to describe a new trend with statistical methods, one should have a sufficient
sample of data showing this trend. If trend’s lifetime is comparable with the time needed to
<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

accumulate sufficient data for a statistical analysis, then the scoring card may become obsolete by
the time of development.

Preparing Initial Data for Calculations
In order to develop credit scoring algorithms, one needs a sample of historical data, a so called
teaching sample. The quality of this sample (in terms of statistics – representativeness) affects the
precision of estimates of scoring model parameters and, consequently, the effectiveness (predicate
capacity) of the scoring algorithm.
Sample effectiveness is determined by the extent to which positive and negative precedents are
presented in the sample. The same sample element may be considered positive or negative for
different tasks or it might be absolutely inappropriate for inclusion in the teaching sample.
For example, for the purpose of assessing potential borrower’s creditworthiness (application
scoring), all instances of timely loan repayment may be assumed to be positive precedents whereas
all the other instances will be deemed negative precedents.
But if we are going to assess whether at least a portion of delinquent loan will be repaid (collection
scoring option), then positive precedents will comprise all instances of repaying delinquent loans в
сумме не менее этой части while all the other delinquent loans will be attributed to negative
precedents. Timely repaid loans should be excluded from the sample as they have nothing to do
with collection scoring.
The content of the scoring task affects not only division of the teaching sample into positive and
negative precedents but also a set of material (significant) factors. Indeed, as soon as a loan has
been granted, an applicant becomes a borrower and the lender gets further information, for example,
information on borrower’s compliance with the repayment schedule. Beside that, some important
characteristics may change over the loan period (for instance, borrower’s income or marital status).

Information on Rejected Loan Applications
Information on rejected loan applications (rejected applicants) cannot be used in the teaching
sample since it lacks essential data. And this creates a certain methodological problem.
Suppose, very strict criteria applied when making loan decisions. This means that some rejected
loan applications would have replenished a subsample of positive precedents in the teaching sample
had these loans been granted. And scoring calculations would have been different. However, even
if all rejected applications had replenished the subsample of negative precedents, scoring
calculations would have differ from calculations with actual data.
Therefore, if scoring calculations rely solely on actual data on originated loans (i.e. data on actual
borrowers), then predicative estimates of potential borrowers’ creditworthiness will be subject to a
certain systemic error.
Scoring results get shifted because a potential borrower is not yet an actual borrower. Including
only actual borrowers in the teaching sample censors (i.e. distorts) the sample. From the statistical
perspective, this means that new potential borrowers belong to another population that differs from
the population from which the teaching sample was formed.
The magnitude of this error can be estimated and reduced by applying scoring to the data on
rejected loan applications and running the scoring model with these data once again. In so doing,
the data on rejected loan applications should be divided into positive and negative precedents as if it
were like this in reality.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

A bank willing to implement assessment of creditworthiness of individual borrowers with a scoring
model needs to decide whether to develop the scoring model in house or purchase it from
professional providers of scoring cards.
We recommend to develop a scoring card in house because in-house development will enable a
lender to allow for lender’s lending practices and other specifics better.
Leading scoring solutions are capable of generating specialized scoring profiles for various credit
products (mortgage loans, car loans, consumer loans, credit cards, etc.).
Development of a scoring card consists of four stages.
At the first stage, a developer identifies which variables are dependent (i.e. variables showing
whether a given borrower is “good” or “bad”).
At the second stage, a sample is formed which will be used for testing the scoring model.
At the third stage, the developer identifies independent variables. The major source of information
to build the scoring model is borrower’s questionnaire form. Both “raw” data (gender, age, etc.)
and derivative (processed) data may be used.
At the fourth stage of building the scoring card a scoring model is selected.
Binary choice models (logit- and probit-models), decision trees, and neural nets are principal
models which are used to build scoring cards.
The advantage of logistic regression is that this model is rather illustrative. In addition, variables
are included in this model additively, thus, making it possible to compare borrowers not only within
one characteristic (for example, compare all male or female borrowers) but also identify the most
significant criterion (i.e. to do comparison across borrowers’ characteristics).
Logistic regression is less sensitive to sample size than the decision tree. Approximately 200
defaults are needed to have a logistic regression model be sustainable.

Discretization of Continuous Variables
First of all, in order to make a scoring card illustrative, continuous variables (like, for example,
borrower’s age) should be broken down into brackets (intervals). Most statistical applications
feature this function. Brackets can be set based purely on statistical principles or some practical
considerations (for example, age bracket can use such conventional “milestones” as graduation
from the university, retirement, etc.).

Optimization Methods
Statistically significant variables can be selected by trial-and-error method or optimization methods,
such as “Backward method” and “Forward method”. Under the Backward method statistically
significant variables are added automatically at each step whereas under the other method
statistically insignificant variable are discarded at each step, thus, facilitating selection of the
optimal model. The cross variables method makes it possible to allow for non-linear effects in
logistic regression.
From now, this model will considered as the basic model for decision making.

Limitations on Scoring Application
There are two specifics of scoring which are particularly important.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

One specific is that a lender classifies the sample only with regard to actual borrowers. We will
never learn how rejected applicants would behave: it is likely that some of them would be good
borrowers. However, loan applications are usually rejected based on serious reasons. Banks set
forth reasons for rejection and save information on rejected applicants so that they are able to
retrieve information on any applicant at a later time.
The other specific is that people’s characteristics tend to change over time; so do socio-economic
conditions influencing people’s behavior. That is why scoring models should be developed based
on a sample of most recent customers. Operation of the scoring system should be checked
periodically. As soon as the quality of scoring results gets worse, a new model needs to be
developed. In the West, lenders develop new scoring models once in 18 months on average. The
periodicity of replacement depends on country’s economic stability. Under Ukrainian conditions,
scoring models should be revised twice a year or even more often.

Once a bank has decided to implement a scoring system, it is time to begin designing and testing a
scoring model.
Technically, development and testing of either statistical or expert model constitute the most
important step. Although most model development works relate to solving quantitative problems
and analytical work, there need to be a permanent two way contact with Bank management to
coordinate all efforts throughout the entire project.
How the scoring model will be designed depends on whether the model is expert or statistical and
also on bank’s lending policy.
If available historical data on originated loans is limited, it is possible to use an expert scoring
model, which currently exists in a certain form in the Bank.
Statistical models rely solely on historical data in establishing relationship between information on
a potential borrower and likelihood of loan repayment.
If the Bank has sufficient historical data on originated loans to small businesses, then it is feasible to
consider development of a statistical scoring model. With logistical regression, one can build a
scoring model and test it on historical data. Statistical models are most powerful scoring models.
However, development of statistical models requires availability of a sufficiently large database
which includes at least 1,000 defaults.
When historical data is insufficient or unavailable it is still possible to use statistical models which
are built on impersonal historical databases of borrower with profiles that are similar to those of
existing and potential borrowers of the bank and include a sufficiently large number of defaults.
Such models allow lenders to use a scoring card developed with such databases as a “zero” scoring
card which will be adjusted later to future bank’s statistics on defaults.
Historical data can be divided into three non-overlapping subsets of data: teaching, testing, and
validating subsets.
The teaching subset of data is intended to build the scoring model and tune its parameters
The testing subset of data is intended for testing and assessing forecasting capability on those data
which was not included in the teaching subset.
The validating subset of data is intended to check sustainability of model operation.
Expert models are usually used when available historical data is insufficient to build a statistical
model or when new factors are being incorporated in the model which, by far, have not been taken
into account in loan decision making. Selection of an exert model can be preconditioned by a
combination of these two factors. Unlike the statistical model, expert model testing outcomes and
<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

expert model’s capability of projecting a quantitative measure of risk may not be justified purely
academically because of the very nature of expert models.
In both cases historical data are used as criteria for identification of adequate ranges of model
parameters which allow the developer at the testing stage to assess model’s capability of rating
borrower by risk magnitude.
Once the model has been developed and approved by Bank management, it should be pilot tested.
Needless to say, goals of the pilot program vary depending on the model type.
There are two options of using the scoring model under pilot project:
     1. In parallel with existing loan origination procedures. Each applicant is assessed with both
        existing procedure and scoring model, however a loan decision is made under existing
        procedure. At the completion of the pilot project the lender compares outcomes of the
        scoring model and existing procedures. This method is most conservative.
     2. The scoring system is used as a standalone and independent procedure. Loan decisions are
        made based on scoring outcomes. This method is more suitable for expert models in banks
        with perfect lending services.
Whatever option is selected, the pilot project should be monitored thoroughly in order to reveal
possible problems and adjust/refine the model accordingly prior to implementation. Finally, the
pilot projects should be preceded by training personnel of branches and offices where the model
will be operated on the pilot basis.

As the pilot project proceeds, collected data needs to be analyzed to find out whether the model is
successful and prepare recommendations for the Bank.
New procedures should be designed and approved by Bank management.

The final step consists in monitoring the loan portfolio status and model operation. There might be
a need in adjusting model parameters or, in case of a statistical model, in re-teaching the model. In
addition to standard loan portfolio status reports, new reporting forms will be introduced to assess
the quality of model operation from the risk assessment perspective.
All readjustments of statistical models are done periodically. Model parameters are readjusted in
order to improve model’s projecting capabilities since initial data, which were used for building the
model, might not be essential for projecting purposes any longer since external conditions have


5.6.1 Comparative Analysis of Scoring Algorithms
In our comparative analysis of scoring algorithms we will consider those models which are widely
used in banking practices. Out of many existing mathematical models of scoring, most popular
today are three basic algorithms. These are algorithms designed based on:
         Logistical regression;
         Decision tree; and
         Neural net.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

The major difference among these three methods lies in approaches to segmentation of precedents
in the teaching sample.
The segmentation aims at identification of material factors affecting the probability of possible
outcomes of lending transactions. This is possible if a statistically significant difference between the
ratios of positive and negative precedents can be revealed between segments.
In the logistical regression method, precedents are segmented based on fragmentation of the space
of factors by n-dimensional net, where n is the number of material factors (Chart 1).
           Chart 1: Segmentation of Precedents under the Logistical Regression Method.
                                           Factor n

                                                                         Factor 1
                                                    Positive precedent   (age)
                                                    Negative precedent
Suppose each cell of the net (n-dimensional rectangle) combines precedents from the teaching
sample which are characterized by the same probability of the outcome.
The coordinates of net nodes are calculated based on statistical criteria by the principal of maximum
difference between the probabilities of lending transactions outcomes for adjacent precedent
The ratio of positive to negative precedents in each segment is used to calculate points in the
scoring card. The coordinates of net nods in the factor space set intervals of characteristics’ values
on the scoring card.
Therefore, logistical regression is an adequate mathematical tool to calculate values on scoring
The decision tree is a more general algorithm of segmenting the teaching sample of precedents as
compared to logistical regression.
Unlike the logistical regression method, under the decision tree method precedents are segmented
by consecutive fragmentation of the factors space into nested rectangular areas rather than by n-
dimensional net (see Chart 2).
                    Chart 2: Segmentation of Precedents under Decision Tree Method
                                            Factor n

                                                                         Factor 1
                                                    Positive precedent   (age)
                                                    Negative precedent

Chart 3 shows sequence of steps under decision tree method.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

                                      Chart 3: Decision Tree: Sequence of Steps
                                                                  Entire sample

                                      Segment 1                 Segment 2                  Segment k

                        Segment 1.1               Segment 1.2                Segment k.1               Segment k.2

At the first step, the sample is divided into segments by the most significant factor. At the second
and subsequent steps, the division applies to each segment repeatedly until not a single subsequent
fragmentation option leads to any material difference between the ratios of positive to negative
precedents in new segments.
The number of branches (segments) at each step of building the decision tree is identified
The neural network makes it possible to process precedents in the teaching sample with more
complex shapes (other than rectangular) – see Chart 4. The geometric shape of segments depends
significantly on the internal structure of the neural net, which can be tuned to the nature of
relationship between material factors.
                      Chart 4: Segmentation of Precedents under Neural Net Method
                                            Factor n

                                                      Positive precedent
                                                      Negative precedent

Although neither decision tree nor neural net leads to building a scoring card in the classic tabular
form, one can get an analog of scoring points under these methods easily.
For example, an empirically calculated percentage of positive precedents in a segment may serve a
scoring point. Then calculating the score of a potential borrower is equal to attributing him/her to
one of built segments. For this purpose, scoring algorithms are applied to a potential borrower.

5.6.2 Assessment of Scoring Algorithms
Existing practices of applying these algorithms show that none of the above methods is best in all
cases. Effectiveness of scoring models can be assessed only through practical comparison of
projections and actual outcomes of lending transactions
The entire sample of available empirical data or a part of this sample can be taken for the
comparison purpose.
Scoring algorithms can be compared by various criteria. One of them is described below.
First we sort the precedent sample in the ascending order by probability of positive outcomes,
which we got from the scoring algorithm, and then plot a graph on which we lay off percentages of
the sorted sample (from left to right by ascending probability of the positive outcome) as abscissa
<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

and percentages of actual negative precedents in the subsample (corresponding to X coordinate) as
This chart shows the Y percentage of actual negative precedents in the first X percent of the sorted
sample. This means that the higher the curve lies above the abscissa axis, the more precise the
                               Factor n
scoring algorithms is.         (income)

In order to compare scoring algorithms by predictive cardinal, it is sufficient to compare their
graphs. The most informative algorithm is that whose curve rises above all the other graphs.
                                     Chart 5: Comparison of Three Algorithms

                                                                     Algorithm # 1
                                                                     Algorithm # 2
                                                                     Algorithm # 3

It may turn out that none of algorithms prevails. The example on Chart 5 illustrates such a
Algorithm # 1 is worse in terms of predictive cardinal to Algorithms # 2, 3 for small values of X but
surpasses them at large values.
Algorithms # 2, 3 are more effective for mid-range and large values of X, respectively.
This might mean that Algorithm # 3 is appropriate for conservative lending policies, whereas
Algorithm # 1 is better for more aggressive lending policies.

5.6.3 Finding the Optimal Cutoff Score
Identification of an optimal policy for a bank requires a further economic analysis. Scenario
calculations will help in doing such analysis.
As an example, let’s consider a single loan product (a loan with the same parameters for all
borrowers). Qualifying applicants will be those with the score not lower than a certain threshold
score (so-called the cutoff score).
The composition of bank’s loan portfolio will vary depending on the cutoff score value. The higher
cutoff score, the smaller the number of loans granted, and the more likely the positive outcome of
each lending transaction. This means that a higher cutoff score corresponds to a more conservative
lending policy and vice versa. Needless to say that the lower the cutoff score, the more loans with
low default probability in the portfolio.
Here we introduce the concept of average return on loan portfolio which equals the difference
between expected yield on the portfolio and expected portfolio costs divided by the number loans in
the portfolio.
Portfolio costs consist of (a) not repaid loan i.e. direct losses caused by defaults on loans and
(b) fixed costs of servicing the loan portfolio (salaries to bank’s personnel, rental for the office,
overhead costs, etc.).
<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

Expected yield and costs are directly linked to the number of originated loans, probabilities of
positive and negative outcomes of lending transactions, which, in turn, depend on borrowers’
creditworthiness and, accordingly, borrowers’ scores.
                                   Chart 6: Loan Portfolio Average Yield Curve
               Average return
               on portfolio

                                                                Zero return line

                                                       cutoff score
                                                                                   Cutoff score
The curve illustrating dependency of the average yield of loan portfolio on the cutoff score has the
maximum at a certain point (see Chart 6). This point reflects the optimal cutoff score a bank may
use to pursue its lending strategy.

A credit scoring system is a component of bank’s underwriting system rather than standalone
function intended to assess borrower’s creditworthiness.
The credit scoring system is built in five stages listed below:
1.    Formulating the task and identifying anticipated results;
2.    Developing (technical, organizational, economic, etc.) requirements to the solution and
      identifying, based on these requirements, criteria to assess the quality of the solution;
3.    Selecting the optimal solution; checking the solution by the above criteria;
4.    Testing and implementing the solution; adjusting business processes to new technologies;
5.    Further development of the solution.

Proper formulation of a task predetermines the final result. Provided below is an example of
formulation of a task for the purpose of credit scoring:
Task: To create a mechanism for assessing borrowers’ creditworthiness: (a) when they apply for a
loan; (b) in the loan decision making process; and (c) when assessing the probability of loan
prepayment and collection of debts throughout the entire loan term. In so doing, the process of
assessing borrowers’ creditworthiness (underwriting) should be fast (should take at most one or two
days) and involve a minimal number of bank’s personnel. This process should not complicate
substantially the loan application procedure for customers.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

Requirements to the new solution should allow for existing infrastructure and systems. Of course,
purchase of new equipment and software always necessitates certain adjustments of business
processes. However, relying on existing resources and procedures allow lenders to lower costs of
introduction and maintenance of new solutions. Typically, three groups of specialists are involved
in the process of making loan decisions:
     Credit experts-analysts who are responsible for setting lending criteria;
     Credit officers and customer service personnel working at front offices;
     IT specialists who ensure entering and saving information on customers and coordinate
      interaction of analysts and operators.
Each of these groups sets forth requirements based on its own responsibilities and plays its own role
in designing functional requirements to the scoring system.
Analysts are responsible for identification of factors which increase and, conversely, reduce the risk
of borrowers’ default on loans. In so doing, the number of factors to be included in the model
should be minimal to avoid overburdening borrower’s questionnaire form but at the same time quite
sufficient to assess borrower’s creditworthiness.
Other requirements are time limits on modeling and flexibility of modeling tools. Building of a
model should not be time consuming so that the scoring model could be rebuilt easily in case of any
external or internal changes.
Major requirements to credit expert-analysis are as follows:
     Preciseness and reliability of models;
     Possibility to monitor the quality of the model easily without a need in an additional labor
      consuming analysis;
     Minimal time and labor consumption for preparing data for the analysis;
     Convenient interpretation of results; visualization of results;
     Possibility to update the model easily and keep track of changes in the model;
     Possibility to adjust the model and create new supplementary models.

Credit Officers and Customer Service Personnel from Front Offices
The major task of credit officers and customer service personnel is to describe the loan execution
procedure to a potential borrower, collect all necessary information on the customer, receive loan
approval, advice the customer on loan decision and discuss details as a need might be. In some
banks it is a credit officer who makes the final loan decision with due regard to applicant’s score.
Involvement of a credit officer in the loan decision making process is also feasible when dealing
with atypical borrower.
High efficiency of the proposed solution implies carrying out all the above actions in a fast manner
and minimizing the probability of operator’s error. To this end, the scoring system should automate
work of the operational staff to the maximum extent possible.
A scoring system should meet the following requirements:
     Fast and convenient entry of data on a borrower in the database;
     Capability of calculating borrower’s credit score within a few minutes;
<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

     Limited supplementary qualification requirements to an operator (small training costs and easy
      replacement of the operation if needed);
     Zero probability of operator’s error.

IT Specialists
Since IT specialists ensure sustainable operation of the entire information system of the bank, their
key requirement to the scoring system is easy integration of a selected analytical tool in the existing
software. The fewer adjustments and refining at the implementation stage, the higher chance of
sustainable operation the entire system. In addition, a really good technical solution should not
require huge maintenance efforts. A good quality scoring system should be built in such a way that
90% to 95% of end users (analysts and operators) could work on their own and seek for assistance
from IT specialists only in case of emergency.
IT requirements to the scoring system:
     Capability of integration with other applications;
     Easy access to initial data (there should be no need in exporting/importing data);
     Possibility to use existing models which are integrated in automated systems and databases
      (exporting models in the form of the executable code).
Beside the above requirements, there are also general requirements to the scoring system as a tool to
be used by bank officers:
     Flexibility: possibility to use the scoring model to accomplish other banking tasks;
     Minimal costs and time for implementation;
     Guarantee of sustainable and effective operation;
     Short investment pay back period.
This list of requirements is general and may be supplemented depending on bank’s operation
technologies and market conditions. All requirements are finalized as project works proceed.

Technologically, a credit scoring system consists of three modules:
     Scoring model proper;
     Integration in the underwriting procedure (entering data from the questionnaire form and
      calculating the credit score);
     Entering of new data in bank’s automated information system; analysts’ access to this data for
      the purpose of further adjustment of the model.
The fewer software products need to be linked in this technological chain, the more reliable solution
will be. A perfect solution incorporates all three components in a single software product.
However, in assessing a credit scoring solution developers often focus on how the scoring model
will be built and overlook integration and further development of the model. For example,
solutions which are based on purchase of an available model or purchase of a set of statistical and
mathematical models which are needed to build a model in-house have certain advantages; however
they lack flexibility and limited capability of further development. Being unable to update and
adjust the model quickly, the bank is not capable of reacting to changes in external or internal
environment in a timely manner.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

That is why, a proposed solution should be verified against requirements set forth at the second
In addition to guarantees of sustainable operation, compliance with these requirements gives a
number of other benefits to the bank, namely:
     Implementation period gets shorter; consequently, overall costs go down;
     Models can be developed not only for lending purposes, but also for other analytical tasks;
     There is no need to rebuild the existing information system of the bank and purchase any
      supplementary applications;
     There are opportunities for further development of the solution, for example, integration with
      bank’s CRM-system or other applications.
Once all requirements has been met, the model can be tested and implemented.

At the testing stage, it is important to check the functions which are described below.

Preparing Data for Analysis
Since preparation of data is the most time and labor consuming stage of building the scoring model,
automation of this process will save analysts’ time and reduce project implementation costs

Building the Scoring Model
In the first place, developers should ensure a balance between reliability and preciseness of the
model. In other words, the model should not only describe available data properly, but also be
capable of generating reliable projections based on new data. In addition, modern tools of analysis
do not require from analysts to put forward hypotheses in advance. Instead, they select the best
model out of all models built.

Applying Models Built
Typically, to build a scoring model, an executable coded is generated (for example, with C, Visual
Basic, Java, SQL, HTML or another programming language) and exported in bank’s automated
system. Therefore, a bank officer use a finished model, rather modeling tool. For an operator, the
process of assessing applicant’s creditworthiness will look like this:

         The operator enters customer’s data in the system and the system generates customer’s score
          and display it immediately.

Integrating the Models in Existing Applications
Lenders should opt for solutions which can be easily integrated in existing systems due to an open-
ended interface rather then standalone applications.
It is sensible to outsource selection of the optimal software solution, installation and setting of
workstations, technical maintenance, and training of lender’s personnel to a specialized company
with sufficient experience of practical implementation.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

The solution may evolve in, for example, multiple models for various borrower’s groups. This way
may be appropriate if the bank implements lending programs in several regions.
Use of specialized software will enable the lender to build unique models which allow for regional
In addition, an advanced scoring option can be realized, namely, segmenting customers by risk
groups and developing group-specific lending criteria.
This information can serve as a basis for future marketing and advertising campaigns and
development of new bank products.
The credit scoring system should not only comply with banks’ business strategy and technological
plan, but also be integrated with bank’s internal rule and procedures.
Since the scoring system should and probably must lead to revision of bank’s rules and procedures,
these changes must improve and optimize the overall lending police of the banks rather than revise
and reassess it.

     Reducing time for processing loan applications and making loan decisions; increasing
      application processing capacity due to minimizing document flow in the loan origination
      process as a precondition for increasing profitability of the retail lending business;
     Efficient assessment and continuous monitoring of risks associated with a concrete borrower;
     Reducing the impact of judgmental factors on loan decisions; ensuring objectivity of
      assessment of creditworthiness by credit officers across all bank’s branches and offices;
     Modeling dependency of portfolio risk assessment criteria on scoring card parameters (plotting
      the dependency curve);
     Implementing a unified approach to assessment of borrowers’ creditworthiness across various
      credit products (bridge loans, credit cards, consumer loans, car loans, mortgage loans);
     Applying the scoring mechanism to all retail credit products offered by the bank;
     Using and tuning scoring cards for each product;
     Setting and optimizing parameters of each credit product based on capacities of a concrete
     Expanding coverage of potential borrowers;
     Reducing personnel and costs of retail lending transactions;
     Monitoring all steps of loan application processing;
     Adjusting and readjusting scoring cards easily.

Scoring systems quantify the magnitude of credit risk and help lenders find out how loan payments
are linked to borrower’s characteristics. All in all, scoring systems enable lenders to make loan
decisions based on the quantitative risk analysis and transparent compromises.

<4fefaffb-8437-4127-ba59-3a1df444cc34.doc> printed on October 21, 2012

To top