Docstoc

Boire Filler Group Credentials

Document Sample
Boire Filler Group Credentials Powered By Docstoc
					Data Mining: The Ongoing
        Journey

        Jan.20, 2009
The Basic Data Mining Project Approach
• Approach to managing projects from briefing to completion from an account
  servicing perspective:
   – We utilize the following four-step process to manage projects:


               Problem Identification


              Creation of the Analytical Data Environment


                          Application of the Data Mining Tools


                                        Implementation and Tracking


                                     Strictly Confidential                    2
Case Study Background-Insurance Company
• This case along with the other case looks at the data mining process as a
  continuous process.
Insurance Company
• Developed an initial response model.
• Looked at opportunities of targeting not only responders but profitable
  responders.
• Insurance company has variety of products and different product vendors.
  Can we organize the data to better manage our marketing efforts. Solution
  was to create an contact/campaign management system.
• Development of specific tools to optimize customer profitability within
  campaign:
   – Product Models
   – Contact Models
   – Retention Models
• Let’s look at the Initial response model
                                  Strictly Confidential                       3
Schematic Approach Of Modelling Objectives

      Schematic of how the model can be leveraged based on exercise objectives.


                       Development of Predictive Model

      Targeting Objective                                    Communications Objective
     Apply model to Customer Base                            Determine triggers for response


    Rank customer based on response                       Develop creative/product strategies

   Create Segments for future campaign
                                                       Test offers/message in future campaign


                             Increase Overall Response

                                         Strictly Confidential                                  4
Methodology \ Approach

                                                            Source Data


           Identify Who                                       Data Audit
        Responded vs. Who
         Did not Respond
                                                           Conduct statistical
                                                         analysis to build model
                                                            -CHAID analysis
                                                            - Factor Analysis
      Frequency                                           -Correlation analysis
    distribution to                                      - Regression Analysis
 determine relevance
     of variables
                                                         Validate models and
                                                         determine $ benefits
        Creation of analytical
            file into both
        development sample                                    Next Steps \
        and validation sample                              recommendations

                                 Strictly Confidential                             5
Methodology – Correlation Analysis
• Once the analytical file was created correlation analysis was conducted analyzing
  the relationship between the 220 variables and response (independently)

             Independent Variables         Dependent Variable

             • 220 variables               • Response
                                           • Non-Response

             Predictor Variables           Objective Function



• All variables that were positively or negatively correlated with response and were
  statistically significant (95% or greater level of confidence) were included in the
  analysis.
• In total 120 variables were used for the next part of the analysis




                                        Strictly Confidential                           6
Methodology – EDA Reports
• For the 120 variables which remain, exploratory data reports (EDA) were
  produced.
• EDA reports graphically depict the relationship between the variable and
  response and help us understand from a business standpoint whether or
  not the relationship makes sense
• The EDA reports are also used to validate variables selected later in the
  modeling process




                                   Strictly Confidential                      7
Sample EDA

                        Age             Response Rate         Number of          % of File
                                                              Customers

                       Average              2.50%                 98511           100%
                       20 to 30             1.50%                 19670           20%
                       30 to 40             2.00%                 19706           20%
                       40 to 50             2.50%                 19730           20%
                       50 to 60             3.00%                 19697           20%
                        60 +                3.50%                 19708           20%


                     4.00%
     Response Rate




                     3.00%
                     2.00%
                     1.00%
                     0.00%
                              Average   20 to 30    30 to 40          40 to 50   50 to 60    60 +
                                                                AGE




                                                    Strictly Confidential                           8
Methodology – Model Development
• Regression Analysis was then used to develop the model
• Based on the variables selected, the regression routine determines the
  most significant variables in terms of predicting response and eliminates
  variables based on multi-collinearity (correlation among independent
  variables)
• The regression routine then produces an equation that allows you to rank
  individuals or groups on the database based on performance.




                                   Strictly Confidential                      9
Case Study Methodology – Model Development

   Development vs. Validation Sample
   • The model was developed based on 50% of the complete file (development
     sample)
   • Once the model is developed, we then apply the model to the remaining 50% of
     file (Validation file) to validate the models results



                               Analytical File




            Development                                        Validation
              Sample                                            Sample


                                       Strictly Confidential                        10
EDA – Credit Bureau Risk Score

                                                                          # of             Response
    Variable           Description                 Range of Variable   Customers % of File   Rate
                                                       Average            10,752 100.00%      2.43%
                                                       0 to 721             2,160 20.09%      4.03%
                                                      722 to 764            2,101 19.54%      3.24%
   crscore_num   Credit Bureau Risk Score
                                                      765 to 793            2,219 20.64%      2.25%
                                                      794 to 823            2,111 19.63%      1.56%
                                                      824 to 892            2,161 20.10%      1.06%

                                                                    • Customers with a lower credit
                                                                      bureau risk score are more likely
                                                                      to respond




                                            Strictly Confidential                                         11
EDA – Credit Limit

                                                            # of             Response
    Variable   Description           Range of Variable   Customers % of File   Rate
                                         Average            10,752 100.00%      2.43%
                                         1 to 1400            1,901 17.68%      5.47%
                                       1500 to 2100           2,407 22.39%      2.66%
    crdlim_n   Credit Limit
                                       2200 to 3000           2,137 19.88%      1.64%
                                       3100 to 4700           2,184 20.31%      1.60%
                                       4800 to 17000          2,123 19.75%      1.08%

                                                      • Customers with a lower credit limit
                                                        are more likely to respond




                              Strictly Confidential                                           12
Regression Results: Variables Summary
  Variable                        Description                          Impact Strength
crscore_num             #308 Credit Bureau Risk Score                    -      39.20%
  crdlim_n                        Credit Limit                           -      22.11%
  c01timd1               Postal Code Median Income                       -      12.56%
Venpr_AG_HI   Vendor Product = HL, HH, PI, HM, NA, JA, 80, LB, AA        +      10.55%
   BHiBF      New customer has first ever balance in first 12 months     +       7.04%
  b_group1                    Group 1 Customer                           -       6.03%
   b_sal4                     Ms / Mad Salutation                        +       2.51%


• Summary of the variables in the response equation, their sign, and the
  percent they contribute to the model




                                         Strictly Confidential                           13
Validation Lorenz Curve


                                                               Lorenz Curve


                     12.00%
                                                          When applying the equation to the validation
                     10.00%                               sample the following results are obtained. The
                                                          top 10% of the file based on model score has
                     8.00%                                an observed response rate of 10.25% versus an
                                                          average response of 3.4% for the complete file
     Response Rate




                     6.00%
                                                          and versus .8% for the bottom 10% of the file.
                     4.00%
                                                                                                                                3.4%
                                                                                                                                Avg.
                     2.00%




                     0.00%
                              1   2   3   4   5   6   7    8      9       10   11      12   13     14       15   16   17   18    19    20


                     -2.00%
                                                                       Half Decile

                                                      Interval Response        Linear (Interval Response)

                                                               Strictly Confidential                                                        14
Cumulative Percent of All Responders

                                                                                  Cum. % of Resp.
     On the validation file, 80%
       120%
     of all responders are being
     captured in the top 50% of                                                                                                                  97%     98%     99% 100%
       100%                                                                                                                              95%
     the list                                                                                            86%
                                                                                                                 89%
                                                                                                                         91%
                                                                                                                                 93%

                                                                                                 83%
                                                                                         80%
        80%                                                                       76%
                                                                         72%
                                                                 67%
                                                         62%
        60%                                      55%
                                         48%

                                 40%
        40%
                          30%


        20%   16%




        0%
                                5%


                                        0%


                                                5%


                                                        0%


                                                                5%


                                                                        0%


                                                                                5%


                                                                                        0%


                                                                                                5%


                                                                                                        0%


                                                                                                                5%


                                                                                                                        0%


                                                                                                                                5%


                                                                                                                                        0%


                                                                                                                                                5%


                                                                                                                                                        0%


                                                                                                                                                                5%
             5%




                                                                                                                                                                           %
                       %




                                                                                                                                                                        00
                      0

                             -1


                                     -2


                                             -2


                                                     -3


                                                             -3


                                                                     -4


                                                                             -4


                                                                                     -5


                                                                                             -5


                                                                                                     -6


                                                                                                             -6


                                                                                                                     -7


                                                                                                                             -7


                                                                                                                                     -8


                                                                                                                                             -8


                                                                                                                                                     -9


                                                                                                                                                             -9
          0-


                   -1




                                                                                                                                                                     -1
                             %


                                    %


                                            %


                                                    %


                                                            %


                                                                    %


                                                                            %


                                                                                     %


                                                                                            %


                                                                                                    %


                                                                                                            %


                                                                                                                    %


                                                                                                                            %


                                                                                                                                    %


                                                                                                                                            %


                                                                                                                                                    %


                                                                                                                                                            %
                  5%




                                                                                                                                                                   %
                          10


                                 15


                                         20


                                                 25


                                                         30


                                                                 35


                                                                         40


                                                                                  45


                                                                                         50


                                                                                                 55


                                                                                                         60


                                                                                                                 65


                                                                                                                         70


                                                                                                                                 75


                                                                                                                                         80


                                                                                                                                                 85


                                                                                                                                                         90

                                                                                                                                                                95
                                                                                         Half Decile




                                                                                Strictly Confidential                                                                          15
Roll Out Gains Chart - Regression
   Half Decile




                    % of                                      Average Number
                 prospects Number of Cumulativ    Minimum    Response    of    Cum.            Cum.     % of              Cum. Lift
                 Phoned in Names     e Names      Score in    Rate in Respond Number           Resp. Resp. in Cum. %      in Resp.    $ Benefit of
                  Interval  Phoned    Phoned       Range      Interval  ers   of Resp.         Rate   Interval of Resp.     Rate       Modelling
   1               0-5%     50,000     50,000     0.093933    11.00%   5500     5500          11.00%    16%      16%        327%        $397,712
   2              5%-10%    50,000    100,000     0.07493      9.50%   4750    10250          10.25%    14%      30%        305%        $717,327
   3             10%-15%    50,000    150,000     0.056221     6.39%   3195    13445          8.96%     10%      40%        267%        $874,969
   4             15%-20%    50,000    200,000     0.049155     5.50%   2750    16195          8.10%     8%       48%        241%        $986,325
   5             20%-25%    50,000    250,000     0.042839     4.90%   2450    18645          7.46%     7%       55%        222%      $1,066,442
   6             25%-30%    50,000    300,000     0.03557      4.70%   2350    20995          7.00%     7%       62%        208%      $1,136,146
   7             30%-35%    50,000    350,000     0.031183     3.50%   1750    22745          6.50%     5%       68%        193%      $1,143,373
   8             35%-40%    50,000    400,000     0.027741     3.10%   1550    24295          6.07%     5%       72%        181%      $1,129,774
   9             40%-45%    50,000    450,000     0.025476     3.00%   1500    25795          5.73%     4%       77%        171%      $1,110,968
  10             45%-50%    50,000    500,000     0.02382      2.64%   1318    27112          5.42%     4%       81%        161%      $1,073,158
  11             50%-55%    50,000    550,000     0.02193      1.94%    972    28084          5.11%     3%       84%        152%        $999,372
  12             55%-60%    50,000    600,000     0.019729     2.10%   1050    29134          4.86%     3%       87%        144%        $933,708
  13             60%-65%    50,000    650,000     0.017801     1.39%    695    29829          4.59%     2%       89%        137%        $831,026
  14             65%-70%    50,000    700,000     0.016067     1.53%    763    30592          4.37%     2%       91%        130%        $735,477
  15             70%-75%    50,000    750,000     0.013579     1.60%    800    31392          4.19%     2%       93%        125%        $643,780
  16             75%-80%    50,000    800,000     0.011813     1.11%    556    31947          3.99%     2%       95%        119%        $526,624
  17             80%-85%    50,000    850,000     0.008673     0.97%    486    32433          3.82%     1%       96%        114%        $402,179
  18             85%-90%    50,000    900,000     0.005169     0.83%    417    32849          3.65%     1%       98%        109%        $270,549
  19             90%-95%    50,000    950,000     0.000594     0.97%    486    33335          3.51%     1%       99%        104%        $146,104
  20             95%-100% 50,000     1,000,000   -0.024246     0.56%    278    33612          3.36%     1%      100%        100%              ($0)
                                                                       33612                         100.00%


  The $ benefits of modeling illustrates the incremental marketing dollars that would have
  to be spent in order to acquire the same number of responders had a model not been
  used.
                                        *Based on only 1,000,00 customer, average cost per call $3.50
                                                                      Strictly Confidential                                                          16
Dollar Benefits of Modeling
  Half Decile




                   % of                                 Average
                prospects   Number of                  Response                                 Cum.    % of           Cum. Lift
                Phoned in    Names       Cumulative     Rate in     Number of      Cum. Number Resp. Resp. in Cum. % in Resp. $ Benefit of
                 Interval    Phoned     Names Phoned    Interval   Responders        of Resp.    Rate Interval of Resp. Rate     Modelling
  1               0-5%       50,000        50,000       11.00%        5500             5500    11.00%   16%      16%    327%     $397,712
  2              5%-10%      50,000       100,000        9.50%        4750            10250    10.25%   14%      30%    305%     $717,327
  3             10%-15%      50,000       150,000        6.39%        3195            13445     8.96%   10%      40%    267%     $874,969
  4             15%-20%      50,000       200,000        5.50%        2750            16195     8.10%   8%       48%    241%     $986,325
  5             20%-25%      50,000       250,000        4.90%        2450            18645     7.46%   7%       55%    222% $1,066,442
  6             25%-30%      50,000       300,000        4.70%        2350            20995     7.00%   7%       62%    208% $1,136,146

      The dollar benefit of modeling is determined by calculating the cost of the additional
      number of calls/contacts that would need to be made in order to acquire the same
      number of responders if the model is not used.
                                    # Called Resp Rate New Policies Cost/Call Total Cost Cost/Policy
                     Model          150,000    8.96%      13,445      $3.50     $525,000   $39.05
                     No Model       400,149    3.36%      13,445      $3.50    $1,400,521 $104.17
                     Difference      250,149                                     $875,521     $65.12


                 To acquire 13,445 policies with a random list an additional 250,149 calls
                 would have to be made at a cost of $875,521. Based on this scenario the
                 model saves you $65 for every policy acquired.


                                                                   Strictly Confidential                                                     17
Leveraging the Response Model for Profitability
• With a response model built, the next question was:
  – “What about profitability”

• Company wanted to maximize response and premium
   – Our current tool just optimizes response

• Before building new tools, though, we wanted to see if existing tools could
  be used to target prospects
   – Can response models be used to target prospects based on premium

• Listed on the next slide is a test matrix that was used to examine the above
  hypothesis.




                                    Strictly Confidential                        18
Test Matrix
                                                    Current Balance
          Group         $0 or Missing   $1-$49          $50-$99             $100-$149    $150+       Grand Total
 1(R1-6,B150+)                                                                              14,628        14,628
 2 (R7-10,B150+)                                                                             6,107         6,107
 3 (R1-6,B100-149)                                                               2,892                     2,892
 4 (R1-3, B50-99)                                              2,382                                       2,382
 5 (R1-3,B1-49)                             3,991                                                          3,991
 6 (R1-6, B0)                   3,000                                                                      3,000
 7 (Remainder-Random)           4,360         769                519               308      1,044          7,000
            Grand Total         7,360       4,760              2,901             3,200     21,779         40,000

• 40,000 names were selected for the November/December calling period.
• R=Rank (1-20), B = Balance, therefore (R1-6,B150+) represents those from rank 1-6 with
  a balance greater than $150
• The test matrix was designed to acquire learning about the relationship between current
  balance and Model Rank.
   – This is important learning … higher balances mean higher premium, but lower
      balances with strong response rates might still make economic sense if individuals
      from these groups eventually grow their balances and the original cost per acquisition
      is low.
• The “Remainder Random” cell represents 7,000 random names selected after cells 1-6
  were selected. So the 1,044 names selected with a current balance of $150+, come from
  rank 11 -20 of the model.


                                                    Strictly Confidential                                          19
Model Validation – All Names (Net)
                                                 Net Response Rate by Model Rank (Grouped)


                                                       6.70%
                                         7.00%

                                         6.00%
                     Response Rate (%)



                                         5.00%

                                         4.00%                               2.91%

                                         3.00%
                                                                                               1.25%
                                         2.00%

                                         1.00%

                                         0.00%
                                                    1 to 5              6 to 10            11 to 20
                                                               Model Rank (Half-Decile)



• The above chart illustrates the models strong performance:
   – Ranks 1-5 out performs rank 6-10 by a margin greater than 3 to 1
   – Ranks 1-5 out performs rank 11-20 by a margin greater 5 to 1



                                                                   Strictly Confidential               20
Model Results by Current Balance Range
                                                   Net Response Rate by Model Rank (Current Balance)


                                        9.00%
                                        8.00%
                                        7.00%
                    Response Rate (%)



                                        6.00%
                                        5.00%
                                        4.00%
                                        3.00%
                                        2.00%
                                        1.00%
                                        0.00%
                                                           1 to 5                   6 to 10                  11 to 20
                                                                         Model Rank (Half-Decile)


                                                $0 or Missing       $1-$49       $50-$99         $100-$149       $150+


When reviewing the model results by Current balance range we observe:
• The model ranks well for all balance ranges
• Model performance is best for those with greater than $100 balance as illustrated by the
  steeper red and purple curves.


                                                                         Strictly Confidential                           21
Results by Current Balance – Rank 1-3 Only
                                                        Net Response Rate by Balance (Rank 1-3 Only)



                                       10.00%
                                       9.00%
                                       8.00%
                                       7.00%
                   Response Rate (%)




                                       6.00%

                                       5.00%
                                       4.00%

                                       3.00%
                                       2.00%
                                       1.00%
                                       0.00%
                                                $0 or Missing    $1-$49         $50-$99           $100-$149   $150+
                                                                            Balance Group



• When reviewing results from rank 1-3 only, we see that response improves gradually as an
  individuals balance goes up.
• But all customers from this model segment with greater than $1+ in balance perform quite well.



                                                                          Strictly Confidential                       22
  Profitability by Cell –
                                                                                            Est. Prem.@
              Group                 # of Targets Cost/Lead   Total Costs    Total Balance      1.29%         % of Costs Avg./Month
1 (Rank 1-6, Balance > $150)              14,628     $1.60       23,404.80   $711,967.11         $9,184.38          39%   $3,061.46
2 (Rank 7-10, Balance > $150)              6,107     $1.60         9,771.20  $147,205.57         $1,898.95          19%      $632.98
3 (Rank 1-6, Balance $100-$149)            2,892     $1.60         4,627.20    $31,252.80          $403.16           9%      $134.39
4 (Rank 1-3, Balance $50-$99)              2,382     $1.60         3,811.20    $30,129.16          $388.67          10%      $129.56
5 (Rank 1-3, Balance $1-$49)               3,991     $1.60         6,385.60    $27,099.14          $349.58           5%      $116.53
6 (Rank 1-6, Balance = $0)                 3,000     $1.60         4,800.00     $4,693.24           $60.54           1%       $20.18
7 (Rank 1-20, Balance = Random)            7,000     $1.60       11,200.00     $49,399.98          $637.26           6%      $212.42
                              Total       40,000     $1.60       64,000.00 $1,001,747.00       $12,922.54           20%   $4,307.51



• Result based on 3 months of earning potential
   – Based on a costs of $64,000, 20% or $12,922 in premium has been generated
   – When exploring profitability by cell (Rank 1-6, > $150) has generated revenue that has
     covered almost 40% of its costs




                                                             Strictly Confidential                                               23
What is next for this insurance company?
• Tools have now been built to optimize response and still function well when
  trying to optimize profitability

• Challenges
   – Variety of products and different vendors
   – How do we begin to leverage campaign history?

• Need to develop a system to organize this information such that it can be
  used to better target customers

• Historically, names were selected based on their billing cycle




                                    Strictly Confidential                     24
Background
• An initial analysis was conducted at BFG based on four products currently
  marketed to the Insurance Company’s base.
• This initial analysis was to determine:
   – How many potential contacts each customer might receive based on
     model score and product (or marketing) restriction (i.e. age, province,
     etc.)
   – Potential benefit of moving from a structured system like the one being
     used today to one based on a marketing optimization techniques or one
     using a campaign management system.




                                   Strictly Confidential                       25
Key Learning and Benefits
   • Once ranked, frequency analysis was conducted to determine the number of times a
     customer fell into the top 25% (generally the target audience for many of this
     company’s Partners) of an individual model score (0-4 times).

                    # of Times
                    Present in       # Of
                     Top 25%     Cardholders       %           Cum. %
                         4          20,737         1%            1%
                         3         233,813         7%            8%
                         2         591,702        18%           26%
                         1        1,033,733       32%           58%
                         0        1,353,553       42%
                       Total      3,233,538       100%
   • 42% of customers (1,353,553) were not in the top 25% for any of the products, yet
     were being assigned to a Partner when their time slot came up on the calendar. These
     names could be used by other Partners not in analysis.
   • 32% were in the the top 25% for only 1 product
   • 18% for 2 products
   • 7 % for 3 products
   • Less than 1% for all 4 products
   • There is significant opportunity to redistribute leads throughout the course of a year



                                               Strictly Confidential                          26
Key Learning and Benefits
   • The table below illustrates the additional number of contacts (efforts)
     that could be made if we were able to redistribute leads based on
     the assumptions below.
   • An incremental 2,450,950 contacts could be made based on these
     assumptions.
   • This number will most certainly decrease when all products and
     services are incorporated, but the number will still be significant.
                                   Opportunity
              # of Times               for        # of
              Present in   # Of     Additional Additional
               Top 25% Cardholders   Offers     Efforts
                   4      20737         0          0
                   3     233813         1       233,813
                   2     591702         2      1,183,404
                   1     1033733        1      1,033,733
                   0     1353553        0          0
                 Total   3233538               2,450,950

   • Assumptions:
      – Customers would not be given more than 4 ( since only 4 product
        In analysis) product offers/year
      – Customers would not be offered the same product more than
        twice/year

                                           Strictly Confidential               27
Key Learning and Benefits

   • Additional opportunities exist to improve results if previous disposition
     data could also be included into the lead selection process.
      – For example, non-contacted names in a given period could be re-
        assigned back to the Partner in a future period (up to 40% of a file
        may not be contacted in given period).
      – Wrong telephone numbers or other negative response disposition
        data could also be eliminated from future lead selections (until new
        number verified or problem resolved).
   • Collection of historical disposition data and marketing history will offer
     improved intelligence and targeting opportunities in the future.
      – Previous disposition codes, recency and frequency of contacts over a
        period of time, and response history will make excellent variables for
        modelling or other types of analysis in the future.




                                     Strictly Confidential                        28
Key Learning and Benefits

   • An improved contact management system could have significant benefits to this
     insurance company & their partners:
       – Allows the company & Partners to communicate more effectively with their
         customers
       – Increase revenues through a process that allows HBC to allocate marketing
         leads based on set business objectives

   • The improved system would include the following:
      – A contact management database that would keep, at an individual level, a
        customers contact history
          • When they were contacted
          • What they were offered
          • What the outcome was
      – An optimization module that would use a combination of business rules to
        prioritize how leads get distribute in a given marketing window (1 month, 2
        months, or 6 weeks).




                                        Strictly Confidential                         29
Key learning and Benefits:Product 1 vs. Product 2


                                      Product 1                                                       Product 1


  Product 2                                                                  Product 2
                    1          2          3          4      Grand Total                    1     2        3        4    Grand Total
      1          327,429    101,402    110,209    121,837     660,877           1         10%   3%       3%       4%       20%
      2          241,516    139,579    140,476    139,306     660,877           2         7%    4%       4%       4%       20%
      3          214,773    154,862    150,480    140,762     660,877           3         7%    5%       5%       4%       20%
      4          370,547    297,248    291,926    291,186    1,250,907          4         11%   9%       9%       9%       39%
  Grand Total   1,154,265   693,091    693,091    693,091    3,233,538      Grand Total   36%   21%      21%      21%     100%



    • If we were to look at only Product 1 and Product 2, there are 327,429 (or 10%)
    cardholders in the top 25% of each product’s model.
    • There are 827,000 cardholders who are in the top 25% rank for Product 1, but are in
    the bottom 75% rank for Product 2.
    • In any given two month cycle, these names which are most likely not to be used by
    Product 1 could be assigned to Product 2.
    • Conversely, there are over 330,000 cardholders who are in the top 25% rank for
    Product 2, but are in the bottom 75% rank for Product 1.

                                                             Strictly Confidential                                               30
Key Learning and Benefits: Matched Matrix (PARTNER A
vs. Other Partners)

                                                                                   Grand Total
                     Campaigns          (blank)    Partner A Mar Partner A Apr      (Shared)   Grand Total
                PARTNER A (Non Match)                   29720             77144      106864      106864
                  PARTNER A ONLY                       196540             138813     335353      335353
                   PARTNER B Feb         17490           1469              1041       2510       20000
                   PARTNER C Feb         57724           6624              4806      11430       69154
                   PARTNER D Feb         54088          10944              6663      17607       71695
                   PARTNER E Feb         85692          20747             11303      32050       117742
                   PARTNER F Feb        115380          22432             13737      36169       151549
                   PARTNER G Feb         50614          11527              6652      18179       68793
                     Grand Total        380988         300003             260159     560162      941150




  • 6,624 cardholders were contacted by Partner C and Partner A in March while 4,806
  cardholders were contacted by Partner C and Partner A in April.
  • About 16.5% of cardholders were contacted by Partner C and Partner A in March and April
  • In the future, further analyses will be required to understand the impact of overlap among all
    underwriters.




                                                  Strictly Confidential                                      31
Creating a Business Process for Campaign Management

• Once learning is achieved, a process needs to be developed that optimizes
  how names will be managed for each campaign:
   – Selection of names
   – Recording of key campaign information
     •   Which Products being promoted
     •   Date of Campaign
     •   Channel being used
     •   How names were targetted

  – Feedback mechanisms to loop back performance of selected names
    after campaign launch
  – Reports that measure and track customer performance after campaign
    launch.
  – Integration of new learning into business process.


                                    Strictly Confidential                     32
Ongoing Improvements-What Kind of Results did we
obtain?

                                                                             Avg.
                                           Total Cost@                                  # of Months to
  Campaign       # of Leads   # of Sales                   Cost/Sale      Prem/Cust./
                                              $36/hr                                          B/E
                                                                            Month
  Jul/Aug '03     20,000         285         $32,000          $112           $2.10           53
                                                                                                         Pre Modelling
 Sept/Oct. '03    20,000         303         $32,000          $106           $2.34           45
 Nov/Dec. '03     40,000        1,134        $64,000          $56            $4.17           14
 Jan/Feb '04      30,000        1,029        $50,000          $49            $4.44           11          Modelling Only
 Mar/ Apr '04     30,000        1,084        $54,750          $51            $4.06           12
    May04         15,000         806         $30,446          $38            $3.89           10
    Jun04         15,000         757         $28,442          $38            $4.79            8            Modeling +
     Jul04        15,000         727         $26,678          $37            $4.72            8             Contact
    Aug04         15,000         690         $28,064          $41            $4.10           10           Management
    Sep04         15,000         725         $27,225          $38            $5.07            7




• The slide above illustrates the improvement in clients results over time:
   – July thru October ’03 no modeling was used
   – November ’03 thru April ’04 the model was applied for list selection
   – Since May ’04 the model has been applied, plus business rules from a
     contact management database that we are housing for the client

                                                  Strictly Confidential                                                 33
What did we need to do next?
• At this point we have some models and a contact management system.

• The opportunity is now to more fully exploit the database environment by
  developing tools that optimize ROI.

• The objective is to create a scoring system that will allow the company to
  prioritize customers for one or all of their products based on Potential Value
  (PV Score)




                                     Strictly Confidential                         34
Approach Overview
• This score will take into account the following factors:
   – Customer’s potential response to a specific product offer
      • Specific response models for each product (if necessary)
   – Customer’s potential to cancel product in 1st three months of
     service
   – Customer’s potential to be contacted by TM
   – Specific product profit margins (value to Company)
• This approach will provide each customer with unique model scores
   – One for each product (assuming they are eligible and do not already
     have the product)
   – Company will have the ability to determine:
      • Which customers meet the desired return requirement for each product
      • How to prioritize product offers to customers based on potential value



                                      Strictly Confidential                      35
Other Activities
• Standardized Reporting
   – There would appear to be a need for a monthly set of standard reports
     that will allow the company to analyze results at the detailed model/rank
     level
   – A set of key metrics would be agreed upon, and reports could be
     produce once completed disposition files have been received from the
     TM agency
   – These reports could be delivered electronically and/or in hard copy
     format.

• In developing these new kind of tools, we need a way to identify the
  benefits which accrue from this approach.
   – Sensitivity reports can be developed that demonstrate the $ benefits of
     improving certain metrics such as cancel rate,contact rate or response
     rate

                                    Strictly Confidential                        36
Objective in building our tools
Response Models
• Produce a list of variables (model equation) that will help predict the
  likelihood of a customer responding to a future campaign

Cancellation Model
• Produce a list of variables that will predict the likelihood of a customer
  cancelling within the first 3 months

Contact Model
• Produce a list of variables that will predict the likelihood of a customer being
  contacted
   – Contact model results will be discussed at a later date when model
     development is complete

• Integrate the above models along with product profit margins to rank and
  select customers for future campaigns
                                     Strictly Confidential                       37
Building the Response Models
• In many cases, we found that each model required an upfront segmentation
  approach:
   – Have at least one existing product(1+vendor product)
   – Do not have an existing product(0 vendor product)

• How might we have arrived at this learning.


• Listed on the next page are some high level results in building these tools
   – Looked at both tools and how they would perform
   – Compared both new tools to existing tools




                                    Strictly Confidential                       38
Model Comparison
0 VP vs. 1+ VP

                                            0 V.P. vs. 1+ V.P.

                       14.00%


                       12.00%


                       10.00%
   Response Rate (%)




                       8.00%


                       6.00%                     Top 40% of 0 V.P. group
                                                 outperforms bottom 70% of
                       4.00%                     1+ V.P. group.

                       2.00%


                       0.00%
                                1   2   3    4          5                6   7   8   9   10
                                                 Model Rank (Decile)

                                                      0 V.P.      1+ V.P.

                                                 Strictly Confidential                        39
Building the Cancellation Model and our Initial Findings

• Customers more likely to cancel tend to fit the following description:
   – More likely to have lower credit optimization
      • Current Balance / Credit Limit
   – More likely to have a higher credit limit
   – More likely to live in postal areas with a lower % of Aboriginals as a % of
     the total population
   – Reside outside of Alberta (Postal Area “T”)
   – Have a higher ratio of refusals per total campaigns in the last 12 months
   – Have a lower lifetime to date balance



• A number of these variables behave opposite in the 1+ VP response model



                                         Strictly Confidential                     40
Cancellation Model
Validation Lorenz Curve
                                                                                    Cancellation Curve


                                                   70.00%


                                                   60.00%
                  Interval Cancellation Rate (%)




                                                   50.00%


                                                   40.00%


                                                   30.00%


                                                   20.00%


                                                   10.00%


                                                   0.00%
                                                            1   2   3   4   5   6    7    8    9   10 11    12     13   14   15   16   17   18   19   20
                                                                                              Rank (Half-Decile)



• When applying the equation to the validation sample, the following results are
  obtained
   – Cancellation rate for top 10% of the file = 63.59%
   – Avg. cancellation rate = 49.39%
   – Cancellation rate for bottom 10% of file = 35.97%
                                                                                         Strictly Confidential                                             41
The Contact Model
• The final model to be included in this integrated solution is the contact
  model which would complete the required modelling tools that are
  necessary for selecting names based on ROI.

Contact Model Objective
• Produce a list of variables that will predict the likelihood of a customer being
  contacted
•




                                     Strictly Confidential                       42
Contact Model
Gains Chart - Validation
                                                                                                Cum.         % of       Cum. % of
            % of         # of       Cum. # of       Interval                   Cum. # of       Contact    Contacts in   Contacts in   Interval Lift in Cum. Lift in
 Rank     Prospects   Prospects     Prospects     Contact Rate # of Contacts   Contacts         Rate       Interval      Interval     Contacts Rate Contact Rate
  1         0-10%          24,333        24,333        74.96%         18,240        18,240       74.96%        13.91%        13.91%         139.11%       139.11%
  2        10%-20%         24,333        48,665        67.21%         16,353        34,593       71.08%        12.47%        26.38%         124.72%       131.92%
  3        20%-30%         24,333        72,998        63.52%         15,455        50,048       68.56%        11.79%        38.17%         117.87%       127.24%
  4        30%-40%         24,333        97,331        59.56%         14,493        64,541       66.31%        11.05%        49.22%         110.53%       123.06%
  5        40%-50%         24,333      121,664         54.98%         13,378        77,919       64.04%        10.20%        59.43%         102.04%       118.86%
  6        50%-60%         24,333      145,996         51.38%         12,501        90,420       61.93%         9.53%        68.96%          95.35%       114.94%
  7        60%-70%         24,333      170,329         48.17%         11,720      102,141        59.97%         8.94%        77.90%          89.39%       111.29%
  8        70%-80%         24,333      194,662         45.52%         11,076      113,216        58.16%         8.45%        86.35%          84.47%       107.94%
  9        80%-90%         24,333      218,994         40.26%          9,796      123,012        56.17%         7.47%        93.82%          74.71%       104.24%
  10      90%-100%         24,333      243,327         33.30%          8,104      131,116        53.88%         6.18%      100.00%           61.81%       100.00%
                         243,327                       53.88%       131,116

• Based on validation sample 243,327
                                                    $ Benefit of Modelling

                               # of   Contact           #       Response            Sale
                              Leads     Rate Contacted             Rate   # Sales Price      $ Sales
         Model (top 20%)     100,000 71.08%           71,080        5.00% 3,554      $180     $639,720
        No Model (random) 100,000 53.88%              53,880        5.00% 2,694      $180     $484,920
                 Difference                           17,200                 860              $154,800
        Resp. Rate          Estimated at 5% for contacted names
        Sales Price         Estimated at $180
        Number of Leads     100,000 leads is just to illustrate value. Company has approximately 3.5 million
                            Cardholders so benefit would actually be larger.


 Taking 100,000 leads from the top 20% of the model in comparison to taking a
 random sample results in 860 more sales and a $154,800 increase in revenue.
                                                                       Strictly Confidential                                                                          43
Contact Model
Validation Lorenz Curve

                                                                  Contact Curve

                                             80.00%


                                             70.00%


                                             60.00%
                 Interval Contact Rate (%)




                                             50.00%


                                             40.00%


                                             30.00%


                                             20.00%


                                             10.00%


                                             0.00%
                                                      1   2   3   4        5        6         7   8   9   10
                                                                          Rank (Decile)



• When applying the equation to the validation sample, the following results are
  obtained
   – Contact rate for top 10% of the file = 74.96%
   – Avg. contact rate = 53.88%
   – Contact rate for bottom 10% of file = 33.30%
                                                                      Strictly Confidential                    44
Contact Model
Validation vs. May 2007 leads – All Products

                                                                                           Cum.         % of       Cum. % of
           % of         # of      Cum. # of      Interval                   Cum. # of    Response    Contacts in   Contacts in   Interval Lift in Cum. Lift in
  Rank   Prospects   Prospects    Prospects    Contact Rate # of Contacts   Contacts       Rate       Interval      Interval      Contact Rate Contact Rate
   1      0-10%          40,848       40,848         71.69%        29,282       29,282      71.69%        23.18%        23.18%         133.39%       133.39%
   2     10%-20%         22,867       63,715         62.69%        14,336       43,618      68.46%        11.35%        34.53%         116.65%       127.38%
   3     20%-30%         26,457       90,172         58.26%        15,413       59,031      65.46%        12.20%        46.73%         108.40%       121.81%
   4     30%-40%         23,531      113,703         53.73%        12,643       71,674      63.04%        10.01%        56.74%          99.98%       117.29%
   5     40%-50%         14,832      128,535         50.68%         7,517       79,191      61.61%         5.95%        62.69%          94.30%       114.64%
   6     50%-60%         17,920      146,455         49.88%         8,939       88,130      60.18%         7.08%        69.76%          92.82%       111.97%
   7     60%-70%         21,330      167,785         47.87%        10,210       98,340      58.61%         8.08%        77.85%          89.07%       109.06%
   8     70%-80%         22,861      190,646         45.42%        10,384      108,724      57.03%         8.22%        86.07%          84.52%       106.12%
   9     80%-90%         20,871      211,517         42.17%         8,802      117,526      55.56%         6.97%        93.03%          78.47%       103.39%
   10    90%-100%        23,542      235,059         37.38%         8,800      126,326      53.74%         6.97%      100.00%           69.55%       100.00%
                        235,059                      53.74%       126,326




• As an additional validation to the model we validated the model against actual
campaign results from May ’07.
• When validating the model against the company’s May leads, we can see that the
model ranks well with 71.69% interval contact rate in the top decile.
• Please note: the leads are not evenly distributed because the scoring and ranking of
the file took place prior to leads selections (On the complete file).


                                                                       Strictly Confidential                                                                     45
Contact Model
Validation Lorenz Curve- May 2007 leads – All Products

                                                                   Response Curve

                                              80.00%


                                              70.00%


                                              60.00%
                 Interval Response Rate (%)




                                              50.00%


                                              40.00%


                                              30.00%


                                              20.00%


                                              10.00%


                                              0.00%
                                                       1   2   3    4        5        6     7   8   9   10
                                                                            Rank (Decile)




• The curve and results are almost identical to development results.



                                                                   Strictly Confidential                     46
Model Integration
 • We have attempted to illustrate how the integrated modeling solution will work
 and its benefits.
             Leads                                    Contact Model Rank
                                               1-5     6 - 10    11 - 15            16 - 20   Total
                  Contact Rate (Validation)    69.83%   58.26%    49.10%             38.35%
                 Net Resp. Rate (April 2007)
                Net      1-5          9.00%    15,000     15,000           15,000    15,000    60,000
             Response    6 - 10       4.26%    15,000     15,000           15,000    15,000    60,000
              Model     11 - 15       3.08%    15,000     15,000           15,000    15,000    60,000
               Rank     16 - 20       2.23%    15,000     15,000           15,000    15,000    60,000
                         Total                 60,000     60,000           60,000    60,000   240,000
             Sales                                      Contact Model Rank
                                               1-5       6 - 10    11 - 15          16 - 20   Total
                  Contact Rate (Validation)    69.83%     58.26%    49.10%           38.35%
                 Net Resp. Rate (April 2007)
                Net      1-5          9.00%       942         786             663       517     2,909
             Response    6 - 10       4.26%       446         372             314       245     1,377
              Model     11 - 15       3.08%       323         269             227       177       996
               Rank     16 - 20       2.23%       234         195             165       128       722
                         Total                  1,945       1,623           1,368     1,068     6,003




• When integrating the response & cancellation models with the contact model, we
  can see some cells with a lower net response rank generate more sales than cells
  with a higher net response rank but lower contact rate.
                                                   Strictly Confidential                                47
       Not For Profit
       Lottery Case

Multi-Channel/Multi-Tool Approach
             Case 3
Background
• Charity Lottery category is an excellent fundraising vehicle
• Lotteries help raise “new monies” without impinging on existing fund raising
  programs
• Since inception, this lottery program has generated over $100 million (net)
• This has enabled the organization to support unique research initiatives and
  health education programs

• Why look at this case
  – Another good example of the evolutionary nature of data mining.




                                   Strictly Confidential                     49
Background
• The first lottery was conducted in the Winter of 1997
   – 185,000 tickets were purchased
   – Since that time…….




                                    Strictly Confidential   50
Lottery – Then and Now


                               Then                     Now
   Tickets Per Year           185,000                  800000+

   Marketing Cost/Year              X                   2.7X

   Cost Per Ticket                  X                   1.3X

   Profit                           X                    3X



   • The Lottery has been able to sustain tremendous sales
     growth, while keeping its costs in check
   • Thus, improving its contributions to the foundation
                               Strictly Confidential             51
Lottery Marketing Matrix - Then



       Direct Mail        Awareness TV            Newspaper      Radio   Distribution




 Total Addressed
       Mail          Unaddressed
                        Mail




                                         Strictly Confidential                          52
Lottery Marketing Matrix - Now

                                                        Newspaper/Inserts                    Miscellaneous
    Direct Mail                  Awareness TV                                  Radio




                                                        Newspaper           Dist. Partners   Telemarketing
                                                                             Scotia Bank
 Addressed    Unaddressed
   Mail           Mail                Online


                                                              FSI -
                                                            Brochure        Dist. Partners
                                                                                PHSN
                      Email             Web Site
                                                          FSI - Proxy


                                                                            Dist. Partners
                                                                            Local Offices
                   Banner                                Winners
                  Advertisings                        Announcements
                                                      Newspaper/Email


                                                   Strictly Confidential                                     53
 Using Data Mining and Models
• To maximize contributions to the mission, the Lottery team recognized that
  it needed to:
    – take advantage of marketing intelligence techniques
      • Modeling and other targeting tools
      • Ongoing measurement and testing
      • Consumer Research


   – maintain and benefit from a past supporters database




                                      Strictly Confidential                    54
Using Data Mining and Models

  • As the Lottery market place grew and became more complex,
    many questions needed to be answered in order to ensure net
    revenue growth for the mission
  • Which past supporters are most likely to repeat?
     – Which supporters will buy early in the campaign?
     – What areas are best to target for acquisition?
     – What other Lottery supporters are most likely to buy a Lottery ticket?
     – Which past supporters are most likely to respond to Telemarketing?
        • Response?
        • Contact/Dial?




                                    Strictly Confidential                       55
Data Mining and Models –
Answering the Questions

 • Since November 1999 the Lottery has been using data mining and
   models to improve its contributions to heart and stroke research and
   health promotion programs
 • The first predictive model was developed to identify past supporters
   with the highest propensity to repurchase




                                Strictly Confidential                     56
Data Mining and Models –
Answering the Questions

• Since that time a number of other tools have been developed, including:
   – Addressed Mail Predictive Model to identify which segments will be most responsive and
     early responders

   – Unaddressed Acquisition Model to identify best postal walks to target from an acquisition
     stand point

   – In House Predictive Acquisition Model to identify potential acquisition from in- house donor
     database

   – Telemarketing Predictive Response Model to identify past supporters most likely to buy
     through telemarketing

   – Telemarketing Optimization Model (Dial Model) to identify best past supporters who require
     the least amount of effort (for cost efficiencies) to contact




                                            Strictly Confidential                                   57
Why Data Mining and Models
•   Data Mining is about identifying opportunities to improve business results
•   This may be achieved by identifying segments of customers that outperform others based on certain
    business objectives (an objective function)
•   For example, the results from the predictive model below identifies customers more or less likely to
    respond to a particular Direct Mail offer


               5.0%
               4.5%
               4.0%
               3.5%
               3.0%
               2.5%
               2.0%
               1.5%
               1.0%
               0.5%
               0.0%
                      1

                              2

                                     3

                                             4

                                                    5

                                                            6

                                                                    7

                                                                           8

                                                                                      9
                                                                                          10
                   ile

                          ile

                                  ile

                                         ile

                                                 ile

                                                        ile

                                                                ile

                                                                        ile

                                                                               ile
                                                                                      ile
                 ec

                         ec

                                ec

                                        ec

                                               ec

                                                       ec

                                                              ec

                                                                      ec

                                                                              ec
                                                                                   ec
               D

                      D

                              D

                                     D

                                             D

                                                    D

                                                            D

                                                                   D

                                                                           D
                                                                                D




                                                              Strictly Confidential                        58
Objective of Model – Addressed Mail Model

   • Produce a list of variables (model equation) that will predict previous
     supporters likelihood to respond to a future campaign
   • Apply the model to past supporters on the database and produce a
     ranking of individuals likelihood to respond to a future campaign




                                        Strictly Confidential                  59
Addressed Mail Predictive Model

  Model Objective:
  • To Predict the likelihood of recommitment from a previous supporters
  • To select the top customers to target for addressed mailings

  Key predictors:


                        Variable                          Impact
 Number of Previous Tickets Purchased                       +
 Recency of Last Ticket Purchase                            -
 Past Winner                                                +
 Movers                                                     -
 Payment Method is Cash                                     -
 Contacted Code Equal to Television                         -
 Lives in Urban Area (vs. Rural)                            +
 Lives in Toronto                                           +
 Male                                                       +

                                  Strictly Confidential                    60
How Do The Tools Perform?


 *Results presented have been indexed; however the
model ranks, relative response rates, relative costs and
            historical trends are accurate.
Validation - Addressed Mail Model
All Repeat Ticket Sales By Decile – Response Curve

                                             Response Curve - Validation

                            60.00%


                            50.00%
        Response Rate (%)




                            40.00%


                            30.00%


                            20.00%


                            10.00%


                            0.00%
                                     1   2   3      4           5            6   7   8   9   10
                                                                    Decile




       • The results show how well the Addressed Mail
         Predictive Model does in terms of predicting future
         supporters
                                                        Strictly Confidential                     62
Validation - Addressed Mail Model
All Repeat Ticket Sales By Decile – Gains Chart
  Model     Avg. Resp. Rate      Cumulative            Cumulative % of        Cumulative Lift
Rank/Decile   in Interval        Resp. Rate          Responders in Interval    in Resp. Rate
      1             50%              50%                         31%              310%
      2             34%              42%                         52%              261%
      3             20%              35%                         64%              214%
      4             17%              30%                         74%              186%
      5             15%              27%                          84%             168%
      6             13%              25%                         92%              153%
      7             5%               22%                         95%              135%
      8             3%               20%                         97%              121%
      9             3%               18%                         98%              109%
     10              3%              16%                         100%             100%
    Total           16%
• Validation results to a recent campaign for all past Lottery supporters
   – In total 16% repurchased a ticket
       • 50% of those from the top model rank (top 10%)
       • 3% of those from the bottom model rank (bottom 10%)
•   84% of all repurchaser come from the top 5 deciles of past supporters
     – That means 84% of the sales can be achieved at 50% of the cost –maximizing
       contributions to the mission

                                         Strictly Confidential                                  63
Dollar Benefits of Modeling - Example

  • Another way to look at the benefit of modeling is to determine the
    additional costs associated with achieving the same number of sales if a
    model were not used
              $ Benefit of Modelling (based on a database of 100,000)
                               Resp.               Cost/
                  # Mailed                Sales            Total Cost   Cost/ Sale
                                Rate              Comm.
 Model (Top 50%)     50,000     27%       13,575     $2     $100,000      $7.37
    No Model         83,796     16%       13,575     $2     $167,593      $12.35
   Difference        33,796                                  $67,593      $4.98

  • In order to acquire 13,575 sales with a random (no modeled) list, an
    additional 33,796 pieces of mail would be required at an additional cost of
    $67,593 (or $4.98/ticket)

  • Based on this scenario an additional $67,593 can be put     toward vital
    research or other initiative that support the mission



                                     Strictly Confidential                           64
Historical Results – Past Supporters

                          Campaign Results
   60.00%



   50.00%



   40.00%



   30.00%



   20.00%



   10.00%



    0.00%
            1   2   3     4           5            6         7         8   9   10
                                          Decile
                          Lottery A         Lottery B      Lottery C


  • Result have remained very consistent over many campaigns


                                           Strictly Confidential                    65
Addressed Mail Model Results
Sales by Source – Repeat Supporters
   Model                                                   Un -
   Rank/    Total    DM Resp.   TM Resp.                addressed      Public
   Decile Resp. Rate   Rate       Rate                  Resp. Rate   Resp. Rate
     1       50%       23%        9%                       5.7%         12%
     2       34%       13%        8%                       3.9%          9%
     3       20%       7%         5%                       2.1%          6%
     4       17%       6%         3%                       1.9%          5%
     5       15%       6%         2%                       1.6%          5%
     6       13%       6%         1%                       1.4%          3%
     7        5%       2%         0%                      0.50%         1.4%
     8        3%       1%         0%                      0.47%         1.3%
     9        3%       1%         0%                      0.42%         1.0%
    10        3%       1%         0%                      0.37%         0.7%
    Total    16%        7%         3%                       2%           4%
         Not targeted

 • This model provides benefits to Lottery for all sales channels




                                Strictly Confidential                             66
Timing of Purchase-Looking at Velocity

      Model Rank   Period 1 Period 2 Period 3
          1-3       45%      40%      15%
          4-7       35%      33%      32%
          8-10      26%      32%      42%
         Total      36%      34%      30%
     New Purchasers 12%      36%      52%
 • Those with higher model ranks are more likely to purchase earlier
   in the campaign

 • Benefits :
    – Early sales result in marketing cuts during public campaign and
      more funds for mission spending


                                 Strictly Confidential                  67
What about unaddressed type tools-
Unaddressed Mail Acquisition Model

   • Postal Walks are selected based on a combined ranking index that was
     developed based on the following:
      – a walks’ past purchase history (or penetration rate)
      – a walks’ previous unaddressed campaign response rate

   • Combined Index = (Postal walk penetration index x 75%) + (Postal walk
     response index x 25%)

   • The Benefit:
      – A tool that estimates the number of pieces to be mailed based
        on required sales from channel
      – More funds directed to mission




                                    Strictly Confidential                    68
Unaddressed Mail Model Results

  1.40%

  1.20%
                                                                   Unnadressed       New
  1.00%

  0.80%

  0.60%

  0.40%

  0.20%

  0.00%
           1      2       3       4       5              6     7      8          9         10
                                              Decile




 • Unaddressed Response Benefits:
    – Top 10 % of walks 18x better then bottom 10% (1.12% vs. .06%)
 • New Response Benefits:
    – Top 10 % of walks 3.5x better then bottom 10% (1.29% vs. .37%)



                                       Strictly Confidential                                    69
Building a Specific tool for Telemarketing for optimizing
contact rate


    • Designed tool to help optimize calling efforts
       – Reduce cost
    • Key variables
       – Age
       – Loyalty
       – Type of Dwelling
       – Gender




                                   Strictly Confidential    70
Telemarketing Optimization Model–
Performance by Quartile
 Indexed Results
                Gross Resp.              Net Resp.   Cost/
  Rank Quintile    Rate     Contact Rate   Rate    Responder
       1           149%        123%        124%       57%
       2           112%        101%        114%       76%
       3           84%         88%         97%        93%
       4           55%         87%         65%       174%
    Grand Total    100%        100%        100%      100%

 • Results comparison by quartile (bottom quartile vs. top quartile):

     • Gross Response Rate – 2.7 times better
     • Contact Rate – 1.4 times better
     • Net Response Rate – 1.9 times better
     • Cost/Response - 3.1 time better


                                Strictly Confidential                   71
Combining the Tools

   • While these tools all work well individually for their designed
     purpose, there is tremendous benefits in combining these tools

   • To illustrate, we have combined the following two tools:
      – the Addressed Mail Predictive Model, and
      – the Telemarketing Optimization Model




                                  Strictly Confidential                72
Combining the Addressed Model
with the TM Optimization Model
   Costs/Resp Index
                      Telemarketing Optimization Model - Quintile
    Addressed Mail
     Rank - Decile      4          3                     2      1      Total
         1 to 3       $2.44       $1.52              $1.01     $0.62   $0.70
         4 to 7       $2.23       $1.54              $1.38     $0.96   $1.20
        8 to 10       $2.87       $1.55              $1.47     $1.31   $1.46
         Total        $2.49       $1.54              $1.29     $0.76   $1.00

  • Table shows the “indexed’ cost/response/segment when we combining the two
    tools (results indexed based on average cost of $1 per response)
     – TMO Rank 1/AM Rank 1-3 produces the lowest cost per ticket at $.62
     – TMO Rank 4/AM Rank 8-10 produces the highest cost per ticket at $2.87
  • Benefits:
     – A tool that allows them optimize channel spending, thus ensuring
        maximum net return for HSFO




                                       Strictly Confidential                    73
Combining the Addressed Mail Model
with the TM Dial Model
Costs/Resp Index    Indexed Avg. Cost = $1
                   Telemarketing Optimization Model - Quintile
Addressed Mail
 Rank - Decile       4          3                           2    1      Total
    1 to 3         $2.44       $1.52                 $1.01      $0.62   $0.70
    4 to 7         $2.23       $1.54                 $1.38      $0.96   $1.20
    8 to 10        $2.87       $1.55                 $1.47      $1.31   $1.46
     Total         $2.49       $1.54                 $1.29      $0.76   $1.00
• Key benefit of combined tool is prioritizing channel
   – TMO Rank 1, AM Rank 4-7 and 8-11 names are better targets for TM
     than,
   – TMO Rank 2-4, AM Rank 1-3 names (which are better for DM)
• As the Lottery’s tools have evolved, knowledge on how to raise more
  funds for the mission has improved




                                    Strictly Confidential                       74
Erosion of Contact Model
• The challenge in using the contact model overtime was that the
  organization wanted not only to optimize contact but optimize it on the first
  attempt.

• Listed on the next slide are some results that demonstrated the need for a
  new contact type tool.




                                     Strictly Confidential                        75
Background (cont’d)

•   TM Results
    – The table below shows TM results by integrating the
      Repurchaser Model and Contact Model(First Dial Effort)
    – While the integrated model is ranking well based on response
      rate, it doesn’t have any differentiation amongst the contact
      model ranks (i.e. we do not see any lift)

                        Response Rate vs. Called
       Repurchaser           Contact Model Rank
       Model Rank      1-5     6-10     11-15    16-20     Total
           1-5         7.88%    8.42%    8.35%    8.52%     8.22%
          6-10         2.93%    3.02%    3.45%    3.30%     3.12%           Ranking
          11-15        2.66%    3.03%    3.14%    3.15%     3.02%
          16-20                                   0.00%     0.00%           nicely
               Total   6.05%    6.23%    6.43%    6.19%     6.20%


                       Response rate is relatively
                         flat across all ranks




                                                          Strictly Confidential       76
Objective of Modeling Exercise

• To produce a new tool that optimize both response rate and contact rate on
  first effort

• Much exploration was done to determine the best targetting approach.
   – This involved looking at a number of different options in defining the
     objective function

• In order to produce an effective tool, we needed to modify our objective
  function
   – Create an ordinal type function rather than binary with three outcomes:
      • 0: for no response and no contact
      • 1: Contact and no response
      • 2: Contact and Response



                                      Strictly Confidential                    77
Dial Model Results
Gains Chart – Validation (vs. Spring 2006 Validation
Sample-Backtesting)

                                                Interval R.R.
                                               (Contacted &                                   Cum.       % of      Cum. % of Interval Lift in   Cum. Lift in
          % of         # of      Cum. # of    Responded on         # of        Cum. # of    Response Responses in Responses in Response          Response      $ Benefit of
 Rank   Prospects   Prospects    Prospects        1st Dial)     Responses     Responses       Rate     Interval     Interval      Rate             Rate         Modelling
  1      0-10%          15,444       15,444             7.85%        1,212          1,212       7.85%       36.19%       36.19%      361.92%        361.92%     $32,360.69
  2     10%-20%         15,444       30,888             4.27%          659          1,872       6.06%       19.69%       55.88%      196.86%        279.39%     $44,328.63
  3     20%-30%         15,444       46,332             2.33%          360          2,232       4.82%       10.74%       66.62%      107.42%        222.07%     $45,245.73
  4     30%-40%         15,444       61,776             2.29%          354          2,585       4.19%       10.56%       77.18%      105.58%        192.95%     $45,934.98
  5     40%-50%         15,444       77,221             1.55%          239          2,825       3.66%        7.15%       84.32%        71.46%       168.65%     $42,408.97
  6     50%-60%         15,444       92,665             1.03%          159          2,984       3.22%        4.75%       89.07%        47.49%       148.46%     $35,920.88
  7     60%-70%         15,444      108,109             0.66%          102          3,086       2.85%        3.04%       92.12%        30.43%       131.59%     $27,325.16
  8     70%-80%         15,444      123,553             0.63%           97          3,183       2.58%        2.90%       95.02%        29.05%       118.78%     $18,558.55
  9     80%-90%         15,444      138,997             0.50%           77          3,260       2.35%        2.31%       97.33%        23.05%       108.14%       $9,051.42
  10    90%-100%        15,444      154,441             0.58%           90          3,350       2.17%        2.67%      100.00%        26.74%       100.00%           $0.00
                       154,441                          2.17%        3,350


• Above Gains chart is validated using the Spring 2006 validation sample
• It is validating on the overall solution (i.e. the solution includes the first rule:
  TM_LAST3 = 1 and final model to the balance)
• Based on 154,441 prospects and an average cost of $0.80, we are
  observing excellent rank ordering with an over 10 to 1 lift improvement
  between the top decile and the bottom decile.



                                                                            Strictly Confidential                                                                             78
First Dial Model Results Validation Lorenz
Curve
                                                                 Response Curve

                                             9.00%

                                             8.00%

                                             7.00%
                Interval Response Rate (%)




                                             6.00%

                                             5.00%

                                             4.00%

                                             3.00%

                                             2.00%

                                             1.00%

                                             0.00%
                                                     1   2   3   4          5        6       7   8   9   10
                                                                          Rank (Decile)



• When applying the equation to the validation sample the following results
  are obtained. The top 10% of the file based on model score has an
  observed response rate of 7.85% versus an average response rate of
  2.17% for the complete file and versus 0.58% for the bottom 10% of the file.

                                                                     Strictly Confidential                    79
1st Dial Model Results
Gains Chart – Rollout (TM Channel DB = 300,000)
                                                  Interval R.R.
                                                 (Contacted &                                  Cum.       % of      Cum. % of Interval Lift in   Cum. Lift in
            % of         # of      Cum. # of    Responded on         # of       Cum. # of    Response Responses in Responses in Response          Response      $ Benefit of
   Rank   Prospects   Prospects    Prospects        1st Dial)     Responses    Responses       Rate     Interval     Interval      Rate             Rate         Modelling
    1      0-10%          30,000       30,000             7.85%        2,355         2,355        7.85%      36.19%       36.19%       361.92%       361.92%     $62,860.30
    2     10%-20%         30,000       60,000             4.27%        1,281         3,636        6.06%      19.69%       55.88%       196.86%       279.39%     $86,107.88
    3     20%-30%         30,000       90,000             2.33%          699         4,335        4.82%      10.74%       66.62%       107.42%       222.07%     $87,889.35
    4     30%-40%         30,000      120,000             2.29%          687         5,022        4.19%      10.56%       77.18%       105.58%       192.95%     $89,228.22
    5     40%-50%         30,000      150,000             1.55%          465         5,487        3.66%       7.15%       84.32%        71.46%       168.65%     $82,378.98
    6     50%-60%         30,000      180,000             1.03%          309         5,796        3.22%       4.75%       89.07%        47.49%       148.46%     $69,775.93
    7     60%-70%         30,000      210,000             0.66%          198         5,994        2.85%       3.04%       92.12%        30.43%       131.59%     $53,078.84
    8     70%-80%         30,000      240,000             0.63%          189         6,183        2.58%       2.90%       95.02%        29.05%       118.78%     $36,049.79
    9     80%-90%         30,000      270,000             0.50%          150         6,333        2.35%       2.31%       97.33%        23.05%       108.14%     $17,582.30
    10    90%-100%        30,000      300,000             0.58%          174         6,507        2.17%       2.67%      100.00%        26.74%       100.00%           $0.00
                         300,000                          2.17%        6,507

• The dollar benefit of modelling is determined by calculating the cost of the additional number of
  customers that would need to be targeted in order to acquire the same number of sales if the
  model was not used.                                    $ Benefit of Modelling
                                                                                               Cum.
                                                                                             Response
                                                                          # Contacted          Rate   # Purchasers                 Total Cost      Cost/ Sale
                                                              Model            120,000              4.19%             5,022         $96,000.00             $19.12
                                                           No Model            231,535              2.17%             5,022        $185,228.22             $36.88
                   *Assuming $0.80 per call
                                                            Difference         111,535                                              $89,228.22             $17.77


• In order to acquire 5,022 purchasers with a random list, an additional 111,535 calls
  would have to be made at a cost of $89,228.22. Based on this scenario, the model
  saves you $17.77 for every sale.
                                                                          Strictly Confidential                                                                                80
First Dial Model Results (Validation on Contact
Rate)

• The table below shows contact rate results when we integrate the First
  Dial Model with the Repurchaser Model
• The integrated model shows some lifting across the 1st Dial Model ranks,
  but in a broad way.
• Key metric of validation is optimization on overall response rate (as shown
  on next slide)


                                            Contact Rate
                  Repurchaser              1st Dial Model Rank
                  Model Rank       1-5        6-10     11-15        16-20    Total
                      1-5         25.77%     24.88%     21.36%      22.27%   25.23%
                     6-10         24.28%     26.20%     23.33%      21.78%   24.25%
                     11-15        23.89%     27.51%     27.21%      18.52%   25.05%
                     16-20                              33.33%               33.33%
                          Total   25.39%     25.70%     23.51%      21.89%   24.88%




                                                 Strictly Confidential                81
First Dial Model Results (Validation on
Response Rate)

• The table below shows TM results when we integrate the First Dial Model
  with the Repurchaser Model
   – The integrated model shows lift in both directions (vertical &
     horizontal), the model is performing well as it predicts overall response
     rate.
                 Repurchaser R.R. vs. 1st Dial R.R.
                                                                                                       ROI
 Repurchaser            1st Dial Model Rank
                                                                       Repurchaser             1st Dial Model Rank
 Model Rank       1-5      6-10      11-15       16-20      Total
                                                                       Model Rank       1-5        6-10     11-15     16-20    Total
     1-5         10.24%     3.10%      4.11%        2.52%    8.22%
                                                                           1-5          1280%        387%      514%     315%    2496%
    6-10          5.86%     1.78%      1.74%        1.55%    3.12%
                                                                          6-10            733%       223%      218%     193%    1367%
    11-15         4.28%     2.62%      1.47%        0.42%    3.02%
                                                                          11-15           535%       328%      184%      52%    1100%
    16-20                              0.00%                 0.00%
                                                                          16-20                                                     0%
         Total    9.12%     2.41%      2.12%        1.91%    6.20%
                                                                               Total    2548%      938%      916%      561%     4963%




            But what about response on 1st dial.


                                                                Strictly Confidential                                                    82
Spring 2008 Results
Integrating Repurchaser & Contact Models
                                                                 • New Contact model was developed in January
                                                                   2008 with the objective function of predicting
                                                                   contact and response on the first dial
                         Response Rate
                                       Contact Model Rate
Repurchaser Model Rank        1-5        6-10    11-15    16-20       Total
           1-5                   3.40%   1.00%    0.90%    1.00%       3.00%
          6-10                   2.30%   1.00%    0.80%    0.90%       1.90%
         11-15                   1.60%   0.90%                         1.40%
         16-20
          Total                 3.30%    1.00%    0.90%     1.00%      2.70%




         • This segment outperforms the better ranking
           response names as these names are easier to
           contact.

           • But how do these results perform in the field



                                                          Strictly Confidential                                83
Live Campaign Results-Fall 2008
                              Response on 1st dial Model
  Predictive Response Model          1          2         3             4          5      6     Grand Total
                1                 11.8%       2.9%      1.6%            **         **     **       7.4%
                2                  9.7%       2.8%      1.4%          1.5%         **     **       6.0%

                                                                                                              1-6 pred.
                3                 8.5%        2.6%       1.4%         0.9%          **     **      4.9%       Resp. Model
                4                 8.1%        3.1%       2.1%         1.6%          **     **      4.5%
                5                 6.5%        2.6%       2.0%         1.5%        2.0%   1.3%      3.5%
                6                 5.8%        2.1%       1.9%         1.4%        0.6%   0.6%      2.6%
                7                 5.2%        1.9%       1.4%         1.3%        1.1%   1.0%      2.9%
                8                 3.4%        2.3%       1.4%         1.3%        1.0%   0.4%      2.1%
                9                 3.0%        1.9%       1.2%         1.6%        0.6%   0.8%      1.7%
               10                 3.7%          **       2.2%         1.3%        0.8%   1.0%      2.3%
               11                 2.5%          **       1.6%         0.9%        0.9%   1.1%      1.3%
               12                 1.0%          **       1.1%         1.5%        1.4%   0.2%      1.2%
             TOTAL                7.6%        2.7%       1.7%         1.4%        1.0%   0.7%      3.9%

                                                     1-6 dial model




             The dial model adds additional lift over and above the
             predictive response model.




                                                          Strictly Confidential                                             84
Where is the real Challenge
• For organizations conducting over 50 campaigns in a year, the need for
  more marketing science becomes even more paramount.

• Optimization is simply not good enough as customers will always be
  available for the most profitable products but not for other products and
  services.
   – How do organizations grow their other products and services.
   – The key is to still optimize profit but within a sub-optimal environment or
     constraints
      • Example: Company A wants to grow its credit card base but the current
                 situation would yield an outcome with very few customers being
                 selected for this program(< 50M customers). The credit card team
                 wants to select a minimum of 200000 names for this program.

      • What to do?


                                     Strictly Confidential                          85
Where is the real Challenge
• Use of Operations Research techniques to help in optimizing a give
  scenario under certain constraints

• Key in building the above type solutions:
   – Can build optimization routines that maximize a given function at the
     customer level
      • Example: we observed this when building solutions to optimize ROI for the
        insurance company.

   – Use of additional tools such as SAS/OR is required when we introduce
     constraints like the Visa constraint in the previous slide.




                                      Strictly Confidential                         86
The Future
• As the marketing science world evolves, we will see the use of more
  advanced mathematical techniques
• But it will be tempered by observing its practical benefits

• Operations Research techniques in the business world were traditionally
  confined to inventory control.
   – Businesses would be trying to optimize profit but under constraints
     which related to inventory volumes.

• In today’s marketing world, campaign frequency will continue to increase
  within organizations.
   – Marketers will need to recognize that customers will need to be shared
      but not necessarily in an optimum manner.
   – These suboptimal situations or constraints, though, will need to become
      part of the overall solution
                                   Strictly Confidential                       87