# Regression Basics

Document Sample

```					Regression

Marketing Analytics   Rajkumar Venkatesan
Conservatism in Major League BB

• Batting Average  =   Hits/(Opportunities– Walks)
• OnBase% = (Hits+Walks)/Opportunities
• OVERUSED:  “small ball”
– Sacrifice Bunt
• Give up an out to advance the runner
– Stealing Bases
• Risk an Out to advance the runner.
• UNDERUSED
– Don’t risk making outs and runs
will take care of themselves.

Rajkumar Venkatesan
Diagnosing Market Response:
Regression Analysis

\$ SPENT BY A
CUSTOMER

NUMBER OF PROMOTIONS

Marketing Analytics   Rajkumar Venkatesan
Example: Shopper Card Program
Units purchased = a+b1*price paid + b2*feature ad + b3*display

Data

Marketing Analytics       Rajkumar Venkatesan
Example: Regression Output From Excel

Marketing Analytics   Rajkumar Venkatesan
Price Elasticity

Price  elasticity  can  be  derived  as  the
ratio  of  change  in  quantity  demanded
(%∆Q) and percentage change in price
(%∆P).

PED = [Change in Sales/Change in Price] × [Price/Sales] = (∆Q/∆P) × (P/Q)

Marketing Analytics                   Rajkumar Venkatesan
Belvedere Vodka

Year   (units)    Ln(Sales)   (dollars )      Ln(Price)     (dollars)    Ln (Advertising)
2007       410        6.016      215.44            5.373      20486.1                 9.93
2006       381        5.943      211.45            5.354        2923.5                7.98
2005       365        5.900      207.45            5.335        4826.3                8.48
2004       369        5.911      240.87            5.484      13726.6                 9.53
2003       339        5.826      241.33            5.486      10330.2                 9.24
2002       306        5.724      247.55            5.512      13473.6                 9.51
2001       273        5.609      240.48            5.483        9264.6                9.13

Marketing Analytics                  Rajkumar Venkatesan
Belvedere Price Elasticity

Regression Statistics
Multiple R          0.67536
R Square            0.45611
Square              0.34733
Observations              7

Standard
Coefficients      Error  t Stat P-value
Intercept           12.686       3.340  3.798    0.013
Ln (Price)         −1.259        0.615 −2.048    0.096

Marketing Analytics        Rajkumar Venkatesan
Regression Statistics
Multiple R             0.06102
R Square               0.00372
Square                −0.19553
Standard Error         0.15252
Observations                 7

Standard
Coefficients       Error       t Stat   P-value
Intercept                5.963        0.850       7.018      0.001
Ln (advertising)       −0.013         0.093      −0.137      0.897

Marketing Analytics                Rajkumar Venkatesan
Marketing Analytics   Rajkumar Venkatesan
Customer Retention: Logistic Regression

• Linear regression assumes the dependent variable (DV) to be
continuous (and normally distributed)

Profits

-                         0             +
• Often we have variables where there are only 2 different
values
• Retain (1) vs lose customer (0)

Marketing Analytics       Rajkumar Venkatesan
Customer Retention: Logistic Regression

• With categorical (1/0) dependent variables, linear regression
can result in nonsensical estimated probabilities (e.g.
probability of retention > 100%)

• A model that allows us to do this is the so-called “logistic
regression”

– Predictions are bound between [0,1]

Marketing Analytics          Rajkumar Venkatesan
Marketing Analytics   Rajkumar Venkatesan
Logistic Regression:
The connection to Bookies

This is                   Chance of retention to chance of
called   à                churn
the “odds”

Marketing Analytics              Rajkumar Venkatesan
SuperBowl 2012 Odds
Green Bay Packers                   3.45 to 1
New England Patriots                4.4 to 1
New Orleans Saints                  8.5 to 1
Baltimore Ravens                    9.5 to 1
San Deigo Chargers                  10.5 to 1
Detroit Lions                       13 to 1
Houston Texans                      17.5 to 1
Pittsburg Steelers                  20 to 1

Marketing Analytics   Rajkumar Venkatesan
What is Odds?
•     If you chose a random day of the week (7 days), then the odds that you would
choose a Sunday would be:
–  (1/7)/[1-(1/7)] = 1/6, but not  1/7.

•     The odds against you choosing Sunday are 6/1 = 6 , meaning that it's 6 times more
likely that you don't choose Sunday.

•     Generally, 'odds' are not quoted to the general public in this format because of the
natural confusion with the chance of an event occurring being expressed
fractionally as a probability.

•     A bookmaker may (for his own purposes) use 'odds' of 'one-sixth', the
overwhelming everyday use by most people is odds of the form 6 to 1, 6-1, or 6/1
(all read as 'six-to-one') where the first figure represents the number of ways of
failing to achieve the outcome and the second figure is the number of ways of
achieving a favorable outcome: thus these are "odds against".

•     An event with m to n "odds against" would have probability n/(m + n), while an
event with m to n "odds on" would have probability m/(m + n).
Source:  http://en.wikipedia.org/wiki/Odds

Marketing Analytics               Rajkumar Venkatesan
Example:
Will a Physician Prescribe a Drug?
Data
Model

Marketing Analytics    Rajkumar Venkatesan
Example: XLStat Output

Marketing Analytics   Rajkumar Venkatesan
Logistic Regression: Coefficients

• Key difference: coefficients are not interpreted as such

• Need to calculate “odds ratio”
– For example, if the logit regression coefficent b = 2.303,
then the odds ratio is: eb =e2.303 = 10

– à when the IV increases one unit, the odds that the DV =
1 increases by a factor of 10, when other variables are
controlled.

Marketing Analytics         Rajkumar Venkatesan
Example: XLStat Output

What is the Odds Ratio for Sales Calls?

–Caution: odds ratios that are close to one, do NOT suggest
that the coefficients are insignificant – it just means there is
50/50 chance of outcome

Marketing Analytics           Rajkumar Venkatesan
Example: Physicians Prescriptions
For each additional sales call, the odds
of a physician prescribing a drug
increases by 43% (holding
everything else constant).

Prob (prescription) when sales calls is zero =
0.36/(1-0.36)                    exp(-0575)/[1+exp(-0.575)]

Prob (prescription) when sales calls is one
= exp(-0.575+0.361)/[1+exp(-0.575+0.361)]

Marketing Analytics                          Rajkumar Venkatesan
Reaction to econometric analysis?

Rajkumar Venkatesan
Combined Effect of Age and Online

Average Profit

Marketing Analytics     Rajkumar Venkatesan
Diagnosing Customer Profits and
Retention: Common Drivers
Behavioral characteristics
•   purchase volume/quantity
Goal:
•   length of relationship                         To identify
•   number of product categories purchased         key lever(s)
•   selling costs                                  that “drive”
•   customer satisfaction                          customer value

Demographic/firmographic  characteristics
• Age, income, gender
• Loyalty program membership
• Firm size
Psychographic characteristics
• Attitudes, values
• Interests
• Activities

Marketing Analytics           Rajkumar Venkatesan
Model Building
• Determine properties of dependent variable
– Linear, + ve values, Dummy Variable

• Select model that reflects dependent
variable properties
– Logistic regression for dummy variables

Marketing Analytics       Rajkumar Venkatesan
Model Building
• Include the decision variable of interest
among the independent variable set

• Include common control variables
– Quality, Distribution, Demographics, Tenure,
Competition etc.

Marketing Analytics   Rajkumar Venkatesan
Model Building
• Does including lagged dependent variable

• If UNIT ROOT, use difference as the
dependent variable

• Are some independent variables correlated
more than 0.8.  If so, can we eliminate one
of the correlated variables or combine them.

Marketing Analytics   Rajkumar Venkatesan
Model Building
• Are some variables Missing at Random
(MAR) or are they missing systematically?

• If variables are missing systematically, are
there proxies that can replace the missing
variables

Marketing Analytics   Rajkumar Venkatesan
Model Building
• Does the model hint @ causality or is it a
correlational model?
– Are dependent and independent variables
measured at the same time?
– Are there sufficient controls or confounding
variables included
– Can a reverse causation reasonably exist
– Do we need to recommend an experiment?

Marketing Analytics      Rajkumar Venkatesan

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 0 posted: 8/16/2013 language: Latin pages: 29