SKEMA
MASTER BUSINESS & ECONOMICS
Introduction to Quantitative
Methods & Econometrics
Lionel Nesta
Observatoire Français des Conjonctures Economiques
Lionel.nesta@ofce.sciences-po.fr
Objective of The Course
The objective of the class is to provide students with a set
of techniques to analyze quantitative data. It concerns the
application of quantitative and statistical approaches as
developed in the social sciences, for future decision
makers, policy markers, stake holders, managers, etc.
All courses are computer-based classes using the SPSS
statistical package. The objective is to reach levels of
competence which provide the student with skills to both
read and understand the work of others and to carry out
one's own research.
Class Password: stmarec123
Examples
Rise in biotechnology
Should the EU fund fundamental research in biotechnology?
Has biotechnology increased the productivity of firm-level R&D?
Did it increase the speed of discovery in pharmaceutical R&D?
Increasing university-industry collaborations
Does it facilitate innovation by firms?
Does it increase the production of new knowledge by academics?
Does it modify the fundamental/applied nature of research?
Examples
Economic (productivity) Growth
Does it come mainly from new firms or improving existing firms?
Is market selection operating correctly?
Why do good firms exit the market?
How does the organisation of knowledge impact on performance?
How do knowledge stock and specialisation impact on productivity?
How do firms enter into new technological fields?
Do firms diversify in new technologies/businesses purposively?
Structure of the Class
Class 1 : Descriptive Statistics
Class 2 : Statistical Inference
Class 3 : Relationship Between Variables
Class 4 : Ordinary Least Squares (OLS)
Class 5 : Extension to OLS
Class 6 : Qualitative Dependent variables
Structure of the Class
Class 1 : Descriptive Statistics
Mean, variance, standard deviation
Data management
Class 2 : Statistical Inference
Class 3 : Relationship Between Variables
Class 4 : Ordinary Least Squares (OLS)
Class 5 : Extension to OLS
Class 6 : Qualitative Dependent variables
Structure of the Class
Class 1 : Descriptive Statistics
Class 2 : Statistical Inference
Distributions
Comparison of means
Class 3 : Relationship Between Variables
Class 4 : Ordinary Least Squares (OLS)
Class 5 : Extension to OLS
Class 6 : Qualitative Dependent variables
Structure of the Class
Class 1 : Descriptive Statistics
Class 2 : Statistical Inference
Class 3 : Relationship Between Variables
ANOVA, Chi-Square
Correlation
Class 4 : Ordinary Least Squares (OLS)
Class 5 : Extension to OLS
Class 6 : Qualitative Dependent variables
Structure of the Class
Class 1 : Descriptive Statistics
Class 2 : Statistical Inference
Class 3 : Relationship Between Variables
Class 4 : Ordinary Least Squares (OLS)
Correlation coefficient, simple regression
Multiple regression
Class 5 : Extension to OLS
Class 6 : Qualitative Dependent variables
Structure of the Class
Class 1 : Descriptive Statistics
Class 2 : Statistical Inference
Class 3 : Relationship Between Variables
Class 4 : Ordinary Least Squares (OLS)
Class 5 : Extension to OLS
Regressions diagnostics
Qualitative explanatory variables
Class 6 : Qualitative Dependent variables
Structure of the Class
Class 1 : Descriptive Statistics
Class 2 : Statistical Inference
Class 3 : Relationship Between Variables
Class 4 : Ordinary Least Squares (OLS)
Class 5 : Extension to OLS
Class 6 : Qualitative Dependent variables
Linear probability model
Maximum likelihood (logit, probit)
Class 1
Descriptive Statistics
Types of Data
Descriptive statistics is the branch of statistics which gathers all
techniques used to describe and summarize quantitative and
qualitative data.
Quantitative data
Continuous
Measured on a scale (value its the range)
The size of the number reflect the amount of the variable
Age; wage, sales; height, weight; GDP
Qualitative data
Discrete, categorical
The number reflect the category of the variable
Type of work; gender; nationality
Descriptive Statistics
All means are good to summarize data in a synthetic way: graphs;
charts; tables.
Quantitative data
Graphs: scatter plots; line plots; histograms
Central tendency
Dispersion
Qualitative data
Graphs: pie graphs; histograms
Tables, frequency, percentage, cumulative percentage
Cross tables
Central Tendency and Dispersion
A distribution is an ordered set of numbers showing how many
times each occurred, from the lowest to the highest number or the
reverse
Central tendency: measures of the degree to which scores are
clustered around the mean of a distribution
Dispersion: measures the fluctuations around the characteristics of
central tendency
In other words, the characteristics of central tendency produce
stylized facts, when the characteristics of dispersion look at the
representativeness of a given stylized fact.
Central Tendency
The mode
The most frequent score in distribution is
called the mode.
The median
The middle value of all observed values, when
50% of observed value are higher and 50% of
observed value are lower than the median
The mean
in
1
The sum of all of the values divided by the
number of value
X
N
x
i 1
i
The mode, the mean and the median ore equal if and only of the distribution is symmetrical and unimodal.
Dispersion
The range
Difference between the maximum and R xmax xmin
minimum values
The variance
i n
Average of the squared differences between
x
2
i X
data points and the mean (average) 2 i 1
quadratic deviation N
The standard deviation
Square root of variance, therefore measures i n
x
2
i X
the spread of data about the mean,
2 i 1
measured in the same units as the data N
Dispersion
The range
Difference between the maximum and R xmax xmin
minimum values
The variance
i n
Average of the squared differences between
x
2
i X
data points and the mean (average) 2 i 1
quadratic deviation N
The standard deviation
Square root of variance, therefore measures i n
x
2
i X
the spread of data about the mean,
2 i 1
measured in the same units as the data N
Research Productivity in the
Bio-pharmaceutical Industry
EU Framework Programme 7
Stylised Facts about Modern Biotech
1. Innovations emerge from uncertain, complex processes
involving knowledge and markets: Roles of networks.
2. Economic value is created in many ways – globally and
in geographical agglomerations
3. Various linkages exist among diverse actors (LDFs,
DBFs, Univ, Venture Capital) in innovation processes,
but the firm plays a particularly important role.
4. Regulations, social structures and institutions affect on-
going innovation processes as well as their impacts on
society: Importance of IPR.
SPSS
Statistical Package for the Social Sciences
The SPSS software
Statistical Package for the Social Sciences (1968)
Among the most widely used programs for statistical analysis
in social sciences.
Market researchers, health researchers, survey companies,
government, education researchers, and others.
Data management (case selection, file reshaping, creating
derived data)
Features of SPSS are accessible via pull-down menus
The pull-down menu interface generates command syntax.
SPSS : Opening SPSS
SPSS : Importing data
SPSS : Importing data
SPSS : Importing data
Settings in the “import text” dialogue box
No predefine format (1)
Delimited (2)
First lines contains the variable names (2)
One observation per line // all observations (3)
Tab delimited only (4)
Finish (6)
SPSS windows
SPSS has opens automatically windows
The datasheet window
Observe, manage, modify, create, data
The results window
Everything you do will be stored there
The syntax window can be opened
SPSS : Data sheet (1)
SPSS : Data sheet (2)
SPSS : Result / Journal
SPSS : Saving data
SPSS : working, at last!
Recoding Variables
Changing existing values to new values (biotechnologie → DBF,
pharmaceutique → LDF)
1 3
2
Computing New Variables
Taking logarithm (normalization of continuous variables)
1 2
Creating Dummy Variables
Taking logarithm (normalization of continuous variables)
1 3
2
Computation of Descriptive Statistics
1
3
2
Descriptive Statistics
Statistiques descriptives
N Intervalle Minimum Maximum Moyenne Ecart type Variance
patent 457 286 0 286 11.92 22.901 524.470
assets 457 35788473.97 4422.18 35792896.15 4358371.54 6086530.85 3.705E+013
rd 457 1917997.980 858.53204 1918856.512 330236.630 405160.516 164155043889
spe 457 2.0235309 -1.1298400 .8936909 -.056808610 .3374751802 .114
pharma 457 1 0 1 .63 .482 .232
biotech 457 1 0 1 .37 .482 .232
N valide (listw ise) 457
Splitting Database
1 2
Descriptive Statistics (by type)
Statistique s de s criptive s
type N Intervalle Minimum Max imum Moy enne Ec art type Varianc e
DBF patent 167 202 0 202 12.11 21.066 443.764
as sets 167 2442619 4422.18 2447041 342934.49 478511.938 2E+011
rd 167 495443.5 858.53204 496302.1 58116.590 88638.5347 8E+009
spe 167 1.7544527 -1.12984 .6246127 -.10630582 .343286812 .118
pharma 167 0 0 0 .00 .000 .000
biotec h 167 0 1 1 1.00 .000 .000
N v alide (lis tw ise) 167
LDF patent 290 286 0 286 11.81 23.929 572.609
as sets 290 4E+007 218006.47 4E+007 6670709.4 6605972.68 4E+013
rd 290 1912600 6256.248 1918857 486940.24 432514.940 2E+011
spe 290 1.6904465 -.7967556 .8936909 -.02830504 .331330781 .110
pharma 290 0 1 1 1.00 .000 .000
biotec h 290 0 0 0 .00 .000 .000
N v alide (lis tw ise) 290
Assignments
Compute logarithm for all quantitative variables patent, assets,
rd, and name them lnpatent, lnassets and lnrd, respectively.
Compute descriptive statistics for both LDFs and DBFs.
Draw conclusion by comparing means.
Logarithm
Normalization
Taking the logarithm is a transformation which usually normalize
distribution.
Elasticities http://en.wikipedia.org/wiki/Elasticity_(economics)
A change in log of x is a relative change of x itself.
Cobb-Douglas production function
log x 1 x
log x
x x x