# BRM

Document Sample

```					                        LOGO

Application of SPSS
Part -1: Descriptive
Statistics
LOGO
Irony of statistics
 Two statisticians were travelling in an airplane from
the pilot announced that they had lost an engine, but
don't worry, there are three left. However, instead of 2
hours it would take 4 hours to get to Islamabad. A little
later, he announced that a second engine failed, and they
still had two left, but it would take 5 hours to get to
Islamabad. Somewhat later, the pilot again came on the
intercom and announced that a third engine had died...
 Never fear, he announced, because the plane could fly on
a single engine. However, it would now take 8 hours to get
to Islamabad. At this point, one statistician turned to the
other and said, "Gee, I hope we don't lose that last engine,
or we'll be up here forever!"
LOGO
Contents

1   What is descriptive research?

2   Types of descriptive measures
LOGO
What is descriptive statistics?

Descriptive statistics aim at describing a
situation by summarizing information in a
way that highlights the important numerical
features of the data.
A good summary captures the essential aspects
of the data and the most relevant ones. It
summarizes it with the help of numbers, usually
organized into tables, but also with the help of
charts and graphs that give a visual
representation of the distributions.
LOGO    Types of univariate descriptive
measures

There are three important types of
univariate descriptive measures:
 measures of central tendency,
 measures of dispersion, and
 measures of position
LOGO
measures of central tendency
Sometimes called measures of the center. It
 What are the categories or numerical values that
represent the bulk of the data in the best way?
Such measures will be useful for comparing
various groups within a population, or seeing
whether a variable has changed over time.
Measures of central tendency include:
 the mean (which is the technical term for
average),
 the median, and
 the mode.
LOGO
Measures of dispersion
Measures of dispersion answer the question:
How spread out is the data?
Is it mostly concentrated around the center, or
spread out over a large range of values?
Measures of dispersion include:
 the standard deviation,
 the variance
 the coefficient of variation.
LOGO
Measures of position
Measures of position answer the question:
How is one individual entry positioned with
respect to all the others?
Or how does one individual score on a variable
in comparison with the others? If you want to
know whether you are part of the top 5% of a
math class, you must use a measure of position.
Measures of position include:
 percentiles,
 deciles, and
 quartiles.
LOGO
Should NRO be allowed?
LOGO

Standard Deviation
LOGO            Standard Deviation and
Probability
In general, people use the +/- 2 SD criteria for
the limits of the acceptable range for a test
When the measurement falls within that range,
there is 95.5% confidence that the measurement
is correct
Only 4.5% of the time will a value fall outside of
that range due to chance; more likely it will be
due to error
LOGO
Example

Consider the following three datasets:
(1) 5, 25, 25, 25, 25, 25, 45
(2) 5, 15, 20, 25, 30, 35, 45
(3) 5, 5, 5, 25, 45, 45, 45
LOGO
Solution

Case                  Standard Deviation
1                     11.55
2                     13.23
3                     20.00

The standard deviations for the datasets are
11.55, 13.23, and 20. The larger standard
deviations indicate greater variability in the data,
and in general we can say that smaller standard
deviations indicate less variability in the data.
LOGO
Example 4

 For example, each of the three populations {0, 0, 14,
14}, {0, 6, 8, 14} and {6, 6, 8, 8} has a mean of 7.
LOGO
Solution

 Their standard deviations are 7, 5, and 1,
respectively.
 The third population has a much smaller standard
deviation than the other two because its values are
all close to 7.
 In a loose sense, the standard deviation tells us
how far from the mean the data points tend to be.
 It will have the same units as the data points
themselves. If, for instance, the data set {0, 6, 8, 14}
represents the ages of a population of 4 cows, the
standard deviation is 5 years.
LOGO
Example 5
 Consider average temperatures for cities. While two
cities may each have an average temperature of 15 °C,
it's helpful to understand that the range for cities near
the coast is smaller than for cities inland, which clarifies
that, while the average is similar, the chance for
variation is greater inland than near the coast.
 So, an average of 15 occurs for one city with highs of 25
°C and lows of 5 °C, and also occurs for another city
with highs of 18 and lows of 12. The standard deviation
allows us to recognize that the average for the city with
the wider variation, and thus a higher standard deviation,
will not offer as reliable a prediction of temperature as
the city with the smaller variation and lower standard
deviation.
LOGO
Standard Deviation

For example, the average height for adult men
in Pakistan is about 70 inches, with a standard
deviation of around 3 in.
How we would interpret it?
LOGO

Interpretation
 This means that most men (about 68 percent,
assuming a normal distribution) have a height
within 3 in of the mean (67–73 in) – one standard
deviation,
 whereas almost all men (about 95%) have a
height within 6 in (15 cm) of the mean (64–76 in)
– 2 standard deviations.
LOGO

If the standard deviation were
zero, then????
LOGO

…then all men would be exactly
70 in high.
LOGO

If the standard deviation were
20 in, then …???
LOGO

…men would have much more
variable heights, with a typical
range of about 50 to 90 in
LOGO
Sigma

zσ        percentage within      percentage outside   ratio outside

1σ        68.2689492%            31.7310508%          1 / 3.1514871
1.645σ    90%                    10%                  1 / 10
1.960σ    95%                    5%                   1 / 20
2σ        95.4499736%            4.5500264%           1 / 21.977894
2.576σ    99%                    1%                   1 / 100
3σ        99.7300204%            0.2699796%           1 / 370.398
3.2906σ   99.9%                  0.1%                 1 / 1000
4σ        99.993666%             0.006334%            1 / 15,788

5σ        99.9999426697%         0.0000573303%        1 / 1,744,278

6σ        99.9999998027%         0.0000001973%        1 / 506,800,000

7σ        99.999 999 999 7440%   0.0000000002560%     1 / 390,600,000,000
LOGO
LOGO
Standard Error of Mean.

A measure of how much the value of the mean
may vary from sample to sample taken from the
same distribution.
It can be used to roughly compare the observed
mean to a hypothesized value (that is, you can
conclude the two values are different if the ratio
of the difference to the standard error is less
than -2 or greater than +2).

25
LOGO

Coefficient of Variation
LOGO
Coefficient of Variation

The Coefficient of Variation (CV) is the standard
Deviation (SD) expressed as a percentage of
the mean
Also known as Relative Standard deviation
(RSD)
CV % = (SD ÷ mean) x 100
LOGO
month-wise average temp (mm)
Month                  Karachi         Peshawar
January                            30            -1
February                           31            4
March                              32            25
April                              33            35
May                                34            40
June                               35            48
July                               35            50
August                             34            45
September                          33            38
October                            32            35
November                           31            25
December                           30            4

Calculate CoV and see whether meaningful conclusion can be drawn
28
LOGO

How to diagnose issues
related with normal
distribution
LOGO
Kurtosis

Kurtosis value tells whether distribution is
peaked, flat, or normal.
If Kurtosis value is zero, distribution is normal, if
it is positive, then distribution is more peaked
than normal and if it is negative, then distribution
is flatter than normal.
Kurtosis values ranging from -1 to +1 are
considered excellent. (George & Mallery, 2006,
p. 98)

30
LOGO

For a normal distribution, the value of the
kurtosis statistic is zero
Bell-shaped curves = describe in terms of its
kurtosis (curvature)

1. Leptokurtic = thin distribution
(concentrated at midpoint) (-)
2. Mesokurtic = normal distribution

3. Platykurtic = flat distribution (+)
31
LOGO

The large positive kurtosis tells you that the
distribution of data is more peaked and has
heavier tails than the normal distribution.

32
LOGO                                      Skewness

Skewness value tells whether distribution is
symmetrical or asymmetrical.
If Skewness value is zero, distribution is
symmetrical, if it is positive, then smaller values
are in greater number in distribution and if it is
negative, then larger values are greater in
number in distribution.
Skewness values ranging from -2 to +2 are
acceptable.

33
LOGO                    Non-symmetrical

1. Positive Skew = high number of
low scores

2. Negative Skew = high number of
high scores

34
LOGO
Skewness value = 0

35
LOGO

Large positive skewness shows that sale has a
long right tail.
That is, the distribution is asymmetric, with some
distant values in a positive direction from the
center of the distribution.

36
LOGO