Docstoc

BRM

Document Sample
BRM Powered By Docstoc
					                        LOGO




Application of SPSS
 Part -1: Descriptive
           Statistics
LOGO
                               Irony of statistics
  Two statisticians were travelling in an airplane from
   Karachi to Islamabad. About half an hour into the flight,
   the pilot announced that they had lost an engine, but
   don't worry, there are three left. However, instead of 2
   hours it would take 4 hours to get to Islamabad. A little
   later, he announced that a second engine failed, and they
   still had two left, but it would take 5 hours to get to
   Islamabad. Somewhat later, the pilot again came on the
   intercom and announced that a third engine had died...
  Never fear, he announced, because the plane could fly on
   a single engine. However, it would now take 8 hours to get
   to Islamabad. At this point, one statistician turned to the
   other and said, "Gee, I hope we don't lose that last engine,
   or we'll be up here forever!"
LOGO
                              Contents


       1   What is descriptive research?


       2   Types of descriptive measures
LOGO
          What is descriptive statistics?

 Descriptive statistics aim at describing a
  situation by summarizing information in a
  way that highlights the important numerical
  features of the data.
 A good summary captures the essential aspects
  of the data and the most relevant ones. It
  summarizes it with the help of numbers, usually
  organized into tables, but also with the help of
  charts and graphs that give a visual
  representation of the distributions.
LOGO    Types of univariate descriptive
                             measures

 There are three important types of
  univariate descriptive measures:
    measures of central tendency,
    measures of dispersion, and
    measures of position
LOGO
          measures of central tendency
 Sometimes called measures of the center. It
  answer the question:
    What are the categories or numerical values that
     represent the bulk of the data in the best way?
 Such measures will be useful for comparing
  various groups within a population, or seeing
  whether a variable has changed over time.
  Measures of central tendency include:
    the mean (which is the technical term for
     average),
    the median, and
    the mode.
LOGO
               Measures of dispersion
 Measures of dispersion answer the question:
  How spread out is the data?
 Is it mostly concentrated around the center, or
  spread out over a large range of values?
 Measures of dispersion include:
    the standard deviation,
    the variance
    the coefficient of variation.
LOGO
                    Measures of position
 Measures of position answer the question:
  How is one individual entry positioned with
  respect to all the others?
 Or how does one individual score on a variable
  in comparison with the others? If you want to
  know whether you are part of the top 5% of a
  math class, you must use a measure of position.
  Measures of position include:
    percentiles,
    deciles, and
    quartiles.
LOGO
       Should NRO be allowed?
                     LOGO




Standard Deviation
LOGO            Standard Deviation and
                           Probability
 In general, people use the +/- 2 SD criteria for
  the limits of the acceptable range for a test
 When the measurement falls within that range,
  there is 95.5% confidence that the measurement
  is correct
 Only 4.5% of the time will a value fall outside of
  that range due to chance; more likely it will be
  due to error
LOGO
                                   Example

 Consider the following three datasets:
  (1) 5, 25, 25, 25, 25, 25, 45
  (2) 5, 15, 20, 25, 30, 35, 45
  (3) 5, 5, 5, 25, 45, 45, 45
LOGO
                                             Solution

       Case                  Standard Deviation
       1                     11.55
       2                     13.23
       3                     20.00



   The standard deviations for the datasets are
   11.55, 13.23, and 20. The larger standard
   deviations indicate greater variability in the data,
   and in general we can say that smaller standard
   deviations indicate less variability in the data.
LOGO
                                        Example 4

  For example, each of the three populations {0, 0, 14,
   14}, {0, 6, 8, 14} and {6, 6, 8, 8} has a mean of 7.
LOGO
                                          Solution

  Their standard deviations are 7, 5, and 1,
   respectively.
  The third population has a much smaller standard
   deviation than the other two because its values are
   all close to 7.
  In a loose sense, the standard deviation tells us
   how far from the mean the data points tend to be.
  It will have the same units as the data points
   themselves. If, for instance, the data set {0, 6, 8, 14}
   represents the ages of a population of 4 cows, the
   standard deviation is 5 years.
LOGO
                                         Example 5
  Consider average temperatures for cities. While two
   cities may each have an average temperature of 15 °C,
   it's helpful to understand that the range for cities near
   the coast is smaller than for cities inland, which clarifies
   that, while the average is similar, the chance for
   variation is greater inland than near the coast.
  So, an average of 15 occurs for one city with highs of 25
   °C and lows of 5 °C, and also occurs for another city
   with highs of 18 and lows of 12. The standard deviation
   allows us to recognize that the average for the city with
   the wider variation, and thus a higher standard deviation,
   will not offer as reliable a prediction of temperature as
   the city with the smaller variation and lower standard
   deviation.
LOGO
                       Standard Deviation

 For example, the average height for adult men
  in Pakistan is about 70 inches, with a standard
  deviation of around 3 in.
 How we would interpret it?
LOGO



 Interpretation
    This means that most men (about 68 percent,
     assuming a normal distribution) have a height
     within 3 in of the mean (67–73 in) – one standard
     deviation,
    whereas almost all men (about 95%) have a
     height within 6 in (15 cm) of the mean (64–76 in)
     – 2 standard deviations.
LOGO




   If the standard deviation were
     zero, then????
LOGO




 …then all men would be exactly
  70 in high.
LOGO




 If the standard deviation were
  20 in, then …???
LOGO



 …men would have much more
  variable heights, with a typical
  range of about 50 to 90 in
LOGO
                                                         Sigma

  zσ        percentage within      percentage outside   ratio outside

  1σ        68.2689492%            31.7310508%          1 / 3.1514871
  1.645σ    90%                    10%                  1 / 10
  1.960σ    95%                    5%                   1 / 20
  2σ        95.4499736%            4.5500264%           1 / 21.977894
  2.576σ    99%                    1%                   1 / 100
  3σ        99.7300204%            0.2699796%           1 / 370.398
  3.2906σ   99.9%                  0.1%                 1 / 1000
  4σ        99.993666%             0.006334%            1 / 15,788

  5σ        99.9999426697%         0.0000573303%        1 / 1,744,278

  6σ        99.9999998027%         0.0000001973%        1 / 506,800,000

  7σ        99.999 999 999 7440%   0.0000000002560%     1 / 390,600,000,000
LOGO
LOGO
                      Standard Error of Mean.

     A measure of how much the value of the mean
      may vary from sample to sample taken from the
      same distribution.
     It can be used to roughly compare the observed
      mean to a hypothesized value (that is, you can
      conclude the two values are different if the ratio
      of the difference to the standard error is less
      than -2 or greater than +2).




25
                           LOGO




Coefficient of Variation
LOGO
                     Coefficient of Variation

 The Coefficient of Variation (CV) is the standard
  Deviation (SD) expressed as a percentage of
  the mean
 Also known as Relative Standard deviation
  (RSD)
 CV % = (SD ÷ mean) x 100
LOGO
                month-wise average temp (mm)
               Month                  Karachi         Peshawar
     January                            30            -1
     February                           31            4
     March                              32            25
     April                              33            35
     May                                34            40
     June                               35            48
     July                               35            50
     August                             34            45
     September                          33            38
     October                            32            35
     November                           31            25
     December                           30            4

     Calculate CoV and see whether meaningful conclusion can be drawn
28
                      LOGO




How to diagnose issues
   related with normal
           distribution
LOGO
                                           Kurtosis

     Kurtosis value tells whether distribution is
      peaked, flat, or normal.
     If Kurtosis value is zero, distribution is normal, if
      it is positive, then distribution is more peaked
      than normal and if it is negative, then distribution
      is flatter than normal.
     Kurtosis values ranging from -1 to +1 are
      considered excellent. (George & Mallery, 2006,
      p. 98)


30
LOGO


     For a normal distribution, the value of the
      kurtosis statistic is zero
     Bell-shaped curves = describe in terms of its
      kurtosis (curvature)


          1. Leptokurtic = thin distribution
           (concentrated at midpoint) (-)
          2. Mesokurtic = normal distribution

          3. Platykurtic = flat distribution (+)
31
LOGO



     The large positive kurtosis tells you that the
      distribution of data is more peaked and has
      heavier tails than the normal distribution.




32
LOGO                                      Skewness

     Skewness value tells whether distribution is
      symmetrical or asymmetrical.
     If Skewness value is zero, distribution is
      symmetrical, if it is positive, then smaller values
      are in greater number in distribution and if it is
      negative, then larger values are greater in
      number in distribution.
     Skewness values ranging from -2 to +2 are
      acceptable.




33
LOGO                    Non-symmetrical

     1. Positive Skew = high number of
       low scores



     2. Negative Skew = high number of
       high scores



34
LOGO
       Skewness value = 0




35
LOGO



     Large positive skewness shows that sale has a
      long right tail.
     That is, the distribution is asymmetric, with some
      distant values in a positive direction from the
      center of the distribution.




36
                          LOGO




Add your company slogan

				
DOCUMENT INFO