VIEWS: 43 PAGES: 11 CATEGORY: Childrens Literature POSTED ON: 11/27/2009 Public Domain
Chapter Three Measures of Central Tendency and Dispersion Outline • • • • • • • • • • • A. Statistics and Parameters B. Mean C. Median D. Mode E. Skew & Scaling F. Weighted Mean G. Range H. Population Variance & Standard Deviation I. Sample Variance and Standard Deviation J. Coefficient of Variation K. Standard Scores A. Characteristics of Frequency Distribution • • • • Central Tendency average Dispersion--how spread out things are Skewness vs. symmetrical Kurtosis--peeked or flat Statistics & Parameters • Sample Statistics Vs. Population Parameters • • • • • • • • • Describe you use Greek letters • Infer you use roman Roman Letters Vs. Greek Letters Symbol Formula the way you compute it Value--numerical result Describe populations Infer from samples to populations Describe means you have all of the data Infer means you have only part of the data Descriptive vs. Inferential A-3 - 20 B. Measures of Central Tendency • Arithmetic Mean • Median • Mode The Arithmetic Mean • • • • • • • • Clearly Understood Every data set has one and only one mean Stable from sample to sample when used inferentially Used in more complex statistics Deceptive in skewed distribution Balance point Uses number, order and value of scores Definition: the point in a distribution from which the sum of the deviations is zero • Symbol: • Formula: x x n x N • Value: -3 27 -2 -1 28 29 30 +1 +2 31 +3 +4 +5 +6 32 33 34 35 36 A-3 - 21 C. The Median • • • • • • • • • The point below and above which fall 50% of the scores Not as widely understood Less stable than mean Ordinal Scaling Uneffected by skew Uses only the number and order of scores Symbol: MD Formula: Value: 31 32 33 36 37 A-3 - 22 D. Mode • • • • • • • • • Most frequent value Least stable measure of central tendency Easy to calculate Bimodal and multimodal possibilities Best for J or U shaped distributions Some distributions have no mode. Uses only the number of scores Symbol: Mode Formula: the score or the category with the highest frequency A-3 - 23 E. Central Tendency & Skew • Symmetrical: mean = median = mode, mean - median = 0 • Positive skew: mean > median, mean - median > 0 • Negative skew: mean < median, mean - median < 0 Positively Skewed Distribution Mode Mean Median Mean Median Mode Measures of Central Tendency and Scaling • Nominal: must use mode • Ordinal: median or mode • Cardinal: mean, median or mode A-3 - 24 F. Weighted Mean • Used when means are based on different numbers of observations or weights • w = weight k = total number of groups X w1 X 1 w2 X 2 wk X k w1 w2 wk A-3 - 25 G. Measures of Dispersion • Range • Population – variance – standard deviation • Sample – variance – standard deviation 15 10 5 0 Dispersion • Also called variability or spread • Helps us to judge the reliability of our measure of central tendency • Excessive variability can cause problems • Need to compare different distributions • Quality of a process is usually judged by the variability not the mean of the output. 0 10 20 30 40 Range • The difference between the highest and lowest observed values • Symbol: Range • Formula: Range = value of greatest observation - value of the least observation • Value: Easy to compute Easy to understand Considers only two values Based on the least reliable values in data set Completely determined by extreme values A-3 - 26 H. Population Variance • • • • • Mean of the squared deviations from the mean of each score in the distribution. Gives a measure of the squared distance of any observation from the mean of the distribution. The concept of squared units is not clear Needed for further calculation later The square root of the variance is the standard deviation x 28 2 4.67 N 6 2 Population Standard Deviation • The square root of the mean of the squared deviations of each score in the distribution from the mean (root mean square) • Symbol: • Formula: • Value: 2 x N 2 A-3 - 27 I. Sample Variance (Unbiased Estimator) • Symbol: s 2 x x 2 s n 1 2 • Formula: • Value: Sample Standard Deviation • The sample value which best estimates the population standard deviation • Symbol: s s s2 • Formula: • The -1 in the denominator corrects for the bias of the sample caused by underestimating the extreme scores • Value: xx n 1 2 Standard Deviation • Most often used measure of dispersion • Uses information from all values in data set • Can be misleading in skewed distributions A-3 - 28 J. Coefficient of Variation • The statistic that results from dividing the standard deviation by the mean and multiplying by 100. • Allows for the comparison of two different standard deviations when the units are different. CVar s 100% X A-3 - 29 K. Standard Scores • Used to compare scores from distributions with different means and variances • Used to estimate the number of scores within certain intervals around the mean of a distribution Scores whose scaling has been changed from the original units (ex. dollars) to standard deviation units (ex. standard deviations) • Z scores Standard Score Problems Raw score Standard score Area Number of scores X Z p = #/N p # # = pN %=p*100 Chebyshev’s Theorem • Regardless of the shape of the distribution at least: 75% of the values will fall within ±2 standard deviations from the mean – 89% of the values will fall within ±3 standard deviations from the mean • Compare with Gaussian Distribution – Gaussian Distribution • If the shape of the distribution is Gaussian exactly (within rounding): 68% of the values will fall within ±1 standard deviations from the mean – 95.5% of the values will fall within ±2 standard deviations from the mean – 99.7% of the values will fall within ±3 standard deviations from the mean – Outlier • • • • • A score whose value is extremely high or low relative to the distribution of which it is a part Observational error/Recording error Not from population Rare phenomenon Beyond ±3 standard deviations distribution of variation •Chebyshev’s •Dispersion •Mean •Measure •Mode •Parameters •Skewness Terms to Know •Bimodal •Kurtosis •Measure •Median •Outlier •Range •Standard •Statistics •Variance Theorem •Coefficient of central tendency of dispersion deviation •Standard • score •Symmetrical A-3 - 30