Document Sample

s. Boxplot: Min Q1 M Q3 Max Statistics Cheat Sheet Mr. Roth , Mar 2004 s 2 ( x x ) /(n 1) SS x /(n 1) , 2 1. Fundamentals t. Variance: a. Population – Everybody to be analysed u. p78: standard deviation, s = √s 2 SS x ( x x ) 2 x 2 ( x) 2 / n Parameter - # summarizing Pop v. b. Sample – Subset of Pop we collect data on Statistics - # summarizing Sample w. Density curve – relative proportion within classes – c. Quantitative Variables – a number area under curve = 1 Discrete – countable (# cars in family) x. Normal Distribution: 68, 95, 99.7 % within 1, 2, 3 std Continuous – Measurements – always # between deviations. d. Qualitative y. p98: z-score z ( x x ) / s or ( x ) / Nominal – just a name z. Standard Normal: N(0,1) when N(μ,σ) Ordinal – Order matters (low, mid, high) 3. Bivariate - Scatterplots & Correlation Choosing a Sample a. Explanatory – independent variable Sample Frame – list of pop we choose sample from b. Response – dependent variable Biased – sampling differs from pop characteristics. c. Scatterplot: form, direction, strength, outliers Volunteer Sample – any of below three types may d. – form is linear negative, … end up as volunteer if people choose to respond. e. – to add categorical use different color/symbol Sample Designs f. p147: Linear Correlation- direction & strength of e. Judgement Samp: Choose what we think represents linear relationship Convenience Sample – easily accessed people g. Pearsons Coeff: {-1 ≤ r ≤ 1} 1 is perfectly linear + f. Probability Samp: Elements selected by Prob slope, -1 is perfectly linear – slope. Simple random sample – every element = chance 1 ( x x ) ( y y) SS xy Systematic sample – almost random but we h. r * , choose by method n 1 sx sy SS x SS y g. Census – data on every everyone/thing in pop i. r = zxzy / (n - 1), Stratified Sampling Divide pop into subpop based upon characteristics j. SS xy xy x y n h. Proportional: in proportion to total pop i. Stratified Random: select random within substrata 4. Regression j. Cluster: Selection within representative clusters k. least squares – sum of squares of vertical error minimized Collect the Data l. p154: y = b0 + b1x, or y a bx , k. Experiment: Control the environment l. Observation: m. (same as y = mx + b) 2. Single Variable Data - Distributions n. b1 ( x x )( y y ) SS xy = r (sy / sx) m. Graphing Categorical: Pie & bar chart) (x x) 2 SS x n. Histogram (classes, count within each class) o. Then solving knowing lines thru centroid o. – shape, center, spread. Symmetric, skewed right, ( ( x , y ); a y bx skewed left p. Stemplots p. b0 y (b x) 1 0 11222 0 112233 n 1 011333 0 56677 q. r^2 is proportion of variation described by linear 2 etc 1 relationship q. Mean: x xi / n r. residual = y - y = observed – predicted. r. Median: M: If odd – center, if even - mean of 2 Statistics Cheat Sheet s. Outliers: in y direction -> large residuals, in x d. Event: outcome of random phenomenon direction -> often influential to least squares line. e. n(S) – number of points in sample space t. Extrapolation – predict beyond domain studied f. n(A) – number of points that belong to A u. Lurking variable g. p 183: Empirical: P'(A) = n(A)/n = #observed/ v. Association doesn't imply causation #attempted. 5. Data – Sampling h. p 185: Law of large numbers – Exp -> Theoret. i. p. 194: Theoretical P(A) = n(A)/n(S) , a. Population: entire group favorable/possible b. Sample: part of population we examine j. 0 ≤ P(A) ≤ 1, ∑ (all outcomes) P(A) = 1 c. Observation: measures but does not influence k. p. 189: S = Sample space, n(S) - # sample points. response Represented as listing {(, ), …}, tree diagram, or grid d. Experiment: treatments controlled & responses observed l. p. 197 Complementary Events P(A) + P( A ) = 1 e. Confounded variables (explanatory or lurking) when m. p200: Mutually exclusive events: both can't happen effects on response variable cannot be distinguished at the same time f. Sampling types: Voluntary response – biased to n. p203. Addition Rule: P(A or B) = P(A) + P(B) – P(A opinionated, Convenience – easiest and B) [which = 0 if exclusive] g. Bias: systematically favors outcomes o. p207: Independent Events: Occurrence (or not) of A h. Simple Random Sample (SRS): every set of n does not impact P(B) & visa versa. individuals has equal chance of being chosen p. Conditional Probability: P(A|B) – Probability of A i. Probability sample: chosen by known probability given that B has occurred. P(B|A) – Probability of B given that A has occurred. j. Stratified random: SRS within strata divisions q. Independent Events iff P(A|B) = P(A) and P(B|A) = k. Response bias – lying/behavioral influence P(B) 6. Experiments r. Special Multiplication. Rule: P(A and B) = P(A)*P(B) a. Subjects: individuals in experiment s. General mult. Rule: P(A and B) = P(A)*P(B|A) = b. Factors: explanatory variables in experiment P(B)*P(A|B) c. Treatment: combination of specific values for each t. Odds / Permutations factor u. Order important vs not (Prob of picking four d. Placebo: treatment to nullify confounding factors numbers) e. Double-blind: treatments unknown to subjects & v. Permutations: nPr, n!/(n – r)! , number of ways to individual investigators pick r item(s) from n items if order is important : f. Control Group: control effects of lurking variables Note: with repetitions p alike and q alike = n!/p!q!. g. Completely Randomized design: subjects allocated w. Combinations: nCr, n!/((n – r)!r!) , number of ways randomly among treatments to pick r item(s) from n items if order is NOT h. Randomized comparative experiments: similar important groups – nontreatment influences operate equally x. Replacement vs not (AAKKKQQJJJJ10) (a) Pick an i. Experimental design: control effects of lurking A, replace, then pick a K. (b) Pick a K, keep it, pick variables, randomize assignments, use enough another. subjects to reduce chance y. Fair odds - If odds are 1/1000 and 1000 payout. May j. Statistical signifi: observations rare by chance take 3000 plays to win, may win after 200. k. Block design: randomization within a block of 8. Probability Distribution individuals with similarity (men vs women) a. Refresh on Numb heads from tossing 3 coins. Do 7. Probability & odds grid {HHH,….TTT} then #Heads vs frequency chart{(0,1), (1,3), (2,3), (4,1)} – Note Pascals triangle a. 2 definitions: b. Random variable – circle #Heads on graph above. b. 1) Experimental: Observed likelihood of a given "Assumes unique numerical value for each outcome outcome within an experiment in sample space of probability experiment". c. 2) Theoretical: Relative frequency/proportion of a c. Discrete – countable number given event given all possible outcomes (Sample Space) d. Continuous – Infinite possible values. f0acc37b-be55-453e-be4c-8c9709fb5712.doc -2- Printed 4/8/2009 Statistics Cheat Sheet e. Probability Distribution: Add next to coins frequency 11. Confidence Intervals chart a P(x) with 1/8, 3/8, 3/8, 1/8 values a. Statistical Inference: methods for inferring data f. Probability Function: Obey two properties of prob. about population from a sample (0 ≤ P(A) ≤ 1, ∑ (all outcomes) P(A) = 1. b. If x is unbiased, use to estimate μ g. Parameter: Unknown # describing population c. Confidence Interval: Estimate+/- error margin h. Statistic: # computed from sample data d. Confidence Level C: probability interval captures Sample Population true parameter value in repeated samples Mean x μ - mu e. Given SRS of n & normal population, C confidence Variance s2 σ2 interval for μ is: x z * / n Standard s σ - sigma deviation f. Sample size for desired margin of error – set +/- value above & solve for n. (x x) 2 i. Base: x x / n , s 2 12. Tests of significance (n 1) g. Assess evidence supporting a claim about popu. Frequency Dist Probability Distribution h. Idea – outcome that would rarely happen if claim Me x xf / f [ xP( x)] were true evidences claim is not true an i. Ho – Null hypothesis: test designed to assess Var (x x) f 2 2 [(x ) 2 P( x)] evidence against Ho. Usually statement of no effect s2 Ha – alternative hypothesis about population ( f 1) j. parameter to null k. Two sided: Ho: μ = 0, Ha: μ ≠ 0 2 Std s = √s Dv 2 l. P-value: probability, assuming Ho is true, that test statistic would be as or more extreme (smaller P- j. Probability acting as an f / f . Lose the -1 value is > evidence against Ho) x 9. Sampling Distribution m. z= / n a. By law of large #'s, as n -> population, x n. Significance level α : if α = .05, then happens no b. Given x as mean of SRS of size n, from pop with μ more than 5% of time. "Results were significant (P < and σ. Mean of sampling distribution of x is μ and .01 )" standard deviation is / n o. Level α 2-sided test rejects Ho: μ = μo when uo falls outside a level 1 – α confidence int. c. If individual observations have normal distribution a. Complicating factors: not complete SRS from N(μ,σ) – then x of n has N(μ, / n ) population, multistage & many factor designs, d. Central Limit Theorem: Given SRS of b from a outliers, non-normal distribution, σ unknown. population with μ and σ. When n is large, the b. Under coverage and nonresponse often more sample mean x is approx normal. serious than the random sampling error accounted for by confidence interval 10. Binomial Distribution c. Type I error: reject Ho when it's true – α gives a. Binomial Experiment. Emphasize Bi – two possible probability of this error outcomes (success,failure). n repeated identical d. Type II error: accept Ho when Ha is true trials that have complementary P(success) + P(failure) = 1. binomial is count of successful trials e. Power is 1 – probability of Type II error where 0≤x≤n b. p : probability of success of each observation c. Binomial Coefficient: nCk = n!/(n – k)!k! n k nk d. Binomial Prob: P(x = k) = p (1 p ) k e. Binomal μ = np f. Binomal np(1 p) f0acc37b-be55-453e-be4c-8c9709fb5712.doc -3- Printed 4/8/2009

DOCUMENT INFO

Shared By:

Categories:

Tags:
Statistics Cheat Sheet, Cheat Sheet, how to, Ps2 Backup, Matrix Game, Statistical Definitions, The Matrix, Ultimate Matrix, Probability & Statistics, Cheat Sheets

Stats:

views: | 8340 |

posted: | 4/8/2009 |

language: | English |

pages: | 3 |

OTHER DOCS BY callmemelo

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.