Embed
Email

Statistics

Document Sample
Statistics
Shared by: HC111124033041
Categories
Tags
Stats
views:
3
posted:
11/23/2011
language:
English
pages:
23
Statistics



What Is Engineering?

July 16, 2007

Why Study Statistics?



Statistics: A mathematical science concerned with data collection,

presentation, analysis, and interpretation.



Statistics can tell us about…



Sports









Economy Population

New York City, NY

Employment Rates: 2006

8,250,000

Estimated Population 8,200,000

4.6% 8,150,000

32.3% Unemployed 8,100,000

Employed 8,050,000

Other

8,000,000

63.1% 7,950,000

7,900,000

2000 2001 2002 2003 2004 2005 2006

Year

Why Study Statistics?

Statistical analysis is also an integral part of scientific research!

Are your experimental results believable?



Example: Breaking strength of Welds

Control: Response:

Velocity (102 ft/min) Breaking Strength (ksi)

2.00 89

2.5 97, 91

2.75 98

3.00 100, 104, 97





Data suggests a relationship between velocity and breaking strength!



Not perfect – have random error.



(To make a weld, the operator stops a rotating part by forcing it into a stationary

part; the resulting friction generates heat that produces a hot-pressure weld.)

Why Study Statistics?



Responses and measurements are variable!



Due to…

Randomness (or individual differences) in sampling

population.





Inability to perform measurements in exactly the

same way every time.



Goal of statistics is to find the model that best describes a

target population by taking sample data.





Represent randomness using probability.

Probability



Experiment of chance: a phenomena whose outcome is uncertain.



Probabilities Chances





Sample Space



Probability Model Events

Probability of Events







Sample Space: Set of all possible outcomes



Event: A set of outcomes (a subset of the sample space). An

event E occurs if any of its outcomes occurs.



Probability: The likelihood that an event will produce a certain

outcome.

Probability



Consider a deck of playing cards…









Sample Space? Set of 52 cards



Event? R: The card is red. F: The card is a face card.

H: The card is a heart. 3: The card is a 3.



Probability? P(R) = 26/52 P(F) = 12/52

P(H) = 13/52 P(3) = 3/52

Events and variables



Can be described as random or deterministic:



The outcome of a random event cannot be predicted:





The sum of two numbers on two rolled dice.





The time of emission of the ith particle from radioactive material.







The outcome of a deterministic event can be predicted:



The measured length of a table to the nearest cm.





Motion of macroscopic objects (projectiles, planets, space

craft) as predicted by classical mechanics.

Extent of randomness



A variable can be more random or more deterministic depending

on the degree to which you account for relevant parameters:

Mostly deterministic:

Only a small fraction of the outcome cannot be accounted for.

Length of a table:

• Temperature/humidity variation

• Measurement resolution

• Instrument/observer error

• Quantum-level intrinsic uncertainty





Mostly Random:



Most of the outcome cannot be accounted for.



• Trajectory of a given molecule in a solution

Random variables



Can be described as discrete or continuous:



• A discrete variable has a countable number of values.

Number of customers who enter a store before one

purchases a product.

• The values of a continuous variable can not be listed:

Distance between two oxygen molecules in a room.



Consider data collected for undergraduate students:

Random Variable Possible Values

Gender Male, Female

Class Fresh, Soph, Jr, Sr

Height (inches) # in interval {30,90}

College Arts, Education, Engineering, etc.

Shoe Size 3, 3.5 … 18



Is the height a discrete or continuous variable?



How could you measure height and shoe size to make them continuous variables?

Probability Distributions



If a random event is repeated many times, it will produce a

distribution of outcomes (statistical regularity).



(Think about scores on an exam)



The distribution can be represented in two ways:



• Frequency distribution function: represents the

distribution as the number of occurrences of each

outcome

• Probability distribution function: represents the

distribution as the percentage of occurrences of each

outcome

Discrete Probability Distributions



Consider a discrete random variable, X:



f(xi) is the probability distribution function









What is the range of values of f(xi)?





Therefore, Pr(X=xi) = f(xi)

Discrete Probability Distributions



Properties of discrete probabilities:





Pr( X  xi )  f ( xi )  0 for all i







k k



 Pr( X  x )   f ( x )  1

i 1

i

i 1

i for k possible discrete outcomes







Pr(a  X  b)  F (b)  F (a)   f (x )

a  xi b

i







Where: F ( x)  Pr(X  x)

Discrete Probability Distributions



Example: Waiting for a success





Consider an experiment in which we toss a coin until heads turns up.



Outcomes, w = {H, TH, TTH, TTTH, TTTTH…}

Let X(w) be the number of tails before a heads turns up.

1

f ( x)  x 1 For x = 0, 1, 2….

2

0.5

0.45

0.4

0.35

0.3

f(x) 0.25

0.2

0.15

0.1

0.05

0

0 1 2 3 4 5 6

Waiting time

Cumulative Discrete Probability Distributions



j

Pr( X  x' )  F ( x' )   f ( xi ) Where xj is the largest discrete

i 1 value of X less than or equal to x’









 Pr( X  xk )  1

Cumulative Continuous Probability Distributions



For continuous variables, the events of interest are intervals rather

than isolated values.



Consider waiting time for a bus which is equally likely to be

anywhere in the next ten minutes:



Not interested in probability that the bus will arrive in 3.451233

minutes, but rather the probability that the bus will arrive in the

subinterval (a,b) minutes:



ba

P(a  T  b)  F (b)  F (a) 

10

F(t)



1







t

10

Continuous Probability Density Function



c.d.f: Gives the fraction of the total probability that lies at or to the

left of each x



p.d.f: Gives the density of concentration of probability at each

point x





In terms of the c.d.f.:



P( x  X  x  x)  F ( x  x)  F ( x)  F ( x)

When F(x) is differentiable at x, and Δx is small, we can approximate

ΔF by the differential of F:



dF ( x)  F ' ( x)x

Continuous Probability Distributions



Properties of the cumulative distribution function:



F ()  0

0  F ( x)  1

F ( )  1



Properties of the probability density function:



b

Pr(a  X  b)  F (b)  F (a)   f ( x)dx

a

Continuous Probability Distributions



Example: Gaussian (normal) distribution:









1  ( x   )2 

f ( x)  exp 

2   2 2 









Each member of the normal distribution family is described by the

mean (μ) and variance (σ2).



Standard normal curve: μ = 0, σ = 1.

Central Limit Theorem



As the sample size goes to infinity, the distribution function of the

standardized variable leads to the normal distribution function!









http://www.jhu.edu/virtlab/prob-distributions/

Moments



In physics, the moment refers to the force applied to a system at a distance

from the axis of rotation (as in a lever).





In mathematics, the moment is a measure of how far a function is from

the origin.





The 1st moment about the origin: (mean)



 Average value of x







The 2nd moment about the mean: (variance)





 A measure of the ‘spread’ of the data

Moments



Other values in terms of the moments:



Skewness:

3

 

2 3/ 2





‘lopsidedness’ of the distribution

 a symmetric distribution will have a skewness = 0

 negative skewness, distribution shifted to the left

 positive skewness, distribution shifted to the right





4

Kurtosis:

( 2 ) 2

 Describes the shape of the distribution with respect to

the height and width of the curve (‘peakedness’)

Standard Error



Standard Deviation:   2



Variance is the average squared distance of the data from the mean.

Therefore, the standard deviation measures the spread of data about the

mean.





Standard Error: Where N is the sample size

N



How do we reduce the size of our standard error?

Independence



A measure of whether two variables are related.



Random variation

Consider data collected for arrowhead breakage: or correlation?



Base Middle Tip

Fire 21 8 18 47



Other 15 11 4 30



36 19 22 77









Does the location of the fracture depend on the cause of fracture?

Or in other words, is the location of fracture independent of the

cause?


Related docs
Other docs by HC111124033041
Universit� Paris X Nanterre
Views: 12  |  Downloads: 0
05V55
Views: 1  |  Downloads: 0
WEBSITE_RESULT
Views: 47  |  Downloads: 0
01 1218
Views: 0  |  Downloads: 0
PART 1 � GENERAL
Views: 0  |  Downloads: 0
Megadenta
Views: 12  |  Downloads: 0
????1
Views: 0  |  Downloads: 0
JobProfile0274 Plumber
Views: 0  |  Downloads: 0
????????????????????
Views: 2  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!