Class width

Document Sample
Class width Powered By Docstoc
					8/24/09

Book: probability and statistics for engineering and sciences 7 ed

Descriptive Statistics

Data representation
       Chart or Graph
       Tabulation
       Numerical Values to describe the data

Frequency table for the tabulation method:

       X  variable      Frequency      Relative Freq (freq / total)   Expected
       1                 5              5/24 = 0.21                    1/6
       2                 5              5/24 = 0.21                    1/6
       3                 4              4/24 = 0.16                    1/6
       4                 4              4/24 = 0.16                    1/6
       5                 4              4/24 = 0.16                    1/6
       6                 2              2/24 = 0.10                    1/6
                         24  Total            1.0                     6/6 = 1


Less than 2 = rel freq for 1 & 2 = 42%

Graphing:
Bar Graphs  freq| ___variable        uses lines

Histogram  freq| ___variable         uses rectangles total area of rectangles should be 1

Discrete Variable: countable number of variables
Continuous Variables
       Numerical Values  cant be counted into all possible outcomes (measured value)
       Group into classes of distinct units to make it countable
       Range of data, find the highest and lowest values
              Range: (98-56) = 42                      n = 24
              # of classes = sqrt(# of observations)  k = sqrt(n) = sqrt(24) =~5

Balance between even if theyre fractions
Freq Table:                  (density)
Class        Frequency       Rel Freq                               ______
54.5-63.5    1               1/24                                    ______
63.5-72.5    3               3/24              Freq                  ______
72.5-81.5    7               7/24                                    ______
81.5-90.5    8               8/24                            ____________
90.5-99.5    5               5/24                            ____________
                                                      _________________
                                                      54.5 63.5 72.5 81.5 90.5
Density of each class (the height) = relative frequency / class length
8/26

Stem & Leaf Plot Stem = common leading digit(s), leaf = trailing digit(s)
92 82 84 74 56 77
91 78 85 90 66 75
72 73 79 89 93 69
82 87 76 88 97 98


Stem    Leaf              Freq
5       6                 1
6       69                2
7       47852396          8
8       2459278           7
9       210378            6

Stem    Leaf              Freq
5L                        0
5H      6                 1
6L
6H      69                2
7L      423               3
7H      78596             5
8L      242               3
8H      978               3
9L      2103              4
9H      78                2


Unimodal curve  has one peak (bell curve)
Bimodal curve  has 2 peaks (not much coverage of this kind of curve in this class)

Unimodal curves
     symmetric curve about the mean value (symmetric) (standard bell curve)
     skewed, has a peak early on and tapers off (positively skew)  big side
     skewed, has a slow start and peaks, tapers off quickly (negatively skewed)  big side

Measures of Location
      Mean  average (also called centroid)
      Median  middle value, each half of data is on either side
      Mode  value that occurs the most in the data set

Data: for n observations consider these to be denoted by x1, x2, x3, … , xn
        Mean  ¯x¯ = (x1 + x2 + x3 + … + xn) / n        (∑ xi (i from 1 to n)) / n
               (better for symmetric)

       Median  ~x~ = { x(n+1 / n) if odd, (x(n/2) + x(n/2)+1)/2 if even } order in ascending order
             (better for skewed)
       Ex 35 p30
              n = 14
              244, 191, 160, 187, 180, 176, 174
              205, 211, 183, 211, 180, 194, 200
       ¯x¯ = (244 + 191 + 160 + 187 + … + 200)/ 14 = 192.6  carry 1 extra significant digit
       ~ ~
        x = {160, 174, 176, 180, 180, 187, 191, 194, 200, 205, 211, 211, 244}
              n is even so (n/2) + (n/2 + 1) / 2 = (x7 + x8) / 2 = (187 + 191) /2 = 189

Measure of Spread
      Range = largest – smallest data  x(n) = x(1)
      Variance = deviation from mean  (xi - ¯x¯) for i = 1 to n
      Sum of squared deviations  ∑ (xi - ¯x¯)2 for i = 1 to n
      s2 = 1/(n-1) * ∑ (xi - ¯x¯)2 for i = 1 to n

       Variance  1/n-1 [∑xi2 + ((∑xi)2)/n]

       Standard Deviation = sqrt(((∑xi)2)/(n-1)))

       Computations:
       For data set of x1, x2, … , xn obtain:
       n  number of observations
       ∑xi for i = 1 to n  sum of observations
       ∑xi2 for i = 1 to n  sum of squared observations

       ¯x¯ = sum of observations / n
       s2 = [ sum of squared – (sum of observations squared / n)] / (n-1)
       s = sqrt (s2)

       Ex 1.16
              n = 15
              ∑xi = 216.1
              ∑xi2 = 3168.13
              ¯x¯ = 216.1 / 15 = 14.4
              s2 = [3168.13 – (216.1^2)/15]/14 = 3.92
              s = 1.98

Transformed data:
       (add a constant to each observation)
       yi = x1 + c, x2 + c, x3 + c, … , xn + c
       ¯y¯ = ¯x¯ + c
       sy2 = sx2

       (multiply a constant to each observation)
       zi = axi
       ¯z¯ = a¯x¯
       sz2 = a2 sx2
       standard deviation  sd = |a| sx
8/31
Ex 47 pg 39

Data: order from smallest to largest first
       87 93 96 98 105 114 128 131 142 168
Sample mean = ¯x¯ = sum(data)/n = (87 + … + 168) / 10 = 116.2
Standard deviation = sqrt[sum(data2) – (sum(data)2)/n] = 25.75
Median = ~x~ = even so middle 2 avg = (105 + 114)/2 = 109.5

Box Plot
Requires a 5 point summary
Smallest value 0%              87
Largest value 100%             168
Median         50%             109.5
Lower fourth 25%               96
Upper fourth 75%               131

 87       96       109.5         131                   168
            _____________________
   ______|          |             |______________________
          |         |             |
85     95      105    115  125     135   145   155   165   175

Fourth spread (spread between the two fourths)
Upper fourth – lower fourth

Fs = 131 – 96 = 35
X is an outlier if x < lower forth – 1.5 Fs
X is an outlier if x > upper forth + 1.5 Fs

96 – (35 * 1.5) = 43.5 (lowest point is 87 so no outlies to the left)
131 + (35 * 1.5) = 183.5 (highest point is 168 so no outlies to the right)

Random Experiment – Any process of making observations or measurements in which the outcome is uncertain
(that is it can not be predicted with certainty)

Sample Space – The collection of all possible outcomes in a random experiment.
(scripted capital S)

Example: Flip a coin two times.
S = {HH, HT, TH, TT}

Example: 3 kids born to a couple
S = {BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG}

Example: Items selected until first defective item found
S = {D. ND, NND, NNND, NNNND, …}  countable infinite

Example: Weight of a newborn baby randomly selected
S = {a < weight < b}
Discrete sample space – consists of finite or countable infinite outcomes
Continuous sample space – consists of infinite uncountable outcomes

Event – a subset of a sample space
Ex1) A = heads appears both times on coin toss      A = {HH}
       B = only one head obtained                   B = {HT, TH}
       C = at least one head obtained               C = {HH, HT, TH}

If only 1 outcome, that is a simple event
If 2 or more outcomes, that is a compound event

Complimentary Event
Complimentary Event of A (A’) – The collection of outcomes that are in S but not in event A
A’ = {HT, TH, TT}
B’ = {HH, TT}
C’ = {TT}

Null Event (Φ) – Two events A + B are mutually exclusive (disjoined) if A ∩ B = Φ

Exercise 3 Pg 50

            [2]
[1] ----<         > -----
            [3]

S = {sss, ssf, sfs, sff, fss, fsf, ffs, fff}

A = Exactly 2 are functioning
A = {ssf, sfs, fss}

B = At least 2 are functioning
B = {sss, ssf, sfs, fss}

C = For system to function, 1 must function and 2 or 3 must function
C = {sss, ssf, sfs}

A U C = {sss, ssf, sfs, fss} = B
A ∩ C = {ssf, sfs}

Venn diagram
S = box
Events as circles
9/2/09

Probability:
To each event A, assign a value called the probability of A and is written as P(A), such that it satisfies the
following properties

1) P(A) >= 0
2) P(S) = 1
3) If A1, A2, A3, … are mutually exclusive events then A1 U A2 U A3 U … = ∑ P(Ai) for i = 1 to n
For a finite number of these, P(A1 U A2 U A3 U …) = ∑(P(Ai)) i = 1 to n

Results        P(Φ) = 0
               If A ⊂ B then P(A) <= P(B)

For any 2 events A and B P(A U B) = P(A) + P(B) – P(A∩B)
For any 3 events A,B, C P(A U B U C) = P(A) + P(B) + P(C) – P(A∩B) – P(A∩C) – P(B∩C) + P(A∩B∩C)

P58 # 15
a)
A = Will buy at most 1 electric dryer
A’ = Will buy 2 or more electric dryers)
P(A) = .428
1 - .428 = A’ = .572

b)
P(A) = .116
P(B) = .005
1 - .116 - .005 = .879

P58 # 17
a)
P(A) = .30
P(B) = .50
P(A) + P(B) != 1 because there are other software packages, so A + B != S

b) P(A’) = .70
c) P(AUB) = .80
d) P(A’ U B’) = .20


Determination of Probability of Event
Consider a sample space consisting of simple events (outcomes) E1, E2, E3, … EN
Let A be the event that consists of n of these out of the total N outcomes, then the probability of A = n/N (how
many events that are in A, divided by the total number of samples)

Roll a die two times (36 possibilities)
A = Roll a sum of 7, (6 possibilities)
P(A) = 6/36 = 1/6
Product Rule
If an experiment E1 has n1 possible outcomes, and an experiment E2 has n2 possible outcomes, such that any of
the n1 outcomes from E1 can be associated with any of the n2 outcomes from E2, then the total number of
paired outcomes = n1 * n2
In general, we have n1 * n2 * nk called the k-tuples resulting from performing k independent experiments E1,
E2, … Ek

Definition: an ordered k-tuple is called a permutation.
The number of permutations of k items taken out of n possibilities is given by Pk,n n! / (n-k)!


5 students, select 3
A, B, C, D, E
5! / 2!

Total! / (total – selection)!

Select 2 cards from 52, order matters
52! / 52-2! = 52 * 51

An unordered k-tuple is called a combination, the number of combinations of n item staken 1 at a time is given
by Pk, n/k!

Nc0 = 1

N = n!___
K (n-k)! k!

Total cards: 52, choose 2, order doesn’t matter (2 choices for each)
52! / 50! 2! = 50! / 2!


Probability of getting 2 spaces
13c2 / 52c2 = (13 * 12) / (52 * 51)

Probability of getting a pair of the same suit
4c1 * 13c2 / 52c2
Exam 1 Wed Sept 16, 2009
Ch 1 & 2
Can bring piece of paper with formulas

Sec 2.4
Diagnostic Test
              Positive        Negative       Total
Has disease 5                 3              8
Doesn’t have 7                85             92
Total         12              88             100

Chance of randomly picking one with a disease = 8 / 100
Chance of randomly picking one without a disease = 92 / 100

A = test is positive
B = person has a disease
B|A = person tested positive has the disease  B when A has occured

P(A) = 12 / 100       P(A`) = 88/100
P(B) = 8 / 100        P(B`) = 92 / 100
P(B|A) = 5 / 12 != P(B) P(A∩B)/P(A)           conditional probability

P(B`|A) = 7 / 12 = 7/100 / 12/100 = P(A∩B`)/P(A)

P(B|A) & P(B`|A) are the conditional probabilities

Definition of the conditional probability:
The conditional probability of an event B given that event A has occurred is defined by P(A∩B)/P(A) or P(B|A)
where the probability of A > 0

Conditional probability = joined probability / marginal probability
Multiplication principle
P(A∩B) = P(B|A) * P(A)
P(A∩B) = P(A|B) * P(B)

P(A|A) = 1
P(AUB|C)  addition property of union
P(A|C) + P(B|C) – P(A∩B|C)

Example Draw 3 cards from a deck of 52
A = 1st card is king of spades
B = 2nd card is queen of diamonds
C = 3rd card is jack of clubs
P(A∩B∩C) 1/52 * 1/51 * 1/50
P(A) * P(B|A) * P(C|A∩B)

P(A1) * P(A2|A1) * P(A3 | A1∩A2) * P(A4| A1∩A2∩A3) * P(An | A1∩A2∩A3∩An-1)
Ex 51 P 75
Box 1 6 red balls, 4 green balls
Box 2 7 red balls, 3 green balls
1 ball selected at random from box 1, and put in box 2
Then 1 ball selected at random from box 2, and put in box 1
R1 = red ball from box 1
R2 = red from box 2
G1 = green ball from box 1
G2 = green ball from box 2
    a) P(R1 ∩ R2) = P(R1) * P(R2 | R1) = 6/10 * 8/11
    b) P(R1 ∩ R2) + P(G1∩ G2)
        P(R1) P(R2|R1) + P(G1) P(G2|G1)
        6/10 * 8/11 + 4/10 * 4/11 = 64/110


Ex 61 P 75

Mutually exclusive events take up all possible events in a given space

Consider events a1, a2, a3, an that satisfy the following
Take any 2 and intersect them, they have no events in common (mutually exclusive)
(i = 1 to n) ∑P(ai) = 1 (exhaustive)

Let B be an arbitrary event that encompases some of several events

Then P(B) = (i = 1 to n) ∑P(Ai)*P(B|Ai)

Baye’s theorem
P(A|B) = P(Aj)P(B|Aj) / (for i = 1 to n) ∑P(Ai)P(B|Ai)

P(Aj|B) , j = 1, 2, 3, … n
P(Aj), j, = 1, 2, 3, … n

If P(A) != P(B|A) then P(A∩B) = P(A) * P(B) then A and B are independent
    1) A & B’ are independent
    2) A’ & B are independent
    3) A’ & B’ are independent

Ex 46, 48, 55, 56, 60, 64, 70, 72, 78, 80, 84
9-14-09
If P(A) != P(B|A) then P(A∩B) = P(A) * P(B) then A and B are independent
    4) A & B’ are independent
    5) A’ & B are independent
    6) A’ & B’ are independent

Ex 1 follows due to P(A∩B) = P(A) * P(B’|A)
                            P(A)[1-P(B|A)]
                            P(A)[1-P(B)]
                            P(A)P(B’)


25 total seams
P(seam not working) = .2

P(one seam working)
25 * .2 * (.8^24)
25 total places for the working seam * .2 and 24 places for the not working seams
9-21-09

Chapter 3
Exam 2 stuff starts here
Random Variable – A variable that assigns a numerical value, one and only one value, to each outcome in the
sample space.

Discrete Random Variable – A random variable which takes on a finite number of values or countable infinite
numbers

Continuous Random Variable – A random variable that has values in an interval, or union of two or more
intervals eg between 0 and 1 are infinitely many decimals

Example 1: A coin is tossed two times.
S = {HH, HT, TH, TT}
X = number of heads obtained
Then if X = {2 if HH, 1 if HT, 1 if TH, 0 if TT}

X is a random variable and takes values of 0, 1, 2

Example 2: Items are selected until a defective one is found.
S = {D, ND, NND, NNND, …}
Let Y be the number of items tested. Then Y is a discrete random variable with its values as {1,2,3,…}

In example 1, we have the following probabilities associated with the values of X

x      outcomes        P[X=x] (probability distribution)
0      TT              .25
1      HT, TH          .5
2      HH              .25

Def: For a discrete random variable X, we define its probability distribution function given by p(x) = p[X=x]
where x represents a value for the random variable X if R = {x: X(s) x in R} then p(x) where x in R denotes the
probability distribution function or the probability mass function (pmf)

Probability histogram
As in the case of relative frequency histograms, this utilizes the probability as the height of a rectangle.

p(x)
.75
.5                _______
.25          ___________________
               0     1     2 (x)
Ex 11 p 98

X = # of cylinders
a) the pmf of x is
   x    4      6             8
p(x) .45       .40           .15

b)
.45     |        ,
.30     |        |
.15     |        |           |
        4        6           8 (x)

.45   ____      ,,,,,,,,,,
.30   ____      _____
.15   ____      _____         ____
        4          6            8

c) probability of at least 6 cylinder car
    = p[X = 6,8]
    = p[X=6] + p[X=8]
    = p(6) + p(8)
    = .40 + .15
    =.55

Cumulative distribution function (CDF)
For a discrete random variable X with pmf p(x), x in R
The cumulative distribution function is defined by
        F(x) = p[X <= x] =  p[X = y] for y <= x
        F(x) =  p(y) for y <= x where x is any real number

The cdf of X is given by F(x) = {¼ if x = 0, ¾ if x = 1, 1 if x = 2}

F(x) = { 0 if x < 0, ¼ if 0 <= x < 1, ¾ if 1 <= x < 2, 1 if x > 2}

Graph is a step function
                                ________________
1                               |
.9                              |
.8                   ,,,,,,,,,,,|
.7                 |
.6                 |
.5                 |
.4                 |
.3      ,,,,,,,,,,,|
.2      |
.1_____|
       0           1            2     3
Find pmf of discrete random variable X if its cdf is F(x), x in R
For x value of X
P(x) = P[X=x] = P[X <= x] – P[X<x]
        = p(x) = F(x) – F(x-)          x- = <x, not included
p(0) = ¼ - 0 = ¼
p(1) = ¾ - ¼ = ½
p(2) = 1 – ¾ = ¼
p(x) = 0 if x != 0, 1, 2  p(1/2) = ¾ - ¾ = 0


problems: 2, 4, 8, 10, 12, 16, 18, 24
Test answers sort of
x avg = 1.97
s = 1.33

mean 4.8
uf 6
lf 3.9

3
B part few got
Range = 50.6 – 45.1 = 5.5
Class width = 5.5 / 5 = 1.1, too tight so make it 5.6/5 = 1.12 or 6/6 = 1

4
a) 8!
b) 2(4!) or 2^4(4!)
c) 5(4!)(4!)

5
p(a) = .7 p(b) = .4 p(aUb) = .8
a) p(a∩b) = p(a) + p(b) – (aUb) = .7 + .4 - .8 = .3
b) p(a∩b)’ = p(a’Ub’) = .2

6
p(T) = .25
p(fail) = 1-P(did not fail)
        = 1- (.99)(.97)(.98)(.99)
        = .25(.068) = .017

P(F2 U F3) = P(F2) + P(F3) – P(F2∩F3)
      = .03 + .02 – (.03)(.02) = .0493

P(T|F) = P(T∩F)/P(F)

7)     A = cancer                             P(A) = .05
       B = diagnosed as having cancer         P(B|A) = .78
                                              P(B|A) = .06
a) P(B) = P(B∩A) + P(B∩A’)
       = P(A)P(B|A) + P(A’) P(B|A’)
       = .05*.78 + .95*.06 = .096

b) P(A|B) = P(A)P(B|A)/P(B) = .05 * .78 / .096 = .406
9/23/09
Expectation
Let X be a discrete random variable with its pmf as p(x) x in R, then the mean of the probability distribution of
x is called the expectation, expected value of x is given by E[X] =  x*p(x) for all x’s

If h(x) is a real value function of X, the E[h(x)] = h(x)p(x)

Notation: μ=E[X]
       σ2 = E[X- μ]2 = V(X)

Properties:
E[aX + b] = (ax+b)*p(x)
            = ax*p(x) + bp(x)
            = aE[X]+b     (p(x) = 1)

Variance
σ 2 = E[(x-mu)2] = (x-mu)2p(x) = (x2 – 2mu – mu2)
        = x2 p(x) + 2mux*p(x) + mu2 p(x)
        = E[X2] – 2mu * mu + mu2
        = E[X2] – mu2

Standard deviation = σ = sqrt( E[X2] – mu2)


Ex 29 p 106
Pmf =
x     0     1          2       3      4
P(x) .08    .15        .45     .27    .05

E[X] = 0*.08 + 1*.15 + 2*.45 + 3*.27 + 4*.05 = 2.06
σ x2 = E[X2] – 2.06
= 02 * .08 + 12 * .15 + 22 * .45 + 32 * .27 + 42 * .05 – 2.06
= .9364

Problems: 30, 32, 42
9/28/09
#35 p107
PMF of x
x      1       2       3       4      5       6
P(x) 1/15      2/15    3/15    4/15   3/15    2/15

Purchase Cost = $1
Selling Price = $2
x = demand that is met
h(x) = 2x is revenue

if 3 copies are ordered then the pmf associated with that
x       1       2      3
p(x) 1/15 2/15 12/15

E[h(x)] = h(x)p(x) * revenue = [1(1/15) + 2(2/15) + 3(12/15)]2 = 82/15
Profit = 82/15 – 3 = 37/15

If 4 copies are ordered then the pmf associated is

x      1       2       3       4
p(x)   1/15    2/15    3/15    9/15

E[h(x)] h(x)p(x) * revenue = [1(1/15) + 2(2/15) + 3(3/15) + 4(9/15)]2 = 100/15 = 20/3
Profit = 20/3 – 4 = 8/3 = 40/15

4 brings more revenue than 3
If h(x) = ax + b then
E[h(x) = aE[x] + b
μ = E[x]

Variance(h(x)) = E[ax+b – aμ-b]^2 = a2 σx2
b cancels, a2 is factored out

σ 2 = E[X2] – μ 2 = E[X2] – E[X]2

standard deviation = sqrt (a2 σx2) = |a|σx

Bernoulli Experiment
1) Has two mutually exclusive and exhaustive outcomes called “success” and “failure”
2) P(Success) = p and P(Failure) = 1-p

Define x = {1 if success, 0 if failure}
Then x has the Bernoulli distribution whose PMF p(x) = px * (1-p)(1-x)

Binomial Experiment
It consists of having the following
1) Bernoulli experiment is repeatedly performed a fixed number of times (n)
2) The Bernoulli trials are independent
3) The probability of success remains constant for each trial

Define x to be the number of successes in n bernoulli trials
Then the PMF of x is given by:
P(x) = nCx * px * (1-p)n-x, x = 0, 1, 2, 3, 4, 5, …
A random variable X with this pmf is called a binomial distribution

1       2       3       4       5       6       …       n
SF      SF      SF      SF      SF      SF              SF

For X = x, there are successes in x trials and failures in n-x trials.
P[X=x] = nCx * px(1-p)n-x


#49 p114
a)
n=6
p = .10
x = number of successes = 1
p[x = 1] = 6C1 = .11 (1-.1)6-1 = ?

b)
n=6
p = .10
x >=2 = 1-P[x<= 1]
   = 1-P[x=0] – P[x=1]
   = 1-6C0 * p0(1-.1)6-0 – (answer a) = ?

c) p(at most 5 selected to get four that are not seconds)
 = P(ffffs) = .9 * .9 * .9 * .9 * .1 = ?
Notation in the book:
X~Bin(n,p)
F(x) = p[X <= x] = P(y) for y = 0 to x = (pCy)py(1-p)n-y
Notation pmf b(x; n, p)
         Cdf B(y; n, p) = b(y; n, p) for y = 0 to x

Appendix table A.1

#50 p114
a)
n = 25
p = .25
x <= 6 = B(6, 25, .25) = .561

b)
x = 6 = P[x<=6] – P[x<=5] = .561 - .378 = .183

c)
x >= 6 = 1 – (x <= 5) = 1 - .378 = .622

d)
x > 6 = 1 – P[x<= 6] = 1 – .561 = 439

mean = μ=np
variance sigma2 = np(1-p)


Problems 46, 48, 56, 60
9/30/09

#65 p115
n = 100
X = # who use debit card
p(B) = .2

a)
so X ~Bin(100, .2)
μ = E[X] = n* p = 100 * .2 = 20
σ2 = Var(x) = np(1-p) = 100(.2)(.8) = 16
σ = sqrt(16) = 4

Y = # who didn’t use cash
Y ~Bin(100, .7)
μ = E[Y] = n * p = 100 * .7 = 70
σ2 = Var(Y) = 100(.7)(.3) = 21
σ = sqrt(21) = 4.58




Negative binomial distribution
Consider the following experiment:
   1) Consider a sequence of independent trials
   2) Each trial has 2 mutually exclusive and exhaustive outcomes called success and failure
   3) The probability of success remains constant from one trial to another

   Let X = the number of failures until the rth success occurs
   Note that X is a random variable with possible values of 0, 1, 2, 3, …, r
   Find P[X=x]

   Need x failures until the (x+r)’th point where the success occurs, so there are r - 1 successes

   P[X=x] (x+r-1)C(r-1) * pr-1 * (1-p)x * p

   Has a negative binomial distribution if its pmf is given by “nb” (negative binomial)
   And is given by nb(x, r, p) = (x+r-1)C(r-1) * pr * (1-p)x , x = 0,1,2,3,…

   Y = # of trials until the rth success
   Note that Y = x+r for x = 0,1,2,3,…
   And so x = y – r
   *** P[Y=y] = (y-1)C(r-1) * p2 * (1-p)y-r   y = r, r+1, r+2
Example
What is the probability that the second debit card payer is the 10th customer at the gas station?
Y = # of trials until the 2nd success
y = 10, r = 2
P[Y=y] = (y-1)C(r-1) * p2 * (1-p)y-r
P[Y=10] = 9C1 * p(.2)2 * (.8)8


Hypergeometric distribution
Consider the following experiment:
   1) population consists of a finite number N of items
   2) a subset of n items is selected at random from N items using sampling without replacement
   3) The N items consist of m items of type I, and N-m of type II

N = total
m = type 1 in N
N-m = type 2 in N
n = number selected
x = type 1 in n
n-x = type 2 in n

Let X be the number of type I items among the n items sampled
Then p[X=x] = mCx * (N-m)C(n-x) / NCn            x = 1, 2, 3, … min(n,m)

A random variable X has a hypergeometric distribution if its pmf is given by
*** h(x; N, M, n) = mCx * (N-m)C(n-x) / NCn


#69 p 120

12 have problems, 7 have compressor problems, 5 don’t
1st try is 7/12 that theres a busted one
2nd try is 7/11 or 6/11 if you don’t replace it back into the set
2nd try is 7/12 if you do replace it back into the set

N = 12
m=7
n=6
of the first 6 selected, 5 are defective
    a) P [x = 5] 7C5 * 5C1 / 12C6 = 5/44
    b) P [X<=4] =  mCx * (N-m)C(n-x) / NCn            for x = 1 to 4

    E[X] = n * m/N             var(x) = N-n/N-1 * n*m/N*(1-m/N)
    b)
           a. μ = 6 * 7/12 = 3.5 and
           b. σ2 = 6/11 * 6 * 7/12(5/12)
           c. σ ≈ 1

    P[X – μ > 1(σ)] = P[X > 4.5] = P[X > 5] = 1 – P[X<=4] = ?
When N is large relative to n, then h(x; N, m, n) ≈ b(x; n, p) where p = m/n
Where b(x; n, p) = nCx * px * (1-p)n-x

P = m/N = 40/400 = .1
n = 15
X is Bin(15, .1)
P[X<=5] = B(5, 15, .1) = .998 from table A.1.c p 664



If X is Negative Binomial NB(r, p) then
        E[X] = r(1-p)/p
        Var(X) = r(1-p)/p2

Special case:
If r = 1, then a geometric distribution with pmf given by p(x) = p * (1-p)x, x = 0, 1, 2, 3, 4, …
         E[X] = (1-p)/p
         Var(X) = (1-p)/p2


Homework 46, 48, 56, 60, 68, 70, 72, 74, 75, 76
10/5/09

Poisson Distribution
A discrete random variable X has a Poisson probability distribution if its pmf is p[X=x] = p(x) =
e-λ * λx / x! x = 0, 1, 2, 3 …

p(x) = e-λ


If x has B(n, p) when n is large and p is small so that np = λ, then
        = b(x; n, p)
        = nCx * px * (1-p)n-x
        = e-λ λx / x!
        = p(x, a)

Ex 3.40 p 122
Shows the exact probability are close to those obtained using Poisson approximation when n is large
and p is small

Moreover, E[X] = λ       V(X) = λ
Note that if x is Bin(n, p) then E[X] = np = λ     V(X) = np(1-p) = λ

Poisson Process

Ex 79 p 125      Using Table A.2 P 667
λ=5
a) P[X<= 8] =  p(x, 5) where x = 0 to 8  F(8, 5) = .932
b) P[X=8] = F(8,5) – F(7,5) = .932 - .867 = .065
c) P[X>=9] = 1 – F(8,5) = 1 - .932 = .068
d) P[5 <= X <= 8] F(8,5) – F(4,5) = .932 - .440 = .492
e) P[5 < X < 8] F(7,5) – F(5,5) = .867 - .616 = .251


Unit time – the time when an event can occur only once
Ex 83 p 125
1/200 people have gene, total sample = 1000 people
α = rate per unit, then        αt = λ is the mean rate for time length or period
α = rate = 1/200 people
λ = 1000(1/200) = 5
a) P[5 <= X <= 8] = F(8,5) – F(4,5) = .968 - .440 = 0.528
b) P[X>= 8] = 1 – F(7,5) = 1 - .867 = 0.133

Homework 80, 82, 88
Sec 4.1 Probability Density Functions
Def: A Random Variable X of continuous type has a probability density function (pdf) f(x), where x
takes a value in some interval ie 0 < x < 1, if
1) f(x) > 0, a < x < b    and f(x) = 0 otherwise
2) ∫f(x)dx = 1 from –infinity to infinity
3) if a < c < d < b             a--------c--------d---------b
        Then P[c < x < d] can be found by ∫f(x)dx from c to d




    a      c          d       b

∫f(x)dx from –inf to inf = ∫f(x) from a to b = 1

The cumulative distribution function (cdf)
F(x) = P[X<=x] = ∫f(y)dy from –inf to x

Curve = f(x), Area = F(x)

F(x)




    a           b

0                     x<a
∫f(x)dx from a to b   a <= x <= b
0                     x>b

P[c <= x < d] = P[c < x <= d], adding the actual numbers has negligible effect

Ex 1, P 135

X = Time the book is checked out
pdf  f(x) = x/2 , 0 <= x <= 2
               0 otherwise
then its cdf = F(x) = ∫y/2 dy from 0 to x       0 <= x <= 2
                            2
                  F(x) = y /4 | 0 to x     note that F(0) = 0 and F(2) = 1
                          {0, x <= 0
               F(x) =     { x2/4, 0 <= x <= 2
                          {1, x >= 2
a) P[X<=1] = F(1) = 12/4 = ¼ = .25
b) P[1/2 <= x <= 3/2] = (3/2)2 / 4 – (1/2)2 / 4 = 9/16 – 1/16 = 8/16 = ½ = .5
c) P[X >= 3/2] = 1 – P[X=3/2] = 1 – 9/16 = 7/16
definition: the 100 pth percentile η(x) of distribution of random variable X with pdf f(x) is given by:

∫f(x)dx from –inf to η(x) = p
F(η(x)) = P

Solve for a given P where 0 <= P <= 1, solve situation 1 for η(p)
Ex1 P 135
F(x) = x2/4, 0 <= x <= 2
F(η(p)) = η(p)2 / 4
Let η(p)2 / 4 = P from equation 1, we have η(p)2 = 4p so η(p) = 2sqrt(p)


Homework
P125 80, 82, 88
P135 2, 6, 8
P142 12, 14, 20


Expected values of X

If x is a discrete random variable with pmf p(x) then
         μ = E[X] =  x*p(x) for x = 0 to N(x)
         σ2 = [X – μ]2 = E[X2] – E[X]2

If X is a continuous random variable with pdf f(x) then
        E[X] = ∫ xf(x) dx from a to b
        σ2 = E[X – μ]2 = E[X2] – E[X]2


Uniform Distribution
A random variable X has a uniform distribution over interval (A, B) if its pdf is given by
f(x) = 1/(B-A) where A < x < B




                    1/(B-A)



     A      B


cdf
F(X) = p[X<=x]
= ∫ f(y)dy from A to x as long as A < B
= ∫ 1/B-A dy
= y/B-A | A to x
= X-A / B-A
μ = E[X] ∫x*f(x) = ∫x*(1/(B-A))dx
       = x2/2(B-A) | A to B
       = B2 – A2 / (2(B-A)
       = (B+A)(B-A)/(2(B-A))
       = (B+A)/2
       = (A+B)/2

σ2     = ∫x2f(x) dx from A to B = ∫x2 f(x)dx
       = x3 / 3(B-A)
       = B3 – A3 / 3 (B-A)
       = (B-A)(A2 + AB + B2)/(B-A)
       = 1/3(A2 + AB + B2) * ((A+B)/2)2
       = (B-A)2/12



Special Case
X has uniform distribution over the interval (0, 1).
Notation X is U(0,1) or       X~U(0,1)
Then the pdf of x is f(x) = 1 0 < x < 1 and 0 otherwise

The cdf is F(x) = { 0 if x < 0, x if 0 < x < 1,   and 1 if x > 1}
The probability is the value for x
Practice test stuff:
1) Chapter 3 Section 3.3

      a) 1 = p(x) from 1 to 4    = x/c x = 1 to 4 = 1/c + 2/c + 3/c + 4/c = 10/c = 1 so c = 10
      so p(x) = x/10 where x = 1, 2, 3, 4

      b) F(x) = P[X<=x] = p(y) y = 1 to x
         F(x) =     {0, x < 1
                    {.1, 1 <= x < 2
                    {.3, 2 <= x < 3
                    {.6, 3 <= x < 4
                    {1, 4 <= x




      P[2 < x < 4] = F(3) = F(4)-F(2) = .3

      c) μx = x*p(x) for all x values
              1*(.1) + 2(.2) + 3(.3) + 4(.4) = 30/10 = 3
         μy = 2 * μx + 3 = 2 * 3 + 3 = 9

2)
      a) x has binomial distribution where n = 15, p = .2 = Bin(x; 15, .2)
      b) E[X] = np = 15 * .2 = 7.5     V(X) = n*p*(1-p) = 15 * .2 * .8
      c) P[X>=1] = 1 - P[x<=0] = 1 - .035 = 0.965
      d) P[2<X<=8] = P[<=8] – P[<=2] = .999 - .398 = 0.601


3) P117
      a)
      X = number of defective transistors

        25
      / | \
      5 | 20
      |    | |
      X---5--5-x
      h(x; successes, possible successes, total number)
      h(x; 5, 5, 25)
        b)
               x = number of customers arriving
               x has poisson distribution given
               with λ = 10
               p[X > 10] = 1 – p[X <= 10] = 1 - .583 = .417


4)
        a) determine P[X<= ½]
        ∫2xdx from 0 to ½
        = 2x2/2 | 0 to ½
        = 2(1/2) / 2 – 0
        =¼

     c) F(μ) = .5
        F(μ) = ∫2xdx from 0 to μ
               μ2 = .5
               μ = sqrt(.5)


        d) μ = ∫x * 2xdx = ∫2x2 dx = 2x3/3 |1 to 0 = 2/3
        σ = E[x2] – E[x]2 = 1/18


5)
        a)
        X has negative binomial distribution with p = .5 and r = 3
        P[x <= 2]  (x+2)C(2)*(p)r (p)x for x = 0 to 2

        b) P[X >= 10] = 1 – 10/30 = 2/3
        c) P[X >= 25 | x > 15] = p[x >= 25] / p[x > 15] = 5/30 / 15/30
          P[A|B] = P(A∩B)/P(B)
10/14/09

EXAM 3 STUFF HERE

Normal Distribution
Definition: A continuous random variable X has a normal distribution with parameters μ and σ 2 where
-∞ < μ < ∞ and σ2 > 0

If the pdf of X is given by
f(X, μ, σ) = 1 / √(2πσ) e-(x-μ)^2 / 2σ^2 , -∞ < x < ∞

max value = 1/√(2πσ)




           μ – middle point of curve

E[X] = μ which is the median
V[X] = σ2 = E[X-μ]2 which is the variance

If the graph is tall, then σ is small
If the graph is flattened out, then σ is large

Fx(x) = P[X <= x] = ∫ f(y; μ, σ)dy from -∞ to x = ∫1/√(2πσ) e^-(y-μ)2/(2σ2dy

Standard normal distribution
If μ = 0 and σ = 1 then the normally distributed random variable is denoted by Z with pdf
F(z, 0, 1) = 1/√(2πσ) e^-(z)2/2
Its cdf is denoted by Φ is given by
Φ(z) = P[Z <= z] = ∫1/√(2pi) e^`(z)2/2
The values of Φ(z) are given tin table A.3 for -3.4 < z < 3.4

Exercise: find Φ(1.25), Φ(1.96), Φ(-1.02)

Φ(1.25) = .8944
Φ(1.96) = .9750
Φ(-1.02) = .1539 = 1- Φ(1.02) = 1 - .8461 = .1539

Φ(-z) = 1-Φ(z)




               zα – the 100(1-zα) percentile of the standard normal distribution
Ex #31 p 154
a) α=.0055
       1-α = .9945, then z.0055 = 2.54
b) α=.09
       1-α = .91, then z.09 = 1.34
c) α=.663
       1-α = .337, then z.663 = -0.42

α = .05
       1-α = .95, then z.05 = 1.645




Proposition:
If x has normal distribution  X ~N(μ, σ2), then Z = X-μ/σ ~N(0,1)

Note that E[Z] = (E[X]-μ)/σ = μ-μ/σ = 0
μz = 0
σz = 1/σ * σ = 1

Thus,
Fx (x) = P[X <= x] = P[X-μ/σ <= x-μ/σ]
                   = P[Z <= x-μ/σ]
                   = Φ(x-μ/σ)




             x----------------------->x-μ/σ
the area to the left of x  f(x)) is the same as the area to the left of x-μ/σ for
             the standard normal  Φ(x-μ/σ)

#33 pg154
X = force
μ = 15.0
σ = 1.25

a) P[x <= 18] =Φ(x-μ/σ) = Φ(18-15/1.25) = Φ(3/1.25) = Φ(2.4) = .9918
b) P[10 <= x <= 12] = Φ(12-15/1.25) - Φ(10-15/1.25) = Φ(-2.4) – Φ (-4) = .0082 – (0) = .0082
c) P[13.125 <= x <= 16.875] = Φ(16.875 – 15/1.25) - Φ(13.125 -15/1.25) = Φ(1.5) – Φ(-1.5) =
recall that 100(1-α) = probability of Z is zα
for x ~N(μ, σ2), the 100(1-α) percentile is given by
xα = μ+zασ
if α = .025, then 100(1-α) the 97.5 percentile is given by z.025 = 1.96
x.025 = 15 + 1.96 * 1.25 = 17.45
Example
X = IQ score
X ~ N(100, 15)
P[X <= 130]
P[x-μ/σ <= 130.5]
Φ(130.5-100/15) = Φ(2.033) = .9788

Normal as an approximation of binomial
X ~ Bin(n,p)
μ = np, σ = √(np(1-p))

then if np and n(1-p) are not small numbers
np > 10, and n(1-p) > 10, then
P[X <= x] = B(x; n, p) ≈ Φ(x+.5 – np / √(np(1-p))

Homework:
28, 30, 34, 52, 54
10/19/09

Exponential Distribution
Defintiion: a continuous random variable X has an exponential distribution if its pdf is given by
f(x, λ) =      {
               λe^(-λx), x > 0,
               0 otherwise
               }




μ=E[x] = ∫x* λe^(-λx)dx
μ = 1/λ

σ2 = E[x2] – E[x]2
σ2 = 1/λ2

the cdf of x is
F(x) = λ∫e-λydy from -∞ to ∞
F(x) = e-λx + 1


Memoryless Property
If X has an exponential distribution with λ, then
P[X >= t + t0 | X >= t0] = P[X >= t]

          <------ t ------->
---------|-------------------|--------------------------
        t0                 t + t0

Right hand side expression P[X > t] = 1-P[X<= t] = 1 – [1 – e-λt] = e-λt
Left side expression P[X>= t+t0] / P[X > t0] = e-λt


Chi Square Distribution
Notation wise χ2ν ν = number of degrees of freedom
Sample data x1 x2 … xn
Sample avg = x / n
Variance s2 1/n-1  (xi – avg(x))2




 |             |x
 Areas between 0 and x are given on 673
Homework: 59, 60, (88, 92, 97) Due 26th
The 100 pth percentile of the the exponential distribution is F(η(p)) = p
p = 1-e-λη(p)
1 – p = e-λη(p)
Ln(1-p) = -λη(p)
η(p) = -1/λ ln(1-p)
10/21/09

Probability Plot
Quantile Plot
Q-Q Plot

Sample Data
-1.91, -1.25, -0.75, -0.53, 0.20, 0.35, 0.72, 0.82, 1.40, 1.56
n = Sample Size = number of observations, then in this example n = 10
Question: Are the sample observations from a N(0,1) (standard normal) ?

The pth percentile of a distribution F(x) satisfies the condition F(η(p)) = P




    <-------------|η(p)

Sample quantile or percentile
The ith smallest observation in the data when ordered = (i-.5)/n sample quantile

  Which percentile           from data     from appendix
i__ p = (i-.5)/10         Sample Percentile Z-percentile_
1      .05                -1.91              -1.645
2      .15                -1.25              -1.037
3      .25                -0.75              -0.675
4      .35                -0.53              -0.385
5      .45                0.20               -0.126
6      .55                0.35               0.126
7      .65                0.72               0.385
8      .75                0.82               0.675
9      .85                1.40               1.037
10     .95                1.56               1.645

The graph on p172 shows that the data is likely to correspond to a standard normal

         |
         |from data
         |
---------|-----------z values


If X is normally distributed N(μ, σ2) then its 100 pth percentile is given by x1-p = μ + σZp
                                                                  μ = y intercept, σ = slope
If X has an exponential distribution with parameter λ then F(x) = 1 – e-λx

Then the percentile of the distribution is obtained by 1 – e-λη(p) = p
                                                       1 – p = e-λη(p)
                                                     Ln(1-p) = -λη(p)
                                                     η(p) = -(ln(1-p))/λ
consider when λ = 1, then η(p) = -ln(1-p)

Exercise #87 p178

Homework: P178, 88, 92, 97 Due 28th

Chapter 5
Section 5.3 – Statistics and their Distributions
A random sample is a set of n independent and identically distributed random variables where the
common distribution is that of the population that is being sampled

Statistics – statistics are functions of the random sample observations

Notation:
X1 X2 X3 X4 … Xn represents a random sample
If x1, x2, x3, … xn are the observations, then these observations constitute a sample

For statistics T = h(X1, X2, … Xn)
For each observed value T = h(x1, x2, … xn)

Exercise #41 p212
X = # of packages shipped
pmf
x      1     2     3      4
p(x) .4      .3    .2     .1

μ = x*p(x) = 2.0
σ2 = E[X2] – μ2 = x2 *p(x) – μ2 = 5.0 – 4.0 = 1

The joined pmf of X1 and X2 is obtained as following:
P[X1 = x1 and X2 = x2]
=P[X1=x1] * P[X2=x2]
                     X1
             1       2    3       4
       1     .16     .12  .08     .04
X2     2     .12     .09  .06     .03
       3     .08     .06  .04     .02
       4     .04     .03  .02     .01

       ¯X¯ = (X1 + X2) /2, it’s pmf is given by

       (x1, x2)   ¯x¯ = x1+x2 / 2         p(¯x¯)
       (1,1)        1                     .16
       (1,2)        1.5                   .12
       (1,3)        2.0                   .08
       (1,4)        2.5                   .04
      (2,1)          1.5               .12
      (2,2)          2.0               .09
      (2,3)          2.5               .06
      (2,4)          3.0               .03
      (3,1)          2.0               .08
      (3,2)          2.5               .06
      (3,3)          3.0               .04
      (3,4)          3.5               .02
      (4,1)          2.5               .04
      (4,2)          3.0               .03
      (4,3)          3.5               .02
      (4,4)          4.0               .01

      Thus, the pmf of ¯X¯

      ¯x¯     p(x)
      1.0     .16
      1.5     .24
      2.0     .25
      2.5     .20
      3.0     .10
      3.5     .04
      4.0     .01
      Total   1.0

      P[X <= 2.5] = .85

      Homework: 37, 38, 39 Due 28th

      10/26/09

      μ =  ¯x¯*p(¯x¯) = 2.0
      σ2 =  ¯x¯2*p(¯x¯) – ( ¯x¯*p(¯x¯))2

      μx¯ = μ
      σx¯2 = σ2/n
      σx¯ = σ/sqrt(n)

       Sampling Distribution of X¯
       If, X1, X2, … Xn constitute a random sample from a population with mean μ and variance σ 2
then the following holds true:

If X¯ = (X1 + X2 + Xn)/n, then

μx¯ = E[X¯] = μ
σ2x¯ = σ2/n and σx¯ = σ/sqrt(n)

if nX¯ = X1 + X2 + … + Xn = T
then μT = E[T] = nμ
σT2 = nσ2     σT = sqrt(n)σ
Example:
Consider a random sample of size 4 from a population with μ = 5, σ2 = 16, then the sampling
distribution of X¯ has mean μX¯ = μ = 5, and variance σX¯2 = σ2/n = 16/4 = 4, and σX¯ = 4/2 = 2

The sampling distribution for the mean is going to be more compacted (less spread) than the original
distribution, moreover it will have more smoothness for the values, rather than the original distribution

Normal Distribution Theory
Let X1 X2 … Xn be a random sample from N(μ, σ2) (normally distributed with mean μ and variance σ2

Define X¯ = 1/n * xi and T =  xi
Then μx = μ, σx 2 = σ2/n and μT = nμ          σT2 = nσ2

X¯ ~ N(μ, σ2/n)

With X¯, larger the sample size, smaller then variance. (graph will be more pointy and compact) but
all cases, they will be centered at the same point, μ

With T, larger the sample size, the more spread out the graph will be.

#53, P218
μ = 50, σ = 1.2
X = hardness of pin

X ~ N(50, (1.2)2)
n=9

X¯ ~ N(50, (1.2)2/9)
If X ~ N(μ, σ2), then P(X<=x) = P[Z <= (x-μ/σ)] = Φ(x-μ/σ)

P(X¯ >= 51) = 1 – P(X¯ <= 50) = 1 – Φ((51 – 50)/(1.2/3))
1 – Φ(2.5) = 1 - .9938 = .0062
P(X<=49) = Φ(49 – 50)/(1.2/3) = Φ(-2.5) = .0062

P[X>51] = Φ(51-50)/1.2 = .20

b) (no longer normal distribution)
n = 40        μ = 50        σ2 = 1.22 / 40

Central Limit Theorem
If n is large then the sampling distribution of X¯ is approximately N(μ, σ2/n)
If the sampling distribution of X¯ is normally distributed, then it is exactly N(μ, σ 2/n)
The approximation is reasonably good for even for a moderate sample size unless the population
distribution is highly skewed, if it is highly skewed, n needs to be more than 30

P(X¯ > 51) ≈ 1 – Φ(51-50/(1.2/sqrt(40)) ≈ 1-1 ≈ 0
Normal Approximation of a Discrete Probability Distribution
Suppose X is an integer valued random variable with mean μ and variance σ 2.
Then, P[X = k] = P[k – ½ <= x < k + ½] ≈ Φ((k+ ½ - μ)/σ) – Φ((k + ½ - μ)/σ)
      P[X <= k] = Φ((k + ½ - μ)/σ)

#55 p 218
μ = λ = 50
σ2 = λ = 50
a) P[35 <= x <= 70] = Φ(70.5 – 50/sqrt(50)) – Φ(34.5 – 50/sqrt(50)) = ?

Homework: 46, 47, 50, 52, 56
10/28/09

56
n = 1000, p = .1
μ = np       = 100
 2
σ = np(1-p) = 100(.9) = 90

b)
P(|X1 – X2| <= 50)
X1 is B(1000, .1)
X2 is B(1000, .1)

So if Y = X1 – X2, then
μy = 100 – 100 = 0
σy2 = 90 + 90 = 180

Since X1 and X2 each are approximately normally distributed, (X1 – X2) are approximately normal.
N(0, 180)
= P[ |(X1 – X2) – 0 / √(180)| <= 50-0/√(180)]
= P[ |Z| <= 3.73 ]
= Φ(3.73) – Φ(-3.73)
=1–0
=1




Difference or Sum of two independent random variables
Let X1 and X2 be two independent random variables with respective means and variances
μ1, = E(X1)     μ2 = E(X2)
σ12 = V(X1)     σ22 = V(X2)

Then E[X1 +/- X2] = μ1 +/- μ2
And V[X1 +/- X2] = σ12 + σ22 (always the sum)


Linear Combination of Normally Distributed Independent Random Variables
Let X1 X2 Xn be n independently distributed normal random variables with respective means and
variances as μi = E[Xi] and σi2 = V[Xi]
If Y =  aiXi
μy =  aiμi
σy2 =  (ai2)( σi2)

Y is normally distributed N(μy, σy2)

Homework: 64, 65, 73 Due Nov 4
Chapter 6

Section 1
Suppose θ is the parameter to be estimated

If x1, x2, …, xn is a set of n sample observations, then a value computed using the observed data
provides an estimate of θ.

Let X1 X2 Xn be a random sample then a statistic, denoted by θ^ is an estimator of θ.

Note that θ^ is a random variable with its own sampling distribution

This is known as point estimation

P240 ex1
  a) n = 27, xi = 219.8 find an estimate of the population mean μ x¯ = 219.8/27 = 8.14 is an
      estimate of μ, using the random sample mean X¯ for a point estimation of μ
  b) find median μ~ = 7.7 is an estimate of x~ , using the random sample median
  c) s = √s2 = 1/27-1[xi2 – μ2] = 1/26 [1890.94 – 8.142/27] = 1.66
  d) let p = P[X > 10] then an estimate of p is given by p^ = (# xi > 10)/27 = 4/27
  e) An estimate of σ/μ = s/x¯ 1.66 / 8.14

Homework: 2, 4 Due Nov 4
11/02/09

Estimation of θ
If θ^ is an estimator of θ then θ^ = θ + estimation error = θ + ε
MSE(θ^) = E[(θ^ - θ)2] = mean square error (keep this value minimized for all values of θ)

Estimation Criteria
    1) unbiased estimator – θ^ is an unbiased estimator of θ if E[θ^] = θ (if the high points of the graph appear
       at the same x coordinates)

bias = E[θ^] – θ = E[θ^ - θ]

MSE(θ^) = E[θ^ - θ]2 = σ2θ^ + bias(θ^)2

If θ^ is unbiased then MSE(θ^) = σ2θ^

An unbiased estimator which has the minimum variance among all unbiased estimators of θ is called the
minimum variance unbiased estimator (MVUE) of θ


Proposition – Let X1, X2, … Xn be a random sample of size n from a population with mean μ and variance σ2
then X¯ = 1/nXi is an unbiased estimator of μ
S2 = 1/n-1 (Xi – X¯)2 is an unbiased estimator of σ2


Corollary if X1 X2 … Xn is a random sample from N(μ, σ2), then X¯ is MVUE of μ

Estimation of proportions P
ˆ
p = 1/nXi where Xi is a random variable where x = 1 if success, 0 if failure = 1/n(# of successes) is an
unbiased estimator of p.
When n is large then T = Xi is approximately N(np, np(1-p)) and in this case p = T/n is close to being a MVUE
                                                                                ˆ
of p

                ˆ
When n is large p is approximately N(p, p(1-p)/n)

            ˆ
Note that E[p ] = p
Var(p ) = np(1-p)/n2 = p(1-p)/n
    ˆ

For estimator θ^ of θ, its standard deviation σθ^ = √(σθ^) = √var(θ^) is called the standard error of θ^, denoted
by σθ^ or SE(θ)

Example:       Point
Parameter      Estimator       SE             Estimated SE
μ              ¯
               X               σ/√n           s/√n
P              ˆ
               p               √(p(1-p)/n)      ˆ    ˆ
                                              √(p (1-p )/n)
σ2             S2              ?              ?

With point estimator, calculate the estimated standard error

Homework 9, 12, 13 Due Nov 11
11/04/09

#4

Var(X-Y) = Var(X) + Var(Y) = σ12/n1 + σ12/n2
SE(X-Y) = √σ2/n1 + σ2/n2
σ^ = estimated SE(X-Y) = √(S12/n2 + S22/n2)
S2 = 1/n-1 [ xi2 – (xi)2/n ]

Θ = (μ1 – μ2) and θ^ = X - Y¯
                       ¯




N(μ, σ2)

Estimate μ?
                  ¯
Point Estimator = X (random sample mean)

σX¯ = SE(X ) = √(σ2/n) = σ/√n
         ¯

X has N(μ, σ2/n)
¯

SE represents the precision, how accurate the estimate is

P(-Zα/2 <= X - μ /(σ/√r) <= Zα/2) = 1-α
           ¯
Then
P(X - Zα/2 σ/√n <= μ <= X + Zα/2 σ/√n) = 1 - x
  ¯                       ¯
The random interval (X - Zα/2 σ/√n , X + Zα/2 σ/√n) contains μ in it has a probability (1-α)
                        ¯              ¯

Interval Estimation
Given an observed sample of size n, {x1, x2, … xn} from N(μ, σ2), a 100(1-α) of confidence interval for μ is
given by the interval (x - Zα/2 σ/√n, ¯ + Zα/2 σ/√n), where x¯ = (1/n)(xi),
                        ¯             x
here, (1-α) is called the confidence level, the probability of μ is either 1 or 0, not 1-α


Example 7.1
σ = 2.0 n = 31
¯ = 80, 1-α = .95
x

A 95% CI for μ is given by +/- 1.96(2.0)/√31 = 80+/- .7 = (79.3, 80.7)

CI width = 2Zα/2 σ/√n

Find the sample size n so that the confidence interval CI is of a specified width
Solution: w = 2Zα/2 σ/√n solve for n, n = (2Zα/2 σ/w)2

If in ex 7.1, w = 2, then n = (2(1.96)(2)/2)2 = ((1.96)(2))2 = 15.3664

If X1, X2, …, Xn constitutes a random sample from a population with mean μ and variance σ2, then by CLT,
X is approximately normal, N(μ,σ2/n) provided n is large, and
¯
X - μ / σ/√n is approximately N(0, 1) provided n is large
¯

Recall that S is a biased estimator of σ, and when sample size is large, S becomes more unbiased toward σ.
When n is large, then X -μ/S/√n is approximately N(0, 1)
                       ¯
An approximate 100(1-α) confidence interval for μ when σ is unknown is given by ¯ +/- Zα/2 S/√n
                                                                                x


Confidence interval for proportion P
        ˆ
Recall p = # of successes / sample size = y/n
Where y = # of “successes”
σp = √(p(1-p)/n )

                   ˆ
if n is large then p is approximately normal N(p, p(1-p)/n)
                                                ˆ            ˆ    ˆ
An approximate interval of 100(1-α) CI for p is p +/- Zα/2 √(p (1-p )/n)
(7.11)

Sample size determination
            ˆ    ˆ
W = 2Zα/2 √(p (1-p )/n)

n = (2Zα/2 √(p (1-p ))/w )2
              ˆ    ˆ
       2
                ˆ    ˆ
n = (4Z α/2 * (p (1-p )/w

<= (Z2α/2)/w since p(1-p) <= ¼

Homework Due Nov 11, Ch7 2, 4, 6, 18, 20, 24
11/09/09

TEST QUESTION

Ex 13 p268
n = 50
x
¯ = 654.16
s = 164.43

Recall:
   1) A 100(1-α)% CI for μ is ¯ +/- Zα/2 σ/(√n)
                              x
        Provided σ is known and the population distribution is normal

   2) If n is large then an approximate 100(1-α)% CI for μ is ¯ +/- Zα/2 s/(√n)
                                                              x
      Assuming σ is unknown

Find 95% CI for μ

a) An approximate 95% CI for μ = 654.16 +/- 1.96(164.43/(√50)) = (608.58, 699.74)

b) σ = 175, 1-α = .95 so Zα/2 = 1.96 w = 50
   n = (2Zα/2 σ/w)2 = (2(1.96)(175)/50)2 = 189


Ex 17 p269
         x
n = 153 ¯ = 135.39 s = 4.59 Zα = -2.33
lower bound of CI = 135.39 – (-2.33(4.59/√153)) = 134.53
upper bound of CI = 135.39 + (-2.33(4.59/√153)) = 136.25

two sided CI, Zα/2 = -2.575
135.39 +/- (-2.575)(4.59/√153))



Ex 25 p269
1-α = .95
α = .05 α/2 = .025
Zα/2 = 1.96
w = .10

a)
n = (2Zα/2/w)2 * p0(1-p0)
n <= 4Z2α/2/w2 * p0(1-p0)
n <= 4(1.96)2/.12 <= 381

b)
n = 381 * (2/3(1-2/3)) = 339



page 148 table 4.1
Test stuff

1b) xα = μ + σ * zα zα = 1.645 μ = 100, σ = 10
    xα = 100 + 10*1.645

1c) n = 4, X has N(μ, σ2/n)
            ¯
        i) So X has normal distribution with μX = 100 and σ2X = 100/4 = 25
              ¯                                  ¯              ¯
        ii) use as with a normal distn as it lies within a certain range


3)
11/16/09

Small Sample Method

Consider a random sample X1 X2 … Xn of size n from N(μ, σ2)

Then X = 1/n Xi has N(μ, σ2/n)
     ¯

Z = X-μ / (σ/√n) has N(0, 1)
    ¯

Case of σ2 is unknown

USE T WHEN σ2 IS UNKNOWN AND THE SAMPLE SIZE IS SMALL

Define T = X-μ /(s/√n)
            ¯
Then T has “student’s t distribution” it has one parameter called the degrees of freedom often denoted by the
greek letter v (nu). here v = (n-1)
Note: S2 = 1/n-1 (Xi - X)2
                         ¯

The graph for student’s t distribution looks like a normal distribution, its symmetrical, but the ends go farther
and the peak is lower

tvα >= Zα

T has a symmetric distribution moreover the t distribution approaches normal as v gets larger

#29 p276
a) central area = .95 so the right tail area = .05/2 = .025, df = 10,    t = 2.228
e) α = .01, df = 25 tv, α = 2.485

It follows that P (-tv,α <= X-μ/(s√n) <= tv,α) = 1-α
                            ¯

P(X-tv,α * s/√n <= μ <= X+tv,α * s/√n) = 1 – α
  ¯                     ¯

Thus, the probability that the random interval (X-tv,α * s/√n, X+tv,α * s/√n) contains μ is 1-α
                                                ¯              ¯

A 100(1-α) CI for μ is X(+/-)tv,α/2 * s/√n
                       ¯                     where X is the average of the data, S2 = 1/n-1(xi2 - (xi)2/n)
                                                   ¯

32) n = 8 X = 30.2 s = 3.1 α = .05 and v = n-1 = 7
              ¯
Thus tv,α/2 = 2.365
So a 95% CI for μ is 30.2 +/- 2.365*3.1/√8 = (27.6, 32.8)

A 95% upper confidence limit for μ is 30.2 + 1.895*3.1/√8


Prediction Interval
Suppose X1 X2 … Xn constitutes a random sample from N(μ, σ2) Let Xn+1 be the next random observation

An estimate of prediction error is (Xn+1 - X). Then the E[Xn+1 - X] = E[Xn+1] – E[X] = μ – μ = 0
                                           ¯                     ¯                ¯
Var(Xn+1 - X) = Var(Xn+1) + Var(X) = σ2 + σ2/n = σ2(1 + 1/n)
           ¯                    ¯
Then (Xn+1 - X)/(s(√1+1/n)) has t-distribution with v = (n-1) degrees of freedom
             ¯

P(-tv,α <= (Xn+1 - X)/(s(√1+1/n)) <= tv,α) = 1-α
                   ¯

This leads to providing the following prediction interval for a new observation X n+1
X +/- tv, α/2 s√(1+1/n) where v = n-1
¯
So the prediction interval will always be larger than the confidence interval

#32 construct a 95% prediction interval for a new observation
30.2 +/- 2.47 (3.1)√(1+1/n)  8.12 = (22.08, 38.32)
Compared to the confidence interval (27.6, 32.8), the CI is much shorter than the prediction interval


Confidence Interval for σ2

χ2 = (xi - X)2 / σ2 = s2(n-1)/σ2
            ¯
has a chi-squared distribution with v = n-1 degrees of freedom




TABLE ON P673
χ2v,α

χ210, .05 = 18.307
χ210, .975 = 3.247

For a 100(1-α) CI for σ2, consider the critical values so that P[χ2 <= χ21-α/2,v] = α/2 and P[χ2 >= χ2α/2,v] = α/2

So a 100(1-α) CI for σ2 is ((n-1)s2/χ2v,α/2 , (n-1)s2/χ2v,1-α/2 )

Similarly, a 100(1-α) CI for σ is ( √( (n-1)s2/χ2v,α/2 ) , √( (n-1)s2/χ2v,1-α/2 ) )



Ex 37 p277
A 95% CI for μ is .9255 +/- 2.093 (.0809/20)
A 95% prediction interval = .9255 +/- 2.093 √(1+1/20)
A 95% CI for σ2 is (19(.0809)2/ 32.852, 19(.0809)2/ 8.906)


HOMEWORK
Ch7 p276
28, 34, 44, 46
11/23/09

Final Exam Dec 9th Wednesday 1-4pm

Statistical Hypothesis Testing
A claim is made. From this develop a null hypothesis H0 to be tested against am alternate hypothesis Ha

Specify the test procedure to decide between H0 and Ha in the sense of strong evidence against or in favor of
H0

Hypothesis is about a parameter or parameters of a distribution.

Ex: One may want to test for the proportion p or mean μ or variance σ2 of a distribution
An appropriate test statistic is chosen to carry out the test procedure

        Null hypotheses H0 requires that the parameter value is equal and may have > or < some numerical
         value

H0  p <= .10            Ha  p > .10
H0  p = .10             Ha  p != .10


#3 p293
H0 μ = 100 vs Ha μ > 100
Why not use Ha μ < 100?

With the alternate hypothesis, it fits the specification, the equivalent null hypothesis would be μ <= 100.


Decision/Test


         Do Not Reject H0                Reject H0
H0       ✔                               Type 1 Error        α = p(Type 1 Error)
Ha       Type 2 Error                    ✔                   β = p(Type 2 Error)

α is often specified. It is called the significance level

Problem: Conduct a test procedure with specified significance level such that β is minimized.

#5
H0:σ >= 0.5     vs       Ha:σ < .05

#13
Consider a random sample X1 X2 … Xn for N(μ, σ2) where σ2 is known.
Then X = 1/n xi has N(μ, σ2/n)
     ¯

a)
H0 = μ0 vs Ha = μ > μ0

If H0 is true, then X is N(μ0, σ2,n)
                    ¯
Z = X-μ0/σ/√n has N(0, 1)
     ¯

Find the rejection region of α = .025

Z.025 = 1.96 (1-.025 in the Z table)
So reject H0 if Z > 1.96
Or X-μ0/σ/√n > 1.96
   ¯
Or X > μ0+1.96σ/√n
   ¯

b)H0 μ = 100 vs Ha μ > 100, σ = 5, n = 25
            ¯            ¯
       Z = X-100/5/5 = X-100
                         ¯
Reject H0 if Z > 2.33 or X > 102.33
Z = 102.33 – 99/5/5 = 3.33
α = 0.0003


p304
#15
H0 μ = μ0   H a μ > μ0          Z >= 1.88             α = 1-.9699 = .0301
μ < μ0                          Z <= -2.75            α = .003
μ != μ0                         Z >= 2.88 or <= -2.88 α = 2 * .002 = .004

Find rejection region when α = .05
μ > μ0                       Z > Z.05 = 1.645
μ < μ0                       Z < Z.05 = -.1645
μ != μ0                      Z = 2(Z.025) = Z < -1.96 or Z > 1.96  |Z| > 1.96


σ                        Test Statistic
known                    Z = X-μ0/σ/√n
                             ¯
unknown, big n           Z = X-μ0/s/√n
                             ¯
unknown, small n         T = X-μ0/s/√n
                             ¯
proportion, big n        Z = p – p0/√(p(1-p)/n)

Section 8.1
2, 6, 10, 14, 16, 18, 20, 26, 30, 36, 46, 54, 56, 58
11/30/09

Use of the following sequence of steps is recommended when testing hypotheses about a parameter.
1. Identify the parameter of interest and describe it in the context of the problem / situation.
2. Determine the null value and state the null hypothesis.
3. State the appropriate alternative hypothesis.
4. Give the formula for the computed value of the test statistic (substituting the null value and the known values of any
other parameters, but not those of any sample based quantities).
5. State the rejection region for the selected significance level α.
6. Compute any necessary sample quantities, substitute into the formula for the test statistic value, and compute that
value.
7. Decide whether H0 should be rejected and state this conclusion in the problem context.


#23, p305

      1.   parameter of interest is μ
      2.   H0: μ = 6 minutes (H0 has equality)
      3.   Ha: μ > 6 minutes (Ha is what you don’t want)
      4.   Test statistics t = (x - μ0)/(s/√n) with df v = n-1 = 25
                                ¯
      5.   For α = .05, the rejection region is given by t > t.05, 25 = 1.708
      6.   ¯ = 370.69 s = 24.36 n = 26  t = (370.69 – 360)/ (24.36 / √26) = 2.24
           x
      7.   Since t = 2.24 > 1.708, we reject H0

      Because H0 is rejected, the prior belief is false.

#35, p310

      1.   parameter of interest is p (proportion)
      2.   H0: p = .70
      3.   Ha: p != .70
      4.                                             ^
           test statistics (assuming n is large) z = p – p0 / √(p0(1-p0)/n)
      5.   for α = .05, the rejection region is given by z-.025 <= z <=z.025 = -1.96 <= z <= 1.96
      6.   ^
           p = 124/200 = .62 so, z = .62 - .70 / √(.70(.30)/200) = -2.47
      7.   Since z <= -1.96, we reject H0
           p-value = 2(1-Φ(2.47)) = 2(.0068) = .0136
           if p-value <= α, reject H0, since .0136 < .05, reject H0
           for α = .01, since .0136 > .01, do not reject H0


p-value is the probability of rejecting H0 given the calculated value of the test statistics.

p-value = {1 – Φ(z) for the right tail test if calculated value is z
          {Φ(z) for the left tail test  if calculated value is z
          {2 * [1-Φ(z)] for two tailed tests

           (replace z values with t values if using proportions

If p-value <= α, reject H0
If p-value > α, keep H0

#49
      a.   df = 8, t = 2.0, therefore .025 < p-value < .05
      b.   df = 11, t = -2.4, therefore .01 < p-value < .025
      c.   df = 15, t = -1.6, therefore .1 < p-value < .20
      d.   df = 19, t = -0.4, therefore

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:15
posted:8/31/2012
language:English
pages:49