Probability by PQ97A8

VIEWS: 5 PAGES: 24

									         Statistics 111 - Lecture 7

                 Probability

               Normal Distribution
               and Standardization


June 5, 2008      Stat 111 - Lecture 7 - Normal   1
                           Distribution
               Administrative Notes


• Homework 2 due on Monday




June 5, 2008       Stat 111 - Lecture 7 - Normal   2
                            Distribution
                    Outline


• Law of Large Numbers
• Normal Distribution
• Standardization and Normal Table




June 5, 2008   Stat 111 - Lecture 7 - Normal   3
                        Distribution
           Data versus Random Variables

• Data variables are variables for which we
  actually observe values
    • Eg. height of students in the Stat 111 class
    • For these data variables, we can directly calculate the statistics
      s2 and x

• Random variables are things that we don't
  directly observe, but we still have a probability
  distribution of all possible values
    • Eg. heights of entire Penn student population




June 5, 2008             Stat 111 - Lecture 7 - Normal               4
                                  Distribution
               Law of Large Numbers
• Rest of course will be about using data
  statistics (x and s2) to estimate parameters of
  random variables ( and 2)
• Law of Large Numbers: as the size of our
  data sample increases, the mean x of the
  observed data variable approaches the mean 
  of the population
• If our sample is large enough, we can be
  confident that our sample mean is a good
  estimate of the population mean!

June 5, 2008        Stat 111 - Lecture 7 - Normal   5
                             Distribution
               The Normal Distribution
• The Normal distribution has the shape of a “bell
  curve” with parameters  and 2 that determine
  the center and spread:



                        




                    

June 5, 2008         Stat 111 - Lecture 7 - Normal   6
                              Distribution
         Different Normal Distributions
• Each different value of  and 2 gives a
  different Normal distribution, denoted N(,2)
                     N(0,1)
                                                     N(2,1)
           N(-1,2)
                                                     N(0,2)




• We can adjust values of  and 2 to provide
  the best approximation to observed data
• If  = 0 and 2 = 1, we have the Standard
  Normal distribution
June 5, 2008         Stat 111 - Lecture 7 - Normal            7
                              Distribution
      Property of Normal Distributions
• Normal distribution follows the 68-95-99.7 rule:
   • 68% of observations are between  -  and  + 
   • 95% of observations are between  - 2 and  + 2
   • 99.7% of observations are between  - 3 and  + 3




                                   
                                        2

June 5, 2008       Stat 111 - Lecture 7 - Normal     8
                            Distribution
                 Calculating Probabilities
• For more general probability calculations, we
  have to do integration

For the standard
normal distribution,
we have tables of
probabilities already
made for us!

If Z follows N(0,1):

P(Z < -1.00) = 0.1587
  June 5, 2008         Stat 111 - Lecture 7 - Normal   9
                                Distribution
                Standard Normal Table
If Z has N(0,1):

P(Z > 1.46)
 = 1 - P(Z < 1.46)
 = 1 - 0.9279
 = 0.0721




• What if we need to do a probability calculation for
  a non-standard Normal distribution?
 June 5, 2008        Stat 111 - Lecture 7 - Normal   10
                              Distribution
                        Standardization
• If we only have a standard normal table, then we
  need to transform our non-standard normal
  distribution into a standard one
   • This process is called standardization




                                                             1




                                                         0


 June 5, 2008             Stat 111 - Lecture 7 - Normal           11
                                   Distribution
                Standardization Formula
• We convert a non-standard normal distribution
  into a standard normal distribution using a linear
  transformation
• If X has a N(,2) distribution, then we can
  convert to Z which follows a N(0,1) distribution

                         Z = (X-)/

• First, subtract the mean  from X
• Then, divide by the standard deviation  of X


 June 5, 2008         Stat 111 - Lecture 7 - Normal   12
                               Distribution
   Linear Transformations of Variables
• Sometimes need to do simple mathematical
  operations on our variables, such as adding and/or
  multiplying with constants

                     Y = a ·X + b

• Example: changing temperature scales
          Fahrenheit = 9/5 x Celsius + 32




• How are means and variances affected?
June 5, 2008       Stat 111 - Lecture 7 - Normal       13
                            Distribution
   Mean/Variances of Linear Transforms
• For transformed variable Y = a·X + b

                mean(Y) = a·mean(X) + b
                  Var(Y) = a2·Var(X)
                  SD(Y) = |a|·SD(X)

• Note that adding a constant b does not affect measures
  of spread (variance and sd)




 June 5, 2008       Stat 111 - Lecture 7 - Normal     14
                             Distribution
     More complicated linear functions
• We can also do linear transformations involving with
  more than one variable:
                   Z = a·X + b·Y + c
• The mean formula is similar:
        mean(Z) = a·mean(X) + b·mean(Y) + c
• If X and Y are also independent then
             var(Z) = a2·var(X) + b2·var(Y)
• Need more complicated variance formula (in book) if
  the variables are not independent



June 5, 2008       Stat 111 - Lecture 7 - Normal         15
                            Distribution
                Standardization Example
Dear Abby,

  You wrote in your column that a woman is pregnant for
  266 days. Who said so? I carried my baby for 10
  months and 5 days. My husband is in the Navy and it
  could not have been conceived any other time because I
  only saw him once for an hour, and I didn’t see him
  again until the day after the baby was born. I don’t drink
  or run around, and there is no way the baby isn’t his, so
  please print a retraction about the 266-day carrying time
  because I am in a lot of trouble!

                                                      -San Diego Reader

 June 5, 2008         Stat 111 - Lecture 7 - Normal                   16
                               Distribution
               Standardization Example
• According to well-documented data, gestation
  time follows a normal distribution with mean 
  of 266 days and SD  of 16
• Let X = gestation time. What percent of
  babies have gestation time greater than 310
  days (10 months & 5 days) ?
   • Need to convert X = 310 into standard Z

     Z = (X-)/ = (310-266)/16 = 44/16 = 2.75


June 5, 2008         Stat 111 - Lecture 7 - Normal   17
                              Distribution
               Standardization Example
P(X > 310)
 = P(Z > 2.75)
 = 1 - P(Z < 2.75)
 = 1 - 0.9970
 = 0.0030

So, only a 0.3%
chance of a
pregnancy lasting
as long as 310 days!

June 5, 2008         Stat 111 - Lecture 7 - Normal   18
                              Distribution
               Reverse Standardization
• Sometimes, we need to convert a standard
  normal Z into a non-standard normal X
• Example: what is the length of pregnancy
  below which we have 10% of the population?
     • From table, we see P(Z <-1.28) = 0.10
• Reverse Standardization formula:

                        X = σ⋅Z +μ
• For Z = -1.28, we calculate
  X = -1.28·16 + 266 = 246 days (8.2 months)
June 5, 2008         Stat 111 - Lecture 7 - Normal   19
                              Distribution
               Another Example
• NCAA Division 1 SAT Requirements: athletes
  are required to score at least 820 on combined
  math and verbal SAT
• In 2000, SAT scores were normally distributed
  with mean  of 1019 and SD  of 209
• What percentage of students have scores
  greater than 820 ?

 Z = (X-)/ = (820-1019)/209 = -199/209 = -.95


June 5, 2008     Stat 111 - Lecture 7 - Normal   20
                          Distribution
               Another Example
• P(X > 820) = P(Z > -0.95) = 1- P(Z < -0.95)




• P(Z < -0.95) = 0.17 so P(X > 820) = 0.83
• 83% of students meet NCAA requirements
June 5, 2008     Stat 111 - Lecture 7 - Normal   21
                          Distribution
               SAT Verbal Scores
• Now, just look at X = Verbal SAT score, which
  is normally distributed with mean  of 505 and
  SD  of 110
• What Verbal SAT score will place a student in
  the top 10% of the population?




June 5, 2008      Stat 111 - Lecture 7 - Normal   22
                           Distribution
               SAT Verbal Scores
• From the table, P(Z >1.28) = 0.10

• Need to reverse standardize to get X:

       X = σ⋅Z + μ = 110⋅1.28 + 505 = 646

• So, a student needs a Verbal SAT score
  of 646 in order to be in the top 10% of all
  students

June 5, 2008      Stat 111 - Lecture 7 - Normal   23
                           Distribution
               Next Class - Lecture 8


• Chapter 5: Sampling Distributions




June 5, 2008        Stat 111 - Lecture 7 - Normal   24
                             Distribution

								
To top