004

Document Sample
004 Powered By Docstoc
					       Normal
       distribution
 Zhang Guozhen
School of Public Health
Xinjiang Medical University


Version 1-Q2                  XJMU EPI&MedSTAT
content
   Introduction to continuous probability distributions

   Normal distribution and it’s general features

   The standard normal distribution

   The applications of normal distributions




                                                           2
Section 1--Introduction to
continuous probability
distributions




                             3
For a defined population, every random variable
has an associated distribution that defines the
probability of occurrence of each possible value of
that variable (if there are a finitely countable
number of unique values) or all possible sets of
possible values (if the variable is defined on the
real line).




                                                      4
Frequency
 Frequency
 30

25

20

15

10

 5

 0
       4 4.2 4.4 4.6 4.8125 5.2 5.4 5.6 5.8
               RBC(10 /L)

                                              5
                                  probability distribution curve




When the sample size increases and the width of
the class intervals decreases infinitively, the peak
line of histogram will turn into a smooth curve
(like a bell shape). This is normal distribution.




                                                                   6
    Distribution Curve

   Because the cumulative frequency is 1, so the area
    under normal curve is also 1.

   The characteristics of this family of curves were
    developed by Abraham de Moivreand karlFriedrich
    Gauss. In fact, this distribution is sometimes called the
    Gauss distribution.




                                                                7
  Section 2--Normal distribution
and it’s general features




                                   8
     The Normal Distribution
   The most widely used continuous distribution
   many other distributions that are not themselves normal
    can be made approximately normal by transforming the
    data onto a different scale

   Generally speaking ,any random variable that can be
    expressed as a sum of many other random variables can be
    well approximated by a normal distribution.




                                                               9
Many medical phenomenon meets with normal
  distribution.
 RBC
 WBC
 Hb
 Blood pressure
 Height
 weight




                                            10
   The probability density function for a normal
    random variable is the form




                                                    11
   If X has a normal distribution
       with mean μ
      and variance σ2
   Then we denote this X~N(μ,σ2)




                                     12
                            σ




                   μ

The normal distributions are symmetric, single-
      peaked, bell-shaped density curves.



                                                  13
General features of the Normal Distribution

   Symmetric about x=μ
   Peak at x=μ
   Total area under the curve equals 1
   The normal distribution is completely determined by the
    parameters μ (the location parameter) and σ(the shape
    parameter)
        In other words, a different normal distribution is
       specified for each different value of μ and σ.




                                                              14
1   2




           15
16
   The areas distribution under the normal curve has certain rules.

                                              68% of total area
                                              is between - and +.




         -3      -2      -1      0        1         2        3

       μ-2.58σ μ-1.96σ μ-σ      μ      μ+σ μ+1.96σ μ+2.58σ

                              68.27%
                              95.00%
                              99.00%



                                                                        17
   The above results show that although any normal
    variable may take values anywhere in (-∞,+∞),the
    chance that its value falls in ( 1.96 ,   1.96 ) is
    always 95% and the chance that its value falls in
    (  2.58 ,   2.58 ) is always 99%.




                                                                18
Section 3 --The standard
normal distribution




                           19
The Standard Normal Distribution


   Suppose X has a normal distribution with mean 
    and standard deviation , denoted X ~ N(, 2).

   Then a new random variable defined as Z=(X- )/ ,
    has the standard normal distribution, denoted Z ~
    N(0,1).




                                                         20
                                   All normal random variables
                                   can be converted to the
                                   standard normal random
                                   variable.




                                           any value Xi in any normal
                                           distribution is corresponding
3   2          2   3    to the value Zi in the standard
3   2          2   3
  -3    -2 -1     0    +1    +2    +3      normal distribution.
 -3     -2   -1   0    +1    +2     +3




                                                                             21
General Features of the Standard Normal
Distribution

   The distribution is centered at 0

   The distribution is bell shaped and symmetric.
    The curve extends to infinity in both directions

   The areas distribution under the normal curve
    has certain rules



                                                       22
23
   It can be shown that about 68% of the area under the
    stand normal density lies between +1 and -1,about 95%
    of the area lies between +1.96 and -1.96,and about 99%
    lies between +2.58 and -2.58.

   These relationships can be more precisely by saying that

        p(1  x  1)  0.68 p(1.96  x  1.96)  0.95
        p(2.58  x  2.58)  0.99




                                                               24
  normal                standard normal
                                           area
  distribution          distribution

    ~                 -1~+1         68.27%

 1.96 ~   1.96      -1.96~+1.96     95.00%

  2.58 ~   2.58     -2.58~+2.58     99.00%




                                                   25
   The area under the curve between any two points of the
    normal distribution is equal to the probability of observing
    a value between those two points.

   How to get the probability under the standard
    normal distribution curve between any two value Z1
    and Z2 ?

       the probability of any event on a normal random
        variable and standard normal random variable can
        be computed from tables of the standard normal
        distribution.




                                                                   26
Normal Table




               0.0 1.0            Z

Table 1 gives areas left of z.
This table from a previous
edition gives areas right of z.



                                      27
The areas (probability) in the table denote the
shaded area from negative infinitive to z


   Φ(z)                       σ =1




                 μ =0     Z

               Φ(-z)=1- Φ(z)



                                                  28
P( Z1  Z  Z 2 )  P(  Z  Z 2 )  P(  Z  Z1 )
                                      =Φ(z2)- Φ(z1)

                                      σ=1




                      Z1   0     Z2




                                                         29
Probability                          Using symmetry and the fact that the
Problems                             area under the density curve is 1.

                                                P(Z > 1.83) = 0.0336
P(Z < 1.83)= 1-P( Z> 1.83)
       =1-0.0336 = 0.9664




                             -1.83            1.83

                      P(Z < -1.83) = P( Z> 1.83)
                                          =0.0336

                               By Symmetry


                                                                            30
    P( -0.6 < Z < 1.83 )=

P( Z < 1.83 ) - P( Z < -0.6 )



= 0.7257 - 0.0336
= 0.6921



                                -0.6   1.83




                                              31
    Calculate P(x<1.96) and P(x<1),
    P(x<-1.96), P(-1<x<1.5) if x~N(0,1)




                                          32
To obtain the probability (or area) under the normal
 distribution curve between any two specified values
 X1 and X2 on X-axis that we are interested in.

               X1                        X2  
  X 1  Z1                 X 2  Z2 
                                            

        P( X 1  X  X 2 )  P( Z1  Z  Z 2 )
                           =Φ(z2)- Φ(z1)



                                                       33
Find P(2 < X < 4) when X ~ N(5,2).
The standardization equation for X is:
        Z = (X-)/ = (X-5)/2

when X=2, Z= -3/2 = -1.5
when X=4, Z= -1/2 = -0.5

P(2<X<4) = P(X<4) - P(X<2)

P(X<2) = P( Z< -1.5 )
       = P( Z > 1.5 ) (by symmetry)

P(X<4) = P(Z < -0.5)
       = P(Z > 0.5) (by symmetry)

P(2 < x < 4) = P(X<4)-P(X<2)
             = P(Z>0.5) - P( Z > 1.5)
            = 0.3085 - 0.0668 = 0.2417




                                         34
Example 1
 Suppose that the scores on an aptitude test are normally
 distributed with a mean of 100 and standard deviation of
 10.( some of the original IQ tests were purported to have
 these parameters) what is the probability that a randomly
 selected score is below 90?




                                                             35
36
37
The probability from this intersection is
0.1587, so the probability of a score
less than 90 is 15.87%



                                            38
Example 2
 What is the probability of a score between 90 and
115?




                                                     39
40
41
Exercise 1
Suppose that diastolic blood pressure X in hypertensive
women centers about 100mmHg and has a standard
deviation of 16mmHg and is normally distributed. Find
P(X<90) and P(X>124).




                                                          42
43
44
Section 4--The Application of
Normal Distributions




                                45
 Estimate the medical reference range
 Estimate the frequency distribution

 Many statistic methods are based on

 normal distribution, or their limit are normal
  distribution




                                                  46
Estimate the Frequency Distribution


Example 3
 Suppose that the scores on CET-4 are normally distributed with
 a mean of 70 and standard deviation of 6. what is the
 probability that a randomly selected score is larger than 80?




                                                                  47
48
Estimate the Medical Reference Range

   In medical field, towards a useful index (a variable in
    statistics) people frequently try to measure a large group
    of ‘ normal’ people to determine the reference range or
    normal range of such an index.
   Reference range, meaning ”normal range”, is the value
    range of most normal individuals. It is used to define a
    normal physiological status.
   In addition, when making reference range, you must
    randomly select enough sample size of “normal people”
    from population.



                                                                 49
   If someone’ s value is outside this range, then he or
    she become suspect and need to pay intensive
    attention.
        the 95% reference range of heights of seven-year
    old: (110cm, 130cm)
        the 95% reference range of hemoglobin of
    healthy female adults: (10g/L, 14g/L)



                                                            50
   How to estimate the reference range

     The normal distribution method
     The percentile method




                                          51
Determination of the Reference Range

   When the distribution of the variable is the normal
    distribution, we use the normal distribution method to
    determine the reference range
    μ and σ are always unknown, so when the sample size is
    large enough we can use       and s to replace them
                               x

           The 95% reference range:   X  1.96S
           The 99% reference range:
                                       X  2.58S


                                                              52
   When the distribution of the variable is the skew
    distribution, we use the percentile method to determine the
    reference range

   If the value above certain value is abnormal, the 95%
                         P95
    reference range:

          for example: hair mercury
    If the value below certain value is abnormal, the 95%
    reference range:
                         P5
         for example: respiratory capacity
   If the value below certain value and above certain value are
    both abnormal, the 95% reference range:
                          P2.5 ~ P97.5
                                                                   53
Example 4
Suppose the concentration of hemoglobin in 120 health
 women is normally distributed with a mean of 117.4g/L
 and standard deviation 10.2g/L. what is the 95% medical
 reference range of hemoglobin?




                                                           54
55
Exercise 2
Suppose the concentration of RBC in 144 health men is
normally distributed with a mean of 5.38 *1012g/L and
standard deviation of 0.44*10.2g/L. what is the 95%
medical reference range of RBC?




                                                        56
57
Summary

   The normal distribution—the most important
    continuous distribution
   Two parameters
   Standard normal distribution
   Reference range




                                                 58
Review questions
   What is a standard normal distribution?
   What is the area to the left of -0.2 under a standard normal
    distribution? What symbol is used to represent this area?
   What is the area to the right of 0.3 under a standard
    normal distribution? What symbol is used to represent this
    area?
   What is Z0.30?what does it mean?
   What is Z0.75?what does it mean?




                                                                   59
   Suppose tree diameters of a certain species of tree
    from some defined forest area are assumed to be
    normally distributed with mean 8 in. and standard
    deviation 2 in. Find the probability of a tree
    having an unusually large diameter, which is
    defined as >12 in.




                                                          60
  Solution
  We have x~N(8,4)and require
  P( x>12)=1-P(x<12)=1-P{z<(12-8)/2}
      =1-P(z<2.0)=1-0.977=0.023
Thus 2.3% of trees from this area have an unusually large
  diameter.




                                                            61
Exercise
Assume the diastolic pressure of healthy high school
students follows a normal distribution with mean 9.3kPa
and variance 1.3kPa.What is the percentage of the students
whose diastolic levels are in between of 8kPa and
10.6kPa,higher than 12.7kPa and lower than 6.7kPa
respectively?




                                                             62

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:27
posted:8/17/2012
language:
pages:63