# 004

Document Sample

```					       Normal
distribution
Zhang Guozhen
School of Public Health
Xinjiang Medical University

Version 1-Q2                  XJMU EPI&MedSTAT
content
   Introduction to continuous probability distributions

   Normal distribution and it’s general features

   The standard normal distribution

   The applications of normal distributions

2
Section 1--Introduction to
continuous probability
distributions

3
For a defined population, every random variable
has an associated distribution that defines the
probability of occurrence of each possible value of
that variable (if there are a finitely countable
number of unique values) or all possible sets of
possible values (if the variable is defined on the
real line).

4
Frequency
Frequency
30

25

20

15

10

5

0
4 4.2 4.4 4.6 4.8125 5.2 5.4 5.6 5.8
RBC(10 /L)

5
probability distribution curve

When the sample size increases and the width of
the class intervals decreases infinitively, the peak
line of histogram will turn into a smooth curve
(like a bell shape). This is normal distribution.

6
Distribution Curve

   Because the cumulative frequency is 1, so the area
under normal curve is also 1.

   The characteristics of this family of curves were
developed by Abraham de Moivreand karlFriedrich
Gauss. In fact, this distribution is sometimes called the
Gauss distribution.

7
Section 2--Normal distribution
and it’s general features

8
The Normal Distribution
   The most widely used continuous distribution
   many other distributions that are not themselves normal
can be made approximately normal by transforming the
data onto a different scale

   Generally speaking ,any random variable that can be
expressed as a sum of many other random variables can be
well approximated by a normal distribution.

9
Many medical phenomenon meets with normal
distribution.
 RBC
 WBC
 Hb
 Blood pressure
 Height
 weight

10
   The probability density function for a normal
random variable is the form

11
   If X has a normal distribution
with mean μ
and variance σ2
   Then we denote this X~N（μ，σ2）

12
σ

μ

The normal distributions are symmetric, single-
peaked, bell-shaped density curves.

13
General features of the Normal Distribution

   Peak at x=μ
   Total area under the curve equals 1
   The normal distribution is completely determined by the
parameters μ (the location parameter) and σ(the shape
parameter)
In other words, a different normal distribution is
specified for each different value of μ and σ.

14
1   2

15
16
   The areas distribution under the normal curve has certain rules.

68% of total area
is between - and +.

-3      -2      -1      0        1         2        3

μ-2.58σ μ-1.96σ μ-σ      μ      μ+σ μ+1.96σ μ+2.58σ

68.27%
95.00%
99.00%

17
   The above results show that although any normal
variable may take values anywhere in (-∞,+∞),the
chance that its value falls in ( 1.96 ,   1.96 ) is
always 95% and the chance that its value falls in
(  2.58 ,   2.58 ) is always 99%.

18
Section 3 --The standard
normal distribution

19
The Standard Normal Distribution

   Suppose X has a normal distribution with mean 
and standard deviation , denoted X ~ N(, 2).

   Then a new random variable defined as Z=(X- )/ ,
has the standard normal distribution, denoted Z ~
N(0,1).

20
All normal random variables
can be converted to the
standard normal random
variable.

any value Xi in any normal
distribution is corresponding
3   2          2   3    to the value Zi in the standard
3   2          2   3
-3    -2 -1     0    +1    +2    +3      normal distribution.
-3     -2   -1   0    +1    +2     +3

21
General Features of the Standard Normal
Distribution

   The distribution is centered at 0

   The distribution is bell shaped and symmetric.
The curve extends to infinity in both directions

   The areas distribution under the normal curve
has certain rules

22
23
   It can be shown that about 68% of the area under the
stand normal density lies between +1 and -1,about 95%
of the area lies between +1.96 and -1.96,and about 99%
lies between +2.58 and -2.58.

   These relationships can be more precisely by saying that

p(1  x  1)  0.68 p(1.96  x  1.96)  0.95
p(2.58  x  2.58)  0.99

24
normal                standard normal
area
distribution          distribution

  ~                 -1~+1         68.27%

 1.96 ~   1.96      -1.96~+1.96     95.00%

  2.58 ~   2.58     -2.58~+2.58     99.00%

25
   The area under the curve between any two points of the
normal distribution is equal to the probability of observing
a value between those two points.

   How to get the probability under the standard
normal distribution curve between any two value Z1
and Z2 ?

   the probability of any event on a normal random
variable and standard normal random variable can
be computed from tables of the standard normal
distribution.

26
Normal Table

0.0 1.0            Z

Table 1 gives areas left of z.
This table from a previous
edition gives areas right of z.

27
The areas (probability) in the table denote the
shaded area from negative infinitive to z

Φ(z)                       σ =1

μ =0     Z

Φ(-z)=1- Φ(z)

28
P( Z1  Z  Z 2 )  P(  Z  Z 2 )  P(  Z  Z1 )
=Φ(z2)- Φ(z1)

σ=1

Z1   0     Z2

29
Probability                          Using symmetry and the fact that the
Problems                             area under the density curve is 1.

P(Z > 1.83) = 0.0336
P(Z < 1.83)= 1-P( Z> 1.83)
=1-0.0336 = 0.9664

-1.83            1.83

P(Z < -1.83) = P( Z> 1.83)
=0.0336

By Symmetry

30
P( -0.6 < Z < 1.83 )=

P( Z < 1.83 ) - P( Z < -0.6 )

= 0.7257 - 0.0336
= 0.6921

-0.6   1.83

31
    Calculate P(x<1.96) and P(x<1),
P(x<-1.96), P(-1<x<1.5) if x~N(0,1)

32
To obtain the probability (or area) under the normal
distribution curve between any two specified values
X1 and X2 on X-axis that we are interested in.

X1                        X2  
X 1  Z1                 X 2  Z2 
                          

P( X 1  X  X 2 )  P( Z1  Z  Z 2 )
=Φ(z2)- Φ(z1)

33
Find P(2 < X < 4) when X ~ N(5,2).
The standardization equation for X is:
Z = (X-)/ = (X-5)/2

when X=2, Z= -3/2 = -1.5
when X=4, Z= -1/2 = -0.5

P(2<X<4) = P(X<4) - P(X<2)

P(X<2) = P( Z< -1.5 )
= P( Z > 1.5 ) (by symmetry)

P(X<4) = P(Z < -0.5)
= P(Z > 0.5) (by symmetry)

P(2 < x < 4) = P(X<4)-P(X<2)
= P(Z>0.5) - P( Z > 1.5)
= 0.3085 - 0.0668 = 0.2417

34
Example 1
Suppose that the scores on an aptitude test are normally
distributed with a mean of 100 and standard deviation of
10.( some of the original IQ tests were purported to have
these parameters) what is the probability that a randomly
selected score is below 90?

35
36
37
The probability from this intersection is
0.1587, so the probability of a score
less than 90 is 15.87%

38
Example 2
What is the probability of a score between 90 and
115?

39
40
41
Exercise 1
Suppose that diastolic blood pressure X in hypertensive
women centers about 100mmHg and has a standard
deviation of 16mmHg and is normally distributed. Find
P(X<90) and P(X>124).

42
43
44
Section 4--The Application of
Normal Distributions

45
 Estimate the medical reference range
 Estimate the frequency distribution

 Many statistic methods are based on

normal distribution, or their limit are normal
distribution

46
Estimate the Frequency Distribution

Example 3
Suppose that the scores on CET-4 are normally distributed with
a mean of 70 and standard deviation of 6. what is the
probability that a randomly selected score is larger than 80?

47
48
Estimate the Medical Reference Range

   In medical field, towards a useful index (a variable in
statistics) people frequently try to measure a large group
of ‘ normal’ people to determine the reference range or
normal range of such an index.
   Reference range, meaning ”normal range”, is the value
range of most normal individuals. It is used to define a
normal physiological status.
   In addition, when making reference range, you must
randomly select enough sample size of “normal people”
from population.

49
   If someone’ s value is outside this range, then he or
she become suspect and need to pay intensive
attention.
the 95% reference range of heights of seven-year
old: (110cm, 130cm)
the 95% reference range of hemoglobin of

50
   How to estimate the reference range

 The normal distribution method
 The percentile method

51
Determination of the Reference Range

   When the distribution of the variable is the normal
distribution, we use the normal distribution method to
determine the reference range
    μ and σ are always unknown, so when the sample size is
large enough we can use       and s to replace them
x

   The 95% reference range:   X  1.96S
   The 99% reference range:
X  2.58S

52
   When the distribution of the variable is the skew
distribution, we use the percentile method to determine the
reference range

   If the value above certain value is abnormal, the 95%
 P95
reference range:

for example: hair mercury
    If the value below certain value is abnormal, the 95%
reference range:
 P5
for example: respiratory capacity
   If the value below certain value and above certain value are
both abnormal, the 95% reference range:
P2.5 ~ P97.5
53
Example 4
Suppose the concentration of hemoglobin in 120 health
women is normally distributed with a mean of 117.4g/L
and standard deviation 10.2g/L. what is the 95% medical
reference range of hemoglobin?

54
55
Exercise 2
Suppose the concentration of RBC in 144 health men is
normally distributed with a mean of 5.38 *1012g/L and
standard deviation of 0.44*10.2g/L. what is the 95%
medical reference range of RBC?

56
57
Summary

   The normal distribution—the most important
continuous distribution
   Two parameters
   Standard normal distribution
   Reference range

58
Review questions
   What is a standard normal distribution?
   What is the area to the left of -0.2 under a standard normal
distribution? What symbol is used to represent this area?
   What is the area to the right of 0.3 under a standard
normal distribution? What symbol is used to represent this
area?
   What is Z0.30?what does it mean?
   What is Z0.75?what does it mean?

59
   Suppose tree diameters of a certain species of tree
from some defined forest area are assumed to be
normally distributed with mean 8 in. and standard
deviation 2 in. Find the probability of a tree
having an unusually large diameter, which is
defined as >12 in.

60
Solution
We have x~N(8,4)and require
P( x>12)=1-P(x<12)=1-P{z<(12-8)/2}
=1-P(z<2.0)=1-0.977=0.023
Thus 2.3% of trees from this area have an unusually large
diameter.

61
Exercise
Assume the diastolic pressure of healthy high school
students follows a normal distribution with mean 9.3kPa
and variance 1.3kPa.What is the percentage of the students
whose diastolic levels are in between of 8kPa and
10.6kPa,higher than 12.7kPa and lower than 6.7kPa
respectively?

62

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 27 posted: 8/17/2012 language: pages: 63