Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

Probability Theory Probability Theory Theory and

VIEWS: 80 PAGES: 52

									Probability Theory
Theory and Terminology

Addition & Multiplication Rules

The Binomial Distribution

The Normal Distribution

Testing for Normality
                                                                                                           2


                                         KEY CONCEPTS
                                              *****
                                        Probability Theory


Definition of probability
Theoretical v relative frequency probability
Statistical experiment
Sample space
A complement
Probability of an event
Relative frequency
Mutually exclusive events
Conditional probability
Independent events
Addition rule of probability
Multiplication rule of probability
Testing the independence of events
Comparing observed and expected frequencies and probabilities
Definition of a probability distribution
Examples of probability distributions:
        Binomial distribution
        Normal distribution
        t distribution
        F distribution
        Chi-square distribution
        Poisson distribution
The binomial distribution and its expansion
The binomial distribution with equal and unequal probabilities
Estimating binomial probabilities
Pascal’s Triangle
Blaise Pascal (1623-1662)
Difference between the binomial and normal distribution
Characteristics of the normal distribution
Finding areas under the normal curve
Standard score (Z)
Determining if a variable is normally distributed:
        Pseudo standard deviation
        Histogram with normal curve over-lay
        Normal probability plot




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         3


                              Lecture Outline

 What is probability & the meaning of
  probability terminology

 Addition and multiplication rules of probability

 Using probability to test the independence of
  events

 Probability distributions

 The binomial distribution

 The normal distribution

 Testing variables for normality




     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         4


                      Theory of Probability
The probability of an event is expressed on a
scale from 0.0 to 1.0
 0.0 means it will never happen
 1.0 means that it is certain to happen

The probability of an event is the number of
times that a specific event occurs relative to the
sum of all possible events that can occur.

Example       The probability of rolling a 3 on a die
is 1 out of 6, or 0.1667

Example    The probability of two coins flipped
once coming up heads is 1 out of 4, or 0.25

Theoretical probability Sometimes we know
from the theory of the matter what the probability
of an event is, e.g. rolling dice or flipping coins.
Relative frequency probability In other cases
we can only estimate the probability by
observing how frequently a particular event
occurs, e.g. jury acquittals, people quitting a job



     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        5


                                  Terminology

Statistical experiment An empirical record of
any phenomenon whose relative frequency of
occurrence is uncertain.

    Example     Data on sentencing outcomes:
    probability of acquittal, fine, deferred
    adjudication, probation, or prison

Sample space All possible outcomes,

    Example     Possible sentencing outcomes:
    acquittal,   fine,   deferred adjudication,
    probation, or prison

An event A specific outcome or collection of
outcomes; e.g. an acquittal

A complement All possible events other than
the one in question, e.g. if the event in question
is acquittal, then the complements are fine,
deferred adjudication, probation, and prison




    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           6



Terminology (cont.)

Probability of an event The proportion of times
an event occurs divided by the frequency of all
other events that can occur, e.g. 246 acquittals
out of 14,573 cases acquitted, fined, deferred,
probated, or sent to prison (246/14,573 = p =
0.0169)

Relative frequency How often an event occurs
relative to all other events that occurred in the
experiment

Mutually exclusive events Two or more events
which can not happen together, .e.g. acquitted
and sent to prison, the probability = 0.0

Conditional probability The probability of event
A happening, given that event B has already
occurred, e.g. probability of going to prison (A)
given that the offender was put on probation (B).


                      This is symbolized P(A B)




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           7



Terminology (cont.)


Independent events         Two events A & B are
considered independent if the conditional
probability P(A B) = P(A), e.g. probability of
acquittal (A) given that it is raining outside (B)




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        8


            Addition Rule of Probability


Q   What is the probability of A or B happening?

If the two events are not mutually exclusive …


      P(A or B) = P(A) + P(B) – P(A and B)



Example     What is the probability of drawing
either a Jack or a Heart from a deck of cards?

    P(J or H) = P(J) + P(H) – P(J and H)

    P(J or H) = P(4/52) + P(13/52) – P(1/52)

    P(J or H) = (0.0769 + 0.2453) – (0.01923)

    P(J or H) = 0.3077




    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           9


Addition rule (con’d)


If the two events are mutually exclusive …


                          P(A or B) = P(A) + P(B)


Example     What is the probability of drawing a
Jack or a King?

      P(J or H) = P(J) + P(H)

      P(J or H) = P(4/52) + P(4/52)

      P(J or H) = (0.0769 + 0.0769) = 0.1538




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         10


      Multiplication Rule of Probability


Q   What is the probability of A and B happening
together?

The general rule

                                      P(B) P(AB)

If the events are independent of each other, this
simplifies to …

                                         P(A) P(B)

Example      What is the probability of drawing a
Jack of Hearts? Since J and H are not mutually
exclusive, therefore independent …

    P(J and H) = P(J) P(H) = (4/52) (13/52)
               = 0.01923

    Notice that this is the same as
    (1/52) = 0.01923




     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         11


   Testing the Independence of Events

These rules of probability allow us to test
whether two events are in fact independent of
each other

Example      Is success on probation (S) related
to drug addiction (A)? Put the question the other
way around …

    Is success independent of addiction?

Cross-Tabulation of Probation Outcome & Drug
              Addiction (N=643)
                            Not
Outcome     Addicted     Addicted     Total

Success                          115                            277                            392


Failure                          194                             57                            251


Total                            309                            334                            643




     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           12


Testing independence (cont.)


If success is independent of addiction, that is the
variables are not related to each other…
      How many of the 643 probationers should have
      been successful in spite of their addiction?

By the multiplication rule for independent events

                                           P(S) P(A)

There are 392-successes (S) and 309-addicted
(A) out of a total of 643 probationers …
      P(S) P(A) = (392/643) (309/643) = 0.2930

Interpretation Theoretically, 0.2930 proportion
of the probationers (188 cases) should have
been successful and addicted if these variables
are independent
      In fact, 0.1788 (115/643) were successful and
      addicted (115 cases)
      Does the difference between these two
      proportions (0.2930 & 0.1788) suggest that
      success and addiction are not independent, i.e.
      they are related to each another?




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           13


Testing independence (cont.)


Calculation of the probabilities & frequencies for
the other cells under the assumption of
independence.

      Successful / not addicted
               (392/643) (334/643) = 0.3167
               (0.3167) (643) = 204 cases

      Failure / addicted
               (251/643) (309/643) = 0.1876
               (0.1876) (643) = 121 cases
      Failure / not addicted

               (251/643) (334/643) = 0.2028
               (0.2028) (643) = 130 cases


These figures are the expected probabilities &
frequencies, assuming independence …

      What we would expect to find in a sample of
      643 probationers if success is unrelated to
      addiction (i.e. independent of).



       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                              14


    Putting the Observed and Expected
           Frequencies Together
                                         Observed Results
                                                                    Not
Outcome                            Addicted                       Addicted                          Total

Success                              115                            277                            392
                                   (0.1789)                       (0.4308)                       (0.6096)

Failure                              194                             57                            251
                                   (0.3017)                       (0.0886)                       (0.3904)

Total                                309                            334                             643
                                   (0.4806)                       (0.5194)                        1.0000)


                                         Expected Results
                                                                    Not
Outcome                            Addicted                       Addicted                          Total

Success                              188                            204                            392
                                   (0.2930)                       (0.3167)                       (0.6096)

Failure                              121                            130                            251
                                   (0.1876)                       (0.2028)                       (0.3904)

Total                                309                            334                             643
                                   (0.4806)                       (0.5194)                        1.0000)


Do the differences between the observed and
expected results indicate that addiction is related
to success?


          Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         15


                    An Algebraic Shortcut
Recall how the expected cell frequencies were
calculated, e.g. successful / not addicted
    Successful / not addicted
             (392 / 643) (334 / 643) = 0.3167
             (0.3167) (643) = 204 cases

This is algebraically the same as
    [(392 / 643) (334 / 643)] (643) = 204 cases

By cancellation of the common term in the
numerator and the denominator
    [(392 / 643) (334 / 643)] (643) = 204 cases

The equation simplifies to
    [(392) (334 )/ 643] = 204 cases

In short, to find the expected frequency in any
cell …

    [(row total) (column total)] / (grand total)




     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        16


                 Probability Distributions

Probability distribution A theoretical model that
indicates the probability of specific events
happening for a phenomenon distributed in a
particular manner.

In statistics, numerous probability distributions
are used to describe, explain, predict, and assist
in decision making.


Examples include

    Binomial distribution

    Normal distribution

    t distribution

    F distribution

    Chi-square distribution

    Poisson distribution



    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         17


                      Binomial Distribution
Consider flipping 3 coins once. What are the
possible theoretical outcomes?

    3 heads
    3 combinations of 2 heads and 1 tail
    3 combinations of 1 head and 2 tails
    3 tails

Let the probability of a head = p, a tail = q, n =
the number of coins flipped, and the probability
of a head or tail = 0.5 (p=q=0.5)

This statistical experiment can be represented
by the following binomial model


                                           (p + q) n


    (p + q)3 = (p + q) (p + q) (p + q)

    (p + q)3 = 1p3 + 3p2q + 3pq2 + 1q3

    (p + q)3 = p3 + 3p2q + 3pq2 + q3



     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         18


               Interpretation of the Binomial
                   Expansion (p + q)3

                    (p + q)3 = p3 + 3p2q + 3pq2 + q3

p3 = 1 way to get 3 heads

3p2q = 3 ways to get 2 heads and 1 tail

3pq2 = 3 ways to get 1 head and 2 tails

q3 = 1 way to get 3 tails

Combinations of outcomes = (1+3+3+1) = 8

Calculating probabilities

    The probability of 2 heads and 1 tail

             3p2q = 3(0.5)2(0.5) = 0.375

    This is the same as the coefficient 3 divided
    by the total number of combinations of
    outcomes; i.e. 8, (3 / 8) = 0.375



     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                            19


Interpretation (cont.)


Calculating probabilities

      The probability of 3 heads

                p3 = (0.5)3 = 0.125

      This is the same as the coefficient 1 divided
      by the total number of combinations of
      outcomes; i.e. 8, (1 / 8) = 0.125

Graph of the binomial distribution (p + q)3

      Probability

0.375 



0.250 



0.125 



0.0
                   p3            3p2q          3pq2            q3

                Combinations of possible outcomes




        Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        20


   Applications of the Binomial (p + q)3

This binomial probability distribution can be
applied to any binary phenomenon in which
there are three events (n = 3)

Examples

   Pre-trial settlements (p) v trials (q) among 3
   civil suits

   Combinations of sons (p) and daughters (q)
   in a family of three children

   Clearances (p) and failures to clear (q)
   among 3 criminal cases

   Escapes (p) and no escapes (q) among 3
   trustees

   Probations (p) and incarcerations (q) among
   3 criminal cases




    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         21


     Binomial with Unequal Probabilities

In the binomial (p + q)3, the probability of p and q
may be equal or unequal

                                        p = q = 0.5
                                            or
                                           pq


Example with unequal probabilities …

    Settlements (p = 0.7) and trials (q = 0.3)
    among 3 civil suits

Calculation of the probabilities …

    p3 + 3p2q + 3pq2 + q3

    2 settlements & 1 trial = 3p2q = 3(0.7)2(0.3)

    Probability = 0.441




     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                             22


Unequal probabilities (cont.)


                                Summary of Probabilities

                     Combination of                                  Probability
                         Events
                       3
                     p                                                   0.343
                     3p2q                                                0.441
                     3pq2                                                0.189
                     q3                                                  0.027
                     Total (c = 8)                                       1.000



Graph of the binomial (p + q)3
        Probability         (p=0.7, q=0.3)
00.50

0.45

0.40

0.35

0.30

0.25

0.20

0.15

0.10

0.05

0.00
                       p3                3p2q                 3pq2                  q3




         Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                                 23


                Pascal’s Triangle
         Expanding the Binomial to (p + q)n
    The binomial distribution can be expanded to
    n = 4, 5, 6, …

    When n is large, multiplying out the binomial can
    be quite tedious. A quick way to determine the
    coefficients of a binomial of n is to use Pascal’s
    Triangle (Blaise Pascal 1623-1662)


n                                                                                                                     C

1                                              1              1                                                       2

2                                      1               2              1                                               4

3                              1               3              3               1                                       8

4                       1              4               6              4               1                           16

5               1              5               10             10              5              1                    32

6        1              6              15              20             15              6              1            64

7    1          7              21              35             35              21             7               1    128




    Using Pascal’s Triangle: (p + q)6

    1p6+6p5q1+15p4q2+20p3q3+15p2q4 +6p1q5 +1 q6



         Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         24




Chinese discovery of the binomial triangle from Chu-
Shi-Chieh's Ssu Yuan Yü Chien (1303 AD)




     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        25


A Shortcut for Determining Coefficients

Given the binomial (p + q)n   The 1st term is
always 1pn. Where do we go from here?

Example               Expansion of the binomial (p + q)6

   1st term                    1p6

   To get the coefficient for the second term

            Multiply the coefficient of the 1st term by
            the exponent of p and …

                      Divide the product by the position of
                      the term, which is 1, the 1st term

                      E.g.                [(1) (6)] / 1 = 6

       The exponent of p in the 2nd term is 1 less
       than (i.e. 6 - 1) what it was in the previous
       term, i.e. 5, and the exponent of q
       increases by 1 (i.e. from 0 to 1).

                      2nd term:                     6p5q1



    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           26


A shortcut (cont.)


 Continue this process from the 3rd term to 7th

       3rd term                              (6)(5)/2 = 15                           15p4q2

       4th term                              (15)(4)/3 = 20                          20p3q3

       5th term                              (20)(3)/4 = 15                          15p2q4

       6th term                              (15)(2)/5 = 6                           6p1q5

       7th term                              (6)(1)/6 = 1                            1 q6

 The result is the same as when expanded
  using Pascal’s Triangle



 1p6+6p5q1+15p4q2+20p3q3+15p2q4 +6p1q5 +1 q6




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                             27


An Equation for Calculating Probabilities
     of Binomial Events (p=q=0.5)


                       P(f) =                       n!               [ pf (1 - p) n-f ]
                                           f! (n - f)!


P (f)   Probability that f-number of the cases will fall in one of the categories

n       The number of events in the sample

!       Factorial, e.g. 3! = 3x2x1

p       0.5


Example      For the binomial (p + q)6, what is the
probability of 4 cases falling in one category and
2 in the other, e.g. 4 boys among 6 children
P(4) =                  (6)!             [ (0.5) 4 (1 – 0.5) (6 – 4) ]
              (4)! (6 - 4)!
P(4) =             (6x5x4x3x2x1) (0.0625) (0.25)
                  (4x3x2x1) (2x1)


                                         P(4) = 0.234



         Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           28



An equation (cont.)


The same result is achieved by expanding the
binomial (p + q)6 and calculating the probability
of the term 15p4q2



 1p6+6p5q1+15p4q2+20p3q3+15p2q4 +6p1q5 +1 q6



15 p4 q2 = 15 (0.5)4 (0.5)2 = 0.234




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        29


 The Binomial v the Normal Distribution

When p=q=0.5, the binomial distribution is a
symmetric, discontinuous distribution.

However, as n increases from a few events to a
very large number, the binomial distribution
approaches a normal distribution if p = q = 0.5.

Shown below are three binomial distributions
with an over-lay of a normal distribution.




    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           30


Binomial & normal (cont.)




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         31


                  The Normal Distribution

Johann Karl Fredrich Gauss developed the
mathematics for the normal probability
distribution. (1777-1855)


                                                                   [-x2/(2 S2)]
                      Y=                  (N)             (e)
                                  S           2




                                            Frequency (Y)




                                                                             X



The normal distribution is a continuous,
symmetric distribution, also called the bell curve,
the normal curve, and the Gaussian curve.

     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        32


      Areas Under the Normal Curve
If the distribution of some phenomenon
approaches a normal curve, the curve can be
use to describe the probabilities associated with
the phenomenon.

Example       A sample of 142 political corruption
cases was analyzed to determine the time from
case filing to final disposition.




The cases are "near" normally distributed with a
mean of 144.2 days and a standard deviation of
S =  14 days.



    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        33


        The Standard Deviation & the
               Normal Curve


                                                                        Mean = 144.2
                                                                        S = 14
                                                                        Area = 0.3413




                                130.2       144.2      158.2




                                  0.6826 (68.26%)


One standard deviation (S) on either side of the
mean is 0.3413 proportion of the area under the
curve, or 34.13%.

The mean plus and minus 1S is 0.6826
proportion, or 68.26% of the curve.


    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        34


 Calculating Normal Curve Probabilities
If a variable approximates a normal curve, the
curve may be used to estimate the probabilities
associated with the variable. This can be done in
two ways:
     Using the equation for the normal curve
     Using a normal curve table, which is the
      more convenient way

To use a normal curve table, the event in
question must first be converted to a standard
score (Z)

Example     Consider the example of the time to
process political corruption cases. (X = 144.2,
S = 14, N = 142)

What is the probability that a case will take more
than 170 days to process?
    Convert 170 days to a Z score

                                  Z = (X – X) / S


    Z = (170 – 144.2) / 14 = +1.84


    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         35


            Probability of Cases Falling
                Above Z = +1.84
A standard score is the ratio of how far a score
deviates from the mean, relative to 1 standard
deviation.
    In effect, it standardized deviations
    from the mean
    Z = +1.84 is a point 1.84 times further from
    the mean than 1S
                     Political Corruption Case Processing Time
                                                                                      Area above +1.84




                                                             +1.84




The normal curve table indicates the area of the
curve above Z of 1.84 is 3.29%.

Therefore, the probability of a case taking longer
than 170 days to process is 0.0329, or 3.29%.


     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        36


               Normal Probability Table
                     *****
 Data extracted from a normal probability table.


Sometimes the entries are expressed                                                                     as
proportions, sometimes as percentages.

In this case they are expressed as percentages

                                   Area between                                     Area
 Z                                  mean and Z                                    beyond Z
 .                                       .                                            .
 .                                       .                                            .
 .                                       .                                            .
 1.83                                 46.64                                         3.36
 1.84                                 46.71                                         3.29
 1.83                                 46.78                                         3.22
 .                                       .                                            .
 .                                       .                                            .
 .                                       .                                            .




    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        37


           Probability of Cases Falling
                Below Z = +1.84

Given X = 144.2, S = 14, N = 142, Z = +1.84



                    Political Corruption Case Processing Time

            Area below Z = +1.84




                                                          +1.84




Since the area above +1.84 is 3.29% of the area
under the curve, the area below +1.84 is

   (100% - 3.29%) = 96.71%

   The probability of a case taking less than
   170 days is therefore 96.71%


    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        38


Probability of Cases Falling Between the
           Mean and Z = +1.84

Given X = 144.2, S = 14, N = 142, Z = +1.84



                    Political Corruption Case Processing Time

                                                              Area between X and Z = +1.84




                                                          +1.84




As indicated in the Table, the area under the
curve between the mean and Z is 46.71%.

Therefore, the probability of a case taking
between 144 days and 170 days to process is
0.4671



    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        39


  Probability of Cases Falling Between
  Z Scores on Either Side of the Mean

What is the probability of a case taking between
130 and 160 days to process, given X = 144.2,
S = 14, & N = 142 (cf. attached table)

    Z1 = (130 – 144.2) / 14 = -1.01

    Z2 = (160 – 144.2) / 14 = +1.13

                    Political Corruption Case Processing Time

            Area 1 = 34.38%                                   Area 2 = 37.08%




                                  -1.01               +1.13


The area between the Z scores –1.01 and
+1.13 is (34.38% + 37.08%) = 71.46%. The
probability that a case will take between 130 and
160 days to process equals = 0.7146


    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                      40


             Normal Probability Table

                                 Area between                                     Area
Z                                 mean and Z                                    beyond Z
.                                      .                                            .
.                                      .                                            .
.                                      .                                            .
1.01                                34.38                                        15.62
.                                      .                                            .
.                                      .                                            .
.                                      .                                            .
1.13                                37.08                                        12.92
.                                      .                                            .




  Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         41


  Probability of Cases Falling Between
   Z Scores on One Side of the Mean

What is the probability of a case taking between
130 and 140 days to process, given X = 144.2,
S = 14, & N = 142 (cf. attached table)
    Z1 = (130 – 144.2) / 14 = -1.01
    Z2 = (140 – 144.2) / 14 = -0.30

                     Political Corruption Case Processing Time

             Area between –1.01 & -0.30 = 22.59%




                                     -1.01   -0.30


The area from the mean to –1.01 = 34.38%
The area from the mean to –0.30 = 11.79%
The area between –1.01 and –0.30
    (34.38% – 11.79%) = 22.59%
    The probability of a case taking between
    130 – 140 days = 0.2259




     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                      42


             Normal Probability Table

                                 Area between                                     Area
Z                                 mean and Z                                    beyond Z
.                                      .                                            .
.                                      .                                            .
.                                      .                                            .
0.30                                11.79                                        38.21
.                                      .                                            .
.                                      .                                            .
1.01                                34.38                                        15.62
.                                      .                                            .
.                                      .                                            .




  Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           43


Cases falling between (cont.)




                       Political Corruption Case Processing Time

      Area between –1.01 & -0.30 = 22.59%




                                -1.01       -0.30




                                                       Area from mean to –0.30 = 11.79%


                                        34.38%




       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        44


  Using the Normal Curve to Compute
           Percentile Ranks
If a variable distribution approximates a normal
curve, the curve can be used to determine the
percentile rank of a particular value

Example     A sergeant received a 71 on the
lieutenants examination in which the mean = 78
& the S = 7.3. What is her percentile rank?

Convert 71 to a Z score
   Z = (71 – 78) / 7.3 = -0.96
            Area below a Z of -0.96 = 16.85%




                                    -0.96

The area below a Z score of -0.959 is 16.85%.
The sergeant scored at the 17th percentile. She
scored above 17% of those taking the test.


    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        45


How to Determine Whether a Variable is
         Normally Distributed

Three techniques

    Compare the S with the IQR, but only if
     the variable is symmetrical (X  Mdn). If S
     ≈ PSD, the distribution is normal (PSD =
     IQR/1.35).

    Plot a histogram with a normal curve over-
     lay based upon the mean and standard
     deviation of the variable

    Graph a cumulative normal probability
     plot, called a Normal P-P Plot in SPSS


Two Examples

    Time from filing to disposition in cases of
     political corruption

    Sentences for offenders convicted of
     political corruption


    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         46


        The Standard Deviation & IQR
                                                 *****
        Sentences & Case Processing Time

If a variable is normally distributed, the standard
deviation (S) will be approximately equal to the
pseudo standard deviation (PSD ≈ IQR / 1.35))

This technique should never be used if the
variable is skewed, since the results will be very
misleading. (Assume X  Mdn)


                        S  PSD = (IQR / 1.35)


                                                                                            Difference
Variable                IQR                     PSD                         S               (S - PSD)

Sentence                12.5                      9.3                    14.0                      4.7

Process                  6.0                      4.4                      5.0                     0.6
Time

It would appear that the variable "process time"
is near normal since S  PSD, but the
distribution of sentences is not normal.



     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                         47


 Testing for Normality with a Histogram

Another technique for testing the normality of a
variable involves …

    Constructing a histogram of the variable.

    Overlaying on the histogram a normal
    distribution that has the same mean and
    standard deviation is the variable.

The two determinants of a normal distribution
are its mean and standard deviation …



                                                                   [-x2/(2 S2)]
                      Y=                  (N)             (e)
                                  S           2


S = the standard deviation

x = a deviation score (X - X)




     Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                     48


Histogram & Normal Probability Plot




 Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        49


                            Normal P-P Plot
A normal P-P Plot compares the cumulative rank
ordered values of a variable with the cumulative
expected normal values, given the sample size,
mean and standard deviation of the variable.

 A straight line represents the normal
  distribution from the lower left corner to the
  upper right corner of the plot.

 The variable is represented by a series of
  “dots” from the lower left corner to the upper
  right corner of the plot.

 If the dots are synonymous with the line, the
  variable is normally distributed

    If the dots “bow-out” to the right, the
    distribution is skew right

    If the dots “bow-out” to the left, the
    distribution is skew left




    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                           50


Normal P-P Plot (cont.)




Processing time is near normal, but sentences
are skewed right.


       Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                            51


                  The Wager on the Existence of God
                      Blaise Pascal (1623-1662)

In Pascal’s Book Pensées (published posthumously), a collection of meditations
on human suffering and faith in God, he presents several “wagers” on the
existence of God. The logic he used anticipated by almost 300 years the
pioneering work of mathematicians John von Neumann and Oskar Morganstern
on the theory of games and decision making under uncertainty.

He rejected the Five Proofs (Quinque Die) of Aquinas and Aristotle, the
ontological arguments of Anselm, and the cosmological arguments of Descartes.

One of his “wagers” involves the lost opportunity costs involving decision making
under uncertainty.


                                    The Decision Matrix

                                                                               The Reality
               The Decision
                                                               God Exists                   God Does Not
                                                                                                Exist

     Wager that God exists                                        Gain All                    Status Quo



     Wager that God does not exist                                 Misery                     Status Quo




Pascal argued that wagering on the existence of God outweighs wagering
against the existence of God, since the eternal payoff of being correct far out
ways all the other alternatives.




        Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
                                                                                                        52


                                Case Studies

Two case studies are associated with this
module, which provide practice using the
binomial expansion and testing whether sample
distributions are normally distributed.

These case studies, and the associated
database, can be found on the statistical
WebPage and are entitled:

  The Binomial Distribution

  The Normal Distribution




    Probability Theory: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

								
To top