Sampling and Confidence Intervals by KRb6iE

VIEWS: 1 PAGES: 254

									Statistics and Modelling
         Course
          2011
Topic: Confidence Intervals
   Achievement Standard 90642
Calculate Confidence Intervals for
     Population Parameters
            3 Credits
       Externally Assessed

      NuLake Pages 63101
             LESSON 1 – Sampling
Handout with gaps to fill in – goes with the following slides.

STARTER: Look at the following 2 examples of bad sampling technique
  & discuss what’s wrong in each case.

1. Discuss how you’d obtain a representative sample from our school
   roll.
2. Notes on sampling and inference.
3. Population and Samples ‘Policemen’ worksheet (from Achieving in
Statistics). Complete for HW.
                   Sampling
Describe some faults with each of these sampling
   methods.
                       Sampling
Describe some faults with each of these sampling
   methods.

    (a) A survey on magazine readership is conducted by phoning
    households between 1 and 4pm.
   People who aren’t at home during those times cannot be
    surveyed.
   Some people don’t have a phone
                        Sampling
Describe some faults with each of these sampling
   methods.

    (b) A talkback radio station asks listeners to phone in with a
    quick ‘yes’ or ‘no’ answer to the question “Should NZ have
    capital punishment?”
   Only people who are listening at the time can participate.
   Self-selected sample. Only those with a strong opinion will
    ring in.
                       Sampling
You are asked the question:
“How tall are St. Thomas students?”

• You only have time to measure the heights of 35 students.

Q1: How would you choose which 35 students to measure.

Q2: Once you’ve measured your 35 students’ heights, how would
    you use this data to answer the question: “How tall are St.
    Thomas students?
Purpose of a Sample


Make an inference   POPULATION
                      SAMPLE
         Purpose of a Sample

SAMPLE   Make an inference
          Inferences         POPULATION



          Sampling terminology
            Purpose of a Sample

SAMPLE     Make an inference
              Inferences                      POPULATION



             Sampling terminology
POPULATION:
Target Population: All items under investigation.
    We usually just call it the “Population”.

SAMPLES:
Sample: Subset selected to REPRESENT the population.

Sampling Frame:      A list/database of items from which we select
    our sample. (Should include all items in the Target Population)
           Sampling terminology
POPULATION:
Target Population: All items under investigation.
    We usually just call it the “Population.”

SAMPLES:
Sample: Subset selected to REPRESENT the population.

Sampling Frame:      A list/database of items from which we select
    our sample. (Should include all items in the Target Population)

 For a sample to be Representative of a given population:
 The Sampling Frame must match the Target Population.
POPULATION:
Target Population: All items under investigation.
    We usually just call it the “Population.”

SAMPLES:
Sample: Subset selected to REPRESENT the population.

Sampling Frame:      A list/database of items from which we select
    our sample. (Should include all items in the Target Population)

 For a sample to be Representative of a given population:
 The Sampling Frame must match the Target Population.
POPULATION:
Target Population: All items under investigation.
    We usually just call it the “Population.”

SAMPLES:
Sample: Subset selected to REPRESENT the population.

Sampling Frame:      A list/database of items from which we select
    our sample. (Should include all items in the Target Population)

 For a sample to be Representative of a given population:
 The Sampling Frame must match the Target Population.

                     Sample statistic          Population parameter
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X                      m
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X                      m
Standard deviation
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X                      m
Standard deviation            s
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X                      m
Standard deviation            s                      s
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X                      m
Standard deviation            s                      s
Proportion
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X                      m
Standard deviation            s                      s
Proportion                    p
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X                      m
Standard deviation            s                      s
Proportion                    p
 POPULATION:
 Target Population: All items under investigation.
     We usually just call it the “Population.”

 SAMPLES:
 Sample: Subset selected to REPRESENT the population.

 Sampling Frame:      A list/database of items from which we select
     our sample. (Should include all items in the Target Population)

  For a sample to be Representative of a given population:
  The Sampling Frame must match the Target Population.

                      Sample statistic          Population parameter
Number of items       n: Sample size            N: Population size
Mean                          X                      m
Standard deviation            s                      s
Proportion                    p                      p
                      Sample statistic       Population parameter
 Number of items      n: Sample size         N: Population size
 Mean                        X                    m
 Standard deviation          s                    s
 Proportion                  p                    p


A representative sample should have…
• Sample-size large enough to allow the results to be meaningful
  (rough guide: sample size of n > 30).
• No Bias – Sample selection is said to be “biased” if some items
  are more likely to be chosen than others. Every item in the
  target population should be equally-likely to be chosen. Random
  selection ensures this.
• Minimal Non-response – difficult to control this.
Example:
A home security firm is hoping to sell as many burglar alarms as
possible to householders in a certain town.
Usually each house only needs one burglar alarm.
Before the firm orders the alarms from their supplier, they wish to
have an indication of how many alarms they might sell.

1.) What is the target population?
       A. all the people who live in the town.
       B. the head of each household.
       C. the houses in the town.
Example:
A home security firm is hoping to sell as many burglar alarms as
possible to householders in a certain town.
Usually each house only needs one burglar alarm.
Before the firm orders the alarms from their supplier, they wish to
have an indication of how many alarms they might sell.

1.) What is the target population?
       A. all the people who live in the town.
       B. the head of each household.
       C. the houses in the town.
Example:
A home security firm is hoping to sell as many burglar alarms as
possible to householders in a certain town.
Usually each house only needs one burglar alarm.
Before the firm orders the alarms from their supplier, they wish to
have an indication of how many alarms they might sell.

1.) What is the target population?
       Answer: C. the houses in the town.

2.) What is the sampling frame?
      A. the electoral roll for the town.
      B. a list of all the people who live in the town.
      C. a list of all the houses in the town.
Example:
A home security firm is hoping to sell as many burglar alarms as
possible to householders in a certain town.
Usually each house only needs one burglar alarm.
Before the firm orders the alarms from their supplier, they wish to
have an indication of how many alarms they might sell.

1.) What is the target population?
       Answer: C. the houses in the town.

2.) What is the sampling frame?
      A. the electoral roll for the town.
      B. a list of all the people who live in the town.
      C. a list of all the houses in the town.
Example:
A home security firm is hoping to sell as many burglar alarms as
    Do Population and Samples
possible to householders in a certain town.
    ‘Policemen’ needs one burglar alarm.
Usually each house onlyworksheet. Finish by
    Monday. Will mark as a their supplier, they wish to
Before the firm orders the alarms fromclass.
have an indication of how many alarms they might sell.

1.) What is the target population?
       Answer: C. the houses in the town.

2.) What is the sampling frame?
      Answer: C. a list of all the houses in the town.

3.) How would you select a representative sample of the houses in
the town? (discus s as a class)
  EXTRA ON SAMPLING
TECHNIQUES IF TIME (schol
       students)

 Otherwise skip to Lesson 3:
Distribution of Sample Means 1
          Extension Lesson:
       Other sampling techniques
Good sampling techniques:
1. Simple Random Sampling
2. Systematic Sampling
3. Stratified Sampling
4. Cluster Sampling

Bad sampling techniques (biased selection):
• Convenience sampling.
• Self-selected sampling.
         Random selection
Q: What does the word “random” actually
mean?

Q: How would you select a student at
random from this school?
21.03
        Simple random sampling.

        Generate 20 different random
        numbers between 1 and 100.
        If a random number has already
        occurred, generate more as needed.
        Calculator formula
        1 + 100×RAN#
        42 67 2 12 77 49 60
        20 45 15 64 7 8 21
        15 64 58 14 29 68 26 90
   1. Simple Random Sampling
1. Obtain a list of all N items in the target population,
   numbering them 1 to N (e.g. the school roll: 1-600).
2. Decide how many you will select for your sample (n).
3. Use the random number generator on your calculator
   to select numbers at random between 1 and N:
On calculator, type: 1 + Population size × RAN#
4. Keep pressing ‘equals’ until you have selected n
   different items. Discard any repeats.

Advantage of SR sampling: Ensures that every item in
  the population has an equal chance of being selected
  – so no chance of bias.
2. Decide how many you will select for your sample (n).
Select a random number generator on your calculator
3. Use thesample of 35 students from the St.
    Thomas school roll.
   to select numbers at random between 1 and N:
On calculator, type: 1 + Population size × RAN#
HW: Old Sigma Pg. until Ex. 9.1 selected n
4. Keep pressing ‘equals’130 –you have (all), then Pg.
    134 – Ex. 9.2 Discard any
  different items. – just Q1. repeats.

Advantage of SR sampling: Ensures that every item in
  the population has an equal chance of being selected
  – so no chance of bias.
Disadvantage:
• Does not ensure that all subgroups of the population
  are represented in proportion (e.g. some racial, socio-
  economic groups could be over/under-represented).
          3 other good sampling techniques
          Systematic sampling
     1.    Obtain a list of all N items in the
           target popn (numbered 1N).
     2.    Pick a random starting point (e.g.
           item number 7)

     3.    Sample every kth item after
           that, where k=N/n until you have
           selected n items.
                                                    Cluster sampling
           Stratified sampling                   Use when the population is distributed
Use when the population consists of              into naturally-occurring groups or
categories (strata), (e.g. racial groups)        ‘clusters’ (e.g. towns and cities in NZ)

1.  Divide sampling frame into the               Stage 1: Select the clusters:
   strata (categories).                          Select a representative sample of the
2. Select a separate random sample               clusters themselves.
   from each stratum in proportion               Stage 2: Select a random sample of
   to the percentage of the population           items within chosen clusters. Must be
   found in each. (Called Proportional           in proportion to the percentage of the
   Allocation )                                  population found in each. (Called
                                                 Proportional Allocation )
21.03
        Comparison of samples.
        Simple random sampling    Systematic sampling




            Stratified sampling   Cluster sampling
          3 other good sampling techniques
           Systematic sampling
     1.  Obtain a list of all N items in the
                popn (numbered 1N).
       1.targetrandom starting point (e.g.
           Select a sample of between 30 and 36 students
     2. Pick a
           from the
         item number 7) school roll using each of these 3

     3.
              methods.
            Sample every kth item after
            that, where k=N/n until you have
            selected n items.
   2. Write down at least one advantage and at least
                                         Cluster sampling
                                     Use when the population is distributed
        one disadvantage/risk associated with each of
     Stratified sampling
                                     into naturally-occurring groups or
Use when the population consists of
                     techniques.
        these 3 (e.g. racial groups) ‘clusters’ (e.g. towns and cities in NZ)
categories (strata),
                                        Stage 1: Select the clusters:
1.  Divide sampling frame into the
    HW: Do Old Sigma (2nd edition) p137: Ex. 9.3. the
   strata (categories).
                                       Select a representative sample of
                                       clusters themselves.
2. Select a separate random sample
   from each stratum in proportion     Stage 2: Select a random sample of
   to the percentage of the population items within chosen clusters. Must be
   found in each. (Called Proportional in proportion to the percentage of the
   Allocation )                        population found in each (Proportional
                                       Allocation).
21.03
        Systematic sampling.


        To obtain a systematic sample of
        size 20 from this data.
    Choose a starting point at
    random between 1 and 100.
  Using calculator
  1 + 100×RAN# =
   Suppose this gives 5.87352 5.
   So start at item number 5.

   Then choose every kth item,
   where k = N/n.
          = 100/20
          = 5. So sample every 5th item.
            Systematic Sampling
1. Obtain a list of all N items in the target population.
2. Decide on your sample size, n .
3. Pick a random starting point (e.g. item number 7)

4. Sample every kth item after that, where k=N/n until
   you have selected n items.

Advantages:
• Ensures that sample is selected from throughout the
  breadth of the sampling frame.
• Convenient and fast – easier to collect info on items that
  are in a sequence (every 5th house) than from a random
  sample where they are scattered all over.
4. Sample every kth item after that, where k=N/n until you
   have selected n items.

Advantages:
• Ensures that sample is selected from throughout the
  breadth of the sampling frame.
• Convenient and fast – easier to collect info on items that
  are in a sequence (every 5th house) than from a random
  sample where they are scattered all over.

Disadvantage:
Be careful that the list itself has no systematic pattern. If
  every 2nd house on a street were sampled, all would be on
  the same side of the street!
21.03
        Stratified sampling.

        Suppose the avocados are of
        3 different varieties.
        Hass:     1–40         40%
        Fuerte:   41–70        30%
        Hopkins: 71–100 30%
        The number in each strata of the
        sample should be proportional to
        the number in each group in the
        population.
        Hass:     40% x 20 = 8
        Fuerte:   30% x 20 = 6
        Hopkins: 30% x 20 = 6
21.03
        Stratified sampling.

        Thus generate random numbers as
        follows:

        Hass:   1–40    8 random nos.
        33 17 12 25 9 9 33 16 39 8
        Fuerte: 41–70   6 random nos.
        58 59 67 43 53 56
        Hopkins: 71–100 6 random nos.
        98 85 96 99 90 81
             Stratified sampling
Use when the population consists of categories (strata), and
  you wish to represent each ‘stratum’ proportionally (e.g.
  racial groups, one-story and multi-story homes within a
  city).

1. Obtain a list of all N items in the target population.
2. Decide on your sample size, n .
3. Divide list into the strata (categories).
4. Select a separate random sample from each stratum
   in proportion to the percentage of the population found in
   each.

Proportional Allocation: Selecting from each stratum in
  proportion to its percentage of the population.
1.   Obtain a list of all N items in the target population.
2.   Decide on your sample size, n .
3.    Divide list into the strata (categories).
4.    Select a separate random sample from each stratum
     in proportion to the percentage of the population found
     in each.

Proportional Allocation: Selecting from each stratum in
   proportion to its percentage of the population.

E.g. If 12% of a city’s citizens are Pacific Islanders, then
   12% of the sample size should be selected from among
   the Pacific Island citizens.
3. Divide list into the strata (categories).
4. Select a separate random sample from each stratum
  in proportion to the percentage of the population found in
  each.

Proportional Allocation: Selecting from each stratum in
  proportion to its percentage of the population.

E.g. If 12% of a city’s citizens are Pacific Islanders, then
  12% of the sample size should be selected from among
  the Pacific Island citizens.

Advantage: Guaranteed to be representative of each stratum.

Disadvantage: Time-consuming and expensive because you
  must collect information about the strata-sizes in advance.
               Cluster sampling
Use when the population is distributed into naturally-
  occurring groups or ‘clusters’ (e.g. towns and cities in a
  country).

1. Select a representative sample of the clusters
   themselves (usually a lot so we can’t sample from all).
2. Select a random sample of items from within each chosen
   cluster.
3. Again, use Proportional Allocation (like with stratified
   samples). Weight the number selected from each cluster
   according to the cluster size.

E.g. Selecting samples of New Zealanders by selecting a
  sample of towns/cities from throughout the country,
  then a proportional random sample from within each.
1. Select a representative sample of the clusters
   themselves (usually a lot so we can’t sample from all).
2. Select a random sample of items from within each
   chosen cluster.
3. Again, use Proportional Allocation (like with stratified
   samples). Weight the number selected from each
   cluster according to the cluster size.

E.g. Selecting samples of New Zealanders by selecting a
   sample of towns/cities from throughout the country,
   then a proportional random sample from within each.

Advantage:
• Cheaper and faster when sampling from a
  geographically large area (data can be collected in groups
  within chosen clusters rather than being spread out).
E.g. Selecting samples of New Zealanders by selecting a
       HW: Memorise the 4 types of
  sample of towns/cities from throughout the country,
        a proportional random sample from within each.
  thensampling techniques and the
      advantages & disadvantages
Advantage:
      of and faster when sampling from a geographically
• Cheapereach.
    large area (data can be collected in groups within chosen
    clusters rather than being spread out).

Disadvantages:
• Items don’t have an equal chance of selection.
    – Small clusters are unlikely to be sampled from.
    – Items that are not in clusters are excluded altogether.
    E.g. farmers or people in small rural communities may have no chance
        of being selected.

•   Requires prior knowledge of cluster sizes.
21.03
         Cluster sampling.

        Here is one way of obtaining a
        cluster sample of size 20.
        Choose four clusters, each of 5
        avocados, by selecting four
        numbers at random from the
        data, and taking them as the
        middle item of a ‘cross’.
        If clusters overlap or run outside
        the boundaries, choose another.
        Spreadsheet formula
        99×RAN# + 1 =
        62 22 2 68 56
        Note: Depending how a cluster is
        defined, it can exclude some items or make other items more likely
        to be chosen than under other sampling methods
 LESSON 2 – Distribution of Sample Means

The points of today:
The point of today: Get confident at calculating
 probabilities involving the distribution of sample
 means.
  – Mark HW: “Achieving in Statistics”: pages 30.
  – Handout to fill in (goes with following slides)


• Then do Achieving in Statistics: pages 31 & 32.
The Distribution of Sample Means
       The Distribution of Sample Means
STARTER ACTIVITY: Each class member has 5 dice.
Toss your 5 dice and record the number facing upward for
each. Add up to get the total for your 5.
My total value from 5 tosses = _____
My mean score for each die roll = ________

Your group of 5 dice tosses represents a sample of size n=5.
Between us, as a class, we tossed 5 dice ________ times.
We got means of: ________________________________

This illustrates the fact that ________________________.
       The Distribution of Sample Means
STARTER ACTIVITY: Each class member has 5 dice.
Toss your 5 dice and record the number facing upward for
each. Add up to get the total for your 5.
My total value from 5 tosses = _____
My mean score for each die roll = ________

Your group of 5 dice tosses represents a sample of size n=5.
Between us, as a class, we tossed 5 dice ________ times.
We got means of: ________________________________

This illustrates the fact that sample means vary.
       The Distribution of Sample Means
Your group of 5 dice tosses represents a sample of size n=5.
Between us, as a class, we tossed 5 dice ________ times.
We got means of: ________________________________

This illustrates the fact that sample means vary.

A random sample can be thought of as a collection of n items
(n=5 dice-tosses in the experiment we did last time),
       The Distribution of Sample Means
Your group of 5 dice tosses represents a sample of size n=5.
Between us, as a class, we tossed 5 dice ________ times.
We got means of: ________________________________

This illustrates the fact that sample means vary.

A random sample can be thought of as a collection of n items
(n=5 dice-tosses in the experiment we did last time), each of
which has a value that we measure (the number facing upward
when a die lands in this case).
       The Distribution of Sample Means
This illustrates the fact that sample means vary.

A random sample can be thought of as a collection of n items
(n=5 dice-tosses in the experiment we did last time), each of
which has a value that we measure (the number facing upward
when a die lands in this case).

When you select items at random from any population, the
value of each item, X is a random variable (e.g. height, weight,
volume of drink in soft drink bottles etc.).

Select a random sample of size n from any population:

                            X 1  X 2  X 3  ...  X n
The sample mean,   X =
                                        n
       The Distribution of Sample Means
When you select items at random from any population, the
value of each item, X is a random variable (e.g. height, weight,
volume of drink in soft drink bottles etc.).

Select a random sample of size n from any population:

                            X 1  X 2  X 3  ...  X n
The sample mean,   X =
                                        n

Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
       The Distribution of Sample Means
Select a random sample of size n from any population:

                           X 1  X 2  X 3  ...  X n
The sample mean,   X =
                                       n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• ___________________________________________
  ___________________________________________

• ___________________________________________
  ___________________________________________
  ___________________________________________
       The Distribution of Sample Means
Select a random sample of size n from any population:

                           X 1  X 2  X 3  ...  X n
The sample mean,   X =
                                       n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• is a random variable itself because it varies at random
  from sample to sample.

• ___________________________________________
  ___________________________________________
  ___________________________________________
       The Distribution of Sample Means
Select a random sample of size n from any population:

                           X 1  X 2  X 3  ...  X n
The sample mean,   X =
                                       n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• is a random variable itself because it varies at random
  from sample to sample.

• is normally distributed about the population mean m,
       The Distribution of Sample Means
Select a random sample of size n from any population:

                           X 1  X 2  X 3  ...  X n
The sample mean,   X =
                                       n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• is a random variable itself because it varies at random
  from sample to sample.

• is normally distributed about the population mean m, even
  if the population from which it is drawn is not normally
  distributed,
       The Distribution of Sample Means
Select a random sample of size n from any population:

                           X 1  X 2  X 3  ...  X n
The sample mean,   X =
                                       n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• is a random variable itself because it varies at random
  from sample to sample.

• is normally distributed about the population mean m, even
  if the population from which it is drawn is not normally
  distributed, provided the samples are large enough.
   Rule of thumb is n > 30.
       The Distribution of Sample Means
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• is a random variable itself because it varies at random from
  sample to sample.

• is normally distributed about the population mean m, even if
  the population from which it is drawn is not normally
  distributed, provided the samples are large enough.
  Rule of thumb is n > 30.

In other words the sample means will ‘average out’ towards
the population mean.
       The Distribution of Sample Means
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• is a random variable itself because it varies at random from
  sample to sample.

• is normally distributed about the population mean m, even if
  the population from which it is drawn is not normally
  distributed, provided the samples are large enough.
  Rule of thumb is n > 30.

In other words the sample means will ‘average out’ towards
the population mean. This result is called the
‘__________________’.
       The Distribution of Sample Means
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• is a random variable itself because it varies at random from
  sample to sample.

• is normally distributed about the population mean m, even if
  the population from which it is drawn is not normally
  distributed, provided the samples are large enough.
  Rule of thumb is n > 30.

In other words the sample means will ‘average out’ towards
the population mean. This result is called the
 ‘Central Limit Theorem’.
       The Distribution of Sample Means
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.

The sample mean :
• is a random variable itself because it varies at random from
  sample to sample.

• is normally distributed about the population mean m, even if
  the population from which it is drawn is not normally
  distributed, provided the samples are large enough.
  Rule of thumb is n > 30.

In other words the sample means will ‘average out’ towards
the population mean. This result is called the
‘Central Limit Theorem’.      i.e.   mX = m
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
              i.e.   mX = m

                                                Mean of sample means
   Distribution of Sample Means    X   .
                                                           mX = m
                                                 Std. deviation of
                                                 distribution of sample
                                                 means (standard error)
                                                                    s
                                                           sX =         *
                                           mX                       n
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
              i.e.   mX = m
                     .




                                                 Mean of sample means

   Distribution of Sample Means     X   .
                                                           mX = m
                                                  Std. deviation of
                                                  distribution of sample
                                                  means (standard error)
                                                                    s
                                                           sX =
                                            mX                      n *
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
              i.e.   mX = m
                     .




                                                 Mean of sample means

   Distribution of Sample Means     X   .
                                                           mX = m
                                                  Std. deviation of
                                                  distribution of sample
                                                  means (standard error)
                                                                    s
                                                           sX =
                                            mX                      n
Since sample means are normally distributed about the
population mean,
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
              i.e.   mX = m
                     .




                                                 Mean of sample means

   Distribution of Sample Means    X    .
                                                       mX = m
                                                  Std. deviation of
                                                  distribution of sample
                                                  means (standard error)
                                                                  s
                                                           sX =
                                            mX                    n
Since sample means are normally distributed about the
population mean, we can use the properties of a normal
distribution curve
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
              i.e.   mX = m
                     .




                                                 Mean of sample means

   Distribution of Sample Means     X   .
                                                           mX = m
                                                  Std. deviation of
                                                  distribution of sample
                                                  means (standard error)
                                                                    s
                                                           sX =
                                            mX                      n
Since sample means are normally distributed about the
population mean, we can use the properties of a normal
distribution curve to predict the percentage of samples that
will produce means within a particular distance from the
population mean.
                                              Mean of sample means

                                                    mX = m
  Distribution of Sample Means   X   .



                                               Std. deviation of
                                               distribution of sample
                                               means (standard error)
                                                             s
                                                     sX =
                                                              n
                                         mX
Since sample means are normally distributed about the
population mean, we can use the properties of a normal
distribution curve to predict the percentage of samples that
will produce means within a particular distance from the
population mean.

Example:
Since sample means are normally distributed about the population
mean, we can use the properties of a normal distribution curve to
predict the percentage of samples that will produce means within a
particular distance from the population mean.

Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.

        E( X ) = m X
And,
by the Central Limit Theorem,    mX = m    the population mean.
      E ( X ) = ____
Since sample means are normally distributed about the population
mean, we can use the properties of a normal distribution curve to
predict the percentage of samples that will produce means within a
particular distance from the population mean.

Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.

        E( X ) = m X
And,
by the Central Limit Theorem,    mX = m    the population mean.
      E ( X ) = 177 cm
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.
        E( X ) = m X
And,
by the Central Limit Theorem,   mX = m   the population mean.
      E ( X ) = 177 cm

b) The standard deviation (standard error) of the sample
   mean.        s
           sX =
                  n
                9
              =
                36
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.
        E( X ) = m X
And,
by the Central Limit Theorem,   mX = m   the population mean.
      E ( X ) = 177 cm

b) The standard deviation (standard error) of the sample
   mean.        s
           sX =
                  n
                9
              =
                36
               = 1.5cm
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.
        E( X ) = m X
And,
by the Central Limit Theorem,   mX = m   the population mean.
      E ( X ) = 177 cm

b) The standard deviation (standard error) of the sample
   mean.        s
           sX =
                  n
                  9
              =
                  36
               = 1.5cm

c) What percentage of such samples would have a mean
   that is:
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (i) Within 3cm of the population mean of 177cm?


P(174  X  180 , if m = 177 ) = P_______  z  ________
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (i) Within 3cm of the population mean of 177cm?

                                  174  m      180  m 
 P(174  X  180, if m = 177) = P
                                  s       z          
                                      X
                                                  sX   
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (i) Within 3cm of the population mean of 177cm?

                                 174  177      180  177 
P(174  X  180, if m = 177) = P
                                 s         z            
                                      X
                                                   sX     
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
     that is:
     (i) Within 3cm of the population mean of 177cm?
                                                             
                                                             
                                     174  177      180  177 
P (174  X  180 , if m = 177 ) = P           z 
                                        s              s     
                                                             
                                         n              n 
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
     that is:
     (i) Within 3cm of the population mean of 177cm?
                                                             
                                                             
                                     174  177      180  177 
P (174  X  180 , if m = 177 ) = P           z 
                                        9              9     
                                                             
                                        36             36 
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (i) Within 3cm of the population mean of 177cm?

                                   174  177      180  177 
P(174  X  180 , if m = 177 ) = P           z            
                                   1.5               1.5 
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (i) Within 3cm of the population mean of 177cm?

                                   174  177      180  177 
P(174  X  180 , if m = 177 ) = P           z            
                                   1.5               1.5 


                             = P 2  z  2

                             = 2  0.47724

                             = 0.9545 (4sf) So 95.45% of samples
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (ii) More than 5cm away from the population mean?
P( X  172 or X  182 ) = 1 - P____  X  ____ 
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (ii) More than 5cm away from the population mean?
P( X  172 or X  182 ) = 1 - P172  X  182 

                        = 1 - P_______  Z  _______
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (ii) More than 5cm away from the population mean?
P( X  172 or X  182 ) = 1 - P172  X  182 
                               172  177     182  177 
                              
                       = 1 - P           Z           
                               sX              sX     
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (ii) More than 5cm away from the population mean?
P( X  172 or X  182 ) = 1 - P172  X  182 
                                                       
                                                       
                                172  177     182  177 
                       = 1 - P           Z
                                   s             s     
                                                       
                                    n             n 
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (ii) More than 5cm away from the population mean?
P( X  172 or X  182 ) = 1 - P172  X  182 
                                                       
                                                       
                                172  177     182  177 
                       = 1 - P           Z
                                   9             9     
                                                       
                                   36            36 
Example:
The results of a census of all 17 year-old males in NZ showed
         Homework:
a mean height of m = 177cm, with s = 9cm.
         Do Achieving in Statistics: pages 31 & 32.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
   that is:
   (ii) More than 5cm away from the population mean?
P( X  172 or X  182 ) = 1 - P172  X  182 
                               172  177     182  177 
                       = 1 - P           Z           
                               1.5              1.5 

                                  1       1
                        = 1 - P(3  Z  3 )
                                  3       3
                        = 1- 0.99914 So only about 0.09% of
                        = 0.00086      samples. Very rare.
                       Extension
The point of today: Look at when we can draw conclusions
  about the population mean based on a sample mean.

STARTER: Look at applet that demonstrates the
  distribution of sample means:
  SIM - onlinestatbook.com.SLASH.rvls.html.

• Work through the following examples as class (handout to
  fill in).
• Then do Sigma p184 – Ex. 11.5 (old version).
           or Sigma p66 – Ex. 3.05 (new version)
Example:
The census of all NZ seventeen year-old males from yesterday’s example
was actually conducted back in 1987. It had mean of m =177cm and s of
9cm.
A random sample of 36 seventeen year-old NZ males was selected just last
year. This sample found a mean height of 180cm.

(a) What is the probability that a random sample of 36 students selected
from a population with m=177cm and s=9cm would give a mean height
greater than 180cm?
                                             
                                             
                                        X m 
   P ( X  180 , if m = 177 ) = P z 
                                        s 
                                             
                                         n 
                                                
                                                
                                       180  177 
                              = P z 
                                          9     
                                                
                                          36 
(a) What is the probability that a random sample of 36 students selected
    from a population with m=177cm and s=9cm would give a mean height
    greater than 180cm?
                                             
                                             
                                         X m 
   P ( X  180 , if m = 177 ) = P z 
                                         s 
                                             
                                          n 

                                              
                                              
                                     180  177 
                            = P z 
                                        9     
                                              
                                        36 
                            = P z  2

                            = 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm?
(a) What is the probability that a random sample of 36 students selected
    from a population with m=177cm and s=9cm would give a mean height
    greater than 180cm?

                                             
                                             
                                         X m 
   P ( X  180 , if m = 177 ) = P z 
                                         s 
                                             
                                          n 
                                              
                                              
                                     180  177 
                            = P z 
                                        9     
                                              
                                        36 
                            = P z  2

                            = 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm?
 Answer: Only 2.275%
                                            
                                            
                                       X m
  P ( X  180 , if m = 177 ) = P z 
                                        s 
                                            
                                         n 
                                                 
                                                 
                                        180  177 
                              = P z 
                                           9     
                                                 
                                           36 

                            = P z  2

                            = 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm?
Answer: Only 2.275%

(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm
                           = P z  2
                           = 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: Only 2.275%

(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm




(d) So it is very ____________ that a randomly selected ________
taken from a _____________ with mean 177cm and standard deviation of
9cm would have a mean as high as this one.
                           = P z  2
                           = 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: Only 2.275%

(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm




(d) So it is very unlikely      that a randomly selected ________
taken from a _____________ with mean 177cm and standard deviation of
9cm would have a mean as high as this one.
                           = P z  2
                           = 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: Only 2.275%

(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm




(d) So it is very unlikely      that a randomly selected sample
taken from a _____________ with mean 177cm and standard deviation of
9cm would have a mean as high as this one.
                           = P z  2
                           = 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: Only 2.275%

(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm




(d) So it is very unlikely      that a randomly selected sample
taken from a population        with mean 177cm and standard deviation of
9cm would have a mean as high as this one.
                           = P z  2
   Do Sigma:
                           = 0.02275
    Based on this edition: Pg. 184 – of samples
             nd
(b) In old (2 ) answer, what percentage Ex. 11.4. would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: in NEW edition: Pg. 66 – Ex. 3.04.
    OR Only 2.275%

(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm




(d) So it is very unlikely       that a randomly selected sample
taken from a population         with mean 177cm and standard deviation of
9cm would have a mean as high as this one. Yet it did.
(e) What is the most likely explanation?
   LESSON 4 – C.I.s for Means 1

• Today’s theme: Solving problems involving
  Confidence Intervals for Means.



• Students do NuLake Ch 2.5 – Calculate
  Confidence Intervals of means.

http://www.youtube.com/watch?v=Ohz-PZqaMtk
Question: If the population mean height of 17 year-old NZ
   males is 177cm with s of 9cm, within what interval would we
   expect the means of 95% of samples of size 36 to lie?



Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.

                                   95%
Question: If the population mean height of 17 year-old NZ
   males is 177cm with s of 9cm, within what interval would we
   expect the means of 95% of samples of size 36 to lie?



Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.

i.e. 47.5% (or 0.475) on
                                  47.5%

                                          47.5%
each side.
Question: If the population mean height of 17 year-old NZ
   males is 177cm with s of 9cm, within what interval would we
   expect the means of 95% of samples of size 36 to lie?



Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.

i.e. 47.5% (or 0.475) on
                                  47.5%

                                          47.5%
each side.

Looking up 0.475 on the
tables gives z = 1.96.
Question: If the population mean height of 17 year-old NZ
   males is 177cm with s of 9cm, within what interval would we
   expect the means of 95% of samples of size 36 to lie?



Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.

i.e. 47.5% (or 0.475) on
                                     47.5%

                                             47.5%
each side.

Looking up 0.475 on the      -1.96                   1.96
tables gives z = 1.96.
Question: If the population mean height of 17 year-old NZ
   males is 177cm with s of 9cm, within what interval would we
   expect the means of 95% of samples of size 36 to lie?




Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.

i.e. 47.5% (or 0.475) on
                                      47.5%

                                              47.5%
each side.

Looking up 0.475 on the       -1.96                   1.96
tables gives z = 1.96.
Question: If the population mean height of 17 year-old NZ
    males is 177cm with s of 9cm, within what interval would we
    expect the means of 95% of samples of size 36 to lie?
So when we calculate the mean from a random sample we expect
    that, 95% of the time, it will be within + 1.96 standard
    errors of the popn mean, m.

Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.

i.e. 47.5% (or 0.475) on
                                      47.5%

                                              47.5%
each side.

Looking up 0.475 on the       -1.96                   1.96
tables gives z = 1.96.
Question: If the population mean height of 17 year-old NZ
    males is 177cm with s of 9cm, within what interval would we
    expect the means of 95% of samples of size 36 to lie?
So when we calculate the mean from a random sample we expect
    that, 95% of the time, it will be within + 1.96 standard
    errors of the popn mean, m.


Now, work out the lower and
upper limits of the interval
within which you’d expect
95% of sample means to lie
if each sample has 36 people
                                       47.5%

                                               47.5%
in it.

                               -1.96                   1.96
 Question: If the population mean height of 17 year-old NZ
     males is 177cm with s of 9cm, within what interval would we
     expect the means of 95% of samples of size 36 to lie?
 So when we calculate the mean from a random sample we expect
     that, 95% of the time, it will be within + 1.96 standard
     errors of the popn mean, m.


 Now, work out the lower and
 upper limits of the interval
 within which you’d expect
 95% of sample means to lie



                                          47.5%

                                                  47.5%
 if each sample has 36 people
 in it.
                                  -1.96                   1.96
Conclusion: So 95% of samples of size 36 from this population
will produce means between _______cm and ________cm
 Question: If the population mean height of 17 year-old NZ
     males is 177cm with s of 9cm, within what interval would we
     expect the means of 95% of samples of size 36 to lie?
 So when we calculate the mean from a random sample we expect
     that, 95% of the time, it will be within + 1.96 standard
     errors of the popn mean, m.


 Now, work out the lower and
 upper limits of the interval
 within which you’d expect
 95% of sample means to lie



                                          47.5%

                                                  47.5%
 if each sample has 36 people
 in it.
                                  -1.96                   1.96
Conclusion: So 95% of samples of size 36 from this population
will produce means between 174.06cm and 179.93cm
 Question: If the population mean height of 17 year-old NZ
     males is 177cm with s of 9cm, within what interval would we
     expect the means of 95% of samples of size 36 to lie?
 So when we calculate the mean from a random sample we expect
     that, 95% of the time, it will be within + 1.96 standard
     errors of the popn mean, m.




                                          47.5%

                                                  47.5%
                                  -1.96                   1.96
Conclusion: So 95% of samples of size 36 from this population
will produce means between 174.06cm and 179.93cm
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm




                            47.5%

                                    47.5%
                    -1.96                   1.96
Problem:
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm




                             47.5%

                                     47.5%
                     -1.96                   1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm




                             47.5%

                                     47.5%
                     -1.96                   1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm




                             47.5%

                                     47.5%
                     -1.96                   1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
How can our knowledge of the distribution of sample means help
us here??
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm




                                 47.5%

                                         47.5%
                         -1.96                   1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
How can our knowledge of the distribution of sample means help
us here??
                                 47.5%

                                         47.5%
                         -1.96                   1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
How can our knowledge of the distribution of sample means help
us here??

Answer: We construct an interval within which
   we think the population mean lies.
                                 47.5%

                                         47.5%
                         -1.96                   1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
How can our knowledge of the distribution of sample means help
us here??

Answer: We construct an interval within which
   we think the population mean lies.
           Estimate of     m = X  margin of error
                                 47.5%

                                          47.5%
                         -1.96                    1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
Answer: We construct an interval within which we think
   the population mean lies.
         Estimate of      m =            X  margin of error.
This is known as a Confidence Interval.
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.

Answer: We construct an interval within which we think the population
    mean lies.
        Estimate of   m = X  margin of error.
This is known as a Confidence Interval.


A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.

    Diagram on board
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.

Answer: We construct an interval within which we think the population
    mean lies.
        Estimate of   m = X  margin of error.
This is known as a Confidence Interval.

A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X 
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.

Answer: We construct an interval within which we think the population
    mean lies.
        Estimate of   m = X  margin of error.
This is known as a Confidence Interval.

A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X  1.96  s X
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.

Answer: We construct an interval within which we think the population
    mean lies.
        Estimate of   m = X  margin of error.
This is known as a Confidence Interval.

A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X  1.96  s
                                                        n
A 99% confidence interval for m is X  _____ s
                                                        n
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.

Answer: We construct an interval within which we think the population
    mean lies.
        Estimate of   m = X  margin of error.
This is known as a Confidence Interval.

A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X  1.96  s
                                                        n
A 99% confidence interval for m is X  2.576  s
                                                        n
A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X  1.96  s
                                                    n
A 99% confidence interval for m is X  2.576  s
                                                   n

Example 1:
A soft drink is sold in bottles. The amount of drink in each
bottle is normally distributed with a standard deviation of
40mL. The mean volume of drink in a random sample of 100 such
bottles is 300mL. Construct a 95% confidence interval for the
true mean volume of drink per bottle.
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
There is a 95% probability that the interval 300mL +1.96 standard
  errors contains the true population mean.

A 95% C.I. for the population mean m is:
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
There is a 95% probability that the interval 300mL +1.96 standard
  errors contains the true population mean.

A 95% C.I. for the population mean m is:
 X   + Margin of Error

=X     + z ×Standard Error of the sample mean

= 300 +
= 300 +     1.96 
                     40    Margin of Error
                     100   E
= 300mL + 7.84mL
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
There is a 95% probability that the interval 300mL +1.96 standard
  errors contains the true population mean.

A 95% C.I. for the population mean m is:
 X   + Margin of Error

=X     + z ×Standard Error of the sample mean

= 300 +
= 300 +     1.96 
                     40
                           ANSWER:
                     100
                           The 95% CI for the population mean is:
= 300mL + 7.84mL           _____mL < m < _____mL
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
There is a 95% probability that the interval 300mL +1.96 standard
  errors contains the true population mean.

A 95% C.I. for the population mean m is:
 X   + Margin of Error

=X     + z ×Standard Error of the sample mean

= 300 +
= 300 +     1.96 
                     40
                           ANSWER:
                     100
                           The 95% CI for the population mean is:
= 300mL + 7.84mL           292.2mL < m < _____mL
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
There is a 95% probability that the interval 300mL +1.96 standard
  errors contains the true population mean.

A 95% C.I. for the population mean m is:
 X   + Margin of Error

=X     + z ×Standard Error of the sample mean

= 300 +
= 300 +     1.96 
                     40
                           ANSWER:
                     100
                           The 95% CI for the population mean is:
= 300mL + 7.84mL           292.2mL < m < 307.8mL
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
There is a 95% probability that the interval 300mL +1.96 standard
  errors contains the true population mean.

A 95% C.I. for the population mean m is:
 X   + Margin of Error
= 300 +     1.96 
                   40
                         ANSWER:
                   100
                         The 95% CI for the population mean is:
= 300mL + 7.84mL         292.2mL < m < 307.8mL
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
There is a 95% probability that the interval 300mL +1.96 standard
  errors contains the true population mean.

A 95% C.I. for the population mean m is:
 X   + Margin of Error
= 300 + 1.96  40       ANSWER:
                100
                        The 95% CI for the population mean is:
= 300mL + 7.84mL        292.2mL < m < 307.8mL


Now calculate the 99 % C.I. Will it be wider or narrower??
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
A 95% C.I. for the population mean m is:
 = 300 + 1.96  40   ANSWER:
                  100
                           The 95% CI for the population mean is:
 = 300mL + 7.84mL          292.2mL < m < 307.8mL

 Now calculate the 99 % C.I. Will it be wider or narrower??
 A 99% C.I. is
                       s
   300   + z0.99 / 2 
                        n
 = 300 + 2.576 
                       40 Margin of Error
                     100        E
 = 300mL + 10.304mL
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
A 95% C.I. for the population mean m is:
 = 300 + 1.96  40   ANSWER:
                  100
                           The 95% CI for the population mean is:
 = 300mL + 7.84mL          292.2mL < m < 307.8mL

 Now calculate the 99 % C.I. Will it be wider or narrower??
 A 99% C.I. is
                       s
   300   + z0.99 / 2 
                        n
 = 300 + 2.576 
                       40
                          ANSWER:
                     100
                           The 99% CI for the population mean is:
 = 300mL + 10.304mL ____mL < m < _____mL
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
A 95% C.I. for the population mean m is:
 = 300 + 1.96  40   ANSWER:
                  100
                           The 95% CI for the population mean is:
 = 300mL + 7.84mL          292.2mL < m < 307.8mL

 Now calculate the 99 % C.I. Will it be wider or narrower??
 A 99% C.I. is
                       s
   300   + z0.99 / 2 
                        n
 = 300 + 2.576 
                       40
                          ANSWER:
                     100
                           The 99% CI for the population mean is:
 = 300mL + 10.304mL 289.7mL < m < _____mL
 Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
                  Copy examples, then do
                  NuLake standard deviation
normally distributed with a Ex. 2.5: p8184 of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.

Solution:
A 95% C.I. for the population mean m is:
 = 300 + 1.96  40   ANSWER:
                  100
                          The 95% CI for the population mean is:
 = 300mL + 7.84mL         292.2mL < m < 307.8mL

 Now calculate the 99 % C.I. Will it be wider or narrower??
 A 99% C.I. is
                       s
   300   + z0.99 / 2 
                        n
 = 300 + 2.576 
                       40
                          ANSWER:
                    100
                          The 99% CI for the population mean is:
 = 300mL + 10.304mL 289.7mL < m < 310.3mL
 When we aren’t told the population
      standard deviation s.

If we aren’t given the popn standard deviation
  s, then use the sample standard deviation s
  as an estimate.

This is OK provided the sample size is large
 enough (n > 30).
        LESSON 5 – C.I.s for Means 2
The purpose of today:
 Memorise definition of a confidence interval.
 Get confident at constructing confidence intervals for
  population means.

To do today:
1.   Watch youtube clip: http://www.youtube.com/watch?v=Ohz-PZqaMtk
2.   Interpret C.I. from yesterday’s e.g. in context.
3.   Finish NuLake 2.5.
4.   Do new Sigma p75 - Ex. 4.01: To end of Q14 compulsory.
5.   Q1517 are extra for experts.
   2008 NCEA exam question:




Do new Sigma p75 - Ex. 4.01: To
end of Q14 compulsory.
Q1517 are extra for experts.
    LESSON 6 – SAMPLE SIZE
          (MEANS)
• Today’s theme: Calculate the required
  sample size to meet a set of specified
  conditions for a Confidence Interval for the
  population MEAN.

• Do Sigma (old): Ex. 14.2 – pg. 230.
 (New version: Ex. 4.02 – pg. 79)
 Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:                        s
                      X  z
                                n
The ____________, E, is _____________________________
________________________________________________.
 Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:                        s
                      X  z
                                n
The margin of error, E, is the _________________________
________________________________________________.
 Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:                        s
                         X  z
                                   n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
                               s
Margin of Error, E   =    z
                               n

For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + ___cm.
 Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:                        s
                         X  z
                                   n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
                               s
Margin of Error, E   =    z
                               n

For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm.
 Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:                        s
                         X  z
                                   n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
                               s
Margin of Error, E   =    z
                               n

For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The ___________ is __cm. Our
estimate is “___________________”.
 Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:                        s
                         X  z
                                   n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
                               s
Margin of Error, E   =    z
                               n

For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “___________________”.
 Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:                        s
                         X  z
                                   n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
                               s
Margin of Error, E   =    z
                               n

For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a,
 Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:                        s
                         X  z
                                   n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
                               s
Margin of Error, E   =    z
                               n

For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a, we can calculate how big a
sample is necessary to estimate m to give a required accuracy or
margin of error, E.
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.

                            s
Margin of Error, E   = z
                             n

For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a, we can calculate how big a
sample is necessary to estimate m to give a required accuracy or
margin of error, E.




E.g: A survey is to be conducted to determine the mean income of a
group of workers. A pilot survey gives s  $100.
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.

                            s
Margin of Error, E   = z
                              n

For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a, we can calculate how big a
sample is necessary to estimate m to give a required accuracy or
margin of error, E.




E.g: A survey is to be conducted to determine the mean income of a
group of workers. A pilot survey gives s  $100. How large must the
sample be if the mean income is to be estimated to within $20 using a
95% confidence interval?
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a, we can calculate how big a
sample is necessary to estimate m to give a required accuracy or
margin of error, E.


E.g: A survey is to be conducted to determine the mean income of a
group of workers. A pilot survey gives s  $100. How large must the
sample be if the mean income is to be estimated to within $20 using a
95% confidence interval?
                                                                     s
Solution: A confidence interval for the mean income m is: X  1·96 
                                                                      n
For the income to be found to within $20, we need:


                  100
         1·96          < 20
                   n
E.g: A survey is to be conducted to determine the mean income of a
   group of workers. A pilot survey gives s  $100. How large must the
   sample you’ve copied down this e.g:
  When be if the mean income is to be estimated to within $20 using
   a 95% confidence interval?
  Do Sigma (new): Ex. 4.02 – pg. 79
Solution: A confidence interval for the mean income m is: X  1·96  s
  (or Old version: Ex. 14.2 – pg. 230).                               n
For the income to be found to within $20, we need:
 FINISH FOR H.W.
                 100
        1·96           < 20
                   n
                 196
                      20
                   n

                 196 2
                        20 2     Squaring both sides
                   n
                 196 2
                     2
                       n
                  20                Answer:
                                    A minimum sample size
                    n   > 96.04     of 97 is needed.
   Formula for calculating minimum
             sample size.

                 zs 
                         2

              n=    
                 E 

Where E = Margin of Error.
i.e. half of C.I. width.
   Sample-size question from 2007
       NCEA External Exam
A random sample of size n is taken from a population
  having a known standard deviation σ. A 95%
  confidence interval for the population mean is
  calculated using the sample mean.
A second random sample of size 2n is taken from the
  same population and a 95% confidence interval for
  the population mean is calculated using its sample
  mean.
How many times greater is the width of the first
  confidence interval than the width of the second
  confidence interval?
   Formula for calculating minimum
             sample size.

                   zs 
                            2

                n=    
                   E 

Where E = Margin of Error.
i.e. half of Confidence Interval width.
   LESSON 7 – Intro to Confidence
      Intervals for Proportions
The points of today:
• Introduction to Distribution of Sample PROPORTIONS.

• Construct confidence intervals for Population Proportions.

 Notes on distn. of sample proportions (handout).
 Do handout on distribution of sample proportions
  (Achieving in Statistics page 33).
 How to construct a C.I. for a proportion.
 HW: NuLake Ex. 2.6.
   The Distribution of Sample Proportions
E.g. Political Opinion Polls - National vs Labour.


2 possible outcomes where p is the proportion of successful
  outcomes in n trials.


If a sequence of n independent trials results in x successes,
then x has a _________ distribution.
   The Distribution of Sample Proportions
E.g. Political Opinion Polls - National vs Labour.


2 possible outcomes where p is the proportion of successful
  outcomes in n trials.


If a sequence of n independent trials results in x successes,
then x has a Binomial distribution.
A point estimator of the popn proportion of successful trials,
p, is the sample proportion     x .
                                p=
                                     n
With a sufficient sample size (rule of thumb n>30), the

distribution of sample proportions p is approximately normal and…
     1. Do handout on of Sample sample
     The Distributiondistribution ofProportions
          proportions.
E.g. Political Opinion Polls - National vs Labour.
              (Will do Q1 table on board as a class)
2 possible outcomes where p is the proportion of successful outcomes in n
   trials.


If a sequence of n independent trials results in x successes, then x has a
Binomial distribution.
                                                                          x
A point estimator of the popn proportion, p, is the sample proportion p =
                                                                          n

With a sufficient sample size (rule of thumb n>30), the distribution of

sample proportions p is approximately normal and…

  E ( p) = m p = p                            p (1  p )
                                  sp =
By the Central Limit Theorem                        n
Next slide:

The proofs of the formulae for mean
and standard deviation of the
distribution of sample proportions
With a sufficient sample size (rule of thumb n>30), the distribution of

sample proportions p is approximately normal and…

                                                         p (1  p )
 E ( p) = m p = p                           sp =
                                                              n
 Proof:
                                                Proof:
            X                                             X
 E ( p) = E                               Var ( p ) = Var  
            n                                             n
           1                                               1 
       =     E( X )                                = Var  X 
           n                                               n 
                                                           2
       1          Since, for the Binomial            1
      = np        Distribution, m = np            =   Var X       Since, for the
       n                                              n 2           Binomial
       =p                                           1
                                                  =   np (1  p )   Distribution,
                                                    n               s2 = np1 p
With a sufficient sample size (rule of thumb n>30), the distribution of

sample proportions p is approximately normal and…

                                                         p (1  p )
 E ( p) = m p = p                           sp =
                                                                  n
 Proof:
                                                Proof:
            X                                             X
 E ( p) = E                               Var ( p ) = Var  
            n                                             n
           1                                              1 
       =     E( X )                                = Var  X 
           n                                              n 
                                                          2
       1          Since, for the Binomial            1
      = np        Distribution, m = np             =   Var X          Since, for the
       n                                             n                  Binomial
                                                      1
       =p                                          = 2 np (1  p )        Distribution,
                                                      n                   s2 = np1 p
                                                     p (1  p )
                                                   =
                                                          n                    p (1  p )
                                                                      s p =
                                                                                   n
          Confidence Intervals for Proportions
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll.
           Confidence Intervals for Proportions
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
Find a 95% confidence interval for the true proportion of all NZers who
favour National.
                                             275
Solution: Our point estimate for p is p =             = 0 .5 5
                                             500
There is a 95% probability that the interval 0.55 + 1.96 standard errors
contains the true population proportion who would prefer National.
                                                          p (1  p )
A 95% C.I. is       p  zs    p   =p         z
                                                              n
                                                            0 . 55  0 . 45
                                    = 0.55    1 . 96 
                                                                 500
                                                                 Margin of
                                    = 0.55    _____
                                                                 Error E
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
Find a 95% confidence interval for the true proportion of all NZers who
favoured National.
                                          275
Solution: Our point estimate for p is p =            = 0 .5 5
                                          500
There is a 95% probability that the interval 0.55 + 1.96 standard errors
contains the true population proportion who would prefer National.
                                                       p (1  p )
A 95% C.I. is       p  zs    p   =p         z
                                                           n
                                                         0 . 55  0 . 45
                                    = 0.55  1 .96           500
                                                                Margin of
                                    = 0.55    _____
                                                                Error E
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
Find a 95% confidence interval for the true proportion of all NZers who
favoured National.
                                            275
Solution: Our point estimate for p is p =             = 0 .5 5
                                            500
There is a 95% probability that the interval 0.55 + 1.96 standard errors
contains the true population proportion who would prefer National.
                                                          p (1  p )
A 95% C.I. is       p  zs     p   =p       z
                                                              n
                                                            0 . 55  0 . 45
                                    = 0.55    1 . 96 
                                                                 500
                                                                 Margin of
                                    = 0.55  0 . 0 4 3 6 1       Error E

ANSWER: The 95% CI for the proportion in favour of National is
______ < p < _______
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
Find a 95% confidence interval for the true proportion of all NZers who
favoured National.
                                            275
Solution: Our point estimate for p is p =             = 0 .5 5
                                            500
There is a 95% probability that the interval 0.55 + 1.96 standard errors
contains the true population proportion who would prefer National.
                                                          p (1  p )
A 95% C.I. is       p  zs     p   =p       z
                                                              n
                                                            0 . 55  0 . 45
                                    = 0.55    1 . 96 
                                                                 500
                                                                  Margin of
                                    = 0.55  0 . 0 4 3 6 1        Error E
ANSWER: The 95% CI for the proportion in favour of National is
0.5064 < p < _______
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
   HW: Do NuLake Ex. 2.6 – CIs for
Find a 95% confidence interval for the true proportion of all NZers who
   proportions.
PONDER THIS:
favoured National .
                                            275
Solution: Our point estimate for p is p =             = 0 .5 5
Based on this opinion poll, does National have a
                                            500

                              SIGNIFICANT National. errors
STATISTICALLYproportion who 0.55 + 1.96majority?
There is a 95% probability that the interval
contains the true population                 would prefer
                                                          standard

                                                          p (1  p )
A 95% C.I. is       p  zs     p   =p       z
                                                              n
                                                            0 . 55  0 . 45
                                    = 0.55    1 . 96 
                                                                 500
                                                                 Margin of
                                    = 0.55  0 . 0 4 3 6 1       Error E
ANSWER: The 95% CI for the proportion in favour of National is
0.5064 < p < 0.5936
 LESSON 8 – Practice constructing
      C.I.s for Proportions
The point of today:
• Do lots of practice involving confidence intervals
  for Population Proportions.

Go over any homework questions – NuLake
 p87,88: Ch 2.6 – C.I.s for proportions.

Then do Sigma pg. 232 – Ex. 14.3 (old version).
or in new version: pg. 88 - Ex. 5.01. Finish for HW.
        LESSON 9 – SAMPLE SIZE
            (PROPORTIONS)
• Today’s theme: Calculate the required sample size to
  meet a set of specified conditions for a Confidence
  Interval for the population PROPORTION.
• Key point – for minimum sample size, if not told p,
  assume p=0.5 as this gives the greatest margin of error
  (prepared for the worst).

 Do Sigma: old edition – p235 – Ex. 14.4
        or new edition – p91 - Ex. 5.02
   Calculating the minimum sample size - proportions.
The confidence interval formula for estimating the population
proportion, p, is:          p (1  p )
                     p  z
                                  n
The margin of error, E, is the distance between the sample
proportion and the upper and lower limits of this interval.
                               p (1  p )
Margin of Error, E    =   z
                                   n
For example, a confidence interval of 0.37 < p < 0.43, can also be
expressed as 0.4 + 0.03. The margin of error is _____.
      Calculating the minimum sample size.
The confidence interval formula for estimating the population
proportion, p, is:          p (1  p )
                     p  z
                                  n
The margin of error, E, is the distance between the sample
proportion and the upper and lower limits of this interval.
                               p (1  p )
Margin of Error, E    =   z
                                   n
For example, a confidence interval of 0.37 < p < 0.43, can also be
expressed as 0.4 + 0.03. The margin of error is 0.03.
      Calculating the minimum sample size.
The confidence interval formula for estimating the population
proportion, p, is:          p (1  p )
                     p  z
                                  n
The margin of error, E, is the distance between the sample
proportion and the upper and lower limits of this interval.
                               p (1  p )
Margin of Error, E    =   z
                                   n
For example, a confidence interval of 0.37 < p < 0.43, can also be
expressed as 0.4 + 0.03. The margin of error is 0.03. Our estimate
is “accurate to within 0.03”.

The sample size depends on three factors:
1.The level of confidence required, a.
2.The true value of p, which will often be unknown.
3.The accuracy required.
   i.e. the margin of error, E, we are willing to accept.
Margin of Error, E     = z  p (1  p )
                                 n
For example, a confidence interval of 0.37 < p < 0.43, can also be
expressed as 0.4 + 0.03. The margin of error is 0.03. Our estimate
is “accurate to within 0.03”.

The sample size depends on three factors:
1.The level of confidence required, a.
2.The true value of p, which will often be unknown.
3.The accuracy required.
   i.e. the margin of error, E, we are willing to accept.

Example
        An international airline is thinking of making smoking illegal on its
aircraft. Before making the decision it wishes to estimate the proportion
of smokers in the population of passengers on its planes by taking a random
sample. How big a sample must it take to be 95% sure that the value so
obtained does not differ from the true proportion by more than 0.05?
The sample size depends on three factors:
1.The level of confidence required, a.
2.The true value of p, which will often be unknown.
3.The accuracy required.
   i.e. the margin of error, E, we are willing to accept.

Example
        An international airline is thinking of making smoking illegal on its
aircraft. Before making the decision it wishes to estimate the proportion
of smokers in the population of passengers on its planes by taking a random
sample. How big a sample must it take to be 95% sure that the value so
obtained does not differ from the true proportion by more than 0.05?

Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:       p (1  p )
              p  1.96 
                              n
For the proportion to be found to within 0.05, we need: Margin of < 0.05
                                                             Error
Example
  An international airline is thinking of making smoking illegal on its
  aircraft. Before making the decision it wishes to estimate the
  proportion of smokers in the population of passengers on its planes by
  taking a random sample. How big a sample must it take to be 95% sure
  that the value so obtained does not differ from the true proportion
  by more than 0.05?

Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:        p (1  p )
              p  1.96 
                             n
                                                            Margin of
For the proportion to be found to within 0.05, we need: Error            < 0.05

                                                            p (1  p )
                                                   1.96                  0.05
                                                                n
PROBLEM!
Example
  An international airline is thinking of making smoking illegal on its
  aircraft. Before making the decision it wishes to estimate the
  proportion of smokers in the population of passengers on its planes by
  taking a random sample. How big a sample must it take to be 95% sure
  that the value so obtained does not differ from the true proportion
  by more than 0.05?

Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:        p (1  p )
              p  1.96 
                             n
                                                            Margin of
For the proportion to be found to within 0.05, we need: Error            < 0.05

                                                            p (1  p )
                                                   1.96                  0.05
                                                          n
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!!
Example
  An international airline is thinking of making smoking illegal on its
  aircraft. Before making the decision it wishes to estimate the
  proportion of smokers in the population of passengers on its planes by
  taking a random sample. How big a sample must it take to be 95% sure
  that the value so obtained does not differ from the true proportion
  by more than 0.05?

Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:               p (1  p )
                    p  1.96 
                                    n
                                                            Margin of
For the proportion to be found to within 0.05, we need: Error            < 0.05

                                                            p (1  p )
                                                   1.96                  0.05
                                                           n
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
     An international airline is thinking of making smoking illegal on its
     aircraft. Before making the decision it wishes to estimate the
     proportion of smokers in the population of passengers on its planes by
     taking a random sample. How big a sample must it take to be 95% sure
     that the value so obtained does not differ from the true proportion
     by more than 0.05?

Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:        p (1  p )
              p  1.96 
                             n
                                                                 Margin of
For the proportion to be found to within 0.05, we need:                    < 0.05
                                                                 Error
                                                                 p (1  p )
                                                        1.96                  0.05
                                                                     n
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
1.   Use a value of p that has held in the past (previous samples).
2. Take a small pilot survey, and use the sample proportion p from that
   as an estimate of p.
Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:          p (1  p )
                p  1.96 
                               n
                                                                 Margin of < 0.05
For the proportion to be found to within 0.05, we need:
                                                                 Error
                                                                 p (1  p )
                                                        1.96                  0.05
                                                                     n
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
1.Use a value of p that has held in the past (previous samples).
2.Take a small pilot survey, and use the sample proportion p from that as
an estimate of p.
3.Use p=0.5. This allows for the greatest possible error because the
maximum possible value of p(1-p) occurs when both p and (1-p) are = ½
       i.e. when    p(1-p) = 0.5 × 0.5
                            = 0.25
For the proportion to be found to within 0.05, we need:            Margin of
                                                                   Error
                                                                   p (1  p )
                                                         1.96    0.05
                                                            n
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
1.Use a value of p that has held in the past (previous samples).
2.Take a small pilot survey, and use the sample proportion p from that as
an estimate of p.
3.Use p=0.5. This allows for the greatest possible error because the
maximum possible value of p(1-p) occurs when both p and (1-p) are = ½
          i.e. when    p(1-p) = 0.5 × 0.5
                              = 0.25
                               p (1  p )
Back to this example: 1.96                  0.05
                                   n
We’re given no information on the value of p, so let p = 0.5.
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
1.Use a value of p that has held in the past (previous samples).
2.Take a small pilot survey, and use the sample proportion p from that as
an estimate of p.
3.Use p=0.5. This allows for the greatest possible error because the
maximum possible value of p(1-p) occurs when both p and (1-p) are = ½
         i.e. when        p(1-p)    = 0.5 × 0.5
                                    = 0.25
                                   p (1  p )
Back to this example: 1.96                      0.05
                                       n
We’re given no information on the value of p, so let p = 0.5.

                0.5(1  0.5)
      1.96                   0.05
                     n

                        0.25
               1.96          0.05
                         n
3. Use p=0.5. This allows for the greatest possible error because the
   maximum possible value of p(1-p) occurs when both p and (1-p) are = ½
   Do Sigma p235 – Ex. 14.4 (old version)
   i.e. when        p(1-p) = 0.5 × 0.5
                           = – Ex.
                       p91 0.25 5.02          (new version)
Back to this example: 1.96 
                             p (1  p )
   Homework: NuLake n 96:        pg. 0.05         Q6477.
We’re given no information on the value of p, so let p = 0.5.
                0.5(1  0.5)
      1.96                   0.05
                     n

                        0.25
               1.96          0.05
                         n
               1.96 2  0.25
                              0.05 2   Squaring both sides
                     n
                              1.96 2  0.25
                          n                   Answer:
                                 0.05 2        A sample size of 385 passengers
                           n > 384.16…         is needed.
LESSON 10 – Differences between
          means 1
The point of today:
Construct confidence intervals for the
 difference between 2 population means.

• Do NuLake 2.7: pg. 8993.
2007 NCEA exam – C.I.s
Confidence Intervals for the Difference
           Between 2 Means
 Confidence Intervals for the Difference
            Between 2 Means
Involves comparison between the means of two populations (e.g. males &
females).
  Confidence Intervals for the Difference
             Between 2 Means
Involves comparison between the means of two populations (e.g. males &
females). We select a random sample from each group and calculate the
2 means, subtracting to get the difference.
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.

The expected difference between the 2 sample means, is the true
  difference between the 2 population means: (Central Limit Theorem)
                            E( X1  X 2 ) =   m1  m 2
i.e. Mean difference between sample means= diff. between popn means.
We select a random sample from each group and calculate the 2 means,
subtracting to get the difference.
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.

The expected difference between the 2 sample means, is the true
  difference between the 2 population means: (Central Limit Theorem)
                           E( X1  X 2 ) =   m1  m 2
i.e. Mean difference between sample means= diff. between popn means
  Sample Mean         Sample        Popn        Variance of
  (point estimate)     Size         Mean       Sample Means
We select a random sample from each group and calculate the 2 means,
subtracting to get the difference.
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.

The expected difference between the 2 sample means, is the true
  difference between the 2 population means: (Central Limit Theorem)
                           E( X1  X 2 ) =   m1  m 2
i.e. Mean difference between sample means= diff. between popn means
  Sample Mean         Sample        Popn        Variance of
  (point estimate)     Size         Mean       Sample Means

         X1             n1

         X2             n2

      X1  X 2         ―
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.

The expected difference between the 2 sample means, is the true
  difference between the 2 population means: (Central Limit Theorem)
                             E( X1  X 2 ) =   m1  m 2
i.e. Mean difference between sample means = diff. between popn means


  Sample Mean        Sample         Popn          Variance of
  (point estimate)    Size          Mean         Sample Means

         X1             n1

         X2             n2

      X1  X 2         ―
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.

The expected difference between the 2 sample means, is the true
  difference between the 2 population means: (Central Limit Theorem)
                             E( X1  X 2 ) =   m1  m 2
i.e. Mean difference between sample means = diff. between popn means


  Sample Mean        Sample         Popn          Variance of
  (point estimate)    Size          Mean         Sample Means
                                                    s 12
         X1             n1            m1              n1

         X2             n2

      X1  X 2         ―
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.

The expected difference between the 2 sample means, is the true
  difference between the 2 population means: (Central Limit Theorem)
                             E( X1  X 2 ) =   m1  m 2
i.e. Mean difference between sample means = diff. between popn means


  Sample Mean        Sample         Popn          Variance of
  (point estimate)    Size          Mean         Sample Means
                                                    s 12
         X1             n1            m1              n1

         X2             n2            m2            s 22
                                                     n2


      X1  X 2         ―
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.

The expected difference between the 2 sample means, is the true
  difference between the 2 population means: (Central Limit Theorem)
                             E( X1  X 2 ) =   m1  m 2
i.e. Mean difference between sample means = diff. between popn means


  Sample Mean        Sample         Popn          Variance of
  (point estimate)    Size          Mean         Sample Means
                                                    s 12
         X1             n1            m1                n1

         X2             n2            m2            s 22
                                                      n2

                       ―          m1 – m2        s1           s2
                                                      2            2

      X1  X 2                                            
                                                   n1         n2
Sample Mean        Sample    Popn      Variance of
(point estimate)    Size     Mean     Sample Means
                                         s 12
       X1           n1        m1            n1

       X2           n2        m2         s 22
                                            n2

                    ―       m1 – m2   s 12
                                             
                                                 s 22
    X1  X 2
                                       n1        n2
Sample Mean        Sample    Popn      Variance of
(point estimate)    Size     Mean     Sample Means
                                         s 12
       X1           n1        m1            n1

       X2           n2        m2         s 22
                                            n2

                    ―       m1 – m2   s 12
                                             
                                                 s 22
    X1  X 2
                                       n1        n2
So the Standard Error of the difference
 between 2 sample means is:
Sample Mean        Sample    Popn       Variance of
(point estimate)    Size     Mean      Sample Means
                                          s 12
       X1              n1     m1                 n1

       X2              n2     m2             s 22
                                                 n2

                       ―    m1 – m2        s 12
                                                  
                                                      s 22
    X1  X 2
                                            n1        n2
So the Standard Error of the difference
 between 2 sample means is:
                              s1   2
                                           s2     2
            s X  X  =               
                   1    2
                               n1            n2
                                s1   2
                                             s2   2

So          s X  X  =                 
                 1    2
                                 n1          n2

NOTE:
1. The 2 samples must be INDEPENDENT of one another.

2.   When finding a confidence interval for the difference
     between 2 means, we use the popn parameters s1 and s2.
     If not told these, we can use the sample SD’s s1 and s2,
     provided the sample sizes are large enough (n>30).
                         s1   2
                                      s2   2

So      s X  X  =              
             1   2
                          n1          n2
NOTE:
1. The 2 samples must be INDEPENDENT of one another.

2.   When finding a confidence interval for the difference
     between 2 means, we use the popn parameters s1 and s2.
     If not told these, we can use the sample SD’s   s1 and s2,
     provided the sample sizes are large enough.

3.   A 95% Confidence Interval tells us that 95% of such
     intervals will CONTAIN the difference between the
     POPULATION MEANS.
So the Standard Error of the difference
 between 2 sample means is:

                      s1   2
                                   s2   2
       s X  X  =            
           1   2
                      n1           n2

  Confidence Intervals for Difference
           Between 2 Means
 So the Standard Error of the difference
  between 2 sample means is:
                            s1   2
                                         s2   2
             s X  X  =            
                 1   2
                             n1          n2

     Confidence Intervals for Difference
Example:
              Between 2 Means
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has a
    mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
    lifetimes of all women and all men.
(b) What can we conclude about the mean lifespans of all men and all
    women on the basis of this confidence interval? Justify your answer.
     Confidence Intervals for Difference
              Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has a
    mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
   lifetimes of all women and all men.


Solution:
For the women:
     Confidence Intervals for Difference
              Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has a
    mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
   lifetimes of all women and all men.


Solution:
For the women: n1 = 49
     Confidence Intervals for Difference
              Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has a
    mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
   lifetimes of all women and all men.


Solution:

For the women:    n1 = 49        X1   =76
     Confidence Intervals for Difference
              Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has a
    mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
   lifetimes of all women and all men.


Solution:

For the women:   n1 = 49         X1   =76         s1 = 8
     Confidence Intervals for Difference
              Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has
    a mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
   lifetimes of all women and all men.


Solution:

For the women:   n1 = 49         X1   =76         s1 = 8

For the men:     n2 = 64
     Confidence Intervals for Difference
              Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has
    a mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
   lifetimes of all women and all men.


Solution:
For the women:   n1 = 49         X1   =76         s1 = 8

For the men:     n2 = 64         X2   =72
     Confidence Intervals for Difference
              Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has
    a mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
   lifetimes of all women and all men.


Solution:
For the women:    n1 = 49        X1   =76         s1 = 8

For the men:     n2 = 64         X2   =72         s2 = 9
     Confidence Intervals for Difference
              Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
    standard deviation of 8 years and a random sample of 64 men has a
    mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
   lifetimes of all women and all men.


Solution:

For the women:    n1 = 49        X 1 =76          s1 = 8

For the men:     n2 = 64         X 2 =72          s2   =9
Confidence Intervals for Difference Between 2 Means
 Example:
 If a random sample of 49 women has a mean life of 76 years with a
     standard deviation of 8 years and a random sample of 64 men has a
     mean life of 72 years with a standard deviation of 9 years.
 (a) Find a 95% confidence interval for the difference between the mean
    lifetimes of all women and all men.



 Solution:
 For the women:   n1 = 49         X1   =76         s1 = 8
 For the men:     n2 = 64         X2   =72         s2 = 9
    X 1  X 2 = 76 – 72
             = 4 yrs
Solution:
For the women:         n1 = 49     X1   =76           s1 = 8
For the men:           n2 = 64     X2   =72           s2 = 9
    X 1  X 2 = 76 – 72
                    = 4 yrs
A 95% Confidence Interval for m1-m2, the difference between the
   population mean lifetimes of women and men is:

      X   1    X2    + z ×Standard Error     of
                                          Use the sample
 = X 1  X 2                            standard deviations –
                                          OK if sample is large
                                          enough


  =             4
For the women:         n1 = 49     X 1 =76            s1 = 8
For the men:           n2 = 64     X 2 =72            s2 = 9

   X 1  X 2 = 76 – 72
                    = 4 yrs
A 95% Confidence Interval for m1-m2, the difference between the
   population mean lifetimes of women and men is:

      X   1    X2    + z ×Standard Error     of
                                          Use the sample
 = X 1  X 2                            standard deviations –
                                          OK if sample is large
                                          enough


  =             4

                                  Margin of
  =             4
                                  Error E
For the women:         n1 = 49     X 1 =76              s1 = 8
For the men:           n2 = 64     X 2 =72              s2 = 9
   X 1  X 2 = 76 – 72
                    = 4 yrs
A 95% Confidence Interval for m1-m2, the difference between the
   population mean lifetimes of women and men is:

      X   1    X2    + z ×Standard Error      of
                                              Use the sample
 = X 1  X 2                                standard deviations –
                                              OK if sample is large
                                              enough


  =             4
                                  Margin of
  =             4                 Error E
              n1 = 49
 For the women:         X 1 =76    s1 = 8
    What n = 64
(b)the men: can we conclude about the 9
For                     X 2 =72    s2 =
                                        mean
lifespans of all men and all women on the basis
               2
     X 1  X 2 = 76 – 72
of this confidence interval? Justify your
               = 4 yrs
answer.
 A 95% Confidence Interval for m1-m2, the difference between the
    population mean lifetimes of women and men is:
ANSWER: Since the interval does not contain
    X 1  X 2  z Standard Error are
a difference+ of ×ZERO, there of sufficient
                                 Use difference
grounds to say that there is a the sample
  = X 1  X 2  mean lifespansstandard deviations –
between the                       of the populations
                                 OK if sample is large

of all men and all women.        enough


   =       4
   =       4
ANSWER: The 95% CI for the difference between the population mean
lifetimes of women and men is 0.857yrs < (m1-m2)< 7.143yrs
 For the women: n1 = 49  X 1 =76    s1 = 8
     What n = 64
(b) the men: can we conclude about the 9
 For                     X 2 =72    s2 =
                                         mean
lifespans of all men and all women on the basis
                 2
     X 1  X 2 = 76 – 72
of this confidence interval? Justify your
               = 4 yrs
answer.
 A 95% Confidence Interval for m1-m2, the difference between the
    population mean lifetimes of women and men is:
ANSWER: Since the interval does not contain
     X 1  X 2  z ZERO, Error of
a difference+ of ×Standardthere are sufficient
grounds to say that there isUse the sample  a difference
   = X 1  X the mean lifespans standard deviations –
between 2                                  of the populations
                                           OK if sample is large
of all men and all women.                    TRY WITH A
                                           enough

99% C.I.
   =         4
   =       4
ANSWER: The 95% CI for the difference between the population mean
lifetimes of women and men is 0.857yrs < (m1-m2)< 7.143yrs
Difference between 2 means exercises
• Do NuLake Ch 2.7: pg. 8993
LESSON 11 – Differences between
          means 2
The point of today:
Construct confidence intervals for the
 difference between 2 population means.



• Do Sigma pg. 239 – Ex. 14.5 (old version).
        or pg. 83 – Ex. 4.03 (new version)
               STARTER:
 GO THROUGH PROBLEM FROM HW AS A CLASS.


Do Sigma pg. 239 – Ex. 14.5 (old version).
        or pg. 83 – Ex. 4.03 (new version)
                    LESSON 12
      The distribution of the sample Total.

The point of today:
Construct confidence intervals for the combined
 total of a sample of items.

• Example

• 2009 NCEA paper (AS90642): Q1b & c.

• Probabilities for sample totals: Ex. 3.03 (pg. 64) – complete
  for HW.
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males).
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean.

Type 2: Based on   X,   the mean from a random sample.
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean. (look at today).

Type 2: Based on   X,   the mean from a random sample. (look
        at next lesson)
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean.
This is where you are given m, the mean value per item in the
population and asked to construct a confidence interval for the
total value of a sample of n items.
E.g. Seventeen year-old NZ males have a known mean weight of
80kg, with a standard deviation of 5kg.
Construct a 99% CI for the combined total weight of a random
sample of 8 students.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .

Type 1: Based on m, a known population mean.
This is where you are given m, the mean value per item in the
population and asked to construct a confidence interval for the
total value of a sample of n items.
E.g. Seventeen year-old NZ males have a known mean weight of
80kg, with a standard deviation of 5kg.
Construct a 99% CI for the combined total weight of a random
sample of 8 students.

Solution: The distribution of the total weight of 8 students is
the sum of 8 identically distributed random variables.

Here we know the population mean weight per seventeen year-
old male, m, and the standard deviation, s.
So we can simply add the means and add the variances.
E.g. Seventeen year-old NZ males have a known mean weight of
80kg, with a standard deviation of 5kg.
Construct a 99% CI for the combined total weight of a random
sample of 8 students.

Solution: The distribution of the total weight of 8 students is
the sum of 8 identically distributed random variables.

Here we know the population mean weight per seventeen year-
old male, m, and the standard deviation, s.
So we can simply add the means and add the variances.

Distribution of a Total of n independent items:
If X1, X2,………..Xn are n independent sample values, then the
   sample total is
Tn = X1 + X2,……….+ Xn
Solution: The distribution of the total weight of 8 students is
the sum of 8 identically distributed random variables.

Here we know the population mean weight per seventeen
year-old male, m, and the standard deviation, s.
So we can simply add the means and add the variances.

Distribution of a Total of n independent items:
If X1, X2,………..Xn are n independent sample values, then the
   sample total is
Tn = X1 + X2,……….+ Xn

Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
                              = E[X1]+………      + E[Xn]
Here we know the population mean weight per seventeen
year-old male, m, and the standard deviation, s.
So we can simply add the means and add the variances.

Distribution of a Total of n independent items:
If X1, X2,………..Xn are n independent sample values, then the
   sample total is
Tn = X1 + X2,……….+ Xn

Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
                              = E[X1]+………      + E[Xn]
                             = nm

Variance of estimates of the total:
                       Var[Tn] =Var [X1 + X2,……….+ Xn ]
                              = Var[X1]+………     + Var[Xn]
So we can simply add the means and add the variances.

Distribution of a Total of n independent items:
If X1, X2,………..Xn are n independent sample values, then the
   sample total is
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
                              = E[X1]+………      + E[Xn]
                              = nm

Variance of estimates of the total:
                         Var[Tn] =Var [X1 + X2,……….+ Xn ]
                                = Var[X1]+………         + Var[Xn]
                                = nσ2 (if all have equal SD)
                                                        
So the std. deviation of estimates of the total is: s = n  s
                                                   T
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
                              = E[X1]+………      + E[Xn]
                              = nm

Variance of estimates of the total:
                         Var[Tn] =Var [X1 + X2,……….+ Xn ]
                                = Var[X1]+………         + Var[Xn]
                               = nσ2 (if all have equal SD)
                                                        
So the std. deviation of estimates of the total is: s = n  s
                                                   T



Back to the example: Total weight of sample of 8 males:
E(T8) =
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
                              = E[X1]+………      + E[Xn]
                              = nm

Variance of estimates of the total:
                         Var[Tn] =Var [X1 + X2,……….+ Xn ]
                                = Var[X1]+………         + Var[Xn]
                                = nσ2 (if all have equal SD)
                                                        
So the std. deviation of estimates of the total is: s = n  s
                                                   T



Back to the example: Total weight of sample of 8 males:
E(T8) = 8(80)
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
                              = E[X1]+………      + E[Xn]
                              = nm

Variance of estimates of the total:
                         Var[Tn] =Var [X1 + X2,……….+ Xn ]
                                = Var[X1]+………         + Var[Xn]
                               = nσ2 (if all have equal SD)
                                                        
So the std. deviation of estimates of the total is: s = n  s
                                                   T



Back to the example: Total weight of sample of 8 males:
E(T8) = 8(80) = 640kg.
Var(T8) = 8(52) = 200.
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
                              = E[X1]+………      + E[Xn]
                              = nm

Variance of estimates of the total:
                         Var[Tn] =Var [X1 + X2,……….+ Xn ]
                                = Var[X1]+………         + Var[Xn]
                               = nσ2 (if all have equal SD)
                                                        
So the std. deviation of estimates of the total is: s = n  s
                                                   T



Back to the example: Total weight of sample of 8 males:
E(T8) = 8(80) = 640kg.
Var(T8) = 8(52) = 200.  So σ8 = 200 = 14.14213562kg
 Variance of estimates of the total:
                    AS90642 – [X1 (b) ,……….+
    1. Do 2009 NCEA Var[Tn] =Var Q1 + X2and (c) Xn ]
                                 = Var[X1]+………          + Var[Xn]
   2. Do Sigma:                 = nσ2 (if all have equal SD)
                                                           
 So the-std. deviation of estimates of the– Ex.is: s T = n  s
         Old (2nd edition): pg. 183 total 11.3.
        - or New: pg. 64 – Ex. 3.03.
 Back to the example: Total weight of sample of 8 males:
 E(T8) = 8(80) = 640kg.
 Var(T8) = 8(52) = 200.  So σ8 = 200 = 14.14213562kg

 99% CI for T is E(T8)  z  s T

               = 640     2.576 14.14...

               = 640kg  36.43kg   (4sf)
ANSWR: The 99% CI for T8, the total weight of the sample of 8 males is:
                603.6kg <T< 676.4kg (all to 4sf)
                 LESSON 13
Confidence Intervals for Population Totals

• STARTER: Revise the definition of a Confidence
  Interval.

• Notes on CI for population totals.

 Do NCEA AS90642 – 2009 paper: Q2c.
 Do NuLake p98-100 (mixed problems).
 Do NuLake practice assessment (p101).
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean.



Type 2: Based on   X,   the mean from a random sample.
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean. (looked at last
lesson).

Type 2: Based on    X,   the mean from a random sample. (look
        at today)
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 2: Based on X , the mean from a random sample:
Type 2: Based on   X,   the mean from a random sample:
This is where you are asked to construct a confidence interval for the
total value of N items but the population mean per item is unknown.
Instead we are told X , the mean from a sample.
Then an estimate of the total value of N items is:    NX

To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per item, m.
2. Multiply the lower and upper bounds of the interval by N , the number
of items.



Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams.
Type 2: Based on   X,   the mean from a random sample:
This is where you are asked to construct a confidence interval for the
total value of N items but the population mean per item is unknown.
Instead we are told X , the mean from a sample.
Then an estimate of the total value of N items is:    NX

To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per item, m.
2. Multiply the lower and upper bounds of the interval by N , the number
of items.



Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
Then an estimate of the total value of N items is:   NX
To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per
item, m.
2. Multiply the lower and upper bounds of the interval by N , the
number of items.

Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so
they conduct a pilot study by selecting a random sample of 30.
Then an estimate of the total value of N items is:   NX
To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per
item, m.
2. Multiply the lower and upper bounds of the interval by N , the
number of items.

Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so
they conduct a pilot study by selecting a random sample of 30. This
sample has a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68
randomly selected Year 13 students.
To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per
item, m.
2. Multiply the lower and upper bounds of the interval by N , the
number of items.

Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so
they conduct a pilot study by selecting a random sample of 30. This
sample has a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68
randomly selected Year 13 students.

Solution:
Type 2: Based on   X   , the mean from a random sample:
Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so
they conduct a pilot study by selecting a random sample of 30. This
sample has a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68 Year 13
students.

Solution:
1.Construct a 96% confidence interval for the popn mean m:
     Interval is given by:      s                 7
                          76  z       = 76  2.054
                                   n                  30
                                       = 76  2.625
Example:
68 Year 13 male students are to be selected at random from throughout
NZ to win a prize of an overseas holiday after NCEA exams. The
organisers need to estimate the likely total weight of the students, due to
weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so they
conduct a pilot study by selecting a random sample of 30. This sample has
a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68 Year 13
students.

Solution:
1.Construct a 96% confidence interval for the popn mean m:
     Interval is given by:      s                 7
                             76  z       = 76  2.054
                                      n                  30
                                          = 76  2.625


So 96% CI for popn mean weight, m is:
Example:
68 Year 13 male students are to be selected at random from throughout
NZ to win a prize of an overseas holiday after NCEA exams. The
organisers need to estimate the likely total weight of the students, due to
weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so they
conduct a pilot study by selecting a random sample of 30. This sample has
a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68 Year 13
students.

Solution:
1.Construct a 96% confidence interval for the popn mean m:
     Interval is given by:      s                 7
                             76  z       = 76  2.054
                                      n                  30
                                          = 76  2.625


So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg
The mean and SD of the popn of all Year 13 males is unknown so they
conduct a pilot study by selecting a random sample of 30. This sample has
a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68 Year 13
students.

Solution:
1.Construct a 96% confidence interval for the popn mean m:
     Interval is given by:
                                   s             7
                           76  z   = 76  2.054
                                  n              30
                                        = 76  2.625
So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg

2. Multiply the lower and upper bounds of the interval by N , the
   number of items.
   96% CI for the expected total weight of the 68 Y13s is:
      (N × lower limit for m) < TN < (N × upper limit for m)
Construct a 96% CI for the expected total of weight of 68 Year 13
students. C.I. for Population Totals:
       1.
Solution: Do 2009 NCEA paper (AS90642):
1.Construct a 96% confidence interval for the popn mean m:
           Q2c.
     Interval is given by:     s                 7
                           76  z      = 76  2.054
        2. Preparation for test:
                             n          30
        Do NuLake p98-100 (Mixed2.625
                                 = 76 
        problems).
So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg
        Do NuLake practice assessment
2.   Multiply the lower and upper bounds of the interval by N   , the
        (p101)
     number of items.
     96% CI for the expected total weight of the 68 Y13s is:
       (N × lower limit for m) < TN < (N × upper limit for m)
               =    (68 × 73.375) < TN < (68 × < 78.625)
               =    4990kg < T68 < 5347kg         answer
   Sample-size question from 2007
       NCEA External Exam
A random sample of size n is taken from a population
  having a known standard deviation σ. A 95%
  confidence interval for the population mean is
  calculated using the sample mean.
A second random sample of size 2n is taken from the
  same population and a 95% confidence interval for
  the population mean is calculated using its sample
  mean.
How many times greater is the width of the first
  confidence interval than the width of the second
  confidence interval?
  LESSON 14 – ASSESSMENT
What to study:
 Do NuLake mixed problems (p98) – merit level qs.

 NuLake practice assesment (p101)

 More practice (Achieved & Merit):
 Do Sigma Confidence Intervals Review exercise:
   Old: p241 – Ex. 14.6
   New: p95 – Ex. 5.03

 CIs for totals (Excellence) – past papers:
 2009 Q2c
 2008 Q6
 2006 Q7

								
To top