```									Chapter 4
Gathering Data

Looking Back
   In Chapters 2 & 3 we learned how to describe data
both graphically and numerically.
   For these statistical analyses to be useful, we must
have good data.
   In fact, the way a study is designed (how we gather
data) can have a major impact on the results of the
study.
   The purpose of this course is for you to learn what
you can conclude about an entire population given a
sample from that population.
   If a study is poorly designed and implemented, the
results may be meaningless or misleading.
Two Scenarios
   Study 1
   A U.S. study (2000) compared 469 patients with brain
cancer to 422 patients who did not have brain cancer. The
patients’ cell phone use was measured using a
questionnaire. The two groups’ use of cell phones was
similar.
   Study 2
   An Australian study (1997) conducted a study with 200
transgenic mice. One hundred were exposed for two 30
minute periods a day to the same kind of microwaves with
roughly the same power as the kind transmitted from a cell
phone. The other 100 mice were not exposed. After 18
months, the brain tumor rate for the exposed mice was
twice as high as that for the unexposed mice.

   Study 2
   Uses mice in hopes of generalizing to humans

Example 4.1
   A large study of student drug use and how it
depends on drug testing enrolled 76,000 middle and
high school students. Each student in the study
filled out a questionnaire. One question asked
whether the student used drugs. The study found
that drug use was not affected by student drug
testing.

   This is an example of an

   Could there be any lurking variables?

Example 4.1
   A large study of student drug use and how it
depends on drug testing enrolled 76,000 middle and
high school students. Each student in the study
filled out a questionnaire. One question asked
whether the student used drugs. The study found
that drug use was not affected by student drug
testing.

   This is an example of an observational study.

   Could there be any lurking variables?

Example 4.1
   A large study of student drug use and how it
depends on drug testing enrolled 76,000 middle and
high school students. Each student in the study
filled out a questionnaire. One question asked
whether the student used drugs. The study found
that drug use was not affected by student drug
testing.

   This is an example of an observational study.

   Could there be any lurking variables?
   Frequency of drug testing, whether testing is random, etc.

Example 4.2
   A researcher buys seeds of two different varieties of
corn. He randomly selects 30 seeds of each variety
and plants them in his backyard, making sure to
label the location of each seed and its type. He then
measures how long it takes each seed to sprout. At
the end of the study he compares the average
germination time of the different varieties.

   This is an example of an

   Could there be any lurking variables?

Example 4.2
   A researcher buys seeds of two different varieties of
corn. He randomly selects 30 seeds of each variety
and plants them in his backyard, making sure to
label the location of each seed and its type. He then
measures how long it takes each seed to sprout. At
the end of the study he compares the average
germination time of the different varieties.

   This is an example of an experiment.

   Could there be any lurking variables?

Example 4.2
   A researcher buys seeds of two different varieties of
corn. He randomly selects 30 seeds of each variety
and plants them in his backyard, making sure to
label the location of each seed and its type. He then
measures how long it takes each seed to sprout. At
the end of the study he compares the average
germination time of the different varieties.

   This is an example of an experiment.

   Could there be any lurking variables?
   Soil quality, temperature

Example 4.3
   A researcher has seeds of only one variety of tomato. She has
60 nearly identical pots of soil and plants one tomato seed in
each. She randomly selects 30 pots and keeps them at 75° F.
The other 30 pots she keeps at 65° F. Aside from temperature,
she provides the same growing conditions to all pots. She then
measures how long it takes for the seeds to sprout. At the end of
the study she compares the average germination time of the
different temperature groups.

   This is an example of an

   Are there any lurking variables?

Example 4.3
   A researcher has seeds of only one variety of tomato. She has
60 nearly identical pots of soil and plants one tomato seed in
each. She randomly selects 30 pots and keeps them at 75° F.
The other 30 pots she keeps at 65° F. Aside from temperature,
she provides the same growing conditions to all pots. She then
measures how long it takes for the seeds to sprout. At the end of
the study she compares the average germination time of the
different temperature groups.

   This is an example of an experiment.

   Are there any lurking variables?

Example 4.3
   A researcher has seeds of only one variety of tomato. She has
60 nearly identical pots of soil and plants one tomato seed in
each. She randomly selects 30 pots and keeps them at 75° F.
The other 30 pots she keeps at 65° F. Aside from temperature,
she provides the same growing conditions to all pots. She then
measures how long it takes for the seeds to sprout. At the end of
the study she compares the average germination time of the
different temperature groups.

   This is an example of an experiment.

   Are there any lurking variables?
   No, everything has been controlled here.

Types of Observational
Studies
   Retrospective
   Observational studies that look back in time
   This is sometimes done to find risk factors for certain
diseases
   Cross-Sectional
   Observational studies that take a cross section of
the population at the current time
   Prospective
   Observational studies in which subjects are
followed into the future
30
Sampling Designs for
Observational Studies
   Systematic Sampling
   A systematic sample selects every k th person from
the sample frame. The researcher randomly
selects a number between 1 and k in order to
know which person to select first, then selects
every kth person after this.

Types of Bias
   Undercoverage
   Occurs when a sampling frame leaves out some groups in
the population
   Nonresponse bias
   Occurs when some sampled subjects cannot be reached,
refuse to participate or fail to answer some questions
   Response bias
   Occurs when the subject gives an incorrect response or
when the question wording or the way the interviewer asks
the questions is confusing or misleading

Examples of Poor Samples
that Result in Bias
Example 4.7
   An experiment involving adolescent males (ages 15-
19) appeared in Science, 1995. The purpose of the
study was to determine whether there was an
association between survey techniques and the
desire to give socially acceptable answers.
   The participants were randomly assigned to one of
two different survey forms, each of which had
identical questions concerning sexual practices and
drug habits.

Example 4.7
   The two versions of the survey were
   Paper: participants put answers in an envelope with ID#
on it and return in person
   Computer: participants listened to questions in

71
Example 4.10
   To investigate whether antidepressants help smokers to quit
smoking, one study used 429 men and women who were 18 or older
and had smoked 15 cigarettes or more per day in the previous year.
They were all highly motivated to quit and in good health. They
were assigned to one of two groups: one group took an
antidepressant called Zyban, while the other group did not take
anything. At the end of a year, the study observed whether each
subject had successfully abstained from smoking.

Important Points
   Observational studies
   Types
   Retrospective, Cross-Sectional, Prospective
   Sampling Designs
   Simple random sample (SRS), Stratified random sample,
Cluster sample, Systematic sample
   Bias Types
   Undercoverage, Response bias, Nonresponse bias
   Sources of Bias
   Convenience sampling, Voluntary response sampling

Important Points
   Experiments
   Types
   Completely randomized design, matched pairs designs,
crossover designs, randomized block designs
   Elements of Good Experiments
   Control group, randomization, blinding and replication
   Can show causation
   Can be unethical
   Can take decades to complete
Important Points
   If a group is underrepresented in the sample, we
   We must be careful when interpreting the results of
observational studies.
   For comparison of several treatments to be valid, you
must apply all treatments to similar groups of
experimental units.
   Interesting questions are usually pretty tough to
answer. This is due in part to the fact that no single
experiment or observational study can determine
causation.

