Statistics
Statistics is the science of designing studies, gathering data, and then
classifying, summarizing, interpreting, and presenting these data to
explain and support the decisions that are reached.

Population
A population is the complete collection of measurements, objects, or
individuals under study.

Sample
A sample is a portion or subset taken from a population.

Parameter
A parameter is a number that describes a population characteristic.

Statistic
A statistic is a number that describes a sample characteristic.

DESCRIPTIVE STATISTICS
Descriptive statistics includes the procedures for collecting,
classifying, summarizing, and presenting data.

STATISTICAL INFERENCE
Statistical inference is the process of arriving at a conclusion
about a population parameter (which is usually an unknown
quantity) on the basis of information obtained from a sample
statistic (a known value).
Data Collection
1. Data that is made available by others.
2. Data resulting from an experiment.
An experiment involves a sample from a population of interest.
The members of the sample are called experimental units, though
when dealing with people they are more often referred to as
subjects. The experimenter manipulates these experimental units
by subjecting them to one of several treatments. These treatments
are created by changing the values or levels of one or more
factors. The response of each experimental unit to its treatment is
observed and comparisons are made across the various treatments.

Experiment
An experiment is the process of subjecting experimental units to
treatments and observing their response.

Example (A Taste of Sampling) Russell Ford wants to compare the
weight gains (responses) of cattle (experimental units) based on the
brand of feed (factor). The four brands (treatments or levels) are (1)
Sterling Steers, (2) Hereford Heaven, (3) Guernsey Goop, and (4) Tastes
Like Chicken. He has 32 head to which he wants to give the various
feeds. How can he randomly assign the brands of feed to the cattle?
Solution Russell could number his cattle from 1 to 32. Then, using a
table of random numbers or a computer program such as MINITAB, he
could select random numbers between 1 and 32. The first eight numbers
he selects will indicate the eight head of cattle to receive Sterling Steers,
the second eight numbers will denote the cattle to receive Hereford
Heaven, and so on. Below is the result of using MINITAB to generate
such random numbers. The first eight, ignoring repeats, are 27, 31, 6, 8,
19, 12, 26, and 21. So the cattle assigned those numbers would be fed
Sterling Steers. Cattle numbered 7, 18, 1, 13, 11, 15, 10, and 5 would
eat Hereford Heaven, while cattle numbered 3, 14, 24, 9, 28, 30, 23, and
25 would dine on Guernsey Goop, and the remaining eight head would
ingest Tastes Like Chicken.

MINITAB: Calc>Random Data>Integer
3. Data collected in an observational study.
In an observational study, a researcher collects data without
imposing any treatments on the subjects or experimental units.
Sampling

A non-probability sample is one in which the judgment of the
experimenter, the method in which the data is collected, or other factors
could affect the results of the sample.

Judgment Samples. Any sample based on someone's expertise about the
population is known as a judgment sample.

Voluntary Samples. Voluntary samples involve open solicitation of
input and attract only those who are interested in the subject matter

Convenience Samples. Surveys where the concern is primarily on the
ease with which the sample is taken, is called a convenience
sample.

A probability sample is one in which the chance of selection of each
item in the population is known before the sample is picked.

Simple Random Samples. If a probability sample is chosen in such a
way that all possible groupings of a given size have an equal
chance of being picked, and if each item in the population has an
equal chance of being selected, then the sample is called a simple
random sample.
Stratified Samples. If a population is divided into relatively
homogeneous groups or strata and a sample is drawn from each
group to produce an overall sample, this overall sample is known
as a stratified sample.

Cluster Samples. A cluster sample is one in which the individual units
to be sampled are actually groups or clusters of items.

Extra problems:

1. A physical education instructor wants to compare the percentage of body fat for joggers, cyclists, and
swimmers.
a) This could be based on an observational study or on a designed experiment. When would it be
an observational study and when would it be a designed experiment?
b) What is the response?
c) What is the factor?
d) What are its levels?
e) What are the treatments?

2. As part of its sales strategy, an Internet based book vendor solicits book reviews of novels from its
customers. The novel The Perfidious Parrot by Janwillem van de Wetering had ten reviews. Nine
raved about the story, giving the book 4, 4 1/2 or 5 stars (out of 5), while the eighth review was
extremely negative, giving it 0 stars and complaining about the plot line and the characterizations.
a) What type of sample is this?
b) What would you think of a claim that ninety percent of all readers like the novel?
c) There were no reviews that thought the book was mediocre or fair. Why do you think this is?

3. The following passage is from “A two-step intervention to increase mammography among women
aged 65 and older” (American Journal of Public Health, Oct. 1997, Janz, Nancy K., Schottenfeld,
David). “Older women are less likely to obtain screening mammograms, although such screening
could reduce breast cancer mortality by at least 30%. In national surveys, the two most common
reasons offered by older women for not having a mammogram were that they did not know they
needed a mammogram and that their physician had not recommended one. . . . Four hundred and
sixty women, . . . were randomized to a control or a two-step intervention group . . . 223 in the
intervention group and 237 in the control group. The two-step intervention consisted of (1) a personal
letter from the primary care physician with a coupon incentive and (2) for women who did not respond
to the letter within 2 months, a telephone counseling session conducted by a community peer.
a) Is this an observational study or a designed experiment? Explain.
b) What or who are the experimental units or subjects?
c) What is the factor?
d) What are its levels?
e) What are the treatments?
4. An April 7, 2000 article from the San Luis Obispo Tribune, headlined “Survey: City people are more
prone to illnesses” states that “City dwellers get sick more often than their rural counterparts. And
people who live in areas of high unemployment are more likely to feel unhealthy. These trends were
drawn from a five-year study . . . by the Centers for Disease Control and Prevention.”
a) Is this an observational study or a designed experiment? Explain.
b) What question or questions would you ask the researchers at the CDCP to better understand
these results?

5. The article “Impact of zinc supplementation on morbidity from diarrhea and respiratory infections
among rural Guatemalan children” (Pediatrics, June 1997, Ruel, Marie T., Rivera, Juan A.) describes
an evaluation of a food supplement to decrease illness among children. “A community-based,
randomized, double-blind intervention trial was conducted to measure the impact of zinc
supplementation on young Guatemalan children's morbidity from diarrhea and respiratory infections. .
. . Children aged 6 to 9 months were randomly assigned to receive 4 mL of a beverage containing 10
mg of zinc (as zinc sulfate) daily (7 d/wk) for 7 months (n = 45) or a placebo (n = 44). Morbidity data
were collected daily.” Is this an observational study or a designed experiment? Explain.

6. A paragraph from Knight Ridder wire services (2000) had a headline “Age of parents influences sex of
baby.” It read “Older fathers and mothers are more likely to have girls, says a study of more than 25
years of birth reports. The influence of age was strongest among nonwhite parents, researchers from
Exxon Biomedical Services report in the March issue of the journal of Fertility and Sterility. The
scientists undertook the study as part of an ongoing examination of the declining proportion of male
births.”
a) Do you think this was originally an observational study or a designed experiment? Why?
b) It is possible that Exxon Biomedical Services used available data in creating their report. If so,
what type of information should they obtain about how the data was collected?

```
