Monday, January 25 th
Consumers Union asked all subscribers whether
they had used alternative medical treatments. They
found that 20% of all their subscribers said “yes.” Is
this number a parameter or statistic?
A survey was conducted in a city with 500,000
residents. They contacted 100 people randomly and
found that 67% of them thought that businesses
should be required to pay for their employees’ health
insurance. Is this number a parameter or statistic?
Scales of Measurement
Quantitative or Numerical
Variable with numerical values associated with them
Qualitative or Categorical
Variables without numerical values associated with them
Gender, nationality, hair color, state of residence
Nominal variables have a scale of unordered categories
It does not make sense to say, for example, that green hair is
greater/higher/better than orange hair
Disease status, company rating, grade in STA 291
Ordinal variables have a scale of ordered categories, they are often
treated in a quantitative manner (A = 4.0, B = 3.0, etc.)
One unit can have more of a certain property than another unit
Age, income, height
Quantitative variables are measured numerically, that is, for each
subject a number is observed
The scale for quantitative variables is called interval scale
A study about oral hygiene and periodontal
conditions among institutionalized elderly measured
Nominal (Qualitative): Requires assistance from staff?
Ordinal (Qualitative): Plaque score
No visible plaque
Small amounts of plaque
Moderate amounts of plaque
Interval (Quantitative): Number of teeth
A birth registry database collects the following information on
Birth weight: in grams
Number of prenatal visits
What are the appropriate scales? Quantitative (Interval) Qualitative
Importance of Different Types of Data
Statistical methods vary for quantitative and qualitative
Methods for quantitative data cannot be used to analyze
Quantitative variables can be treated in a less quantitative
Height: measured in cm/in
Can be treated at Qualitative
• Greater than 60in? (Yes/No)
• Between 60in-72in? (Yes/No)
A variable is discrete if it can take on a finite number
Grade in STA 291
Favorite MLB team
All Qualitative variables are discrete
Continuous variables can take an infinite continuum
of possible real number values
Time spent studying for STA 291 per day
Can be subdivided into more accurate values
Discrete or Continuous
Quantitative variables can be discrete or continuous
Age, income, height?
Depends on the scale
Age is potentially continuous, but usually measured in years
The following are examples of quantitative variables.
Identify them as discrete or continuous:
Number of children in a family
Distance a car travels on a tank of gas
Number of customers in a store
Weight of a textbook
Data Collection and Sampling
Methods of Collecting Data
Methods of Collecting Data I
• An observational study observes individuals and
measures variables of interest but does not attempt
to influence the responses.
• The purpose of an observational study is to describe/
compare groups or situations.
• Example: Select a sample of men and women and ask
whether he/she has taken aspirin regularly over the
past 2 years, and whether he/she had suffered a
heart attack over the same period.
Methods of Collecting Data II
• An experiment deliberately imposes some treatment
on individuals in order to observe their responses.
• The purpose of an experiment is to study whether the
treatment causes a change in the response.
• Example: Randomly select men and women, divide
the sample into two groups. You assign one group to
take aspirin daily and the other group a placebo.
After 2 years, determine for each group the percent
of people who had suffered a heart attack.
Methods of Collecting Data III
• Observational Studies are passive data
• We observe, record, or measure, but don’t
• Experiments are active data production
• Experiments actively intervene by imposing
some treatment in order to see what happens
• Experiments are preferable if they are possible
Simple Random Sample
• Each possible sample has the same
probability of being selected.
• The sample size is usually denoted
• Population of 4 students: Adam, Bob, Christina,
• Select a simple random sample (SRS) of size n=2 to
ask them about their smoking habits
• 6 possible samples of size n=2:
(1) A+B, (2) A+C, (3) A+D
(4) B+C, (5) B+D, (6) C+D
How to choose a SRS?
• Old way: use a random number table.
• A little more modern: http://www.randomizer.org
How to Choose a Simple Random Sample (SRS)
• Each possible sample has the same probability of
• The sample size is denoted by n.
• Enumerate all possible samples, and then
randomly choose one of them
• Or, let the computer choose a random sample, for
example using this tool:
How not to choose a SRS?
• Ask Adam and Dana because they are in
your office anyway
– “convenience sample”
• Ask who wants to take part in the survey
and take the first two who volunteer
– “volunteer sampling”
Problems with Volunteer Samples
• The sample will poorly represent the
• Misleading conclusions
• Examples: Mall interview, call-in poll,
internet poll, street corner interview
Why are call-in polls usually biased?
People are much more likely to call in if
they feel strongly about an issue:
(Israel-Palestine, Iraq, water company,
mountaintop removal, pedestrian safety,
name of the UK mascot)
The UK Mascot
• Wildcat named “Blue” is the
official UK mascot
• The name was selected in
2002 in an online poll
where multiple voting
• The choices were “Champ”,
“Blue”, or “Tucky”
• Somebody felt strongly about
it and voted often
Sampling: Famous Example
• 1936 presidential election
• Alfred Landon vs. Franklin Roosevelt
• Literary Digest sent over 10 million
questionnaires in the mail to predict the
• More than 2 million questionnaires returned
• Literary Digest predicted a landslide victory
by Alfred Landon
Sampling: Famous Example (cont’d)
• George Gallup used a much smaller
random sample and predicted a clear
victory by Franklin Roosevelt
• Roosevelt won with 62% of the vote
• Why was the Literary Digest prediction so
• TV, radio call-in polls
• “should the UN headquarters continue to be located in
• ABC poll with 186,000 callers: 67% no
• Scientific random sample with 500 respondents: 28%
• Explain to someone who knows no statistics why the
opinions of only 500 randomly chosen respondents are
a better guide to what all Americans think than the
opinions of 186,000 callers.
Please check your online homework.
Listen for any announcements made in class today
about the first homework assignment!!!
Attendance Survey Question 2
• On a 4”x6” index card (or little piece
– Please write down your name and
– Today’s Question (please answer with a complete
What are the 2 main ways to collect data?