VIEWS: 0 PAGES: 11 CATEGORY: Education POSTED ON: 7/20/2010
Surveys and Scientiﬁc Studies to Collect Data (Chapter 2) Chapter 2 deals with various aspects of using sampling methods to produce data. A primary goal in sampling is to produce meaningful data; hence, one should keep in mind speciﬁc questions when constructing a sampling plan. Formal Inference: In designing a study, one should focus on the following questions: 1. What is the objective of the study? (i.e.: exactly what is it that we are trying to investigate?) 2. In what variables are we interested? (Ex: If we want to study the performance of college applicants, we might be interested in: SAT score, GPA, Strength of high school, etc.) 3. What type of design is appropriate for collecting the desired data? (Ex: If we are testing the eﬀectiveness of a new headache medicine, how do we de- cide who gets which medicine, how to measure eﬀectiveness, who to include in the experiment, etc?) 4. How should these variables be measured? (Ex: What devices need to be used to measure the amount of ﬂuoride in water?) 5. How many individual subjects should be measured? (Ex: To compare the height/weight guesses between males and females, how many guesses do we need?) 6. How do we choose a sample? (Ex: First n trees? Randomly? Phone Book? etc.) 7. Is the sample representative of the population about which we want to make inferences? The importance of this last question stems from the idea that we always want to apply all decisions made from the sample to the population. Example: Suppose we want to study the eﬀects of Agent Orange on troops in Vietnam. A study was done by the Centers for Disease Control (1987) which looked at dioxin concen- trations (in ppt) in Vietnam vets. The researchers wanted to see if there were higher dioxin concentrations in Vietnam veterans than other types of war veterans. • The goal of the study is clearly stated. The researchers wanted to see whether or not the dioxin concentrations in Vietnam vets were unusually high due to Agent Orange. • What is the variable of interest? • What type of design should be employed to collect these data? 4 • How were the samples chosen and how many veterans were used? There are three basic types of studies performed: 1. Surveys - instruments for collecting data on existing conditions or opinions. These can be done as written questionnaires or in the form of an interview. Either of the two types of studies below can be conducted in the form of a survey. 2. Scientiﬁc - gathering data while controlling for other factors (active). (Ex: We might want to compare the germination rates for a number of diﬀerent types of plants, con- trolling for moisture levels in the soil. In other words, we would keep the moisture levels the same for all plants in the study, so that any diﬀerences in germination rates between plants are due to real diﬀerences, not to diﬀerences in soil moisture.) 3. Observational - gathering data where inﬂuencing factors cannot be controlled (pas- sive). These are used in place of scientiﬁc studies because certain factors simply cannot be controlled. (Ex: Vietnam vets, smoking vs. lung cancer) Surveys (2.2) : Who conducts them? Why? • Census Bureau - surveys conducted to decide on allocation of federal funds, urban planning, etc. • Bureau of Labor Statistics - to establish the consumer price index (CPI), unemployment ﬁgures, etc. • Opinion Polls - CNN, Nielsen ratings for television, Gallup polls, etc. • University Center (UC) - conducted a survey on attitudes toward new UC businesses. Some Common Sampling Strategies: 1. Opportunistic Sample: An example of this might be to just stand in front of the UC and sample students as they walk by on their attitudes to new businesses. Problems? 2. Judgemental Sample: trying to subjectively select a representative sample from the population. Problems? 3. Probability Sample: samples are drawn by some probabilistic mechanism (i.e.: random sample). 5 Deﬁnition (4.13 - Section 4.11): A sample of n measurements selected from a popu- lation is said to be a random sample if every diﬀerent sample of size n from the population has the same chance of being selected. Types of Random Samples: (a) Simple Random Sample (SRS): A sample of size n selected from a population in such a way that each sample is equally likely to be chosen. Ex: Interested in the opinion of Missoula voters on whether or not the US should end the 1981 moratorium on oﬀ-shore oil drilling. We might obtain a voter regis- tration list for Missoula, and randomly select 100 Missoula voters to interview. (b) Stratiﬁed Random Sample: Suppose the population can be divided into 2 or more groups (strata) based on some other variable (such as political aﬃliation or gender). A stratiﬁed random sample selects a SRS from each group. Ex: Might select n1 = 50 people from the Democratic party, n2 = 50 people from the Republican party, and n3 = 10 people from the remainder of voters, with each sample selected at random. (c) Cluster Sample: Suppose it is inconvenient to sample completely randomly. We might ﬁrst select clusters of subjects, and then survey every subject in these clus- ters. This is generally done purely for logistic reasons. Ex: Instead of interviewing 100 randomly chosen adults all over the city, we might randomly select blocks and interview all adults on these blocks. The clusters here are the blocks. (d) Systematic Random Sample: If a list of subjects is available, a systematic sample is where every mth subject from the list is selected, until a sample of size n is obtained. This is done for convenience and is often easier than an SRS. Why is the word “random” used? Now that we know some diﬀerent types of random sampling, we need to address some various data collection strategies. Some advantages and disadvantages of four such strategies are considered in the table at the top of the next page: 6 Data Collection Strategies Advantages Disadvantages 1. Personal Interviews 2. Telephone Interviews 3. Questionnaires 4. Direct Observation Scientiﬁc Studies (2.3) : studies where randomization is used to control for other factors. These types of studies are covered in depth in Math 445, with a brief overview given here of some of the more important designs for these studies. Experimental Designs: generally for comparing and detecting statistical diﬀerences in t diﬀerent “treatments”. Ex: Suppose we want to compare t = 3 fertilizers (A, B, C) in terms of their eﬀect on corn production in bushels. Consider a piece of land separated into 9 plots as shown to the right: How? • We want to randomly assign the treatments to these plots. • Since randomization occurs over the complete piece, this is called a completely rando- mized design (CRD). • Now, suppose a river runs at the base of the ﬁeld, which might aﬀect corn production. • Note: More C fertilizer is near the river than A or B ⇒ Wetter conditions for C. • If we ﬁnd a diﬀerence in the three fertilizers, is it real or is it due to diﬀerent soil moisture levels? How do we get around this? 7 • Once homogeneous blocks are identiﬁed, the treatments are randomly assigned within each block. This is known as a randomized block design (RBD) or randomized complete block design (RCBD). • These designs guarantee the same number of each treatment per block, controlling for the “moisture factor”. • Now, suppose there is a signiﬁcant elevation gradient running along the river. Note that fertilizer B tends to be at higher elevations. Any diﬀerences in corn production for the three fertilizers could be due to elevation. How do we control for elevation diﬀerences? • • Note: Each treatment appears exactly once in each wetness group and in each elevation group. • Controlling for two factors simultaneously in this fashion is known as a Latin Square Design. Some Other Design Types: Split Plot Design, Nested Design, Repeated Measures Design, Factorial Experiments, and many others. Factorial Experiments: experiments for studying the relationship between 2 or more fac- tors where every possible combination of factor levels is included. • Very common in industrial experiments and greenhouse experiments. Ex: Suppose a steel company is studying the eﬀect of carbon content and tempering tem- perature on the strength of steel. • 2 Factors: (i) Carbon content (High, Medium, Low) (ii) Temperature (High, Low) =⇒ 6 Treatments: (CC-High, Temp-Low), (CC-High, Temp-High), (CC-Med, Temp- Low), (CC-Med, Temp-High), (CC-Low, Temp-Low), (CC-Low, Temp-High). 8 • Might have 18 production batches and randomly assign the 6 treatments, 3 batches each, to the 18 batches. Ex: A human growth hormone is given to growth hormone deﬁcient children. A researcher wants to study the eﬀects of gender (M,F) and bone development (severely, moderately, mildly depressed) on the diﬀerence in growth rate with and without the hormone (cm/month). • 2 Factors: (i) Gender (M,F) ⇒ 2 levels. (ii) Bone Development (Se, Mo, Mi) ⇒ 3 levels. • Response Variable? There were between 1 and 3 children used for each of the 6 factor-level combinations. The mean diﬀerence in growth rate (measured in cm/month) for each of the 6 gender-bone development combinations is given in the table below. Bone Development Severely Moderately Mildly Gender Depressed Depressed Depressed M 2.0 1.9 0.9 F 2.4 2.1 0.9 Do you notice any trends in this table? A plot representing these data is given 6 to the right. What does this plot tell us? • The mean GRD is larger for more severely depressed bone development - children. • The mean GRD is slightly larger for female children. • The diﬀerence in mean GRD for females and males is larger for severely depressed bone development children. =⇒ Interaction between gender and bone development on the mean GRD (i.e.: the eﬀect of one factor (gender) on the response (mean GRD) is not the same for diﬀerent levels of the 2nd factor (bone development)). 9 • What would the picture look like if there were no interaction between the factors? 6 6 - - Observational Studies (2.4) : studies where it is not possible to control for all other factors. Ex: Establishing a link between smoking and lung cancer. How might you set up a scientiﬁc study to examine whether or not smoking leads to lung cancer? Other possible factors? • The fundamental diﬀerence between an experimental (scientiﬁc) study and an obser- vational study is in the inferences that can be drawn. Ex: We say: “There is an association between smoking and lung cancer.” We do not say: “Smoking causes lung cancer.” experimental study =⇒ cause & eﬀect relationship, • So: observational study =⇒ associative relationship. In all of these types of experiments or sampling regimes, there are a number of terms which are commonly used of which you should be aware. Some of these are given below. Terms Used in Survey Studies : A population is the entire group of objects or people about which information is desired. A sample is a representative part or subset of the population on which measurements are taken to obtain information about the population as a whole. The actual subjects in the sample are known as sampling units or experimental units. If we sample every individual in a population, this is known as a census (usually too expensive or impractical). 10 We discussed a number of methods for taking samples, although some of these really illus- trated how NOT to take a sample. Taking opportunistic samples or judgemental samples are generally bad sampling methods which can introduce unforeseen bias into the study. These methods are typically used for convenience. Another poor sampling method is what is known as voluntary response. With this method, response to the survey is voluntary, so that only those with strong opinions tend to respond. The resulting sample is then not generally representative of the population. Example: Ann Landers asked the question: “If you had it to do all over again, would you have kids?” 70% of the respondents said “No.” However, a properly conducted statistical poll found that 91% of those with children answered “Yes” to the question. As a basic goal for any type of sampling, we want to avoid bias, i.e.: we want all possible samples of a given size in the population to have an equal chance of being chosen as our sample. This idea of allowing all possible samples the same chance of selection is known as randomization. Earlier in this handout, we looked at four basic sampling designs which make use of ran- domness: 1. Simple Random Sampling 2. Stratiﬁed Random Sampling 3. Cluster Sampling 4. Systematic Sampling The speciﬁcs of these methods were discussed earlier. One important aspect to notice in the samples resulting from each of these methods is that we would get a diﬀerent sample if we repeated the survey (replication). And hence we would likely not get the same results. The fact that with each new sample, new results are obtained is known as sampling variability. This concept will be revisited later in the course. There are a number of PROBLEMS with sampling that were mentioned earlier in this handout. These problems are listed and deﬁned below: 1. Undercoverage - We need an accurate list of the population from which to sample. Such a list is usually not available, hence possibly missing important segments of the population. [Ex: If our population is all people in the US, and we sample households, we are missing the homeless, prison inmates, apartment dwellers, etc.] 2. Nonresponse - This occurs when a person in the sample does not cooperate or cannot be reached. 11 3. Biased Responses - People will answer incorrectly or lie intentionally if there are sensitive questions related to subjects such as drug/alcohol abuse, cheating on taxes, etc. 4. Wording of Questions - Questions are often leading or worded in a confusing manner, prompting a wrong response. Terms Used in Scientiﬁc Studies (Studied in Depth in Chapter 14): An experiment is any study where some treatment is imposed on the experimental units (subjects) in order to observe a response. Experimental units are the objects on which the experiment is performed. Generally, if these objects are people, they are referred to as subjects. They are also sometimes called sampling units. A treatment is a speciﬁc experimental condition applied to units in the sample. Typically, in addition to having a number of treatment groups where each subject in the group receives a speciﬁc treatment, there is a group of experimental units (or subjects) which does not receive any treatment. This group is known as a control group. The point of a control group is to have a basis to which the treatment groups are compared, this avoiding what is known as the “placebo eﬀect”. In medical studies in particular, people will claim improvement from a drug whether it really helped them or not. By introducing a placebo or control into the experiment, the eﬀect of such responses can be measured. This eﬀect from people claiming improvement even when they have only received the placebo is known as the “placebo eﬀect.” Once we know what type of experiment or scientiﬁc study to conduct, we must determine which experimental units should receive each treatment. The use of chance to allocate experimental units to treatments is known as randomization. If we make such assignments subjectively, we may introduce bias unknowingly into the study. The variable of interest in a given study is known as the response variable. Variables which help explain the response variable in the study are known as factors or explanatory varia- bles. If these variables are categorical, there will be several levels for each factor (Factorial Experiments). We will usually want to look at several factors simultaneously. Ex: Suppose a researcher wants to study the eﬀect of arsenic, phosphorous, and presence of mycorrhizal fungae on the growth of some plant species [Julie Knudsen, DBS, 1998 Master’s Thesis]. From many seeds of this type, some number of them might be randomly selected for an experiment. The response variable for this problem might be plant biomass. Of the three factors mentioned (arsenic, phosphorous, mycorrhizae), three levels of arsenic (high, medium, low), three levels of phosphorous (H, M, L), and two levels of mycorrhizae (pres- ence, absence) were considered. Recall that this is the setup of a factorial experiment where we would look at all 3x3x2 = 18 possible combinations of the factor levels. This allows us 12 to study the interaction between the factors. As an example, we might use 36 plants, where each factor-level combination is replicated once, giving two plants for each treatment. Earlier in the handout, we discussed three basic types of experimental designs, and many others were mentioned. Some of these are: 1. Completely Randomized Design (CRD) 2. Randomized Block Design (RBD or RCBD) 3. Latin Square Design 4. Split Plot Design, Repeated Measures, Nested Design There are a few basic principles of any experimental design: 1. Control - We want to control for the eﬀects of other variables (or factors) on the response variable. 2. Randomization - the use of chance to allocate treatments. 3. Replication - We want to repeat the experiment on a large enough number of exper- imental units to see any treatment eﬀects. Why is replication important? Sampling and Statistical Inference A primary goal of sampling methods is statistical inference, that is to draw conclusions about the population based on the sample. This is typically done by statistically comparing summary statistics computed from the data to population characteristics. A parameter is a number which describes the population. The two most common nu- merical descriptions are the population mean and population standard deviation, denoted by µ, σ respectively. Parameters are generally assumed to be unknown. A statistic is a number computed from the sample data. The two most common statistics computed from a given data set are the sample mean and sample standard deviation, denoted by y, s re- spectively. In statistical inference, we use statistics to estimate parameters. More about the sampling variability and sampling distributions associated with these statistics will be given later. One other major goal in designing an experiment or study is the minimization of bias. Bias is roughly deﬁned as a systematic error in the estimation process that causes the statistic being measured to regularly miss the true parameter value in the same direction. Randomization guards against some types of bias (namely selection bias), but not all. Whenever we believe 13 we have eﬀectively eliminated any bias from our study, the resulting statistic computed is said to be unbiased for the parameter it is trying to estimate. As a ﬁnal note on bias and how it pertains to the notions of accuracy and precision in an experiment, the distinction between these terms should be clearly identiﬁed. Suppose we repeat an experiment many times and compute some summary statistic for each experiment. For a statistic to be precise means that there is little sampling variability in the statistic (i.e.: each time we repeat the experiment we get nearly the same value of the statistic). For a statistic to be accurate means that on average, the value of the statistic is “close” to the true population parameter it is estimating. The term accuracy actually includes sampling variability and bias, where something which has low bias and high precision is said to be accurate. In summary then, 1. Precision refers to the level of variability in a study. 2. Accuracy refers to the level of bias & variability in a study. It was mentioned throughout this handout that randomization is one way to control the bias in an experiment. How can the variability be controlled? 14