CSS 590 Experimental Design in Agriculture
Lab exercise – 9th week Suggested reading:
Multiple Comparison Tests Petersen 153-162; Kuehl 91-117
Across Site Analysis Petersen – Chapt. 6
Part I. Mean Comparison Tests
In some studies, there is no obvious structure among the treatments, yet the
experimenter wishes to distinguish the “best” or the “worst” treatments in terms of a
common response variable. Examples would include variety trials in a breeding program,
a study of the effects of a number of toxins, or evaluation of an array of products on the
Open the data set Lab9_comparisons.xls and save it to your desktop. The data
represent yields of seven varieties evaluated in a RBD with four replications. Input the
data into SAS and conduct an analysis of variance using PROC GLM. Requests for
multiple comparison tests can be made using the MEANS statement. (Note: the lecture
material includes a spreadsheet using this data set, showing how many of these tests
TITLE 'multiple comparison tests';
CLASS rep variety;
MODEL Yield = rep variety;
MEANS variety/LSD Duncan SNK Tukey Waller Bon Scheffe;
Compare these tests using the following criteria:
1) Which test(s) control only the comparisonwise error rate, and not the
experimentwise error rate?
2) Which use a single value to compare all means? Which use multiple values?
3) For each test, how many significant comparisons were obtained? What might
this suggest in terms of the control of experimentwise error (Type I) and
power to detect differences? (We cannot answer this question definitively
because we do not know the true magnitude of differences among the
Some statisticians recommend that we report confidence intervals rather than the
results of mean comparison tests. To obtain 95% confidence intervals of means and
differences among means using the LSD:
MEANS variety/LSD CLM CLDIFF;
If the data were imbalanced, we could use an lsmeans statement to obtain multiple
comparison tests. The pdiff statement alone will give all possible t tests among means,
taking into account the differences in standard errors due to unequal replication. Various
adjustments can also be used, such as Tukey’s test.
LSMEANS variety/pdiff adjust=tukey;
Assume that the variety ‘Sitka’ is the control. Try using the following statements to
obtain all possible two-tailed comparisons with the control. The first statement will give
results for Dunnett’s test which is similar to Tukey’s test. The second will give LSD tests
and the third will provide bonferroni probabilities.
MEANS variety/Dunnett ('Sitka');
LSMEANS entry/pdiff=CONTROL('Sitka') adjust=T;
LSMEANS entry/pdiff=CONTROL('Sitka') adjust=bon;
Because varieties were selected for increased yield, it would be reasonable to perform
one-sided tests to compare all treatments with a control. The Dunnettu option can be
used to obtain a one-tailed comparison of all means with the control (the last ‘u’ in
Dunnettu stands for the ‘upper’ tail). You can also use the lsmeans statement and
pdiff=controlu option to obtain one-tailed comparisons of all means with a control.
MEANS variety/Dunnettu ('Sitka');
LSMEANS entry/pdiff=CONTROLU('Sitka') adjust=bon;
To calculate an LSI for a one-tailed comparison to a control, use can request an LSD test
with alpha=0.1. You should only consider comparisons with varieties that have higher
yields than Sitka.
MEANS variety/LSD alpha=0.1;
How do the results of these tests of variety vs control compare?
Part II. Across Site Analysis
Seven rice varieties were evaluated at two locations (representing the ecologies in which
the varieties might be grown) in an RBD with three replications. Do you think that
Varieties should be a random or fixed effect? Are locations random or fixed?
Enter the data on the Lab9_sites.xls spreadsheet into SAS. First, run an analysis for each
of the locations separately:
Class Rep Variety;
Title 'Analysis by Location';
Model Yield = Rep Variety;
What could you do to verify that the Error Mean Squares are homogeneous across
locations? Are the differences among varieties significant at each location?
Run the across site analysis using PROC GLM, assuming that the varieties are a fixed
effect and the Reps and Locations are random effects (note that interactions with Reps
and Locations will also be considered to be random effects.) You do not need to specify
the appropriate error terms, because the ‘Random../Test’ statement in SAS will figure
that out for you. However, SAS uses slightly different rules for determining Expected
Mean Squares than are recommended by some statisticians – as a result the test for
Locations will be approximate rather than direct (we used a direct test in the example
given in class.)
Class Location Rep Variety;
Title 'GLM Analysis across locations';
Model Yield = Location Rep(Location) Variety Location*Variety;
Random Location Rep(Location) Location*Variety/Test;
LSmeans Location Variety Location*Variety/stderr;
Are there significant variety x location interactions? What is your proof? Are there
differences among the means for varieties across sites?
Write a program and run the across site analysis using PROC MIXED.
Proc Mixed data=sites;
Title 'Mixed Model Analysis across locations (random)';
Class Location Rep Variety;
Model Yield = Variety;
Random Location Rep(Location) Location*Variety;
LSmeans Variety / pdiff;
How would you modify the program for the situation where both Locations and Varieties
were assumed to be fixed? Run the modified program and compare the output to your