# Topic 6: Two-way designs: Randomized Complete Block and Latin Square - DOC

Document Sample

```					Lecture 7
Randomized Complete Block Design (RCBD)
[ST&D sections 9.1 – 9.7 (except 9.6) and section 15.8]

The Completely Randomized Design (CRD)
1. It is assumed that all experimental units (EU's) are uniform.

2. Treatments are randomly assigned to EUs such that each treatment occurs equally
often in the experiment. (1 randomization per experiment)

3. It is advocated to include as much of the native variability of the experiment as
possible within each EU.

4. When EU's are not uniform, experimental error (MSE) increases, F (MST/MSE)
decreases, and the experiment loses sensitivity. If the experiment is replicated in a
variety of situations to increase its scope, the variability increases even further.

The Randomized Complete Block Design (RCBD)
1. The population of EU's is divided into a number of relatively homogeneous
subpopulations or blocks, and it is assumed that all EU's within a given block are
uniform.

2. Within each block, treatments are randomly assigned to EU's such that each treatment
occurs equally often (usually once) in each block. (1 randomization per block)

3. It is advocated to minimize the native variability as much as possible within blocks
and to maximize the native variability as much as possible among blocks.

4. Variation among blocks can be partitioned out of the experimental error (MSE),
thereby reducing this quantity and increasing the power of the test. Additional
variability introduced when increasing the scope of the experiment can also be
partitioned out of the MSE.

Blocks usually represent levels of naturally-occurring differences or sources
of variation that are unrelated to the treatments, and the characterization
of these differences is not of interest to the researcher.

1
Example: A field trial comparing three cultivars (A, B, and C) of sugar beet with four
replications.

North end of field             Hi N
1            2            3
4            5            6
7            8            9
10           11           12
South end of field             Low N

CRD: One randomization per experiment
North end of field             Hi N
B            C            B
C            A            C
A            B            A
C            B            A
South end of field             Low N

RCBD: One randomization per block

Block              North end of field              Hi N
1            1           2            3

2            1           2            3

3            1           2            3

4            1           2            3
South end of field          Low N

Block              North end of field              Hi N
1            B           A            C

2            A           B            C

3            A           C            B

4            A           C            B
South end of field          Low N

2
The linear model
The model underlying each observation in the experiment:

Yij =  + i + j + ij

Yij  Y ..  (Y i.  Y .. )  (Y . j  Y .. )  (Yij  Y i.  Y . j  Y .. )

And the sum of squares:

t    r                          t                           r                        t    r

  (Y
i 1 j 1
ij    Y .. )  r  (Y i.  Y .. )  t  (Y . j  Y .. )   (Yij  Y i.  Y . j  Y .. ) 2
2

i 1
2

j 1
2

i 1 j 1

TSS             =           SST              +          SSB          +                      SSE

This partitioning of variance is possible because the sums of squares of
treatments, blocks, and error are orthogonal to one another.

This orthogonality is a direct result of the completeness of the block design.

CRD

Source                 df             SS            MS                     F
Total                rt - 1          TSS
Treatments            t-1            SST       SST/(t-1)            MST/MSE
Error               t(r - 1)       TSS-SST     SSE/r(t-1)

RCBD (one replication per block-treatment combination)

Source                  df              SS                     MS                     F
Total                 rt - 1           TSS
Treatments            t-1              SST                SST/(t-1)                MST/MSE
Blocks                r-1              SSB                SSB/(r-1)
Error              (t-1)(r-1)      TSS-SST-SSB          SSE/(t-1)(r-1)

1. RCBD has (r - 1) fewer dfe than the CRD.

2. If there are no differences among blocks (SSB = 0), MSECRD < MSERCBD.

3. If there are large enough differences among blocks (SSB >> 0), MSECRD > MSERCBD.

3
Example: An experiment was conducted to investigate the effect of estrogen on weight gain in
sheep. The treatments are combinations of sex of sheep (M, F) and level of estrogen (Est0, Est3).
The sheep are blocked by ranch, with one replication of each treatment level at each ranch.

Ranch
Trtmt          1          2           3       4

M Est0

M Est3

F Est0

F Est3

Effect of estrogen on weight gain in sheep (lbs).

Ranch (i.e. block)                 Treatment
Treatment           I        II         III          IV     Total   Mean
F-S0               47        52          62           51     212      53
M-S0               50        54          67           57     228      57
F-S3               57        53          69           57     236      59
M-S3               54        65          74           59     252      63
Block Total       208       224         272          224     928
Block Mean         52        56          68           56              58

CRD ANOVA (treating blocks as reps)

Source                  df    SS          MS       F
Totals                  15    854
Treatment                3    208     69.33     1.29 NS
Error                   12    646     53.83

RCBD ANOVA

Source                  df    SS          MS       F
Total                   15    854
Treatment                3    208     69.33      8.91**
Blocks                   3    576    192.00     24.69**
Error                    9     70      7.78

4
Relative efficiency: When to block?

MST                            SSE
F                       MSE                          Fcrit  F , df trt , df e
MSE                            df e

Blocking reduces SSE, which reduces MSE.

Blocking reduces dfe, which increases MSE and increases Fcrit.

The concept of relative efficiency formalizes the comparison between two
experimental methods by quantifying this balance between loss of degrees of
freedom and reduction in experimental error.

The information per replication in a given design is:

1     df  1  1
I        MSE    
 df
   MSE  3  MSE
2


 df MSE1  1  1
              
I1  df MSE1  3  MSE1 (df MSE1  1)(df MSE 2  3) MSE2
              
RE1:2                          
I 2  df MSE 2  1  1      (df MSE 2  1)(df MSE1  3) MSE1

 df           
 MSE
 MSE 2  3        2

The main complication is how to estimate MSE for the alternative design.

If an experiment was conducted as an RCBD, MSECRD can be estimated by the following
formula (ST&D 222):

ˆ          df B MSBRCBD  (dfT  df e ) MSE RCBD
MSECRD 
df B  dfT  df e

Assume TSS of the two designs is the same.
Rewrite TSS in terms of its components and simplify the expression.

5
For the interested: Derivation of the expected MSECRD

1. Set the total sums of squares of each design equal to each other and rewrite in terms of mean
squares and degrees of freedom:

TSSRCBD  TSSCRD
SSTRCBD  SSBRCBD  SSERCBD  SSTCRD  SSECRD
dfT ( R ) MSTR  df B ( R ) MSBR  df e ( R ) MSER  dfT (C ) MSTC  df e (C ) MSEC
(t  1) MSTR  (r  1) MSBR  (t  1)(r  1) MSER  (t  1) MSTC  t (r  1) MSEC

2. Replace each mean square with the variance components of its expected mean square:

(t  1)( e2( R )  r T ( R ) )  (r  1)( e2( R )  t B ( R ) )  (t  1)( r  1) e2( R )  (t  1)( e2(C )  r T (C ) )  t (r  1) e2(C )
2                                  2                                                            2

[(t  1)  (r  1)  (t  1)( r  1)] e2( R )  t (r  1) B ( R )  r (t  1) T ( R )  [(t  1)  t (r  1)] e2(C )  r (t  1) T (C )
2                    2                                                    2

t (r  1) B ( R )  (tr  1) e2( R )  (tr  1) e2(C )
2

t (r  1) B ( R )
2

   2
   2

(tr  1)
e(C )        e( R )

3. Finally, rewrite this expression in terms of mean squares and degrees of freedom:

MSB  MSERCBD
MSECRD  MSERCBD  t (r  1)
t (tr  1)
MSB                  MSERCBD
MSECRD            MSERCBD  (r  1)                     (r  1)
tr  1                 tr  1
MSERCBD                   MSB
MSECRD            [(tr  1)  (r  1)]                        (r  1)
tr  1                 tr  1
r (t  1) MSERCBD  (r  1) MSB
MSECRD           
tr  1
(dfT ( R )  df e ( R ) ) MSERCBD  df B MSB
MSECRD           
dfT ( R )  df B ( R )  df e ( R )

6
Example: From the sheep experiment, MSERCBD = 7.78 and MSBRCBD = 192.0. Therefore:

ˆ          df B MSBRCBD  (dfT  df e ) MSE RCBD 3(192 )  (3  9)7.78
MSECRD                                                                 44 .62
df B  dfT  df e                 339

ˆ
(df MSE1  1)(df MSE2  3)MSECRD (9  1)(12  3)44.62
RERCBD:CRD                                                             5.51
(df MSE2  1)(df MSE1  3) MSERCBD   (12  1)(9  3)7.78

Interpretation: It takes 5.51 replications in the CRD to produce the same amount
of information as one replication in the RCBD. Or, the RCBD is 5.51 time more
efficient than the CRD in this case.

1. When there are no significant differences among blocks

distribution of the difference
between two means (CRD)

distribution of the difference
between two means (RCBD)

SSERCBD = SSECRD
/2                                             MSERCBD > MSECRD

0

Power RCBD < Power CRD
Power
1–β

β

>
0

7
2. When there are significant differences among blocks

distribution of the difference
between two means (RCBD)

distribution of the difference
between two means (CRD)

/2

0                         dfe RCBD < dfe CRD

SSERCBD < SSECRD

Power RCBD > Power CRD
or                              Power
Power RCBD < Power CRD                    1–β

β

>
0

Assumptions of the model
The model for the RCBD with a single replication per block-treatment combination:

Yij =  + i + j + ij

1. The residuals (ij) are independent, homogeneous, and normally distributed.

2. The variance within each treatment levels is homogeneous across all treatment levels.

3. The main effects are additive.

8
Recall that experimental error is defined as the variation among experimental units that are
treated alike.

Ranch
Trtmt           1         2            3       4

M Est0

M Est3

F Est0

F Est3

There is an expected value for each sheep, given by:

Expected Yij =  + i + j

Observed Yij =  + i + j + ij
With only one replication per cell, the residuals are the combined effects of experimental error

ij = i*j + errorij

So when we use ij as estimates of the true experimental error, we are assuming that i*j = 0.

This assumption of no interaction is referred to as the assumption of additivity of
the main effects. If this assumption is violated, all F-tests will be very inefficient
and possibly misleading, particularly if the interaction effect is very large.

9
Example: Consider the hypothetical effects of two factors (A, B) on some response variable:

Factor A
Factor B         1 = 1        2 = 2          3 = 3
1 = 1             2             3               4
2 = 5             6             7               8

Purely multiplicative
Factor A
Factor B         1 = 1        2 = 2          3 = 3
1 = 1             1             2               3
2 = 5             5            10              15

If multiplicative data of this sort are analyzed by a conventional ANOVA, the
interaction SS will be large due to the nonadditivity of the treatment effects.

In the case of multiplicative effects, there is a simple remedy. Simply transform
the variable by taking the log of each mean to restore additivity:

Factor A
Factor B        1 = 0.00     2 = 0.30       3 = 0.48
 1 = 0.00         0.00          0.30            0.48
 2 = 0.70         0.70          1.00            1.18

10
Tukey’s 1-df test for nonadditivity                                                                    [ST&D 395]

Under our linear model, each observation is characterized as:

y ij     i   j   ij

The predicted value of each individual is given by:

pred ij     i   j

So, if we had no error in our experiment (i.e. if  ij  0 ), the observed data would exactly match
its predicted values and a correlation plot of the two would yield a perfect line with slope = 1:

Observed vs. Predicted Values (RCBD, no error)

20

18
Observed……

16

14

12

10
10        12           14          16        18       20
Predicted

Now let's introduce some error:

Observed vs. Predicted (RCBD, with error)

20

18
Observed....

16

14

12

10
10        12           14         16         18       20
Predicted

11
But what happens when you have an interaction (e.g. Block * Treatment) but lack the degrees of
freedom necessary to include it in the linear model?

 ij   RANDOMij  B*T Interaction Effects

Observed vs. Predicted (RCBD, with error and B*T)

20

18
Observed....

16

14

12

10
10          12         14         16         18          20
Predicted

SO, if the observed and predicted values obey a linear relationship, then the non-
random Interaction Effects buried in the error term are sufficiently small to uphold
our assumption of additivity of main effects.

This test is easily implemented using SAS:

Data Lambs;
Input Sex_Est \$ @@;
Do Ranch = 1 to 4;
Input Gain @@;
Output;
End;
Cards;
F0     47    52    62   51
M0     50    54    67   57
F3     57    53    69   57
M3     54    65    74   59
;
Proc GLM;
Class Sex_Est Ranch;
Model Gain = Ranch Sex_Est;
Output out = LambsPR p = Pred r = Res;
Proc GLM;                                                                                * This is the Tukey 1 df test;
Class Sex_Est Ranch;
Model Gain = Ranch Sex_Est Pred*Pred;
Run;
Quit;

NOTE: Pred*Pred is not in the Class statement, and it is the last term in the Model.

12
Output from the 2nd Proc GLM (the Tukey 1 df test):
Source                      DF    Type III SS     Mean Square    F Value     Pr > F

Ranch                        3     0.73506585      0.24502195         0.03   0.9927
Sex_Est                      3     0.29808726      0.09936242         0.01   0.9981
Pred*Pred                    1     3.41880342      3.41880342         0.41   0.5395 NS

This test is necessary ONLY when there is one observation per block-treatment combination.
If there are two or more replications per block-treatment combination,
the block*treatment interaction can be tested directly in an exploratory model.

Example: Yield of penicillin from four different protocols (A – D). Blocks are different stocks
of an important reagent. The numbers below each observation (O) are the predicted values (P =
Grand Mean + Treatment effect + Block effect) and the residuals (R).

Treatment                        Block        Block
Block
A           B        C           D            Mean         Effect
O: 89      O: 88    O: 97        O: 94
Stock 1         P: 90      P: 91     P: 95       P: 92          92           +6
R: -1      R: -3     R: 2        R: 2
O: 84      O: 77    O: 92        O: 79
Stock 2         P: 81      P: 82     P: 86       P: 83          83            -3
R: 3       R: -5     R: 6        R: -4
O: 81      O: 87    O: 87        O: 85
Stock 3         P: 83      P: 84     P: 88       P: 85          85            -1
R: -2      R: 3      R: -1       R: 0
O: 87      O: 92    O: 89        O: 84
Stock 4         P: 86      P: 87     P: 91       P: 88          88            2
R: 1       R: 5      R: -2       R: -4
O: 79      O: 81    O: 80        O: 88
Stock 5         P: 80      P: 81     P: 85       P: 82          82            -4
R: -1      R: 0      R: -5       R: 6
Treatment mean        84         85        89          86
Mean = 86
Treatment effect      -2         -1         3           0

13
The SAS code for such an analysis:

Data Penicillin;
Do Block = 1 to 4;
Do Trtmt = 1 to 4;
Input Yield @@;
Output;
End;
End;
Cards;
89     88    97    94
84     77    92    79
81     87    87    85
87     92    89    84
79     81    80    88
;
Proc GLM Data = Penicillin;
Class Block Trtmt;
Model Yield = Block Trtmt;
Output out = PenPR p = Pred r = Res;
Proc Univariate Data = PenPR normal;    * Testing for normality of residuals;
Var Res;
Proc GLM Data = Penicillin; * Testing for variance homogeneity (1 way ANOVA);
Class Trtmt;
Model Yield = Trtmt;
Means Trtmt / hovtest = Levene;
Proc Plot Data = PenPR; * Generating a plot of predicted vs. residual values;
Plot Res*Pred = Trtmt;
Proc GLM Data = PenPR;                           * Testing for nonadditivity;
Class Block Trtmt;
Model Yield = Block Trtmt Pred*Pred;
Run;
Quit;

This dataset meets all assumptions: normality, variance homogeneity, and additivity:
Tests for Normality

Test                       --Statistic---      -----p Value------

Shapiro-Wilk               W      0.967406     Pr < W      0.6994 NS

Levene's Test for Homogeneity of Yield Variance
ANOVA of Squared Deviations from Group Means

Sum of        Mean
Source         DF        Squares      Square     F Value    Pr > F

Trtmt           3          922.2       307.4         0.39   0.7620 NS
Error          16        12618.8       788.7

Tukey's 1 df test for nonadditivity

Source                          DF     Type III SS        Mean Square   F Value     Pr > F

Block                           3      28.29215818         9.43071939      0.28     0.8365
Trtmt                           3      28.26814524         9.42271508      0.28     0.8367
Pred*Pred                       1      26.46875000        26.46875000      0.79     0.3901 NS

14
No particular pattern presents itself in this graph.

15
Nesting within an RCBD                                                             Ranch
Trtmt     1      2     3      4
Data Lambs;                                                      M Est0
Input Sex_Est \$ Ranch Animal Gain @@;
Cards;                                                           M Est3
f0 1 1 46    f0 2 1 51    f0 3 1 61    f0           4   1   50
m0 1 1 49    m0 2 1 53    m0 3 1 66    m0           4   1   56   F Est0
f3 1 1 56    f3 2 1 52    f3 3 1 68    f3           4   1   56
m3 1 1 53    m3 2 1 64    m3 3 1 73    m3           4   1   58   F Est3
f0 1 1 48    f0 2 1 53    f0 3 1 63    f0 4 1 52
m0 1 1 51    m0 2 1 55    m0 3 1 68    m0 4 1 58
f3 1 1 58    f3 2 1 54    f3 3 1 70    f3 4 1 58                     2 measurements
m3 1 1 55    m3 2 1 66    m3 3 1 75    m3 4 1 60
;
Proc GLM Data = Lambs Order = Data;
Class Ranch Sex_Est Animal;
Model Gain = Ranch Sex_Est Animal(Ranch*Sex_Est);
Random Animal(Ranch*Sex_Est);
Test h = Sex_Est e = Animal(Ranch*Sex_Est);

Contrast 'sex'              Sex_Est      1 -1 1 -1 / e = Animal(Ranch*Sex_Est);
Contrast 'estrogen'         Sex_Est      1 1 -1 -1 / e = Animal(Ranch*Sex_Est);
Contrast 'interaction'      Sex_Est      1 -1 -1 1 / e = Animal(Ranch*Sex_Est);

Means Sex_Est / Tukey e = Animal(Ranch*Sex_Est);

Proc Varcomp Method = Type1;
Class Ranch Sex_Est Animal;
Model Gain = Ranch Sex_Est Animal(Ranch*Sex_Est);

Run; Quit;

Type 1 Analysis of Variance

Source                      Expected Mean Square

Ranch                       Var(Error) + 2 Var(Animal(Ranch*Sex_Es)) + 8 Var(Ranch)
Sex_Est                     Var(Error) + 2 Var(Animal(Ranch*Sex_Es)) + 8 Var(Sex_Est)
Animal(Ranch*Sex_Es)        Var(Error) + 2 Var(Animal(Ranch*Sex_Es))
Error                       Var(Error)

Type 1 Estimates

Variance Component              Estimate             %

Var(Ranch)                      46.05556         65.6
Var(Sex_Est)                    15.38889         21.9
Var(Animal(Ranch*Sex_Es))        6.77778          9.7
Var(Error)                       2.00000          2.8

The only reason to analyze this dataset as a nested RCBD is to calculate
the variance components. If you do not need the variance components,
simply average the subsamples for each experimental unit
and analyze it as a simple RCBD.

16
RCBD with multiple replications per block-treatment combination
Data Lambs;
Do Ranch = 1 to 4;                                                          Ranch
Do Sex_Est = 'M0', 'M3', 'F0', 'F3';       Trtmt         1           2            3            4
Input Gain @@;
Output;                               M Est0
End;
End;                                         M Est3
Cards;
58     64    46    66
56     57    46    54                         F Est0
59     68    51    63
54     64    56    62                         F Est3

58    62    58    58
53    63    45    53                                        σe2
56    73    46    54
55    71    52    57
;
Proc GLM Data = Lambs Order = Data;                               * The exploratory model;
Class Ranch Sex_Est;
Model Gain = Ranch Sex_Est Ranch*Sex_Est;

Proc GLM Data = Lambs Order = Data;                                                 * The ANOVA;
Class Ranch Sex_Est;
Model Gain = Ranch Sex_Est;
Output out = LambsPR p = Pred r = Res;
Contrast    'sex'           Sex_Est    1            -1         1      -1;
Contrast    'estrogen'      Sex_Est    1            1          -1     -1;
Contrast    'interaction' Sex_Est      1            -1         -1     1;
Means Sex_Est / Tukey;

Proc Univariate Data = LambsPR normal;                                 * Testing normality;
Var Res;

Proc GLM   Data = Lambs Order = Data;                  * Levene's test (1 way ANOVA);
Class   Sex_Est;
Model   Gain = Sex_Est;
Means   Sex_Est / hovtest = Levene;

Run; Quit;

Output from the exploratory model:
Sum of
Source                   DF       Squares       Mean Square      F Value          Pr > F

Model                    15   1264.875000        84.325000            5.51        0.0008
Error                    16    245.000000        15.312500
Corrected Total          31   1509.875000

Source                   DF   Type III SS       Mean Square       F Value         Pr > F

Ranch                     3   176.1250000        58.7083333           3.83        0.0304
Sex_Est                   3   951.6250000       317.2083333          20.72        <.0001
Ranch*Sex_Est             9   137.1250000        15.2361111           1.00        0.4811 NS

17
RCBD 1 rep/cell                      RCBD 1 rep/cell with subsamples

T1        T2                           T1          T2

B1                                   B1

B2                                   B2

Class Block Trtmt;                   Class Block Trtmt Pot;
Model Y = Block Trtmt;               Model Y = Block Trtmt Pot(Block*Trtmt);
Random Pot(Block*Trtmt);
Test h = Trtmt e = Pot(Block*Trtmt);

Tukey Test Required                  Tukey Test Required

RCBD >1 rep/cell                     RCBD >1 rep/cell with subsamples

T1        T2                           T1          T2

B1                                   B1

B2                                   B2

Exploratory model:                   Exploratory model:
Class Block Trtmt;                   Class Bl Trt Pot;
Model Y = Block Trtmt Block*Trtmt;   Model Y = Bl Trt Bl*Trt Pot(Bl*Trt);
Random Pot(Bl*Trt);
Test h = Bl*Trt e = Pot(Bl*Trt);

Tukey Test not Required              Tukey Test not Required

ANOVA:                               ANOVA:
Class Block Trtmt;                   Class Bl Trt Pot;
Model Y = Block Trtmt;               Model Y = Bl Trt Pot(Bl*Trt);
Random Pot(Bl*Trt);
Test h = Trt e = Pot(Bl*Trt);

18

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 52 posted: 7/27/2012 language: English pages: 18
How are you planning on using Docstoc?