Document Sample

Journal of Statistical Planning and Inference 132 (2005) 163 – 182 www.elsevier.com/locate/jspi Efﬁciency of split-block designs versus split-plot designs for hypothesis testing Minghui Wanga , Franz Heringb,∗ a ORTEC Inc., Amsterdam, The Netherlands b Statistics Department, University of Dortmund, Germany Received 6 April 2003; accepted 10 July 2003 Available online 20 August 2004 Abstract In this paper, the efﬁciencies of split-plot design relative to split-block design for experiments of hypothesis testing problems are explored with the deﬁnition of efﬁciency based on the sample size of the experiments. The conclusion is that efﬁciencies of the experiments for hypothesis testing problems depend not only on the hypotheses but also on the precision of the tests. A practical example is given showing how to determine the relative efﬁciency of a split-block design to a split-plot design for hypothesis testing problems. © 2004 Elsevier B.V. All rights reserved. MSC: 62K99 Keywords: Relative efﬁciency; Split-plot design; Split-block design; Sample size; Hypothesis testing 1. Introduction A split-block design with ﬁxed treatment effects can be obtained from a split-plot design by adding an additional column structure to the super-blocks. For point estimation problems, it has been shown in Hering and Wang (1998) that if the same set of experimental units is used for both the split-plot designs and the split-block designs, the addition of the column structure to the split-plot design will lead to more precise estimation of the interaction A×B, where A is the whole-plot treatment and B the split-plot treatment. As a consequence, the split-plot treatment B is estimated less precisely in split-block design than in split-plot ∗ Corresponding author. Kohlenbankweg3c, DortmundD-44227, Germany. E-mail address: fehering@web.de (F. Hering). 0378-3758/$ - see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2004.06.021 164 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 design. The whole-plot treatment A is estimated with the same precision in both of the designs. If the purpose of the experiment is hypothesis testing, then the conclusion drawn for the estimation problem should not be automatically taken over for hypothesis testing. In general, the efﬁciency of the experiments for hypothesis testing problems will not only depend on the speciﬁc hypothesis, but also on the precision requirement of the test. Let us assume that an experiment is carried out to investigate the effects of two factors A and B, with levels A1 , A2 , . . . , Aa and B1 , B2 , . . . , Bb , respectively, on the quality of a product. Let us assume further that, due to the availability of experimental units and/or the emphasis of the precision on one of the factors, the experimenter has to make a choice of either a split-plot design or a split-block design. The purpose of the experiment is to test any effects of the factors. If a split-plot design is used, the hypotheses could be formulated as follows: • Effects of the whole-plot treatments Ai , i = 1, 2, . . . , a, are equal. • Effects of the split-plot treatments Bj , j = 1, 2, . . . , b, are equal. • Effects of the interaction between the whole-plot and the split-plot Ai Bj , i = 1, 2, . . . , a; j = 1, 2, . . . , b, are equal. If a split-block design is used for the same experimental units, by adding a column structure to the super-blocks of the split-plot design, the whole-plot treatments and the split-plot treatments then can be referred to as row treatments and column treatments, respectively. The hypotheses are accordingly • Effects of the row treatments Ai , i = 1, 2, . . . , a, are equal. • Effects of the column treatments Bj , j = 1, 2, . . . , b, are equal. • Effects of the interaction between the row treatment and the column treatment Ai Bj , i = 1, 2, . . . , a; j = 1, 2, . . . , b, are equal. Before we start our investigation, the efﬁciency of an experiment when used to test a hypoth- esis must be deﬁned. The following section deﬁnes the relative efﬁciency of experiments based on the sample size required by each of the designs. 2. Efﬁciency of the F-test in ANOVA Deﬁnition. Let y = X + e1 , e1 ∼ N(0, 1 In1 ), 2 (1) y = Z + e2 , e2 ∼ N(0, 2 In2 ) 2 (2) be the linear model of design X and Z, respectively. It is assumed that X and Z do not have a full column rank. M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 165 For hypothesis testing in the ANOVA, the relative efﬁciency of design X with respect to design Z is deﬁned as sZ RE(X to Z) = , (3) sX where sX and sZ are the sample sizes required by designs X and Z, respectively, when both designs guarantee the same precision or sensitivity of the test. Throughout this paper, we apply this deﬁnition to the F-test only. In this deﬁnition, by same precision or sensitivity we mean that both the designs are assumed to detect the same difference d of the treatment effects with equal power 1 − at the same signiﬁcant level . Here, is the probability of committing the second kind of error (i.e., failure to detect a difference at ). The general hypothesis we consider is H : K = m, (4) where K has full row rank s. This means that the linear functions of which form the hypothesis must be linearly independent. Since X does not have full column rank, X X has no inverse and the normal equation X X 0 = X y has no unique solution. They have many solutions. To get any one of them we ﬁnd any generalized inverse G of X X and write the corresponding solutions as 0 = GX y. According to Searle (1971), y ∼ N(X , 1 In1 ), 2 0 = GX y ∼ N(GX X , GX XG 1) 2 and K 0 − m ∼ N(K − m, K GK 1 ). 2 It turns out that when hypothesis (4) is not true, Q/s F (H ) = SSE/[n1 − r(X)] X ∼F s, n1 − r(X), (5) 2 and under the null hypothesis H : K = m F (H ) ∼ Fs,n1 −r(X) . Here • F denotes the non-central F-distribution with s degrees of freedoms and n1 − r(X) and denotes the non-centrality parameter. 166 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 • X = (K − m) [K GK]−1 (K − m)/ 2 . 1 • Q = (K 0 − m) [K GK]−1 (K 0 − m) is the quadratic form in K 0 − m. • SSE = [y − XK(K K)−1 m] [In1 − XGX ][y − XK(K K)−1 m] is the error sum of squares. • r(X) is the rank of X. In the deﬁnition, it is assumed that matrix X does not have full column rank, thus the generalized inverse matrix is used. When X has full column rank the inverse of X X exists. Accordingly, G becomes (X X)−1 . As a consequence, X can be uniquely decided. For an ANOVA model for which the design matrix is not of full column rank, by imposing suitable side condition (constraints) it is possible to get a full-rank model. For example, as will be seen in the paper, the assumption of full column rank is fulﬁlled in the ANOVA model (7) by imposing conditions (8) on the parameters. From Wang (2002), by solving F [1 − , s, n1 − r(X)] = F [ , s, n1 − r(X), X /2], (6) where, is the signiﬁcance level of the F-test, 1 − is the power of the test at signiﬁcance level , F [1− , s, n1 −r(X)] is the 1− , quantile of the central F-distribution with s degrees of freedom and n1 − r(X), F [ , s, n1 − r(X), X /2] is the quantile of the non-central F-distribution with degrees of freedom s and n1 − r(X), and non-centrality parameter X . We are able to determine a sample size n1 , denoted by sX , for design X such that the F-test rejects the null hypothesis (4) with power 1 − at signiﬁcance level . Analogously for the design Z, we can determine a sample size sZ such that the resulting F-test rejects the null hypothesis (4) with the same power 1 − at the signiﬁcance level . Therefore, the efﬁciency of design X relative to design Z can be determined according to (3) in the deﬁnition. In the deﬁnition, it is assumed that matrix X does not have full column rank, thus the generalized inverse matrix is used. When X has full column rank the inverse of X X exists. Accordingly G becomes (X X)−1 . For convenience, from now on, it is assumed that all the designs discussed here are complete designs. 3. Efﬁciency of split-block designs versus split-plot designs 3.1. The statistical models The statistical model of both the split-plot design and split-block designs are given in the following sub-sections in order to make it easier to discuss the problem. 3.1.1. Split-plot design In a split-plot design, the levels of factor A, denoted with A1 , A2 , . . . , Aa , are randomly assigned to the whole plots. The levels of factor B, denoted with B1 , B2 , . . . , Bb , are ran- domly assigned to the split plots within each whole plot. If the number of whole plots is equal to the number of the levels of factor A, the design is said to be complete with respect M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 167 to factor A. If the design is complete for both of the factors, it is called a complete split-plot design. The statistical model for a complete split-plot design is of the form A AB yij k = + ri + j + eij + k +( )j k + eij k (7) with i =1, 2, . . . , n; j =1, 2, . . . , a; k =1, 2, . . . , b. Here, represents the general mean, ri the super-block effects, j , the whole-plot factor, k the split-plot factor effects, respectively, whereas ( )j k are the effects of interaction between the whole-plot factors and the split- A AB plot factors. eij and eij k are i.i.d. random variables normally distributed both with means 0, and variances eA 2 and 2 eAB , respectively. Further model assumptions are: j = 0, j k = 0, (8) k ( )j k = 0 for each k, j ( )j k = 0 for each j. k Part of the ANOVA (the sum of squares column is not included) of the split-plot design is given in Table 1. 3.1.2. Split-block designs Suppose that, based on the split-plot design given above, a further column structure is introduced within the super-block. The resulting design, known as a split-block design, represents a generalization of the split-plot design. A split-block design has the structure in which two whole plots, or strips, are orthogonal. The randomization procedure consists of two steps: ﬁrst, randomly allocate one of the two factors, say factor A, to the row strips, then randomly allocate the other factor, factor B, to the column strips. If the number of row strips equals the number of the levels of factor A, the design is said to be complete with respect to factor A. If the design is complete for both of the factors, it is called a complete split-block design. The statistical model of a complete split-block design is of the form A B AB yij k = + ri + j + eij + k + eik + ( )j k + eij k (9) with i = 1, 2, . . . , n; j = 1, 2, . . . , a; k = 1, 2, . . . , b. Here is the general mean, ri are the super-block effects, and j are the row, k the column effects, whereas ( )j k are A B AB the measures of interaction between rows and columns. eij , eik , and eij k are i.i.d. random variables normally distributed with mean 0, and variances eA eB 2 , 2 , and 2 eAB , respectively. So (9) differs from (7) only by the additional error term for the column strips. The model assumptions for the parameters are the same as in (8). Part of the ANOVA for the split-block design is given in Table 2 (the sum of squares column is not included). 168 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 Table 1 ANOVA for the split-plot design (7) Source d.f. E(MS) Replicates n−1 A a−1 2 eAB + b 2 + nb j 2 /(a − 1) eA j Error(A) (n − 1)(a − 1) 2 eAB +b 2eA B b−1 2 eAB + na k 2 /(b − 1) k A×B (a − 1)(b − 1) 2 eAB + n j k ( )2 k /(a − 1)(b − 1) j Error (AB) (n − 1)a(b − 1) 2 eAB Total nab − 1 Table 2 ANOVA for the split-block design (9) Source d.f. E(MS) Replicates n−1 A a−1 2 eAB +b 2 + nb eA j j /(a − 1) 2 Error(A) (n − 1)(a − 1) 2 eAB +b 2 eA 2 B b−1 2 eAB +a 2 + na eB k k /(b − 1) Error(B) (n − 1)(b − 1) 2 eAB +a 2 eB A×B (a − 1)(b − 1) 2 eAB +n j k ( )j k /(a − 1)(b − 1) 2 Error (AB) (n − 1)(a − 1)(b − 1) 2 eAB Total nab − 1 The following symbol convention will be used throughout this paper: • j , j = 1, 2, . . . , a, denote the whole-plot treatments (i.e., factor A) effects if a split-plot design is used. When a split-block design is used, they denote the row treatment (i.e., factor A) effects. • k , k = 1, 2, . . . , b, denote the split-plot treatments (i.e., factor B) effects if a split-plot design is used. If a split-block design is used, they denote the column treatment (i.e., factor B) effects. • ( )j k , j =1, 2, . . . , a; k=1, 2, . . . , b, denote the interaction effects between the whole- plot treatments and split-plot treatments if a split-plot design is used. They denote the effect interaction between row treatments and column treatments if a split-block design is used. For each of the hypothesis testing problems listed above, the degrees of freedom of the F-tests for both the split-plot design and the split-block design are summarized in Table 3, in which df 1 and df 2 stand for the degrees of freedom of the numerator and the denominator of the appropriate F-tests. M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 169 Table 3 Summary of df 1 and df 2 of F-test for split-block design and split-plot design Design Test for effects of factor df 1 df 2 Remark Split-block A a−1 (n − 1)(a − 1) B b−1 (n − 1)(b − 1) A×B (a − 1)(b − 1) (n − 1)(a − 1)(b − 1) Split-plot A a−1 (n − 1)(a − 1) B b−1 (n − 1)a(b − 1) Assuming interaction A × B B b−1 (na − 1)(b − 1) Assuming no interaction A×B (a − 1)(b − 1) (n − 1)a(b − 1) For our purpose, the efﬁciency of split-block design relative to split-plot design will be investigated with respect of the following hypothesis: (I) H0 : 1 = 2 = · · · = a = 0. (II) H0 : 1 = 2 = · · · = b = 0. (III) H0 : ( )j k = 0 for all j and k. The following subsections will be devoted to each of the hypothesis testing problems which are listed above. 3.2. Testing the effects of row treatments or whole-plot treatments The hypothesis is H0 : 1 = 2 = · · · = a = 0 against HA : j = 0 for at least one j. (10) Assuming that a split-plot design is used, this hypothesis testing problem leads to the F-test in the ANOVA of linear models (7) with error variance 2 under H0 eA MS(A) F= ∼ Fa−1,(n−1)(a−1) . (11) MS(EA ) Suppose that we want this experiment to reveal a practically interesting difference of d (d can be expressed as a ratio proportional to eA(SPD) ) with power 1 − under signiﬁcance level . Let min be the minimum and max be the maximum of the set of a effects 1 , . . . , a . By Tang (1938), the non-centrality parameter of the F-test, i.e., a SPD = NSPD ( j − ¯ )2 / 2 eA(SPD) satisﬁes the following inequality : j =1 SPD NSPD ( max − min ) /(4 eA(SPD) ) 2 2 eA(SPD) ), = NSPD d 2 /(4 2 (12) 170 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 where NSPD =abnSPD is the total number of experimental units in the split-plot design, nSPD A eA(SPD) is the variance of eij in model (7), d = max − min is the number of super-blocks, 2 is the minimum difference among the largest and smallest effects of factor A to be detected. Following Rasch and Wang (1998), one can calculate the total number of experimen- tal units NSPD needed to guarantee the precision speciﬁcation expressed with , , d and 2 eA(SPD) . The result is 2 SPD eA(SPD) NSPD 4 . (13) d2 On the other hand, the non-centrality parameter of F-distribution can be expressed as an inverse function of the power function (cf. Das Gupta (1968)), i.e., SPD = ( , , 1 , 2 ), SPD SPD (14) where 1(SPD) =a −1 and 2(SPD) =(nSPD −1)(a −1) are, respectively, d.f. of the numerator and denominator in (11), is the signiﬁcance level, and 1 − is the power of the test. Eqs. (11) and (12) suggest that NSPD nSPD = ab ( , , a − 1, (nSPD − 1)(a − 1)) 2 eA(SPD) 4 . (15) abd 2 We know that SPD = ( , , 1(SPD) , 2(SPD) ) monotonically decreases with 2(SPD) . The solution (i.e., the smallest integer) of nSPD is found by solving ( , , a − 1, (nSPD − 1)(a − 1)) 2 eA(SPD) nSPD = 4 . (16) abd 2 Deﬁne nSPD = CEIL(nSPD ), (17) where CEIL(x) is the smallest integer that is not smaller than x. It is clear that nSPD satisﬁes (15). By (16), nSPD is an implicit function of , , a, d and 2 eA(SPD) . As a result, the solution nSPD of (16) can be found iteratively. (cf. Rasch and Wang, 1998). So we ﬁnally obtain NSPD . Analogously, in order to get the relative efﬁciency, we ﬁrst determine nSBD , and then NSBD , the total number of experimental units of the split-block design for testing hypothesis (10). Recall that the efﬁciency is deﬁned as the ratio of the total experimental units required by the two designs when they have the same precision. Now, the same precision for testing hypothesis (10) implies that SBD = SPD , SBD = SPD , dSBD = dSPD . M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 171 As already mentioned, a split-block design can be derived from a split-plot design by adding an additional column structure to the whole plot. This does not affect the variance of factor A. That is, eA(SBD) = eA(SPB) . Thus, we have dSBD dSPD = . eA(SBD) eA(SPD) Furthermore, by Table 3, it holds that SBD = a − 1 = SPD . 1 1 Now, it can be veriﬁed that the formulas for calculating nSBD and nSPD are identical with respect to all the parameters. Therefore, E(SBD to SPD) ≡ 1. (18) That is, in terms of our efﬁciency deﬁnition, for testing the hypothesis that the effects of all the whole-plot treatments Ai , i = 1, 2, . . . , a, are equal, split-block and split-plot designs are equally efﬁcient. 3.3. Testing the effects of column treatment or split-plot treatment Now, the hypothesis is H0 : 1= 2 = · · · = b = 0 against HA : k = 0 for at least one k, (19) where k , k = 1, 2, . . . , b, are the effects of factor B, the column treatments in a split-block design or the split-plot treatments in a split-plot design. Assuming that a split-plot design is used, and assuming that an interaction A × B exist, then hypothesis (19) suggests an F-test in the ANOVA of linear models (7) with error variance 2 eAB(SPD) under H0 : MS(B) F= ∼F 1, 2 , (20) MS(EAB ) where 1 = b − 1, 2 = (nSPD − 1)a(b − 1). Suppose that the experiment must be able to detect a difference d (d can be expressed as a ratio proportional to eAB(SPD) ) of practical interest with power 1 − and signiﬁcance level . Let min be the minimum and max be the maximum of the set of b effects 1 , . . . , b . By Tang (1938), the non-centrality parameter of the non-central F-distribution satisﬁes b SPD = NSPD ( j − ¯ )2 / 2 eAB(SPD) j =1 NSPD ( max − min ) /(4 eAB(SPD) ) 2 2 eAB(SPD) ), = NSPD d 2 /(4 2 (21) 172 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 where NSPD = abnSPD is the total number of experimental units in the split-plot design, nSPD is the number of super-blocks, 2 AB eAB(SPD) is the variance of eij in model (7), d = max − min is the minimum difference to be detected among the largest and the smallest effects of factor B. Analogous to the procedures in Section 3.2, we calculate the total number of experimental units needed to guarantee the same precision for both the split-block design and the split-plot design for testing hypothesis (19). For a split-plot design we obtain NSPD nSPD = ab ( , , b − 1, (nSPD − 1)a(b − 1)) 2 eAB(SPD) 4 , (22) abd 2 where nSPD is the number of super-blocks, 2 eAB(SPD) is error variance of factor B. Then the solution of the RHS of (22) is not an integer, in general. So (22) serves only as an approximation to the number of super blocks. From now on we assume equality in (22). Similar to Section 3.2, the number of the total experimental units NSPD = nSPD ab for a split-plot design satisﬁes b 1 ¯ SPD = ( , , 1 , 2 )= NSPD (Ei − E)2 / eAB(SPD) , SPD SPD 2 (23) 2 i=1 sp sp where 1 = b − 1, 2 = (nSPD − 1)a(b − 1) are the degrees of freedoms for factor B and error (AB), respectively, 2eAB(SPD) is the expected mean square of error (AB), viz., the residual variance, whereas Ei , i = 1, 2, . . . , q, are the effects of factor B, i.e., the column treatment effects. Analogously, for the split-block design we have b 1 ¯ SBD = ( , , , 2 )= NSBD (Ei − E)2 / eAB(SBD) , SBD SBD 2 1 2 i=1 i.e. b 1 ¯ ( , , b − 1, (nSBD − 1)(b − 1)) = nSBD ab (Ei − E)2 / eAB(SBD) . 2 2 i=1 From the deﬁnition in (3) we have 2 sp eAB(SPD) RE(SBD to SPD) = 2 sb eAB(SBD) ( , , b − 1, (nSPD − 1)a(b − 1)) 2 eAB(SPD) = . (24) ( , , b − 1, (nSBD − 1)(b − 1)) 2 eB(SBD) M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 173 In general, 2 2 eAB(SBD) and eAB(SPD) are unknown. When a priori information is available we are able to get an estimated relative efﬁciency (ERE) by substituting 2 eAB(SBD) and 2 eAB(SPD) in (24) with their respective estimates, i.e., sp MSsp (EAB ) ERE(SBD to SPD) = sb MSsb (EB ) ( , , b − 1, (nSPD − 1)a(b − 1))MSsp (EAB ) = . (25) ( , , b − 1, (nSBD − 1)(b − 1))MSsb (EB ) No general conclusion about the relative efﬁciency of split-block design with respect to split-plot design can be drawn from (25) since mean squares are involved. Following Yates (1935) we consider a uniformity trial, i.e., a trial with dummy treatments. By Hering and Wang (1998), for a uniformity trial we have MSsp (EAB ) < 1. MSsb (EB ) Consequently, it can be shown that when MSsp (EAB )/MSsb (EB ) < 1, ERE (SBD to SPD) < 1. (26) MSsp (EAB ) Proof. Suppose MSsb (EB ) < 1 and ERE (SBD to SPD) 1, (27) we have ( , , b − 1, (nSPD − 1)a(b − 1)) ( , , b − 1, (nSBD − 1)(b − 1)) ( , , b − 1, (nSPD − 1)a(b − 1))MSsp (EAB ) > ( , , b − 1, (nSBD − 1)(b − 1))MSsb (EB ) = ERE (SBD to SPD) 1. This is equivalent to ( , , b − 1, (nSPD − 1)a(b − 1)) > 1. (28) ( , , b − 1, (nSBD − 1)(b − 1)) On the other hand, by deﬁnition of the ERE in (3), nSPD ab NSPD nSPD = nSBD = nSBD nSBD ab NSBD = nSBD ERE (SBD to SPD) nSBD . (29) Thus, SBD 2 = (nSBD − 1)(b − 1) (nSPD − 1)(b − 1) < (nSPD − 1)a(b − 1) = SPD . 2 (30) 174 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 Since ( , , 1, 2) decreases monotonically with increasing 2, (30) implies ( , , b − 1, (nSPD − 1)a(b − 1)) < 1, (31) ( , , b − 1, (nSBD − 1)(b − 1)) (31) contradicts (28). Thus we have proved (26). MSsp (EAB ) When MSsb (EB ) 1 no general conclusion for the relative efﬁciency can be drawn. 3.4. Testing effects of treatment interaction The null hypothesis concerns the interaction between the whole-plot treatment and the split-plot treatment if a split-plot design is used, or equivalently, the interaction between the row treatment and the column treatment if a split-block design is used. That is, H0 : ( )j k = 0, for each of the (j, k), HA : ( )j k = 0, for at least one of the (j, k). (32) For a split-plot design, the ANOVA suggests an F-test with degrees of freedom (a−1)(b−1) and (n − 1)a(b − 1) for the numerator and denominator, respectively, whereas for a split- block design, the degrees of freedom for the numerator and denominator of the F-test are (a − 1)(b − 1) and (n − 1)(a − 1)(b − 1), respectively. Analogous to Section 3.2, the relative efﬁciency of the split-block design versus split-plot design can be expressed as ERE(SBD.SPD) ( , , (a − 1)(b − 1), (nSPD − 1)a(b − 1)) 2 eAB(SPD) = . (33) ( , , (a − 1)(b − 1), (nSBD − 1)(a − 1)(b − 1)) 2 eAB(SBD) So, as in Section 3.2, when a priori information is available, we can get the estimated relative MS (E ) efﬁciency by substituting the a priori information and it can be shown that MSsp (EAB ) < 1 sb AB implies ERE (SBD to SPD) < 1. 4. A practical example A practical example is given here to determine the efﬁciency of a split-block design relative to a split-plot design for hypothesis testing. The data set, given in Table 4, is cited from Gomez and Gomez (1984). The data have been collected from a split-plot design conducted to investigate the effects of six different types of nitrogen (applied on the whole- plots) and four varieties (on the split-plots) on the grain yield. The ANOVA of the split-plot design1 for the data set is summarized in Table 5. If this same data set would have been collected from a split-block design,2 we could have got a different ANOVA for a split-block design as summarized in Table 6. 1 The statistical model of the split-block design and the split-plot design are given in (7) and (9). 2 For the experiment of a split-block design we assume that we have used the nitrogen types on row plots and varieties on column plots. M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 175 Table 4 The grain yield data set of four rice varieties grown with six levels of nitrogen in a split-plot design with three replications Nitrogen Variety Replication R1 R2 R3 N1 V1 4430 4478 3850 V2 3944 5314 3660 V3 3464 2944 3142 V4 4126 4482 4836 N2 V1 5418 5166 6432 V2 6502 5858 5586 V3 4768 6004 5556 V4 5192 4604 4652 N3 V1 6076 6420 6704 V2 6008 6127 6642 V3 6244 5724 6014 V4 4546 5744 4146 N4 V1 6462 7056 6680 V2 7139 6982 6564 V3 5792 5880 6370 V4 2774 5036 3638 N5 V1 7290 7848 7552 V2 7682 6594 6576 V3 7080 6662 6320 V4 1414 1960 2766 N6 V1 8452 8832 8818 V2 6228 7387 6006 V3 5594 7122 5480 V4 2248 1380 2014 The unit of yield is in kilograms. Table 5 ANOVA of the data set in Table 1 for the split-plot design Source of variation d.f. SS MS F Replication 2 10 82 576.7 5 41 288.4 Nitrogen (A) 5 3 04 29 199.6 60 85 839.9 42.9 Error (A) 10 14 19 678.8 1 41 967.9 Variety (B) 3 8 98 88 101.2 2 99 62 700.4 85.7 A×B 15 6 93 43 486.9 46 22 899.1 13.2 Error 36 1 25 84 873.2 3 49 579.8 Total 71 20 47 47 916.4 176 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 Table 6 ANOVA of the data set in Table 1 for the split-block design Source of variation d.f SS MS F Replication 2 10 82 576.7 5 41 288.4 Nitrogen (A) 5 3 04 29 199.6 60 85 839.9 42.9 Error (A) 10 14 19 678.8 1 41 967.9 Variety (B) 3 8 98 88 101.2 2 99 62 700.4 154.1 Error (B) 6 11 66 911.0 1 94 485.2 A×B 15 6 93 43 486.9 46 22 899.1 12.1 Error 30 1 14 17 962.2 3 80 598.7 Total 71 20 47 47 916.4 From the results of these two ANOVAs we are able to derive the sample sizes required by these two designs to guarantee given precision for testing a speciﬁc hypothesis. Conse- quently, the relative efﬁciency of these two designs can be determined with respect to the hypothesis testing problems. Speciﬁcally, the relative efﬁciency of split-block design to split-plot design is compared for testing the following null hypothesis: • H1. Effects of all the nitrogen on the grain yield are equal. • H2. Effects of all the variety on the grain yield are equal. • H3. Effects of all the interactions of the nitrogen and the variety on the grain yield are equal. The efﬁciency will be calculated for the following precision, characterized by the parameter combinations: • = 0.01(0.01)0.1, denoting the signiﬁcance level. • = 0.02(0.02)0.3 where 1 − is the power of the test. • d = 0.25 min(ssp , ssb ), 0.5 min(ssp , ssb ), min(ssp , ssb ), max(ssp , ssb ) and 2 max(ssp , ssb ). Here, ssp is the estimate of the expected mean square in the ANOVA of the split-plot design 2 that is used as the denominator of the F-ratio for testing a given hypothesis. ssb is the 2 estimate of the expected mean square in the ANOVA of the split-block design that is used as the denominator of the F-ratio for testing the same hypothesis. Thus, for testing H1, for instance, ssp the estimate of EMS(EA ) in Table 1 is equal to 1 41 967.9, while ssb the 2 2 estimate of EMS(EA ) in Table 2 is equal to 1 41 967.9. However, for testing H2, ssp the 2 estimate of EMS(EAB ) in Table 1 is equal to 3 49 579.8 and ssb2 is the estimate of EMS(E ) B in Table 2 is equal to 1 94 485.2 as indicated in Tables 5 and 6. Fig. 1 visualizes the speciﬁcation of d. Note, A may equal B as will be seen in the case of testing H1. 4.1. Testing the equality of whole-plot treatment (row-treatment) effects For testing H1, it has been veriﬁed that the estimated relative efﬁciency of split-block design to split-plot design is constantly equal to one, regardless of the precision requirement. M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 177 0 A/4 A/2 A B 2B Fig. 1. The speciﬁcations of d. Here, A = min(ssp , ssb ), B = max(ssp , ssb ). Table 7 Number of replications for split-block design and split-plot design when d = max(ssp , ssb ) 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.30 0.01 16.91 15.19 14.14 13.35 12.72 12.19 11.73 11.32 10.95 10.60 10.29 9.99 9.71 9.44 9.19 0.02 15.38 13.74 12.72 11.97 11.37 10.87 10.43 10.03 9.68 9.35 9.05 8.77 8.50 8.25 8.01 0.03 14.44 12.85 11.86 11.14 10.55 10.06 9.63 9.25 8.91 8.60 8.30 8.03 7.78 7.53 7.30 0.04 13.76 12.20 11.23 10.52 9.95 9.47 9.06 8.69 8.35 8.05 7.76 7.50 7.25 7.01 6.79 0.05 13.22 11.68 10.74 10.04 9.48 9.01 8.60 8.24 7.91 7.61 7.33 7.07 6.83 6.60 6.38 0.06 12.77 11.25 10.32 9.63 9.08 8.62 8.22 7.86 7.54 7.25 6.97 6.72 6.48 6.25 6.04 0.07 12.38 10.88 9.96 9.29 8.74 8.29 7.89 7.54 7.23 6.94 6.67 6.42 6.18 5.96 5.75 0.08 12.03 10.55 9.65 8.98 8.44 7.99 7.61 7.26 6.95 6.66 6.40 6.15 5.92 5.70 5.49 0.09 11.72 10.26 9.36 8.70 8.18 7.73 7.35 7.01 6.70 6.42 6.16 5.91 5.68 5.47 5.26 0.10 11.44 9.99 9.11 8.45 7.93 7.49 7.11 6.78 6.47 6.19 5.94 5.70 5.47 5.26 5.05 Thus, it can be concluded that if the purpose of the experiment is to test the null hypothesis of equal whole-plot treatment effects, the split-block design and the split-plot design are equally efﬁcient in the sense that they need equal number of experimental units for any given precision requirement. Table 7 lists the number of replication (not rounded for higher precision) required by both the split-block design and the split-plot design in order to guarantee the precision characterized by the combination of , and d = max(ssp , ssb ). 4.2. Testing equality of the split-plot treatment (column-treatment) effects Figs. 2–6 show the estimated relative efﬁciencies of the split-block design to the corre- sponding split-plot design for testing the hypothesis of equal split-plot (for the split-plot design) or column (for the corresponding split-block design) treatment effects. In Figs. 2–6 the values of d are set to 0.25 min(ssp , ssb ), 0.5 min(ssp , ssb ), min(ssp , ssb ), max(ssp , ssb ), 2 max(ssp , ssb ), respectively. On the right-hand side of each ﬁgure the relative efﬁciency is zoomed for better visual results. In each of the ﬁgures, the 15 curves located from the top to the bottom correspond to the relative efﬁciencies for increasing from 0.02 to 0.30 with step 0.02. It can be summarized from Figs. 2–6 that (1) For ﬁxed and , the relative efﬁciency of split-block design to split-plot design de- creases with the increase of d. It is remarkable that the relative efﬁciency decreases from around 1.75 to about 0.7 for different settings of d. When d is very small, the numbers of experimental units required for both designs are very large in order to 178 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 1.650 1.780 1.775 1.450 1.770 Efficiency Efficiency 1.765 1.250 1.760 1.755 1.050 1.750 1.745 0.850 1.740 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.650 Alpha 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Fig. 2. 1.650 1.720 1.450 Efficiency 1.700 1.250 1.680 Efficiency 1.660 1.050 1.640 0.850 1.620 0.650 1.600 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Alpha Fig. 3. 1.650 1.580 1.450 1.530 Efficiency 1.250 1.480 Efficiency 1.430 1.050 1.380 0.850 1.330 0.650 1.280 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Alpha Fig. 4. satisfy the precision requirement. On the other hand, when d is very large, the numbers of experimental units required by both of designs are rather small, and so are their de- grees of freedom of the denominator in the F-tests. Table 8 lists the calculated degrees of freedom for both designs. M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 179 1.650 1.400 1.450 1.350 Efficiency Efficiency 1.300 1.250 1.250 1.050 1.200 1.150 0.850 1.100 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.650 Alpha 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Fig. 5. 1.650 1.050 1.000 1.450 Efficiency 0.950 1.250 0.900 Efficiency 0.850 1.050 0.800 0.750 0.850 0.700 0.650 0.650 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Alpha Fig. 6. Table 8 Degrees of freedom of the denominator of the F-test required by split-block design and split-plot design when = 0.05, = 0.20 d Efﬁciency d.f. for SBD d.f. for SPD 0.25 min(ssp , ssb ) 1.753475 176 1869 0.5 min(ssp , ssb ) 1.643532 45 458 min(ssp , ssb ) 1.371532 12 104 max(ssp , ssb ) 1.206837 7 51 2 max(ssp , ssb ) 0.823191 1 1 (2) With the increase of , the relative efﬁciency decreases slightly for all d. The differences among the efﬁciencies for different becomes larger for larger values than for smaller values of d. (3) As d increases the difference between the efﬁciencies for different values of becomes larger. (4) When d 0.5 min(ssp , ssb ) the efﬁciencies for all the settings of increase monotoni- cally with the increase in . 180 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 0.9200 0.9190 0.9150 0.9188 0.9100 Efficiency Efficiency 0.9186 0.9050 0.9184 0.9000 0.9182 0.8950 0.9180 0.8900 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Alpha Fig. 7. 0.9200 0.9186 0.9150 0.9184 0.9100 Efficiency 0.9182 0.9050 Efficiency 0.9180 0.9000 0.9178 0.8950 0.9176 0.9174 0.8900 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Alpha Fig. 8. 0.9200 0.9180 0.9150 0.9175 0.9100 Efficiency Efficiency 0.9170 0.9050 0.9165 0.9000 0.9160 0.8950 0.9155 0.8900 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Alpha Fig. 9. 4.3. Testing the equality of interactions between the whole-plot and the split-plot treatment In Figs. 7–11 , the estimated relative efﬁciencies of the split-block design to the split-plot design for testing the hypothesis of equal interactions between the whole-plots and split- plots (for the corresponding split-plot design) treatment or the interactions between the row and column treatment (for split-block design) effects are shown. For Figs. 7–11 the value of d are set to 0.25 min(ssp , ssb ), 0.5 min(ssp , ssb ), min(ssp , ssb ), max(ssp , ssb ), 2 max(ssp , ssb ), respectively. On the RHS of each ﬁgure the relative efﬁciency is zoomed for better visual result. In each of the ﬁgures, the 15 curves located from the top to the bottom correspond M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 181 0.9200 0.9150 0.9173 0.9100 0.9168 Efficiency Efficiency 0.9050 0.9163 0.9000 0.9158 0.8950 0.9153 0.8900 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Alpha Alpha Fig. 10. 0.9200 0.9150 0.9080 0.9060 0.9100 0.9040 Efficiency 0.9020 Efficiency 0.9050 0.9000 0.9000 0.8980 0.8960 0.8950 0.8940 0.8920 0.8900 0.8900 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 AlpHa Alpha Fig. 11. to the relative efﬁciencies for increasing from 0.02 to 0.30 with step 0.02. Due to the precision of the calculation, the zoomed relative efﬁciencies for small d are distorted as shown in Figs. 7 and 8. For testing the hypothesis of equal interactions, the following can be summarized: (1) For ﬁxed and , the relative efﬁciency of split-block design to split-plot design decreases with the increase in d. The relative efﬁciency of split-block design to the split-plot design is always less than one, meaning that the split-plot design is favored regardless of the settings of the precision requirement. It is true that small d suggests larger number of experimental units and vice versa. (2) With increase of , the relative efﬁciency of split-block design to split-plot design decreases slightly for all d. The differences among the efﬁciencies for different become greater for larger values than for smaller values of d. (3) When d is very small, e.g., 0.25 min(ssp , ssb ), the relative efﬁciencies for different values of are nearly the same. As d increases the difference among the efﬁciencies for different values of increases too. 182 M. Wang, F. Hering / Journal of Statistical Planning and Inference 132 (2005) 163 – 182 References Das Gupta, P., 1968. Tables of the non-centrality parameter of F-test as a function of power. The Indian J. Stat., Ser. B, 30, Parts 1 and 2. Gomez, K.A., Gomez, A.A., 1984. Statistical Procedures for Agricultural Research, second ed. Wiley, New York. Hering, D., Wang, M.H., 1998. Efﬁciency comparison of split-block design versus split-plot design. Biometrical Lett. 35 (1), 27–35. Rasch, D., Wang, M.H., 1998. Determination of the size of an experiment for the analysis of variance when at least one factor is ﬁxed. Biometrical Lett. 35 (1), 117–125. Searle, S.R., 1971. Linear Models, Wiley, New York. Tang, P.C., 1938. The power function of the analysis of variance tests with table and illustrations of their use. Statistical Research Memoirs 2, 126–149. Wang, M.H., 2002. Sample size and efﬁciency for hypotheses testing in ANOVA model. Ph.D. Thesis, Logos Verlag, Berlin. Yates F., 1935. Complex experiments. J. Roy. Statist. Soc. (Suppl. 2), 181–247.

DOCUMENT INFO

Shared By:

Categories:

Stats:

views: | 26 |

posted: | 6/18/2012 |

language: | English |

pages: | 20 |

OTHER DOCS BY alpd03l

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.