Split-Plot Design in R

The traditional split-plot design is, from a statistical analysis standpoint, similar to the two factor repeated measures desgin from last week. The design consists of blocks (or whole plots) in which one factor (the whole plot factor) is applied to randomly. Within each whole plot/block, it is split into smaller units and the levels of second factor are applied randomly to the smaller pieces of the whole plot. The key is the experimental unit is different for each factor.

Oats Example

In this example, entire fields are planted with one of three types of oats. These are the whole plots, of which there are 18. Each whole plot is split into four split-plots, each of which was randomly assigned one of four levels of nitrogen. The yield (in bushels/acre) of oats was determined. Of interest are differences in oat variety and nitrogen levels.

sp.oats <- read.csv("oats.csv")
sp.oats <- within(sp.oats, nitroF <- factor(nitro))
head(sp.oats)
##   observation replicate variety nitro yield nitroF
## 1           1         I Victory   0.0   111      0
## 2           2         I Victory   0.2   130    0.2
## 3           3         I Victory   0.4   157    0.4
## 4           4         I Victory   0.6   174    0.6
## 5           5         I  Golden   0.0   117      0
## 6           6         I  Golden   0.2   114    0.2
library(lattice)  # Can only list one package at a time
library(car)
library(agricolae)
with(sp.oats, xyplot(yield ~ nitroF | variety))

plot of chunk unnamed-chunk-1

with(sp.oats, xyplot(yield ~ nitroF | variety, groups = replicate))

plot of chunk unnamed-chunk-1

with(sp.oats, xyplot(yield ~ nitroF | variety, groups = replicate, aspect = "xy"))

plot of chunk unnamed-chunk-1

with(sp.oats, xyplot(yield ~ nitroF | variety, groups = replicate, aspect = "xy", 
    type = "o"))

plot of chunk unnamed-chunk-1

with(sp.oats, xyplot(yield ~ nitroF | variety, groups = replicate, aspect = "xy", 
    type = "a"))

plot of chunk unnamed-chunk-1

If you ignore the experimental design, you get the following, incorrect, results. Notice, the error df is higher, making it easier to detect differences that are not really there.

res.bad <- lm(yield ~ variety * nitroF, data = sp.oats)
anova(res.bad)
## Analysis of Variance Table
## 
## Response: yield
##                Df Sum Sq Mean Sq F value  Pr(>F)    
## variety         2   1786     893    1.79    0.17    
## nitroF          3  20020    6673   13.41 8.4e-07 ***
## variety:nitroF  6    322      54    0.11    1.00    
## Residuals      60  29857     498                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The analysis in R follows the same pattern as last week. The “variety” factor only has (18-1) total df to use when testing for differences in the variety. Therefore, we need to specify the error term for variety correctly. It should be noted that the form of the “Error(A:wholeplot)” can change depending on the data is laid out. The key is to know the correct df so you know have the correct results.

res.good <- aov(yield ~ variety * nitroF + Error(replicate:variety), data = sp.oats)
summary(res.good)
## 
## Error: replicate:variety
##           Df Sum Sq Mean Sq F value Pr(>F)
## variety    2   1786     893    0.61   0.56
## Residuals 15  21889    1459               
## 
## Error: Within
##                Df Sum Sq Mean Sq F value  Pr(>F)    
## nitroF          3  20020    6673    37.7 2.5e-12 ***
## variety:nitroF  6    322      54     0.3    0.93    
## Residuals      45   7969     177                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In order to check assumptions, you need to not use the error term. You can add the term without error, but the F tests are wrong. Assumption checking is OK, however.

res.good2 <- aov(yield ~ variety * nitroF + replicate:variety, data = sp.oats)
summary(res.good2)
##                   Df Sum Sq Mean Sq F value  Pr(>F)    
## variety            2   1786     893    5.04   0.011 *  
## nitroF             3  20020    6673   37.69 2.5e-12 ***
## variety:nitroF     6    322      54    0.30   0.932    
## variety:replicate 15  21889    1459    8.24 1.6e-08 ***
## Residuals         45   7969     177                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(res.good2, 1)

plot of chunk unnamed-chunk-4

plot(res.good2, 2)

plot of chunk unnamed-chunk-4

plot(res.good2, 5)

plot of chunk unnamed-chunk-4

boxCox(res.good2)

plot of chunk unnamed-chunk-4

Since the Box-Cox plot is good, no need to make any transformations. You still need to identify which levels of nitrogen are different.

with(sp.oats, HSD.test(yield, nitroF, DFerror = 45, MSerror = 117))
## 
## Study:
## 
## HSD Test for yield 
## 
## Mean Square Error:  117 
## 
## nitroF,  means
## 
##      yield std.err  r Min. Max.
## 0    79.39   4.571 18   53  117
## 0.2  98.89   5.149 18   64  140
## 0.4 114.22   5.260 18   81  161
## 0.6 123.39   5.421 18   86  174
## 
## alpha: 0.05 ; Df Error: 45 
## Critical Value of Studentized Range: 3.773 
## 
## Honestly Significant Difference: 9.619 
## 
## Means with the same letter are not significantly different.
## 
## Groups, Treatments and means
## a     0.6     123.4 
## a     0.4     114.2 
## b     0.2     98.89 
## c     0       79.39
library(gplots)
plot.stuff <- with(sp.oats, HSD.test(yield, nitroF, DFerror = 45, MSerror = 117))
## 
## Study:
## 
## HSD Test for yield 
## 
## Mean Square Error:  117 
## 
## nitroF,  means
## 
##      yield std.err  r Min. Max.
## 0    79.39   4.571 18   53  117
## 0.2  98.89   5.149 18   64  140
## 0.4 114.22   5.260 18   81  161
## 0.6 123.39   5.421 18   86  174
## 
## alpha: 0.05 ; Df Error: 45 
## Critical Value of Studentized Range: 3.773 
## 
## Honestly Significant Difference: 9.619 
## 
## Means with the same letter are not significantly different.
## 
## Groups, Treatments and means
## a     0.6     123.4 
## a     0.4     114.2 
## b     0.2     98.89 
## c     0       79.39
names(plot.stuff)
## [1] "statistics" "parameters" "means"      "comparison" "groups"
plot.stuff$means
##      yield std.err  r Min. Max.
## 0    79.39   4.571 18   53  117
## 0.2  98.89   5.149 18   64  140
## 0.4 114.22   5.260 18   81  161
## 0.6 123.39   5.421 18   86  174
plot.stuff$groups
##   trt  means M
## 1 0.6 123.39 a
## 2 0.4 114.22 a
## 3 0.2  98.89 b
## 4 0    79.39 c
barplot2(plot.stuff$means[, 1])

plot of chunk unnamed-chunk-5

# Add some lables
barplot2(plot.stuff$means[, 1], names.arg = rownames(plot.stuff$means), xlab = "Nitrogen", 
    ylab = "Yield per Acre")

plot of chunk unnamed-chunk-5

# Add some error bars
mu.i <- plot.stuff$means[, 1]
se.i <- qt(1 - 0.05/2, 45) * plot.stuff$means[, 2]
bp <- barplot2(mu.i, names.arg = rownames(plot.stuff$means), xlab = "Nitrogen", 
    ylab = "Yield per Acre", plot.ci = TRUE, ci.l = mu.i - se.i, ci.u = mu.i + 
        se.i)
# Group them together
text(bp, 0, plot.stuff$groups[, 3], cex = 1, pos = 3)

plot of chunk unnamed-chunk-5

# Is this correct?
plot.stuff$groups
##   trt  means M
## 1 0.6 123.39 a
## 2 0.4 114.22 a
## 3 0.2  98.89 b
## 4 0    79.39 c
order(plot.stuff$groups[, 1])
## [1] 4 3 2 1
plot.stuff$groups[order(plot.stuff$groups[, 1]), 3]
## [1] c b a a
## Levels: a b c
# This is better
bp <- barplot2(mu.i, names.arg = rownames(plot.stuff$means), xlab = "Nitrogen", 
    ylab = "Yield per Acre", plot.ci = TRUE, ci.l = mu.i - se.i, ci.u = mu.i + 
        se.i)
text(bp, 0, plot.stuff$groups[order(plot.stuff$groups[, 1]), 3], cex = 1, pos = 3)

plot of chunk unnamed-chunk-5

# Sometimes it is easier to do it by hand
bp <- barplot2(mu.i, names.arg = rownames(plot.stuff$means), xlab = "Nitrogen", 
    ylab = "Yield per Acre", plot.ci = TRUE, ci.l = mu.i - se.i, ci.u = mu.i + 
        se.i)
text(bp, 0, c("A", "B", "C", "C"), cex = 1, pos = 3)

plot of chunk unnamed-chunk-5

Split-Split-Split Plot Design

In this experiment you wish to measure the effects of three factors on the amount of glycogen in the liver. In your experiment, there are 6 rats (whole plots).

To each rat, one of three food diets was randomly assigned (T1, T2, and T3).

From each rat, the liver was removed and split into four segments. Each segment was prepared using one of two different chemicals (P1 and P2).

Finally, each piece of liver had the glycogen level measured using two different analytical techniques (A and B).

The experimental unit for diet is rat. The experimental unit for liver preperation chemical is strip of liver. The experimental unit for analytical technique is piece of liver. All there are different.

spp.rat <- read.csv("split_split_rat.csv")
spp.rat
##    glycogen food       rat prep method
## 1       127   T1      Remy   P1      A
## 2       126   T1      Remy   P1      B
## 3       127   T1      Remy   P2      A
## 4       121   T1      Remy   P2      B
## 5       124   T1      Remy   P1      A
## 6       125   T1      Remy   P1      B
## 7       132   T1      Remy   P2      A
## 8       138   T1      Remy   P2      B
## 9       146   T1 Templeton   P1      A
## 10      144   T1 Templeton   P1      B
## 11      136   T1 Templeton   P2      A
## 12      139   T1 Templeton   P2      B
## 13      156   T1 Templeton   P1      A
## 14      146   T1 Templeton   P1      B
## 15      134   T1 Templeton   P2      A
## 16      133   T1 Templeton   P2      B
## 17      157   T2  Scabbers   P1      A
## 18      145   T2  Scabbers   P1      B
## 19      154   T2  Scabbers   P2      A
## 20      142   T2  Scabbers   P2      B
## 21      147   T2  Scabbers   P1      A
## 22      153   T2  Scabbers   P1      B
## 23      155   T2  Scabbers   P2      A
## 24      157   T2  Scabbers   P2      B
## 25      151   T2  Splinter   P1      A
## 26      155   T2  Splinter   P1      B
## 27      147   T2  Splinter   P2      A
## 28      147   T2  Splinter   P2      B
## 29      162   T2  Splinter   P1      A
## 30      152   T2  Splinter   P1      B
## 31      145   T2  Splinter   P2      A
## 32      144   T2  Splinter   P2      B
## 33      130   T3 Nicodemus   P1      A
## 34      121   T3 Nicodemus   P1      B
## 35      134   T3 Nicodemus   P2      A
## 36      134   T3 Nicodemus   P2      B
## 37      131   T3 Nicodemus   P1      A
## 38      132   T3 Nicodemus   P1      B
## 39      128   T3 Nicodemus   P2      A
## 40      127   T3 Nicodemus   P2      B
## 41      134   T3     Rizzo   P1      A
## 42      136   T3     Rizzo   P1      B
## 43      135   T3     Rizzo   P2      A
## 44      134   T3     Rizzo   P2      B
## 45      130   T3     Rizzo   P1      A
## 46      123   T3     Rizzo   P1      B
## 47      136   T3     Rizzo   P2      A
## 48      133   T3     Rizzo   P2      B
with(spp.rat, xyplot(glycogen ~ factor(method:prep) | food, groups = rat, aspect = "xy"))

plot of chunk unnamed-chunk-6

This is a nutty design, but it happens. Treatment, prep, and method all have different denominator MS for their f-test.

What if you did it wrong?

fact.bad <- lm(glycogen ~ food * prep * method, data = spp.rat)
anova(fact.bad)
## Analysis of Variance Table
## 
## Response: glycogen
##                  Df Sum Sq Mean Sq F value  Pr(>F)    
## food              2   3530    1765   32.26 9.4e-09 ***
## prep              1     35      35    0.64    0.43    
## method            1     54      54    0.99    0.33    
## food:prep         2    133      67    1.22    0.31    
## food:method       2      5       3    0.05    0.95    
## prep:method       1     11      11    0.20    0.66    
## food:prep:method  2      5       3    0.05    0.95    
## Residuals        36   1970      55                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

First, lets pretend that the method factor is ignored. That makes this a split-plot design.

sp.res <- aov(glycogen ~ food * prep + Error(rat/food), data = spp.rat)
## Warning: Error() model is singular
summary(sp.res)
## 
## Error: rat
##           Df Sum Sq Mean Sq F value Pr(>F)  
## food       2   3530    1765    6.22  0.086 .
## Residuals  3    851     284                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Error: Within
##           Df Sum Sq Mean Sq F value Pr(>F)
## prep       1     35    35.0    1.14   0.29
## food:prep  2    133    66.6    2.18   0.13
## Residuals 39   1194    30.6

Now add method as a second split, so now it is a split-split-plot design.

sp.res <- aov(glycogen ~ food * prep * method + Error(rat/food:prep), data = spp.rat)
## Warning: Error() model is singular
summary(sp.res)
## 
## Error: rat
##           Df Sum Sq Mean Sq F value Pr(>F)  
## food       2   3530    1765    6.22  0.086 .
## Residuals  3    851     284                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Error: rat:food:prep
##           Df Sum Sq Mean Sq F value Pr(>F)
## prep       1     35    35.0    0.27   0.64
## food:prep  2    133    66.6    0.51   0.64
## Residuals  3    390   130.0               
## 
## Error: Within
##                  Df Sum Sq Mean Sq F value Pr(>F)
## method            1     54    54.2    2.23   0.15
## food:method       2      5     2.7    0.11   0.90
## prep:method       1     11    11.0    0.45   0.51
## food:prep:method  2      5     2.6    0.11   0.90
## Residuals        30    728    24.3