# Appendix 4: Contrasts

Learning Objectives

To learn how to build contrasts in PERMANOVA and other R functions.

Contents:

- General Principles
- A More Detailed Grazing Example
- Pairwise Contrasts
- Different Methods Yield Different Conclusions
- Custom Contrasts
- References

In the chapter about complex models, I presented a simple way to do pairwise contrasts. Here, I want to describe some more general concepts related to contrasts. I present these in the context of PERMANOVA, but they are general concepts– they are applicable for all linear models in R, including conventional parametric tests such as `lm()`

and `aov()`

. After outlining some general principles, we will consider how to do pairwise contrasts and custom contrasts.

# General Principles

R, like all statistical packages, deals with factors via **contrasts**. Contrasts are an important follow-up to many analyses. Some – but not all – analytical techniques have pre-defined functions that can be used to conduct pair-wise comparisons. However, you can also specify these comparisons yourself for complete control over your analyses.

You may recall from parametric statistics courses that the analysis of a factor with *n* groups can be decomposed into *n*-1 contrasts. Crawley (2012) provides a very good explanation and demonstration of contrasts. He also demonstrates an approach in which the model is simplified by aggregating non-significant factor levels in a step-wise *a posteriori *procedure.

Each contrast consists of a series of coefficients with the following characteristics:

- Levels that are being combined get the same sign
- Levels that are being contrasted get opposing signs
- Levels that are being omitted get a coefficient of 0

When using contrasts, it is often helpful that they be **orthogonal** (i.e., uncorrelated with one another). Using orthogonal contrasts enables you to fully decompose the variability associated with a factor into the variability associated with each contrast. In addition to the three characteristics listed above, orthogonal contrasts also have the following characteristics:

- The coefficients in a given contrast sum to zero
- The cross-products of any two contrasts sum to zero

A key principle when conducting a contrast is to pay attention to how variance is being partitioned. An analysis of a factor with two levels obviously does not require follow-up contrasts, so our focus is on factors with > 2 levels and thus > 1 df. Our intent is to fully partition the df and variance (SS) associated with the factor into the df and SS associated with the contrast of interest and the df and SS associated with other aspects of the factor. If these are not fully partitioned, the unaccounted-for df and SS will be lumped into the residual. Depending on their magnitude, this may impact conclusions about the statistical significance of the contrast of interest.

# A More Detailed Grazing Example

The oak plant community dataset contains factors indicating whether stands were grazed in the past and/or grazed at the time the data were collected (`GrazPast`

and `GrazCurr`

, respectively). We’ll combine these factors together into a new `Grazing`

factor (capitalized):

`Grazing <- factor(with(Oak, paste(GrazPast, GrazCurr, sep = "_")))`

`summary(Grazing)`

```
No_No Yes_No Yes_Yes
24 6 17
```

Since each of the original factors is binary, there are four possible levels here. However, there are no stands that were not grazed in the past but were being grazed at the time the data were collected (i.e., the ‘No_Yes’ combination is absent).

Let’s test the overall effect of `Grazing`

on vegetation composition, as measured in `Oak1`

:

`Graz.res2 <- adonis2(Oak1 ~ Grazing, method = "bray")`

View ANOVA table (`Graz.res2`

):

```
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = Oak1 ~ Grazing, method = "bray")
Df SumOfSqs R2 F Pr(>F)
Grazing 2 0.9095 0.07844 1.8725 0.006 **
Residual 44 10.6854 0.92156
Total 46 11.5949 1.00000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

Yes, there is a relationship between `Grazing`

and composition. But which levels differ from one another? To determine this, we need to use pairwise contrasts.

There are several ‘families’ of contrasts that can be applied in linear models. For our purposes here, we will focus on two:

`contr.treatment`

– sets the first level to be the baseline and contrasts each other level with the baseline. The help file notes that this contrast may not be orthogonal.`contr.sum`

– contrasts levels directly against each other.

The desired family can be specified using the `C()`

function (note: capital C!):

`C(Grazing, contr.sum)`

```
[1] Yes_Yes Yes_No Yes_Yes No_No No_No Yes_Yes Yes_Yes
[8] No_No No_No No_No Yes_Yes Yes_Yes Yes_Yes Yes_Yes
[15] Yes_No Yes_No No_No No_No No_No No_No No_No
[22] No_No No_No No_No No_No No_No No_No No_No
[29] Yes_Yes No_No No_No No_No Yes_Yes No_No Yes_Yes
[36] Yes_Yes Yes_Yes Yes_No Yes_No Yes_Yes Yes_Yes Yes_Yes
[43] Yes_Yes No_No Yes_No No_No No_No
attr(,"contrasts")
[,1] [,2]
No_No 1 0
Yes_No 0 1
Yes_Yes -1 -1
Levels: No_No Yes_No Yes_Yes
```

The contrasts that R applies in an analysis are stored in a model matrix. Note that the function to create this matrix uses a formula:

`Grazing.mm <- model.matrix( ~ C(Grazing, contr.sum))`

```
(Intercept) C(Grazing, contr.sum)1 C(Grazing, contr.sum)2
1 1 -1 -1
2 1 0 1
3 1 -1 -1
4 1 1 0
5 1 1 0
6 1 -1 -1
7 1 -1 -1
8 1 1 0
9 1 1 0
10 1 1 0
11 1 -1 -1
12 1 -1 -1
13 1 -1 -1
14 1 -1 -1
15 1 0 1
16 1 0 1
17 1 1 0
18 1 1 0
19 1 1 0
20 1 1 0
21 1 1 0
22 1 1 0
23 1 1 0
24 1 1 0
25 1 1 0
26 1 1 0
27 1 1 0
28 1 1 0
29 1 -1 -1
30 1 1 0
31 1 1 0
32 1 1 0
33 1 -1 -1
34 1 1 0
35 1 -1 -1
36 1 -1 -1
37 1 -1 -1
38 1 0 1
39 1 0 1
40 1 -1 -1
41 1 -1 -1
42 1 -1 -1
43 1 -1 -1
44 1 1 0
45 1 0 1
46 1 1 0
47 1 1 0
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$`C(Grazing, contr.sum)`
[,1] [,2]
No_No 1 0
Yes_No 0 1
Yes_Yes -1 -1
```

By examining this object, we can see that:

- The first column is for the intercept
- The attributes at the bottom show that the first contrast compares ‘No_No’ (1) with ‘Yes_Yes’ (-1), ignoring ‘Yes_No’ (0)
- The second contrast compares ‘Yes_No’ (1) with ‘Yes_Yes’ (-1), ignoring ‘No_No’ (0)

*Note*: For more information about contrasts, see the R help files and Crawley (2012).

# Pairwise Contrasts

Pairwise contrasts require separate tests of all possible pairs of levels. The number of pairs of *n* levels is the same as the number of distances from *n* sample units: *n*(*n*-1)/2.

In this example, there are 3 levels so there are 3 ( 3 – 1 ) / 2 = 3 pairs. We are interested in determining, for each pair, whether there is a significant difference between them. We can organize the results in a table (not shown) or a distance matrix like this:

No_No | Yes_No | |

Yes_No | ||

Yes_Yes |

For convenience, we are just going to display the *P*-values.

In essence, what we are going to do is test each pair while accounting for the other variation that is attributable to the grazing factor. Our grouping variable had 3 levels, so there were 2 df. This means that in each test we can test no more than two pairwise contrasts. And, if we fit a single pairwise contrast we will need to make sure to also account for the other df and its associated variation.

The model matrix that we saw above includes two pairwise contrasts. If we fit the full model matrix as a single explanatory variable, we get the same result as when we used the Grazing term above:

`adonis2(Oak1 ~ Grazing.mm, method = "bray")`

```
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = Oak1 ~ Grazing.mm, method = "bray")
Df SumOfSqs R2 F Pr(>F)
Grazing.mm 2 0.9095 0.07844 1.8725 0.004 **
Residual 44 10.6854 0.92156
Total 46 11.5949 1.00000
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

However, by treating each pairwise contrast as its own explanatory variable, we can partition the variance to these terms. We could do this either by indexing the model matrix within the `adonis2()`

function or by creating a new object for each contrast like this:

`con.NNvYY <- Grazing.mm[, 2]`

`con.YNvYY <- Grazing.mm[, 3]`

If we only use one of these contrasts, the variance is not partitioned correctly:

`adonis2(Oak1 ~ con.YNvYY, method = "bray")`

```
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = Oak1 ~ con.YNvYY, method = "bray")
Df SumOfSqs R2 F Pr(>F)
con.YNvYY 1 0.317 0.02734 1.2651 0.211
Residual 45 11.278 0.97266
Total 46 11.595 1.00000
```

Only 1 df has been used to test for the grazing effect. The other df and its associated variance have been included in the residual. (I used `con.YNvYY`

here as it illustrates this point more strongly than `con.NNvYY`

).

** Note:** this is different from the pairwise comparisons that we conducted earlier (

`pairwise.adonis2()`

); in that case we subsetted the data to a pair of levels. Stands with the other grazing treatment were excluded from the analysis, so their variance was not included in the residual.

Substitute both contrasts into the function, and verify that the df and SS for grazing have been fully partitioned between them:

`adonis2(Oak1 ~ con.NNvYY + con.YNvYY, method = "bray")`

```
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = Oak1 ~ con.NNvYY + con.YNvYY, method = "bray")
Df SumOfSqs R2 F Pr(>F)
con.NNvYY 1 0.7869 0.06787 3.2404 0.001 ***
con.YNvYY 1 0.1225 0.01057 0.5046 0.981
Residual 44 10.6854 0.92156
Total 46 11.5949 1.00000
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

Note that the residual df and SS are identical to what they were when we used `Grazing`

or `Grazing.mm`

(each with three levels of this factor).

However, we are now dealing with a more complex model than we have seen previously: this model includes two terms, and is unbalanced. This means that the order in which terms enter the model matters – see the discussion of types of sums of squares in the ‘Complex Models’ chapter. The first contrast is accurate because it is the first term in the model, but the second one may not be.

The conclusions about a term depend on where it is in the formula, and on how other terms are dealt with. When conducting contrasts, we want each term to explain as much variation as possible. To do so, we will enter the contrast of interest as the first term in the model and the other contrast as the second term so that it accounts for as much of the remaining variation as possible. We could also specify `by = "terms"`

for completeness, though this is not strictly necessary as it is the default approach in `adonis2()`

.

Here is the model with our two contrasts reversed so that `con.YNvYY`

is in the first position:

`adonis2(Oak1 ~ con.YNvYY + con.NNvYY, method = "bray")`

```
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = Oak1 ~ con.YNvYY + con.NNvYY, method = "bray")
Df SumOfSqs R2 F Pr(>F)
con.YNvYY 1 0.3170 0.02734 1.3055 0.163
con.NNvYY 1 0.5924 0.05109 2.4394 0.002 **
Residual 44 10.6854 0.92156
Total 46 11.5949 1.00000
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

Note that the *P*-value for `con.YNvYY`

has changed considerably. In this case, we still conclude that there is no significant difference in composition between the ‘Yes_No’ and ‘Yes_Yes’ grazing treatments. However, it is possible for one analysis to indicate significance and the other to indicate non-significance.

We can enter both of these pairwise contrasts in our matrix:

No_No | Yes_No | |

Yes_No | ||

Yes_Yes | 0.001 | 0.163 |

Now, what about the last pairwise contrast, between ‘No_No’ and ‘Yes_No’? To test this, we need to create a new contrast. We will do so by following four simple steps:

- Create a duplicate of the original factor. This is not strictly necessary, but I do this so that my original object is unchanged and I can easily refer back to it if desired.

`Grazing1 <- Grazing`

- Designate our own contrasts for this term. Contrasts are applied to the levels of a factor in the order in which they are summarized when you view them; the default is to report them alphabetically. See
`levels(Grazing1)`

to verify. Since we want to contrast ‘No_No’ with ‘Yes_No’, our coefficients will be (1, -1, 0). Two contrasts are possible among these three levels; if we do not specify the second, it will be created automatically to be orthogonal to the one we specified. We will apply this contrast to a new model matrix:

`Grazing.mm1 <- model.matrix(~ C(Grazing1, c(1, -1, 0)))`

Verify that the contrasts have been applied to the levels of the factor as intended. For example, every row that belongs to the ‘No_No’ level is assigned a ‘1’ in the first contrast. - Use indexing to create a separate object for each contrast.

`con.NNvYN <- Grazing.mm1[, 2]`

`con.NNvYN.rest <- Grazing.mm1[, 3]`

Note that the last contrast here is one that we may not be interested in but need to include for completeness. - Rerun our analysis with the new contrasts, specifying the ‘focal’ contrast as the first one in the model.

`adonis2(Oak1 ~ con.NNvYN + con.NNvYN.rest, method = "bray")`

```
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = Oak1 ~ con.NNvYN + con.NNvYN.rest, method = "bray")
Df SumOfSqs R2 F Pr(>F)
con.NNvYN 1 0.5389 0.04647 2.2189 0.002 **
con.NNvYN.rest 1 0.3706 0.03196 1.5260 0.066 .
Residual 44 10.6854 0.92156
Total 46 11.5949 1.00000
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

Verify once again that the df and variance have been fully partitioned between the contrasts.

Finally, we update our matrix of pairwise contrasts:

No_No | Yes_No | |

Yes_No | 0.002 | |

Yes_Yes | 0.001 | 0.163 |

Now that we have completed the table, we can inspect the pairwise contrasts and summarize what they mean ecologically. Here, we see evidence that composition differs between stands that have never been grazed (‘No_No’) and those that have been grazed at any point. Stands that were grazed in the past do not differ in composition, regardless of whether they were being grazed at the time the data were collected.

# Different Methods Yield Different Conclusions

You may be interested in seeing how well the above matrix of pairwise comparisons agrees with the matrix of comparisons we created in the ‘Complex Models‘ chapter. That one is repeated here:

No_No | Yes_No | |

Yes_No | 0.359 | |

Yes_Yes | 0.001 | 0.932 |

These results obviously do not match. Why not? There are several possible reasons:

- Using a different set of permutations for each test
- Data are unbalanced
- When using custom contrasts as in this appendix, the inclusion of the other level of the grazing factor may influence the residual (variation within levels of grazing)

Some exploration suggests that the last of those reasons is the main ‘culprit’. For example, consider what happens when we index the response and explanatory objects to only include two levels of grazing and do the PERMANOVA on that reduced dataset:

`g2 <- Grazing %in% c("Yes_No", "Yes_Yes")`

set.seed(42)

`adonis2(Oak1[g2, ] ~ Grazing[g2], method = "bray")`

`# F = 0.58; P = 0.943`

`g2 <- Grazing %in% c("Yes_No", "No_No")`

`set.seed(42)`

adonis2(Oak1[g2, ] ~ Grazing[g2], method = "bray")

`# F = 1.07; P = 0.371`

`g2 <- Grazing %in% c("No_No", "Yes_Yes")`

`set.seed(42)`

adonis2(Oak1[g2, ] ~ Grazing[g2], method = "bray")

`# F = 3.18; P = 0.001`

Putting this together in a table of P-values:

No_No | Yes_No | |

Yes_No | 0.371 | |

Yes_Yes | 0.001 | 0.943 |

What we did here – subsetting the data – is equivalent to what was done in `pairwise.adonis2()`

in the ‘Complex Models‘ chapter. However, these differences do not necessarily mean that one of these approaches is better than the other – there is some debate among statisticians about which approach is preferred.

Also, the more detailed approach that I outline in this appendix has some advantages:

- It draws on all of the data and hence has a more accurate estimate of the within-group variability.
- It is necessary if you want to make other planned comparisons such as comparing two levels of a factor against another level of that factor (see ‘Custom Contrasts’ below).
- Individual contrasts can be easily included in models that also include other terms.

# Custom Contrasts

Pairwise contrasts are by no means the only contrasts that are possible. Other contrasts may be suggested by the objectives of your study. For example, you may want to evaluate whether groups of levels differ from one another.

In the context of this example, a reasonable contrast would be to compare stands that were grazed historically (‘Yes_No’ and ‘Yes_Yes’) with those that were not (‘No_No’). To do so, we follow the same procedure outlined above for designating our own pairwise contrasts.

- Create a duplicate of the original factor.

`Grazing.custom <- Grazing`

- Designate our own contrasts for this term. Here, we want to compare historically grazed and ungrazed stands, so our desired contrast is (1, -1, -1). No coefficients are 0 as we are not excluding any levels from this contrast. Apply this set of contrasts to the model matrix.

`Grazing.custom.mm <- model.matrix(~ C(Grazing.custom, c(1, -1, -1)))`

- Use indexing to create a separate object for each contrast.

`con.NNvYN_YY <- Grazing.custom.mm[, 2]`

`con.NNvYN_YY.rest <- Grazing.custom.mm[, 3]`

As before, the last contrast here is one that we may not be interested in but need to include for completeness. - Replace the grazing term in
`adonis2()`

with our new contrasts, with the ‘focal’ contrast as the first one in the model.

`set.seed(42)`

adonis2(Oak1 ~ con.NNvYN_YY + con.NNvYN_YY.rest, method = "bray")

```
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = Oak1 ~ con.NNvYN_YY + con.NNvYN_YY.rest, method = "bray")
Df SumOfSqs R2 F Pr(>F)
con.NNvYN_YY 1 0.7693 0.06635 3.1677 0.001 ***
con.NNvYN_YY.rest 1 0.1402 0.01209 0.5773 0.943
Residual 44 10.6854 0.92156
Total 46 11.5949 1.00000
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

Again, we have fully partitioned the df and variance associated with the grazing term into that associated with our contrast of interest and that associated with other levels of the term. This contrast indicates that past grazing status has a significant effect on composition.

You may notice that the question that we asked here is identical to asking about the past grazing status as defined in the oak plant community dataset. Indeed, what we did here is identical to using those factors in the PERMANOVA:

`set.seed(42)`

`adonis2(Oak1 ~ GrazPast + GrazCurr, data = Oak, method = "bray")`

```
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = Oak1 ~ GrazPast + GrazCurr, data = Oak, method = "bray")
Df SumOfSqs R2 F Pr(>F)
GrazPast 1 0.7693 0.06635 3.1677 0.001 ***
GrazCurr 1 0.1402 0.01209 0.5773 0.943
Residual 44 10.6854 0.92156
Total 46 11.5949 1.00000
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

Our custom model matrix produces exactly the same output as when we analyze GrazPast and GrazCurr. **This is not unique to PERMANOVA – is is part of the logic behind all linear models in R.**

Verify that, as with the other contrasts above, the order of these terms can dramatically affect the conclusions. Again, **this is not unique to PERMANOVA – it is a consequence of model structure and the fact that these data are unbalanced. The same issues apply to any linear model in R.**

# References

Crawley, M.J. 2012. *The R book*. Second edition (First edition, 2007). Wiley, West Sussex, England.