Group Comparisons

24 Complex Models

Learning Objectives

To consider ways to analyze a factor with multiple levels via pairwise contrasts.

To consider ways to simultaneously analyze multiple explanatory variables.

To understand the consequences of specifying different types of Sums of Squares.

To introduce custom functions.

Key Packages

require(vegan, RVAideMemoire)



We have focused thus far on techniques to compare a priori groups.  However, recall that ANOVA is a specific instance of a linear model in which the explanatory variable is categorical.  Linear models can be used to compare groups (categorical variables) and to regress one continuously distributed variable on another.  The same generalization applies to techniques like PERMANOVA – it can be used to analyze any linear model.  This is why PERMANOVA is sometimes referred to more generally as a ‘distance-based linear model’ (DISTLM) or as distance-based redundancy analysis (dbRDA) (Legendre & Anderson 1999; McArdle & Anderson 2001).

Most of the concepts outlined in this chapter also apply to other techniques that can handle multiple factors: the generalized ANOSIM test, Mantel test, and RRPP.

Multi-Factor Models

Since PERMANOVA partitions variation to determine how much is associated with different terms (explanatory variables), one of its key strengths is that it can be applied to complex designs such as those with multiple factors and/or covariates.

Furthermore, PERMANOVA can test whether terms have additive effects (think of the main effects in an ANOVA) or interactive effects (think of the two-way terms in an ANOVA: the effect of one factor depends on the level of another factor).  Interactive effects can occur because of differences in the size and/or the direction of the effect.

When testing complex models with interaction terms, interpretation should be done in a logical manner.  Anderson (2015) recommends:

  • Fit the full model, including main effects and interaction term(s).
  • Examine whether the interaction is significant.  If it is significant, test levels of one factor within levels of the interacting factor.
  • If the interaction is not significant, examine whether the main effects are significant.

The adonis2() function can handle both ANOVA and regression models.  Each variable is treated as categorical or continuous based on its class.  Therefore, it is helpful to always check that the degrees of freedom in a model output correspond to what you expected.  A variable that is categorical will have 1 fewer df than it has levels, whereas a variable that is continuous will only have 1 df.

Model Formulation

As we will see with contrasts below, it is important to identify the correct model formula when dealing with complex designs.  This is true not just for adonis2() but also for standard statistical functions in R (e.g., lm(), aov()).

Here are some basic rules for building formulae:

  • Formulae are arranged such that the dependent variable or matrix is on the left-hand side (LHS) and the explanatory variable(s) are on the right-hand side (RHS):
    Dependent variable or matrix ~ Explanatory variable(s)
  • The explanatory variables can be categorical (as in ANOVA) and/or continuous (as in regression).  R uses the class of the object to determine whether it is categorical or continuous.  The possibility of having both categorical and continuous explanatory variables in the same analysis is why adonis2() requires data to be in a data frame rather than a matrix.
  • Explanatory variables can be arranged together succinctly using a variety of operators.  These permit the creation of complex ANOVA tables without having to type all of the terms out.  Some examples are shown in the below table.
Example Note about operator Equivalent expanded formula
A effect of single term A
A+B effect of terms independently A + B
A:B interaction between terms A:B
A*B terms are crossed A + B + A:B
(A+B+C)^2 ^’ indicates the maximum order of the desired interactions. A + B + C + A:B + B:C + A:C
(A+B+C)^2 – A:B ’ removes the terms specified after it. A + B + C + B:C + A:C
A + B %in% A %in%’ indicates that the term on its left is nested within those on its right. A + A:B

For more information about building formulae, see ?formula.

Evaluating Complex Designs

Recall that a PERMANOVA applied to a single variable with Euclidean distances is identical to a conventional ANOVA (or linear model) of that variable.  This suggests a three-step process to verify that a design is being analyzed correctly:

  1. Analyze a univariate response conventionally
  2. Duplicate the conventional analysis using PERMANOVA with Euclidean distances
  3. Replace the univariate response with your multivariate response and desired distance measure

We will illustrate this process by repeating our test of the effect of grazing.  These principles are also illustrated in the ‘Restricting Permutations‘ chapter.

Step 1: Analyze a Univariate Response Conventionally

Begin with a standard univariate analysis.

This analysis could involve a univariate response whose characteristics you or your statistical advisor are most comfortable with.  Perhaps one that satisfies the assumption of normality?

This analysis could involve any analytical function or statistical platform.   In R, this might entail fitting a linear model via lm() or an ANOVA via aov().  However, this conventional analysis could also be conducted in a different software package. 

Conduct the analysis and ensure that the resulting ANOVA table is correct:

  • Degrees of freedom are appropriate for each term
  • All necessary terms and interactions are specified
  • F-statistic is being calculated with the correct error term.  One way to identify the correct denominator is to consider the expected mean squares (EMS) associated with each term.  Each term has an EMS that is a linear combination of one or more of the terms in the model.  The correct denominator for a term is the term whose EMS includes all linear combinations except that of the term itself.  Another way to state this is that the denominator should be the term whose EMS would equal the numerator if the null hypothesis were true (Anderson et al. 2008).


For example, imagine that we were not confident that our approach for analyzing the plant compositional response to grazing was correct.  We can start by using aov() with a univariate response such as species richness:

summary(aov(Oak$SppRich ~ grazing, data = Oak))

            Df Sum Sq Mean Sq F value  Pr(>F)   
grazing      1    704   704.0   9.964 0.00285 **
Residuals   45   3180    70.7                   
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We confirm the details of this analysis:

  • Grazing has two levels and is represented by 1 df
  • No other terms are required
  • The F-statistic is calculated by dividing the MS for grazing by the MS for the residual (704.0 / 70.7 = 9.96).

Step 2: Analyze the Univariate Response in PERMANOVA with a Euclidean Distance Matrix

Once we are confident that the experimental design has been correctly incorporated into the conventional analysis, repeat the analysis in PERMANOVA.  Use the same univariate response variable and set of explanatory variables, and specify the Euclidean distance measure.

If the PERMANOVA approach is set up correctly, the resulting ANOVA table will be identical to that produced in step 1:

  • Same set of terms, with same df and SS for each
  • Same F-statistic for relevant terms, other than rounding differences.
  • If the PERMANOVA analysis is formulated correctly, the resulting ANOVA table will be exactly the same as that produced above, except for rounding errors.  If a different term was used as the error term for a factor, the associated pseudo-F statistic would differ from what was calculated in the conventional software.
  • P-values will be permutation-based and thus not exactly the same – though in my experience they tend not to vary that much from the parametric values.


We can analyze the effect of grazing on species richness in PERMANOVA, with Euclidean distances.

adonis2(Oak$SppRich ~ grazing, data = Oak, method = “euclidean”)

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999

adonis2(formula = Oak$SppRich ~ grazing, data = Oak, method = "euclidean")
         Df SumOfSqs      R2     F Pr(>F)   
grazing   1    704.0 0.18128 9.964  0.002 **
Residual 45   3179.6 0.81872                
Total    46   3883.6 1.00000                
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Here, the PERMANOVA table and the earlier ANOVA table have identical values for the df, SS, and F-statistic.


Key Takeaways

If a model is formulated correctly, a PERMANOVA applied to a univariate response variable with Euclidean distances will yield exactly the same statistics as a parametric linear model.

Step 3: Substitute Desired Response and Distance Measure in PERMANOVA

Once we’re confident that the analysis is coded correctly in PERMANOVA, simply replace the univariate response variable and Euclidean distance measure with the multivariate set of responses and the distance measure appropriate for your hypothesis.  The analysis structure will thus be applied identically to the analysis of this more complex response.


In this case, we analyze the effect of grazing status on species composition as expressed by Bray-Curtis dissimilarities:

adonis2(Oak1 ~ grazing, data = Oak, method = "bray")

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999

adonis2(formula = Oak1 ~ grazing, data = Oak, method = "bray")
         Df SumOfSqs      R2      F Pr(>F)   
grazing   1   0.6491 0.05598 2.6684  0.002 **
Residual 45  10.9458 0.94402                 
Total    46  11.5949 1.00000                 
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Here, the result is identical to what we saw in the ‘PERMANOVA’ chapter.  Community composition differs as a function of grazing status.


Testing the permutation-based analysis against a conventional analytical function provides reassurance that the former is functioning as intended.

Choice of Sums of Squares

Analysis is more involved when dealing with a complex design than with a single factor.  For complex designs, there are several ways to partition variance.  Anderson et al. (2008, p.69-73) provide a nice summary of these types, which are briefly summarized in the following table and in Figure 1.41.

Type Name Description Does Order matter? Partitioning by =
I Sequential Test terms in order listed Yes SS of terms sums to total SS terms
II Conditional Test term after accounting for all other terms except those that contain the term No SS of terms may not sum to total SS
III Partial Test term after accounting for all other terms No SS of terms may not sum to total SS margin

When your data are balanced, it matters little which type of SS you use – the results will be identical in most cases.  However, when your data are unbalanced, you have to decide which type of SS is most appropriate based on the questions being explored (Anderson et al. 2008).

Type I or Sequential SS

Type I or Sequential SS are the default in R – both for standard functions like aov() and for permutation-based techniques like adonis2().  With Sequential SS, each term is fitted after taking into account previous terms in the model.  This means that the order of terms in the formula can affect the results.  For example, Figure 1.41a shows a model in which term A is fitted before term B, while Figure 1.41b shows the opposite – term B fitted before term A.  The same amount of variation is explained, but it’s attributed differently.  The second term can only explain variation that has not been explained by the first term.

Sequential (Type I) SS may be acceptable or even desirable under some circumstances.  For example, it is common to fit covariates first in an ANCOVA, and then to test whether the other term(s) explain significant amounts of the remaining variation.  Anderson (2015) notes that the inclusion of a covariate automatically means that the SS from the different terms are not independent, even if your dataset is balanced.

Statisticians generally prefer this type of SS as it fully partitions the variation in a form that is easily understood.

Type II or Conditional SS

Type II or Conditional SS is not described here as this option is not available in adonis2().

Type III or Partial SS

Type III or Partial SS is common in some situations.  Each term is fitted after taking into account all other terms in the model, regardless of their position.  Thus, the order of the terms does not matter.  By accounting for all other terms in the model, Partial (Type III) SS is the most conservative approach – if you detect a significant effect, you would also detect it with any other type of SS, but not vice versa.

However, this approach can result in incomplete partitioning of the variance.  Consider two explanatory variables that are significantly related to a response when tested individually but that also are closely related to one another.  When tested together, it is possible for neither variable to be identified as significant because a sizable amount of the variation could be attributed to either variable and is therefore ignored in this model (this is represented by the white areas within the Venn diagram in Figure 1.41d).  Also, the variation that is ignored is also not reported – you can only detect this by checking whether the SS of the terms in the ANOVA table sum to SST.

The current version of this formulation in vegan has the surprising action that, if applied to a model with main effects and interaction terms, ignores the main effects and only reports the interaction terms!

Anderson et al. (2008) note that many ecologists and editors expect this approach to be used since it is most conservative.  However, in my experience relatively few ecologists realize that their analysis using aov(), for example, is not using this type of SS!

Key Takeaways

Deciding which type of sums of squares (SS) to use is important when:

  • Testing multiple explanatory variables
  • Data are unbalanced

Grazing Example

We have focused thus far on the effect of current grazing, but our dataset also includes information about whether plots were grazed in the past.  These two factors (GrazCurr and GrazPast) are highly unbalanced.  Test the effects of these two factors, evaluating how the order of terms and type of SS (I or III) affect the conclusions.  Compare the results by evaluating whether the SS explained by the terms sum to the total SS, and by the broad patterns in statistical significance.  How do the results compare?

Model Type of SS Results
GrazCurr + GrazPast I
GrazPast + GrazCurr I
GrazCurr + GrazPast III
GrazPast + GrazCurr III

For which questions might each model be appropriate?

Model Selection

When choosing among competing complex models, it is helpful to have clear criteria to guide decisions.  This is an area of active research, and I think it is likely to only become more prevalent as our computing capabilities and the complexity of our analyses increase (e.g., Mitchell et al. 2017).  More complex models often explain more of the variance, but may only incrementally improve model fit.

One way to compare models is to consider how parsimonious they are – whether the improved model fit is ‘worth it’ given the increased complexity.  This can be based on Akaike’s Information Criterion (AIC) and variations thereof.

Indices like AIC are not calculated automatically for the models produced from adonis2(), but can be calculated by hand.  Smaller values indicate more parsimonious models.  The actual magnitude of these indices does not matter; what is important is how they compare among different models applied to the same response data.  Models with AIC values that differ by < 2 units are generally considered to be equally supported.

Type Formula Notes
AIC [latex]N \log \frac{SSR}{N} + 2p[/latex] Overly liberal (chooses more complex models)?
AICc [latex]N \log \frac{SSR}{N} + 2p \frac{N}{N - p -1}[/latex] Corrects AIC for small sample sizes
BIC [latex]N \log \frac{SSR}{N} + \log (N) p[/latex] Overly conservative (chooses less complex models)?

Formula interpretation:

  • [latex]SSR[/latex] = Sum of squares in the residual.
  • [latex]N[/latex] = sample size.
  • [latex]p[/latex] = number of parameters in model.  Number of variables plus 1 for the intercept.

These formulae are from Anderson (2015).

Contrasts (Pairwise and otherwise)

When a factor has only two levels, comparisons among those levels are trivial.  The situation becomes more interesting when more than two levels are involved.  We will illustrate this using the past and current grazing statuses of each stand.  Also, we will focus here on pairwise contrasts; see Appendix 4 for a discussion of other types of contrasts.

A More Detailed Grazing Example

The oak plant community dataset contains factors indicating whether stands were grazed in the past and/or grazed at the time the data were collected (GrazPast and GrazCurr, respectively).  We’ll combine these factors together into a new Grazing factor (capitalized):

Grazing <- factor(with(Oak, paste(GrazPast, GrazCurr, sep = "_")))


No_No Yes_No Yes_Yes
24      6      17

Since each of the original factors is binary, there are four possible levels here.  However, there are no stands that were not grazed in the past but were being grazed at the time the data were collected (i.e., the ‘No_Yes’ combination is absent).


Let’s test the overall effect of Grazing on vegetation composition, as measured in Oak1:

Graz.res2 <- adonis2(Oak1 ~ Grazing, method = "bray")

View ANOVA table (Graz.res2):

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999

adonis2(formula = Oak1 ~ Grazing, method = "bray")
         Df SumOfSqs      R2      F Pr(>F)   
Grazing   2   0.9095 0.07844 1.8725  0.005 **
Residual 44  10.6854 0.92156                 
Total    46  11.5949 1.00000                 
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Yes, there is a relationship between Grazing and composition.  This means that at least two levels of grazing have different compositions, but doesn’t tell us whether some or all of the levels differ from one another.  To determine this, we need to use pairwise contrasts.


Pairwise contrasts are planned comparisons between pairs of levels of a factor.  If f  is the number of levels of a factor, the number of pairs of levels is f(f-1) / 2 – the same formula that we used to determine the number of distances in a distance matrix though I’ve used different symbology here.   In this case, there are 3 x (3-1) / 2 = 3 pairs of levels:

  • No_Yes vs. Yes_No
  • No_Yes vs. Yes_Yes
  • Yes_No vs. Yes_Yes


The current version of adonis2() includes an argument to do one degree of freedom contrasts for a factor:

adonis2(Oak1 ~ Grazing, method = "bray", by = "onedf")

Permutation test for adonis under reduced model
Sequential test for contrasts
Permutation: free
Number of permutations: 999

adonis2(formula = Oak1 ~ Grazing, method = "bray", by = "onedf")
               Df SumOfSqs      R2      F Pr(>F)    
GrazingYes_No   1   0.1361 0.01174 0.5604  0.958    
GrazingYes_Yes  1   0.7734 0.06670 3.1846  0.001 ***
Residual       44  10.6854 0.92156                  
Total          46  11.5949 1.00000                  
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The Grazing factor in the ANOVA table has been replaced by two rows.  Each of these rows is a contrast accounting for one df and explaining some of the SS.  Together, these rows account for the df and SS that were previously accounted for by Grazing – refer to above results to satisfy yourself that this is the case.   The first level in our factor (No_No) was set as the baseline and each of the other levels was compared against it – that’s why the names of the other levels are written after the factor name.  For example, the row labeled ‘GrazingYes_No‘ is the contrast between No_No and Yes_No, and indicates that these two levels do not differ in composition.

However, this result is not entirely satisfying for several reasons:

  • It only shows one set of contrasts – in this case, the contrast between the Yes_No and Yes_Yes treatments is not tested.
  • It shows the contrasts in one particular order.  Recall that adonis2() uses Type I or sequential sums of squares, which means that the first term explains as much of the variation as it can but the second can only explain the variation that was not explained by the first term.  In other words, switching the order of the contrasts may change their apparent importance.

We can do better if we control the contrasts ourselves.  Although we could code our own contrasts (and do so in Appendix 4), here I modify a function from the RVAideMemoire package.


The usage of this function is:

test = c("Pillai", "Wilks", "Hotelling-Lawley", "Roy", "Spherical"),
nperm = 999,
progress = TRUE,
p.method = "fdr",


This function relates a response (resp) to a grouping factor (fact).  It includes numerous options that are not relevant for our purposes – for example, it can do pairwise comparisons for parametric MANOVAs.

The actual code of a function can be seen by calling the function without the parentheses.  I display a piece of the code from pairwise.perm.manova() here, excerpting it to the sections that apply if the response (resp) is a distance matrix:

if ("dist" %in% class(resp)) {
fun.p <- function(i, j) {
fact2 <- droplevels(fact[as.numeric(fact) %in% c(i, j)])
resp2 <- as.matrix(resp)
rows <- which(fact %in% levels(fact2))
resp2 <- as.dist(resp2[rows, rows])
vegan::adonis(resp2 ~ fact2, permutations = nperm)$[1, Pr(>F)"]
## JDB - additional lines omitted
multcomp <- pairwise.table(fun.p, levels(fact), p.adjust.method = p.method)
## JDB - additional lines omitted
method <- "permutation MANOVAs on a distance matrix"

Each pair of levels needs to be compared in some way.  The bulk of the code in this excerpt is the fun.p() function, which:

  • Subsets both the response and grouping factor to a pair of levels. The levels of the grouping factor are replaced by numbers which can then be paired by indexing them as i and j (this happens internally so we do not need to define them ourselves).
  • Uses adonis() to test for differences between the remaining pair of levels
  • Returns the p-value of this test

After creating this function, the second-last line applies it within an existing function, stats::pairwise.table(), to create a table of pairwise comparisons among the levels of the grouping factor (fact).

Customizing the Code: Creating the pairwise.adonis2() function

I used the above excerpt of code as a starting point for a new function focused on our needs.

In writing this code, I assumed that:

  • resp is a distance matrix
  • unless otherwise specified, we do not want to make corrections for multiple testing. (If this is desired, the options are described at ?p.adjust.methods).

I also replaced adonis(), which is no longer supported, with adonis2().  Here’s the function:

pairwise.adonis2 <- function(resp, fact, p.method = "none", nperm = 999) {
  resp <- as.matrix(resp)
  fact <- factor(fact)
  fun.p <- function(i, j) {
    fact2 <- droplevels(fact[as.numeric(fact) %in% c(i, j)])
    index <- which(fact %in% levels(fact2))
    resp2 <- as.dist(resp[index, index])
    result <- adonis2(resp2 ~ fact2, permutations = nperm)
  multcomp <- pairwise.table(fun.p, levels(fact), p.adjust.method = p.method)
  return(list(fact = levels(fact), p.value = multcomp, p.adjust.method = p.method))


Functions have a standard form.  Some notes about this form, using the above function as an example:

  • Defined using the function() function.
  • The object name that function() is assigned to is the name of the function that you call to execute it. This function is called pairwise.adonis2().
  • The arguments within the parentheses of function() are the elements that can be adjusted when the function is executed. If a default value is desired or you wish to choose from a set of options, those can be specified here.  For example, I set the default number of permutations to nperm = 999.
  • The start and end of the function are designated by squiggly brackets ( { } ) – they may encompass many lines of code. Furthermore, these can be nested within one another, and thus it is typical to indent the code to reflect which set of brackets it is part of.  In RStudio scripts, this indenting happens automatically.
  • Arguments are referenced within the function’s code – everywhere that an argument’s name is referenced, it will be replaced with the value that is specified when running the code. In the above code, I’ve highlighted in red text the places within the function where the arguments are used.

One caveat: this function is hard-coded for a one-way ANOVA … no other explanatory variables are included in the adonis2() function.  Can you see how you would adjust it if you needed to include another explanatory variable?


This function is available in the GitHub repository as pairwise.adonis2.R.  You will need to download and save it into your ‘functions’ sub-folder, and then load it into R’s memory before you can apply it.  To apply it to our example:


pairwise.adonis2(resp = vegdist(Oak1), fact = Grazing)

The output is a list, including the p-values for each pairwise comparison:

No_No Yes_No
Yes_No 0.359
Yes_Yes 0.001 0.932

Our levels are the past and current grazing status, in that order.  So, these comparisons indicate that never grazed (No_No) stands differ from always grazed stands (Yes_Yes), but that those that are no longer grazed (Yes_No) do not differ from either of the other combinations.

Other contrasts are also possible.  For example, in the ‘Custom Contrasts’ section of Appendix 4, I demonstrate how these three levels of Grazing can be used to compare stands that were grazed historically (Yes_No and Yes_Yes) with stands that were not (No_No).


Techniques such as PERMANOVA are versatile and can simultaneously test the effects of multiple explanatory models.  They can also be extended to answer specific questions, such as contrasts between pairs of levels of a factor.

When data are unbalanced, the type of sums of squares can dramatically affect the conclusions of an analysis.  This is equally true for conventional univariate analyses and for permutation-based multivariate analyses.


Anderson, M.J. 2015. Workshop on multivariate analysis of complex experimental designs using the PERMANOVA+ add-on to PRIMER v7. PRIMER-E Ltd, Plymouth Marine Laboratory, Plymouth, UK.

Anderson, M.J., R.N. Gorley, and K.R. Clarke. 2008. PERMANOVA+ for PRIMER: guide to software and statistical methods. PRIMER-E Ltd, Plymouth Marine Laboratory, Plymouth, UK. 214 p.

Legendre, P., and M.J. Anderson. 1999. Distance-based redundancy analysis: testing multispecies responses in multifactorial ecological experiments. Ecological Monographs 69:1-24.

McArdle, B.H., and M.J. Anderson. 2001. Fitting multivariate models to community data: a comment on distance-based redundancy analysis. Ecology 82:290-297.

Media Attributions



Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Applied Multivariate Statistics in R Copyright © 2024 by Jonathan D. Bakker is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book