Group Comparisons
27 Comparison of Techniques
Learning Objectives
To consider the advantages and disadvantages of multivariate statistical techniques for comparing groups.
Readings
Walters & Coen (2006)
Key Packages
require(vegan, RRPP)
Introduction
In this chapter, I want to summarize the advantages and disadvantages of the techniques that we’ve discussed in this section of the course. These opinions are admittedly biased, primarily from the perspective of community ecology. See McCune & Grace (2002), Kenkel (2006), Walters & Coen (2006), Ramette (2007), and Anderson & Walsh (2013) for more information.
Please note that there are other techniques that we have not discussed here (e.g., Wang et al. 2012).
Summary of Techniques
All of these techniques are permutationbased. This means that we do not have to assume (multivariate) normality.
The ability to assess significance is a function of sample size, though the number of permutations is not a limiting factor if you have a reasonable sample size. Knowing to control permutations can ensure that statistical tests are repeatable, producing the same Pvalue each time they are run.
Some of these techniques are only useful for simple designs, while others can handle complex designs. In the latter cases, it is often important to know how to restrict permutations to reflect your study design. Existing functions to do so assume that designs are balanced.
ANOSIM
ANOSIM (vegan::anosim()
) is a nonparametric technique based on ranks. It is conceptually similar to a nonparametric correlation test and thus is a special form of the more general Mantel test.
Somerfield et al. (2021b) argued that ANOSIM results are more robust than other test statistics: since this technique is based on rank distances, a monotonic transformation of the dissimilarity matrix will not alter the ranks of the dissimilarities themselves. For example, the conclusions would be the same if based on Euclidean distances or on Manhattan distances (recall that Euclidean distances are Manhattan distances squared).
ANOSIM is commonly used for simple designs. Recent efforts have generalized this method to more complex designs with up to three factors, though it cannot distinguish main effects and interactions among factors (Somerfield et al. 2021a, b, c; see here for examples). These generalizations are not currently available in the current formulation of this technique in R so they would have to be manually coded by the user.
Mantel Tests
Mantel tests (vegan::mantel()
) assess the linear correlation between two distance matrices derived from different variables on the same sample units. One of the distance matrices can be based on a grouping factor to compare among levels of that factor. However, the two distance matrices can also be produced from continuous variables, allowing a regressiontype approach.
Mantel tests can also be used to control for one variable or set of variables while testing for the effect of another variable or set of variables. For example, you could correlate your response distance matrix against the spatial distances among sample units, and then test whether the residuals from that analysis relate to a distance matrix calculated from an explanatory variable or set of variables.
Some authors strongly discourage the use of Mantel tests (e.g., Guillot & Rousett 2013) though solutions have been proposed recently (e.g., Lisboa et al. 2014; Crabot et al. 2019; Somers & Jackson 2022) that claim to address the issues with this type of test. The verdict is still out about the utility of these solutions.
MRPP
MRPP (vegan::mrpp()
) uses a simple test statistic and is only applicable for simple designs.
PERMANOVA
PERMANOVA (vegan::adonis2()
) is conceptually very similar to ANOVA and linear regression. It can be applied to both metric distances (e.g., Euclidean) and semimetric dissimilarities (e.g., BrayCurtis). It analyzes the distances themselves, regardless of the characteristics of those distances. Since it is based on the distance matrix, it can be applied identically to univariate and multivariate data.
How much variation a dataset contains is in part a function of which distance measure is used to summarize the variation among sample units. This means that PERMANOVA can lead to different conclusions depending on which distance measure is used to summarize the data matrix. However, it can quantify the importance of the main effects of factors and of interactions between factors (Somerfield et al. 2021b).
PERMANOVA can be applied to a wide range of complex models. In the version of PERMANOVA available in R , some aspects of analyses need to be manually coded. For example, when analyzing the whole plot aspect of a splitplot design, we demonstrated how to apply a set of permutations and to use the correct error term when calculating the pseudoF statistic.
PERMDISP
PERMDISP (vegan::betadisper()
) is different from the other techniques we’ve covered – it simply characterizes the distance between each sample unit and the centroid of the group to which it belongs. These distances can then be analyzed to help understand whether differences among groups are due to location of the centroids, dispersion around those centroids, or both. Knowing whether dispersion varies among groups can help us interpret statistical results and their ecological significance.
RRPP
RRPP (RRPP::lm.rrpp()
) is similar to PERMANOVA but explicitly designed for metric distances. It is computationally complex but permits use of many existing functions for downstream applications. RRPP is a linear model and can be applied to a wide range of complex models.
RRPP automatically includes an adjustment that causes semimetric distances to ‘behave’ as if they were metric – see the RRPP chapter for an example of this. It’s unclear to me how much this adjustment might alter the conclusions of an analysis. At a minimum, if this adjustment was being made in the analysis it should also be applied to the distance matrix before it is subject to ordination, etc.
Comparisons of Techniques
Walters & Coen (2006) provide a nice ‘real world’ comparison of the utility of these techniques for assessing compositional differences between treatments. They compared two of the techniques we covered, ANOSIM and PERMANOVA, with a classic multivariate analysis of variance (MANOVA) and with ECOSIM, a ‘null model analysis of cooccurrence’. All techniques were used to analyze the same dataset evaluating whether composition differed between constructed and natural reefs. Overall, the authors concluded that PERMANOVA was the most sensitive: other techniques did not detect differences between constructed and natural reefs. They highlighted several strengths of PERMANOVA, including negligible assumptions, flexibility to handle complex designs, and ability to conduct statistical tests even when sample sizes are small.
Anderson & Walsh (2013) compared the sensitivity of ANOSIM, Mantel tests and PERMANOVA to heterogeneity of dispersion. ANOSIM was particularly sensitive to heterogeneity in dispersion, followed by Mantel tests. PERMANOVA was unaffected by heterogeneity in dispersion if the design was balanced, but affected by it if the design was unbalanced (and, the direction of the change depended on whether greater dispersion occurred in the smaller or larger group). Overall, PERMANOVA was more powerful at detecting changes in community structure. They concluded that the combination of PERMANOVA and PERMDISP was particularly helpful for distinguishing location and dispersion effects in balanced designs, though reliability when applied to unbalanced designs can be poor.
Paliy & Shankar (2016) provide a detailed overview of a wide range of techniques from the perspective of microbial ecology. They include ANOSIM, PERMANOVA and Mantel tests, along with a suite of ordination and other techniques. However, they don’t do any simulations or provide assessments of how well the techniques perform.
Somerfield et al. (2021b) argue that ANOSIM and PERMANOVA are complementary techniques: ANOSIM can provide an overall test of a factor’s importance (robust to transformations of the distances, for example), whereas PERMANOVA can provide insight into elements such as the main effect of a factor and its interactions with other factors, but is sensitive to the details of which distance measure is used.
I haven’t seen any papers that have compared RRPP with other techniques, in part because it is so new. In an appendix to their article describing RRPP, Collyer & Adams (2018) compare it to PERMANOVA as coded in the adonis()
and adonis2()
functions from vegan
. They conclude that there is considerable overlap between them and that they yield identical analytical results. However, there are more downstream functions available for the output of RRPP than of PERMANOVA.
Advantages and Disadvantages of Techniques
This table summarizes the advantages and disadvantages of the techniques that we’ve covered, along with the parametric ANOVA/MANOVA approach.
Method  Advantages  Disadvantages 
ANOVA / MANOVA 


Mantel tests 


ANOSIM 


MRPP 


PERMANOVA 


PERMDISP 


RRPP 


Conclusions
In these, as in all of the other multivariate techniques that we will discuss this quarter, it is very important that you use appropriate data adjustments (removal of rare species, transformations, and, especially, relativizations) and distance measure. All of these adjustments need to be clearly noted when describing an analysis.
Furthermore, we often apply several techniques to the same data. For example, we might use PERMANOVA to test differences among groups, PERMDISP to test for differences in dispersion, and PCoA or another ordination technique to visualize those differences. If you are applying several techniques to the same data, it is essential that the same data adjustments and distance measure be used in all cases. Otherwise, the different techniques (e.g., analysis and visualization) are not using the same data!
Regardless of which technique we’ve used, a significant effect by itself does not allow us to say how the treatments differ. It does not make sense to describe one group as having ‘larger’ or smaller values than another when they are being compared with respect to multiple responses … can you see why this is so? This is different from how univariate analyses are often interpreted.
There are several ways to understand a significant group comparison test:
 PERMDISP helps distinguish differences in location from differences in dispersion
 Techniques that visualize the data (e.g., ordinations) can give valuable insights into how the treatments differ
 Followup tests can explore whether individual response variables differ among the groups. For compositional data, these techniques include SIMPER, Indicator Species Analysis, and TITAN.
References
Anderson, M.J., and D.C.I. Walsh. 2013. PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: what null hypothesis are you testing? Ecological Monographs 83:557574.
Collyer, M.L., and D.C. Adams. 2018. RRPP: an R package for fitting linear models to highdimensional data using residual randomization. Methods in Ecology and Evolution 9:17721779.
Crabot, J., S. Clappe, S. Dray, and T. Datry. 2019. Testing the Mantel statistic with a spatiallyconstrained permutation procedure. Methods in Ecology and Evolution 10:532540.
Gotelli, N.J., and A.M. Ellison. 2004. A primer of ecological statistics. Sinauer, Sunderland, MA.
Guillot, C., and F. Rousett. 2013. Dismantling the Mantel tests. Methods in Ecology and Evolution 4:336344.
Lisboa, F.J.G., P.R. PeresNeto, G.M. Chaer, E.d.C. Jesus, R.J. Mitchell, S.J. Chapman, and R.L.L. Berbara. 2014. Much beyond Mantel: bringing Procrustes Association Metric to the plant and soil ecologist’s toolbox. PLoS ONE 9(6):e101238.
Kenkel, N.C. 2006. On selecting an appropriate multivariate analysis. Canadian Journal of Plant Science 86:663676.
McCune, B., and J.B. Grace. 2002. Analysis of ecological communities. MjM Software Design, Gleneden Beach, OR.
Paliy, O., and V. Shankar. 2016. Application of multivariate statistical techniques in microbial ecology. Molecular Ecology 25:10321057.
Ramette, A. 2007. Multivariate analyses in microbial ecology. FEMS Microbiology Ecology 62:142160.
Somerfield, P.J., K.R. Clarke, and R.N. Gorley. 2021a. A generalized analysis of similarities (ANOSIM) statistic for designs with ordered factors. Austral Ecology 46:901910.
Somerfield, P.J., K.R. Clarke, and R.N. Gorley. 2021b. Analysis of similarities (ANOSIM) for 2way layouts using a generalized ANOSIM statistic, with comparative notes on Permutational Multivariate Analysis of Variance (PERMANOVA). Austral Ecology 46:911926.
Somerfield, P.J., K.R. Clarke, and R.N. Gorley. 2021c. Analysis of similarities (ANOSIM) for 3way designs. Austral Ecology 46:927941.
Somers, K.M., and D.A. Jackson. 2022. Putting the Mantel test back together again. Ecology 103:e3780.
Walters, K., and L.D. Coen. 2006. A comparison of statistical approaches to analyzing community convergence between natural and constructed oyster reefs. Journal of Experimental Marine Biology and Ecology 330:8195.
Wang, Y., U. Naumann, S.T. Wright, and D.I. Warton. 2012. mvabund– an R package for modelbased analysis of multivariate abundance data. Methods in Ecology and Evolution 3(3):471474. https://doi.org/10.1111/j.2041210X.2012.00190.x