Group Comparisons

# 19 MRPP

Learning Objectives

To understand the theory behind MRPP.

To apply Mantel tests manually and through R.

Key Packages

`require(vegan)`

# Theory

Multi-Response Permutation Procedure (MRPP) tests whether there is a significant difference between two or more groups of sampling units.  It was first used to analyze the spatial distribution of archeological artifacts (Berry et al. 1980, 1983) and applied to ecological questions by Biondini et al. (1985 (Appendix), 1988).  More recent ecological examples include Altrichter et al. (2018), Bates & Davies (2018), and Phillips & Swanson (2018).

MRPP focuses on the distances among sample units within each group.  The test statistic (delta: $\delta$) is the mean distance among sample unit within groups.  The key idea is simple: if groups differ, the mean within-group dissimilarity should be smaller than the mean dissimilarity among randomly selected groups of the same size.

Generally, the mean distances are weighted by sample size, so that groups that contain more information (i.e., more sample units) are given more importance in the calculation.  McCune & Grace (2002, Table 24.1) outline several other potential weighting methods.  In my experience, these methods are not commonly used.

A blocked MRPP (MRBP) is also available, which allows you to test the significance of one factor while controlling for the effect of another.  However, MRPP is a simple metric and is not appropriate for complex designs.

Key Takeaways

MRPP tests whether the mean within-group dissimilarity is smaller than expected.

The test statistic, $\delta$, is simply the mean dissimilarity between pairs of sample units from the same group.  It can be 0 or positive.  Statistical significance focuses on the left tail of the distribution obtained from permutations of the grouping factor.

An effect size, A, is also calculated and relates the observed $\delta$ to its expected values if the grouping factor was not helpful.

MRPP cannot accommodate complex designs.

# Basic Procedure

The basic procedure for MRPP is as follows.

1. Convert the data matrix to a dissimilarity matrix.
2. Calculate the mean distance within each group.
3. Calculate the test statistic (the observed $\delta$) as the average of the mean distance within each group, weighting by sample size if necessary.
4. Assess statistical significance via a permutation test:
• Shuffle the group identities.
• Recalculate $\delta$.  Save this value.
• Re-shuffle and recalculate the specified number of times.  The permutations produce a sampling distribution of $\delta$.
5. Calculate the P-value as the proportion of permutations that yielded $\delta$ equal or smaller than the observed $\delta$ (do you understand why we are interested in smaller values?).

MRPP also calculates an effect size, A, which is defined as the chance-corrected within-group agreement.  A is calculated as:

$A = 1 - \frac{\delta}{m_{\delta}} = 1 - \frac{observed \delta}{expected \delta}$

where $m_{\delta}$ is the random expectation (i.e., value of $\delta$ if plots were randomly assigned groups).  A ranges from 0 (groups contain as much heterogeneity as would be expected by chance) to 1 (all items are identical within groups).  McCune & Grace (2002, p. 191) note that small A values (< 0.1) are common in community ecology.

# Simple Example, Worked By Hand

Here’s the distance matrix from our simple example:
`Resp.dist <- dist(perm.eg[ , c("Resp1", "Resp2")])`

 Plot1 Plot2 Plot3 Plot4 Plot5 Plot2 2.828 Plot3 4.123 2.236 Plot4 11.314 11.662 9.849 Plot5 9.849 9.220 7.071 4.123 Plot6 12.207 12.042 10.000 2.236 3.162

Recall that plots 1-3 are in group A and plots 4-6 are in group B.  The distances between plots in the same group are in bold.  This is the opposite of how I summarized it for other tests, but I do so here because MRPP relies only on the within-group dissimilarities for the calculation of its test statistic.

Calculation of the mean within-group distances is simple arithmetic:

• Group A: $(2.828 + 4.123 + 2.236) / 3 = 3.063$
• Group B: $(4.123 + 2.236 + 3.162) / 3 = 3.174$

Since our dataset is balanced, both groups receive equal weight and the average within-group distance across groups is $(3.063 + 3.174) / 2 = 3.118$.  This is the observed delta.

The expected delta is approximately equal to the mean distance averaged across the entire distance matrix – in this case, 7.461 (can you calculate this?).  Therefore, the chance-corrected within-group agreement (A) is approximately:

$A = 1 - \frac{3.118}{7.461} = 0.582$

During the actual implementation of this analysis, the expected delta is calculated as the mean of the $\delta$ values calculated during the permutations.

# Implementation in R (`vegan::mrpp()`)

In R, MRPP can be conducted using the `mrpp()` function in the `vegan` package.  Its usage is:

```mrpp( dat, grouping, permutations = 999, distance = "euclidean", weight.type = 1, strata = NULL, parallel = getOption("mc.cores") )```

The arguments are:

• `dat` – data matrix, data frame, or dissimilarity matrix
• `grouping` – factor or numeric index for grouping observations
• `permutations` – number of permutations to conduct to assess the significance of the test statistic, or a list of permutation instructions obtained using the `how()` function.  This latter option is necessary with complex designs where permutations need to be restricted – we’ll discuss this in a few days.  Default is to conduct 999 permutations without any restrictions to the permutations.
• `distance` – Default is Euclidean, though any distance measure available in `vegdist()` can be specified.  This is only used if `dat` is not a dissimilarity matrix. Note that `mrpp()` decides whether it needs to calculate the dissimilarity matrix based on the class of the object identified with the dat argument.  If the object is not of class ‘dist’, `mrpp()` calls `vegdist()` and converts the object to a dissimilarity matrix using the distance measure specified with the distance argument.
• `weight.type` – This is equivalent to the methods for weighting groups as outlined in Table 24.1 of McCune & Grace (2002).  The default is method 1, which weights each group by its sample size.
• `strata` – integer vector or factor identifying strata that permutations are to be restricted within.  Used for blocked MRPP (MRBP).  I’m not sure whether this involves median alignment within blocks as discussed by McCune & Grace (2002, p. 194).
• `parallel` – option for parallel processing

The resulting object include several statistics:

• `A` – chance corrected within-group agreement.
• `delta` – observed delta; weighted mean within-group distance from actual data.
• `E.delta` – expected delta; weighted mean distance from permutations.
• `Pvalue` – significance of delt; number of permutations with delta as small or smaller than `delta`.

## Simple Example, in R

To analyze this simple example:

```simple.results.mrpp <- mrpp( dat = Resp.dist, grouping = perm.eg$Group, permutations = 999 )``` `simple.results.mrpp` ```Call: ````mrpp(dat = Resp.dist, grouping = perm.eg$Group, permutations = 999)`
``` Dissimilarity index: euclidean```
`Weights for groups:  n`

`Class means and counts:`

`      A     B`
`delta 3.063 3.174`
`n     3     3`

`Chance corrected within-group agreement A: 0.5821`
`Based on observed delta 3.118 and expected delta 7.461`

`Significance of delta: 0.1`
`Permutation: free`
`Number of permutations: 719`

## Grazing Example, in R

To analyze our grazing example:

```grazing.results.mrpp <- mrpp( dat = Oak1.dist, `````` grouping = grazing, permutations = 999 )```

`grazing.results.mrpp`

`Call:`
`mrpp(dat = Oak1.dist, grouping = grazing, permutations = 999)``Dissimilarity index: bray`
``` Weights for groups:  n`````` Class means and counts:````      No     Yes`
`delta 0.6892 0.6877`
`n     30     17```` Chance corrected within-group agreement A: 0.0183```
`Based on observed delta 0.6887 and expected delta 0.7015``Significance of delta: 0.001`
`Permutation: free`
`Number of permutations: 999`

Although grazing is statistically significant, the chance-corrected within-group agreement (A) is very low, suggesting that this is not a strong effect.  We may want to consider, for example, whether it is biologically meaningful.

As we saw with ANOSIM and Mantel tests, the object containing the results of this analysis includes more information than is displayed on the screen.

`str(grazing.results.mrpp)`

For example, you can plot the distribution of deltas obtained from the permutations and see how the observed $\delta$ compares.

# References

Altrichter, K.M., E.S. DeKeyser, B. Kobiela, and C.L.M. Hargiss. 2018. A comparison of five wetland communities in a North Dakota fen complex. Natural Areas Journal 38:275-286.

Bates, J.D., and K.W. Davies. 2018. Characteristics of intact Wyoming big sagebrush associations in southeastern Oregon. Rangeland Ecology and Management 72:36-46.

Berry, K.J., K.L. Kvamme, and P.W. Mielke Jr. 1980. A permutation technique for the spatial analysis of the distribution of artifacts into classes. American Antiquity 45:55-59.

Berry, K.J., K.L. Kvamme, and P.W. Mielke Jr. 1983. Improvements in the permutation test for the spatial analysis of the distribution of artifacts into classes. American Antiquity 48:547-553.

Biondini, M.E., C.D. Bonham, and E.F. Redente. 1985. Secondary successional patterns in a sagebrush (Artemisia tridentata) community as they relate to soil disturbance and soil biological activity. Vegetatio 60:25-36.

Biondini, M.E., P.W. Mielke Jr., and K.J. Berry. 1988. Data-dependent permutation techniques for the analysis of ecological data. Vegetatio 75:161-168.

McCune, B., and J.B. Grace. 2002. Analysis of ecological communities. MjM Software Design, Gleneden Beach, OR.

Phillips, P., and B.J. Swanson. 2018. A genetic analysis of dragonfly population structure. Ecology and Evolution 8:7206-7215.