Dune meadow vegetation data

This data set contains abundances of 30 plant species measured at 20 sites in a dune area in The Netherlands. Details can be found in Jongman et al. (1996). The abundance values are on a 1-9 scale. Five explanatory variables were measured at each site, namely:

bulletThickness of the A1 horizon (in mm).
bulletMoisture content of the soil (on a 0-4 scale).
bulletQuantity of manuring (on a 0-4 scale).
bulletAgricultural use (with the three classes haypasture, hayfield and pasture).
bulletManagement regime (with the four classes standard farming (SF), bio-dynamical farming (BF), hobby farming (HF) and nature management (NM)).

The data are available in the file dune1.xls. The variables agricultural use and management regime are nominal variables and need to be transformed into  so-called dummy variables, see dune2.xls. Data in this file can be imported into Brodgar. To perform a CCA (or partial CCA, RDA and partial RDA) on these data, the classes pasture and NM need to be de-selected during the data import process in Brodgar. This is to avoid multi-colinearity.

In this example, we show how variance partitioning can be used to identify the variance in the species data which is (i) purely due to management effects ,  (ii) purely related to the soil variables (A1 and moisture), and (iii) how much information is related to none of these two groups.

Variance partitioning

Variance partitioning is best explained in terms of linear regression. Let y be a vector containing the response variables and the matrices X and W contain the response variables. Variance partitioning allows one to model the relationship between y and X while controlling for the effects of W. Technically, variance partitioning works as follows:

bullet

Apply a multiple regression of y against X and W together.

bullet

Apply a multiple regression of y against X.

bullet

Apply a multiple regression of y against W.

Using R2 of each regression analysis, the pure X effect, the pure W effect, the shared effect and the amount of residual variation can be determined. See Legendre and Legendre (1998) for more details. Borcard et al. (1992) applied this to multivariate data and used CCA instead of linear regression. There approach lead to the following sequence of steps:

  1. Apply a CCA on Y (which is now a matrix!) against X and W together.

  2. Apply a CCA on Y against X.

  3. Apply a CCA on Y against W.

  4. Apply a CCA on Y against X, using W as covariates (partial CCA).

  5. Apply a CCA on Y against W, using X as covariates (partial CCA).

Using the total sum of (canonical) eigenvalues of each CCA analysis (equivalent of R2 in regression), the pure X effect, the pure W effect, the shared information and the residual variation can all be explained as % of the total inertia (variation). 

Results for dune meadow data

Results of the 5 CCA and partial  CCA analyses are given in Table 1. The explained variances of the five models can be used to decompose the total variance in a pure soil effect (A1 & moisture), a pure management effect (SF, BF, HF), the shared component and the residual information, see Table 2. Results indicate that 22% of the variation in the species data is due to the management variables. The soil variables explain 19% of the variation. Both groups share 6% variation (no discrimination could be made), and 53% of the variation in the plant species data is not related to either management or soil variables.

 

Table 1. Results of various CCA and partial CCA analysis for the dune meadow data. Soil variables are A1 & moisture. Management variables are the nominal variables SF, BF and HF. Total inertia is 2.16. Percentages are obtained by dividing the explained variance by total inertia.

Step Explanatory variables Explained variance
1 Soil and management  1.00 46
2 Soil 0.53 25
3 Management regime 0.59 27
4 Soil with management as covariable 0.41 19
5 management with soil as covariable 0.47 22

 

Table 2. Variance decomposition table showing the effects of management and soil variables for the dune meadow data.

Component Source Calculation Variance %
a Pure management   0.47 22
b Pure soil   0.41 19
c Shared 0.59-0.47 0.12 6
d Residual 2.16-1.00 1.16 53
         
Total     2.16 100

 

Other applications

In this example, the matrix X contained the soil variables and W the management variables. Using variance partitioning, the pure X effect, the pure W effect and the residual variation was determined. The same approach can be applied if sampling takes place at different time periods (W contains nominal time factors). This allows one to analyse short multivariate time series data. Other scenarios are bats abundance data caught by different nets and observers, fish caught by different boats, etc. Other common applications involve spatial variables.

 

References

Borcard, D., Legendre, P. and P. Drapeau. 1992. Partialling out the spatial component of ecological variation. Ecology 73: 1045-1055.

Jongman, R.H.G. and Ter Braak, C.J.F. and van Tongeren, O.F.R. (1995). Data analysis in community and landscape ecology. Cambridge University Press, Cambridge.

Legendre, P., and L. Legendre. 1998. Numerical ecology. Second edition. Elsevier, Amsterdam, The Netherlands.

Ter Braak, C.J.F. 1994. Canonical community ordination. Part I: Basic theory and linear methods. Ecoscience 2: 127-140

Ter Braak, C.J.F. and P.F.M. Verdonschot. 1995. Canonical correspondence analysis and related multivariate methods in aquatic ecology. Aquatic Sciences 57/3.

 

Home