|
Ecologists study complex systems, and often need to use non-standard methods of sampling and data analysis. The data might be collected over a long-time scale, involve little spatial replication, or be highly aggregated in space. There have been many fruitful collaborations between ecologists and statisticians, often leading to the development of new statistical methods. In this brief overview of the subject, I will focus on three areas that have been of particular interest in the management of animal populations. I will also discuss the use of statistical methods in other areas of ecology, the aim being to highlight interesting areas of development rather than a comprehensive review.
Mark-recapture methods
Mark-recapture methods are commonly used to estimate abundance and survival rates of animal populations (Lebreton et al 1992, Williams et al 2002). Typically, a number of individuals are physically captured, marked and released. The information obtained from successive capture occasions is summarised in a "capture history", which indicates whether or not an individual was captured on the different occasions. The likelihood is specified in terms of demographic parameters of interest, such as annual survival probabilities, and nuisance parameters that model the capture process. A range of goodness-of-fit diagnostics have been developed, including estimation of overdispersion (Anderson et al 1994). Overdispersion usually arises as a consequence of heterogeneity, or lack of independence, amongst individuals in the survival and/or capture probabilities; attempts have also been made to model such heterogeneity directly (Pledger et al 2003). often involves use of Akaike's information criterion (AIC), and model-averaging is also commonly used (Johnson and Omland 2004). Bayesian methods are becoming popular, particularly as means of fitting hierarchical models (Brooks et al 2000). Recent developments include the use of genotyping of fecal, hair or skin samples to identify individuals (Lukacs and Burnham 2005, Wright et al 2009), and spatially-explicit models that allow estimation of population density (Borchers and Efford 2008). A related area of recent interest has been the estimation of the occupancy rate, i.e. the proportion of a set of geographical locations that are occupied by a species (MacKenzie et al 2006). This can be of interest in large-scale monitoring programs, for which estimation of abundance is too costly, and in understanding metapopulation dynamics. In this setting, the "individuals" are locations and the "capture history" records whether or not a species was observed at that location, on each of several occasions.
Distance sampling
A common alternative method for estimating population abundance or density is distance sampling. This involves recording the distance of each observed individual from a transect line or a point. The analysis then involves estimation of the probability of detection of an individual as a function of distance (Buckland et al 2004), thereby allowing estimation of the number of individuals that have not been detected. Two important assumptions in using this method is that detection is certain for an individual on the line or point and that individuals do not move during the observation process, although modifications have been suggested for situations in which these assumptions are not met (Borchers et al 1998, Buckland and Turnock 1992). Compared to the use of mark-recapture methods for estimating abundance, distance sampling typically provides savings in terms of field effort, and will usually be more appropriate when the population is widely dispersed. A useful discussion of the theory underlying use of distance sampling is given by Fewster and Buckland (2004), while Schwarz and Seber (1999) provide an extensive review of methods for estimating abundance.
Population modelling
Population projection models have long been used as a tool in the process of managing animal and plant populations, most often as means of assessing the impact of management on the population growth rate or on the probability of quasi-extinction (Caswell 2001, Burgman et al 1993). A population model will typically involve one or more demographic parameters, such as annual survival probabilities and annual reproductive rates, for individuals in different ages or stages. In the past, estimation of the parameters has been performed by separately fitting statistical models to the different sets of data; recent work in this area has focussed on regarding the population model as a statistical model that can be fitted to all the available data (Buckland et al 2007). The benefit of this approach is that all the uncertainty can be allowed for, and that estimation of the parameters can be improved by including data that provide a direct indication of the population growth rate (Besbeas et al 2002). This development has the potential to allow ecologists to fit a broad range of population models to their data, including ones that allow for immigration (c.f. Nichols and Hines 2002, Peery et al 2006).
Other Developments
A key aspect of studying many plant and animal populations is their aggregated spatial distribution. This distribution might be of interest in itself, or be something that needs to be allowed for in the sampling and data analysis. There is a long tradition of the analysis of spatial pattern in ecology, involving a range of statistical techniques, including distance-based methods and spatial point processes (Fortin and Dale 2005). Various statistical distributions have been suggested as a means of allowing for the fact that aggregation often leads to zero-inflated and/or positively skewed data. These include the negative binomial, lognormal and gamma distributions, plus zero-inflated versions of these (Dennis and Patil 1984, Martin et al 2005, Fletcher 2008). Likewise, methods have been developed for fitting models that incorporate spatial autocorrelation (Legendre 1993, Fortin and Dale 2005).
Adaptive sampling is a modification of classical sampling that aims to allow for spatial aggregation by adaptively increasing the sample size in those locations where the highest abundances have been found in an initial sample (Thompson, Seber 1996, Brown and Manly 1998). Information on the number and relative abundance of individual species in one or more geographical areas has been of interest to many ecologists, leading to the use of species abundance models (Hughes 1986, Hill and Hamer 1998), estimation of species richness (Chao 2005), modelling species-area relationships (Connor and McCoy 2001), and the analysis of species co-occurrence (MacKenzie et al 2004, Navarro-Alberto and Manly 2009).
In studying ecological communities, it is often natural to consider the use of multivariate methods. There is a large literature in this area, primarily focussing on classification and ordination techniques for providing informative summaries of the data (McGarigal et al 2000). Likewise, multivariate analysis of variance has been used to assess the ecological impact of human disturbance on a range of species (Anderson and Ter Braak 2003).
In order to study processes operating at large spatial scales, it is useful to carry out studies at those scales. In doing so, there is a tension between satisfying the statistical requirements of replication and keeping the study at a scale that is large enough to provide meaningful results (Schindler 1998, Hewitt et al 2007). There has been some discussion in the ecological literature regarding appropriate statistical methods for such studies (Cottenie and De Meester 2003). One approach is to consider a single large-scale study as insufficient to provide the level of evidence that is usually required of a small-scale experiment, with the hope that information from a number of studies can eventually be combined, either informally of using meta analysis (Gurevitch and Hedges 1999).
Future
It is clear that the increasing popularity of computationally-intensive Bayesian methods of analysis will lead to ecologists being able to fit statistical models that provide them with a better understanding of the spatial and temporal processes operating in their study populations (Clark 2007). Likewise, recently-developed techniques such as neural networks (Lek et al 1996) and boosted trees (Elith et al 2008), are likely to appear more frequently in the ecological literature. In tandem with the development of new techniques, there will always be a need to balance complexity and simplicity in the analysis of ecological data (Murtaugh 2007).
Reprinted with permission from Lovric, Miodrag (2011), International Encyclopedia of Statistical Science. Heidelberg: Springer Science +Business Media, LLC.
- 1
- Anderson DR, Burnham KP and White GC AIC model selection in overdispersed capture-recapture data. Ecology 75: (1994), 1780-1793.
- 2
- Anderson MJ and Ter Braak CJF Permutation tests for multi-factorial analysis of variance. Journal of Statistical Computation and 73: (2003), 85-113.
- 3
- Besbeas P, Freeman SN, Morgan BJT and Catchpole EA Integrating mark-recapture-recovery and census data to estimate animal abundance and demographic parameters. Biometrics 58: (2002), 540-547.
- 4
- Borchers DL and Efford MG Spatially explicit maximum likelihood methods for capture-recapture studies Biometrics 64: (2008), 377-385.
- 5
- Borchers DL, Zucchini W and Fewster RM (1998) mark-recapture models for line transect surveys. Biometrics 54: (1998), 1207-1220.
- 6
- Brooks SP, Catchpole EA and Morgan BJT (2000) Bayesian annual survival estimation. Statistical Science (2000), 15: 357-376.
- 7
- Brown JA and Manly BJF (1998) Restricted adaptive cluster sampling. Environmental and Ecological Statistics (1998),5: 49-63.
- 8
- Buckland ST, Anderson DR, Burnham KP, Laake JL, Borchers DL and Thomas L (Editors) Advanced distance sampling: estimating abundance of biological populations. Oxford University Press.(2004)
- 9
- Buckland ST, Newman, KB, Fernández C, Thomas L and Harwood J (2007) Embedding population dynamics models in inference. Statistical Science (2007), 22: 44-58.
- 10
- Buckland ST and Turnock BJ (1992) A robust line transect method Biometrics 48: 901-909.
- 11
- Burgman MA, Ferson S and Akcakaya HR Risk assessment in conservation biology. (1993), Chapman and Hall.
- 12
- Caswell H Matrix population models. 2nd Edition, (2001) Sinauer Associates, Massachusetts.
- 12
- Chao A Species richness estimation. In Encyclopedia of Statistical Sciences, Second Edition. (2005) Wiley.
- 12
- Clark JS Models for ecological data: an introduction. (2007) Princeton University Press.
- 13
- Connor EF and McCoy ED Species-area relationships. In Encyclopedia of Biodiversity (2001) (pp 397-411, Volume 5). Academic Press.
- 14
- Cottenie K and De Meester L (2003) Comment to Oksanen (2001): reconciling Oksanen (2001) and Hurlbert (1984). Oikos 100: 394-396.
- 15
- Dennis B and Patil GP The gamma distribution and weighted multimodal gamma distributions as models of population abundance. Mathematical Biosciences 68: (1984) 187-212.
- 16
- Elith J, Leathwick JR and Hastie T A working guide to boosted regression trees. Journal of Animal Ecology 77: (2008) 802-813.
- 17
- Fewster RM and Buckland ST Chapter 10 of Advanced distance sampling: estimating abundance of biological populations. Edited by Buckland ST, Anderson DR, Burnham KP, Laake JL, Borchers DL and Thomas L. (2004) Oxford University Press.
- 18
- Fletcher DJ Confidence intervals for the mean of the delta-lognormal distribution. Environmental and Ecological Statistics 15: (2008) 175-189.
- 19
- Fortin M-J and Dale MRT Spatial analysis: a guide for ecologists. (2005) Cambridge University Press.
- 20
- Gurevitch J and Hedges LV Statistical issues in ecological meta-analyses. Ecology 80: (1999) 1142-1149.
- 21
- Hewitt JE, Thrush SF, Dayton PK, and Bonsdorff E The effect of spatial and temporal heterogeneity on the design and analysis of empirical studies of scale-dependent systems. The American Naturalist 169: (2007) 398-408.
- 22
- Hill JK and Hamer KC Using species abundance models as indicators of habitat disturbance in tropical forests. Journal of Applied Ecology 35: (1998) 458-460.
- 23
- Hughes RG (1986) Theories and models of species abundance. The American Naturalist 128: 879-899.
- 24
- Johnson JB and Omland KS Model selection in ecology and evolution. Trends in Ecology and Evolution 19: (2004) 101-108.
- 25
- Lebreton J-D, Burnham KP, Clobert J and Anderson DR Modeling survival and testing biological hypotheses using marked animals: a unified approach with case studies. Ecological Monographs 62: (1992) 67-118.
- 26
- Legendre, P Spatial autocorrelation: trouble or new paradigm? Ecology 74: (1993) 1659-1673.
- 27
- Lek S, Delacoste M, Baran P, Dimopoulos I, Lauga J and Aulagnier S Application of neural networks to modelling nonlinear relationships in ecology. Ecological Modelling 90: (1996) 39-52.
- 28
- Lukacs PM and Burnham KP Review of capture-recapture methods applicable to noninvasive genetic sampling. Molecular Ecology 14: (2005) 3909-3919.
- 29
- McArdle BH Levels of evidence in studies of competition, predation, and disease. New Zealand Journal of Ecology 20: (1996) 7-15.
- 30
- Mackenzie DI, Bailey LL and Nichols JD Investigating species co-occurrence patterns when species are detected imperfectly. Journal of Animal Ecology 73: (2004) 546-555.
- 31
- MacKenzie D, Nichols J, Royle J, Pollock K, Bailey L and Hines J Occupancy estimation and modeling: inferring patterns and dynamics of species occurrence. Academic Press. (2006)
- 32
- McGarigal K, Cushman S and Stafford S Multivariate statistics for wildlife and ecology research. Springer. (2000)
- 33
- Martin TG, Wintle BA, Rhodes JR, Kuhnert PM, Field SA, Low-Choy SJ, Tyre AJ and Possingham HP (2005) Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecology Letters 8: (2005) 1235-1246.
- 29
- Murtaugh PA (2007) Simplicity and complexity in ecological data analysis. Ecology 88: (2007) 56-62.
- 34
- Navarro-Alberto JA and Manly BFJ Null model analyses of presence-absence matrices need a definition of independence. Population Ecology 51: (2009) 505-512.
- 35
- Nichols JD and Hines JE (2002) Approaches for the direct estimation of, and demographic contributions to, using capture- recapture data. Journal of Applied Statistics 29: (2002) 539-568.
- 36
- Peery MZ, Becker BH and Beissinger SR (2006) Combining demographic and count-based approaches to identify source-sink dynamics of a threatened seabird. Ecological Applications 16: (2006) 1516-1528.
- 37
- Pledger S, Pollock KH and Norris, JL Open capture-recapture models with heterogeneity: I. Cormack-Jolly-Seber model. Biometrics 59: (2003) 786-794.
- 38
- Schindler DW Replication versus realism: the need for ecosystem-scale experiments. Ecosystems 1: (1998) 323-334.
- 39
- Schwarz CJ and Seber GAF Estimating animal abundance: review III Statistical Science 14: (1999) 427-456.
- 40
- Thompson SK and Seber GAF Adaptive sampling. Wiley. (1996)
- 41
- Williams BK, Conroy MJ and Nichols JD. Analysis and management of animal populations. Academic Press. (2002)
- 42
- Wright JA, Barker RJ, Schofield MR, Frantz AC, Byrom AE and Gleeson DM (2009) Incorporating genotype uncertainty into mark-recapture-type models for estimating abundance using DNA samples. Biometrics 65: (2009) 833-840.
|