The Global Garlic Mustard Field Survey ( GGMFS ) : challenges and opportunities of a unique , large-scale collaboration for invasion biology

To understand what makes some species successful invaders, it is critical to quantify performance differences between native and introduced regions, and among populations occupying a broad range of environmental conditions within each region. However, these data are not available even for the world’s most notorious invasive species. Here we introduce the Global Garlic Mustard Field Survey, a coordinated distributed field survey to collect performance data and germplasm from a single invasive species: garlic mustard (Alliaria petiolata) across its entire distribution using minimal resources. We chose this species for its ecological impacts, prominence in ecological studies of invasion success, simple life history, and several genetic and life history attributes that make it amenable to experimental study. We developed a standardised field survey protocol to estimate population size (area) and density, age structure, plant size and Copyright Robert I. Colautti et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. NeoBiota 21: 29–47 (2014) doi: 10.3897/neobiota.21.5242 www.pensoft.net/journals/neobiota RESEARCH ARTICLE Advancing research on alien species and biological invasions A peer-reviewed open-access journal


Introduction
How is it that invasive species, which are introduced to novel geographical regions where they lack an adaptive evolutionary history, are nonetheless able to establish and proliferate?Since investigation into this topic was launched over a half-century ago (Elton 1958;Baker and Stebbins 1965), research on this question has expanded rapidly, leading to a large and growing number of ecological and evolutionary hypotheses (Sakai et al. 2001;Facon et al. 2006;Catford et al. 2009;Gurevitch et al. 2011;Jeschke et al. 2012).Biological hypotheses of invasion success generally fall into one of two categories: (i) biogeographical differences between native and introduced ranges and (ii) functional traits that differ between species or higher-order phylogenetic groups (Colautti et al. 2014).Hypotheses in the first category attribute the success of invasive species to biological differences between native and introduced regions that are more favourable in the latter.These differences include escape from natural enemies (Mitchell and Power 2003;Torchin et al. 2003;Mitchell et al. 2010), novel effects of biochemical weapons (Callaway and Ridenour 2004), novel biotic interactions (Reinhart and Callaway 2006), or increased anthropogenic disturbance (Byers 2002).Hypotheses in the second category associate invasion success with functional traits that differ among species or higher-order phylogenetic groups.For example, invasive species may be the subset of species from native source pools that possess particular ecological or evolutionary characteristics that promote introduction, establishment, spread and competitive displacement of natives (Pyšek and Richardson 2007;van Kleunen et al. 2010).These hypotheses are not exhaustive or mutually exclusive and different species may be invasive for different reasons (Mack et al. 2000).A challenge for ecology is to evaluate these hypotheses for individual invasions, eliminate those unlikely to explain invasion success, quantify the relative importance of the remaining hypotheses and identify context dependencies that allow for a robust general theory of invasion success.
A key distinction between biogeographical and trait-based hypotheses is whether introduced populations have increased in performance relative to native populations, where performance may be measured as abundance, range size, demography, and individual survival, reproduction or competitive ability.Biogeographical hypotheses that seek to explain how introduced species escape the regulatory mechanisms present in their native range make an inherent assumption that introduced populations benefit from an ecological or evolutionary increase in performance relative to native populations, on average.We call this the "increased vigour assumption".In contrast, traitbased hypotheses assume that native and introduced populations perform similarly, with invasion success owing to preadaptation and environmental similarities between ranges.Therefore it is possible two very different types of invaders may exist: (i) species that become abundant and widespread through niche expansion (e.g.escaping regulation or gaining access to new resources) and (ii) species that expand their range through human agency but ultimately perform the same in native and introduced regions.Moreover, the species that both perform well in their native range and expand their niche in invaded regions should be the most successful invaders.However, few studies have measured performance of natural populations to directly test the assumption of increased vigour and these have been limited in size and geographical scope (Bossdorf et al. 2005;Parker et al. 2013;Colautti et al. 2014).Indeed, even basic field performance comparisons between native and introduced populations are often not available for many of the world's most notorious invasive species, and where available, there is often little information about variation in performance among individuals or populations within each range (Parker et al. 2013).Comprehensive field data are therefore crucial for testing the assumption of increased vigour.
In addition to testing for increased vigour, direct field measurements help to assess the ecological relevance of factors proposed to explain invasion success.By "ecological relevance" we mean the extent to which factors affecting fitness components measured under experimental conditions directly translate to invasion success in natural populations and at larger biogeographical scales.Field measurements of natural populations are important because complex interactions among ecological and evolutionary processes can limit the predictive power of ecological and genetic factors identified in controlled experiments.For example, many plants and animals have lost specialist enemies following introduction to geographically distant locales (Mitchell and Power 2003;Torchin et al. 2003), but this does not often translate to increased performance relative to native competitors in natural field settings where multiple factors interact (Agrawal et al. 2005;Parker et al. 2006;Parker and Gilbert 2007).Without comprehensive field data from natural populations, it is unclear how often invasion success owes to escape from natural enemies.Performance measurements of individuals in natural populations provide an important link between experimental results and ecological relevance.
Testing the assumption of increased vigour, distinguishing biogeographical and phylogenetic effects, and measuring the ecological relevance of hypotheses of invasion success are difficult tasks.Ideally, a study of increased vigour in a single species would involve (i) extensive field surveys measuring performance of native and introduced populations, (ii) large-scale field manipulations to study population dynamics at nested spatial scales, (iii) development of genetic resources, and (iv) an integrated experimental approach involving experts in a variety of areas including chemical ecology, community dynamics, population ecology, population genetics, developmental biology, and genomics.To facilitate such an approach, our goal is to build a model species for invasion biology, develop novel resources and encourage international and interdisciplinary collaboration to coordinate detailed and robust research on a single species: garlic mustard, Alliaria petiolata.Such an approach could then be applied to additional species in the future.Below we review our motivation for the project, introduce the sampling protocol, describe constraints on protocol design and implementation, summarize the extent of participation, outline our curation and quality control procedures, and note potential avenues for future research.

Rationale
The Global Garlic Mustard Field Survey (GGMFS) was conceived during a 2008 meeting of the Global Invasions Network, funded by a Research Collaboration Network grant from the U.S. National Science Foundation (see Acknowledgements).The meeting involved 35 invasion biologists representing a broad range of empirical and theoretical backgrounds in ecology, evolution and genetics.Discussions within this group identified the need for standardized field measurements of performance traits, and led to our choice of study species and experimental design.

Large-scale collaborations in ecology
A major challenge in ecology is generalizing from individual studies done under controlled conditions or in particular locations to broad-scale ecological patterns.This is particularly difficult in invasion biology where ranges often span continents and resources are split among studies of hundreds of different invasive species.Just as large genomic databases and bioinformatics techniques are advancing scientific understanding of genetics and evolution, large-scale ecological data collection networks that focus on a common goal can provide comprehensive data to improve understanding of ecological processes and interactions (Silvertown 2009;Cadotte et al. 2010;Firn et al. 2011;Moles et al. 2011;Silvertown et al. 2011).Large networks of professional scientists are ideally suited to collecting spatially extensive ecological data in a standardized format that is also scientifically rigorous (Craine et al. 2007).Such planned, coordinated endeavours provide much more consistent and reliable data than those which could be obtained through reviews and meta-analyses of smaller studies that typically differ in methodology and sample size, and in some cases were not designed to answer the scientific questions being addressed (Moles et al. 2011).Our general approach is similar but intermediate to two increasingly popular models of large-scale research networks: citizen science projects and "coordinated distributed experiments" or CDEs (Fraser et al. 2012).These studies are generally conducted on a global scale, but vary in the level of expertise, number of participants, sample size and data depth (Figure 2).
Pure citizen-science projects like project bud-burst (http://budburst.org)and the Evolution MegaLab (http://www.evolutionmegalab.org)(Silvertown et al. 2011) gen-erate large amounts of data at a low cost but rely heavily on private citizens of low expertise who may have no formal education in biology, and this has led to concerns about data accuracy (Delaney et al. 2008;Fitzpatrick et al. 2009;Dickinson et al. 2010).In contrast, large-scale coordinated distributed experiments typically involve a smaller group of professional scientists but require a higher levels of funding.For example, the International Tundra Experiment (ITEX) (http://www.geog.ubc.ca/itex), and the Nutrient Network (NutNet) (http://nutnet.science.oregonstate.edu)involve scientists using standardized protocols to collect ecological data at a series of sites around the world (Craine et al. 2007;Fraser et al. 2012).The GGMFS is intermediate to these two general approaches as it involves a larger number of highly-trained professional scientists, covering a larger number of sites, but with less intensity of research and low research cost per site (Figure 2).

Challenges and trade-offs
Large-scale collaborative research projects involving hundreds of scientists create a number of specific challenges, trade-offs and financial considerations.Development of a standardized protocol for large distributed studies requires decisions about the intensity of study and sophistication of participants (Figure 2).Given limited human resources and time, increased intensity of sampling at each site will trade off with the number of sample sites.Additionally, more sophisticated measurements require more expertise, reducing the number of qualified participants.In contrast to most citizen science projects, which usually include very simple measurements (e.g.presence of a species) with little or no equipment or training, we developed a protocol that requires only basic measuring supplies, but can be slightly challenging for participants with no formal science education, and can take several hours to sample a dense population.Nonetheless, we deliberately excluded more complicated but time-intensive field manipulations or measurements such as survival rates, seed production, edaphic measurements, and community composition to encourage an increased number of participants and to keep the protocol accessible to non-scientists.As a result, GGMFS participants are mostly professionally-trained, full-time scientists at academic and governmental organizations who are aware of the need for careful and unbiased data collection.A smaller number of populations have been sampled by non-scientists, and these will be analysed for potential data quality issues in future analyses.

Study system
Garlic mustard, Alliaria petiolata (M.Bieb.)Cavara & Grande, is native to most of Europe and western Asia below 68°N latitude (Cavers et al. 1979).Several factors make this species suitable as a model species for invasion biology.

A widespread and successful invader with demonstrated impacts
Herbarium records (Cavers et al. 1979) and neutral genetic markers (Durka et al. 2005) suggest introductions of A. petiolata to North America from multiple locations across its native Eurasian distribution beginning in the early 19 th century, probably for food and medicinal uses.Like many invasive species, it remained inconspicuous for decades, after which it spread at an exponential rate.It is now present in at least 37 U.S. states and five Canadian provinces and has been declared a prohibited or noxious weed in eight states (USDA Plants Database).Alliaria petiolata invades nutrient-rich, semi-shaded habitats such as forest edges and moist woodlands.Dense invasive populations can reduce native plant diversity and limit recruitment of native trees (Carlson and Gorchov 2004;Stinson et al. 2007) by disrupting mycorrhizal communities that are important for economically valuable trees and native understory plants (Stinson et al. 2006;Burke 2008;Wolfe et al. 2008;Barto et al. 2011;Lankau 2011), and by altering litter decomposition as well as soil nitrogen and phosphorus cycles (Rodgers et al. 2008).

Implicated in several key hypotheses of invasion success
Invasive populations of Alliaria petiolata have been studied frequently.In North America, the species lacks its native specialist herbivores (Blossey et al. 2001), and suffers less herbivore damage (Lewis et al. 2006).Several previous studies tested whether this enemy release has led to evolution of decreased herbivore defences and increased competitive ability (EICA hypothesis).Some aspects of plant defence were consistently decreased in introduced populations (Bossdorf et al. 2004b;Hull-Sanders et al. 2007), but overall evidence was rather ambiguous and did not support the EICA hypothesis (Bossdorf et al. 2004a, b;Cipollini et al. 2005;Hull-Sanders et al. 2007;Cipollini and Lieurance 2012).Alliaria petiolata contains a broad array of secondary metabolites that putatively affect soil microbial communities (Cipollini 2002;Cipollini et al. 2005;Callaway et al. 2008;Bressan et al. 2009;Lankau 2011) and likely play a role in defence against herbivores and pathogens (Haribal andRenwick 1998, 2001;Haribal et al. 1999;Renwick et al. 2001;Kumarasamy et al. 2004;Cipollini and Gruner 2007).In connection with field surveys and further genetic studies, this body of research provides an excellent basis for characterizing individual-level variation in secondary chemicals, and potentially for linking this variation to ecosystem-level processes.

Simple lifetime fitness estimate
Alliaria petiolata is a biennial monocarpic species.Seeds germinate in spring, overwinter as rosettes, flower the following spring, and produce fruits in early summer (Cavers et al. 1979).Sampling populations in the summer therefore allows for simultaneous measurements of reproductive output of individual plants as well as population demographic structure (i.e.first-year-vs.second-year plants).Reproductive stems reach approximately 1 m in height and typically produce 10-25 siliques, each containing 10-20 seeds produced primarily through self-pollination.This relatively simple life cycle means that total lifetime reproductive success can be measured in the field and in greenhouse or growth chamber experiments.

Easily identified
Alliaria petiolata is the sole member of its genus found in Europe and North America (Cavers et al. 1979).Both juvenile and adult plants are very distinct from naturally co-occurring plants.Adult plants are easily recognized by their characteristic inflorescences in late spring (Figure 1A).First year rosettes vary in size (2-20 cm), and usually have 5-10 toothed leaves (Figure 1A) before developing inflorescences and siliques (Figure 1C).Leaves on mature plants vary from deltoid at the base to lanceolate toward the apex (Figure 1D).Flowers are 6-7 mm in diameter, each with four white petals (Figure 1D).

Genetically tractable
The Brassicaceae is a genetically well-studied family of angiosperm.It includes many economically important horticultural and crops species and the model species Arabidopsis thaliana, which diverged from a common ancestor with Alliaria petiolata only about 20 million years ago (Koch et al. 2000).As a result of recent divergence, the two species share 98.3% of 1383 base pairs in the large subunit of ribulose-1,5-bisphosphate carboxylase (rbcL) (GENBANK accessions JQ933212.1 and AP000423.1).This genetic similarity is useful for annotating sequencing data and identifying candidate genes underlying phenotypic traits using common genomic tools.While metabolism and expression of defensive chemicals have not been well characterized in A. petiolata, related compounds have been extensively studied in many Brassicaceae species, and in particular the core biosynthetic pathway for glucosinolate production has been almost completely mapped in A. thaliana (Halkier and Gershenzon 2006).Identification of candidate genes and comparative genomics is quickly becoming eco- nomically feasible for genetic studies of non-model organisms as sequencing costs have declined roughly 100,000-fold over the last decade (Lander 2011).This will be of particular value for genomic studies of A. petiolata because it is an autotetraploid with a relatively large genome (C-value 1.95) -about 14× that of A. thaliana (Bennett 1972).

Field survey
We established the GGMFS in 2009 to measure population size (i.e.area of coverage), density and age structure (i.e.proportion of first-year rosettes vs. second-year reproductive adults), as well as size, reproduction, herbivory and fungal damage of individual plants, plus some important environmental variables such as habitat type and canopy cover.The current full protocol and data collection sheet are available at http://www.garlicmustard.org,archived on FigShare (doi: 10.6084/m9.figshare.729274),and are described briefly below.

Setup and site choice
Contributors to the GGMFS are expected to sample populations in the late spring or early summer after >95% of plants have finished flowering.The specific date varies by climate because sampling is intended to standardize phenology across populations and to collect mature seeds for future experiments.We generally encourage participants to include one site with low light (e.g. a forest interior location), and one with high light (e.g.forest edge or roadside).However, to limit selection bias, any site containing 20 or more A. petiolata plants can be included in the study.

Data collection, curation and quality control
Measurements begin by pacing out the approximate area of the population of A. petiolata (length × width).The participants then lay out a 10 m transect to measure plants from one edge of the population moving toward the centre.In each of 10 adjacent 1 × 0.5 m plots, all juvenile rosettes and adult plants are counted.The size of the nearest juvenile and adult plant, as well as the height, leaf number and fecundity (number of fruits) of the adult are measured at five intervals, 20 cm apart, along each plot.On the adult plants, participants also count both the number of undamaged leaves and the number of leaves with >10% herbivore damage.In 2011, we added a simple measurement for pathogen damage, by noting both the total number of plants in each plot that exhibited signs of leaf pathogens as well as the number of damaged leaves.To estimate % canopy cover at each site, photos are taken of the forest canopy at three points across the sampled transect and a visual estimate of average cover is recorded.We use digital camera photos for these estimates because they are common equipment whereas a fisheye lens is a more accurate but specialized piece of equipment that many people may not have access to.The participants also note whether there is any information of past or ongoing disturbance or control efforts applied to the population they are sampling.
After measurements are recorded, participants harvest inflorescences from the first 20 adult plants in each population.These collections are dried for at least two weeks and then mailed along with the original data sheets and canopy photos.Seed collections and photos of sampled populations provide confirmation that the correct species was sampled, which is of concern mainly for private citizen contributions.Upon receiving these collections we visually inspect the online and hand-written data for potential typographical errors, and we clean and store all seed collections at 5°C.We use ImageJ (Abramoff et al. 2004) to estimate canopy cover from canopy photos, and we compile approximate bioclimatic variables for each sampled location from the World-Clim database (http://www.WorldClim.org)(Hijmans et al. 2005).We use Google Earth (Version 5.0) to confirm population locations and habitat information.With precise GPS locations, Google Earth images are of high enough resolution to verify key characteristics entered as site descriptions, such as altitude and habitat type.Finally, we run a series of statistical tests for typos and outliers and organize all text into a single data file using R (R Core Team 2012).

Participation and current extent of sampling
Across four field seasons (2009)(2010)(2011)(2012), 164 participants in 16 countries across North America and Europe (Figure 3) collected data and sampled seeds from 383 field sites (Figure 4).These participants included many academic scientists -faculty, postdocs, and graduate students -but also weed managers, conservation groups, and citizen scientists.Sampling intensity was strongly skewed as most contributors sampled a small number of populations, but a few individuals sampled many sites (Figure 3A).Interestingly, participation at each site was generally all-or-nothing; only five entries were excluded due to incomplete data.Sampling began in 2009 but was most heavily promoted in 2010 and 2011 through direct invitation of individual scientists, as well as general announcements on listserv (e.g.ECOLOG-L, EvolDir), and citizen science channels.After the 2011 season, participants were no longer directly solicited, and participation dropped to 19 populations in 2012.The total field data contributed thus far includes 383 populations concentrated in the northeast U.S. and central Europe (Figure 4).These samples represent a fair proportion of North American and European distribution of A. petiolata (Welk et al. 2002).Over 5,000 seed families have been collected from these populations and are currently being evaluated for viability and subsequently grown for seed production.Collectively, participants, counted 137,811 plants and recorded 47,514 individual measurements.

Undergraduate engagement
The GGMFS protocol has been incorporated into some undergraduate-level field courses, particularly at universities focused on undergraduate teaching and educating minorities.These faculty positions are typically associated with high teaching loads and low research budgets, which limit research output relative to faculty at major universities.At the same time, these schools tend to have smaller class sizes and more involved undergraduate students.Large collaborative networks like the GGMFS allow faculty with limited time and financial resources to collaborate within a global community of researchers and play a key role in collecting and analysing valuable scientific data (Bowne et al. 2011).
In addition to involving faculty in research, the GGMFS data and protocol introduce students to research and provide excellent opportunities for project-oriented undergraduate teaching.Active learning exercises, in which students engage directly in laboratory or field demonstrations, are more effective than class lectures and other passive modes of teaching (National Academies Press 2005;Michael 2006).We see two main avenues for teaching: (i) Students can use the formal GGMFS protocol to collect data in their area, and then analyse and discuss these data in comparison with other populations from the GGMFS.(ii) Students can incorporate climate data and satellite images of GGMFS sample sites to learn techniques and address fundamental questions in invasion biology, plant ecology and general biology.There are many more options for building elements of the GGMFS into future curricula, e.g.laboratory studies using seeds from the seed bank, which connect molecular or quantitative genetic information with the large amount of field and environmental information available for each of the seed origins.

Open science
One of the guiding principles of the GGMFS is the inherent value to society of completely open and accessible data, analyses, germplasm and any additional resources arising from this project.Creating these resources is also a crucial step for cultivating a diverse community of biologists a variety of expertise but a common goal of understanding the biological mechanisms underlying the invasion success of an ecologically important invasive species.We plan to eventually release all data collected for the GGMFS, along with relevant analyses needed to replicate any results published in peer reviewed journals.These will be available to the scientific community with the expectation that future analyses using the data will also be made equally open-access.The release of the field data will likely occur in a series of stages following publication, with a sequester period of 1-2 years to facilitate novel analyses among project participants before releasing data to the scientific community.

Seed propagation and archiving
Seeds collected as part of the field survey will form the basis of inbred lines for laboratory demonstrations and for future research.Seed collections are stored under cool-dry conditions in three locations -University of Tübingen, Germany, Fordham University in New York, USA, and the University of British Columbia in Vancouver, Canada.To reduce the potential for maternal effects and to produce enough seeds for future experiments, we are currently propagating all viable seed families in an outdoor common garden in Tübingen.

Future projects
In addition to collecting valuable data and seed resources, the GGMFS has brought together a global network of scientists who posess a range of expertise and a unifying interest in understanding biological invasions.To build on the work described above, we propose to expand this network and to consider additional studies that complement the GGMFS data and address difficult but important questions about the ecology and evolution of invasive species.Anyone interested in participating should contact the lead authors of this paper or the project coordinators listed on the GGMFS website (http://www.garlicmustard.org).We have identified three projects in particular that build on the strengths of the GGMFS model: (i) Temporal sampling: More detailed demographic measurements, and longterm sampling of the same populations across multiple years would help improve understanding of invasion dynamics at several nested spatial scales.
(ii) Additional invasive species: Replicating the GGMFS approach with other invasive species, including those with different life-history strategies and extent of invasiveness, would allow for a more general test of the increased vigour assumption and testing hypotheses of invasion success in other species.In addition to all of the benefits described above for A. petiolata, field surveys from multiple species will help to identify generalities in the relative importance of different ecological and evolutionary processes to invasion success.
(iii) Large-scale reciprocal transplant experiments: Reciprocal transplant experiments have a long history in plant biology but have rarely been used to study invasive species.Utilizing the GGMFS network and additional collaborators, a large transcontinental reciprocal transplant experiment across dozens of sites in North America and Europe and using a shared subset of GGFMS seed families would be particularly useful to test for genetic differences between, and local adaptation of, native and introduced populations.

Conclusions
Large collaborations are transforming many areas of science, but ecological and evolutionary studies of invasive species have spread limited resources across a broad range of study systems and geographic locations.Focusing studies on a few model systems can help to comprehensively address the fundamental question of what determines invader success, and to evaluate different mechanisms of invasiveness.The Global Garlic Mustard Field Survey (GGMFS) is a step toward this more integrated approach to invasion biology; it provides much-needed comprehensive data on the performance of natural populations of an invasive species across its native and introduced ranges.Large field surveys can provide an important link from experimental results observed at local sites on a subset of populations to biogeographical patterns of invasion success occurring at continental and global scales.

Figure 1 .
Figure 1.Diagnostic characteristics of Alliaria petiolata.A populations B rosettes C bolting inflorescences and D individual flowers and developing siliques.

Figure 2 .
Figure 2. Schematic showing two hypothetical trade-off axes: 1. (x-axis) Sampling intensity: depth of data collection (e.g.number of measurements per site) increases towards the left while the breadth of coverage (e.g.number of sites) increases towards the right; 2. (y-axis) Participant sophistication: number of participants increases towards the top while the average level of expertise per participant increases toward the bottom.Approximate position of the Global Garlic Mustard Field Survey (GGMFS) is shown in relation to other large collaborations in ecology: Nutrient Network (NN); National Ecological Observatory Network (NEON); International Tundra Experiment (ITEX); and the Alpine Stress Gradient Project (ASG), and to citizen science projects: Christmas Bird Count (CBC) and Project Budburst (PBB).

Figure 3 .
Figure 3. Frequency histograms of A sampling effort by each participant, and B number of participants per country.Note that "participant" in this case is either an individual or a group of people sampling together.

Figure 4 .
Figure 4. Map showing 383 sample locations from 2009-2012 inclusive, representing both the native (Europe) and introduced (North America) ranges.Dots are translucent resulting in darker areas that indicate regions with higher sampling intensities.