Taxonomic perils and pitfalls of dataset assembly in ecology: a case study of the naturalized Asteraceae in Australia

The value of plant ecological datasets with hundreds or thousands of species is principally determined by the taxonomic accuracy of their plant names. However, combining existing lists of species to assemble a harmonized dataset that is clean of taxonomic errors can be a difficult task for non-taxonomists. Here, we describe the range of taxonomic difficulties likely to be encountered during dataset assembly and present an easy-to-use taxonomic cleaning protocol aimed at assisting researchers not familiar with the finer details of taxonomic cleaning. The protocol produces a final dataset (FD) linked to a companion dataset (CD), providing clear details of the path from existing lists to the FD taken by each cleaned taxon. Taxa are checked off against ten categories in the CD that succinctly summarize all taxonomic modifications required. Two older, publicly-available lists of naturalized Asteraceae in Australia were merged into a harmonized dataset as a case study to quantify the impacts of ignoring the critical process of taxonomic cleaning in invasion ecology. Our FD of naturalized Asteraceae contained 257 species and infra-species. Without implementation of the full cleaning protocol, the dataset would have contained 328 taxa, a 28% overestimate of taxon richness by 71 taxa. Our naturalized Asteraceae CD described the exclusion of 88 names due to nomenclatural issues (e.g. synonymy), the inclusion of 26 updated currently accepted names and four taxa newly naturalized since the production of the source datasets, and the exclusion of 13 taxa that were either found not to be in Australia or were in fact doubtfully naturalized. This study also supports the notion that automated processes alone will not be enough to ensure taxonomically clean datasets, and that manual scrutiny of data is essential. In the long term, this will best be supported by increased investment in taxonomy and botany in university curricula. Copyright Brad R. Murray et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. NeoBiota 34: 1–20 (2017) doi: 10.3897/neobiota.34.11139 http://neobiota.pensoft.net RESEARCH ARTICLE Advancing research on alien species and biological invasions A peer-reviewed open-access journal


introduction
Large datasets in plant ecology, composed of hundreds or thousands of species, are increasingly being assembled by combining existing lists of species (van Kleunen et al. 2015). The value of such datasets for addressing research questions is first and foremost determined by the quality of taxonomic accuracy underpinning their plant names. The task of merging multiple source datasets into one plant ecological dataset that is clean of taxonomic errors is seldom straightforward because lists that have not been actively maintained become outdated and riddled with incorrect or obsolete taxonomy (Soberón andPeterson 2004, Hulme andWeser 2011). This can lead to a lack of taxonomic congruence among existing lists and ultimately the assembly of a taxonomically unreliable dataset (Jansen and Dengler 2010). The use of unreliable datasets is of concern as they increase the risk of reaching questionable ecological conclusions and making poorly informed conservation and management decisions (Pyšek et al. 2013).
Taxonomic cleaning during the assembly of plant ecological datasets can be an especially difficult process for non-taxonomists, not only because of the inherent complexities of taxonomy and the ongoing nature of taxonomic change (Chapman 2005), but also given the recent decline of taxonomic expertise and resources (Wheeler 2014, Halme et al. 2015) that would normally be the first point of contact for taxonomic assistance (Gotelli 2004). The sorts of problems that need to be overcome during the assembly of plant ecological datasets include, among others, locating scientifically reliable source datasets, resolving issues of synonymy so that species' names are correct and currently accepted, and, where relevant, assigning the correct ecological status (e.g. rare, naturalized, invasive) to each species' name. For instance, a status of common may have switched to a status of rare by the time of dataset assembly (e.g. Murray and Hose 2005). Despite these problems, the increasing global availability of large volumes of ecological data and the growing reliance on Big Data to address the world's environmental problems mean that efforts must continue to assemble taxonomically clean and reliable datasets.
In an effort to assist ecologists not familiar with the finer details of taxonomic cleaning and who may not have previously assembled an ecological dataset, our first aim in the present study is to describe the range of taxonomic difficulties likely to be encountered when combining existing lists of plant species into a harmonized dataset. To facilitate this, we present a systematic taxonomic cleaning protocol for merging multiple source datasets into a single plant ecological dataset. The protocol draws partly on established knowledge and procedures for taxonomic cleaning (e.g. Chapman 2005, Chavan 2007, Kooyman et al. 2012, Pyšek et al. 2013, Mathew et al. 2014) and expands upon these in a systematic way to include searches for taxa new to a study region since the production of source datasets, confirmation of the occurrence of taxa in the region through manual inspection of distribution records, and verification of the ecological status of taxa. Our second aim is to present a case study that assembles a dataset of naturalized species and infra-species in the Asteraceae in Australia by merging two publicly available source datasets (Groves et al. 2003, Randall 2007). Importantly, we use this case study to quantify the impacts and to highlight the ramifications of ignoring the critical process of taxonomic cleaning.

Dataset design
Data cleaning identifies inaccurate and incomplete data and improves the quality of a dataset through correction of detected errors and omissions (Chapman 2005). We describe an eight step protocol for taxonomic cleaning ( Fig. 1) that produces a final dataset (FD) linked to a companion dataset (CD). The FD is the cleaned dataset of species and infra-species (together referred to as taxa) ready for use in ecological studies (Suppl. material 1). The CD provides clear details of the path from source dataset to the FD taken by each taxon that has required some form of cleaning (Suppl. material 2).
The first four columns in both datasets contain genus, species, infra-species marker, and infra-species names while the fifth column contains the title(s) of the source dataset(s) in which taxon names occur. Central to the construction of the CD is checking off each taxon name against one or more of 10 categories listed in the CD. Each category, which has its own column in the CD for noting whether a taxon meets the requirements of the category, is described in full detail below (examples of each category are provided in Table 1). The CD is critical as it allows future studies to trace the origin of taxa in the FD exactly in the taxonomic form that they were collected and revisit them if need be. A comment column is included in both the FD and CD to ensure clearly articulated pathways of communication about the cleaning process between the two datasets. The 10 categories and comments columns transparently summarize all taxonomic modifications and updates, additions of new taxa, and taxon exclusions.

Taxonomic cleaning protocol
The eight step protocol presented here can be used to integrate any number of source lists, ranging from two to hundreds, into a single dataset from which taxonomic uncertainties and inaccuracies have been removed. The protocol is applicable to any taxonomic clade and in a consistent manner both to the assembly of datasets that target one or more geographic regions (from local plant communities to continental or global floras). The protocol can also be used to assemble comparative datasets that require large numbers of taxa to test ecological and evolutionary hypotheses which may not Figure 1. A Flowchart of the eight steps in the taxonomic cleaning protocol B Ten categories in the companion dataset that are populated with taxon names during the cleaning process, located adjacent to relevant steps in the protocol C A walkthrough of the case study of naturalized Asteraceae in Australia, with a numerical breakdown of the taxa in the working list at each step to the production of the final and companion datasets.  Groves et al. (2003) and Randall (2007)  necessarily be tied to a particular geographic region. Recently-developed automated processes for various aspects cleaning (e.g. Cayuela et al. 2012, Boyle et al. 2013, Pennell et al. 2016 can be implemented while following our protocol. We do not explore issues related to cleaning geographic coordinate records of taxa as these have been covered in detail elsewhere (e.g. Chapman 1998, Chapman 2005, Kooyman et al. 2012, Maldonado et al. 2015, Robertson et al. 2016. Our check for the occurrence of taxa in a region (step 6) is simple in that we are interested in whether a taxon is either in the study region or not. However, this step of the protocol does require careful scrutiny of available data such as inspection of comments on herbarium records and perhaps even new field surveys to ensure that specimens were collected within the study region.
Step 1: Locating source datasets Protocol. Datasets can be obtained from a wide range of sources, including published floras, scientific papers, herbaria and museums. There is also an expanding availability of relevant data from sources such as the Global Biodiversity Information Facility Table 1. Descriptions of the 10 categories in the companion dataset with examples of naturalized Asteraceae in Australia. FD = final dataset, CD = companion dataset, GD = Groves et al. (2003), RD = Randall (2007).

Clone
A taxon with an identical entry of its name in more than one source dataset. Facelis retusa has the same name in GD and RD . Facelis retusa is placed in the FD and in the CD checked off against the clone category.

New
A taxon found to occur either within a study region or in clades that are the focus of a study, since the time when the source datasets were originally constructed. Bidens aurea has become naturalized in Australia since the preparation of GD and RD. Bidens aurea is placed in the FD and in the CD checked off against the new category.

Synonym
A taxon with an old, no longer accepted scientific name listed in a source dataset, and that is now recognized by a new, currently accepted scientific name. Cnicus benedictus in GD and RD is a synonym of the currently accepted name Centaurea benedicta . Centaurea benedicta is placed in the FD and Cnicus benedictus is placed in the CD checked off against the synonym category.

Infra-species
A taxon whose [genus + species] and [genus + species + infra-species] names in source datasets are taxonomically valid. Centauria nigrescens ssp. nigrescens in GD and Centauria nigrescens in RD are both valid names. We placed Centaurea nigrescens ssp. nigrescens in the FD and Centaurea nigrescens in the CD checked off against the infra-species category, as we chose to include [genus + species + infra-species] names in the FD over [genus + species] names.

Problem
A taxon in a source dataset for which there is either current uncertainty regarding the correct name that should be used or whose name cannot be officially verified. Palafoxia rosea cannot be taxonomically verified and is excluded from the FD and placed in the CD checked off against the problem category.

Non-region
A taxon in a source dataset that is found on close inspection not to occur in the study region.
Brachylaena discolor does not occur in Australia (both known herbarium records are from overseas) and is excluded from the FD and placed in the CD checked off against the nonregion category.

Island
A taxon in a source dataset that is found on a nearby island, not on the mainland study region.
Picris hieracioides is not on mainland Australia but has possibly been recorded on nearby Norfolk Island. Picris hieracioides is excluded from the FD and placed in the CD and checked off against the island category.

Cultivated
A taxon in a source dataset that is found in the study region, but only in cultivated form.
There are no examples of naturalized Asteraceae in the source datasets that are only in Australia in cultivation.

Residence
A taxon in a source dataset that is native when the focus of the study is on exotic taxa, or a taxon that is exotic when the focus of the study is on native taxa. There are no examples of naturalized Asteraceae in the source datasets excluded from the FD because they are native to Australia.

Status
A taxon whose ecological status in the source dataset does not match the required status. Anacyclus radiatus is excluded from the FD and placed in the CD checked off against status because it is doubtfully naturalized in Australia.
(GBIF, www.gbif.org), the Global Invasive Species Dataset (GISD, www.issg.org) and the TRY Plant Trait Dataset (TRY, www.try-db.org). Each source dataset used during dataset assembly is given a unique title to keep track of the origin of taxon names throughout the cleaning process.
Confidence that source datasets are scientifically reliable and have been produced carefully is an essential requirement for dataset assembly. No matter how much a source dataset is cleaned, if the underlying compilation of taxa in the source dataset is questionable, then use of the dataset will subsequently lead to the assembly of an unreliable dataset. The best-case scenario is found in regions with a long history of botanical work and record-keeping. In such cases, obtaining reliable and up-to-date source datasets is straightforward. For example, the alien flora of the Czech Republic has been carefully described (Pyšek et al. 2002, Pyšek et al. 2012a, and a solid body of research which has used and refined this work provides a supportive framework for new research (e.g. Mihulka et al. 2003, Pyšek et al. 2003, Chytrý et al. 2005, Křivánek and Pyšek 2006, Chytrý et al. 2009, Phillips et al. 2010, Pyšek et al. 2012b). The strength of such source datasets is that there is usually a wealth of information about how they were built, including references, contained in peer-reviewed papers. There is an important point of distinction, in terms of confidence in a source dataset, between regions with such dataset availability and regions which have lists that are perhaps only available online and that are not attached to an institution, lacking any information about their construction or ongoing taxonomic maintenance.
Naturalized Asteraceae. Australia was permanently settled by Europeans in 1788, and even within the first 14 years of settlement, 29 exotic plant taxa that were introduced either accidentally or deliberately had started to naturalize (Groves 2002). Since then, over 2,500 plant taxa have become naturalized across the continent (Groves et al. 2005). Our case study assembled a dataset of species and infra-species in the Asteraceae that have become naturalized in the natural environment in Australia since permanent European settlement. We selected the Asteraceae for our study because a large number of taxa in the group have become naturalized in Australia and many have become invasive and problematic across the landscape (Radford and Cousens 2000, Parsons and Cuthbertson 2001, Hamilton 2005, Dodd et al. 2015. Two publicly available datasets of naturalized plants in Australia were used, Groves et al. (2003) and Randall (2007) (referred to as GD and RD respectively, here and in the FD and CD), to merge naturalized Asteraceae from these source datasets into a single dataset. These are older sources of information, but we selected these to specifically demonstrate the problems that are to be expected and the errors that can arise when combining existing lists of plant species into a harmonized dataset. The book by Groves et al. (2003), commissioned by the Department of Environment and Heritage in 2000 and the Bureau of Rural Sciences in 2001, was compiled by 14 plant specialists from all the States and Territories of Australia with high-level expertise in taxonomy and botany. Naturalized exotic plants were defined as species or infra-species that have been introduced, become established and that reproduce naturally in the wild, without human intervention, consistent with descriptions in . The book by Randall (2007), which not only provides a comprehensive list of all exotic plant species introduced to Australia, but also identifies those that have become naturalized somewhere in Australia, was a publication of the CRC for Australian Weed Management and represents a development of A Global Compendium of Weeds (Randall 2002), a major dataset of all weedy flora of the world. Both GD and RD source datasets represent the results of years of meticulous botanical work.
Step 2: Preparing a working list Protocol. All taxa from the source datasets are placed in an initial working list that is a precursor to the FD. Some taxa will be present more than once in the working list under exactly the same name when source datasets are merged. These repeat entries are kept in the working list at this stage with their different source titles.
Naturalized Asteraceae. There were a total of 537 taxa of naturalized Asteraceae in Australia in the working list resulting from the merging of GD and RD.
Step 3: Managing clones Protocol. Clones are repeat, completely identical entries of a taxon name from more than one source dataset. Once all clones have been identified, their occurrence in the working list is reduced to a single-name entry for each cloned taxon. Each cloned taxon is placed in the CD and checked off against the clone category ( Fig. 1), retaining all source titles for each taxon. This step is important for record keeping as it provides an initial evaluation of consistency among source datasets (Chapman 2005).
Naturalized Asteraceae. There were 209 clones across the 328 unique taxa derived from both source datasets. This translates to 76.6% of the 273 taxa in GD and 79.2% of the 264 taxa in RD that were initially common to both datasets, leaving 64 taxon names found only in GD and 55 taxon names found only in RD.

Step 4: Identifying new taxa
Protocol. This step ensures that the FD contains all taxa currently known to occur either within a target region (sensu Pyšek et al. 2004) or in clades that are the focus of study. Taxa new to a study region (e.g. newly discovered natives, recent introductions of exotics or non-endemic natives) and recently described taxa -since the time when the source datasets were originally constructed -need to be identified. Each new taxon is attached to a unique source title to keep track of the origin of the taxon name and placed in the FD. The names of these new taxa and their source titles are also placed in the CD and checked off against the new category in the companion dataset (Fig. 1).
Naturalized Asteraceae. To gather information about newly naturalized taxa in the Asteraceae in Australia since the compilation of the two source datasets, we conducted a literature search of publications from the Australian state herbaria and botanical gardens including Austrobaileya, Cunninghamia, Telopea, Muelleria, Journal of the Adelaide Botanical Gardens and Nuytisia. These journals periodically publish lists and records of plants newly recorded or identified as naturalized within Australia. We located three sources documenting new naturalizations in Australia, Hosking et al. (2007), Hosking et al. (2011) andParsons (2012).
Step 5: Correcting taxon names Protocol. This step requires careful scrutiny of taxon names in the working list to ensure that taxa are represented with their currently accepted and correct names. How difficult a task this is will ultimately depend on the availability of up-to-date taxonomic information via sources such as publications, online datasets and tools, detailed herbarium records, and taxonomists and their expertise. The guiding principle when updating taxa with their currently accepted names is to adopt a taxonomic system that provides an accepted, current authority in the jurisdiction of interest. Where no single authoritative source is available and competing taxonomies exist, researchers will need to make a choice and be explicitly clear about their taxonomic choices. This step in the process also corrects misspellings and lexical variants (i.e. different ways of writing the same name), and misapplications (where an incorrect name has mistakenly been given to a taxon), with any corrected taxon names checked in case they are clones of taxa already in the working list (step 3), to ensure that clones are limited to single-name entries. In some cases, it might be helpful to make use of automated recognition and correction tools for plant taxonomy, such as TaxonStand (Cayuela et al. 2012), the TNRS (Boyle et al. 2013) andtaxonlookup (Pennell et al. 2016). If such tools are implemented, the version used must be carefully documented as these tools are also reliant on their underpinning sources of taxonomic information being maintained and kept up-to-date.
One of the most difficult taxonomic cleaning issues is dealing with the complex issue of synonymy. In taxonomy, a synonym is an old, no longer accepted scientific name that applies to a taxon that is now recognized by a new, currently accepted scientific name. Homotypic synonyms are problematic when assembling a dataset from multiple source datasets, as the inclusion of two or more names that refer to the same taxon (i.e. two or more names given to the same type specimen) leads to pseudo-replication in the dataset and thus problems with subsequent analyses and conclusions. Heterotypic synonyms consist of different names for different type specimens, which were all at one point considered distinct taxa, but which have now been lumped into the one taxon. Heterotypic synonymy needs to be resolved not only because the single, up-to-date taxon could have a broader geographic range than its constituent synonyms (an important distinction for macroecological studies of range size variation), but also because variation in life-history and ecological traits will probably be greater for the wider ranging up-todate taxon (an important detail for comparative studies of life-history variation). It is also important to identify and correct any homonyms in the working list, which refer to a name for a taxon that is identical in spelling to another such name, that belongs to a different taxon, as well as any misapplications (i.e. where a taxon has been incorrectly identified). Once all issues of synonymy have been identified, the single currently accepted name of a taxon is retained in the working list and non-current or misapplied names are excluded from the working list and placed in the CD and checked off against the synonym category ( Fig. 1). Source titles are retained for each taxon with specific notes kept on the link that each synonymous taxon has to its currently accepted name in the working list, remembering that the working list becomes the FD at the end of the process.
It may become apparent that source datasets have chosen a different approach in relation to infra-species epithets. For example, a taxon might be represented with a [genus + species] name in one source dataset, but represented with [genus + species + infra-species] name in another (and in some cases both might be included). Sometimes, in checking the up-to-date names of such taxa, both names are considered to be current. An approach for dealing with infra-species in dataset assembly is to decide at the outset whether to include infra-species epithets across the whole working list, or if not, to pool infra-species into a [genus + species] name where appropriate. The latter approach can perhaps be used to deal with 'difficult' taxonomic groups where there are unresolved taxonomic issues. This pooling approach, however, can have disadvantages. Pooling infra-species into one larger taxon ignores potentially important differences among infra-species in their geographic distribution, life history, physiology and ecology. We suggest that where possible, infra-species are included in the working list. In such cases, the [genus + species] name that is not used is placed in the CD and checked off against the infra-species category and only the [genus + species + infra-species] name is retained in the working list with the relevant source title (Fig. 1). Where infra-species are not recognized, then [genus + species + infra-species] names are placed in the CD and checked off against the infra-species category, and the [genus + species] names are placed on the working list. The infra-species category provides the opportunity to contrast patterns emerging from the FD in analyses with and without infra-species if desired. Some taxa may need to be removed from the working list, placed in the CD and checked off against a problem category (Fig. 1). These are either taxa for which there is current uncertainty regarding the correct name that should be used for the taxa in question or taxa whose names cannot be officially verified.
Naturalized Asteraceae. We used the Australian Plant Name Index (APNI, http:// www.anbg.gov.au/apni/) and the Australian Plant Census (APC, http://www.chah.gov. au/apc/about-APC.html) to determine currently accepted names for all taxa in our working list. The system of nomenclature adopted for APC is endorsed by the Council of Heads of Australasian Herbaria (CHAH), while APNI is maintained by the Australian National Botanic Gardens in collaboration with the Centre for Australian National Biodiversity Research and the Australian Biological Resources Study.
Step 6: Confirming occurrence in target region Protocol. If a research goal is to include all taxa within a specific geographic region, then taxa in the working list are verified for their occurrence within that target region. This step may also include the requirement that taxa are identified as native or exotic to the region. Official plant censuses and herbarium records curated and maintained by national herbaria or botanic gardens, among other sources of reliable information, can be inspected closely to provide such verification. Ground truthing in the field may be required if there is real uncertainty about the occurrence of taxa in the region.
Taxa are removed from the working list, placed in the CD and checked off against the non-region category if there are no verified records of them in the target region ( Fig. 1). This can happen, for instance, when specimens collected well outside the region are kept in herbaria and then those records are incorrectly entered into distributional datasets for the region which are then used as source datasets in dataset assembly.
Taxa are removed from the working list and placed in the CD and checked off against the island category if they are not found in the mainland target region, but are found on nearby external islands (Fig. 1). It is desirable to keep such taxa separate from those in the non-region category, as it might be argued for some studies, for instance, that it is important to perform analyses with and without nearby island species. For example, taxa in the island category might be excluded if seeking to identify those taxa that have naturalized within a mainland study region. These taxa might be included, however, if the goal is to identify taxa that have penetrated broader national biosecurity and quarantine systems where the island is considered part of the nation.
Taxa that only occur in the target region because they have been cultivated there, and which do not occur naturally in the wild, are removed from the working list, placed in the CD and checked off against the cultivated category ( Fig. 1).
If a study is focused specifically on taxa native to the region, then exotic taxa are excluded from the working list and placed in the CD and checked off against the residence category ( Fig. 1). If a study is about exotic taxa, then native taxa are excluded and placed in the residence category. Alternatively, this category need not be included in the CD, but rather a separate column distinguishing native from exotic taxa can be included in the FD if comparisons between natives and exotics are desired in the study.
Naturalized Asteraceae. We used APNI and APC to determine non-region, island and cultivated taxa or native residency of taxa in Australia that would exclude them from the FD. If a name wasn't found in APNI, which provides a comprehensive record of every scientific plant name in taxonomic literature concerning Australia, this meant that the name had not been used in the scientific literature as referring to a taxon occurring within Australia. If a name was excluded from APC, this meant that the name was not considered by CHAH to be in Australia. We then scrutinized herbarium records in Australia's Virtual Herbarium (AVH, www.avh.chah.org.au) to seek further evidence of occurrence of species in Australia. The AVH resource is maintained by CHAH and provides on-line access to Commonwealth, State and Territory herbarium records. These records provide important information on the date and location of collection and if specimens were obtained overseas, from islands or cultivated plants, or from plants occurring in natural habitats.
Step 7: Verifying ecological status Protocol. Dataset assembly often requires a final clean so that only taxon names with a particular ecological status or statuses, related to their distribution and abundance within the target region, are included. These might include, for example, datasets comprised of taxa classified as either naturalized, invasive, declining, or threatened. We have included this step in the taxonomic cleaning process because this a particular area where taxonomy and ecology overlap considerably and they should not be considered separately (Graham et al. 2004, Wheeler 2004, Halme et al. 2015. The definition of ecological status in the source datasets must be clear and should preferably comply for the most part with published and widely adopted descriptions. In the field of invasion ecology, for instance, there are widely adopted schemes for consistent terminology (e.g. . Only if these definitions are similar should source datasets be put through the process simultaneously. If two or more source datasets differ substantially in their classification schemes, and these differences cannot be resolved, it is advisable to treat the datasets independently and put them through the process separately to produce two separate FDs. For example, species invasiveness might be determined as level of impact in one source and as rate of spread and geographic range size in another, and it is important that these two definitions of invasiveness are not considered the same. If a taxon name does not have the appropriate ecological status, it is excluded from the working list, placed in the CD and checked off against a status category in the companion dataset ( Fig. 1). If more than one ecological status is assessed, then separate columns are included in the CD representing each status. As an alternative, this category need not be included in the CD, but rather a separate column distinguishing the status of each taxon can be included in the FD if comparisons between or among statuses are the focus of the study (e.g. the study seeks to compare rare and common taxa in the dataset).
Naturalized Asteraceae. The naturalized status of each taxon in Australia was reviewed by carefully examining source datasets in conjunction with APC, APNI and AVH. In particular, the APC states clearly if taxa are doubtfully naturalized, and we excluded those taxa from the FD.

Step 8: Producing the final and companion datasets
Protocol. The working list at this stage of the process becomes the FD of taxa linked to the CD. The FD has now been cleaned and is the primary, up-to-date inventory of species that can be used with confidence and transparency in dataset studies. In both the FD and CD, it is important to ensure that the language and terminology used in the comments columns are consistent, to ensure ease of use when cross-walking the datasets.
Naturalized Asteraceae. The FD is presented in Suppl. material 1 and the CD is presented in Suppl. material 2.

Summary patterns in the FD and CD
The FD of naturalized Asteraceae in Australia contained 257 taxa. Four of these taxa (1.6%) were new, recorded as naturalized in Australia since the publication of the source datasets. There were 278 taxa in the CD. A total of 173 taxa (67.3% of the FD) were clones across the FD and CD with the same currently accepted name in both source datasets. There were 54 taxa (21.0%) in the FD that were either found only in GD (23 taxa, 8.9%) or only in RD (31 taxa, 12.1%) under their currently accepted name. Thus, a total of 227 taxa (88.3%) in the FD were unchanged from the source datasets. A total of 26 updated names (10.1%) not found in GD or RD were included in the FD.

A walk-through of the taxonomic cleaning process
The source datasets GD and RD were selected (step 1, Fig. 1) with the working list containing 537 taxon names after their merger (step 2, Fig. 1). Management of clones led to the removal of 209 duplicate taxon names (e.g. Ambrosia artemisiifolia) leaving 328 distinct taxon names in the working list (step 3, Fig. 1). We added 4 new taxa (e.g. Pentzia globosa) resulting in 332 taxa in the working list (step 4, Fig. 1). A total of 88 taxon names were excluded as they were either problematic (e.g. Chrysocoma comaaurea); they were [genus + species] names that were replaced with valid [genus + species + infra-species] names (e.g. Chrysanthemoides monilifera in RD was excluded and Chrysanthemoides monilifera ssp. monilifera and Chrysanthemoides monilifera ssp. rotundata in GD were included); and/or they were old synonyms that required updating with currently accepted names (step 5, Fig. 1, e.g. four taxon names in GD, Xanthium cavanillesii, Xanthium italicum, Xanthium occidentale, Xanthium orientale were excluded and the currently accepted name Xanthium strumarium in RD was included). In some cases during this step, the currently accepted names or [genus + species + infra-species] names appeared in one or both of GD and RD. For example, Cineraria lyrata in RD was updated to Cineraria lyratiformis which appeared in both GD and RD (Suppl. material 2). In other cases, the old synonyms were replaced with a total of 26 updated names that were not in the source datasets (e.g. Oligocarpus calendulaceus was included and its synonym Osteospermum calendulaceum in GD and RD was excluded).
At the end of step 5, there were 270 taxa in the working list. Five taxa were found not to be present in Australia (e.g. Gazania serrata) and their removal left 265 taxa in the working list (step 6, Fig. 1). Eight taxa were identified as doubtfully naturalized (step 7, Fig. 1, e.g. Cichorium endivia) and their removal left 257 taxa in the working list which became the FD (step 8, Fig. 1). Among these eight taxa, Brachylaena discolor was excluded both because its two herbarium records were collected overseas and because it is considered doubtfully naturalized, while Picris hieracioides was excluded because it does occur on mainland Australia, its presence on an external island (Norfolk Island) is questionable due to misidentification and because it is also considered doubtfully naturalized.

Discussion
Several outcomes of our dataset assembly of naturalized Asteraceae in Australia demonstrate how critical it is to implement taxonomic cleaning. Although our study only dealt with a few hundred taxa, the outcomes of the study have direct implications for even bigger data studies involving thousands of taxa. First, the cleaned dataset contained 257 taxa. Had the cleaning protocol not been implemented, and a dataset constructed simply by merging the two source datasets (with just the straightforward removal of duplicate names), the assembled dataset would have contained 328 taxa. This equates to a considerable and unacceptable overestimate of taxon richness of naturalized Asteraceae in Australia by 71 taxa (27.6%). Such a high level of taxonomic inaccuracy is especially unsuitable for comparative plant studies that require accurate representations of phylogenetic relationships (Gotelli 2004). Second, any taxonomic cleaning process must account not just for nomenclatural issues (step 5), it must also include careful scrutiny of the occurrence (step 6) and ecological status (step 7) of each taxon. Had we not manually inspected the actual distributional records of each taxon, the assembled dataset would have contained 270 taxa, an overestimate of taxon richness by 13 taxa. Third, where there is any reasonable gap in time between dataset assembly and the construction of the source datasets, the literature must be scoured for evidence of new taxa that need to be added to the dataset (step 4). While in our case, this involved searching for and finding four recently naturalized taxa in Australia, in other cases, this might include newly described taxa within study clades. Implementation of our cleaning protocol has also demonstrated that it is unlikely that a reliance on automated processes for cleaning will be all that is required to completely clean and prepare datasets. Indeed, previous work has described data cleaning and taxonomic scrutiny of Big Data as 'intelligent processes' (Chavan 2007), requiring the involvement of skilled individuals with taxonomic expertise to be fully effective. Kooyman et al. (2012) pointed out that even after automated taxonomic cleaning, each taxon in an assembled dataset must be individually inspected to ensure all taxonomic inaccuracies have been dealt with. While there have been recent efforts to automate the most time-consuming process (step 5) of cleaning (Cayuela et al. 2012, Boyle et al. 2013, Pennell et al. 2016), coordination of effort across a global or even more regional scales to provide combined automation of step 5 with steps 4 (new taxa), 6 (confirming occurrence) and 7 (verifying status) in particular will be much harder to achieve in the foreseeable future, and these steps will for some time require human vetting and expertise. This is especially so for step 6, as distributional records need to be inspected. With these records, there is often much detail and little consistency in how comments and notes are provided, making efforts to establish an automated process rather difficult. In addition, it is critical to understand that automated approaches are still reliant on the sources used for nomenclatural cleaning being regularly maintained and updated to reflect current taxonomic knowledge. Unfortunately, the approach to ensuring currently valid names are used has generally been haphazard in broader curatorial practice (Costello and Wieczorek 2013), but there is much scope for it to become more systematic as these datasets grow and receive more attention based on their value in the age of Big Data (Zermoglio et al. 2016).
The number of clones in the FD, taxa found in both GD and RD under their currently accepted names, was moderately high (67%). This is probably unsurprising given the meticulous nature with which the source datasets were constructed. Nevertheless, the differences between the two source datasets point to issues that need to be considered when merging datasets. For instance, the 21% of taxa in the FD that were either found only in GD or only in RD under their currently accepted name demonstrate that using more than one source dataset when possible is likely to lead to a higher number of relevant taxa in the FD and that disparate source datasets are likely to differ in their taxonomic content (e.g. Hulme and Weser 2011). At this stage, it is unclear why our two source datasets each contained taxa that the other did not. It is also interesting to note that in nearly ten years since the publication of the latest dataset (RD), 26 updated taxon names needed to be inserted into the FD with the removal of 88 other names for issues related to synonymy, infra-species epithets and problematic circumstances. These numbers are not insignificant and indicate that even in a short period of time, taxonomy is incredibly dynamic.
A key strength of the protocol presented in this paper is that it presents a simple step-by-step approach for taxonomic cleaning that can easily be adopted by nonspecialists who are assembling a plant ecological dataset, perhaps for the first time. In addition, it systematically coordinates steps in a way that especially targets the construction of plant ecological datasets, particularly because it includes ecological aspects (i.e. occurrence, status) and the need to search the most up-to-date sources for taxa new to study regions (if a target area approach is used) or taxonomic clades (if a broader comparative study is involved). Further detailed descriptions of taxonomic cleaning can be obtained by consulting sources such as Chapman (2005) and Mathew et al. (2014). The production of both a final dataset and a companion dataset via our protocol make it very clear that we believe it important to be transparent about not only which taxa are included in a study, but also about those taxa not included and the reasons for their exclusion. The recent retraction of a published paper from the journal Biology Letters (Hanna and Cardillo 2014) on the grounds that the ecological dataset contained substantial errors lends weight to the argument of transparency in dataset presentation. Analysis of a revised dataset has produced considerably different outcomes compared with the original study, and will lead to a new publication in a different journal (Retraction Watch at http://retractionwatch.com/2016/12/27/errorladen-database-kills-paper-extinction-patterns/). This is the first botanical study that details the types and amounts of taxonomicallyrelated errors that arise when source datasets are merged to assemble an ecological dataset. A small number of studies, however, have begun to empirically address the issue of taxonomic reliability in the sorts of large datasets available for use in large dataset studies in animal ecology. Zermoglio et al. (2016), for example, analysed 1000 scientific names taken at random from VertNet, an aggregator of vertebrate biodiversity data from natural history collections. They found that less than 47% of names were currently valid. Our cleaning protocol removed 27% (88 out of 328 taxon names at step 2) based on similar nomenclatural issues. Although this percentage is not as high as that reported in Zermoglio et al. (2016), it still represents the highest number of taxa requiring attention (excluding the removal of duplicate names). The high prevalence of synonymy is not surprising as this type of issue is the most difficult and time consuming to solve (Zermoglio et al. 2016). In this context, consultation with specialist taxonomists is highly desired (Gotelli 2004). However, such expertise has become less available in recent times (Wheeler 2014). In the long term, this problem will best be solved by increasing the taxonomic expertise of ecologists building and using datasets containing large numbers of species. Thus, our study provides further evidence to support calls for continued investment in plant systematics and the representation of taxonomy and botany in university curricula (Wheeler et al. 2012, Pyšek et al. 2013, Bebber et al. 2014, Wheeler 2014, Deng 2015.

Conclusions
Big data can be used effectively in a targeted way in ecological studies to address major scientific and societal problems (Hampton et al. 2013). However, the value of any analysis of large ecological datasets depends on the quality of the underlying taxonomic data (Valdecasas and Camacho 2003). The challenge is that taxonomic cleaning during dataset assembly is an incredibly difficult task. This difficulty, compounded by the global decline of taxonomic expertise, leads to situations where ecological datasets are often used without much attention being given to the quality of the underlying taxonomic data (Maldonado et al. 2015). This is concerning because a lack of appropriate taxonomic consideration can have serious impacts on the robustness of outcomes from large dataset studies (Jansen and Dengler 2010, Duarte et al. 2014, Zermoglio et al. 2016). The protocol we have presented here is helpful because it brings together an integrated management plan that combines usually disparate elements of dataset assembly which are not always considered together in a systematic way for plant ecological datasets. Our study has clearly shown that ignoring the critical process of taxonomic cleaning can lead to serious dataset problems that will likely lead to incorrect ecological conclusions.
draft of the manuscript. PP was supported by project no. 14-36079G Centre of Excellence PLADIAS (Czech Science Foundation), long-term research development project RVO 67985939 and Praemium Academiae award (The Czech Academy of Sciences). Biotic constraints on the establishment and performance of native, naturalized, and invasive plants in Pacific Northwest (UsA) steppe and forest Abstract Factors that cause differential establishment among naturalized, invasive, and native species are inadequately documented, much less often quantified among different communities. We evaluated the effects of seed addition and disturbance (i.e., understory canopy removal) on the establishment and seedling biomass among two naturalized, two invasive, and two native species (1 forb, 1 grass in each group) within steppe and low elevation forest communities in eastern Washington, USA. Establishment within each plant immigrant class was enhanced by seed addition: naturalized species showed the greatest difference in establishment between seed addition and no seed addition plots, native and invasive species establishment also increased following seed addition but not to the same magnitude as naturalized species. Within seed addition plots, understory canopy disturbance resulted in significant increases in plant establishment (regardless of plant immigration class) relative to undisturbed plots and the magnitude of this effect was comparable between steppe and adjacent forest. However, regardless of disturbance treatment fewer invasive plants established in the forest than in the steppe, whereas native and naturalized plant establishment did not differ between the habitats. Individual biomass of naturalized species were consistently greater in disturbed (canopy removed) versus undisturbed control plots and naturalized species were also larger in the steppe than in the forest at the time of harvest. Similar trends in plant size were observed for the native and invasive species, but the differences in biomass for these two immigration classes between disturbance treatments and between habitats were not significant. We found that strong limitations of non-native species is correlated with intact canopy cover within the forest understory, likely driven by the direct or indirect consequences of low light transmittance through the arboreal and understory canopy.
Considered collectively, our results demonstrate how seed limitation and intact plant ground cover can limit the abundance and performance of naturalized species in Pacific Northwest steppe and low elevation forest, suggesting that local disturbance in both habitats creates microsites for these species to establish and survive. Future studies evaluating interactions between multiple barriers to establishment using more representatives from each immigration class will further reveal how biotic interactions ultimately influence the demography and distribution of non-native plants within these communities.

Keywords
Disturbance, seed limitation, biotic resistance, competition, mesic steppe, coniferous forest, seedling establishment, seedling performance introduction Naturalizations form the small fraction of those introduced species that have surmounted demographic and local environmental barriers to develop self-sustaining populations, but unlike invaders, naturalized species do not inevitably proliferate within the novel habitat . Limitation of naturalized species in their abundance and geographic range may result from demographic restrictions, dispersal limitations, abiotic constraints, or trophic interactions in the novel range (Davis 2009, Richardson and Pyšek 2012, Connolly et al. 2014). Furthermore, naturalized species often establish more readily and have higher fitness in disturbed habitats (MacDonald and Kotanen 2010, Maron et al. 2012, Maron et al. 2013, suggesting that competition for microsites may be a major determinant of plant naturalization (Going et al. 2009, Kempel et al. 2013. However, despite the role of naturalizations as precursors to invasions, we know surprisingly little about how demographic, physical, and biotic factors interact within a novel range to curb, delay or prevent naturalized species from becoming invasive , van Kluenen et al. 2010, Richardson and Pyšek 2012. The physical and biotic factors governing plant establishment are frequently quantified (e.g., Mack and Pyke 1984, Pyke 1986, Weiher and Keddy 1999, Myers and Harms 2009), but the effect of these factors on the fate of naturalized species compared to the fate of co-occurring invaders is unclear (van Kluenen et al. 2010). We compare and contrast here the effects of two factors -disturbance and seed limitation -on the establishment and subsequent performance of native, naturalized, and invasive species between two community types that differ radically in invasion history. The proliferation of many temperate plant species are limited by seed recruitment (Turnbull et al. 2000, Clark et al. 2007, suggesting low abundance of naturalized species or poor dispersal ability may be related directly to low seed availability. Introduced species also likely differ in their tolerance of highly competitive environments, e.g. the recruitment of naturalized species may be more strongly limited by native canopy cover than by co-occurring invaders. Consequently disturbance by the removal of competitors can differentially influence the establishment of introduced plants (Gross et al. 2005), but the specific response of invasive vs. naturalized species to this disturbance is unclear.
The potential for a species' immigrants to naturalize and the descendants to invade can also vary by habitat (Rejmanek et al. 2005, Richardson and. The dominance of non-native species can vary enormously among habitats in novel ranges, a relationship often largely described as a reflection between introduced plants and response to the climate of their new habitat (Alpert et al. 2000). The availability of microsites and the severity of interspecific competition, however, will be functions of resource availability (Rejmanek et al. 2005, Chytrý et al. 2008, and species naturalized or invasive in resource-rich habitats may be rare or excluded in adjacent habitats that lack critical resources (e.g., Huenneke et al. 1990). For example, low light transmittance through the forest canopy and understory can be a major barrier prohibiting many non-native species, particularly grasses, from invading forests (Pierson and Mack 1990, Brothers and Spingarn 1992, Martin et al. 2009), but shade may not inhibit the establishment of other non-natives (e.g. Microstegium vimineum, Martin et al. 2009, Flory 2010). Recruitment of non-native species is often much greater when seed additions can co-occur with disturbance of the forest understory or overstory (Pierson andMack 1990, Dodson andFelder 2006). To date, however, no comprehensive evaluation has been assembled of the effect of understory canopy disturbance on the concurrent establishment rates and performance of naturalized versus invasive species in forests and adjacent grasslands.
Meadow steppe and adjacent coniferous forest in eastern Washington (USA) have experienced markedly different levels of plant invasion. Non-native grasses and forbs are prevalent in steppe (Daubenmire 1970, Mack 1986) but infrequent in adjacent coniferous forests (Daubenmire andDaubenmire 1968, Parks et al. 2005). When the understory is removed, seedling establishment of some non-native species is not however otherwise limited by differences between these communities (Connolly 2013). Additionally, preferential granivory partially explains differences in the abundance of naturalized and invasive species within the steppe (Connolly et al. 2014) but fails to account for the low abundance of non-native species in these forests. To an un-quantified degree, the realized distribution of native, naturalized, and invasive species within the steppe and forest communities may be a function of seed limitation and the ability for species to persist in undisturbed habitat (Pierson and Mack 1990).
We examined the effect of seed addition and local disturbance (i.e., removal of all plant material <1.5 m above the ground) on the establishment and performance of native, naturalized, and invasive species in meadow-steppe and forest habitats in eastern Washington (USA) as part of a multi-pronged investigation of the forces that restrict/ enhance naturalization (Connolly 2013, Connolly et al. 2014). Our objectives were to 1) quantify the severity of seed limitation for a set of representative native, naturalized, and invasive species, 2) evaluate how disturbance of the understory canopy cover influenced recruitment and performance of each class of immigrant, and 3) evaluate the effect of these factors within invaded steppe and uninvaded forests.

Study sites
A total of eight steppe and forest study sites were chosen that span the meadow steppexerophytic forest ecotone in eastern Washington (See Suppl. material 1: Table S1). The co-dominance of Symphoricarpos albus with Festuca idahoensis and Pseudoroegneria spicata characterize the mature vegetation in the Festuca idahoensis/Symphoricarpos albus habitat type (sensu Daubenmire 1970) in the four eastern Washington meadow-steppe sites (1250 m 2 each). The four forest sites (1250 m 2 each) are dominated by Pinus ponderosa with co-dominate Symphoricarpos albus in the understory (hereafter termed the P. ponderosa/S. albus habitat type, sensu Daubenmire and Daubenmire 1968). Sites were 40.9 ± 6.1 km apart; the adjacent sites were at least > 0.5 km apart.

Study species
A seed mixture of three grasses and three forbs (a native, naturalized, and invasive species of each taxonomic category) was used in seed addition plots in this study. The native perennials Pseudoroegneria spicata and Geum triflorum are prevalent in meadow-steppe (Daubenmire 1970); these species are less prominent in P. ponderosa forests (Daubenmire and Daubenmire 1968). Secale cereale, a naturalized annual, is a Washington Class C noxious weed that appears as a volunteer in many cultivated crops (Gaines and Swan 1972, Washington Noxious Weed Control Board [WNWCB]: http://www. nwcb.wa.gov) and establishes, albeit rarely, in meadow steppe and P. ponderosa forest (Connolly et al. 2014, USDA PLANTS database: http://plants.usda.gov). Centaurea cyanus, a naturalized annual, is also registered on the WNWCB monitor list and is widely established at low density throughout the meadow steppe and P. ponderosa forest in eastern Washington (Roche and Talbot 1986, USDA PLANTS database: http:// plants.usda.gov). The invasive annual Bromus tectorum is abundant, even dominant, in the meadow steppe (Daubenmire 1970, Mack 1981) but infrequent in P. ponderosa forest (Daubenmire andDaubenmire 1968, Pierson andMack 1990). Cirsium arvense, an invasive perennial, commonly occurs in anthropogenically disturbed sites and is present in both habitat types (http://www.nwcb.wa.gov, Connolly 2013). Seeds of G. triflorum, Ce. cyanus, B. tectorum, and Ci. arvense were collected in bulk from our meadow steppe sites from May -September 2010 and 2011; seeds of P. spicata and S. cereale were obtained from a local vendor (Rainer Seed Company, Davenport, WA, USA) to insure we had adequate numbers of locally produced seeds for all treatments (described below).
We substantiate the immigrant class (naturalized vs. invasive) of each non-native test species based on 1) a preliminary vegetation analysis conducted at all 8 study sites (Connolly et al. 2014) and 2) state and regional published accounts habitats (e.g., Sawn 1972, Roche andTalbot 1986) of the relative abundance of these species. Importantly, some work has evaluated the mechanisms driving competition dynamics between these specific native perennials and introduced annuals (e.g., Madsen et al. 2012), but outcomes remain unclear and suggest evaluation of their respective establishment potential and relative performance across environmental and disturbance gradients may help identify the drivers of introduced plant colonization and persistence in natural sites.

Two-factor field exclosure experiment in steppe and forest
The effects of seed addition and disturbance were assessed in late July-early August 2011 in six experimental blocks arranged in a 2 × 3 grid at each site (25 × 50 m); blocks were 25-m apart. Each block was comprised of four hardware cloth exclosures (aboveground dimensions were 45 × 45 × 45 cm tall, 1 cm 2 openings); exclosures in each experimental block were arranged 2-m apart in a square (24 exclosures per site, 192 exclosures total across all sites). Before its installation each exclosure was sprayed with enamel paint (Krylon®) to prevent leachate from the hardware cloth affecting plant growth within the exclosure. Exclosures were embedded 15-cm deep into the mineral soil to exclude the treatment being confounded by vertebrate seed predators.
Each block contained a complete 2 × 2 factorial cross with seed addition and disturbance as factors. To generate disturbance treatments, we removed all vegetation and litter from the soil surface and churned the top 3 cm of mineral soil without removing any soil. Disturbed soil was then leveled within each exclosure to minimize differences in soil microtopography among these exclosures (Harper 1977). We extended disturbance treatments in a 0.5-m buffer zone around each disturbance treatment exclosure to minimize shading by neighboring understory plants. Vegetation was left intact within and around undisturbed control exclosures. Exclosures were embedded carefully around each replicate assigned to the undisturbed treatment and produced no detectable changes to plant cover within or around the exclosures. Importantly, this disturbance treatment did not necessarily release experimentally sown plants from competition but rather increased the availability of some resources (e.g., light, Suppl. material 2: Figure S1) that are known to influence plant competition in understory environments.
In early August 2011, 96 exclosures amongst the sites were sown with an admixture of seeds containing three grasses (P. spicata, S. cereale, B. tectorum) and the three forbs (G. triflorum, Ce. cyanus, Ci. arvense). Seeds were sown evenly across a 30 × 30 cm square at the center of each exclosure (0.09 m 2 sampling area, 50 seeds of each species, 300 seeds sown total per exclosure). Seeds were pressed firmly onto the soil surface to minimize post-dispersal seed movement. In the remaining 96 exclosures amongst the sites no seeds were added in order to measure natural recruitment of study species and evaluate the contribution of seed addition to plant establishment counts.
Exclosures were monitored monthly for damage and other extraneous events; plants were counted in early July 2012 to estimate establishment. Following July counts, all above ground plant biomass was harvested within each exclosure, separated by species, dried (48 hours at 70°C) and weighed. Plant establishment was quantified early in the growing season and before the production of reproductive structures in order to minimize the possibility of introducing non-native species. Natural recruitment by species other than our six test species was rare within these exclosures; nonetheless these recruits were excluded from the analysis. Average individual seedling biomass was estimated by dividing total biomass for each species in each exclosure by the number of that species in the exclosure. Plots that received seed addition were treated with glyphosate herbicide (Roundup®, Monsanto Company) at the cessation of the study. Additionally, the immediate area in a 15-m radius surrounding each exclosure was monitored throughout 2012 and 2013 to detect and remove extraneous introductions.

Statistical analysis
We used general linear mixed models to evaluate whether seed addition, disturbance, and plant immigration class (Native vs. Naturalized vs. Invasive) influenced the number of individuals that established within each community (Steppe vs. Forest). July individual counts of each species were averaged across all blocks at a site to generate site-level averages for each treatment combination and for each species. Ten exclosures were damaged in March 2011. These units were excluded from analysis as vertebrate seed predators and grazers can strongly influence plant establishment in these habitats (Connolly et al. 2014) and may generate undetectable variation in seedling recruitment. Five of eight sites, however, had no damage to exclosures, and no site with damaged exclosures had fewer than three replicates of each treatment combination with which to generate sitelevel averages for each species. Site-level averages for plant counts for each species were used as model response variables and all fixed effects (habitat, seed addition, disturbance, plant immigration class) and their possible interactions were included in analysis of the response variables. Site identification and the interaction between site, seed addition, and disturbance were included in this model as random effects to account for the nested structure of the design. Average individual counts for each species were log (x+1) transformed prior to analysis. We used post hoc tests to evaluate pairwise contrasts using the Tukey-Kramer method to control for multiple comparisons (Littell et al. 2006).
Analysis of average individual biomass followed a similar model structure but was limited to seed addition plots to insure the analysis was conducted between individuals with similar durations of residence time within each plot. Individual biomass estimates of each species were averaged across all blocks at a site to generate site-level averages for each treatment combination and for each species. Individual biomasses were squareroot transformed before analysis. Average Ci. arvense biomass at one steppe site (Smoot Hill -Summit) was a significant outlier differing from the species' other mean values by over three standard errors and was driven by the rapid second year growth of an adult Ci. arvense already residing in the plot. Omitting this observation permits the analysis to satisfy assumptions of normality; consequently, final model analysis for average individual biomass did not include this observation. Models evaluating plant establishment and biomass employed the Kenward-Roger approximation to estimate appropriate degrees of freedom (Littell et al. 2006). All analyses were conducted in SAS (Proc GLIMMIX, SAS 9.3; Cary, North Carolina, USA).
Our experimental design incorporated the effect of plant immigration class (Native, Naturalized, or Invasive) by evaluating two representative species from each class (one grass, one forb). Although the species selected represent common or dominant plants in these forest and steppe communities (See Study species section) and site-level quality can be assessed by the relative abundance of these native and non-native species (Daubenmire 1970, Mack 1981, Pierson and Mack 1990, we were only able to accommodate two species of each plant immigration status within each plot in our experimental design. Given the limited number of species within each immigration class, we must tentatively interpret conclusions drawn from the main effect immigration class or interactions including immigration class. In order to accommodate interpretation at the species level, we include supplemental results and figures that evaluate plant species as a main effect instead of plant immigration class in the same general linear mixed model framework (Suppl. material 3: Tables S2-S3, Figs S2-S3). Importantly, given the early experimental harvest date and relatively large plot size we assume species sown in our seed mixtures demonstrated independent responses to treatments and had negligible effects on the overall emergence and growth of other species occurring in the same plot. Ancillary analysis using statistical models that helps account for that lack of independence with multivariate responses (i.e., MANOVA general linear models evaluating the response of multiple species sown in the same plot) indicate similar results for main fixed effects to those derived from mixed models (Suppl. material 4: Tables S4-S7).

Results
Not surprisingly seed addition plots had greater recruitment than plots without seed addition, but the magnitude of the positive effects of seed addition varied by habitat and disturbance treatment (Table 1, Fig. 1). The positive effect of seed addition on plant establishment was greater in disturbed plots than undisturbed plots and greater in forest plots than plots in the steppe (Fig. 1). July establishment counts for the four native and naturalized species did not differ significantly between the forest and the steppe (Table 1, Fig. 2A (Table 1, Fig.  2B; Naturalized spp.: t = 11.76, d.f. = 161.9, P < 0.001). Seed additions also resulted in greater establishment for native species and invasive species relative to plots that did not receive seeds (Native spp.: t = 7.06, d.f. = 161.9, P < 0.001; Invasive spp.: t = 4.69, d.f. = 161.9, P < 0.001). As of July 2012, the magnitude of the effect of seed addition was greatest for naturalized species, had an intermediate effect on native species, and contributed the least to invasive plant establishment (Table 1, Fig. 2B). Table 1. General linear mixed model analysis describing the influence of habitat, disturbance, seed addition, plant introduction class, and all possible interactions of these fixed factors on the log-transformed individual counts of plots established in Pacific Northwest steppe and forest communities. Significant differences at a Type I Error = 0.05 are indicated in bold; marginally significant differences at Type I Error = 0.10 are indicated in italics. Individual plant biomass was influenced by a significant interaction between plant immigration class and habitat and a marginally significantly interaction between plant immigration class and disturbance treatment ( Table 2). Regardless of habitat or disturbance treatment, naturalized species were significantly larger than either the invasive or native species (Table 2, Fig. 3A, B), reflecting important differences in life history between the species in each plant immigration class. Plants were typically larger in the steppe than in the forest (Table 2), but only the two naturalized species displayed a significant difference in average individual biomass between the two habitats (Fig. 3A, Naturalized spp.: t = -4.93, d.f. = 24.9, P < 0.001). Similarly, plants were typically larger in experimentally disturbed plots than undisturbed plots (Table 2), but only the two naturalized species demonstrated a significant difference in average individual biomass between the two disturbance treatments (Fig. 3B, Naturalized spp.: t = 4.58, d.f. = 76.5, P < 0.001).

Discussion
Our goal was to determine whether seed limitation and disturbance via canopy removal differentially influence the recruitment and performance of native, naturalized, and invasive species in communities (meadow steppe and coniferous forest) that differ Table 2. General linear mixed model analysis describing the influence of habitat, disturbance, plant introduction class, and the interaction of these fixed factors on the square root-transformed individual biomass of plants harvested (July 2012) from plots established in Pacific Northwest steppe and forest communities. Significant differences at Type I Error = 0.05 indicated in bold; marginally significant differences at Type I Error = 0.10 are indicated in italics.  radically in physiognomy. Seed limitation differed among the three class with naturalized species the most seed limited, native species intermediately limited, and invaders experiencing intermediate to no limitation. We found that intact plant cover restricts seedling establishment similarly across all plant immigrant class and also results in significantly lower naturalized species growth. Low abundance among naturalized species in PNW meadow steppe and low recruitment of most non-native species in the forest understory are at least partially attributable to the combined influence of seed limitation and low resource availability mitigated by understory canopy cover (e.g., light levels at the soil surface, Suppl. material 2: Fig. S1). Our results, considered simultane- ously with the conclusions of other contemporary studies conducted at these same sites with the same test species (Connolly 2013, Connolly et al. 2014, suggest that biotic resistance can play a major role in determining non-native species abundance (naturalized vs. invasive) within and between these PNW plant communities.

Seed limitation, disturbance, and naturalizations
Seed limitation influences recruitment of many native (Turnbull et al. 2000, Clark et al. 2007) and non-native species (Jongejans et al. 2007, Swope and Parker 2010, Connolly et al. 2014. Seed limitation can be the product of 1) a paucity of reproducing plants, 2) poor seed dispersal, 3) biotic agents that directly reduce seed number, 4) poor propagule viability, or 5) some combination thereof (Harper 1977, Seabloom et al. 2003, Davis 2009). Non-native plants are unlikely to be dispersal-limited between communities in our study region as the propagules of non-native species can readily traverse the PNW steppe-forest ecotone and establish (albeit rarely and for short durations) in disturbed coniferous forest sites (e.g., Pierson andMack 1990, Dodson andFelder 2006). Unlike native and invasive species, adult S. cereale and C. cyanus are however rare at both forest and steppe sites (Connolly et al. 2014), implicating the lack of reproducing plants, poor seed dispersal, or both as a major limiting factor for naturalized species within these communities. Moreover, preferential attack by granviores and consistent losses caused by pathogenic soil fungi in both habitats also contribute substantially to seed limitation, occasionally eliminating entire experimentally-introduced populations (Connolly 2013, Connolly et al. 2014). Differences in species' biomass production between habitat types and with or without disturbance may also influence non-native propagule pressure and contribute to seed limitation for non-native species. For example, individual B. tectorum biomass correlates strongly with total seed mass produced per individual plant (R 2 adj = 0.861; P < 0.001, Almquist 2013) and our study shows average B. tectorum biomass was quantitatively greater in undisturbed steppe (43.4 ± 8.2 mg [mean ± SE]) than in undisturbed forest (10.0 ± 4.0 mg) at the time of July 2012 harvest (Suppl. material 3: Fig. S3) suggesting that average annual seed production per B. tectorum individual is likely greater in the PNW meadow steppe than in the adjacent, undisturbed ponderosa pine understory. Plants were harvested from exclosures before the generation of reproductive tillers to eliminate unintentional plant introductions at these sites, but previous estimates of B. tectorum fitness within each of these communities corroborate this hypothesis (steppe: 16-20 seeds per adult plant, Pyke 1986; forest: 0.7-0.9 seeds per adult plant, Pierson and Mack 1990). Additionally, disturbance of plant canopy cover in our study resulted in 87.5% and 31.7% greater individual B. tectorum biomass in the forest and steppe, respectively (Suppl. material 3: Fig. S3). These disturbancemediated effects on productivity may also increase individual seed production for nonnative plants. By limiting productivity, plant cover likely limits non-native plant seed production, influences seed dispersal dynamics, lowers propagule pressure, and facilitates community resistance to the establishment of light-requiring non-native plants.
Disturbance can facilitate a species' transition from naturalization to invasion (Crooks and Soulé 1999, Groves 2006, Chakraborty and Li 2010 Pyšek 2012) by increasing resource availability and eliminating competitors (Davis et al. 2000, Davis and Pelsor 2001, Myers and Harms 2009, Richardson and Pyšek 2012, Leffler et al. 2016). In our study, naturalized species' establishment in disturbed plots in the forest and steppe were equivalent to or exceeded the establishment of cooccurring invasive and native species in identical treatments (Suppl. material 3: Fig.  S2), suggesting that resources provided by the removal of understory (< 1.5m high) canopies (e.g., light [ Fig. S1], soil nutrients, water) helped meet a major requirement for recruitment for these naturalized species. Seedlings of invasive species may have higher relative growth rates and net assimilation rates than introduced, non-invasive congeners (Grotkopp et al. 2010) and, consequently, invaders may be more robust in resource scarce (e.g., undisturbed) sites than co-occurring naturalized species. Disturbance has a strong, positive effect on the growth of these two naturalized species, suggesting resource limitation, and in particular light limitation, may be a consistent, effective biotic barrier against some members of this class of plant immigrants in PNW forests. Residence time, however, can also influence the potential for naturalized species to invade (Groves 2006) and, while the two naturalized species examined in this study have likely occupied PNW natural habitats for over 100 years (e.g., Sawn 1972, Roche andTalbot 1986), it is possible that sufficient time has not elapsed to permit the expansion of these species within these habitats. Further research is needed to determine the extent to which the interaction of resource availability, disturbance regimes, and species residence time in a novel habitat affects the differential establishment of invasive and naturalized species (Grotkopp et al 2002, Groves 2006, Moravcová et al. 2010.

Competition in PNW coniferous forests
Competition in the PNW coniferous forest understory is a strong biotic barrier to invasive species that are abundant in the adjacent steppe, particularly B. tectorum (Pierson and Mack 1990). For example, low light availability at the soil surface in the P. ponderosa forest understory may cause low non-native species recruitment and individual seedling biomass. P. ponderosa forest understory lowered light transmittance at the soil surface to 20% of ambient conditions in June 2012, whereas shading in undisturbed steppe only lowered light transmittance to 60% of ambient conditions at the soil surface (see Suppl. material 2: Methods S1 and Fig. S1). Shading may directly influence the survival of some invasive species; for example, Bakker (1960) reported large Ci. arvense seedling mortality if light intensities fall below 20% of full sunlight -a threshold similar to that measured beneath the understory at our Ponderosa Pine forest sites. Additionally, shade lowers the probability that non-native seeds receive essential light-related germination cues (Pons 2000, Jensen andGutekunst 2003) and may slow non-native seedling growth rate and result in lower fecundity through modifications of seedling microclimate (e.g. low temperatures, increased snow cover, Pyke 1984, Pierson andMack 1990).
The environmental tolerances of introduced species interact with a novel habitat to determine a species' potential for naturalization (Richardson and Pyšek 2012), and climatic mismatch between an invader and a novel habitat may preclude non-native plant establishment (Alpert et al. 2000). Consequently, pre-adaptation to forest understories will raise the likelihood that an introduced species will naturalize in the interior of these temperate North American forests. Shade-tolerant non-native perennials (e.g. Berberis thunbergii, Celastrus orbiculatus, Lonicera spp.) readily establish in eastern North American forests (Zheng et al. 2006) and could plausibly be introduced as horticultural escapes and even naturalized in western coniferous forests (Smith and Mack 2013). Some non-native grasses may also tolerate low light levels in North American forest understories and may be candidates for future naturalizations and potential invasions (e.g. Miscanthus sinensis, Horton et al. 2010;leptomorphic bamboos, Smith and Mack 2013). Understanding the interactions between the physical tolerances of introduced species and the severity of competition in novel habitats would improve predictions of non-native plant naturalization or invasion potential on a habitat-specific level (Chytrý et al. 2008, Richardson and.

Conclusions and future directions
Few studies directly evaluate the relationship between biotic resistance and the relative abundance of introduced species (van Kleunen et al. 2010, Richardson and. However, here we report the results of one part of a three-experiment series evaluating how functionally different components of biotic resistance (i.e., seed predation [Connolly et al. 2014], seed parasitism [Connolly 2013], competition [reported here]) relate to the prevalence of non-native plants between habitats differing in susceptibility to invasion. Invasive plants are conspicuous by their tolerance or avoidance, or both, of most biotic barriers in the extensively invaded PNW steppe (Mack 1986), whereas naturalized species are significantly restricted, and occasionally eliminated, by the joint action of biotic interactions in the same habitat and at the same time. In the examples investigated here, community resistance to invasions is substantial in adjacent low-elevation PNW coniferous forest. For the species we evaluated, limitations to recruitment and performance imposed by a dense canopy and seed limitation imposed by granivores and, to a lesser extent, seed pathogens ensure that undisturbed forests interiors are likely to be well defended against the encroachment of many non-native species, particularly annual grasses. Collectively, our work demonstrates that biotic resistance likely plays a role both in determining 1) the distribution of some non-native species amongst a region's communities and 2) the position of a non-native species along the introduced-naturalized-invasive species continuum in a community. Further work evaluating the potential synergistic interactions between multiple biotic barriers with a larger suite of representatives from each immigration class (e.g., Suwa and Louda 2011, Maron et al. 2013) will help elucidate how biotic interactions ultimately influence demography of non-native plants and the distribution of non-native plants within Pacific Northwest steppe and forest communities.  introduction Islands are particularly noteworthy for global conservation efforts because they host more than 20% of the world's terrestrial plant and vertebrate species within less than five percent of global terrestrial area (Kier et al. 2009). Reflecting this conservation significance, ten of the world's 35 biodiversity hotspots consist entirely, or largely of islands (Zachos and Habel 2011). Island biodiversity is highly threatened, with over half of all recent documented extinctions occurring on islands (Butchart et al. 2006;Sax and Gaines 2008), including almost 1000 species of non-passerine land birds (Duncan et al 2013). Islands currently harbour over a third of all terrestrial species facing imminent extinction. (Ricketts et al. 2005), as well as 45% of all species categorised as critically endangered by the IUCN (Baillie et al. 2004).
Much of the conservation threat on islands, as well as on mainland ecosystems, arises from invasive species, which are considered to be the second largest driver of extinction globally ). Among invasive taxa, ants are particularly notable for their serious environmental impacts (Holway et al. 2002;Lach and Hooper-Bui 2010), especially on islands. A prominent example is the yellow crazy ant Anoplolepis gracilipes F. Smith, 1857 invasion on Christmas Island that has resulted in significant environmental transformation of the rainforest ecosystem, as well as promotion of secondary invasion by other invasive species (O'Dowd et al. 2003;Green et al. 2011). The cumulative effect of these invasions has recently resulted in the first vertebrate (a bat) extinction in Australia for over 50 years (Lumsden 2009;Martin et al. 2012). Another example is the accidental introduction of about 60 ant species to the Hawai'ian islands since human colonisation (Krushelnycky et al. 2005), which has resulted in substantial negative impacts on native Hawaiian biodiversity (Reimer 1994).
Each year, more ant species are being accidentally transported by human commerce, and species already outside of their native range are further dispersing to new locations (Williams 1994;McGlynn 1999;Holway et al. 2002). Because of the significance of the social, economic and environmental effects of many ant invasions, as well as the difficulty in eradicating invasive ants after they have established (Hoffmann et al. 2011, globally, ants are increasingly becoming a target of biosecurity measures to prevent their arrival, especially on islands (HAG 2001;PIAG 2004;COA 2006). Such biosecurity measures are potentially most advanced in New Zealand, where biosecurity efforts have extended to ports-of-exit in neighbouring countries to prevent contamination of goods prior to transportation to New Zealand. This port-of-exit effort reduced ant presence in goods from 17% of containers to less than 1% (Nendick et al. 2006).
Lord Howe Island is located approximately 760 kilometres northeast of Sydney, Australia in the Pacific Ocean (S31.5545, E159.0841). The island is notably species rich, with a high level of endemism (Cassis et al. 2003) and as a result of its conservation significance, the island has World Heritage status. Exotic species are prominent on the island (Hutton et al. 2007), with rats, attributed to causing the extinction of five bird species and two land snail species, as well as being implicated in the decline of many other species (Ponder 1991;Cassis et al. 2003;Hutton et al. 2007). Rats were also believed to have caused the extinction of the Lord Howe Island phasmid, Dryococelus australis (Montrouzier, 1855), until a small population of the phasmid was found on a nearby rodent-free islet (Priddel et al. 2003). Similarly, invasive plants are a major focus of onground conservation efforts, with management costing AUD$6.5 million over the past decade (Lord Howe Island Board 2016). In 2003, the invasive African big-headed ant Pheidole megacephala Fabricius, 1793 was found to be established on Lord Howe Island, and an eradication program commenced. This ant is considered to be among the worst invasive species globally (Lowe et al. 2001) partly because of its severe environmental consequences (Hoffmann et al. 1999;Wetterer 2007;Hoffmann and Parr 2008).
Following the detection of P. megacephala, ants became a target for biosecurity measures on Lord Howe Island to prevent further ant species introductions. Such measures included more thorough inspection of goods arriving on the island, prohibition on the importation of second-hand building materials, and strict protocols on the importation of plants and soil to the island. In addition, regular prophylactic treatments for ants commenced at the port, public awareness efforts of the issues of invasive ants were initiated, ant identification training was provided to many people. Despite these efforts, over the next decade numerous ant species were collected on the island for the first time in ad hoc ecological surveys indicating a serious biosecurity problem. Here, we investigate the chronosequence of ant introductions to Lord Howe Island to quantify the extent and nature of the island's ant biosecurity problem.

Methods
A timeline of species discovery was generated by determining the earliest collection date of all ant species found on Lord Howe Island. These dates were identified from the labels of specimens in the ant collections of the Australian Museum in Sydney, the Australian National Insect Collection in Canberra and the Tropical Ecosystems Research Centre (TERC) in Darwin. These three collections contained the most comprehensive set of ant specimens from Lord Howe Island. They included specimens arising from both formal and informal collections by many people over the past century, commencing in 1915.
Species nomenclature follows Bolton (1995), and subsequent revisions (Seifert 2008; Ward et al. 2014). Species were designated as endemic, possibly native or exotic (introduced) based on current biogeographical knowledge of each species, and in a few cases by subjective opinion of the authors. Subjectivity exists for two reasons. Firstly, it is difficult to ascertain whether some Australian mainland species were present on Lord Howe Island at the time of the first ant collection in 1915 because they had previously dispersed there naturally, or because they had been accidentally introduced by early colonists. Secondly, ant taxonomy is far from complete, and multiple recent reviews have found that species considered to be widespread exotic species were actually multiple cryptic species, consisting predominantly of native species within their home ranges (Seifert 2003(Seifert , 2008Bolton 2007). The Australian ant fauna is particularly diverse with many cryptic species (Andersen et al. 2013, and we believe that two apparently widespread tramp species on the island may instead be cryptic species endemic to Lord Howe Island.

Results
Information obtained from entomological collections revealed that ant species were collected for the first time on Lord Howe Island during two concerted ant biodiversity sampling events in 1915 and 2003, and six smaller-scale samplings 1966, 1979, 1995, 2000, 2005 and 2012 that were predominantly opportunistic hand collections. A total of 45 species have been collected (Table 1, Figure 1), and of these, 12 species are considered to be endemic, and a further seven possibly native as they may have self-dispersed to the island prior to human colonisation. All species of uncertain provenance that we consider to be possibly native were found in the first collection in 1915. The last endemic species to be found for the first time was a species of Discothyrea, found in 2000. Nineteen non-native species (42% of the total fauna) were only found for the first time since 2000. All but five of the species that are not native to Lord Howe Island are native to the Australian mainland. The five non-Australian mainland species are: Tetramorium bicarinatum which is believed to be native to SE Asia (Wetterer 2009), Pheidole megacephala which is native to Africa, Cardiocondyla nuda which is native to tropical and sub-tropical Pacific, Iridomyrmex albitarsus which is native to Norfolk Island (Shattuck 1993) and the other species, Paraparatrechina sp, B is only known from the TERC collection from New Caledonia. Prior to 2003 there were no Pheidole species collected on Lord Howe Island, but since then five species have been collected, including the highly invasive P. megacephala. Rhytidoponera victoriae was first found in 1966, and Pheidole sp. group C in 2005, and these two species are now among the most commonly collected ants on the island (B Hoffmann, per. ob.). Figure 1. Accumulation of ant species on Lord Howe Island. Note that species considered to be native to the island are all graphed at 1915 irrespective of when they were first found. Table 1. Species list of the ants of Lord Howe Island with date of first record and biogeographic origin. * indicates species that the authors believe may have a taxonomic issue in that these species may instead be cryptic species native to Lord Howe Island.

Discussion
Since humans started visiting Lord Howe Island in 1778, and subsequently colonised it in 1834 (Hutton et al. 2007), the island's ant fauna has increased by almost 250%, with almost three quarters (73%) of the colonising species being collected for the first time only in the last 15 years, mostly just prior to when new biosecurity measures were implemented. Importantly, although we are unable to provide any information about sampling methods and intensity throughout the last century of ant collecting on the island, there probably has not been an increase in survey intensity in the last two decades driving the recent rise in species detections. Instead, it is more likely that the increased rate of colonisation was driven by the increase in tourism and development on the island in the mid-20 th century, coupled with a time-lag between when species established and when they were first detected. The global spread of exotic species is known to be positively related to economic activity through the movement of goods (Essl et al. 2011), and species will often be present for many years before their populations reach detectible levels (Vanderwoude et al. 2003;Frieire et al. 2014;Wylie and Peters, in press), especially if people are not actively surveying for them. It is most likely that some of the species found only in the past 15 years probably had been present for up to a decade or more prior to being collected; long before any biosecurity measures were established.
Since 2003, approximately coinciding with the detection of P. megacephala on Lord Howe Island, the movement of many high-risk items such as soil, plants, machinery and building materials to the island has been highly regulated. For example, plants must be soil free (except for a potting medium) and certified to be free of pests and disease, timber must be dressed, and gravel/roadbase must be certified to be Virgin Extracted Natural Material and free of pests. Additionally, there are enhanced protocols such as the prophylactic baiting at the wharf and immediate surroundings just in case ant species arrive in goods. But are these protocols preventing new ant incursions? As a demonstration that the relatively new biosecurity protocols are working, on 23 July 2016, as a result of compulsory inspection of all high-risk goods arriving on the island, ants were found for the first time arriving in cargo (Andrew Walsh and Hank Bower personal communication). Two intact colonies of mainland Australian species, Polyrhachis femorata and a Crematogaster species in the laeviceps species group, were found within timber. The cargo was rapidly quarantined and the colonies were treated with a toxic solution. Although it was clear that the biosecurity protocols worked in this instance, such protocols are unlikely to be perfect. Indeed more recently on 15 March 2017, a resident reported ants infesting a recently delivered consignment of corrugated iron. To further reduce this risk, the island's biosecurity procedures are planned to be enhanced in the latest review of its biosecurity strategy, including compulsory inspections of all goods arriving on the island, and these inspections being conducted in a more routine way. Only with more time, potentially a decade, will it be possible to demonstrate that the biosecurity measures are preventing further ant introductions.
Even if the exact introduction date of all species found for the first time in the past two decades was known, the general pattern of recent increase in species accumulation would stand, at least up to the point of biosecurity implementation. Such an accelerating colonisation pattern of Lord Howe Island by ant species is greatly concerning. This finding begs the question, that if such species accumulation has occurred recently on such a small island associated with a mainland with a big biosecurity effort, what is happening elsewhere throughout the world where biosecurity is not such a focus? Few data are available for invertebrates globally, but very recently it has been shown that the establishment of alien insect species has nearly doubled over the last few decades in Europe (Roques et al. 2016). Even in Antarctica, strong biosecurity measures have not prevented the unintentional transport of invertebrates and plant propagules to the region (Chown et al. 2012;Houghton et al. 2014). It is suggested that what has happened on Lord Howe Island is probably not an isolated phenomenon and that many ant species are currently being accidentally dispersed to, and successfully colonising, most islands globally that are habitable by ants and visited by people (Herrera et al. 2013;Moreau et al. 2014;Morrison 2014). Indeed, even in Hawaii where biosecurity now focuses on ants, and it was reported that ant species accumulation was decelerating (Krushelnycky et al. 2005), it is now believed that there are up to 64 species present (Paul Krushelnycky, personal communication), indicating a rise in species accumulation in the past few decades in accordance with increasing commerce.
We have a poor ability to manage or eradicate most exotic species after they establish anywhere, and this is particularly the case for ants. In the most recent global review of ant eradications, there have only been 106 successful eradications (excluding 38 that were nearing the end of their 2-year monitoring phase) from 316 attempts, and 77% of these successful eradications covered less than 5 ha . Clearly, preventing ant species colonising new locations is far more effective biosecurity measure than trying to eradicate, or even manage, them after they arrive.
The potential impacts that most of these species will have on people and the ecosystems on the island is unknown. Although the detrimental, and often severe, impacts of some invasive ant species are well known (Holway et al. 2002), just as for all taxa that are being accidentally transported to novel locations, it is completely unclear what effect, if any, most species may have (Simberloff 2011). Of all the non-endemic ant species on the island, only one, P. megacephala, is known to have serious negative impacts, which is why it is currently subject to an eradication campaign, and three other common tramp species, Cardiocondyla nuda, Ochetellus glaber and Tetramorium bicarinatum are known only as minor pests (Lester et al. 2003;Heinze et al. 2006;Wetterer 2009). For the other species that are exotic to Lord Howe Island there is little information that can used to predict their impacts on native flora and fauna. Most of these species are uncommon and have low abundance, with the exception of two recently arrived ant species, Pheidole sp. A and Rhytidoponera victoriae which are widely distributed in the lowland areas associated with human habitation, (B Hoffmann, personal observations). Given that ants are well documented to have major contributions to many ecosystem processes (Del Toro et al. 2012) these two species are likely to be influencing ecosystem processes on the island.
Notably, of the ant species on Lord Howe Island only five are not of Australian origin, clearly demonstrating that the biosecurity risk to the island comes primarily from the transport of goods from the Australian mainland. Indeed it is also most likely that three of the five non-Australian-mainland species, Cardiocondyla nuda, Tetramorium bicarinatum and Pheidole megacephala were accidentally transported to Lord Howe Island from the Australian mainland. Given that we are unaware of Paraparatrechina sp, B and Iridomyrmex albitarsus, being on the Australian mainland, these are possibly the only species that arrived on the island from a different source location, most likely being New Caledonia and Norfolk island respectively. Also noteworthy is the absence of other exotic ant species that are common throughout mainland Australia that have not yet been found on Lord Howe Island, including the highly invasive Argentine ant Linepithema humile. It is unclear if this absence is merely due to lack of dispersal opportunity or colonisation failure. Additionally, Lord Howe Island has not been colonised by many other highly invasive ant species that occur on islands throughout the Pacific, such as multiple fire ant species, Solenopsis spp., yellow crazy ant, Anoplolepis gracilipes, and the little fire ant Wasmannia auropunctata. It is suggested that this outcome is not due to current biosecurity protocols, but instead reflects a lack of transport pathways to Lord Howe Island from infested locations throughout the Pacific. Essentially Lord Howe Island has just been lucky.
In summary, since human settlement there has been a significant number of ant introductions to Lord Howe Island, and it appears that species accumulation on the island has accelerated in the last few of decades. It remains to be seen whether biosecurity protocols that were first implemented on the island just over a decade ago have indeed succeeded in slowing the rate of, or even completely stopping, accidental introductions. No system is perfect, and, for example, even in New Zealand and the Australian mainland where there are stringent biosecurity protocols, incursions and establishment of many taxa are a constant occurrence. If this pattern of species accumulation on Lord Howe Island really does reflect what may be happening on islands globally, then this highlights the need for biosecurity procedures on islands to be increased, especially islands of high conservation value. Even better would be to implement more effective biosecurity measures at ports of exit to prevent transport in the first place. For both strategies, this would involve greater public awareness of invasive species generally, especially ants, as well as solid understanding of how to prevent their spread, such as by preventing the unregulated movement of soil, plant materials, machinery, construction materials and other goods, enforcement of these quarantine requirements, and high biosecurity standards at ports of exit.

Managing for biodiversity: impact and action thresholds for invasive plants in natural ecosystems introduction
It is well-known that invasive alien plants can significantly threaten the structure, function and productivity of natural ecosystems, and are generally associated with declines in diversity and fitness of resident biota (Ehrenfeld 2010. However, there is growing evidence that such impacts are highly variable amongst landscape contexts and are modulated by the condition of the recipient native ecosystem (e.g. French 2007, Pyšek et al. 2012). Although there is little doubt that widespread and dominant invasive plants can adversely affect natural ecosystem properties when at high abundances, evidence that an invasive plant's presence alone causes deleterious changes in the recipient ecosystem's condition is less clear (Barney et al. 2013, Hulme et al. 2013. A key issue is that invasive plant impacts are highly scale-dependent (Powell et al. 2013, Rejmánek andStohlgren 2015). Indeed, at landscape and continental scales there tends to be a positive rather than negative association between the regional diversity of non-native and indigenous flora (Sax 2002, Maskell et al. 2006, Nobis et al. 2016, which suggests that most introduced plants enrich rather than deplete the diversity of recipient vegetation. Such positive associations at large scales may reflect coincident functional responses of alien and native plants to favourable abiotic (e.g. climate and nutrients) and biotic (e.g. herbivore pressure, pollinator activity) conditions (Sax 2002). Where smaller scales (i.e. those over which management interventions are feasible) are concerned, an important question that must be addressed when assessing invasive plants is: how abundant (in terms of biomass and spatial extent) must a species become before the recipient ecosystem begins to change in response to its invasion? In almost all instances, the rate at which natural ecosystems change (e.g. decline in number of native species) in response to invasion is not known (Barney et al. 2013). Does an ecosystem change at all points along the invasion pathway (i.e. a linear response to invasion), or is there a certain minimum, critical "tipping point" or threshold beyond which an ecosystem changes as the invasive plant becomes dominant? Given the increasingly high cost and economic burden of controlling invasive species in agricultural and natural ecosystems (e.g. at least $13.6 billion per year in Australia; Hoffmann and Broadhurst 2016), there is a clear need to determine the spatial and temporal scale over which impacts occur, the identities of the invasive plants that drive the greatest impacts, and the ecosystems most vulnerable to change, so that the limited resources for control can be prioritised to areas most likely to be impacted. The very scarce resources available for invasive plant control in natural ecosystems means that the likelihood of eradicating widespread and well-established invaders is diminishingly small (Panetta and James 1999). Prioritisation must be given in such circumstances to controlling widespread alien plants in sites of high conservation priority and containing their spread elsewhere (Cousens 1987). Critical questions that still elude land managers and invasive plant ecologists include: (1) How much of an invasive plant (in terms of cover and biomass) can be retained within an ecosystem without compromising key functions and biodiversity values and (2) When should control be implemented? The answer to the second question does not follow directly from the answer to the first, as we will show below.
For widespread and dominant invasive plants with demonstrably negative effects on native ecosystems, there is growing evidence that ecosystem responses are non-linear, such that they occur only once a particular level of invasive plant abundance has been exceeded; that is, a negative impact threshold relationship (Alvarez and Cushman 2002, Gooden et al. 2009, Thiele et al. 2010b, McAlpine et al. 2015, Fried and Panetta 2016. In this paper we describe a mechanistic framework for such relationships and explore the role of thresholds in triggering invasive plant control. Our framework explicitly considers alien plants that are widespread across their potential invasive ranges, locally abundant and capable of generating negative impacts in native plant communities.

ecological framework for invasive species impact thresholds
The concept of invasive species impact thresholds has received attention for at least two decades (see reviews by Adair andGroves 1998, Panetta andJames 1999), yet little headway has been made in testing threshold models empirically in the field, or exploring their application to decisions related to invasive plant control. Henry (1994) first suggested that, given limited management resources, invasive populations could be locally contained below an abundance threshold level to prevent the decline in native vegetation or other ecosystem properties. This model assumes that invasive plants interact only weakly with native vegetation at low abundance levels, and invasion deleteriously affects recipient ecosystem properties only once an abundance threshold is breached. Adair and Groves (1998) posited that this threshold could be used to set the maximum tolerable level of infestations and consequently a target for weed control programs.
The prevalence of impact thresholds throughout invaded ecosystems is poorly known. A recent review of biases and errors in assessments of weed impacts on natural ecosystems by Hulme et al. (2013) highlighted that our ability to determine the prevalence, scale, direction and rate of ecosystem change in response to invasive plants is hampered by the fact that the vast majority of extant impact studies do not quantify ecosystem responses along a gradient of alien plant abundance. Rather, most studies tend to compare heavily invaded sites (e.g. where the abundance of the invasive plant relative to native ones exceeds at least 60 or sometimes 80%, e.g. French 2008, Gooden andFrench 2014) to non-invaded sites (Hulme et al. 2013). It is thus possible that impact threshold relationships between invasive plant abundance and natural assets, such as native species richness, are more prevalent than is currently recognised, but our sampling efforts are currently inadequate to detect them.
For example, Gooden et al. (2009) sampled vegetation species richness, abundance and composition in wet sclerophyll forest of eastern Australia that was invaded by the thicket-forming shrub Lantana camara. Samples were taken across a gradient of L. camara cover, to ensure that community change was not biased towards the relatively small proportion of infestations in which L. camara cover exceeded 80%. Gooden et al. (2009) found that there was a striking non-linear decline in the number of native plant species with increasing L. camara cover, such that native species loss occurred only once L. camara cover exceeded a threshold zone of between 60-80% within the forest. Indeed, some of the highest species richness values occurred when L. camara was present in the forest at covers between 20 and 50%, which strongly suggests that this invader is able to coexist with native species at low abundances without exerting significant negative impacts on the community. Importantly, however, threshold effects of L. camara varied substantially amongst different native plant functional groups; for example, native vine and herb species richness began to decline significantly at only 70-80% L. camara cover, whilst native fern species richness began declining at 30-40% cover. These results indicate that thresholds can vary amongst different life forms within the invaded ecosystem, and furthermore that the maximum tolerable level of an invader should be set at the threshold demonstrated by the most sensitive ecosystem component.
Several other studies have also provided evidence for negative impact thresholds for a variety of invasive plants, including the shrub Baccharis halimifolia in Mediterranean saltmarshes (Fried and Panetta 2016), the monocarpic perennial Heracleum mantegazzianum in northern European grasslands (Thiele et al. 2010b), the vine Delairea odorata in northern Californian coastal scrub and riparian communities (Alvarez and Cushman 2002), and the scrambling herbs Tradescantia fluminensis, Plectranthus ciliatus and Asparagus scandens in New Zealand temperate forests (McAlpine et al. 2015). Fried and Panetta (2016) found that native species' responses to invasion were complex and, in general, non-linear across a gradient of B. halimifolia cover. Species richness declined linearly with increasing B. halimifolia cover (indicating a non-threshold relationship), whilst the abundance of native perennial forbs and graminoids declined significantly only when B. halimifolia exceeded 80%, i.e. following a threshold relationship. McAlpine et al. (2015) reported that patterns of native plant species decline varied amongst the three weed species in the temperate forest, with only T. fluminensis and P. ciliatus exerting negative threshold effects on native species richness at approximately 50 % weed volume. It is clear from these studies that rates of species decline and the position of the threshold zone varies from one invasive species to another, and may depend upon the functional identity of the native vegetation within the recipient community. For example, over a broad range of invaded habitats, Hedja (2013) found that annuals, species with taproots, juveniles of tree species and fast-spreading clonal species were impacted least by invasion. Presumably communities with a high proportion of such species would exhibit wider maintenance zones (see Box 1). Other work has shown that species of small stature were most negatively impacted in communities invaded by either Heracleum mantegazzianum, Lupinus polyphyllus or Rosa rugosa (Thiele et al. 2010a).
The ecological processes that underpin impact threshold relationships (Box 1) have as yet not been examined empirically, yet may be framed by two broad questions: (1) What maintains native vegetation diversity or ecosystem function at levels below the weed abundance threshold, and at what point (i.e. threshold) do native species begin to decline (or ecosystem processes change) with increasing weed abundance? Based on Box 1. Conceptual model for negative impact threshold relationships between invasive plant abundance and a natural asset (e.g. number of native plant species) within the recipient ecosystem.
Invasive species impact threshold relationships can be defined as non-linear declines in one or more natural ecosystem properties, such as number of native plant species, with increasing weed abundance. The model curve consists of several components: A. Threshold relationships exist when the quality of a particular natural asset does not significantly change (either positively or negatively) at low levels of invader abundance. At point A on the non-linear curve, native plant species are able to coexist with the invasive plant . This initial "zone of maintenance" may vary in extent depending on the type of invaded community, capacity of the native species to withstand invasion and functional activity of the invasive plant. For example, as indicated by the relatively steep light-greydotted curve at point A, invasive plants that actively engineer one or more ecosystem properties, such as nitrogen-fixing shrubs, may drive native species decline even at low levels of invasion, due to small changes that accumulate through time. In some instances multiple thresholds have been observed (see Fried and Panetta 2016) but we do not consider this phenomenon further. B. This point lies within the threshold zone: the levels of invasion at which the natural asset in question begins to decrease as weed abundance increases. This represents a transition zone from one natural ecological state (i.e. ecosystem dominated by native species) to an alternative, degraded state (i.e. one dominated by an invasive species, with altered ecosystem properties; Downey and Richardson 2016). As yet, there has been no explicit test of how extensive threshold zones can be or the processes that underpin the transition between the alternative states on either side of the threshold zone. C. The rate of change (represented by the negative gradient over the stretch of curve at point C) once the threshold zone has been exceeded is unknown in most cases, but can be very high. For example, Gooden et al. (2009) found that about two native plants were displaced with every percentage increase in L. camara cover above 75%, whilst no detectable change in native species richness occurred up to this cover abundance threshold. D. The trajectory of the tail-end of a threshold relationship, where weed abundance approaches 100%, has never been examined and therefore is unclear for most invasive species. It is nonetheless an important component of the curve, because it defines the subset of ecological attributes that are tolerant to invasion at high abundances (see Hejda 2013). In some cases, such as with L. camara, the negative gradient appears to approach zero native species richness, where the invader completely replaces the native community (Gooden et al. 2009). In other cases (e.g. T. fluminensis; McAlpine et al. 2015), rates of loss of native species follow a sigmoidal relationship, whereby the decline in species richness beyond the threshold zone is initially rapid but slows with increasing weed abundance, eventally approaching an asymptote as weed abundance approaches 100%. A negative sigmoidal-threshold relationship occurs where a tolerant subset of native species is retained even at very high invader abundance. "Proactive management" is undertaken to prevent weed abundance from reaching threshold zone levels.
Delaying control until after the weed has attained high levels of abundance (i.e. "reactive management") may result in irreversible loss of particularly sensitive species (Downey and Richardson 2016).
classical competition theory, species will interact weakly at low densities, especially when resources are high or if resource requirements do not overlap strongly in niche space, thereby enabling coexistence. At low densities, an invasive plant's competitive performance against well-established natives may be relatively weak if those native plants have priority access to limited resources and greater resource-use efficiency (e.g. Kardol et al. 2013, Mason et al. 2013. Ecosystems with high levels of resilience to disturbance (such as those with either persistent and dense seed banks or ones that are replenished often by immigrant propagules from adjacent patches of non-invaded vegetation) may have high impact thresholds. This is because any losses of standing native vegetation in response to invasion may be buffered against by recruitment from the seed bank, thus maintaining community diversity (Gioria and Pyšek 2016). (2) What are the mechanisms underpinning the dynamic transition across the threshold zone from a rich, functional natural ecosystem to one dominated by an invasive plant with reduced natural value? The threshold zone most likely represents a rapid, dynamic shift from one state (i.e. natural ecosystem) to an alternative, degraded one (i.e. invaded ecosystem). Such a rapid (rather than a gradual and linear) shift across the threshold may be driven by high levels of disturbance and, in some cases, positive feedback mechanisms (e.g. at high densities, invasive grasses of semi-arid woodlands can boost wildfire severity and frequency, which accelerates native species loss and facilitates further invasion; Rossiter et al. 2003), or may simply represent the point at which multiple native species disappear simultaneously as invasion increases. The mechanisms by which alien species reduce diversity are likely to vary according to vegetation successional status, with competition for resources more prominent in the mid-to late-successional stages (Catford et al. 2012).

Thresholds and management for biodiversity values
Interest in the application of thresholds to the management of weeds in agricultural systems developed as an extension of their use in managing arthropod pests in crops (Norris 1999). Fundamental differences in the biology of pest animals and weeds, in particular the existence of seed banks in the latter, meant that major differences in population dynamics had to be taken into account if thresholds were to be at all useful. The "economic threshold level" (the point at which economic losses equal the cost of control) proved to be inadequate for crop weed management, essentially because the seeds produced by plants present at sub-threshold densities would contribute to the soil seed bank and hence to the weed burden of future crops. The "economic optimum threshold" proposed by Cousens (1987) was an attempt to include the economic impacts of multiyear population dynamics. Since ultimately a decision needs to be made on when to take action against a pest, the manager will need to determine an "action threshold" (Cousens 1987, Coble andMortensen 1992). In practice this may combine an economic threshold, a "safety threshold" (allowing a safety margin owing to uncertainty about both economics and weed-related crop losses), and a "visual threshold" (what is visually acceptable to the land manager) (Norris 1999). Given that natural ecosystems are considerably more complex than their agricultural counterparts, it could be anticipated that the determination of action thresholds in this context would be particularly challenging. That said, where the protection of biodiversity values is the objective, the problem can be framed more simply because its economic dimension can be reduced to the cost of control, particularly where damage is not sufficient to require restoration effort. Panetta and James (1999) concluded that, regardless of the invasion context, four aspects must be considered in relation to the use of thresholds in weed management, namely: (1) the benefits provided for the system being managed; (2) damage relationships resulting from the presence of weeds; (3) weed population dynamics; and (4) the framing of risk. In general, very little information is available upon which to base predictions of the population dynamics of weeds of natural ecosystems. Furthermore, recruitment of weeds can be highly episodic, with the attendant risk that occasional rapid increases in density could make effective control much more difficult. Hiebert (1997) argued that the urgency of weed control is an important factor in prioritising weed control efforts-urgency being defined in terms of how much of an increase in effort would be required to achieve successful control should action be delayed. We will return to this point below.

Management strategy options-eradication, extirpation and maintenance control
Eradication has been defined as the elimination of every single individual (including propagules) of a species from a defined area in which recolonisation is highly unlikely (Myers et al. 1998). Where the targeted invader is widespread, extirpation (the elimination of all individuals from an area in which the possibility of recolonisation cannot be ignored in practice; Wilson et al. 2017) would be the appropriate strategy. There may be circumstances under which extirpation is both desirable and achievable, for example when a high quality asset is isolated spatially and potential pathways of recolonisation are either inactive or can be managed effectively. In most cases, however, "maintenance management" (controlling an invader to densities at which it can be tolerated; Simberloff 2003) will be the most appropriate response. Where damage functions are nonlinear, this would involve ensuring that invader densities lie within the maintenance zone (designated by point A in Box 1), i.e. below the impact threshold zone.
The concept of maintenance control for invasive plants in natural ecosystems appears to have originated in relation to the management of aquatic weeds, specifically water hyacinth (Eichhornia crassipes) in Florida during the early 1970s. Until then management of water hyacinth had been essentially reactive (see Box 1), whereby the weed was allowed to reach problem levels before control was implemented. Among other negative effects, this management strategy resulted in severe detrital loading from controlled plants. Joyce (1985) reported that maintaining water hyacinth below 5% cover reduced annual herbicide use by more than 250%, reduced organic sedimentation by up to 400%, and also reduced depressions in dissolved oxygen. Following widespread adoption of maintenance control, relatively little management was necessary by the mid-1980s, reducing environmental and economic impacts. Recolonisation by native plants promoted the restoration of fish and wildlife habitat in many areas (Schardt 2005). The difficulty of detecting and controlling this weed at extremely low densities, the likely presence of a persistent seed bank and potential recolonisation via vegetative propagules meant that maintenance control was the most cost effective management option.
Maintenance control in a terrestrial context was addressed by Goodall and Naudé (1998, p. 116), who defined the maintenance control phase as one "...when priority areas require low annual or biennial commitment to prevent reinfestation (less than 5% cover), or can be maintained using management practices like fire and livestock." For a range of weed life forms and control methods they demonstrated that the cost of labour for keeping weeds to such low densities was considerably less than when control was undertaken at higher densities ( Fig. 1) and was the most cost-effective management option when extirpation was not feasible. Similar cost-versus-density relationships have been reported for the control of Australian Acacia and Eucalyptus, as well as Pinus, species in South Africa (Marais et al. 2004).
The timing of maintenance control has a significant bearing upon the retention of biodiversity values. Where a maintenance control regime is commenced following control efforts targeting an invader that has achieved a high level of cover, legacy effects (Corbin and D'Antonio 2012) may occur, depending upon the length of time that the invader has been present. If the targeted species has been dominant for a long time, there is a risk that highly sensitive native species may have become extirpated (Downey and Richardson 2016). Biodiversity values will therefore be best protected if maintenance control is proactive, controlling the invasive plant at its earliest stages of invasion, rather than reactive-implemented to protect a potentially degraded asset.
The aim of maintenance control should be to keep the cover of the targeted species within a range below the impact threshold zone over an indefinite timeframe, without the need for its eradication across the invaded range. The upper limit of this range will be determined largely by two factors, the first being ecological and the second economic. Where the biodiversity value of the asset being managed is very high, there will be a need to protect the life forms (or species) that are most sensitive to the presence of the invader, whether the relevant damage function is linear or non-linear (Fried and Panetta 2016). The second determinant will be the cost of control, which is expected to increase with the cover of the targeted species (Fig. 1). Taking into consideration the fact that managers usually operate under strong budgetary constraints, an argument can be made for approaching the problem from a primarily economic perspective (i.e. managing for a level of invader cover that is least costly to maintain) since economic and biodiversity objectives are essentially concordant. In relation to invasive species population density and economic impact, Yokomizo et al. (2009) have argued that the optimal management effort will minimise the sum of both management and impact  Goodall and Naudé (1998). Maintenance control, as defined by the authors, is that undertaken at 0-5% infestation cover.
costs. Note that if the invader is maintained at very low densities, the specific nature of the damage function will become moot.
Tactics for maintenance control differ qualitatively from those employed when extirpation is the management goal. In essence this means that the exacting standards of extirpation, in particular the control of all aboveground plants and the total elimination of seeds and other propagules, can be relaxed (Table 1). While return times (i.e. intervals between consecutive search and control operations) for extirpation must be sufficiently short to prevent reproductive escape (Panetta 2007), some level of seed production would be allowable under maintenance control. For perennial species, fecundity schedules relative to age and size are of relevance, since plants generally produce the smallest number of seeds during their first reproductive event. For example, Osunkoya et al. (2012) found that under the most favourable conditions Lantana camara seedlings (10-20 cm) could produce fruits within one growing season. However, for plants growing in the understorey of an open eucalypt forest, fruit production increased markedly with plant size. Small (61-100 cm), medium (101-160 cm) and large (>160 cm) plants produced 34.8 ± 58.1 (mean + sem), 569 ± 27 and 1328 ± 581 fruits respectively.
Species that reproduce vegetatively warrant special consideration, since clonal growth has been shown to influence the magnitude of the impact of non-native plants on native species richness (Vilà et al. 2015). Many invasive aquatic plants proliferate through clonal growth (Barrat-Segretain 1996), but can be readily controlled by some combination of mechanical, chemical and, in some cases, biological control. In a terrestrial context, achieving effective control of plants that exhibit clonal growth can be more challenging. Perennial plant species that spread vegetatively can be difficult to manage, especially where potentially more effective control methods (e.g. herbicide application) are not permitted (Schiffleithner and Essl 2016). Even if herbicides are an important component of control tactics, control may be inadequate owing to less than optimal effectiveness of systemic herbicides (Brown andBettink 2010, Enloe et al. 2013). For such species, care must be taken to prevent the establishment of new plants, an exception to the general guideline that larger plants should be prioritised for control (see Box 2).

Concluding remarks
In this piece we have assumed that the biodiversity values of an asset are known and that a management strategy can be formulated on the basis of this knowledge. When considering the management of widespread serious weeds on a larger scale there is a need for an understanding of the biodiversity values of different assets, as well as the urgency of control (see Hiebert 1997) relative to the degree of threat posed to biodiversity (Downey et al. 2010). The need for further research is manifold. The prevalence of non-linear damage relationships, whether these relate to biodiversity values or ecological functions, will only become apparent by sampling over a broad range of weed abundance in impact studies. The ecological processes that underpin impact threshold relationships are largely unknown and it has yet to be determined whether maintenance control to a level below the threshold really does prevent declines in native species. Finally, there is a need to obtain estimates of maintenance control costs for a range of invasive species life forms and recipient communities so that weed management decisions may be better informed. Table 1. Tactics for extirpation versus maintenance control. Widespread invaders are generally not good candidates for extirpation because of a continued risk of re-invasion. In the context of asset protection the intensity of control required to keep the invader at maintenance levels will be significantly less than would be the case if the management goal were extirpation. Box 2. Guidelines for maintenance control. The standard of control here is less exacting than where extirpation is the management goal, but the underlying principles are similar.

1) Maintenance control should be undertaken only if there is a commitment to continued management of a valued asset.
As compared to extirpation, where there is a defined management endpoint, maintenance control aims to keep the impact of the invader at an acceptable level.
Managers must be prepared to support the latter strategy indefinitely or until an equally effective and more sustainable control method (such as biological control) becomes available. 2) Control should be implemented in such a way as to minimise the likelihood of rapid increases in weed density. Disturbance resulting from control measures should be minimised. Larger individuals of the targeted species should be prioritised for control. (See (4) below). 3) Return times should be geared to the life cycle of the targeted species, with more frequent control operations for species with short pre-reproductive periods. While the need to prevent reproductive escape is less stringent for maintenance control than for extirpation, control measures should be timed so as to reduce the level of propagule production. 4) Larger plants should be prioritised for control. Not only do larger plants contribute more to total cover and thus impact, but they are more fecund than smaller plants, a proportion of which would be pre-reproductive. However, where clonal plants reproduce sexually, care should be taken to detect and control new genets if clones are difficult to manage. 5) Where travel cost is a significant component of the total cost of management, more time should be spent on site in order to detect and control a larger number of plants. Budget constraints will make it comparatively more difficult to conduct a maintenance control regime when it is relatively expensive to travel to the asset of concern. By increasing search effort (therefore detecting and controlling smaller plants) it may be possible to achieve an acceptable management outcome with greater return times.