Consistency of impact assessment protocols for non-native species

Standardized tools are needed to identify and prioritize the most harmful non-native species (NNS). A plethora of assessment protocols have been developed to evaluate the current and potential impacts of non-native species, but consistency among them has received limited attention. To estimate the consistency across impact assessment protocols, 89 specialists in biological invasions used 11 protocols to screen 57 NNS (2614 assessments). We tested if the consistency in the impact scoring across assessors, quantified as the coefficient of variation (CV), was dependent on the characteristics of the protocol, the taxonomic group and the expertise of the assessor. Mean CV across assessors was 40%, with a maximum of 223%. CV was lower for protocols with a low number of score levels, which demanded high levels of expertise, and when the assessors had greater expertise on the assessed species. The similarity among protocols with respect to the final scores was higher when the protocols considered the same impact types. We conclude that all protocols led to considerable inconsistency among assessors. In order to improve consistency, we highlight the importance of selecting assessors with high expertise, providing clear guidelines and adequate training but also deriving final decisions collaboratively by consensus.


Introduction
Coupled with the increasing evidence of adverse impacts exerted by some non-native species (NNS) on native species and ecosystems (Katsanevakis et al. 2014, Vilà et al. 2011, Vilà and Hulme 2017, there is an increasing demand for robust and userfriendly impact assessment protocols to be used by professionals with different levels of expertise and knowledge. Such protocols are needed to predict impacts of new or likely invaders as well as to assess the actual impact of established species. Scientists, environmental managers, conservationists, and policy makers are developing and implementing approaches to prevent further NNS introductions and their subsequent establishment, spread and impact. Risk analysis associated with these four main phases of the invasion process is used to inform management decisions, such as whether to eradicate or control species that arrive despite prevention efforts (Leung et al. 2012). Assessment of the realized or potential impacts of NNS is particularly important for the prioritization of management actions (Essl et al. 2011). However, the large variety of metrics adopted to measure the impacts undermines direct comparison of impacts across species, groups of taxa, localities or regions (Vilà et al. 2010). To this end, protocols to integrate and synthesize the empirical evidence of NNS impacts are needed in order to ensure a rational use of resources (McGeoch et al. 2016), or for prioritizing species for subsequent risk assessment .
Robust NNS impact protocols should ideally result in accurate and consistent impact scores for a species even if applied by different assessors, as long as they have the adequate expertise in the assessed species and context. However, despite the importance of consistency in impact protocols, we have little understanding of the patterns in consistency of impact scores across assessors and protocols, and more importantly, which factors contribute to high levels of consistency. The level of consistency in species scores across assessors may depend on the characteristics of the protocol (e.g. taxonomic and environmental scope, impact types included), but also on the available scientific evidence of impact, and the level of expertise of assessors. For instance, we may expect high consistency (i.e. low impact score variability) across assessors for well-studied species, or when all assessors have an in-depth understanding of the species under consideration.
Several international and national organizations and research groups have developed NNS protocols (Table 1). The common aspect of most of these protocols is that they allow a ranking of NNS according to the threat they pose to the risk assessment area. These have been applied for identifying and assessing potential NNS impacts at different spatial scales, e.g. continental (Nentwig et al. 2010) or national (D'hondt et al. 2015). However, these protocols differ in several aspects. For example, they vary according to their objective, with some considering only environmental impacts whereas others are broader and include socio-economic or ecosystem services impacts (Leung et al. 2012, McGeoch et al. 2016. Some protocols were designed to be taxonomically generic (e.g. GB-NNRA), whereas others are specific for the screening of certain taxonomic groups such as fish or other aquatic organisms (e.g. FISK, MI-ISK, FI-ISK, Amph-ISK, EPPO-PRI; see Table 1), particular habitats (e.g. BINPAS), or pathways (Panov et al. 2009). Moreover, the existing protocols vary considerably in complexity, such as the number of questions, the need for peer review, the use of additional software (e.g. spreadsheet or online form), the ways of assessing uncertainty , and the scoring system used, which can be categorical, ordinal or continuous . The content and structural differences among protocols could lead to differences in the assessment results (Leung et al. 2012).
A few comparative analyses have addressed differences in the structure of impact assessment protocols (Essl et al. 2011, Heikkilä 2011, Vilà et al. 2019, and on their consistency in ranking species across regions (Matthews et al. 2017). However, studies have focused on a reduced number of protocols, and a short list of species (Křivánek andPyšek 2006, Turbé et al. 2017). An in-depth comparison across taxa and across standardized protocols is missing for Europe (Essl et al. 2011), or elsewhere (Snyder et al. 2013). Such a comparison is urgently required to respond to the European legislation on invasive NNS (Regulation EU No. 1143. The aim of the present study was to test for consistency in assessment scores across assessors through comparison of several NNS impact assessment protocols. To address this aim, 89 invasive NNS specialists used 11 protocols to assess the potential impact of 57 species not native to Europe and belonging to a very large array of taxonomic groups (plants, animals, pathogens) from terrestrial to freshwater and marine environments. The specific questions considered were: 1) How consistent are species scores across assessors? 2) To what extent does consistency depend on the protocol characteristics, i.e. impact categories considered (environmental and socio-economic), structural complexity of the protocol (number of questions and scoring system)? 3) How is consistency related to the characteristics of the NSS (taxonomic group, habitat type, and available scientific knowledge of the species); 4) What is the relation between consistency and assessor expertise? 5) Do different protocols provide similar final scores or species ranking? Based on the study results, we provide recommendations on how the robustness and applicability of protocols could be improved for assessing NNS impacts.

Selection of impact assessment protocols
Eleven commonly used scientifically based protocols developed or applied in Europe for the evaluation of NNS impacts were selected for comparison by consensus in the AlienChallenge COST Action workshop in April 2014 by 36 European experts in NNS risk assessments (Rhodes, Greece) (Table 1). We included all protocols developed and officially used at national or continent level in Europe (e.g. EPPO, Harmonia+ and GB-NNRA) and the main protocols used by European research community (e.g. GISS and FISK). Only the EFSA protocol was discarded from this selection due to the complexity of extracting and processing the data. Furthermore, during the selection we aimed to cover the major types and groups of protocols in order to guarantee Table 1. Characteristics of impact assessment protocols used in the study. Each protocol is characterized in terms of the a) taxonomic group the protocol could be used for, b) the impact categories included (environmental alone or environmental and socio-economic), c) the final scoring scale (i.e. three levels, five levels, and more than 5 levels), d) whether the final score is based on the maximum score of impacts, e) whether the protocol included questions on species spread as part of a risk assessment (yes/no), f ) the number of questions contributing to the final score, and g) the mean assessor expertise on species required to fill the questionnaire (1-5 scale based on 63 online anonymous questionnaire responses). 3.50 (Narščius et al. 2012, Olenin et al. 2007, Zaiko et al. 2011 EICAT Environmental Impact Classification for Alien Taxa All Environmental 5 yes no 9 3.37 , Hawkins et al. 2015 EPPO 4.12 (Copp 2013, Panov et al. 2009, Tricarico et al. 2010 4.34 (Gederaas et al. 2012, Sandvik et al. 2013 enough variability in their characteristics. The selection does not consider risk analysis tools or updates that have become available after 2015, such as AS-ISK (Copp et al. 2016), which replaces FISK and the other -ISK toolkits and complies with the minimum standards NNS risk analysis under Regulation (EU) No 1143/2014 . Risk assessments are usually divided into four components that consider the potential for a non-native species to enter a region, establish, spread and cause impacts. The selection included impact assessment and risk assessment protocols for which we only compared the sections dealing with spread and impact as they are largely interrelated. Each protocol considers a different method to calculate the final score per species based on the responses (i.e. aggregation method): maximum impact, accumulated impact, categorization matrix or decision trees, an independent summary question, or the combination of any of the previous methods. Owing to the number of protocols used in the present study and their complexity, no attempt was made to standardize variations in score aggregation methods but rather, where possible, to account for this variability during the data analysis as covariates. Some protocols can be applied to any taxon while others are specific to particular groups or habitats (e.g. BINPAS and FISK are used only for aquatic animals, EPPO Prioritization for plants). As such, the number of protocols assessed per species varied depending on the taxonomic group (Table 1). Although all the -ISK toolkits (FISK, FI-ISK, Amph-ISK, MFISK, MI-ISK) were used for their respective taxonomic groups, in the data analyses all the versions were listed under 'FISK' because of their high similarity. For the same reason, the EPPO-EIAs for insects/pathogens and plants were listed together. Each protocol was characterized according to several variables (Table 1): the categories of impact considered (environmental alone or environmental and socio-economic), inclusion of questions on species spread (yes/no), on scoring scale (i.e. three levels, five levels and more than five levels), whether the protocol included a maximum aggregation method (i.e. the largest value of a set of values) to calculate the final score (yes/no), the number of questions requiring input from the assessors and contributing to the final score, and the expertise on the species required to complete the protocol. The latter was based on 63 responses received from an online anonymous questionnaire distributed to all assessors, which included a question asking them to rate their agreement (from 1 = disagree to 5 = fully agree) with the statement: "This protocol requires a high level of expertise on the species". Assessors answered this question for each protocol after having completed all assessments. The response values were averaged per protocol to provide a single estimate of the level of expertise required for that NNS protocol ( Table 1).

Selection of species
A total of 57 species from different taxonomic groups not native to terrestrial, freshwater, and marine environments in Europe were selected (Suppl. material 1: Table S1). Among them, only two species are native to a part of Europe (Arion vulgaris and Dreissena polymorpha). The list of species was elicited by consensus also at the Al-ien Challenge COST Action workshop in April 2014 (Rhodes, Greece). During the workshops, the experts were grouped according to their taxonomic expertise under the coordination of a taxonomic leader, in order to select a list of species covering a wide range of European climatic regions and habitat types, biological characteristics and the degree and type of impact. While some NNS were widespread, very well studied and with known impacts, some had a localized geographical distribution (Suppl. material 1: Table S1). Each NNS was assigned to a specific taxonomic group and habitat type: terrestrial plants, freshwater plants, terrestrial vertebrates, terrestrial insects, other terrestrial invertebrates, freshwater invertebrates, freshwater fish, marine species, and pathogens. The scientific knowledge available for the NNS was quantified as the number of records in the Web of Science using the accepted scientific name as a query, and biology and ecology research area as filters (retrieved in August 2016). Additionally, the mean and coefficient of variation of the assessor expertise on each species (Suppl. material 1: Table S1) was derived through a self-valuation questionnaire on each assessed NNS using the following classification: 1 = low (the assessor has not worked with the species); 2 = medium (the assessor has not published on the species but has expertise on it through surveys or reports); and 3 = high (the assessor has published on the species).

Assessment of non-native species
There is a large variation in methods to implement the different protocols; some are available as downloadable freeware (-ISK toolkits, the 'NAPRA' version of the GB-NN-RA), as online applications (e.g. Harmonia+, BINPAS), whereas some have to be constructed following the text guidelines (e.g. GISS, EICAT), and others can be obtained as spreadsheets (e.g. GB-NNRA) or databases (e.g. NGEIAAS). To harmonize use of the protocols and facilitate data retrieval, a comprehensive Excel® spreadsheet template was developed to include all the protocols (see Suppl. material 2). The resulting spreadsheet was checked by the authors or owners of each protocol to ensure that it accurately depicted the original protocol whilst matching the common-practice methodology.
Using the protocols selected in the spreadsheet template, 89 assessors independently assessed between three to 11 species (mean = 3.9) of the taxonomic group in their area of expertise (i.e. terrestrial plants, aquatic plants, terrestrial vertebrates, terrestrial insects, other terrestrial invertebrates, freshwater invertebrates, freshwater fish, marine species and pathogens) (Suppl. material 1: Table S1). All assessors were researchers with expertise in biological invasions (PhD or PhD candidate) selected among the participants of the Alien Challenge COST Action by the coordinators of each taxonomic group. The experience of the assessors with NNS impact assessments varied. Most assessors had occasionally participated in NNS risk assessments exercises (59.3%), while 19.7% had never participated and 17.5% had often participated. All NNS were assessed by a minimum of five assessors (maximum eight) (Suppl. material 1: Table S1), yielding a total of 2614 assessments. Before conducting the assessments, the assessors were required to read the impact assessment guidelines provided per protocol and ask questions directly to the protocol developers if needed. When scoring impacts, assessors were instructed to consider Europe as the risk assessment area and the likely worst-case scenario for each NNS. Based on the precautionary principle, protocols recommend scoring the potential impact of NNS based on the available information either from studies for the area of assessment, or from areas with the same invaded habitat in a similar climate. The assessors were instructed to base their assessments on all available literature, information sources and their own expertise, indicating in the assessment the source of the information. The selection of the literature used for the assessment was left at the discretion of the assessor.
Before retrieving the data, each assessment was checked for completeness. Once all NNS assessments were completed, the final scores for each assessment were extracted. To harmonize scores across protocols, all ordinal scores (i.e. protocols with three or five levels as final scoring scale; Table 1) were transformed into numeric values, with the lowest impact as 1 and the maximum as 3 or 5, respectively. Then, all scores were standardized from 0 to 1 using the following equation (S -Smin)/(Smax -Smin), where S represent the score per NNS in each assessment, and Smax and Smin, the maximum and minimum possible scores provided by the protocol .

Consistency in non-native species scoring across assessors
For each NNS and protocol (471 combinations), the mean and the coefficient of variation (CV) of the final score were calculated. The mean was used as the overall score across experts per NNS and protocol, whereas CV was used as an estimate of the consistency of scores across experts, adjusting for the mean value. First, differences in CV among all protocols were tested using a linear mixed model with protocol name as a fixed effect and species nested within taxonomic groups as random effects (i.e. random intercept model). Second, we used multimodel inference (Burnham and Anderson 2002) of linear mixed models to analyze the relationship between the CV and species characteristics (taxonomic group and available knowledge), protocol characteristics (impact categories, spread question included, final scoring scale, whether final scoring was based on maximum score, number of questions and expertise on the species required) and assessor expertise on the species (mean and coefficient of variance). In this set of models, we used the same random effects structure as in the first model but did not include protocol name as a covariate. Model residuals were checked for normality and homoscedasticity and identified the square root as the best transformation for CV. Multi-model inference, based on the all-subsets selection of predictors, was performed using the corrected Akaike's Information Criterion (AIC c ) keeping the same random effects in all model combinations. For each combination of predictors, Akaike weights (w i ) were calculated. Considering the best models given the selected predictors (ΔAIC c < 6) (Richards 2008), the relative importance w +(j) of each predictor j was estimated as the sum of the AIC c weights across all models in which the selected predictor appeared. Predictors with higher w +(j) (i.e. closer to 1) have a higher weight of evidence to explain the response variable with the given data. Finally, the average of regression coefficients weighted by w i within the subset of best models was calculated.
Differences in the mean CV among levels for the categorical variables in the best candidate model (i.e. with the smallest AIC c ) were tested for significance using a Tukey post hoc test. Prior to modelling, continuous predictors for the models above were checked for multicollinearity using Pearson correlations. All variables were selected for further analyses considering the low correlation values found (r < 0.5; Suppl. material 1: Table S2) (Dormann et al. 2013). Continuous variables were centered (deviate from the mean) and scaled (divided by standard deviation) to facilitate interpretation of model coefficients and model convergence (Schielzeth 2010). Finally, in all models explained above we accounted for the variability in the number of assessments per NNS (5 to 8; Suppl. material 1: Table S1) (i.e. sample size effect), including the number of assessments as a covariate (i.e. fixed effect).

Differences in impact assessment scoring across protocols
Similarities in the scoring of NNS across the different protocols were compared using hierarchical cluster analyses. Cluster analyses of the mean scores per NNS and protocol (calculations described above) were performed using Spearman's correlation coefficient as a similarity measure and the complete linkage method (i.e. maximum distance between clusters). Using this method, we first carried out a cluster analysis of all NNS across the six protocols common to all taxonomic groups (i.e. GABLIS, GB-NNRA, EICAT, Harmonia+, GISS and NGEIAAS). Then, separate analyses were also performed for four subsets of NNS with common protocols: 1) aquatic and terrestrial plants, 2) aquatic animals (combining freshwater invertebrates, freshwater fish, and marine invertebrates), 3) terrestrial invertebrates (terrestrial insects and other terrestrial invertebrates), and 4) terrestrial vertebrates (Suppl. material 1: Table S1). Pathogens were not included in this analysis due to the low number (n = 3) of species tested. Prior to these analyses in order to account for the variability in the number of assessments per NNS (five to eight; Suppl. material 1: Table S1) (i.e. sample size effect), we calculated the Pearson's correlation between the mean score per NSS and protocol and the number of assessments performed for all groups of species indicated above. When the correlation was significant for a group of species (p < 0.05) we used simple linear regression models to relate the mean score with the number of assessments per species and used the model's residuals in subsequent hierarchical analyses. We followed this approach only for plants and aquatic animals based on the significant correlation found (Plants r: -0.17, p < 0.05; Aquatic animals r: -0.17, p < 0.05). Results without this correction were similar, reinforcing the robustness of the results (Suppl. material 1: Fig. S2). All statistical analyses and figures were carried out in R v3.4.1 (R Core Team 2017) using packages lme4, lsmeans, MuMIn and sjPlot to implement and plot mixed models and gplots for the correlation heat maps and dendrograms.

Consistency across assessors
The mean coefficient of variation (CV) of assessor scores per NNS and protocol was 40% (± 37% SD), with 10% (n = 470) showing complete agreement (CV = 0) among assessors but with maximum variability being 223% (four species in ISEIA: Aedes albopictus, Arion vulgaris, Australoheros facetus and Fascioloides magna; two species in EPPO EIA: Diabrotica virgifera and Tuta absoluta). CV was remarkably different among protocols (Fig. 1). ISEIA, EPPO-EIA and Harmonia+ protocols had the highest CV, whereas NGEIAAS and GABLIS protocols showed the lowest values. CV across assessors was better explained by protocol characteristics than by NNS characteristics (Table 2). Scoring scale, expertise required and the use of maximum impact score were the variables with the highest weight of evidence.
According to Tukey post hoc tests in the best candidate model, protocols using three score levels had significantly lower CV than the protocols using scales with five levels (difference = 0.25, p < 0.001) or more than five levels (difference = 0.29, p < 0.001). However, protocols with five score levels were similar to pro-Figure 1. Coefficient of variation (CV) of species scoring across assessors per impact assessment protocol based on linear mixed models controlling for taxonomic group and species as nested random effects and number of assessments per species as fixed effects. Protocols with the same letters above the graph are not significantly different (p < 0.05; Tukey test). Dots indicate the least squares means per protocol. Lines indicate the confidence interval (95%) around the means.

Figure 2.
Mean regression coefficient and confidence interval (95%) of taxonomic groups (random effects) in the best linear mixed model explaining the coefficient of variation of scores of 57 invasive non-native species for 11 different protocols including all significant species, assessor and protocol characteristics (see Table 2) . tocols with more than five levels (p = 0.27). CV across assessors was significantly lower for protocols that required higher expertise than those for which low expertise was required ( Table 2). The expertise required per protocol was highly correlated to the overall number of fields in the protocol (i.e. questions, comments, uncertainty and results; Pearson's r = 0.9) but less with the number of questions actually contributing to the final score calculation (r = 0.5; Suppl. material 1: Table S2). Protocols using the maximum impact score yielded lower CV values. In terms of protocol content, CV was higher when protocols included a NNS spread module but there was no difference depending on the impact types considered ( Table 2). The number of questions contributing to final score and impact categories considered did not show significant relations to CV (Table 2). Among NNS and assessor characteristics, only the mean of assessor expertise on each NNS showed a significant negative relationship with CV values (Table 2). Finally, there were some differences in CV among taxonomic groups (Fig. 2). Although not significant, terrestrial vertebrates, terrestrial plants, pathogens and freshwater invertebrates tended to show lower CVs whereas higher values were found for terrestrial insects, other terrestrial invertebrates and freshwater plants. Only terrestrial insects and freshwater plants showed a significantly higher CV than the average across all taxa (Fig. 2).

Consistency across protocols
The pair-wise correlations in NNS scores among the six protocols common to all taxa were highly diverse (min-max = 0.16-0.77; mean = 0.55), indicating low consistency in species scores among some protocols (Fig. 3). With respect to taxonomic groups, aquatic animals had the highest mean correlation among protocols, terrestrial invertebrates and plants showed an equally low mean correlation, and terrestrial vertebrates had the lowest correlation levels (Fig. 4). These correlations remained similar when considering only the protocols common to all three taxonomic groups (Suppl. material 1: Fig. S1) and without sample size correction (Suppl. material 1: Fig. S2). Cluster analysis identified two main groups (Fig. 3, Suppl. material 1: Fig. S1): protocols that include only environmental impacts (NGEIAAS, GABLIS, and EICAT) and protocols that include both environmental and socio-economic impacts (GB-NNRA, GISS and Harmonia+). The scorings of Harmonia+ were clearly distinct from the other protocols (indicated by lower correlation values), particularly for plants and terrestrial invertebrates (Figs 3, 4). Similarly, FISK and GABLIS showed relatively low correlation values with the other protocols for aquatic animals and terrestrial vertebrates, respectively (Fig. 4).

Discussion
The comparison of impact assessment protocols for NNS shows that scoring variability across assessors can be substantial, depending on the taxonomic group con- Table 2. Average coefficient and Akaike weights for each species, assessor and protocol variable within the best linear mixed models (AIC c < 6) explaining the coefficient of variation of the scores of 57 non-native species in 11 impact assessment protocols. Taxonomic groups and species identification were included as nested random effect. Predictors with weight closer to one have a higher relative importance to explain the response variable. Variables with weight equals zero were not included in the best subset of models to calculate average coefficients.  sidered and the scoring system. However, there is potential to reduce this variability by considering the expertise of the assessors and optimizing structural characteristics of the protocol. Furthermore, the ranking of NNS based on the protocol scoring can differ depending on the approach implemented, mainly based on the impact category type considered (i.e. whether socio-economic impacts are included). Thus, the selection of the scoring approach can have important consequences on the final ranking of NNS produced.

Consistency across assessors and across taxonomic groups
Scoring consistency across assessors and for some taxonomic groups was surprisingly low. It is not clear why these large discrepancies occurred even when the assessors were experts in invasion biology within their taxonomic domain. Many factors can influence the interpretations of context dependence found in the scientific literature, which can lead to subjective and inconsistent answers even amongst expert assessors (Gilovich et al. 2002). Heuristics and bias, including intuitive strategies to process information, can lead to variability in expert responses (O'Hagan et al. 2006). For example, experts might score the impact according to the studies with which they feel most familiar (e.g. conducted by colleagues in their region). Similarly, if there is a lack of information on the impacts for a NNS, then the judgement might be biased towards a NNS of the same taxonomic lineage. Alternatively, inconsistencies might be due to inherent uncertainty. For instance, a greater inconsistency for most groups of aquatic taxa may reflect a higher difficulty in determining impacts than for taxa in other environments (Molnar et al. 2008). Finally, these biases could be balanced by anchoring effects where most assessors might assign intermediate levels of impact when there is insufficient information to fulfil the protocol requests. Part of the variability in consistency was explained by protocol characteristics and the approaches implemented. Protocols with three score levels were more likely to show consistency among assessors than those with five or more levels. However, a three-category scoring system might not be sufficient to discriminate between NNS impacts or magnitude of impacts and rank NNS for prioritisation, because too many species will have the same score. Protocols that select the highest impact among different categories provided higher consistency. By definition, this approach will homogenise the scores towards higher values discarding inconsistencies from less important impacts in a way that results will be more conservative.
Protocols containing questions that required greater expertise on the species yielded higher scoring consistency than simpler protocols. Protocols requiring greater expertise demanded very detailed information about the species (e.g. expected population lifetime in NGEIAAS) that, when available, is very likely to be available only in few studies. Owing to the restricted number of sources of information, the variability in the final score might be low. Complex protocols might be less user-friendly and more time-consuming, but this in itself could increase focus and decrease subjectivity. Exceptions exist, e.g. the -ISK screening (Copp 2013), whereby the protocol is easy to use but the 49 questions require more time to answer than simpler tools such as ISEIA, which has only 12 questions. However, the questions from simple tools such as ISEIA focus mainly on impacts, whereas the -ISK screening tools include a much broader range of questions, such as invasion history, species traits and susceptibility to management measures. The balance between ease of use and time spent is critical as some protocols are meant to be used for the rapid screening of a NNS, whereas others provide more in-depth assessments. For example, NGEIAAS was designed for professional experts who carry out very detailed risk assessments on behalf of government authorities (Gederaas et al. 2012, Sandvik et al. 2013. This issue highlights that although we only selected impact and spread related sections, the present study compares tools intended for different phases of the risk analysis process, i.e. risk identification (e.g. ISEIA, -ISK screening tools), risk assessment (e.g. GB-NNRA, Harmonia+) and impact assessment (e.g. GISS, EICAT). Further studies could look into a detailed comparison across all phases of the risk analysis process in order to highlight those sections that might require improvement.
Regarding assessor and NNS characteristics, the only factor that significantly increased consistency among assessors was their level of expertise with the assessed species. Assessors that had previous experience with the NNS assessed may have had similar high levels of knowledge on that NNS, and this may have led to similar scores. Nevertheless, this situation is infrequent as NNS assessments are more commonly undertaken by persons familiar with the taxonomic group but not necessarily with the NNS being assessed (e.g. NNS not yet present or still rare in the study area). Unexpectedly, consistency was not related to the availability of information about the species (i.e. higher number of WoS records). The simplest explanation is that the number of studies available does not necessarily indicate more studies relevant for impact assessments as the literature on these species could be linked to other research fields in invasion biology not directly associated with their environmental or socioeconomic impacts. It is also relevant to note that different assessors might have had access to different information sources, particularly non-English literature and reports. This might have affected consistency results but we followed standard practices for NNS risk assessments. Further studies could look at these differences providing a base information for the species to be assessed.
The high inconsistency found among assessor's scores raises high concerns and suggests that assessments conducted by single assessors should be interpreted with caution (Pheloung et al. 1999, Cousens 2008. Expert working group scoring, the use of consensus techniques and reviewing processes can inform the responses of single assessors and therefore reduce uncertainty Burgman 2015, Vanderhoeven et al. 2017). For NNS lacking information or contrasting data, structured elicitation techniques, such as the Delphi approach, which is based on a feedback and revision process (Mukherjee et al. 2015), can identify and reduce potential sources of bias among experts (Morgan 2014, Sutherland andBurgman 2015). In practice, risk assessments for NNS, in particular those carried out in the plant health sector, are usually done either by groups of experts, as in EPPO pest risk assessment, or using an independent peer reviewer and an editorial-board type vetting procedure, such as in Great Britain (Baker et al. 2008, Booy et al. 2012. The consensus approach is used for plants and plant pests because those assessments are likely to be used in international trade agreements in order to demonstrate robustness (Schrader et al. 2010). However, national or regional impact risk assessments of NNS for blacklists or prioritization purposes are often based on the judgement of a few or single experts. Thus, efforts should be made to involve a panel of experts in the species or the system following elicitation techniques.

Differences across protocols
Variations among protocols in species scoring are mainly due to the inclusion, or not, of socio-economic impacts. Although socio-economic and environmental impacts are generally correlated (Kumschick et al. 2015a, Vilà et al. 2010, it is almost impossible to predict the magnitude of one impact from the other . For example, many NNS, such as agricultural pests and organisms affecting human health, exclusively cause socio-economic impacts Branco 2010, Kumschick et al. 2015b) and, thus, using protocols that include such impacts will affect the impact ranking of NNS under consideration. Furthermore, the perception of socio-economic impacts is likely to vary across stakeholders. Thus, depending on the target audience and objectives of the assessment, different protocols may be used, focusing either on environmental or socio-economic impacts or both together. The majority of the protocols exclusively considered environmental impacts, and there was greater correlation in scores among these protocols. However, the difference between scores was dependent on the taxonomic group under consideration. Ranking of species completely shifted (negative correlation of scores across protocols) when different impact categories were considered for terrestrial vertebrates and plants, but the difference was lower for aquatic animals. This pattern might be due to differences in the relevance of impacts across taxa, with terrestrial vertebrates showing highly contrasting impact types for single species (e.g. high economic impact but low environmental impact) (Vilà et al. 2010). However, differences in scores among taxonomic groups might again also simply reflect differences in the knowledge of their impacts. Impacts of terrestrial vertebrates or plants might be better known than those of aquatic organisms. Testing this hypothesis requires comparing uncertainty scores provided by experts across impact types and taxonomic groups, which could be done with the current dataset in further studies.
Among all protocols, Harmonia+, FISK and GABLIS led to very different scores in comparison to the other protocols. This difference was partly related to the different impact categories considered but also to the inclusion of questions beyond impact (e.g. management in GABLIS and FISK). Finally, the GB-NNRA protocol showed a variable relation with other protocols across taxa: low correlation with protocols only considering environmental impacts for plants and terrestrial invertebrates but high for vertebrates. The final score in the GB-NNRA was not automatically calculated as in the other protocols. Instead, assessors were asked to provide overall summary scores and confidence rankings for the NNS based on the answers provided in previous sections, which include questions that consider both environmental and socio-economic impacts (Baker et al. 2008, Mumford et al. 2010. This approach could have led to the not consistent relation between the GB-NNRA protocol and the others. However, when used as part of the GB risk analysis process (Booy et al. 2012), it aids the NNS risk analysis panel to identify inconsistencies between the assessor's individual question responses and their overall scores and confidence levels (Mumford et al. 2010).

Recommendations for NNS impact assessments
Several key factors should be taken into account when selecting or designing a NNS risk assessment protocol, such as the aim, the scope, the consistency and the accuracy of the outcomes, and the resources available to perform the assessment (e.g. time or information). As a first step, the suitability of a NNS risk assessment protocol will depend on the scope and aim of the assessment. For instance, if a NNS is already present in the region of interest, assessments on likelihood of entry and establishment are less meaningful than just the assessment of impact. Protocols with different scopes may produce different results in terms of NNS rankings (Lazzaro et al. 2016). As we have shown, even just considering different types of impacts could result in large differences in rankings. Thus, it is crucial not to mix the results from assessment methods that consider different impacts or phases of the invasion process. Furthermore, our results show that even if the focus is only on impact and spread sections, the choice of the protocol is critical because the scoring consistency will depend on the characteristics of the protocol. Three main factors were responsible for these inconsistencies, the choice of the scoring scale, how the final score is summarized and the general expertise required to use the protocol. We recommend using a 5-level scoring, maximum aggregation method and moderate expertise requirements as a good compromise to reduce inconsistency without losing discriminatory power or usability. In general, we also advise protocol developers to perform sensibility tests of consistency before final release or adoption (e.g. Pheloung et al. 1999). This is crucial because if a protocol yields inconsistent outcomes when used by different assessors, then it is likely that decisions taken based on the results could be variable and disproportionate to the actual impacts .
Part of the inconsistency might also come from the way the protocol is used in practice (e.g. standardized forms, clear guidelines, selection of assessors, individual vs. group assessments). We propose three main ways to reduce this type of inconsistency. First, irrespectively of the protocol, selecting a group of assessors with high expertise will yield more consistent results. Second, inconsistencies due to linguistic uncertainties (e.g. definitions, formulations, rating) can be reduced by improving the guidelines and with adequate training of the assessors (Vilà et al. 2019). Third, other studies have suggested using expert elicitation methods to reduce inconsistencies (Morgan 2014, Sutherland andBurgman 2015), such as consensus building (Mukherjee et al. 2015) or quality control mechanisms (e.g. peer-review panels). Elicitation methods can reveal whether differences in scoring outcomes between and within protocols reflect true differences in opinion, lack of evidence, or subjective biases due to protocol interpretation . In fact, scientific consensus and robust revisions are crucial for policy implementation ). Finally, there will always be inconsistencies due to knowledge gaps and subjectivity in the interpretation of the scientific results when there is high context dependency. This might not be a problem in providing a sound evidencebase for decisions on NNS as long as protocols are used transparently and uncertainties are explicitly dealt with through appropriate methods   Figure S1: hierarchical cluster of the species scores for the six protocols common to all taxonomic groups. Figure S2: hierarchical cluster of the species scorings for plants and aquatic animals without correcting for sample size bias. Table S1: list of non-native species. Does invasive species research use more militaristic language than other ecology and conservation biology literature?

Abstract
Invasive species research has been criticised for a reliance on hyperbolic or sensationalistic language, including the use of militaristic language that dates to the popularisation of this concept. We sought to evaluate whether the invasive species literature used more militaristic language than other literature across the fields of ecology and conservation biology, given that many research areas in these fields (e.g. competition) may routinely use militaristic language. We compared militaristic language use in journal articles on invasive species or other topics across both applied and basic science journals in the fields of ecology and conservation biology. We further restricted our study to papers where lead-authors were located at institutions in the United States, to evaluate whether militaristic language use varied over peace time and conflict periods for this country. We found no significant differences in the percentage of journal articles that used any militaristic language between either invasive species research or research on other topics, but we did find that invasive species research used a greater frequency (count) of militaristic language per article than research on other topics. We also found that basic rather than applied science journals were more likely to use militaristic language and we detected no significant effect of time period on the usage of militaristic language in the ecology and conservation biology literature. Researchers working on invasive species should continue to be conscientious about their language use on this occasionally controversial topic, particularly in basic science journals.

Keywords
Alien species, animal behaviour, competition, exotic species, introduced species, literature review, natural enemies, natural defences, non-native species

Introduction
Invasive species are generally defined as organisms that have been introduced from their native range to new regions of the world by human actions and either: 1) spread rapidly or widely where non-native (Richardson et al. 2000 or 2) result in some harm or negative effect to native species, ecosystems or human society (Parker et al. 1999, Young andLarson 2011). Invasive species have emerged as a major area of research and management interest over recent decades in fields like ecology and conservation biology, as these biological invasions have accelerated concurrent with increasing human mobility and global trade (Seebens et al. 2017). For example, invasive species have been routinely attributed as a driver of biodiversity declines and the extinction crisis (Clavero and García-Berthou 2005) and have further been identified as causing impacts to human health and economies (Pimentel et al. 2005). Yet research on, and management of, biological invasions has occasionally been controversial both within scientific fields , Simberloff 2011, as well as between scientists or managers and members of the public who may favour some non-native species identified as invasive (Loss and Marra 2018) or object to particular management strategies for invasive species (Bremner and Park 2007). One critique of invasion biology is that this research area lacks objectivity, reflected in part by language use by scientists or managers working on invasive species that may be militaristic, nativist, sensational or xenophobic (Simberloff 2003, Verbrugge et al. 2016. The identification of this research area as potentially militaristic is especially clear given the names "invasion" biology and "invasive" species, which may imply related terms like "attack" or "assault" to many readers. Further, although recognition of introduced species as a phenomenon dates back to at least the 16 th century (Chew 2011, Kowarik and, the field of invasion biology was popularised by Elton (1958), largely in response to his own work managing animals and plants impacting the British economy and food supply during World War II. Some researchers, working on such species, have favoured other terminology like "alien", "non-indigenous" or "exotic", but these words lack either the implication of negative impact, or spread, that distinguishes an invasive species from the greater proportion of non-native or introduced species that do not cause some identified harm (Jeschke 2014) or remain localised after introduction (Richardson et al. 2000. As such, "invasive species" and related terms have persisted as common or even dominant phrases for this phenomenon, while continuing to attract critiques for their militaristic implications (Larson 2005).
Beyond the use of invasive or invasion with reference to non-native species that cause harm or spread rapidly and widely, other militaristic language has been observed to appear in the literature on these organisms, ranging from descriptors of ecological effects that might include "attack" to descriptors of management interventions that might include "combat" (Larson 2005). For example, Larson (2008) reviewed 166 articles that were published between 1999 and 2003 in the journal Biological Invasions and reported 36 instances of militaristic language use within these papers. Yet many areas of general interest in ecology -such as competition, predation or successionregularly include language or metaphor use with militaristic associations, irrespective of a focus on invasive species. As examples, animal behaviour literature on competitive interactions between individuals, populations or species frequently uses terms like "fight", independent of any focus on biological invasions (e.g. Arnott and Elwood 2009), whereas plant ecology literature often refers to natural "defences" against natural "enemies" (Palo andRobbins 1991, Landis et al. 2000). If invasion biology is to be specifically criticised for its reliance on militaristic language and related consequences (see discussion), this might ideally be based on evidence for greater use of such terminology than the broader fields of ecology or conservation biology as a whole.
We expand on past studies like that of Larson (2008) by asking whether invasive species literature is more likely than research in ecology or conservation biology in general to use militaristic language (other than "invasive" or "invasion"). We documented militaristic language use in a random sample of research articles about either invasive species or other topics in both applied and basic ecology journals for two decades spanning periods of peace (1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001) and conflict (2002-2011) for our focal country (the United States). We expected to find more militaristic language use in invasive species papers than for other topics in ecology or conservation biology, given this research area's historic association with wartime impacts (e.g. Elton 1958) and previous criticisms on this language use in invasive species literature (Simberloff 2003, Larson 2005. We also anticipated that applied ecology journals might be more likely to use militaristic language than basic ecology journals, given that applied management actions or conservation goals might lend themselves to such militaristic language or metaphors (Larson 2008, Verbrugge et al. 2016). Finally, we were curious to determine if the cultural context of an active and prolonged period of military conflict might contribute to an increased use of militaristic language in invasive species literature in comparison to a relatively more peaceful preceding decade. Cumulatively, our study provides empirical evidence for the frequency of militaristic language use in invasive species literature in comparison to the fields of ecology and conservation in general across a variety of journals and over a recent two-decade time period.

Methods
We compared militaristic language use between research papers on invasive species to those on other topics in ecology and conservation biology for a series of applied and basic science journals over two time periods (peace and conflict). We used American Naturalist, Ecology and Journal of Ecology as basic science journals in our study; each of these journals does occasionally publish articles on applied topics, but were anticipated to be less likely to do so than journals dedicated to applied science in these fields, particularly when published by the same scientific societies (i.e. Ecological Applications). We used Biological Conservation, Conservation Biology and Ecological Applications as applied science journals in these fields; we did not include some relevant applied science journals like Biological Invasions because they did not exist over the entire duration of our study period. We chose these six journals because they are leading journals in the fields of ecology and conservation biology that publish invasive species research and existed for the duration of our study interval.
We restricted our study to papers where the lead authors were located at institutions in the United States, in order to make a comparison of militaristic language use between a relatively peaceful time period for this country following the end of the Cold War (1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001) in contrast to an era of heightened military conflict (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011), during which the United States engaged in major wars in Afghanistan and Iraq. We used the institutional address of the first author with the expectation that this was the person most responsible for writing the manuscript (Weltzin et al. 2006) and because the fields of ecology and conservation biology have had inconsistent interpretations of the meaning of corresponding or last author over time and relative to some other fields, where these designations imply seniority or primary responsibility for the work (Duffy 2017).
For each of the six journals included in our study, we sought to quantify militaristic language use in 10 invasive species articles published in the peace time period and 10 invasive species articles published during the conflict period, as well as in 10 articles on other (non-invasive species) topics published in the peace time period and 10 on other topics published during the conflict period. We selected articles for inclusion in this study using ISI Web of Science's (WoS) Core Collection, restricting searches to the relevant publication titles and time periods (above). We identified invasive species papers as those where the terms "invasive", "exotic", "alien" or "non-indigenous" appeared in a WoS topic search and identified all other papers as other topics. We confirmed these classifications during our quantification of militaristic language use in each scientific article (below). We omitted any articles where the address of the lead author was not at an institution in the United States per WoS. We further used WoS to omit any articles classified as opinion, editorials or reviews, restricting our analysis to primary research articles.
Upon compiling a list of candidate articles across our six focal journals for each time period (peace and conflict) and topic (invasive species or other), we used a random number generator to choose 10 from this list for each time, topic and journal combination, summing to 40 total articles considered for each journal. As exceptions, the Journal of Ecology published only two papers on invasive species, and the American Naturalist published only one, during the peace time period. As a consequence, our dataset consisted of 223 (rather than 240) articles analysed for frequency of use of militaristic language (Suppl. material 1). We downloaded the randomised articles (above) as PDFs and confirmed that they were either about invasive species or other topics, replacing any articles that were misclassified with the next available article on our randomised list for the appropriate time period and topic area.
We then quantified the frequency of militaristic language use in the PDF of each individual article by searching for and counting the following terms: army, attack, combat, defeat, defence, enemy, fight, war, weapon, win and victory. We also searched for variations of the preceding words by tense, parts of speech or plural forms (e.g. fight, fighting etc.). We developed our word list based on militaristic language identified in past research on this topic (e.g. Simberloff 2003, Larson 2008, Verbrugge et al. 2016) and iteratively revised the list to include other words we encountered during our data extraction. We omitted proper nouns and words in the abstract, title or reference section from our count, instead including only militaristic language that occurred in the main text of articles. We did not count or include "invasive" or "invasion" in our analysis, assuming that these terms would certainly be more common in invasive species literature than other topic areas, focusing instead on the frequency of other militaristic language use within the fields of ecology and conservation biology (Larson 2005, Larson 2008).
We first tested whether the percentage of articles using any militaristic language (≥ 1 word) was different between papers on invasive species or other topics across all six journals together (n=6) or within the applied or basic science categories (n=3) using paired t-tests in SigmaPlot 14.0. We did not make this comparison between time periods owing to low replication of invasive species articles for some journals during the peace time period. We next sought to determine if the frequency (count) of militaristic words in articles varied by research topic, journal type and time period. We analysed this zero-inflated count data using generalised linear mixed models (glmm) with the glmmTMB package in R version 3.4.2. We used mixed models in order to include topic (invasive species or other), time period (peace or conflict) and journal type (applied or basic) as fixed effects, while including journal identity (i.e. American Naturalist, Ecological Applications etc.) as a random effect to account for variation between individual journals not included by the broader journal type category (applied or basic). We considered a number of models with different error distributions (e.g. Poisson, negative binomial etc.) and compared model performance with Akaike's Information Criterion (AIC). We found that negative binomial zero-inflated models best fit this dataset and used these models for our primary analysis.

Results
We found that a minority of research articles in the fields of ecology or conservation biology used militaristic language, as 67 of 223 (30.0%) scientific papers we searched used any of the terms or their variants included in our study. Invasive species articles in basic science journals were most likely to use any militaristic language (18 of 43 articles, 41.9%), articles on other topics in applied science journals were least likely to use any militaristic language (8 of 60 articles, 13.3%) and articles on other topics in basic science journals (21 of 60, 35.0%) or articles on invasive species in applied science journals (20 of 60, 33.3%) were intermediate. Invasive species articles in the basic science journals Ecology (55.0%) and American Naturalist (45.5%) and the applied science journal Biological Conservation (50.0%) had the highest incidence of militaristic language use, whereas articles on other topics in the applied science journals Biological Conservation Figure 1. The percentage of articles published in three applied and three basic journals in the fields of ecology and conservation biology using any (≥ 1) militaristic language over our study period in papers either on invasive species or other topics. Each journal by topic (invasive or other) combination had 20 articles evaluated with the exception of invasive species articles for both American Naturalist (11) and Journal of Ecology (12), which had few eligible articles on invasive species prior to 2001 (see main text). and Ecological Applications had the lowest incidence of militaristic language use (10% each; Figure 1). However, there was no significant difference in the percentage of articles using any militaristic language by paired t-tests, with all three P-values ≥ 0.12.
When testing frequency (counts) of militaristic word use in articles, we found that scientific papers in basic science journals and on the topic of invasive species were significantly more likely to use militaristic language than those articles in applied science journals or on other topics (Table 1). However, the time period (peace or conflict) had no effect on word counts of militaristic language within articles. Similar to the percentage of articles using any militaristic language (above), counts or frequencies of militaristic language words within articles were highly skewed or zero-inflated, with many articles having no use of this language, but some articles have very frequent use (Figure 2).
The most common militaristic language across these 223 articles were variants of attack/attacking (found in 14.3% of articles), defence/defends (found in 13.5% of articles) and enemy/enemies (found in 9.9% of articles). Terms like combat/combatting (2.2%), fight/fighting/fought (1.8%) and battle/battling (0.9%) were relatively rare and some terms like victory/victorious or war were never used in the 223 articles considered.

Discussion
The study and management of invasive species have been partially linked with militaristic context and language use since their popularisation by Elton (1958), yet this relationship exposes researchers or managers concerned with invasive species to accusations of bias or subjectivity (Larson 2005, Larson 2008). Such language could even Table 1. Results of a negative binomial zero-inflated mixed model (glmmTMB package, R Version 3.4.2) on frequency (counts) of militaristic language use in scientific articles in the fields of ecology and conservation biology. These mixed models included time period (peace time or conflict), journal type (applied or basic science) and topic (invasive species or other topics) and also included journal identity (e.g. American Naturalist, Ecological Applications etc.) as a random effect.

Predictor
Estimate ( reduce support for invasive species research or management because of the "boomerang effect", in which some extreme persuasive language can have the opposite effect on its audience (Byrne and Hart 2009). Suggestions to use alternative terms for "invasive" or "invasion" have not been universally adopted, as many scientists and resource managers working on this topic prefer the implication of either impact or spread that differentiates invasive species from the majority of introduced or non-native species. However, if some use of "invasive" or "invasion" is inevitable in this literature, must other militaristic language be similarly inevitable? Larson (2005) gives vivid examples of scientists writing with militaristic metaphors about impacts of, or management responses to, invasive species and this same author later documented the frequency of such militaristic language use in the journal Biological Invasions (Larson 2008). However, because some militaristic language is widely used by certain concepts in the field of ecology (e.g. fights between individuals, natural defences, natural enemies), we sought to determine whether use of militaristic language was greater specifically in the invasive species literature relative to research on other topics in ecology and the related field of conservation biology. We found that a minority of papers in ecology or conservation biology used any militaristic language (≥ 1 word use per paper), but this use was heterogeneously distributed, with papers in some journals using this terminology frequently and other publications using this terminology rarely. However, we found no significant difference between the percentage of journal articles using any militaristic language in invasive species research relative to other topic areas. We did, however, find that the frequency (count) of militaristic language use was significantly higher in both invasive species papers and in papers published in basic, rather than applied, science journals. We also evaluated whether the time period (peace or conflict) affected militaristic language use, restricting our study to papers lead-authored at institutions in the United States and defining a recent peace time and conflict period. We found no effect of time period on frequency of militaristic language use across these papers, suggesting that researchers in ecology and conservation biology were not necessarily influenced by cultural or historical context with respect to use of militaristic language.
It is perhaps not surprising that invasive species literature uses more frequent militaristic language than other research areas in ecology or conservation biology, given past work and criticism on this topic (Simberloff 2003, Larson 2005, Larson 2008, Verbrugge et al. 2016. We do caution, however, that our contrast of invasive species literature to the broader ecology or conservation biology literature did not match journal articles by specific topic or theme. Invasive species papers may be more likely to be on topics like competition where militaristic language is common, whereas broader ecology or conservation biology literature may include many topics -like ecosystem ecology -where militaristic language might be less common. For example, our papers with the most frequent use of militaristic language were often about competitive behavioural interactions between animals (McGlynn 1999) or about plant defences to enemies as contrasted with interactions with mutualists (Swope and Parker 2010). A future study could more closely match papers on invasive species to papers on other topics in similar research areas using tools like propensity-score matching (i.e. Caliendo and Kopeining 2008) or restrict the overall analysis to narrower themes or areas in ecology (e.g. competition).
We were surprised to find militaristic language use more common in basic than applied science journals, which ran counter to our prediction that management or conservation-focused articles in applied science journals might be most likely to use militaristic language. Instead, the type of theory-driven research appearing in basic science journals in ecology (i.e. McGlynn 1999, Swope and Parker 2010) is apparently more likely to rely on militaristic language, defying our initial prediction. We propose that researchers working primarily in basic science in ecology might be less aware of controversies around language use in this field than applied researchers and, accordingly, more likely to use militaristic language. However, this hypothesis would require additional study, perhaps by conducting qualitative social science interviews with basic and applied ecologists (e.g. Selge et al. 2011).
We recognise that different methods or approaches could be applied to our research question in future work. For example, we developed a word list of militaristic language that we then manually searched for in a random sample of scientific articles. First, our word list is not necessarily exhaustive and could arguably be expanded to include some other terms, or be criticised for our interpretation of included terms as militaristic. Despite the somewhat ad hoc nature of our militaristic word list, we still found some significant differences in language between invasive species and other journal articles, as well as between applied and basic science journals, when analysing word frequency or count. Second, we restricted ourselves to manually searching a random 223 journal articles (of an intended 240), but machine-learning textual analyses could facilitate a much broader search of ecology and conservation biology literature for our question (e.g. Boiy andMoens 2009, Cheng et al. 2018). Finally, our study analysed primary research articles only, but militaristic language use by invasive species researchers or managers could be more common in opinion or editorial writing within scientific journals, or in communications to policy-makers or to the public (Verbrugge et al. 2016). We found militaristic language use to be relatively rare in invasive species research in particular and ecology or conservation biology research in general, but these scientific articles may not reflect the most common or most problematic uses of militaristic language in communicating about invasive species and other topics in ecology.

Conclusions
Whether communicating to other scientists, policy-makers or the general public, researchers working in the fields of ecology and conservation biology often seek to use metaphors and other literary devices to make their writing more accessible and engaging (e.g. Wood-Charlson et al. 2015). Yet some such language use in scientific writing can reflect biases or assumptions of the authors or cause negative reactions or criticisms from the audience (Simberloff 2003, Larson 2005, Larson 2008). For example, we find the term "invasive" useful for distinguishing between introduced species that do or do not cause harm where non-native, or spread rapidly or widely after introduction, while simultaneously finding some sensationalistic language like "war" on invasive species to be hyperbolic or distasteful. As a result, we were curious to know how dependent invasive species literature was on militaristic language relative to the broader fields of ecology and conservation biology. We found that invasive species research does use significantly more militaristic language when present in an article than other research in ecology or conservation biology and that basic science in ecology makes more frequent use of militaristic language than applied science or conservation biology. Researchers in ecology and conservation biology should be conscientious about their choice of language in reporting their research, and this recommendation may be particularly urgent for basic science on the topic of invasive species.

Efficacy of Kamona strain Deladenus siricidicola nematodes for biological control of Sirex noctilio in North America and hybridisation with invasive conspecifics Introduction
Invasive species management often employs biological control agents, such as predators, parasitoids or disease organisms, to slow invasive population growth. Many factors should be taken into consideration when evaluating biological control agents, including potential interactions with closely-related organisms which become sympatric upon introduction. Competition amongst closely-related organisms could affect the long-term success of the agent (Messing et al. 2006, Guzmán et al. 2016, Cebolla et al. 2018), but unpredictable and potentially irreversible effects may also occur through hybridisation (Szűcs et al. 2011, Havill et al. 2012. Hybridisation between introduced and native biological control organisms could have effects ranging from inhibition, to no impact, to enhancement of control, but direct links between hybridisation and control efficacy have not been demonstrated. We confirmed intraspecific hybridisation between two strains of an insect-parasitic nematode and investigated the efficacy of infection and sterilisation of the invasive target insect. The Eurasian woodwasp, Sirex noctilio and its symbiotic white rot fungus, Amylostereum areolatum, can kill pine trees (Pinus spp.). Both organisms have been introduced to North America (Hoebeke et al. 2005, Nielsen et al. 2009) and pine-growing regions in the Southern Hemisphere (reviewed in Hurley et al. 2007). The nematode Deladenus siricidicola has been used in the Southern Hemisphere as a biological control agent since the 1970s because it can kill eggs in female S. noctilio (Hurley et al. 2007). A commercialised strain of D. siricidicola, called Kamona, is used there, but a non-sterilising strain of this nematode species (hereafter called the "North American strain;" NA) was unintentionally introduced to North America, presumably along with invasive S. noctilio and is well-established (Yu et al. 2009, Kroll et al. 2013. Although not found inside S. noctilio eggs, the NA strain is transferred amongst trees along with viable eggs (Kroll et al. 2013).
Deladenus siricidicola can live for many generations as free-living mycophagous forms, feeding on A. areolatum within pines, but can be triggered to change to the infective form and enter Sirex larvae. Higher CO 2 and lower pH in close proximity to larvae are associated with conversion of D. siricidicola from mycophagous to infective forms (Bedding 2009). A strain of D. siricidicola used for biological control was originally isolated from Sirex juvencus in Sopron, Hungary, but after 15-20+ years in lab culture in the mycophagous phase, it lost the ability to switch to the infective form. Therefore, D. siricidicola was re-isolated in 1991 from the Kamona forest in Tasmania, where the Sopron strain had been released in 1970 (Hurley et al. 2007, Bedding andIede 2005) and this strain is now referred to as Kamona. However, having been grown in culture for many generations, the Kamona strain has undergone repeated bottlenecks and is not genetically diverse (Mlonyeni et al. 2011). While morphologically indistinguishable, there are genetic differences between the Kamona strain and the NA strain in nuclear and mitochondrial DNA sequences (Yu et al. 2009) and they can be distinguished with microsatellite loci (Mlonyeni et al. 2011).
Native North American species of Sirex, Deladenus and Amylostereum interact with the invasive species in pine trees. During Sirex oviposition, adult females inoculate trees with Amylostereum which subsequently surrounds larval Sirex galleries, assisting with larval nutrition (Thompson et al. 2014). Deladenus nematodes also feed on this fungus when in the mycophagous phase. Bedding and Akhurst (1978) suggest that Deladenus species are more specific to fungi than to insect hosts and feeding and reproduction of Deladenus species and strains differ based on Amylostereum species and strain (Morris et al. 2012, Caetano et al. 2016. Thus, the fungal species or strain accompanying a Sirex larva could impact whether a particular Deladenus genotype is abundant and near enough to larvae to transform to the parasitic phase. Previous studies have suggested that numerous strains of A. areolatum were introduced with S. noctilio to North America (Nielsen et al. 2009, Bergeron et al. 2011 and that there is a separate native genotype of A. areolatum (Nielsen et al. 2009, Olatinwo et al. 2013. While siricids are less tightly associated with fungal strains than previously thought (Nielsen et al. 2009;Hajek et al. 2013Hajek et al. , 2018Wooding et al. 2013), an exception appears to be S. noctilio in North America which primarily uses A. areolatum and only rarely A. chailletii (Wooding et al. 2013). Amylostereum chailletii is usually found associated with native siricid species, such as Sirex nigricornis (Bedding 1974. The purpose of this study was to further evaluate the biological control potential of Kamona in North America, where treatments in Pennsylvania and New York were not efficacious against S. noctilio in the presence of the NA strain of D. siricidicola (Williams and Hajek 2017). We reared cultures of D. siricidicola Kamona on two A. areolatum fungal strains to investigate whether preconditioning, while the nematodes are mycophagous, would help the nematodes survive, reproduce and ultimately infect S. noctilio larvae. We followed procedures similar to those used in the Southern Hemisphere against S. noctilio (e.g. Carnegie and Bashford 2012), injecting Kamona into woodwasp-infested pine logs in the local environment where D. siricidicola NA and A. areolatum were already present. We evaluated the emerging woodwasp adults for nematode parasitism and sterilisation of eggs and identified a subsample of nematodes to strain using genetic markers. This is the first study in which single Deladenus nematodes have been isolated from Sirex noctilio woodwasps and their eggs for fine-scale genetic identification, including hybridisation.

Parental generation
In spring and early summer of 2013 and 2014, trees infested with S. noctilio were collected from field sites and cut into logs about 70 cm long. We sealed both cut ends with wax to retain moisture and placed them inside cardboard barrels (91 cm high x 61 cm diam.) with covers made of window screening. Barrels were kept in an unheated barn and were checked for emergence daily from late June through September. We collected adults of S. noctilio in 29 ml clear plastic cups and stored them at 4 ± 1 °C to extend their life span.
Sirex spp. are haplodiploid, so successful mating is required to obtain female offspring to test the ability of D. siricidicola to sterilise female hosts. Ten males per fe-male were mated outdoors in cages (60 × 60 × 60 cm; BugDorm 2; Bioquip, Rancho Dominguez, CA) following methods described by Caetano and Hajek (2017). After mating, we kept females at 4 °C for at least 24 hours before allowing them to oviposit in trees prepared for nematode injection trials.

Tree preparation and oviposition
In 2013 and 2014, mature red pine trees, Pinus resinosa, were selected at Arnot Teaching and Research Forest, Cornell University (Tompkins County, New York, USA). To make these trees attractive to female woodwasps for oviposition, they were weakened by injection with the herbicide Banvel (49.4% diluted in water 1:1) in early July. We drilled holes into the trunks 50 cm above ground level and about 5 cm deep at a 45° angle, spaced 10 cm apart from each other around the circumference of the tree trunk (Zylstra et al. 2010), then injected 1 ml of the herbicide solution into each hole.
We enclosed a 1 m section of each treated tree with a cage made of window screening, with the bottom of the cage approximately 80 cm above the ground (see Figure 1). In July and August 2013, 4 weeks after the herbicide had been injected, we released S. noctilio females into the cages to oviposit. We placed two or three females at a time in each cage for three to four days and then added new females for another three to four days, for a total of 4 or 5 females per cage. Woodwasps released in the same cage at the same time had their pronota marked with different colours of paint to identify individu- als for later dissection. After exposure to woodwasps, standing experimental trees were left to overwinter in the forest and were cut in mid-April of 2014. We excised the portion of each tree that had been caged, waxed the cut ends and inoculated these bolts with nematodes by the end of April (see below). Each tree had a single cage, so the terms "tree" and "log" are interchangeable in this study. This protocol was repeated in 2014-2015.

Origin of nematode and fungal cultures
Cultures of D. siricidicola Kamona grown on A. areolatum BDF were imported in January each year from Ecogrow Environment (Westgate, NSW), the Australian commercial producer of this nematode for biological control of S. noctilio, following USDA APHIS permits. We reared colonies of this nematode strain in the Sarkaria Arthropod Research Laboratory, a quarantine facility under USDA APHIS permit at Cornell. Nematodes were grown in 100 mm diameter Petri dishes on ½ PDAh (Morris et al. 2012). Half of the imported nematodes were transferred to Petri dishes containing A. areolatum BE, a fungal strain found in North America (Nielsen et al. 2009), which was cultured from S. nigricornis collected in Warrensburg, NY (SAC132). We transferred nematodes of each strain on to new media every two weeks until we obtained at least 30 well-colonised plates for each fungal strain. During 2014-2015, only D. siricidicola Kamona on A. areolatum BE was used.

Nematode inoculation
Suspensions of nematodes were prepared by rinsing culture plates into 50 ml centrifuge tubes with autoclaved tap water. We counted all nematodes (both juvenile and adult) within five 20-µl drops under a dissecting scope at a magnification of 60× and adjusted suspension volumes to obtain 40 nematodes per drop, which resulted in an average of 2000 nematodes per ml. We added 0.5 g of polyacrylamide gel powder to each 50 ml tube of adjusted suspension and allowed the gel to hydrate (Williams and Hajek 2017).
On the same day, following methods of Bedding (2009), we used a punch hammer to punch holes (1 cm wide × 1 cm deep) into the logs that had been within oviposition cages the previous summer. We punched two rows of holes along the length of each log on opposite sides of the log; these holes were placed 20 cm apart in a row. Each of the holes was filled with 1 ml of the nematode suspension or with control gel (see Table 1 for numbers of trees/treatment in both years). After inoculation, we placed treated logs inside cardboard barrels with window screening covers; logs that were inoculated with a suspension of D. siricidicola Kamona were kept in the quarantine facility and control logs were kept in an unheated barn located about 1.6 km from the quarantine. During the following summer, we checked barrels daily to collect adult (offspring) woodwasps and these were stored at 4 °C until dissection.

Woodwasp dissections
Dissections of Sirex mothers and offspring were performed under a dissecting microscope at a magnification of 60×. For male offspring, abdomens were removed and cut open. Two drops of deionised water were added to the abdominal contents and internal organs were removed and spread apart in the dissecting dish. If present, we collected nematodes with disposable sterile pipettes. For females, abdomens were removed, cut lengthwise on both sides and dorsal sclerites were removed, exposing internal organs. Three drops of deionised water were added to the abdomen. Taking care to avoid breaking the venom gland, we spread eggs in the water in a 5.5 cm diameter glass dissecting dish. At this point, eggs were counted. We preserved eggs and nematodes (if present) in 1.5 ml centrifuge tubes containing 95% ethanol at -20 °C. For verification of sterilisation status, eggs were later spread in a 35 mm diameter gridded Petri dish and placed on an inverted compound microscope to count sterilised and unsterilized eggs at 200×. It was found that this method would detect nematodes in eggs when the number of nematodes per egg was low.
For this study, we use the term "parasitised" to indicate that nematodes are present anywhere inside the woodwasp body and "sterilised" to only indicate the presence of nematodes inside of inviable woodwasp eggs. "Partial sterilisation" refers to less than 100% of eggs being sterilised within a woodwasp.

DNA extraction, PCR and data analysis
Initially, we took samples of multiple nematodes from each woodwasp hemocoel and extracted their DNA in groups using DNeasy kits (Qiagen, Germantown, MD). The mitochondrial COI DNA barcoding locus was amplified using these DNA preparations from multiple nematodes per woodwasp using methods similar to Morris et al. (2013), sequenced at the Cornell University Institute of Biotechnology (Ithaca, NY) and compared against sequences available in GenBank to identify nematode strains.
For analysis of single nematodes, we spread an aliquot from a suspension of live or preserved nematodes on to 1.5% agar in a 5 cm diameter Petri dish. While viewing through a dissecting scope, we picked up single nematodes using a tool consisting of a minuten pin mounted on the end of a glass rod. We placed each nematode into a 10-20 µl drop of 10× PCR buffer (Qiagen) diluted to 1× in 0.025% Tween on a clear plastic dish. To avoid cross-contamination, tools were cleaned with 10% bleach then rinsed with water between picking up individual nematodes. After viewing through the microscope to confirm the presence of a single nematode per drop, each drop was transferred to a well of a PCR strip tube using a sterile pipette tip. For analysis of sterilised eggs, we selected an intact sterilised woodwasp egg, washed away any external nematodes and split the egg open on a clean dish of agar. Single nematodes were individually selected as above.
Reference samples were collected in the same way for both Kamona (selected from a pure colony sample) and NA (selected from three sites in New York and one site in Pennsylvania, USA).
Nematodes in PCR buffer were frozen at -80 °C for a minimum of 30 minutes or up to several days to begin the lysing process. Thawed nematodes were treated with 1 µl of Proteinase K (Qiagen) and lysed chemically overnight at 56 °C, followed by heat inactivation of the enzyme at 95 °C. These template DNA preparations were stored at 20 °C until use in PCR reactions.
Based upon Mlonyeni et al. (2011), we chose 8 microsatellite loci most likely to show variation in D. siricidicola NA and designed two multiplex panels of four loci each (Panel A: Ds1, Ds105, Ds323 and Ds388 and Panel B: Ds83, Ds201, Ds366 and Ds325). Forward primers were 5'-labelled with fluorescent dyes (FAM, VIC, NED or PET; Applied Biosystems, Foster City, CA, USA). Reverse primers (IDT, Coralville, Iowa) had a 5' GTTTCTT pigtail (Brownstein et al. 1996). The panels were amplified using the Type-It PCR kit (Qiagen) with the standard manufacturer-recommended reagent concentrations. For PCR, we used a temperature profile of 95 °C for 5 min followed by 32 cycles of annealing at 57 °C for 90 sec. All denaturing steps were 95 °C for 30 sec and cycle extensions were 72 °C for 30 sec, with a final extension at 60 °C for 30 min. Products were diluted with Hi-Di Formamide and GeneScan 600 LIZ Size Standard v. 2.0 (Applied Biosystems) and visualised with an ABI 3730×1 at the Cornell University Institute of Biotechnology. Allele calls were checked by two authors using both Geneious and GeneMarker software.
For microsatellite data, we selected only those nematodes with successful amplification at five or more of the 8 loci. We used Structure v. 2.3.4 (Pritchard et al. 2000) with K=2 groups using correlated allele frequency and an admixture model to distinguish the parental genotypes. NewHybrids v. 1.1 (Anderson and Thompson 2002) was used to distinguish parental and F1, F2 and backcross hybrid classes. Both Structure and NewHybrids were run with 20,000 burn-in iterations followed by 1 million sample iterations.

Woodwasp emergence and dissections
The 12 trees from 2013-2014 produced a total of 86 S. noctilio (Table 2). Of these, 39 emerged from control logs (none parasitised), 46 from logs treated with D. siricidicola Kamona grown on A. areolatum BDF (13 parasitised males, 57% and 12 parasitised females, 52%) and only one from logs treated with D. siricidicola Kamona grown in A. areolatum BE (1 parasitised female, 100%).  Table 3. Experimental offspring (woodwasps from treated trees) used in microsatellite analysis of nematodes. Nematodes from control trees were not included in the microsatellite analysis.

Nematode identification
The analysis of COI from grouped nematodes per woodwasp showed that 94% of parasitised experimental woodwasps carried the NA strain (29 out of 31 parasitised woodwasps), as well as 100% of the control woodwasps (131 parasitised woodwasps).
Microsatellite analyses were performed on a subsample of nematodes from a subsample of experimental woodwasps (not on those from control trees). It was not possible to isolate single nematodes from every woodwasp emerging from treated wood, either because preservation/extraction failed or the number of nematodes was not sufficient for analysis. A total of 418 single nematodes from 14 parasitised experimental woodwasps (8 females and 6 males) were genotyped at 5 or more loci (Table 3). Of these, 84 nematodes were isolated from inside of 8 woodwasp eggs. We also genotyped 44 pure, cultured Kamona and 155 nematodes from 25 woodwasps from 4 sites. Structure analysis revealed that the microsatellite genotypes of NA found in the region were distinctly different from the cultured Kamona and placed them in two separate genetic clusters (Figure 2). NewHybrids provided probabilities of assignment to hybrid class including F1, F2 and NA backcrosses for each individual nematode (Figure 3).
Within woodwasps, mixtures of parental strains and hybrids were found in 8 of 14 woodwasps (Figure 3). The NA strain predominated overall with the exception of one female woodwasp (S5) which had mostly Kamona. Over all 418 nematodes sampled from experimental woodwasps, 15.1% were pure Kamona, but these were concentrated in three woodwasps. The hybrid classes combined were all found at low frequencies (usually less than 10% per woodwasp and 2.1% overall). Within eggs, the composition of nematodes was generally similar to that inside the woodwasp hemocoel, although that information was not available for one of the eggs (S36) because nematodes from the hemocoel failed to amplify.

Egg sterilisation
Twelve parasitised females emerging from the Kamona/BDF treatment in 2014 had 50% of all eggs sterilised ( Table 2); six of those females had sterilised eggs with a mean of 71% and range of 25-100% sterilisation per female, the other six females had unsterilised eggs. Within the microsatellite sample group (8 females over all treatment groups), egg sterilisation ranged from 0 to 90 percent. A high proportion of nematodes found inside of eggs were the NA strain, which has not been reported previously within eggs in North America (Figure 3). There was no relationship between the combined proportion of Kamona and hybrid genotypes and the proportion of eggs sterilised within the 8 female woodwasps sampled (Table 3). In some cases, most eggs were sterilised even when Kamona-related genotypes were few or undetected.

Discussion
The objective of this study was to test a biological control agent that is already successfully controlling Sirex noctilio in Oceania, South Africa and South America against this invasive woodwasp in North America. In the 1970s, many different strains of D. siricidicola were tested against S. noctilio in Australia and New Zealand, before the These investigations revealed great variability amongst strains in parasitism and sterilisation, which seemed to depend on interactions between the woodwasp and nematode strains (Bedding 1972(Bedding , 2009). Timing of host ovarian development in relation to juvenile nematode release within the host is an important factor affecting sterilisation of woodwasp eggs (Bedding 1972). The S. noctilio found in North America are genetically different from those introduced to Oceania, for which the Kamona strain was selected (Boissin et al. 2012, Bittner et al. 2017. Unlike in the Southern Hemisphere, the use of the Kamona strain for control of S. noctilio in North America could be complicated by the presence of D. siricidicola NA or native species of Deladenus which are not known to sterilise S. noctilio. Based on the high growth rates of Kamona on Amylostereum areolatum IGS BE in Morris et al. (2014), we hypothesised that rearing Kamona on BE would improve the survival and/or reproduction of Kamona in the experimental trees, which could already contain the native BE strain of A. areolatum. Unfortunately, our conclusions about this treatment effect are limited by sample size. The BE treatments produced far fewer woodwasps overall, yet the woodwasps that did emerge had a higher parasitism rate compared to the treatment reared on A. areolatum BDF (Table 2). However, these parasitism rates also included much more parasitism by the NA strain than the Kamona strain, which grew equally well on BE in lab culture (Caetano et al. 2016). More research is needed to understand the best rearing technique for Kamona in order to optimise inoculations, were this strain to be used for S. noctilio control in North America.
A long term study similar to this one was conducted using trees that were naturally infested with S. noctilio (Williams and Hajek 2017). These trees were cut down and injected with Kamona in autumn and left on the ground until spring, when the injected portions were transferred to rearing barrels. Although the methods differed, both studies obtained low parasitism rates by Kamona.
Overall, the number of woodwasps emerging from Kamona-treated trees in this study was also low, especially in 2015. Some of the factors that may affect the success of woodwasp development and/or nematode parasitism include the use of herbicide (D.W. Williams, pers. comm.), the presence of competing blue stain fungi from bark beetles (Yousuf et al. 2014, the timing of injection and tree cutting , the strain of A. areolatum for nematode development (Bedding 2009) and the moisture content of wood (Hurley et al. 2008). Moisture content may have differed in the current experiment, because the quarantine storage conditions were warmer and drier than the unheated barn where controls were stored. However, in their injection trials, even when moisture levels were adequate, Hurley et al. (2008) also observed low rates of parasitism by Kamona which could not be fully explained.
Sequencing of COI on grouped nematodes was used to determine if any Kamona were successful at infecting North American S. noctilio and this method cannot detect hybridisation. When COI was sequenced for groups of nematodes per parasitised woodwasp, it most often indicated the NA strain, rarely the Kamona strain and sometimes showed ambiguities that suggested co-infection. Thus it provided a coarsegrained picture of the overall success of the Kamona strain and suggested that only about 6% of parasitised woodwasps in experimental trees had Kamona. Microsatellite genotyping of single nematodes from selected woodwasps, both inside and outside of eggs, provided a more detailed picture. This method provided clear evidence that both hybridisation and co-infection with different strains did occur during this study.
Near Sirex larvae, some nematodes become parasitic or "infective," but sexual reproduction amongst nematodes is only known to occur outside of the Sirex larva (both in mycophagous and infective forms, Bedding 1972), so different strains must arrive at and enter a larva independently. Multiple mated female infectives can enter each woodwasp larva (from 1 to over 100, Bedding 1972), thus co-infection of different nematode strains is not surprising. Infective females absorb nutrients directly from the host and release fertilised eggs or hatched juveniles when triggered by host pupation and, depending upon appropriate timing, the juveniles may enter host eggs (Bedding 1972). Thus, even at the egg level, co-infection of nematode strains can occur, as seen in a single egg from woodwasp S5 (Figure 3). As the cycle continues, the composition of nematodes within a female woodwasp (and her eggs) affects the next generation of woodwasps by direct transfer of nematodes during oviposition. However, even in the experimental trees where no maternal-generation woodwasps were intentionally parasitised with NA, woodwasps (S1.2014, S13, S35, S37 and S38, Figure 3) still contained predominantly NA nematodes. This shows the strong prevalence of the NA strain in areas that have already been exposed to S. noctilio, such as our study site, even as the trees appeared unaffected at the time of the study. Possibly treatment with herbicide made the trees attractive to S. noctilio in the environment or they had been used by Sirex for oviposition in a previous season; either way, nematodes could have migrated through the tree vascular tissue to woodwasp larvae in the experimental section.
During the mycophagous phase, nematodes within a tree produce many generations, "breeding wherever there is growing fungus" (Bedding 1972) and this is where the opportunity for hybridisation occurs. This is not the first demonstration of between-strain hybridisation in Deladenus. Using seven species of Deladenus, Akhurst (1975) compared inter-and intraspecific hybridisation, based on successful production of F1s and F2s of the mycophagous phase. Hybridisation between strains of the same Deladenus species was only tested using Deladenus rudyi strains from Turkey and Japan. Hatch of eggs from F1s was diminished compared with crosses within the same strain, but remained > 40%.
The possible outcomes of hybridisation range from hybrid instability or reproductive failure to hybrid vigour and selective advantage. Interspecific hybridisation of parasites can produce new combinations of genetic diversity that may result in increased fitness, infectivity, host range/diversity or geographic range (e.g. Boissier et al. 2016). Intraspecific hybridisation (amongst strains) may be likely to succeed due to fewer genetic barriers and similarity of host biology.

Conclusions
If the Kamona strain were to be used as a biological control agent in North America, its success could be limited by competition with the NA strain or may be either enhanced or reduced by hybridisation. However, the most unexpected result of this study was that, even when Kamona was very low or undetectable, the NA strain was able to enter the eggs, something that had never been recorded previously. In 6 out of 8 females, we detected less than 10% Kamona/hybrid genotypes yet two of these had high egg sterilisation levels (Table 3). To our knowledge, the NA strain has never been detected inside of S. noctilio eggs in North America prior to this study. Our data reveal that a relatively small presence of Kamona strain can result in either no sterilisation or fairly high sterilisation rates. Future work on the developmental and biochemical mechanisms of sterilisation could tell us whether there is potential for hybridisation between NA and Kamona to affect sterilisation of S. noctilio on a landscape scale.

Tolerance assessment of the aquatic invasive snail Potamopyrgus antipodarum to different post-dispersive conditions: implications for its invasive success
Alberto Romero-Blanco 1 , Álvaro Alonso 1

Introduction
Biological invasions are considered one of the main forms of global change (Sala et al. 2000). Invasive exotic species can have multiple effects such as drastic changes in the structure and functioning of ecosystems (Mack and D'Antonio 1998;Strayer 1999;Riley et al. 2008). However, there are some ecosystems, such as freshwater ecosystems, that are more susceptible to biological invasions than terrestrial ones because they have been intensively used by humans (Lodge et al. 1998;Richardson 2011) and they have fewer barriers for propagation (Lodge et al. 1998). Exotic invasive freshwater mollusks represent a threat to the functioning of the invaded ecosystems and to native species (Strayer 1999;Hall et al. 2006). For instance, they can dominate the production and the biomass in food webs and they can compete with native species affecting their distribution and abundance (Hall et al. 2006;Brenneis et al. 2011). Aquatic mollusks can be transported in ballast water (Carlton 1985;Hoy et al. 2012) and attached to fishing tools (Richards et al. 2004;Davidson et al. 2008), the body surface and the digestive tract of many species of birds and fish (Malone 1965;Haynes et al. 1985;Aarnio and Bonsdorff 1997). Moreover, the propagation between remote water bodies requires in many cases a mechanism of aerial passive translocation (Alonso and Castro-Díez 2012a). Therefore, the tolerance to air exposure is a prerequisite to the propagation of those species (Richards et al. 2004;Alonso and Castro-Díez 2012a).
The New Zealand mudsnail (NZMS), Potamopyrgus antipodarum (Gray, 1843), is a prosobranch native of New Zealand and an invasive species in many aquatic ecosystems around the world (Alonso and Castro-Díez 2012b). This snail has been found in a wide variety of aquatic ecosystems, both in its native and invaded range: from freshwater ecosystems to brackish and saltwaters, and from lotic to lentic habitats (Winterbourn 1970;Alonso and Castro-Díez 2012b;Hoy et al. 2012). Therefore, this species presents a wide ecological and phenotypic plasticity which might allow the NZMS to establish in a wide diversity of habitats (Richards et al. 2006). Populations of invaded regions are formed mostly by ovoviviparous clonic females that reproduce by apomictic parthenogenesis (Winterbourn 1970). Embryos develop into a female's brood pouch and emerge to the environment as fully functional neonates (Proctor et al. 2007). Moreover, females of invasive populations are able to produce nearly 230 neonates per year (Cheng and LeClair 2011). Therefore, a single female under optimal conditions could initiate a new population (Vazquez et al. 2016).
The impacts caused on invaded ecosystems by NZMS are mostly due to its elevated population density. For example, Hall et al. (2006) found populations formed by more than 700,000 individuals/m 2 in Polecat Creek River, Wyoming (USA). These population densities can account for the 80% of the secondary productivity, causing the consumption of 75% of the primary productivity in some invaded ecosystems (Hall et al. 2003(Hall et al. , 2006. Therefore, this snail can rule carbon and nitrogen fluxes of the aquatic ecosystems that it invades (Hall et al. 2003). By contrast, there is no consensus about the impacts of NZMS on native populations as some authors report a lack of impact (Brenneis et al. 2010;Kerans et al. 2014;Gergs and Rothhaupt 2015) while others have report negative impacts (Haynes et al. 1985;Brenneis et al. 2011;Kerans et al. 2014;Riley and Dybdahl 2015) or even positive impacts (Schreiber et al. 2002;Hellmair et al. 2011). These contrasting observations may be due to the different population densities that the NZMS can reach in colonized regions (Proctor et al. 2007).
The NZMS may tolerate a wide range of physical and chemical conditions. Some authors (Sousa et al. 2007;Lewin 2012) showed that this snail tolerates wide ranges of conductivity as it has been found in waters with conductivities from 66 µS/cm to 3240 µS/cm, or even to 7390 µS/cm (Schreiber et al. 2003). This trait is important for the survival of this species as the discharge of certain substances from anthropogenic sources, such as chloride salts, fertilizers, industrial effluents, and organic pollutants can increase the conductivity of aquatic ecosystems (Bellos and Sawidis 2005;Pal and Chakraborty 2017). Other studies have documented that the NZMS is eurythermal and, in consequence, is able to survive to a wide range of temperatures (0-29 °C) (Hylleberg and Siegismund 1987;Richards et al. 2004). Moreover, this species is able to tolerate short periods of air exposure (Haynes et al. 1985;Alonso and Castro-Díez 2008;Collas et al. 2014), which is necessary to survive an aerial translocation (i.e. attached to fish equipment, boats' hulls, etc.) (Alonso and Castro-Díez 2012a). Yet, air exposure might affect the ability of NZMS to tolerate a subsequent translocation to different water conditions. However, no information on this issue has been published. Additionally, the reproduction capacity after aerial exposure in NZMS has not been assessed, which is a key factor to the establishment of new populations.
The aim of this study was to assess the effect of a period of air desiccation (= aerial passive translocation) and two subsequent physical and chemical factors (temperature and conductivity) on the mortality, behavior and reproduction of Potamopyrgus antipodarum. We hypothesized that the period of air desiccation would decrease the ability of this snail to tolerate the subsequent abiotic conditions. Therefore, the mortality, behavior and reproduction of the NZMS would be affected negatively. This study can provide new data about what environmental conditions the NZMS needs to establish and multiply successfully in recipient ecosystems with different properties than those of the original ecosystem.

Experimental population
Animals for this experiment were collected from a laboratory population kept in two 60-l glass aquaria with control water (moderately hard USEPA: 96 mg NaHCO 3 , 60 mg CaSO 4 *2H 2 O, 4 mg KCl, 122.2 mg MgSO 4 *7H 2 O/l of deionized water), enriched with calcium carbonate (10 mg de CaCO 3 /l of deionized water) (United States Environmental Protection Agency 2002). We randomly selected 360 adults with a mean shell length (± SD) of 3.63 ± 0.24 mm. Animals were distributed randomly in five acclimatization glass vessels (1.25 l). Individuals with similar sizes were chosen to rule out an effect of this variable, since size is known to influence the desiccation tolerance of this gastropod. In fact, larger individuals tolerate desiccation better than smaller ones (Richards et al. 2004). Individuals were subjected to an acclimatization period of 48 hours in a climatic chamber at 18 °C (ANSONIC VAC0732).

Experimental design and analyzed variables
Six treatments were established ( Table 1). Three of the treatments were subjected to a desiccation process and to a subsequent rehydration. The three remaining treatments were used as controls without desiccation. Physical and chemical factors applied on each treatment (C18, D18, C10, D10, C-cond18 and D-cond18) are summarized in Table 1.
Two climatic chambers (ANSONIC®VAC0732) were used to attain the target temperatures (18 °C and 10 °C) (Table 1). These temperatures were chosen because they resembled those reached in Mediterranean aquatic habitats during spring-summer and autumn-winter respectively (Bennett et al. 2015). Normal values of conductivity (300 µS/cm; Table 1) corresponded to those of the USEPA water and high values of conductivity (3000 µS/cm; Table 1) were achieved by adding 1669.4 mg of sodium chloride NaCl (99% minimum, Lot 47H1049, Sigma, Germany) to 1.5 l of USEPA water. These physical and chemical values were chosen to simulate the values of physical and chemical conditions that occur in the aquatic environments susceptible to be invaded by the NZMS, such as endorheic basins, rivers, estuaries or inner seas (Costil et al. 2001;Dybdahl and Kane 2005;Moffitt and James 2012;Astel et al. 2016). Glass vessels (volume, 0.23 l; height, 6 cm; diameter, 8 cm; control water volume, 0.17 l) were used as experimental units. They were covered with a perforated Petri dish to reduce water evaporation. Each desiccation treatment was reproduced by slipping each snail on a filter paper until their shells lost the shine caused by water. Snails of control treatments were not slipped on a filter paper since this manipulation does not have an effect on the survival (Alonso and Castro-Díez 2012a). Then, snails were left in a climatic chamber with 18.5 ± 0.5 °C (mean ± SD; n = 37 measures) and 67.6 ± 4.5% of relative humidity (mean ± SD; n = 37 measures) during 20 hours. Each treatment was replicated seven times with eight randomly selected individuals per replicate (Fig. 1).
After the air exposure period, snails were translocated to USEPA water with the physical and chemical properties showed in Table 1. Experimental units were the same as those mentioned above. The mortality of adult snails was assessed on days 3, 4, 5 and 6 after the start of the rehydration. As few snails survived after this period (see Results), survivors to air exposure and an equal number of snails of the respective controls were individually translocated to glass vessel units (volume, 0.03 l; height, 4 cm; diameter, 4.5 cm; control water volume, 0.023 l) (Fig. 1) with the same physical and chemical conditions represented in Table 1. Glass vessels were covered with a perforated plastic lid to reduce the water evaporation.
Measures of conductivity, dissolved oxygen and temperature of water were monitored every 1-2 weeks for 50 days by randomly selecting three glass vessels of each desiccation Table 1. Physical and chemical properties of desiccation treatments and their respective controls. In the treatment code "D" means "desiccation" and "C" means "control" (not subjected to desiccation); the number indicates target water temperature and "cond" indicates high conductivity. C18: normal temperature and normal conductivity control; D18: desiccation treatment with normal temperature and normal conductivity; C10: low temperature and normal conductivity control; D10: desiccation treatment with low temperature and normal conductivity; C-cond18: high conductivity and normal temperature control; D-cond18: desiccation treatment with high conductivity and normal temperature. treatment and each control. A digital conductivimeter (TDS&EC meter, GHB) was used to measure the water temperature and conductivity and a portable oxymeter (OXI 45+, CRISON Instruments) to measure the dissolved oxygen concentration. The air temperature and the relative humidity of climatic chambers were measured every 3 hours using two climate recorders (LOG32, DOSTMANN electronic GmbH, Germany). Selected endpoints were three parameters related with animal fitness (Table 2): 1) Mortality was assessed as the cumulative percentage of dead adults at day 50 after rehydration. A snail was considered to be dead if its reaction time (see below) exceeded 150 seconds and if no reaction was observed after having touched its operculum with forceps. All dead animals were removed of the experiment after monitoring. These observations were carried out 2-3 days per week until day 50 after rehydration. 2) Behavior was assessed by two variables: the reaction time (in seconds) employed by every live adult to start the normal movement after being disturbed and the cumulative number of immobile adults at day 50 after rehydration. Normal movement was considered when the individual pulled its soft body out of the shell and started sliding using its foot. Firstly, the retraction into the shell was stimulated by picking individuals up with a forceps by the central part of their shells. Secondly, individuals were rapidly pulled out from water and deposited at the bottom of the glass vessel with the operculum facing towards the bottom. The time employed to start the normal movement was assessed with a chronometer (Onstart 100, Geonaute) (Alonso and Camargo 2009). To study the number of snails that were immobile in each treatment, the quotient between the number of times that the individual of each replicate was immobile (= cumulative number of immobile adults at day 50 after rehydration) and the number of days when activity was measured (= number of observations between 8-50 days after rehydration) was calculated. A snail was considered immobile when its reaction time exceeded 150 seconds (a value obtained from a previous pilot study showed that most of snails started moving before that time) and movement inside the shell was detected. These observations were carried out 2-3 days per week between 8-50 days after rehydration.
3) Reproduction was assessed by four variables: total number of neonates per live adult at 7 days after rehydration, cumulative number of neonates in each glass vessel, cumulative number of dead neonates (i.e. animals without movement) and cumulative number of live neonates (i.e. sliding animals). The three last variables were monitored every 1-2 weeks between 8-50 days after rehydration and before every water renewal to avoid losing neonates in the process. During the 2 hours previous to water renewal, snails of every experimental unit were fed with three or four pellets (JBL, NovoPrawn, GmbH & Co KG, Germany) provided ad libitum. After monitoring, neonates were removed from the experiment with a plastic pipette. The analysis of all these variables was carried out with a stereomicroscope fitted with a cold light source (Motic MLC-150C) with a 50% of light intensity.

Statistical analysis
For each variable based on the cumulative number, two comparisons were made: the three controls were compared with their respective desiccation treatments through three Mann-Whitney U tests and desiccation treatments were compared among them through a Kruskal Wallis test or through a one-way ANOVA. Table 2 summarized the statistical analysis made for each variable. Three kinds of comparisons were performed to study the influence of treatments (desiccation treatments and controls) on the cumulative percentage of mortality of adults (Table 2). Firstly, mortality of controls (C18, C10 and C-cond18) was compared through a non-parametric Kruskal-Wallis test. Global mortality in controls was less than 10% without significant differences among them. Thereafter, the effect of desiccation on the cumulative percentage of mortality was assessed by comparing each desiccation treatment (D18, D10 and D-cond18) with its control through a Mann-Whitney U test. Eventually, desiccation treatments were compared among themselves with a one-way ANOVA, since in this case homoscedasticity assumption was met (p = 0.14; Fligner-Killeen). Effect size was also report in the ANOVA analysis as η 2 .
A mixed ANOVA was performed to study the influence of time and treatments (desiccation treatments and controls) on the reaction time of the NZMS between 8-50 days after rehydration (Table 2). Reaction time was log-transformed to achieve normality. Effect size was also report as η 2 . Degrees of freedom were corrected through the Greenhouse-Geisser approach as the sphericity assumption was not met (Field 2005). When significant differences were obtained, a post-hoc test (Student t pairwise with Bonferroni correction) was carried out to analyze which treatments caused the differences in reaction time (Table 2). Two comparisons were performed to assess the effects of treatments on the cumulative number of immobile adults (= 150 seconds) ( Table 2): firstly, desiccation treatments were compared with their respective controls through a Mann-Whitney U test and, afterwards, desiccation treatments were compared among themselves through a Kruskal-Wallis test.
The effect of treatments (desiccation treatments and controls) on the total number of neonates at 7 days after rehydration was studied through the ratio between the total number of neonates and the number of live adults of each replicate. Differences in the total number of neonates between the desiccation treatments and their controls were studied through the Mann-Whitney U test ( Table 2). The total number of neonates per live adult at 7 days after rehydration was also compared between desiccation treatments through a Kruskal-Wallis test (Table 2). On the other hand, the influence of treatments on the number of total neonates, dead neonates and live neonates between 8-50 days after rehydration was studied by calculating the cumulative number of these variables for each replicate at day 50 after rehydration (Table 2). Differences in the number of total, dead and live neonates between controls and their desiccation treatments were analyzed through Mann-Whitney U tests. Moreover, the comparison of the effects of the desiccation treatments on total, dead and live neonates was carried out through a Kruskal-Wallis test. A descriptive analysis was made for the influence of time on the number of total, dead and live neonates.
To decrease type I errors in multiple testing, a p-value of 0.01 was chosen. Surviving individuals of the same replicates were averaged during their individualization to avoid pseudoreplication (Fig. 1). In consequence, replicates ranged from 5 to 7. All the statistical analysis were carried out with R software (R Core Team 2017).

Physical and chemical properties
Overall, values of dissolved oxygen were relatively high in all treatments (> 8.5 mg O 2 /l; n = 12 measures for each treatment), air temperature (mean ± SD) in climatic chambers was 17.6 ± 2.04 °C at normal temperature and 11 ± 0.73 °C at low temperature (n = 389 measures in each climatic chamber) and the mean water conductivity (± SD) was 2811.5 ± 177.1 µS/cm in the treatments with high conductivity and 303.4 ± 43.7 µS/cm in the treatments with normal conductivity (n = 15 measures for each treatment).

Mortality and behavior of adults
Only the 16.1% of total snails of the desiccation treatments survived after the period of exposure to air at 7 days of rehydration. Cumulative percentage of mortality at day 50 after rehydration was significantly higher in the desiccation treatments than in their respective controls (p < 0.01; Mann-Whitney U test). By contrast, differences in the cumulative percentage of mortality were non-significant between desiccation treatments (F (2,18) = 0.68, p = 0.52, η 2 = 0.07 [effect size]; ANOVA) (Fig. 2).
The influence of time and treatments on the reaction time is shown in Figure 3. Time affected significantly the reaction time in each treatment between 8-50 days after rehydration (p = 0.01, η 2 = 0.07 [effect size]; Table 3). This trend was similar for  all treatments as interaction between treatments and time was not significant (p = 0.3, η 2 = 0.14 [effect size]) (Table 3). In contrast, the low temperature (C10 and D10) and high conductivity treatments (C-cond18 and D-cond18) affected significantly the reaction time (p < 0.001, η 2 = 0.27 [effect size]) (Table 3). In fact, C18 and D18 differed significantly with the rest of treatments (p < 0.05; Student t pairwise with Bonferroni correction). However, no significant differences were found neither between the low temperature control and its desiccation treatment (p > 0.05; Student t pairwise with Bonferroni correction) nor between the high conductivity control and its desiccation treatment (p > 0.05; Student t pairwise with Bonferroni correction).
On the other hand, no significant differences were found neither in the cumulative number of immobile individuals between controls and their respective desiccation treatments (p > 0.01 in all cases; Mann-Whitney U test) nor between desiccation treatments (χ 2 = 1.24, p = 0.54; Kruskal-Wallis) (data not shown). Figure 4 shows the influence of treatments on the total number of neonates produced per live adult at 7 days after rehydration. No significant differences were found between desiccation treatments (χ 2 = 0.94, p = 0.63; Kruskal-Wallis) nor between controls and desiccation treatments (W = 42, p > 0.01 in all cases; Mann-Whitney U test). Moreover, although desiccation treatments had a lower number of adults than their respective controls, the amount of neonates per live adult was similar (Fig. 4). Figure 5 represents the effects of treatments and time over the cumulative number of total, dead and live neonates between 8-50 days after rehydration. No significant differences in those variables between controls and their respective desiccation treatments were  a Treatment (C18, D18, C10, D10, C-cond18 y D-cond18) was the intersubject factor, time (11 observations done between 8 and 50 days after rehydration) was the within-subject factor and reaction time (in seconds) was the dependent variable. b Degrees of freedom (degrees of freedom of numerator/degrees of freedom of denominator) have been corrected for sphericity using the Greenhouse-Geisser approach (Field 2005). found (p > 0.01 in all cases; Mann-Whitney U test) ( Fig. 5A-C). In addition, a similar number of total, dead and live neonates was found in all the desiccation treatments (p > 0.01 in all cases; Kruskal-Wallis) ( Fig. 5A-C). All treatments started producing neonates in the first day of observation (11 days after rehydration), except in the desiccation treatments with low temperature and high conductivity (D10 and D-cond18), which started producing neonates later (Fig. 5D). The conditions of the desiccation treatments D10 and D-cond18 were the ones that most delayed the production of live neonates (Fig. 5E, F). Moreover, it is worth noting that no live neonates were observed in the desiccation treatment with low temperature until almost the end of the observation period (Fig. 5F).

Discussion
This study confirmed that the NZMS is able to survive and reproduce under contrasting physical and chemical conditions after a short-term desiccation period. These conditions can be found in new regions where the NZMS can arrive after a brief period of exposure to air. We found a higher mortality than expected after a short air exposure period. For instance, Alonso and Castro-Díez (2012a) found a fewer percentage of dead snails at 24 hours. These differences may be because we established a higher temperature during the desiccation process, and higher temperatures are known to decrease the survival of this species (Richards et al. 2004). Anyway, exposure to air is a limiting factor for the aerial passive translocation of the NZMS. A higher number of individuals died in the desiccation treatments as compared to controls. This result confirmed our initial hypothesis and it coincides with those of other studies (Alonso and Castro-Díez 2012a). However, under natural conditions, the survival of the NZMS to air exposure can be increased if the snail is carried in a moist substrate (Alonso et al. 2016). In fact, Potamopyrgus antipodarum is known as mudsnail because during dry periods it buries itself into the sediment (Duft et al. 2003).
Treatments affected the NZMS behavior in different ways. The high conductivity increased the reaction time. The desiccation process had little influence in this regard. Thus, high conductivity could have negative effects on the acclimatization of the mudsnail to the recipient ecosystems, because this effect may make access to resources difficult and impair the escape from potential predators. Low temperatures also caused a significant increase in the reaction time. Moreover, other authors have already confirmed that low temperatures affect negatively other life history traits, such as reproduction, in this mollusk (Gust et al. 2011). In contrast, exposure to air did not have a significant effect on the reaction time of the NZMS. Therefore, a brief exposure to air period would not impede the subsequent acclimatization of the NZMS in an aquatic habitat with a low temperature or a high conductivity.
Regarding reproduction, we found that neonates tolerated the new environmental conditions (temperature and conductivity) after rehydration. Neither high conductivities nor low temperatures reduced the number of neonates, in contrast to other authors who reported that low temperatures slow down the NZMS embryonic development and production (Gust et al. 2011). Our results also suggest that survivors to a brief period of passive propagation by air might be able to produce neonates, even though they face unfavorable temperatures and conductivities. Besides, these results suggest that even a low number of surviving adults after an aerial translocation could generate a new population in the recipient ecosystem.
Neonates showed a wide tolerance to temperatures and conductivities. Embryos are protected against air exposure since they are housed in a brood pouch carried by adult females until they are fully formed and functional (Proctor et al. 2007). Although the growth of neonates was not monitored, our results show a likely potential for propagation in natural conditions. Besides, human disturbances can boost NZMS spread in aquatic ecosystems. For instance, the rise of the pollution in Europe has triggered an increase of the quantity of ions in great European rivers and, in consequence, the increase of the water conductivity in those aquatic ecosystems, which favors the dispersal and the establishment of exotic species resistant to high conductivities ), such as the NZMS (Costil et al. 2001). However, a combination of a brief period of exposure to air and a relatively high conductivity tended to delay the production of live neonates. Therefore, an increase of the conductivity in aquatic habitats due to pollution could have negative effects in this regard. On the other hand, the desiccation treatment combined with low temperature tended to slow down the production of viable neonates. This might be a problem for the naturalization of the NZMS, as their populations with low reproductive rates are more susceptible to disturbance (Proctor et al. 2007). Therefore, low temperatures combined with a brief period of air exposure could limit the propagation of the NZMS. However, this impediment could be mitigated in the future warmer conditions predicted by climate change models.
The invasive success of the NZMS resides in various functional traits shared with other invasive species, such as fast growth rate, high fecundity rate, early sexual maturity, asexual reproduction, a tolerance to wide ranges of abiotic conditions, and a high phenotypic plasticity, among others (Alonso and Castro-Díez 2008). This study suggests that a brief period of exposure to air does not reduce the ability of this snail to tolerate subsequent low temperatures and high conductivities. Moreover, survivors to a brief period of exposure to air are able to reproduce in different abiotic conditions, which indicate a potential to colonize and to establish in the recipient ecosystems. These results suggest that management plans should take into account that an aerial translocation is a highly plausible route of introduction in aquatic habitats for this exotic species.
The dark side of facilitation: native shrubs facilitate exotic annuals more strongly than native annuals

Introduction
Positive interactions among species, or facilitation, can strongly influence the organisation of plant communities (Callaway 1995;Callaway 2007;Brooker et al. 2008), particularly in unproductive environments (Bertness and Callaway 1994;Maestre et al. 2009). Facilitation occurs when a foundation species ameliorates biotic or abiotic stresses that would otherwise inhibit the abundance, richness, fitness, and/or population growth of beneficiary species (Callaway 2007). For example, foundation shrubs in deserts can provide annual species with refuge from solar radiation and/or drought, resulting in increased richness and abundance of annual species inside shrub canopies relative to outside of canopies (Filazzola and Lortie 2014). Importantly, beneficiary species can experience facilitation from foundation species and interspecific competition from other beneficiary species simultaneously, which can influence the net outcome of biotic interactions (Poulos et al. 2014;Sheley and James 2014;Wright et al. 2014). Regardless, facilitation generally enhances biodiversity (Butterfield et al. 2013;McIntire and Fajardo 2014) and ecosystem function (Michalet 2006;Callaway 2007;Michalet and Pugnaire 2016). However, facilitation can have a dark side when beneficiary species are exotic invaders. Invasive plant species pose a pervasive threat to ecosystem function worldwide (Simberloff et al. 2013), including strong effects on historic patterns of nutrient cycling (Liao et al. 2008), energy flow (Baxter et al. 2004;Pearson and Callaway 2008), and abiotic disturbance (D'Antonio and Vitousek 1992; Balch et al. 2013). These disruptions are often associated with sharp reductions in local biodiversity (Vila et al. 2011;Bellard et al. 2016). Interestingly, positive interactions have been shown to promote the success of invasive species in non-native communities (Simberloff 2006;Griffith 2010). Exotic invaders are commonly facilitated by exotic species (reviewed by Simberloff 2006), and such "invasional meltdown"  is a leading hypothesis in invasion biology (Jeschke et al. 2012). Native foundation species can also facilitate exotic invaders, especially in harsh environments (Lenz and Facelli 2003;Cavieres et al. 2008;Altieri et al. 2010;Griffith 2010;Zarnetske et al. 2013;Badano et al. 2015;Hupp et al. 2017). However, native-invader facilitation has attracted less attention than the invasional meltdown hypothesis. Furthermore, very few studies have specifically addressed whether native foundation species in drylands benefit native and exotic beneficiary species to the same extent (but see Reisner et al. 2015;Ramírez et al. 2015). The biogeographic origins (i.e., provenance) of beneficiary species is an important consideration because exotic species displace native species in drylands globally (Balch et al. 2013;Bellard et al. 2016;Vitousek et al. 2017), and facilitation by native foundation species can influence the outcome of interactions between native and exotic taxa (Reisner et al. 2015). Strong facilitation of exotic species relative to native competitors could require conservationists to shift their focus from manipulating competitive interactions to facilitative ones in order to manage biological invasions (Funk et al. 2008).
The objective of this study was to investigate the extent that native and exotic species of annual plants associate with (i.e., are facilitated by) native foundation shrubs in an arid ecosystem. This issue is timely because drylands worldwide are increasingly comprised of exotic species (Vitousek et al. 2017;Simpson and Eyler 2018), and facilitation by native foundation species has considerable potential to be used as a tool for restoring native biodiversity to drylands degraded by biological invasions and other anthropogenic disturbances (Padilla and Pugnaire 2006;Funk et al. 2008;Gomez-Aparicio 2009;Lortie et al. 2018c). Specifically, we examined the hypothesis that native and exotic annual species can associate differentially with native foundation shrubs. We tested the following predictions: (i) the net abundance of the annual plant community is greater near native foundation shrubs than away from shrubs; (ii) native and exotic annual species can both associate with native foundation shrubs to become beneficiary species; and (iii) the strength of facilitation depends upon the provenance of beneficiary species. To better understand community-level outcomes of biotic interactions, we also evaluated correlations between the abundance of native and exotic annual species near and away from native foundation shrubs.

Study site and species
We surveyed annual plant communities at Carrizo Plain National Monument in the San Joaquin Desert (Germano et al. 2011) of California (35.1N, 119.6W, elevation: 723 m) at peak flowering from March to April from 2015 to 2018 at a total of seven study sites (Suppl. material 1: Table A1). This area is characterised as an arid grassland (Buck-Diaz and Evens 2011), but the native shrubs Ephedra californica, Gutierrezia californica, and Atriplex polycarpa are also present (U.S. Department of the Interior 2011). Here, we explored the potential for E. californica, the most abundant shrub species at our sites (Buck-Diaz and Evens 2011; Noble et al. 2016), to act as a foundation species. Ephedra californica is a long-lived perennial associated with the creosote scrublands, chaparral, and arid grasslands of southwestern North America (Cutlar 1939) and can facilitate native annuals (Lortie et al. 2018a) and endangered vertebrates (Filazzola et al. 2017;Westphal et al. 2018). We sampled a total of 21 annual plant species throughout the study (Suppl. material 1: Table A2), including 17 native and 4 exotic species. Among the exotic species sampled was Bromus madritensis ssp. rubens (B. rubens hereafter), one of the most problematic exotic invaders in the region (Salo 2005). Annual precipitation, mean annual temperature, and mean winter temperature at the study sites ranged from 116.22-129.44 mm, 17.5-18.0 °C, and 10.19-12.14 °C, respectively (Suppl. material 1: Table A2).

Sampling
We sampled the abundance of the annual plant community using a paired shrub-open microsite contrast with 0.5 × 0.5 m quadrats (Pescador et al. 2014). Shrub microsites were defined as the area immediately beneath the canopy of E. californica shrubs, and open microsites were defined as interstitial spaces at least 1 m from any shrub canopy. We did not sample areas more than 5 m away from shrubs. A total of 1194 independent pairs of shrub and open microsites were sampled, and repeated-measures were avoided by randomly selecting sampling locations at each site for each year. The size of foundation plants can influence the direction and magnitude of their effects on neighbours (Tewksbury and Lloyd 2001;Miriti 2006;Brathen and Lortie 2015). To account for this, the height, width, and perpendicular width (m) of each foundation shrub were measured, and the volume for a sphere was used to summarise shrub sizes (m 3 ) as a covariate in subsequent analyses. Across all study sites, mean shrub width was 2.80 m ± 0.03 SE, mean shrub volume was 23.65 m 3 ± 0.26 SE, and mean shrub density was 44.43 shrubs/ha ± 8.98 SE. These measurements are well within ranges reported by other studies in similar systems (e.g., Lortie et al. 2018a). We recorded the total abundance of each annual species present in sampling quadrats, and the provenance of each annual species was retrieved from the CalFlora database (CalFlora 2018). Individuals were easy to distinguish because the annual species we sampled do not reproduce asexually. Data are publicly archived (Lortie et al. 2018b).

Statistical analyses
Relative interaction indices ("RIIs" hereafter; Armas et al. 2004) were used to estimate the effects of E. californica shrubs on the relative abundance of annual species. We calculated RIIs as follows: where A s is the abundance (i.e., no. individual plants) of an annual species in a shrub microsite and A c is the abundance of the same annual species in the paired open microsite. RII values range from -1 to +1. Negative RII values indicate negative (competitive) effects of shrubs on annual species, positive values indicate positive (facilitative) effects, and a value of 0 indicates a neutral effect. Annuals are only considered to be beneficiary species of foundation shrubs when RII is positive.
To evaluate whether annual species were generally more abundant near E. californica shrubs than away from shrubs, we performed independent one-sample t-tests with mean RII (pooled per species per year) as the response variable for each year of the study. We used an additional one-sample t-test to summarise the net strength of shrub associations across all study years. We evaluated the net strength of species-specific shrub associations using a linear mixed-effects model with RII (pooled per species per year) as the response variable, species identity and shrub volume as fixed factors, and study year as a random factor. Treating year as a random factor accounted for stochastic sources of inter-annual variation, such as climate (Suppl. material 1: Table A2).
To test whether native and exotic annual species associated differentially with E. californica at the provenance level, we employed independent generalised linear models for each year of the study with RII (pooled per species) as the response variable and species provenance as a fixed factor. We contrasted the net strength of native vs. exotic associations with shrubs across all study years using a linear mixed-effects model with RII (pooled per species per year) as the response variable, species provenance and shrub volume as fixed factors, and study year as a random factor.
We inferred the outcome of biotic interactions between native and exotic annuals at the provenance level using t-tests and linear models. For shrub and open microsites and for each year of the study, we contrasted the net abundance of native vs. exotic annuals using independent one-sample t-tests with abundance (i.e., not RIIs) as the response variable. We then regressed net native abundance against net exotic abundance using linear models. These regressions addressed the effects of exotic annual species on the abundance of native annual species. Negative line slopes suggested competitive effects; positive line slopes suggested facilitative effects (Pearson et al. 2016).
All analyses were performed in R, version 3.5.1 (R Development Core Team 2018). All linear mixed-effects models used the lmer function of the lmerTest package (Kuznetsova et al. 2018). T-tests, generalised linear models, and linear models used the t.test, glm, and lm functions, respectively (R Development Core Team 2018). We used the emmeans function of the emmeans package (Lenth et al. 2018) for post-hoc contrasts of factors from generalised and mixed-effects linear models. R code is publicly archived at Zenodo (Lortie 2018).

Results
Annual plant species were generally more abundant near native foundation shrubs than away from shrubs. Across all species and years, annual plants were 1.35 (± 0.68 SE) times more abundant under shrubs than in the open (df = 2381, t-value = 12.97, P < 0.01). Accordingly, net RII summarised across all species and years was greater than zero (RII = 0.22 ± 0.05 SE, df = 30.00, t -value = 4.98, P << 0.01) ( Table 1). In addition, RII values summarised across all annual species were greater than zero for each year of the study except 2017 and were never less than zero (Table 1).
In each year of the study, native and exotic annual species positively associated with E. californica to become beneficiary species (Fig. 1). Many shrub-annual plant associations were neutral, but none were negative (Fig. 1, Suppl. material 1: Table A3). Interestingly, the only annual species that formed a positive association with E. californica across all study years was the exotic invader B. rubens (RII = 0.55 ± 0.16 SE, df = 4.09, t-value = 2.72, P = 0.05) (Suppl. material 1: Table A3). Shrub size did not affect association patterns at the species level (Suppl. material 1: Table A4).
Interestingly, native and exotic annuals associated differentially with native foundation shrubs at the provenance level. At the provenance level, net RII summarised

Figure 1.
Year-by-year effects (RII ± 95% CI) of native foundation shrubs on the abundance of annual species. Net effects across all years are summarised in Suppl. material 1: Table A3.  Table A5 for complete statistics across all years was 2.75 ± 0.14 SE times greater for exotic annuals than for native annuals (df = 21.01, Z -ratio = 3.05, P < 0.01), and this general trend (i.e., greater RII values for exotic annuals than native annuals at the provenance level) was apparent in each year of the study except 2017 (Table 2). At the provenance level, RII values for native annuals were never greater than RII values for exotic annuals (Table 2). Shrub size did not affect association patterns at the provenance level (Suppl. material 1: Table A5).
Regardless of year and microsite, exotic annual species were always more abundant than native annual species. Summarised across all years, the net abundance of exotic annuals was 4.97 ± 0.78 SE and 3.05 ± 0.78 SE times greater than the net abundance of native annuals in shrub and open microsites, respectively (df ≥ 1863.40, t-value ≥ |28.89|, P < 0.01) (Suppl. material 1: Table A6). This trend (i.e., greater net abundance of exotic annuals than native annuals) was apparent in shrub and open microsites for each year of the study (Suppl. material 1: Table A6).

Discussion
Facilitation is an important process in the assembly of plant communities in drylands and other extreme environments globally (Callaway 2007), but few studies have contrasted the effects of native foundation species on native vs. exotic beneficiary species (but see Reisner et al. 2015;Ramírez et al. 2015;Hupp et al. 2017;Llambi et al. 2018). Understanding how ecological processes affect native and exotic taxa has important implications for the conservation of ecosystems affected by biological invasions (Simberloff et al. 2013, Pearson et al. 2018. In an arid grassland, we found that native and exotic annual species consistently formed positive associations with the native shrub E. californica. However, the strength of these associations depended upon the provenance of beneficiary species -at the provenance level, exotic annuals consistently associated more strongly with E. californica shrubs than native annuals, and in terms of relative abundance, exotic species always dominated annual plant communities. Thus, the force of facilitation had a dark side at Carrizo Plain. Our study coincides with a broad literature suggesting that ecological processes can have markedly different effects on native and exotic taxa in the same communities (reviewed by Levine et al. 2003;Mitchell et al. 2006;Catford et al. 2009;Pearson et al. 2018). Most studies have focused on the effects of negative interactions like competition (Seabloom et al. 2003;Vila and Weiner 2004;Callaway et al. 2011) and predation (Maron et al. 2012;Lucero 2018;Lucero and Callaway 2018), but our study is unique in contrasting the effects of positive interactions on native and exotic taxa at the provenance level. The extent that community-level processes have divergent effects on native and exotic neighbours has been hotly debated Simberloff 2011) but is an important consideration for explaining, predicting, and managing biological invasions (Pearson et al. 2018).
Our study underscores the potential for facilitation by native foundation species to exacerbate biological invasions. Native foundation species can increase the ecophysiological performance , abundance (Lenz and Facelli 2003;Reisner et al. 2015;Hupp et al. 2017), population growth (Griffith 2010), and spatial distribution (Altieri et al. 2010) of exotic invaders. In addition, facilitation by native foundation species may help explain the initial colonisation of some exotic species in non-native communities (Stohlgren et al. 2006;Fridley et al. 2007). In this context, the initial colonisation of exotic species like B. rubens, Schismus barbatus, and Erodium cicutarium at Carrizo Plain may have been facilitated by native foundation shrubs. However, this interpretation should be viewed with some caution because exotic annuals in this system can clearly colonise open microsites without the aid of shrubs (Suppl. material 1: Table A6). Because exotic annuals were relatively more abundant than native annuals in both shrub and open microsites (Suppl. material 1: Table A6), we argue that shrub facilitation probably reinforced but did not entirely drive the dominance (in terms of relative abundance) of exotic annuals in this system. Our study emphasises the importance of interpreting the effects of ecological processes within the context of net outcomes at the community level (Brooker et al. 2005;Soliveres et al. 2015).
There was considerable inter-annual variation in the effects of E. californica on annual species. The strength of positive interactions is known to increase with environmental severity (Bertness and Callaway 1994;Maestre et al. 2009;He et al. 2013;Gao et al. 2018), and environmental severity (i.e., drought and heat stress) can fluctuate widely from year to year in deserts (Venable 2007). In this context, native shrubs facilitated the abundance of the greatest number of species in 2015 (Fig. 1), the study's driest year (Suppl. material 1: Table A2). Drought intensity is predicted to increase in deserts across southwestern North America in the 21 st century (Cook et al. 2015). If so, the number of species that associate with shrubs and the strength of these associations may also increase Gao et al. 2018), along with any negative net outcomes of facilitation.
We observed no effects of shrub size on association patterns at either the species or provenance levels (Suppl. material 1: Tables A4, 5). This accords with a recent study by Lortie et al. (2018a) showing that the positive effects of E. californica on annual species are independent of canopy size. Our study extends these results by considering the provenance of annuals. In other severe environments, shrub size has strongly influenced the direction and magnitude of association patterns (Tewksbury and Lloyd 2001;Miriti 2006;Brathen and Lortie 2015) and may be an important consideration for other foundation species at Carrizo Plain.
Our study hints that facilitation by E. californica shrubs can alter the outcome of interspecific interactions among native and exotic neighbours. In 2018, the abundance of native and exotic annuals was positively related in open microsites (where E. californica was absent) but unrelated in shrub microsites (where E. californica was present). These relationships suggest facilitation between native and exotic annuals in open but not shrub microsites. Thus, it is possible that E. californica attenuated positive interactions between native and exotic annuals in that year. Facilitation of native species by exotic neighbours -including invasive species -is not necessarily unusual in deserts. For instance, in the Great Basin Desert, Lucero et al. (2015) found evidence that the native perennial grass Elymus elymoides was more abundant and produced a significantly greater seed rain in areas invaded by exotic Bromus tectorum than in adjacent noninvaded areas. Thus, invasive species do not always impose negative effects on native neighbours. Importantly, a positive relationship between native and exotic abundance does not necessarily indicate facilitation; native and exotic plants could both respond favourably to particularly good microsites.
Our study highlights the potential for beneficiary species to experience facilitation from foundation species and interspecific competition from other beneficiary species simultaneously. In 2015, competition and facilitation appeared to operate in tandem to influence biodiversity patterns under shrubs (Fig. 2, Tables 1, 2). This coincides with a number of recent studies (Maestre et al. 2004, Poulos et al. 2014Sheley and James 2014;Wright et al. 2014;Reisner et al. 2015;Llambi et al. 2018). The relative importance (sensu Brooker et al. 2005) of these biotic interactions likely depends upon abiotic conditions and the ontological development of interacting species (Callaway et al. 1996;Fagundes et al. 2018;Gao et al. 2018;Pierce et al. 2018).
We hypothesised that any competitive effects of exotic annuals on native annuals in 2015 may have been driven by B. rubens, as this exotic species was facilitated more strongly and consistently than any other (Fig. 1, Suppl. material 1: Table A3). To test this, we regressed the abundance of native annuals (all species combined) against the abundance of B. rubens for each year of the study. We found no negative relationship between the net abundance of native annual species and B. rubens in 2015 or any other year (Suppl. material 1: Fig. A1), suggesting that any competitive effects of exotics on natives were not driven by B. rubens alone. However, it is important to note that B. rubens has strongly reduced the abundance of native competitors in arid environments similar to our study system (Brooks 2000;Salo 2005). The source of such context-dependence is unclear, but the presence of competitive interactions between native and invasive species at fine spatial scales is consistent with an extensive literature (reviewed by Levine et al. 2003;Vila and Weiner 2004;Mitchell et al. 2006;Liao et al. 2008;Vila et al. 2011).
Our data do not speak to the mechanisms by which facilitation occurred. Nonmutually exclusive mechanisms of facilitation include seed trapping, amelioration of abiotic stress, modification of soil biogeochemical processes, increasing pollinator visitation, and/or providing herbivore protection (reviewed by Filazzola and Lortie 2014). Importantly, we do not know whether native and exotic species were facilitated via the same mechanisms. If native and exotic species generally capitalise on different mechanisms of facilitation, conservationists could potentially manage biological invasions by disrupting the mechanistic pathways specific to exotics.
Our findings have practical implications. Because E. californica canopies were hotspots for the abundance of native and exotic annual species, conservationists may consider targeting their efforts to control invasive species under shrub canopies. For example, herbicide applications to reduce the density of invasive species and subsequent reseeding efforts to increase the density of native species (Huddleston and Young 2005;Rowe 2010;Clements et al. 2017) might yield the greatest dividends if focused under shrub canopies. In addition, reducing the density of B. rubens and S. barbatus under shrub canopies could help decrease wildfire risk by reducing fine fuel loads (Brooks 1999;Brooks et al. 2004). Positive feedbacks between wildfire and the abundance of exotic invaders are well documented (reviewed by D'Antonio and Vitousek 1992; Brooks et al. 2004), and wildfire-invasion feedbacks may cause rapid state changes in dryland vegetation (Balch et al. 2013;Horn and St. Clair 2017).
Furthermore, our study suggests caution in using facilitation by native shrubs as a tool for restoring native biodiversity to degraded environments. Drylands in California and globally are being retired from intensive agricultural use due to drought, poor soils, and changing climate (Webb et al. 2017), presenting critical opportunities for restoring native biodiversity (Kelsey et al. 2018;Lortie et al. 2018c). In this context, facilitation by native shrubs has attracted considerable interest as a restoration tool (Padilla and Pugnaire 2006;Funk et al. 2008;Gomez-Aparicio 2009;Lortie et al. 2018c). However, strong facilitation of exotic and invasive species by E. californica could undermine restoration efforts in our study area. For instance, the San Joaquin kit fox (Vulpes macrotis ssp. mutica) is an endangered species endemic to the San Joaquin Desert (Williams et al. 1998) that has been identified as a potential target for restoration (Lortie et al. 2018c). Importantly, kit foxes avoid areas with high densities of exotic grass species like B. rubens, S. barbatus, and Hordeum murinum (Smith et al. 2005). Accordingly, facilitation of these exotic grass species by E. californica (Fig. 1) could be counterproductive to the restoration of kit foxes and many other wildlife species that avoid areas with high densities of exotic grasses (Ostoja and Schupp 2009;Freeman et al. 2014;Filazzola et al. 2017). Our study highlights the need for ecological restoration based on facilitation to be tailored to the species and environments in question (Noumi et al. 2015).

Conclusions
Our study reaffirms facilitation as an important force in the organisation of plant communities and confirms that both native and exotic beneficiary species can positively associate with native foundation shrubs. However, we found that the magnitude of facilitation depended upon the biogeographic origins of beneficiary species -at the provenance level, exotic species were facilitated in abundance much stronger than native species. Importantly and regardless of inter-annual variation in climate, the net outcome of biotic interactions that included facilitation was an annual plant community dominated (in terms of relative abundance) by exotic species. Our study stresses that the effects of ecological processes like facilitation must not be decoupled from net outcomes relevant to conservation and restoration. In systems like ours where facilitation increases the abundance of invasive species, managing positive interactions may be a useful conservation strategy.