Advancing impact assessments of non-native species strategies for strengthening the evidence-base

The numbers and impacts of non-native species (NNS) continue to grow. Multiple ranking protocols have been developed to identify and manage the most damaging species. However, existing protocols differ considerably in the type of impact they consider, the way evidence of impacts is included and scored, and in the way the precautionary principle is applied. These differences may lead to inconsistent impact assessments. Since these protocols are considered a main policy tool to promote mitigation efforts, such inconsistencies are undesirable, as they can affect our ability to reliably identify the most damaging NNS, and can erode public support for NNS management. Here we propose a broadly applicable framework for building


Introduction
Globally, the number of introduced, non-native species (NNS) continues to increase (Seebens et al. 2017) while biological invasions already are one of the major causes of global biodiversity loss, and inflict massive economic and societal costs (Bradshaw et al. 2016;Paini et al. 2016).Yet predicting the magnitude of NNS' impacts remains particularly difficult (Courchamp et al. 2017;Dick et al. 2017).To identify those NNS that are most likely to cause substantial ecological and/or socio-economic damage, a wide range of prioritization tools have been proposed.These tools are generally based on previous records of impact (Kulhanek et al. 2011).Some protocols are geared towards specific types of impacts, taxa or geographical areas (e.g.Copp et al. 2009;Sandvik et al. 2013), while others aim to be more generically applicable (e.g.Blackburn et al. 2014;Nentwig et al. 2016).These impact assessments take into account existing evidence regarding NNS impacts to aid conservation managers and policy-makers in deciding how conservation resources can best be allocated.
However, impact prioritization protocols differ in how they treat available evidence on NNS impacts (e.g.relying on peer-reviewed literature only versus accepting Box 1.The Precautionary Principle. The Precautionary Principle according to the Rio Declaration: "Where there are threats of serious or irreversible damage, lack of full scientific certainty shall not be used as a reason for postponing cost-effective measures to prevent environmental degradation."(UNCED 1992, Principle 15).
The Precautionary Principle according to the European Commission: "According to the European Commission the precautionary principle may be invoked when a phenomenon, product or process may have a dangerous effect, identified by a scientific and objective evaluation, if this evaluation does not allow the risk to be determined with sufficient certainty.Recourse to the principle belongs in the general framework of risk analysis (which, besides risk evaluation, includes risk management and risk communication), and more particularly in the context of risk management which corresponds to the decision-making phase.The Commission stresses that the precautionary principle may only be invoked in the event of a potential risk and that it can never justify arbitrary decisions.The precautionary principle may only be invoked when the three preliminary conditions are met: (1) identification of potentially adverse effects, (2) evaluation of the scientific data available, and (3) the extent of scientific uncertainty."(European Commission 2000).grey literature as well), while several protocols invoke the precautionary principle (see Box 1) to guide scoring of NNS impacts.Here, we propose an integrative strategy for building and organizing the evidence base underlying NNS impact assessments.In addition, given the challenges in predicting which NNS are likely to have the most severe impacts, we support the use of the precautionary principle in the risk management stage of the risk analysis of NNS (Ahteensuu and Sandin 2012).Explicitly and publicly acknowledging the evidence included, and the choices made in NNS impact assessments, is vital for the legitimacy of any NNS management policy (Bartz and Kowarik 2019).Therefore, we suggest how one can build a comprehensive, transparent and reproducible database, and argue that applying this public database to available impact prioritization protocols will allow anyone to track and (re-)evaluate NNS rankings.In this essay, we first describe risk analysis (Fig. 1) for non-native species to provide an organized and comprehensive view of this process.We then focus on the impact assessment step and highlight some key issues and propose a novel framework that allows us to simultaneously answer many of those challenges.Finally, we specifically discuss issues regarding the application of the precautionary principle in the impact assessment stage, and discuss how this important principle may be used in NNS risk analysis.

A primer on (NNS) risk analysis
To ensure unequivocal use of words pertaining to NNS risk assessment, this paragraph focusses on clarifying and standardizing the terminology used in this essay.In its most general form, 'risk' equals the likelihood that harm will occur multiplied by the consequences of that harm.The main standard-setting organizations (e.g. the CAC, regulating food safety; the FAO/WHO, regulating animal health; the IPPC, regulating plant health) consider a 'risk analysis' to include several discrete components, categorized as 'hazard identification', 'risk assessment', 'risk management' and 'risk communication' (Fig. 1, Fig. 2).Accordingly, the first component of a full NNS risk analysis, the 'hazard identification' step, is where it is decided for which species a risk analysis is to be conducted.This can be done proactively, for example when a horizon-scanning exercise has identified a set of potential NNS; or reactively, when an early-detection network has uncovered emerging populations of a NNS.The second component, the 'risk assessment', collates scientific evidence pertaining to the species under consideration.Risk assessments take many forms, from experimental manipulations to soliciting expert opinion (Speirs-Bridge et al. 2010;Aven 2017), but are fundamentally tools for obtaining, organizing, presenting and summarizing information for further use in management processes (Fairbrother and Bennett 1999).The risk assessment component forms the cornerstone of risk analysis, and in the context of NNS, it is typically separated into four subcomponents, corresponding to distinct components of invasion, namely evaluation of the potential/likelihood of introduction, establishment, spread and impact (Fig. 1).Commonly used assessment protocols may focus on specific components of risk assessment (e.g. the impact component only, such as the Figure 1.Schematic representation of a full risk analysis procedure for non-native species (modified from Maijala 2006 andEFSA 2011).Note that it is generally considered that the Precautionary Principle (PP) should be applied at the risk management stage, not at the risk assessment stage (Ahteensuu and Sandin 2012).
The third component is known as 'risk management'.Here, decision-makers consider the information and evidence collected in the preceding risk assessment, and weigh it against any other economic, political or societal factors.Risk management thus is about the selection and application of specific management policies, procedures and practices to reduce or mitigate the proliferation of damaging NNS (Abt et al. 2010;Tollington et al. 2017).Fourth and finally, 'risk communication', is closely related to the principle of transparency and with the right of societies to participate in the process of decision making.Its major function should be to ensure that all information and opinions essential for effective NNS management are exchanged among stakeholders and incorporated into the decision making process (Goldschmidt 2017).

Current issues in NNS impact assessment
A main challenge in NNS impact assessment is the ability to evaluate, compare or even predict the magnitude of impacts attributable to a wide range of non-native taxa, often based on limited evidence (Hulme et al. 2013;Blackburn et al. 2014).Progress has been made in devising generic impact scoring protocols that are applicable across a wide range of taxa and habitats.Recent studies have further proposed methods for Figure 2. Schematic representation visualizing the main differences between current practices in NNS risk assessment and the framework for evidence mapping proposed here.Building a transparent and searchable database whereby the evidence base is grouped according to the relevance of the geographical area from where the information is reported, the type of publication, study design and impact direction facilitates evaluating the robustness of NNS impact assessments, and makes them more legitimate for policy decisions.
ensuring quality control and reducing disagreement between expert assessors (Turbé et al. 2017;Vanderhoeven et al. 2017).For example, González-Moreno et al. (2019) propose that in order to reduce inconsistencies in research findings, assessment protocols should use a five-level scoring rule, maximum aggregation of impacts and the moderation of expertise requirements.However, more fundamentally, protocols differ strongly in the kind of information they accept as an evidence base.Firstly, some protocols consider only impacts originating from the (invaded) area (e.g.EICAT) for which the assessment is done whereas others recommend incorporating impact information from other invaded ranges, or even from the native range (e.g.NNRA; Table 1).Secondly, some protocols focus on impacts reported in the peer-reviewed literature only, while others allow any kind of grey literature or expert opinion to be used (Table 2).Thirdly, protocols also may not clearly discriminate between study designs, risking largely anecdotal observations to be considered as equally informative as experimental studies (Table 2).While protocols often include a general level of confidence for the overall (impact) assessment, this typically does not allow one to identify the type or source of uncertainty.Consequently, reasons for self-reported low levels of confidence in the output of (impact) assessments remain mostly unexplained (Vanderhoeven et al. 2017).
Finally, multiple protocols invoke the precautionary principle as a guideline for building and interpreting the impact evidence base (Table 3).Impact assessment protocols typically consist of a list of questions regarding invader impacts for distinct impact categories.Using the precautionary principle as justification, several protocols instruct expert assessors to give a single answer (i.e. a single impact score) for each category, using only the most severe impact case study encountered -thereby effectively ignoring studies showing less severe impacts.Additionally, when aggregating the answers across impact categories into an overall score, multiple protocols also refer to the precautionary principle when recommending to rank NNS based on the most severe impact scores only.This is controversial, as the consensus view is that the precautionary principle is relevant only for the risk 'management component', and not in risk 'assessment' (Aven 2011;Ahteensuu and Sandin 2012, Fig. 1, Fig. 2, Box 1).Indeed, the precautionary principle is a normative principle that allows policy-makers to opt for certain cost-effective measures when there are threats of serious or irreversible damage, even if there is a lack of full scientific certainty (UNCED 1992, European Commission 2000, Ahteensuu and Sandin 2012, Garnett and Parsons 2017).
These issues underlying current NNS impact assessments are problematic for several reasons.First, even if conducted for the same species and the same region, differences across protocols in the evidence utilized may lead to inconsistencies in NNS rankings (Matthews et al. 2017).This hinders the acceptance and uptake of NNS biosecurity strategies by decision makers and the general public (McGeoch et al. 2012).Second, NNS impact reports typically derive from a wide range of sources.Although observational and experimental peer-reviewed studies are an important source of impact information, many impacts are only reported in the grey literature.Especially when data is scarce, all available information can be valuable -whether it comes from grey or peerreviewed literature.This, however, makes it especially relevant to explicitly document which data is considered, as for instance, assessments that include anecdotal information versus experimental data only inherently differ in the quality of their underlying evidence.Accepting different quality levels may result in inconsistencies in assessment outcomes (White et al. 2019).A transparent and systematic classification of the evidence base, which allows going back to the sources, is crucial to avoid a 'data laundering' process (Strubbe et al. 2011), whereby stakeholders use the results of the impact assessments to draw conclusions without being aware of the potentially limited quality of the underlying evidence.Third, the rise of 'invasive species denialism' (Ricciardi and Ryan 2017;Russell and Blackburn 2017) challenges invasion biologists to better present the available evidence, because disagreements often arise when uncertainty on impacts is confounded by differences in personal values.Minimizing social conflict in NNS management will need more than evidence alone.For example, Crowley et al. (2017a) advocate for the implementation of social impact assessments ('SIA') for identifying, evaluating and addressing social costs and benefits, and to enable meaningful public participation in management planning.Hence, especially in our contemporary 'post-truth' world (Higgins 2016), compiling and presenting a transparent and publicly available evidence base for informed risk assessment, risk management, and risk communication should be a core concern for invasion ecologists.

A framework to map variation in the evidence base
To address these challenges, we propose a framework by which all available information is systematically classified and catalogued, in order to achieve the creation of a transparent, objective and reproducible evidence base.Multiple impact assessment protocols already require expert evaluators to document the most severe impact case studies encountered (e.g.Blackburn et al. 2014;Bacher et al. 2018), but similar reporting of all literature sources assessed typically is not mandatory.Here, instead of following protocol-specific instructions regarding eligible evidence, we suggest separating out the initial construction of the evidence base of NNS impacts (Fig. 2), making this independent of the scoring protocol.This evidence base can subsequently be used for impact scoring in combination with a chosen impact assessment protocol (and provided to stakeholders, the general public, and/or reviewers).We propose that this evidence base (a) must include each and every source documenting an invader impact -not only the most severe case study, and (b) that each and every source is catalogued using at least the following four variables (see Table 4 for a summary).
A first important yet variable decision NNS assessment protocols make (see Table 1) is deciding from which 'geographical area' invader impacts are included.Invasion impacts are context-dependent, as they, among others, depend on the environment in which the impact occurs (Pyšek and Richardson 2010;Bartz and Kowarik 2019).For a flexible implementation of this decision, we propose that invader impacts are classified in the evidence base depending on whether the information comes from (a) the geographical area for which the assessment is performed, (b) from other non-native Table 1.Geographical areas considered by a set of commonly used NNS risk or impact assessment protocols.The protocols listed in the Tables below vary in their scope, from purely impact assessment (such as EICAT and GISS) to full risk analysis tools (e.g.GABLIS).We here focus on the 'impact assessment' component common to each protocol.To illustrate how available impact assessments differ in various ways, we here highlight protocols whose impact assessment module potentially meets the minimum standards set by the European Union (Roy et al. 2014).We additionally included the more recent EICAT protocol because it has been adopted by the IUCN as their tool of choice to rank invaders based on the magnitude of their impacts.

AquaNIS
AquaNIS considers two 'impact blocks'.'Regional impacts' involves data on species impacts in the invaded region under consideration.'General impacts' refers to impact data from 'any world region', whereby AquaNIS does not explicitly discriminate between impacts from the native range or impacts from other invaded areas.(Olenin et al. 2014.)EFSA EFSA instructions require consideration of impacts in other invaded regions and also allow to use impacts from species native ranges: "A review of the type and intensity of the current environmental impact in other invaded regions (outside the risk assessment area) is required.From this information, the prevalent ecological role and the ecological interactions the pest establishes (or is expected to establish) in the current area of distribution and in its different developmental stages can be defined.If the species has not invaded any other area, or if the invasion is too recent and too little is known about its ecology in the invaded areas, the ecological role of the species as a driver of ecosystem change can be evaluated in the native distribution area" (EFSA 2011) ENSARS ENSARS seems to primarily rely on information from species native ranges: "Impacts on aquaculture are determined first through questions on the impact the agent has on aquatic animal production within the existing geographic range, and whether impact is likely to be comparable in the importing country.",and "Similarly, social and environmental impacts are also assessed through comparison of the original geographic range with the RA area."(Copp et al. 2016) EPPO EPPO primarily relies on impacts reported from the invaded area under consideration, but also allows to use evidence from elsewhere, but it is not clarified whether "elsewhere" includes both species native ranges and other invaded areas: "As far as possible, evidence should be obtained from records of invasive behavior in the area under assessment or in the EPPO region.Information on invasive behavior elsewhere may also provide guidance."(Brunel et al. 2010) FISK FISK impact questions focus on impacts on the invaded range only (e.g.'In the taxon's introduced range, are there known adverse impacts to ecosystem services?"),no specific questions or guidance regarding impacts in other invaded areas or in species' native ranges are given.(Copp et al. 2016) GABLIS GABLIS seems to allow to use impact information from any area, as it states "Data used for assessment (…) may refer either to a reference area or to climatically and ecologically similar areas."Indeed, GABLIS mentions that "(…) This "invades elsewhere" criterion is one of the most important and most appropriate to carry out predictive risk assessments (Pyšek and Richardson 2007).",but also warns that "(…) transferability must be assessed with caution and on a case-by-case basis."(Essl et al. 2011) GB NNRA GB NNRA considers both impacts in the invaded region under consideration and impacts 'within its existing geographic range', but does not explicitly discriminate between species native ranges and other invaded areas.(Baker et al. 2008) GISS GISS allows to use information from the invaded region under consideration, other invaded areas and from the species native range; "The impact scored by the GISS should ideally be observed in the focal invaded range.However, if the species shows no impact, for example because its density is still too low or it has just started spreading, no published information can be expected.In such cases, impact reports from other invaded areas ("impact elsewhere") can be taken into consideration and in some cases, even including impacts from the native range is justified, especially for species that are vectors of parasites or are toxic or allergenic (i.e., possess features that are unlikely to change between ranges)."(Nentwig et al. 2016) Harmonia+ allows to use data from any geographical region to inform the assessment, as in its key guidelines, it is stated that: "Third, to use cases that are similar in biology or geography when direct evidence appears lacking (the higher the similarity, the better)."(D'hondt et al. 2015) EICAT For so-called 'non-global' assessments, EICAT allows to use data from any other invaded region: "Non-global assessments may be carried out, based on data from the focal region or from focal regions outside the particular country or region of interest (…)".It does, however, explicitly exclude data from the native range: "Data and observations from the native range are often important components of risk assessments, but such data should not be used in estimating Current or Maximum Recorded Impacts.The EICAT scheme is purely about impact in the alien range of a species."(Blackburn et al. 2014) NORWAY SCHEME The Generic Ecological Impact Assessments of Alien Species in Norway allows to use data from elsewhere, if there is no information available from the (invaded) region under consideration.It is not clear whether this includes the use of data from the native range, but this seems acceptable judging from the following statement: "Where data on a given species are not available, from the country or region for which it is assessed, data should, in this order, be sought from: other regions with comparable ecoclimatic conditions, other regions with different ecoclimatic conditions, other, preferable closely related, species with comparable ecological and demographic characteristics."(Sandvik et al. 2013)  ENSARS primarily relies on peer-reviewed literature, but also allows for 'other sources of reliable information', yet it does not clarify what criteria need to be met for 'other sources' to be 'reliable': "A key feature of ENSARS is that the risk assessments are, as far as possible, informed using peer-reviewed literature or other sources of reliable information, and there is therefore a 'paper trail' that enables the justification for a decision to be reviewed and subsequently be revised, should new information become available."(Copp et al. 2016) EPPO EPPO (European and Mediterranean Plant Protection Organization) allows a wide range of data sources to be considered: "Available sources of information to run the process include: NPPO data, scientific literature, personal communications from scientists and botanists, websites and databases on invasive alien plants.Existing PRAs (Pest Risk Assessments) also need to be consulted (e.g. on the EPPO and NPPO (International Plant Protection Convention) websites)."(Brunel et al. 2010) FISK No guidance was found regarding the types of information that are acceptable for informing FISK impact questions.
A 2013 background and guidance document prepared by the 'Salmon and Freshwater Team' mentions that when answering FISK questions, the assessor should "Provide a justification for that response (i.e.bibliographic source, background information, etc.)."This seems to suggest that a wider range of sources is accepted (i.e.not only peerreviewed literature).(Salmon andFreshwater Team 2013, Copp et al. 2016 GABLIS GABLIS allows to use impact information from a wide range of sources, as it states "Data used for assessment may result from scientific reports and peer-reviewed publications as well as from expert judgement (…)." (Essl et al. 2011) GB NNRA No explicit guidance could be found on which data sources are considered acceptable for informing the GB NNRA.(Baker et al. 2008) GISS GISS seems to exclusively rely on peer-reviewed literature, as it states that "(…) the GISS relies on published evidence of the impacts caused rather than on expert knowledge (…)" and "If no publications on impact can be found, this species cannot be scored by the GISS."(Nentwig et al. 2016) Harmonia + Harmonia + does not explicitly state which documents can inform the assessment, but seems open to include a wide range of sources as it states that: "Key guidelines are, firstly, to base answers as much as possible on evidence and not on a purely hypothetical or speculative basis.

"). (D'hondt et al. 2015) EICAT
EICAT mentions that different data types can be used, classifying data as 'Observed' (e.g.empirical observation, designed observational studies) versus 'Inferred' (e.g.outcomes of mathematical models), but does not explicitly mention what data sources can be used.An IUCN EICAT evaluation excel sheet, however, mentions that: "Information on the impacts of an alien species may be taken from a range of sources including journal articles, books, scientific reports, websites, grey literature (unpublished) and personal communications."(Blackburn et al. 2014) NORWAY SCHEME The Generic Ecological Impact Assessments of Alien Species in Norway accepts a wide range of data sources: "Scientific publications, reports as well as unpublished data are accepted as documentation, as long as the latter are made available by the experts.Documentation also includes reporting the complete input values of models performed, not merely their output."(Sandvik et al. 2013) areas invaded by the species, (c) from the species' native range, or (d) from captivity or cultivation.We acknowledge that there may be border cases where it could be difficult to allocate a specific study to one category or the other.Such ambiguity could be commented upon in the database so that other assessors can investigate these cases and consider categorizing them differently.Assessors may also consider investigating how alternative allocations affect the final conclusions (i.e. a sensitivity analysis).Choosing which geographical areas are relevant can depend on the goal of the assessment.For example, the EICAT protocol (Blackburn et al. 2014) is based only on impacts that have been observed in the invaded area under consideration.Other protocols explicitly aim at quantifying not only the actual, but also the potential impact invaders can have in the invaded area under consideration, by incorporating impacts recorded elsewhere as No references found.ENSARS only uses the wording 'precautionary approach' once, but it does not refer to the interpretation of impacts.ENSARS includes a pre-screening component which corresponds to the initial hazard identification phase of the risk analysis process.Here, 'precautionary approach' is used to justify that "(…) toolkits are based on the generally accepted premise that organisms invasive in other parts of the world have an increased chance of being invasive in new areas with similar environmental conditions", and thus seems to allow to use information from other native invaded ranges as well.(Copp et al. 2016) EPPO No references found.(Brunel et al. 2010) FISK FISK does not mention the PP explicitly, but the 2013 background and guidance document prepared by the 'Salmon and Freshwater Team' mentions that, for scoring uncertainty, "A question is counted as unanswered if any of these items is not completed -in such a case, a default (precautionary) score is given (i.e. the highest possible value)." FISK thus invokes the PP to assign the highest possible uncertainty score if an assessor cannot fully answer a given question, independent of the reason that the question cannot be answered.(Salmon andFreshwater Team 2013, Copp et al. 2016) GABLIS GABLIS first refers to the PP in the introduction: "Management opportunities for IAS are mostly restricted to early stages of invasions …, hence the early, ideally ex ante identification of IAS is an urgent need.The priority of this precautionary principle is recognized by the Convention on Biological Diversity (CBD 1992)."GABLIS assigns IAS to a listing approach, and invokes the PP in this listing process through stating that: "The allocation to a list is based on the precautionary approach: if at least one criterion is assessed with "yes", the alien species is assigned to the Black List" Furthermore, GABLIS invokes the PP when discussing how uncertainty is treated: "Thus, any methodology for the assessment of future impacts inevitably includes a certain probability of error resulting from insufficient data or wrong data interpretation.GABLIS covers this uncertainty by placing alien species for which deleterious impacts on biodiversity are insufficiently known on the Grey List.This is also supported by the precautionary principle of the CBD (2000,2002).As Genovesi and Shine (2003) put it: "Where there is a threat of significant reduction or loss of biological diversity, lack of full scientific certainty should not be used as a reason for postponing measures to avoid or minimize such a threat".Lastly, GABLIS explicitly states that "In other words, the precautionary principle should be employed as a significant guideline for assessing the risks posed by IAS (Genovesi and Shine 2003).We have explicitly included the precautionary principle in GABLIS (…) ". (CBD 1992, CBD 2000, CBD 2002, Genovesi and Shine 2003, Essl et al. 2011) GB NNRA No references found in the Baker et al. (2008) publication outlining this risk assessment.The NNS website (http:// www.nonnativespecies.org)does refer to the Convention of Biological Diversity (CBD), which emphasizes the need for a precautionary approach towards NNS, and mentions that the NNRA "has been developed to help facilitate such an approach in Great Britain".(Baker et al. 2008) GISS GISS explicitly invokes the PP once, to justify why the highest impact score should be chosen when there is conflicting evidence: "If several studies report different impact levels in the same category, the maximum is chosen as a representation of the highest potential impact a species can reach (precautionary principle)."(Nentwig et al. 2016) Harmonia+ Harmonia+ explicitly refers to the PP when describing its key guidelines: "Second, to always employ the precautionary principle; e.g., by taking the worst-case scenario when different scenarios are possible.This is in line with a primary principle from the Convention on Biological Diversity (COP 2002)."(COP 2002(COP , D'hondt et al. 2015) EICAT EICAT invokes the PP multiple times, sometimes using the wording 'precautionary approach' as a synonym."The EICAT scheme takes a precautionary approach: when the main driver of change is unclear, it should be assumed to be the alien taxon for the purposes of the EICAT process.""We note that invasion, and by extension impact, is a characteristic of a population, rather than a species: not all populations of a given taxon necessarily become invasive.It follows that the EICAT classification of a taxon will generally reflect impact recorded from one or a small number of populations, and hence that population level impacts translate into taxon-level assessments.This reflects the precautionary principle 1 for alien impacts, as impact caused by one population suggests the potential for other populations of the same taxon to cause similar impact elsewhere if they were transported outside of their natural boundaries.""As most taxa that are alien and have impacts somewhere have not been introduced to many of the locations where they could potentially thrive and have impacts, the vast majority of assessments will use 'focal region' data to generate a global level species assessment.Again, this reflects the precautionary principle for alien impacts, which is important as there is evidence that many alien taxa can have strong impacts in at least part of their invaded range, if distributed sufficiently widely."(Blackburn et al. 2014) NORWAY SCHEME The Generic Ecological Impact Assessments of Alien Species in Norway invokes and discusses the PP, mainly to justify a 'One Out, All Out' scoring: "… a species is categorized by selecting the highest risk category of which at least one criterion is met.Criteria used to assess species should not simply be summed, because this may result in an intermediate risk category for species that score extremely high on one criterion but low on others (cf.Makowski and Mittinty 2010)."(Makowski andMittinty 2010, Sandvik et al. 2013) Table 4. Proposed impact evidence variables and metadata recorded for each evidence entry in an impact assessment evidence base.When assignment to a single category is difficult, this can be flagged in the comments column or the entry can be given a dual coding.

Species
Scientific name of the organism under assessment Criteria for including non-native species in the assessment.

Impact category or mechanism
Specific to the impact assessment protocol chosen.

Study design
Experimental Qualitative/quantitative study using a qualitative/quantitative experimental manipulation of the mechanisms by which the invader is presumed to have an effect (allows inference on magnitude and causality of impact).

Non-experimental
A study that uses a qualitative/quantitative, but non-experimental, scientific sampling design (allows inference on magnitude but not causality of impact).

Anecdotal
Casual observation acquired without a sampling design (only allows inferences on presence/absence of impact, not on magnitude or causality).

Indirect report
Impact not observed by person reporting it or sources that do not report primary data (impacts cannot be verified).

Impact direction
Deleterious Evidence entry explicitly reports deleterious impact Beneficial Evidence entry explicitly reports beneficial impact No impact Covers cases where no impact is explicitly reported.

Metadata
Source identifier; Evidence entry identifier (for entries coming from a source containing multiple pieces of evidence); Year in which evidence was made available; Source language; Geographical region; Country; Detailed location of reported impact (e.g.nearby city or coordinates); Full bibliographic reference of source; Expert assessor name; and a short written description of relevant evidence.
a proxy (Bomford 2008;Nentwig et al. 2016, see Table 1).Indeed, both Matthews et al. (2017) and Verbrugge et al. (2010) found that the geographical area considered was a root cause of variability among NNS impact classifications.As an example, depending on the European country where the impact information was taken from, the fish species Umbra pygmaea is classified either as a low priority 'non-invasive' introduced species or as a 'high risk' invader (Verbrugge et al. 2010).When impacts are clearly labelled based on their geographical context, assessors can transparently debate and decide which evidence is or is not incorporated into the specific assessment they are carrying out, and the consequences of using different criteria on invasive species impact rankings can be transparently assessed and discussed.Second, evidence should be classified according to its "source type", as either (a) peer-reviewed literature, (b) non-peer-reviewed ("grey") literature or (c) unpublished data (i.e.personal communication, personal observation, unpublished data).NNS assessment protocols again differ in which source types are included (Table 2), so an evidence base that is structured accordingly allows for flexible use and evaluation of this criterion.For example, the fact that Chlamydia psittaci (the bacterium causing the zoonotic infectious disease psittacosis) is present in Belgian non-native ring-necked parakeet (Psittacula krameri) populations is only known from a single line in a grey literature report (Vangeluwe et al. 2004).Thus, the type of publication that is allowed into the evidence base, or its weight, may have a marked effect on the assessment outcome and on the identification of which kinds of impact may be most threatening.
Third, the evidence should also be explicit about the 'study design'.This is important, as also peer-reviewed studies can strongly differ in the amount and quality of the evidence they provide.Therefore, we propose to classify the study design as either (a) an experimental study, i.e. any study using a qualitative/quantitative experimental manipulation of the mechanisms by which the invader is presumed to have an effect, so causality can be inferred, (b) a non-experimental study, i.e. any study using a qualitative/quantitative scientific sampling design to quantify associations between NNS and impacts, without being able to definitively establish causality, (c) an anecdotal report, i.e. any casual observation acquired without a qualitative/quantitative scientific sampling design, so presence/absence of impact can be inferred but neither magnitude nor causality, or (d) indirect reports, i.e. data not observed by the person reporting it, or sources that do not report primary data, so impacts cannot be verified.
Fourth, the evidence should include the direction of the impacts encountered (Schlaepfer et al. 2011;Tanner et al. 2017;Dickey et al. 2018;Hagen and Kumschick 2018).Impact assessments typically only consider deleterious impacts (Baker et al. 2008), and often do not take into account evidence of no impact.Besides deleterious impacts, NNS can also have beneficial impacts, and information on such beneficial impacts may be used by policy-makers in the subsequent risk management and risk communication steps.Indeed, recent European Union legislation aimed at combatting NNS explicitly states that risk assessments should include "a description of the known uses for the species and social and economic benefits deriving from those uses" (EU Regulation 1143/2014, Art. 5, 1(g)).Along similar lines, Branquart et al. (2016) mention that when NNS impacts are offset against perceived gains, cataloguing such gains belongs within the scope of the broader risk analysis.A concrete example of such possible beneficial effects is shown by the invasion by Scotch broom (Cytisus scoparius) in New Zealand, where this plant is considered valuable by beekeepers (Jarvis et al. 2006), whereas farmers and the forestry industry consider it a pest and opt for releasing biocontrol agents.Including direction of impacts therefore will also highlight that impacts (in either direction) may not be fully objective and can be "user-dependent": some impacts may be scored differently by distinct sections of the scientific community and the general public.We therefore strongly advocate that information on absent and (apparent) beneficial invader impacts is fully included in a transparent and systematic manner in the evidence database.By making evidence of beneficial impacts part of the evidence base, policy-makers or conservation managers can rely on this information in later stages of the risk analysis process, i.e. in the risk management step (Fig. 1, Fig. 2) where any other relevant economic, political or societal factors are considered.
Our framework aims at strengthening the existing standards towards transparency and reproducibility in NNS impact assessments, in line with current scientific trends.For example, recently, Galanidi et al. (2018) applied the EICAT protocol for prioritizing marine invasive fishes in the Mediterranean and published the full underlying evidence base, detailing for each impact encountered (not only the worst ones) the geographical area where the impact was recorded and the study design of the manuscript reporting it.In that sense, our proposal to build and publish the evidence base responds to the needs identified by the scientific community.The criterion 'geographical area' refers to the issue of transferability of non-native species impacts, while study design and source type are both associated with the credibility and reproducibility of scientific findings.Direction of impact is included here because of its relevance for the broader risk analysis process and this information is valuable in the later stages following risk management.We aim to promote such organization and reporting of impact evidence.We stress that, here, we do not make a judgement about which evidence should or should not be considered in invasive species impact assessment, but we do call for an evidence base that includes all known information, allowing to transparently track which decisions any study may have made.We further note that our framework would also facilitate the interchange or publication of data sets.This can prevent unnecessary replication of literature review efforts, facilitate rapid updating, enable comparison of outcomes of assessments with respect to different assessment protocols, and promote the involvement of other stakeholders.Technical barriers for sharing data have recently been lowered by the emergence of digital platforms such as, for example, Data Dryad, Figshare, Zenodo, or the GBIF Integrated Publishing Toolkit (Groom et al. 2017).Impact assessment databases can be made findable, shareable and citable using resolvable Digital Object Identifiers (DOI, Kahn and Wilensky 2006).
Fig. 3 provides a hypothetical example illustrating that impact scores can be strongly dependent on whether certain source and evidence types, or geographical areas, are included or excluded in the final impact assessment.Classifying the evidence base according to the variables outlined above prior to the actual scoring allows one to transparently track why impact classifications may differ between scoring protocols.White et al. (2019) recently applied the evidence base scheme proposed here to first collate evidence on impacts caused by parakeets and then leveraged this information to carry out a GISS-based impact assessment for all parakeet species introduced to Europe.They found that the types of evidence included in assessments strongly influenced outcomes, whereby, for example, including evidence from the native range or anecdotal evidence resulted in a switch from minimal-moderate to moderate-major overall impact scores (Fig. 4).Such transparency is important for application of assessment outcomes by different users and supports the communicability and acceptance of assessment results (Bartz and Kowarik 2019).We should however note that uncertainty in NNS impact assessments has many sources (reviewed by McGeoch et al. 2012).Our framework may help by addressing the 'epistemic' uncertainty due to data quality.Having all informa-Figure 3. Hypothetical example of an impact assessment carried out following the evidence mapping framework presented here.The figure shows how impact evidence can differ across dimensions, and that consequently, in-or excluding certain classes of evidence (and how they are scored) can strongly change impact assessment final outcomes.The size of the dots is proportional to the number of impacts reported in the literature.In this hypothetical example, including impact evidence from native-range grey literature would result in the species under consideration being assigned a far higher threat level compared to only considering peer-reviewed studies from the invaded area under consideration.Ranking NNS based on their threat level can be done either by averaging impact scores across impact categories ('mean impact scoring'), or based solely on the most severe impact recorded ('worst-case impact scoring').
tion on available evidence at hand will help assessors assigning impact scores and the associated uncertainty.For example, an assessor could opt for maximum scoring (and assign the invader to a 'high impact' category) but report a large degree of uncertainty (because the evidence for the more severe impacts comes from grey literature studies).That is one of the ways we envisage our evidence base will facilitate invasive species impact assessments.

A prudent use of the precautionary principle
Both according to scholars and to legal entities such as the EU, the precautionary principle is part of the risk 'management' step, allowing policy-makers to take certain decisions even if there is no full scientific certainty or agreement (European Commission 2000; Ahteensuu and Sandin 2012, Fig. 1, Box 1).Introducing the precautionary principle during the impact scoring in risk 'assessment' comes down to reframing this principle from a rule that guides how we should 'act or take decisions', to a principle about what we should 'believe' (Harris and Holm 2002).Such use of the precautionary principle may be problematic for two reasons.First, the precautionary principle likely is one of the underlying causes of systematic differences between assessment outcomes, as protocols that invoke this principle score invader impacts solely on the worst recorded impacts and consequently tend to result in higher impact scores compared to protocols that do not rely on the precautionary principle.Such disagreement between different protocols can be problematic and undermine the credibility of assessments (see e.g.Vanderhoeven et al. 2017;Turbé et al. 2017;González-Moreno et al. 2019).Second, instead of providing 'a scientific and objective evaluation' (European Commission 2000), basing impact assessment scoring on the precautionary principle 'may encourage assessors to select information that portrays alien species in the worst possible light' (Matthews et al. 2017).Dahlstrom et al. (2011) remark that variation regarding the incorporation of the precautionary principle can contribute to disparate outcomes of different risk assessment frameworks.Along similar lines, Heard et al. (2011), for example, note that it had become difficult to quantify how widespread a threat the non-native chytrid fungus Batrachochytrium dendrobatidis really is to the world's IUCN Red Listed amphibians because -motived by the precautionary principle -assessors routinely list the disease as a contributing factor, mostly without any evidence on whether the disease has actually been detected (a more recent analysis does confirm that the chytridiomycosis panzootic is a leading cause of species extinctions at a global scale, Scheele et al. 2019).
The precautionary principle is an important and explicit part of several impact assessment schemes (Table 3).For instance, some protocols, such as AquaNIS, EFSA and EPPO, make no reference to the precautionary principle, while others explicitly invoke the use of this principle when assessing risks and impacts (e.g.EICAT, GISS, Harmo-nia+; Table 3 for full details).When referenced in NNS impact assessment, we argue the precautionary principle is typically used (1) to limit the evidence base to only the most severe documented impacts, (2) for using the most severe impact recorded as sole criterion for ranking NNS, and (3) to lower the evidence bar needed to accept an impact.In contrast, we contend that the evidence base accompanying impact assessments should include all relevant studies encountered, not only the most severe ones.Next, impact as-Figure 4. Effect of allowing only impact evidence from the invaded area under assessment (orange lines) versus also including evidence from other geographical areas (brown lines) on impact assessment outcomes for ring-necked (left) and monk parakeets (right) introduced to Europe.Impacts were scored according to the GISS protocol (Nentwig et al. 2016), whereby the magnitude of impact is quantified with six levels ranging from 0 (no impacts known) to 5 (the highest possible impact, see Table 2 in Nentwig et al. 2016).Spider graphs are drawn using maximum scoring (i.e. based on the worst recorded impact for each impact mechanism) based on data from White et al. (2019).
sessments result in impact scores, and these scores need to be integrated to allow ranking invaders according to their overall impacts.While we reject the precautionary principle as justification for using maximum impacts as the sole acceptable criterion for categorizing invaders, maximum scoring can be useful for identifying NNS which may, depending on location or context, have a single, large impact.For example, in Europe, the ruddy duck (Oxyura jamaicensis) threatens the survival of the endangered native white-headed duck (Oxyura leucocephala) through hybridization (Munoz-Fuentes et al. 2007).This 'most severe' impact alone formed the justification for an eradication campaign targeting the species (Henderson 2009).Importantly, our evidence framework allows for easy inclusion of alternative criteria.For example, when an invader is capable of damaging its environment in multiple ways, listing all impacts and averaging them (both within impact categories, and when summarizing impact category scores into an overall impact score) allows for a more representative picture on how likely and how high impacts currently really are, even if there is no single, major impact known (D'hondt et al. 2015;Nentwig et al. 2016).Consider, as a theoretical example, an impact assessment scheme that would employ ten different impact mechanisms for which impact scores need to be derived from the literature.Imagine we have two invasive species: species A attains a 'moderate' impact score for only one category, while species B attains a 'minor' impact in four categories and a 'moderate' impact in another two.Maximum scoring will conclude species A and B pose a similar overall threat, while averaging-based scores will assign species B to a higher threat level compared to A. Having an evidence base that includes all known impact case studies additionally allows one to, for example, provide histograms of impact scores, further clarifying the distribution of NNS impact evidence and severity.
Lastly, we argue that calling upon the precautionary principle encourages impact assessors -most likely unintentionally -to give a greater weight to any evidence suggesting an impact, regardless of the origin, type and quality of underlying evidence (Harris and Holm 2002).Quantifying invader impacts is fraught with difficulties (see above, Courchamp et al. 2017), yet, under the guise of the precautionary principle, invasion biology sometimes drifts into strong inferences that species have a greater impact than is objectively justified by the evidence (see for example Strubbe et al. 2011).This may be motivated by concerns that decision-makers will not act when a NNS suspected of damaging its environment is not unequivocally designated as a high-impact invader.Yet, invoking the precautionary principle in impact assessment risks over-emphasizing likely context-dependent impacts and can mask actual differences in impact between species.This not only leads to disagreements between experts on the magnitude of NNS impacts (Crowley et al. 2017b;Davis and Chew 2017;Russell and Blackburn 2017), but may also fuel public opposition to NNS management, especially for so-called charismatic invaders such as most birds and many mammals (Dana et al. 2013;Estévez et al. 2015).In addition, using the precautionary principle in impact assessments may lead to an inflation in the number of species classified as high-impact invaders, straining and potentially misallocating the resources available for NNS management (Matthews et al. 2017).We therefore argue against the use of the precautionary principle during impact assessment.

Conclusions
We recommend that NNS impact assessments (i) focus on constructing a transparent, complete, reproducible and preferably public database that maps all evidence according to a set of main criteria, ii) explicitly mention what (often protocol-specific) criteria that have been applied to select 'admissible evidence' from the database, and (iii) do not involve the precautionary principle in their database construction or scoring (Fig. 1).This improves the scientific basis upon which informed decision-making (sensu Fairbrother and Bennett 1999) can take place.We contend that adopting such an approach will promote better and societally-supported policy and management of NNS, which is ultimately needed to reduce their ecological and socio-economic impacts.

Table 2 .
(Olenin et al. 2014pted by a set of commonly used NNS risk or impact assessment protocols.AquaNISNo specific guidance is given, but the impact section mentions 'Evidence of environmental and socio-economic effects, documented in the peer-reviewed literature, is stored under general impacts, so AquaNIS likely only accepts impacts available through peer-reviewed literature.(Oleninetal. 2014.)EFSA EFSA only mentions 'review of the available scientific literature and documents'.There is no explicit reference to peerreviewed literature, and thus it is currently unclear what is considered to be 'scientific' under the EFSA framework." (EFSA 2011) ENSARS

Table 3 .
Reference to the precautionary principle by a set of commonly used NNS risk or impact assessment protocols.