Citation

urn:lsid:arphahub.com:pub:8D1BC1DD-8175-5933-B147-C839B202D5BA

urn:lsid:zoobank.org:pub:C132FA7C-A22D-4474-AAAF-B7A6889D0FF9

NeoBiota

1619-0033 1314-2488

Pensoft Publishers

10.3897/neobiota.44.31650

31650

Research Article

Biological Invasions Conservation Biology Ecological risk assessment

Cenozoic

Europe

Consistency of impact assessment protocols for non-native species

González-Moreno

Pablo

pablo.gonzalez@uco.es https://orcid.org/0000-0001-9764-8927 1 Lazzaro

Lorenzo

https://orcid.org/0000-0003-0514-0793 2 Vilà

Montserrat

https://orcid.org/0000-0003-3171-8261 3 Preda

Cristina

4 5 Adriaens

Tim

https://orcid.org/0000-0001-7268-4200 6 Bacher

Sven

https://orcid.org/0000-0001-5147-7165 4 Brundu

Giuseppe

https://orcid.org/0000-0003-3076-4098 7 Copp

Gordon H.

https://orcid.org/0000-0002-4112-3440 8 9 Essl

Franz

10 García-Berthou

Emili

https://orcid.org/0000-0001-8412-741X 11 Katsanevakis

Stelios

https://orcid.org/0000-0002-5137-7540 12 Moen

Toril Loennechen

13 Lucy

Frances E.

14 Nentwig

Wolfgang

15 Roy

Helen E.

https://orcid.org/0000-0001-6050-679X 16 Srėbalienė

Greta

17 Talgø

Venche

18 Vanderhoeven

Sonia

https://orcid.org/0000-0002-6298-5373 19 Andjelković

Ana

https://orcid.org/0000-0001-6616-1710 20 21 Arbačiauskas

Kęstutis

22 Auger-Rozenberg

Marie-Anne

23 Bae

Mi-Jung

11 24 Bariche

Michel

25 Boets

Pieter

26 Boieiro

Mário

https://orcid.org/0000-0002-9087-091X 27 Borges

Paulo Alexandre

https://orcid.org/0000-0002-8448-7623 27 Canning-Clode

João

28 29 30 Cardigos

Federico

29 Chartosia

Niki

31 Cottier-Cook

Elizabeth Joanne

https://orcid.org/0000-0002-1466-6802 32 Crocetta

Fabio

33 D'hondt

Bram

34 Foggi

Bruno

https://orcid.org/0000-0001-6451-4025 35 Follak

Swen

36 Gallardo

Belinda

https://orcid.org/0000-0002-1552-8233 37 Gammelmo

Øivind

https://orcid.org/0000-0002-6026-9023 38 Giakoumi

Sylvaine

39 Giuliani

Claudia

40 Fried

Guillaume

41 Jelaska

Lucija Šerić

42 Jeschke

Jonathan M.

https://orcid.org/0000-0003-3328-4217 43 44 45 Jover

Miquel

46 Juárez-Escario

Alejandro

47 Kalogirou

Stefanos

https://orcid.org/0000-0002-3064-9236 48 Kočić

Aleksandra

49 Kytinou

Eleni

12 Laverty

Ciaran

50 Lozano

Vanessa

7 Maceda-Veiga

Alberto

3 Marchante

Elizabete

https://orcid.org/0000-0003-1303-7489 51 Marchante

Hélia

51 52 Martinou

Angeliki F.

53 Meyer

Sandro

54 Minchin

Dan

55 56 Montero-Castaño

Ana

3 Morais

Maria Cristina

51 57 Morales-Rodriguez

Carmen

58 Muhthassim

Naida

15 Nagy

Zoltán Á.

59 Ogris

Nikica

60 Onen

Huseyin

Pergl

Jan

61 Puntila

Riikka

62 Rabitsch

Wolfgang

63 Ramburn

Triya Tessa

64 Rego

Carla

https://orcid.org/0000-0001-8005-4508 27 Reichenbach

Fabian

15 Romeralo

Carmen

https://orcid.org/0000-0002-8510-9915 65 66 Saul

Wolf-Christian

43 44 45 Schrader

Gritta

67 Sheehan

Rory

14 Simonović

Predrag

https://orcid.org/0000-0002-4819-4962 68 Skolka

Marius

5 Soares

António Onofre

69 Sundheim

Leif

70 Tarkan

Ali Serhan

https://orcid.org/0000-0001-8628-0514 Tomov

Rumen

71 Tricarico

Elena

https://orcid.org/0000-0002-7392-0794 2 Tsiamis

Konstantinos

https://orcid.org/0000-0003-1192-3516 72 Uludağ

Ahmet

van Valkenburg

Johan

https://orcid.org/0000-0001-7281-7819 73 Verreycken

Hugo

https://orcid.org/0000-0003-2060-7005 74 Vettraino

Anna Maria

75 Vilar

Lluís

46 Wiig

Øystein

76 Witzell

Johanna

https://orcid.org/0000-0003-1741-443X 66 Zanetta

Andrea

4 77 Kenis

Marc

CABI, Egham, UK

Department of Biology, University of Florence, Florence, Italy

Estación Biológica de Doñana (EBD-CSIC), Sevilla, Spain

University of Fribourg, Department of Biology, Fribourg, Switzerland

Ovidius University of Constanta, Department of Natural Sciences, Constanta, Romania

Research Institute for Nature and Forest (INBO), Brussels, Belgium

Department of Agriculture, University of Sassari, Sassari, Italy

Salmon & Freshwater Team, Cefas, Lowestoft, UK

Centre for Conservation Ecology and Environmental Science, Bournemouth University, Poole, UK

Division of Conservation, Vegetation and Landscape Ecology, University Vienna, Vienna, Austria

GRECO, Institute of Aquatic Ecology, University of Girona, Girona, Spain

University of the Aegean, Department of Marine Sciences, Mytilene, Greece

Norwegian Biodiversity Information Centre. Trondheim. Norway

CERIS, Institute of Technology, Sligo, Ireland

Institute of Ecology and Evolution, University of Bern, Bern, Switzerland

Centre for Ecology & Hydrology, Crowmarsh Gifford, UK

Marine Science and Technology Centre, Klaipėda University, Klaipėda, Lithuania

Norwegian Institute of Bioeconomy Research (NIBIO), Ås, Norway

Belgian Biodiversity Platform, Walloon Research Department for Nature and Agricultural Areas (DEMNA), Service Public de Wallonie, Gembloux, Belgium

Institute for Plant Protection and Environment, Belgrade, Serbia

Department of Biology and Ecology, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia

Nature Research Centre, Akademijos Street 2, LT-08412 Vilnius, Lithuania

INRA, UR633, Zoologie Forestière (URZF), Orléans, France

Freshwater Biodiversity Research Division, Nakdonggang National Institute of Biological Resources, Gyeongsangbuk-do 37242, Republic of Korea

Department of Biology, American University of Beirut, Beirut, Lebanon

Provincial Centre of Environmental Research (PCM), Ghent, Belgium

cE3c – Centre for Ecology, Evolution and Environmental Changes/Azorean Biodiversity Group and Universidade. dos Açores – Depto de Ciências e Engenharia do Ambiente, Azores, Portugal

MARE – Marine and Environmental Sciences Centre, Madeira Island, Portugal

Centre of IMAR of the University of the Azores, Department of Oceanography and Fisheries. Horta, Azores, Portugal

Smithsonian Environmental Research Center, Edgewater, MD 21037, USA.

OKEANOS - Research Center – Universidade dos Açores, Azores, Portugal

Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus

Scottish Association for Marine Science, Scottish Marine Institute, Oban, UK

Department of Integrative Marine Ecology, Stazione Zoologica Anton. Dohrn, Villa Comunale, I-80121 Napoli, Italy

Biology Department, Ghent University, Ghent, Belgium

Austrian Agency for Health and Food Safety, Institute for Sustainable Plant Production, Vienna, Austria

Applied and Restoration Ecology Group (IPE-CSIC), Zaragoza, Spain

BioFokus, Oslo, Norway

Université Côte d’Azur, CNRS, UMR 7035 ECOSEAS, Nice, France

Department of Pharmaceutical Sciences (DISFARM), University of Milan, Milane, Italy

Plant Health Laboratory, Anses, Montferrier-sur-Lez, France

Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia

Leibniz-Institute of Freshwater Ecology and Inland Fisheries (IGB), Berlin, Germany

Freie Universität Berlin, Department of Biology, Chemistry, Pharmacy, Institute of Biology, Berlin, Germany

Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany

Unitat de Botànica, Facultat de Ciències, Campus de Montilivi, University of Girona, Girona, Spain

Department of Horticulture, Fruit Growing, Botany and Gardening, Agrotecnio, ETSEA, University of Lleida, Spain

Department of Forest and Crop Science, Agrotecnio, ETSEA, University of Lleida, Spain

Hellenic Centre for Marine Research, Hydrobiological Station of Rhodes, Rhodes, Greece

Department of Biology, Josip Juraj Strossmayer University of Osijek, Osijek, Croatia

Department of Marine Sciences, University of the Aegean, Lesvos Island, Greece

School of Biological Sciences, Medical and Biological Centre, Queen’s University Belfast, UK

Centre for Functional Ecology, Department of Life Sciences, University of Coimbra, Coimbra, Portugal

Instituto Politécnico de Coimbra, Escola Superior Agrária de Coimbra, Coimbra, Portugal

Joint Services Health Unit, RAF Akrotiri, British Forces Cyprus, Cyprus

Department of Environmental Sciences, University of Basel, Basel, Switzerland

Marine Science and Technology Centre, Klaipeda University, Klaipeda, Lithuania

Marine Organism Investigations, Ballina, Killaloe, Ireland

Centre for the Research and Technology of Agro-Environmental and Biological Sciences, Department of Biology and Environment, University of Tras-os-Montes and Alto Douro, Vila Real, Portugal

Pathology of Woody Plants. Technische Universität München, TUM, Freising, Germany

Phytophthora Research Centre, Department of Forest Protection and Wildlife Management, Faculty of Forestry and Wood Technology, Mendel University in Brno; Brno, Czech Republic

Slovenian Forestry Institute, Ljubljana, Slovenia

Department of Plant Protection, Faculty of Agriculture, Gaziosmanpasa University, Tokat, Turkey

Department of Invasion Ecology, Institute of Botany, The Czech Academy of Sciences, Průhonice, Czech Republic

Marine Research Centre, Finnish Environment Institute, Helsinki, Finland

Environment Agency Austria, Vienna, Austria

Simon Fraser University, Burnaby, Canada

Sustainable Forest Management Research Institute, University of Valladolid-INIA, Palencia, Spain

Swedish University of Agricultural Sciences, Facu

lty of Forest Sciences, Southern Swedish Forest Research Centre, Alnarp, Sweden

Julius Kuehn Institute (JKI), Braunschweig, Germany

Faculty of Biology & Institute for Biological Research “Siniša Stanković”, University of Belgrade, Belgrade, Serbia

cE3c – Centre for Ecology, Evolution and Environmental Changes/Azorean Biodiversity Group and University of the Azores – Faculty of Sciences and Technology, Açores, Portugal

Faculty of Fisheries, Muğla Sıtkı Koçman University, Muğla, Turkey

University of Forestry, Department of Plant Protection, Sofia, Bulgaria

European Commission, Joint Research Centre (JRC), Ispra, Italy

Faculty of Agriculture, Çanakkale Onsekiz Mart University, Çanakkale, Turkey

National Plant Protection Organization (NPPO), Wageningen,The Netherlands

Research Institute For Nature and Forest (INBO), Linkebeek, Belgium

Department for Innovation in Biological, Agro-food and Forest systems, University of Tuscia, Viterbo, Italy

Natural History Museum, University of Oslo, Oslo, Norway

Swiss Federal Research Institute WSL, Biodiversity and Conservation Biology, Birmensdorf, Switzerland

CABI, Delémont, Switzerland

Corresponding author: Pablo González-Moreno (p.gonzalez-moreno@cabi.org)

Academic editor: P. Hulme

2019

01 04 2019

44 1 25 FDD337D9-19DE-5B13-91EA-6F66B30FBBA6 2633461 14 11 2018 26 02 2019

Pablo González-Moreno, Lorenzo Lazzaro, Montserrat Vilà, Cristina Preda, Tim Adriaens, Sven Bacher, Giuseppe Brundu, Gordon H. Copp, Franz Essl, Emili García-Berthou, Stelios Katsanevakis, Toril Loennechen Moen, Frances E. Lucy, Wolfgang Nentwig, Helen E. Roy, Greta Srėbalienė, Venche Talgø, Sonia Vanderhoeven, Ana Andjelković, Kęstutis Arbačiauskas, Marie-Anne Auger-Rozenberg, Mi-Jung Bae, Michel Bariche, Pieter Boets, Mário Boieiro, Paulo Alexandre Borges, João Canning-Clode, Federico Cardigos, Niki Chartosia, Elizabeth Joanne Cottier-Cook, Fabio Crocetta, Bram D'hondt, Bruno Foggi, Swen Follak, Belinda Gallardo, Øivind Gammelmo, Sylvaine Giakoumi, Claudia Giuliani, Guillaume Fried, Lucija Šerić Jelaska, Jonathan M. Jeschke, Miquel Jover, Alejandro Juárez-Escario, Stefanos Kalogirou, Aleksandra Kočić, Eleni Kytinou, Ciaran Laverty, Vanessa Lozano, Alberto Maceda-Veiga, Elizabete Marchante, Hélia Marchante, Angeliki F. Martinou, Sandro Meyer, Dan Minchin, Ana Montero-Castaño, Maria Cristina Morais, Carmen Morales-Rodriguez, Naida Muhthassim, Zoltán Á. Nagy, Nikica Ogris, Huseyin Onen, Jan Pergl, Riikka Puntila, Wolfgang Rabitsch, Triya Tessa Ramburn, Carla Rego, Fabian Reichenbach, Carmen Romeralo, Wolf-Christian Saul, Gritta Schrader, Rory Sheehan, Predrag Simonović, Marius Skolka, António Onofre Soares, Leif Sundheim, Ali Serhan Tarkan, Rumen Tomov, Elena Tricarico, Konstantinos Tsiamis, Ahmet Uludağ, Johan van Valkenburg, Hugo Verreycken, Anna Maria Vettraino, Lluís Vilar, Øystein Wiig, Johanna Witzell, Andrea Zanetta, Marc Kenis

This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Standardized tools are needed to identify and prioritize the most harmful non-native species (NNS). A plethora of assessment protocols have been developed to evaluate the current and potential impacts of non-native species, but consistency among them has received limited attention. To estimate the consistency across impact assessment protocols, 89 specialists in biological invasions used 11 protocols to screen 57 NNS (2614 assessments). We tested if the consistency in the impact scoring across assessors, quantified as the coefficient of variation (CV), was dependent on the characteristics of the protocol, the taxonomic group and the expertise of the assessor. Mean CV across assessors was 40%, with a maximum of 223%. CV was lower for protocols with a low number of score levels, which demanded high levels of expertise, and when the assessors had greater expertise on the assessed species. The similarity among protocols with respect to the final scores was higher when the protocols considered the same impact types. We conclude that all protocols led to considerable inconsistency among assessors. In order to improve consistency, we highlight the importance of selecting assessors with high expertise, providing clear guidelines and adequate training but also deriving final decisions collaboratively by consensus.

Keywords Environmental impact expert judgement invasive alien species policy management prioritization risk assessment socio-economic impact

European Cooperation in Science and Technology

501100000921

http://doi.org/10.13039/501100000921

Citation

González-Moreno P, Lazzaro L, Vilà M, Preda C, Adriaens T, Bacher S, Brundu G, Copp GH, Essl F, García-Berthou E, Katsanevakis S, Moen TL, Lucy FE, Nentwig W, Roy HE, Srėbalienė G, Talgø V, Vanderhoeven S, Andjelković A, Arbačiauskas K, Auger-Rozenberg M-A, Bae M-J, Bariche M, Boets P, Boieiro M, Borges PA, Canning-Clode J, Cardigos F, Chartosia N, Cottier-Cook EJ, Crocetta F, D’hondt B, Foggi B, Follak S, Gallardo B, Gammelmo Ø, Giakoumi S, Giuliani C, Guillaume F, Jelaska LS, Jeschke JM, Jover M, Juárez-Escario A, Kalogirou S, Kočić A, Kytinou E, Laverty C, Lozano V, Maceda-Veiga A, Marchante E, Marchante H, Martinou AF, Meyer S, Michin D, Montero-Castaño A, Morais MC, Morales-Rodriguez C, Muhthassim N, Nagy ZA, Ogris N, Onen H, Pergl J, Puntila R, Rabitsch W, Ramburn TT, Rego C, Reichenbach F, Romeralo C, Saul W-C, Schrader G, Sheehan R, Simonović P, Skolka M, Soares AO, Sundheim L, Tarkan AS, Tomov R, Tricarico E, Tsiamis K, Uludağ A, van Valkenburg J, Verreycken H, Vettraino AM, Vilar L, Wiig Ø, Witzell J, Zanetta A, Kenis M (2019) Consistency of impact assessment protocols for non-native species. NeoBiota 44: 1–25. https://doi.org/10.3897/neobiota.44.31650

Introduction

Coupled with the increasing evidence of adverse impacts exerted by some non-native species (NNS) on native species and ecosystems (Katsanevakis et al. 2014, Vilà et al. 2011, Vilà and Hulme 2017), there is an increasing demand for robust and user-friendly impact assessment protocols to be used by professionals with different levels of expertise and knowledge. Such protocols are needed to predict impacts of new or likely invaders as well as to assess the actual impact of established species. Scientists, environmental managers, conservationists, and policy makers are developing and implementing approaches to prevent further NNS introductions and their subsequent establishment, spread and impact. Risk analysis associated with these four main phases of the invasion process is used to inform management decisions, such as whether to eradicate or control species that arrive despite prevention efforts (Leung et al. 2012). Assessment of the realized or potential impacts of NNS is particularly important for the prioritization of management actions (Essl et al. 2011). However, the large variety of metrics adopted to measure the impacts undermines direct comparison of impacts across species, groups of taxa, localities or regions (Vilà et al. 2010). To this end, protocols to integrate and synthesize the empirical evidence of NNS impacts are needed in order to ensure a rational use of resources (McGeoch et al. 2016), or for prioritizing species for subsequent risk assessment (Brunel et al. 2010, Copp et al. 2009).

Robust NNS impact protocols should ideally result in accurate and consistent impact scores for a species even if applied by different assessors, as long as they have the adequate expertise in the assessed species and context. However, despite the importance of consistency in impact protocols, we have little understanding of the patterns in consistency of impact scores across assessors and protocols, and more importantly, which factors contribute to high levels of consistency. The level of consistency in species scores across assessors may depend on the characteristics of the protocol (e.g. taxonomic and environmental scope, impact types included), but also on the available scientific evidence of impact, and the level of expertise of assessors. For instance, we may expect high consistency (i.e. low impact score variability) across assessors for well-studied species, or when all assessors have an in-depth understanding of the species under consideration.

Several international and national organizations and research groups have developed NNS protocols (Table 1). The common aspect of most of these protocols is that they allow a ranking of NNS according to the threat they pose to the risk assessment area. These have been applied for identifying and assessing potential NNS impacts at different spatial scales, e.g. continental (Nentwig et al. 2010) or national (D’hondt et al. 2015). However, these protocols differ in several aspects. For example, they vary according to their objective, with some considering only environmental impacts whereas others are broader and include socio-economic or ecosystem services impacts (Leung et al. 2012, McGeoch et al. 2016, Vanderhoeven et al. 2017). Some protocols were designed to be taxonomically generic (e.g. GB-NNRA), whereas others are specific for the screening of certain taxonomic groups such as fish or other aquatic organisms (e.g. FISK, MI-ISK, FI-ISK, Amph-ISK, EPPO-PRI; see Table 1), particular habitats (e.g. BINPAS), or pathways (Panov et al. 2009). Moreover, the existing protocols vary considerably in complexity, such as the number of questions, the need for peer review, the use of additional software (e.g. spreadsheet or online form), the ways of assessing uncertainty (Vanderhoeven et al. 2017), and the scoring system used, which can be categorical, ordinal or continuous (Roy et al. 2018). The content and structural differences among protocols could lead to differences in the assessment results (Leung et al. 2012).

Table 1.

Characteristics of impact assessment protocols used in the study. Each protocol is characterized in terms of the a) taxonomic group the protocol could be used for, b) the impact categories included (environmental alone or environmental and socio-economic), c) the final scoring scale (i.e. three levels, five levels, and more than 5 levels), d) whether the final score is based on the maximum score of impacts, e) whether the protocol included questions on species spread as part of a risk assessment (yes/no), f) the number of questions contributing to the final score, and g) the mean assessor expertise on species required to fill the questionnaire (1–5 scale based on 63 online anonymous questionnaire responses).

Protocol	Full name	Taxonomic groups	Impact categories	Final scoring scale	Final scoring based on maximum score	Spread questions included	Number of questions	Expertise on species required	Reference
BINPAS	Biological Invasion Impact/Biopollution Assessment	Aquatic animals	Environmental	5	yes	yes	5	3.50	(Narščius et al. 2012, Olenin et al. 2007, Zaiko et al. 2011)
EICAT	Environmental Impact Classification for Alien Taxa	All	Environmental	5	yes	no	9	3.37	(Blackburn et al. 2011, Hawkins et al. 2015)
EPPO-EIA	European Plant Protection Organisation-Environmental Impact Assessment for plants (EPPO-EIA-PL) and terrestrial invertebrates (EPPO-EIA-IN)	Terrestrial plants and invertebrates	Environmental	5	yes	no	8 (Plants); 9 (invert.)	3.16	(Kenis et al. 2012)
EPPO-PRI	EPPO-Prioritization scheme	Plants	Environmental and socio-economic	3	yes	yes	11	3.00	(Brunel et al. 2010)
FISK (and related)	Fish Invasiveness Screening Kit (FISK); Freshwater Invertebrate Invasiveness Screening Kit (FI-ISK); Marine Fish Invasiveness Screening Kit (MFISK); Marine Invertebrate Invasiveness Screening Kit (MI-ISK)	Aquatic animals	Environmental and socio-economic	3	no	yes	49	4.12	(Copp 2013, Copp et al. 2009, Panov et al. 2009, Tricarico et al. 2010)
GABLIS	German-Austrian Black List Information System	All	Environmental	3	yes	yes	12	3.22	(Essl et al. 2011)
GB-NNRA	Great Britain Non-native Species Risk Assessment	All	Environmental and socio-economic	5	no	yes	33	3.90	(Baker et al. 2008, Mumford et al. 2010)
GISS	Generic Impact Scoring System	All	Environmental and socio-economic	>5 (discrete with max 60)	no	no	12	3.46	(Nentwig et al. 2010, 2016)
Harmonia+	Belgian risk screening tools for potentially invasive plants and animals	All	Environmental and socio-economic	>5 (continuous	yes	yes	20	3.46	(D’hondt et al. 2015)
ISEIA	Belgian Invasive Species Environmental Impact Assessment	All (not marine for this study)	Environmental	3	no	yes	4	2.81	(Branquart 2009)
NGEIAAS	Norway Generic Ecological Impact Assessment of Alien Species	All	Environmental	5	yes	yes	11	4.34	(Gederaas et al. 2012, Sandvik et al. 2013)

A few comparative analyses have addressed differences in the structure of impact assessment protocols (Essl et al. 2011, Heikkilä 2011, Vilà et al. 2019), and on their consistency in ranking species across regions (Matthews et al. 2017). However, studies have focused on a reduced number of protocols, and a short list of species (Křivánek and Pyšek 2006, Turbé et al. 2017). An in-depth comparison across taxa and across standardized protocols is missing for Europe (Essl et al. 2011), or elsewhere (Snyder et al. 2013). Such a comparison is urgently required to respond to the European legislation on invasive NNS (Regulation EU No. 1143/2014). The aim of the present study was to test for consistency in assessment scores across assessors through comparison of several NNS impact assessment protocols. To address this aim, 89 invasive NNS specialists used 11 protocols to assess the potential impact of 57 species not native to Europe and belonging to a very large array of taxonomic groups (plants, animals, pathogens) from terrestrial to freshwater and marine environments. The specific questions considered were: 1) How consistent are species scores across assessors? 2) To what extent does consistency depend on the protocol characteristics, i.e. impact categories considered (environmental and socio-economic), structural complexity of the protocol (number of questions and scoring system)? 3) How is consistency related to the characteristics of the NSS (taxonomic group, habitat type, and available scientific knowledge of the species); 4) What is the relation between consistency and assessor expertise? 5) Do different protocols provide similar final scores or species ranking? Based on the study results, we provide recommendations on how the robustness and applicability of protocols could be improved for assessing NNS impacts.

Material and methods Selection of impact assessment protocols

Eleven commonly used scientifically based protocols developed or applied in Europe for the evaluation of NNS impacts were selected for comparison by consensus in the AlienChallenge COST Action workshop in April 2014 by 36 European experts in NNS risk assessments (Rhodes, Greece) (Table 1). We included all protocols developed and officially used at national or continent level in Europe (e.g. EPPO, Harmonia+ and GB-NNRA) and the main protocols used by European research community (e.g. GISS and FISK). Only the EFSA protocol was discarded from this selection due to the complexity of extracting and processing the data. Furthermore, during the selection we aimed to cover the major types and groups of protocols in order to guarantee enough variability in their characteristics. The selection does not consider risk analysis tools or updates that have become available after 2015, such as AS-ISK (Copp et al. 2016), which replaces FISK and the other -ISK toolkits and complies with the minimum standards NNS risk analysis under Regulation (EU) No 1143/2014 (Roy et al. 2018). Risk assessments are usually divided into four components that consider the potential for a non-native species to enter a region, establish, spread and cause impacts. The selection included impact assessment and risk assessment protocols for which we only compared the sections dealing with spread and impact as they are largely interrelated. Each protocol considers a different method to calculate the final score per species based on the responses (i.e. aggregation method): maximum impact, accumulated impact, categorization matrix or decision trees, an independent summary question, or the combination of any of the previous methods. Owing to the number of protocols used in the present study and their complexity, no attempt was made to standardize variations in score aggregation methods but rather, where possible, to account for this variability during the data analysis as covariates. Some protocols can be applied to any taxon while others are specific to particular groups or habitats (e.g. BINPAS and FISK are used only for aquatic animals, EPPO Prioritization for plants). As such, the number of protocols assessed per species varied depending on the taxonomic group (Table 1). Although all the -ISK toolkits (FISK, FI-ISK, Amph-ISK, MFISK, MI-ISK) were used for their respective taxonomic groups, in the data analyses all the versions were listed under ‘FISK’ because of their high similarity. For the same reason, the EPPO-EIAs for insects/pathogens and plants were listed together.

Each protocol was characterized according to several variables (Table 1): the categories of impact considered (environmental alone or environmental and socio-economic), inclusion of questions on species spread (yes/no), on scoring scale (i.e. three levels, five levels and more than five levels), whether the protocol included a maximum aggregation method (i.e. the largest value of a set of values) to calculate the final score (yes/no), the number of questions requiring input from the assessors and contributing to the final score, and the expertise on the species required to complete the protocol. The latter was based on 63 responses received from an online anonymous questionnaire distributed to all assessors, which included a question asking them to rate their agreement (from 1 = disagree to 5 = fully agree) with the statement: “This protocol requires a high level of expertise on the species”. Assessors answered this question for each protocol after having completed all assessments. The response values were averaged per protocol to provide a single estimate of the level of expertise required for that NNS protocol (Table 1).

Selection of species

A total of 57 species from different taxonomic groups not native to terrestrial, freshwater, and marine environments in Europe were selected (Suppl. material 1: Table S1). Among them, only two species are native to a part of Europe (Arion vulgaris and Dreissena polymorpha). The list of species was elicited by consensus also at the Alien Challenge COST Action workshop in April 2014 (Rhodes, Greece). During the workshops, the experts were grouped according to their taxonomic expertise under the coordination of a taxonomic leader, in order to select a list of species covering a wide range of European climatic regions and habitat types, biological characteristics and the degree and type of impact. While some NNS were widespread, very well studied and with known impacts, some had a localized geographical distribution (Suppl. material 1: Table S1). Each NNS was assigned to a specific taxonomic group and habitat type: terrestrial plants, freshwater plants, terrestrial vertebrates, terrestrial insects, other terrestrial invertebrates, freshwater invertebrates, freshwater fish, marine species, and pathogens. The scientific knowledge available for the NNS was quantified as the number of records in the Web of Science using the accepted scientific name as a query, and biology and ecology research area as filters (retrieved in August 2016). Additionally, the mean and coefficient of variation of the assessor expertise on each species (Suppl. material 1: Table S1) was derived through a self-valuation questionnaire on each assessed NNS using the following classification: 1 = low (the assessor has not worked with the species); 2 = medium (the assessor has not published on the species but has expertise on it through surveys or reports); and 3 = high (the assessor has published on the species).

Assessment of non-native species

There is a large variation in methods to implement the different protocols; some are available as downloadable freeware (-ISK toolkits, the ‘NAPRA’ version of the GB-NNRA), as online applications (e.g. Harmonia+, BINPAS), whereas some have to be constructed following the text guidelines (e.g. GISS, EICAT), and others can be obtained as spreadsheets (e.g. GB-NNRA) or databases (e.g. NGEIAAS). To harmonize use of the protocols and facilitate data retrieval, a comprehensive Excel® spreadsheet template was developed to include all the protocols (see Suppl. material 2). The resulting spreadsheet was checked by the authors or owners of each protocol to ensure that it accurately depicted the original protocol whilst matching the common-practice methodology.

Using the protocols selected in the spreadsheet template, 89 assessors independently assessed between three to 11 species (mean = 3.9) of the taxonomic group in their area of expertise (i.e. terrestrial plants, aquatic plants, terrestrial vertebrates, terrestrial insects, other terrestrial invertebrates, freshwater invertebrates, freshwater fish, marine species and pathogens) (Suppl. material 1: Table S1). All assessors were researchers with expertise in biological invasions (PhD or PhD candidate) selected among the participants of the Alien Challenge COST Action by the coordinators of each taxonomic group. The experience of the assessors with NNS impact assessments varied. Most assessors had occasionally participated in NNS risk assessments exercises (59.3%), while 19.7% had never participated and 17.5% had often participated. All NNS were assessed by a minimum of five assessors (maximum eight) (Suppl. material 1: Table S1), yielding a total of 2614 assessments. Before conducting the assessments, the assessors were required to read the impact assessment guidelines provided per protocol and ask questions directly to the protocol developers if needed. When scoring impacts, assessors were instructed to consider Europe as the risk assessment area and the likely worst-case scenario for each NNS. Based on the precautionary principle, protocols recommend scoring the potential impact of NNS based on the available information either from studies for the area of assessment, or from areas with the same invaded habitat in a similar climate. The assessors were instructed to base their assessments on all available literature, information sources and their own expertise, indicating in the assessment the source of the information. The selection of the literature used for the assessment was left at the discretion of the assessor.

Before retrieving the data, each assessment was checked for completeness. Once all NNS assessments were completed, the final scores for each assessment were extracted. To harmonize scores across protocols, all ordinal scores (i.e. protocols with three or five levels as final scoring scale; Table 1) were transformed into numeric values, with the lowest impact as 1 and the maximum as 3 or 5, respectively. Then, all scores were standardized from 0 to 1 using the following equation (S – Smin)/(Smax – Smin), where S represent the score per NNS in each assessment, and Smax and Smin, the maximum and minimum possible scores provided by the protocol (Turbé et al. 2017).

Consistency in non-native species scoring across assessors

For each NNS and protocol (471 combinations), the mean and the coefficient of variation (CV) of the final score were calculated. The mean was used as the overall score across experts per NNS and protocol, whereas CV was used as an estimate of the consistency of scores across experts, adjusting for the mean value. First, differences in CV among all protocols were tested using a linear mixed model with protocol name as a fixed effect and species nested within taxonomic groups as random effects (i.e. random intercept model). Second, we used multimodel inference (Burnham and Anderson 2002) of linear mixed models to analyze the relationship between the CV and species characteristics (taxonomic group and available knowledge), protocol characteristics (impact categories, spread question included, final scoring scale, whether final scoring was based on maximum score, number of questions and expertise on the species required) and assessor expertise on the species (mean and coefficient of variance). In this set of models, we used the same random effects structure as in the first model but did not include protocol name as a covariate. Model residuals were checked for normality and homoscedasticity and identified the square root as the best transformation for CV. Multi-model inference, based on the all-subsets selection of predictors, was performed using the corrected Akaike’s Information Criterion (AIC_c) keeping the same random effects in all model combinations. For each combination of predictors, Akaike weights (w_i) were calculated. Considering the best models given the selected predictors (ΔAIC_c < 6) (Richards 2008), the relative importance w_+(j) of each predictor j was estimated as the sum of the AIC_c weights across all models in which the selected predictor appeared. Predictors with higher w_+(j) (i.e. closer to 1) have a higher weight of evidence to explain the response variable with the given data. Finally, the average of regression coefficients weighted by w_i within the subset of best models was calculated.

Differences in the mean CV among levels for the categorical variables in the best candidate model (i.e. with the smallest AIC_c) were tested for significance using a Tukey post hoc test. Prior to modelling, continuous predictors for the models above were checked for multicollinearity using Pearson correlations. All variables were selected for further analyses considering the low correlation values found (r < 0.5; Suppl. material 1: Table S2) (Dormann et al. 2013). Continuous variables were centered (deviate from the mean) and scaled (divided by standard deviation) to facilitate interpretation of model coefficients and model convergence (Schielzeth 2010). Finally, in all models explained above we accounted for the variability in the number of assessments per NNS (5 to 8; Suppl. material 1: Table S1) (i.e. sample size effect), including the number of assessments as a covariate (i.e. fixed effect).

Differences in impact assessment scoring across protocols

Similarities in the scoring of NNS across the different protocols were compared using hierarchical cluster analyses. Cluster analyses of the mean scores per NNS and protocol (calculations described above) were performed using Spearman’s correlation coefficient as a similarity measure and the complete linkage method (i.e. maximum distance between clusters). Using this method, we first carried out a cluster analysis of all NNS across the six protocols common to all taxonomic groups (i.e. GABLIS, GB-NNRA, EICAT, Harmonia+, GISS and NGEIAAS). Then, separate analyses were also performed for four subsets of NNS with common protocols: 1) aquatic and terrestrial plants, 2) aquatic animals (combining freshwater invertebrates, freshwater fish, and marine invertebrates), 3) terrestrial invertebrates (terrestrial insects and other terrestrial invertebrates), and 4) terrestrial vertebrates (Suppl. material 1: Table S1). Pathogens were not included in this analysis due to the low number (n = 3) of species tested. Prior to these analyses in order to account for the variability in the number of assessments per NNS (five to eight; Suppl. material 1: Table S1) (i.e. sample size effect), we calculated the Pearson’s correlation between the mean score per NSS and protocol and the number of assessments performed for all groups of species indicated above. When the correlation was significant for a group of species (p < 0.05) we used simple linear regression models to relate the mean score with the number of assessments per species and used the model’s residuals in subsequent hierarchical analyses. We followed this approach only for plants and aquatic animals based on the significant correlation found (Plants r: -0.17, p < 0.05; Aquatic animals r: -0.17, p < 0.05). Results without this correction were similar, reinforcing the robustness of the results (Suppl. material 1: Fig. S2). All statistical analyses and figures were carried out in R v3.4.1 (R Core Team 2017) using packages lme4, lsmeans, MuMIn and sjPlot to implement and plot mixed models and gplots for the correlation heat maps and dendrograms.

Results Consistency across assessors

The mean coefficient of variation (CV) of assessor scores per NNS and protocol was 40% (± 37% SD), with 10% (n = 470) showing complete agreement (CV = 0) among assessors but with maximum variability being 223% (four species in ISEIA: Aedes albopictus, Arion vulgaris, Australoheros facetus and Fascioloides magna; two species in EPPO EIA: Diabrotica virgifera and Tuta absoluta). CV was remarkably different among protocols (Fig. 1). ISEIA, EPPO-EIA and Harmonia+ protocols had the highest CV, whereas NGEIAAS and GABLIS protocols showed the lowest values. CV across assessors was better explained by protocol characteristics than by NNS characteristics (Table 2). Scoring scale, expertise required and the use of maximum impact score were the variables with the highest weight of evidence.

10.3897/neobiota.44.31650.figure1

F74B72C7-F220-58BF-948B-DFA38481E70F

Figure 1.

Coefficient of variation (CV) of species scoring across assessors per impact assessment protocol based on linear mixed models controlling for taxonomic group and species as nested random effects and number of assessments per species as fixed effects. Protocols with the same letters above the graph are not significantly different (p < 0.05; Tukey test). Dots indicate the least squares means per protocol. Lines indicate the confidence interval (95%) around the means.

https://binary.pensoft.net/fig/278656

According to Tukey post hoc tests in the best candidate model, protocols using three score levels had significantly lower CV than the protocols using scales with five levels (difference = 0.25, p < 0.001) or more than five levels (difference = 0.29, p < 0.001). However, protocols with five score levels were similar to protocols with more than five levels (p = 0.27). CV across assessors was significantly lower for protocols that required higher expertise than those for which low expertise was required (Table 2). The expertise required per protocol was highly correlated to the overall number of fields in the protocol (i.e. questions, comments, uncertainty and results; Pearson’s r = 0.9) but less with the number of questions actually contributing to the final score calculation (r = 0.5; Suppl. material 1: Table S2). Protocols using the maximum impact score yielded lower CV values. In terms of protocol content, CV was higher when protocols included a NNS spread module but there was no difference depending on the impact types considered (Table 2). The number of questions contributing to final score and impact categories considered did not show significant relations to CV (Table 2). Among NNS and assessor characteristics, only the mean of assessor expertise on each NNS showed a significant negative relationship with CV values (Table 2). Finally, there were some differences in CV among taxonomic groups (Fig. 2). Although not significant, terrestrial vertebrates, terrestrial plants, pathogens and freshwater invertebrates tended to show lower CVs whereas higher values were found for terrestrial insects, other terrestrial invertebrates and freshwater plants. Only terrestrial insects and freshwater plants showed a significantly higher CV than the average across all taxa (Fig. 2).

10.3897/neobiota.44.31650.figure2

27346ECD-4004-5B98-AF82-8E817DE111A2

Figure 2.

Mean regression coefficient and confidence interval (95%) of taxonomic groups (random effects) in the best linear mixed model explaining the coefficient of variation of scores of 57 invasive non-native species for 11 different protocols including all significant species, assessor and protocol characteristics (see Table 2) .

https://binary.pensoft.net/fig/278657

Table 2.

Average coefficient and Akaike weights for each species, assessor and protocol variable within the best linear mixed models (AIC_c < 6) explaining the coefficient of variation of the scores of 57 non-native species in 11 impact assessment protocols. Taxonomic groups and species identification were included as nested random effect. Predictors with weight closer to one have a higher relative importance to explain the response variable. Variables with weight equals zero were not included in the best subset of models to calculate average coefficients.

Variable	Coefficient	Adjusted SE	z	P	Weight
Intercept	0.36	0.06	5.76	<0.001
Number of assessments					0
Species
Web of Science records (available knowledge)	-0.06	0.05	1.18	0.24	0.06
Assessor
Mean assessor expertise	-0.04	0.02	2.21	0.03	0.14
CV assessor expertise					0
Protocol
Scoring scale	See results section				1
Expertise required	-0.14	0.02	7.76	<0.001	1
Using maximum impact score (yes-no)	-0.12	0.02	4.93	<0.001	1
Spread (yes-no)	0.12	0.05	3.57	<0.001	0.95
Impact type					0
Number of questions					0

Consistency across protocols

The pair-wise correlations in NNS scores among the six protocols common to all taxa were highly diverse (min–max = 0.16–0.77; mean = 0.55), indicating low consistency in species scores among some protocols (Fig. 3). With respect to taxonomic groups, aquatic animals had the highest mean correlation among protocols, terrestrial invertebrates and plants showed an equally low mean correlation, and terrestrial vertebrates had the lowest correlation levels (Fig. 4). These correlations remained similar when considering only the protocols common to all three taxonomic groups (Suppl. material 1: Fig. S1) and without sample size correction (Suppl. material 1: Fig. S2). Cluster analysis identified two main groups (Fig. 3, Suppl. material 1: Fig. S1): protocols that include only environmental impacts (NGEIAAS, GABLIS, and EICAT) and protocols that include both environmental and socio-economic impacts (GB-NNRA, GISS and Harmonia+). The scorings of Harmonia+ were clearly distinct from the other protocols (indicated by lower correlation values), particularly for plants and terrestrial invertebrates (Figs 3, 4). Similarly, FISK and GABLIS showed relatively low correlation values with the other protocols for aquatic animals and terrestrial vertebrates, respectively (Fig. 4).

10.3897/neobiota.44.31650.figure3

3C0E226F-7D0A-5FD3-A339-63FCB5BB63C9

Figure 3.

Spearman correlation matrix and hierarchical cluster of species scorings for the protocols common for all species. The color scale indicates the correlation between the species scorings obtained for each protocol pair. In brackets, the mean of all pair-wise correlations.

https://binary.pensoft.net/fig/278658

10.3897/neobiota.44.31650.figure4

B957E1B5-9D05-587F-A0D8-E9485109CA07

Figure 4.

Spearman correlation matrix and hierarchical cluster of the species scorings for the protocols common per species group. The color scale indicates the correlation between the species scorings obtained for each protocol pair. In brackets, the mean of all pairwise correlations per group.

https://binary.pensoft.net/fig/278659 Discussion

The comparison of impact assessment protocols for NNS shows that scoring variability across assessors can be substantial, depending on the taxonomic group considered and the scoring system. However, there is potential to reduce this variability by considering the expertise of the assessors and optimizing structural characteristics of the protocol. Furthermore, the ranking of NNS based on the protocol scoring can differ depending on the approach implemented, mainly based on the impact category type considered (i.e. whether socio-economic impacts are included). Thus, the selection of the scoring approach can have important consequences on the final ranking of NNS produced.

Consistency across assessors and across taxonomic groups

Scoring consistency across assessors and for some taxonomic groups was surprisingly low. It is not clear why these large discrepancies occurred even when the assessors were experts in invasion biology within their taxonomic domain. Many factors can influence the interpretations of context dependence found in the scientific literature, which can lead to subjective and inconsistent answers even amongst expert assessors (Gilovich et al. 2002). Heuristics and bias, including intuitive strategies to process information, can lead to variability in expert responses (O’Hagan et al. 2006). For example, experts might score the impact according to the studies with which they feel most familiar (e.g. conducted by colleagues in their region). Similarly, if there is a lack of information on the impacts for a NNS, then the judgement might be biased towards a NNS of the same taxonomic lineage. Alternatively, inconsistencies might be due to inherent uncertainty. For instance, a greater inconsistency for most groups of aquatic taxa may reflect a higher difficulty in determining impacts than for taxa in other environments (Molnar et al. 2008). Finally, these biases could be balanced by anchoring effects where most assessors might assign intermediate levels of impact when there is insufficient information to fulfil the protocol requests.

Part of the variability in consistency was explained by protocol characteristics and the approaches implemented. Protocols with three score levels were more likely to show consistency among assessors than those with five or more levels. However, a three-category scoring system might not be sufficient to discriminate between NNS impacts or magnitude of impacts and rank NNS for prioritisation, because too many species will have the same score. Protocols that select the highest impact among different categories provided higher consistency. By definition, this approach will homogenise the scores towards higher values discarding inconsistencies from less important impacts in a way that results will be more conservative.

Protocols containing questions that required greater expertise on the species yielded higher scoring consistency than simpler protocols. Protocols requiring greater expertise demanded very detailed information about the species (e.g. expected population lifetime in NGEIAAS) that, when available, is very likely to be available only in few studies. Owing to the restricted number of sources of information, the variability in the final score might be low. Complex protocols might be less user-friendly and more time-consuming, but this in itself could increase focus and decrease subjectivity. Exceptions exist, e.g. the -ISK screening (Copp 2013), whereby the protocol is easy to use but the 49 questions require more time to answer than simpler tools such as ISEIA, which has only 12 questions. However, the questions from simple tools such as ISEIA focus mainly on impacts, whereas the -ISK screening tools include a much broader range of questions, such as invasion history, species traits and susceptibility to management measures. The balance between ease of use and time spent is critical as some protocols are meant to be used for the rapid screening of a NNS, whereas others provide more in-depth assessments. For example, NGEIAAS was designed for professional experts who carry out very detailed risk assessments on behalf of government authorities (Gederaas et al. 2012, Sandvik et al. 2013). This issue highlights that although we only selected impact and spread related sections, the present study compares tools intended for different phases of the risk analysis process, i.e. risk identification (e.g. ISEIA, -ISK screening tools), risk assessment (e.g. GB-NNRA, Harmonia+) and impact assessment (e.g. GISS, EICAT). Further studies could look into a detailed comparison across all phases of the risk analysis process in order to highlight those sections that might require improvement.

Regarding assessor and NNS characteristics, the only factor that significantly increased consistency among assessors was their level of expertise with the assessed species. Assessors that had previous experience with the NNS assessed may have had similar high levels of knowledge on that NNS, and this may have led to similar scores. Nevertheless, this situation is infrequent as NNS assessments are more commonly undertaken by persons familiar with the taxonomic group but not necessarily with the NNS being assessed (e.g. NNS not yet present or still rare in the study area). Unexpectedly, consistency was not related to the availability of information about the species (i.e. higher number of WoS records). The simplest explanation is that the number of studies available does not necessarily indicate more studies relevant for impact assessments as the literature on these species could be linked to other research fields in invasion biology not directly associated with their environmental or socioeconomic impacts. It is also relevant to note that different assessors might have had access to different information sources, particularly non-English literature and reports. This might have affected consistency results but we followed standard practices for NNS risk assessments. Further studies could look at these differences providing a base information for the species to be assessed.

The high inconsistency found among assessor’s scores raises high concerns and suggests that assessments conducted by single assessors should be interpreted with caution (Pheloung et al. 1999, Cousens 2008). Expert working group scoring, the use of consensus techniques and reviewing processes can inform the responses of single assessors and therefore reduce uncertainty (Sutherland and Burgman 2015, Vanderhoeven et al. 2017). For NNS lacking information or contrasting data, structured elicitation techniques, such as the Delphi approach, which is based on a feedback and revision process (Mukherjee et al. 2015), can identify and reduce potential sources of bias among experts (Morgan 2014, Sutherland and Burgman 2015). In practice, risk assessments for NNS, in particular those carried out in the plant health sector, are usually done either by groups of experts, as in EPPO pest risk assessment, or using an independent peer reviewer and an editorial-board type vetting procedure, such as in Great Britain (Baker et al. 2008, Booy et al. 2012). The consensus approach is used for plants and plant pests because those assessments are likely to be used in international trade agreements in order to demonstrate robustness (Schrader et al. 2010). However, national or regional impact risk assessments of NNS for blacklists or prioritization purposes are often based on the judgement of a few or single experts. Thus, efforts should be made to involve a panel of experts in the species or the system following elicitation techniques.

Differences across protocols

Variations among protocols in species scoring are mainly due to the inclusion, or not, of socio-economic impacts. Although socio-economic and environmental impacts are generally correlated (Kumschick et al. 2015a, Vilà et al. 2010), it is almost impossible to predict the magnitude of one impact from the other (Bacher et al. 2018). For example, many NNS, such as agricultural pests and organisms affecting human health, exclusively cause socio-economic impacts (Kenis and Branco 2010, Kumschick et al. 2015b) and, thus, using protocols that include such impacts will affect the impact ranking of NNS under consideration. Furthermore, the perception of socio-economic impacts is likely to vary across stakeholders. Thus, depending on the target audience and objectives of the assessment, different protocols may be used, focusing either on environmental or socio-economic impacts or both together. The majority of the protocols exclusively considered environmental impacts, and there was greater correlation in scores among these protocols. However, the difference between scores was dependent on the taxonomic group under consideration. Ranking of species completely shifted (negative correlation of scores across protocols) when different impact categories were considered for terrestrial vertebrates and plants, but the difference was lower for aquatic animals. This pattern might be due to differences in the relevance of impacts across taxa, with terrestrial vertebrates showing highly contrasting impact types for single species (e.g. high economic impact but low environmental impact) (Vilà et al. 2010). However, differences in scores among taxonomic groups might again also simply reflect differences in the knowledge of their impacts. Impacts of terrestrial vertebrates or plants might be better known than those of aquatic organisms. Testing this hypothesis requires comparing uncertainty scores provided by experts across impact types and taxonomic groups, which could be done with the current dataset in further studies.

Among all protocols, Harmonia+, FISK and GABLIS led to very different scores in comparison to the other protocols. This difference was partly related to the different impact categories considered but also to the inclusion of questions beyond impact (e.g. management in GABLIS and FISK). Finally, the GB-NNRA protocol showed a variable relation with other protocols across taxa: low correlation with protocols only considering environmental impacts for plants and terrestrial invertebrates but high for vertebrates. The final score in the GB-NNRA was not automatically calculated as in the other protocols. Instead, assessors were asked to provide overall summary scores and confidence rankings for the NNS based on the answers provided in previous sections, which include questions that consider both environmental and socio-economic impacts (Baker et al. 2008, Mumford et al. 2010). This approach could have led to the not consistent relation between the GB-NNRA protocol and the others. However, when used as part of the GB risk analysis process (Booy et al. 2012), it aids the NNS risk analysis panel to identify inconsistencies between the assessor’s individual question responses and their overall scores and confidence levels (Mumford et al. 2010).

Recommendations for <abbrev xlink:title="non-native species" id="ABBRID0E5LBG">NNS</abbrev> impact assessments

Several key factors should be taken into account when selecting or designing a NNS risk assessment protocol, such as the aim, the scope, the consistency and the accuracy of the outcomes, and the resources available to perform the assessment (e.g. time or information). As a first step, the suitability of a NNS risk assessment protocol will depend on the scope and aim of the assessment. For instance, if a NNS is already present in the region of interest, assessments on likelihood of entry and establishment are less meaningful than just the assessment of impact. Protocols with different scopes may produce different results in terms of NNS rankings (Lazzaro et al. 2016). As we have shown, even just considering different types of impacts could result in large differences in rankings. Thus, it is crucial not to mix the results from assessment methods that consider different impacts or phases of the invasion process. Furthermore, our results show that even if the focus is only on impact and spread sections, the choice of the protocol is critical because the scoring consistency will depend on the characteristics of the protocol. Three main factors were responsible for these inconsistencies, the choice of the scoring scale, how the final score is summarized and the general expertise required to use the protocol. We recommend using a 5-level scoring, maximum aggregation method and moderate expertise requirements as a good compromise to reduce inconsistency without losing discriminatory power or usability. In general, we also advise protocol developers to perform sensibility tests of consistency before final release or adoption (e.g. Pheloung et al. 1999). This is crucial because if a protocol yields inconsistent outcomes when used by different assessors, then it is likely that decisions taken based on the results could be variable and disproportionate to the actual impacts (Schrader et al. 2012).

Part of the inconsistency might also come from the way the protocol is used in practice (e.g. standardized forms, clear guidelines, selection of assessors, individual vs. group assessments). We propose three main ways to reduce this type of inconsistency. First, irrespectively of the protocol, selecting a group of assessors with high expertise will yield more consistent results. Second, inconsistencies due to linguistic uncertainties (e.g. definitions, formulations, rating) can be reduced by improving the guidelines and with adequate training of the assessors (Vilà et al. 2019). Third, other studies have suggested using expert elicitation methods to reduce inconsistencies (Morgan 2014, Sutherland and Burgman 2015), such as consensus building (Mukherjee et al. 2015) or quality control mechanisms (e.g. peer-review panels). Elicitation methods can reveal whether differences in scoring outcomes between and within protocols reflect true differences in opinion, lack of evidence, or subjective biases due to protocol interpretation (Vanderhoeven et al. 2017). In fact, scientific consensus and robust revisions are crucial for policy implementation (Turbé et al. 2017). Finally, there will always be inconsistencies due to knowledge gaps and subjectivity in the interpretation of the scientific results when there is high context dependency. This might not be a problem in providing a sound evidence-base for decisions on NNS as long as protocols are used transparently and uncertainties are explicitly dealt with through appropriate methods (Vanderhoeven et al. 2017).

Acknowledgements

This article is based upon work from the COST Action TD1209: Alien Challenge. COST (European Cooperation in Science and Technology) is a pan-European intergovernmental framework. The mission of COST is to enable scientific and technological developments leading to new concepts and products and thereby contribute to strengthening Europe’s research and innovation capacities. PGM was supported by the CABI Development Fund (with contributions from ACIAR (Australia) and Dfid (UK) and by Darwin plus, DPLUS074 ‘Improving biosecurity in the SAUKOTs through Pest Risk Assessments’. MV by Belmont Forum-Biodiversa project InvasiBES (PCI2018-092939). CP by Sciex-NMSch 12.108. JMJ and WCS by BiodivERsA (FFII project; DFG grant JE 288/7-1). JMJ by DFG project JE 288/9-1,9-2. CR and MB by Fundação para a Ciência e a Tecnologia grants SFRH/BPD/91357/2012 and SFRH/BPD/86215/2012, respectively. PS by MESTD of Serbia, grant #173025. JP by RVO 67985939 and 17-19025S. JCC was supported by a starting grant in the framework of the 2014 FCT Investigator Programme (IF/01606/2014/CP1230/CT0001).

References

Bacher

Blackburn

Essl

Genovesi

Heikkilä

Jeschke

Jones

Keller

Kenis

Kueffer

Martinou

Nentwig

Pergl

Pyšek

Rabitsch

Richardson

Roy

Saul

W-C

Scalera

Vilà

Wilson

JRU

Kumschick

(2018) Socio-economic impact classification of alien taxa (SEICAT).Methods in Ecology and Evolution9: 159–168. https://doi.org/10.1111/2041-210X.12844

Baker

Black

Copp

Haysom

Hulme

Thomas

Ellis

(2008) The UK risk assessment scheme for all non-native species. In: Rabitsch

Essl

Klingenstein

(Eds) Biological invasions: from ecology to conservation.Institute of Ecology of the TU Berlin, Berlin, 46–57.

Blackburn

Pyšek

Bacher

Carlton

Duncan

Jarošík

Wilson

JRU

Richardson

(2011) A proposed unified framework for biological invasions.Trends in Ecology & Evolution26: 333–339. https://doi.org/10.1016/j.tree.2011.03.023

Booy

Copp

Mazaubert

(2012) Réseaux d’experts et prise de décisions: l’exemple du Royaume-Uni.Sciences Eaux & Territoires Numéro6: 74–77.

Branquart

(2009) Guidelines for environmental impact assessment and list classification of non-native organisms in Belgium. Version 2.6.

Brunel

Branquart

Fried

Van Valkenburg

Brundu

Starfinger

Buholzer

Uludag

Joseffson

Baker

(2010) The EPPO prioritization process for invasive alien plants.EPPO Bulletin40: 407–422. https://doi.org/10.1111/j.1365-2338.2010.02423.x

Burnham

Anderson

(2002) Model selection and multimodel inference: a practical information-theoretic approachSecond edition.Springer-Verlag, New York, 528 pp.

Copp

(2013) The Fish Invasiveness Screening Kit (FISK) for non-native freshwater fishes: A summary of current applications.Risk Analysis33: 1394–1396. https://doi.org/10.1111/risa.12095

Copp

Vilizzi

Mumford

Fenwick

Godard

Gozlan

(2009) Calibration of FISK, an Invasiveness Screening Tool for Nonnative Freshwater Fishes.Risk Analysis29: 457–467. https://doi.org/10.1111/j.1539-6924.2008.01159.x

Copp

Vilizzi

Tidbury

Stebbing

Trakan

Miossec

Goulletquer

(2016) Development of a generic decision-support tool for identifying potentially invasive aquatic taxa: AS-ISK.Management of Biological Invasions7: 343–350. https://doi.org/10.3391/mbi.2016.7.4.04

Cousens

(2008) Risk assessment of potential biofuel species: an application for trait-based models for predicting weediness? Weed Science 56: 873–882. https://doi.org/10.1614/WS-08-047.1

D’hondt

Vanderhoeven

Roelandt

Mayer

Versteirt

Adriaens

Ducheyne

Martin

Grégoire

J-C

Stiers

Quoilin

Cigar

Heughebaert

Branquart

(2015) Harmonia+ and Pandora+: risk screening tools for potentially invasive plants, animals and their pathogens.Biological Invasions17: 1869–1883. https://doi.org/10.1007/s10530-015-0843-1

Dormann

Elith

Bacher

Buchmann

Carl

Carré

Marquéz

JRG

Gruber

Lafourcade

Leitão

Münkemüller

McClean

Osborne

Reineking

Schröder

Skidmore

Zurell

Lautenbach

(2013) Collinearity: a review of methods to deal with it and a simulation study evaluating their performance.Ecography36: 027–046. https://doi.org/10.1111/j.1600-0587.2012.07348.x

Essl

Nehring

Klingenstein

Milasowszky

Nowack

Rabitsch

(2011) Review of risk assessment systems of IAS in Europe and introducing the German-Austrian Black List Information System (GABLIS).Journal for Nature Conservation19: 339–350. https://doi.org/10.1016/j.jnc.2011.08.005

Gederaas

Moen

Skjelseth

Larsen

(2012) Non-native species in Norway – with the Norwegian Black List 2012. The Norwegian Biodiversity Information Centre, Norway.

Gilovich

Griffin

Kahneman

(2002) Heuristics and biases: The psychology of intuitive judgment. Cambridge University Press, Cambridge.

Hawkins

Bacher

Essl

Hulme

Jeschke

Kühn

Kumschick

Nentwig

Pergl

Pyšek

Rabitsch

Richardson

Vilà

Wilson

JRU

Genovesi

Blackburn

(2015) Framework and guidelines for implementing the proposed IUCN Environmental Impact Classification for Alien Taxa (EICAT).Diversity and Distributions21: 1360–1363. https://doi.org/10.1111/ddi.12379

Heikkilä

(2011) A review of risk prioritisation schemes of pathogens, pests and weeds: principles and practices.Agricultural and Food Science20: 15–28. https://doi.org/10.2137/145960611795163088

Katsanevakis

Wallentinus

Zenetos

Leppäkoski

Çinar

Oztürk

Grabowski

Golani

Cardoso

(2014) Impacts of invasive alien marine species on ecosystem services and biodiversity: a pan-European review.Aquatic Invasions9: 391–423. https://doi.org/10.3391/ai.2014.9.4.01

Kenis

Bacher

Baker

RHA

Branquart

Brunel

Holt

Hulme

MacLeod

Pergl

Petter

Pyšek

Schrader

Sissons

Starfinger

Schaffner

(2012) New protocols to assess the environmental impact of pests in the EPPO decision-support scheme for pest risk analysis.EPPO Bulletin42: 21–27. https://doi.org/10.1111/j.1365-2338.2012.02527.x

Kenis

Branco

(2010) Impact of alien terrestrial arthropods in Europe. Chapter 5. BIORISK? Biodiversity and Ecosystem Risk Assessment 4. https://doi.org/10.3897/biorisk.4.42

Křivánek

Pyšek

(2006) Predicting invasions by woody species in a temperate zone: a test of three risk assessment schemes in the Czech Republic (Central Europe).Diversity and Distributions12: 319–327. https://doi.org/10.1111/j.1366-9516.2006.00249.x

Kumschick

Bacher

Evans

Marková

Pergl

Pyšek

Vaes-Petignat

van der Veer

Vilà

Nentwig

(2015a) Comparing impacts of alien plants and animals in Europe using a standard scoring system.Journal of Applied Ecology52: 552–561. https://doi.org/10.1111/1365-2664.12427

Kumschick

Gaertner

Vilà

Essl

Jeschke

Pyšek

Ricciardi

Bacher

Blackburn

Dick

Evans

Hulme

Kühn

Mrugała

Pergl

Rabitsch

Richardson

Sendek

Winter

(2015b) Ecological Impacts of Alien Species: Quantification, Scope, Caveats, and Recommendations. BioScience 65. https://doi.org/10.1093/biosci/biu193

Lazzaro

Foggi

Ferretti

Brundu

(2016) Priority invasive alien plants in the Tuscan Archipelago (Italy): comparing the EPPO prioritization scheme with the Australian WRA.Biological Invasions18: 1317–1333. https://doi.org/10.1007/s10530-016-1069-6

Leung

Roura-Pascual

Bacher

Heikkilä

Brotons

Burgman

Dehnen-Schmutz

Essl

Hulme

Richardson

Sol

Vilà

(2012) TEASIng apart alien species risk assessments: a framework for best practices.Ecology Letters15: 1475–1493. https://doi.org/10.1111/ele.12003

Matthews

van der Velde

Collas

FPL

de Hoop

Koopman

Hendriks

Leuven

RSEW

(2017) Inconsistencies in the risk classification of alien species and implications for risk assessment in the European Union.Ecosphere8: 1–17. https://doi.org/10.1002/ecs2.1832

McGeoch

Genovesi

Bellingham

Costello

McGrannachan

Sheppard

(2016) Prioritizing species, pathways, and sites to achieve conservation targets for biological invasion.Biological Invasions18: 299–314. https://doi.org/10.1007/s10530-015-1013-1

Molnar

Gamboa

Revenga

Spalding

(2008) Assessing the global threat of invasive species to marine biodiversity.Frontiers in Ecology and the Environment6: 485–492. https://doi.org/10.1890/070064

Morgan

(2014) Use (and abuse) of expert elicitation in support of decision making for public policy.Proceedings of the National Academy of Sciences111: 7176–7184. https://doi.org/10.1073/pnas.1319946111

Mukherjee

Hugé

Sutherland

McNeill

Van Opstal

Dahdouh-Guebas

Koedam

(2015) The Delphi technique in ecology and biological conservation: applications and guidelines.Methods in Ecology and Evolution6: 1097–1109. https://doi.org/10.1111/2041-210X.12387

Mumford

Booy

Baker

RHA

Rees

Copp

Black

Holt

Leach

Hartley

(2010) Invasive non-native species risk assessment in Great Britain. Aspects of Applied Biology, 49–54.

Narščius

Olenin

Zaiko

Minchin

(2012) Biological invasion impact assessment system: From idea to implementation.Ecological Informatics7: 46–51. https://doi.org/10.1016/j.ecoinf.2011.11.003

Nentwig

Bacher

Pyšek

Vilà

Kumschick

(2016) The generic impact scoring system (GISS): a standardized tool to quantify the impacts of alien species. Environmental Monitoring and Assessment 188: 315. https://doi.org/10.1007/s10661-016-5321-4

Nentwig

Kühnel

Bacher

(2010) A Generic Impact-Scoring System Applied to Alien Mammals in Europe.Conservation Biology24: 302–311. https://doi.org/10.1111/j.1523-1739.2009.01289.x

O’Hagan

Buck

Daneshkhah

Eiser

Garthwaite

Jenkinson

Oakley

Rakow

(2006) Uncertain Judgements: Eliciting Experts’ Probabilities.John Wiley & Sons, Chichester, 340 pp. https://doi.org/10.1002/0470033312

Olenin

Minchin

Daunys

(2007) Assessment of biopollution in aquatic ecosystems.Marine Pollution Bulletin55: 379–394. https://doi.org/10.1016/j.marpolbul.2007.01.010

Panov

Alexandrov

Arbačiauskas

Binimelis

Copp

Grabowski

Lucy

Leuven

Nehring

Paunović

(2009) Assessing the risks of aquatic species invasions via European inland waterways: from concepts to environmental indicators.Integrated environmental assessment and management5: 110–126. https://doi.org/10.1897/IEAM_2008-034.1

Pheloung

Williams

Halloy

(1999) A weed risk assessment model for use as a biosecurity tool evaluating plant introductions.Journal of environmental management57: 239–251. https://doi.org/10.1006/jema.1999.0297

R Core Team (2017) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.

Richards

(2008) Dealing with overdispersed count data in applied ecology.Journal of Applied Ecology45: 218–227. https://doi.org/10.1111/j.1365-2664.2007.01377.x

Roy

Rabitsch

Scalera

Stewart

Gallardo

Genovesi

Essl

Adriaens

Bacher

Booy

Branquart

Brunel

Copp

Dean

D’hondt

Josefsson

Kenis

Kettunen

Linnamagi

Lucy

Martinou

Moore

Nentwig

Nieto

Pergl

Peyton

Roques

Schindler

Schönrogge

Solarz

Stebbing

Trichkova

Vanderhoeven

van Valkenburg

Zenetos

(2018) Developing a framework of minimum standards for the risk assessment of alien species.Journal of Applied Ecology55: 526–538. https://doi.org/10.1111/1365-2664.13025

Sandvik

Sæther

B-E

Holmern

Tufto

Engen

Roy

(2013) Generic ecological impact assessments of alien species in Norway: a semi-quantitative set of criteria.Biodiversity and Conservation22: 37–62. https://doi.org/10.1007/s10531-012-0394-z

Schielzeth

(2010) Simple means to improve the interpretability of regression coefficients.Methods in Ecology and Evolution1: 103–113. https://doi.org/10.1111/j.2041-210X.2010.00012.x

Schrader

MacLeod

Mittinty

Brunel

Kaminski

Kehlenbeck

Petter

Baker

(2010) Enhancements of pest risk analysis techniques.EPPO Bulletin40: 107–120. https://doi.org/10.1111/j.1365-2338.2009.02360.x

Schrader

MacLeod

Petter

Baker

RHA

Brunel

Holt

Leach

Mumford

(2012) Consistency in pest risk analysis – how can it be achieved and what are the benefits? EPPO Bulletin 42: 3–12. https://doi.org/10.1111/j.1365-2338.2012.02547.x

Snyder

Mandrak

Niblock

Cudmore

(2013) Developing a Screening Level Risk Assessment Prioritization Protocol for Aquatic Non-Indigenous Species in Canada: Review of Existing Protocols.Fisheries and Oceans Canada, Canadian Science Advisory Secretariat: Research Document97: 1–82.

Sutherland

Burgman

(2015) Policy advice: Use experts wisely.Nature526: 317–318. https://doi.org/10.1038/526317a

Tricarico

Vilizzi

Gherardi

Copp

(2010) Calibration of FI-ISK, an Invasiveness Screening Tool for Nonnative Freshwater Invertebrates.Risk Analysis30: 285–292. https://doi.org/10.1111/j.1539-6924.2009.01255.x

Turbé

Strubbe

Mori

Carrete

Chiron

Clergeau

González-Moreno

Le Louarn

Luna

Menchetti

Nentwig

Pârâu

Postigo

J-L

Rabitsch

Senar

Tollington

Vanderhoeven

Weiserbs

Shwartz

(2017) Assessing the assessments: evaluation of four impact assessment protocols for invasive alien species.Diversity and Distributions23(3): 297–307. https://doi.org/10.1111/ddi.12528

Vanderhoeven

Branquart

Casaer

D’hondt

Hulme

Shwartz

Strubbe

Turbé

Verreycken

Adriaens

(2017) Beyond protocols: improving the reliability of expert-based risk analysis underpinning invasive species policies.Biological Invasions19(9): 2507–2517. https://doi.org/10.1007/s10530-017-1434-0

Vilà

Basnou

Pyšek

Josefsson

Genovesi

Gollasch

Nentwig

Olenin

Roques

Roy

Hulme

(2010) How well do we understand the impacts of alien species on ecosystem services? A pan-European, cross-taxa assessment.Frontiers in Ecology and the Environment8: 135–144. https://doi.org/10.1890/080083

Vilà

Espinar

Hejda

Hulme

Jarošík

Maron

Pergl

Schaffner

Sun

Pyšek

(2011) Ecological impacts of invasive alien plants: a meta-analysis of their effects on species, communities and ecosystems.Ecology letters14: 702–708. https://doi.org/10.1111/j.1461-0248.2011.01628.x

Vilà

Gallardo

Preda

García-Berthou

Essl

Kenis

Roy

González-Moreno

(2019) A review of impact assessment protocols of non-native plants.Biological Invasions21: 709–723. https://doi.org/10.1007/s10530-018-1872-3

Vilà

Hulme

(2017) Impact of Biological Invasions on Ecosystem Services.Springer International Publishing, Cham, 354 pp.

Zaiko

Lehtiniemi

Narščius

Olenin

(2011) Assessment of bioinvasion impacts on a regional scale: a comparative approach.Biological Invasions13: 1739–1765. https://doi.org/10.1007/s10530-010-9928-z

Supplementary materials

10.3897/neobiota.44.31650.suppl1

2633599

5C55B3AD-7FBD-592C-A7A9-E0CD67BDE2B9

Supplementary material 1

Supplementary materials

Data type

statistical data

Explanation note

Figure S1: hierarchical cluster of the species scores for the six protocols common to all taxonomic groups. Figure S2: hierarchical cluster of the species scorings for plants and aquatic animals without correcting for sample size bias. Table S1: list of non-native species. Table S2: correlation analyses.

https://binary.pensoft.net/file/278660

This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.

Pablo González-Moreno et al.

10.3897/neobiota.44.31650.suppl2

2633603

B1B4126D-4EF8-506A-A327-B5FF89F2ED6E

Supplementary material 2

Supplementary materials

Data type

Spreadsheet template

Explanation note

Spreadsheet template to fill the 11 impact assessment protocols for non-native species considered in the study.

https://binary.pensoft.net/file/278661

Pablo González-Moreno et al.