Ambrosia artemisiifolia control in agricultural areas: effect of grassland seeding... 1 Ambrosia artemisiifolia control in agricultural areas: effect of grassland seeding and herbivory by the exotic leaf beetle Ophraella communa

Ambrosia artemisiifolia (common ragweed) is an invasive species native to North America and was accidentally introduced to Europe in the 19th century. Widespread in disturbed habitats, it is a major weed in spring-sown crops and it causes serious allergic rhinitis and asthma due to its allergenic pollen. The aim of this research was to analyse the effects of both competitive vegetation and herbivory by Ophraella communa to control A. artemisiifolia in an agricultural area of north-western Italy. Hayseed mixtures, both over-seeded over the resident plant community or after ploughing, when seeded before the winter season, were able to suppress the establishment of A. artemisiifolia as well as to reduce its growth in terms of plant height and inflorescence size. Defoliation of A. artemisiifolia by O. communa at the end of the growing season was conspicuous but most of the plants still produced flowers and seeds. However, significant O. communa attack was recorded for reproductive structures. As for non-target species, O. communa was mainly recorded on Asteraceae, with low density and low degree of damage. Reduction of inflorescence size due to competitive vegetation and damage to male flowers by O. communa may diminish the amount of available pollen. The results of this study may be useful for the implementation of management measures to control A. artemisiifolia in agricultural areas using mixtures of native species.


Introduction
The introduction of Invasive Alien Species (IAS) in a new region has multi-scale impacts on ecosystems and socio-economic implications for resident human communities (Branco et al. 2015, Early et al. 2016. For these reasons, multi-disciplinary approaches to study the consequences of IAS as well as implementing sustainable management options are required, with the final goal being to accomplish successful control measures and/or eradication (Harker and O'Donovan 2013). In Europe, Ambrosia artemisiifolia L. (common ragweed) is considered an extremely dangerous IAS, due to both its allergenic pollen that causes serious human diseases such as rhinitis and asthma ) and its impact on crops yields that decrease when ragweed is abundant . The species, native to North America, was accidentally introduced into the wild in Europe around the middle of the 19 th century  probably through contaminated seed stocks (birdseeds, corn, grain etc.; Brandes and Nitzsche 2006) and, from then, it spread exponentially in several countries (Chauvel et al. 2006) and other continents (Montagnani et al. 2017). It is expected that the species will expand its range further due to its great dispersal ability and favoured by global warming (Cunze et al. 2013, Chapman et al. 2014, Leiblein-Wild et al. 2016, Skálová et al. 2017. Exchanges of contaminated crop seeds still represent an important vector for diffusion  but, despite the absence of specialised dispersal structures, A. artemisiifolia seeds are also spread by water (river flooding; Fumanal et al. 2007), animals and human activities (Chauvel et al. 2006, Vitalos and Karrer 2009, Von der Lippe et al. 2013, Montagnani et al. 2017. As an annual pioneer species, it colonises disturbed habitats, such as river corridors, roadside verges, ruderal and agricultural areas (Chauvel et al. 2006, Müller-Schärer et al. 2014, Gentili et al. 2015. To date, several control measures have been tested to promote its eradication, including chemical, physical and biological techniques. However, mowing and herbicides are still the most applied methods in agroecosystems (Buttenschøn et al. 2009, Milakovic et al. 2014. Recently, in the framework of the EU-project SMARTER (EU COST action FA1203: Sustainable management of Ambrosia artemisiifolia in Europe, http://www. ragweed.eu), a multi-disciplinary team of researchers is performing several studies directed at creating innovative measures to control the species, according to different methods: mechanical, chemical, biological and ecological (Müller-Schärer et al. 2014, Bonini et al. 2017. With regard to ecological methods, "competitive vegetation" created from species-rich seed mixtures and vegetation succession are considered promising approaches to suppress the species' growth (Gentili et al. 2015(Gentili et al. , 2017b, while "biological control" through the insect Ophraella communa LeSage 1986 (Coleoptera: Chrysomelidae) is currently under evaluation (Müller-Schärer et al. 2014, Lommen et al. 2017a, Bonini et al. 2017, Sun et al. 2017. The insect, whose presence was recently reported for Europe (Müller-Schärer et al. 2014), is used as a successful biological control agent in China, together with Epiblema strenuana (Zhou et al. 2014).
As for competitive vegetation, A. artemisiifolia is well known for being able to rapidly occupy empty niches across its invasion range; particularly, being an aggres-sive early coloniser of open disturbed habitats (Gentili et al. 2015) and abandoned crop fields (Maryushkina 1991), it takes advantage of the "priority effect" (Dickson et al. 2012) which allows it to better outcompete other species and, thus, inhibits their establishment and growth (Young et al. 2001, Ortmans 2016. In turn, enforcing competitive vegetation, seeding both native (hayseed) and commercial seed stocks, has been demonstrated to inhibit ragweed germination and growth, to increase biodiversity (hayseed) and to be effective in recovering ruderal habitats (Gentili et al. 2015). Such a method has yet to be tested in agricultural and/or protected areas, contexts where methodological issues, such as how to schedule periods of seeding, are important in order to understand the way of limiting its priority effect advantages and consequently maximising the controlling effect on A. artemisiifolia. Inevitably the seeding strategy (i.e. restoration) is habitat-dependent due to local environmental and biotic filters and needs to be calibrated on recipient environmental types as its effectiveness is strictly connected to local ecosystem conditions (Funk et al. 2008).
With regard to biological control, O. communa is a multi-voltine leaf beetle originally from North America (Futuyma and McCafferty 1990). It was first recorded in Europe in 2013, when the species was found in northern Italy (Lombardy, Piedmont and Emilia-Romagna regions) and in southern Switzerland (Ticino Canton; Bosio et al. 2014, Müller-Schärer et al. 2014). Due to its high dispersal ability (potentially up to 329 km/year;Yamamura et al. 2007), it was expected to rapidly expand its European range and, by 2013, was already covering an area of 20000 km 2 (Müller-Schärer et al. 2014, Lommen et al. 2017b. O. communa larvae and adults preferentially feed on A. artemisiifolia and they can completely defoliate the plant to death prior to seed production when the initial density is high enough (Guo et al. 2011, Zhou et al. 2014. Also in Italy and Switzerland, O. communa was observed to reach densities high enough to kill A. artemisiifolia plants before flowering (Müller-Schärer et al. 2014) and beetles were seen to cause damage to male flowers, with negative effects on pollen production . The recent spread of the insect is considered to be the potential explanation for the low levels of A. artemisiifolia pollen in the Milano area recorded during 2013 and 2014 (Bonini et al. 2015.
One of the greatest concerns when choosing a biocontroller is clearly connected to the risk for non-target species to be attacked by the agent (Louda et al. 2003). As for O. communa, the insect was reported on plant species different from A. artemisiifolia, such as other ragweed taxa and relatives mainly belonging to the tribe of Heliantheae (Tamura et al. 2004, Watanabe and Hirai 2004, Yamanaka et al. 2007, Cao et al. 2011, Müller-Schärer et al. 2014. Risk assessments of O. communa's attack on the cultivated sunflower H. annuus gave controversial results. Palmer and Goeden (1991) rejected the beetle as a biocontroller for Australia because, in laboratory tests, it can complete its life cycle on sunflower while, recently, the possibility of an Ophraella's attack on H. annuus in the field was considered negligible (Dernovici et al. 2006, Cao et al. 2011, Zhou et al. 2011. Most of the studies on the potential ability of O. communa to choose new host plants in introduced areas were conducted on weeds or species of commercial interest (Palmer and Goeden 1991, Watanabe and Hirai 2004, Dernovici et al. 2006, Cao et al. 2011, Zhou et al. 2011, Lommen et al. 2017b. The risk to native flora was rarely taken into account. Taking advantage of the recent accidental introduction of O. communa in Italy, the present work analyses the effects and modus operandi of both competitive vegetation and herbivory by O. communa, as well as their possible additive or divergent effect, on the management and control of A. artemisiifolia in an agricultural protected area. Specifically, the aims of this study were to: a) assess the effectiveness of seeding competitive native vegetation to control A. artemisiifolia; in particular, to assess the priority effect advantages by testing two different seeding periods; b) evaluate the damage to A. artemisiifolia caused by natural population of O. communa in an area of the European range where both the plant and the insect are present at very high densities; c) detect the presence of O. communa on resident non-targets species (i.e. species other than A. artemisiifolia) and its damage to these plants, in order to assess if potential future use of the beetle as a biological agent could contrast with seeding competitive vegetation.

Study area and experimental design
The study was carried out in the "Alto Milanese" Park (359 ha), a protected area of local interest sited in northern Italy, approximately 28 km north-western from the city of Milan (45°35'38.20"N, 8°51'52.61"E). The park is located in one of the most invaded areas by A. artemisiifolia (Gentili et al. 2015) and its surface is mainly covered by cropped fields (60.2 %; Parco Alto Milanese 2007). Woodlands (17 %), mostly dominated by Prunus serotina Ehrh. and Robinia pseudoacacia L., fallow fields (1.6 %) and hedgerows (3.8 %) are also present. In 2014, three sites with comparable soil properties and a seed bank of A. artemisiifolia (Suppl. material 1: Table S1, Figure S1), were selected inside the park: (1) a short-rotation clover field (X: 45°35'42.37"N, 8°51'52.99"E), (2) an oat field (Y: 45°35'54"N, 8°52'9"E) and (3) a short-rotation meadow (Z: 45°35'37"N, 8°52'14"E). Each site contained three squared plots of 100 m 2 , separated by 1 m buffer, for a total number of 9 plots. In each site, the following treatments were set up: (a) Control -not seeded (C): the plot was harrowed and ploughed no deeper than 15 cm and then left to spontaneous vegetation colonisation, without sowing any herb layer; (b) Hayseed (Hs): the plot was harrowed and ploughed no deeper than 15 cm and then seeded with hayseed at a density of about 20 g/m 2 . In June 2013, a mowed mesophilous grassland dominated by Arrenatherum elatius (L.) P. Beauv. ex J. & C. Presl close to the study area was selected as a donor grassland for hayseed collection. The most frequent species of the mixture besides A. elatius were: Achillea millefolium L., Centaurea nigrescens Willd., Trifolium pratense L. and T. repens L. Once dried, hayseed was prepared in accordance with the protocols of the Native Flora Centre of the Lombardy Region (Ceriani et al. 2011); (c) Over-seeding hayseed (Ov): the plot was only superficially harrowed and over-seeded with hayseed at a density of about 20 g/m 2 .
In 2014, these treatments were applied in March (late seeding), then the experiment was repeated in the same sites during 2015, the only difference being that soil was prepared and hayseed sown in October 2014 (early seeding).
The proposed experimental approach is quite different from that published in Gentili et al. (2017b) especially: a) the current work was done in protected arable areas, suffering from the expansion of A. artemisiifolia; b) different seeding periods were applied in 2014 and 2015; c) different techniques for soil treatment were used.

Vegetation
In 2014 and 2015, vegetation data were collected in three 2 m × 2 m quadrats randomly chosen within each plot, at least 1 m from the edge. The following parameters were measured in June for the vegetation cover other than A. artemisiifolia and in September for the weed abundance and traits (on 30 randomly selected plants, when present) (Gentili et al. 2015): (a) vegetation cover: percentage vegetation cover other than A. artemisiifolia, visually estimated; (b) species abundance: number of individuals of A. artemisiifolia; (c) vegetative traits: plant height (cm), measured from the plant collar to the apex; plant width (cm), measured as the maximum width of an individual; maximum leaf length (cm), measured from the petiole to the leaf apex; (d) reproductive traits (i.e. pollen production proxies): maximum size of male composite inflorescence, i.e. spike (mm); total number of male inflorescences.
In order to assess the effect of hayseed cover on soil temperature at the beginning of the vegetative period, two dataloggers (model TransitempII, Magditech) were placed in control and hayseed treatments at site Z during April 2015, corresponding to the germination of A. artemisiifolia. Dataloggers were buried at a depth of 10 cm and the temperature was measured daily, with an interval of 30 minutes.

O. communa presence and damage to A. artemisiifolia
In mid-September 2015, when damage caused by O. communa to A. artemisiifolia is usually at its maximum (Miyatake andOhno 2010, Fukano et al. 2013) and the weed is at the end of its growing season (MacKay and Kotanen 2008, MacDonald andKotanen 2010a), the following data were recorded for 25 plants (when present) in each plot: (a) if individuals were mature, i.e had raceme longer than 1 cm or had female structures or seeds formed (http://www.ragweed.eu); (b) the number of individuals of O. communa in every life stage (i.e. egg batches, larvae, pupae and adults); (c) the damage, visually assessed and expressed as a percentage of missing tissue, caused by O. communa, separately for leaves, stems, reproductive structures and for the whole plant.

O. communa presence and damage to non-target species
In each site, during the summer of 2015, O. communa presence on non-target plants was monitored in an area of about 600 m 2 that included the plots and the surrounding vegetation. Non-target species were selected on the basis of the hayseed composition (i.e. most frequent species) and of floristic surveys of common plants in the area and included genus and species belonging to six different families: (1)  From early June to the end of September 2015, the following data were recorded fortnightly for 5 individuals of each non-target species (when present) in every sites: (a) phenological stage, i.e. vegetative, flowering or seedling; (b) presence/absence of O. communa and the number of individuals in every life stage (i.e. egg batches, larvae, pupae and adults); (c) when O. communa was present, the damage, visually assessed and expressed as a percentage of missing tissue, potentially caused by the beetle, separately for leaves, stems, reproductive structures and for the whole plant. Damage was evaluated only when the insect was seen on the plant to minimise the possibility of mistakenly assigning to O. communa a feeding event due to other herbivores. Moreover, in order to increase the probability of O. communa encounter on non-target species, in each session, plants were randomly chosen for observation so that the same individuals were rarely sampled.
In order to have comparison data of the beetle presence and attack on its primary host throughout the season, 10 A. artemisiifolia plants were contemporarily monitored in each site, with the same method described above for the non-target species. In every session, the beetle density (i.e. number of egg batches, larvae, pupae and adults) was also estimated in 11 quadrats of 1 m 2 homogeneously distributed inside the 600 m 2 area.

Vegetation
Prior to any statistical analysis, vegetation data, collected in the three quadrats, were pooled for each plot. Differences in vegetation cover of species other than A. artemisiifolia and in vegetative and reproductive traits of A. artemisiifolia plants in different treatments and years were tested by Linear Mixed Effects models (LME). Treatment and year were fitted as interacting fixed factors, while the site was fitted as a random effect. When necessary, the data were log-transformed to normalisation. The difference in number of A. artemisiifolia individuals between treatments and years was tested using a negative binomial Generalised Linear Mixed Models (GLMM) model (to correct for over-dispersion) with the same structure of LMEs described above.
Soil temperature was compared between control and hayseed treatments by means of a t test.

O. communa on A. artemisiifolia and non-target species
Differences in damage to leaf and reproductive structures of A. artemisiifolia caused by O. communa and in the number of adult beetles per plants between treatments were tested by GLMMs, with the treatment as fixed factor and the site as random effect. Data on damage was arcsin-transformed [Y = asin(√(0.01*y))]) and modelled with a Gaussian distribution, while data on density was modelled with a negative binomial distribution to correct for over-dispersion. The difference in A. artemisiifolia height related to leaf damage caused by O. communa was tested only in control plots, where the weed growth was not influenced by hayseed competition and only scarcely influenced by spontaneous vegetation: a LME model was constructed, with leaf damage as continuous fixed factor and the site as categorical random effect.
Data on O. communa presence and damage on non-target species were analysed only qualitatively and cumulated throughout the season, due to low number of records of the beetle on species other than A. artemisiifolia. Moreover, data were cumulated over the three sites, as O. communa density was similar during the study period (mean ind/plant: site X = 14.4, site Y = 13.6, site Z = 15.8; X 2 = 0.5, DF = 2, NS).
All statistical analyses were performed using R version 3.3.2 (R Core Team 2016) and post-hoc tests by means of the packages "lsmeans" (Lenth 2016) and "multcomp" (Hothorn et al. 2008).
In 2014, following the "late seeding", the number of individuals of A. artemisiifolia did not show any differences amongst treatments (Figure 1b). On the contrary, in 2015 (early seeding), C exhibited a higher number of individuals than both Ov and Hs (z C vs Ov = 5.41, p < 0.001; z C vs Hs = 5.23, p < 0.001). Comparing the two observation years, the number of A. artemisiifolia individuals greatly decreased in Ov (z = 4.76; p < 0.001) and Hs (z = 4.53, p < 0.001).

A. artemisiifolia traits
With regard to plant height, in 2014, the species showed quite a large size, reaching more than 1 m for the mean in all treatments that differed significantly from each other (Figure 1c): the highest individuals were found in Hs while the shortest ones were in Ov (t C vs Ov = 3.98, p < 0.001; t C vs Hs = -3.31, p < 0.003; t Ov vs Hs = -7.29, p < 0.001). In 2015, C exhibited a significantly higher size than both Ov (t = -12.85, p < 0.001) and Hs (t = 15.9, p < 0.001). Comparing the treatment trends in the years 2014 and 2015, they exhibited a strong reduction in plant height in the second year (C: t 2014 vs 2015 = 18, p <0.001; Ov: t 2014 vs 2015 = 11.94, p <0.001; Hs: t 2014 vs 2015 = 20.037, p <0.001). This reduction was particularly marked in Hs where the plant height diminished to 94.1 ± 4.7 cm.
Regarding the other collected plant traits (plant width, maximum leaf length and number of male inflorescences), similar trends to those of inflorescence size were observed. In particular, in 2015, C differed from Ov and Hs, while comparing the same traits in 2014 and 2015, reductions in size and number was recorded (see Suppl. material 1: Figure S2).

O. communa presence and damage to A. artemisiifolia
Overall, 192 plants of A. artemisiifolia were observed: 75, 75 and 42 individuals in C, Ov and Hs treatments, respectively. The lower number of plants monitored in Hs was due to the low density of the weed in those plots (see paragraph "Vegetation cover and A. artemisiifolia abundance"). Of the sampled individuals of A. artemisiifolia in mid-September, 94.8% were mature, i.e. with reproductive structures formed.
In total, 3267 O. communa were found on A. artemisiifolia plants; most of all were adults (76.3 % vs 17.8 % larvae, 3 % pupae, 2.9 % egg batches). All plants except one (in Ov treatment) were attacked by the insect, reporting damage on about 72 % of the whole tissues (min.: 2 %, max.: 99 %). Of the attacked plants, all had conspicuous damage on leaves and 90 % on reproductive structures (i.e. male inflorescences and seeds; Table 1). Damage on the stem was reported for 30.7 % of the plants but was not influential as it concerned, on average, only 4 % of the tissues.
The number of adults per plants was significantly lower in Hs plots with respect to C and Ov plots and also in Ov plots with respect to those in C (z Hs vs C = -11.54, p < 0.001; z Hs vs Ov = -7.83, p < 0.001; z Ov vs C = -5.37, p < 0.001; Table 1). However, damage on A. artemisiifolia leaves in the three treatments was similar, while damage on reproductive structures in C and Hs plots was higher than the damage in Ov (z C vs Ov = 3.96, p < 0.001;  Table S2; no increasing trend in insect number on non-target species was observed throughout the summer. On 4 non-target species (Lolium sp. pl., Papaver rhoeas, Persicaria maculosa, Polygonum sp. pl.), no O. communa was detected.
On 6 of the 9 species where O. communa was present, damage was observed (Table 2). For each species, less than 7 % of monitored individuals were attacked throughout the season and only on Centaurea sp. pl. the mean damage was higher than 10 % (Table 2). Damage on non-target species was recorded only on leaves.

Vegetation
This study ascertained the effectiveness of seeding competitive vegetation from native species mixture of hayseed, both over-seeded over the resident plant community or after ploughing, in controlling A. artemisiifolia in an agricultural area. Particularly, it confirmed the trend observed in a ruderal quarry habitat (Gentili et al. 2015(Gentili et al. , 2017b, that competitive vegetation was able to suppress the establishment and growth of A. artemisiifolia as well as possibly contributing to reduce the soil seed bank of the species and its pollen production (i.e. lower number of plants and reduced inflorescences). The competitive ability of A. artemisiifolia in continuously disturbed sites such as agricultural areas has been recognised as being a key factor in promoting its invasion success (Kazinczi et al. 2008, Bullock et al. 2012. In these results, a high vegetation cover in seeded areas, holding an increased number of competitor species filling vacant niches, controlled or suppressed the species. This could be due, from one side, by limiting available resources (light and nutrients) and on the other, by modifying local environmental conditions (Katz et al. 2014). For instance, soil temperature was lower in highly vegetated treatments (i.e. hayseed): lower temperature may delay germination of A. artemisiifolia and enable the competitors to grow more and before the weed. In fact, the recent study of Skálová et al. (2015) highlighted that temperature is a main determinant of A. artemisiifolia distribution, especially influencing the species' germination (Leiblein-Wild et al. 2014). For all such reasons, the performance of A. artemisiifolia was poorer where vegetation cover was higher. Even if the percentage cover of other species was also relatively high in control plots in 2015, they showed a very limited growth in height under the canopy of A. artemisiifolia.
In addition, testing in the field two different seeding periods (i.e. an early and a late seeding period), allowed the verification of different priority effect advantages for this invasive species. Seeding the hayseed after the winter season (late seeding) allows the earlier development of A. artemisiifolia, as it gives it a competitive advantage with a temporal priority effect. This advantage may lead to a different community structure dominated by A. artemisiifolia. Indeed, when it starts to grow simultaneously, the weed is able to growth rapidly and out-compete native species. This kind of performance was also observed by Ortmans (2016) in a greenhouse experiment. On the other hand, anticipating the seeding of hayseed before the winter season (early seeding), native species start to grow before the weed, eliminating or strongly reducing the temporal priority effect for A. artemisiifolia and shifting to the native species assemblage (i.e. hayseed).
Priority effects of alien species have been previously investigated in several plant communities in the context of habitat restoration and control of alien species, with the final aim being to encourage the competitive effect of native species over invasive ones (Vaughn and Young 2015). Studying the role of priority in shaping community composition can address management activities and the choice of native plant assemblages able to inhibit invasive plant species (Zefferman 2015).

O. communa on A. artemisiifolia
A high number of O. communa heavily feeding on both leaves and reproductive structures of A. artemisiifolia was observed in the study area, as already reported for other sites in northern Italy , Müller-Schärer et al. 2014. Damage on leaves involved, on average, 73 % of the tissues and caused a significant reduction in plant size, but about 95 % of A. artemisiifolia still had flowers and/or seeds. This was not unexpected as defoliation by herbivores is found to reduce plant height and the number of branches (Guo et al. 2011) but not to affect the ability of the weed to fructify (MacDonald and Kotanen 2010b). However, 90 % of sampled plants reported damage on around 40-50 % of the reproductive structures tissue (i.e. male inflorescences and seeds). Damage to racemes is of great importance in terms of biocontrol because it may reduce the amount of available pollen thus benefiting the allergic human population (Bonini et al. 2015. With regard to the treatments, O. communa density was very variable, ranging from an average of 25.2 adults per plants in control plots to 0.6 adults per plants in hayseed plots. Despite this difference, damage to leaves was similar in C, Ov and Hs; damage to reproductive structures was also comparable and quite high, even if it was lower in Ov than in the other two treatments. This likely indicates an ability of the insect to move between A. artemisiifolia patches and find its primary host even when the plant is at very low density. O. communa is considered to have high dispersal ability (Yamamura et al. 2007). Tanaka and Yamanaka (2009) estimated that it can potentially fly a distance of 25.4 km in 23 hours. When the beetle finds a new A. artemisiifolia plant, it is able to severely damage it; if the attack is massive, the beetle and its next generations are obliged to move again to search for other plants, both for feeding and reproduction (Yamazaki et al. 2000, Tanaka andYamanaka 2009). These results highlighted that, during summer 2015, the O. communa density was sufficiently high to force the species to move also to Ov and Hs plots, where A. artemisiifolia presence was low; there, the damage caused, combined with low availability of other primary hosts in the immediate vicinity, likely led the beetle to leave these patches, justifying the low number of insects per plants observed in these plots at the end of the season.
Despite the apparent overall high number of beetles and the conspicuous damage caused to leaves and reproductive structures, in the study area O. communa did not naturally reach the minimum density crucial for the suppression of A. artemisiifolia population in the short term and parts of the plant which were able to produce seeds survived, even if climate during summer 2015 was favourable for the beetle development. The mean temperature during daylight was between 25-30 °C (June: 24.7 °C, July: 29.8 °C, August: 25.2 °C; U.O. Meteoclimatologia 2017), which is suggested as an optimum range for O. communa population growth (Zhou et al. 2010). In fact, studies conducted in China, where O. communa is used as a successful biocontroller of A. artemisiifolia, demonstrated that the beetle effectiveness is highly density-dependent (Guo et al. 2011, Zhou et al. 2014. Moreover, the number of beetles should be higher with increasing plant height. For example, Guo et al. (2011) reported that ≥ 1.07 and ≥ 12 adults per plant, at early and late growth stage, respectively, should be released to cause the complete defoliation of the weed and its death prior to fructification. Gard et al. (2013) and Zhou et al. (2014) also suggested that biocontrol of A. artemisiifolia should include various specialised enemies, whose joint combination that weaken different parts of the plant (e.g. roots, leaves, flowers) could prevent the reallocation of its resources on undamaged structures. Despite determining the ability of O. communa to cause damage to male flowers with potential reduction in pollen release, more investigations are certainly needed to understand if, in Italy, the natural density of the beetles, even if not able to suppress A. artemisiifolia population in the short term, will be able to decrease seed production and, consequently, reduce the weed abundance in the medium-long term or if inundative releases, maybe in combination with other agents, are necessary in the case the insect is elected as a suitable biocontroller.

O. communa on non-target species
The risk for non-target species was potentially high in the area which was monitored. The number of O. communa was conspicuous and, at the end of the season, A. artemisiifolia defoliation was relevant; therefore, there were suitable conditions for movement of the beetle to other hosts. However, O. communa was recorded only on 8.5% of observed non-target plants. Beetle detection started from the first sampling session, in early June, but no trend was observed throughout the summer, neither was there an increase in September, when food shortage caused by A. artemisiifolia exploitation could have forced migration to other species.
Greater incidence of O. communa was recorded on species belonging to the family of Asteraceae containing relatives of A. artemisiifolia (Achillea millefolium, Artemisia verlotiorum, Centaurea sp. pl., Erigeron annuus), but also on Chenopodium album (Chenopodiaceae) and Trifolium sp. pl. (Fabaceae). Similar results were obtained in other studies where the insect, when reported on plant species different from A. artemisiifolia, was present on relatives of the weed (e.g. A. trifida L. and A. psilostachya DC., A. cumanensis Kunth, Xanthium sp. pl., Heliantus sp pl., Iva sp. pl. and Parthenium sp. pl.; Palmer and Goeden 1991, Tamura et al. 2004, Watanabe and Hirai 2004, Yamanaka et al. 2007, Cao et al. 2011). On the other 7 plant species sampled, no beetle was detected (Lolium sp. pl., Papaver rhoeas, Persicaria maculosa, Polygonum sp. pl.) or only adults were present with no trace of feeding (Arrhenatherum elatius, Holcus lanatus, Sorghum halepense), suggesting casual wandering of the insect (Yamazaki et al. 2000). When damage was present, it was on a very low percentage of individuals (never higher than 7%), on average, on no more than 6% of the tissues and always located on leaves, even if around 50% of the plants had flowers or seeds. Only on Centaurea sp. pl., the mean damage was higher, due to a few events where attack resulted on around 50 % of the leaf tissues. Particularly interesting is the observation of quite a high number of plants with O. communa presence (23 %) in all life stages on Artemisia verlotiorum, another exotic species close to A. artemisiifolia. Elsewhere in Italy, adult beetles were found on other Asteraceae (X. strumarium, H. tuberosus, Erigeron canadensis L. and Dittricha graveolens (L.) Greuter), some of which were feeding but not causing significant damage to the plants , Müller-Schärer et al. 2014.
A limitation for this study is that O. communa was not directly observed feeding on A. artemisiifolia and this could lead to false positives (Palmer and Goeden 1991). However, it was attempted to contain mistakes by assessing damage only on plants where O. communa was present and where the feeding trace was similar to those left by the beetle. Consequently damage was recorded mainly on Asteraceae, that is a likely result as relatives of A. artemisiifolia have already been reported as alternative sub-optimal hosts of O. communa (Yamazaki et al. 2000, Cao et al. 2011. Moreover, when damage was recorded, O. communa was often found at life stages other than adult, suggesting some kind of use by the insect and not only casual movements. In the end, the overall risk for the non-target species monitored in this study seems small; the density of O. communa and damage on plants resulted as low and can be considered as occasional events. On the contrary, O. communa showed a strong preference for its primary host, A. artemisiifolia; beetle number, percentage of attacked plants and feeding were higher for A. artemisiifolia compared to non-target species. It has already been demonstrated that when A. artemisiifolia is in sufficient number to sustain O. communa population, the beetle prefers to complete its life cycle on its primary host (Yamazaki et al. 2000, Cao et al. 2011. Moreover, Dernovici et al. (2006) has seen that, even if all stages of O. communa can survive on species other than A. artemisiifolia, such as sunflowers, the population collapses within a few generations. However, it cannot be ignored that egg batches, larvae and pupae were also present on non-target plants, suggesting that specific laboratory and field tests on oviposition preference and larval development on non-target species should be conducted to precisely evaluate the risk of an O. communa shift in the case of absence of A. artemisiifolia.

Management implications
This work is one of the first that investigated, with an interdisciplinary approach, the effects of both competitive vegetation and herbivory by O. communa in contrasting the alien invasive species A. artemisiifolia in a protected agricultural, highly invaded, area.
Regarding competitive vegetation, during the implementation of hayseed (or seed mixtures), the key factors for controlling/suppressing the weed will be: (1) the seeding period before the winter season and (2) a gap-free vegetation cover. After adopting competitive vegetation, A. artemisiifolia decreases in abundance and reproductive potential (i.e. inflorescence size) and consequently, its allergenic impact could also be strongly reduced. Further studies will be needed to clarify the long-term effect on seed production and soil seed bank. This method is particularly suitable for agricultural protected areas where the use of herbicide is not allowed or discouraged.
With regard to herbivory, the crushing impact of O. communa on A. artemisiifolia is confirmed: severe damage to reproductive structures (racemes) was observed, probably conditioning the amount of released pollen and the allergenic potential of A. artemisiifolia populations (Bonini et al. 2015. However, O. communa was not able to kill A. artemisiifolia prior to fructification at its natural density in the study area and plants kept on producing inflorescences and seeds, confirming that a minimum number of beetles per plant is necessary for the suppression of A. artemisiifolia population in the short term in the field (Guo et al. 2011, Zhou et al. 2014). Moreover, even if O. communa preferred A. artemisiifolia for feeding and oviposition (Yamazaki et al. 2000, Cao et al. 2011, it was also found on non-target species, mainly belonging to Asteraceae family. The degree of damage was generally low as O. communa tends to move to its primary host to complete its life cycle. Observation of life stages other than adults on some resident plants confirms that, when A. artemisiifolia density is low, O. communa can potentially choose different plant species, suggesting that some attention should be paid to the risk for non-target species. Specific tests are currently being undertaken by the Ophraella task force (EU-project SMARTER) that is intensely evaluating the suitability of the beetle as a biological control in Europe.
Considering the two methods, it can be asserted that competitive vegetation using native flora plants has a small/null impact on ecosystems and it can be almost totally controlled by users. On the other hand, biological agents are often alien to the resident community and they potentially represent a risk for local flora, fauna and agricultural production. As for O. communa, preliminary results of a hazard analysis in France revealed a low risk for agriculture and the environment (Chauvel et al. 2017). Moreover, the beetle has been identified as one of the most promising agents for biocontrol of A. artemisiifolia in Italy, as a great overlap of A. artemisiifolia and O. communa suitable areas in current and future climatic scenarios was predicted (Lommen et al. 2017b, Sun et al. 2017). In addition, it is important to underline that the effect of the bio-agent can be more time and cost-saving than vegetation recovery, potentially ensuring an effective action on a wider area in a shorter time. Due to the possible risk for non-target species, an integrated control applying both of the two techniques should be monitored in the medium-long term as the insect use could conflict with the seeding of native mixture due to its possible attack on other Asteraceae beside A. artemisiifolia. In addition, in agricultural areas, the application of both competitive vegetation and biological control using O. communa are critical due to potential interferences on farming practices and vice versa, beyond issues related to optimising crop yields. However, agricultural areas probably represent the main sources of pollen and propagules of A. artemisiifolia as the species finds suitable conditions to persist due to repeated disturbance. Consequently, according to the current need for an even more sustainable agriculture, low impact solutions respecting alimentary products and environment should be developed.  Table S1. Soil characteristics in the three investigated sites; Table S2. O. communa density on non-target species throughout summer 2015; Figure S1. A. artemisiifolia soil seed bank in the three sites; Figure S2.

Patterns of occurrence of semi-aquatic reptiles in highly invaded Mediterranean rivers Introduction
The Mediterranean region has a high diversity of reptiles, including many endemic species (Sindaco and Jeremčenko 2008). This region has had dense human populations since historical times and its natural (particularly riverine) habitats have been subject to intense impacts (Corbacho et al. 2003;Ferreira et al. 2007). As a consequence of human activities, the populations of several semi-aquatic reptiles are in severe decline (Filippi and Luiselli 2000;Cox et al. 2006). This is attributable to a decline in habitat quality and the increasing presence of alien species that are, in many cases, superior competitors in disturbed environments (Cadi and Joly 2003;Metzger et al. 2009). In this study, the factors potentially influencing the presence of semi-aquatic reptiles in the fluvial systems of Girona (north-eastern Spain) were investigated. The lower reaches of the major rivers of the region have been subject to substantial habitat degradation, associated with watershed regulation and the widespread occurrence of alien species (Saurí et al. 2001;Ordeix et al. 2014). These changes have been associated with the collapse of native biotic communities in the coastal plain rivers (Doadrio 2001;Clavero et al. 2009). In contrast, the headwaters of the rivers are subject to greater seasonality, have been colonised by fewer alien species and constitute important shelters for native fauna (Boix et al. 2010;Maceda-Veiga et al. 2010).
In this region, four species of native semi-aquatic reptiles occur, including two turtles (Emys orbicularis and Mauremys leprosa) and two natricine snakes (Natrix astreptophora and Natrix maura; Pleguezuelos et al. 2002). Three of these species are widespread in the rivers of the region, but E. orbicularis is rare and localised (Mascort 1998). In addition, there is also an alien turtle, Trachemys scripta, which occurs occasionally in the rivers/streams of the region, particularly associated with disturbed parts of deltas and coastal marshes (Martínez-Silvestre et al. 2011). Therefore, interspecific interactions amongst the three species of turtles are likely to be localised, but could be common between the two natricines, which also show some overlap in their diets (Salvador 1998). In other European natricines, these associations can be negative or neutral, depending on the level of trophic overlap (Filippi et al. 1996;Scali 2011).
The widespread presence of alien species in the fluvial systems could influence the occurrence of the native semi-aquatic reptiles. However, it is not known what interactions occur between the native reptiles and alien species, because previous studies have focused only on the effects of alien species on native fish communities (Benejam et al. 2008;Clavero and Hermoso 2011). It is likely that native reptiles and alien species interact negatively, because they compete for similar prey (small fish, amphibians and macro-invertebrates; Doadrio 2001) and some alien fishes could be predators of the reptiles (e.g. Esox lucius, Micropterus salmoides;Lagler 1956;Zavala 1983).
In this study, the impact of alien species on the occurrence of the native reptiles was assessed and I expected that the effect would be negative (hypothesis i). My second objective was to assess whether the two natricines would show non-random associations, which I expect would be negative (hypothesis ii). My third objective was to identify the environmental characteristics of those river stretches that have greater diversity of semi-aquatic reptiles.

Study region
Based on the Köppen climate classification system, most of the study region has a Csa climate type (AEMET 2011) characterised by a warm dry summer. The headwaters of the Muga and Fluvià rivers are located in an area having a Cfa climate type (AEMET 2011), which is characterised by short periods of summer drought. The study involved seven fluvial systems having very distinct hydromorphological characteristics. These included: 1) the riera de Calonge, which is a seasonal stream of 3.5 km in length; 2) the Daró River, which is a seasonal stream of 35 km in length; 3) the Fluvià River (97 km in length and having a flow rate at the mouth of 11 m 3 s −1 ); 4) the Muga River (58 km in length and having a flow rate of 3.3 m 3 s −1 ); 5) the riera de Pedret, which is a seasonal stream of 17 km in length; 6) the Ter River (208 km in length and having a flow rate of 25 m 3 s −1 ); 7) the Tordera river (62 km in length, and having a flow rate of 5.0 m 3 s −1 ). The occurrence of alien species (alien fishes, crayfish) varies significantly amongst the basins of these rivers/streams (Ordeix et al. 2014). For example, alien species only occur at one site in the riera de Calonge (25%), while they were present in 96% of the sites in the Ter River (Suppl. material 1).

Sampling and habitat characterisation
Baited crayfish net traps (60 × 30 cm) were used to detect the presence of reptiles at 261 sites distributed amongst the six fluvial systems. The baiting method is commonly used in this type of study (Gibbons et al. 2006;McDiarmid et al. 2012). One to three traps separated by five metres were placed at each station for a period of 12−16 h. One end of each trap was suspended above the water to enable air breathing by the captured reptiles. The surveys were carried out between April and October 2016, encompassing the period of greatest activity for semi-aquatic reptiles in the region (Salvador 1998). This trapping method of capture is also effective in estimating the presence of fish and crayfishes (Johnson et al. 1992;Harper et al. 2002). The captured fishes were classified as alien species following Doadrio (2001).
The categorisation of riverine habitats was based on seven characteristics described by Pardo et al. (2002), including: 1) embeddedness in riffles and runs and sedimentation in pools; 2) riffle frequency; 3) substrate composition; 4) velocity/depth regime; 5) shading of river bed; 6) heterogeneity components; and 7) aquatic vegetation cover. The assessment of these characteristics enabled estimation of the heterogeneity of the riverine habitats, based on a fluvial habitat index (Pardo et al. 2002). High values of this index are associated with higher native biota diversity in Mediterranean fluvial systems (Pardo et al. 2002;Aparicio et al. 2011).
In addition to the above, the stream intermittency (or stream level) and the forest cover in the surveyed river sections were assessed. Stream levels were measured because this is an important factor affecting the presence of alien species (Clavero and Hermoso 2011). The stream level was categorised as: 1) mild dryness (a small part dry); 2) moderate dryness (a large part dry); and 3) severe dryness (completely dry). Forest cover was assessed because of its effect on the biotic composition of the rivers (Gergel et al. 2002), including semi-aquatic reptiles (Ficetola et al. 2004;Escoriza and Ben Hassine 2017). The forest area was assessed by remote sensing-based habitat characterisation (Tuanmu and Jetz 2014). A range of major physicochemical parameters of waters was also measured (Eklöv et al. 1998;Peltzer and Lajmanovich 2004), including conductivity (µS m -1 ), pH and dissolved oxygen (mg l -1 ). These parameters were measured in situ, using a Crison 524 conductivity meter (for conductivity), an EcoScan ph6 (for pH) and a Hach HQ10 Portable LDO meter (for the dissolved oxygen content). The measurements of these parameters were made at a single time point between 11:00 h and 16:00 h (local time).

Data analyses
Data analysis was focused on investigating the relationships of the reptile occurrence to the various riverine habitat descriptors (i) and on patterns of reptile co-occurrence along the fluvial systems (ii). The presence of reptiles in relation to environmental conditions was visualised using canonical correspondence analysis (CCA) (Ter Braak 1986), which was conducted using the software package PAST 3.0 (Hammer et al. 2013). Significant associations between reptile occurrence and riverine habitat predictors were tested using distance-based linear models (DistLM), by developing a distance matrix using the Sørensen index for presence/absence data (Clarke and Gorley 2006). The significance was assessed following 9999 permutations of residuals under a reduced model (Clarke and Gorley 2006). To assess whether the predictors exerted a positive or negative influence on the dependent variable, XY scatter plots and trend lines were generated (Clarke and Gorley 2006). These analyses were carried out using PRIMER-E (PRIMER-E Ltd., Plymouth).
Co-occurrence patterns were investigated using joint species distribution models (JSDM) (Pollock et al. 2014). This method of analysis enables evaluation of whether the occurrence of a species is influenced by environmental factors and interspecific interactions. The finding of pairwise associations to the gradient superior to species' residual correlations is an indication that environmental filtering probably explained the co-occurrences of species (Pollock et al. 2014). On the other hand, the finding of strong residual correlations and weak associations with the gradient is as an indication that specific interactions probably explained the co-occurrences of species (Pollock et al. 2014). The model was adjusted by running five chains with 100,000 iterations each; the first 10,000 were discarded as burn-in and 10 was used as the factor to thin the post burn-in samples (Pollock et al. 2014). Model convergence was determined by Gelman-Rubin statistic (Gelman and Rubin 1996). These analyses were performed using R2jags (Su and Yajima 2011) and R (R Core Development Team 2017).

Results
During the surveys, four reptile species (E. orbicularis, M. leprosa, N. astreptophora and N. maura) were found, while the alien turtle species that also occurs in the study region was not detected. The presence of E. orbicularis was only observed in a single site (Fig. 1). For this reason, this species was included in the analyses, but not in the discussion of the results. The descriptive statistics for the environmental variables and species-sites are shown in Table 1. The data obtained for all the river basins and the list of observed or collected alien species are shown in Suppl. material 1.
The CCA showed the distribution of these reptiles as a function of the riverine habitats. The first axis of the CCA (eigenvalue = 0.24, explained variance = 56.86%) was negatively correlated with the presence of alien species and was positively correlated with stream level and fluvial index ( Table 2 and Fig. 2). This axis described the transition between non-seasonal riverine habitats having greater/lesser alien species richness in coastal plain rivers/streams (Table 2 and Fig. 2). Natrix astreptophora and N. maura showed a positive association with this axis, while E. orbicularis and M. leprosa showed a negative correlation (Fig. 2). The second axis of the CCA (eigenvalue = 0.12, explained variance = 28.18%) was positively correlated with the stream level and was negatively correlated with altitude and the water conductivity (Table 2 and Fig. 2). This axis described the transition from mountain streams to seasonal plain      streams. The species that were positively correlated with CCA 2 were M. leprosa and N. astreptophora, while E. orbicularis and N. maura showed a negative correlation (Fig. 2). The DistLM analysis indicated that the presence of M. leprosa was significantly negatively associated with altitude, but positively associated with alien fish richness and crayfish presence ( Table 3). The occurrence of N. astreptophora was significantly positively associated with stream level, but negatively associated with alien fish richness (Table 3). The occurrence of N. maura was significantly positively associated with dissolved oxygen (Table 3).
The JSDM analysis showed a strong environmental correlation (R = 0.70) between M. leprosa and N. maura, while both natricine snakes showed a weaker correlation (R = 0.37; Table 4). There was a strong negative residual correlation in the occurrence of M. leprosa and N. astreptophora (R = −0.81), while N. astreptophora and N. maura showed a weak negative residual correlation (R = −0.14; Table 5).

Discussion
The fluvial systems of north-eastern Spain are highly disturbed and have been colonised by several alien species (Ordeix et al. 2014). Many studies have shown that alien species have a major impact on native icthyofauna and batrachofauna (Kats and Ferrer 2003;Cruz et al. 2008; Clavero and Hermoso 2011), but it is not known what effect they have on semi-aquatic reptiles. This uncertainty was investigated in the present study by analysing the patterns of presence of reptiles based on several environmental descriptors, including the presence of alien fishes and crayfish. The results showed that the presence of alien species also adversely affect the diversity of reptiles, although the species' vulnerability is variable.
The distLM analyses showed that the three species of semi-aquatic reptiles responded differently to the conditions of riverine habitats. Mauremys leprosa and N. maura occupied habitats in highly regulated coastal plain rivers having low structural diversity. Due to the stability of the watershed, these areas support a greater number of alien species that are poorly adapted to the Mediterranean seasonality (Benejam et al. 2005;Boix et al. 2010). The JSDM analyses also showed strong shared environmental responses between M. leprosa and N. maura, indicating that the co-occurrence of these species can be attributed to habitat filtering. The positive association of M. leprosa with highly disturbed stretches of the coastal plain rivers can be explained by several factors: (i) these populations of M. leprosa are close to the northern limit of the distribution of the species (Pleguezuelos et al. 2002) and their occurrence was negatively associated with altitude, which is a surrogate of the thermal gradient; (ii) this reptile shows heliothermic regulation (Salvador 1998) and occupies open, sun-exposed sections of the rivers. These sections have a lower fluvial habitat index score (Pardo et al. 2002); and (iii) M. leprosa is an opportunist feeder that includes alien species in its diet (Pérez-Santigosa et al. 2011), so it can thrive in habitats where native communities are very impoverished but alien species are abundant. Natrix maura also favours more stable river stretches that maintain water during the dry season. These sections typically retain populations of some small native fishes that are the usual prey for this snake (Salvador 1998). However, N. maura is a highly opportunistic predator and also feeds on alien fishes if they replace the native species (Rugiero et al. 2000).
By contrast, JSDM analysis indicated that there was a strong negative association between M. leprosa and N. astreptophora. The occurrence of N. astreptophora was positively associated with river sections having absence of alien fishes and higher hydrologic seasonality. Negative interactions between the two species are unlikely, as both differ in their use of habitats and trophic resources (Salvador 1998); this result might be caused by some non-evaluated habitat parameters (Börger and Nuds 2014). The association between N. maura and N. astreptophora was weakly negative, possibly caused by some trophic segregation, resulting in interspecific interactions being weak but not absent (Luiselli 2006).

Conclusion
Overall, the analyses indicated that the three species of reptiles can occur in altered river stretches (i.e. with low fluvial habitat indices). The semi-aquatic reptiles of the region are generalist species, well adapted to occupy highly dynamic habitats that show significant interannual fluctuations (Gasith and Resh 1999). In our study, most alien species are limnophilic species that have originated in regions with oceanic climates (Doadrio 2001); as a consequence, they are not well-adapted to the highly-variable discharge regimes that characterise Mediterranean-climate rivers (Boix et al. 2010). This fact suggests that restoring the natural flow regime could indirectly favour reptile diversity by creating hostile conditions for the persistence of alien species. In summary, the removal of alien species combined with habitat restoration measures to prevent future proliferation of alien species is probably the most appropriate strategy to preserve the diversity of native semi-aquatic species. The relevance of using various scoring schemes Introduction Alien species cause various and sometimes devastating changes to the environment where they are introduced and influence social and economic aspects of human life (e.g. Pyšek and Richardson 2010, Vilà et al. 2010. To minimise the negative effect of alien species, the Convention on Biological Diversity in its Aichi target 9 has proposed steps to mitigate their impacts, including identifying harmful alien species and prioritising their management (www.cbd.int/sp/ targets/). To reach these goals, standardised measures for impact assessment and species prioritisation are needed.
This need has recently been met by the development of impact scoring schemes (e.g. Hawkins et al. 2015; see Leung et al. 2012 for a review on risk assessments more broadly). Such schemes are typically based on published evidence of impacts or expert opinion and are meant to be transparent, robust and easy to use ). However, they often differ in the parameters used for the assessment and the way published evidence is translated into impact magnitude (e.g. Kumschick et al. 2017. Differences in the outcomes using these schemes can potentially influence policy decisions and, for this reason, there is a need to quantify whether impact assessment schemes are comparable. Three scoring schemes are considered in this study. The Generic Impact Scoring System (GISS) was first developed to assess the environmental and economic impact of alien mammals in Europe  and is one of the most widely used scoring schemes to date ). It has been applied to various taxa including vertebrates, plants and invertebrates (e.g. Kumschick and Nentwig 2010, Vaes-Petignat and Nentwig 2014, Measey et al. 2016, Rumlerová et al. 2016. By comparison, the Environmental Impact Classification for Alien Taxa (EICAT), developed by , focuses only on the environmental impact of species. It was adopted by the IUCN (https://portals.iucn.org/congress/motion/014) to enable a classification of all alien species worldwide (Hawkins et al. 2015, Evans, Kumschick and. The third scoring scheme is the Socio-Economic Impact Classification of Alien Taxa (SEICAT), which exclusively assesses the socio-economic impact of alien species .
As a case study to compare the impact scoring schemes, we use feral mammals (including mice and rats) alien to South Africa. Alien mammals are known to cause damage to many ecosystems worldwide (Howald et al. 2007, Witmer et al. 2007, Skead et al. 2011, Capellini et al. 2015 and they have been shown to have the highest impacts across various taxonomic groups in a European study . For example, feral dogs (Canis familiaris) have contributed to the decline of turtle and tortoise species in India, Costa Rica as well as the Galapagos Archipelago (e.g. MacFarland et al. 1974, Fowler 1979. In South Africa, they are known to transmit diseases to jackals and bat-eared foxes . Furthermore, the impacts of alien mammals have been reported as particularly severe on islands , Hays and Conant 2007, Reaser et al. 2007). Impacts of alien species in general are thought to be more detrimental on island environments by causing higher numbers of extinctions due to higher endemism, simpler food webs and slow diversification rates of species compared to mainlands . This is known as the island susceptibility hypothesis (Elton 1958, Jeschke et al. 2012. The aims of our study were threefold. Firstly, we compared the outcomes of the three scoring schemes by a) comparing environmental and socio-economic impacts of feral mammals (including mice and rats) between the respective schemes and b) disentangling differences in impact scores between impact mechanisms (such as competition and predation) for environmental impacts, expecting to find similar levels of impacts between the schemes. Secondly, a test of the island susceptibility hypothesis was conducted by looking at the differences between socio-economic and environmental impacts caused on islands and mainlands, hypothesising that impacts are higher on islands. Lastly, following the finding that some taxa receive more research attention than others , it was determined whether there is a publication bias in our study. However we do not expect a bias since mammals are generally well studied.

Species selection and literature search
Using data from various sources, including Spear and Chown (2009) (2016) and Invasive Species South Africa (www.invasives.org.za), a list of domestic mammals alien to and feral in South Africa was compiled. This includes eight species which were initially introduced for their use as pets and/or are of agricultural significance. Additionally, we included rats (Rattus rattus and R. norvegicus) and mice (Mus musculus) due to their global significance as invasive pests (Figure 1). These species all have established alien populations in South Africa, but also represent some of the most prevalent domestic mammals with feral populations globally and some of the most damaging alien mammals ) and are referred to as "feral mammals" in this manuscript for simplicity. Even though the selection of species was based on aliens in South Africa, the literature search was based on these species' global alien range and the classification therefore represents the entire alien range.
In order to assemble information on the global impacts of these species, a review of published literature was undertaken. A search string, developed by Evans et al. (2016), was adopted (see also Appendix 1 for further detail) and papers were selected based on manual filtering of titles and abstracts. Databases searched included Google Scholar, Scopus and Web of Science. Publications on the impact of domestic mammals or pets in captivity were excluded and only impacts of stray or feral populations were considered. The reference list of the papers selected was further analysed to search for additional records of impacts. The search was terminated when the same sources were repeatedly found. All impact references were assigned a score by the main assessor and checked by a second assessor. Discussions on scores only occurred when there were disagreements or uncertainties around the score. For each impact found, we noted if it occurred on an island or mainland.

Impact schemes, categories and levels
GISS includes both environmental and socio-economic impacts, with EICAT and SEI-CAT focusing on one impact type each (see Suppl. material 1: Table S1 for the differences in the descriptions of the impact categories of the three schemes). GISS measures impacts by assessing the damage each species causes using six environmental (impacts on plants or vegetation; predation; competition; transmission of diseases; hybridisation; impacts on ecosystems; labelled E_GISS hereafter) and six socio-economic (impacts on agricultural production; animal production; forestry production; human infrastructure and administration; human health; human social life; SE_GISS) impact categories with six subcategories each (based on impact mechanisms or socio-economic sectors) Nentwig 2010, Nentwig et al. 2016). Impact magnitudes range from 0 to 5, zero meaning that no known or detectable impacts were recorded whereas scores of five were equal to the most severe impacts. For GISS, scores were aggregated in two ways: a) Maximum scores refer to highest scores a species achieved in any subcategory and b) sums of the maximum scores received in all subcategories of the environmental and socio-economic categories, respectively and these give an overall potential impact score, termed summed score and ranging from 0 to 30.
In addition, total scores per species referred to summed scores of the socio-economic and environmental scores combined, with a potential range from 0 to 60 .
EICAT focuses on environmental impacts consisting of 12 mechanisms and five impact magnitudes, namely minimal concern (MC), minor (MN), moderate (MO), major (MR) and massive (MV), where MC is equivalent to no detectable impact on native individuals and MV equates to most severe impacts equalling a community compositional change , Hawkins et al. 2015. According to the guidelines by Hawkins et al. (2015), only the maximum impacts across all mechanisms per species were considered for this scheme. Lastly, SEICAT  investigates the socioeconomic effects of species and is similar to EICAT in terms of impact levels. SEICAT is based on alien species' influence on all constituents of human well-being by using changes in peoples' activities to evaluate the impacts. As for EICAT, only maximum impacts were analysed. Impact scores for EICAT and SEICAT were transferred into numerical scores from 1 to 5 (where MC was translated to 1 and MV translated to 5) for statistical analyses.

Statistical analyses
Maximum and summed environmental scores per species were used to compare E_ GISS to EICAT. The same process was followed for the socio-economic comparison of SE_GISS and SEICAT. Paired Wilcoxon's signed rank tests were used to test the similarity of the maximum and summed scores obtained in GISS to maximum scores in EICAT and SEICAT respectively.
To examine the differences in magnitude between environmental and socio-economic impacts, EICAT scores were compared to SEICAT scores and E_GISS scores to SE_GISS scores using a non-paired Wilcoxon's signed rank test. For the GISS comparisons, only maximum scores were used for this test.
In order to assess what drives the potential similarity and differences between E_ GISS and EICAT scores, scores pertaining to specific mechanisms were contrasted. This was done by unifying similar mechanisms across the schemes (Table 1). A single mechanism in GISS, for example, is sometimes represented by more than one mechanism in EICAT. As we are interested in whether there are differences in how the two schemes treat each record of impact, each record was treated as one impact entry (as opposed to a maximum per species and mechanism). The scores relating to each of these were compared using paired Wilcoxon's signed rank tests.
A non-paired Wilcoxon's signed rank test was conducted to test the difference between impacts caused on islands and mainlands. Due to the small sample size when analysing impacts per species, each publication containing information on impact was used as a separate record instead of using maximum impacts per species.
A Kendall's tau was used to examine the relationship between the aggregated scores per species and the number of publications. This was done separately for each scoring scheme to test for publication bias . All analyses were conducted in R studio (version 0.99.903) and R (version 3.3.1) (R Core Team 2016). Table 1. Concatenation of the environmental impact mechanisms in GISS and EICAT that are similar following Nentwig et al. (2016) and Hawkins et al. (2015), as used to compare detailed outcomes of the two scoring schemes for each source or information. "Interaction with other alien species" in EICAT could not be attributed to specific mechanisms in GISS and was therefore not included here.

Results
The total impact of the species using GISS ranged from 15 to 37 with the highest impact being from Sus scrofa and the lowest from Felis catus (Figure 1). No difference between the scoring schemes could be found when comparing EICAT scores to maximum E_GISS scores (paired Wilcoxon's signed rank test; V = 7.5, p = 0.424). Sixty four percent of the species had equivalent scores and, where differences occurred, it was only by a single magnitude. In contrast, when comparing EICAT scores to summed E_GISS scores, we found a significant difference (V = 66, p = 0.038). E_GISS summed scores ranged from 9 to 23 and all but three species (Bos taurus, C. familiaris and Equus asinus scored MR) had an impact magnitude of MV under EICAT.
However, SEICAT and maximum scores of SE_GISS were significantly different (paired Wilcoxon's signed rank test; V = 50.5, p = 0.015). Only 9% of the species' scores were equivalent, whereas more than 81% of the species scored higher in GISS than in SEICAT. This higher score was mostly by a single magnitude (e.g. 4 in GISS and MO in SEICAT), except for S. scrofa where the schemes differed by two magnitudes (5 versus MO). A difference remained when comparing SEICAT with summed SE_GISS scores (V = 54, p = 0.007). While summed SE_GISS scores in this case never exceeded 15, SEICAT scores ranged from MN to MR.
When testing how EICAT and GISS treat scores for various mechanisms mentioned in Table 1  When comparing environmental and socio-economic impacts, EICAT scores were significantly higher than SEICAT scores (Wilcoxon's signed rank test, W = 1.5, p < 0.0001) and the same was true using GISS (W = 105.5, p = 0.001).
A total of 318 papers were used for impact scoring (see Appendix 2). An average of 32 publications per species was found for environmental impacts in comparison to 7.5 publications per species for socio-economic impacts. None of the environmental impact

Discussion
Firstly, following the publication of EICAT (Hawkins et al. 2015) and SEICAT , this is the first application of these schemes for mammals. Until now, EICAT has only been used to assess the impacts of amphibians and birds introduced globally (Evans et al. 2016 and gastropods alien to South Africa (Kesner and Kumschick in revision) and SEICAT exclusively for the latter two Kesner and Kumschick in revision). Our study provides a starting point to adding another taxonomic group to the list of alien species with evidence based impact classifications and shows that EICAT and SEICAT are applicable to mammals. Furthermore, this study provided support for the already commonly used scoring scheme GISS ) and adds assessments of many mammal species which were excluded in previous studies (Nentwig et al. 2010 excluded domesticated mammals).
Besides proving the applicability of the schemes to further taxa, our analysis reveals which impact measures might be necessary and most useful for management decisions. The comparison of environmental and socio-economic impact magnitudes, for example, shows that it is not sufficient to study one aspect to get a full picture of impacts (see also Vilà et al. 2010. Previous studies, such as those by  and Kumschick et al. (2015), found that, within schemes, environmental and . The relationship between the number of publications and a environmental impact scores using EICAT scores and b socio-economic impact scores for alien mammals using SEICAT, where each dot represents a species and the line represents the relationship between impacts and number of publications. Maximum GISS scores and the sum of GISS environmental and socio-economic scores showed the same pattern as Figure 4a and b, respectively (results not shown).

a) b)
socio-economic impacts were comparable, with species with high environmental impacts generally showing high economic impacts as well. This study found that feral mammals generally have larger environmental than socio-economic impacts. Yet, the difference in environmental and socio-economic impacts does not mean that the socio-economic impacts are low  with some species, such as C. familiaris, still scoring MR. Furthermore, different societal sectors (which includes conservation, health, agriculture and social) also have different priorities and for that reason, will be interested in different aspects of impacts covered by different scoring schemes . The scoring schemes used here (Kumschick et al. 2012), as well as others previously developed (e.g. D'hondt et al. 2015) therefore allow for the explicit weighting of categories. However we do not consider this to be the task of scientists, but rather the decision-makers and therefore consider all sectors to be of equal weight for this study.
The difference in magnitude recorded for socio-economic impacts between the scoring schemes has various possible causes. While both schemes are based on the same literature and are able to score all socio-economic impacts found for the selected species, GISS and SEICAT are based on different endpoints and use different "currencies" to compare impacts, with GISS addressing the actual economic damage of species and SEICAT transcribing these changes into effects on human well-being and activities being affected by the damage . SEICAT has thus moved away from a mainly economic and value-driven (monetary) approach and assesses how people react to the damage caused by the invasive species rather than the actual damage itself. As an example, the damage that feral donkeys (E. asinus) cause to agricultural production might seem high at first, as seen in the GISS impact scores of 4, but it has not stopped farmers from continuing to produce agricultural products in any way (leading to low SEICAT scores of MN) (Tisdell and Takahashi 1988). Hence, SEICAT assumes that if peoples' behaviour does not change as a result of the impact caused, the impact is not bad enough; or conversely, an impact does not have to cause huge monetary costs to be perceived as bad by certain vulnerable communities which have limited possibilities to cope with the problem . As both scoring schemes cover important aspects of socio-economic impacts, but in fundamentally different ways with GISS focusing on actual damage and monetary losses and SEICAT focusing on resulting changes to activities more generally and peoples' well-being, we suggest that using a combination of GISS and SEICAT assessments could prove useful to obtain a more complete and distinct picture of socio-economic impacts of alien species. Although this might not be the most practical solution, both schemes rely on the same evidence base. Alternatively, one scheme could be chosen based on the specific needs and scope of the assessment, with the respective endpoints assessed by each in mind.
In contrast, environmental impact scores were comparable between EICAT and GISS in our study, especially when the maximum impact was considered, suggesting only one scheme is needed. This supports the decision by the IUCN to adopt one scheme, namely EICAT, for a global classification of all alien taxa according to their environmental impacts. However, a previous study comparing the two schemes using amphibians as a case study highlights important differences for certain mechanisms between the schemes which should be considered in future applications ). These differences were mainly attributed to uncertainties in the scoring of disease impacts in general (for transmission of diseases) and the differences between the two schemes in assigning the highest impact levels for hybridisation, with GISS depending on the size of the hybrid population and EICAT only referring to the fertility of F1 offspring. The main difference, which was found between certain mechanisms in this study, could be attributed to the split in some mechanisms in GISS (namely "Impacts on ecosystems" and "Impacts on animals through predation, parasitism or intoxication") into several mechanisms in EICAT and which allows for more detailed assessments. Furthermore, EICAT consistently focuses on the recipient native community, while GISS assesses changes in nutrient fluxes and other abiotic changes as well, without a necessary link to the native biota. This can, in certain cases (mainly for plant impacts), be an advantage as studies sometimes lack a link from changes in nutrient availability to effects on the native community (e.g. Castro-Diez et al. 2009 for two exotic trees). On another note, EICAT also includes distinct categories of impact through bio-fouling and interactions with other alien species, which are not separated out in GISS Nentwig et al. 2016). This allows for a more detailed assessment of impacts of a larger variety of taxonomic groups which is another advantage of EICAT.
In terms of the various ways to aggregate scores, EICAT and SEICAT suggest using the maximum across all categories as summary classification , Hawkins et al. 2015 while GISS originally promoted the use of summed scores , but see Kumschick et al. 2016. Both have their strengths and shortcomings and can introduce different biases (as outlined in Nentwig et al. 2017 andTurbé et al. 2017). Consequently, for some taxa, we found marked differences when scores are aggregated in different ways. The cat (F. catus) for example had the lowest recorded sum score in GISS of all taxa here considered, even though it is widely recognised as one of the most damaging alien species globally (e.g. Lowe et al. 2004) and has contributed to many extinctions of birds especially on islands (e.g. Dickman 1996). This seeming discrepancy is due to the limited range of mechanisms through which feral cats cause harm to native communities, namely mainly through predation, which leads to relatively low summed scores in GISS. Therefore we would like to highlight the importance of documenting all the sources used for each assessment and the details on all scores obtained to make a more informed policy decision, regardless of which tool is used.
Even though impacts of alien mammals are generally well studied compared to other taxa (e.g. Kumschick et al. 2015), there is a potential publication bias which can influence the magnitude of impacts recorded -the less is known about a species the lower the likelihood a severe impact is found or vice versa. We expect this to be more of a problem for less conspicuous and generally understudied taxa like invertebrates. It needs to be further evaluated how such (potential) publication biases could be addressed and avoided (see e.g. Evans et al. 2018).
Given that the selected species all have alien populations in South Africa, the results shown here could be useful to provide information for local policy-making and prioritisation. Little evidence exists on impacts of these species in the country, but data from elsewhere show that all these mammals have caused severe impacts leading to the disappearance of at least one species locally and some even contributed to global extinctions ( Figure 1). Even though impacts on island have generally been more severe, they are not restricted to these regions and we expect many of the changes caused elsewhere could also happen in South Africa or have already occurred but not been documented. As an example, knowing that feral dogs hybridise with wolves and coyotes in Europe and America , Freeman and Shaw 1979, it is possible that domestic dogs could hybridise with other native species such as the African wild dog and jackals. Evidence from other African countries in fact shows that hybridisation and disease transmission is occurring between these species . As another example, feral pigs (S. scrofa) have the highest summed impact (Figure 1) showing that the range of mechanisms through which they have impacts is quite widespread. For example, impacts on ecosystems have shown to be massive due to uprooting damage leading to the elimination of a rare plant, Navarretia plieantha in the United States of America . Other impacts include impacts through predation, herbivory and competition. These are very generalist impacts which are not dependent on specific conditions in the recipient environment. In South Africa, however, environmental impacts of feral pigs have only been recorded to be minor as the damage reported does not affect species composition yet (Spear and Chown 2009). This might be a function of the species' limited distribution or a bias due to lacking research. It therefore seems timely to consider this vast amount of evidence and evaluate management options for these species in sensitive areas.

Island susceptibility hypothesis
Only few studies have tested the island susceptibility hypothesis explicitly (Jeschke et al. 2012 found only 9 studies, most on birds), even though islands are generally thought to be more susceptible to invasions. Furthermore, previous studies testing this hypothesis have looked at "invasion success" or establishment rather than impact, finding limited support (Sol 2000, Jeschke 2008. A recent study on birds also used EICAT to classify species according to impacts and, as in our study, it confirmed impacts to be more severe on islands compared to mainlands (Evans et al. 2016). This might suggest that establishment and invasion success (cf. Blackburn et al. 2011) are not increased on islands, but alien species are causing more harm to the native biota. For example, ground-nesting birds and giant tortoises are particularly vulnerable to predation and trampling by invasive rodents and other mammals (MacFarland et al. 1974. However, further studies would need to be undertaken to confirm this pattern more broadly.

Conclusion
This study highlights the similarity and differences amongst three impact scoring schemes when using feral mammals as a case study and which can be used to make recommendations as to how prioritisation for actions can be improved. While using more than one scoring scheme to assess the same impacts seems cumbersome and unnecessary, it can help us to get an improved understanding of the various dimensions of such impacts, especially on socio-economic systems. Although this can be time-consuming, the most labour-intensive part of the impact scoring process is collating the relevant literature. All the schemes used here are based on the same data to assess and score impacts ) and, once data is accumulated for the GISS assessment, the same references can be used to complete the other assessments.

The search string used to assemble information on the global impacts of the mammals assessed in this study. Adopted from Evans et al. (2016).
Searches on the impacts of mammals were undertaken using the following search terms within a search string, in conjunction with the species scientific and common name: "introduced species", "invasive species", "invasive alien species", "IAS", "alien", "non-native", "non-indigenous", "invasive bird", "pest", "feral" and "exotic". Thus, the search string for the species feral pig was ("introduced species" OR "invasive species" OR "invasive alien species" OR "IAS" OR "alien" OR "non-native" OR "nonindigenous" OR "invasive bird" OR "pest" OR "feral" OR "exotic") AND ("pig" OR "boar" OR "Sus scrofa").

List of references used for the scoring impacts using GISS, EICAT and SEICAT
Abdel-Moein KA, Hamza DA (2016) Norway rat (Rattus norvegicus) as a potential reservoir for Echinococcus granulosus: a public health implication. Acta Parasitologica 61 (4)

Models of alien species richness show moderate Introduction
Knowing the distribution patterns of alien species richness is increasingly crucial for assessing and monitoring global biodiversity (Dornelas et al. 2014, Tittensor et. al. 2014, Capinha et al. 2015, Latombe et al. 2017). Knowledge about alien species richness is also required by environmental managers and researchers to assist in decision-making of tasks such as ecosystem restoration (Catford et al. 2012), the identification of points of entry for introduced species (e.g. Seebens et al. 2013), the quantification of impacts posed by invasive alien species (i.e. the subset of alien species that have harmful effects on the recipient ecosystems;  or the assessment of the ecological degradation of habitats (Vandekerkhove et al. 2013).
Despite substantial progress being made in the availability of alien species distributions, there still are numerous regions worldwide for which alien species richness data are lacking or highly incomplete ). These gaps occur at multiple spatial scales and, although often related to lower levels of socioeconomic development (McGeoch et al. 2010), they can also be found in generally wellstudied taxonomic groups and regions . A further challenge for preparing an inventory of alien species richness is the highly dynamic nature of alien species spread, which requires regular updates as time progresses (McGeoch et al. 2010, Tittensor et al. 2014.
Despite the relevance of information on alien species richness, little work has been done to assess whether alien species richness can be accurately predicted for areas where such data are lacking. If this metric is possible to predict with accuracy, then available data can be used to geographically broaden the current knowledge on biodiversity patterns and to support conservation and alien species management decisions in areas that are currently not surveyed. Further, high reliability of predictive models would enable the integration of predictive modelling in alien species mapping.
Two main lines of modelling approaches are used for predicting alien species richness. The first consists of the use of stacked species distribution models (e.g. Bertelsmeier et al. 2015). Here, alien species distributions are modelled individually as a function of environmental factors and the predictions are turned into binary maps of species' presence/absence. Predictions of alien species richness in regions are obtained by stacking the individual maps. The second approach consists of statistical modelling of alien species richness directly as a function of environmental factors (e.g. Jarnevich et al. 2006, Nobis et al. 2009). The framework for this approach is similar to that of descriptive models relating alien species richness to spatial factors (e.g. Kühn et al. 2003, Westphal et al. 2008, but it goes one step further by using the identified relationships to make predictions. Species distribution modelling has been intensively applied to alien species in recent years and modelling practices are increasingly refined (e.g. Calabrese et al. 2014). Thus, one might expect that the stacked modelling approach would be a preferred means for predicting alien species richness. However, stacked species distribution models, which imply developing models for each species individually, are data-and resource-demanding and may not be applicable when a high number of species or regions are involved. For instance, one would need to have at least 10 times as many presence points as one has predictor variables (Babyak 2004) -and ideally at least as many absence points (but see MaxEnt; Elith et al. 2011). Further, in species distribution models, one of the most important fundamental assumptions, i.e. the species modelled being in equilibrium with the environment, is violated. This would lead to underestimating the potential niche space of the species and would result in biased models, usually leading to incorrect predictions and inflated turnover rates in projections (Pompe et al. 2008, Pompe et al. 2011. In this context, statistical models directly relating alien species richness with spatial drivers (hereafter referred to simply as "species richness models") become particularly relevant as they are less data-demanding. However, benchmarking the accuracy and performance of these models using typical datasets of alien species distributions has rarely been done, which hampers the assessment and comparison of the model predictions.
Here, we perform a formal evaluation of the ability of species richness models to predict the richness of alien species. We measure and compare the predictive accuracies of five modelling techniques extensively used in ecology: i) a Generalised Linear Model (GLM) using a Poisson distribution (GLM-P), ii) a GLM using a negative binomial distribution (GLM-NB), iii) boosted regression trees (BRT), iv) multivariate adaptive regression splines (MARS) and v) random forests (RF). We assess the ability of the modelling techniques to predict within the geographical range of the model's calibration data (i.e. "geographical interpolation") and in new, spatially independent regions (i.e. transferability or "geographical extrapolation"; Wenger and Olden 2012). We perform this assessment using a collection of 22 datasets of alien species richness analysed in previous studies.

Alien species richness data
We collected 22 typical data sets of alien species richness from previous studies ( Table 1). Each dataset provides the total number of established alien species in distinct geographical units (hereafter referred to as 'regions'), such as countries, sub-national administrative regions (e.g. federal states, provinces) or islands. Ten taxonomic groups are covered by the data. Spatial coverage (extent) differs widely, ranging from the whole world, the European continent, temperate and subtropical regions to oceanic islands. In order to have a common geographical basis for the assignment of values of the predictor variables (below), we matched each region with the corresponding polygon of the Global Administrative Areas Database v.2.8 (GADM; http://www.gadm.org/). The GADM is the most detailed delimitation of worldwide administrative divisions available. We excluded all regions that we could not identify unambiguously, that had no geographical match in GADM and also regions that were smaller than 1 km 2 -the highest spatial resolution provided by gridded predictor variables; see below. In some cases, this resulted in reduced numbers of records compared to the original datasets (62% to 100% of the records in the original datasets kept for our analyses, average = 92% ± 10.2%). Most data sets in our collection contain a relatively low number of regions with 15 datasets consisting of less than 100 regions and 7 of less than 50 regions (Table 1).

Predictor variables
We selected nine explanatory variables representing factors that have been shown in previous studies to explain the variation in alien species richness (Kühn et al. 2003, Lambdon et al. 2008, McGeoch et al. 2010, Essl et al. 2011, Essl et al. 2013. These variables were: geodesic area (log-transformed); insularity (island/mainland); mean annual temperature; mean annual precipitation; diversity of bioclimatic types; geographical isolation; GDP per capita; human population density and proportion of area covered by urban land use.
Geodesic area (km 2 ) was measured using the spatial polygon of the region after re-projection to a Mollweide equal area projection. Insularity was a binary variable (islands or mainland regions). Mean annual temperature and mean annual precipitation represent region-wide averages of the corresponding climatic conditions (at ca. 1×1 km) derived from WorldClim (http://www.worldclim.org/). We defined bioclimatic diversity as the total number of distinct bioclimatic types delimited by Metzger et al. (2013) that are found within a region. The bioclimatic types defined by Metzger et al. (2013) consist of 125 divisions that group relatively homogeneous environmental conditions at the global scale. Geographical isolation corresponded to the shortest travel time possible in the region to reach a populated place with 50,000 or more people as mapped by Nelson (2008). For each region, we extracted the minimum travel value found in the region. Gross Domestic Product (GDP) per capita in 2005 -in purchasing power parity-constant 2005 US dollars -and human population size were retrieved mainly from the World Development Indicators (http://data.worldbank.org/ data-catalog/world-development-indicators) and Gennaioli et al. (2014) but also from a few other sources including national and regional statistical agencies, Index Mundi (http://www.indexmundi.com/) and online reports. When data for the year 2005 was not available, we used data for the closest available year from the first decade of the 21 st century. Human population density was calculated by dividing population size of the region in 2005 by its area. The proportion of area covered by cities was calculated by dividing the area of urban land cover of the region -as measured from GlobCov-er2009 (http://due.esrin.esa.int/page_globcover.php) -by region size. For each data set, we tested for redundancy amongst the predictors using pairwise Pearson correlation and selected only predictors that were moderately correlated (|r|<0.7, Dormann et al. 2013). The final set of predictors considered for each dataset is shown in the Suppl. material 1: Table A1.
We performed all data processing in R (v. 3.4.1) (www.R-project.org/). The extraction of values from the source gridded datasets was done using the 'extract' method of the  package.

Modelling techniques used for predicting species richness
We tested five techniques for modelling alien species richness: i) GLM-P using a Poisson distribution, ii) GLM-NB using a negative binomial distribution, iii) boosted regression trees using a Poisson distribution (BRT), iv) multivariate adaptive regression splines (MARS) and v) random forests (RF). These methods were selected because they fall into different positions along the spectrum of statistical assumptions and modelling architectures, allowing a number of relevant comparisons. These include i) a comparison of GLMs having a restrictive (GLM-P) and a more relaxed distributional assumption (GLM-NB); ii) comparison of GLMs with machine learning models (BRT, MARS and RF) and iii) comparison of a linear regression-type machine learning model (MARS) with tree ensembles-type machine learning models (BRT, RF). We briefly describe each of the modelling techniques used in the Suppl. material 1: Text A1. Their implementation is described below.

Generalised linear models using Poisson and negative binomial distributions
We implemented GLM-P using the standard 'glm()' function of R and GLM-NB using the 'glm.nb()' function of the  package. The theta parameter, which represents the dispersion of the data in the calculation of the variance of the NB distribution, was estimated by means of maximum likelihood. An important step in the application of GLMs is to identify the 'best' combination of predictors (Brewer et al. 2016), especially when multiple "best models" can be found due to collinearity (Dormann et al. 2013). We adopted a multi-model inference approach (Burnham and Anderson 2002), in which we identified the combination of predictors that were best supported by the Akaike information criteria corrected for small sample sizes (AICc). We tested two options to select the models used for comparison with other methods. First, we simply used the model receiving the highest support from AICc. Second, we accounted for the possibility of multiple models receiving strong support, i.e. Δ AICc < 2 and corresponded to the use of the average of their coefficients. To avoid overfitting the models by including more predictor variables than warranted by the number of observations available (Babyak 2004), the maximum number of predictor variables considered simultaneously in each multi-model comparison was set to one tenth of the number of observations being used for model calibration (Babyak 2004). The multi-model assembly and calculation of AICc were performed using the R package MUMIN (v. 1.13.4).We found the accuracy of predictions from these two options to be virtually identical (Suppl. material 1: Table A2) and used the results for the best supported model by AICc for comparison with other methods. For GLM-P, over-dispersion was calculated and detected for several of the datasets used in this study (Suppl. material 1: Table A3). For these cases, we also performed multi-model comparisons using the quasi Akaike Information criterion corrected for small sample size, QAICc, a criterion more suitable for model selection in the presence of over-dispersion. Adopting QAICc, however, did not improve the models nor change the general outcome (results not shown).
Hurdle models (Potts and Elith 2006) using GLM-P and GLM-NB were also implemented for a subset of datasets with zero inflation (for details on implementation see Suppl. material 1: Text A1). We found that small sample sizes impeded model convergence for some datasets and that predictions from converging models were not significantly superior to those from 'plain' GLM's (Suppl. material 1: Text A1). Accordingly, we refer no further to the results from hurdle models in our work.

Multivariate adaptive regression splines
Multivariate adaptive regression splines (MARS; Friedman 1991) were implemented using the EARTH (v. 4.2) package of R. Interactions amongst predictor variables were not considered (i.e. only additive models), as preliminary tests revealed that interactions did not improve predictive performances (results not shown). Exhaustive pruning, a method that considers all candidate model terms, was used for selecting those to keep in the final MARS model.

Random forests
Random forests (RF; Breiman 2001) were implemented using the 'randomForest' (v. 4.6-10) package for R. All models corresponded to an ensemble of 1000 trees and used a random selection of 4 predictors as candidates for branch splitting.

Boosted regression trees
To implement Boosted regression trees (Elith et al. 2008), we used the 'dismo' (v. 1.1-4) package for R with an assumed Poisson distribution for the response variable. For all models, we used a learning rate of 0.001, which implies a fairly low contribution of each tree added to the model and imposes a desirably high number of total trees in the ensemble (often > 1000) (Elith et al. 2008). Remaining parameters were kept at default values. Following the fitting of models using all predictors, we performed a stepwise variable selection procedure, in which any predictor not contributing to the decrease in model deviance was removed (Elith et al. 2008).

Assessment of predictive performance
The performance of the previous techniques in predicting alien species richness for each of the 22 datasets was assessed using two distinct approaches. The first was a leave-one-out cross-validation (Wenger and Olden 2012). In leave-one-out cross-validation, the model is fitted and used to predict one hold-out observation at a time. This is repeated until all observations are used for validation. This approach provides unbiased estimates of in-sample predictive accuracies, but does not allow assessing the model performance for predictions outside the spatial range covered by the data set (Wenger and Olden 2012). The second approach was a k-fold regional cross-validation (Jiménez-Valverde et al. 2011). This approach, which relies on the use of geographically distinct subsets of the data, provides a reliable assessment of the accuracy of the predictions made to new, unrelated, geographical domains -i.e. it assesses the spatial transferability and generality of the relationships identified by the models (Wenger and Olden 2012). Here we used a 4-fold regional crossvalidation in which each dataset was divided into four geographical quadrants based on the centroids of all regions. Then, regions in three quadrants were used for model fitting while one was left-out for model evaluation. The procedure is repeated four times, so that all possible three quadrant-combinations are used for model calibration.
The assessment of validation accuracy was made for two criteria: (1) agreement between reported and predicted absolute values of alien species richness and (2) agreement between the rank order of reported and predicted alien species richness. For the first criterion, we calculated the 'relative absolute error' (RAE). An RAE of zero represents a perfect match between predicted and observed values, while 100% corresponds to the level of error that is obtained if all predictions simply represent the average of the alien species richness values used for evaluation (Witten et al. 2016). For the second criterion, we compared the Spearman rank correlation (ρ) between predictions and left-out observations. Both criteria were calculated using the RMINER (v. 1.4) package.
Two distinct evaluation assessments were made for GLM-P, GLM-NB and MARS, in order to account for their greater susceptibility to errors when predicting beyond the sampling space of the calibration data (i.e. when extrapolating). While BRT and RF 'do not extrapolate' because they use the closest known subspace of the calibration data as target for the prediction (Elith and Graham 2009), GLM-P, GLM-NB and MARS assume that the trend of fitted responses continues outside the sampling space, which may lead to predictions of unrealistically high values of richness. The first evaluation consisted of performing the predictions from GLM-P, GLM-NB and MARS without any constraints. The second evaluation consisted in constraining these models to mimic the behaviour of BRT and RF by means of 'clamping' (Elith and Graham 2009). Clamping corresponds to setting the range margins of values used for prediction to the range margins found in the model calibration data. That is, if (x < min) then x = min or if (x > max) then x = max; where x is the value of the predictor in the test data and min and max are the minimum and maximum values of the predictor in the calibration data, respectively.
Multiple pairwise Wilcoxon tests were used to test for significant differences in RAE and ρ between the five modelling techniques. The differences were assessed by comparing the performances achieved by each method in the 22 datasets of alien species richness tested. In the case of GLM-P, GLM-NB and MARS, pairwise Wilcoxon tests were also used to compare the accuracy from clamped versus the not clamped versions of the predictions.

Spatial autocorrelation
Spatial autocorrelation (SAC) in the distribution of alien species richness may lead to incorrect model parameter estimates (Dormann et al. 2007), even resulting in changing the sign of the relationship (Kühn 2007). Testing for transferability is particularly important when the autocorrelation structure of the areas of calibration and prediction differ (as in the case of cross-validation applied here). We tested for SAC in the residuals in all models during the regional cross-validation process by means of correlograms showing the correlation amongst Pearson residuals of regions over a range of uniformly distributed distance classes. The statistical significance of the correlations was based on 1000 permutations for the values of the residuals. The package NCF (v. 1.1-7) for R was use to perform the analyses. We found no or limited SAC in the residuals for the majority of models. A few significant autocorrelations occurred, but these were generally of low magnitude and did not tend to be conserved across the four-folds of regional cross-validation or across the different modelling techniques. These results led us not to consider the application of predictive models explicitly accounting for SAC.

Predictive performance of alien species richness using leave-one-out cross-validation
Results for the leave-one-out cross-validation -which assesses the accuracy of predictions made for within the geographical range of the data -show that RF and BRT provided the comparably best performances (Figure 1a; Suppl. material 1: Table A4).

Figure 1.
Detailed legend: Accuracy was measured for generalised linear models using Poisson (GLM-P) and negative binomial distributions (GLM-NB), boosted regression trees (BRT), random forests (RF) and multivariate adaptive regression splines (MARS) for predictions of the total number of alien species per region (relative absolute error, RAE; lower is better) (a, b) and the rank order of each region (Spearman's rho; higher is better) (c, d). Boxplots represent variations in accuracy across 22 datasets of alien species richness for GLM-P, GLM-NB, RF and MARS, but not for BRT. Due to model convergence issues, results for BRT comprise only a subset of datasets and are thus not directly comparable with the results of the other techniques. Panels in the right left (a, c) refer to predictions evaluated using a leave-one-out approach, which measures the accuracy of predictions within the geographical range of the model calibration data. Panels in the right (b, d) refer to predictions evaluated using a four-fold regional cross-validation approach, which assesses the spatial transferability of the models. A few outliers lie outside the ranges of the Y-axes, see Tables A2 and A3 for the complete list of values. Still, these two techniques had a median RAE of about 76% and five or less datasets had a RAE higher than 90% (Figure 1a; Suppl. material 1: Table A4). The predictive performance of these two techniques was not significantly different from one another and both performed significantly better than GLM-P, GLM-NB and MARS (p < 0.05; Wilcoxon rank-sum test) (Table 2A). For GLM-P and GLM-NB, about half or more datasets had a RAE of 90% or higher.
The application of data 'clamping' in predictions of GLM-P, GLM-NB and MARS resulted in improvements in predictive performance for nearly all datasets (Suppl. material 1: Table A5). However, this improvement was only statistically significant for GLM-P (p < 0.05; Wilcoxon rank-sum test).
Regarding the predictions for the relative order of regions in alien species richness, these were more accurately predicted by RF (median ρ = 0.63), closely followed by BRT (median ρ = 0.62). The higher performance of RF was significantly different from the performances achieved by GLM-P, GLM-NB and MARS (p < 0.05; Wilcoxon rank-sum test), but not for BRT (p > 0.05). BRT was also significantly better in predicting the relative order of regions in terms of alien species richness than the two GLM-type models (p < 0.05). The application of clamping to GLMs and MARS did not significantly alter their accuracy (p > 0.05; Wilcoxon rank-sum test). A reasonable number of data sets achieved high (ρ > 0.6) degrees of correlation between the predicted and observed order of regions, particularly for the two best performing techniques (RF and BRT), whereas weak (ρ < 0.25) correlations were less common ( Figure 1C; Suppl. material 1: Table S4).

Predictive performance of alien species richness using regional cross-validation
Results for the 4-fold regional cross-validation, which assesses the transferability of model predictions, showed a consistently worse predictive accuracy than the one evaluated by leave-one-out cross-validation. All modelling techniques showed substantially higher medians of RAE ( Figure 1B; Suppl. material 1: Table A6) and, in the case of MARS, RF and BRT, the lower performance than their counterparts, evaluated by leave-one out cross-validation, was statistically significant (p < 0.05; Wilcoxon rank-sum test).
The least inaccurate technique was RF (median RAE = 95.1%; interquartile range = 25.5%) ( Figure 1B; Suppl. material 1: Table S6) which showed significant (p < 0.05; Wilcoxon rank-sum test) to marginally significant (p < 0.1) differences in all pairwise comparisons with the other methods (Table 2B). Despite performing best, RF still delivers a substantial amount of error. For ten datasets, the predictions of alien species Table 2. Results of pairwise Wilcoxon tests of significant differences for the performance of the techniques for predicting absolute richness (as measured by relative absolute error) using leave-one-out crossvalidation (A) and regional cross-validation (B). Predictions of GLM-P, GLM-NB and MARS refer to models using 'clamped' data (see main text). Significant differences (at α = 0.05) are shown in bold. richness were worse than the ones obtained if using simply the mean absolute value of alien species richness (i.e. RAE > 100%) and, only for eight data sets, the accuracy of RF was clearly superior (RAE ≤ 90%) to this benchmark (Suppl. material 1: Table  A6). All remaining techniques performed worse, with median performance above the 100% RAE benchmark ( Figure 1B; Suppl. material 1: Table A6). In addition, BRT is demanding in terms of sample size and could not be fitted for 7 of the smallest datasets (mean n = 38, S.D. = 10). The remaining techniques were able to fit models for all datasets. No significant differences in performance were found between any two techniques other than with RF (Table 2B). Similarly to what was verified for leave-one-out cross validation, the application of data 'clamping' in predictions of absolute alien species richness by GLM-P, GLM-NB and MARS resulted in clear increases in predictive performances for nearly all datasets (Suppl. material 1: Tables A6 and A7), but this improvement was only statistically significant for GLM-P (p < 0.05; Wilcoxon rank-sum test).
For predictions of the relative order of regions in alien species richness (ρ), no method emerges as best performing (p > 0.05; Wilcoxon rank-sum test). The application of clamping did not significantly alter the results (p > 0.05; Wilcoxon rank-sum test). Most datasets showed moderate to low (ρ < 0.45) degrees of correlation between the predicted and observed order of regions ( Figure 1D; Suppl. material 1: Table S6).

Discussion
Our results show that values of alien species richness can be predicted with reasonable to moderate accuracy within the geographical range of the model calibration data, but only poorly in regions outside this range. This drop in predictive power was verified across modelling techniques and concerned the capacity to predict both absolute alien species richness and relative alien species richness.
The poor transferability of statistical models is not unexpected because the relationships they identify are not functional (mechanistic) and may thus be limited in their realism outside the space of the calibration data. Issues related to transferability have been well documented and examined for species distribution models (e.g. Randin et al. 2006, Heikkinen et al. 2012, Wenger and Olden 2012, Bahn and McGill 2013. Our results extend these findings to models of alien species richness. Given the similarity of the two model approaches, we argue that the causes for these congruent findings could be largely shared. First, poor transferability can be a consequence of overfitted models, which have relationships overly adjusted to the calibration data (e.g. also expressing random noise), reducing their generality (Wenger and Olden 2012). However, we expect this potential source of error to be of minor importance in our models because RF, which are known to be susceptible to overfitting (Heikkinen et al. 2012, Wenger and Olden 2012, Bahn and McGill 2013, showed consistently better transferability than GLMs selected by AICc, a modelling approach that is expected to provide models robust to overfitting (Randin et al. 2006;Wenger and Olden 2012).
Another possibility for the poor transferability of the aliens species richness models could be that the relationships (i.e. the covariance structure) between predictor and response variables and amongst response variables are not conserved in the areas that lie beyond the spatial range of the calibration data (Bahn and McGill 2013). These sorts of changes can result from real differences in the way factors influence alien species richness in the new areas (e.g. temperature may be an important limiting factor at high latitudes, but not near the tropics) or from changes in the correlation structure of the predictors (Elith et al. 2010). Importantly, these changes are more likely for predictors that are proxies of the causal process, because relationships for them should be less robust to the effects of confounding factors or to changes in predictor's correlation structures (Austin 2002). In our models, several predictors are actually proxies of the putative relevant mechanisms rather than direct measures, making them particularly susceptible to this type of problem. We stress however, that this is frequently the case in models of alien species richness, as better, proximal information is often not available. One particular example refers to colonisation pressure, i.e. the number of species introduced into a region (Lockwood et al. 2009), which is a strong determinant for the number of alien species that become established ). In the absence of better information, colonisation pressure was represented by variables depicting variation in levels of human activity (e.g. per capita GDP; human population density; proportion of urban areas). The assumption is that higher human activity should translate into higher colonisation pressure e.g. due to a higher purchase of pets and plants or to higher volumes of imported cargo potentially carrying alien species. While this relationship should hold some degree of generality , it is also likely that the shape and relative importance of the relationship changes to some extent across space, given the local influence of regional-scale factors such as regulations for the importation of living animals and plants (e.g. Reino et al. 2017).
A third possible cause for the observed poor transferability of models concerns extrapolation, which is also related to the information content in the model calibration data. Predictions made for conditions out of the range of the calibration data are extremely challenging, no matter the modelling technique used Graham 2009, Elith et al. 2010). We applied 'clamping' to GLMs and MARS when confronted with extrapolating conditions, mimicking the behaviour of BRT and RF. This substantially improved the overall accuracy of these models, providing circumstantial evidence for a substantial prevalence of extrapolation in the predictions. It is worth emphasising that, while clamping avoids 'off the chart' predictions under extrapolation, it does not 'add' extra information to the models and any extrapolating prediction, clamped or not, should generally be less accurate than a prediction made for conditions sampled by the data (i.e. interpolation). Our 4-fold regional cross-validation, which holds-out one geographical quadrant of the data at a time, should imply a substantial number of extrapolating data points, hence also likely contributing to the verified sharp drop in predictive accuracy when models are spatially transferred.
A good transferability of models of alien species richness may not be required if predictions or model-based inferences are intended for the geographical range of the sampling data. Our results show that, under these settings, models of alien species richness can achieve moderate (e.g. RAE ≈ 75% and ρ ≈ 0.6) predictive accuracy. However, one of our most prominent results was that predictions from RF and BRT were significantly better, despite not being good, than those from GLM-type models and MARS. This occurred even after data clamping being applied to GLMs and MARS, allowing the effect of the higher susceptibility of these models to extrapolation errors to be discarded.
It is not unexpected to find RF and BRT outperforming GLMs in non-transferred predictions. The model fitting process of the former techniques consists in iteratively fitting the data and testing the ability of the fitted relationships for prediction using portions of data left-out from the fitting. The leave-one-out cross-validation mimics this procedure, differing mainly in that the error levels measured are not used to retune the model. Hence, machine learning techniques are specifically optimised to predict well, based on the patterns sampled by the data. Besides, the capacity of machine learning techniques to fit complex functions could be particular relevant for models of alien species richness, because the relationships between variables in these models are often fitted along wide gradients (such as for global-scale environmental and socioeconomic variation; e.g. Dawson et al. 2017), where the persistence of simple linear or monotonic relationships (as it is assumed by GLMs) should be less common. MARS, another machine learning method, was also substantially outperformed by RF and BRT. Although also able to fit non-linear functions, the complexity of MARS models is usually considerably lower than that of BRT and RF (Merow et al. 2014) which could explain its comparatively lower predictive performance.
Similarly to what was verified in the predictions of transferred models, extrapolation also severely harmed predictions made for in-sample geographical ranges. Ideally, extrapolation should be overcome by the use of additional data, sampling the extrapolating predictors' space. However, when that is not possible, our results show that the use of clamping is strongly recommended. Further benefits could also be expected from the examination of the conditions leading to extrapolation, such as the identity of the predictors involved and of how far the model has to extrapolate in the predictors' space. This has been assessed in SDMs previously (see, for instance, Elith et al. 2010) and allows a further refinement of the predictions by, for instance, identifying extrapolating regions of 'low novelty' and for which prediction could be appropriate or inversely, regions where conditions are far beyond what is sampled by the data and for which predictions should be avoided.
Overall, our results suggest that accurate predictions of regional alien species richness from correlative models are beyond the scope of the models we used. This is particularly the case for absolute values of richness, whereas relative richness, despite not achieving overall good accuracy, showed to be more robust to errors. Here we analysed the transferability of models on species richness between regions. A complementary analysis, recognising species identity explicitly and, hence, also allowing for the analysis of species turnover, are models of compositional similarity (e.g. McGeoch 2014, Capinha et al. 2015). A promising way forward would be testing the transferability of such models.

Conclusions
We showed that regional alien species richness cannot be predicted with reliability using the data and methods typically found in literature. Given that these data and methods already reflect best available possibilities to modellers, in the near future the coverage of information gaps on alien species richness is likely to remain entirely dependent on the publication and updating of alien species inventories, which reinforces recent calls for the publication of this information . A complementary approach to model species richness and by some, regarded as a potential way forward, is to model and perhaps predict, patterns of alien biodiversity using measures of compositional differentiation and similarity between regions.
Two of our results are also of relevance for descriptive models of alien species richness. First, we found that tree ensembles-type modelling techniques (RF and BRT) are consistently superior in predicting non-transferred values of richness than GLM and MARS. This supports the fact that flexible, non-linear, models are better able to capture information from the data than GLM, a more commonly used technique. The common justification for the use of GLM-type models for analysing alien species richness concerns their ease of interpretation. However, a number of methods have recently been developed to improve the interpretability of tree ensembles (e.g. Fokkema 2017), which may be worth considering in order to further refine our understanding about the relationships between alien species richness and spatial factors. Second, our results also warrant a call for caution in making inferences beyond the geographical (and likely also temporal) range of the data used to calibrate the models, e.g. for future projections. We recommend performing a transferability assessment of the models, as allowed, for instance, by regional cross-validation, in order to confirm the generality of the relationships identified.
in part III, Hui and Richardson present a synthesis: in Chapter 10, complex adaptive networks are put forward as an approach to uniting the concepts of invasiveness and invasibility; and finally Chapter 11 explores how we can manage biological invasions in the Anthropocene.
The mathematical, theoretical grounding of much of invasion biology is in general a strong point of the book, although the sometimes "maths-heavy" nature of the text may not be for everyone. Another important plus is the recognition that invasion biology is not an independent discipline, but is rather built upon general theories and models developed in the wider field of ecology, which are frequently referred to and invoked throughout. It is somewhat refreshing to read an invasion biology book that does not appear to separate the discipline from the rest of ecology, but instead acknowledges ecological theory as being fundamental to furthering our understanding of invasions. The book is richly illustrated with figures drawn from invasion literature, often representing examples of specific invasions. Invasion Dynamics is replete with references that, on the whole, represent a fair reflection of this burgeoning field. However, a slight downside to the book is that, apart from a few extra plates, the book is entirely in monochrome. Some figures may well benefit from being in colour and it appears this was realised to some extent with the insertion of a few colour plates. The three sections (spread, impact, synthesis) are helpful, though the synthesis section seems a little short with only two chapters. The chapter on complex adaptive networks is a useful and fascinating starting point, though it is heavy on general theory and lighter on its application to unite invasiveness and invasibility; this probably simply reflects the nascence of this very interesting research direction.
Overall, Invasion Dynamics is a thorough and commendable attempt to give invasion biology a mathematically rigorous basis. The attempt is largely successful, although this book is perhaps most suitable for advanced (and mathematically "savvy") researchers in the field.