Modelling the distribution of the invasive Roesel ’ s bush-cricket ( Metrioptera roeselii ) in a fragmented landscape

The development of conservation strategies to mitigate the impact of invasive species requires knowledge of the species ecology and distribution. This is, however, often lacking as collecting biological data may be both time-consuming and resource intensive. Species distribution models can offer a solution to this dilemma by analysing the species-environment relationship with help of Geographic information systems (GIS). In this study, we model the distribution of the non-native bush-cricket Metrioptera roeselii in the agricultural landscape in mid-Sweden where the species has been rapidly expanding in its range since the 1990s. We extract ecologically relevant landscape variables from Swedish CORINE land-cover maps and use species presence-absence data from large-scale surveys to construct a species distribution model (SDM). The aim of the study is to increase the knowledge of the species range expansion pattern by examining how its distribution is affected by landscape composition and structure, and to evaluate SDM performance at two different spatial scales. We found that models including data on a scale of 1 × 1 km were able to explain more of the variation in species distribution than those on the local scale (10 m buffer on each side of surveyed road). The amount of grassland in the landscape, estimated from the area of arable land, pasture and rural settlements, was a good predictor of the presence of the species on both scales. The measurements of landscape structure – linear elements and fragmentation gave ambivalent results which differed from previous small scaled studies on species dispersal behaviour and occupancy patterns. The models had good predictive ability and showed that areas dominated by agricultural fields and their associated grassland edges have a high probability being colonised by the species. Our study identified important landscape variables that explain the distribution of M. roeselii in Mid-Sweden that may also be important to other range expanding orthopteran species. This work will serve as a foundation for future analyses of species spread and ecological processes during range expansion.


introduction
The development of effective strategies to manage the spread of invasive organisms requires data on species habitat preferences and knowledge of how landscape characteristics influence species dispersal and establishment (Cote and Reynolds 2002;Rosin et al. 2011).However, the collection of fine-detailed distribution data over large scales is time consuming and logistically challenging, hence data is missing for many species (Jimenez-Valverde et al. 2008).Management decisions have often to be taken swiftly (Morueta-Holme et al. 2010) and species distribution modelling becomes a handy tool when dealing with limited observation data and large spatial and temporal extents (Guisan and Thuiller 2005).By modelling species distribution as a function of ecologically relevant data on climate conditions and/ or landscape characteristics, it is possible to describe occupancy patterns and predict species range expansions (Hein et al. 2007;Early et al. 2008;De Groot et al. 2009;Bonter et al. 2010).Estimates of current and future species distributions rely on: (1) the strength of the relationship between environmental variables and the organism in question (Cote and Reynolds 2002), and (2) the availability of ecological relevant environmental data that can be applied at a range of geographic scales (Scott et al. 2002).It is also important to consider the impact of scale on the performance of the models (Scott et al. 2002), i.e. we need to know which environmental predictors give the best estimates for species presence at a given spatial scale.
Some species of orthopterans (grasshoppers and bush-crickets) have recently shown a rapid response to changed environmental conditions and are invading new areas outside their common range (Sword et al. 2008;Bazazi et al. 2011).Orthopterans are well suited for studying distribution patterns across a range of spatial and temporal scales, because they are relatively easy to survey and their ecology is well studied (Ingrisch and Köhler 1998;Gwynne 2001;Hein et al. 2003;Holzhauer et al. 2006).Metrioptera roeselii is an example of a range expanding species in northern Europe (Simmons and Thomas 2004;Gardiner 2009;Hochkirch and Damerau 2009;Species Gateway 2010).Detailed studies on the species' ecology (e.g.Ingrisch 1984;Berggren et al. 2001;Poniatowski and Fartmann 2005;Holzhauer et al. 2006) and movement behaviour (Berggren et al. 2002;Berggren 2004Berggren , 2005) ) have increased the understanding of how M. roeselii responds to local biotic and abiotic factors.However it is currently unknown which of the factors are shaping the regional occupancy pattern of M. roeselii, and to what extent readily-available landscape data can be used to predict the regional distribution of the species The aim of this study is to model the distribution of M. roeselii at a large scale (>2000 km 2 ) using species presence-absence data from field surveys and digital landscape data available from the national cartographic agency.Since the predictive ability of occupancy models is known to be scale sensitive (Scott et al. 2002) we model the distribution of M. roeselii at two different spatial scales ('landscape' and 'local' scale) and compare model performance.At the 'landscape' scale, we measure the landscape composition and structure, factors that affect colonisation and establishment of populations (Werling and Gratton 2008).At the local scale we use land cover type as a predictor for species occurrence as it is thought to reflect closely species habitat requirements (Hirzel and Le Lay 2008).
The questions we sought to answer in this study were: ( 1) is there any difference in predictive ability of models which use landscape composition and structure versus those that only include local land cover type to explain the distribution pattern of M. roeselii and (2) which landscape variables explain best the occurrence of M. roeselii and are these variables consistent between the landscape and local scale?

Study species
Metrioptera roeselii (Orthoptera: Tettigoniidae) (Hagenbach 1822) is a small (12-18 mm) predominantly short-winged and flightless bush-cricket commonly found in grasslands of central and northern Europe (Bellmann 2006).In Sweden M. roeselii occurs mainly in the Lake Mälaren region and the position of the population core area suggests that the species has been introduced via sea cargo (de Jong and Kindvall 1991).There are indications that the expansion of M. roeselii may cause the displacement of a native orthopteran species (Berggren and Low 2004), but its impact on the insect community as whole is largely unknown.Metrioptera roeselii is an omnivorous generalist that prefers tall grassland habitats.In the agricultural landscape the species is found in extensively grazed pastures, leys, grassy field margins, ditches, and road verges (Marshall and Haes 1988;Berggren et al. 2001).Forests, arable crop fields and intensively grazed pastures are considered to be unsuitable habitat for the species and urban areas are usually avoided (Ingrisch and Köhler 1998;de Jong and Kindvall 1991;Wissmann et al. 2009).
The reproductive season of M. roeselii in Scandinavia is between July and September.Males stridulate to attract females and the species-specific call makes the species easy to census (Marshall and Haes 1988).Metrioptera roeselii is a wing polymorphic species; extremely favourable weather conditions (mild springs and hot summers) and high population densities trigger the development of long winged morphs (macropters) (Poinatowski and Fartmann 2010).However, in normal years and at range margins the proportion of macropters in M. roeselii populations rarely exceeds two percent and the vast majority of individuals disperse by walking and jumping (Vickery 1965;Wissmann et al. 2009;pers. obs.).

Data collection
During 2008 and 2009 we surveyed an area of 2554 km 2 in the Lake Mälaren region (mid-point 59°44'N, 16°52'E) for the presence of M. roeselii (Fig. 1).The landscape in this region consists of a mosaic of agricultural land (46%), forest (43%), scattered settle-ments and small towns (5%), lakes and waterways (3%) and a small proportion of other land use types (3%).In our surveys we sampled the land cover types proportionally to their occurrence in the landscape.We used known locations of M. roeselii (de Jong and Kindvall 1991;Berggren et al. 2001; Species Gateway 2010) as starting points for our surveys and surveyed the wider surroundings to map the current distribution of the species.We conducted auditory surveys by car (de Jong and Kindvall 1991;Berggren et al. 2001) on sunny days, between 10 am -5 pm, from mid July until the end of August.Since the species' call is strong and can be heard over distances of approximately 10 m (Fischer et al. 1997;Bellman 2006), it is possible to listen for stridulating males from the car window while driving slowly (~30 km/ h) along countryside roads (Berggren et al. 2001).We recorded our survey routes and observations of M. roeselii using a GPS (Garmin 60XL).

Variable selection
We used ArcGIS 9.2 (ESRI 2006) to plot and analyse the survey and landscape data.Information on landscape structure and landscape composition was extracted from a topographic map (Geographic Sweden Data (GSD) 1:50 000) and a Swedish CO-RINE (Coordination of Information on the Environment) land cover map (resolution 30 × 30 m) both available from the Swedish mapping, cadastral and land registration authority.We analysed the effect of landscape variables on the species' distribution at two spatial scales: the landscape and the local scale.We placed a 1 × 1 km grid across the study area to create presence-absence squares from the species survey data and to design units in which we measured the predictor variables for the landscape scale analysis (Fig. 1).For the analysis at the local scale we use the same 1 × 1 km grid for the spe- cies data but extracted the land use data from a 10 m wide buffer strip running parallel to each side of the surveyed roads (i.e. the search area).We compared the models from the search area with the models at the landscape scale to test if we find similar effects of land use on species occurrence at a larger spatial scale.
The distribution of M. roeselii was treated as presence-absence data within the 1 × 1 km squares for both spatial scales of the analysis (n total = 874 with 318 absence and 556 presence squares).Squares where M. roeselii was absent were only included in the analysis if they were adjacent to a presence square.Based on our knowledge of the species dispersal behaviour (Berggren et al. 2001(Berggren et al. , 2002) ) we excluded distant and isolated absence squares from the analysis because we considered those squares to lie outside the species immediate colonisable area.We chose this conservative approach in order to minimise the number of false absences in the data which otherwise inflates the omission error, lowering the accuracy of the models (Guisan and Thuiller 2005).Because we were primarily interested in modelling the distribution of populations rather than dispersing in individuals, we only included squares in the analysis that contained at least two observations of male M. roeselii.Previous studies have shown that the species has a good colonising ability and propagules consisting of two males and two females can found sustainable populations (Berggren 2001).Because survey length affects detection probability of the species, we used survey length as a covariate in all models, and only included squares in the analyses in which more than 100 m of road was surveyed.
We used GIS to extract landscape variables that are of ecological relevance for M. roeselii (Berggren et al. 2001;Berggren et al. 2002;Berggren 2004) and which represent predefined categories in the maps that we used.The land cover categories were generic and consisted of sub-categories of land-use types that resembled each other in terms of vegetation-and management type: (1) arable land (under crop rotation; includes cultivation of cereals, fodder -and root crops, fallow land), ( 2) forest (includes broadleaved, coniferous and mixed forest, clear-cuts and young plantations), (3) pasture (includes dense herbaceous vegetation dominated by grasses under different grazing regimes), (4) urban areas (includes land with buildings and other man-made structures, small towns and villages), (5) rural settlements (includes solitary houses and farm buildings surrounded by grasslands and gardens), (6) linear elements (combined lengths of streams and roads), and (7) number of fragments of arable land (see Table 1).
We used Pearson's product-moment correlations to test for the relationships between landscape variables using JMP version 8.0.1 (SAS Institute Inc. 2009).Arable land and forest were highly negatively correlated (r = -0.86,p < 0.0001), suggesting they are mutually exclusive in the landscape.Thus, we choose to exclude forest and include arable land in the analyses as previous studies have shown that M. roeselii does not occur in forest areas and arable land under intensive cultivation but occurs and spreads along grassy field margins (Ingrisch and Köhler 1998;Berggren et al. 2001).Linear elements were positively correlated with urban areas (r = 0.56, p < 0.0001) as road length increases with urban development.We excluded urban areas from the analyses since we know from personal observations and records in the national species base (Species Gateway 2010) that M. roeselii is rarely found in urban areas due to the lack of suitable habitat.All other landscape variables showed low to moderate r-values (r ≤ 0.3) and were included in analyses.Moran's I values indicated that the response variable was spatially structured which would cause our estimates of variable significance in the models to be exaggerated (Legendre 1993).However, our primary aim was not to elicit precise species-habitat relationships but rather to produce a general applicable model to predict the species distribution over a large spatial extent.We therefore chose a non-spatial modeling approach over explicitly accounting for spatial dependency in the species distribution model.

Statistical analyses
We used logistic regression models to investigate the relationship between the landscape variables and M. roeselii occurrence at two scales: the landscape scale (1 × 1 km units) and the local scale (10 m area either side of surveyed roads).For both analyses a balanced set of candidate models were considered (i.e.all possible combinations of the variables of interest) and these were ranked according to the relative strength of support for each model using Akaike's information criterion (AIC).We used AIC weights (ω i ) to generate weighted model-averaged parameter estimates when there was no clear best model by including all models within 5 AIC (Σ ω i = 0.95) from the highest-ranked model (Burnham and Anderson 2002).We also estimated the relative importance of the predictor variables by summing the AIC weights over all the models in which the variable was contained (Burnham and Anderson 2002).Parameter estimates and AIC for all models were calculated using the 'glm' function in the R 2.8.1 software (R Core Development Team 2008).We used v-fold cross-validation (Witten and Frank 2000), to evaluate the prediction accuracy of the highest-ranked models from our analyses (i.e.survey scale and landscape scale).Of the 874 survey squares, 80% were randomly sub-sampled as the training set and used to parameterise the model.The coefficients of this model were then used to derive probabilities of occurrence for the remaining 20% of the survey squares.Among the number of data partitioning methods in model evaluation (Fielding and Bell 1997) this ratio of 80% training and 20% test data has been previously found useful (Dormann et al. 2008).The square-specific probabilities were used to calculate a random draw from a Bernoulli probability distribution for each square to produce a prediction (0 or 1) and these were compared to the observed data in the validation set (0 or 1) for each square.Differences in observation versus prediction were then recorded as a proportion of mismatches for the training data set.This was repeated 1000 times, with the proportion of mismatches being modeled as a distribution of errors; i.e. the proportional deviation of the predicted versus the observed -similar to a probability density curve.The median and 95% confidence intervals of these errors were then calculated using the cumulative distribution function (ecdf ) in R 2.13.1 (R Development Core Team 2009).

Results
Models at the landscape scale had lower AIC values when compared to equivalent models at the local scale (Table 2), suggesting that variables measured at the landscape scale were better predictors of M. roeselii presence than those measured in the immediate survey area (local scale).There was strong support for arable land as an important positive predictor for this species, as it was the only variable present in all models with AIC support (Table 2).By comparing different scales in the analyses (landscape versus local) we show that the habitat variables were differently associated with the species presence depending on the spatial scale at which they were measured (Tables 2 and 3).
At the landscape scale, M. roeselii presence was best explained by the full model, containing arable land, rural settlements, pasture, number of arable land fragments and linear elements (Table 2).The second-and third-ranked models differed in either number of fragments or linear elements, suggesting that structural landscape variables had weaker support in explaining M. roeselii occurrence.Contrary to expectation, occurrence of M. roeselii was negatively correlated with the amount of pasture and linear elements, and positively correlated with the number of fragments of arable land (Table 3).The three land-use variables (arable land, rural settlements and pasture) had the highest relative-importance weights (1.0), followed by linear elements (0.927) and number of fragments (0.778).
At the local scale, the two highest-ranked models contained arable land and rural settlements.This, in combination with their relative-importance weights (1.0 and 0.905 respectively), demonstrates the strong support for them as positive predictors (Tables 2 & 3).Although pasture was included in the second-highest-ranked model, an examination of Table 2 shows that its inclusion in models generally results in a lower ranking than models without it -suggesting very weak support for it as a predictor of M. roeselii presence (relative-importance of pasture = 0.353).
Cross-validation showed that the models were generally accurate in their predictions of species occurrence across the spatial scales for the environmental gradients examined in the study.The landscape-level model prediction for the probability of M. roeselii being detected in a square had an error which ranged from -0.091 to +0.080 (95% CI; Fig. 2a).At the survey scale, model prediction error for the probability of detection ranged between -0.075 to +0.097 (95% CI; Fig. 2b).

b) a) Discussion
In our study, the distribution of M. roeselii was best explained by models at the landscape scale.This indicates that measuring the landscape characteristics within 1 × 1 km units captures both the availability of habitat for the species and incorporates ecological functions of the landscape features (Crawford and Hoagland 2010).The weaker relationship between land use and species occurrence at the local scale could be attributable to the coarse grain size of the land-cover data failing to capture local aspects of habitat quality, i.e. vegetation heterogeneity, microclimate (Gardiner and Dover 2008) and its temporal variability (Gardiner et al. 2008;Poniatowski and Fartmann 2008) as well as important biotic interactions (Huston 2002) that are influencing the distribution of the species.Our study shows that landscape data extracted from digital map sources can be used to explain the regional distribution pattern of this expanding species.Determining biologically important variables and the optimal spatial scale is a prerequisite to predict the likelihood of occurrence of a species in non-surveyed sites with a resolution of 1 km 2 and form the base for monitoring species spread, serving conservation planning and future research on spatial processes shaping species distributions.The models can also be further developed and used for region-wide predictions in areas similar to the study area, assisting in devising management actions and possible control of undesired species expansion (Hutto and Young 2002;Scott et al. 2002).However, extrapolation of model results should be treated with caution.Abiotic factors such as land cover can generally be applied only within a limited spatial extent and time frame because the same variables can differ in habitat suitability moreover the same species may respond to different sets of variables in different parts of its distributional range (Guisan and Zimmerman 2000).
When modelling species distributions in fragmented landscapes it is important to incorporate the landscape structure into the analyses (Umetsu et al. 2008).The number of fragments of arable land was a positive predictor for the occurrence of M. roeselii, indicating that the field margins offer important edge habitat and serve as dispersal paths in the agricultural landscape (Berggren et al. 2001).Similar dispersal behaviour has been observed in the wood cricket Nemobius sylvestris that moves along habitat edges (Brouwers et al. 2011).Contrary to expectations, linear landscape elements (roads and streams) had a negative effect on species occurrence at the landscape scale.One possible explanation is that although linear elements have been associated with increased dispersal opportunities in small-scaled studies, at larger scales linear landscape features such as major roads and streams act as a barrier for the species dispersal if they separate suitable habitat areas (de Jong and Kindvall 1991).Due to the large spatial extent of our study it was not possible for us to explicitly incorporate spatial configuration and orientation of linear landscape features in the model.
At both spatial scales that we analysed, arable land and rural settlements turned out to be strong predictors for the presence of M. roeselii suggesting that these land use types can be used as surrogate measure for grassland habitat in the region.The positive effect of arable land on the occurrence of M. roeselii might be surprising at first since it is known that M. roeselii avoids crop fields because of the lack of shelter, food and egg laying places (Ingrisch and Köhler 1998).However, arable land is a generic land use description and vegetation cover varies with the type of crop cultivated.In Sweden, crop rotation is commonly practised (Söderberg 2006) and arable land becomes temporally a suitable habitat for orthopterans and other grassland living insects when crop fields are shifted into fallows or leys (Duelli et al. 1999).The ability to track resources is particularly important for species in dynamic landscapes.In areas with intensive agricultural production the grassy field margins and hedgerows often have high species richness and function as dispersal corridors and source habitats for colonisers of crop fields (Marshall and Moonen 2002;Meek et al. 2002).The present findings support our assumption that grassland insects like M. roeselii benefit from habitat heterogeneity in arable landscapes.Braschler et al. (2009) found that cricket (Ensifera) density was higher in fragmented plots, as uncut patches of grassy vegetation play an important role in maintaining insect diversity in the agricultural landscape by offering shelter from predators and serving as mating and egg laying sites.A previous study by Bieringer and Zulka (2003) showed that orthopteran species richness increases with distance to forest edge.We believe that the positive effect of arable land in our study was not simply because the bush-crickets avoided forest, but rather that agricultural areas contain a larger amount of suitable grassland vegetation than forests.
In cultivated landscapes, generalist species that are able to occupy a broad range of habitat types are less sensitive to local habitat loss (Marini et al. 2008(Marini et al. , 2009a)).Metrioptera roeselii is an example of a grassland generalist (Ingrisch and Köhler 1998) colonising a range of grassland types (Gardiner et al. 2008;Poniatowski and Fartmann 2005).Like its close relative M. bicolor (Kindvall 1996) it is able to sustain populations in small patches of habitat.Rural settlements, despite covering only a small area of the landscape, have been shown to provide important habitat for a range of species (Belfrage et al. 2005;Rosin et al. 2011) and may function as source patches for M. roeselii enabling the species to colonise surrounding areas.Extensive farming practices and small field sizes are positively correlated with habitat heterogeneity, which in turn has a positive effect on the local diversity of species with limited movement ability like pollinators and grassland living insects (Benton et al. 2003;Marini et al. 2009b;Steck et al. 2007).
We expected that the amount of pasture and the presence of M. roeselii would be positively correlated on both scales since M. roeselii has been found to colonise extensively grazed pastures (Poniatowski and Fartmann 2005).The negative correlation of pasture on M. roeselii occurrence at the landscape scale is difficult to interpret.A possible explanation could be that the overall proportion of pastures in the landscape is small and its distribution scattered which makes it more difficult for the species to colonize.
Species ecology, range size and rarity have an influence on model performance (Franklin et al. 2009;Syphard and Franklin 2009).Results from other studies (Heikkinen et al. 2006;Segurado and Araújo 2004) have shown that specialist species and species with a limited range are generally more accurately modeled than generalist species and species with a wide geographic range, M .roeselii is an example of the latter.The natural dynamics of the study species makes it more difficult to model its distribution because the assumption of the species being in equilibrium with the environment is violated and dispersal contributes to spatial autocorrelation in the data (Franklin et al. 2009).With these limitations in mind we thoroughly surveyed the range of environmental conditions present in the distribution area from the core of the study area to the margin aiming to obtain a large sample size as possible.Despite our surveys were conducted by car, we sampled all important habitat types (arable land, forests, pastures and human settlements) proportionally to their occurrence in the landscape.Aware of the trade-off between model generality, reality and precision (Guisan and Zimmerman 2000), we prioritized the former as our primary aim in this study was to develop predictive model for M. roeselii within the study region.The model can be further developed and applied to other grassland insects with similar traits.

Conclusions
Type of land use and structural landscape elements describing the amount of available habitat are important predictors for species occurrences (Hein et al. 2007;Kemp et al. 1990;Crawford and Hoagland 2010).The possibility to model M. roeselii distribution using survey data and available land-cover data on a scale that is easy to extract and utilise for managers is promising in that it will enable us to predict the direction and possible extent of future range expansion of the species.As many Orthopterans disperse and interact with the environment in a similar way (Hjermann and Ims 1996;Diekötter et al. 2007;Brouwers et al. 2011), the results from this study may also be valid for other related species that are now expanding their distribution areas.This is very useful, as many studies on grassland living insects face a similar dilemma: a limited availability of distribution data for species that are living in highly dynamic landscapes (Marini et al. 2009b).The possibility to utilise available distribution data in combination with land-cover data enables us to improve our understanding of the species ecology, to highlight areas of conservation concern and to predict species occurrences in a time of environmental change (Bonter et al. 2010).

Figure 2 .
Figure 2. Cross-validation accuracy of 1000 models using randomly selected training and validation sets (80% and 20% respectively).The curves show the relative deviation of prediction accuracy when comparing estimated to observed occurrence of Metrioptera roeselii being detected in a square at a the survey scale (vertical bars show the 95% CI for model prediction error: -0.075 to +0.097), and b the landscape scale (-0.091 to +0.080).

table 1 .
Descriptive statistics for the major landscape features and predictor variables used in the regression analyses to explain the distribution of Metrioptera roeselii in south-central Sweden.
Min = Minimum, Max = Maximum, SE is the standard error of the mean.† = Number of fragments of arable land, ‡ = the sum of the length of streams and roads.

table 2 .
Model selection results for the effect of landscape variables on the occurrence of Metrioptera roeselii.The model selection statistics are number of parameters (K), Akaike's information criterion (AIC), difference between model and minimum AIC values (∆AIC), and AIC weights (ωi).Only models with ∆AIC < 10 are shown.
Abbreviations used for the explanatory variables in the models: Sur = Survey length, Ara = Arable land, Rural = Rural settlements, Pas = Pasture, Lin = Linear Elements, Frag = Number of fragments of arable land.table3.AIC-weighted model-averaged parameter estimates generated from the top three models (Σ ωi = 0.95) presented in Table2.