Predicted distributions could suggest regional sympatry in spite of local allopatry: The case of the dung beetle Canthon rutilans Castelnau, 1840 (Coleoptera: Scarabaeidae)

Studying the environmental characteristics of the localities in which different taxa have been observed could help to estimate probable niche differences. Previous studies on local distribution and reproductive behavior of the two subspecies of Canthon rutilans support their allopatry. Here we estimated the geographical distribution of these taxa to examine the congruency between the environmental characteristics of the occurrence localities derived from geographical and local data, in order to facilitate the future study of the causal factors that are more likely to explain their segregation. To do so, a database including most of the available distributional information about these subspecies was compiled. The data derived from 23 environmental variables in the occurrence cells were used to estimate those variables with different mean values among the two subspecies, and also to generate distributional maps reflecting the probable distribution of the two subspecies. Ten variables have statistically significant different values among both subspecies. Canthon rutilans rutilans would be present in colder places, living in the high elevation localities of the Atlantic Forest or Pampas biomes, whereas C. rutilans cyanescens seems to be restricted to lowland forested areas. Probable distribution maps show geographical sympatry in almost half of the distribution range of C. r. rutilans. As former results suggest that they are in allopatry at a local scale, these results propose that fine‐grain environmental factors would be promoting the segregation of taxa, but also that coarse‐grain data should be used with caution when the aim is to estimate niche differences.


INTRODUCTION
The multidimensional environmental space defined by the values of non-interactive variables in the occurrence localities of a species could help to delimit their Grinnellian niche (Grinnell 1917;Soberón 2007). However, this niche does not have to reflect the full set of conditions at which a species can have a positive demographic rate (Soberón & Peterson 2005;Jiménez-Valverde et al. 2011). Physiological data related with thermal responses could, for example, add supplementary valuable information about the environmental requirements of species, helping to infer more accurately the niche of a species and, in consequence, its potential distribution (Verdú & Lobo 2008;Strangas et al. 2019). Despite the tendency to think that ecological and evolutionary processes act at different timescales, a deep connection between these two types of processes could be the norm (Thompson 1998;Urban et al. 2020). The geographical and temporal patterns of species can thus be understood considering physiological, ecological and evolutionary processes jointly, because all of them can help to understand the probable causes of species distribution limits (Gaston et al. 2009).
Canthon (Francmonrosia) rutilans Castelnau, 1840 is a species of dung beetle in which, in spite of its controversial character (Phillimore & Owens 2006), two subspecies are today recognized considering external morphological characters (Vaz-de-Mello et al. 2014;Vaz-de-Mello 2020): the red colored Canthon rutilans rutilans Castelnau, 1840 and the blue colored Canthon rutilans cyanescens Harold, 1868. Its geographical distribution extends from Brazil in the states of Mato Grosso do Sul, São Paulo, Paraná, Santa Catarina and Rio Grande do Sul to Argentina (Misiones), Uruguay, Paraguay and French Guiana (Vaz-de-Mello et al. 2014;Schoolmeesters 2020). The available information indicates that the subspecies C. r. cyanescens occurs in forest habitats at elevations below 1000 m a. s.l. in the Atlantic Forest biome (Culot et al. 2013;Korasaki et al. 2013;Bogoni & Hernández 2014;Costa-Silva et al. 2014;da Silva & Hernández 2014, 2016Campos & Hernández 2015;Batilani-Filho & Hernández 2017;Hernández et al. 2019). However, C. r. rutilans would inhabit the high-altitude grasslands or forests of the Atlantic Forest Biome situated at more than 1000 m a.s.l. in elevation, in the Pampas biome or in Eucalyptus plantations under cold environmental conditions (da Silva et al. 2008(da Silva et al. , 2009(da Silva et al. , 2012Hernández & Vaz-de-Mello 2009;Audino et al. 2011;da Silva 2017;Hensen et al. 2018). A study using fine-grain survey data (≈1 km resolution) suggests that these two subspecies probably do not overlap spatiotemporally . This supposition is based on the data obtained by a yearly survey in six sites located along an elevational gradient from 200 to 1600 m a.s.l. with an approximate length of 100 km. Although further studies will be necessary to corroborate this spatial segregation, recent laboratory breeding experiments (Hensen et al. 2020) also suggest that these two subspecies could have different thermal preferences in their reproductive behavior and larval development.
In this study we use the environmental information of the available occurrence localities to assess if the probable environmental segregation previously detected at the local scale is corroborated when coarse distribution data are used. Additionally, we used a simple recently proposed approach directed to model the distribution of species (García-Roselló et al. 2019) to generate spatial predictions about the whole geographical distribution of these two subspecies. Geographical patterns are useful to infer the processes that have caused the separation of related taxa (Hausdorf & Hennig 2020), but determining these patterns depends on the considered grain. Fine-grained data derived from an ecological study could support the existence of allopatric distributions, as in this case, while biogeographical data obtained at a coarser resolution could suggest sympatry. Thus, young sister taxa could show a greater or lesser degree of sympatric co-occurrence depending of the grain size (Kirchheimer et al. 2015). We hypothesize here that a high degree of range overlap at coarser grain would suggest the existence of fine-grain local environmental factors able to facilitate the observed divergence, whereas a low degree of geographical overlap would support the existence of segregation mainly due to mesoenvironmental factors acting at a coarser scale (Anacker & Strauss 2014). Thus, we aim to estimate in this study the degree of environmental and geographical overlap of these two subspecies in order to examine the congruency between the environmental characteristics of the occurrence localities derived from geographical and local data, and thus facilitate the future study of the causal factors that are more likely to explain their segregation.  Grosso, Cuiabá). The data on this species in the Brazilian collections account for the great majority of the reliable records available, according to the opinion of the largest specialist of this group in the Neotropical region (Fernando Zagury Vaz-de-Mello, pers. comm. 2020). We gathered the information for the entomological label for all the specimens identified as Canthon rutilans, and we also took three photographs of each specimen using a digital camera (Canon Eos Rebel T3). A maximum of ten individuals per locality were included in the database in order to minimize the use of redundant data. If the entomological label only had the name of the place recorded but not the geographical coordinate, we obtained it through Google Earth Pro (https://www. google.es/earth/). All the occurrences located at less than 1 km of distance from each other were considered as single occurrences.

Origin of environmental data
A total of 23 environmental variables at a resolution of 2.5 0 (approximately 25 km 2 ) were used as predictors: two topographic (elevation and elevation range) and 21 bioclimatic variables. Topographic variables were extracted from a digital elevation model downloaded from the USGS EROS Data Center (http://eros.usgs. gov/). Bioclimatic variables come from WorldClim (see https://www.worldclim.org; Hijmans et al. 2005) adding aridity and continentality as calculated by Valencia-Barrera et al. (2002).

Discriminatory variables
To recognize the environmental variables with differing values in the occurrence cells at which both subspecies are observed, Student's t-test was used applying a Bonferroni corrected P-value for multiple comparisons (P = 0.05/23 = 0.002). A backward stepwise discriminant function analysis was additionally carried out (P = 0.002 both for F to enter and F to remove) to determine which variables best discriminate between the occurrence cells of the two subspecies.

Estimation of geographical distributions
A simple procedure was used to provide a geographical representation of the suitable and accessible areas with environmental conditions similar to those existing in the observed occurrence areas at a resolution of 2.5 0 (a cell of approximately 5 × 5 km). This resolution is a compromise between the resolution of the available digital environmental information and the precision of the geographical information included in the entomological labels. A simple protocol has been followed for this purpose (Lobo 2016), avoiding the use of complex modelling techniques and the so-called background absences because our goal is to obtain coarse-grained distributional hypotheses directed to estimate the overlap in the range of both subspecies. The complete procedure, called NOO (Niche of Occurrence), is included within the ModestR software version 5.4 (García-Roselló et al. 2013. NOO is fully described in García-Roselló et al. (2019) and a step-by-step tutorial is provided at the website http://www.ipez.es/modestr/ Manual_Tutorial.html. Briefly, the procedure can be summarized in three steps. First, the distributional extent or accessible area of each taxon was delimited as the one composed by the set of river basins of lower level with presence observations that, in turn, enable the connection of all the available occurrences. Watershed information provided by the WaterBase project (http://www.waterbase.org) is used for this purpose.
Subsequently, the most relevant predictor variables within this area are selected, eliminating sequentially those with a variance inflation factor lower than five (Dormann et al. 2013), and submitting the remaining predictors to an instability index (InsInd) able to select the variables with a high capacity of discriminating the occurrence cells in the selected distributional extent (Guisande et al. 2017). The variables with a higher percentage contribution to InsInd are those that better discriminate between the conditions in the cells of presence and the cells of the selected distributional extent. The variables finally selected are those that represent an accumulated percentage contribution of 90% (see Guisande et al. 2017). NOO later generates a compound environmental layer (CEL) using polar coordinates with the sole purpose of including in a single file all the information about the relevant predictors at the desired geographical extent. Subsequently, the occurrence cells of the target species are projected onto the CEL and a kernel density estimation calculated, which reflects the intensity of the presence data in the environmental space. Default smoothing and tolerance values are used at this step (1 and 1%, respectively; see García-Roselló et al. 2019). Smoothing determines the kernel spread around presence areas, while tolerance is the percentage of increase at which minimum and maximum values of the variables in the occurrences increased. Finally, the values of the environmental variables constituting the CEL, ranging from maximum to minimum values attained in species presences ("minimal density at presence" default option), are retained and projected geographically. Thus, all the cells with environmental conditions similar to those existing in the occurrence localities are discriminated along the previously selected geographical extent, and a binary distribution map representing suitable and unsuitable cells was produced.

RESULTS
Data coming from a total of 192 individuals of C. r. rutilans belonging to 46 different 2.5 0 cells and 523 individuals of C. r. cyanescens belonging to 120 different 2.5 0 cells were compiled during visits to the entomological collections (see Supporting Information; Appendix S1 for collections and Appendix S2 for localities). Mean values for 10 of the 23 considered environmental variables significantly differed between the localities in which both taxa have been observed (Table 1), showing that C. r. cyanescens specimens occur in lowland localities with higher temperature values and lower aridity, opposite to C. r. rutilans. For example, the annual mean temperature of the occurrence cells of C. r. rutilans is 17.13 ± 2.34 C (mean ± SD), and 19.20 ± 1.57 C in the case of C. r. cyanescens. None of the precipitation-related variables differed between the two subspecies (Table 1). Although most of the variables that differ between the two subspecies are related to temperature, the backward stepwise discriminant function analysis selects annual mean temperature (standardized coefficient of 0.86) and two precipitation-related variables: precipitation of the wettest quarter and precipitation of the warmest quarter (standardized coefficients of −3.32 and 3.21, respectively). The obtained function is able to correctly discriminate 83.9% of the considered occurrences.
A visual inspection of the occurrences belonging to the two subspecies shows that both partially share their geographical distribution (Fig. 1). The NOO process of variable selection allowed us to obtain four relevant variables for C. r. rutilans: mean temperature of the warmest quarter (28.0% contribution percentage), mean diurnal range (26.3%), mean temperature of the wettest quarter (23.2%) and annual precipitation (13.8%). In the case of C. r. cyanescens, five variables were selected: mean diurnal range (33.7% contribution percentage), mean temperature of the warmest quarter (24.7%), precipitation of the wettest month (17.1%), elevation range (10.7%) and temperature seasonality (7.4%). These variables generate distributional predictions for each subspecies (Fig. 2) in which the sympatric area (168 342 km 2 ) represents approximately 44% of the predicted distributional area of C. r. rutilans (383 306 km 2 ), and 26% of the predicted area for C. r. cyanescens (637 752 km 2 ).The predicted distributional area of C. r. rutilans included the states of Espírito Santo, Rio de Janeiro, Minas Gerais, Goiás, São Paulo, Paraná, Santa Catarina, Rio Grande do Sul, wide areas of Uruguay, and some regions of southeastern Paraguay and north of Argentina. The predicted distributional area of C. r. cyanescens comprised the states of Rio de Janeiro, Minas Gerais, São Paulo, Mato Grosso do Sul, Paraná, Santa Catarina, Rio Grande do Sul, and some regions of northern Argentina and south-eastern Paraguay (Fig. 2).

DISCUSSION
Our results show that the mean values of almost half of the considered environmental variables differ between the occurrence localities of the two subspecies of C. rutilans. Despite the proximity of some occurrence localities between both subspecies, these could probably differ in elevation and temperature; C. r. rutilans is the subspecies more related to high elevation in cooler regions than C. r. cyanescens. Precipitation variables appear to be highly relevant when the predictors are sequentially eliminated. As stepwise procedures are influenced by the presence of collinear predictors, the high correlation between temperature variables (mean = 0.752; maximum = 0.969; minimum = 0.386) allows the inclusion of precipitation Figure 1 Occurrence localities of Canthon rutilans in South America (A), as well as those corresponding to C. r. rutilans (red) (B) and C. r. cyanescens (blue) (C). variables. Only those variables able to yield maximum discrimination independently of their true statistical significance are included in the results of stepwise discrimination function analyses (see Legendre & Legendre 2012). In our case, standardized coefficients indicate that C. r. cyanescens would inhabit the hottest localities with a higher precipitation during the warmest season but with a lower rainfall in the wettest period. The species that occur at low and high elevations in tropical biomes often have large differences in their climatic tolerances compared to those inhabiting temperate areas (Janzen 1967;Linck et al. 2020). Hence, the differences in the environmental characteristics of the occurrences as suggested by the use of coarse-grain geographical data are congruent with the patterns recognized at finegrain local scale .
Our results indicate that the two subspecies can partially be environmentally and geographically discriminated using coarse-grain data, and that predictions based on this kind of data can offer an approximate picture about the environmental determinism of these distributions (Guisan et al. 2007). However, both the maps displaying occurrences (Fig. 1) and those reflecting probable distributions (Fig. 2) show that the area of geographical overlap is relatively high between both subspecies, which means that they may appear sympatric when a coarse-grain scale is considered (Connor et al. 2018). Almost half of the distribution range of C. r. rutilans would be shared with C. r. cyanescens; these sympatric areas are located at intermediate elevations (≈603 ± 250 m a.s.l.). This result suggests the existence of fine-grain local environmental factors that facilitate the regional coexistence of the two subspecies (Anacker & Strauss 2014). What could these local environmental drivers be? We suspect that the fine-scale microclimatic variations existing within a single 2.5 0 cell might be sufficient to explain the coexistence of the two subspecies at the considered coarse-grain scale (Pincebourde et al. 2016). Further studies will be necessary to confirm the fine-grain segregation of these two subspecies in the sympatric cells detected here. Furthermore, as in the case of Rebaudo et al. (2016), this result also questions the ability of  coarse-grain geographical data to adequately identify niche requirements.
Independently of the degree of geographical sympatry, occurrence maps for these two subspecies show that C. r. rutilans extends its distribution further south and north, and C. r. cyanescens further west of southern South America. Obviously, the predicted geographical representations could vary depending on the predictors and the modelling method. Here, a simple modelling procedure was used that is able to provide a distributional hypothesis about the suitable and accessible areas with environmental conditions similar to those existing in the observed occurrence areas (García-Roselló et al. 2019). Most part of the distributional area of both subspecies was found in areas of the Atlantic Forest, and the particular orography of this region allows different types of environments (Instituto Brasileiro de Geografia e Estatística 1992). The distribution of C. r. cyanescens is related to the rainforest, which explains its limitation of southward displacement (Korasaki et al. 2013;Bogoni & Hernández 2014;Costa-Silva et al. 2014;da Silva & Hernández 2014, 2016Campos & Hernández 2015;da Silva et al. 2018;Hernández et al. 2019). However, C. r. rutilans could occur at higher elevations within the Atlantic Forest biome (Hernández & Vaz-de-Mello 2009;Hensen et al. 2018;da Silva et al. 2019), being its distribution related with the presence of highaltitude grasslands or Pampas biomes. These two habitat types share many dung beetle species (da Silva et al. 2008(da Silva et al. , 2009(da Silva et al. , 2012Audino et al. 2011;da Silva 2017), probably as a consequence of the shared biogeographic origin of these two habitats in southern Brazil, which have expanded and contracted their ranges during the Pleistocene glacial/interglacial periods (Leite et al. 2016).
In conclusion, this study shows that environmental variables are also important factors in determining the coarse-grained distribution of these two subspecies. Although other factors could help to understand the fine-grained segregation of these two taxa (Costa et al. 2008), the environmental characteristics of the occurrence localities of the two studied subspecies show that C. r. rutilans would be distributed in colder (17 C) and higher elevation areas (800 m), whereas C. r. cyanescens is distributed in warmer localities (19 C) at lower elevations (350 m a.s.l.). In our case, two subspecies that differ in the environmental characteristics of their occurrence localities ) could appear to be highly sympatric geographically, thus suggesting that local microenvironmental factors are probably very relevant to explain the spatial segregation of these two sibling species at the considered resolution. Together, these results reinforce the species status of the two studied taxa. Finally, coarse-grain data should be used with caution when the aim is to estimate niche differences between related taxa.