The Pyrenees as a cradle of plant diversity: phylogeny, phylogeography and niche modeling of Saxifraga longifolia

The current distribution of most species results from ecological niche, past distribution, and migrations during glacial–interglacial periods and in situ evolution. Here, we disentangle the colonization history of Saxifraga longifolia Lapeyr., a limestone plant abundant in the Pyrenees and rare in other Iberian mountains and the African Atlas. Our working hypothesis is that the current distribution results from the shrinkage of a more extensive distribution in previous cold periods. We sampled 160 individuals of 32 populations across the whole distribution range and sequenced four DNA regions (rpl32‐trnL, rps16‐trnQ, trnS‐trnG, and ITS). Ecological conditions were modeled to identify factors promoting high genetic diversity and long‐term persistence areas for S. longifolia. In addition, we inferred phylogenetic relationships, phylogeographic divergence, genetic diversity, and migration routes. Seven plastid haplotypes were found, of which six occur in the Pyrenees and one in the High Atlas (Morocco). Discrete phylogeographic analysis (DPA) estimated migration routes predominantly from the Pyrenees to the other areas. Colonization events to those areas appear to have taken place recently given that the rest of the Iberian mountains do not harbor exclusive haplotypes. Iberian–Northern African distribution was inferred to be the result of long‐distance dispersal because the split between Iberian and High Atlas haplotypes is estimated to have taken place in the last 4 million years ago when the Strait of Gibraltar was already open. Migrations from the Pyrenees to the south may have been favored by a corridor of predominant limestone rocks along Eastern Iberia, followed by successful overcoming the Strait of Gibraltar to reach northern Africa.


Introduction
The distribution and genetic structure of mountain plants in the Western Mediterranean region have been ultimately shaped by geologic complexity, coupled with Tertiary and Quaternary climatic episodes (Vargas, 2020). The onset of the Mediterranean climate in the Pliocene (c. 3.2 million years ago, Ma) originated with a climatic seasonality, including a long summer drought of three or more months. This is a characteristic feature of the Mediterranean climate that became stable at approximately 2.8 Ma (Suc, 1984). Accordingly, floras of Mediterranean mountains may have evolved during the Pliocene and Pleistocene (Stebbins, 1984), including alpine floras in high elevations isolated by lowland Mediterranean conditions. Glacial-interglacial cycles in the Pleistocene promoted the isolation of mountain plants in high-elevation mountains across the Mediterranean Basin during warm (interglacial) periods when alpine plants are hypothesized to have suffered from small population sizes, limited gene flow, and reduced time for adaptation to environmental changes (Christmas et al., 2016). In colder and longer periods of time (glacial stages), alpine plant populations had the opportunity to recolonize lower elevations of the same mountains and colonize new ones. Most alpine plants of southern European mountains are the result of migrations from lower elevations after glacial periods (Blanco-Pastor et al., 2019) or descendants of colonists from Northern Europe (Winkler et al., 2012). As a result, new areas have been colonized by plants, while refugia have been created depending on climate pulses. Quaternary refugial areas (sites that have remained environmentally suitable for certain species in the course of glacial and interglacial climatic changes) have allowed long-term species survival and accumulation of a higher genetic diversity than newly colonized areas (Holderegger & Thiel-Egenter, 2009;Stewart et al., 2010). Colonization of new territories usually implies a decrease in genetic variation due to founder effects (Hewitt, 2000;Magri et al., 2006). An ideal scenario to test all these hypotheses is found in the complex geography of Europe, including numerous peninsulas, mountains, and islands that, together with dramatic extinctions in recent glacial periods, have brought about exclusive phylogeographic patterns (Taberlet et al., 1998;Hewitt, 1999;Vargas, 2020).
The Pyrenees is one of the richest European mountain ranges with 3652 plant species, of which 119 (3.4%) are endemic (Gómez et al., 2017). Endemics illustrate current isolation in a mountain range, while the term subendemic expresses certain success of colonization of adjacent mountains (Ninot et al., 2013). This is the case of the calcicolous Saxifraga longifolia Lapeyr., which is abundant in the Pyrenees and extremely rare in adjacent and southern mountains with limestone substrates. Indeed, S. longifolia is an ideal species to study patterns of colonization of Western Mediterranean mountain plants in the Pleistocene because this species: (i) is morphologically well-defined (Webb & Gornall, 1989;Vargas, 1997), (ii) displays adaptations to rocky, dry environments exposed to cold-hot fluctuations (Munné-Bosch et al., 2016;, (iii) occurs in a phylogeographically well-studied region (endemic to the Western Mediterranean) (Vargas, 2020), and (iv) has a scattered distribution across Iberian and northern African mountains (Webb & Gornall, 1989). Certain intraspecific variation has been considered by some authors, which has led to the description of subendemic taxa: subsp. gausseni Emberger (northern African mountains), var. aitanica (Pau) Pau (Sierra de Aitana, SE Iberia), and var. subnana Don (Pyrennees) (Engler & Irmscher, 1919;Webb & Gornall, 1989).
A widely accepted approach to reconstruct the colonization history of a plant species across any territory is to investigate species-level relationships using a phylogenetic reconstruction of nuclear and plastid DNA sequences, followed by phylogeographic analyses based on organelle sequences (Guzmán & Vargas, 2009;Liberal et al., 2014). Indeed, the best method to study historical patterns of species migration is provided by phylogeography, which allows inference of the geographical origin of lineages (Avise et al., 1987), description of migration routes (Hewitt, 1996;Vargas, 2003), and discovery of refugial areas (Provan & Bennett, 2008;Médail & Diadema, 2009). Although a great number of phylogeographic studies of Western Mediterranean mountain species have been already conducted (e.g., Vargas, 2003;Martín-Bravo et al., 2010;Fernández-Mazuecos & Vargas, 2013;García-Fernández et al., 2013;Blanco-Pastor et al., 2019), few of them deal with herbaceous plants distributed in the highest mountains of the Iberia and Atlas mountains (but see Ortiz et al., 2009;Tremetsberger et al., 2016). Additionally, one of the most frequently addressed biogeographic questions for Western Mediterranean species is whether the Strait of Gibraltar has played a role as a barrier or as a bridge between the floras of Iberia and northern Africa (Rodríguez-Sánchez et al., 2008;Guzmán & Vargas, 2009;Fernández-Mazuecos & Vargas, 2010;Jaramillo-Correa et al., 2010;Santiso et al., 2016;Martín-Rodríguez et al., 2020).
Based on previous results from alpine plant species in general and S. longifolia in particular (Vargas, 1997;Conti et al., 1999;Munné-Bosch et al., 2016;Cotado, 2019;, our working hypothesis is that Pleistocene glaciations shaped the current range of S. longifolia in such a way that its geographic distribution was larger in previous cold periods, and currently, isolated populations resulted from multiple interglacial periods between the four last glaciations. To test this hypothesis, we addressed four main objectives: (i) to assess phylogenetic relationships within the S. longifolia group (Saxifraga sect. Ligulatae subsect. Aizoonia); (ii) to evaluate levels of haplotype diversity; (iii) to describe the geographic structure of genetic variation; and (iv) to reconstruct routes of migration across the Iberian mountains and Strait of Gibraltar.

Study species
Saxifraga longifolia is a long-lived herb that develops a single rosette in limestone rocky places (mainly cliffs). The diameter of this rosette has a noticeable variation reaching up to 30 cm in the largest individuals. This species shows semelparity, meaning that a synchronous blooming episode occurs once in a lifetime and then the plant dies off right after fructification (monocarpy). Flower and seed production increases as a function of plant size, and maximum female success is reached in plants with an intermediate size (García, 2003). This species endemic to Western Mediterranean mountains is distributed across the Pyrenees, in particular in the south-central part, with additional, more isolated localities in the Cantabrian Mountains, the Eastern and Southeastern Iberia, and the High Atlas (Morocco) (Webb & Gornall, 1989). Saxifraga longifolia has an altitudinal range between 540 m.a.s.l in Roncal Valley (Pyrenees) and 2600 m.a.s.l on Pico del Aspe (Pyrenees) (Gómez et al., 2020) and Jbel Ouirarassen (High Atlas).

Sampling and data collection
To design field sampling, we conducted a preliminary exploration of S. longifolia occurrence sites in GBIF. We selected 32 populations of S. longifolia for sampling, including 160 individuals across the whole geographic distribution of the species (main range and isolated populations), morphological diversity, and geographic areas with different climatic conditions (Tables 1, S1). We discarded the population in    For each sequence, we indicate the locality and elevation, coordinates, haplotypes, ribotypes, and GenBank accession numbers (ITS 1 , rpl32-trnL 2 , rps16-trnQ 3 , and trnS-trnG 4 ) of sequenced individuals (elevation, coordinates, haplotypes, and ribotypes are only shown for S. longifolia samples). In cases in which there was more than one haplotype or ribotype in one locality, the number of individuals for each haplotype or ribotype appears in brackets.
southern France outside of the Pyrenees because it is suspected to be the result of naturalization (Gonard, 2006) as well as some unreliable GBIF sites that do not match with geographical and ecological characteristics of S. longifolia. Five individuals from each population were randomly selected for DNA extraction and sequencing. The leaves of each individual were preserved in silica gel until processed in the laboratory.

Species distribution modeling (SDM)
A higher number of haplotypes is expected in areas climatically favorable for S. longifolia throughout the Quaternary climatic cycles than in areas with recent favorable bioclimatic conditions (Coello et al., 2021). To detect areas with potential high diversity, we used SDM based on 30 populations sampled in this study (Table 1), which cover the entire distribution range of the species in Iberia. We did not include High Atlas populations because of the bias between the number of populations sampled in Iberia (30) and the less known High Atlas (2). Furthermore, S. longifolia only occurs in limestone places, but detailed lithological information was not available for northwest Africa. We decided to use only the populations sampled in this study to build the models because many localities from local (www.proyectoanthos) and worldwide (GBIF) plant distribution databases of S. longifolia are based on unreliable records, as shown by low elevation (below the known altitudinal limits of the species) or rocky substrates other than limestone. In addition, the species is often confused with the common Saxifraga paniculata Mill. in the Pyrenees. All these led us to restrict the occurrences used for the SDMs to our 30 sampled populations to have an appropriate match between genetic identity and geographic sites. We downloaded 19 bioclimatic variables from WorldClim 1.4 (Hijmans et al., 2005) and selected six that were not highly correlated across Iberia according to pairwise correlation coefficients (|r| < 0.75): bio04 (temperature seasonality), bio08 (mean temperature of the wettest quarter), bio 09 (mean temperature of the driest quarter), bio11 (mean temperature of the coldest quarter), bio16 (precipitation of the wettest quarter), and bio17 (precipitation of the driest quarter). We also chose six edaphic variables from SoilGrids2017 ( We estimated long-term persistence areas as the intersection of inferred presence areas from each period until the present: (i) long-term persistence since the LIG was the intersection of potential distributions in the LIG, LGM, MH, and present; (ii) long-term persistence since the LGM was the intersection of LGM, MH, and current distributions; and (iii) long-term persistence since the MH was the intersection of MH and current distributions.

Phylogenetic and phylogeographic analyses (ITS and plastid DNA)
To determine the species of Saxifraga most closely related to S. longifolia (to be used as the outgroup in downstream analyses), we conducted a Bayesian analysis of ITS sequences in MrBayes using an alignment including our sequences of S. longifolia and 12 additional sequences of Saxifraga sect. Ligulatae (10 sequences) and sect. Trachyphyllum (2 sequences) deposited in the GenBank database (Table 1). The analysis was conducted using two runs of 10 million generations each, with a sampling frequency of 1000 and a burn-in of 10%. Based on the resulting phylogenetic tree, the four DNA regions of this study (ITS and three plastid regions) were sequenced for the closest relatives of S. longifolia (see also Conti et al., 1999). In particular, we sequenced a sample of S. valdensis from the herbarium of the Royal Botanical Garden of Madrid (MA-01-00777476) and two newly collected individuals of S. paniculata and S. cotyledon. These outgroup sequences were aligned with S. longifolia sequences using MAFFT. Genealogical relationships among plastid haplotypes and ITS ribotypes of S. longifolia and the outgroup were inferred using the statistical parsimony algorithm (Templeton et al., 1992) as implemented in TCS 1.21 (Clement et al., 2000) with a 95% connection limit. We exclusively used DNA substitutions. Gaps resulting from mononucleotide repeated units (poly-A or poly-T) were treated as missing data because they are known to be highly homoplastic. We also inferred phylogenetic relationships among S. longifolia sequences using Bayesian inference, with S. paniculata, S. cotyledon, and S. valdensis as the outgroup. Separate analyses for cpDNA and ITS sequences were implemented in MrBayes 3.2.6 (Ronquist et al., 2012) using two runs of 10 million generations each, with a sampling frequency of 1000 and a burn-in of 10%. Best-fitting substitution models were selected using jModelTest (Darriba et al., 2012) (JC for ITS, GTR for rpl32-trnL, HKY + G for rps16-trnQ, and JC for trnS-trnG).
Lineage divergence times were estimated using BEAST v1.10.4 (Drummond & Rambaut, 2007) for cpDNA and ITS sequences separately. Analyzed matrices included sequences of S. longifolia (one per haplotype or ribotype) and those of the outgroup species S. paniculata, S. cotyledon, and S. valdensis. Based on ITS sequence relationships in a wider phylogenetic context (see above), all sequences except for that of S. paniculata were constrained as a monophyletic group. Given the low number of changes found in each DNA region (Table S3), partitions could not be implemented in the cpDNA analysis. To the best of our knowledge, there are no well-documented Saxifraga fossils. For this reason, we employed a secondary basal calibration point (Hipsley & Müller, 2014) from a previous time-calibrated phylogeny of Saxifraga. In particular, root age was calibrated using the crown age of Saxifraga sect. Ligulatae, which includes S. longifolia and the outgroup (7.94 Ma, 95% HPD: 4.86-11.87 Ma) (Ebersbach et al., 2017). We selected a simple constant size coalescent tree prior to facilitate convergence and an uncorrelated lognormal clock. Best-fitting substitution models were selected using jModelTest (Darriba et al., 2012) (JC for both ITS and cpDNA regions). The analysis was conducted using two runs of 100 million generations each, with a sampling frequency of 10 000 and a burn-in of 10%. The adequacy of effective sample sizes was confirmed with Tracer ). We combined the two runs with LogCombiner (discarding the burn-in) and summarized the trees in a maximum clade credibility tree using TreeAnnotator.
To reconstruct migration routes, we performed a Bayesian discrete phylogeographic analysis (DPA; Lemey et al., 2009) in BEAST 1.10.4 (Drummond & Rambaut, 2007) using our plastid sequences for all sampled individuals. We constrained all sequences of S. longifolia as a monophyletic group and defined five areas, including four Iberian areas based on those described in Buira et al. (2017) (Darriba et al., 2012) for the entire alignment. The reconstruction of ancestral areas followed an asymmetric substitution model. We used a Bayesian stochastic search variable selection (BSSVS) to infer statistically supported migration routes. We implemented an uncorrelated relaxed clock with a lognormal distribution for the DNA partition and a strict clock for the area partition, with a simple constant size coalescent tree prior to facilitate convergence. The root age was calibrated using a normal prior distribution with a mean age of 7.94 Ma and a standard deviation of 1.5 Ma (Ebersbach et al., 2017). We used a Bayesian stochastic search variable selection (BSSVS) to infer statistically supported migration routes. The analysis was conducted using two runs of 100 million generations each, with a sampling frequency of 10 000 and a burn-in of 10%. Adequacy sample size was confirmed with Tracer ). We combined the two runs with LogCombiner (discarding the burn-in) and summarized the trees in a maximum clade credibility tree using TreeAnnotator. Finally, we used SpreaD3 0.9.6 (Bielejec et al., 2016) to calculate Bayes factors (BF) and identify strongly supported migration routes as those connections between areas with BF > 3 ( Kass & Raftery, 1995).

SDM
According to the maximum entropy model, the inferred potential distribution of Saxifraga longifolia under current conditions (Fig. 1D) is mainly found in the Northern, Northeastern, and Southeastern Iberian mountains, with a high predictive accuracy (AUC = 0.984). Variable soil organic carbon content (eda07), the mean temperature of the coldest quarter (bio11), precipitation of the driest quarter (bio17), lithological classes (iberlit), and volumetric percentage of coarse fragments (>2 mm) (eda02) had the highest percent contributions to the model, with a combined contribution of 82.8% (Table S2). Projections to the past showed limited variation in the estimated potential distribution, with some degree of waxing and waning, particularly in southeastern mountains (Figs. 1A-1C). In fact, areas of persistence since the LIG were widely distributed in the Northern, Northeastern, and Southeastern Iberian mountains (Fig. 1E).

Phylogenetic and phylogeographic analyses
The Bayesian analysis of ITS sequences of Saxifraga sect. Ligulatae and sect. Trachyphyllum supported S. valdensis as sister species to S. longifolia (posterior probability, PP = 1), with S. cotyledon as a sister to the S. valdensis/S. longifolia clade (PP = 0.99) (Fig. S1).
Sequencing the ITS region from 71 individuals of S. longifolia and the three outgroup species resulted in an alignment length of 468 bp. Seven substitution-based ribotypes were identified in the sequence data set, which formed a single network with no loops and six missing ribotypes (Fig. S2). Iberia and the High Atlas did not share any ribotypes, and there were no missing haplotypes separating both ribogroups. In Iberia, R1 (52 individuals, 26 populations) was found in the majority of individuals and populations (Fig. S2A). We considered this ribotype ancestral because it was the most frequent and widespread one. Three lineages were derived from R1: the first one was formed by ribotype R2 (4 ind., 2 pop), present in Eastern Iberia (Parc Natural dels Ports and Sierra de Aitana); the second lineage included R3 (4 ind., 3 pop.) from the Central Pyrenees, which was separated by three missing ribotypes from two Fig. 3. Fifty percent majority rule Bayesian consensus tree for Saxifraga longifolia. Values above branches represent Bayesian posterior probability (PP). Colors represent the sequences included in each group; the size is proportional to the number of sequences. The numbers in brackets correspond to sequences of each group per population, and the absence of brackets indicates that all sequences of a population are included in the same group. A, Plastid DNA sequences (rpl32-trnL, rps16-trnQ and trnS-trnG). B, ITS sequences. ribotypes present in the West Pyrenees (R4, 3 ind., 1 pop., Fago) and (R5, 2 ind., 1 pop., Roncal); and the third lineage included High Atlas ribotypes (R6, 3 ind., 1 pop., Zaouiat Ahansal; and R7, 3 ind., 1 pop., Jbel Ouirarassen) (Fig. S2B). All the Iberian ribotypes were found in the Pyrenees, except for one (R2) only found in Eastern Iberia (Fig. S2A). Within the Pyrenees, populations of the western (Fago and Roncal) and the central (San Juan de la Peña, Bubal and Cerler) valleys showed higher numbers of ribotypes (four ribotypes and three ribotypes, respectively) than any other areas. The highest number of ribotypes was found in areas that have remained environmentally suitable for the species since the LIG (Western and Central Pyrenees) (Fig. 1E). In the Bayesian phylogenetic tree of ITS sequences, S. longifolia sequences formed a monophyletic group (PP = 1.00), which supported a single origin. Relationships within S. longifolia were largely unresolved, with four ribotypes (R7, R2, R5, and R4) were supported as monophyletic groups (PP > 0.99).
Lineage divergence times based on plastid DNA sequences showed an average value of 4.48 Ma for the split between S. longifolia and S. valdensis, while the crown age of S. longifolia had an average of 2.33 Ma. The average divergence time for the split between Iberian and High Atlas populations was 0.73 Ma (Fig. 4). For ITS sequences, the estimated divergence time between S. longifolia and S. valdensis was 4.78 Ma, while the crown age of S. longifolia was 2.45 Ma. The average divergence time for the split between Eastern Iberia and High Atlas populations was 1.38 Ma (Fig. S3).
The Bayesian DPA based on plastid DNA sequences inferred the Pyrenees as the most probable range for the most recent common ancestor of all S. longifolia sequences (PP = 0.60), and the other four areas had much lower probabilities (High Atlas Mountains, PP = 0.13; Catalan Coastal Range and Ebro Depression, PP = 0.12; Baetic System and South-Eastern Mediterranean coast, PP = 0.11; and Cantabrian Mountains and Northern Iberian System, PP = 0.04). In addition, four routes strongly supported by BF values from the Pyrenees to (i) Cantabrian Mountains (BF = 103.2); (ii) Catalan Coastal Range and Ebro Depression (BF = 7.6); (iii) Baetic System and South-Eastern Mediterranean coast (BF = 7.3); and (iv) High Atlas Mountains (BF = 9.7). Migration routes from the remaining areas to the Pyrenees were moderately supported (BF = 3.7-4.7) (Fig. 5).

Discussion
The Pyrenees can be considered the cradle of genetic diversity of Saxifraga longifolia based on the occurrence in this region of most genetic lineages and ancestral genotypes. In fact, the survival and differentiation of current lineages in the Pyrenees during the Quaternary are suggested by our phylogenetic, phylogeographic, and SDM results. S. longifolia may have been extensively distributed in the Pyrenees throughout the Quaternary and suffered limited expansion and contraction of its distributional range associated with glacial and interglacial cycles. In contrast with this long-term occurrence in the Pyrenees, current populations distributed in other Iberian mountains fit into a pattern of recent colonization and isolation. This pattern is interpreted based on the failure to find unique plastid haplotypes across Iberian mountains outside of the Pyrenees, which may be explained by isolation in the last glacial-interglacial period. In contrast, a deeper split and a significant genetic differentiation were found between Iberian populations and those from northern Africa.

Western and Central Pyrenees harbor most S. longifolia diversity
The Pyrenees harbor six of the seven plastid haplotypes of S. longifolia recorded in this study (Fig. 2) and four of the seven ribotypes (Fig. S2). Haplotype lineage distribution also suggests that the center of diversity of S. longifolia is located in the Pyrenees, from where migration to the rest of Iberian and northern African areas appears to have taken place according to the Bayesian phylogeographic analysis (Fig. 5). Genetic and lineage diversity, coupled with population abundance within each area, points to the Pyrenees as the cradle for S. longifolia differentiation. In particular, Central and Western populations of the Pyrenees harbor most of this diversity (Fig. 2). Pyrenean valleys could act as Quaternary refugia for S. longifolia, although (climatically favorable) longterm persistence areas for this species were widely distributed across Iberian limestone mountains (Fig. 1E). Abrupt valleys offer the opportunity of preserving suitable habitat conditions despite climatic oscillations, as interpreted for Fagus sylvatica (Magri et al., 2006), Pinus uncinata (Dzialuk et al., 2009), Antirrhinum majus (Liberal et al., 2014), and Rhododendron ferrugineum (Charrier et al., 2014). The importance of the Central Pyrenees as a biodiversity hotspot is not only hypothesized for plants but also for animal species (García-Barros et al., 2002). The geography of the Pyrenees, forming an east-west barrier between Iberia and the rest of Europe, is interpreted as a strong limitation for species dispersal, although some other valleys contribute to reduce isolation (Wallis et al., 2016). Indeed, the Central Pyrenees display an important diversity of habitats (limestone, rock places; high mountain pastures and scrubs; and mature forests) together with a great amount of microhabitats, which has resulted in a high degree of plant endemicity (63 of 119 endemic Pyrenean plant species are exclusive to the Central Pyrenees) (Gómez et al., 2017). During the LGM, Pyrenean valleys appear to have been covered with ice, but S. longifolia could have survived on cliffs and reached other valleys later on by diffusion processes when establishment conditions were favorable.
The Quaternary history of the Western Mediterranean is characterized by glacial and interglacial periods (Vargas, 2020). High-mountain vegetation may have survived during glacial periods in refugia (Birks, 2016) such as ice-free areas emerging above the glaciers (nunataks) (Stehlik et al., 2002) and on the borders of glaciated mountains (peripheral glacial refugia) (Segarra-Moragues et al., 2007;Médail & Diadema, 2009). Glacial periods facilitated colonization of other mountains, especially by plants with a monocarpic reproductive pattern such as S. longifolia, which generates hundreds of small fusiform seeds (1-1.5 × 0.4-0.5 mm; see Vargas, 1997) for each plant that can be easily dispersed by wind. In contrast, during interglacial periods, these populations probably had a limitation in their survival and colonization capacity to highland rocks (Vargas, 2003). These ecological and biological limitations may have produced isolated populations recurrently in the history of S. longifolia, with a higher or lower gene flow depending on the distance to the nearest population and ecological factors (erosion, competition for space, soil loss) that may have acted as a barrier or a bridge between populations (Stehlik et al., 2002).
Interestingly, the most genetically differentiated population based on plastid DNA sequences is found in the western Pyrenees (Isaba, 9 ind.), as shown by the six missing haplotypes needed to connect this population with the phylogeographically closest population (Fago, 8 ind.) (Fig. 2). In contrast, geographic distance between them is short (18.39 km in a straight line), which suggests strong haplotype extinction. Alternatively, failure in obtaining a significant sample size may have also affected our phylogeographic reconstruction, despite a particular sampling effort made on that particular area (Sanz et al., 2014). Unexpectedly, current phylogeographic isolation does not mean intraspecific morphological differentiation (Vargas, 1997). In fact, S. longifolia individuals from Isaba are morphologically similar to those from other Pyrenean populations (n = 23). Nevertheless, isolation in extreme Mediterranean conditions may have brought about local morphological differentiation, as interpreted from the small phenotype of S. longifolia var. aitanica (Pau, 1904) from Sierra de Aitana (Eastern Iberia) (ribotype R2) and long-term isolation for the subsp. gaussenii from the African Atlas (haplotype H4 and ribotypes R6 and R7).

Iberian mountains: Multiple patterns of colonization
Phylogeographic and SDM results suggest pulses of active colonization and extinction across Iberia in multiple ways. As a result, finding common phylogeographic patterns across species is difficult (comparative phylogeography) (Vargas, 2003;Kropf et al., 2006;Smyčka et al., 2018), even within the same species. SDM results indicate a potential distribution of S. longifolia permanently in favorable areas (calcareous mountains of SW Europe) during the Quaternary (Fig. 1). Despite the difficulty of determining how macroclimatic variables influence the distribution of species and lineages located at microsites (Kuntz & Larson, 2006), we can draw some interpretations from the S. longifolia distribution model. The importance of edaphic and lithological variables (>60% of the model, Table S2) can be related to the importance of limestone substrates over climate variables for this species. On the one hand, soil organic carbon content (eda07), lithological class (iberlit), and volumetric percentage of coarse fragments (>2 mm) (eda02) (Hengl et al., 2017) explained more than 50% of the variation (Table S2). On the other hand, mean temperature of the coldest quarter (bio11) and precipitation of the driest quarter (bio17) explained >32%. These climate variables represent low temperature during winter (bio11) and water supply during summer (bio17), essential ecological requirements for S. longifolia . Climate conditions during the Quaternary probably caused significant extinction of S. longifolia populations but also persistence in favorable microsites, as found for Kernera saxatilis, Silene rupestris, and Gentiana alpina (Kropf et al., 2006). Recent colonization events between and within mountain ranges have been suggested for Androsace vitaliana (Dixon et al., 2009), which had a recent isolation in the Pyrenees and mountains of Central and Eastern Iberia (Vargas, 2003). Connectivity between limestone mountains out of the Pyrenees has also been reported for Brachypodium distachyon, which also reached the Eastern Iberian mountains (Marques et al., 2017). A waxing-and-waning process during the Pleistocene may have caused genetic and morphological differentiation, which have been identified by taxonomists in terms of new subspecies endemic to particular mountain ranges (see Vargas, 2003 for details). A close connection between the Cantabrian mountains and the Pyrenees is supported by our DPA analysis (Fig. 5), but the lack of a strong phylogeographic structure indicates that the Cantabrian populations may be the result of a recent colonization as well, as described for some other species (García-Barros et al., 2002).

Eastern Iberian corridor to the Strait of Gibraltar and beyond
Our phylogenetic reconstruction based on ITS and plastid sequences is congruent with an early split of northern African and European populations (<3.33 Ma). A further connection between both continents was found neither by phylogenetic nor phylogeographic analyses. The timecalibrated phylogeny using ITS sequences (Fig. S3) is equivocal about a split before or after the opening of the Strait of Gibraltar (5 Ma). However, the phylogeny of plastid sequences supports a colonization in the last three million years (Fig. 4), which is only compatible with long-distance dispersal over the Strait of Gibraltar.
The High Atlas haplotype is connected by five missing haplotypes with Els Ports (Eastern Iberia) and Cerler populations (the Central Pyrenees) (Fig. 2), which suggests an ancient relationship with northeastern Iberia populations. S. longifolia could have migrated at least once from the Pyrenees to the High Atlas by long-distance dispersal or most likely in a process where Eastern Iberia may have been an ideal corridor, as shown by numerous limestone mountains that connect the Pyrenees and the Strait of Gibraltar. On the north African side, the limestone Rif mountains are located between the Mediterranean coast and the High Atlas; however, there are no records of S. longifolia in the Rif mountains despite similar edaphic and climatic conditions (Rodríguez-Sánchez et al., 2008). Further exploration should be carried out in northern Morocco given that new small populations of S. longifolia have been found in the last decades in Spain (Peñas de Herrera, Moncayo, Escudero, 1992;La Paredina, Rodríguez, 1977;Sagra Mountains, Font Quer, 1961), suggesting that the distribution of scattered populations of S. longifolia is still poorly known.
Colonization routes from the Pyrenees to the east of Iberia could have been favored by the vast corridor of tundra/ steppe vegetation adapted to the cold climate that connected these regions during the LGM (22 kya) (Kropf et al., 2006). This pattern is also found in some Pyrenean species such as a subendemic snapdragon (Antirrhinum majus, Liberal et al., 2014), the endemic daffodil (Narcissus alpestris), and the endemic newt (Calotriton asper) (see a revision by García-Barros et al., 2002). During the LIG (120-140 Ma), when warmer temperatures may have not allowed the persistence of S. longifolia at low elevations, its populations were probably fragmented and stayed isolated in mountain systems at higher elevations (Fig. 1A). There are other examples of even more disjunct distributions like Veronica aragonensis, which could have used this corridor to reach the Baetic System from the Pyrenees (Padilla- . In the opposite direction, Arenaria tetraquetra could have also used this limestone corridor to reach the Pyrenees from Southeastern Iberia (Vargas, 2003).
The Eastern Iberian corridor facilitated colonization of Eastern and Southern Iberian mountains and then the High Atlas by S. longifolia. However, local extinctions probably occurred in S. longifolia populations outside the High Atlas and the Pyrenees after the first colonization process. Thus, colonization from the Pyrenees is indicated by haplotypes sharing with other limestone Iberian mountains. As a result of recent colonization across Iberia, we failed to find a strong phylogeographic pattern, which prevented us to point out a close connection between Iberian and High Atlas populations. Indeed, molecular and ecological evidence suggest that a clearer pattern was probably blurred during the waxing-and-waning process that occurred in the Eastern Iberian corridor during the Pleistocene. Table S1. Saxifraga longifolia samples used in this study covering its geographic distribution. For each locality, we indicate coordinates, elevation, date of collection, collector, collector ID, and number of individuals analyzed in this study. Table S2. Estimation of relative contributions of the bioclimatic, soil, and lithological variables to the species distribution model (Maxent), including the percent contribution and the permutation importance. Table S3. Number of variant (variable and informative) sites in ITS, rpl32-trnL, rps16-trnQ, trnS-trnG, and concatenated plastid (cpDNA) Saxifraga (S. longifolia, S. cotyledon, S. paniculata, S. valdensis) matrices. The numbers in brackets indicate S. longifolia variant sites.