- Details on pre-selection of variables: we used 19 climate variables along with elevation data extracted from WorldClim version 2.1 (www.worldclim.org), at a spatial resolution of 2.5 arcminutes (≈ 4,630 m). WorldClim provides high-resolution interpolated climate data, derived from 9,000 to 60,000 weather stations worldwide, aggregated across a target temporal range of 1970–2000
- To avoid redundancy, we included in the models all variables that were correlated with each other by less than 0.8, using a Pearson correlation matrix calculated using the R package “vegan” (version 3.5.3)
- We used Receiver Operator Characteristic (ROC) statistics to assess model accuracy, with 10 replicates of 10,000 maximum iterations, 10% of the average replicates were randomized as test data, while the remainder were randomized to train the model during each replicate
- Assessment of variable importance: we used the jackknife option to identify variables not contributing importantly to model robustness
- Details on threshold selection: We selected the thresholds for each species that defined the smallest potential habitat following a conservative approach to avoid overestimating species geographic distributions (Table S2)
- Performance statistics estimated on training data: performance statistics were estimated using the training data during each bootstrap replicate. For each of the 10 repetitions, MaxEnt evaluated model accuracy by comparing predicted and observed occurrences. Standard metrics such as AUC (Area Under the Curve) were calculated to assess model performance, ensuring robust and reliable predictions. The use of multiple replicates enhances the stability of the performance estimates and minimizes the risk of overfitting.
- Performance statistics estimated on validation data (from data partitioning): although no explicit data partitioning was applied, we used a random test point during each bootstrap replicate. These random test points served as a form of validation by evaluating the model’s performance on data that was not used during model fitting. Performance statistics, such as AUC, were calculated based on these test points, providing an estimate of model accuracy and its ability to generalize to unseen data.
- Response plots: We used partial dependence plots to check the ecological plausibility of fitted relationships in MaxEnt models.
- Prediction unit: Predictions of relative probability of presence
- Post-processing: after thresholds selection, clipping was performed to generate binary maps.
Uncertainty quantification
- Algorithmic uncertainty, if applicable: None
- Uncertainty in input data, if applicable: None
- Effect of parameter uncertainty, error propagation, if applicable: None