Abstract
Small populations face high extinction risks. This can be explained by several non-genetic and genetic factors, the latter including the loss of genetic diversity and evolutionary potential, as well as the accumulation of harmful mutations (genetic load). Using whole-genome data from island populations with different effective sizes, we estimated genetic variation and load and explored the relationship between these quantities. An extremely small population of the Aeolian wall lizard, Podarcis raffonei, likely isolated for tens of thousands of years, shows the lowest genome-wide heterozygosity observed in wild eukaryotes (one polymorphic site every 300 kb on average). Despite this, its realized genetic load is comparable to that observed in another larger and more genetically variable population. Both populations have lower variation and higher load than the much more abundant sister species, the Sicilian wall lizard. These observations are consistent with the hypothesis that populations experiencing severe bottlenecks may persist for extended periods with extremely low genomic variation, provided that their burden of deleterious mutations remains within tolerable bounds.
Similar content being viewed by others
Population genetic differentiation and genomic signatures of adaptation to climate in an abundant lizard
Genetic purging in an island-endemic pigeon recovering from the brink of extinction
Genomic basis of insularity and ecological divergence in barn owls (Tyto alba) of the Canary Islands
Introduction
The genetic factors contributing to the risk of extinction in a population are numerous and involve mutation, genetic drift, and inbreeding. Mutation frequently generates maladaptive variants (Muller 1950), drift may subsequently increase their frequency, and mating among consanguineous individuals can expose deleterious recessive variants (Charlesworth and Charlesworth 1999). Moreover, drift may also reduce beneficial genetic variation for future adaptation (Bijlsma and Loeschcke 2012). Population size plays a crucial role in all these processes.
The partitioning of genetic load into components affecting current (realized load) and future (masked load) generations is crucial for understanding and potentially predicting extinction risk dynamics (Bertorelle et al. 2022). In large populations, individuals tend to accumulate more deleterious variants, defined as mutations with negative selection coefficients that reduce fitness, than individuals in small populations. This is due to a higher value of the parameter Nμ (effective population size multiplied by the mutation rate), as well as the fact that most mutations are mildly deleterious and recessive. Consequently, large populations exhibit a higher genetic load, but its impact is buffered as deleterious mutations in heterozygosis (the masked load) prevail over those in homozygosis (the realized load) (Smeds and Ellegren 2023; Humble et al. 2023). Decreasing population size escalates drift effects, increasing both the frequency of deleterious variants, possibly up to fixation (genetic meltdown), and the level of inbreeding that may translate into unfit offspring (Charlesworth and Charlesworth 1999). Only the removal of deleterious pairs of alleles in inbred offspring (genetic purging) can reduce the conversion of masked load into realized load and mitigate the subsequent fitness reduction and demographic decline (Crow 1970).
Theoretical and simulation studies have explored the interplay between genetic variation and genetic load components to predict extinction risks under different demographic scenarios (Bertorelle et al. 2022). Whole-genome sequences have provided empirical insights by estimating genetic variation and load in single individuals (Pečnerová et al. 2024; Smeds and Ellegren 2023; Humble et al. 2023). However, the effectiveness of whole-genome sequences as proxies for fitness in current and future generations, in particular for comparing populations and species with different population sizes and demographic histories, remains unclear. While some studies indicate the accumulation of realized genetic load and increased extinction risks as the main and most worrisome process in declining species (Robinson et al. 2016; Von Seth et al. 2021; Dussex et al. 2021; Kleinman-Ruiz et al. 2022), others suggest that purging is a relevant mechanism reducing vulnerability to extinction, particularly in populations experiencing prolonged periods of small population size (Ficetola et al. 2011; Grossen et al. 2020; Ochoa and Gibbs 2021; Khan et al. 2021; Kleinman-Ruiz et al. 2022; Mathur et al. 2023b). Additionally, conflicting findings on the purging effect of highly deleterious mutations and the simultaneous accumulation of mildly deleterious alleles in small populations further complicate the understanding of fitness outcomes (Grossen et al. 2020; Khan et al. 2021; Dussex et al. 2023). Finally, the use of various load proxies and approaches to predict deleteriousness, coupled with the recent appreciation of load partition into the realized and masked components, has led to inconclusive findings regarding extinction risk and general patterns of accumulation and expression of deleterious variants in natural systems (Bertorelle et al. 2022).
Here we investigate the interplay between population size, genomic variation, and genetic load in three insular lizard populations: two populations of the endangered (Salvi et al. 2024) Aeolian wall lizard (Podarcis raffonei) with census population sizes in the tens and hundreds of individuals, respectively, and, as a control, the sister species, the Sicilian wall lizard (Podarcis waglerianus), which is widely distributed across Sicily and satellite islands with a very large census population size (Bowles 2024). The Aeolian wall lizard occurs in extremely small and isolated populations in the Aeolian archipelago (Fig. 1A) in southern Italy, with a population size ranging from 100 to 1400 individuals across three islets and a small peninsula within an island, for a total population of about 2000 individuals occupying an area of approximately five to six thousand square meters (Gippoliti et al. 2017; Ficetola et al. 2018, 2021; Salvi 2023). Recent human disturbances and competition with the invasive Italian wall lizard (Podarcis siculus) have likely contributed to a reduction in the global population size and distribution range (Cascio and Corti 2006).
A Distribution range of the Aeolian wall lizard and the Sicilian wall lizard (bottom right, different scale). The red square in the inset indicates the position of the Aeolian archipelago. The Aeolian wall lizard is found in four islands and islets (underlined in gray), including our study sites of La Canna and Strombolicchio. The geologically estimated age of La Canna and Strombolicchio is indicated in italics, with the heights of the vertical bars proportional to the corresponding geological ages. B Picture of the La Canna stack (credit: Piera Rapisarda), a volcanic neck 70 m high and 1.6 km distant from the island of Filicudi in the background. C Genome-wide observed heterozygosity in 53 eukaryotic species (Robinson et al. 2016; Morin et al. 2021), including the Aeolian wall lizard from La Canna (LC) and Strombolicchio (ST), and the Sicilian wall lizard. Silhouettes indicate the taxonomic group of the species included in this plot, namely plants, invertebrates, mammals, birds, fish, and reptiles, and were retrieved from Phylopic (Gearty and Jones 2023). D Neighbor-joining phylogenetic tree based on pairwise genetic distances between individuals, for 5 different populations and species: the Aeolian wall lizard from La Canna (LC) and Strombolicchio (ST), the Sicilian wall lizard (SI), the Tyrrhenian wall lizard (Podarcis tiliguerta), and the Italian wall lizard (Podarcis siculus).
We focus on the populations of Strombolicchio (census population size of 500–700 individuals; Capula and Lo Cascio 2011; Ficetola et al. 2021) and La Canna (census population size of about 100 individuals; Salvi 2023; Fig. 1B). These are two volcanic islets located at approximately 1.5 km from the closest islands of Stromboli and Filicudi, respectively, where the Aeolian wall lizard is not found. The bathymetric profile suggests that the isolation of Strombolicchio from Stromboli and La Canna from Filicudi dates back to 7–10 kya (Cascio and Corti 2006). Consequently, gene flow with the respective main islands might have ceased at around that time. It is worth noting that lizard populations might have been present in Strombolicchio much earlier than those of La Canna, as Strombolicchio was formed much earlier than La Canna (approximately 200 kya and 30 kya, respectively; Francalanci et al. 2013; Lucchi et al. 2013).
First, we characterized the genomic patterns of variation and divergence, and reconstructed the evolutionary history of two small populations of the Aeolian wall lizard and one large population of the Sicilian wall lizard. Then, we estimated and compared the genetic load across these three different groups. Despite an extreme level of inbreeding in the La Canna population, which behaves as a wild inbred strain, and distinct demographic histories, we found that the two populations of the Aeolian wall lizard exhibit a similar amount of realized genetic load. This confirms the expectation that the erosion of genetic variation is proportional to population size in isolated populations but also suggests that extinction can be avoided when genetic erosion is extreme but genetic load does not increase proportionally.
Methods
Population sampling and genomics data generation
A total of 32 lizards were sampled in the Aeolian archipelago and the island of Sicily. Ten individuals of Aeolian wall lizards were sampled on the stack of La Canna (38°34’56.13”N – 14°31’16.61”E; Fig. 1B), in a small terrace at 50 m above sea level on the eastern slope of the stack. Ten individuals were sampled on the islet of Strombolicchio (38°49’01.8“N – 15°15’06.7“E). Ten individuals of Sicilian wall lizards were sampled in Sicily, four individuals from the South-east and six individuals from the West of the island (see SI Appendix, Table S1 for details). Finally, two individuals of Italian wall lizards were sampled in Sicily. For each individual, the tip of the tail was cut, flash-frozen in liquid nitrogen, and later stored at -80°C. Individuals were released after sampling. Whole DNA was extracted using a PureLink Genomic DNA Mini Kit (Invitrogen). DNA integrity was assessed by 1.5% agarose gel electrophoresis, and DNA concentration was measured using a Qubit 4 fluorometer Broad Range Assay (Invitrogen). Genomic libraries were constructed using an Illumina DNA PCR-Free Prep Kit (Illumina) according to the manufacturer’s protocol. Target coverage was 10-15X for all samples. Libraries were sequenced paired-end on an Illumina NovaSeq 6000 System using a 300-cycle S2 Reagent Kit v1.5. Read quality was evaluated with FastQC v0.12.1 (Andrews 2010), and output reports were examined visually.
Read mapping
A high-quality reference genome of the Aeolian wall lizard (Gabrielli et al. 2023) was used for read mapping. In addition to the 32 individuals collected and sequenced for the study, we used six Podarcis genomes from Yang et al. (2021): one Aeolian wall lizard from Strombolicchio, one Sicilian wall lizard from Sicily, two Italian wall lizards from Sicily and the Tuscan islands, respectively, and two Tyrrhenian wall lizards from Sardinia and Corsica, respectively, the last two species serving as outgroups for our subsequent analyses. Reads were mapped to the reference genome using BWA version 0.7.17 (Li and Durbin 2009) with default parameters, with an Optical Density (OD) value of 100 for our samples, as they were generated using patterned flow cells, and 2500 for the other individuals. PCR and optical duplicates were tagged using the MarkDuplicates tool in the Picard Toolkit version 2.24.1 (http://broadinstitute.github.io/picard/). In our study, a PCR-free library preparation kit was used so that reads tagged as PCR duplicates were further untagged using a custom bash script. A number of statistics were computed to ensure the quality of the mapping (SI Appendix, Table S2, Fig. S1).
Variant calling and filtering
An independent SNP calling was performed for the Aeolian wall lizards from La Canna and Strombolicchio, the Sicilian wall lizards, the Italian wall lizards, and the Tyrrhenian wall lizards. SNP calling was performed using GATK (McKenna et al. 2010) v4 following the best practice guides for short variant discovery, which is composed of three steps: (i) calling variants per sample using HaplotypeCaller; (ii) consolidating GVCFs to improve scalability using GenomicsDBImport; (iii) joint-genotyping all the per-sample GVCFs using GenotypeGVCFs. HaplotypeCaller was run independently in each scaffold of each individual, using the option EMIT_ALL_CONFIDENT_SITES to output all sites (including non-variant sites). The outputs were generated in condensed non-variant blocks. A minimum mapping quality of 20 was used. Finally, all scaffolds were merged together to produce one unique VCF file per dataset using bcftools merge (Danecek et al. 2021). From the raw VCF file produced after SNP calling, we first extracted SNPs and indels using GATK SelectVariant. We then filtered indels by quality (minimum quality of 60) and then filtered out low-quality SNPs that were located within 5 bp of the high-quality indels, with the following criteria: QUAL < 60; QD < 2.0; FS > 60.0; MQ < 40.0; MQRankSum < −20.0; ReadPosRankSum < −8.0; DP < meandepth/3; DP > meandepth*2; GQ < 10, using GATK VariantFiltration. We removed repeat regions, regions of high heterozygosity and filtered out sites with more than 25% of missing data using VCFtools (Danecek et al. 2011) and bedtools intersect (Quinlan and Hall 2010). We also extracted the invariant sites and filtered out low-quality sites with a RGQ < 10. Finally, we merged high-quality SNPs and invariant sites for each group with bcftools concat and merged the VCF files of all groups using bcftools merge. We removed the two sexual chromosomes (Z and W) and restricted the analyses to the 18 autosomes.
Genomic diversity and population structure
We first quantified genomic diversity in our three groups of interest, the Aeolian wall lizard from La Canna, the Aeolian wall lizard from Strombolicchio, and the Sicilian wall lizard, computing the number of SNPs per population and species and the number of individual heterozygous sites. We also computed the mean observed heterozygosity per population and compared the obtained estimates to genomic estimates of observed heterozygosity in a range of eukaryotes (Robinson et al. 2016; Morin et al. 2021).
We investigated genomic diversity in the Major Histocompatibility Complex (MHC), a region typically characterized by high variability in vertebrates. To identify relevant loci, we used data from Gaczorek et al. (2023), which characterized MHC regions in Podarcis lizards inhabiting the Iberian Peninsula. We focused on two genes, corresponding to a class I and class II MHC second exons, both showing high variability in Gaczorek et al. (2023). By performing a BLAST search (Altschul et al. 1990) of ten randomly selected MHC I alleles deposited by Gaczorek et al. (2023) against the P. raffonei reference genome (accession GCF_027172205.1), we consistently retrieved 15 accession numbers, all of which corresponded to the “H-2 class I histocompatibility antigen Q9 alpha chain-like” protein sequence. After filtering out isoforms of the same transcript and sequences shorter than 200 bp, six very similar sequences were retained. One of them, randomly selected, was used as a query to extract the homologous region in the genomes of lizards from La Canna, Strombolicchio, and the Sicilian wall lizard and estimate the number of SNPs in the three groups. We applied the same approach for MHC II alleles. Five accession numbers were retrieved corresponding to the “H-2 class II histocompatibility antigen, E-S beta chain-like” protein sequence. After filtering for isoforms and fragment length, two very similar sequences were retained, and one of them was used to estimate the genetic variation in La Canna, Strombolicchio, and the Sicilian wall lizard.
In order to analyze population and species clustering and ensure the global quality of our dataset, we first performed a Principal Component Analysis using Plink (Purcell et al. 2007). We then analyzed the shared ancestry between individuals using Admixture (Alexander et al. 2009) using sites segregating in the Aeolian, Sicilian, and Italian lizards, and 50 kb apart in order to have independent markers (resulting in 30,497 SNPs). Finally, we performed a phylogenetic tree based on pairwise genetic distances between individuals using Plink.
Inbreeding levels
To quantify the levels of inbreeding in our three populations of interest, we used ROHan, a probabilistic method that infers local rates of heterozygosity and delineates Runs of Homozygosity (ROH) (Renaud et al. 2019). We ran the program with a rhomu value of 5e−5 and a window size of 1 Mb. We then considered a region to be a ROH when the P(ROH) was higher than 0.9, and computed the proportion of the genome in ROH higher than 1 Mb, (FROH), which serves as a genomic estimate of the inbreeding coefficient F. We also used a window size of 100 kb in order to estimate the proportion of the genome in different classes of ROH length, and here removed two individuals (ST03 and ST04) for which the chains did not reach convergence.
Demographic inferences
The long-term variation in population size was first investigated using MSMC2 (Schiffels and Wang 2020; Wang et al. 2020). This analysis was run on the 18 different autosomes using three masks. The first was a callable mask that was computed using GATK CallableLoci on the bam files. We defined callable sites as sites with a minimum base quality and mapping quality of 20 and covered between one-third and three times the mean individual depth. The second was a mappability mask generated using GEM v1.4.3 (Derrien et al. 2012) to determine uniquely mapping positions in the reference genome of the Aeolian wall lizard. The third was a repeat negative mask to exclude repetitive regions extracted from the output of RepeatMasker previously run on the reference genome of the Aeolian wall lizard. We used a default time patterning (“4 + 25*2 + 4 + 6”), with an initial theta/rho ratio (-r parameter) of 5 and a maximum 2N0 coalescent time (-N parameter) of 15 (default values). The recombination rate was set as variable in the MSMC algorithm. We used a generation time of two years and a mutation rate of 1e−9 mutations per site and per year as estimated from Podarcis whole genomes (Yang et al. 2021).
The recent demographic history was investigated from the distribution of the ROH. The calculation of the coalescent times of ROH was estimated as: (g=frac{100}{2{rL}}) (Thompson 2013), where g is the expected time back to the parental common ancestor when the IBD haplotypes forming an ROH coalesce, r is the recombination rate, and L is the length of the ROH in megabases. We used the recombination rate estimate of the chicken (Gallus gallus) of 2.8 cM.Mb−1, as done by Ochoa and Gibbs (Ochoa and Gibbs 2021), as this is, to our knowledge, the closest available estimate applicable to reptiles in the absence of species-specific data. We then derived the formula: ({F}_{{ROH},t}=1-{left(1-frac{1}{2{Ne}}right)}^{t}) (Khan et al. 2021) to get an estimate of Ne from different time periods corresponding to different classes of ROH length.
To estimate the divergence time between the La Canna and Strombolicchio populations, we used SMC++ v1.15.4 (Terhorst et al. 2017), which implements a sequentially Markovian coalescent model to infer past population size changes and split times from unphased genomic data. As SMC++ treats missing data as homozygous reference genotypes, we first masked genomic regions with low or excessively high coverage, as well as those with insufficient base or mapping quality, using the same callable mask used for MSMC and computed using GATK CallableLoci. We generated smc files for each autosome using the vcf2smc command, with the –mask option to exclude unreliable regions. We estimated marginal effective population sizes for each population using smc++ estimate, and applied the smc++ split command to fit a clean-split model between La Canna and Strombolicchio and infer the divergence time between the two populations. Analyses were run assuming the same mutation rate and generation time as in the MSMC analysis.
Genetic load estimation
Ancestral state definition
A fundamental component of the genetic load estimation is the definition of the ancestral state. We therefore used two outgroups with different divergence times from the clade formed by the Aeolian and Sicilian wall lizard: the Tyrrhenian wall lizard from the Western islands group (divergence time: 16 My) and the Italian wall lizard (divergence time: 18 My) (Yang et al. 2021). We defined the ancestral allele as the allele present in at least three or four of the Italian wall lizard individuals and the two Tyrrhenian wall lizard individuals using a custom Python script (available at: https://github.com/maevagabrielli/Conservation-genomics-Belem-2025/blob/main/polarization_miss_ancestralallele.py). All sites where at least one of these outgroups was heterozygous were discarded to maximize the confidence in the ancestral allele definition. We then selected the sites for which at least one derived allele was present in the Aeolian and/or Sicilian wall lizard, but not fixed in these two species, resulting in 13,421,208 sites.
Annotation-based approach to estimate the genetic load
We first estimated the genetic load based on genomic annotations and classified mutations of different effects using SnpEff (Cingolani et al. 2012). Well-known functional effects of mutations in different genomic regions are used to classify a mutation. Mutations inside coding protein sequences (CDS) were categorized into three classes based on their predicted impact: LOW (synonymous); MODERATE (missense, i.e., nonsynonymous), and HIGH (nonsense mutations including stop-gained, start-gained, start lost, splice acceptor, and splice donor variants). When one mutation was associated with multiple effects (due to different genes being affected by the same variant or one gene having multiple transcripts), the first one was considered, corresponding to the higher putative impact and to canonical over non-canonical transcripts.
Evolutionary conservation-based approach to estimate the genetic load
We then estimated the genetic load based on evolutionary conservation using GERP++ (Davydov et al. 2010). This approach builds on the idea that rare substitutions in genome phylogenies across species indicate strong constraints, so that variants in these positions are predicted to be deleterious. GERP scores were calculated on an alignment of 46 species, including 16 reptiles (Roscito et al. 2022). As the tegu (Salvator merianae) was used as the reference genome for this later multispecies alignment, we first aligned the genome of the Aeolian wall lizard to the genome of the tegu (obtained in DNA Zoo, version 6; Roscito et al. 2018) using AnchorWave v1.0.1 (Song et al. 2022). We then used liftOver (UCSC; Kuhn et al. 2013) to change the coordinates of the tegu genome into the coordinates of the Aeolian wall lizard genome. Mutations with a GERP score higher than 4 were classified as putatively deleterious, following Smeds et al. (2023) and based on the distribution of the GERP scores of synonymous, missense, and nonsense mutations from the genomic annotations (see SI Appendix, Fig. S2). Nonsense mutations had a higher GERP score (mean value: 2.3) than missense mutations (mean value: 1.9), which had a higher GERP score than synonymous mutations (mean value: −0.87; significant differences in mean scores from pairwise Wilcoxon tests; SI Appendix, Fig. S2), showing that inferences based on annotations and evolutionary conservation are congruent.
Proxies of the genetic load components
Assuming that most deleterious mutations are recessive, homozygous and heterozygous genotype counts of derived deleterious mutations can be used as indicators of the genetic load components. For each class of putatively deleterious mutations, namely missense, nonsense and high-GERP-score mutations, the genetic load was first measured at the individual level and separated into two components: (i) the masked load (unexpressed and not affecting individual fitness) was estimated by the individual number of heterozygous genotypes and (ii) the realized load (expressed and reducing the fitness of an individual) was estimated by the individual number of homozygous genotypes for the derived allele (Bertorelle et al. 2022). We also computed these counts for high-GERP-score missense mutations and high-GERP-score nonsense mutations. At the population level, we characterized genetic load as the number of deleterious alleles present in a population at high frequency (frequency of the derived allele higher than 0.9), following Mathur et al. (2023a). To ensure comparisons across populations were not confounded by patterns of neutral variation, we also compared the total number of derived alleles at synonymous SNPs.
Purging inference
To estimate the levels of purging in our system, we first contrasted the total number of deleterious alleles in the different populations. In case of purging, we expect a loss of deleterious alleles in small populations, as in Kleinman-Ruiz et al. (2022). We then computed the percentage of difference in the number of deleterious alleles between the Sicilian wall lizard and both populations of the Aeolian wall lizard for mildly deleterious mutations (missense mutations) and highly deleterious mutations (nonsense and high-GERP-score mutations). If purging is more efficient for highly deleterious mutations, the percentage of difference between the number of highly deleterious mutations between large and small populations should be higher than that for mildly deleterious mutations (Grossen et al. 2020; Khan et al. 2021). Finally, we computed the average coefficient of deleteriousness of mutations based on GERP scores for the three groups of individuals. In case of purging, lower values of this average are expected in small populations where highly deleterious mutations have been eliminated (Dussex et al. 2021).
Statistical analyses
The two-tailed Wilcoxon test was used to compare means in the “Genetic load estimation” section. Considering the very low number of tests we performed, and the highly significant p-values obtained when the null hypothesis was rejected, no correction for multiple testing was applied. We compared the distributions of GERP scores between the three groups using Kruskal–Wallis and Kolmogorov–Smirnov tests, and we compared the proportion of high-GERP-score mutations (mutations with a GERP score larger than 4) using a χ2 test.
Research ethics statement
Sampling authorization was given under Prot. 45382, Rif. 35644/2019.
Results
Mean individual sequencing depth was higher than 10× for all individuals except for two Italian wall lizard individuals, which were sequenced at a lower depth, as they only act as outgroups. We obtained an average sequencing depth of 18.7× for the Aeolian wall lizard (22.5× and 15.3× for the La Canna and Strombolicchio individuals, respectively) and 13.7× for the Sicilian wall lizard.
Genomic variation and population structure
The final dataset included 38 individuals (32 individuals sequenced in this study and six genomes from the literature; SI Appendix, Table S1) from four species: the Aeolian wall lizard (P. raffonei), the Sicilian wall lizard (P. waglerianus), the Italian wall lizard (P. siculus), and the Tyrrhenian wall lizard (P. tiliguerta). Thirty-nine millions of Single Nucleotide Polymorphisms (SNPs) were identified with a mean genotyping rate of 0.95: 10,810 biallelic SNPs were identified in La Canna, 335,876 in Strombolicchio, and 17 million in the Sicilian wall lizard (SI Appendix, Table S3). Single individuals confirmed this pattern, with Aeolian wall lizards showing much fewer heterozygous sites compared to Sicilian wall lizards, especially in La Canna (SI Appendix, Fig. S3). The average genome-wide heterozygosity was extremely low for both Aeolian wall lizard populations when compared to other eukaryotic species, with La Canna showing the lowest value documented so far (3.4 × 10−6, i.e., one SNP every 300 kb on average; Fig. 1C; SI Appendix, Fig. S4). Similar differences were found when considering only protein-coding regions, with mean individual heterozygous counts of 94, 2703, and 93,902 in La Canna, Strombolicchio, and the Sicilian wall lizard, respectively. The extremely low genetic variation in La Canna and Strombolicchio was also observed in DNA regions where diversity is typically maintained by balancing selection. Specifically, one gene region from the class I and one from the class II Major Histocompatibility Complex (MHC) pathways, known to be highly variable in Podarcis lizards (Gaczorek et al. 2023), showed 22 and 13 polymorphic sites, respectively, in the Sicilian wall lizard, 6 and 0 polymorphic sites in Strombolicchio, and 0 polymorphic sites in both genes in La Canna.
The Principal Component Analysis revealed a clear separation between species and between La Canna and Strombolicchio individuals (SI Appendix, Fig. S5). Admixture analyses confirmed that species and populations were clearly separated, with no individuals showing mixed ancestry (SI Appendix, Fig. S6). Additionally, a phylogenetic reconstruction revealed that La Canna and Strombolicchio form two well-separated clades, characterized by shorter branches indicative of lower genetic diversity compared to the other species (Fig. 1D).
Inbreeding estimation and demographic history
The proportion of the genome in Runs of Homozygosity (FROH) was about 0.98 in La Canna and 0.48 in Strombolicchio (Fig. 2A), suggesting extreme and high inbreeding levels in the two islets, respectively. La Canna and Strombolicchio populations had very different ROH length distributions (Fig. 2B), with an average ROH length of 50 Mb and 1.5 Mb in La Canna and Strombolicchio, respectively, while the Sicilian wall lizard was characterized by less than 100 regions in ROH that were always shorter than 300 kb and a FROH close to 0 (SI Appendix, Fig. S7).
A FROH measured as the proportion of the genome in ROH longer than 1 Mb in the Aeolian wall lizard from La Canna (LC) and Strombolicchio (ST), and in the Sicilian wall lizard (SI). B Percentage of the genome in different lengths of ROH fragments for each individual of the La Canna (LC) and Strombolicchio (ST) populations. C Ancient Ne variation through time as inferred from MSMC analyses, for the Aeolian wall lizard from La Canna (LC) and Strombolicchio (ST), and for the Sicilian wall lizard (SI). D Recent Ne estimations from the proportion of the genome in different lengths of ROH fragments, for the Aeolian wall lizard from La Canna (LC) and Strombolicchio (ST).
The population size dynamics inferred by MSMC indicated a large population size for the Sicilian wall lizard (with some oscillation in the ancient past) and a constant demographic decline in both Aeolian lizard populations, particularly in La Canna (Fig. 2C). Around 200 kya, after a shared demographic decline, the trajectories of the Sicilian and the Aeolian wall lizard started diverging: the decline ended and likely reversed in the Sicilian wall lizard, with population sizes at high values (105–106), and the Aeolian wall lizard in Strombolicchio continued to decline gradually until 2 kya. Sometime earlier, around 20 kya, the shallow coalescent tree for the Aeolian wall lizard in La Canna allows the MSMC method to start the inference of the demographic dynamics of the individuals sampled on this rock: after an initial and intense drop, possibly due to a methodological artifact (Li and Durbin 2011), the population size decreased more gradually, paralleling the dynamics of Strombolicchio, but with smaller absolute values. The orders of magnitude of the effective population sizes one thousand years ago are estimated at 102, 103, and 106 in La Canna, Strombolicchio, and the Sicilian wall lizard, respectively. Interpretation of these absolute numbers requires great caution (e.g., the mutation rate is approximate), but they correspond to the estimated relative population sizes of these three groups of individuals. Recent effective population sizes in La Canna and Strombolicchio estimated from ROH patterns suggested even smaller population sizes in the last 500 years, with an initial decline to a modern effective population size of very few individuals (Fig. 2D, SI Appendix, Table S4). The divergence between the populations on La Canna and Strombolicchio was estimated using the SMC++ approach to have occurred approximately 14,000 years ago. However, given that both La Canna and Strombolicchio were likely colonized from the larger and geographically distant islands of Filicudi and Stromboli, respectively (see Fig. 1A), where the species is now extinct, this divergence time should not be interpreted as the time of colonization or separation of these two small islets. Rather, it provides a temporal framework for understanding the broader colonization history of the archipelago.
Genetic load and purging
Genetic load was investigated using two distinct approaches based on genomic annotations (SnpEff) and evolutionary conservation (GERP), at the individual and population levels. At the individual level, assuming that most deleterious mutations are recessive, homozygous and heterozygous genotype counts of derived deleterious mutations can be used as indicators of the realized genetic load (expressed and reducing the fitness of an individual) and the masked genetic load (unexpressed and not affecting individual fitness), respectively. The Aeolian and the Sicilian wall lizards showed opposite patterns at the genetic load partitions (i.e., masked and realized), consistent with their different population sizes. Regardless of the approach used to identify deleterious mutations (missense, nonsense, high-GERP-score, or combination of these criteria), the Aeolian wall lizard showed approximately 30–900 times less masked load than the Sicilian wall lizard, and both Aeolian wall lizard populations showed approximately twice the realized load observed in the Sicilian wall lizard (Fig. 3 and SI Appendix, Tables S5 and S6). The masked load was much lower in La Canna than in Strombolicchio (Wilcoxon test, p-value = 1.22e−4, 1.16e−4, and 1.24e−4 for missense, nonsense, and high-GERP-score mutations, respectively), reflecting the different levels of genomic variation. Conversely, the realized load did not significantly differ between the two populations (p-value = 0.22, 0.27, and 0.27 for missense, nonsense, and high-GERP-score mutations, respectively).
Number of genotypes for different classes of putatively deleterious mutations: missense (slightly deleterious mutations) and nonsense (highly deleterious mutations), classified using SnpEff, and high-GERP-score mutations (highly deleterious mutations), estimated from GERP (GERP score>4). Upper panels are heterozygous counts (proxies for the masked genetic load), and lower panels are homozygous derived counts (proxies for the realized genetic load).
To estimate the genetic load at the population level, thus considering how much individuals tend to share deleterious mutations, we computed the number of deleterious mutations at high frequency within the population (derived allele frequency higher than 0.9), following Mathur et al. (2023a). The population genetic load was higher in the Aeolian wall lizard than in the Sicilian wall lizard (5–6 times higher depending on the class of mutation considered), and very similar in the La Canna and Strombolicchio populations (SI Appendix, Table S7).
Finally, we found some partial evidence of purging in the small populations of the Aeolian wall lizard when compared to the much larger population of the Sicilian wall lizard. First, the total number of deleterious alleles was higher in Sicilian wall lizards compared to Aeolian wall lizards for all classes of putatively deleterious mutations and was lower in La Canna than in Strombolicchio (SI Appendix, Fig. S8). More explicitly, the number of missense, nonsense, and high-GERP-score mutations decreased respectively by 3.9%, 6.6%, and 9.2% from the Sicilian wall lizard to the Aeolian wall lizard in Strombolicchio, and respectively by 5.9%, 7.9%, and 9.8% from the Sicilian wall lizard to the Aeolian wall lizard in La Canna. Considering that the number of derived synonymous mutations in both Strombolicchio and La Canna was 5.0% and 4.4% lower than in the Sicilian wall lizard, respectively (this reduction is not expected theoretically under nucleotide independence, but has been observed in other studies (Kleinman-Ruiz et al. 2022) and it can be related to linkage effects), the higher reductions for the nonsense and high-GERP-score mutations can be interpreted as evidence of purging. Second, the difference in the total number of deleterious alleles between La Canna or Strombolicchio and the Sicilian wall lizard was lower for mildly deleterious mutations (missense mutations) compared to highly deleterious mutations (nonsense or high-GERP-score mutations; SI Appendix, Fig. S9, SI Appendix, Table S8). On the other hand, the expectation that under purging the Aeolian wall lizard should show a larger depletion of derived alleles with high GERP scores when compared to the Sicilian wall lizard is only partially met (SI Appendix, Fig. S10). The frequency distributions of GERP scores differed significantly among the three groups (pairwise two-sample Kolmogorov–Smirnov tests, p < 0.0001 for all comparisons; SI Appendix, Fig. S10), as well as the mean scores (Kruskal–Wallis test, p < 0.0001) and the proportion of variants with GERP values higher than 4 (chi-square test: χ2 = 78.393, d.f. = 2, P < 0.0001). The differences are, however, very weak: the average GERP score was 1.68, 1.69, and 1.70 in the Aeolian wall lizard from La Canna and Strombolicchio, and in the Sicilian wall lizard, respectively, and the proportion of alleles with GERP scores larger than 4 increased from 4.96% to 5.04% to 5.24% in the same three groups. Statistically significant differences are clearly obtained considering that very large sample sizes are involved, but considering also the errors associated with the GERP score, the biological significance of these differences is probably minor.
Discussion
The Aeolian and Sicilian wall lizard system appears as an extreme and ideal natural experiment to empirically explore the biological limits of genetic variation and load, and their relationship, in viable populations. The genomes of three groups of individuals were sequenced: Aeolian wall lizards from La Canna, Aeolian wall lizards from Strombolicchio, and Sicilian wall lizards. Assuming a genomic variation level of one in La Canna individuals, this value increases to 40 in Strombolicchio and 1400 in Sicily. The estimated inbreeding coefficient was close to 1, 0.5, and 0 in La Canna, Strombolicchio, and Sicily, respectively. The reconstructed demographic trajectories of the lizard populations in Strombolicchio and La Canna are compatible with the estimated ages of these islands. Individual genomes from Strombolicchio indicate an ancient decline between 1 Mya and 200–300 kya, shared with the sister species (the Sicilian wall lizard) and possibly related to Pleistocene climatic events affecting the pre-colonization ancestors. Since then, the Aeolian wall lizard in Strombolicchio continued to decline, contrary to what happened to the Sicilian wall lizard. This decline is likely related to the erosion of genetic variation that occurred during and after the colonization of this island, which emerged around 200 kya (Francalanci et al. 2013). The very low level of genomic variation observed in La Canna allows only to reconstruct the demographic trend for a more recent time window. Starting a few tens of thousands of years ago, the effective population size shows a rapid decline, compatible with a recent colonization after the emergence of this volcanic neck approximately 30 thousand years ago (Lucchi et al. 2013). The extremely limited area of suitable habitat likely imposed a strong and persistent constraint on population size thereafter (Salvi 2023). The complex history of divergence and colonization across the islands and islets is difficult to reconstruct, especially considering that the species, once likely widespread throughout the Aeolian archipelago (Paris et al. 2024), is now restricted to just three tiny islets (Scoglio Faraglione, Strombolicchio, and La Canna) and a narrow peninsula (Capo Grosso), and that only two of these populations were included in our study. However, the estimated divergence time of 14,000 years between La Canna and Strombolicchio supports a colonization followed by long-term isolation and strong genetic drift, leading to the observed genetic structure.
The relationship between genetic variation and load
Recent empirical studies have gathered evidence for an increased genetic load in small and endangered populations across various species (Humble et al. 2023; Quinn et al. 2024). Small populations, usually characterized by low genetic variation and high inbreeding levels, tend to be characterized by an elevated realized load, whereas larger populations may experience an elevated masked genetic load (Smeds and Ellegren 2023; Humble et al. 2023). Our results partially support this growing body of evidence as far as the two sister species of lizard (the Aeolian wall lizard and the Sicilian wall lizard) are compared. However, when comparing the two Aeolian wall lizard populations, the genetic load was not inversely correlated to the population size and the level of genomic variation, and was not directly related to the inbreeding coefficient. Despite their vastly different inbreeding levels and effective population sizes, individuals from La Canna and Strombolicchio exhibit, on average, similar numbers of deleterious homozygous sites and similar numbers of frequent deleterious mutations. The extent of genetic load is shaped by multiple aspects of demographic history, including inbreeding dynamics, the presence of ancestral masked load, and the potential for purging (Pekkala et al. 2012; Kyriazis et al. 2021). Still, theoretical and simulation studies have shown that small populations accumulate realized load and fix deleterious mutations more readily, largely due to reduced efficacy of selection (Bertorelle et al. 2022; Dussex et al. 2023). Our empirical data from La Canna and Strombolicchio do not fully align with this expectation and suggest that the fraction of genetic load directly affecting individual fitness (the realized genetic load) may approach a plateau in these two islets, potentially reflecting an upper bound compatible with population persistence. One possible explanation is that individuals in Strombolicchio and especially in La Canna already carry a high burden of homozygous deleterious mutations, such that additional increases in realized load may be poorly tolerated.
If this is true, we can speculate that both populations on Strombolicchio and La Canna are close to a mutational meltdown, even though their levels of variation differ, and that extinction occurred on other islands or rocks where the load threshold had been exceeded. Clearly, we do not propose a universal or absolute threshold. The levels of realized load observed in La Canna and Strombolicchio likely reflect context-specific tolerances shaped by the ecological simplicity and extreme isolation of these islets. In such settings, reduced predation, interspecific competition, or pathogen pressure may buffer the fitness impacts of some deleterious variants. It is therefore possible that similar or even lower levels of load could have more severe fitness consequences in more complex or stressful environments. This underscores the importance of integrating both genetic and ecological perspectives when interpreting genetic load in terms of extinction risk.
The existence of two small populations that differ by a factor of 40 in genomic diversity, with the smaller one being almost an inbred strain, yet exhibiting comparable realized load, raises a fundamental question: what are the relative roles of genetic variation and genetic load in the persistence or extinction of populations? Clearly, both factors are important: the correlations between genetic variation and adaptive potential, and between genetic load and reduced fitness, are well-supported both theoretically and empirically. However, if we extend our results to other systems, we should conclude that the absence of variation can be better tolerated in small populations than increased genetic load, and historical extinctions due to genetic factors might have been driven more often by excessive genetic load than by low genetic variation. The persistence of reproductively active populations with very low genomic diversity (Robinson et al. 2016, 2022), some evidence of a correlation between load and extinction (Rogers and Slatkin 2017), and the weak correlation between genetic variation and IUCN Red List categories (Schmidt et al. 2023) are compatible with this idea, but further empirical, theoretical, and simulation-based testing of this hypothesis across different biological systems is required.
Populations or species with low or very low levels of genomic variation may still persist due to the presence of a few key genes with substantial levels of variation. This has been observed for immune system and/or olfactory receptor genes in the Apennine brown bear (Benazzo et al. 2017) or the Channel Island fox (Robinson et al. 2016). Genetic variation at the MHC regions we analyzed was absent in the La Canna population and very limited in Strombolicchio but was relatively high in the same genomic regions in the Sicilian wall lizard. This analysis, while acknowledging the challenges associated with using genomic data to characterize complex MHC loci, suggests that these populations may have persisted despite limited diversity at some of the genes typically among the most variable and subject to balancing selection in vertebrates. The isolation and tiny size of these islets potentially limit the arrival of new pathogens, thereby reducing selective pressure on immune-related genes and reducing the importance of maintaining variation at these loci.
Challenges in estimating genetic load
Genomics predictions of mutation deleteriousness have large potential for the study and the management of endangered species (van Oosterhout 2020; Bertorelle et al. 2022; Dussex et al. 2023). However, at least three types of warning should be considered: biological, bioinformatical, and statistical. Biologically, (i) the dominance architecture of most mutations of interest is not known, and deleterious mutations are usually considered fully recessive (Bertorelle et al. 2022); (ii) rare variants specific to a lineage may be adaptive and not necessarily deleterious, even if absent in large multispecies alignments (Pagel et al. 2017); and (iii) predicted deleterious variants in genes with low expression rate may have minor effects on individual fitness (Trucchi et al. 2025). Bioinformatically, the number of deleterious alleles may depend on (i) the filtering approach; (ii) the sample depth of coverage; and (iii) the distance between the focal species and outgroups, the latter of which affects the total number of derived alleles. Statistically, (i) reliable selection coefficients for each mutation are very difficult to obtain, and their proxies, such as GERP scores, are affected by errors (Huber et al. 2020); and (ii) false positives (i.e., neutral variants classified as deleterious) may confound the pattern. Regarding false positives, a well-known issue when thousands of SNPs are analyzed, empirical studies have shown that GERP scores correlate with experimentally measured fitness effects in humans and dogs, supporting their general utility for identifying deleterious variants (Henn et al. 2015; Marsden et al. 2016). It is, however, possible that in the case of strong purging effects that drastically reduce the number of true positives, differences or similarity between the load estimated in different populations may be only affected by the number of false positives.
More standardized load metrics and functional studies to validate in vivo or in vitro the negative impact of variants bioinformatically estimated are needed to increase the reliability of load estimates. Also, future work incorporating simulations or experimental validation will be essential to better understand the impact of false positives under different demographic and selective scenarios. All these improvements in estimating genetic load from genomic data will be crucial to test our hypothesis regarding the existence of a maximum load threshold, whose absolute value likely depends on the species’ ecology as well as its demographic and evolutionary history, and which may influence population persistence more directly than overall levels of genetic variation.
Practical implications
If supported by additional analyses and broader datasets, the results of our study could have important implications for the conservation of endangered species. Preventing genetic erosion to enable adaptation to changing environments must always be pursued, but conservation strategies should also prioritize the protection of populations or species with intermediate levels of variation yet substantial realized load. Focusing on the Aeolian wall lizard populations, conservation efforts should be equally prioritized for La Canna and Strombolicchio, as both groups appear to be approaching a potential load threshold at which any additional genetic or environmental pressure could disproportionately increase extinction risk. The survival of this species depends, of course, also on the other two known populations in the archipelago, where genomes should be urgently sequenced. However, the wild and almost clonal population of La Canna should be strictly protected and closely monitored to better understand the relationship between genomic patterns, phenotypes, and extinction risks.
Data availability
Raw sequencing data are available under NCBI BioProject PRJNA1089471.
References
Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19:1655–1664.
Google Scholar
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410.
Google Scholar
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
Benazzo A, Trucchi E, Cahill JA, Delser PM, Mona S, Fumagalli M et al. (2017) Survival and divergence in a small group: the extraordinary genomic history of the endangered Apennine brown bear stragglers. Proc Natl Acad Sci USA 114:E9589–E9597.
Google Scholar
Bertorelle G, Raffini F, Bosse M, Bortoluzzi C, Iannucci A, Trucchi E et al. (2022) Genetic load: genomic estimates and applications in non-model animals. Nat Rev Genet 23:492–503.
Google Scholar
Bijlsma R, Loeschcke V (2012) Genetic erosion impedes adaptive responses to stressful environments. Evol Appl 5:117–129.
Google Scholar
Bowles P (2024). Podarcis waglerianus. The IUCN Red List of Threatened Species 2024:e.T61557A137856508.
Capula M, Lo Cascio P (2011) Podarcis raffonei (Mertens, 1952). In: Fauna d’Italia, Reptilia. Calderini de Il Sole 24 ore, Bologna, pp 401–407.
Cascio PL, Corti C (2006) The micro-insular distribution of the genus Podarcis within the Aeolian Archipelago: historical vs. palaeogeographical interpretation. In: Mainland and insular lizards: a Mediterrean perspective. Firenze University Press, 1–12.
Charlesworth B, Charlesworth D (1999) The genetic basis of inbreeding depression. Genet Res 74:329–340.
Google Scholar
Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L et al. (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w 1118; iso-2; iso-3. Fly 6:80–92.
Crow JF (1970) Genetic loads and the cost of natural selection. In: Kojima K (ed) Mathematical topics in population genetics, Biomathematics. Springer, Berlin, pp 128–177.
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA et al. (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158.
Google Scholar
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO et al. (2021) Twelve years of SAMtools and BCFtools. GigaScience 10:giab008.
Google Scholar
Davydov EV, Goode DL, Sirota M, Cooper GM, Sidow A (2010) Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput Biol 6:13.
Google Scholar
Derrien T, Estellé J, Sola SM, Knowles DG, Raineri E, Guigó R et al. (2012) Fast computation and applications of genome mappability. PLoS ONE 7:e30377.
Google Scholar
Dussex N, Morales HE, Grossen C, Dalén L, van Oosterhout C (2023) Purging and accumulation of genetic load in conservation. Trends Ecol Evol 38:961–969.
Google Scholar
Dussex N, van der Valk T, Morales HE, Wheat CW, Díez-del-Molino D, von Seth J et al. (2021) Population genomics of the critically endangered kākāpō. Cell Genomics 1:100002.
Google Scholar
Ficetola GF, Garner TWJ, Wang J, De Bernardi F (2011) Rapid selection against inbreeding in a wild population of a rare frog. Evol Appl 4:30–38.
Google Scholar
Ficetola GF, Barzaghi B, Melotto A, Muraro M, Lunghi E, Canedoli C et al. (2018) N -mixture models reliably estimate the abundance of small vertebrates. Sci Rep 8:10357.
Google Scholar
Ficetola GF, Silva-Rocha I, Carretero MA, Vignoli L, Sacchi R, Melotto A et al. (2021) Status of the largest extant population of the critically endangered Aeolian lizard Podarcis raffonei (Capo Grosso, Vulcano island). PLoS ONE 16:e0253631.
Google Scholar
Francalanci L, Lucchi F, Keller J, De Astis G, Tranne CA (2013) Chapter 13 Eruptive, volcano-tectonic and magmatic history of the Stromboli volcano (north-eastern Aeolian archipelago). Geol Soc Lond Mem 37:397–471.
Google Scholar
Gabrielli M, Benazzo A, Biello R, Ancona L, Fuselli S, Iannucci A et al. (2023) A high-quality reference genome for the critically endangered Aeolian wall lizard, Podarcis raffonei. J Hered 114:279–285.
Google Scholar
Gaczorek TS, Chechetkin M, Dudek K, Caeiro-Dias G, Crochet P-A, Geniez P (2023) Widespread introgression of MHC genes in Iberian Podarcis lizards Mol Ecol 32:4003–4017.
Google Scholar
Gearty W, Jones LA (2023) rphylopic: an R package for fetching, transforming, and visualising PhyloPic silhouettes. Methods Ecol Evol 14:2700–2708.
Google Scholar
Gippoliti S, Capula M, Ficetola GF, Salvi D, Andreone F (2017) Threatened by Legislative Conservationism? The case of the critically endangered Aeolian lizard. Front Ecol Evol 5:130.
Google Scholar
Grossen C, Guillaume F, Keller LF, Croll D (2020) Purging of highly deleterious mutations through severe bottlenecks in Alpine ibex. Nat Commun 11:1001.
Google Scholar
Henn BM, Botigué LR, Bustamante CD, Clark AG, Gravel S (2015) Estimating the mutation load in human genomes. Nat Rev Genet 16:333–343.
Google Scholar
Huber CD, Kim BY, Lohmueller KE (2020) Population genetic models of GERP scores suggest pervasive turnover of constrained sites across mammalian evolution. PLoS Genet 16:e1008827.
Google Scholar
Humble E, Stoffel MA, Dicks K, Ball AD, Gooley RM, Chuven J et al. (2023) Conservation management strategy impacts inbreeding and mutation load in scimitar-horned oryx. Proc Natl Acad Sci USA 120:e2210756120.
Google Scholar
Khan A, Patel K, Shukla H, Viswanathan A, van der Valk T, Borthakur U et al. (2021) Genomic evidence for inbreeding depression and purging of deleterious genetic variation in Indian tigers. Proc Natl Acad Sci USA 118:e2023018118.
Google Scholar
Kleinman-Ruiz D, Lucena-Perez M, Villanueva B, Fernández J, Saveljev AP, Ratkiewicz M et al. (2022) Purging of deleterious burden in the endangered Iberian lynx. Proc Natl Acad Sci USA 119:e2110614119.
Google Scholar
Kuhn RM, Haussler D, Kent WJ (2013) The UCSC genome browser and associated tools. Brief Bioinform 14:144–161.
Google Scholar
Kyriazis CC, Wayne RK, Lohmueller KE (2021) Strongly deleterious mutations are a primary determinant of extinction risk due to inbreeding depression. Evol Lett 5:33–47.
Google Scholar
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760.
Google Scholar
Li H, Durbin R (2011) Inference of human population history from individual whole-genome sequences. Nature 475:493–496.
Google Scholar
Lucchi F, Santo AP, Tranne CA, Peccerillo A, Keller J (2013) Chapter 8 Volcanism, magmatism, volcano-tectonics and sea-level fluctuations in the geological history of Filicudi (western Aeolian archipelago). Geol Soc Lond Mem 37:113–153.
Google Scholar
Marsden CD, Ortega-Del Vecchyo D, O’Brien DP, Taylor JF, Ramirez O, Vilà C (2016) Bottlenecks and selective sweeps during domestication have increased deleterious genetic variation in dogs Proc Natl Acad Sci USA 113:152–157.
Google Scholar
Mathur S, Mason AJ, Bradburd GS, Gibbs HL (2023a) Functional genomic diversity is correlated with neutral genomic diversity in populations of an endangered rattlesnake. Proc Natl Acad Sci USA 120:e2303043120.
Google Scholar
Mathur S, Tomeček JM, Tarango-Arámbula LA, Perez RM, DeWoody JA (2023b) An evolutionary perspective on genetic load in small, isolated populations as informed by whole genome resequencing and forward-time simulations. Evolution 77:690–704.
Google Scholar
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A et al. (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303.
Google Scholar
Morin PA, Archer FI, Avila CD, Balacco JR, Bukhman YV, Chow W et al. (2021) Reference genome and demographic history of the most endangered marine mammal, the vaquita. Mol Ecol Resour 21:1008–1020.
Google Scholar
Muller HJ (1950) Our load of mutations. Am J Hum Genet 2:111.
Google Scholar
Ochoa A, Gibbs HL (2021) Genomic signatures of inbreeding and mutation load in a threatened rattlesnake. Mol Ecol 30:5454–5469.
Google Scholar
van Oosterhout C (2020) Mutation load is the spectre of species conservation. Nat Ecol Evol 4:1004–1006.
Google Scholar
Pagel KA, Pejaver V, Lin GN, Nam H-J, Mort M, Cooper DN et al. (2017) When loss-of-function is loss of function: assessing mutational signatures and impact of loss-of-function genetic variants. Bioinformatics 33:i389–i398.
Google Scholar
Paris JR, Ficetola GF, Ferrer Obiol J, Silva-Rocha I, Carretero MA, Salvi D (2024) Does hybridization with an invasive species threaten Europe’s most endangered reptile? Genomic assessment of Aeolian lizards on Vulcano Island. iScience 27:111097.
Google Scholar
Pečnerová P, Lord E, Garcia-Erill G, Hanghøj K, Rasmussen MS, Meisner J et al. (2024) Population genomics of the musk ox’ resilience in the near absence of genetic variation. Mol Ecol 33:e17205.
Google Scholar
Pekkala N, Knott KE, Kotiaho JS, Nissinen K, Puurtinen M (2012) The benefits of interpopulation hybridization diminish with increasing divergence of small populations. J Evol Biol 25:2181–2193.
Google Scholar
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575.
Google Scholar
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842.
Google Scholar
Quinn CB, Preckler-Quisquater S, Buchalski MR, Sacks BN (2024) Whole genomes inform genetic rescue strategy for montane red foxes in North America. Mol Biol Evol 41:msae193.
Google Scholar
Renaud G, Hanghøj K, Korneliussen TS, Willerslev E, Orlando L (2019) Joint estimates of heterozygosity and runs of homozygosity for modern and ancient samples. Genetics 212:587–614.
Google Scholar
Robinson JA, Ortega-Del Vecchyo D, Fan Z, Kim BY, vonHoldt BM, Marsden CD et al. (2016) Genomic flatlining in the endangered island fox. Curr Biol 26:1183–1189.
Google Scholar
Robinson JA, Kyriazis CC, Nigenda-Morales SF, Beichman AC, Rojas-Bracho L, Robertson KM et al. (2022) The critically endangered vaquita is not doomed to extinction by inbreeding depression. Science 376:635–639.
Google Scholar
Rogers RL, Slatkin M (2017) Excess of genomic defects in a woolly mammoth on Wrangel Island. PLoS Genet 13:e1006601.
Google Scholar
Roscito JG, Sameith K, Pippel M, Francoijs K-J, Winkler S, Dahl A et al. (2018) The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly. GigaScience 7:giy141.
Google Scholar
Roscito JG, Sameith K, Kirilenko BM, Hecker N, Winkler S, Dahl A et al. (2022) Convergent and lineage-specific genomic differences in limb regulatory elements in limbless reptile lineages. Cell Rep 38:110280
Google Scholar
Salvi D (2023) Climbing on the La Canna Volcanic Sea Stack to obtain first-hand data on the tiniest population of the critically endangered Aeolian wall lizard Podarcis raffonei. Animals 13:2289.
Google Scholar
Salvi D, Corti C, Ficetola GF, Vignoli L (2024) Podarcis raffonei. The IUCN Red List of Threatened Species 2024:e.T61552A137855910.
Schiffels S, Wang K (2020) MSMC and MSMC2: the multiple sequentially Markovian coalescent. In: Dutheil JY (ed) Statistical population genomics, Methods in molecular biology. Springer US, New York, NY, pp 147–166.
Schmidt C, Hoban S, Hunter M, Paz-Vinas I, Garroway CJ (2023) Genetic diversity and IUCN Red List status. Conserv Biol 37:e14064.
Google Scholar
Von Seth J, Dussex N, Díez-del-Molino D, Van Der Valk T, Kutschera VE, Kierczak M et al. (2021) Genomic insights into the conservation status of the world’s last remaining Sumatran rhinoceros populations. Nat Commun 12:2393.
Google Scholar
Smeds L, Ellegren H (2023) From high masked to high realized genetic load in inbred Scandinavian wolves. Mol Ecol 32:1567–1580.
Google Scholar
Song B, Marco-Sola S, Moreto M, Johnson L, Buckler ES, Stitzer MC (2022) AnchorWave: Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism, and whole-genome duplication. Proc Natl Acad Sci USA 119:e2113075119.
Google Scholar
Terhorst J, Kamm JA, Song YS (2017) Robust and scalable inference of population history from hundreds of unphased whole-genomes. Nat Genet 49:303–309.
Google Scholar
Thompson EA (2013) Identity by descent: variation in meiosis, across genomes, and in populations. Genetics 194:301–326.
Google Scholar
Trucchi E, Massa P, Giannelli F, Latrille T, Gargano M, Fernandes FAN et al. (2025) High geneexpression predicts extremely low segregation of deleterious mutations in large penguin populations. Mol Biol Evol 42:msaf146.
Google Scholar
Wang K, Mathieson I, O’Connell J, Schiffels S (2020) Tracking human population structure through time from whole genome sequences. PLoS Genet 16:e1008552.
Google Scholar
Yang W, Feiner N, Pinho C, While GM, Kaliontzopoulou A, Harris DJ et al. (2021) Extensive introgression and mosaic genomes of Mediterranean endemic lizards. Nat Commun 12:2762.
Google Scholar
Acknowledgements
We thank the mountain guide Lorenzo Inzigneri for his help in the sampling in La Canna. We thank Silvia Fuselli for her important contribution to the analysis of MHC variation. This work was supported by the University of Ferrara (Italy) and funded by the MIUR PRIN 2017 grant 201794ZXTL to GB.
Funding
Open access funding provided by Università degli Studi di Ferrara within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Contributions
Designed research: ET and GB. Performed research: MG, GB, AB, ET, RB, CC, DS, and AI. Analyzed data: MG and GB. Wrote the initial draft: MG. Wrote the paper: MG and GB. Interpreted the results: MG, AB, ET, DS, and GB. Discussed and approved the final manuscript: all the authors.
Corresponding authors
Ethics declarations
Competing interests
Giorgio Bertorelle is an associate editor for Heredity.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Associate editor: Aurora Ruiz-Herrera.
Supplementary information
Supplementary Information (download DOCX )
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and permissions
About this article
Cite this article
Gabrielli, M., Benazzo, A., Biello, R. et al. The relationship between genomic variation and genetic load: insights from small island populations.
Heredity (2026). https://doi.org/10.1038/s41437-026-00835-8
Received:
Revised:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41437-026-00835-8
Source: Ecology - nature.com
