in

Mitogenome analyses elucidate the evolutionary relationships of a probable Eocene wet tropics relic in the xerophile lizard genus Acanthodactylus

The following museum acronyms are used (mostly following55):

BMNH, NHM, BM: Natural History Museum, London, formerly British Museum of Natural History (UK);

CAS: California Academy of Sciences, San Francisco (USA);

MHNC: Musée d’histoire naturelle, La Chaux-de-Fonds (CH);

MHNG: Muséum d’Histoire Naturelle, Genève (CH);

MNHN: Muséum national d’Histoire naturelle, Paris (F);

SMNS: Staatliches Museum für Naturkunde, Stuttgart (D);

UWBM The Washington State Museum of Natural History and Culture/Burke Museum, University of Washington (USA);

ZMB: Museum für Naturkunde Berlin, formerly Zoologisches Museum Berlin (D).

Study organism and taxonomic background

Acanthodactylus was first described as a subgenus of Lacerta Cuvier (sic)56, the type species is A. boskianus (Daudin, 1802). The few meristic characters described to be diagnostic for Acanthodactylus56 include: collar connected in the center but free on the sides, tempora squamosal, i.e., temporal region covered by small scales rather than larger shields, ventral scales rectangular and arranged in longitudinal rows, digits acutely fimbriate-denticulate forming “toe fringes”. Fringed toes have evolved in various shapes multiple times in lizards and in Lacertidae, and when triangular, projectional or conical as in Acanthodactylus they are commonly seen as adaptation to windblown sand substrate57.

The enigmatic Acanthodactylus guineensis is among the lesser-known species of the Lacertidae. Only limited information is available regarding the species’ morphology, habitat and distribution (see Meinig & Böhme44 for a review and references therein; and 34, 38, 45, 58). Based on one young specimen, A. guineensis was described as member of the genus Eremias Fitzinger, 1834 by Boulenger40. Boulenger does not mention the slightly projecting third row of scales around the toes and fingers (“fringed toes”) of the type specimen in his quite comprehensive description40, probably because he did not notice it in such a small (SVL 24 mm) specimen and because the projection is indeed rather minor compared to other members of the genus. This is probably also the reason why he did not place guineensis in the genus Acanthodactylus Wiegmann, 1834 but in Eremias Fitzinger, 1834 (in56).

About 30 years later, Boulenger revised the genus Eremias and divided it into five sections he assumed to be natural associations59: (1) Taenieremias Boulenger, 1918—monotypic, type species Eremias guineensis Boulenger, 1887 (currently Acanthodactylus guineensis); (2) Lampreremias Boulenger, 1918—type species Eremias nitida Günther, 1872 (currently Heliobolus nitidus); (3) Pseuderemias Boettger, 1883—type species Eremias mucronata (Blandford, 1870) (currently Pseuderemias mucronata); (4) Mesalina Gray, 1838—type species Eremias rubropunctata (Lichtenstein, 1823) (currently Mesalina rubropunctata); (5) Eremias s. str. Fitzinger, 1834—type Eremias velox (Pallas, 1771) (currently Eremias velox).

Monard47 described Eremias (Taenieremias) benuensis from Cameroon based on a few minor morphological differences compared to E. guineensis, but in 1969 E. benuensis was synonymized with E. guineensis60.

Salvador17 and one year later also Arnold18 in their respective major revisions of the genus Acanthodactylus both found that Eremias (Taenieremias) guineensis agrees with all the characteristic morphological features of Acanthodactylus with the exception of the arrangement of scales around the nostril. However, it was suggested that the E. guineensis condition with an extra suture across the area occupied by the first upper labial scale to produce a smaller first upper labial is easily derived from that found in Acanthodactylus, evidenced by BMNH 1966.430, a juvenile A. erythrurus lineomaculatus (sic) with a similar scale arrangement18. Within Acanthodactylus, both authors placed guineensis in the Western clade and in the Acanthodactylus erythrurus group which was assumed to consist of A. erythrurus, A. savignyi, A. boueti, A. guineensis, and A. blanci which was considered a subspecies of A. savignyi at the time14, 20.

Molecular data and phylogenetic analyses

We aimed at reconstructing a well-supported phylogeny of the genus Acanthodactylus using whole mitochondrial DNA sequences, assembled by means of a shotgun next generation sequencing strategy. Analyses of whole mitogenomes have been shown to resolve many nodes of the lacertid tree with high statistical support (e.g.31). We retrieved muscle tissue from a museum voucher of A. guineensis (ZFMK 59511 from Daroha, near Bobo Dioulasso, Burkina Faso43) that likely was never in contact with formalin for preservation and therefore offered good chances to obtain sufficient amounts of DNA of decent quality for sequencing. We extracted genomic DNA using the Qiagen DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany) following the protocol provided by the manufacturer. We additionally sequenced a tissue sample of a freshly caught individual of Acanthodactylus schmidti (SB 642) from Abu Dhabi (UAE) to increase sampling of Acanthodactylus spp. in the mitogenomic tree. The lizard was euthanized by injection of an aqueous solution of benzocaine (20%) into the body cavity. Subsequently, a sample of muscle tissue was taken from the thigh, preserved in 98% Ethanol and stored in − 80 °C. Handling, euthanizing and collection of tissue samples of A. schmidti individuals was approved by the NYUAD Institutional Animal Care and Use Committee (IACUC 19-0002) and UAE No Objection Certificate (NOC 8416), and all applied methods were performed in accordance with the relevant guidelines and regulations.

Genomic DNA of SB 642 was extracted from ethanol-preserved muscle tissue using the Qiagen MagAttract HMW DNA Kit (Qiagen, Hilden, Germany) for high molecular weight DNA. We determined DNA yields on a Qubit fluorometer (Qubit, London, UK) with a dsDNA high sensitivity kit. Totals of 52 ng and 80 ng of DNA (diluted in 26 µl 10 mM Tris·Cl, 0.5 mM EDTA, pH 9.0 (AE buffer)) from A. guineensis and A. schmidti, respectively, were used for library preparation.

ZFMK 59511 libraries were prepared with NEB Ultra II FS DNA (New England Biolabs, Ipswich, MA, USA) kit as per protocol instructions with input below 100 ng. For sample SB642, linked reads were generated on a 10X Genomics Chromium following Genome reagent kits v2 instructions. Resulting libraries’ concentration, size distribution and quality were assessed on a Qubit fluorometer (Qubit, London, UK) with a dsDNA high sensitivity kit and on an Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA) using a High Sensitivity DNA kit. Subsequently, libraries were normalized and pooled, and pools quantified with a KAPA Library quantification kit for Illumina platforms (Roche Sequencing, Pleasanton, CA, USA) on a ABI StepOnePlus qPCR machine (Thermo Fisher Scientific Inc., Waltham, MA, USA), then loaded on a SP flowcell and paired-end sequenced (2 × 150 bp) on an Illumina NovaSeq 6000 next generation sequencer (Illumina, San Diego, CA, USA), and a S2 flowcell for linked read library. Raw reads were deposited in the Sequence Read Archive (SRA) under BioProject ID PRJNA700414 . All mitogenome assemblies and original alignments were deposited in Figshare under  https://doi.org/10.6084/m9.figshare.13754083.v1 .

Raw FASTQ sequenced reads were first assessed for quality using FastQC v0.11.561. The reads where then passed through Trimmomatic v0.3662 (parameters ILLUMINACLIP: trimmomatic_adapter.fa:2:30:10 TRAILING:3 LEADING:3 SLIDINGWINDOW:4:15 MINLEN:36) for quality trimming and adapter sequence removal. Following the quality trimming, the reads were assessed again using FastQC. The executed workflows were performed using BioSAILs63.

For SB 642 (A. schmidti), we received 436,370,000 single-end reads (average read length 139.5 bases (b)) with a raw coverage of 29.29X and a scaffold N50 of 21.12 kilobases (kb). For the MITObim analysis, read number for SB 642 was randomly reduced to 1,000,000 reads (360,463,035 b) using the awk command. For ZFMK 59511 (A. guineensis), we received 57,936,345 paired-end reads (average read length 151 b, raw coverage ~ 10x). Subsequently, the two sets of reads were converted to interleaved format. We used the quality filtered reads to assemble the mitogenomes of A. guineensis and A. schmidti using an iterative mapping strategy in MITObim v. 1.9.164. We used the Acanthodactylus aureus mitogenome (GB accession number xxxxx; assembly method is provided in the next paragraph) as seed for both samples; this rendered an initial mapping of a conserved region from a more distantly related individual unnecessary64. We therefore applied the –quick option in MITObim and iterations were run until no additional reads could be incorporated into the assembly (14 in A. guineensis, eight in A. schmidti).

We also assembled complete or nearly complete mitogenomes for additional twelve species of Gallotiinae and Eremiadini using anchored hybrid enrichment sequence data from Garcia-Porta et al.6 (see Supplementary methods in6 for extraction protocol and sequencing methods and Supplementary Table S1 online in6) with the Podarcis muralis complete mitogenome (GB accession number NC_011607) as reference sequence. The raw data of each individual was quality filtered using Trimmomatic v0.3662 (parameter MINLEN:45) and assembled using MITObim v. 1.9.1. All multiple alignment files generated in the final MITObim iteration were imported to Geneious R11 (https://www.geneious.com) to check for assembly quality and coverage.

The resulting assemblies were annotated with MITOS65 using defaults settings with the vertebrata database as a reference. The annotated assemblies were imported into PhyloSuite66 together with existing mitogenomic sequences of Lacertidae, and Blanus cinereus (Amphisbaenia; outgroup) available in GenBank (as of July 2020). Details of all mitogenomic sequences included in this study and the corresponding GenBank accession numbers are provided in Supplementary Table S1 online. Using PhyloSuite, we exported all protein coding sequences plus the two rRNAs. The resulting files were aligned with MAFFT67 using “auto” settings. Alignments of coding sequences were refined with MACSE68 to account for open reading frame structure during the alignments. The protein coding and rRNAs alignments were finally concatenated into a single alignment delimiting partitions by marker and, for the protein-coding genes, by codons within markers. Mitogenome phylogenies were inferred using a maximum likelihood approach with IQ-TREE v. 1.5.4 software69 and Bayesian inference (BI) analysis with MrBayes 3.270.

For the maximum likelihood approach, best-fitting partitioning schemes and substitution models were selected based on the Akaike Information Criterion (AIC) using the heuristic algorithms implemented in the MFP + MERGE option (Supplementary Table S3 online). After ML inference, branch support was assessed with 1000 ultrafast bootstrap replicates. As the alignments used for phylogenetic inference contained more than one sequence for some species, these terminals in the resulting tree were collapsed a-posteriori for aesthetic reasons using FigTree v1.4 (as depicted in Fig. 1). An uncollapsed version of this tree is presented in Supplementary Fig. S1 online. For the BI analysis, the best-fitting partition/substitution model scheme, as selected by the ModelFinder algorithm in iQTree (Supplementary Table S4 online), was implemented with MrBayes 3.270. Results of two independent runs of 10 million generations, each comprising four Markov Chains (three heated and one cold), were sampled every 1000 generation. Chain mixing and stationery was assessed by examining the standard deviation of split frequencies and by plotting the –lnL per generation using Tracer 1.5 software71. Samples corresponding to the initial phase of the Markov chain (25%) were discarded as burn-in and the remaining results were combined to obtain a majority rule consensus tree and the respective posterior probabilities of nodes.

An additional, more comprehensive, alignment was produced containing all currently known Acanthodactylus species to obtain a higher resolution perspective on the phylogenetic placement of the target species. Since no nuclear data was available for A. guineensis, we extracted from the newly obtained sequences the four best represented mitochondrial gene fragments across Lacertidae, as compiled by Garcia-Porta et al.6, despite the known shortcomings of partial mitochondrial datasets to resolve lacertid relationships9, 72, 73. The corresponding sequences of the ribosomal RNAs (12S, 16S), COB, and NADH-dehydrogenase subunit (ND4) were later added to the alignment of6 using –add and –keeplength options in MAFFT (Supplementary Table S5 online).

Species distribution modeling

As basis for characterizing the species’ bioclimatic envelopes, we compiled an updated distribution map for A. guineensis and A. boueti including new discoveries in museum collections from the current study (see Supplementary Material online), all known records from the literature16, 34, 37, 38, 40, 42, 44, 45, 47, 58, 60, 74,75,76,77 and from the Global Biodiversity Information Facility (GBIF78). We provide all coordinates as latitude (decimal degrees), longitude (decimal degrees). Due to the general scarcity of records for A. guineensis and A. boueti, we added two localities for A. guineensis (12.5°, − 2.5°; 7.5°, 13.5°) and one locality for A. boueti (− 2.5°, 7.5°) published in Trape et al.45 that were not found on GBIF or in other publications, by extracting the center coordinates of the respective 1 × 1 degree grid. These additional localities have coordinates with very low accuracy and the corresponding environmental variables extracted from a 1 × 1 km resolution grid (30 arc-seconds; see below) therefore have high degrees of uncertainty. When examined, the majority of values fell within the total range for the respective environmental variable (the two exceptions are mentioned in the Results section), we consequently decided to keep them for our analysis. We reduced the final locality dataset to one record per km2 (the resolution of the environmental input data, see below) to avoid pseudoreplication using the R package spThin79. We then tested whether the remaining presence points are randomly dispersed or clustered using the Nearest Neighbor Index NNI in the R packages sp80 and spatialEco81. NNI was 1.4 for A. guineensis and 1.2 for A. boueti, indicating that the filtered point dataset consists of randomly distributed points which are not spatially autocorrelated.

As environmental parameters we downloaded spatial layers of the Terrestrial Ecoregions of the World (TEOW)82, global bioclimate and elevation layers39, data on potential evapotranspiration83 and global aridity index84 with a spatial resolution of 30 arc-seconds (~ 1 km2 near the equator). Although the climate datasets are interpolated from data from weather stations much farther apart from one another and are consequently not measurements of actual environmental conditions, they were rigorously cross-validated with observed data (including satellite data) during the development of the dataset and generally showed high correspondence with observations (especially temperature variables)39. Since for many applications such as our species distribution models data at high spatial resolution (i.e., 1 × 1 km) are preferable over lower resolution to capture variation for example across steep climate gradients in mountains39 we chose the 30 arc-seconds dataset over the 5 arc-minutes dataset (~ 9 × 9 km near the equator) for the ecoregions and climate datasets. In addition, we downloaded spatial layers with forest and grassland/scrub/woodland cover from the harmonized soil database (only available in 5 arc-minutes spatial resolution85). We downscaled the forest and grassland/scrub/woodland dataset to 30 arc-seconds resolution despite the resulting slight inaccuracy, which we kept in mind during interpretation but did not consider relevant for the vegetation datasets.

In order to compare the prevailing climate within the distribution range of A. guineensis and A. boueti to all other species of the genus Acanthodactylus we downloaded all available records of the other currently known Acanthodactylus species from GBIF. We cleaned the downloaded datasheet by deleting all entries with missing data for either latitude, longitude, species epithet, or all of those, and removed duplicates or spelling mistakes. We further removed all GBIF entries with imprecise coordinates, i.e. records with coordinates that when plotted landed just outside of coastal areas in the ocean instead of on land (usually coordinates with only two decimals). In addition, we added several curated locality data from Garcia-Porta et al.6 as available from Figshare (https://doi.org/10.6084/m9.figshare.8866271.v1/). Following Tamar et al.14 we treated A. lineomaculatus as a junior synonym of A. erythrurus and changed the respective records in our database accordingly. For species that did not have records published on GBIF we added further locality data from the literature (Supplementary Table S6 online). The final database contained 4286 records from all 44 currently recognized species of Acanthodactylus. We acknowledge that our database is by no means complete, however, we wanted to follow the most conservative approach and use only records that are supported by a voucher specimen and can be traced back using published databases. We trust that our record list covers a nearly complete representation of the ecological conditions inhabited by all currently accepted Acanthodactylus species.

Using the bioclim dataset39 we plotted annual mean temperature (bio1) and annual precipitation (bio12) for all locality records in our Acanthodactylus species database and compared the respective data for A. guineensis and A. boueti with its congeners. We developed species distribution models using Maxent 3.4.146 for A. guineensis and A. boueti. Maxent applies the maximum entropy principle86 for model fitting under the basic premise that the estimated species distribution deviates from a uniform distribution as minimally as required to explain the observations46. We used the cleaned datasets of A. guineensis and A. boueti with only one record per km2 as input presence locality data. We extracted environmental data for all presence sites from the 19 bioclim variables (bio1-19), elevation (alt), aridity index (AI), percentage cover of forest (forest) and grassland, scrub and woodland (grass) per grid cell, as well as four parameters comprising potential evapotranspiration (PET): PET of the wettest, driest, warmest and coldest quarter of the year (PETwet, PETdry, PETwarm, PETcold). To avoid issues resulting from correlated parameters we reduced the number of environmental predictors by performing multicollinearity analyses using the Variance Inflation Factor (VIF) with a threshold < 10 for non-collinearity87 in the R package usdm88. The VIF, one of the most commonly used factors in collinearity analyses, is the result from regressing the predictor variable against all other predictor variables. The VIF measures how strongly each predictor can be explained by the rest of predictors89. After removing all variables with collinearity problems our final environmental input variable dataset comprised bio3, bio11, bio13, bio14, bio18, bio19, PETcold, PETdry, PETwet, forest, grass. We ran the Maxent algorithm with the following settings: we chose 10,000 random background points from a rectangular area extending an additional 500 km in every compass direction from the outermost presence sites of A. guineensis for both species90, allowed only linear and quadratic features and chose the output format cloglog. For A. guineensis we had 38 presence sites, of which we used 70% for model training and 30% for testing with 25 model replicates of type “subsampling” and a regularization multiplier of 2.591. For A. boueti, due to low number of presence sites (N = 11), we did not use a training dataset but ran the model with eleven replicates and bootstrapping (following Morales et al.91) and a regularization multiplier of 1. The predictive model performance was evaluated using the area under the receiver operating curve (AUC) which measures the ability of predictions to discriminate between presences and absences, irrespective of the absolute value of the predictions92. Finally, we plotted the median of all models to predict the potential range of each species and determined the most important environmental predictors.


Source: Ecology - nature.com

DNA traces the origin of honey by identifying plants, bacteria and fungi

SMART develops analytical tools to enable next-generation agriculture