Low level of anthropization linked to harsh vertebrate biodiversity declines in Amazonia
Study areaThe study was conducted on two rivers in north-eastern Amazonia sensu lato, including the Guiana Shield and the Amazon River drainage (Fig. 2). The climate of the entire study area is homogeneous and the region is covered by dense, uniform lowland primary rainforest51. The altitude is in the range of 0–860 m a.s.l. The regional climate is equatorial, and the annual rainfall ranges from 3600 mm in the northeast to 2000 mm in the southwest. The Maroni River is 612 km long from its source to its estuary, and its watershed covers a surface of >68,000 km2 in Suriname and French Guiana. The Oyapock River (length, 404 km; area, 26,800 km2) is located in the state of Amapa in Brazil and in French Guiana.The foregoing river basins host nearly 400 freshwater fish species and more than 180 mammal species52,53. Most of the mammal species have a large distribution range, covering the entire study area53. The fish species have a less homogeneous distribution and a distinct upstream-downstream composition gradient54,55. Here, only large rivers were considered and most fish species were widespread over the whole study area. As habitat availability increases with river size, species richness is expected to increase upstream to dowsntream31,32. The Oyapock and Maroni river basins are among the last remaining wilderness areas on Earth17. Nevertheless, ecological disturbances are increasing there because of a growing human population and the development of small-scale gold mining activity. These disturbances have caused limited but diffuse deforestation23,56. The deforested areas currently comprise 0.67% of all Maroni and Oyapock catchments.SamplingEnvironmental DNA (eDNA) was collected from water samples at 74 locations (hereafter, sites) along the main channel and the large tributaries of the Maroni and Oyapock rivers (Fig. 2). Thirty-seven sites were sampled at each river basin. The minimum and maximum distances between adjacent sites were 1.07 and 50.20 km, respectively. The mean and median distances between adjacent sites were 10.18 and 9.14 km, respectively, and the standard deviation (SD) was 7.79 km. The sites were located from sea level to 157 m a.s.l. At all sites, the river was wider than 20 m and deeper than 1 m (Strahler orders 4–8; Supplementary Fig. 5). The physicochemical properties of the water slightly varied among sites. The temperature, pH, and conductivity were in the ranges of 28.4–33.2 °C, 6.5–7.6, and 16.9–54.6 µS/cm, respectively, at all sites except two estuarine locations where the conductivity was relatively high because of seawater incursion (Supplementary Data 2).The eDNA samples were collected during the dry seasons (October–November) of 2017 and 2018 for Maroni and Oyapock, respectively. At both rivers, the sites were sequentially sampled from downstream to upstream at a rate of 1–4 sites per day depending on the distance and travel time between sites. Following the protocol of ref. 45, we collected the eDNA by filtering two replicates of 34 L of water per site. A peristaltic pump (Vampire Sampler; Buerkle GmbH, Bad Bellingen, Germany) and single-use tubing were used to pump the water into a single-use filtration capsule (VigiDNA, pore size 0.45 μm; filtration surface 500 cm2, SPYGEN, Bourget-du-Lac, France). The tubing input was placed a few centimetres below the water surface in zones with high water flow as recommended by Cilleros et al.43. Sampling was performed in turbulent areas with rapid hydromorphologic units to ensure optimal eDNA homogeneity throughout the water column. To avoid eDNA cross-contamination among sites, the operator remained on emerging rocks downstream from the filtration area. At the end of filtration, the capsule was voided, filled with 80 mL CL1 preservation buffer (SPYGEN), and stored in the dark up to one month before the DNA extraction. No permits were required for the eDNA sampling and the access to all sites was legally permitted. The study complies with access and benefit permits ABSCH-IRCC-FR-246820-1 and ABSCH-IRCC-FR-245902-1, authorizing collection, transport and analysis of all environmental DNA samples used in this study.Laboratory procedures and bioinformatic analysesFor the DNA extraction, each filtration capsule was agitated on an S50 shaker (Ingenieurbüro CAT M. Zipperer GmbH, Ballrechten-Dottingen, Germany) at 800 rpm for 15 min, decanted into a 50 mL tube, and centrifuged at 15,000 × g and 6 °C for 15 min. The supernatant was removed with a sterile pipette, leaving 15 mL of liquid at the bottom of the tube. Subsequently, 33 mL of ethanol and 1.5 mL of 3 M sodium acetate were added to each 50 mL tube, and the mixtures were stored at −20 °C for at least one night. The tubes were then centrifuged at 15,000 × g and 6 °C for 15 min, and the supernatants were discarded. Then, 720 µL of ATL buffer from a DNeasy Blood & Tissue Extraction Kit (Qiagen, Hilden, Germany) was added. The tubes were vortexed, and the supernatants were transferred to 2 mL tubes containing 20 µL proteinase K. The tubes were then incubated at 56 °C for 2 h. DNA extraction was performed using a NucleoSpin Soil kit (Macherey-Nagel GmbH, Düren, Germany) starting from step six of the manufacturer’s instructions. Elution was performed by adding 100 µL of SE buffer twice. After the DNA extraction, the samples were tested for inhibition by qPCR following the protocol in ref. 57. Briefly, quantitative PCR was performed in duplicate for each sample. If at least one of the replicates showed a different Ct (Cycle threshold) than expected (at least 2 Cts), the sample was considered inhibited and diluted 5-fold before the amplification.For the fish, the “teleo” primers58 (forward: 3ʹ-ACACCGCCCGTCACTCT-5ʹ; reverse: 3ʹ-CTTCCGGTACACTTACCATG-5ʹ) were used as they efficiently discriminated local fish species43,45. For the mammals, the 12S-V5 vertebrate marker59 (forward: 3ʹ-TAGAACAGGCTCCTCTAG-5ʹ; reverse: 3ʹ-TTAGATACCCCACTATGC-5ʹ) was used as it also effectively distinguishes local mammal species44,60. The DNA amplifications were performed in a final volume of 25 μL containing 1 U AmpliTaq Gold DNA Polymerase (Applied Biosystems, Foster City, CA, USA), 0.2 μM of each primer, 10 mM Tris-HCl, 50 mM KCl, 2.5 mM MgCl2, 0.2 mM of each dNTP, and 3 μL DNA template. Human blocking primer was added to the mixture for the “teleo”58 (5′-ACCCTCCTCAAGTATACTTCAAAGGAC-C3-3′) and the “12S-V5” primers61 (5′-CTATGCTTAGCCCTAAACCTCAACAGTTAAATCAACAAAACTGCT-C3-3′) at final concentrations of 4 μM and 0.2 μg/μL bovine serum albumin (BSA; Roche Diagnostics, Basel, Switzerland). Twelve PCR replicates were performed per field sample. The forward and reverse primer tags were identical within each PCR replicate. The PCR mixture was denatured at 95 °C for 10 min, followed by 50 cycles of 30 s at 95 °C, 30 s at 55 °C for the “teleo” primers and 50 °C for the 12S-V5 primers, 1 min at 72 °C, and a final elongation step at 72 °C for 7 min. This step was conducted in a dedicated room for DNA amplification that is kept under negative air pressure and is physically separated from the DNA extraction rooms maintained under positive air pressure. The purified PCR products were pooled in equal volumes to achieve an expected sequencing depth of 500,000 reads per sample before DNA library preparation.For the fish analyses, 10 libraries were prepared using a PCR-free library protocol (https://www.fasteris.com/metafast) at Fasteris, Geneva, Switzerland. Four libraries were sequenced on an Illumina HiSeq 2500 (2 × 125 bp) (Illumina, San Diego, CA, USA) with a HiSeq SBS Kit v4 (Illumina), three were sequenced on a MiSeq (2 × 125 bp) (Illumina) with a MiSeq Flow Cell Kit Version3 (Illumina), and three libraries were sequenced on a NextSeq (2 × 150 bp + 8) (Illumina) with a NextSeq Mid kit (Illumina). The libraries run on the NextSeq were equally distributed in four lanes. Sequencing was performed according to the manufacturer’s instructions at Fasteris. For the mammal analyses, eight libraries were prepared using a PCR-free library protocol (https://www.fasteris.com/metafast) at Fasteris. Two libraries were sequenced on an Illumina HiSeq 2500 (2 × 125 bp) (Illumina) using a HiSeq Rapid Flow Cell v2 and a HiSeq Rapid SBS Kit v2 (Illumina), three libraries were prepared on a MiSeq (2 × 125 bp) (Illumina) with a MiSeq Flow Cell Kit Version3 (Illumina), and three libraries were prepared using a NextSeq (2 × 150 bp + 8) (Illumina) and a NextSeq Mid kit (Illumina). The libraries run on the NextSeq were equally distributed in four lanes. As different sequencing platforms were used (MiSeq and NextSeq for the Maroni and HiSeq 2500 and MiSeq for the Oyapock; Supplementary Fig. 6 and Supplementary Data 3), the possible influences of the platforms on the sequencing results were verified. To this end, we compared the differences in species numbers between the sample replicates assigned to the same platform (accounting for replicate effect only) against those of the sample replicates assigned to different platforms (accounting for replicate and platform effects). As there were more sites with their two replicates sequenced with the same platform than sites with their replicates sequenced with different platforms (see Supplementary Fig. 6), sites with replicates on the same platform were randomly selected for the comparisons. We repeated this procedure 50 times. The number of species between replicates sequenced on the same platform and those sequenced on different platforms did not differ for >98.5% of all fish and mammal samples (Supplementary Fig. 7 and Supplementary Note 2). Similar to these results, a previous study on 16 S rRNA amplicon has shown that the samples were not influenced by the Illumina sequencing platform used62.To monitor for contaminants, 13 negative extraction controls were performed for each of the primers (“teleo” and “12S-V5”); one control was amplified twice. All of them were amplified and sequenced by the same methods as the samples and in parallel to them. Therefore, for the negative extraction controls, 168 amplifications were prepared with the “teleo” primers (13 negative controls; one amplified and sequenced twice) and 156 amplifications with the “12S-V5” primers (13 negative controls). Fourteen negative PCR controls (ultrapure water; 12 replicates) were amplified and sequenced in parallel to the samples. Eight were amplified with the “teleo” primers and six were amplified with the “12S-V05” primers. Thus, for the PCR negative controls, there were 96 amplifications with the “teleo” primers and 72 amplifications with the Vert01 primers. Sequencing information for the controls is shown in Supplementary Data 3c.An updated version of the reference database from ref. 43 was used. There were 265 Guianese species for the fish analyses (ref. 47). The GenBank nucleotide database was consulted, but it contained little information on the Guianese fish species. Most of the sequences were derived from ref. 43. For the mammal analyses, the vertebrate database was built using ecoPCR software63 from the releases 134 and 138 of the European Nucleotide Archive (ENA), for the Maroni and Oyapock river samples, respectively. The two releases were compared, and it was established that the new mammal species added to each version did not originate from French Guiana. Hence, the results were not influenced by the EMBL release number. The relevant metabarcoding fragment was extracted from this database with ecoPCR63 and OBITools64. Therefore, the reference database comprised the local database of French Guianese mammals60, which references 576 specimens from 164 species as well as all available vertebrate species in EMBL.The sequence reads were analyzed with the OBITools package according to the protocol described by Valentini et al.58. Briefly, the forward and reverse reads were assembled with the illuminapairedend programme. The ngsfilter programme was then used to assign the sequences to each sample. A separate dataset was created for each sample by splitting the original dataset into several files with obisplit. Sequences shorter than 20 bp or occurring less than 10 times per sample were discarded. The obiclean program was used to identify amplicon sequence variants (ASVs) that have likely arisen due to PCR or sequencing errors. It uses the information of sequence counts and sequence similarities to classify whether a sequence is a variant (“internal”) of a more abundant (“head”) ASV64. After this step, we matched the ASV with the reference database to obtain the taxonomic assignation for each ASV. Sequences labelled by the obiclean programme as ‘internal’’ and probably corresponding to PCR errors were discarded. The ecotag programme was then used for taxonomic assignment of molecular operational taxonomic units (MOTUs). The taxonomic assignments from ecotag were corrected to avoid overconfidence in assignments. Species-level assignments were validated only for ≥98% sequence identity with the reference database. Sequences below this threshold were discarded.Measuring disturbance intensity using GIS dataIn riverine systems, the disturbances may accumulate because of hydrologic connectivity, which is the downstream transfer of matter and pollutants4. Hence, the upstream sub-basin drainage network was considered to determine the size of the upstream sub-basin affecting local biodiversity (Fig. 1). The sub-basins were delineated by applying a flow accumulation algorithm to the SRTM global 30 m digital elevation model65. Deforestation was measured over 14 upstream spatial extents with radii of 0.5, 1.5, 3, 5, 10, 15, 20, 30, 40, 50, 60, 70, 80, and 90 km for each sampling site. Then, these 14 upstream spatial extents were intersected with the sub-basin drainage network. In addition, mammals and fish can also be affected by disturbances other than those mediated by hydrologic connectivity. Thus, deforestation was also measured upstream and downstream from the eDNA sampling sites using the same foregoing 14 radii.At each sampling site, deforestation intensity was quantified for each of the 14 spatial extents. We summed upstream (only accounting for disturbances mediated by river hydrologic connectivity) or upstream and downstream (not only considering disturbances mediated by hydrologic connectivity) deforested surfaces from Landsat satellite image datasets. Forest loss surfaces were obtained from the Global Forest Change dataset66. The Global Forest Change dataset identifies areas deforested between 2001 and 2017 on a 30 m spatial scale. To incorporate deforested areas prior to 2000, tree canopy cover data for that year were also used. Except for river courses, all pixels with More