in

Response of soil viral communities to land use changes

Characteristics of LVD dataset and assembled vOTUs

The land use virome dataset LVD was derived from 2.6 billion paired clean reads of sequences across 50 viromes of 25 samples with five types of land uses (Supplementary Data 2). A total of 6,442,065 contigs (>1500 bp) were yielded, of which 764,466 (11.8%) contigs were identified as putative viral genomes through VIBRANT. Subsequently, putative false positive viral genomes were removed (see Methods section), and 27,951 and 48,936 bona fide viral genomes were retained from the 25 intracellular VLPs (iVLPs) and 25 extracellular VLPs (eVLPs) viromes, respectively. These genomes were clustered into 25,941 and 45,152 vOTUs for iVLPs and eVLPs viromes, respectively, in which the iVLPs and eVLPs viromes shared 11,467 (19.2%) vOTUs. Subsequently, they were merged and dereplicated, resulting in 59,626 vOTUs (Supplementary Data 3) for the following analysis. A total of 8112 (13.6%) vOTUs genomes were classified as complete, in which the median length of all and circular vOTUs were 25,183 bp and 45,511 bp, respectively (Supplementary Fig. 4).

To explore the taxonomic affiliation of vOTUs in family and genus-level, a gene-sharing network consist of 59,626 vOTUs genomes from this study and 3502 reference phage genomes (from NCBI Viral RefSeq version 201) revealed 6009 VCs comprising of 37,224 vOTUs, of which 34,417 vOTUs were from LVD, besides 2794 singletons (2653 from LVD dataset), 16,056 outliers (15,833 from LVD) and 8492 overlaps (8061 from LVD) were detected (Supplementary Data 4). Of these, only 157 VCs contained genomes from both the RefSeq and LVD dataset (1864 viral genomes) (Supplementary Data 4). Most of VCs (1837, 30.4%) included only two members.

At the family level, most of vOTUs were classified into Siphoviridae (712 by vConTACT2 and 29,671 (50.9%) by Demovir, tailed dsDNA), Podoviridae (610 by vConTACT2 and 9923 (17.6 %) by Demovir, tailed dsDNA), Myoviridae (485 by vConTACT2 and 5445 (9.9%) by Demovir, tailed dsDNA), Tectiviridae (50 by vConTACT2 and 10 (0.10%) by Demovir, non-tailed dsDNA) (Fig. 1). Besides, the Eukaryotic viruses Herpesviridae (159 by Demovir, 0.26%, dsDNA), Phycodnaviridae (120 (0.20%) by Demovir, dsDNA); the Virophage Family Lavidaviridae (15 (0.03%) by Demovir) were detected as well, but a majority of vOTUs were unclassified in genus-level.

Fig. 1: The taxonomic assignment of LVD.

Pie charts showing the affiliation of 56,870 vOTUs at family level assigned by script Demovir (a). and the affiliation of 1864 vOTUs at family level assigned by package vConTACT2 (b). Source data are provided in the Source Data file.

Full size image

Viral community structures differ across land use types

Bray–Curtis dissimilarity of viral communities (median 0.9951) showed strong heterogeneity of viral communities among different sites (Fig. 2a). While, the Bray–Curtis dissimilarity (median: 0.5109) between paired viral communities of iVLPs and eVLPs from each site have a significant lower heterogeneity than inter-sites (Wilcox.test, p < 0.001; Fig. 2a). Furtherly, the Bray–Curtis distance between paired intracellular and extracellular viral communities did not present a significant difference among all types of land uses (Supplementary Fig. 5). Similarly, viral communities of paired iVLPs and eVLPs were grouped together for each site and were well separated from the other sites (anosim.test, R = 0.04, p > 0.05; Fig. 2b). Therefore, the paired iVLPs and eVLPs viromes from each site were merged for subsequently viral community analysis.

Fig. 2: The macrodiversity of soil viral communities.

a Boxplot showing Bray–Curtis dissimilarity of viral communities of intra-sites (between the corresponding community of iVLPs and eVLPs, n = 25) and inter-sites (between different sample sites, n = 300). The minima, maxima, center, bounds of box and whiskers in boxplots from bottom to top represented percentile 0, 10, 25, 50, 75, 90, and 100, respectively, the difference between different zones was tested using the two-sided Wilcox.test, ****p < 0.0001. The exact p value is 1.6e-15. b Principal coordinates analysis (PCoA) of viral community structures, as derived from reads mapping to 59,626 vOTUs and Bray–Curtis dissimilarities; each point is one sample, triangle indicated extracellular (Non_mitomycin_C) and circles indicated intracellular (mitomycin_C treatment) viral community. The analysis of similarity (ANOSIM) statistics considered viral community composition grouped by habitat and treatment. The land use zone AG represents agricultural areas including paddy and vegetable field; UG represents urban green space including park and road verge; FO represents forest. Ellipses in the PCoA plot are drawn around the centroids of each zone at 95% (inner) and 97.5% (outer) confidence intervals. The statistical test used was two-tailed. Source data are provided in the Source Data file.

Full size image

PCoA based on Bray–Curtis dissimilarity of viral communities indicated the viral communities of the 25 samples from five types of land use were clustered into three zones (Fig. 2b). We designated these three emergent land use zones as the agricultural zone (AG) including paddy and vegetable plot, urban green spatial zone (UG) including park and road verge, and forest zones (FO) respectively (anoism.test, R = 0.72, p = 1e-4). PERMANOVA indicated that the best predictor of β-diversity of viral community composition was the land use zones compare with other environmental parameters, which explained 10.4% of the variance (adjust p < 0.001) (Supplementary Data 5). The proportion of shared vOTUs within land use zones was significantly greater than those between land use zones (Wilcox.test, p < 0.0001; Fig. 3a). The heatmap indicated that the viral communities were clustered according to the land use zones (Fig. 3b) except sample T5 rather than geographic distance (Supplementary Fig. 6a). A stronger distance-decay pattern was observed within zone FO compared with the other two zones (Supplementary Fig. 6b). Furthermore, at the VCs level, the multi-zonal VCs were dominant and accounted for 24.7% to 45.2% relative abundance (Fig. 3c), suggesting that the strong vOTUs boundary among different zones of land use was less pronounced at VCs level, such as the zone FO and AG did not share any vOTUs, but they shared 34% VCs (Fig. 3d). We also observed that UG and AG shared the most VCs (47%), followed by UG and FO (42%), and FO and AG (32%; Fig. 3d).

Fig. 3: Dynamic of viral operational taxonomic unit (vOTUs) and clusters (VCs).

a The shared vOTUs of inter-zones (n = 285) and intra-zones (n = 30). The minima, maxima, center, bounds of box and whiskers in boxplots from bottom to top represented percentile 0, 10, 25, 50, 75, 90, and 100, respectively, the difference between different zones was tested using the two-sided Wilcox.test, ****p < 0.0001. The exact p value < 2.2e-16. b Heatmap showed the shared viral content between different samples. The FO represent forest zone, the UG represent urban green space zone, and AG represent agricultural zone. c The relative abundances of VCs according to their distribution. The “all zone shared” as the VCs presented in all zones; “FO and AG shared” as the VCs only presented in zone FO and AG, “UG and AG shared” as the VCs only presented in zone UG and AG; “FO and UG shared” as the VCs only presented in zone FO and UG; “Unique” as the VCs exclusively presented in each zone. d The proportion of shared VCs (above) and vOTUs (below) between different zones. e Mantel test showing the correlation between viral communities and geochemical parameters. The p values were adjusted using FDR. The statistical test used was two-tailed. Source data are provided in the Source Data file.

Full size image

The diversity of viral communities of three land use zones were comparable as indicated by Shannon (6.1–7.0) and Simpson (0.9878–0.9977) indexes (Supplementary Fig. 7). The compositions of viral communities were similar, dominated by tailed dsDNA bacteriophages assigned to the family Siphoviridae, followed by Podoviridae, Myoviridae, Herpesviridae, and Tectiviridae (Supplementary Fig. 8). Mantel test revealed that viral community structures at population-level significantly correlated with 22 environmental parameters, of which the pH played the most important role in the change of viral community structures (Fig. 3e).

The relative abundances of putative lysogenic phages vary among different land uses

The putative temperate vOTUs were identified based on the presence of integrases or whether the vOTU was identified as integrated into a bacterial contig through VIBRANT. The relative abundance of putative lysogenic phage presented a significant difference among the three land use zones (Fig. 4a), in which the relative abundance of putative lysogenic vOTUs of the AG (from 4.3% to 17.4%) was significantly lower than those in the UG (from 7.5% to 31.0%) and FO (from 24.5% to 47.8%) (Fig. 4a). Additionally, the difference of putative lysogenic phage between paired iVLPs and eVLPs viromes presented an obvious variety across the three zones, of which the iVLPs viromes of FO harbored higher putative lysogenic phages (Fig. 4a). The alpha-diversity of putative lysogenic phage showed a significant difference among the three zones (Fig. 4b). Mitomycin C treatment can facilitate the recovery of putative lysogenic vOTUs (Wilcox test: p < 0.001; Fig. 4c). The relative abundances of putative lysogenic vOTUs were negative correlated with soil moisture (F-statistic: R2 = 0.27, p < 0.05; Fig. 4d), whereas a positive correlation with DOC was observed (F-statistic: R2 = 0.35, p < 0.05; Fig. 4d). Furthermore, we detected 617 prophages, of which 525 prophages were used as representative genomes of vOTUs, only 18 non-prophage vOTUs contains prophage. Consistent with relative abundance of putative lysogenic phage, we also found the mitomycin C can enhance the recovery of prophage from viromes (Fig. 4e).

Fig. 4: The lifestyle of viruses in different land uses.

The relative abundance (a), and the alpha diversity (b) of putative lysogenic phage in zones FO (n = 5), AG (n = 10) and UG (n = 10), and (c) the relative abundance of putative lysogenic phages in eVLPs (Non_mitomycin_c, n = 25) and iVLPs (mitomycin_c, n = 25) viromes. The minima, maxima, center, bounds of box and whiskers in boxplots from bottom to top represented percentile 0, 10, 25, 50, 75, 90, and 100, respectively, the difference between different zones was tested using the Wilcox.test, ****, ***, **, * and ns represents p < 0.0001, p < 0.001, p < 0.01, p < 0.05, and p > 0.05 respectively. d Relationship between the relative abundance of putative lysogenic phages and moisture and DOC. The dark blue lines represent fitting curve of linear model for all sites. Inset values display the R2 and adjusted p value of F-statistic. The statistical test used was two-tailed. e The relative abundance of proviruses in eVLPs (Non_mitomycin_c, n = 25) and iVLPs (mitomycin_c, n = 25) viromes. The minima, maxima, center, bounds of box and whiskers in boxplots from bottom to top represented percentile 0, 10, 25, 50, 75, 90, and 100, respectively, the difference between different treatments was tested using the Wilcox.test, ** represents p < 0.01. The exact p value is 0.0056. The statistical test used was two-tailed. Source data are provided in the Source Data file.

Full size image

Host-linked viral community compositions

The variations of prokaryotic community composition of different land uses were characterized based on taxonomic profiles derived from corresponding metagenomes. The results showed that the dominant class was Actinobacteria, Bacilli, Alphaproteobacteria, Betaproteobacteria, Gammaproteobacteria, and Deltaproteobacteria (Fig. 5a). There was a significant difference among the five land use types (anoism.test R = 0.44, p < 0.001; Supplementary Fig. 9a), whereas the alpha-diversity were similar (Supplementary Fig. 9b).

Fig. 5: The relationship between bacteria, host-linked viruses and environmental factors.

a Class distribution of bacterial compositions of metagenomes from various land use types. b Relative abundances of lineage-specific phages grouped at the host class level. c Mantel test showed the correlation between environmental factors and lineage-specific phages (top node, host-linked abundance) and bacterial community (bottom node, bacterial abundance). The size of circle represents Pearson Correlation Coefficient (r) among different environmental factors. The p values of mantel test were adjusted using FDR. The statistical test used was two-tailed. Source data are provided in the Source Data file.

Full size image

The relative abundances of vOTUs were summed up according to their predicted hosts at class level (Fig. 5b and Supplementary Data 6). The results indicated obvious variations in community compositions according to zones (Fig. 5b) compared with overall soil bacterial compositions (Fig. 5a). Overall, the dominant predicted host were Actinobacteria, Alphaproteobacteria, Gammaproteobacteria, Betaproteobacteria, and Acidobacteriia (Fig. 5b). The zone UG occupied the highest relative abundance (from 17.5% to 52.8%, Wilcox.test: p < 0.01) of Actinobacteria-linked phages; the AG occupied the highest relative abundance (from 6.8% to 26.9%, Wilcox.test: p < 0.05) of Betaproteobacteria-linked phages (Fig. 5b). We explored the host of 8,910 vOTUs with significant differences among the three zones (Supplementary Data 7). Phages infected Gammaproteobacteria were enriched in zone FO, and the phages infected Actinobacteria and Alphaproteobacteria were enriched in zone UG, whereas the zone AG enriched phages that infected Acidobacteriia and Betaproteobacteria (Supplementary Fig. 10).

Mantel tests revealed that the compositions of bacterial communities were significantly related to the environmental factors DC/DN (Dissolved carbon/Dissolved nitrogen), C/N, DN, NO3, Se, and longitude (Fig. 5c). The host-linked viral communities were significantly related to bacterial communities (mantel statistic r = 0.44, p < 0.001), and they were significantly sensitive to the changes of geographic distance (longitude), soil pH, and concentration of metal element Ba, Ni, As, Mo, La, Nd, and Zn (Fig. 5c). The network based on the Spearman correlation coefficient indicated that pH, longitude, P, Zn, and moisture have a higher connection degree compared to the other environmental factors (Supplementary Fig. 11). pH strongly correlated to the phages infecting Actinobacteriota, Acidobacteriia, Gammaproteobacteria, but not to the abundance of these taxa in soil metagenome (Supplementary Fig. 12). The relative abundances of Actinobacteria-linked phages were positively correlated with the elevated pH, whereas the relative abundances of Acidobacteria and Gammaproteobacteria-linked phages were negatively correlated with pH (Supplementary Fig. 12).

Shared vOTUs and microdiversity

The shared vOTUs based on their distribution within and between land use zones were classified as local, regional, and multi-zonal vOTUs (Fig. 6a). In total, 44,534 vOTUs (74.6%) were classified as “local”, and 12,314 vOTUs (20.6%) were classified as “regional”. Only 2771 vOTUs (4.6%) were present in at least two zones and were defined as “multi-zonal”, of which zone FO and AG only shared 103 vOTUs, whereas 1479 vOTUs were shared by zone FO and UG. In addition, only 76 vOTUs were detected in all zones, whereas there were 7398 bacterial species (64.4% of all) presented in all zones (Fig. 6a).

Fig. 6: Evolutions of soil vOTUs with different geographic ranges.

a Venn diagram showing the shared vOTUs (top) and bacterial species (bottom) among three land use zones. b The microdiversity of local, regional, and multi-zonal vOTUs in zones FO (n = 10), UG (n = 20) and AG (n = 20). c The microdiversity of multi-zonal vOTUs among three land use zones FO (n = 10), UG (n = 20) and AG (n = 20). d The connection number of local (n = 44,534), regional (n = 12,314), and multi-zonal (n = 2771) vOTUs in gene-sharing network. e The size of total VCs (n = 6009) and multi-zonal VCs (n = 1416). The exact p value less than 2.2e-16. f The averaged pN/pS values of local, regional, and multi-zonal vOTUs in zones FO (n = 10), UG (n = 20), and AG (n = 20). For boxplots, the minima, maxima, center, bounds of box and whiskers in boxplots from bottom to top represented percentile 0, 10, 25, 50, 75, 90, and 100, respectively, the difference between different land use was tested using the Wilcox.test, ****, ***, **, *, and ns represents p < 0.0001, p < 0.001, p < 0.01, p < 0.05, and p > 0.05, respectively. The statistical test used was two-tailed. Source data are provided in the Source Data file.

Full size image

The microdiversity of vOTUs increased from local, regional to multi-zonal, despite that in FO, significant difference in the microdiversity between regional and multi-zonal vOTUs was not observed (Wilcox test: p > 0.05; Fig. 6b). The microdiversity of multi-zonal vOTUs in FO were significantly lower than UG and AG (Wilcox test: p < 0.01; Fig. 6c). Furthermore, we also observed that the multi-zonal vOTUs have a significant higher popANI from intra zones compared with those from inter-zones (Supplementary Fig. 13).

The connection number of node (vOTU) in the gene-sharing network constructed by vConTACT2 were calculated. We found that the number of connections (median 47.00) of multi-zonal vOTUs with other vOTUs was significantly greater than that of regional one (median, 38; Wilcox test: p < 0.001), and regional one (medium, 38) was significantly greater than that of local (median, 31) (Fig. 6d). Subsequently, 2771 multi-zonal vOTUs were observed in 1416 VCs. The results showed that the size (median, 10) of VCs included multi-zonal vOTUs was significantly higher than the total (median, 4; Wilcox test: p < 0.001), the multi-zonal vOTUs prefer to exist in the large-sized VCs (Fig. 6e). These results demonstrated that the multi-zonal vOTUs shared genetic information with more vOTUs than local and regional in soil.

We identified genes whether under positive selection by evaluating the ratio of non-synonymous to synonymous mutations observed in gene sequences using the pN/pS equation. Of 726,331 genes tested from vOTUs with sufficient coverage (>5 mean coverage depth), 39,118 genes were identified as being under positive selection in at least one sample, and almost all genes (39,027) were uniquely selected by certain region, of which 18,167 genes can be predicted with at least a function, and most of them were predicted responsible for structure and DNA metabolism (Supplementary Data 8). The frequency of each pfam number was counted, of which the top 10 positively selected proteins were dominated by those associated with viral structures, including 3981 genes related to Viral head morphogenesis (PF04233), 3101 genes for P22 coat protein (PF11651), and 695 genes for Phage virion morphogenesis (PF05069) (Supplementary Data 9). Uniquely positively selected genes were observed among the three zones, genes encoding DNA polymerase family A (PF00476) and Phage integrase family (PF00589) occupied the top two positively selected genes in zone FO, genes encoding Phage portal protein (PF04860) and Phage capsid family (PF05065) zone PT, and genes encoding Viral head morphogenesis (PF04233) and P22 coat protein (PF11651) were were positively selected in zone PV. The result was in accordance with that the FO occupied higher putative lysogenic phages compared to other two zones (See source data). Higher pN/pS values were detected for the genes carried by local vOTUs than those carried by regional and multi-zonal vOTUs (Fig. 6f).


Source: Ecology - nature.com

3Q: Why Europe is so vulnerable to heat waves

Substantial differences in soil viral community composition within and among four Northern California habitats