in

Metagenomic analysis of the cow, sheep, reindeer and red deer rumen

[adace-ad id="91168"]

Construction of RUGs from rumen sequencing data

We produced 979G of Illumina sequencing data from 4 cows, 2 sheep, 4 red deer and 2 reindeer samples, then performed a metagenomic assembly of single samples and a co-assembly of all samples. This created a set of 391 dereplicated genomes (99% ANI (average nucleotide identity)) with estimated completeness ≥ 80% and estimated contamination ≤ 10% (Fig. 1). 284 of these genomes were produced from the single-sample assemblies and 107 were produced from the co-assemblies. 172 genomes were > 90% complete with contamination < 5%, and would therefore be defined as high-quality draft genomes by Bower et al.23. The distribution of these RUGs between our samples can be found in Supplementary data S1 (based on coverage). Supplementary data S2 contains the predicted taxonomic assignment for each RUG while Fig. 2 shows a phylogenetic tree of the genomes.

Figure 1

Contamination and completeness (defined by CheckM software) of 391 dereplicated metagenome-assembled-genomes from rumen samples. Grey: genomes which are 80–90% complete with 5–10% contamination. Red: genomes which are > 90% complete with < 5% contamination.

Full size image
Figure 2

Phylogenetic tree of the 391 draft microbial genomes from rumen samples, labelled by taxonomic class. Taxonomies were defined by GTDB-Tk. The tree was produced by MAGpy, using GraPhlAn24 (v.0.9.7), and rerooted at the branch between archaea and bacteria.

Full size image

The tree is dominated by the Bacteroidota (136 RUGs: All order Bacteroidales) and the Firmicutes_A (121 RUGs), followed by lesser numbers of the Firmicutes_C (40 RUGs), Synergistota (20 RUGs: All family Aminobacteriaceae), Firmicutes (19 RUGs), Proteobacteria (15 RUGs), Cyanobacteriota (9 RUGs: All family Gastranaerophilaceae), Actinobacteriota (7 RUGs), Euryarchaeota (7 RUGs: All family Methanobacteriaceae), Spirochaetota (5 RUGs), Elusimicrobiota (3 RUGs: All family Endomicrobiaceae), UBP6 (3 RUGs: All genus UBA1177), Fibrobacterota (2 RUGs: All genus Fibrobacter), Riflebacteria (2 RUGs: All family UBA8953), Chloroflexota (1 RUGs: family Anaerolineaceae) and Desulfobacterota (1 RUGs: genus Desulfovibrio). All members of the phylum Firmicutes_A belonged to the Clostridia class: orders 4C28d-15 (n = 9), CAG-41 (n = 3), Christensenellales (n = 4), Lachnospirales (n = 56), Oscillospirales (n = 45), Peptostreptococcales (n = 2) and Saccharofermentanales (n = 2). Firmicutes_C contains the orders Acidaminococcales (n = 8) and Selenomonadales (n = 32). The phylum Firmicutes contained the orders Acholeplasmatales (n = 3), Erysipelotrichales (n = 1), Izimaplasmatales (n = 1), ML615J-28 (n = 1), Mycoplasmatales (n = 1). RFN20 (n = 7) and RF39 (n = 5), The Actinobacteria contained the orders Actinomycetales (n = 1) and Coriobacteriales (n = 6). The Proteobacteria phylum contains the orders Enterobacterales (n = 4), Paracaedibacterales (n = 1), RF32 (n = 8) and UBA3830 (n = 2). The Spirochaetota contains the orders Sphaerochaetales (n = 1) and Treponematales (n = 4).

After sub-sampling, we found that samples from different ruminant species clustered significantly separately by abundance of RUGs (PERMANOVA: P = 3e − 05). This may be due to the fact that the vast majority of RUGs were only found in a single host species (Fig. 3), including 111 RUGs in red deer, 78 RUGs in reindeer, 40 RUGs in cow and 31 RUGs in sheep. Only 3 RUGs were found in ≥ 1X average coverage in all species: uncultured Bacteroidaceae sp. RUG30019, uncultured Prevotella sp. RUG30028 and uncultured Prevotella sp. RUG30114.

Figure 3

UpSetR graph showing the number of shared microbial genomes at average ×1 coverage (after sub-sampling to equal depth) within four ruminant species.

Full size image

We compared our RUGs to microbial genomes which had previously been sequenced from the rumen to determine if we had discovered any novel strains or species. We dereplicated our RUGs at 99% and 95% ANI to a “superset” of genomes containing rumen RUGs previously produced by our group20, Hess et al.11, Parks et al.25, Solden et al.26 and Svartström et al.27 and the genomes from the Hungate collection17. After dereplication at 99% and the removal of any RUGS with  ≥ 99% ANI to an existing genome (as assigned by GTDB-Tk) or which clustered with members of the superset, 372 of our RUGs remained, representing putative novel strains. After dereplication at 95% and the removal of any RUGS with ≥ 95% ANI to an existing genome (assigned by GTDB-tk) or which clustered with members of the superset, 279 of our RUGs remained, representing putative novel species. The majority of these species originated from single-sample assemblies: 110 from red deer samples, 68 from reindeer samples, 23 from sheep samples and 1 from cow samples, suggesting that many novel microbial species remain to be discovered from non-cow ruminant hosts. These novel species are taxonomically diverse, with members belonging to the phyla Bacteroidota (n = 97), Firmicutes_A (n = 85), Firmicutes_C (n = 27), Firmicutes (n = 16), Synergistota (n = 14), Proteobacteria (n = 11), Cyanobacteriota (n = 9), Actinobacteriota (n = 5), Spirochaetota (n = 4), Euryarchaeota (n = 3), Elusimicrobiota (n = 3), Riflebacteria (n = 2), Chloroflexota (n = 1), Desulfobacterota (n = 1) and UBP6 (n = 1).

31 of our total RUGs were able to be taxonomically identified to species level and these contain bacteria which are commonly isolated from the rumen including novel strains of Bacteroidales bacterium UBA118425, Bacteroidales bacterium UBA329225, Butyrivibrio fibrisolvens, Escherichia coli, Fibrobacter sp. UWB228, Lachnospiraceae bacterium AC300717, Lachnospiraceae bacterium UBA293225, Methanobrevibacter sp. UBA18825, Methanobrevibacter sp. UBA21225, Prevotella sp. UBA285925, Ruminococcaceae bacterium UBA381225, Ruminococcus sp. UBA283625, Sarcina sp. DSM 1100117, Selenomonas sp. AE300517, Succiniclasticum ruminis and Succinivibrio dextrinosolvens.

Comparing microbial taxonomies, CAZymes and KEGG orthologs between ruminant species

We assigned taxonomies to paired sequence reads using our custom kraken database containing RefSeq complete genomes, our RUGs, and the superset of rumen isolated microbial genomes. After subsampling we compared the abundance of members of the microbiota in different ruminant species at multiple taxonomic levels. Averaging reads across rumens species, the vast majority of reads mapped to bacteria (Sheep: 97%, Cow: 97%, Reindeer: 92%, Red deer: 98%) with smaller amounts of archaea (Sheep: 2.3%, Cow: 2.1%, Reindeer: 6.3%, Red deer: 1.9%) and Eukaryota (Sheep: 0.23%, Cow: 1.3%, Reindeer: 1.8%, Red deer: 0.56%). Eukaryota reads originated primarily from fungi and protists. In all ruminants, Bacteroidetes was the most abundant phylum (Sheep: 64%, Cow: 65% Reindeer: 54% Red deer: 52%), with Firmicutes being the second most abundant (Sheep: 29%, Cow: 26% Reindeer: 26% Red deer: 38%). Using PERMANOVA, significant differences in the abundance of taxonomies between ruminant species were found at both high (Kingdom: P = 0.01058, Phylum: P = 0.00017) and low (Family: P = 1e−05, Genus: P = 3e−05) taxonomic levels (Fig. 4).

Figure 4

NMDS of ruminal samples clustered by abundance of taxonomies, using Bray–Curtis dissimilarity values. (a) Kingdom (PERMANOVA; P = 0.01058), (b) Phylum (PERMANOVA; P = 0.00017), (c) Family (PERMANOVA; P = 1e−05), (d) Genus (PERMANOVA; P = 3e−05).

Full size image

We also compared the abundance of genes encoding for specific CAZymes between species. These enzymes are responsible for the synthesis, binding and metabolism of carbohydrates. The carbohydrate esterases (CEs), glycoside hydrolases (GHs), glycosyltransferases (GTs) and polysaccharide lyases (PLs) act to degrade cellulose, hemicellulose and other carbohydrates which could otherwise not be digested by the host. Non-catalytic carbohydrate-binding modules (CBMs) bind to specific carbohydrates, increasing the efficiency of enzymatic degradation29. The auxiliary activities (AAs) redox enzymes are reclassified CBMs which are lytic polysaccharide monooxygenases30. In our samples we found the following numbers of these CAZyme families: 6 AAs redox enzymes, 39 CBMs, 14 CEs, 191 GHs, 61 GTs and 27 PLs. The ten most abundant GHs in the different ruminant species were: for cows GH2, GH3, GH31, GH97, GH28, GH51, GH43_10, GH105, GH10 and GH95; for sheep GH2, GH3, GH28, GH31, GH97, GH32, GH51, GH77, GH78 and GH95; for red deer GH2, GH3, GH31, GH97, GH77, GH32, GH51, GH109, GH28 and GH78; and for reindeer GH2, GH3, GH92, GH109, GH97, GH13, GH31, GH78, GH28 and GH77. Different ruminant species were found to have significantly differently abundant CAZyme genes (PERMANOVA: P = 1e−05, Fig. 5). However, it should be noted that the vast majority of CAZyme families were found in all sample types (Fig. 6), indicating that there exists a set of CAZymes which are present across ruminant species consuming different diets and living in vastly different conditions.

Figure 5

NMDS of ruminal samples clustered by abundance of CAZymes, using Bray–Curtis dissimilarity values (PERMANOVA; P = 1e−05).

Full size image
Figure 6

UpSetR graph showing the number of shared CAZyme families at average ×1 coverage within four ruminant species.

Full size image

DeSeq2 was used to identify specific CAZymes which were significantly more abundant in one ruminant species versus another (Supplementary data S3). Those CAZymes which were consistently more abundant in specific species when compared to other species are listed in Supplementary tables S1–S4.

CAZymes are often found organised into Polysaccharide Utilization Loci (PUL) which comprise a set of genes that enable the binding and degradation of specific carbohydrates or multiple carbohydrates. We used the software PULpy to predict PULs which were present in our Bacteroidales RUGs. Of the 136 RUGs which belong to the taxonomy Bacteroidales, 112 contain putative PULs. Within these RUGs we identified 970 PULs, with numbers of PULs per RUG ranging from 1 to 35. The largest quantity of PULs originating from one RUG was 35 from uncultured Bacteroidales sp. RUG30227; these encoded a wide range of CAZymes. This RUG was more abundant in reindeer samples than samples from other ruminants. Of the 970 PULs, 332 of these were a single susC/D pair. A summary of identified PULs can be found in Supplementary data S4 and Supplementary fig S1.

We also examined the abundance of genes which belonged to specific KEGG orthologs. KEGG orthologs represent a wide range of molecular functions and are defined by a network-based classification. We found that, as for CAZymes, ruminant species clustered significantly by the abundance of genes with specific KEGG orthologs (PERMANOVA: P = 1e−05, Fig. 7) and that the vast majority of orthologs were found in all ruminant species (Fig. 8). However, the large amount of orthologs (n = 729) which were only found in the two domesticated species (cows and sheep) is also worthy of note. It should also be noted that the two sheep samples did not cluster visually to the same extent as the samples originating from the other ruminant species (Fig. 7). DeSeq2 was used to identify many KEGG orthologs which were significantly more abundant in one ruminant species vs another (Supplementary data S5). Those orthologs which were consistently more abundant in specific ruminant species (Adjusted p value < 0.05) are listed in Supplementary data S6.

Figure 7

NMDS of ruminal samples clustered by abundance of KEGG orthologs, using Bray–Curtis dissimilarity values (PERMANOVA; P = 1e−05).

Full size image
Figure 8

UpSetR graph showing the number of shared KEGG orthologs families at average ×1 coverage within four ruminant species.

Full size image

Source: Ecology - nature.com

Could lab-grown plant tissue ease the environmental toll of logging and agriculture?

How to get more electric cars on the road