A time-resolved meta-omics dataset
To characterize the niche space of lipid-accumulating populations as well as resistance and resilience of the microbial community, we sampled a municipal BWWTP weekly over a 14-months period (from 2011-03-21 to 2012-05-03). Additionally, two preliminary time-points outside of the time-series were included13,26. Samples were split into intracellular and extracellular fractions, followed by concomitant biomolecular extractions27 and high-throughput measurements (Fig. 1). MG, MT, and MP data were obtained on the intracellular fractions and MM data was generated on both the intracellular and extracellular fractions.
Samples are derived from in situ sampling of an anoxic tank of a municipal biological wastewater treatment plant. Metagenomic (MG), metatranscriptomic (MT), metaproteomic (MP), and meta-metabolomic (MM) data is generated. Physicochemical data is also collected. Additionally, MG and MT data is generated for ex situ experiments using biomass from the same system and fed with oleic acid under different oxic conditions to evaluate short-term responses to pulse disturbance. The time-series meta-omics data is integrated to define metagenome-assembled genomes (MAGs) over all time points. Representative MAGs (rMAGs) across time are selected for further analysis. The rMAGs’ functional potential is used to infer the fundamental niches. Abundance and activity data are derived from the functional omics and substrate usage is inferred per population. The variability of gene expression is used to assess the phenotypic plasticity of the individual populations.
After quality filtering, the per-sample averages of MG and MT reads were 5.3 × 107 (±7.7 × 106 s.d.) and 3.3 × 107 reads (±1.2 × 107 s.d.), respectively (Supplementary Data 1). We performed sample-specific genome assemblies (average of 4.1 × 105 contigs per sample) followed by binning28 yielding a total of 1364 MAGs passing our quality filtering criteria (see “Methods” section). To track the abundance, gene expression, and activity of individual microbial populations over time, we dereplicated29 the MAGs across samples to generate 220 representative MAGs (rMAGs). From these, we further selected those with the highest completeness resulting in 78 rMAGs (76.2% mean completeness, 2.2% mean contamination) (Supplementary Data 2). These genomes represent the major populations across the time-series, with an average mapping percentage of 26% ± 3% (s.d.) and 27% ± 3% (s.d.) of total MG reads and total MT reads per time-point, respectively, and are corroborated by a previous study based on 16S rRNA amplicon sequencing13. For the MP measurements, we obtained a per-sample average of 1.5 × 105 ± 8.2 × 103 (s.d.) MS2 spectra and a total of 7.6 × 106 MS2 spectra. Of 7.8 × 105 identified peptides, 3.3 × 105 (43%) could be matched to 2.1 × 105 predicted coding sequences of the 78 rMAGS. Per time-point, on average 1.5 × 104 ± 4.5 × 103 (s.d.) spectral matches, i.e., on average 94% of all rMAG-associated matches could be assigned to genes with predicted functions, i.e., assigned KEGG ortholog groups (KOs). To study the community-wide resource space and metabolite turnover, we measured metabolite levels by an untargeted approach using gas chromatography (GC) coupled with mass spectrometry (MS) (Supplementary Data 3). In total, 89% (58 of 65) of the identified metabolites could be linked to enzymes encoded by the rMAGs. We estimated resource uptake by calculating intracellular vs. extracellular metabolite ratios for 42 metabolites detected in both fractions (Supplementary Data 3). Additionally, six abiotic parameters were measured during sampling, as well as 34 parameters recorded continuously as part of the BWWTP online monitoring (Supplementary Data 4).
We also generated MG and MT data for the ex situ experiments. These simulated the fluctuating conditions within the BWWTP, namely the short-term response to pulse disturbances of oleic acid influx under shifting dissolved oxygen conditions. We sequenced DNA and RNA fractions obtained at 0, 5, and 8 h after addition of oleic acid, yielding on average 1.02 × 108 MG and 9.33 × 107 MT reads per sample. The increased sequencing depth compared to the in situ time-series was important to obtain a fine-grained view on short-term responses to oleic acid. Mapping of the sequencing reads to the selected set of rMAGs revealed mapping percentages comparable to the in situ time-series (mean: 21% ± s.d.: 1% for both MG reads and MT reads).
Overall, our meta-omics dataset comprehensively describes mixed microbial communities underlying lipid-accumulation processes in BWWTPs, and in particular their functional potential, composition, activity, as well as substrate availability and assimilation.
Distinct niche types
To resolve the fundamental niches of the pertinent bacterial populations through their functional genomic potential, we assigned KOs to the rMAGs’ predicted coding sequences. We hypothesized that individual populations would form clusters based on the similarity/dissimilarity of their functional potential. We found four distinct clusters of rMAGs by projecting pairwise Jaccard distances of KO presence (Fig. 2a and Supplementary Fig. 1). These functional clusters (FunCs) represent differences of known, overall metabolic capabilities of the rMAGs and reflect their fundamental niches. FunC-1 consisted of Actinobacteria, and FunC-2 was primarily comprised of members of the Bacteroidetes phylum, mainly of the Sphingobacteriia class (Fig. 2a). FunC-3 contained Betaproteobacteria and Gammaproteobacteria whereas FunC-4 appeared more diverse, containing Spirochaetia as a subcluster, Deltaproteobacteria, and taxonomically unclassified rMAGs. We found mash-based genomic distance30 to be strongly linked to FunC assignment (PROCRUSTES sum of squares: 0.399, correlation 0.775, PROTEST p-value 0.001, Supplementary Fig. 2a), highlighting that phylogeny is a strong determinant for FunC assignment. However, some distantly linked subgroups were defined by their shared functional complement, i.e., assigned to a different FunC than their neighbors in a corresponding phylogenetic tree (Supplementary Fig. 2b). This shows that KO profile similarity-based analyses provide important information in addition to phylogeny-based approaches31.
a Multidimensional scaling (MDS) of Jaccard distances for the functional repertoire (presence of KEGG ortholog groups [KOs]) for each rMAG. Ellipses containing 95% (inner) or 99% (outer) of cluster-assigned data points are shown resulting in four distinct functional clusters (FunCs). Colors indicate the class-level taxonomy of the rMAGs. b Numbers of shared and unique KO assignments between the FunCs. Colored bars show the total number of nonredundant KO assignments within the individual FunCs. Overlaps between different sets of FunCs and their unique KOs are represented by the central black bars with the points below defining the members of the respective sets. c Presence of key functions within the four FunCs. Bars next to metabolic conversions show the proportion of rMAGs encoding characteristic enzymes for the respective reaction or pathway adjusted for mean rMAG completeness. Pathways ubiquitously present across rMAGs are shown in gray color. Source data are provided as a Source Data file. red. reductase, GLN syn. glutamine synthetase, GLU dhg. glutmatate dehydrogenase, glyox. cyc. glyoxylate cycle, ethylm.-CoA ethylmalonyl-CoA pathway, PHA depolym. PHA depolymerase.
A total of 1857 KOs was shared between all FunCs and we found that FunCs 1, 3, and 4 contained comparable numbers of nonredundant KOs with 4276, 4177, and 4129 KOs, respectively (Fig. 2b). FunC-2 exhibited a reduced number of KOs (3550), however it also represented the least taxonomically diverse FunC as it almost exclusively consisted of Haliscomenobacter spp. and Chitinophaga spp. (Supplementary Data 2). We tested for the molecular functions that were significantly enriched in individual FunCs and found, among others, functions related to lipid metabolism for FunC-1, amino sugar, and nucleotide sugar metabolism for FunC-2, and biofilm and secretion systems for FunC-3 to be enriched (Fig. 2c and Supplementary Data 5; one-sided Fisher′s exact test, adjusted p-values < 0.05).
While lipid-accumulating organisms hold great potential for the recovery of high-value molecules5, interactions between these organisms as well as the community at large are understudied in situ. We found that diacylglycerol O-acyltransferase (DGAT/WS), which is involved in lipid storage32, was encoded in 23 out of 24 rMAGs of FunC-1, pointing to the importance of TAG accumulation in this cluster. Most FunC-3 members also encoded DGAT/WS (14 of 19). Moreover, PHA synthase was enriched in this cluster (15 of 19). All rMAGs encoded lipases, functions involved in fatty acid synthesis, or beta-oxidation. However, several acyl-CoA and acyl-ACP dehydrogenases were overrepresented in FunC-1 and FunC-3. Additionally, acetyl-CoA acetyltransferases involved in the degradation and biosynthesis of fatty acids were prevalent throughout all FunCs. The enrichment in FunC-1 and FunC-3 for genes involved in lipid accumulation are consistent with previous metabolic characterizations, with FunC-1 consisting mainly of Actinobacteria for which TAG accumulation has been described33. FunC-3 contains Betaproteobacteria and Gammaproteobacteria that have been characterized as TAG, WE, and/or PHA accumulators, e.g., Thauera spp., Albidiferax spp., or Acinetobacter spp.33,34. Importantly, we observed a difference between these FunCs in the utilization of acetyl-CoA. Specifically, FunC-1 members showed an enrichment in functions related to the ethylmalonyl-CoA pathway (crotonyl-CoA reductase and enoyl ACP reductase), while FunC-3 members encoded key enzymes involved in the glyoxylate cycle (malate synthase and isocitrate lyase).
We further determined specific functional enrichment for the four FunCs in relation to the breakdown of other macromolecules (including CAZymes and proteases), nitrogen cycling, stress response, and motility (Supplementary Data 5). The discriminating functions point towards interdependencies between the different FunCs, e.g., in terms of denitrification (Fig. 2c). We found that the separation into taxonomically consistent groups is accompanied by specific conserved functions, e.g., strong enrichment in FunC-1 for WhiB transcriptional regulators characteristic of the Actinobacteria35. Overall, we observed a widely distributed set of core functions in foaming sludge microbiomes and identified groups of populations characterized by distinct functional potential in lipid metabolism, amino sugar, and nucleotide sugar metabolism as well as biofilm and secretion systems.
Community dynamics and stability
To understand whether population dynamics can be related to substrate availability and other abiotic factors36, we used MG depth-of-coverage to infer rMAG population abundance across the time-series. We computed distances between the rMAGs’ abundance profiles (based on their pairwise correlations) and found that the dynamics of rMAGs can be partially explained by the FunC assignment (PERMANOVA R2 = 0.12, Pr > F = 0.002; no significant difference in dispersion; Supplementary Fig. 3), thereby linking FunC membership to temporal abundance shifts. The most abundant taxa (Supplementary Data 2) included Candidatus Microthrix (26.0% relative abundance across the time-series; referred to as Microthrix in the remaining text), Acinetobacter (8.1%), Haliscomenobacter (8.0%), Intrasporangium (7.2%), Leptospira (6.3%), Albidiferax (5.7%), and Dechloromonas (2.4%) (Fig. 3a). Several of the recovered rMAGs belonged to filamentous taxa according to the MiDAS field guide database for organisms in activated sludge37, such as the highly abundant Microthrix, and Haliscomenobacter, as well as the less abundant Anaerolinea (1.1 %) and Gordonia (0.2 %).
a, b Relative abundance and expression levels of recovered populations represented by rMAGs over time based on MG depth (a) and MT depth (b) of coverage, respectively, representing mapping percentages of MG [26% ± 3% (s.d.)] and MT [27% ± 3% (s.d.)]. The relative abundance of individual rMAGs is grouped based on genus-level taxonomic assignment with rMAGs of unresolved taxonomy grouped in “Other”. Recovered genera with mean abundance below 2% are summarized as a single group (light gray). c, d Ordination of Bray–Curtis dissimilarity of relative abundances, MG (c) and MT (d), of individual rMAGs constrained by selected abiotic factors (metabolite levels, metabolite-ratios, and physico-chemical parameters shown as black arrows with arrow lengths indicating environmental scores as predictors for each factor). Points are colored by month of sampling and point-shape reflects the year of sampling. Thin black lines connecting the points visualize the time course of sampling. Source data are provided as a Source Data file.
Variations during the operation of BWWTPs occur largely due to changes in the influent wastewater composition and climatic conditions38. We observed gradual changes in the community structure with the seasons (Fig. 3a). In October 2011 (month seven of the timeseries), the community composition began to shift, with a markedly altered composition in late November 2011. This shift was characterized by spikes in the relative abundance of Leptospira (peak at 2011-11-23) and Acinetobacter (peak at 2011-11-29) (Fig. 3a), and co-occurred with a pronounced shift in substrates (Fig. 4 and Supplementary Fig. 4). The substrates included mainly nonpolar metabolites, including long-chain fatty acids (LCFAs) and glycerides, as well as polar metabolites mannose, glucose, disaccharides, ethanolamine, and putrescine. We found that the intersample distances of MG-based abundances could partially be explained by a subset of the abiotic factors (Fig. 3c). Summer samples were characterized by higher temperatures, phosphate levels and higher intracellular vs. extracellular oleic acid ratios. Higher extracellular mannose levels and a slight increase in conductivity marked the beginning of the autumn shift. During November, intracellular and extracellular levels for LCFAs increased, indicating a higher availability or turnover of LCFAs, but not necessarily an equivalent conversion to neutral storage lipids. In the subsequent winter time-points, substrate levels normalized and the community transitioned back to the predisturbance state.
Z-score transformed metabolite intensities, metabolite ratios, and physico-chemical parameter levels over time are shown. Row annotations highlight classes of metabolites and parameters, measurement types (bnp: intracellular nonpolar metabolites, bp: intracellular polar m., pcparams: physico-chemical parameters, ratio: metabolite intrac./extrac. ratio, snp: extracellular nonpolar m., sp: extracellular polar m.), and the subtype or fraction of the measurement (manual: measured during sampling, online: measured during WWTP operation). Selected rows are shown (comprehensive heatmap shown in Supplementary Fig. 6). Source data are provided as a Source Data file.
The dominance of Microthrix was re-established within approximately ten generations, given estimates for in situ growth rates of 0.12–0.3 growth cycles per day7,8. The stability39 of the individual rMAGs was heavily affected by the November shift (mean population stability: 1.43 ± 0.69 s.d.), compared to the stability when excluding the respective time-points (mean population stability: 2.39 ± 1.28 s.d.; 2011-11-02 to 2011-11-29; Supplementary Data 2). The observed population dynamics indicate that the community composition is resilient, i.e., recovers after pronounced changes in available substrates, and resistant to small-scale environment fluctuations over time.
While MG depth was used as a proxy for population abundance, MT depth enabled the analysis of transcriptional activity within the community and of individual populations (Fig. 3b). The comparison of intersample distances based on mean, relative MT depth showed similar patterns to MG-based results (Fig. 3c), albeit with a higher degree of variability indicated by increased inter-sample distances (Fig. 3d). A comparison of relative MP counts showed a more even distribution between populations with comparable overall trends (Supplementary Fig. 5). Samples collected in April 2011 and 2012 appeared to represent transition states between seasons. Additionally, a set of late winter and early spring samples in 2011 and 2012 showed higher similarities at the expression level than at the abundance level. Interestingly, the high abundance of individual genera, such as Microthrix or Chitinophaga was not necessarily reflected in their mean expression levels (Fig. 3b and Supplementary Fig. 5): populations assigned to Leptospira, Haliscomenobacter, Anaerolinea, and Acinetobacter showed higher mean expression overall. Spikes in relative MT depth as for Acinetobacter rMAGs (Fig. 3b; 2011-04-14, 2011-05-08, and 2012-04-25) point towards increased activity around these time-points, which however did not lead to major shifts in community structure. Notably, higher expression levels of Acinetobacter were succeeded by increased expression levels of Haliscomenobacter (2011-04-14 to 2011-05-20) or Anaerolinea (2011-05-08 to 2011-09-19). On average, MT-based stability values were less affected by the community shift than MG-based stability values (Supplementary Table 2). We also observed adaptation of metabolic pathway activity to environmental conditions (Fig. 5). Pentose to EMP pathway intermediates exhibited the highest correlation between MT and MP abundances, followed by Hydrogen metabolism and Fatty acid oxidation. Several pathways exhibited a characteristic drop during the November shift, e.g., hydrogen metabolism, hydrocarbon degradation, and TCA cycle, while fatty acid oxidation showed a marked peak. This highlights the transition from dominance by generalist, lipid-accumulating populations towards a lipolytic community.
Metatranscriptomic and metaproteomic levels (normalized relative expression for MT data and normalized relative spectral counts for MP data) of rMAG-derived genes assigned to FOAM ontology-based functional categories. Pearson correlation coefficients (r) of MT and MP values are shown in the title of each panel. Panels are ordered from highest to lowest mean MP relative count in row-major order. Source data are provided as a Source Data file.
With each of the four FunCs comprising multiple organisms encoding similar KOs and, hence, metabolic capabilities, we studied how individual populations adapt to their environment. To this end, we linked changes in community structure and in the expression levels of individual populations to the influence of environmental parameters. While rMAG abundance patterns could be linked to FunC assignment (Supplementary Fig. 3), we could not identify an analogous categorization when correlating rMAG abundances to abiotic factors. Instead, correlation patterns indicating similar preferences to environmental conditions emerged for subgroups of rMAGs across different FunCs (Supplementary Fig. 7). This shows that populations with a similar fundamental niche type responded differently to the environmental conditions pointing towards functional plasticity and, thus, adaptations of their realized niches
Niche characteristics of in situ and ex situ time-series
While we identified four fundamental niche types, it may be assumed that cohabiting species cannot occupy the same realized niches, leading to realized niche segregation within and between types. We hypothesized that different degrees of niche overlap, leading to variable levels of competition, must exist40,41. To better understand the complementarity of realized niches, we used the functional omics data to study how rMAGs overlapped in relation to their encoded genes and how rMAGs varied in their expression profiles. While the former represents competition between populations with overlapping profiles, the latter is an important factor for the adaptability and overall survival strategy of individual populations. We distinguished between expressed KOs and nonexpressed KOs based on MT/MG ratios as well as MP data and computed distances between the resulting time-point-specific expression profiles. While the separation based on the functional potential was preserved in a clustering of expression profiles (in particular for FunC-2), the expression profiles of FunCs-1, FunCs-3, and FunCs-4 overlapped to a greater extent than those of FunC-2 (Fig. 6a). Two Anaerolinea populations assigned to FunC-1 appeared to express similar functions compared to the rMAGs of FunC-3 and FunC-4 and were found in a subgroup of rMAGs that showed a higher overall activity in terms of MT/MG ratios also when clustering expression profiles per time-point (Supplementary Fig. 8). Overall, the clusters based on KO expression status per time-point did not exhibit a separation according to the grouping into FunCs (Supplementary Fig. 8). This indicates a propensity of the respective rMAGs to more frequently express shared KOs than discriminatory KOs and, consequently, increased the competition for specific substrates.
a MDS of time-point specific expression profiles based on MT/MG ratios or evidence at the MP level. Colors indicate FunC assignment of the individual rMAGs. Point shape represents cluster assignment based on automated clustering of the embedded points. Ellipses containing 95% of cluster-assigned data points are shown. Points size represents the average MT/MG depth ratios of the individual rMAGs. The amounts of variance explained by the first two dimensions are shown on the respective axes. b Mean MT/MG depth ratios over all time-points are shown per condition for 78 rMAGs (boxplots show: center line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; Each group of boxplots corresponds to a group of rMAGs (FunC-1 n = 24, FunC-2 n = 23, FunC-3 n = 19, FunC-4 n = 12), each boxplot represents an independent experiment.). c Mean MT/MG depth ratios grouped according to class-level taxonomic assignment of the rMAGs with the number of rMAGs for each group are shown in the top of the plot (n). Source data are provided as a Source Data file.
To investigate the importance of individual, discriminatory functions, we selected rMAG clusters, based on gene expression and MP counts, to which the two most abundant rMAGs (D51_G1.1.2, A01_O1.2.4) had been assigned. We observed that clusters into which rMAG D51_G1.1.2 (Microthrix) was consistently categorized showed expression of few KOs with the majority being ribosomal proteins, TCA cycle-related enzymes such as pyruvate, malate, and glyceraldehyde 3-phosphate dehydrogenases, chaperones, and most frequently the WhiB family transcriptional regulator (19 time-points; Supplementary Data 6).
Clusters containing rMAG A01_O1.2.4 (Acinetobacter) frequently exhibited expression of genes related to motility and chemotaxis as well as stress response, but also functions related to phosphate accumulation, such as K08311 and K00937 (Supplementary Data 6). KOs related to lipid metabolism were also frequently expressed in these clusters e.g. acylglycerol lipase (in 35 time-points) or diacylglycerol O-acyltransferase (25 time-points). This indicates that high expression of key functionalities is an integral part of the strategies of the populations within these clusters even though they differed with respect to their encoded functions.
We next studied how the observed distinction between populations with high activity is linked to phenotypic plasticity. As alternating oxygen levels in BWWTPs play an important role in selecting for lipid accumulating populations7,42, we added oleic acid, the preferred carbon source for Microthrix43, in lab-scale experiments under different oxygen fluctuation conditions8 (see “Methods” section; Fig. 1). These ex situ conditions involved aerobic, anoxic, aerobically preconditioned biomass followed by hourly anoxic alternations, and anoxically preconditioned followed by hourly aerobic alternations. The MT/MG ratios for FunC-1 and FunC-3 were higher ex situ when compared to the in situ samples, and vice versa for FunC-2 and FunC-4 rMAGs (Fig. 6b). Furthermore, especially for FunC-3, average MT/MG ratios were highest in the aerobic conditions and lowest in the anaerobic conditions. This is in line with FunC-3 being comprised mainly of Betaproteobacteria and Gammaproteobacteria, which include mostly aerobic genera44. A more fine-grained view on differences in specific activity was obtained, when grouping rMAGs based on taxonomic assignment (Fig. 6c). While rMAGs of the classes Acidimicrobia and Actinobacteria (FunC-1) showed the lowest mean MT/MG ratios across the in situ time-series (0.5), the ratio was twice as high in the ex situ experiments across all conditions which can be attributed to the oleic acid pulse. Betaproteobacteria (FunC-3) behaved similarly, while Gammaproteobacteria (FunC-3) showed a tendency towards higher activity with increased oxygen levels. We observed high activity for rMAGs assigned to Anaerolineae and Spirochaetia in the in situ time-series. Interestingly, this was not the case for Spirochaetia in the ex situ experiments, which points towards the necessity for additional substrates. The Anaerolineae rMAGs, with taxonomically related species being mainly anaerobic45, showed the lowest MT/MG ratio under the alternating conditions, while Deltaproteobacteria rMAGs showed high MT/MG ratios. Overall, the differentiated responses under alternating conditions point to distinct short-term and long-term adaptation strategies.
To study how fast the adaptations in response to the influx of oleic acid occur, we compared the baseline (0 h time-points, before oleic acid addition) against the 5 and 8 h time-points (after oleic acid addition). At 5 h, lipases, involved in TAG hydrolysis, for which high expression in the in situ samples was observed, were downregulated in the ex situ response to the addition of oleic acid (Supplementary Fig. 9a). An increased number of genes related to beta-oxidation were upregulated at 5 h, particularly in rMAGs assigned to FunC-3 (Supplementary Fig. 9b). Similar effects were observed when comparing the 0 h and 8 h time-points (Supplementary Fig. 10a, b). This suggests that responses in gene expression happen within the 5 h timeframe but on distinct time scales for different populations. In-depth analyses of the populations exhibiting the highest expression levels for TAG lipases, DGAT/WS, and PHB synthases (Supplementary Note 1 and Supplementary Figs. 11–14) underline the previously determined role of Microthrix as a key lipid accumulator in BWWTPs13,46. The results also indicate that populations such as Anaerolinea, Leptospira and Acinetobacter overlap with Microthrix in terms of their capacity to assimilate LCFAs and available neutral lipids. Niche complementarity and plasticity, i.e., overlapping fundamental and realized niches, as well as gene expression variability, impart population-independent processing of lipids in situ. From an ecosystem perspective, this community-wide trait confers functional resistance and resilience.
Source: Ecology - nature.com