A network approach to elucidate and prioritize microbial dark matter in microbial communities

Overall strategy to detect the relevance of Unknown taxa

A pipeline based on network analysis was developed to detect and quantitatively measure the overall and individual impact of Unknown taxa on their environmental communities (Fig. 1). Briefly, Illumina 16S rRNA sequencing fastq files belonging to four distinctive aquatic extreme environments (i.e., hot springs, hypersaline, deep sea, and polar habitats that included both Arctic and Antarctic samples) were collected from public databases and 45 different BioProjects (Fig. 2a and Supplementary Dataset S1). We included different environment types to assess general and environment-specific patterns and chose to use extreme habitats as they comprise some of the harshest and relatively understudied habitats on Earth, and therefore, are likely to contain uncharacterized organisms.

Fig. 1: Overview of the analysis pipeline.

A minimum of 250 samples was retrieved for each of the four different extreme environments—hot springs (red), hypersaline (dark green), deep sea (turquoise), and polar (blue). Sequence reads were quality filtered, assigned to a taxonomy, and clustered to OTUs. At each classification level, any unassigned, ambiguous, or uncultured OTUs were designated as Unknowns, or “microbial dark matter” (MDM). For each environment, at each classification level, the direct co-occurrence relationship between all OTUs was mathematically modeled as a network. Networks were created for each environment, across all taxonomic classification levels, including all OTUs (Original, orange), excluding MDM (Without Unknowns, light green), and excluding an equal number of random Knowns (Bootstrap, blue). Network centrality metrics (i.e., degree, betweenness, and closeness) were calculated for each node, compared, and visualized as boxplots between these network types. Hub scores were calculated for each node in the Original network and networks were recreated, resizing by hub score, where the largest size node indicates a top hub species.

Full size image

Fig. 2: Summary of environmental 16S rRNA gene data.

a Map of the sample sites used in this study. Circles symbolize sample locations and are color-coded by environment: hot springs (HS, red), hypersaline (HY, dark green), deep sea (DS, turquoise), and polar (PO, blue). b Summary of data used in this study. OTUs counts are provided at the genus level. c Proportion of “microbial dark matter” (MDM) OTUs for each environment labeled as unassigned (dark blue), uncultured (dark red), and ambiguous (yellow) after SILVA and UCLUST-based taxonomic assignment to OTU. d Venn diagram of shared OTUs in four extreme environments. Each pie chart depicts the proportion of unique OTUs that were Known (lighter shade) and Unknown (darker shade) for each environment, with the bottom-most pie chart showing combined data for all environments. e Prevalence curves indicate the number of unique OTUs consistently present at an increasing number of samples. Dotted lines signify the prevalence of MDM OTUs and solid lines signify the prevalence of Known OTUs.

Full size image

After quality filtering, reads were mapped to OTUs and annotated against the SILVA (v128) reference database [38] by an open-reference strategy, i.e., allowing the detection of Unknown OTUs [39]. Over two million Known and novel taxa were observed with the vast majority of the taxa annotated as Bacteria. Only the bacterial taxa or OTUs unclassified at the domain-level were targeted for downstream network analyses to demonstrate the feasibility of this approach across ecosystems. As the term “microbial dark matter” can have a broad meaning, here, we define Unknown taxa as uncultured, unassigned, or ambiguous by the reference database at each taxonomic classification level (e.g., phylum to genus). For each environment, networks reflecting across-samples co-occurrence relationships between all taxa, Known and Unknown, were constructed and referred to as the “Original” networks (Fig. 1). To assess the role of the Unknown taxa on network structure and properties, the Unknown nodes were removed from the ‘Original’ network and a new network, referred to as ‘Without Unknown’ was reconstructed. To ensure that changes in network properties were not caused just by the number of nodes, a third network, referred to as the “Bootstrap” network, was created where a random set of nodes of the same number as the Unknown OTUs was removed. The relevance of the Unknown taxa was assessed by comparing changes in degree, closeness, and betweenness scores between the three network types and by evaluating the frequency of Unknowns as top hubs within each of the “Original” environmental networks.

A similar and significant fraction of Unknown taxa populates diverse environments

To assess whether there were distinctive patterns or trends of Unknown taxa within the four targeted environments, data from each habitat type were mined from several geographical locations across the globe (Fig. 2a and Supplementary Dataset S1). Reads were collected from the online repositories National Center for Biotechnology Information Sequence Read Archive (NCBI SRA) and Joint Genome Institute Genomes Online Database (JGI Gold). For each environment, between 255 and 286 16S rRNA samples and between 51 and 57 million reads were included in the analysis, resulting in a total of 219,980,340 reads from 1086 samples (Fig. 2b).

After processing, quality filtering, and OTU assignment steps were completed, there were 2102,595 unique OTUs totaling 164,896,127 amplicon read counts derived from these samples (Fig. 2b). The relative proportions of Unknown OTUs, which were designated as unassigned, ambiguous, or uncultured by the SILVA database, were compared between environments. Results indicated that all environments showed similar relative contributions of these three Unknown types, with unassigned and uncultured OTUs making up the majority of the Unknown component composition (Fig. 2c). Regardless of environment type, within each of the four habitats, between 25 and 38% of unique OTUs were cataloged as Unknown (Fig. 2b, d). Samples collected from polar habitats were significantly enriched (Fisher’s Exact Test p value < 0.05) in Unknown OTUs despite having the highest total read counts. The higher proportion of Unknown OTUs in the polar samples compared to the other habitats likely reflects the less-well-characterized biological diversity of these Arctic and Antarctic ecosystems.

Next, the proportion of shared Known and Unknown OTUs between environments was evaluated. Most OTUs, regardless of assigned or unassigned taxonomic status, were environment-specific, with only 11,318 out of the 2,102,595 OTUs present in all four of the environments (Fig. 2d). The majority of shared OTUs were observed between the hypersaline and polar environments and between hypersaline and hot springs environments, possibly reflecting the widespread distribution of hypersaline habitats across diverse thermal zones. Unsurprisingly, polar and hot springs environmental samples shared the least number of OTUs (Fig. 2d). Given this low common OTU pool across environments, network analyses were applied to each environment independently.

Last, the prevalence (i.e., the percentage of samples with nonzero counts in which an OTU was detected) of Known and Unknown OTUs was evaluated at the genus level to assess the consistency of OTU detection within each environment. The OTU matrix was sparse, with the majority of taxa observed in ≤50 (~20%) samples (Fig. 2e). However, prevalence curves were generally very similar for Known and Unknown OTUs in all four environments, indicating that Unknown OTUs are not necessarily rarer than already characterized species (Fig. 2e). Moreover, we confirmed that Unknown OTUs, like Known OTUs, were generally present and consistent across multiple studies within the same environment and did not tend to concentrate in any particular project (Supplementary Figs. S1–4). Consequently, these results indicated that a network analysis of these data would be a reflection of the co-occurrence structure of the community and not of potential compositional bias.

Network analysis of OTU abundance at different taxonomic levels reveals the connectivity of unknown microbes

Having demonstrated that the Unknown taxa comprise a substantial proportion of unique OTUs and have comparable abundances to Known taxa within a community, network metrics were used to effectively compare the ecological relevance of both Known and Unknown taxa in subsequent networks. Microbial association networks were constructed that featured only significant co-occurrence correlation relationships for OTUs with a notable prevalence in each environment, meaning that any OTUs that were not detected in a sufficient number of samples were removed. To select a suitable prevalence threshold, the proportion of Known and Unknown taxa across a range of sample percentages was evaluated (Supplementary Fig. S5). Across all taxonomic levels, the Known and Unknown taxa of hot springs and polar habitats were more prevalent than those of hypersaline and deep sea communities; therefore, a slightly more stringent prevalence threshold (40%) was chosen for the former, and a lower threshold value (30%) chosen for the latter environments. These thresholds resulted in the retention of a similar fraction of data from the initial OTU count (Table 1), with 102–297 nodes present per environment, making the networks both suitably large and comparable.

Table 1 Breakdown of node and edge attributes across extreme environmental networks.

Full size table

The SpiecEasi Meinshausen–Buhlmann (MB) neighborhood algorithm was then used to construct networks (see “Methods”) that contained at least 100 nodes (i.e., OTUs) per environment and had edge-to-node ratios that varied from 1.9 and 2.9 (Table 1). Although most OTUs that passed the prevalence criteria became elements of the networks, no relationship between the initial data size (i.e., number of samples and taxa) and the interconnectedness (i.e., nodes/edges ratio) of the resulting network was observed (Table 1). The two environments with the highest number of initial OTUs (i.e., hypersaline and deep sea) had the lowest number of prevalent members, 193 and 102, respectively, and very different edge/node ratios, 2.7 and 1.9, respectively, indicating that high prevalence does not necessarily correlate with high co-occurrence. Similarly, the polar and hot springs networks retained a high number of prevalent OTUs but differed in edge number, yet again, indicating that the observed network structures were the result of the intrinsic properties of each environment and were not dependent on the sampling procedure. Interestingly, when evaluating the proportion of Unknown edges and Unknown-Unknown connections at the genus level, similar patterns were observed across environments. Between 45 and 62% of all connected nodes were Unknown OTUs and a higher proportion of Unknown-Unknown versus Known-Known links was present at the genus level (Table 1), for all environments but hot springs, where a higher proportion of Known-Unknown and Known-Known links was observed.

The results of this global analysis of network construction and composition demonstrate that although the general community connectivity might be environment-specific, the relative contribution of Known and Unknown taxa to these networks is similar. Once again, network properties were not a direct outcome of sampling biases, but more likely, reflect the biology of their respective ecosystems.

Unknown taxa play important roles in interconnectedness and connectivity of extreme environmental microbial networks

Next, the position and neighborhood of Unknown nodes were examined. At the phylum level, Unknown taxa were present in the hot spring, hypersaline, and polar environments, but were not found in the deep sea network (Supplementary Fig. S6). The class level was the first taxonomic classification rank in which Unknown taxa were present in all environmental networks. To accurately assess the role of Unknowns, we evaluated class-level networks and observed that the hypersaline and polar Unknown OTUs created distinctive clusters, whereas hot springs and deep sea Unknown OTUs were intermixed with Known taxa (Fig. 3a). Hypersaline and polar Unknowns consistently appeared to be isolated and peripherally located compared to the centrally positioned hot springs and deep sea Unknown nodes across almost all classification ranks (Supplementary Fig. S6). Consequently, these results suggest that the clustering pattern is unrelated to a higher abundance of Unknowns and is more environment dependent. For example, the targeted hot spring environments had the highest number of Unknown OTUs yet showed the most dispersed connections between the these taxa (Fig. 3a). Thus, the inclusion of the Unknown taxa in our environmental networks models was, as anticipated, not solely the consequence of their level of prevalence, but rather a reflection of a particular abundance co-variation pattern.

Fig. 3: Analysis of environmental network taxa interconnectedness.

a Microbial networks at class taxonomic classification level. Nodes are colored by class assignment, with gray nodes representing Unknown taxa at the class level. b Bar graphs of the co-occurrence relationships (i.e., edges) of Unknown OTUs with other taxa at the class level within each environmental network. Y axis labels and colors signify the different classes with which Unknowns were found to co-occur. Unknown-Unknown relationships are represented in gray.

Full size image

For all four environments, Unknown OTUs had more frequently shared edges among them than with classified taxa (Fig. 3b). Unknown OTUs were found to frequently co-occur with each other within each environmental network, although the frequency of within-class interactions for Unknowns at the class level was found to be statistically no greater than all other within-class interactions for each environment (Supplementary Fig. S7). In fact, the pattern of a high frequency of shared edges among members of the same class held true for known classes as well (Supplementary Fig. S8). In accordance with other studies, these results demonstrate that OTUs of the same taxonomic classification most frequently co-occur with each other [40, 41]. Furthermore, the high frequency of shared edges between Unknown classes suggests that Unknown OTUs might be taxonomically related.

To ensure that the observations found were reproducible, robust, and not biased by earlier steps of the analysis, the diversity and position of Unknown taxa in the networks were examined for several parameters. Although the number of Unknown nodes changed at each level (Supplementary Fig. S6), the environment-specific network patterns observed at the class level (Fig. 3a) were retained at other taxonomic levels. In addition, to determine whether the topology of the network was a direct consequence of our correlation metric of choice or the prevalence threshold, three other network construction approaches were used: SparCC [11], CClasso [42], and Pearson correlation. Network analyses were performed across a range of prevalence thresholds (15–35%, at 5% increments). Again, regardless of the network construction approach or sample percentage applied, network shape and Unknown OTU position remained consistent and each environment exhibited a distinctive pattern of Unknown taxa inclusion. For example, Unknown nodes continued to occupy peripheral positions for hypersaline and polar networks, whereas nodes in hot springs and deep sea environments were more centrally located when applying different correlation metrics (Supplementary Fig. S9). Although networks appeared “noisy” at more lenient prevalence thresholds (15–20%; Supplementary Figs. S10–13), the networks and positioning of Unknowns at higher percent thresholds were similar in appearance to the “Original” networks for all environments. Based on these results, we found that our general analysis strategy was robust across parameter choices and, therefore, these networks captured critical features of the relationships among taxa within each distinctive environment.

Microbial dark matter acts as unifiers and frequent hubs within extreme environmental networks

Although these results suggest that Unknown taxa were highly interconnected, these observations did not reveal how the presence of Unknowns affected the overall community structure. To more fully understand this role, we analyzed how network properties changed in the presence and absence of Unknown OTU nodes. We evaluated changes in degree, betweenness, and closeness, as different network metrics reveal different aspects of the relevance of nodes within their networks. This approach has the potential to ascertain whether certain Unknown OTUs were more centrally positioned (e.g., due to high closeness scores), more essential for joining other taxa (e.g., high betweenness), or simply more prevalent and likely to co-occur with others (e.g., high degree). To control for the effect of node removal and distinguish effects of Unknown taxa from network size, networks were generated that excluded several randomly picked Known nodes equal to the number of Unknown OTUs. This process was repeated 100 times to create a distribution of “Null” or “Bootstrap” networks for statistical comparisons. Differences in network parameters between networks without Unknown OTUs and the “Original” or “Bootstrap” networks were determined by the Wilcoxon test and p values were adjusted using the Holm method [43]. Strikingly, removal of the Unknown taxa caused a statistically significant impact on all measured network metrics in all four studied environments (Table 2, Fig. 4, and Supplementary Figs. S14–24). In the polar environments, for example, removal of Unknown OTUs caused a significant decrease in degree (p value < 1E − 5) and betweenness (p value < 1E − 4) scores and a significant increase in closeness (p value < 2.22E − 16) scores across multiple taxonomic levels (Fig. 4), suggesting that the Unknown taxa in the polar environments are critical to the community structure and preserve local connections. Closeness score was the only parameter that behaved differently across environments (Table 2), increasing upon exclusion of the Unknowns for hypersaline and polar networks but decreasing at family level for the deep sea environment, due to the different topology of Unknown OTUs within their environmental networks. For example, the removal of centrally located Unknowns at the family level in the deep sea network increased node distance and network fragmentation. In contrast, in the hypersaline and polar networks, the removal of the peripherally located Unknowns resulted in a more connected network structure and appearance. The impact of Unknown node removal on degree, betweenness, and closeness persisted across taxonomical levels for the respective environments (Fig. 4 and Supplementary Figs. S14–16) and followed a similar pattern across tested correlation metrics (Supplementary Figs. S17–20) and prevalence thresholds (Supplementary Figs. S21–24).

Table 2 Impact on network metrics of the removal of Unknown OTUs from the network.

Full size table

Fig. 4: Impact of Unknown taxa on polar network metrics at different taxonomic levels.

Boxplots of degree, betweenness, and closeness centrality values of nodes present in different network types at different taxonomic levels. Wilcoxon pairwise comparisons were used to assess significance between the three network types (Original-Without Unknown, Without Unknown-Bootstrap, Original-Bootstrap) for each taxonomic level. For each comparison, p values after Holm adjustment are shown.

Full size image

In summary, despite these minor differences in closeness score changes, these results suggest that Unknown OTUs had significant and comparable relevance for community structure in all four environments. The exclusion of Unknown nodes from their environmental networks led to a drastic change in network structure and interconnectedness, illustrating that the Unknown taxa are critical for establishing co-occurrence relationships and for maintaining overall network shape within each distinct ecosystem.

Unknown taxa act as important hubs within extreme environment networks

After studying the overall relevance of the “microbial dark matter” in extreme environmental networks, we then concentrated on finding the most prominent Unknown components of each community. To do so, the hub score of each node was calculated to determine which taxa caused the most fragmentation or loss of network structure when removed, and therefore, may be a critical component for the microbial community structure of the target environment (Supplementary Dataset S2).

Hub scores were calculated at the genus level for each node and the environmental networks were recreated, resizing the nodes as a function of the scores (Fig. 5). The overwhelming majority of the top hub scores, as symbolized by the largest-sized nodes in each network, were unknown for the hypersaline and polar environments, even at higher taxonomic ranks (Supplementary Fig. S25), different prevalence thresholds (Supplementary Figs. S26–29), or correlation metrics (Supplementary Fig. S30). Moreover, at least four of the nodes within the top 20 hub scores were unknown at the genus level for all environments (Supplementary Dataset S2), providing further evidence that Unknown taxa are key components of microbial community structure in all four extreme environments.

Fig. 5: Hub analysis of extreme environmental networks.

Environmental networks at the genus level with nodes sized as a function of hub score. Nodes are colored by genus classification with ambiguous, unassigned, or uncultured taxa depicted in dark gray.

Full size image

Student’s t tests were performed to evaluate the statistical significance of the differences between hub score values of Known and Unknown genera. Unknown genera in the hypersaline (p value = 2.01E − 13) and polar (p value = 2.41E − 9) habitats had significantly higher hub scores than Known genera, whereas in the deep sea environments the Unknown nodes were marginally significant (p value = 1.77E − 3). No significant differences were observed between the Known and Unknown nodes (p value = 0.6) in the studied hot springs environment. Based on these analyses, we concluded that while Unknowns OTUs occupied and dominated key positions within the hypersaline and polar networks, all environments harbored relevant Unknown hubs within their microbial communities.

Network analysis of Unknown hubs as a tool to prioritize taxa for the search of novel genes with targeted functions

The high frequency of currently unknown microbes as top hubs within their habitats implies that these organisms have high prevalence and co-occurrence within their microbial communities, indicating that they have successfully adapted to survive in these harsh ecosystems. Therefore, we hypothesize that our network approach, particularly using the hub score to prioritize Unknowns, could serve as a critical foundation to study novel pathways and gene functions present in these yet-uncharacterized microbes. This hypothesis, however, poses two problems. First, this network-based approach relies on 16S rRNA sequencing data, which holds no additional functional information, and second, reference genomes for these uncultured organisms are lacking. To circumvent these challenges, we used the 16S rRNA sequences of the top Unknown hubs to probe large metagenomics databases, where extensive and diverse genome information is available on these microbial habitats, thereby facilitating the recovery of gene content associated with these unknown organisms. In parallel, we searched the literature for gene functions described to be associated with adaptation to our four studied extreme environments, resulting in a list of 86 “adaptation”-related terms that mostly contained metabolic and stress response processes (Supplementary Table S1). Once high-confidence matching scaffolds were found, putative novel genes with the targeted functions could be computationally identified by searching for genes of unknown function in the neighborhood of genes previously annotated with the list of targeted terms.

To explore this idea, a blastn search was performed on the 16S rRNA gene sequences of the top five Unknown and Known hubs at the genus level for each environment against 100 draft and permanently assembled metagenomes in the IMG/M database [44]. Hits with high similarity (i.e., >95%) and equivalent partial taxonomic classification in SILVA to the query hub 16S rRNA gene sequences were selected. To make sure that this search was not misled by incorrect 16S rRNA gene assembly in the metagenome scaffolds, we selected 70 such 16S rRNA genes obtained as hits of Known OTUs and performed a blastn search of these hits against the NCBI Microbial Genomes database. We obtained that the great majority (84%) of these 16S rRNA genes in scaffolds had >98% identity and between 97 and 100% sequence coverage with their corresponding genome sequence, while remaining 16S rRNA gene sequences belonged to species poorly represented in the NCBI database. Therefore, we concluded that 1S rRNA gene sequences in metagenome scaffolds could confidently be used in our analysis pipeline. Identified scaffolds were then filtered for a high (≥ 50) gene content and searched for operons containing gene descriptions matching any of the terms in our “adaptation” list, as well as genes labeled as “hypothetical proteins” or of “unknown function.” Hierarchical clustering of gene distance between pairs of genes was used to identify putative operons, with all results validated by checking gene neighborhood information for each scaffold using the IMG/M genome browsing and annotation platform.

Overall, metagenomic screening with the 16S rRNA gene sequences of the top Known and Unknown hubs returned numerous high-match, gene-rich scaffolds that varied by environment (Supplementary Table S2). The variation between habitats likely reflected the compositional differences of the metagenome database. Nevertheless, similar numbers of genes labeled as “adaptation,” “hypothetical genes,” and similar “putative adaptation-related” operons (see “Methods” for assignment criteria) were identified for all environments, regardless of hub type, with scaffolds consistently containing an average of three to four potential adaptation-related genes to extreme environments. Interestingly, the average number of hypothetical genes found per scaffold differed most between environments. Deep sea hubs consistently had the lowest average number of hypothetical protein genes per scaffold (12.8 for Unknown, 20.6 for Known), whereas polar, hypersaline, and hot springs scaffolds had similar, larger numbers of hypothetical genes (up to 35 for Known hypersaline hubs). These results suggest that a low total scaffold count does not prohibit detection of a high, equally abundant, number of hypothetical genes across environments or hub types. Importantly, across all habitats, an equivalent mean number of putative operons, where hypothetical and environment-relevant genes were found in close genomic proximity, were detected. Altogether, we recovered 6,734 un-annotated genes present in 535 putative operon structures potentially involved in the metabolic and stress responses to extreme environmental conditions (Supplementary Table S2).

An example of the outcome of this approach for the targeted hot springs habitat is depicted in Fig. 6. Blast results showed a 97% sequence similarity match to the fasta sequence of Unknown hub AB176701.1.1510 (Supplementary Dataset S2) within the metagenomic scaffold Ga007390_1000203, which was originally generated from Dewer Creek hot spring sediment in British Columbia. The scaffold contained 98 genes in total, of which 19 were annotated as “Hypothetical Proteins” and distributed across seven operons (Fig. 6a). One of the identified operons contained three putative genes with predicted proteins of unknown function (i.e., two were labeled as hypothetical and one was labeled with unknown function). Further, this operon contained a gene annotated as Fe–S oxidoreductase and two well-known oxidative stress genes encoding DNA-binding ferritin-like protein (DPS) and rubrerythrin (RBR), suggesting possible roles in oxidative stress-related functions. An exact match for all six of the protein sequences in the same operonic structure was found in the genome of the bacterium Blastocatellia by searching the NCBI nonredundant database (Fig. 6b), which was recently sequenced as part of a hydrothermal vent metagenome project [45]. Further bioinformatic analysis of the poorly characterized open reading frames in this operon led to the identification of Ga0073930_100020343 as a domain of unknown function (DUF) DUF3501 family member and Ga0073930_100020344 as part of the COG0247 family (Fe–S oxidoreductase). The link with oxidative stress was reinforced by the clustering of these same four genes with the hydrogen peroxide-inducible genes activator OxyR in Sideroxydans lithotrophicus ES-1 (Fig. 6c). The two other hypothetical genes, Ga0073930_100020341 and Ga0073930_100020340, were identified as a putative SH3-domain-containing protein and a putative lysine synthase protein, respectively. Notably, the DUF3501-COG0247-RBR functional association had already been reported in a previous study where the authors proposed that the DUF3501 and COG0247 protein families enabled the functional adaptation of RBR from a thermophilic/anaerobic to a mesophilic/aerobic environment [46]. The identification of DUF3501 by our OTU prioritization analysis of the hot spring habitat supports this prior work suggesting the gene may be part of the microbial adaptation strategy to high temperatures. These results illustrate how the hub score prioritization can be successfully combined with computer-intensive mining of publicly available metagenomic data to identify novel and potentially ecologically relevant genes and gene products.

Fig. 6: Metagenomics analysis of top Unknown OTU AB176701.1.1510.

a Overview of operon annotation in scaffold hit Ga0073930_1000203 obtained from the JGI metagenomics browsing platform, which matched with 97% similarity to Unknown AB176701.1.1510. Genes in light yellow color represent hypothetical genes or genes of unknown function. b Zoom-in of the selected operon with adaptation-related gene annotations. Bioinformatic function predictions for genes of unknown function are indicated under their gene boxes. c DUF3501-RBR containing operon for Sideroxydans lithotrophicus, including the OxyR gene. RBR Rubrerythrin gene, COG0247 Fe-SA Fe–S oxidoreductase gene, DUF3501 domain of unknown function 3501, DPS DNA-binding ferritin-like protein, SH3 SH3-domain-containing protein, LysW lysine synthesis protein W. OxyR: hydrogen peroxide-inducible genes activator.

Full size image

Source: Ecology - nature.com

A network approach to elucidate and prioritize microbial dark matter in microbial communities

Overall strategy to detect the relevance of Unknown taxa

A similar and significant fraction of Unknown taxa populates diverse environments

Network analysis of OTU abundance at different taxonomic levels reveals the connectivity of unknown microbes

Unknown taxa play important roles in interconnectedness and connectivity of extreme environmental microbial networks

Microbial dark matter acts as unifiers and frequent hubs within extreme environmental networks

Unknown taxa act as important hubs within extreme environment networks

Network analysis of Unknown hubs as a tool to prioritize taxa for the search of novel genes with targeted functions

Half of resources in threatened species conservation plans are allocated to research and monitoring

Population genetic structure of the great star coral, Montastraea cavernosa, across the Cuban archipelago with comparisons between microsatellite and SNP markers

ITALIAN LANGUAGE

ENGLISH LANGUAGE