A genomic view of the microbiome of coral reef demosponges
Six sponge species, R. odorabile, C. matthewsi, C. foliascens, S. flabelliformis, I. ramosa and C. orientalis (a bioeroding sponge), were selected for metagenomic sequencing (7 ± 0.5 Gbp) as these species represent dominant habitat forming taxa on tropical and temperate Australian reefs and exhibit high intraspecies similarity in their microbiomes. In addition, previously published microbial MAGs from I. ramosa and Aplysina aerophoba were analysed [8, 12], including 62 additional unpublished MAGs from A. aerophoba. The recovered MAGs, averaging 86 ± 12% completeness and 2 ± 2% contamination, made up 72 ± 21% relative abundance of their respective communities (by read mapping) on average and spanned the vast majority of microbial lineages typically seen in marine sponges [45] (Fig. S1 and Table S1), including the bacterial phyla Proteobacteria (331 MAGs), Chloroflexota (242), Actinobacteriota (155), Acidobacteriota (97), Gemmatimonadota (60), Latescibacterota (44; including lineages Anck6, PAUC34 and SAUL), Cyanobacteria (43), Bacteroidota (38), Poribacteria (35), Dadabacteria (22; including SBR1093), Nitrospirota (22), Planctomycetota (15), UBP10 (14), Bdellovibrionota (13), Patescibacteria (9; includes Candidate Phylum Radiation), Spirochaetota (8), Nitrospinota (7), Myxococcota (4), Entotheonella (2) and the archaeal class Nitrososphaeria (21; phylum Crenarchaeota), hereafter referred to by their historical name “Thaumarchaeota” for name recognition. Mapping of the metagenomic reads to the recovered MAGs showed that the communities had high intraspecies similarity across replicates, consistent with previous 16S rRNA gene-based analyses (Fig. S1). In general, taxa present in A. aerophoba, C. foliascens, C. orientalis and S. flabelliformis appeared unique to those sponge species, with only one dominant lineage present in C. orientalis (order Parvibaculales). In contrast, several Actinobacteriota, Acidobacteriota and Cyanobacteria populations were shared across C. matthewsi, R. odorabile and I. ramosa. Further, members of the Thaumarchaeota were detected in all sponge species and were particularly abundant in S. flabelliformis at 12 ± 4% relative abundance (Fig. S1). Addition of these sponge MAGs to genome trees comprising all publicly available sponge symbionts (N = 1188 MAGs) resulted in a phylogenetic gain of 44 and 75% for Bacteria and Archaea, respectively, reflecting substantial novel genomic diversity (Fig. 1).
Comparative genomic analysis of the sponge-derived MAGs provided unique insights into the distribution of metabolic pathways across sponge symbiont taxa. For example, microbial oxidation of ammonia benefits the sponge host by preventing ammonia from accumulating to toxic levels [46], a process thought to be mediated by both symbiotic Bacteria and Archaea (i.e. Thaumarchaeota) [33]. Prior identification of ammonia oxidisers has been based on functional inference from phylogeny (16S rRNA gene amplicon surveys) [47] or homology to specific Pfams (metagenomes) [33]. However, the CuMMO gene family is diverse, encompassing functionally distinct relatives that include amoA, particulate methane monooxygenases and hydrocarbon monooxygenases that cannot be distinguished by homology alone [35]. We used GraftM [32] to recover CuMMO genes from the sponge MAGs and their metagenomic assemblies, as well as previously sequenced metagenomic assemblies from six additional sponge microbiomes where bacterial amoA gene sequences had been identified [33]. Phylogenetic analysis of the recovered CuMMO genes showed that all archaeal homologues came from Thaumarchaeota and fell within the archaeal amoA clade. In contrast, bacterial CuMMO sequences were identified exclusively in MAGs from the phylum UBP10 (formerly unclassified Deltaproteobacteria) and from an unknown taxonomic group in the previous metagenomic assemblies [33]. All recovered bacterial and taxonomically unidentified CuMMO placed within the Deltaproteobacteria/Actinobacteria hmo clade, indicating these genes are specific for hydrocarbons rather than ammonia (Fig. S2). The finding that Thaumarchaeota are the only microbes within any of the surveyed sponge species capable of oxidising ammonia, and their ubiquity across sponges, suggests they are a keystone species for this process.
To further investigate the distribution of functions within the sponge microbiome, a set of highly complete ( >85%) sponge symbiont MAGs were grouped by principal components analysis based on their KEGG and Pfam annotations, as well as orthologous clusters that reflected all gene content. Similar analysis conducted on 37 MAGs from the sponge Aplysina aerophoba suggested the presence of functional guilds, with MAGs from disparate microbial phyla carrying out similar metabolic processes [12] (e.g. carnitine catabolism). Here, we find that MAGs clustered predominately by microbial taxonomy (phylum) rather than function in all three analyses (Fig. S3). While functional guilds could not be identified based on analysis of total genome content, this does not preclude the existence of such guilds based on more specific metabolic pathways.
To identify pathways enriched within the sponge microbiome, sponge-associated MAGs with >85% completeness (N = 798) were compared with a set of coral reef and coastal seawater MAGs (N = 86), 31 derived from published datasets [31] and 55 from this study (Table S1). Seawater MAGs with >85% genome completeness (93 ± 4% completeness and 2 ± 2% contamination; Table S1) spanned the bacterial phyla Proteobacteria (48 MAGs), Bacteroidota (13), Planctomycetota (5), Myxococcota (5), Gemmatimonadota (3), Marinisomatota (3), Actinobacteriota (3), Verrucomicrobiota (2), Cyanobacteriota (2), Bdellovibrionota (1) and the archaeal phylum Nanoarchaeota (1). Comparative analysis revealed that sponge symbionts were enriched in metabolic pathways for carbohydrate metabolism, defence against infection by MGE, amino acid synthesis, eukaryote-like gene repeat proteins (ELRs) and cell–cell attachment (Tables S2–S4).
Genes belonging to GH and carbohydrate esterase (CE) families (Table S2) acting on starch (GH77), arabinose (CAZY families GH127 and GH51), fucose (GH95 and GH29) and xylan polymers (CE7 and CE15), were enriched in sponge-associated lineages, likely reflecting the hosts critical role in catabolising dissolved organic matter (DOM) present in reef seawater (Fig. 2). Microbial GHs from the GH77 family target starch, the main sugar storage compound in marine algae [48], whereas GHs from families 51 and 127 are known to act on plant arabinosaccharides, such as the hydroxyproline-linked arabinosaccharides found in algal extensin glycoproteins [49, 50]. GH127 enzymes are also required for microbial degradation of carrageenan, a complex heteropolysaccharide produced by red algae [51]. Members of the fucosidase GH95 and GH29 enzyme families are known to degrade fucoidan, a complex fucosaccharide prominent in brown algae [50, 52]. Notably, arabino- and fucopolysaccharides also make up a significant proportion of coral mucus, a major component of DOM in coral reefs that sponges have been shown to utilise [53, 54]. Supporting this observation, isotopic investigation of the fate of coral mucus and algal polysaccharides in sponges showed that the microbiome participates in metabolism of these compounds, particularly in sponges with high microbial abundance and diversity [4, 5]. Enzymes from the CE families 15 and 7 have been primarily characterised in terrestrial plants where they act as glucuronyl esterases and acetyl-xylan esterases, degrading lignocellulose and removing acetyl groups from hemicellulose [55] (e.g. xylans). Characterisation of CE15 and CE7 from marine microbes is rare, though activity on xylans, which are a structural component of marine algae, has previously been demonstrated [55,56,57].
Fig. 2: Phylogenetic tree showing the distribution of glycosyl hydrolases and esterases across MAGs with >85% completeness (N = 884).
Values represent the copy number of each gene per MAG. Internal branches of the tree are coloured by phylum, while the outer strip is coloured by class. Both are listed clockwise in the order in which they appear. Seawater MAGs are denoted by grey labels with red text.
Full size image
GHs acting on sialic acids (GH33) and glycosaminoglycans (GH88) were also enriched in the sponge-associated MAGs and may act on compounds found within sponge tissue [13] (Fig. 2). In contrast, no genes for the degradation of collagen (collagenases), one of the main structural components of the sponge skeleton were identified. Sialic acid-linked residues are found in the sponge mesohyl [58], and although the impact of cleavage on the host is unknown, analogy can be made to other symbioses. For example, sialidases are common in the commensal bacteria present in the human gut where they are used to cleave and metabolise the sialic acid-containing mucins lining the gut wall [59]. Increased sialidase activity is associated with gut dysbiosis and inflammation [60] and careful control of sialidase-containing commensals is therefore necessary to maintain gut homoeostasis [59]. As glycosaminoglycans are also part of sponge tissue [13, 61], the same may apply to microorganisms encoding GH88 family enzymes. However, these genes are also implicated in the degradation of external sugar compounds, such as ulvans, a major sugar storage compound found in green algae that can make up to 30% of their dry weight [62]. Thus, the ecological role of GH88 family enzymes within the sponge microbiome requires further investigation.
Enrichment of GHs and CEs was largely restricted to the Poribacteria, Latescibacteria (class UBA2968), Spirochaetota, Chloroflexota (classes UBA2235 and Anaerolineae, but not Dehalococcoides) and Acidobacteriota (class Acidobacteriae). These findings corroborate previous targeted genomic characterisations of the Chloroflexota and Poribacteria [13, 14] but show that they are part of a larger set of polysaccharide-degrading lineages. Identification of disparately related microbial taxa across several sponge lineages (Figs. 1 and 2) that encode similar pathways for polysaccharide degradation, and therefore occupy a similar ecological niche, supports the existence of functional guilds within the sponge microbiome when viewed at the level of individual pathways. Given the fundamental role of marine sponges in recycling coral reef DOM, studies targeting these specific guilds are needed to quantify their contribution to reef DOM transformation.
Because sponges filter and retain biomass from an extensive range of reef taxa (eukaryotic algae, bacteria, archaea, etc), they are exposed to a greatly expanded variety of MGEs from these organisms, including viruses, transposable elements and plasmids [33, 63]. For this reason, sponge-associated microorganisms likely require a diverse toolbox of molecular mechanisms for resisting infection. Both RM and CRISPR systems are capable of recognising and cleaving MGEs as part of the bacterial immune repertoire. RM systems are part of the innate immune system of bacteria and archaea and are encoded by a single (Type II) or multiple proteins (Type I, III and IV) that recognise and cleave foreign DNA based on a defined target sequence. In contrast, CRISPR systems are part of the adaptive immune system of some bacteria and archaea and encode a target sequence derived from the genome of a previous infective agent that is used by a CRISPR-associated protein (CAS) to identify and cleave foreign DNA. RM (Fig. S4) and CAS (Fig. S5) genes were both enriched (Table S3) in the sponge-associated MAGs and relatively evenly distributed across taxa, with the exception of the Planctomycetota and Verrucomicrobiota, where they were largely absent. As these MAGS average 93 +/− 5% completeness, this result is not likely due to genome incompleteness. This finding contrasts with comparative investigations of Planctomycetota genomes from other environments [64] and additional research is required to ascertain the mechanisms used by sponge-associated Planctomycetota and Verrucomicrobiota to avoid infection. Although Type III RM genes were enriched in sponge MAGs, they were also present in all seawater MAGs. In contrast, Types I and II RM genes were present almost exclusively in the sponge-associated MAGs. In conjunction with an enrichment in CRISPR systems, this expanded repertoire of defence systems likely reflects the increased burden from MGEs associated with the hosts role in filtering and concentrating diverse sources of reef biomass. Supporting this hypothesis, metagenomic surveys of sponge-associated viruses revealed a more diverse viral population than what could be recovered from the surrounding seawater [63]. Further, we found that genes encoding toxin-antitoxin systems, which are present on MGEs, such as plasmids, were also enriched in sponge-associated MAGs. These observations suggest that RM and CRISPR systems are important features of microbe-sponge symbiosis, allowing the symbionts to colonise and persist within their host by avoiding viral infection or being overtaken by MGEs.
Pathways for the synthesis of amino acids were also enriched in the sponge microbiome. The inability of animals to produce several essential amino acids has been proposed as a primary reason that they harbor microbial symbionts [65,66,67,68] and it has long been thought that sponges acquire at least some of their essential amino acids from their microbiome [69, 70]. Further, gene-centric characterisation of the Xestospongia muta and R. odorabile microbiomes revealed pathways to synthesise and transport essential amino acids [33, 70]. However, these same amino acid pathways are also used catabolically by the microorganisms, and transporters could simply be importing amino acids into the microbial cell. Further, as sponges are almost constantly filter feeding, essential amino acids could be acquired through consumption of microorganisms present in seawater. Comparison of sponge MAGs with those from seawater revealed enrichment of specific pathways for the synthesis of lysine, arginine, histidine, threonine, valine and isoleucine (Table S4). However, visualisation of the distribution of these genes revealed that almost all MAGs in both sponges and seawater produce all amino acids, though specific lineages may use different pathways to achieve this (Fig. S6). The enrichment observed in the sponge MAGs was therefore ascribed to differences in pathway completeness between sponge-associated and seawater microbes, rather than an enhanced ability of sponge symbionts to produce any specific amino acid. In contrast, compounds, such as taurine, carnitine and creatine have also been proposed as important host-derived carbon sources for symbionts [69], but pathways for their catabolism were enriched in seawater rather than sponge-associated MAGs. While these findings do not invalidate the possibility that microbial communities play a role in amino acid provisioning to the host or that they utilise host-derived taurine, carnitine, or creatine, they suggest that these are not key processes mediating microbe-sponge symbiosis.
To form stable symbioses, bacteria must persist within the sponge tissue and avoid phagocytosis by host cells. Microbial proteins containing ELR motifs have been identified in a range of animal and plant-associated microbes and are thought to modulate the host’s intracellular processes to facilitate stable symbiotic associations [71, 72]. For example, ELR-containing proteins from sponge-associated microbes have been shown to confer the ability to evade host phagocytosis when experimentally expressed in E. coli [10, 73]. ELR-containing proteins from the ankyrin (ARP), leucine-rich, tetratricopeptide and HEAT repeat families were enriched in the sponge-associated MAGs. In contrast, WD40 repeats were not found to be enriched but are included here as they have previously been reported as abundant in Poribacteria and symbionts of other marine animals [13, 31]. Most ELRs were present across all taxa but were much more prevalent in specific lineages (Fig. 3). For example, sponge-associated Poribacteria, Latescibacterota and Acidobacteriota encoded a high proportion of all ELR types, while other lineages, such as the Gemmatimonadota (average 0.25% coding genes per sponge-associated MAG versus 0.09% in seawater MAGs), Verrucomicrobiota (2%), Deinococcota (0.85%), Acidobacteriota (0.20%; specifically class Luteitaleia at 0.55%) and Dadabacteria from C. orientalis (0.62%) encoded a comparatively high percentage of ARPs and Nitrospirota encoded a high percentage of HEAT_2 family proteins (0.55% versus 0.05% in seawater MAGs) relative to other taxa. In contrast, ELR abundances were substantially lower, or absent, in the Actinobacteriota, the class Bacteroidia within the phylum Bacteroidota, and the Thaumarchaeota, suggesting these microorganisms utilise alternative mechanisms to maintain their stable associations with the host.
Fig. 3: Phylogenetic tree showing the distribution of eukaryote-like repeat proteins—ankyrin (ARP), leucin-rich (LRR), tetratricopeptide (TPR), HEAT and WD40—across MAGS with >85% completeness (N = 884).
Values represent the percentage of coding genes per MAG devoted to each gene class. Internal branches of the tree are coloured by phylum, while the outer strip is coloured by class, and both are listed clockwise in the order in which they appear. MAGs from seawater are denoted by grey labels with red text.
Full size image
The mechanisms by which ELRs interact with sponge cells remains largely unknown, although microbes in other host systems are known to deliver ELR-containing effector proteins into host cells via needle-like secretion systems (types III, IV and V) or extracellular contractile injection systems [74, 75], where they interact with the cellular machinery of the host to modify its behaviour. In sponges, it is also possible that ELRs could be secreted into the extracellular space by type I or II secretion systems. Interestingly, although most sponge MAGs encoded eukaryote-like proteins (Fig. 3), few lineages encoded the necessary genes to form secretion systems (Fig. S7). It is therefore unlikely that ELRs are introduced to the sponge host via traditional secretion pathways used in other animal-symbiont systems.
Maintaining stable association with the sponge may also require mechanisms for attachment to the host tissue. For example, cadherin domains are Ca2+-dependent cell–cell adhesion proteins that are abundant in eukaryotes and have been found to serve the same function in bacteria [76]. Similarly, fibronectin III domains mediate cell adhesion in eukaryotes, but also occur in bacteria where they play various roles in carbohydrate binding and biofilm formation [77, 78]. In addition, some bacterial pathogens utilise fibronectin-binding proteins to gain entry into host tissue by binding to host fibronectin [77, 78]. Genes containing cadherin domains were enriched in the sponge-associated MAGs and were identified in most bacterial lineages, but were notably absent in the Cyanobacteriota and Verrucomicrobiota (Fig. 4). Genes containing fibronectin III domains and those for fibronectin-binding proteins were also enriched in sponge-associated MAGs and were distributed across most lineages, though were particularly abundant in the Actinobacteriota and Chloroflexota. However, although fibronectin III-containing genes were taxonomically widespread, those encoding fibronectin-binding proteins were restricted to the phyla Poribacteria, Gemmatimonadota, Latescibacterota, Cyanobacteriota, class Anaerolineae within the Chloroflexota (but not Dehalococcoidia), class Rhodothermia within the Bacteroidota, Spirochaetota, Nitrospirota and the archaeal phylum Thaumarchaeota. Interestingly, the taxonomic distribution of these genes shares significant overlap with lineages encoding the genes for sponge sialic acid and glyosaminoglycans degradation, suggesting that attachment to the host may be necessary for utilisation of these carbohydrates (Fig. 2). However, as the host, bacterial, and archaeal components of the sponge holobiont have fibronectin III domains, symbionts encoding fibronectin-binding proteins may use these to adhere to the host tissue or potentially to form biofilms (bacteria–bacteria attachment). In either case, the enrichment and wide distribution of cadherins, fibronectins and fibronectin-binding proteins in the sponge MAGs suggests that cell–cell adhesion is critical for successful establishment in the sponge niche.
Fig. 4: Phylogenetic tree showing the distribution of cadherins, fibronectins and fibronectin-binding proteins across MAGS with >85% completeness (N = 884).
Values represent the copy number of each gene per MAG. Internal branches of the tree are coloured by phylum while the outer strip is coloured by class. Both are listed clockwise in the order in which they appear. Seawater MAGs are denoted by grey labels with red text.
Full size image
Distribution of genes encoding ELRs, polysaccharide-degrading enzymes (GHs and CEs), cadherins, fibronectins, RMs and CRISPRs across distantly related taxa suggests that they were either acquired from a common ancestor or that they represent more recent LGT events, potentially mediated by MGEs, which are enriched in sponge-associated microbial communities [69]. Here, we identify 4963 LGTs from five sponges for which sufficient sequence data were available ( >100 Mbp total sequence length across all MAGs), as well as 136 LGTs from seawater MAGs, averaging 1.64 and 0.52 LGTs per Mbp sequences, respectively (Fig. 5 and Table S5). Sequence similarity of LGTs from MAGs within a sponge species was higher than between sponge species, indicating relatively recent gene transfers (Fig. S8). A higher frequency (Fig. S9) and lower genetic divergence of LGTs among MAGs derived from the same sponge species likely results from the close physical distance between members of each microbiome, as has been observed in other host-symbiont systems [79]. The identification of lateral transfers between microbes from different sponge species may highlight the horizontal acquisition of these microbes or that a recent ancestor inhabited the same host. Notably, LGTs included a subset of genes that were enriched within the sponge-associated MAGs, such as GH33 (sialidases) and CE7 (acetyl-xylan esterases), attachment proteins (cadherins and fibronectin III), RM and CAS proteins, and members of all ELR families other than WD40 (Figs. 6 and S10). The observation that a significant number of sponge-enriched genes were laterally transferred between disparate microbial lineages suggests that the processes they mediate provide a strong selective advantage within the sponge niche, though further research is required to validate these findings.
Fig. 5: Visualisation of LGTs detected within the MAGs for the five sponges passing the cumulative MAG length criteria ( >100 Mbp).
The inner strip is coloured by phylum while the outer strip is coloured by host sponges. Bands connect donors and recipients, with their colour corresponding to the donors and the width correlating to the number of LGTs.
Full size image
Fig. 6: Visualisation of gene flow among microbial phyla for gene families enriched in sponge-associated MAGs.
The inner ring and band connecting donor and recipient is coloured by protein family of the gene being transferred, with the width of the band correlating to the number of LGTs. Recipient MAGs are shown in grey. The outer ring is coloured by microbial phylum. Representation of RM and CAS gene LGTs can be found in Fig. S10.
Full size image
Sponges are important constituents of coral reef ecosystems because of their critical role in DOM cycling and retention via the sponge-loop. Despite their importance, functional characterisation of sponge symbiont communities has been restricted to just a few lineages of interest, potentially biasing our view of sponge symbiosis. Here we present a comprehensive characterisation of sponge symbiont MAGs spanning the complete range of taxa found in marine sponges (Fig. 7), most of which were previously uncharacterised. We revealed enrichment in glycolytic enzymes (GHs and CEs) reflecting specific functional guilds capable of aiding the sponge in the degradation of reef DOM. Further, we identified several ELRs, CRISPRs and RMs that likely facilitate stable association with the sponge host, showing the specificity of ELR types with individual microbial lineages. We also clarified the role of Thaumarchaeota as a keystone taxon for ammonia oxidation across sponge species and showed that processes previously thought to be important, such as amino acid provisioning and taurine, creatine and carnitine metabolism are unlikely to be central mechanisms mediating sponge-microbe symbiosis. Many of the enriched genes are laterally transferred between microbial lineages, suggesting that LGT plays an important role in conferring a selective advantage to specific sponge-associated microorganisms. Taken together, these data illustrate how evolutionary processes have distributed and partitioned ecological functions across specific sponge symbiont lineages, allowing them to occupy or share specific niches and live symbiotically with their sponge hosts.
Fig. 7: Schematic overview of microbial interactions with the host as inferred from the functional potential encoded by the sponge-associated microbial MAGs.
Fbn fibronectin, cdh cadherins, RM restriction-modification systems, CAS CRISPR-associated proteins, ELP eukaryotic-like repeat proteins, CE7 carbohydrate esterase family 7, GH33 glycosyl hydrolase family 33.
Full size image More