Overlooked and widespread pennate diatom-diazotroph symbioses in the sea
Epithemia isolation and cultureThe Epithemia cells were isolated from 0.5 L of seawater collected from depths of 25, 75, and 100 m in the North Pacific Subtropical Gyre (22°45′ N, 158°00′ W). Seawater was collected during the near-monthly Hawaii Ocean Time-series (HOT) expeditions to the long-term monitoring site Station ALOHA (water depth ca. 4800 m) in October 2014 (HOT cruise #266) and February–July 2019 (HOT cruises #310–313). Serial dilution (unialgal strains UHM3202, UHM3203, UHM3204) or micropipette isolation of single cells (clonal strains UHM3200, UHM3201, UHM3210, UHM3211) were used to establish the Epithemia cultures, which were grown in a seawater-based, low-nitrogen medium. Filtered (0.2 µm) and autoclaved, undiluted Station ALOHA seawater was amended with 2 μM EDTA, 50 nM ferric ammonium citrate, 7.5 μM phosphoric acid, trace metals (100 nM MnSO4, 10 nM ZnCl2, 10 nM Na2MoO4, 1 nM CoCl2, 1 nM NiCl2, 1 nM Na2SeO3), vitamins (50 μg/L inositol, 10 μg/L calcium pantothenate, 10 μg/L thiamin, 5 μg/L pyridoxine HCl, 5 μg/L nicotinic acid, 0.5 μg/L para-aminobenzoic acid, 0.1 μg/L folic acid, 0.05 μg/L biotin, 0.05 μg/L vitamin B12), and 106 μM Na2SiO3. Although not tested here, simpler formulations of diazotroph media such as PMP40 or RMP41 may also be suitable for growing Epithemia, when made with 100% seawater and adding Na2SiO3. The cultures were subsequently incubated at 24 °C on a 12:12 h light:dark cycle with 50–100 μmol quanta m−2 s−1 using cool white fluorescent bulbs. All E. pelagica and E. catenata symbioses were stable under these medium and incubation conditions. E. pelagica was successfully isolated from at least one of the three depths that were targeted during each sampling occasion.Morphological observationsEpithemia living and fixed cells were imaged by light and epifluorescence microscopy using a Nikon Eclipse 90i microscope at 40×–60× magnification. Diatom cell sizes were determined using >60 live, exponentially growing cells, imaged in either valve view (E. pelagica) or girdle view (E. catenata). Endosymbiont (spheroid body) cell sizes were averaged from DNA-stained cells for E. pelagica UHM3200 (n = 78) and E. catenata UHM3210 (n = 91), imaged by epifluorescence microscopy after preparing samples as follows: Epithemia cells were fixed in 4% glutaraldehyde for 30 min, pelleted at 1000 × g for 1 min, the supernatant was exchanged with 0.5% Triton X-100 (in autoclaved filtered seawater), samples were incubated for 10 min with gentle agitation, cells were then pelleted at 4000 × g for 1 min, supernatant was exchanged with autoclaved filtered seawater and fixed in 4% glutaraldehyde, and samples were stained with 1× final concentration of SYBR Gold nucleic acid stain (Invitrogen, cat. # S11494) for 2 h. For routine observations of endosymbionts (e.g., determining presence/absence and number per host cell), osmotic shock was used to disrupt the cell contents of diatom host cells and improve visualization of the endosymbionts. This was achieved by gently pelleting cells and exchanging the medium with either ultrapure water or 2–3 M NaCl solution, followed by immediate observation. While this is a simple technique for detecting and visualizing endosymbionts (Fig. 1c, f), it does not accurately represent the natural location of endosymbionts within the host cells, as seen when compared to fixed cell preparations for epifluorescence microscopy (Fig. 1n, o). To assess the presence of fluorescent photopigments in endosymbiont cells, live host cells were pelleted at 4000 × g for 5 min and crushed using a microcentrifuge tube pestle (SP Bel-Art, cat. # F19923-0000) to release the endosymbionts. The crushed pellet was resuspended in 75% glycerol containing live Synechococcus WH7803 cells (positive control for fluorescence), and samples were observed by epifluorescence microscopy using filter cubes appropriate for observing phycoerythrin (EX: 551/10, BS: 560, EM: 595/30) and chlorophyll (EX: 480/30, BS: 505, EM: 600LP).The loss of endosymbionts from Epithemia cultures (UHM3200 and UHM3210) was observed after propagating cells for four months in nitrogen-replete medium (K)18, where approximately 5–10% of the culture was transferred to fresh medium about every two weeks. Observations were only made at the end of the four-month period. Endosymbionts were not observed growing freely in these cultures, and the absence of endosymbionts within host cells was confirmed by the failure to observe spheroid bodies by light microscopy after osmotic shock of the diatoms, as well as a failure to amplify the endosymbiont SSU (16S rRNA) and nifH genes from cellular DNA extracts. PCR reactions were performed in parallel with DNA extracts from control cultures (grown in low-nitrogen medium), using the same template DNA amount (10 ng) and PCR conditions (see methods for Marker gene sequencing and phylogenetics).Ultrastructural observations by electron microscopy (EM) were conducted for E. pelagica UHM3200 and E. catenata UHM3210. EM preparations of diatoms typically involve the oxidative removal of organic matter to uncover the fine details of frustule ultrastructure. However, in the case of E. catenata, oxidatively cleaned cells lacked structural integrity, leading to collapsed frustules when dried and viewed by scanning EM (SEM). For this reason, both species were prepared for SEM with and without (Fig. 1a, d) the oxidative removal of organic matter, and cleaned E. catenata frustules were further analyzed by transmission EM (TEM). To remove organic matter, 100 mL of exponentially growing culture was pelleted by centrifugation at 1000 × g for 10 min and resuspended in 30% H2O2. Cells were boiled in H2O2 for 1–2 h, followed by rinsing cells six times in ultrapure water by sequential centrifugation at 1000 × g for 10 min and resuspension of cell pellets. Suspensions of the cleaned cells were dried on aluminum foil and mounted on aluminum stubs with double-sided copper tape. For some E. catenata SEM preparations, the cleaned frustules were dehydrated in an ethanol dilution series and exchanged into hexamethyldisilazane (HMDS) prior to drying on aluminum foil; this was to minimize the collapse of frustules resulting from drying. To prepare cells with organic matter intact, 25 mL of exponentially growing culture was mixed with an equal volume of fixative solution (5% glutaraldehyde, 0.2 M sodium cacodylate pH 7.2, 0.35 M sucrose, 10 mM CaCl2) and incubated overnight at 4 °C. Cells were gently filtered onto a 13 mm diameter 1.2 μm pore size polycarbonate membrane filter (Isopore, Millipore Sigma), washed with 0.1 M sodium cacodylate buffer (pH 7.4, 0.35 M sucrose), fixed with 1% osmium tetroxide in 0.1 M sodium cacodylate (pH 7.4), dehydrated in a graded ethanol series, and critical point dried. Filters were mounted on aluminum stubs with double-sided conductive carbon tape. All SEM stubs were sputter coated with Au/Pd, prior to observing on a Hitachi S-4800 field emission scanning electron microscope at the University of Hawai’i at Mānoa (UHM) Biological Electron Microscope Facility (BEMF). Cleaned E. catenata cells were prepared for TEM by drying a drop of sample on a formvar/carbon-coated grid and observing on a Hitachi HT7700 transmission electron microscope at UHM BEMF.Additional light microscopy of hydrogen-peroxide cleaned frustules was conducted for E. pelagica UHM3201 and E. catenata UHM3210. Samples were mounted in Naphrax (PhycoTech, Inc., cat. # P-Naphrax200) and observed at 100× using an Olympus BX41 Photomicroscope (Olympus America Inc., Center Valley, Pennsylvania) with differential interference contrast optics and an Olympus SC30 Digital Camera at California State University San Marcos.A key to the strains used in each micrograph is provided in Supplementary Table 2.Marker gene sequencing and phylogeneticsFor each Epithemia strain, 25–50 mL of culture was pelleted at 4000 × g for 10 min, and DNA was extracted from the pellet using the ZymoBIOMICS DNA Miniprep Kit (Zymo Research, cat. # D4300). Marker genes were amplified with the Expand High Fidelity PCR System (Roche, cat. # 4743733001), using conditions previously described for genes SSU encoding 18S rRNA (Euk328f/Euk329r)42, LSU encoding 28S rRNA (D1R/D2C)43, rbcL (rbcL66+/dp7−)44,45, psbC (psbC+/psbC−)44, and cob (Cob1f/Cob2r)21. For the endosymbionts, a partial sequence for the SSU (16S rRNA) gene was amplified using a primer set targeting unicellular cyanobacterial diazotrophs, CYA359F/Nitro821R46,47, and the nifH gene was amplified using new primers specific to the nifH of Cyanothece-like organisms, ESB-nifH-F (5′-TACGGAAAAGGCGGTATCGG-3′) and ESB-nifH-R (5′-CACCACCAAGRATACCGAAGTC-3′), with a 55 °C annealing temperature and 75 s extension time. All primers were synthesized by Integrated DNA Technologies (IDT). Amplified products were cloned and transformed into E. coli using the TOPO TA Cloning Kit for Sequencing (Invitrogen, cat. # K457501), and plated colonies were picked and grown in Circlegrow medium (MP Biomedicals, cat. # 113000132). Plasmids were extracted with the Zyppy Plasmid Miniprep kit (Zymo Research, cat. # D4019) and sequenced from the M13 vector primers using Sanger technology at GENEWIZ (South Plainfield, NJ). For the diatom SSU (18S rRNA) gene, sequencing reactions were also performed using the 502f and 1174r primers48.Phylogenetic trees (Fig. 2) were inferred using concatenated alignments for both diatom host genes (SSU encoding 18S rRNA, psbC, rbcL) and endosymbiont genes (SSU encoding 16S rRNA, nifH). For each gene, nucleotide sequences were aligned using MAFFT v7.45349 (L-INS-i method), and sites with gaps or missing data were removed. An appropriate nucleotide substitution model was selected for each gene alignment using jModelTest v2.1.1050. Bayesian majority consensus trees were inferred from the concatenated alignments using MrBayes v3.2.751 with two runs of 4–8 chains, until the average standard deviation of split frequencies dropped below 0.01. Maximum likelihood bootstrap values were generated for the Bayesian tree using RAxML v8.2.1252, implemented with 1000 iterations of rapid bootstrapping. To further analyze the phylogenetic position of the new Epithemia species in the broader context of Surirellales and Rhopalodiales diatoms, individual gene trees (SSU encoding 18S rRNA, LSU, rbcL, psbC, and cob; Supplementary Figs. 13–19) were constructed from sequences aligned using MAFFT (automatic detection method) and trimmed using trimAl v1.253 (gappyout method). rRNA gene phylogenies were also inferred using sequences aligned according to the global SILVA alignment for SSU and LSU genes using SINA54, which were either left untrimmed in the case of the LSU gene or trimmed to remove highly variable positions (SINA’s “012345” positional variability filter) and gappy positions (trimAL v1.2, gappyout method) in the case of the SSU gene. These trimming strategies were selected based on their ability to maximize the monophyly of the previously described Rhopalodiales clade and minimize the separation of known conspecific strains, such as the strains of E. pelagica described here. All gene phylogenies were inferred using the Bayesian methods described above. To investigate the level of support for constrained tree topologies placing E. catenata within or outside of the genus Epithemia and family Rhopalodiaceae, SH55 and AU56 statistical tests were performed in IQ-TREE 257 (implementing ModelFinder58) using all alignments from the individual gene trees (Supplementary Table 3).Given E. catenata’s unusual morphology, test trees were inferred with the inclusion of diatom sequences from orders Bacillariales (Nitzschia, Pseudo-nitzschia), Cymbellales (Didymosphenia), Naviculales (Amphiprora, Navicula, Pinnularia), and Thalassiophysales (Amphora, Halamphora, Thalassiophysa); however, E. catenata was consistently placed within Rhopalodiales, and these trees were not pursued further.An additional nifH phylogeny was constructed using all environmental sequences from NCBI’s non-redundant nucleotide (nt) database >300 bp and sharing >95% nucleotide sequence identity with EpSB and EcSB nifH sequences (Supplementary Fig. 23), including 51 environmental sequences from prior studies investigating marine diazotrophs34,59,60,61,62,63,64,65,66. Environmental nifH sequences were aligned to the previously generated nifH sequence alignment using MAFFT (automatic method detection and addfragments options), and the best-scoring maximum likelihood phylogeny was inferred using RAxML with 1000 iterations of rapid bootstrapping. NCBI accession numbers for all tree sequences are in the Source Data file.Analysis of Epithemia endosymbiont nifH sequences in environmental datasetsNucleotide sequences for EpSB and EcSB nifH were queried against NCBI’s non-redundant nucleotide (nt) database using webBLAST67 (megablast; https://blast.ncbi.nlm.nih.gov/) and SRA databases for nifH amplicon sequencing projects from the marine environment using the SRA Toolkit68 (dc-megablast, with database validation using vdb-validate; https://github.com/ncbi/sra-tools). Database hits with 98–100% nucleotide identity over an alignment of the entire subject sequence (BLAST alignment length = subject sequence length) were identified, and the associated sample’s latitude and longitude coordinates (where available) were mapped. Coordinates were also mapped for metagenome and metatranscriptome samples containing matches to unigene MATOU-v1_93255274 from the Marine Atlas of Tara Oceans Unigenes69, a unigene that shares 100% identity over the entire length of the EpSB UHM3202 nifH sequence and >99.4% identity with all other EpSB nifH sequences.The presence of EpSB and EcSB nifH sequences was examined in metagenomes prepared from sinking particles collected at 4000 m depth at Station ALOHA27,28. The sinking particles were collected during intervals of 12, 10, and 8 days during 2014, 2015, and 2016, respectively, using a McLane sediment trap equipped with a 21-sample bottle carousel. The presence of EpSB and EcSB nifH sequences in the metagenomes was assessed by blastn70, after first removing low quality bases from metagenomic reads using Trimmomatic v0.3971 (parameters: LEADING:20 TRAILING:20 MINLEN:100). For each sediment trap metagenome, the total number of reads matching EpSB or EcSB nifH nucleotide sequences with 100% identity were tallied and normalized to the total number of reads in the database. Only EpSB-matching reads were detected in this analysis.Quantitative PCRSpecific PCR primers were designed targeting a 102 bp region of E. pelagica’s LSU gene (Epel-LSU-F, 5′-GAAACCAGTGCAAGCCAAC-3′; Epel-LSU-R, 5′-AGGCCATTATCATCCCTTGTC-3′) and an 85 bp region EpSB’s nifH gene (EpSB-nifH-F, 5′-CACACTAAAGCACAAACTACC-3′; EpSB-nifH-R, 5′-CAAGTAGTACTTCGTCTAGCTC-3′) and were synthesized by IDT. Gene copy concentrations were quantified for Station ALOHA water samples (~2 L) collected by Niskin bottles at 5, 25, 45, 75, 100, 125, 150, and 175 m on January 16 and July 1 (except 5 m), 2014, during HOT cruises #259 and #264. Samples were filtered onto 25 mm diameter, 0.02 μm pore size aluminum oxide filters (Anotop; Whatman, cat. # WHA68092102) and stored at −80 °C until extracting DNA using the MasterPure Complete DNA and RNA Purification Kit (Epicentre, cat. # MC85200) according to Mueller et al.72. Briefly, a 3-mL syringe filled with 1 mL of tissue and cell lysis solution (MasterPure) containing 100 μg mL−1 proteinase K was attached to the outlet of the filter, and the filter inlet was sealed with a second 3-mL syringe. The lysis solution was pulled halfway through to saturate the filter membrane, and the entire assembly was incubated at 65 °C for 15 min while attached to a rotisserie in a hybridization oven rotating at ca. 16 rpm. The lysis buffer was then drawn fully into the inlet syringe, transferred to a microcentrifuge tube, and placed on ice. The remaining steps for protein precipitation and removal and nucleic acid precipitation were carried out following the manufacturer’s instructions. For each sample, DNA was resuspended in a final volume of 100 μL. Quantitative PCR (qPCR) was performed using the PowerTrack SYBR Green Master Mix system (Applied Biosystems, cat. # A46109) and run on an Eppendorf Mastercycler epgradient S realplex2 real-time PCR machine. Reactions (20 µL total volume) were prepared according to the manufacturer’s protocol, containing 500 nM of each primer. Sample reactions (four replicates) contained 2 μL of environmental DNA extract (24–76 ng DNA), while standards (three replicates) contained 2 μL of gBlocks Gene Fragments (IDT) that were prepared at 1, 2, 3, 4, 5, and 6 log gene copies/μL. The gBlocks Gene Fragments were 500 bp in length and encompassed the entire E. pelagica UHM3201 LSU sequence and positions 1–500 of the EpSB UHM3201 nifH sequence, respectively. The main cycling conditions consisted of an initial denaturation and enzyme activation step of 95 °C for 2 min, followed by 40 cycles of 95 °C for 5 s and 57 °C or 55 °C for 30 s for the LSU and nifH genes, respectively. Melting curves were analyzed to verify the specificity of the amplifications, and reactions containing Epithemia catenata DNA extract were included as negative controls. Reaction efficiencies were 104.23% and 95.15% for the LSU and nifH genes, respectively. The limit of detection for these assays was not empirically determined. gBlocks sequences, qPCR threshold cycle values, and conversion equations are provided in the Source Data file.Physiology experimentsThe daily patterns of N2 fixation were quantified for E. pelagica UHM3200 and E. catenata UHM3210 using two techniques: acetylene (C2H2) reduction to ethylene (C2H4) and argon induced dihydrogen (H2) production (AIHP). Both analyses were conducted using a gaseous flow-through system that quantified the relevant trace gas on the sample outlet line with a temporal resolution of 10 min73. To conduct the measurements, a 10-mL subsample of each Epithemia culture was placed in a 20-mL borosilicate vial and closed using gas-tight rubber stoppers and crimp seals. Separate bottles were used for H2 production and C2H2 reduction. During the experimental period, the temperature was maintained at 25 ± 0.2 °C using a benchtop incubator (Incu-Shaker; Benchmark Scientific) and light exposure was 200 μmol photons m−2 s−1 at wavelengths of 380–780 nm with a 12:12 h square light:dark cycle (Prime HD+; Aqua Illumination). To conduct the AIHP method, the sample vial containing the culture was flushed with a high purity gas mixture consisting of argon (makeup gas; 80%), oxygen (20%), and carbon dioxide (0.04%). In the absence of N2, all of the electrons that would have been used to reduce N2 to NH3 are diverted to H2 production, thereby providing a measure of Total Nitrogenase Activity (TNA). The C2H2 reduction assay also represents a measure of TNA. Our analytical set-up introduced C2H2 at a 1% addition (vol/vol) to the high purity air with a total flow rate (13 mL min−1) identical to the AIHP method. The gas emissions were analyzed using separate reductive trace gas analyzers that were optimized for the quantification of H2 and C2H4. To verify the observed daily patterns in N2 fixation, 15N2 assimilation measurements were conducted on triplicate samples of Epithemia cultures at targeted time points. Five milliliters of 15N-enriched seawater was added to the subsamples, which were subsequently crimp sealed and incubated for a 2 h period with the same light and temperature conditions as the daily gas measurements. At the end of the incubation, the contents of each vial were filtered onto a pre-combusted glass fiber filter. The concentration and isotopic composition (δ15N) of particulate nitrogen for incubated and non-incubated (i.e., natural abundance) samples was measured using an elemental analyzer/isotope ratio mass spectrometer (Carlo-Erba EA NC2500 coupled with a ThermoFinnigan Delta Plus XP). For each of the described analyses, cell-specific rates were calculated based on the average of triplicate cell concentration measurements, obtained from cell samples preserved at 4 °C with Lugol’s iodine solution and quantified within a week using a Sedgwick-Rafter counting chamber (Electron Microscopy Sciences, cat. # 68050-52). All rate measurement data is provided in the Source Data file.Reporting summaryFurther information on research design is available in the Nature Research Reporting Summary linked to this article. More