Beetle collection
Donaciinae species were collected as adult beetles in and around ponds and lakes in Germany, France, Japan, and the USA between 2005 and 2019 (Supplementary Table 1). Only the species Macroplea appendiculata was collected as larvae.
DNA extraction
The symbiont containing organs of adult female beetles were dissected and stored at −80 °C. For the larvae of M. appendiculata, all internal organs were dissected for DNA extraction. DNA was extracted and purified with the Epicenter MasterPureTM kit (Epicenter Biotechnologies, Madison, WI, USA).
Diagnostic PCR
To confirm the presence of symbiont DNA in the samples, a specific primer pair for the symbionts of all European Donaciinae beetles was designed, targeting the 16S ribosomal DNA: Don_sym_F1 [5′- GAC TTR RAG GTT GTR AGC -3′] and Don_sym_R1 [5′- GAC TCY AAT CCG AAC TAM GAT A -3′]. The specificity of the primer pair was checked in silico using the Probe Match tool of the Ribosomal Database Project (RDP)63. For all samples, a diagnostic PCR with this primer pair was performed and the amount of DNA was measured using the Qubit® 2.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) prior to genome sequencing.
Sequencing of the Macroplea mutica symbiont reference genome
To obtain a high-quality genome of one symbiont, DNA extracted from the symbiotic organs of Macroplea mutica with the DNeasy Blood and Tissue kit (Qiagen, Hilden, Germany) was subjected to sequencing with PacBio (699 Mbp raw reads, average read length of 2260 bp; LGC Genomics, Berlin, Germany) and Illumina technologies (Illumina MiSeq, 250 bp mate-pair reads with 3 kbp library insert size, 30.8 million reads; GATC Biotech, Constance, Germany), respectively. The resulting PacBio reads were assembled into two circular symbiont contigs using Canu64, which were corrected with the Illumina reads by mapping in Geneious 11.0.565.
Genome sequencing of the symbionts of all other species
For each host species, DNA extracted from the symbiotic organs of one to four females beetles (or the larval internal organs for M. appendiculata) was used for symbiont genome sequencing. Sequencing was done commercially (StarSEQ GmbH, Mainz, Germany) with Illumina NextSeq 500 technology at a depth of approximately 5 million 150 bp paired-end reads per sample. The resulting raw reads were uploaded to the KBase Web server66, quality-checked with FastQC v1.0.4 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/), and adapter-trimmed and quality-trimmed using Trimmomatic v0.3667. Only read pairs for which both forward and reverse read passed the trimming were then assembled using SPAdes v3.12.068. The resulting assembly was uploaded to Busybee69 to identify the contigs belonging to the symbiont genome. In Busybee, a taxonomic assignment of all fragments was carried out using Kraken70, to cluster contigs into different taxonomic groups and distinguish host and bacterial contigs. The data were further analyzed and visualized using RStudio version 1.1.45371 including the seqinr package72. All fragments with similar coverage, GC content around 20% and taxonomic annotation as Gammaproteobacteria were selected and blasted against the reference genome of the symbiont of Macroplea mutica. The matching fragments were saved in the correct order and orientation for further curation with Geneious 11.0.565.
For most species, the SPAdes assembly and subsequent binning yielded a single circular contig for the plasmid, and one to three contigs covering the symbiont’s chromosome. For some species, all DNA fragments could be combined into one contig with de novo assembly of the SPAdes contigs using the assembler integrated in Geneious. In all other cases, a short sequence at the end of one fragment and at the beginning of the following fragment were aligned with the reference genome to merge both fragments correctly. In case that a part of the sequence was missing and a gap between both fragments occurred, these gaps were filled with the corresponding number of Ns. As the symbiont genomes were generally poor in repetitive sequences, the only major challenge was the assembly of the rRNA operons, as they were present in two copies within the genome, which could not be resolved with short-read sequencing. Hence, assembly of the rRNA operons was done by using the M. mutica symbiont genome as a reference, for which the long PacBio reads successfully resolved both copies. For all other genomes, all rRNA reads were incorrectly assembled into a single operon whose ends perfectly overlapped with both positions of the rRNA operons in the genome. To obtain closed draft genomes, the rRNA operon was therefore copied and automatically (Geneious) or manually assembled with the remaining contigs. As the M. mutica symbiont genome contains a tRNA (Glu-TTC) in one of the two otherwise identical rRNA operons, this tRNA may have been lost or duplicated in the process of assembling the other symbiont genomes. All resulting genomes were finally annotated with RAST v2.073 as implemented in Kbase66.
Symbiont genome analysis
The genes in the chromosomes of all symbionts were clustered into orthologous groups using OrthoMCL v.0.0.774 (Supplementary Data 1). To determine the functional categories for each gene, BlastKOALA was used to assign KEGG Orthology (KO) identifiers based on the KEGG database75,76,77. Based on the hierarchical classification of the KO IDs, pathways were assigned to each gene. The annotated and categorized genes in the genomes were finally visualized with OmicCircos78. Synteny between different genomes was visualized with hive plots79, where the connections between genes was based on the OrthoMCL information of homologous proteins.
Symbiont phylogenetic analyses
The sequences of 49 marker genes in the genomes of the Donaciinae symbionts, the tortoise leaf beetle symbiont Stammera capleta, and the 20 closest relatives in the Kbase database were aligned and used for phylogenomic analysis using FastTree 280, as implemented in the “Insert Genomes into Species Tree” tool in Kbase66.
To assess the phylogenetic placement of the two pectinases, the translated coding sequences were extracted from the symbiont genomes and plasmids, respectively. Taking into account the low level of amino acid similarities between genome-encoded and plasmid-encoded GH28 proteins, we decided to perform two independent phylogenetic analyses. We used the same strategy for both analyses. We used either a genome-encoded or a plasmid-encoded GH28 protein sequence as a query for a BLASTP search of the ncbi_nr protein database and we recovered the first 250 hits. We eliminated redundancy at 90% identity level of both datasets using the CD-HIT Suite server81. Amino acid sequences were first aligned using MAFFT version 782 and inspected visually in order to correct potential misaligned regions. Maximum likelihood-inferred phylogenetic analyses were performed on the IQ-TREE web server83, where the fittest evolutionary model was selected automatically. The robustness of the analysis was tested using 1000 bootstrap replicates.
Extraction of mitochondrial genomes and phylogenetic analysis
For most beetle species, the contig containing the complete mitochondrial genome could easily be identified within the SPAdes assembly (see above) based on its size (between 14.3 and 15.9 kb), intermediate coverage (lower than the symbiont, but higher than host nuclear contigs) and low GC content (18.7–25.2%). For D. cincticornis, no contig containing the complete mitochondrial genome could be identified, so the raw reads were mapped against the mitochondrial genomes of all 25 other Donaciinae species, yielding 35 contigs that captured approximately 56.3% of the mitochondrial genome. The partial (D. cincticornis) or complete (all other species) nucleotide sequences of all 13 protein-coding genes in the beetles’ mitochondrial genomes were extracted, aligned, concatenated, and used for a phylogenetic analysis using Cricoceris duodecimpunctata (NC 003372) as an outgroup. Phylogenetic trees were reconstructed using FastTree (GTR model)80, PhyML (GTR model, 100 bootstrap replicates)84, and RAxML (GTR + Gamma model, 100 bootstrap replicates, dataset partitioned by gene and codon position (first and second positions combined, third positions separated)85, respectively. As all three methods yielded identical tree topologies, only the FastTree phylogeny is displayed.
Fluorescence in situ hybridization of symbiotic organs
In order to assess the presence and localization of symbiotic bacteria in adult males and females across species, fluorescence in situ hybridization (FISH) was performed on one or two male and female specimens, respectively, of eight Donacia and Plateumaris species feeding on Poales (D. cinerea, D. clavipes, D. semicuprea, D. simplex, D. thalassina, D. vulgaris, P. consimilis, and P. sericea), two Donacia species feeding on Alismatales (D. dentata and D. versicolorea), and one on Nymphaeales (D. crassipes). Elytra, head, and legs were removed from the specimens prior to fixation in either Carnoy’s fixative (ethanol, chloroform, acetic acid in a ratio of 6:3:1) or 4% formaldehyde (FA) in PBS. After up to 4 h of fixation, FA-fixed samples were washed in water and then dehydrated in an increasing butanol series, whereas Carnoy-fixated specimens were washed in butanol. Subsequently, specimens were embedded in Technovit®8100 (Kulzer GmbH, Wehrheim, Germany) according to the manufacturer’s protocol, and then subjected to semi-thin sectioning (8 µm) on a Leica RM-2245 rotary microtome (Leica, Wetzlar, Germany). Finally, sections were subjected to FISH as described previously86, using a combination of two of the three fluorescent oligonucleotide probes Don-Sym (specific to Donaciinae symbionts, 5′-GCTYACAACCTYYAAGTC-3′), EUB338 (general for Eubacteria, 5′-GCTGCCTCCCGTAGGAGT-3′)87, and EUB784 (general for Eubacteria, 5′-TGGACTACCAGGGTATCTAATCC-3′)88, labeled with Cy3 or Cy5, respectively, as well as DAPI as a general DNA counterstain (Supplementary Table 2). Briefly, samples were hybridized for 90 min at 60 °C in hybridization buffer (0.9 M NaCl, 0.02 M Tris/HCl pH 8.0, 0.01% SDS) containing 5 µl of each probe and 5 µg ml−1 DAPI. Two wash steps with pre-warmed washing buffer (0.1 M NaCl, 0.02 M Tris/HCl pH8.0, 0.01% SDS, 5 mM EDTA), the second for 20 min at 60 °C, as well as rinsing with dH2O served to remove residual probe. After drying at room temperature, slides were covered with VectaShield® (Vector Laboratories Ltd., Peterborough, UK) and inspected on an AxioImager.Z2 fluorescence microscope (Zeiss, Jena, Germany).
Host transcriptome sequencing and identification of PCWDEs
To assess the beetle hosts’ genetic repertoire for PCWDEs, live beetles were briefly cooled down at −20 °C to immobilize them, and then their digestive tract was dissected, flash-frozen in liquid nitrogen and stored at −80 °C. Total RNA was extracted from the beetles’ midguts using the innuPrep DNA/RNA Mini kit (Analytik Jena, Jena, Germany) following the manufacturer’s instructions. A DNase treatment was then performed to remove potential genomic DNA contaminations using TURBO™ DNase (Invitrogen, Carlsbad, CA, USA) for 30 min at 37 °C. Purification and concentration of the RNA samples were achieved using the RNeasy MinElute Clean up Kit (Qiagen, Hilden, Germany) following the manufacturer’s protocol. Control of the quality of the RNA samples was determined using the RNA 6000 Nano LabChip kit on an Agilent 2100 Bioanalyzer (both Agilent Technologies, Santa Clara, CA, USA) according to the manufacturer’s instructions.
RNA-Seq was performed at the Max Planck Genome Center (Cologne, Germany) where poly(A)+-RNA was first enriched before being fragmented to an average of 300–350 nucleotides. Then, a TruSeq compatible, directional library was prepared for each sample using dual-indexed adapter tags. Sequencing was carried out on a HiSeq3000 sequencing platform (Illumina, CA, USA) using paired-end (2 × 150 bp) reads. Quality control measures, including the filtering of high-quality reads based on fastq file scores, the removal of reads containing primer/adapter sequences and trimming of the read length, were carried out using CLC Genomics Workbench v11.0 (Qiagen, Hilden, Germany). Several assemblies were performed for each sequencing dataset. These assemblies differed by the number of randomly selected read pairs that were included (Supplementary Data 2). The quality of each assembly was assessed by performing a BUSCO (Benchmarking Universal Single-Copy Ortholog)89 analysis on an in-house Galaxy server.
All RNA samples were processed in the same way for RNA-Seq, except the one for Macroplea mutica. In this case, the TruSeq compatible, directional library was prepared using single-indexed adapter tags and was multiplexed with other beetle-derived libraries on the same sequencing lane. After assembly of the resulting sequencing dataset, we realized that the M. mutica assembly was cross-contaminated with sequences from other datasets which were sequenced on the same lane which made subsequent analyses difficult. We used the protocol described by Peters et al.90 in order to cure these RNA-Seq data from cross-contamination. We performed cross-BLAST searches, using BLASTN, between the M. mutica transcriptome assembly and all other assemblies corresponding to samples sequenced in the same run. Transcripts that shared nucleotide sequence identity of at least 98% over a length of at least 180 bp between two or more assemblies were identified. If the relative coverage of two transcripts originating from two different assemblies differed >2-fold, the transcript with the lower relative coverage was assumed to be a contaminant and was removed from the corresponding assembly.
Transcriptome assemblies of reed beetle species were then screened for the presence of transcripts encoding putative plant cell wall degrading enzymes (PCWDEs) using TBLASTN and previously characterized beetle-derived PCWDE sequences91,92. In parallel, the transcriptome with the highest number of complete single-copy orthologous genes, according to the BUSCO analysis, per species was screened for its complement of carbohydrate-active enzymes (CAZyme) using the dbCAN2 meta server (http://bcb.unl.edu/dbCAN2/index.php)93 (Supplementary Data 3).
In vitro analysis of plant cell wall degrading capabilities
Freshly dissected and frozen guts (−80 °C, see above) were used for enzymatic assays to characterize the ability of the beetle and its symbionts to digest plant cell wall components. Guts were thawed on ice, pooled for each species and subjected to homogenization in a precooled Tissue LyserLT (Qiagen, Hilden, Germany) using 25 µl of homogenization buffer for each gut and three metal beads per tube. Subsequently, 50 mM citrate/phosphate buffer pH 5.0 including protease inhibitor cocktail (complete EDTA-free, Roche, Basel, Switzerland) were added to the pooled guts and samples were shaken for 1 min at 50 Hz. Homogenates were centrifuged at 16,000×g at 4 °C for 2 min to pellet remnants of gut tissue. Supernatants were directly used for agarose diffusion assays in 1% agarose Petri dishes containing either 0.1% demethylated PGA from citrus (PGA; degree of methylation DM 0%) (Megazyme, Wicklow, Ireland) or 0.1% carboxymethylcellulose (CMC; Sigma, St. Lois, MO, USA) and 50 mM citrate/phosphate buffer pH 5.094. Two-millimeter holes were made into the agarose, and 10 μl of supernatant from gut homogenates were added to each hole. Agarose plates were incubated for 16 h at 40 °C. Activity was revealed after 1 h of incubation with 0.1% Congo Red solution (for CMC) or 2 h with 0.1% Ruthenium red solution (for PGA) at room temperature; each plate was then destained with 1 M NaCl or distilled water (for ruthenium red) until pale activity zones appeared against a dark red background. For qualitative analysis of breakdown products of the gut homogenates by thin layer chromatography (TLC), supernatants were first subjected to desalting with Zeba Desalt Spin Columns with a 7 kDa cutoff (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer’s guidelines to remove impurities of low molecular weight. Desalted samples were then used for TLC 20 µl enzyme assays set up as follows: 14 µl of each sample was incubated with either 0.4% pectin polymer or 10 µg of pectic oligomer of galacturonic acid (GalA) in a 20 mM citrate/phosphate buffer pH 5.0 at 40 °C for about 16 h38. Enzymatic activity was tested on the following pectin polymers: demethylated PGA from citrus (same as above), pectin from citrus peel (DM 60%) and esterified pectin from citrus fruit (DM 85%), both from Sigma (St. Louis, MO, USA). In addition, the following pectic oligomers of galacturonic acid (GalA) were tested: GalA heptamer/octamer mixture and tetramer (both from Elicityl, Crolles, France), trimer and dimer (both from Santa Cruz Biotechnology, TX, USA). The whole assay volumes were used for TLC afterwards. Samples were applied to TLC plates (Silica gel 60, 20 ×20 cm, Merck, Kenilworth, NJ, USA) in 2.5 µl steps and plates were developed ascending with ethyl acetate: glacial acetic acid: formic acid: water (9:3:1:4) for about 120 min. After drying, carbohydrates were stained by spraying plates in 0.2% (w/v) orcinol in methanol: sulfuric acid (9:1), followed by a short heating until spots appeared. The reference standard contained 2 µg each of GalA, GalA dimer, and GalA trimer.
Heterologous expression of symbiont-encoded pectinase genes
Glycoside hydrolase family 28-coding genes of D. crassipes and M. mutica symbionts were synthesized by the company Genscript (Piscataway, NJ, USA), including codon optimization for expression in E. coli, and were subsequently cloned into pET-22b(+) in frame with a C-terminal V5 epitope and a 6×His tag. Both, plasmid-located (Dcra-pPG, Mmut-pPG) and chromosome-located (Dcra-cPG, Mmut-cPG) GH28 genes were synthesized (see Supplementary Data 4 for codon optimized sequences). Heterologous expression was performed using the Overnight Express Autoinduction System 1 by Novagen according to the manufacturers protocol (Merck, Kenilworth, NJ, USA) with slight modifications as follows. Autoinduction cultures were directly inoculated with single colonies picked from plated BL21 Star (DE3) transformations and were incubated in baffled flasks (50 ml medium/250 ml flask) at 18 °C and 200 rpm for 40 h. Cells were pelleted and subsequently lysed with Novagen BugBuster 10× (Merck, Kenilworth, NJ, USA) in Immobilized Metal Affinity Chromatography (IMAC) Binding buffer (see below) supplemented with Lysonase by rotating the samples at room temperature for 30 min. Samples were centrifuged and the supernatant was subjected to IMAC purification on a column self-packed with 1 ml HisPur cobalt resin. After applying samples in IMAC Binding buffer (50 mM sodium phosphate buffer pH 7.7, 0.5 M sodium chloride, protease inhibitor) on the pre-equilibrated column, the resin was washed extensively (50 mM sodium phosphate buffer pH 7.7, 0.3 M sodium chloride, 10 mM imidazole, protease inhibitor) and eluted three times (sodium phosphate buffer pH 7.4, 0.3 M imidazole, protease inhibitor) with 1 min incubation time for each elution step. Elution fractions e0 were subjected to buffer exchange against 50 mM citrate/phosphate buffer pH 5.0 on Zeba Spin Desalting columns with a 7 kDa cutoff (Thermo Fisher Scientific, Waltham, MA, USA). Alternatively, recombinant proteins from elution fractions were pulled down by immunoprecipitation using anti-V5 agarose beads (Bethyl Laboratories, Montgomery, TX, USA) as follows. The complete 500 µl of the elution e1 were mixed with 20 µl of the agarose bead slurry and incubated rotating over night at 4 °C. The mixture was centrifuged at 1000×g at 4 °C for 2 min to pull down the beads. Beads were washed three times with 500 µl of 50 mM citrate/phosphate buffer pH 5.0 and subsequently re-suspended in 100 µl of water. Success of expression and purification was monitored by Western Blot using a horseradish peroxidase (HRP) coupled V5 tag monoclonal antibody (dilution 1:10,000) and the SuperSignal West Extended Duration Substrate (both Thermo Fisher Scientific, Waltham, MA, USA). Both, buffer exchanged elution fractions as well as immuno-precipitated and re-suspended recombinant GH28 proteins were used for enzymatic assays (TLC).
Statistics and reproducibility
Symbiotic organs were dissected from 2 to 10 specimens per host species, and localization was consistent throughout, as represented in Fig. 1d. Fluorescence in situ hybridization to localize the microbial symbionts in adult beetles’ Malpighian tubules (Figs. 1e, 7, and Supplementary Fig. 8) was performed on one (Donacia cinerea; Donacia clavipes; Donacia crassipes; Donacia dentata; Donacia semicuprea; Donacia simplex; Donacia thalassina; Donacia vulgaris male; Plateumaris sericea) or two (Donacia versicolorea; Donacia vulgaris females; Plateumaris consimilis) specimens per species and sex, yielding consistent results. The heterologous expression of GH28 proteins (Supplementary Fig. 5) was performed three times. The success of heterologous expression and subsequent IMAC was monitored three times but the pull down using anti-V5 agarose beads was just performed and monitored once. Replicated experiments yielded consistent results.

Females (left panel) and males (right panel) of four representative species feeding on different host plants are shown (for fluorescence micrographs of 11 different species, see Supplementary Fig. 8). Note that different probes were used (see Supplementary Table 2), so the symbionts of different species are labeled in red (Cy3, a, b), green (Cy5, c–e), or yellow (Cy3 and Cy5, g). DAPI (blue) was used for general DNA counterstaining. Filled white arrowheads highlight symbiont-filled Malpighian tubules (symbiotic organs), empty arrowheads point to Malpighian tubules without symbionts. The following species are shown (host plant order given in brackets): a, b Donacia crassipes (Nymphaeales), c, d Donacia dentata (Alismatales), e, f Donacia semicuprea (Poales), g, h Plateumaris sericea (Poales). Note that only the Alismatales-feeding and Nymphaeales-feeding species show symbiont-bearing organs in adult males (b, d), whereas the males of Poales-feeding species are symbiont-free (f, h). By contrast, females carry symbionts in all species (a, c, e, g). Scale bars 50 µm.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Source: Ecology - nature.com