in

A molecular atlas reveals the tri-sectional spinning mechanism of spider dragline silk

Chromosomal-scale genome assembly and full spidroin gene set of T. clavata

To explore dragline silk production in T. clavata, we sought to assemble a high-quality genome of this species. Thus, we first performed a cytogenetic analysis of T. clavata captured from the wild in Dali City, Yunnan Province, China, and found a chromosomal complement of 2n = 26 in females and 2n = 24 in males, comprising eleven pairs of autosomal elements and unpaired sex chromosomes (X1X1X2X2 in females and X1X2 in males) (Fig. 1a). Then, DNA from adult T. clavata was used to generate long-read (Oxford Nanopore Technologies (ONT)), short-read (Illumina), and Hi-C data (Supplementary Data 1). A total of 349.95 Gb of Nanopore reads, 199.55 Gb of Illumina reads, and ~438.41 Gb of Hi-C raw data were generated. Our sequential assembly approach (Supplementary Fig. 1c) resulted in a 2.63 Gb genome with a scaffold N50 of 202.09 Mb and a Benchmarking Universal Single-Copy Ortholog (BUSCO) genome completeness score of 93.70% (Table 1; Supplementary Data 3). Finally, the genome was assembled into 13 pseudochromosomes. Sex-specific Pool-Seq analysis of spiders indicated that Chr12 and Chr13 were sex chromosomes (Fig. 1b; Supplementary Fig. 2). Based on the MAKER2 pipeline34 (Supplementary Fig. 1e), we annotated 37,607 protein-encoding gene models and predicted repetitive elements with a collective length of 1.42 Gb, accounting for 53.94% of the genome.

Table 1 Characteristics of the T. clavata genome assembly
Full size table

To identify T. clavata spidroin genes, we searched the annotated gene models for sequences similar to 443 published spidroins (Supplementary Data 6) and performed a phylogenetic analysis of the putative spidroin sequences for classification (Supplementary Fig. 12a). Based on the knowledge that a typical spidroin gene consists of a long repeat domain sandwiched between the nonrepetitive N/C-terminal domains16, 128 nonrepetitive hits were primarily identified. These candidates were further validated and reconstructed using full-length transcript isoform sequencing (Iso-seq) and transcriptome sequencing (RNA-seq) data. We thus identified 28 spidroin genes, among which 26 were full-length (Supplementary Fig. 11a), including 9 MaSps, 5 minor ampullate spidroins (MiSps), 2 flagelliform spidroins (FlSps), 1 tubuliform spidroin (TuSp), 2 aggregate spidroins (AgSp), 1 aciniform spidroin (AcSp), 1 pyriform spidroin (PySp), and 5 other spidroins. This full set of spidroin genes was located across nine of the 13 T. clavata chromosomes. Interestingly, we found that the MaSp1a–c & MaSp2e, MaSp2a–d, and MiSp-a–e genes were distributed in three independent groups on chromosomes 4, 7, and 6, respectively (Fig. 1c). Notably, using the genomic data of another orb-weaving spider species, Trichonephila antipodiana35, we identified homologous group distributions of spidroin genes on T. antipodiana chromosomes (Fig. 1d), which indicated the reliability of the grouping results of our study. When we compared the spidroin gene catalog of T. clavata and those of five other orb-web spider species with genomic data28,29,36,37, we found that T. clavata and Trichonephila clavipes possessed the largest number of spidroin genes (28 genes in both species; Fig. 1e).

To further explore the expression of spidroin genes in different glands, all morphologically distinct glands (major and minor ampullate- (Ma and Mi), flagelliform- (Fl), tubuliform- (Tu), and aggregate (Ag) glands) were cleanly and separately dissected from adult female T. clavata spiders except for the aciniform and pyriform glands, which could not be cleanly separated because of their proximal anatomical locations and were therefore treated as a combined sample (aciniform & pyriform gland (Ac & Py)). After RNA sequencing of these silk glands, we performed expression clustering analysis of transcriptomic data and found that the Ma and Mi glands showed the closest relationship in terms of both morphological structure (Fig. 1g) and gene expression (Fig. 1f, h). We noted that the expression profiles of spidroin genes were largely consistent with their putative roles in the corresponding morphologically distinct silk glands; for example, MaSp expression was found in the Ma gland (Fig. 1h). However, some spidroin transcripts, such as MiSps and TuSp, were expressed in several silk glands (Fig. 1h). Unclassified spidroin genes, such as Sp-GP-rich, did not appear to show gland-specific expression (Fig. 1h).

In summary, the chromosomal-scale genome of T. clavata allowed us to obtain detailed structural and location information for all spidroin genes of this species. We also found a relatively diverse set of spidroin genes and a grouped distribution of MaSps and MiSps in T. clavata.

Dragline silk origin and the functional character of the Ma gland segments

To further evaluate the detailed molecular characteristics of the Ma gland-mediated secretion of dragline silk, we performed integrated analyses of the transcriptomes of the three T. clavata Ma gland segments and the proteome and metabolome of T. clavata dragline silk (Fig. 2a). Sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS–PAGE) analysis of dragline silk mainly showed a thick band above 240 kDa, suggesting a relatively small variety of total proteins (Fig. 2b). Subsequent liquid chromatography–mass spectrometry (LC–MS) analysis identified 28 proteins, including ten spidroins (nine MaSps and one MiSp) and 18 nonspidroin proteins (one glucose dehydrogenase (GDH), one mucin-19, one venom protein, and 15 SpiCEs of dragline silk (SpiCE-DS)) (Fig. 2b; Supplementary Data 10). Among these proteins, we found that the core protein components of dragline silk in order of intensity-based absolute quantification (iBAQ) percentages were MaSp1c (37.7%), MaSp1b (12.2%), SpiCE-DS1 (11.9%, also referred to as SpiCE-NMa1 in a previous study28), MaSp1a (10.4%), and MaSp-like (7.2%), accounting for approximately 80% of the total protein abundance in dragline silk (Fig. 2b). These results revealed potential protein components that might be highly correlated with the excellent strength and toughness of dragline silk.

Fig. 2: Dragline silk origin and the functional character of the Ma gland segments.

a Schematic illustration of Ma gland segmentation. b Sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS–PAGE) (left) and LC–MS (right) analyses of dragline silk protein. iBAQ, intensity-based absolute quantification. Similar results were obtained in three independent experiments and summarized in Source data. c Classification of the identified metabolites in dragline silk. d LC–MS analyses of the metabolites. e LC–MS analyses of the golden extract from T. clavata dragline silk. The golden pigment was extracted with 80% methanol. The extracted ion chromatograms (EICs) showed a peak at m/z 206 [M + H]+ for xanthurenic acid. f Pearson correlation of different Ma gland segments (Tail, Sac, and Duct). g Expression clustering of the Tail, Sac, and Duct. The transcriptomic data were clustered according to the hierarchical clustering (HC) method. h Combinational analysis of the transcriptome and proteome showing the expression profile of the dragline silk genes in the Tail, Sac, and Duct. i Concise biosynthetic pathway of xanthurenic acid (tryptophan metabolism) in the T. clavata Ma gland. Gene expression levels mapped to tryptophan metabolism are shown in three segments of the Ma gland. Enzymes involved in the pathway are indicated in red, and the genes encoding the enzymes are shown beside them. j Gene Ontology (GO) enrichment analysis of Ma gland segment-specific genes indicating the biological functions of the Tail, Sac, and Duct. The top 12 significantly enriched GO terms are shown for each segment of the Ma gland. A P-value <0.05 was set as the criterion for screening significantly enriched GO terms. Source data are provided as a Source Data file.

Full size image

To evaluate the composition of T. clavata dragline silk, we then assessed its metabolite composition and identified a total of 180 components (Supplementary Data 12). Among the metabolites, 109 were classified into ten categories: 34 organic acids, 22 organoheterocyclic compounds, 16 lipids, 13 benzenoids, 5 organic nitrogens, 8 organic oxygens, 5 nucleosides, 3 organooxygens, 2 phenylpropanoids and polyketides, and 1 alkaloid (Fig. 2c; Supplementary Data 13). We noted that xanthurenic acid (XA, a yellow pigment38) was the most abundant pigment (Fig. 2d), while other yellow pigments (such as carotenoids and flavonoids39,40) were not detected in our analysis, implying that XA is the major pigment providing T. clavata dragline silk with its golden coloration. The presence of XA was further confirmed by LC–MS analysis (Fig. 2e), consistent with a recent report41.

To explore the origin of dragline silk components from the tri-sectional Ma gland, we focused on the transcriptomic features of the Tail, Sac, and Duct. We determined that the gene expression profiles of the Tail and Sac were more highly correlated with each other than with that of the Duct (Fig. 2f, g; Supplementary Fig. 13c), implying that the Tail and Sac have similar molecular functions. Furthermore, combined transcriptomic and proteomic analyses revealed the expression patterns of the 28 dragline silk protein transcripts in the Tail, Sac, and Duct (Fig. 2h; Supplementary Data 10). Notably, MaSp1a–c & MaSp2e (MaSp-Group1) were highly coexpressed in the Tail and Sac, and MaSp2a–d (MaSp-Group2) were highly coexpressed in only the Sac, while neither of these groups of proteins was highly coexpressed in the duct, indicating that the Tail and Sac are major silk-secreting segments.

We then used the tri-sectional Ma gland datasets to trace the source of the metabolite XA. We found that the genes encoding key enzymes involved in the XA biosynthesis (tryptophan metabolism) pathway were activated in all three Ma gland segments (Fig. 2i); in particular, a kynurenine aminotransferase gene (KAT, Tc09G169510) encoding the primary enzymes catalyzing the transamination of 3-hydroxy-L-kynurenine (3-HK) to XA showed this pattern. These findings suggested that XA is secreted by the Tail, Sac, and Duct.

To characterize the specific biological functions of the Tail, Sac, and Duct related to dragline silk production, we next assigned Gene Ontology (GO) terms to classify the functions of Ma gland segment-specific genes (Supplementary Data 14). We found that the GO terms that were significantly enriched in the Tail (relative to the Duct and Sac) were mainly related to the synthesis of organic acids (the largest group in the silk metabolome), those in the Sac (relative to the Duct and Tail) were mainly related to the synthesis of lipids (the third-largest group in the silk metabolome), and those in the Duct (relative to the Sac and Tail) were related to ion (Ca2+ and H+) exchange and chitin synthesis (Fig. 2j). Thus, a segmental division of biological functions was revealed.

Taken together, our results demonstrate a tri-sectional generation process of dragline silk in the Ma gland. Thus, we have established a genetic relationship between dragline silk components and the Tail, Sac, and Duct glands.

Comprehensive epigenetic features and ceRNA network of the Ma gland tri-section

Based on the Ma gland RNA-seq data, we found that the total fragments per kilobase of transcript per million mapped reads (FPKM) of the dragline silk genes accounted for 47.49% and 34.33% of the FPKM values of the Tail and Sac, respectively; however, in the Duct, these genes accounted for only 0.76% of the FPKM values (Supplementary Data 11), indicating that the transcription of dragline silk genes was incredibly efficient in the first two segments. In particular, the MaSps within the two groups were highly coexpressed in the specific segments of the Ma gland (Fig. 2h). These findings revealed a segment-specific expression pattern of dragline silk genes.

To better understand the transcriptional regulatory mode of these genes, we first investigated genome-wide chromatin accessibility (CA) in the Tail, Sac, and Duct using the assay for transposase-accessible chromatin with sequencing (ATAC-seq). A total of 702,037 (Tail), 767,517 (Sac), and 653,361 (Duct) significant ATAC peaks (RPKM > 2) were identified in the 2 kb regions upstream and downstream of genes, and 10,501,151 (Tail), 11,356,55 (Sac), and 9,778,368 (Duct) significant ATAC peaks (RPKM > 2) were identified at the whole-genome level. The Tail (mean RPKM: 1.78) and Sac (mean RPKM: 2.04) plots showed genes with more accessible chromatin than the Duct (mean RPKM: 1.59) plots (Fig. 3a). We then analyzed the genome-wide DNA methylation level in the Tail, Sac, and Duct. We found the highest levels of DNA methylation in the CG context (beta value: 0.12 in Tail, 0.13 in Sac, and 0.10 in Duct) and only a small amount in the CHH (beta value: 0.04 in Tail, 0.05 in Sac, and 0.03 in Duct) and CHG (beta value: 0.04 in Tail, 0.05 in Sac, and 0.04 in Duct) contexts (Fig. 3b). Overall, there was no significant difference in methylation levels among the Tail, Sac, and Duct. Taken together, our results suggest a potential regulatory role of CA rather than DNA methylation in the transcription of dragline silk genes.

Fig. 3: Comprehensive epigenetic features and ceRNA network of the tri-sectional Ma gland.

a Metagene plot of ATAC-seq signals and heatmap of the ATAC-seq read densities in the Tail, Sac, and Duct. The chromatin accessibility was indicated by the mean RPKM value (upper) and the blue region (bottom). b Metagene plot of DNA methylation levels in CG/CHG/CHH contexts in the Tail, Sac, and Duct. (c, d) Screenshots of the methylation and ATAC-seq tracks of the MaSp1b (c) and MaSp2b (d) genes within the Tail, Sac, and Duct. The potential TF motifs (E-value <1e−10) in the indicated peak set (2 kb upstream of the TSS) are listed to the right and sorted by position. Asterisks represent the shared TF motif within the corresponding MaSp group. e Venn network of TF motifs between MaSp-Group1 and MaSp-Group2. f Expression levels of miRNAs and lncRNAs in the Tail, Sac, and Duct. Date are presented as mean ± SD (n = 3 for each Ma segment). Box plots show minimum to maximum (whiskers), 25–75% (box), median (band inside) with all data points. g ceRNA network of the dragline silk genes.

Full size image

Next, the visualization of ATAC-seq and methylation datasets of the two MaSp groups in a genome browser revealed a reverse trend of peak signals (Fig. 3c, d; Supplementary Fig. 16). We analyzed potential TF motifs among the ATAC-seq peak sets in the 2 kb regions upstream of the transcriptional start sites (TSSs). We identified nine Tail- and Sac-specific TF motifs for MaSp1b (in MaSp-Group 1) and 13 Sac-specific TF motifs for MaSp2b (in MaSp-Group 2) (Supplementary Data 15; Supplementary Fig. 17a, b). Interestingly, we noted that the TF motifs closest to the TSSs, such as MYB and homeobox motifs for MaSp-Group 1 and two C2H2 motifs for MaSp-Group 2, were shared within each MaSp group (Fig. 3c, d). However, the Venn network of TF motifs between MaSp-Group1 and MaSp-Group2 showed little commonality among the Tail, Sac, and Duct (Fig. 3e). Therefore, we concluded that there was a common regulatory pattern within each MaSp group but a differentiated regulatory pattern between the two MaSp groups.

To investigate the impact of competing endogenous RNAs (ceRNAs: a post-transcriptional regulatory system implemented by miRNA and lncRNA42) corresponding to the regulation of dragline silk genes, we performed a whole-transcriptomic analysis of the tri-sectional Ma gland and identified a total of 527 miRNAs (179 in the Tail, 167 in the Sac, 181 in the Duct) and 10,110 lncRNAs (240 in the Tail, 982 in the Sac, and 4808 in the Duct) (Fig. 3f). From these data, we constructed a potential lncRNA–miRNA–mRNA interaction pairs by using the miRanda43 and RNAhybrid44 algorithms to identify the potential binding site between miRNA and lncRNA/mRNA, and then visualized the interaction networks by using Cytoscape software45. As shown in Fig. 3g, the ceRNA network of dragline silk genes consisted of 28 lncRNAs, 21 miRNAs, and 13 mRNAs. Remarkably, we noted that the ceRNA networks of MaSp1a–c & MaSp2e (MaSp-Group 1) were tightly clustered, as were those of MaSp2a–d (MaSp-Group 2); three lncRNAs (LXLOC_047988, LXLOC_047990, and LXLOC_051464) in the MaSp-Group 1 network were highly expressed in the Ma gland (Supplementary Fig. 18); one lncRNA (LXLOC_070389) and four miRNAs (novel_mir42, novel_mir46, novel_mir166, and miR-285_3) in the MaSp-Group 2 network were highly expressed in the Ma gland (Supplementary Fig. 18); in addition, the ceRNA networks of the two MaSp groups were independent of each other (Fig. 3g; Supplementary Fig. 18). These results further revealed potential post-transcriptional networks and the differentiated coregulatory pattern of the genes in the two MaSp groups in the Ma gland.

In summary, we observed an abundance of epigenetic and ceRNA signatures associated with the efficient and segment-specific transcription of dragline silk genes. Our data suggested the existence of differential regulation strategies in the three segments of the Ma gland dedicated to achieving the hierarchical gene expression of the MaSps.

Single-cell spatial architecture at the whole Ma gland scale

To further explore the cytological basis related to the hierarchical organization of the Ma gland, we generated single-cell and spatial patterns of gene expression in the T. clavata whole Ma gland. A total of 9349 high-quality single cells (SCs) were obtained after quality control, and they were then split into ten clusters through uniform manifold approximation and projection (UMAP) clustering (Fig. 4a). Based on the GO analysis of cluster-specific marker genes combined with the expression profiles of the segment-specific genes in each cluster (Supplementary Fig. 21 and Supplementary Note “Cell type annotation”), we carried out the fine annotations of ten SC clusters, namely, cluster 1: Ma gland origin cell (MaGO), cluster 2: MaSp-Group synthesis cell (MG1S), cluster 3: Chitin synthesis cell (CS), cluster 4: Unknown cell, cluster 5: Ampullate lumen skeleton cell (ALS), cluster 6: Ion transport cell (IT), cluster 7: Lipid synthesis cell (LS), cluster 8: pH adjustment cell (PA), cluster 9: MaSp-Group 2 synthesis cell I (MG2S I), and cluster 10: MaSp-Group 2 synthesis cell II (MG2S II), and delineated their sources from the Tail (clusters 1 and 2), Sac (clusters 1, 2, 5, 7, 9, 10), and Duct (clusters 1, 3, 4, 6, 8) (Fig. 4a).

Fig. 4: Single-cell spatial architecture at the whole-Ma-gland scale.

a Uniform manifold approximation and projection (UMAP) analysis of cell types in the Ma gland and their grouping into ten cell clusters. The numbers in white-filled circles indicate cell clusters. The ubiquitous clusters are shown in the yellow series, the Sac clusters in the green series, and the Duct clusters in the blue series. b Pseudotime trajectory of all 9349 Ma gland cells. Each dot indicates a single cell, color-coded by the cluster as in (a). The numbers in black-filled circles indicate branch sites. The black arrows indicate the start of the trajectory. c Hematoxylin and eosin staining of Ma gland sections and unbiased clustering of spatial transcriptomic (ST) spots. Dotted lines depict the outline of the Ma gland. Similar results were obtained in three independent experiments and summarized in Source data. d UMAP and ST feature plots of the expression of genes in the MaSp group. e Heatmap showing the expression of Ma gland segment-specific genes in each cell type and each ST cluster along with corresponding GO terms. Source data are provided as a Source Data file.

Full size image

To discern how the Tail, Sac, and Duct develop within the Ma gland, we ordered all single cells according to pseudotime and constructed a developmental trajectory. This resulted in a continuum of cells with three distinct branch points. We found that a set of cells from cluster 1 (ubiquitous SC cluster) assembled at the beginning of the pseudotime period and gradually bifurcated into four end-points representing two segments (the Sac and Duct) of the Ma gland, with the clusters arranged at different branch sites (Fig. 4b). We further investigated the developmental trajectories of the Tail, Sac, and Duct separately. As expected, most cells from cluster 1 were assembled at the beginning of the pseudotime period, while the Sac cells (clusters 2, 5, 7, 9, and 10) and the Duct cells (clusters 3, 4, 6, 8) were grouped into different branches (Fig. 4b; Supplementary Fig. 23). These results identified SC cluster 1 as the Ma gland origin cell and provided insights into the differentiation trajectories of Ma gland cells during cell state transitions.

We next assessed the spatial organization of cell populations in the Ma gland sections. This dataset contained gene expression information from a resource generated across 579 spatial transcriptomic (ST) spots within Ma gland sections (Fig. 4c). After analyzing the transcriptional signatures of ST spots, we identified seven spot clusters (one in the Tail, four in the Sac, and two in the Duct). As a first demonstration of the single-cellular and spatial gene expression patterns in the Ma gland, we visualized the expression of the MaSp-Group1 and MaSp-Group2 genes in UMAP and ST feature plots to better localize silk protein-secretion cells within our captured cell populations. Interestingly, we found that the MaSp-Group1 genes were prominently expressed in the Tail and Sac clusters (SC clusters 1, 2, 5, 7, 9, and 10 and ST clusters “a–f”), while the MaSp-Group2 genes were predominantly expressed in the Sac clusters (SC clusters 9–10 and ST cluster “d”) (Fig. 4d; Supplementary Fig. 22a). Thus, the genes exhibited cellular and spatial cluster-specific patterns according to both the scRNA-seq and ST results that were consistent with the results of the expression and regulation analyses (Figs. 2h, 3c, d, g).

To characterize the identified SC and ST clusters related to the molecular function of the Ma gland, we selected 35 of the segment-specific genes identified in the segment transcriptomes to perform expression analyses based on the scRNA-seq and ST datasets. We found that SC cluster 2 and ST cluster “a” were major sets with the functions of organic acid metabolic process and oxidoreductase activity; SC cluster 7 and ST cluster “e” were major sets with the functions of lipid metabolic process and transferase activity; SC cluster 6 and ST cluster “f” were major sets with the function of calcium ion binding; SC cluster 3/8 and ST cluster “c/g” were major sets with the function of proton-transporting V-type ATPase complex; and SC cluster 3 and ST cluster “f” were major sets with the function of chitin binding (Fig. 4e; Supplementary Figs. 25, 26).

In summary, the detailed anatomic and molecular description of the Ma gland revealed a single cell type within the Tail and multiple cell types within the Sac and Duct, highlighting the developmental and functional differentiation of the Ma gland in the tri-section.

Convergent evolution of the tri-sectional silk gland between T. clavata and Bombyx mori

To extend our investigation to an established model organism that has also evolved specialized glands for spinning silk, we chose the silkworm, Bombyx mori, which has a distinct phylogenetic position from spiders in the phylum Arthropoda46. Through anatomical observations, we noted a morphological convergence of the silk-spinning gland between the T. clavata Ma gland (Tail, Sac, and Duct) and the B. mori silk gland (posterior silk gland (PSG), middle silk gland (MSG), and anterior silk gland (ASG)), indicating a one-to-one correspondence of the tri-sectional architecture (Fig. 5a, b). We also found a similar number of silk gland cell types between T. clavata and B. mori47 but differentiated annotations except for the chitin-related process in the Duct/ASG (Fig. 5b; Supplementary Fig. 29). Our previous studies revealed defects in silk spinning by silkworms caused by structural deficiency of the PSG and ASG48,49. However, the genetic manipulation system for spiders has not yet been established. To examine whether the remaining segment of the silkworm silk gland is required for proper silk production, we used ser1 promoter-driven transgenic overexpression (OE) of a butterfly cytotoxin (pierisin-1A, P1A50) to generate an MSG-deficient silkworm (P1A-OE) (Fig. 5c). We found that the MSG of the P1A-OE strain was successfully truncated and that these silkworms failed to spin a cocoon (Fig. 5c), consistent with the phenotypes of PSG- and ASG-deficient silkworms48,49. Our results further demonstrated that the tri-sectional architecture of the silk gland is essential for silk spinning.

Fig. 5: Convergent evolution of the tri-sectional silk gland between T. clavata and B. mori.

a Morphology of the T. clavata Ma gland, dragline silk, B. mori silk gland, and cocoon silk. Insets show cross-sections of silk threads. Similar results were obtained in three independent experiments and summarized in Source data. b Schematic illustration showing the morphological convergence of the T. clavata Ma gland, with the Tail, Sac, and Duct indicated (above), and the B. mori silk gland, with the PSG, MSG, and ASG indicated (below). c Construction of the piggyBac transgenic vector (above) and silk gland phenotypes of the P1A-OE and wild-type (WT) strains in silkworm (below). Ser1-P, Ser1 promoter. 3 × P3-P, 3 × P3 promoter. SV40-T, SV40 terminator. L5D7 represents the 7th day of the fifth instar. The arrows indicate the silk production process, and a red cross indicates that the process was blocked. d Sequence alignment of the orthologous Hsp20 proteins (Tc04G175120 and BMSK0007630) in spider and silkworm. The red line indicates the editing region. The arrow shows the distribution of identified alleles around the cleavage site of the sgRNA. The top 16 sequences with a high percentage were exhibited. e Cocoon weight, pupa weight, and cocoon layer rate performance of the CRISPR/Cas9-based Hsp20 knockout silkworm strain. Data were presented as mean ± SD (n = 3). Statistical comparisons were made using two-tailed Student’s t test. ns indicates non-significant. **P-value <0.01. The arrow indicates the silk production process. The red line ending in a crossbar indicates that the process was suppressed. f Molecular functional convergence of the T. clavata Ma gland and the B. mori silk gland according to GO term analysis. A P-value < 0.05 was set as the criterion for screening significantly enriched GO terms. g Orthologous gene expression convergence between the T. clavata Ma gland and the B. mori silk gland. h, i Component convergence of protein (h) and metabolite (i) between silks produced by T. clavata and B. mori. The metabolites with a total intensity percentage above 90% were analyzed. The major metabolites in T. clavata dragline silk and B. mori cocoon silk are indicated in red. Source data are provided as a Source Data file.

Full size image

To further investigate the convergence between these two species, we performed whole-genome blastp alignment, and we identified 9593 and 7355 orthologous genes in T. clavata and B. mori, respectively. From these assigned gene ortholog pairs, we selected the Hsp20 pair (Tc04G175120 and BMSK0007630) because of the high sequence identity of 87.1% between T. clavata and B. mori (Fig. 5d); in addition, Hsp20 was expressed in the silk glands of both T. clavata and B. mori and simultaneously served as a marker gene of SC cluster 5 of the T. clavata Ma gland (Supplementary Data 17), which encodes a small heat shock protein that acts as a protein chaperone to protect other proteins against misfolding and aggregation51. We next performed CRISPR/Cas9-based knockout (KO) of Hsp20 in silkworm and successfully identified indels (96.46%) at the target site of the Hsp20 sgRNA (Fig. 5d). Interestingly, we found that the silk production (cocoon weight and cocoon layer rate) of the Hsp20-KO strain was significantly lower than that of the wild type (Fig. 5e). From these results, we concluded that the orthologous Hsp20 played a positive role in silk production in both the investigated spider and the silkworm.

To further investigate the molecular signatures of silk spinning for evidence of convergent evolution in the spider and silkworm, we next performed comparative transcriptomic, proteomic, and metabolomic analyses of the tri-sectional silk glands and silks of T. clavata and B. mori. Shared GO terms were identified between each corresponding silk gland region of T. clavata and B. mori. We found that three GO terms (calcium ion binding, chitin binding, and signal transduction) were commonly enriched in the Duct and ASG and that one GO term (fatty acid metabolic process) was commonly enriched in the Sac and MSG (Fig. 5f). These shared GO terms were silk gland-specific and not identified in other tissue types (hemocyte and ovary) (Supplementary Fig. 30). Next, we performed unique gene screening for each silk gland segment using a customized pipeline (Supplementary Fig. 28a). We found that 42, 6, and 2 pairs of orthologous genes were coexpressed in the Duct and ASG, Sac and MSG, and Tail and PSG, respectively (Fig. 5g; Supplementary Data 20). Interestingly, we found that seven ortholog gene pairs involved in the V-type ATPase family, encoding key factors involved in regulating silk fibrillogenesis52, were significantly upregulated in both the Duct and ASG (Fig. 5g). Our results indicated a higher consistency between the Duct and ASG in molecular functions than between the Sac and MSG or the Tail and PSG.

We then explored whether the components of spider and silkworm silks showed convergence by performing Venn analyses of the silk proteomes and metabolomes across the two species (Supplementary Fig. 28b). We found that mucin-19 and GDH were the only two proteins existing in both T. clavata dragline silk and B. mori cocoon silk, and more interestingly, these proteins were mainly synthesized and secreted by Tail/PSG and Sac/MSG (Fig. 5h). We also identified six common metabolites in these two silks: choline, N-methyl-α-aminoisobutyric acid, DL-malic acid, D(-)-threose, 4-oxoproline, and L-threonic acid (Fig. 5i). Among these metabolites, it is worth noting that choline, a component of phospholipids in cell membranes53, and DL-malic acid, which act as a preservative or pH-adjuster54, were both major metabolites of dragline silk and cocoon silk (Fig. 5i). We therefore speculated that silk secretion from gland cells to the lumen is accompanied by choline release from the cell membrane and that, in natural silk, anti-rot and anti-bacteria characteristics are conferred by DL-malic acid.

In summary, the shared molecular characteristics of the tri-sectional silk glands of a spider and silkworm indicated convergent evolution under similar cases related to silk spinning and provided rich insights into silk gland function and silk biosynthesis.


Source: Ecology - nature.com

Responsive design meets responsibility for the planet’s future

Featured video: Investigating our blue ocean planet