Summary of sequencing data
We obtained 168 million and 180 million 250 bp reads from Asian Badger and Northern Hog Badger, respectively. After removing transcripts and unigenes below 200 bp, we obtained 335,772 transcripts and 285,159 unigenes belonging to Asian Badger and 413,917 transcripts and 362,075 unigenes belonging to Northern Hog Badger (Table 1). Next, we analysed the length distribution of the unigenes and transcripts in these two species (Fig. 1). Their N50 of transcript length is longer than 1000 bp, and their N50 of unigene length is longer than 600 bp. The average GC content of the transcriptome data of Asian Badge was 52.71%, a value slightly higher than that of the Northern Hog Badger, which was 52.12% (Table 1).
Functional annotation and classification of the assembled unigenes
The success rate of annotation of these research data in the seven databases is shown in Table 2. In total, 34,150 (ZH) and 31,632 (GH) unigenes had GO terms (Table 2). Among them, there were three GO items related to digestion: positive regulation of the digestive system process (GH and ZH both have one gene), digestive tract development (GH and ZH both have four genes), and digestion (GH has five genes, ZH has three). Next, we compared the GO terms of Asian Badger and Northern Hog Badger transcriptomes and found that the distributions pattern of gene functions from these two species were particularly similar (Fig. 2). This predictable result indicates that there is no bias in the construction of the libraries from the Asian Badger and Northern Hog Badger. For both species, in the three main partitions (cellular component, molecular function, and biological process) of the GO classification, ‘Cellular process’, ‘Binding’ and ‘Metabolic process’, terms were principal individually (Fig. 2). In total, 8915 (ZH) and 10,203 (GH) unigenes had KOG terms (Table 2). In addition, 15,667 (ZH) and 17,823 (GH) were mapped to the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (Table 2) and grouped into 32 subclasses. Interestingly, the digestive system subcategory contains 695 and 611 unigenes in Asian Badger and Northern Hog Badger, respectively, involving 9 pathways, namely, bile, gastric acid, pancreatic, salivary secretion, carbohydrate, protein, vitamin, fat digestion and absorption, and mineral absorption.
Analysis of orthologous genes
The transcriptome evolution of different species can be understood by comparing transcriptome data. We analysed the possible orthologous genes between the transcriptome of Asian Badger and Northern Hog Badger obtained in this study. We selected a total of 5227 homologous gene pairs from these four species. After 5227 pairs of homologous genes were optimized and screened, 943 orthologous gene pairs were obtained (Supplementary Table S1).
To explore whether the genes related to small intestinal digestion in Asian Badger and Northern Hog Badger have undergone adaptive evolution. We can predict the genes that affect the evolution of the two species through selection pressure on orthologous genes12. We selected 473 orthologous gene pairs with Ka/Ks > 1 called divergent orthologous genes from the Ka/ks analysis results. We obtained 1263 orthologous gene pairs with Ka/Ks < 0.1, which were called conserved orthologous genes (Fig. 3). These genes are relatively conserved, and they are subject to strong selection constraints in evolution.
GO enrichment analysis of divergent and conserved orthologous genes
After screening out divergent and conserved orthologous genes, studying the distribution of divergent and conserved orthologous genes in Gene Ontology will clarify the manifestation of evolutionary differences in species sequences on gene function. Of the 473 divergent orthologous gene pairs identified, 260 were enriched in 1510 GO terms, 920 were biological processes, 370 were molecular functions and 220 were cellular components (Supplementary Table S2). A significant analysis of GO enrichment results (p value < 00.05) showed that 43 GO terms were significantly enriched. As shown in Table 3, 8 terms were related to the activities of various enzymes [steroid dehydrogenase activity (GO: 0016229), nuclease activity (GO: 0004518), plastocyanin reductase activity (GO: 0009496), oxidoreductase activity (GO: 0052880), N-acetyltransferase activity (GO: 0008080), and endonuclease activity (GO: 0004519)].
KEGG enrichment analysis of divergent orthologous genes
In organisms, different genes coordinate with each other to perform their biological functions. Through significant enrichment of pathways, the most important biochemical metabolic pathways and signal transduction pathways involved in divergent or conserved genes can be determined. Our enrichment analysis of 473 pairs of divergent orthologous genes showed that 117 pairs of divergent orthologous genes were enriched in 195 KEGG pathways, and the number of genes enriched in each pathway was between 1 and 9 (Supplementary Table S3). The 20 most significant pathways are shown in Fig. 4. Among these KEGG pathways of the top 20 significance, 6 were significantly enriched (p value < 0.05), and the most significant enrichment was the cell adhesion molecule pathway, which enriched 7 divergent orthologous genes, followed by the cGMP-PKG signalling pathway enriched in 8 genes, ribosomal synthesis in eukaryotes enriched in 5 genes, Parkinson’s disease enriched in 8 genes, Fanconi anaemia pathway enriched in 3 genes and Alzheimer’s disease pathway enriched in 8 genes. In the significant top 20 KEGG pathways, three pathways were closely related to this study, namely, the cGMP-PKG signalling pathway (map04022), cAMP signalling pathway (map04024), and Hippo signalling pathway (map04390).
The eight genes enriched in the cGMP-PKG signalling pathway are: CNGB1 is a nucleotide-gated channel 1, and its function is to respond to light-induced changes in intracellular cGMP levels13; the protein encoded by the PPIF gene is a peptide A member of the prolyl cis–trans isomerase (PPIase) family14; ADRA1B is the adrenergic receptor α-1B15; GTF2I is the transcription initiation factor; AKT is serine, threonine Acid kinase16; KCNMB1 is the largest potassium channel, which has the characteristics of large conductance, large voltage, and calcium sensitivity, and is the basis for smooth muscle tone and neuronal excitability control17. PPP1C is the catalytic subunit of myosin light chain phosphatase (MLCP). It and myosin light chain kinase (MLCK) are phosphorylated and dephosphorylated so that myosin can activate myosin ATPase, thereby causing smooth muscle contraction activity18,19.
The eight genes enriched by the cAMP signalling pathway are as follows: ADCY10 is adenylate cyclase 10, which regulates cAMP levels under the action of carbonate ions and calcium ions20; CNGB1 is cyclic nucleotide-gated channel 1, which is used in calcium ions. Play a role in CAM under the action21. PPP1C is the catalytic subunit of serine/threonine protein phosphatase (PP1); AKT is a serine, threonine kinase; HCN2 is cyclic nucleotide-gated channel 2 activated by potassium hyperpolarization22; and PPP1R1B is a protein phosphate enzyme 1 that regulates subunit 1B23. PKA has a regulatory effect on bile secretion, insulin secretion, and apical chloride channels.
Finally, five genes were enriched in the Hippo signalling pathway: the BMP8 gene encodes the secretory ligand of transforming growth factor superfamily protein. Transforming growth factor is a multifunctional protein that can regulate the growth and differentiation of various cells, as well as apoptosis and cellular immunity24. Protein phosphatase 1 (PP1) has three catalytic subunits, one of which is the protein encoded by the PPP1C gene. PP1 is a serine/threonine-specific protein phosphatase that is acknowledged to participate in the regulation of all kinds of cellular processes, for instance muscle contractility, cell division, glycogen metabolism, and protein synthesis25. YWHAH gene products belong to the highly conserved 14-3-3 family of proteins, which are mediated by binding to phosphoserine proteins. Signal transduction inhibits cell division26. The BIRC5 gene is a member of the inhibitor of apoptosis (IAP) gene family, and the IAP gene family encodes negative regulatory proteins that inhibit apoptotic cells27. The protein family encoded by the FZD1 gene serves as a Wnt signal, and the transmembrane receptors of the pathway play a very important role in the growth and development of animals28.
Source: Ecology - nature.com