in

Characterization of metapopulation of Ellobium chinense through Pleistocene expansions and four covariate COI guanine-hotspots linked to G-quadruplex conformation

[adace-ad id="91168"]

Genetic diversity of E. chinense based on COI

The partial fragment of COI, 595 bp in length, was sequenced from 113 E. chinense individuals collected from the eleven sites in South Korea (Table 1). The resultant COI sequences were aligned together with 27 COI individual sequences12,13,14 retrieved from the NCBI GenBank (Table 1). The latter consists of 26 from six collection sites in South Korea and one from a Japanese site. Hence, 140 COI sequences of E. chinense were analyzed, representing 18 collection sites at the nine populations in South Korea and Japan (Table 1). Based on the alignment set (no indels) of these 140 COI sequences (Data S1), we obtained a total of 58 COI haplotypes, of which 43 were singleton, appeared in only a single site. The novel 41 out of 58 COI haplotypes obtained were registered under the GenBank accession nos. MW265437–MW265477 (Table S2). According to the sequence alignment of the 58 COI haplotypes (Fig. S2; Data S2), there were 71 polymorphic sites and 31 parsimoniously informative sites (Fig. 1C), among which four adenine/guanine hotspots at 207, 282, 354, and 420 were ascertained to articulately divide the haplotypes of E. chinense into four meaningful phylogenetic groups: (a) A(207)–A(282)–A(354)–A(420), (b) A–A–G–A, (c) G–A–G–A, and (d) G–G–G–G.

Table 1 List of collection sites and the number of individuals of Ellobium chinense with genetic markers applied to each of the nine populations in South Korea and Japan.
Full size table

Based on the COI haplotype sequence alignment (Fig. S2; Data S2), we reconstructed a ML tree using Ellobium aurisjudae as an outgroup. In the resultant tree topology (Fig. S3), it was confirmed that E. chinense appeared as a monophyletic group, but no distinction between the haplotypes from each geographical population was observed. To define detailed relationships among the COI haplotypes, the outgroup was removed and then an unrooted ML tree (Fig. 1D) was reconstructed. The resultant tree showed two distinctive phylogenetic groups, namely A–A–A–A and the other groups (including at least one G or more in the four positions), regardless of collection localities. The A–A–A–A group included 35 of the 58 COI haplotypes. The others could be divided into the A–A–G–A group (N = 12: ECH11, 12, 15, 16, 18, 23, 27, 28, 32, 33, 36, and 49), the G–A–G–A group (N = 1: ECH35), and the G–G–G–G group (N = 8: ECH01, 07, 19, 29, 41, 45, 48, and 54).

As shown in Table S2 and Fig. S3, ECH01 was a dominant member of the G–G–G–G group with the most individuals (27), which appeared across all the South Korean populations examined here. As shown in Fig. 1D, the A–A–A–A group is likely to be an ancestral type because it was most frequently found in the other species within Ellobiidae (unpublished data) and its haplotype diversity was the highest among the four genetic groups. Given that the G–G–G–G group exhibited much lower haplotype diversity than the A–A–A–A group, and was not observed in any other ellobiid species (unpublished data), it is reasonable to suggest that the G–G–G–G group is a derived rather than an ancestral type. Thus, as shown in Fig. 1D, it is conceivable that unidirectional and stepwise A → G transition events from A–A–A–A to G–G–G–G may have been occurred in E. chinense. Within the A–A–A–A and A–A–G–A groups, parsimoniously informative A → G transition events were found at the sites 120 (ECH12, 15, 16, 18, 23, 38, 40, 44, and 49) and 183 (ECH3, 9, 10, 20, 43, 50, and 55), with a few exceptional cases of G → A at the sites 216 (ECH12, 15, 16, 18, and 23), 372 (ECH12, 15, and 23), and 429 (ECH37, 38, 46, and 47; ECH19 found in the G–G–G–G group).

As indicated in Table 2, the nucleotide diversity (π) is relatively low among the nine populations of E. chinense, ranging from 0.00749 (population BG) to 0.01042 (SC) with an average of 0.00865, whereas the haplotype diversity was very high across these populations, ranging from 0.924 (YG) to 1.000 (SC and JB) with an average of 0.939). All values of Tajima’s D and Fu’s FS were congruently negative, with averages of − 1.87100 (P < 0.05) and − 2.75886 (not statistically significant), respectively, indicating that the signature of demographic expansions recently occurred in E. chinense.

Table 2 The result summary of genetic diversity analyses and neutrality tests performed with 140 COI sequences from Ellobium chinense along the nine populations in South Korea and Japan.
Full size table

Population genetic structure of E. chinense based on COI

As shown in Table S3, the pairwise FST values between nine E. chinense populations were highly low and statistically non-significant, ranging from − 0.21177 (JB and GY) to 0.09091 (JB and JK), indicating the lack of genetic differences among populations. Analysis of molecular variance (AMOVA) for the nine populations of E. chinense in South Korea and Japan (Table S4) revealed that 101.05% of the variance was allocated into the level of individuals within each population, while differentiation between populations (− 1.05%) did not appear to contribute to the overall variance.

Upon the results of TCS network analysis, all the COI haplotypes were connected by forming a single, large network, conforming that no distinct genetic differences existed among populations (Fig. S4a). The G–G–G–G group, with its representative member ECH01, was located rather at a distal position on the TCS network, which implies that these haplotypes may have recently emerged during the expansion of South Korean E. chinense populations. The phylogenetic network analysis (Fig. S4b) also indicated that distinct genetic differences representing the geographical distributions of E. chinense did not exist. Indeed, all haplotypes were grouped into a single broad cluster with overlaps between populations, which is in agreement with the results of the TCS network analysis (Fig. S4a). In addition, the mismatch distribution analysis showed a clearly unimodal curve (Fig. S4c), evidently implying the metapopulation of E. chinense that may have experienced a recent demographic expansion15,16 or through a range expansion with high levels of migration between neighboring demes17,18.

The PCoA shown in Fig. 2a provided additional evidence that, regardless of the geographical distribution of South KoreanE. chinense, there was consistent support for the existence of four distinct phylogenetic groups, i.e., A–A–A–A, A–A–G–A, G–A–G–A, and G–G–G–G (as also shown in Fig. 1). In this analysis, the plausible ancestral A–A–A–A group (with the highest degree of haplotype diversity) was placed mainly around the bottom of the second quadrant, whereas the A–A–G–A group was in the left of the fourth quadrant, the G–A–G–A group (ECH35 only) in the right middle of the fourth quadrant, and the G–G–G–G group (represented by ECH01) was in the first quadrant. The three variation types within the A–A–A–A group, namely A–A–A–G, A–G–A–A, and A–G–A–G, appeared in the ML tree (Fig. 1D; Fig. S3) and were spatially separated from the remaining orthodox members of A–A–A–A, which were located along the Y axis between the first and second quadrants. Based on the results of PCoA (Fig. 2a) and the ML tree (Fig. 1A), we confidently suggest that the four phylogenetic groups of E. chinense arose as a result of serial, stepwise A → G transition events in the following order: A–A–A–A (ancestral) → A–A–Ğ–A → Ğ–A–Ğ–A → G–Ğ–G–Ğ (derived). The other local A → G transition events occurred within the A–A–A–A group (A–A–A–Ğ, A–Ğ–A–A, and A–Ğ–A–Ğ) (Fig. 1D; Fig. S3); these could be considered the fifth spatially separated phylogenetic group of E. chinense, based only on PCoA (Fig. 2a), being located between the phylogenetic groups G–G–G–G and A–A–A–A but apart from A–A–A–A.

Figure 2

Results of principal coordinate analysis (PCoA), covariance of the four guanine hotspots, and plausible G-quadruplex motifs inferred from 58 COI haplotypes found in northwestern Pacific (mainly South Korean) Ellobium chinense; hypothetical ribosome stalling caused by four plausible consecutive G-quadruplex structures in the mRNA of E. chinense COI. (a) Results of PCoA. Representation of the scores on the first two axes (Axis 1 = 23.33%; Axis 2 = 14.68%) from the matrix of genetic distances estimated with the 58 COI haplotypes. Refer to Table 1 for the number of individuals along with the collection site references of the haplotypes; refer to Figs. 1C and S2 for sequence alignment of the 58 COI haplotypes. The colored circles indicate four or five genetic groups regardless of geographical distributions. The black arrows indicate unidirectional, stepwise, A → G transitions (b) Covariance test results for the four adenine/guanine hotspots that may be linked to the four guanine hotspots inferred from 58 COI haplotypes shown in Figs. 1C and S2. In addition, a node in the network represented the position of each sequence, a link represented a covariance position, and the link thickness represented covariance frequency. The detailed data are listed in Data S3. (c) Plausible selected G-quadruplex motifs linked to the four adenine/guanine hotspots on COI. For the 282-site-related motif, the two alternatives are suggested. Refer to Data S3 for the raw data from the G-quadruplex sequence motif search in COI. (d) Hypothetical ribosome stalling in the COI translation process caused by four consecutive G-quadruplex structures with intervals from 39 to 65 bp. Each structure consists of two stacked G-quadruplexes. For the 282-site-linked G-quadruplex structure, two alternative structures are depicted based on the two alternative motif sequences shown in (c). The pictures were edited using Adobe Illustrator v.25.2 (https://www.adobe.com).

Full size image

Covariance of four COI hotspots linked to the conformation of four G-quadruplexes

Similar sequence interval lengths existed between the four adenine/guanine hotspots of COI characterized by unidirectional A → G transitional changes: G-(75 bp)–G-(72 bp)–G-(66 bp)–G. Additionally, the covariance analysis indicated that there was a high degree of covariance among these (see the four thick and bold linkage lines in Fig. 2b): out of 58 haplotypes, 44 were covariate between 207 and 282 sites, 44 between 207 and 420 sites, 41 between 282 and 420 sites, 37 between 207 and 354 sites, 32 between 282 and 354 sites, and 32 between 282 and 420 sites (Data S3 in detail). Given the similar intervals and covariance characteristics, we used QGRS-Conserve to search for sequence motifs that could possibly indicate the formation of G-quadruplex structures in relation to the four hotspots on the CDS of the COI barcoding region. Our results revealed possible sequences associated with the conformation of four different G-quadruplexes represented by each of the four hotspots (Fig. 2c). The first hotspot located at site 207 may be involved in forming a G-quadruplex structure such as G2–N6–G2–N15–G2–N11ĞA; the second hotspot at site 282 may be involved in forming G2–N2–G2–N5–G2–N5ĞT or G2–N2–G2–N4Ğ2–N1–GA; the third hotspot at site 354 may be involved in forming G2–N1–G2–N4–G2–N18–CĞ; and the fourth hotspot, found at site 420, may be involved in the formation of TĞ–N8–G2–N10–G2–N21–CG.

G-Quadruplex structures generally have 1–7 nucleotides between the four guanine-tracks each consisting of two or three guanines, e.g., G2–3–N1–7–G2–3–N1–7–G2–3–N1–7–G2–3; two (for G2) or three (for G3) G-quadruplexes can form a stacked structure. Considering the typical features of G-quadruplex structures, each of the putative motifs searched for G-quadruplex structure formation is likely to exhibit two G-quadruplexes stacked together (Fig. 2c,d). However, the number of nucleotides between the G-tracks is much larger than usual for some sites, and some G tracks consist of only one G (not two or three). Thus, the G-quadruplex structures formed from the putative motifs are expected to be relatively weak. Assuming that all four putative motifs found in our analyses are capable of forming G-quadruplex structures, the four consecutive G-quadruplexes with intervals of 57/60, 39/41, and 65 bp in order might play a role in reducing or inhibiting transcription or translation of COI expression, i.e., they may be an exemplar of downregulation of gene expression; the ribosome stalling process (Fig. 2d).

Genetic diversity of E. chinense inferred from microsatellite markers

The ten polymorphic microsatellite loci were amplified from 54 individuals collected from the four populations (BG, YG, HK, and HS) of E. chinense in South Korea (Table 1), and the PCR products were then sequenced and analyzed to elucidate the level of genetic diversity of E. chinense. Figure 3a shows the microsatellite diversity indices related to genetic diversity for the four E. chinense populations. The number of observed alleles per locus (NA) varied greatly among loci, from 14 (ECHm41) to 29 (ECHm36). The number of effective alleles per locus (NE) ranged from 5.377 (ECHm24) to 11.858 (ECHm40), with the mean of 9.495. Overall, the effective number of alleles (NE = 9.495) was smaller than the mean number of alleles (NA = 13.4). The expected mean heterozygosity (HE) per locus ranged from 0.793 (ECHm24) to 0.913 (ECHm40), while the average value was 0.883. The observed heterozygosity per locus (HO) was from 0.533 (ECHm24) to 0.969 (ECHm40), with a mean value of 0.806. Thus, the expected heterozygosity (HE = 0.883) was larger than the observed heterozygosity (HO = 0.806). The average of the genetic differentiation coefficient FST was estimated relatively low to be 0.038; it ranged from 0.020 (ECHm40) to 0.051 (ECHm27). Table S5 provides more detail on the genetic diversity estimates within each of the four populations (BG, YG, HK, and HS) of E. chinense.

Figure 3

Genetic diversity, population genetic structure, isolation by distance (IBD), and principal component analysis (PCoA) based on 10 selected microsatellite markers and 54 Ellobium chinense individuals collected from South Korea. (a) Summary statistics for the genetic diversity of E. chinense. NA: mean number of alleles, NE: number of effective alleles, HO: observed heterozygosity, HE: expected heterozygosity, FIS: inbreeding coefficient (indicates populations with heterozygote deficit); HWE: loci showing a significant departure from Hardy–Weinberg equilibrium with a global test at the 5% level after a sequential Bonferroni correction (*P < 0.05, **P < 0.01, ***P < 0.001, ns: not significant). (b) Determination of the true number of genetic clusters (K = 2) by Bayesian clustering using the method of Evanno et al.16. K statistics are based on the rate of change in the log probability of data between successive K values. Bayesian clustering results were obtained using the STRUCTURE program48. Refer to Table 1 for information on the four populations: BG, YG, HK, and HS. (c) IBD test results showing that genetic distances are not correlated with geographical distances among the sample collection sites. (d) A two-dimensional plot from the PCoA results showing that any distinct genetic differences did not exist among the four populations of E. chinense in South Korea. The percentage variation attributable to the three principal coordinate axes was 10.66% (Axis 1 = 5.94%; Axis 2 = 4.72%).

Full size image

Population genetic structure of E. chinense based on microsatellite markers

According to Table S6, the estimated FST values, showing the degree of genetic differentiation among pairs of populations, were statistically significant (P < 0.05) only between the populations BG and HS (FST = 0.120) and HK and HS (FST = 0.0184). In contrast, pairwise RST was not significant among populations. Generally, pairwise genetic differentiation among populations was low, indicating that the E. chinense populations were not genetically differentiated from each other.

The distributions of molecular genetic variation among and within populations, as estimated by AMOVA, are shown in Table S7; only 0.85% of the total molecular variation was attributed to interpopulation differentiation, whereas 86.40% of the variation was within populations. This supports the conclusion that significant genetic differences did not exist among the four examined E. chinense populations.

Based on microsatellite markers, the 54 E. chinense individuals were further examined to determine population stratification. The optimal number of clusters, K, according to the method of Evanno et al.19, was 2 (Fig. 3b). However, with this method, it is only possible to infer with confidence about clusters ≥ 2. Indeed, the LnP (D) plot shows a strong drop off in model fit only after K = 2 (data not shown); thus, the ancestry proportions were plotted for each individual given two clusters. All individuals were equally admixed (Fig. 3b) across the range, supporting a single genetic cluster. IBD analysis indicated that the pairwise observed genetic differences were not correlated with the pairwise geographical distances among the four examined E. chinense populations (Fig. 3c). Consistently, the PCoA results (Fig. 3d) showed that all populations were clustered into a single genetic group. Taken together, the Bayesian clustering approach, IBD analysis, and PCoA provided consistent results that indicated the lack of spatial seperation among the four E. chinense populations from South Korea and Japan. These results are coincident with the metapopulation of E. chinense formed through recent expansion, as inferred from COI haplotypes (Figs. 1 and 2).


Source: Ecology - nature.com

Substrate-dependent competition and cooperation relationships between Geobacter and Dehalococcoides for their organohalide respiration

Behavioral traits and territoriality in the symbiotic scaleworm Ophthalmonoe pettiboneae