
Genetic diversity of the chloroplast genome in different plant groups
Because the divergent time and the rate of chloroplast genome evolution are different in different plant groups, the genetic diversity of chloroplast genome in these plant groups is quite different. According to morphological and ecological data combined with geological records, the following results were obtained: In the early tertiary period, there was only one species of Paulownia, and it was divided into two primitive species of Paulownia, P. kawakamii and P. tomentosa, in the Miocene period. Later, other species of Paulownia were formed through evolution and hybridization27. Our results showed that the nucleotide polymorphism (Pi value) of the Paulownia chloroplast genomes was only 0.00066, which is significantly lower than that of many other groups. The Pi value of the chloroplast genome of 5 Rosa species was 0.00154, with a nucleotide polymorphism 3 times that of the genus Paulownia28. The average Pi value of the chloroplast genome of 6 species of Ipomoea was 0.0045, nearly 10 times the Pi value of the chloroplast genome of Paulownia29; the chloroplast genome of Aristolochia has a higher nucleotide polymorphism than the chloroplast genome of Paulownia, and its Pi was 0.01717, which is 31 times that of Paulownia20.
Genetic diversity in different chloroplast regions
The genetic polymorphisms in different regions of the chloroplast genome vary substantially. In general, the single copy (SC) regions (containing LSC and SSC regions) of the chloroplast genome have higher genetic diversity than the IR regions in most plant groups30. In our study, the Pi values of SSC and LSC in the Paulownia chloroplast genomes were 0.00104 and 0.00089, respectively, both of which were significantly higher than the Pi value (0.00012) in the IR regions. Similar results were also found in other plant groups. The Pi values of LSC and SSC in the chloroplast genomes of Aristolochia were 0.02182 and 0.03114, respectively, which were also much higher than the Pi value (0.00411) in the IR regions20. The difference in genetic diversity among regions of the chloroplast genome also appeared at the family level. The IR regions of Apiaceae species were far more conserved than the SC regions, with an average Pi value of 0.002 for the former and 0.009 for the latter31. The percentage of nucleotide variation in the SC sequences (12.7%) was also higher than that in IRs (4.14%) in the chloroplast genomes of 6 Adoxaceae species32. However, the opposite was found in some groups. For example, in Caprifoliaceae chloroplast genomes, the percentage of nucleotide variation in the SC regions (17.61%) was slightly lower than that in the IR regions (21.25%)32.
The coding region of the chloroplast genome is more conserved due to its functional limitations; therefore, the genetic diversity of the coding region is lower than that of the noncoding region. The genetic diversity (Pi = 0.00102) of the noncoding region was significantly higher than that of the coding region (Pi = 0.00033) in Paulownia chloroplast genomes, which was consistent with the results of other groups with ratio differences. The genetic polymorphism of the noncoding region was 3.1 times that of the coding region in Paulownia, 3.9 times that in six Adoxaceae species, 3.5 times that in eight Caprifoliaceae species and 2.4 times that in six Ipomoea species29,32. Because of their abundant nucleotide variations, which can provide rich genetic information, noncoding regions are often employed to analyze the phylogenetic relationship of species and probe into plant evolution and colonization33,34,35. Many studies have shown that genetic diversities also differ greatly among noncoding regions of the chloroplast genome, and the regions with the greatest variation are usually called hotspot regions20. In different plant groups, hotspot regions vary. Dong et al. compared the chloroplast genomes of 29 plant species from 12 genera and identified 19 noncoding regions with high variability, of which pl32-rnL and trnH-psbA had the highest genetic variation14. The most variable noncoding regions included trnH-GUG-psbA, trnR-UCU-atpA, trnC-GCA-petN, ycf3-trnS-GGA, and trnL-UAA-trnF-GAA in six Adoxaceae chloroplast genomes32; and TrnN-GUU-ndhF is the hotspot region in Capsicum36. The regions with the highest percentage of sequence variation were ccs-trnL-UAG, psbI-trnS-GCU, rpl32-ndhF, trnT-UGU-TrnL-UAA and petN-psbM in Echinacea19. In three closely related East Asian wild roses, matK-trnK, psbI-trnS-trnG, rps16-trnG, rpoB-trnC and rps4-trnT were the most divergent intergenic regions, with Pi values exceeding 0.00628.
There were also significant differences in the degree of variation among chloroplast protein-coding regions. Some coding regions show high variability in most plant groups, such as ycf1, nahF, rbcL, and matK, which are often used for barcoding14. Other coding regions show high polymorphism only in some groups, such as trnK, rpl22, ndhI, clpP, and rps1614,32. In the chloroplast genomes of Paulownia, the high-polymorphism coding regions included rpl36, rps12, rps11, rpl16, and ycf3, most of which are genes that encode ribosomal proteins.
In short, although many universal primers for chloroplast DNA have been used, the overall variation in the chloroplast genome of target groups should be detected before selecting certain DNA fragments for further research because of the difference in hotspot regions in different plant groups. The hypervariable loci found in Paulownia in this study, including coding regions and noncoding regions, can provide abundant variation information, which can be used to identify Paulownia species and study species differentiation, population genetics and phylogeography.
Gene selective analysis
Chloroplasts are organelles that carry out photosynthesis in green plants and are the most abundant energy converters on earth. Some enzymes and structural proteins within chloroplasts are encoded by genes of chloroplast genomes24. During chloroplast genome evolution, most genes were subjected to purifying selection due to functional limitations; some of these genes were involved in adaptation to the environment and underwent positive selection, while others were under neutral evolution. By calculating the ratio of dN to dS (dN/dN) for the coding genes with genetic variation, we identified 3 genes (rps2, rbcL and ndhG) under positive selection in the chloroplast genomes of Paulownia, and each of three selected genes performed different physiological functions. A few genes undergoing positive selection also occurred in some other plant groups. Five plastid genes (rbcL, clpP, atpF, ycf1 and ycf2) were subject to positive selection in 7 Panax species30, and only three chloroplast genes (clpP, ycf1 and ycf2) underwent positive selection in the chloroplast genomes of seven Sileneae species37. In many other groups, multiple chloroplast genes show a positive selection effect. One-third of the chloroplast genes in PACMAD grasses, 27 genes in the genus Iodes, 19 genes in Dipsacales species, and 10 genes in Gossypium evolved under positive selection25,32,38,39. Those identified selected genes may be underwent certain functional diversification during their evolutionary history.
Among the selected chloroplast genes in Paulownia, the rbcL gene encodes the large subunit of RuBisCO, which plays an important role in plant photosynthesis. Previous studies showed that rbcL is often under positive selection because of being the target of selection in relation to the changes in temperature, drought and carbon dioxide concentration24,30,32,39. So, the rbcL gene could be a positively selected site during the evolutionary process of Paulownia. The ndhG gene is another selected gene in Paulownia. In higher plants, chloroplast NAD(P)H dehydrogenase can protect plants from photoinhibition or photooxidation stress caused by strong light and alleviate the decrease in the photosynthetic rate and growth delay caused by drought40,41. This enzyme has important functions and is composed of many subunits. Due to adaptations to the environment, some of the genes encoding these subunits (ndh) are involved in adaptive evolution and exhibit positive selection25,38,42. For example, in Australian Citrus, ndhF exhibited a positive selection effect for its involvement in the adaptation to hot and dry climates21,43, and ndhG were also subjected to positive selection in Iodes38. The positive selection signal of ndhG in the Paulownia genus might be the result of adaptation to different environments because the climate of the growth areas of different Paulownia species is different.
Phylogenetic relationships of Paulownia species
Due to the frequent hybridization among Paulownia species, there is a general genetic introgression among these species, which leads to a complex phylogenetic relationship of Paulownia species11. Although the phylogenetic relationships of Paulownia species have been investigated based on morphological, structural, physiological, biochemical and genetic information, a reliable phylogenetic tree for Paulownia species has not been established. Using the complete chloroplast genome information, we constructed a highly reliable pedigree tree of Paulownia. In our study, the Paulownia genus was of monophyletic origin, and its eight species clustered into two clades. P. coreana, P. tomentosa and P. kawakamii formed one clade, while the five other species of the genus formed another clade. Our results were generally consistent with those obtained based on the morphological traits of Paulownia. Fan selected 22 independent traits to conduct comparative analysis of Paulownia species9. According to the Q cluster of these morphological traits, he concluded that P. elongata, P. catalpifolia and P. fortunei were clustered together, forming a white flower Paulownia group with other species, while P. tomentosa and P. kawakamii were included in another Paulownia group. In addition, some of our results are also supported by studies based on molecular data. For example, by analyzing random amplified polymorphic DNA (RAPD) data, Lu et al. categorized P. fargesii, P. australis, P. catalpifolia and P. fortunei into one group44.
The systematic positions of P. fargesii and P. australis have always been the most controversial issue. Fan’s study indicated that P. fargesii, P. tomentosa and P. kawakamii clustered into one clade, while P. australis formed a separate clade9. The phylogenetic relationship established by Mo based on inter simple sequence repeat (ISSR) data suggested that P. fargesii and P. australis were also divided into two different groups10. However, according to morphological traits, Xiong et al. proposed that P. fargesii and P. australis were closely related to P. tomentosa and P. kawakamii, and all four species were classified into one group45. In our study, P. fargesii and P. australis form a large clade together with P. elongata, P. catalpifolia and P. fortunei with high bootstrap support.
Based on the above analysis, it is very clear that, in the Paulownia genus, P. coreana, P. tomentosa and P. kawakamii form an evolutionary branch, while P. fortunei, P. elongata and P. catalpifolia are involved in forming another branch. In addition, in our study, the most controversial systematic positions of P. fargesii and P. australis have been well resolved.
Source: Ecology - nature.com