in

Sampling from four geographically divergent young female populations demonstrates forensic geolocation potential in microbiomes

Cohort demographics

A total of 206 female participants were enrolled in the study and passed our quality control standards. All participants were required to be between the ages of 18–26 years old (22.5 ± 2.1) and to be born and at the time living in one of four geographically distinct regions of the world: Barbados; Santiago, Chile; Pretoria, S. Africa; and Bangkok, Thailand. The regions do, however, differ by an order of magnitude in their geographic spread as the intra-distance separating the residence neighborhood of participants ranged from 34 (Barbados) to 681 km (Pretoria, S. Africa) (Fig. S2). The Chilean and the South African datasets are further divided into two contiguous sub-regions, or neighborhoods, to allow for a micro-geographic analysis. The study population is largely dominated by individuals with self-identified Thai heritage (33%), followed by Black African (16%), Afro-Caribbean (14%) and white (14%) descent, although 19% of the Chilean population did not report ethnicity.

Study participants, despite the divergent geographies, mostly have similar dietary and lifestyle habits (Table S1). Over half the study population (62%) have a normal BMI, with the mean BMI in this range (22.6 ± 5.5). The diets of the different cohorts are also similar as of the total cohort, 78% consume a starch heavy diet (≥ 4 days a week) of rice, bread and pasta, followed by 66% who frequently consume (≥ 4 days a week) vegetables and fruit and 49% who frequently consume dairy products. The study population is split by level of tobacco exposure, with 51% of the population having never smoked, and 43% being exposed to second-hand smoke through living with a smoker. Over half (56%) of the study population own one or more pets.

Stool microbiome

The OTUs identified using the UPARSE pipeline17 were used to compute the alpha diversity of the microbial communities using the Chao1 (species richness) and Shannon (species evenness) indices. The mean Shannon indices reveal that the microbiota diversity is only significant between Thailand-Chile with FDR < 0.05. In case of Chao1 diversity index Thailand-Chile, Thailand-South Africa, Chile-South Africa, Barbados-South Africa have different richness with FDR < 0.05 (Fig. 1A).

Figure 1

Stool alpha diversity: (A) microbial richness and evenness of cheek was calculated based on the Chao1 and Shannon index of four different sites. The y-axis represents the alpha diversity unit scale either Shannon or Chao1. (B) Phylum level abundance of stool samples, (C) top ten most abundance genera in stool samples.

Full size image

The three abundant phyla (Actinobacteria, Bacteroidetes, Verrucomicrobia) have significant differential abundance with FDR < 0.05 among the four countries (Fig. 1B). The top five most dominant taxa identified among stool microbiota are Bacteroides, Prevotella_9, Faecalibacterium, Alistipes, and unclassified Eubacterium (Fig. 1C). Interestingly, Faecalibacterium, an anti-inflammatory commensal recognized for its importance in maintaining intestinal health (see Miguel et al.46), is observed at significantly higher abundance in South African individuals and lower abundance in the Thai individuals (Table S2). There are 28 differentially abundant genera between the four-country using DESeq2 algorithm with only five genera have high abundance in the stool microbiome. These are Pseudobutyrivibrio, Fusobacterium, Christensenellaceae_R-7_group, Ruminococcus_1, Escherichia-Shigella and other important ones are Prevotella, Incertae_Sedis, Megamonas, Enterobacteriaceae_unclassified (Fig. 2). The data suggest that in these populations with relatively similar diets (Table S1), the most geographically distinct taxa (Table S6) are in lower abundance in the stool representing only 10.4% of the total gut microbiota. Using Pearson’s Correlation calculated between the first five Principal Components (PCs), we examined the influential factors of lifestyle behaviors on the composition of microbial communities originating from stool among the entire study population of Barbadian, Chilean, Pretorian and Thai individuals. The composition of stool microbiota across all the populations is most influenced by BMI (PC4 p = 0.018, r2 = 0.029; 3.35% variance). Within single region populations, Chilean stool microbiota correlates with having never smoked (PC3 p = 0.0271, r2 = 0.074; 4.02% variance), and Pretorians being the only population with stool microbiota that correlates with BMI categories (PC1 p = 0.0205, r2 = 0.156; 67.62% variance) and the frequency of eating corn/cornmeal (PC3, p = 0.0077, r2 = 0.196; 4.02% variance). The Thai population’s stool microbiota is correlated with living with a current smoker (PC3 p = 0.012, r2 = 0.093; 5.53% variance) and being an ex-smoker (PC4 p = 0.0097, r2 = 0.0998; 4.56% variance). Stool microbiota of the Barbadian population is not significantly correlated with any of the lifestyle behavioral factors tested.

Figure 2

The significant differential abundant stool genera between four countries displayed as Box and whisker plot.

Full size image

Oral microbiome

The mean Chao1 indices reveal that the microbiota diversity is significant between Thailand–Barbados, Thailand–Chile, Thailand–South Africa and Chile–South Africa with FDR < 0.05. Whereas only significant difference was observed between Thailand and Chile using Shannon diversity index with FDR < 0.05 (Fig. 3A). Two abundant phyla, Bacteroidetes and Proteobacteria have significant differential abundance between countries (FDR < 0.05) (Fig. 3B).

Figure 3

Cheek alpha diversity: (A) microbial richness and evenness of cheek was calculated based on the Chao1 and Shannon index of four different sites. The y-axis represents the alpha diversity unit scale either Shannon or Chao1. (B) Phylum level abundance of Cheek samples, (C) top ten most abundance genera in cheek samples.

Full size image

The top most dominant taxa identified among oral microbiota are two Prevotellaceae genera, Pasteurellaceae_unclassified, Haemophilus, Streptococcus, Gemelia, Veillonella and Neisseria (Fig. 3C), all of which have been documented as among the most abundant in oral microbiota in other populations47. The oral microbiomes also have thirty-five differentially abundant genera (Table S7). Eight of the ten most dominant genera in the oral microbiota Pasteurellaceae_unclassified, Streptococcus, Gemelia, Veillonella, two Prevotellaceae genera, Haemophilus and Neisseria have significance difference in at least one of the populations with FDR < 0.05 (Fig. 4). As such, the oral microbiome on average contains more bacteria from taxa with geographic specific signals as a percentage of the total microbiome (16%) when compared to percentage of the microbiome in differentially abundant taxa in the stool samples (2%).

Figure 4

Box and whisker plot showing the significant differential abundant cheek genera between four countries.

Full size image

We also find that lifestyle and behavior have a greater influence on the oral microbiota compared to stool microbial composition for those factors tested. Like with the stool samples, the oral microbiota composition are associated with different lifestyles and behaviors in different populations, with the exception of BMI which was strongly correlated with oral microbial communities across all four populations using BMI categories: Chile (PC1 p = 0.0085, r2 = 0.103; 71.77% variance), S. Africa (PC1 p = 0.0169, r2 = 0.242; 37.77% variance,) Barbados (PC1 p = 0.0155, r2 = 0.174; 46.41% variance) and Thailand (PC2 p = 0.017, r2 = 0.083; 21.83% variance respectively). In addition to BMI, oral microbiota of the Chilean and Thai population correlated with the frequency of consuming fish with p value < 0.05 (PC2 p = 0.033, r2 = 0.0710; 14.13% variance and PC3, p = 0.0081, r2 = 0.1029; 10.07% variance), while oral microbiota composition of the Barbadian population was also strongly correlated with the frequency of eating meat such as beef and pork (PC2 p = 0.0450, r2 = 0.157; 19.01% variance), as well as eating fruits and vegetables (PC4 p = 0.00169, r2 = 0.342; 8.97% variance).

Global geographical variability of oral and stool microbiota

Both oral and stool microbial communities at genus level exhibited distinct geographic variation (i.e., country of origin) in their taxonomic distribution, though the body site from which the microbial community originated was more discriminatory (Fig. 5). We also identified potential differentially abundant species among the four countries using the usearch “unoise” algorithm to obtain ASVs. Due to skeptical nature of species prediction using short tags V4 regions, the details are described in the Supplementary Tables S8–S15.

Figure 5

Oral (n = 195) and stool (n = 196) microbiota differences according to body site and geographical location (Barbados, Chile, Thailand and S. Africa). Measured by NMDS using weighted UniFrac distance in stool (PERMANOVA r2 = 0.084, p = 0.001), and oral (PERMANOVA r2 = 0.161, p = 0.001).

Full size image

Microbiota from the oral cavity can differentiate geographic locations as shown by both NMDS (Fig. 5) and by PERMANOVA, with approximately 16% of the variation between oral microbial communities explained by country of origin. Within the study populations, Chilean oral microbial communities were the most distinct geographically, explaining 17% of the taxa variation, as compared to 9% for Pretorian and 4% for Barbadian oral microbiota. Using only the differential abundant taxa in the oral microbiome, the country of origin is less explanatory explaining only 11% of the variation by PERMANOVA. Country of origin explained less than 8% of that variance in the taxonomic distribution of the stool, with insufficient differentially abundant taxa to run PERMANOVA on this reduced set.

Since it is possible that the differences could derive not from differences in geographic locations, but instead differences between the lifestyles of the cohorts, we also examined the effect of the metadata values on the strength of the PERMANOVA signal. For all of the metadata variables in the oral and stool microbiome, a significant signal differentiating the country by PERMANOVA remains even after accounting for the metadata (Table S5). The strength of this signal is not similarly observed using only the metadata or the combined data, suggesting that the geographic signal is strongest. However, in metadata variables previously found to be influential in sculpting the microbiome, such as smoking for the oral microbiome20,48, and BMI for the gut microbiome49,50, the PERMANOVA signal remains strong. Interestingly, the strongest reduction of the signalin the oral microbiome and a significant reduction in the stool microbiome is in connection with how much beef or pork an individual eats per week. Previous work on the effect of a carnivorous diet on the oral microbiome was inconclusive6,51,52, though these have mostly concentrated on vegan versus omnivore diets.

We also investigated if the geolocation signal could be amplified either by using differentially abundant taxa or by combining multiple body sites. When only taxa identified as differentially abundant in at least one location compared to the other locations were used, there was an increase in the PERMANOVA signal in both the stool (25%) and oral (54%) microbiome (Fig. S3). However, combining the taxa distribution of oral and stool samples across geography either by adding the distances or by concatenating the taxa counts, when possible, does not increase the geolocation significance of the combined sample (Table S4). Instead, each of the combined sample averages out to below the significance of the oral signal, suggesting that oral microbiota alone has higher geolocation prediction power as compared to stool and combined body sites.

Intra-region geospatial variation of oral and stool microbiota

To assess the extent of variation of oral and stool microbial communities within a geographical region, Chilean and Barbadian study populations were each divided into two distinct neighborhood sub-regions ranging from 27.5 to 178 km based on their residence (Fig. S2). Neighborhood sub-regions were determined by prioritizing geographically discrete and continuous sub-regions with near equal subject populations, without considering any metadata and sociological differences. The Chilean neighborhoods do not have a significant difference between oral or stool microbiomes as identified by PERMANOVA (Fig. 6). Only one of the taxa (Family XI Gemella) was one of the top five taxa in the Chilean oral microbiome (Fig. 3), and differentially abundant between the two sub-regions. Though the two from the stool microbiomes were less abundant. There were no taxa that globally differentially abundant. The microbial communities of the Barbadian population had an overall similar level of difference between the neighborhood sub-regions as did the Chilean population even with a smaller geographical range (27.5 to 32.6 km), though the lower number of subjects does limit the significance of these differences (Fig. S4). No taxa in the Barbados oral samples were identified as significantly differentially abundant, with the exception of one stool taxa (Prevotellaceae Prevotella v9) (Fig. 1). This taxa is associated with carbohydrate-rich diets53.

Figure 6

Oral (n = 66) and stool (n = 67) microbiota diversity between populations from different neighborhoods (sub-region 1 and sub-region 2) in Santiago, Chile as shown by NMDS using weighted UniFrac distance (stool: PERMANOVA r2 = 0.026, p = 0.159; oral: PERMANOVA r2 = 0.032 p = 0.089). The boundaries of the neighborhoods are shown in Supplementary Fig. S2B.

Full size image

However, similar to previous studies of populations living in the same country2,14, when considering the lifestyle behaviors of the individuals resident in each sub-region, some significant differences emerge. Sub-region 1 and 2 in Santiago Chile have different economic resources as is reflected in their cultural and dietary choices54, in addition to the microbiomes. For example, residents in Chilean sub-region 1 more frequently consume fruits/veggies (p = 0.0027) and have a lower BMI (p = 0.001) than those resident in Chilean sub-region 2, while there are more pet owners in sub-region 2 than in sub-region 1 (p = 0.0098). Within the Barbadian population, residents in neighborhood sub-region 1 more frequently consume fish (p = 0.0014) and have a higher BMI (p = 0.0124). When accounting for the metadata differences, the size of the geolocation effect did not appreciably decrease for any of the sub-region comparisons nor any of the body sites. Likewise, the effect size on the microbiome based on the differences in metadata for the groups usually was small, with r2 almost always around 0.01, with the singular exception of the weekly consumption of bread, rice, and pasta between the two populations in the two sub-regions in Barbados (r2 = 0.097). It would be interesting to know what this dietary difference can further be attributed to food costs, and whether it can be used in the process of forensic identification of the victim similar to as in Chile.


Source: Ecology - nature.com

Methane research takes on new urgency at MIT

Ocean microbes get their diet through a surprising mix of sources, study finds