in

The application of a CART model for forensic human geolocation using stable hydrogen and oxygen isotopes

The isotopic spread for each study site

The overall linear relationship between δ2H and δ18O values for hair (n = 81) and toenails (n = 39), respectively, were (Fig. 2):

$$delta^{2} {text{H}}_{{text{hair(VSMOW)}}} = , 0.89 times delta^{18} {text{O}}_{{text{hair(VSMOW)}}} {-} , 86.16,;{text{R}}^{2} = , 0.19,;p , < , 0.01$$

(1)

$$delta^{2} {text{H}}_{{text{toenail(VSMOW)}}} = , 0.15 times delta^{18} {text{O}}_{{text{toenail(VSMOW)}}} {-} , 91.69,;{text{R}}^{2} = , 0.00,;p , = , 0.69$$

(2)

Figure 2

δ2H and δ18O values (‰) of all samples for both hair (δ2H: n = 81, δ18O: n = 82) and toenails (δ2H and δ18O: n = 39). The solid black line represents the Global Meteoric Water Line (GMWL) [δ2H = 8 (times) δ18O + 10] and is included in the graph for comparison purposes. The regression lines between oxygen and hydrogen values for hair [δ2Hhair(VSMOW) = 0.89 × δ18Ohair(VSMOW) − 86.16, R2 = 0.19, p < 0.01] and toenails [δ2Htoenail(VSMOW) = 0.15 × δ18Otoenail(VSMOW) − 91.69, R2 = 0.00, p = 0.69] are also indicated as solid lines in their respective colors. A statistical outlier as identified by measuring the Mahalanobis distance metric55 is marked by a black circle.

Full size image

δ2H and δ18O values in hair samples collected from individuals residing in Site 4 had the most positive values, and Site 1 had the most negative values (Table 2, Fig. 3). For toenails, the most positive δ2H and δ18O values were also observed for individuals from Site 4; however, the most negative δ2H and δ18O values were observed for Site 1 and Site 3, respectively. This is an interesting result as Site 3, Iqaluit, was expected to produce the lowest values for δ2H and δ18O values in both hair and toenails, given that it is located on a much higher latitude compared to the other sites, including Site 1 of Metro Vancouver (see Supplementary Fig. S1 online). However, such a pattern could not be observed for Iqaluit, except for δ18O values in toenails.

Table 2 Descriptive statistics of δ2H and δ18O in hair and toenails grouped by study sites.
Full size table
Figure 3

A box plot of hair and toenail data for each study site. Site 1: Vancouver, Site 2: Orillia, Site 3: Iqaluit, Site 4: Wolfville. The black dots represent outliers for each group of either δ2H or δ18O values however, differ from the statistical outliers when both δ2H or δ18O values are determined using the multivariate distance metric of Mahalanobis distance55.

Full size image

MANOVA results (differences in δ2H and δ18O in tissues between study areas)

An outlier was detected for one hair sample (H2) with δ2H = − 58.3‰ and δ18O = 6.4‰ using the Mahalanobis distance 55. When MANOVA was run on the original dataset, including the outlier, mean δ2H and mean δ18O values differed between the four sampling sites as measured by the MANOVA test [F(3, 77) = 14.07, p < 0.01] and [F(3, 77) = 7.43, p < 0.01], respectively. MANOVA was also run on the dataset with the outlier removed but resulted in a similar result where δ2H and δ18O values of hair differed across the study sites at [F(3, 76) = 22.093, p < 0.01] and [F(3, 76) = 6.90, p < 0.01], respectively. Thus, the presence of the outlier did not influence the overall results for hair. Mean δ2H values and mean δ18O values in toenails differed between the four sampling sites as measured by the MANOVA test [F(3, 35) = 8.26, p < 0.01] and [F(3, 35) = 6.34, p < 0.01], respectively.

Results from the Tukey post-hoc test indicated that δ2H values in hair samples collected from Site 1 differed significantly from those collected from Sites 2 (p < 0.01) as well as from Site 4 (p < 0.01) (See Supplementary Table S2. online). Site 3, on the other hand, was more similar to Site 1; however, it was distinct from Site 4 (p < 0.05). Hair δ18O values also differed significantly between Sites 1 and 2, and between Sites 1 and 4. For toenails, samples from Site 1 showed δ2H values that were distinct from all other sites, whereas δ18O values in samples from Site 4 differed significantly from the other three sites.

Human tissue and drinking water

δ2H and δ18O values of tap water from Wolfville, NS were − 61.8‰ and − 9.4‰, respectively. These data are also true for all cities in Site 4 as they are all supplied by the same Cornwallis watershed42,43. For the remaining study sites, isotopic data were collected from existing publications. Actual tap water data were retrieved for all cities in Site 1 from Ueda and Bell’s7 extensive tap water study covering the entire regional district of Metro Vancouver, BC (Table 3). The tap water value for the city of Barrie, ON in Site 2, was downloaded from the Waterisotope Database46. Although the source water information was not provided, it was assumed that the data are sufficient as Barrie is mainly supplied by a single water source, the Kempenfelt Bay29, and thus has a high likelihood that this particular tap water sample came from the main drinking water source.

Table 3 δ2H and δ18O values of tap water for each location.
Full size table

Surface water data were retrieved for locations where actual tap water data could not be found. Isotopic data of the Black River near Washago, ON56 were chosen as the most representative data for Lake Couchiching as no isotopic data on water samples collected directly from Lake Couchiching were found. Another city in Site 2, Gravenhurst, ON, primarily draws drinking water from Lake Muskoka. Untreated water from Lake Muskoka is drawn proximate to Brydon’s Bay for treatment and distribution to residents across Gravenhurst. Thus data from the drainage area identified as MA3BA in James et al.’s57 study, which covered the entire area of Brydon’s Bay, was determined to best represent drinking water isotope values for Gravenhurst. Finally, the Online Isotope Precipitation Calculator was used to estimate drinking water values for the remaining cities of Horseshoe Valley, ON and Midland, ON in Site 2, and Iqaluit, NU in Site 3.

The retrieval of drinking water isotope data allowed the determination of the relationship between stable hydrogen and oxygen isotopes in modern human tissues and drinking water. The linear regression results between δ2H of hair (δ2Hhair) and drinking water (δ2Hdw) for all study sites with drinking water values (n = 81) were (Fig. 4):

$$delta^{2} {text{H}}_{{{text{hair}}}} = , 0.15 times delta^{2} {text{H}}_{{{text{dw}}}} {-} , 65.17,;{text{R}}^{2} = , 0.18,;p , < , 0.01$$

(3)

Figure 4

A plot of (a) δ2H values and (b) δ18O values for human tissues and local drinking water and (c) δ2H values and (d) δ18O values for human tissues and precipitation data as obtained from the Online isotope precipitation calculator58. Colours indicate tissue type. Lines represent the linear relationships between the stable isotope compositions of human tissues and drinking water. The dotted lines show the relationships for all samples, and the solid lines show relationships for all samples excluding those from Iqaluit. These samples were treated as outliers, given that stable hydrogen and oxygen isotope values of tissue samples from Iqaluit were significantly higher than expected, and local drinking water values were those estimated using the Online Isotope Precipitation Calculator58 rather than those of Iqaluit tap water samples. The exclusion of Iqaluit samples generally increased the R2 value for most equations.

Full size image

δ2H of toenails (δ2Htoenails) and drinking water gave a relationship of:

$$delta^{2} {text{H}}_{{{text{toenails}}}} = , 0.05 times delta^{2} {text{H}}_{{{text{dw}}}} {-} , 86.40,;{text{R}}^{2} = , 0.03,;p , = , 0.32$$

(4)

δ18O of hair (δ18Ohair) and drinking water (δ18Odw):

$$delta^{18} {text{O}}_{{{text{hair}}}} = , 0.20 times delta^{18} {text{O}}_{{{text{dw}}}} + , 12.61,;{text{R}}^{2} = , 0.02,;p , = , 0.21$$

(5)

And finally, δ18O of toenails (δ18Otoenails) and drinking water

$$delta^{18} {text{O}}_{{{text{toenails}}}} = , 0.50 times delta^{18} {text{O}}_{{{text{dw}}}} + , 13.99,;{text{R}}^{2} = , 0.20,;p , < , 0.05$$

(6)

Tissue values from all study sites were also regressed against OIPC values that were calculated from latitudinal, longitudinal, and altitude data (Fig. 4):

$$delta^{2} {text{H}}_{{{text{hair}}}} = , 0.12 times delta^{2} {text{H}}_{{{text{OIPC}}}} {-} , 67.41,;{text{R}}^{2} = , 0.08,;p , < , 0.01$$

(7)

$$delta^{2} {text{H}}_{{{text{toenails}}}} = , {-}0.01 times delta^{2} {text{H}}_{{{text{OIPC}}}} {-} , 91.10,;{text{R}}^{2} = , 0.00,;p , = , 0.89$$

(8)

$$delta^{18} {text{O}}_{{{text{hair}}}} = , 0.07 times delta^{18} {text{O}}_{{{text{OIPC}}}} + , 11.08,;{text{R}}^{2} = , 0.00,;p , = , 0.70$$

(9)

$$delta^{18} {text{O}}_{{{text{toenails}}}} = , 0.58 times delta^{18} {text{O}}_{{{text{OIPC}}}} + , 14.91,;{text{R}}^{2} = , 0.24,;p , < , 0.01$$

(10)

Actual tap water data could not be retrieved for Iqaluit and given that higher-than-expected stable hydrogen and oxygen isotope values were measured for individuals from Iqaluit, these samples were removed as it was unclear whether the OIPC generated values were appropriate estimates for Iqaluit drinking water. Below are the relationships between human tissue values and drinking water values where Iqaluit samples were excluded from the analysis (Fig. 4).

$$delta^{2} {text{H}}_{{{text{hair}}}} = , 0.32 times delta^{2} {text{H}}_{{{text{dw}}}} {-} , 53.69,;{text{R}}^{2} = , 0.28,;p , < , 0.01$$

(11)

$$delta^{2} {text{H}}_{{{text{toenails}}}} = , 0.35 times delta^{2} {text{H}}_{{{text{dw}}}} {-} , 64.99,;{text{R}}^{2} = , 0.35,;p , < , 0.01$$

(12)

$$delta^{18} {text{O}}_{{{text{hair}}}} = , 1.10 times delta^{18} {text{O}}_{{{text{dw}}}} + , 21.57,;{text{R}}^{2} = , 0.1,;p , < , 0.01$$

(13)

$$delta^{18} {text{O}}_{{{text{toenails}}}} = , 0.55 times delta^{18} {text{O}}_{{{text{dw}}}} + , 14.53,;{text{R}}^{2} = , 0.05,;p , = , 0.22$$

(14)

And for OIPC values (Fig. 4):

$$delta^{2} {text{H}}_{{{text{hair}}}} = , 0.60 times delta^{2} {text{H}}_{{{text{OIPC}}}} {-} , 33.24,;{text{R}}^{2} = , 0.2,;p , < , 0.01$$

(15)

$$delta^{2} {text{H}}_{{{text{toenails}}}} = , 0.60 times delta^{2} {text{H}}_{{{text{OIPC}}}} {-} , 49.86,;{text{R}}^{2} = , 0.2,;p , < , 0.05$$

(16)

$$delta^{18} {text{O}}_{{{text{hair}}}} = , 1.20 times delta^{18} {text{O}}_{{{text{OIPC}}}} + , 22.59,;{text{R}}^{2} = , 0.06,;p , < , 0.05$$

(17)

$$delta^{18} {text{O}}_{{{text{toenails}}}} = , 2.13 times delta^{18} {text{O}}_{{{text{OIPC}}}} + , 30.87,;{text{R}}^{2} = , 0.2,;p , < , 0.01$$

(18)

The R2 values increased for all four equations when Iqaluit samples were removed. Overall, stable hydrogen and oxygen isotope values in modern human tissues are weakly related to those in drinking water as measured by tap water. The R2 value showed a slight increase for δ18O values in toenails when the OIPC precipitation data were used instead of tap and surface water values; however still on the lower end at R2 = 0.2. Therefore, there is no clear evidence from this dataset that a latitudinal gradient exists for hydrogen and oxygen values in human tissues for Canada as measured by the stable isotope values of drinking water.

The CART model

The first CART decision tree model was built for hair samples with stable hydrogen and oxygen isotope values (Model 1) (Fig. 5a). The split cut-off for node 1 was determined by hydrogen values where any samples with δ2Hhair values less than − 82‰ were predicted as Site 1. Samples with δ2Hhair > − 82‰ were then split further where any samples with δ2Hhair values less than − 73‰ were initially classified as Site 2. These samples were then split again to either Site 2 (δ2Hhair ≥ 76‰) or Site 4 (δ2Hhair < 76‰). Hair samples classified as Site 4 at this stage were further classified into Site 2 or 4 depending next on the δ18O values. Finally, all samples with δ2Hhair > − 73‰ were classified as Site 4. No samples could be classified as originating from Site 3. The second CART model was built for stable hydrogen and oxygen isotopes of toenails (Model 2) (Fig. 5b). The model included only two decision nodes in which the first predictor variable was δ2Htoenail value, where samples with values less than − 93‰ were predicted to be from Site 1. For toenail samples with hydrogen values greater than − 93‰, oxygen values were used to determine whether they could be classified as Site 2 or Site 4. Those samples with δ18Otoenail values less than 9.6‰ were classified as Site 2 and those with values greater than 9.6‰ were predicted as Site 4. No samples were predicted to be from Site 3 purely from stable hydrogen and oxygen isotopes in toenails. Finally, the third model consisted of stable hydrogen and oxygen isotope values in both hair and toenail samples (Model 3) (Fig. 5c). Model 3 selected toenails as the best attribute for classification, which indicates that toenail isotope values are the better predictor when both hair and toenail samples are present for analysis from Sites 1–4. The model was similar to that of Model 2.

Figure 5

Decision trees developed from both δ2H and δ18O values of (a) hair [Model 1, trained with n = 65], (b) toenails [Model 2, trained with n = 32] and (c) of both hair and toenails [Model 3, trained with n = 28]. The predicted study site numbers are shown on the first row within each bubble. The proportions of samples in each node are shown as decimals for Sites 1, 2, 3, 4, respectively. The percentages indicate the proportion of samples within each sub-partition.

Full size image

Confusion matrices (Table 1) were constructed for all three models to evaluate the performance of the classification models. Of the three models, Model 3 proved to be the most accurate model with an overall accuracy of 71.4% (see Supplementary Fig S2. online). The performance evaluation summary, including measures for sensitivity, specificity, positive predictive value, and negative predictive value for all three models, is provided in (see Supplementary Table S3. online).

Intra-individual differences

Both hair and toenail samples were retrieved from 35 of the 86 individuals. The paired difference between δ2H values in hair and toenails of the same individual was tested using the Wilcoxon Signed Rank’s test for non-normal data as the dataset failed the Shapiro–Wilk’s normality test at the α = 0.05 significance level. Significant differences were found between δ2H values of hair (n = 35, mean = − 78.0‰, s.d. = 3.06) and toenails (n = 35, mean = − 90.9‰, s.d. = 3.27) from the same individual (p < 0.05). The paired t-test was utilized to assess for similarities in δ18O values of hair and toenail from the same individual as the dataset was normally distributed. The results indicated that there were no significant differences between δ18O values of hair (n = 35, mean = 9.7‰, s.d. = 1.22) and toenails (n = 35, mean = 8.2‰, s.d. = 2.10) at the 0.05 significance level, t(34) = 1.92, p > 0.05. Overall, the isotopic values of δ2H in hair were higher than those of toenail from the same individual by 13.0‰, on average, with a standard deviation of 8.4‰. For δ18O, the average was 1.5‰ with a standard deviation of 4.6‰ (Fig. 6).

Figure 6

(a) δ2H and (b) δ18O values in hair and toenails for all individuals that provided both tissue types (n = 35). Study site information are also shown by shapes. The standard deviations of each sample, ran in either duplicates or triplicates, are shown by error bars. Note that error bars cannot be seen for some samples due to small standard deviations. The average difference between the isotopic values of hair and toenail from the same individual were 13.0‰ with a standard deviation of 8.4‰ for δ2H and 1.5‰ with a standard deviation of 4.6‰ for δ18O.

Full size image

The linear relationships between δ2H in hair and toenails for all individuals were (see Supplementary Fig S3. online):

$$delta^{2} {text{H}}_{{{text{hair}}}} = , 0.48 times delta^{2} {text{H}}_{{text{toenail }}} {-} , 34.72,;{text{R}}^{2} = , 0.16,;p , < , 0.05$$

(19)

and for δ18O:

$$delta^{18} {text{O}}_{{{text{hair}}}} = , 0.55 times delta^{18} {text{O}}_{{{text{toenail}}}} + , 5.16,;{text{R}}^{2} = , 0.13,;p , < 0.05$$

(20)

Overall, both equations showed a weak relationship, as seen by the small R2 values.


Source: Ecology - nature.com

The evolutionary origin of avian facial bristles and the likely role of rictal bristles in feeding ecology

Recent speciation associated with range expansion and a shift to self-fertilization in North American Arabidopsis