in

Shape-changing chains for morphometric analysis of 2D and 3D, open or closed outlines

2D mandible outlines

In17, elliptical Fourier analysis (EFA) is employed to investigate the lateral shape difference between 106 fossil mandibles of 5 groups: A. robustus ((n=7)), H. erectus ((n=12)), H. heidelbergensis ((n=4)), H. neanderthalensis ((n=22)), and H. sapiens ((n=61)). In the presented work, the authors apply the shape-changing chain method to the same dataset. Twelve samples were suppressed, including all 7 A. robustus samples, 4 H. neanderthalensis samples, and one H. sapiens sample. Therefore, 94 mandible profiles of 4 groups of the ancient human are analyzed: H. erectus ((n=12)), H. heidelbergensis ((n=4)), H. neanderthalensis ((n=18)), and H. sapiens ((n=60)). The dataset is in the form of Cartesian coordinates of points along the mandible boundary. Note that the shape-changing chain method does not require pre-alignment of curves or removal of the size factor. However, the mandible profile dataset the authors obtained had already lost the information of the original sample sizes. Therefore, only normalized mandibular shapes are compared herein. Figure 5 illustrates the mean shape of each group of profiles by aligning all profiles in the set using a standard Procrustes superimposition (PS) which includes translation, scaling, and rotation of the profiles27.

Figure 5

Mean mandibular shapes of samples from H. erectus (red circles), H. heidelbergensis (blue triangles), H. neanderthalensis (green squares), and H. sapiens (black diamonds).

Full size image

Using the relative angle method ((k=150), (T=20^circ)), five apices are reserved, constituting six sub-profiles for each profile. An illustration of the location of the apices on a mandible profile is shown in Fig. 6. Each sub-profile is then matched with a shape-changing chain individually. The determination of segment type vector of each sub-profile refers to the growth mechanism of mandible proposed by Enlow et al.28. As shown in Fig. 6, the mechanism of mandible growth involves bone resorption (indicated by the arrows pointing towards the mandible contour) and bone deposition (indicated by the arrows pointing out of the mandible contour). Although the whole mandible’s displacement direction is forwards and downwards, the reconstruction of the ascending limb is generally backwards and upwards. G-segments and C-segments are employed to approximate the growing portions in target profiles and characterize the difference in profile lengths.

Figure 6

A profile ((j=1)) from the H. erectus group: Five apices (red circle) are located using the relative angle method ((k=150), (T=20^circ)) and divide the profile into six sub-profiles. The arrows represent the growth pattern of the mandible28.

Full size image

Note that the growths of the inferior edge of the mandibular body and the posterior edge of the mandibular ramus are more significant than the rest parts of the mandible profile. The 94 mandible profiles are then matched with a shape-changing chains using the following scheme. The segment vectors for the first, second, third, and sixth sub-profiles are defined as (left[{text{MGM}}right]) alike, and the segment vectors for the fourth and fifth sub-profiles are both defined as (left[{text{MCGM}}right]), where the C-segments and G-segments are used to capture the difference in arc lengths. Therefore, the overall segment vector is

$$mathbf{V}=left[text{M G M M G M M G M M C G M M C G M M G M} , right]text{,}$$

where there are a total of 20 segments—12 M-segments, 2 C-segments, and 6 G-segments. After the segment type vector is defined, the shape-changing chain is generated to match the target mandible profiles and then is optimized for each sub-profile. The maximum and mean error of all profiles of the final matching result are ({E}_{text{max}}=8.0863) and (overline{E }=0.6009) units, respectively. Figure 7 shows the best (a), the average (b), and the worst match (c) according to ({tilde{E }}_{j}). Note that in the worst match, the G-segment at the condyle (head) of the mandible causes the largest matching error. This is because the third primary segmentation point (between the two M-segments that follow) identified using the relative angle method for this specific profile is not at the tip of the condyle as the majority of the profiles.

Figure 7

The fitting result of 94 human mandibles. (a) The best match (the 4th profile—H. erectus, ({tilde{E }}_{4}=0.3924)); (b) The match with error closest to (overline{E }) (the 6th profile—H. erectus, ({tilde{E }}_{6}=0.6006)); (c) The worst match (the 13th profile—H. heidelbergenis, ({tilde{E }}_{13}=1.0771)).

Full size image

The orientation difference between two neighboring segments reflects the rotational angle between them, and thus are employed in the statistical analysis in the next step. Denote the direction of a vector (mathbf{u}={left{{u}_{x},{u}_{y}right}}^{T}) as (angle left(mathbf{u}right)), then the orientation change between the ({e}{text{th}}) and the ({(e+1)}{text{th}}) segments on the ({j}{text{th}}) profile is calculated as the difference between the direction of the last piece on the ({e}{text{th}}) segment and the direction of the first piece on the ({(e+1)}{text{th}}) segment

$${sigma }_{j}^{e}=angle left({overline{mathbf{z}} }_{{j}_{2}}^{e+1}-{overline{mathbf{z}} }_{{j}_{1}}^{e+1}right)-angle left({overline{mathbf{z}} }_{{j}_{{m}_{j}^{e}+1}}^{e}-{overline{mathbf{z}} }_{{j}_{{m}_{j}^{e}}}^{e}right), forall e=1,dots ,q-1 j=1,dots ,p.$$

(10)

In the mandible example, 19 angular variables are generated from 20 segments. As in17, a stepwise discrimination analysis (DA) is conducted (in IBM SPSS 22) to figure out the relationship among the four homo groups. DA is a supervised classification method and returns (g-1) canonical components among (g) groups of samples29. Figure 8 shows the convex hull of four homo genus plotted with the first and the second canonical components. The three main groups: H. erectus, H. neanderthalensis, and H. sapiens, are separated from each other in the direction of the first canonical component. H. heidelbergensis and H. neanderthalensis have an overlap in the direction of the second canonical component. In stepwise DA, leave-one-out cross-validation (LOOCV) is applied to verify the stability of the linear model. As a result, the prediction accuracy is 91.5% and the cross-validation accuracy is 80.9%. This DA result suggests that the shape-changing chain method is useful in analyzing 2D shapes. The classification matrices of original prediction and LOOCV are presented in Table 2, showing the details of discrimination of the four mandibular shape groups.

Figure 8

Canonical plot of the 94 human mandibles from four groups (H. erectus, H. heidelbergensis, H. neanderthalensis, and H. sapiens) based on the orientation changes between segments (19 variables).

Full size image
Table 2 Classification matrices of the original DA and cross-validated prediction of 94 human mandibles.
Full size table

Note that the classification results as shown in Fig. 8 and Table 2 are in accordance with the results obtained with EFA in17. The high misclassification rate of H. heidelbergensis and its distribution on the canonical plot are also in keep with the mainstream opinion that H. heidelbergensis is a chronospecies evolving from H. erectus and is considered as the most recent common ancestor (MRCA) between H. sapiens and H. neanderthalensis. In the work of Lestrel et al. based on EFA, 20 harmonics are employed to match 106 mandibular shapes, producing 82 Fourier descriptors17. Then, 12 distances from the centroid to specified points on each mandible’s contour are used in statistical analysis. Compared to their study, the shape-changing chain method generates only a total of 28 variables (20 orientations of all segments and 8 arc lengths of C-segments and G-segments). The differences of orientations between neighboring segments is then calculated and generates 19 variables to be analyzed in stepwise DA. Table 3 shows a comparison of the variables generated in the approximation of curves and used for statistical analysis with the shape-changing chain method and EFA. The shape-changing chain method performs a satisfying approximation result of the mandibular shapes with much fewer variables compared with EFA.

Table 3 Numbers of variables used in the shape-changing chain method and in EFA17 for fitting and analyzing human mandible profiles.
Full size table

2D leaf outlines

Leaf classification is a typical problem that has been studied with various methods, such as artificial neural networks (ANN)30, image moments31, and EFA9. In addition, many leaves have a symmetrical shape creating issues for effective EFA12. Using the shape-changing chain method, the fitting result reveals the growth of portions on the contour and the rotation between them. This kind of information can be used in statistical analysis. Although other methods which also make use of non-shape information (size, color, etc.) have been very convenient and efficient in recognizing leaf genera, leaf matching and classification remains a problem to test the ability of the shape-changing chain method to fit and compare profiles with complicated and largely varying shapes. In this example, nine groups of 145 leaves are studied (see the groups and the number of samples in each group in Table 3). The original scanned and binarized images of the nine genera of leaves are shown in Fig. s1. The contours are traced using the Moore-Neighbor method32 and then smoothed with the MATLAB cubic spline interpolation (see Fig. s2). All leaf profiles of their original sizes are analyzed. The arc lengths of the profiles range from 1141.5 units to 8433.1 units, the areas of the leaves range from (6.1671times {10}^{4}) units2 to (1.4615times {10}^{6}) units2.

Applying the relative angle method, a number of apices are recognized on each leaf contour. These apices are the primary segmentation points that determine the boundaries of sub-profiles on leaf contours. Note that the shapes of leaves from different groups vary significantly, therefore the point interval and angle threshold used for locating apices varies from group to group. For some groups, the numbers of apices identified on different samples may be different too. Table 4 shows the parameters used for identifying apices as well as the minimum and maximum numbers of apices identified on samples for each group.

Table 4 Parameters used for identifying apices on leaf contours and the number of apices identified for each group.
Full size table

In order to maintain homology, supplementary segmentation points are added to divide all sample profiles into the same number of portions. There is no need to add more segmentation points on the profile that contains the most number of apices (red oak, (j=101)), therefore the total number of segmentation points on each profile is determined to be 34, dividing each profile into 35 portions. In order to reduce the matching error, supplementary segmentation points are distributed as evenly as possible in sub-profiles formed by the primary segmentation points (original apices) using a method developed based on a genetic algorithm (GA). In this problem, the locations of the supplemented segmentation points on the ({j}{text{th}}) profile are determined through the fitness function determined as follows

$${F}_{j}=sum_{e=1}^{q}{left({k}_{j}^{e+1}-{k}_{j}^{e}-frac{{N}_{j}-1}{q}right)}^{2}.$$

(11)

In Eq. (11), the number of pieces contained in the ({j}{text{th}}) profile (({N}_{j}-1)) divided by the number of portions (q) yields the average number of pieces in each portion. (({k}_{j}^{e+1}-{k}_{j}^{e})) is the number of pieces contained in the ({e}{text{th}}) portion confined by the ({e}{text{th}}) and the ({(e+1)}{text{th}}) segmentation points on the ({j}{text{th}}) profile. After encoding the locations of all segmentation points in the GA and several rounds of optimization based on a certain scale of crossover and mutation, the set of supplementary segmentation points that minimizes the fitness function, Eq. (11), is determined. The original apices (red circles) and supplementary segmentation points (green circles) distributed on samples from different groups are shown in Fig. 9. In this example, each profile is finally divided into 35 portions.

Figure 9

The original apices (red circles) and supplementary segmentation points (green circles) on leaf contours. (a) Cherry, (b) Dogwood, (c) Gum, (d) Hickory, (e) Mulberry, (f) Red maple, (g) Red oak, (h) Sugar maple, (i) White oak. For each group, the sample that contains the most original apices is presented.

Full size image

The length of each portion varies among profiles, thus M-segments are not applicable. In addition, some portions still contain local burrs and sharp corners, which would not be matched well by C-segments. Therefore, each portion is matched by a G-segment, and the segment vector contains 35 G-segments. The maximum and mean error of 145 leaf profiles are ({E}_{text{max}}=60.6063) and (overline{E }=8.7062) units, respectively. Figure 10 shows the best, the average, and the worst matching results of the leaves according to ({tilde{E }}_{j}). More matches of nine genera of leaves are illustrated in Fig. s3. The result show that given the distribution of apices (primary segmentation points that determine sub-profiles), the GA strategy can automatically determine the distribution of supplementary segmentation points along a profile. With the segmentation points generated from this process, the shape-changing chain matches the leaf contours with small error compared to the random segmentation in the previous study.

Figure 10

The fitting results of 145 leaves. (a) The best match (the 62nd profile—hickory, ({tilde{E }}_{62}=0.8748)); (b) The average match (the 88th profile—red maple, ({tilde{E }}_{88}=4.0866)); c The worst match (the 110th profile—red oak, ({tilde{E }}_{107}=9.7866)).

Full size image

For classification analysis, 34 orientation differences between neighboring segments are calculated using Eq. (10). Three more variables are employed: The number of primary segmentation points, the number of burrs (detected using the relative angle method with (k=50) and (T=30^circ)), and the arc length of each profile. This sums up to a total of 37 variables. A stepwise DA is performed to classify the 145 leaf samples, and 22 out of the 37 variables are selected for analysis. The variances of the first three canonical functions are 73.5%, 13.3%, and 7.5%, which add up to 94.3% in total. Figures 11 and 12 illustrate the 2D and 3D canonical plots of the nine genus of leaves based on the first three canonical components. The plots show that gum, red maple, and white oak are distinctively separated from other groups. Cherry and mulberry are partially overlapped in the directions of canonical Roots 1 and 2 for their similar overall shapes and serrated edges. There is also an overlap between dogwood and hickory in the directions of canonical Roots 1 and 3 for their similar shapes and smooth edges. The prediction accuracy is 98.6%, and the leave-one-out cross-validation is 97.9%. Only two samples of cherry are misidentified as mulberry, and one sample of hickory is discriminated as dogwood. The DA results reveal that the shape-changing method is capable of fitting a large number of profiles that have complicated shapes and different sizes, as well as generating useful variables for statistical analysis. The leave-one-out cross-validation accuracy suggests that this method is also effective with fewer variables. In addition, the shape-changing chain method enables direct observation and comparison of variables that have physical meanings, such as the relative angles between segments.

Figure 11

The 2D Canonical plots of nine genus of leaves based on 22 variables.

Full size image
Figure 12

The 3D Canonical plot of nine genus of leaves based on 22 variables.

Full size image

3D cranial suture curves

The shape-changing chain method is now applied to 3D suture curves on human infants’ skulls from a study of coronal synostosis18,19. The dataset contains 63 samples categorized into 4 groups, including left unicoronal synostosis (LUCS, (n=8)), right unicoronal synostosis (RUCS, (n=19)), bicoronal synostosis (BCS, (n=16)), and unaffected cases ((n=20)). The original data of each sample consist of 209 anatomical landmarks and curve semilandmarks located on the skull surface, especially along some anatomical lines as sutures. In this work, three curves that characterize the skull deformation are selected for analysis: the coronal suture curve, the lambdoid suture curve, and the sagittal curve which is comprised of anatomical landmarks and curve semilandmarks located on the metopic suture, the sagittal suture, and the mid-line on the occipital bone. Figure 13 shows the three suture curves on a skull surface.

Figure 13

The location of the coronal suture (magenta), sagittal curve (blue) and the lambdoid suture (red) on a human infant skull. The intersection points between sutures, P1 and P2, divide the sagittal curve into three sub-profiles and the lambdoid suture into two sub-profiles.

Full size image

BCS occurs when the coronal sutures on both sides of the skull fuse prematurely, causing the overall head shape to become broad and short. In this case, the relative location of the lambdoid suture on the skull will move forward compared to the unaffected cases, but its shape is not affected as obviously as the coronal suture or the sagittal curve which are directly affected by coronal synostosis. In order to investigate the relative location and orientation in addition to the shape of the suture curves, a standard Procrustes superimposition is performed on the original data so that all skulls represented by the 209 landmarks and semilandmarks are scaled to the same size and aligned. Figure 14 illustrates the mean shapes of the sagittal curves and the lambdoid sutures of each group. It can be observed that for BCS cases, the sagittal curve is shorter in the anterior–posterior direction, the coronal suture becomes wider in the left–right direction, and the lambdoid suture is longer and positioned relatively forward. These differences are in accordance with the overall wider and shorter BCS skull shape. As for LUCS and RUCS cases, all three curves display a symmetrical shape deformation or orientation change about the skull symmetry plane (X = 0).

Figure 14

Mean shapes of (a) the sagittal curves, (b) the coronal suture, and (c) the lambdoid sutures of four groups: LUCS (red dotted line), BCS (blue solid line), RUCS (green dotted dashed line), and unaffected cases (black dashed line). Notice the symmetry of the suture curves about the skull symmetry plane (X = 0).

Full size image

The anatomical landmarks P1 and P2 (Fig. 13) which are the intersection points between the sutures, are selected as the primary segmentation points. Thus, the sagittal curve, the coronal suture, and the lambdoid suture are divided into three, two, and two sub-profiles, respectively. Since the coronal suture and the lambdoid suture grow symmetrically about the skull symmetry plane, their segment type vectors for two sub-profiles should be symmetric, respectively. In addition, each segment type vector should contain a G- or H-segment to characterize the growth. The segment type vectors of the three curves are designated as:

  • Sagittal curve: (left[begin{array}{cc}{text{M}}& {text{G}}end{array}right]), (left[begin{array}{cc}{text{M}}& {text{H}}end{array}right]), (left[begin{array}{cc}{text{M}}& {text{G}}end{array}right]);

  • Coronal suture:(left[begin{array}{ccc}{text{M}}& {text{G}}& {text{H}}end{array}right]), (left[begin{array}{ccc}{text{H}}& {text{G}}& {text{M}}end{array}right]);

  • Lambdoid suture: (left[begin{array}{cc}{text{G}}& {text{H}}end{array}right]), (left[begin{array}{cc}{text{H}}& {text{G}}end{array}right]).

In spatial cases, the orientation of each segment is given by 3 parameters, and each G- or H-segment is characterized by an additional length parameter. Therefore, this matching scheme generates 21, 22, and 16 parameters to describe the shape variances for the sagittal curves, the coronal suture curves, and the lambdoid suture curves, respectively. Note that the suture curves are relatively smooth, thus the average value of the maximum error on all segments ({overline{E} }_{j}) is very significant of the matching error of the chain at the ({j}{text{th}}) profile. Therefore, this parameter is chosen to assess the error in this application. Figure 15 shows the best, the average, and the worst matches of the sagittal curves, the coronal sutures, and lambdoid sutures. The overall mean error ((overline{E })) of the sagittal curves, the coronal sutures, and the lambdoid sutures are 0.8728, 0.5060, and 0.3666 units, respectively.

Figure 15

The fitting results of (a–c) sagittal curves, (d–f) coronal sutures, and (g–i) lambdoid sutures from 63 samples. The left column (adg) is the best match of each group, the middle column (beh) is the average match of each group, and the right column (cfi) is the worst match of each group. (a) ({overline{E} }_{s-52}=0.5122) (unaffected); (b) ({overline{E} }_{s-19}=0.8681) (BCS); (c) ({overline{E} }_{s-46}=1.6055) (unaffected); (d) ({overline{E} }_{c-12}=0.2507) (BCS); (e) ({overline{E} }_{c-54}=0.5082) (unaffected); (f) ({overline{E} }_{c-42}=0.9636) (RUCS); g ({overline{E} }_{l-29}=0.1225) (BCS); (h) ({overline{E} }_{l-37}=0.3687) (BCS); (i) ({overline{E} }_{l-28}=1.0997) (RUCS).

Full size image

In order to represent the orientation of the spatial chain, the ({e}{text{th}}) segment at the ({j}{text{th}}) profile is characterized by a unit vector that points from the starting point to the endpoint of the segment as

$${mathbf{u}}_{j}^{e}=frac{{overline{mathbf{z}} }_{{j}_{{m}_{j}^{e}+1}}^{e}-{overline{mathbf{z}} }_{{j}_{1}}^{e}}{Vert {overline{mathbf{z}} }_{{j}_{{m}_{j}^{e}+1}}^{e}-{overline{mathbf{z}} }_{{j}_{1}}^{e}Vert }.$$

(12)

Since each vector ({mathbf{u}}_{j}^{e}) contains three Cartesian coordinates, each sagittal curve, coronal suture, and lambdoid suture is thus characterized by 18, 18, and 12 variables, respectively. For lambdoid sutures, its relative location which is characterized by the coordinates of point P2 is also analyzed. Therefore, the total number of variables analyzed for a lambdoid suture is 15. This is much fewer than the 209 landmarks and semilandmarks analyzed in the work of Heuzé et al.19. Stepwise DA is conducted with the variables above, and LOOCV is performed to verify the stability of the linear model. In stepwise DA, 6, 7, and 8 variables are selected to be analyzed for sagittal curves, coronal sutures, and lambdoid sutures, respectively. Figure 16 illustrates the canonical plots of the three set of curves. As shown in Fig. 16, all three set of suture curves display strong separation among four classes on the 2D canonical plots. Besides, the LUCS and RUCS curves are distributed in the opposite directions from the BCS and unaffected ones along the first canonical component, while the BCS and unaffected curves differ in the direction of the second canonical component. These plots confirm the symmetrical shape deformation of the suture curves of LUCS and RUCS cases, and the changes in the lengths and relative locations of the suture curves of BCS as observed in Fig. 14.

Figure 16

The 2D canonical plots of the suture curves selected from 63 skull samples. (a) The sagittal curves based on 6 variables. (b) The coronal sutures based on 7 variables. (c) The lambdoid sutures based on 8 variables.

Full size image

The original DA prediction accuracy and cross-validated accuracy are both 100% for the sagittal curves, which indicates that the shape difference of the sagittal curves can efficiently distinguish specific diagnosis of coronal synostosis. The original DA prediction accuracy and cross-validated accuracy for the coronal sutures are 98.4% and 96.8%, respectively. There are two cases (BCS and RUCS) misclassified as unaffected case. As for the lambdoid sutures, the original DA prediction accuracy and cross-validated accuracy are both 98.4%. The only misclassified case in both predictions is that one BCS lambdoid suture is categorized as unaffected. This suggests that the coronal suture and the lambdoid suture is subjected to both shape deformation and location transformation due to coronal synostosis.

These matching and classification results show that the shape-changing chain is efficient in fitting and analyzing 3D curves with a very moderate number of variables compared to other parametric methods. For example, Zhou et al. employed discrete cosine transform (DCT) to analyze the same three sets of suture curves33. In their work, 12, 6, and 6 harmonics are employed to fit the sagittal curves, the coronal suture curves, and the lambdoid suture curves, resulting in 36, 18, and 18 coefficients to be analyzed. Table 5 shows a comparison of the variables used to match the curves and perform statistical analysis with the two methods. A comparison of the classification accuracies of the two methods is not provided because Zhou et al. employed between-group principal component analysis (bgPCA) while the presented work uses stepwise DA. Note that the variables obtained in DCT are mathematical coefficients which are hard to interpret, while the variables in the shape-changing chain method represent the orientations, lengths, or locations of segments, providing direct information of the variance of the curve shapes.

Table 5 Numbers of variables used in the shape-changing chain method and in DCT33 for fitting and analyzing suture curves.
Full size table


Source: Ecology - nature.com

Local adaptations of Mediterranean sheep and goats through an integrative approach

Predicting spring migration of two European amphibian species with plant phenology using citizen science data