Samples
We sampled 73 adult fox skulls (Vulpes vulpes) from three separate sample groups: wild (8 F, 12 M), unselected (15 F, 8 M), and domesticated (15 F, 15 M). Domesticated and unselected skulls from the RFF experiment were generously provided by Dr. Trut and transported to Harvard in 2004. Unfortunately, we do not know how these foxes were chosen, but have no reason not to assume that they were selected randomly from both populations. Wild fox skulls in the study were sampled from the collections of the Museum of Comparative Zoology, Harvard University. All but two wild foxes were trapped in Canada east of Quebec between 1894 and 1952, with the majority (70%) between 1894 and 1900 (Table S1). We excluded from the study sample all juvenile skulls, as determined through lower third molar eruption and fusion of the cranial suture between the basioccipital and basisphenoid16, and those skulls that had evidence of damage or disease. After applying these exclusion criteria, we arrived at our final sample of 73 skulls.
3D landmarks
To prevent movement during measurement, each skull was embedded in styrofoam and secured to the workspace desk before 3D coordinates of 29 landmarks, listed, defined and displayed in Table S2 and Fig S1, were collected on the left half of each skull by a single analyst (TMK). Of these 29 landmarks, 17% are on the cranial base, 27% are on the neurocranium, and 55% are on the face. 3D landmark coordinates were measured with a Microscribe G2 (Positional Accuracy ± 0.38 mm, Revware, Inc.). This machine consists of a mobile robotic arm tipped with a probe. After calibration, the probe tip is placed on each landmark to record its XYZ coordinates. To avoid having to move the skulls during measuring and to limit the number of variables in our final GM analysis (too high a number may be a problem given our small sample size), we restricted landmark measurements to one half of each skull. We assume that the fluctuating asymmetry between each fox population is negligible and stable as has previously been shown in comparisons across a domestic-wild hybridization zone in mice17. In most cases, landmark positions were lightly marked with pencil to ensure proper probe placement.
Linear and endocranial volume measurements
Six linear measurements were taken on each skull using digital calipers (Fowler High Precision, Positional Accuracy ± 0.03 mm): total skull length, snout length, cranial vault height, cranial vault width, bi-zygomatic width and upper jaw width (Fig. 1). Endocranial volume was measured using plastic beads. Each cranium was filled up to the level of the foramen magnum and repeatedly shaken and tamped down until no more beads could be added. The beads were then funneled into a graduated cylinder to obtain a volumetric measurement.
Schematic diagram of the six linear measurements taken on the fox skulls. 1: Bi-zygomatic width. 2: Cranial vault width. 3: Upper jaw width. 4: Total skull length. 5: Snout length. 6: Cranial vault height.
Geometric morphometrics
Generalized Procrustes analysis was conducted in R v. 4.0.218 using the geomorph v. 3.3.1 package19. Landmark configurations from each specimen were translated to the origin, rescaled to centroid size, and optimally rotated (using a least-squares criterion) until the coordinates of homologous landmarks aligned as closely as possible. These steps place all specimens in the same shape space, centered on the mean shape. An orthogonal projection into a linear tangent space was applied so statistical analyses could be performed on the resulting tangent space coordinates. For Procrustes superimposition, we used the default parameters of the gpagen function in geomorph.
Repeatability analyses
Landmark measurement repeatability was evaluated through repeated measurements of three fox skulls (domesticated male ID# TM23, domesticated female ID# TF476, unselected female ID# UF1058) on ten separate occasions. In this case, repeatability encompasses both Microscribe and operator error. Generalized Procrustes Analysis (GPA) was used on these landmark coordinates to ensure that they were in the same 3D location relative to one another. The average Procrustes distance (PD) between all ten iterations of the same specimen was then compared to the average Procrustes distance within the population-sex grouping to which the skull belonged. To do this, we calculated a sensitivity ratio based on the formula: (Mean Inter-specimen PD – Mean Intra-specimen PD)/Mean Intra-specimen PD. This created a sensitivity ratio that reflects how sensitive the Microscribe measurements are with respect to the average difference among foxes of the same population-sex category. Averaging the sensitivity ratios for our three skulls, we find that the difference between replicates is roughly 3.7 times smaller than the differences within population-sex groupings. This indicates that the Microscribe G2 is robust enough to detect subtle individual differences in measured landmark coordinates. Linear and endocranial measurement repeatability was quantified through a similar method where repeat measurements were taken on 3 domesticated female fox skulls on 15 separate occasions. Sensitivity ratios were deteremined for each measurement (i.e. total skull length, snout length etc.) by calculating the standard deviation of each repeated measurement on a single specimen, averaging the three specimens’ standard deviations for that measurement, and then comparing that value to the population (domesticated female) standard deviation for that measurement. With the exception of cranial vault height (see limitations), the replicate standard deviations of each measurement were roughly a third (or less) of the population standard deviations (Table S3).
Statistical analyses
All statistical analyses were performed in R18. For all parametric inferences, we report point and interval (95% confidence) estimates of effect sizes, while for permutation-based inferences we report point estimates and p-values. All p-values involving multiple comparisons were adjusted for family-wise error using the sequential Bonferroni method.
3D shape comparisons
To test hypotheses about shape differences among the three populations of foxes, we used a permutation-based Procrustes MANOVA to regress tangent space coordinates on population identity and sex in the geomorph v. 3.3.1 package. Because we are unable to detect significant differences in allometry among populations with a permutation-based Procrustes MANOVA of tangent space coordinates on the interaction term of population identity and centroid size, the tangent space coordinates were not corrected for any scaling effects (see Supplemental information and Fig S2). Given this result, we control only for isometric size in geometric morphometric analyses (i.e. no correction for scaling in tangent space coordinates) as well as in our linear measurements. To determine how skull shape differed between fox populations, we performed pairwise comparisons of shape using Procrustes distances. We additionally performed pairwise comparisons between groups of the shape variance within a group (as assessed by the dispersion of residuals around the mean shape for a given population)20. All pairwise comparisons were made using the RRPP v. 0.6.1 package21,22. Permutation-based p-values for the pairwise comparisons were corrected for family-wise error using the sequential Bonferroni method. To visualize changes in skull shape between populations, a principal components analysis (PCA) was performed on the tangent space coordinates. Skull warp changes along the first principal component were graphed to visualize shape changes along this axis. Size differences between populations were assessed via a linear model using a weighted least squares (WLS) estimator, where centroid size was regressed on population identity and sex. The WLS estimator allowed for separate residual variances for each combination of population and sex, so that heteroskedasticity across these groups could be accounted for in the model. Variance in centroid size was assessed with a Levene’s test based on absolute deviations from the median and was performed using the car package v. 3.0-1023.
Linear and endocranial volume comparisons
Prior to modeling linear and volumetric data, we created size-adjusted versions of our variables to account for a difference in isometric size between wild and RFF populations. Normalizing to size allows us to parse out the effects of size selection from those of selection for docility as they likely have overlapping effects on craniofacial shape. We adjusted for size by normalizing each linear measurement and the cube root of endocranial volume by centroid size. We used centroid size rather than the geometric mean of the six linear measurements because centroid size was calculated using a larger sample of craniometric landmarks and is therefore the better proxy of overall cranial size. We performed size corrections on the raw measurements instead of including a size variable in the models because it allows the size-correction to be intrinsic to each fox rather than depending on the size of every fox in the model.
To determine if there were population-level differences in size-corrected linear and volumetric variables, we used a linear model with a generalized least squares (GLS) estimator from the nlme v. 3.1-150 package24 to regress all 7 skull variables simultaneously as correlated responses on population identity and sex (see Supplementary Methods for details of estimation strategy and model specification and Figs. S3, S4). We report estimates of pairwise percent differences between population means for each skull variable. We use this method because the 7 linear and volumetric skull variables were correlated in two ways (see Fig. S3). First, they were measured on the same specimens, and second, they represent non-independent aspects of shape variation. Modeling these response variables in 7 separate general linear models (e.g., ANOVA) would result in biologically unrealistic inferences because these correlations would be artificially fixed at zero. In addition, since skull variables exhibited varying amounts of dispersion, the GLS model allowed for different residual variances for each response variable.
Sexual dimorphism comparisons
Sexual dimorphism within a species is often represented as dimorphism in size as well as shape25. Therefore, in contrast to the previous analyses, we assess the degree of sexual dimorphism in both size and shape. We used a similar GLS model to determine the degree of sexual dimorphism of the raw (non-size corrected) variables within each population. To estimate sex-specific effects, we added an additional interaction term between sex and population identity in this model. We report the degree of dimorphism using estimates of mean differences between males and females for a given skull variable, within a population. For both models using linear and volumetric data, we performed model selection for variance components and correlation structures using the Bayesian information criterion, since this has been shown to provide a good balance between parsimony and over-fitting for explanatory models26. Linear model (GLS) assumptions were checked using diagnostic plots of standardized residuals and fitted values (see Fig. S5).
Source: Ecology - nature.com