Goose capture and tracking
We used rocket netting and leg snares to capture white-fronted geese in three regions in Texas (Rolling Plains, Lower Texas Coast, and South Texas Brushlands) and one region in Louisiana (Chenier Plain) from October to February 2016–2018 (Fig. 1). We determined age and sex of individuals by cloacal inversion, rectrices and other plumage characteristics27,28. We fit a solar powered GPS/ACC/Global System for Mobile communication (GSM) neckband tracking device (Cellular Tracking Technologies Versions BT3.0, BT3.5 and BT3.75; 44–54 g; Rio Grande, New Jersey, USA, and Ornitela OrniTrack-N38; 36 g; Vilnius, Lithuania), and an aluminum U.S. Geological Survey Bird Banding Laboratory metal leg band (Supplementary Fig. S1) on each bird. Geese were captured and tagged under USGS Bird Banding Permits #21314 and #23792, and Texas A&M University-Kingsville Institutional Animal Care and Use Committee #2015-09-01B. Captive geese were permitted under TAMUK IACUC #2018-01-11 and United States Fish and Wildlife Service Waterfowl Sale and Disposal permit #MB03808D-0. All applicable field methods were carried out in accordance with relevant guidelines and regulations. All animal handling protocols were approved by TAMUK IACUC committees and the USGS Bird Banding Laboratory. When multiple white-fronted geese were captured simultaneously, devices were only placed on adult females or adult males to eliminate the potential of placing devices on mated pairs, thus biasing independent data collection due to monogamous, long-term pair bonds in white-fronted geese. Location duty cycles were set to collect a GPS location every 30 min (i.e., 48/day) and location accuracy was 7.2 and 6.5 m for CTT and Ornitela devices, respectively. Data were uploaded once daily to respective online user interface websites when within areas of GSM coverage. When not in coverage areas, data were stored onboard the device until birds returned to coverage areas. All devices were equipped with a tri-axial ACC sensor which measured G-force (g; CTT devices) or millivolts (mV; Ornitela devices) at a fixed sampling scheme; CTT BT3.5 and Ornitela devices collected ACC data for a duration of 3 s every 6 min at 10 Hz, while BT3.0 devices collected data for a duration of 10 s every 6 min at 10 Hz. Generation BT3.0 devices were subsampled to match the sampling scheme of 3 s bursts before analyses. Ornitela units measured in mV were converted to G-force. We applied manufacturer- and tag-specific ACC calibration to all units, respectively, by collecting ACC data on each possible rotation for all axes when the device was stationary and applying the calibration to the raw ACC values (see Ref.29 for full calibration procedure). All devices recorded temperature in °C at each GPS fix. We censored GPS and ACC data from the time of release until individuals appeared to resume normal movement activity (i.e., roosting and foraging), as geese typically flew to the nearest wetland immediately after release where they remained without leaving while acclimating to wearing devices, which ranged from 1 to 7 days30. We defined the start of the winter period following a southward migratory movement from staging areas in Canada, without additional migratory movements southward below 40° 0′ 00″ N, or from the time of device deployment (minus device acclimation period) until geese made large northward migratory movements, or 28 February if geese remained in wintering areas.
Land cover covariates
We used publicly available spatial landcover data sets (30-m resolution) in combination with remote sensing to create landscape layers using programs Esri ArcMap (version 10.3.1), Erdas Imagine, and Program R (version 3.5.231). We used 2017 and 2018 National Agricultural Statistics Service Cropland Data Layer (CDL) data sets for agricultural crop types and freshwater wetlands, and the 2010 Coastal Change Analysis Program layer for saltwater and coastal wetland classifications29,32. Additionally, we used multi-spectral Landsat 8 Operational Land Imager satellite imagery, with principal component analysis on eight Landsat bands and a normalized difference vegetation index band, and unsupervised classification33,34 to accurately identify and create a spatial layer for peanut fields. We developed this layer for two regions with annual peanut agriculture (i.e., the Rolling/High Plains and South Texas Brushlands) using ground-truthed peanut fields, because the CDL layer did not identify this crop accurately based on our field observations during captures. We achieved > 90% accuracy of peanut identification for each image independently based on annual ground-truthed observations of peanut fields. Finally, we grouped like-habitat categories to reduce the total number of categories to eight: corn, grass/winter wheat, herbaceous wetlands, other grains (i.e., soybeans, sorghum, and peanuts), rice, woody wetlands, open water/unconsolidated shore and other (Supplementary Table S1). White-fronted geese used several ecologically distinct regions in both winters of our study (Fig. 1), where the landscape composition of specific landcover types varied. To account for regional variability, we added region ID as a categorical variable to all GPS locations. Regions included the MAV, Chenier Plain, Texas Mid-coast, Lower Texas Coast, South Texas Brushlands, Texas Rolling/High Plains, and Other (i.e., locations outside of these identified wintering regions; Fig. 1). We used regional shapefiles of Gulf Coast Joint Venture Initiative Areas (Laguna Madre [Lower Texas Coast], Texas Mid-coast, and Chenier Plain35), and Level III Ecoregions (Mississippi Alluvial Valley, Texas Rolling/High Plains, and South Texas Brushlands36) as boundaries to classify data into regions. Due to insufficient and incompatible spatial layers for Mexico, we limited analyses to locations within the US.
Location and acceleration data collection
Remotely determining behaviors of individuals using ACC data is most accurately addressed by developing a training dataset of known behaviors linked with ACC measurements of those behaviors18,37. To develop a training dataset, we collected video footage of two domestic white-fronted geese in Texas, US, and 18 tagged wild Greenland white-fronted geese (A. a. flavirostris) fitted with the same device types and the same data collection scheme, in Wexford, Ireland and Hvanneyri, Iceland during winters 2017–2018. We supplemented wild recordings with behavioral recordings of captive white-fronted geese as a proxy for wild individuals due to difficulty filming wild geese in inclement weather and obstructed video footage, which is common in ACC literature19,20,38,39. To replicate devices placed on wild white-fronted geese and account for potential variation in ACC measurements between device brands, among device versions and individual geese, we deployed three versions of devices used in this study on captive white-fronted geese during filming sessions38,40. We attached tracking devices to captive geese one week prior to video collection to allow geese to adjust to wearing devices. We collected ACC measurements for 3 s bursts, at 1 min intervals, at 10 Hz. We constructed a 149 m2 enclosure in an agricultural field to imitate an environment that wild geese may encounter. We created two enclosure settings allowing captive geese to forage on sprouted winter wheat (~ 2–15 cm) or on a randomly dispersed mixture of grain seeds (corn, wheat, sorghum) to account for both ‘grazing’ of vascular vegetation and ‘gleaning’ of agricultural grains to imitate foraging in wild geese. We used Sony Handycam DCR-SR45 video cameras, matched internal camera clocks with a running Universal Coordinated Time clock, and verbally re-calibrated the current time every 2 min during video footage collection. We filmed 119.5 h of video footage, and matched behavior with recorded ACC measurements by pairing video and device timestamps for each device using JWatcher41 and Program R.
We characterized goose behaviors into four categories: ‘stationary’, ‘walk’, and ‘foraging’ from ground-truthed video footage, and ‘flight’ from visual inspection of the ACC data and consecutive GPS tracks during migration where device-measured speed remained > 4.63 km/h (based on a natural break in the speed density distribution of all GPS locations). Each ACC burst was classified as only one behavior (i.e., a goose that was walking as it foraged was classified as ‘foraging’). We combined wild goose behaviors and captive goose behaviors after identifying minimal differences in ACC burst summary statistics29 for ‘stationary’ and ‘walk’ behaviors. We used ‘graze’ behaviors only from wild geese because of low sample size for captive geese and slight differences in ACC summary statistics between captive and wild geese for this behavior. ‘Glean’ foraging behavior was only classified from captive geese. We then combined ‘graze’ and ‘glean’ behaviors into an overall ‘foraging’ behavior to account for variation in foraging behavior of wild geese, and because machine learning models could not accurately distinguish between the two foraging modes40. We randomly subsampled all behaviors to 150 bursts if our dataset contained at least that many bursts to reduce the risk of artificially increasing prediction accuracy20. We determined there were insufficient differences in ACC signatures between CTT BT3.0 and BT3.5 versions by visual comparison of signatures and summary statistics, and merged all bursts into an overall CTT-specific training data set, and retained CTT- and Ornitela-specific training data sets to account for brand-specific ACC measurement schemes. The final training data sets consisted of 150 stationary, 150 walking, 118 foraging, and 150 flying bursts (CTT), and 150 stationary, 75 walking, 120 foraging, and 150 flying bursts (Ornitela).
We used the training data sets to predict behaviors of tagged, wild white-fronted geese during winter with temporally aligned GPS and ACC data. We used a suite of supervised machine-learning algorithms and selected the algorithm with greatest prediction accuracy based on an 80% training, 20% testing sample approach. We tested random forest, support vector machines, K-nearest neighbors, classification and regression trees, and linear discriminant analysis, all with cross validation in Program R18,29,42. We evaluated models using three metrics defined in Ref.42: (1) overall classification accuracy as the percent of classifications in the test data set that were predicted correctly, (2) precision of assignment, the probability that an assigned behavior in the test data set was correct, and (3) model recall, the probability that a sample with a specific behavior in the test data set was correctly classified as that behavior by the model. Random forests provided the highest overall classification accuracy (95.6% for CTT and 96.0% for Ornitela), as well as high precision and recall for each behavior (CTT range 93.1–99.3, Ornitela range 88.9–100.0%), and therefore we labeled behaviors from wild goose ACC data using the random forests.
Habitat transition model
Our habitat-transition model required temporally matched GPS and ACC datasets. Therefore, we subset all GPS locations to match the time-series of ACC data per individual because devices typically acquired GPS data longer than ACC data before device failure or individual mortality. For each GPS location, we extracted the landcover type and wintering region from spatial layers and retained temperature recorded from the device. To link classified ACC behaviors to GPS locations, we matched ACC timestamps between two GPS locations with the previous GPS timestamp. That is, all ACC bursts between two GPS locations were assigned backward to the previous GPS location. In this way, an individual’s first location is collected in GPS landcover type A, ACC data are collected in 5 bursts, their behaviors are classified and assigned to the first GPS location A and associated landcover type, followed by collection of GPS location B, in which the subsequent 5 ACC bursts are associated to GPS location/landcover type B. In the case of missing GPS locations, we matched ACC bursts to the previous GPS location only if the ACC timestamps were within 60 min of the GPS timestamp, and ACC bursts occurring greater than 60 min after GPS acquisition were removed until the next GPS fix. To account for temporal variation in habitat-behavior relationships, we calculated two continuous covariates representing time-of-day based on the local time associated with the timestamp of each GPS location for each individual. The variable cos(Diel) represented diurnal (negative values) and nocturnal (positive values) periods, and sin(Time) represented midnight until 11:59 a.m. (positive values) and noon until the following 11:59 p.m. (negative values), where high and low values ranged continuously between 1 and − 143. Our temporally matched time series of GPS and ACC data yielded 53,502 GPS locations linked with 300,348 ACC bursts across both winters.
We used a Bayesian Markov model with Pólya-Gamma sampling following43), [cf. Refs.44,45] to determine how transitions between landcover types were influenced by behavior, temperature, time-of-day, and wintering region. The proportion of time spent foraging, walking, and stationary between each successive GPS fix was included as a covariate; flight was not included to reduce multicollinearity due to behavior proportions summing to one. Markov models account for non-independence among observations by assuming that the current state (i.e., landcover type) is dependent upon the previous state, and allow the determination of covariate influences on the probability of transitioning among states through a logistic link function. The transition probability from habitat i to habitat j at time t for individual n is modeled with multinomial logistic regression:
$$begin{aligned} & logitleft( {p_{nijt} } right) = logleft( {frac{{p_{nijt} }}{{p_{niJt} }}} right) = mathop sum limits_{{r in {mathcal{R}}_{j} }} beta_{0jr} Ileft( {Region_{nt} = r} right) + beta_{1j} {text{cos}}left( {Diel_{nt} } right) & quad + beta_{2j} {text{sin}}left( {Time_{nt} } right) + beta_{3ij} Forage_{nt} + beta_{4ij} Walk_{nt} + beta_{5ij} Stationary_{nt} + beta_{6ij} Temperature_{nt} , end{aligned}$$
where ({mathcal{R}}_{j}) is the set of wintering regions (r) where habitat (j) occurs, (Regio{n}_{rnt}) indicated wintering region (r), and (mathrm{cos}left({Diel}_{nt}right)) and (mathrm{sin}({Time}_{nt})) were temporal covariates (described above) for habitat j. Quantities ({Forage}_{nt}, {Walk}_{nt},mathrm{ and }{Stationary}_{nt}) were the scaled (mean = 0, standard deviation = 1) proportion of time spent in each behavior between transitions from habitat i to habitat j, and ({Temperature}_{nt}) was scaled ambient temperature (°C) for transitions from habitat i to habitat j. All coefficients for transitions to the baseline habitat (J) were set to 0 (i.e., ({beta }_{0Jr}) for all (r), ({beta }_{1J}), ({beta }_{2J}), ({beta }_{3iJ}), ({beta }_{4iJ}), ({beta }_{5iJ}),({beta }_{6iJ}), for all (i)). We set the baseline habitat (J) as open water/unconsolidated shore because this habitat is used primarily for both nocturnal roosting and diurnal loafing, included all behaviors, and transitions to all other landcover types were frequent in each region.
The prior for the set of winter region intercepts for each habitat was:
$${beta }_{0jr}sim N({beta }_{0j},{sigma }_{0jr}^{2}),$$
for (rin {mathcal{R}}_{j}), ({beta }_{0j}) was the mean intercept, and ({sigma }_{0jr}^{2}) was set to 100. For ({beta }_{0j}), a vague prior mean 0 and σ2 = 100 was used with an assumed normal distribution.
The Markov model was executed within a Bayesian framework to robustly quantify uncertainty. The Markov model assumed that data were collected at regular time intervals for both GPS (30 min) and ACC (6 min), however imperfect collection by devices created irregular data sets. Therefore, we subsampled GPS locations and constrained time series data to sequences where GPS locations missing > 120 min intervals (i.e., 4 locations) were separated into sequences of regular time series data for each individual46. We extended43 by including a mix of both transition-specific effects (i.e., behaviors, temperature) and habitat-specific effects (i.e., wintering region, cos(Diel), and sin(Time)), where transition-specific effects were allowed to vary for a current habitat state, while habitat-specific effects were not. We included a mix of coefficients because initial model runs indicated that some effects were similar regardless of the current habitat (i.e., were habitat- and not transition-specific decisions). We also incorporated a model feature to exclude estimation of transitions that did not occur either within the dataset as a whole or within each specific wintering region because landcover types varied among them by setting those specific transition probabilities to zero. We centered and standardized all behavior and temperature covariates, sampled 50,000 iterations from the model posterior using one chain, and discarded the first 10,000 iterations as burn-in. We assessed model convergence by evaluating trace plots and setting random initial values, examined autocorrelation plots, and Geweke diagnostics using the R package ‘coda’47,48,49.
Source: Ecology - nature.com