Data sources
Our dataset includes all collared wolves monitored by telemetry (virtually all VHF radio-transmitters) in Wisconsin (WI) between 1979 and April 2012, published previously in full detail5. The dataset includes 486 wolves fitted with collars by the Wisconsin Department of Natural Resources (WDNR) or its agents, plus 27 collared wolves initially captured in the neighboring state of Michigan, which later migrated to Wisconsin (for a total n = 513 individuals).
Our dataset includes 257 wolves that were reported by the WDNR as ‘lost-to-follow-up’ (LTF) because they were not detected via repeated aerial telemetry. LTF may occur for various reasons: (a) individuals that have moved permanently out of telemetry range (i.e., migrants), (b) collars that stopped transmitting because of battery depletion or mechanical failure, and (c) unreported poaching followed by destruction of the transmitter (cryptic poaching). The WDNR suspended telemetry monitoring and assigned an LTF to a wolf if their personnel were unable to detect the collar signal after several months of statewide aerial or ground telemetry. However, the WDNR did not quantify telemetry effort. Dead wolves (n = 242) were recovered by the mortality signals emitted from their collars, after legal killing by management agents, or after private citizens reported a dead wolf between monitoring flights41,42. Some LTF wolves were subsequently recovered by means other than telemetry, such as reporting by private citizens. For these cases we used the estimated date of LTF for the endpoint (i.e., death from various causes or disappearance). For fuller treatment of disappearances, detection, and causes of individual wolf death see5.
Estimating conditional hazards
Our analyses exploit the survival history of monitored wolves, measured in days from date of collaring until date of endpoint (i.e., date of death, last monitoring date, or end of our analysis period on April 15, 2012) for each monitored individual.
We modeled endpoint-specific hazard and subhazard in a competing risk framework, which are extensions of survival (or ‘time-to-event’) analyses. Survival analyses estimate ‘time-to-event’ functions, which describe the probability of observing a time interval (T) to an endpoint (‘event’) within a specified analysis time (t) that a subject was observed, such that (Sleft(tright)=P(T>t)) . Alternatively, these techniques allow for calculating the hazard function, ({h}_{k}(t)), or the instantaneous rate of occurrence of a particular endpoint k conditional on not experiencing any endpoint until time t30,43,44. We also used the (conditional) hazard functions for all endpoints to estimate the probability of any endpoint up to a particular time T, i.e., the incidence over time for particular endpoints, such as LTF or death by vehicle collision, nonhuman cause, etc.
Semi-parametric, Cox proportional hazard models estimate how the endpoint-specific ({h}_{k}(t)) changes as a function of survival time and a set of hypothetical covariates; (Sleft(tright)={e}^{-{h}_{k}(t,x,beta )}), where x is a vector of covariates acting on the hazard, and β is a vector of their respective parameter estimates. The estimation of covariate effects on the endpoint-specific hazard is modeled as ({h}_{k}left(tright)= {h}_{0k}(t){e}^{({beta }_{1}{x}_{1}+dots +{beta }_{j}{x}_{j})}), where ({h}_{0k}(t)) is an unestimated baseline hazard function (i.e., semi-parametric) and ({beta }_{j}) represent estimates of hazard ratios (HRs) for each covariate ({x}_{j}) (HR < 1 represents a reduction in hazard and HR > 1 an increase in hazard).
The estimated HRs, ({beta }_{j}), are assumed proportional throughout the analysis time, t, (only differ multiplicatively between categorical covariate levels). Furthermore, we include time-varying effects on hazards and incidences by including interactions between covariates and monitoring time (in days) (see “Model covariates” section)43,44,45. These models allow us to estimate covariate effects on the rate of occurrence of an endpoint looking only at those wolves reaching that endpoint (so that the presence of other endpoints would not affect these estimates). Inference from hazards is limited in the presence of other endpoints competing to bring about the end of monitoring because interaction between endpoint hazards is unaccounted for. Interactions between endpoints are crucial for our tests of hypotheses that relate legal killing to poaching (i.e., illegal killing, both reported poaching and cryptic poaching through the LTF endpoint) at an individual level.
Estimating unconditional incidences
Competing risk analyses extend standard survival analysis by considering multiple endpoints simultaneously (e.g.: multiple causes of death or disappearance). These models are useful for estimating the incidence of a particular endpoint while accounting for the potential occurrence of all other competing endpoints (e.g., the incidence of wolf-poaching in the presence of other causes of death or LTF). In a competing risk framework, individuals can potentially experience one of multiple mutually exclusive endpoints at each interval T. Because only one endpoint can occur first, we refer to the endpoints as ‘competing’ over time, and to the respective probabilities over time as ‘competing risks’.
Rather than estimating the endpoint-specific HRs, as in the Cox model explained above, competing risk analyses estimate the cumulative incidence function (CIF) for each endpoint, defined by the failure probability (Prob(Tle t,D=k)); the cumulative probability of endpoint k occurring over time in the presence of other competing endpoints30,31,46. Competing risk analysis accounts for the CIF of any endpoint being a function of all endpoint-specific hazards, ({h}_{k}(t)), reflecting the rate of occurrence of that endpoint as well as how it is influenced by others32.
Although CIFs can be derived by using all endpoint-specific HRs derived from Cox models, such a procedure cannot estimate the magnitude of the relative difference between covariate CIFs for each endpoint. Using Fine-Gray (FG) models instead of Cox models allows us to estimate differences in CIFs for a given endpoint conditional on covariates31,47. FG models are also semi-parametric (i.e., the baseline subhazard function is not estimated) and assume proportionality of subhazard functions, defined as the risk of failure at time t from endpoint k in subjects that have yet to reach an endpoint or have experienced any other endpoint30,31,47. Therefore, FG models estimate the subhazard functions of endpoint-specific CIFs using similar regression techniques as the Cox model (but on the subhazard rather than the hazard thus yielding SHR rather than HR for ratios that compare to a standard), but parameter interpretation changes. Subhazards are interpreted as relative incidence in the presence of other endpoints29,30,31.
In sum, endpoint-specific Cox models and their HRs allow us to test the hypothesis that liberalized wolf-killing affected the rate of occurrence of any endpoint; for example, if liberalized killing increased or decreased the rate of occurrence of reported poaching or LTF. By contrast, the FG models and their SHRs allow us to account for the simultaneous presence of all competing endpoints to test if and how much liberalized killing affected the probability and incidence of reported poaching or LTF, in addition to the potential simultaneous effects of other covariates described after data preparation. CIFs allow us to visualize those effects on incidence while considering the prevalence of each endpoint in the population.
Data preparation
For wolves monitored until death, our endpoints classify the cause of death by 5 mutually exclusive causes of death similar to5: “collision” (trauma caused by vehicles; n = 24, 4.7%), “legal” (lethal control by management agencies; n = 32, 6.2%), “poached” (illegal human-caused killing; n = 88, 17.2%), “nonhuman” (causes unrelated to people, e.g.: other wolves or diseases; n = 77, 15.0%) and “uncertain” (uncertain cause but the wolf carcass was recovered, i.e.: difficult to discern in necropsy; n = 21, 4.1%). We added a sixth distinct category of LTF endpoint (n = 231, 45.0%, and see Supplementary Data S1) and we address 40 collared wolves missing endpoint dates (7.7%) below.
We defined the date of endpoint either as the recorded date of death for wolves monitored by telemetry until death (n = 242, 47.2% of sample) or as the date of last telemetry contact for LTF wolves (n = 231, 45.0%). Some of the LTF wolves were found dead later (n = 51), through means other than telemetry (e.g., visual detection), which might bias to a later date of ‘death’, if carcasses were found long after the actual date of death which was not uncommon5. Given the sensitivity of time-to-event models to the accuracy of endpoint dates and because most (n = 206, 78% of the LTF subsample) were never detected again, our step to restrict the record histories of LTF wolves to the last date of monitoring is an important yet imperfect improvement in measurement precision.
Accounting for all individuals at risk of experiencing an endpoint at any particular time T (the ‘risk set’) is essential for obtaining unbiased estimates of HR, SHR, and CIF43,44,48. Omitting a class of individuals (e.g., LTF) strongly biased risk estimates for four populations of wolves, and in the Wisconsin wolf population specifically, as summarized above5,9.
Model covariates
We included three time-dependent categorical covariates in our models. Time-dependent covariates are variables that change value due to external events at a known date, either for individual wolves or all wolves. For example, we modeled policy period as time-dependent by changing the covariate value at the dates of policy change for a particular individual’s history of monitoring. To assign categorical values of the time-dependent covariates to each monitored wolf, we split each history at each specified date of change in covariate value. We refer to the splits for a monitored wolf as ‘spells’, because they refer to briefer time periods within an individual’s total monitoring time T. So, the time-dependent categorical covariates have a duration that overlaps the monitoring period for collared wolves during that period, but the wolves have individual spells that might be less than or equal to the duration (see example in Supplementary Table S1).
Our main covariate of interest is policy that liberalized wolf-killing (lib_kill where 1 = liberalized killing, 0 = full protection). Gray wolves experienced full protection under the ESA from 1979 to March 31, 2003. From April 1, 2003, wolves in WI and MI were subject to 11 alternating sequential, non-overlapping periods in which wolf-killing policies were first liberalized and then restricted for varied durations (Supplementary Table S2)5,12,28. Although WDNR or its agents occasionally killed a wolf during full protection periods, in capture-related accidents or after verified threats to human safety, these were rare and few. By contrast, liberalized killing periods were characterized by an announcement of policy change that allowed managers or private landowners to kill wolves for perceived or verified losses of domestic animals. Liberalized killing periods included:
‘Downlisting’ to threatened status (one period starting April 1, 2003; 670 days, Supplementary Table S2)—allows for lethal control in defense of human property or safety as well as for population management or conservation purposes under ESA section Rule 4(d).
Issuing of sub-permits for “take” (“to harass, harm, pursue, hunt, shoot, wound, kill, trap, capture, or collect, or to attempt to engage in any such conduct” [ESA]) of wolves by managers and sometimes private landowners (periods within 2005 and 2006; 263 days, Supplementary Table S2) under ESA sections 9 and 10.
‘Delisting’, or removing ESA protections entirely (periods of 2007, 2009 and 2012; 701 days, Supplementary Table S2).
Choosing to end our study on April 14, 2012 presented several advantages. First, the WDNR summarized wolf census data and population reports for the preceding year on April 15th. Second, we could compare our results to prior work12,21,49. Third, the April 2012 passage of Act 169 enacting the first wolf-hunting seasons since wolf bounties were terminated in the 1950s50 was a qualitatively different policy signal than those of the liberalized killing periods (Supplementary Table S2).
Our second binary covariate, winter, produced spells for October–March (‘1’, winter) and April-September (‘0’, summer). Our inclusion of this variable is warranted by robust independent evidence of seasonal differences in both overall and endpoint-specific mortality21,51,52. Most LTF endpoints occurred during winter months (143/231 = 62% of LTF wolves, with n = 40 wolves censored).
Our third covariate had three levels for periods with different methods of censusing wolves (method_change). In the winter of 1994–1995 the wolf census methods changed, and did so again sometime between summer 2000 and winter 2003–2004, with changes in monitoring techniques and protocols for data handling18,23. Those changes affected effort and training of wolf census-takers, so might have affected the detection and monitoring effort for collared wolves also. Although there is some ambiguity in the literature over the exact dates of these changes, we opted for the following splits based on year of endpoint: 1979–1994 (‘1’), 1995–2000 (‘2’) and 2001–2012 (‘3’).
Imputation for 2012 records without endpoint data
We right-censored the interval for individuals that did not experience an endpoint during the analysis period (start of monitoring until April 14, 2012), meaning they are considered as part of the risk set from collaring until the end of the analysis period. Our dataset includes 40 wolves without attributed mortality of disappearance data, because we could not find their endpoint (i.e., cause of death or disappearance) in public records after December 31st, 2011 (see supplementary data files for WDNR monitoring records for 2012 and 2013). Although 14 of those 40 wolves were later found dead in mortality reports between May 2012 and October 2013 (Supplementary Data S2), those reports did not reveal the last date of monitoring but rather a lengthy interval without a record of monitoring followed by discovery of the dead animal. Therefore, we conservatively censored those 14 wolves at April 14, 2012 to consider them as within the risk set (monitored) for the corresponding time intervals, yet without experiencing an endpoint during that time. For the other 26 censored wolves that vanished from public records after December 31st, 2011, our repeated efforts to obtain data were not fulfilled by the WDNR. We submitted four separate requests to the WDNR (1 open records request, 1 state Natural Heritage Inventory request, a personal request to research staff who have published analyses with those data, and we enlisted the aid of the lieutenant governor and governor’s offices to request those data) for all collared wolves monitored in the state in 2012. Therefore, we simulated their endpoints in three scenarios described below.
We imputed either an LTF or censored status to the n = 26 wolves with missing endpoints based on the rationale that if any of these monitored wolves had suffered a death rather than a disappearance, their deaths should have appeared in mortality records spanning January 1, 2012 (when missing records for these wolves begin) to October 31, 2013, as happened with the 14 wolves with missing endpoint but found in subsequent mortality reports and therefore censored. Thus, the two remaining possibilities are that these wolves were either LTF or survived our analysis period and beyond October 31, 2013 which means they must be included in the risk set but be censored for endpoint analyses because they do not fit our 6 categories of endpoint.
For our simulation scenarios, we developed a series of FG imputation models (IMs) with LTF as the endpoint of interest using the above covariates for the full, original dataset (i.e., with all 40 wolves with missing data classified as ‘censored’ on April 14, 2012). We then used the most appropriate FG model (accounting for Akaike’s Information Criterion (AIC), Bayesian Information Criterion (BIC), log-likelihood (LL), parsimony and proportionality assumptions) to predict the probability of LTF incidence by April 14, 2012 for each of the 26 wolves. Because we assumed all 26 wolves were alive on April 14, 2012 (i.e., each is imputed their maximum survival time) for all models, whereas they might actually have disappeared earlier in 2012, our approach is conservative because it likely underestimates the relative incidence of LTF.
To calculate each of the 26 wolves’ probability of LTF, we first calculated the baseline CIF for the best IM and multiplied it by the exponentiated lib_kill and winter coefficients in Model 2 to obtain a probability of LTF for each wolf during winter periods with liberalized killing, as wolves experienced during the period beginning January 28, 2012 until April 14, 2012 (Supplementary Table S2). Then we ran 1,000 simulations for each wolf going LTF, using a Bernoulli distribution with the LTF probability for each wolf as the probability of success (‘LTF’). For our MAIN imputation scenario, each wolf was imputed an LTF endpoint (on April 14, 2012) if the simulated occurrence of the LTF endpoint was higher than the probability of LTF predicted from the FG model (used as an imputation threshold), ({p}_{i,SIM}left(ltfright)>{p}_{i,FG}(ltf)), otherwise we censored that wolf. To analyze sensitivity to the MAIN scenario, we also developed HIGH and LOW scenarios following a similar imputation process (Supplementary Data S3). For the HIGH imputation scenario, we increased the threshold probability for going LTF by half the difference between ({p}_{i,FG}(ltf)) and 1; ({p}_{i,HI}left(ltfright)={p}_{i,FG}left(ltfright)+(1-{p}_{i,FG}left(ltfright)/2). For the LOW imputation scenario, we decreased the threshold probability for going LTF by half of ({p}_{i,FG}(ltf)); ({p}_{i,LO}left(ltfright)={p}_{i,FG}left(ltfright)-{p}_{i,FG}(ltf)/2). The LOW and HIGH scenarios provided bounds on the point estimates of relative hazard and incidence for the simulated LTF process in the MAIN scenario.
Statistical tests
To model all endpoint-specific HRs, we employed Lunn & McNeil’s (1995 Method B) data augmentation method. Namely, we augmented the data by our 6 endpoint categories and employed stratified joint Cox multiple regression (on endpoint) with interactions between covariates and each endpoint. Our initial model included all interactions. We then discarded the weakest first to follow model selection procedures while retaining the policy variable in all models (7 models total, Supplementary Table S5). The approach provides us with covariate HRs for all endpoints and we use those HRs for estimating the CIFs by policy period for each endpoint. We model HR distributions of covariates for our poaching and LTF by exponentiating a normal distribution parameterized with the covariate coefficients and standard deviations obtained from their respective Cox models.
We also ran separate FG univariate and multivariate models, which mirrored the best stratified joint Cox model, to estimate FG CIFs for each endpoint. We compared CIFs visually to identify the most appropriate CIF model estimate (Cox or FG), following53.
Given the complete survival history of each individual wolf was split into multiple spells, we clustered all our regression analyses using a unique identifier for each wolf, following methods in54. Clustering on wolf identity accounts for auto-correlation (e.g., all spells are analyzed within-subjects) and avoids pseudo-replication of observations. We evaluated compliance with the proportionality assumptions for each model through the inclusion of time-varying coefficients (tvc). A tvc is an interaction of each parameter with analysis time which models changes in that parameter’s effect over time; i.e., non-proportionality. Endpoint-specific models with significant non-proportionality in a covariate (tvc) cannot provide predictions of risk or incidence due to computational limitations. We further verified proportionality using Schoenfeld residuals43,44,48, which should show a random pattern against time as evidence of compliance with the PH assumption. We selected the best regression models considering AIC, BIC, LL, parsimony, and compliance with model assumptions. When we set aside a best model because of non-proportionality, we present and discuss the best model but our CIF calculations use parameters from the same Cox or FG model without the tvc. We visually assessed goodness-of-fit for each selected endpoint-specific Cox model by Cox-Snell residual plots, which should show the Nelson-Aalen cumulative hazard closely following the line of Cox-Snell residuals if the model is a good fit. We conducted all statistical analyses in Stata 15 (StatCorp, College Station, TX, 2015; see supplementary materials for statistical code).
Source: Ecology - nature.com