A simple methodological framework was established for testing the BCR model using empirical datasets, consisting of the GPS data of 5 animals. For each of these 5 animals, the three parameters were accordingly tuned using a straightforward estimation procedure. This procedure uses the empirical datasets to infer the parameters’values (Fig. 1). We also used the datasets to assess the model’s reliability—or performance-. We also detail other analyzes that were carried out to ensure the robustness and consistency of the approach, including the deterministic nature of the 5 statistics and a sensitivity analysis. This analysis consists in evaluating the performance of the BCR using a sweep method that produce arbitrary values of the parameters instead of using data-driven estimations. All BCR simulations and the five statistics were performed using MATLAB Version 7.13.0.564 (R2011b).
Framework used for testing the BCR model performance, for one animal. Black lines detail the two operations processed from the GPS dataset. The 3 parameters are estimated from the GPS data and—using these parameters—1000 simulations of the BCR model are computed. No particular operations are associated with the dotted black lines, but they show how the BCR and the GPS dataset are evaluated and compared using the statistics.
Data
The locations of 5 GPS-collared red deer (Cervus elaphus) were gathered at La Petite Pierre National Hunting and Wildlife Reserve (NHWR), in north-eastern of France (48.8321 (Lat.) / 7.3514 (Lon.)). The reserve is an unfenced 2670 ha forest area characteristics by deciduous trees (mostly Fagus sylvatica) in the western part and by coniferous species (mostly Pinus sylvestris and Abies alba) in the eastern part in nature reserve surrounded by crops and pastures. It is located at a low elevation area of the Vosges mountain range, which rises up to 400 m a. s. l. The climate is continental with cool summers and mild winters (mean January and July temperatures of 1.4 and 19.6 (^{circ })C, respectively, data from Phalsbourg weather station, Meteo France, from 2004 to 2017). Three ungulate species are present and mainly managed through hunting in the NHWR: wild boar, red deer and roe deer. The present study focuses on female red deer for test model. A detailed overview of the landscape and surroundings is given in40. The GPS data had regular observation frequencies with high frequency sampling (Table 1). In the following text, we note (X_i = [X_i^{(1)}, X_i^{(2)}]) the locations of the individual with (X_i in {mathbb {R}}^{2}), (i=1,2,ldots ,n) and where (X_i^{(1)}), (X_i^{(2)}) represent the longitude and latitude respectively. We use (t_{i}) ((t_1=0)) as the time elapsed between two successive locations (X_{i-1}), (X_{i}) and
$$begin{aligned} {overline{T}}= dfrac{1}{n} sum _{i=1}^{n} t_i end{aligned}$$
(1)
as the average sampling time. The trajectory of the animal, or ‘path’, was interpolated using linear interpolation between each pair of recorded observations (Fig. 2 and detailed in Supplementary Methods (Eq. 21) and associated Graphic 2). It approximates the animal travels in straight lines at constant velocity between each pair of locations41. The attractor (X_F) of one individual was estimated as the isobarycenter of all recorded locations:
$$begin{aligned} X_F = left[ frac{1}{n}sum _{i=1}^{n} X_i^{(1)}, frac{1}{n}sum _{i=1}^{n} X_i^{(2)} right] end{aligned}$$
(2)
Individual paths of the five red deer. Individual paths of the five red deer. The individual paths are plotted for the five red deer (left panel, a) along with the distribution of the relative turning angles (degrees) in polar plots (right panel, b). An angular value of 0 consists in a straight motion from the previous location, while a relative turning angle of 180 (^{circ }) c corresponds to a turn back.
BCR model
The model aims at estimating the location at the next time step, given the actual location X at step i:
$$begin{aligned} X_{i+1} = f left( X_i right) end{aligned}$$
(3)
such that the function (f(cdot )) is assumed to be representative of the behavior of the animal on sufficiently large time scales. We considered one individual of a given species with no interaction and simulated its movement in continuous space and discrete time in 2 dimensions. The BCR includes 3 parameters coupled with isotropic diffusion:
Diffusion: A random direction with uniform spatial distribution in a 2D plane,
Bias ((p_F)): An increased probability to go to a fixed point named attractor42. This attractor was estimated as the isobarycenter of all recorded locations, defined as (X_F) (Eq. 2). This yields a bias or advection parameter in the direction of (X_F). We use the term ’attraction’ for the bias component of the BCR and the term ’den’ for the attractor. In the data set we study, the den is equivalent to the deer’s bunk.
Correlated component ((p_I)): This parameter increases the probability to move forward, i.e. to perform one step in the direction of the previous step. This is equivalent to a short term bias in movement, when the animal has inertia. We refers to ’inertia’ for the correlated component,
Immobility ((p_s)): We included this as a specific parameter and the movement is stopped for one step. This takes into account the absence of movement between a pair of locations. It can be accredited to technological limitations with the satellite telemetry due to a weak GPS signal strength, possibly due to natural elements: such as when the animal was standing underneath a rock or due to dense clouds, dust particles, mountains or flying objects, such as airplanes). However, this can also be part of the behavior of the animals, during specific times: sleep cycles or foraging for instance. We use (d_{min }) to denote this distance cutoff and set (d_{min }=10)m which corresponds to the magnitude of the error typically found in GPS locations43. We also use (d_{min }) to encapsulate GPS error and peculiar ecological behavior, not associated with (p_I) or (p_F), that are beyond the scope of this study.
The effect of each parameter is detailed in Fig. 3. The typical model contains all three parameters: (p_I), (p_s) and (p_F) for describing animal motion while offering a trade-off between the number of parameters and the description of animal motion.
Simulated animal motions over arbitrary parameter values. Fifty motions of length (n_s=100) steps are simulated and originate from a common centroid (downward-pointing triangle) with increased levels of correlation ((p_I)), immobility ((p_s)) and bias ((p_F)). Both the location of the attractor (X_F) (black dot) and the log-normal parameters controlling the step size distribution are fixed ((mu =3), (sigma ^2 = 1)).
Estimation of the parameters
The estimation of the three parameters for each animal is based on the empirical datasets. We distinguished between the states, where one state is described by the pair (left{ X_{i-1}, X_iright} ) and the situations, where one situation is described by the past ((X_{i-1})), current ((X_i)) and future ((X_{i+1})) locations. Knowing both the state of the animal at a given time step i and its situation—the realization of movement at the next time step (i+1)—allowed for collecting the occurrences of inertia, immobilism and attraction. This could be done provided we account for the variability of the movement: the animal may not be heading exactly toward the den, or performing inertia with an exact angular value of (pi ). Thus we discretized the space around the animal in 8 quadrants at each time step i. For example, if the animal was heading straight forward with a margin of (pm pi /8) then it was considered in the situation of inertia. In other words, the state could fall in a situation of inertia with a margin of (pm pi /8). Such a discretization can be represented as a matrix, depending on the state of the animal, its location and the location of the den at each time step (see Supplementary methods, Eq. 19). In order to gather enough data samples per situation, we arbitrary used angular thresholds of (pi /8) as a convenient trade-off between data scarcity and precision loss. Using smaller threshold values (say (pi /10)) may result in too few samples per situations. Using larger threshold values such as (pi /4) may result in a loss of precision while capturing additional movement samples that may not correspond to the situation.
We first needed to define in which state is the animal at each step i. A state is the 2-tuple containing the previous and actual observation ({X_{i-1}, X_i}). We wanted to distinguish between non-conflicting and conflicting states, where a non-conflicting state is when the animal is in one state only, while a conflicting state is when the animal is in two states at once. We defined two conflicting states:
$$begin{aligned} {mathscr {H}}_{IF}:={i : widehat{ left| X_{i-1} X_i X_F right| } le pi /8} end{aligned}$$
(4)
when the animal was already heading toward the den (X_F), and:
$$begin{aligned} {mathscr {H}}_{Is}:={i : dleft( X_{i-1}, X_iright) le d_{min }} end{aligned}$$
(5)
when the distance between two consecutive observations was too small ((le d_{min }) m.), describing an individual that was already immobile. Such that the subset of non-conflicting states is:
$$begin{aligned} {mathscr {H}}:={1,cdots ,n} – {mathscr {H}}_{IF} – {mathscr {H}}_{Is} end{aligned}$$
(6)
We then needed to assess in which situation the animal was for each corresponding state. A situation is the 3-tuple (left{ X_{i-1}, X_i, X_{i+1}right} ). We defined three subsets of situations corresponding to a straight forward motion (I), no motion (s) and a motion toward the den (F):
$$begin{aligned} I:= {i: pi – pi /8 < widehat{(X_{i-1} X_i X_{i+1})} le pi + pi /8} end{aligned}$$
(7)
$$begin{aligned} s:= {i: dleft( X_i, X_{i+1} right) le d_{min }} end{aligned}$$
(8)
$$begin{aligned} F:= {i: mid widehat{X_{i-1} X_t X_{i+1}} – widehat{X_{i-1} X_i X_F} mid le pi /8} end{aligned}$$
(9)
With (d(cdot ,cdot )) the Euclidean distance between two locations. For the situations in s, we considered that the animal is not performing a motion if the Euclidean distance between two successive locations was (le d_{min })m.
We counted the number of states falling in each situation, for states in ({mathscr {H}}) (Eq. 6). We defined (x_{1}), (x_{2}), (x_{3}) as the empirical proportion of cases corresponding to each situation:
$$begin{aligned} {left{ begin{array}{ll} x_1 = dfrac{# I cap {mathscr {H}}}{# {mathscr {H}}} ; qquad x_1:=dfrac{1+p_I}{chi } [16pt] x_2 =dfrac{# s cap {mathscr {H}}}{# {mathscr {H}}} ; qquad x_2:=dfrac{p_s}{chi }[16pt] x_3 = dfrac{# F cap {mathscr {H}}}{# {mathscr {H}}} ; qquad x_3:=dfrac{1+p_F}{chi } end{array}right. } end{aligned}$$
(10)
with (chi = 8+p_{I}+p_{s}+p_{F}). The values of (x_1), (x_2) and (x_3) were then gathered for each animal. We did not use immobile locations (i.e. distances separating two successive observations must be (> d_{min }) m) for the estimations of (x_1) and (x_3). Solving Eq. (10) for (chi ) with respect to (x_1), (x_2), (x_3) yields:
$$begin{aligned} chi = dfrac{6}{1-(x_{1} + x_{2} + x_{3})} end{aligned}$$
(11)
Plugging in Eq. 10:
$$begin{aligned} {left{ begin{array}{ll} p_{I} = x_1 chi -1 p_{s} = x_2 chi p_{F} = x_3 chi -1 end{array}right. } end{aligned}$$
(12)
Note that we assumed that (p_{IF} = p_I + p_F) in ({mathscr {H}}_{IF}) and (p_{Is} = p_I + p_s) in ({mathscr {H}}_{Is}) as a convenient arrangement and ignoring higher order conflicting cases. Investigating the step-size distribution in the 5 deers, we found a log-normal step size distribution (Supplementary Fig. S1). We then set a log-normal distribution (ln {mathscr {N}}(mu , sigma ^2)) for the step size distribution for the step size in the BCR.
The same estimation procedure was used for configurations using a different number of parameters and quantity (chi ) is accordingly calculated depending on the number of parameters used. It is possible to obtain negative values using this inference method. A parameter with a negative values reflects a direction that is not favored by the animal. In such a case, one should rethink the design of the BCR by changing the parameters (see Supplementary methods, section “negative parameters”). In the subsequent sections, we only consider parameters with positives values.
BCR dynamics
The BCR dynamics for each animal are completely determined by the three parameters (p_F), (p_I), (p_s), taking values in ({mathbb {R}}^{+}), and the step-size distribution. If (p_F = p_I = p_s = 0), the BCR resumes to a typical two-dimensional random walk with a log-normal step size distribution (ln {mathscr {N}}(mu , sigma ^2)). The dynamics can be visualized in Fig. 3 for different values of each parameter. When simulating a step in the model, the motion in ({mathscr {H}}) is described by:
$$begin{aligned} f left( X_i right) = {left{ begin{array}{ll} left{ X_i^{(1)} + d cos (alpha _1) ; X_i^{(2)} + d sin (alpha _1) right} &{} qquad text {if } x in [0,8[ left{ X_i^{(1)} + d cos (alpha _2) ; X_i^{(2)} + d sin (alpha _2) right} &{} qquad text {if } x in [8,8+p_I[ X_i &{} qquad text {if } x in [8+p_I, 8+p_I+p_s[ left{ X_i^{(1)} + d cos (alpha _3) ; X_i^{(2)} + d sin (alpha _3) right} &{} qquad text {else} end{array}right. } end{aligned}$$
(13)
with x, d, (alpha _1) random variables defined as (x sim {mathscr {U}} in [0, chi ]), (d sim ln {mathscr {N}}(mu , sigma )), (alpha _1 sim {mathscr {U}} in [0,2pi ]). Variables (alpha _2), (alpha _3) are related to the angular values (alpha _2 = {{,{mathrm{atan2}},}}(X_{i}^2 – X_{i-1}^2, X_{i}^1 – X_{i-1}^1)), (alpha _3 = {{,{mathrm{atan2}},}}(X_{F}^{(2)} – X_{i}^{(2)}, X_{F}^{(1)} – X_{i}^{(1)})) with ({{,{mathrm{atan2}},}}(y, x)) the four quadrant inverse tangent function (14):
$$begin{aligned} {{,{mathrm{atan2}},}}(y, x) = {left{ begin{array}{ll} arctan left( {frac{y}{x}}right) &{} x> 0, arctan left( {frac{y}{x}}right) +pi &{} x< 0{text {, }}y ge 0, arctan left( {frac{y}{x}}right) -pi &{} x< 0{text {, }}y< 0, +{frac{pi }{2}} &{} x=0{text {, }}y > 0, -{frac{pi }{2}} &{} x=0{text {, }}y < 0, 0 &{} x=0{text {, }}y=0text {.} end{array}right. } end{aligned}$$
(14)
The motion in ({mathscr {H}}_{Is}) is:
$$begin{aligned} f left( X_i right) = {left{ begin{array}{ll} left{ X_i^{(1)} + d cos (alpha _1) ; X_i^{(2)} + d sin (alpha _1) right} &{} qquad text {if } x in [0,8[ X_i &{} qquad text {if } x in [8, 8+p_I+p_s[ left{ X_i^{(1)} + d cos (alpha _3) ; X_i^{(2)} + d sin (alpha _3) right} &{} qquad text {else} end{array}right. } end{aligned}$$
(15)
The motion in ({mathscr {H}}_{IF}) is:
$$begin{aligned} f left( X_t right) = {left{ begin{array}{ll} left{ X_i^{(1)} + d cos (alpha _1) ; X_i^{(2)} + d sin (alpha _1) right} &{} qquad text {if } x in [0,8[ X_t &{} qquad text {if } x in [8, 8+p_s[ left{ X_i^{(1)} + d cos (alpha _2) ; X_i^{(2)} + d sin (alpha _2) right} &{} qquad text {else} end{array}right. } end{aligned}$$
(16)
Statistics for describing animal movement
We simulated (N=1000) BCR and used 5 statistics to assess the model reliability on spatial features including: (i) the distribution of relative turning angles which provides information about the movement of the animal, (ii) the home range which provides information about the spatial density of observations and (iii) observation counts using still and mobile transects, providing information on absolute observation abundance44. A detailed description of each statistic is provided in Supplementary Methods and Fig. S2. The reliability—or performance—was assessed in each animal and studied statistic using two error terms (e_1) and (e_2). Error (e_1) is the (L^1) norm to compare the differences between the statistic ({tilde{mathscr {S}}}) computed over a simulated path, and the statistic (smash {{mathscr {S}}}) computed over the data-set:
$$begin{aligned} e_1 mathrel {mathop :}= sum {text {errors}} = sum _{k=1}^N |smash {{mathscr {S}}}- {tilde{mathscr {S}}}_k| end{aligned}$$
(17)
With (k = 1,ldots , N) the number of simulations of the BCR. Error (e_1) is the sum of absolute differences in the given statistic, and is a natural way of measuring the distance between the statistics computed on the data set and the trajectories generated using the BCR. We also focused on the average relative error (e_2) as an indicator of the sensitivity:
$$begin{aligned} e_2 mathrel {mathop :}= dfrac{1}{N} sum _{k=1}^N dfrac{{tilde{mathscr {S}}}_k}{smash {{mathscr {S}}}} end{aligned}$$
(18)
Distribution of turning angles
For each individual, the distribution of counter-clockwise relative turning angles (widehat{(X_{i-1} X_i X_{i+1})}) was gathered, provided (d(X_{i-1}, X_{i}) > d_{min }) and (d(X_{i}, X_{i+1}) > d_{min }). This means that we only kept the angles from observations that were separated by an Euclidean distance greater than (d_{min }).
Home range
We used an adaptive kernel density estimator (matlab package kde2d—kernel density estimation version 1.3.0.0) as an estimator of the utilization distribution45 to represent the home range of the animal. The approach of Z.I. Botev provided an estimate of observation density using a bivariate (Gaussian) kernel with diagonal bandwidth matrix46. The density was estimated over a grid of (210 times 210) nodes and we computed the home range area (in m2) for various values: 100, 99, 95, 90, 80, …, 20, 10% of the estimated density. Similarly to the distribution of turning angles, we compared each value of the data’s home range against the simulated one.
Dilation
Dilation is generally used to account for the spatial attributes of an object such as to measure an area around the path or the volume of a brownian motion (see Wiener sausage47 and Gromov–Hausdorff distance). In our approach, we use dilation of both simulated and GPS paths for two reasons: to have a real—and comparable—number that accounts for how a trajectory has explored space and because it is natural tool from a census point of view (the dilated path corresponds to the area where the animal can be detected). Each simulated or real path was plotted in binary format in a window and dilated with a disk shape. The window size was set to a huge value in order to encapsulate the dilated path while preventing boundary effects, i.e. the convex envelope of the dilated area did not collide with any window border. We then estimated the surface covered by the dilated path for 100 different sizes of the disk, from disk size 1 to disk size 100. We compared each value of the data’s estimated surface against the simulated one.
Immobile transects
We used still transects that counted the number of times the animal was seen in their line of sight. We arbitrary set the line-of-sight value at 200 m. The number of sightings of each transect was gathered and ordered in decreasing order, thus breaking the spatial dependence. We then compared the bins of the resulting histogram in the data and in the simulated path.
Mobile transects
First, the movement of the animal was linearly interpolated from the GPS data, meaning that between two recorded locations the individual followed a linear path. The speed of the animal between two locations was accordingly reconstructed using the recorded times (t_i) between each location. Second, we used mobile transects as the ecological sampling method, where each transect ‘count’ the intersection between its path and the animal’s one. The mobile transects followed a predefined path at a given constant speed as time increased. The area of vision of each transect was defined as a circle of a given radius. Each time the path of an individual collided with an area of vision, the count of the corresponding transect increased by 1. Two types of movements were used: linear and clockwise rotational transects. The initial locations of both types of transects are (X_1) and (X_F). Both the animal and mobile transects started to move at the same time. At each of the two locations (X_1), (X_F), 8 linear transects moved in the 8 cardinal directions, totalizing 16 transects. For the linear transects, every 10,000 time steps, we set (2 times 8) new transects starting at the same locations and following the same directions. Clockwise rotational transects were rotated around (X_1) and (X_F) using a 500 m radius. When we reached (t_n), we gathered the total count (i.e. the count of all transects). For the two types of transects, we gathered the total count for 6 different lines of sight: 50, 100, 200, 400, 500, 1000 m. and 4 speeds: s/4, s/2, s, 2 s with s the average speed of the animal. We then aggregated the overall count in each of the two types of transects, and compared the results from the data and the simulated path (Supplementary Methods and Fig. S2).
Scale invariance
Several authors pointed out that the temporal resolution of the discretization is of importance: it should be relevant to the considered behavioral mechanisms5,48,49,50. Schlägel and Lewis focused on the quantification of movement models’ robustness under subsampled movement paths49. They found that increased subsampling leads to a strong deviation of the central parameter in resource selection models49,51. They underlined that important quantities derived from empirical data (e.g. parameters estimates, travel distance or sinuosity) can differ based on the temporal resolution of the data49,51. Moreover, Postlethwaite and Dennis highlighted the difficulty of comparing model results amongst tracking-datasets that vary substantially in temporal grain50). Each of the studied dataset has a relatively high sampling rate (roughly 10 m) and a period of study that is appropriate to the analysis of animal movement at the year scale (Table 1). In order to investigate such a possible effect on the BCR dynamic, we changed the sampling rate of the movement path to ensure that the three parameters (p_I), (p_s) and (p_F) are scale invariant. The movement path formed by the GPS observations (X_i) was subsampled (decimated) for each individual. We only kept every (k^{text {th}}) observation starting with the first one and (k in left[ 1,10right] ). For (k=1) the path corresponded to the original one. The time spent between each successive observation was also accordingly reconstructed in order to keep track of ({overline{T}}) in subsampled movement paths. The time between two locations (X_i) and (X_{i+k}) was reconstructed as:
$$begin{aligned} t_{j}’ = sum _{i=j}^{i+(k-1)} t_i end{aligned}$$
(19)
with (j in left[ 1, 1+k, 1+2k, ldots , n-left( k-1 right) right] ). We did not change the value of (d_{min }) as we subsampled the movement path because we designed (p_s) for capturing GPS noise and movements that are associated with peculiar ecological behaviors that are beyond the scope of this study in terms of time and spatial scales (foraging for instance). We then compared the resulting parameters (p_I), (p_F) and (p_s) as the resampling rate k increased.
Fluctuations
Whereas the BCR is a stochastic process, the deterministic aspects of the 5 statistics were tested with an increasing number of steps (n_s). The statistic associated with each realization of the model (a simulated path) is a random variable. If the distribution of these random variables has low concentration (high variance) then it is not a convenient statistic as it cannot be used as a reference for assessing the model’s performance, even when averaging over multiples realizations. On the opposite, if the statistic is deterministic (no fluctuations) it can provide a reliable tool to assess the model’s performance. This was numerically tested over a range of increasing (n_s) values with (n_s = 10^4, 2times 10^4, ldots , 4 times 10^5). For each of those step values, a set of 100 BCR was simulated with parameters (p_I), (p_F) and (p_s) estimated from the first deer (see Table 2) and we studied the variance of the statistics.
Sensitivity analysis
In order to assess whether the estimated parameters are optimal (i.e. providing the best possible performance) and to study parameter scarcity, we also evaluated the performance of the model using arbitrary weight values. We first started by evaluating how using one parameter instead of the three could alter the performance of the model. We then extended this sensitivity analyse by drawing arbitrary values for each parameter from a multi-dimensional square mesh, whose center corresponds to the estimated values of (p_I), (p_s), (p_F), estimated using GPS data (Fig. 1). We additionally used values that are distant from the estimated ones, up to (p_I=3), (p_F=3) and (p_s=5). We tested a total of 151 new configurations with these arbitrary values. For each configuration, we ran 150 simulations and evaluated them using the 5 statistics. The mean error of (|smash {{mathscr {S}}}- {tilde{mathscr {S}}}_k|) and its standard deviation are gathered and plotted for each arbitrary configuration. As a resume, we replicate the framework described in Fig. 1 but we inject arbitrary parameters instead of using data-driven parameterisations.
Application
The proposed model could be used to infer environmental and behavior information from the dataset. We chose to illustrate such an application by trying to detect anomalous voids (or holes) in the spatial territory of the individual using the GPS dataset and Monte-Carlo simulations of the model. Anomalous means that the observed void is not related to the randomness of the movement, but rather related to a geographical artifact. The parameters (p_I), (p_F), (p_s), (mu ) and (sigma ^2) of the BCR were accordingly estimated from the data of each individual, similarly to previous experiments (Fig. 1). A simple heuristic was used to find voids in empirical and simulated paths for each individual: we computed the alpha shape of all locations using a fixed alpha radius of 60 m. This allowed for determining the surface covered by all locations while preserving the voids. We then collected the area of each void provided they had an area of at least 100 m2. We focused on voids near the center of the alpha shape in order to avoid artificial voids, generated by the weak density of locations at the boundaries. We ran 10,000 iterations of the model for each animal and estimated the probability (p_{varnothing }) of finding voids of different sizes in the simulated paths. This probability was then compared to voids found in the GPS datasets and available environmental information was used to determine whether any geographical element(s) could explain the unexpected voids.
Source: Ecology - nature.com