Assessing data bias in visual surveys from a cetacean monitoring programme

Data processing

In 2019, the CETUS data spanning between 2012 and 2017 was published open access through the Flanders Marine Institute (VLIZ) IPT portal and distributed by EMODnet and OBIS, in a first version of the CETUS dataset⁹. The data collected between 2018 and 2019 was prepared as the 2012–2017 data⁹. Methods for photographic verification/validation and to evaluate the MMOs experience were applied (see below), in order to include new variables on data quality in an updated version of the dataset. Currently, the CETUS dataset is updated, with a 2^nd version available¹⁰. It comprises data from 2012 to 2017, with the following two new columns on the observers’ experience: “most experienced observer” and “least experienced observer”; and a new column associated with validation of the sightings’ identifications: “photographic validation”. The results here presented correspond to the analysis of the data from 2012 to 2019, and the open-access dataset will soon be further updated with the 2018–2019 data.

Photographic verification

All the former MMOs who have integrated the CETUS Project, between 2012 and 2019, were contacted and asked to provide any available photographic or video records of cetaceans collected during their on board periods. The collection of sighting’s images was not a requirement of the CETUS protocol, and so these records were obtained opportunistically, with availability and quality depending on several factors: observers on board having personal cameras, camera quality, intention of the observer taking the photograph (e.g., for aesthetic or identification purposes).

The images obtained were organized in a folder hierarchy from the year to the day of recording. However, not all the images had metadata up to the day of recording, so these were inserted into the most appropriate hierarchy-level of the folder organization. For each set of records corresponding to a single-taxon sighting, the photos/videos with the better quality or framing (i.e., that allowed for an easier species identification) were selected for that sighting. The remaining photos/videos were only consulted in case of doubt (e.g., to look for additional details that could help with the identification).

Verification consisted of the process of matching the photographic/video records with the dataset sighting registers. Whenever possible and ideally, the file metadata was used for the process. However, often, the date and/or time of the file metadata were wrong, non-existent, or in different time zones. In these cases, a conservative methodology was applied using all available information to match as many sightings as possible. An estimation of time lag was attempted (based on, at least, two obvious matches between photographs/videos and dataset registers, e.g., unique sighting of the day, close to the boat, easy/obvious identification). When not possible, further evaluation consisted in assessing whether the sighting and image record was too obvious, and accounting for unique complementary information on the sighting (e.g., the number of animals or the side of the sighting were unique for that day and/or for that species/group).

Photographic validation

After the verification process, the validation of the matched records was carried out, to confirm or correct the species identification of sightings in the 1^st version of the CETUS dataset (i.e., reported by the MMOs on board). The validation approach involved, for more dubious identification through the photo/video records, the discussion between four experienced observers of the CETUS team. In cases where no consensual agreement was achieved, an external expert on cetacean identification was also consulted. Identifications made through the photographic/video records required 100% certainty, and these were then compared with the cetacean identifications provided in the 1^st version of the CETUS dataset. Then, the occurrence records with originally misidentifications of cetaceans, as well as those records where validation allowed to achieve an identification to a lower taxon, were corrected in the 2^nd version of the dataset (i.e., a delphinid sighting validated as common dolphin, will now appear as common dolphin). A new column “photographic validation” was added to the dataset with the following categories: “yes” (i.e., validated with photograph/video), “no” (i.e., not validated with photograph/video), and “to the family” (i.e., validation only to the family taxon).

For further analysis, specifically for the model process on the identification success (see below), registers were considered “completely validated” if it was possible to complete the photographic/video identification process up to the species level (then, differentiating if the original identification from the MMOs was or not correct). For Globicephala sp. and Kogia sp., validation to the genus was considered complete, since the species from both genera are visually hardly differentiated, especially at sea.

Creating a data quality criteria: evaluating MMOs experience

Quality criteria were created to evaluate the MMOs experience based on the information collected from their curricula vitae (CVs) (alumni MMOs provided as many CVs as the years of their participation in CETUS). The following quality criteria were considered: (i) the experience at sea, (ii) the experience with cetaceans’ ID, (iii) the number of species they have worked with, and (iv) the experience working with the CETUS Project protocol. Each of these quality criteria was ranked from 0 to 5, and then these were summed to generate an evaluation score, on a scale of 0 to 20, attributed to each MMO (Table 4).

Table 4 Quality criteria for MMOs evaluation.

Full size table

The MMOs evaluations were computed for each cruise (i.e., the trip from one port to another), considering the experience of the MMOs based on the CV obtained for that year, plus the experience acquired during CETUS participation in previous cruises that year. Since most of the times, the team of observers on board each cruise was constituted by two MMOs, two final evaluation scores were attributed to each cruise in the 2^nd version of the CETUS dataset, into two new columns: “most experienced observer” and “least experienced observer”. On rare occasions where there is only one observer on board that cruise, only the evaluation of the single observer was included under the column “most experienced observer”, leaving the column “least experienced observer” as “NULL”. To investigate the experience of MMOs on board, both individually and cumulative (LEO + MEO), the combination of the score values was computed by cruise. These were then trimmed to unique combinations of evaluation scores.

The names of observers, previously presented in the online dataset for each cruise, were removed for anonymity purposes, as there is now ancillary information regarding their experience.

Model fitting

Two Generalized Additive Models (GAM) were fitted to assess bias on the number of sightings recorded per survey and on the identification success of cetacean species. Details for each model are presented below. Both models were fitted in R (Version 4.1.0). Prior to modelling, Pearson correlations were calculated between all pairs of explanatory variables, considered for each model (see below), to exclude highly correlated variables, considering a threshold of 0.75^24,25,28. Since the variables regarding MMOs’ experience were correlated (LEO or MEO correlated with cumulative and mean experience; and cumulative experience correlated with mean experience – Supplementary Fig. S3), these variables were not included in the first fitting stage (backward selection) but included later through forward selection (see below). Multicollinearity among explanatory variables was measured through the Variance Inflation Factor (VIF), with a threshold of 3 (Supplementary Tables S4)^24,25,29. After removing the MMOs evaluation scores, no multicollinearity was observed, so all the other variables were kept for the first fitting stage.

For model selection, a backward selection was applied to oversaturated models^{18,24,25,30,31}. The Akaike Information Criterion (AIC) was used as a measure of adequation of fitness, choosing the model with the lowest AIC value at each step of the model fitting process, i.e., comparing nested models (larger model incorporating one more explanatory variable compared with the smaller model). If the AIC-difference between the two models was less than 2, an Analysis of Variance (ANOVA), through chi-square test, was used to check if the AIC-difference was significant^24,25,32. If this difference was not statistically significant (p > 0.05), the simplest model (smaller model) was kept. Through a forward selection process, the variables regarding the MMOs evaluation scores were added, one at a time, to the best model obtained in the previous backward selection. After comparing the models with each other (separate variables for LEO + MEO vs. Cumulative Evaluation vs Mean Evaluation), the best model, considering the AIC value, was kept. A final backward selection process was then applied.

All GAMs were fitted with the “mgcv” package (https://cran.r-project.org/web/packages/mgcv) and a maximum of four splines (k = 4) was chosen to limit the complexity of smoothers describing the effects of the explanatory variables^25,31. If a spline was close to linear (with estimated degrees of freedom of ~1), the smooth term was removed, and a linear function was fitted. To check for model quality, the “gam.check” function was used to verify the diagnostic plots and the adequacy of the number of splines (Supplementary Figs. S5 and S6). Existence of influential data points was assessed (with the threshold of 0.25 for the Hat values), as well as the correlation between model residuals and explanatory variables. In both final models, number of splines was adequate and there were no influential data points or clear correlation between residuals and explanatory variables (Supplementary Figs. S7 and S8)^24,32.

Bias modelling of number of sightings

To assess the bias parameters on the number of sightings recorded per survey (i.e., a full day monitoring, from sunrise to sunset), the following detectability factors were considered as explanatory variables: weather conditions (i.e., the minimums and maximums of the sea state, wind state, and visibility), the experience of MMOs (i.e., the evaluation scores of the least and the most experienced observers, as well as the mean and cumulative evaluations of the MMOs experience) and kilometres sampled “on-effort” (i.e., periods of active survey). Sampling periods were divided into “On-effort” and “Off-effort” conditions, based on four meteorological variables: sea state (Douglas scale), wind state (Beaufort scale), visibility (measured in a categorical scale ranging from 0–10 and estimated from the distance to the horizon line and possible reference points at a known range, e.g., ships with an automatic identification system, > 1000 km), and the occurrence of rain (see Supplementary Table S9)¹⁰. For the model fitting, only “on-effort” periods of sampling were considered. Given that the response variable was count data, a Poisson distribution was tested (with a log link function). Then, the resulting first oversaturated model was checked for overdispersion, through a Pearson estimator. Since it tested positive for overdispersion (φ = 1.99), a negative binomial distribution (with a log link function) was fitted.

Bias modelling of identification success

A binary response variable, based on the success in the species identification for each sighting, was generated, and a model with binomial distribution (with a logit link function) was fitted. As in the previous model, only “on-effort” records were used. The total number of non-successful identifications across the dataset (the 0 s of the model) was extrapolated from the proportion of wrong identifications obtained in the validation process. To calculate this proportion, only complete validated sightings registered “on-effort” were used. Proportions were computed and extrapolated to Odontoceti and Mysticeti, separately. This resulted in 78 non-successful identifications in delphinids, plus 17 misidentifications in baleen whales, i.e., a total of 95 “on-effort” sightings randomly selected from the dataset were defined as unsuccessful identifications (0 s in the response variable for model fitting). The remaining records were considered successful identifications (1 s in the response variable for model fitting). To assess the bias parameters on the identification success, the following independent variables were considered in the analysis: the group (i.e., Group A: Odontoceti sightings, excluding sperm whale (Physeter macrocephalus); and Group B: Mysticeti sightings, plus sperm whale), the size of the group (i.e., the best estimate of the number of animals in a sighting, from the observer’s perspective), sighting distance (i.e., a relative measure according to the scale of the binoculars), weather conditions (i.e., the sea state, wind state, and visibility at the time of each sighting), the experience of MMOs (i.e., the evaluation scores of least and most experienced observers, as well as the mean and cumulative scores of the MMOs teams). Group A and B were settled according to cetacean morphology. However, since sperm whales have closer similarities with Mysticeti species, they were also included in Group B^21,33. This categorization was mostly based on body size, as this is likely the main factor, regarding the species morphology, influencing the identification. Group A is constituted by species with a medium length of less than 10 meters, while Group B includes the larger species over 10 meters (Mysticeti plus P. macrocephalus)³³. Since in the CETUS Project, different binoculars have been used – with two different reticle scales – it was necessary to standardize binocular distances to the same scale.

Source: Ecology - nature.com

Assessing data bias in visual surveys from a cetacean monitoring programme

Data processing

Photographic verification

Photographic validation

Creating a data quality criteria: evaluating MMOs experience

Model fitting

Bias modelling of number of sightings

Bias modelling of identification success

Advancing the energy transition amidst global crises

MIT PhD students shed light on important water and food research

ITALIAN LANGUAGE

ENGLISH LANGUAGE