A comparison of baleen whale density estimates derived from overlapping satellite imagery and a shipborne survey

Here we tested the capacity of VHR imagery to provide estimates useful for monitoring whale distribution and densities, using a direct comparison with a ship-based line transect survey to gauge the relative sighting rates obtained by the satellite platform in comparison to that of the ship. Our results show that density estimates derived from satellite imagery (0.13 whales per km², CV = 0.38—taken from calm waters) are approximately 0.39 of those estimated from the ship-based survey (0.33 whales per km², CV = 0.09); an encouraging result suggesting that data from satellite imagery has potential to detect whales at similar levels to a traditional survey method. These results match our expectation that image derived densities would be lower than that of the ship-survey, with the instantaneous nature of the image acquisition on the satellite platform likely a strong driver of these differences, in addition to limitations in image resolution and the potential for random fluctuations in local whale densities during the time between acquisition of satellite images and the vessel-based survey. However they also demonstrate that satellite surveys have sufficient whale detection capacity that they can provide a complementary approach to monitoring whale presence in remote regions where regular surveys are difficult.

In setting up this study, we chose an area that (1) is of specific scientific interest in terms of whales; (2) is remote and relatively difficult to access, but has had some whale survey effort; (3) where the environmental conditions are changing; and (4) where whale density and habitat use patterns are required to understand population recovery from exploitation and spatial overlap with the regional fishery for Antarctic krill. We focused on an area where one whale species very strongly predominates (humpback whales) in order that our results have potential use for inference about the density patterns of this species, and as there is a smaller likelihood that species mis-identification would introduce bias. We also chose a sea channel which is relatively sheltered, reducing the likelihood of turbulent sea conditions (particularly wind on sea), which can make satellite images useless for survey. Our site selection considerations highlight the limitations still facing development of VHR as a platform, and we consider these limitations and next steps to address them in the following sections. We propose that this method can be used to investigate spatial and temporal patterns of whale distribution and densities, supplementing existing methods, providing that the limitations of this new method are carefully considered during design and implementation.

Weather conditions, specifically the sea state, impact detectability of whales at sea. Sea state is known to influence the ability of observers to detect animals, with worsening conditions reducing the detection probability. Consequently, effort is typically halted when conditions exceed a predefined limit. In all at-sea surveys, sea state increases the likelihood that the assumption of perfect detection on the track line will be violated. If detection off the track line is impacted by environmental conditions, inclusion of covariates in the detection function can take account of this bias⁴⁴ (up to a cut off, normally 5). However, if poor sighting conditions impact detection on the track line, alternative methods such as a double-observer/platform study or a mark recapture approach can be implemented to account for and quantify this bias. For an image-based survey, poorer weather conditions will also reduce the ability of the observer to differentiate FOIs from background noise (i.e. breaking waves, wind lines, etc.)³⁰. This results in fewer features being identified, and lower reported densities. Poor sea state, and associated wind conditions, typically ground aerial surveys, whether manned or UAS-based, or force them to be aborted inflight. Here we show that worsening sea states in the south of the study area on the day that the image was taken (Fig. 2), correspond to lower perceived and estimated densities in these regions. Compared to the northern area, the surface conditions of the southern image were less conducive to the visual detection of FOIs, showing an increased frequency of white-caps and wind lines, possibly because this region is prone to katabatic winds sweeping into the channel. Densities in the south of the survey area, where the sea state was poorer, were 0.4 of those from calmer regions (0.05 versus 0.13 whales per km², CV = 0.58 and 0.38, respectively, Table 2). To address this effect in the future, an adapted version of a Mark-Recapture Distance Sampling (MRDS) analysis, such as⁴⁵ using multiple observers to review images³³, could be applied to assess variations in detectability as a function of covariates (i.e. sea state), and investigate the impact of perception bias on whale detection. However, to accurately parameterise a multi-covariate model, several tens, if not hundreds of whale detections would be needed. Another approach could be to collect multiple images of the same area very close in time (within several seconds to a minute of each other), to quantify the variation in whale detections according to sea state when variation in true whale density is likely to be negligible. In the present study, density comparisons were made using data from the northern (calmer) portion of the imagery only (0.13 whales per km², CV = 0.38, Table 2).

When planning satellite imagery analysis, species composition of the focal area needs to be carefully considered, because at present this approach has very limited capacity to differentiate between species when compared to in situ surveys, due to the resolution of the images (~ 30 cm in this study). Our density estimates most likely reflect the density of humpback whales using the area of the Gerlache Strait in summer, because these are the most commonly sighted species in this region, both in terms of previous surveys, where they comprise > 80% of sightings^15,16, and during the present ship-based survey (> 95% of the groups were identified as humpback whales). During summer periods, other larger baleen whale species tend to be seen further offshore, exhibiting affinity for the more open waters of the Bransfield Strait¹⁵. Smaller cetacean species (e.g. Antarctic minke whales, Balaenoptera bonarensis and both Type A and B killer whales^46,47,48, Orcinus orca), co-occur with humpback whales in the Gerlache Strait but are unlikely to be misidentified as humpback whales, either by ship or imagery surveys, because of their differing size, surface behaviours and morphology. Southern right whales Eubalaena australis are occasionally sighted in this region too¹⁶. However, head callosities are normally visible in overhead imagery of this species, and offer a clear means of differentiation^30,31. Since other species likely reflect at best a very small fraction of the image-survey detections, they are unlikely to comprise a significant component of the overall density estimates.

Obtaining reliable whale density estimates require adjustments for biases. In addition to perception bias, as mentioned above, another key bias is availability bias⁴⁵. Availability bias is the underestimation of density that occurs as a result of a proportion of animals being underwater, or too deep in the water for detection by the survey platform as it passes a point in the ocean. In the present study, we applied an estimate of surface availability⁴⁹ (where availability is 1-availability bias), which was derived by taking dive-recording suction cup tag data from humpback whales in the same region and time, to estimate the proportion of time a whale spends at the surface, versus its dive. Applying this correction, density was initially estimated as 0.12 whales per km² (CV = 0.38) over the whole region surveyed, and as 0.13 whales per km² (CV = 0.38) in calmer waters. However, we note that when tag data are processed, the analyst determines the threshold at which the animal transitions from being present at the surface, to when it dives⁵⁰. Typically, for baleen whales, dives are classified as such when the whale is > 4–5 m for > 20 s. However, with such a threshold, shallow dives of < 4–5 m would go unaccounted for. Currently, the depth to which a whale remains reliably detectable in imagery is highly variable and difficult to estimate⁵¹. As such, when selecting this “surface” threshold we opted for a conservative approach (< 1 m) to filter the tag data to estimate the average times a whale is visible to aerial platforms during daylight hours. We made the assumption that by applying such a shallow threshold, it would be highly unlikely that whales present above this depth would not be visible in the imagery given the resolution available and the likely turbidity of the water on the WAP. Uncertainty around the depth to which an animal remains reliably detectable is an issue for all forms of aerial surveys^26,31,52. Additional accurate measurements of surfacing time, which include the incorporation of covariates (i.e. time of day, animal behaviour, sea state and turbidity) alongside aerial/satellite surveys, may help to more accurately account for whale surface availability in image-based surveys.

An alternative way to correct for availability bias in satellite images could be to reproduce at a “satellite-scale” the availability analyses carried out for UAV-based surveys^52,53, whereby the surfacing rate of animals is captured in video or multiple overlapping still images, and availability estimated. Logistically, repeating this with satellite imagery may be more challenging, given orbital acquisition windows, but the possibility exists to examine surface availability in overlapping sequential images. However, surface availability is a highly variable process⁵². Thus, to correct for it requires careful consideration on a case-by-case basis, and using adjustments stemming from data collected in spatially and temporally comparable regions. Whilst possible for this study, we note that it is not realistic to assume that estimates of surface availability will be available for all regions. However, we would recommend that steps are taken to obtain such estimates, and for future image-based surveys to apply corrections derived from data in close proximity, both spatially and temporally, to the focal region.

The surface availability adjustments made in the current study are akin to those typically made for a ship or manned aerial surveys to account for diving behaviour of the target species⁴⁵. One component of the issues of availability that these adjustments cannot cover, however, is the speed of satellite image acquisition. Satellites survey vast areas in seconds, a process which exaggerates the effect of availability bias, and, therefore, decreases the number of animals which can be detected in comparison to ship or aerial surveys. Further investigation of the relationship between instantaneous surface availability and image properties, perhaps through using images repeatedly collected over short time periods, as mentioned above, in conjunction with other local surveys via ship, UAS or “circle back” methods to simultaneously estimate visibility bias (i.e. a combination of both availability and perception bias)⁵⁴.

Perception bias potentially has differing effects in a ship-survey versus an image survey. In a typical line or point transect survey, perception bias is introduced when an animal, which is available for detection, is missed by the observers for whatever reason⁴⁵, these include, but are not limited to, observer fatigue, worsening environmental conditions, observer inexperience, and chance. However, in an image-based survey, perception bias manifests itself slightly differently, given the extended period available to observers to review the images of the surface of the ocean. This longer review time, and the ability to rest observers, without losing in situ survey time, probably reduces the likelihood of an observer missing an animal if it was there, at the surface, to be detected—but further research is needed to test that assumption. In this instance, perception bias is reflected in variations in how FOIs are classified. Here we compared between the scores given for a randomly chosen subset of the original data, and the re-classified scores from three independent reviewers. We found that despite a degree of inter-user variation (Fig S2) the variance in the scores did not exceed 1 (Fig S1), and when averaged over the three reviewers, the proportions of these scores classified as either “definite” or “probable”, did not deviate significantly from the original data. This suggests that at an FOI-level, inter-user variation was present, presumably reflecting an individual’s interpretation of what the feature being considered was. However, averaging over this variability in individual perception still revealed a very similar proportion of FOIs that were classified and scored as “definite” and “probable” whales compared to the main observer. Despite this, we note that variation between observers represents a sizable source of uncertainty associated with manual scanning and classification of imagery, with data being prone to user performance bias. Automation of the initial image scanning and FOI classification process using machine learning tools could go some way to solving this issue. Automation would not totally remove perception bias, as the parameters offered to define/train the automated systems would themselves be subject to a degree of bias. However, trained algorithms with known, quantifiable uncertainties may provide a more analytically uniform means of scanning, identifying and classifying FOIs. Studies^55,56 have shown remarkable progress in this field over recent years,however ongoing development and testing is still required to hone these methods in order to assess their accuracy when compared to visual observations in challenging conditions.

Further classifications using larger numbers of human observers (e.g. crowd sourced analysis of imagery⁵⁷ could also provide a useful means of optimising the approach, to: (1) provide a best practice approach to classification which reduces inconsistent interpretation among observers, and (2) provide overall better perception of FOIs averaged over a larger number of observers, reducing error brought about by extreme differences in individual perception. Currently, there is not enough data for both approaches to adequately parameterise the observation process, both require substantially larger data sets of whales in satellite imagery than are currently available, and as such this remains an area of interest for future research.

Satellite imagery as a platform for assessing whale occupancy is in its infancy but this assessment shows that with careful consideration of location and environmental conditions, it can provide density estimates which could be useful for monitoring whale density patterns in time and space for some populations, complementing existing methodologies. There are a number of key areas in which image-based surveys need to be developed to ascertain their overall comparability to existing techniques, for example via continued data collection, careful consideration of environmental conditions, and further assessment of instantaneous surface availability. However, one area where satellite imagery is distinctly advantageous, is that it has potential to survey very large areas instantaneously, providing weather conditions can be accounted for. This allows for more simple analysis than traditional line transect, as the latter requires extrapolation between transects in order to infer broader-scale density estimates. Reliable species identification would also represent a significant milestone in the development of this method. The results presented here act as a first attempt, and a baseline from which future studies can focus on addressing the aforementioned limitations. Global ecosystems are moving through a period of increased perturbation^23,58,59, where costs and limited access are hampering research. The ability to deploy satellites to collect data offers a distinct advantage over existing survey techniques, which are expensive, use high volumes of fuel and often face significant logistical lead times compared to the effective “real-time” assessments that can be made through remote earth observations. Our results show that VHR satellite imagery has strong potential to be used as a safer, non-invasive means of surveying remote regions, which compliments existing approaches.

Source: Ecology - nature.com

A comparison of baleen whale density estimates derived from overlapping satellite imagery and a shipborne survey

Peering into peer review

Surface cooling caused by rare but intense near-inertial wave induced mixing in the tropical Atlantic

ITALIAN LANGUAGE

ENGLISH LANGUAGE