in

Machine learning-based global maps of ecological variables and the challenge of assessing them

The quality of global maps can be assessed in different ways. One way is global assessment where a single statistic is chosen to summarize the quality of the entire map: the map accuracy. For a categorical variable, this can be the probability that for a randomly chosen location on the map, the map value corresponds to the true value. For a continuous variable, it can be the RMSE, describing for a randomly chosen location on the map the expected difference between the mapped value and the true value. When a probability sample, such as a completely spatially random sample, is available for the area for which a global assessment is needed, then map accuracy can be estimated model-free (also called design-based, e.g., by using the unweighted sample mean in case of a completely spatially random sample). This circumvents modeling of spatial correlation because observations are independent by design6,9. This approach is called model-free because no model needs to be assumed about the distribution or correlation of the data: the only source of randomness is the random selection of sample units from a target population. If a probability sample is not available this approach cannot be used, and automatically the accuracy assessment approach becomes model-based10, which involves modeling a spatial process by assuming distributions and taking spatial correlations into account, and choosing estimation methods accordingly.

Using naive random n-fold or leave-one-out cross-validation methods (or a simple random train-test split) to assess global model quality (usually equated with map accuracy) makes sense when the data are independent and identically distributed. When this is not the case, dependencies between nearby samples, e.g., in a spatial cluster, are ignored and result in biased, overly optimistic model assessment, as shown in, e.g., Ploton et al.5. Alternative cross-validation approaches such as spatial cross-validation5,11 that control for such dependencies are the only way to overcome this bias. Different spatial cross-validation strategies have been developed in the past few years, all aiming at creating independence between cross-validation folds5,11,12,13. Cross-validation creates prediction situations artificially by leaving out data points and predicting their value from the remaining points. If the aim is to assess the accuracy of a global map, the prediction situations created need to resemble those encountered while predicting the global map from the reference data (see Fig. 1 and discussions in Milà et al.14). This occurs naturally when reference data were obtained by (completely spatially random) probability sampling, but in other cases, this has to be forced for instance by controlling spatial distances (spatial cross-validation). Such forcing, however, is only possible when the distances in space that need to be resembled are available in the reference data. In the extreme case where all reference data come from a single cluster, this is impossible. When all reference data come from a small number of clusters, larger distances are available between clusters but do not provide substantial independent information about variation associated with these distances. Lack of information about larger distances means that we cannot assess the quality of predictions associated with such distances and cannot properly estimate global quality measures. Alternative approaches such as experiments with synthetic data15 or a validation using independent data at a higher level of integration16 would then be options to support confidence in the predictions.

Another way of accuracy assessment is local assessment: for every location, a quality measure is reported, again as probability or prediction error. Such a local assessment predicts how close the map value is to newly observed values at particular locations. If the measurement error is quantified explicitly, a smoother, measurement-error-free value may be predicted10. If the model accounts for change of support10,17, predictions errors may refer to average values over larger areas such as 1 × 1, 5 × 5, or 10 × 10 km grid cells. Examples of local assessment in the context of global ecological mapping are modeled prediction errors using Quantile Regression Forests18 or mapped variance of predictions made by ensembles1,2. Neither of these examples quantifies spatial correlation or measurement error, or addresses change of support, as it is known from other modeling frameworks19. By omitting to model the spatial process, the local accuracy estimates as presented in the global studies that motivated this comment are disputable.

The difference between global and local assessment is striking, in particular for global maps. A global, single number averages out all variability in prediction errors, and obscures any differences, e.g., between continents or climate zones. It is of little value for interpreting the quality of the map for particular regions.


Source: Ecology - nature.com

A community approach to improving the health of the planet

At Climate Grand Challenges showcase event, an exploration of how to accelerate breakthrough solutions