Coastal phytoplankton blooms expand and intensify in the 21st century
Data sourcesMODIS on the Aqua satellite provides a global coverage within 1–2 days. All images acquired by this satellite mission from January 2003 to December 2020 were used in our study to detect global coastal phytoplankton blooms, with a total of 0.76 million images. MODIS Level-1A images were downloaded from the Ocean Biology Distributed Active Archive Center (OB.DAAC) at NASA Goddard Space Flight Center (GSFC), and were subsequently processed with SeaDAS software (version 7.5) to obtain Rayleigh-corrected reflectance (Rrc (dimensionless), which was converted using the rhos (in sr−1) product (rhos × π) from SeaDAS)41, remote sensing reflectance (Rrs (sr−1)) and quality control flags (l2_flags). If a pixel was flagged by any of the following, it was then removed from phytoplankton bloom detection: straylight, cloud, land, high sunglint, high solar zenith angle and high sensor zenith angle (https://oceancolor.gsfc.nasa.gov/atbd/ocl2flags/). MODIS level-3 product for aerosol optical thicknesses (AOT) at 869 nm was also obtained from OB.DAAC NASA GSFC (version R2018.0), which was used to examine the impacts of aerosols on bloom trends.We examined the algal blooms in the EEZs of 153 ocean-bordering countries (excluding the EEZs in the Caspian Sea or around the Antarctic), 126 of which were found with at least one bloom in the past two decades. The EEZ dataset is available at https://www.marineregions.org/download_file.php?name=World_EEZ_v11_20191118.zip. The EEZs are up to 200 nautical miles (or 370 km) away from coastlines, which include all continental shelf areas and offer the majority of marine resources available for human use. Regional statistics of algal blooms were also performed for LMEs. LMEs encompass global coastal oceans and outer edges of coastal currents areas, which are defined by various distinct features of the oceans, including hydrology, productivity, bathymetry and trophically dependent populations42. Of the 66 LMEs identified globally, we excluded the Arctic and Antarctic regions and examined 54 LMEs. The boundaries of LMEs were obtained from https://www.sciencebase.gov/catalog/item/55c77722e4b08400b1fd8244.We used HAEDAT to validate our satellite-detected phytoplankton blooms in terms of presence or absence. The HAEDAT dataset (http://haedat.iode.org) is a collection of records of HAB events, maintained under the UNESCO Intergovernmental Oceanographic Commission and with data archives since 1985. For each HAB event, the HAEDAT records its bloom period (ranging from days to months) and geolocation. We merged duplicate entries when both the recorded locations and times of the HAEDAT events were very similar to one another, and a total number of 2,609 HAEDAT events were ultimately selected between 2003 and 2020.We used the ¼° resolution National Oceanic and Atmospheric Administration Optimum Interpolated SST (v. 2.1) data to examine the potential simulating effects of warming on the global phytoplankton trends. We also estimated the SST gradients following the method of Martínez-Moreno33. As detailed in ref. 33, the SST gradient can be used as a proxy for the magnitude of oceanic mesoscale currents (EKE). We used the SST gradient to explore the effects of ocean circulation dynamics on algal blooms.Fertilizer uses and aquaculture production for different countries was used to examine the potential effects of nutrient enrichment from humans on global phytoplankton bloom trends. Annual data between 2003 and 2019 on synthetic fertilizer use, including nitrogen and phosphorus, are available from https://ourworldindata.org/fertilizers. Annual aquaculture production includes cultivated fish and crustaceans in marine and inland waters, and sea tanks, and the data between 2003 and 2018 are available from https://ourworldindata.org/grapher/aquaculture-farmed-fish-production.The MEI, which combines various oceanic and atmospheric variables36, was used to examine the connections between El Niño–Southern Oscillation activities and marine phytoplankton blooms. The dataset is available from https://psl.noaa.gov/enso/mei/.Development of an automated bloom detection methodA recent study by the UNESCO Intergovernmental Oceanographic Commission revealed that globally reported HAB events have increased6. However, such an overall increasing trend was found to be highly correlated with recently intensified sampling efforts6. Once this potential bias was accounted for by examining the ratio between HAB events to the number of samplings5, there was no significant global trend in HAB incidence, though there were increases in certain regions. With synoptic, frequent, and large-scale observations, satellite remote sensing has been extensively used to monitor algal blooms in oceanic environments17,18,19. For example, chlorophyll a (Chla) concentrations, a proxy for phytoplankton biomass, has been provided as a standard product by NASA since the proof-of-concept Coastal Zone Color Scanner (1978–1986) era43,44. The current default algorithm used to retrieve Chla products is based on the high absorption of Chla at the blue band45,46, which often shows high accuracy in the clear open oceans but high uncertainties in coastal waters. This is because, in productive and dynamic coastal oceans, the absorption of Chla in the blue band can be obscured by the presence of suspended sediments and/or coloured dissolved organic matter (CDOM)47. To address this problem, various regionalized Chla algorithms have been developed48. Unfortunately, the concentrations of the water constituents (CDOM, sediment and Chla) can vary substantially across different coastal oceans. As a result, a universal Chla algorithm that can accurately estimate Chla concentrations in global coastal oceans is not currently available.Alternatively, many spectral indices have been developed to identify phytoplankton blooms instead of quantifying their bloom biomass, including the normalized fluorescence line height21 (nFLH), red tide index49 (RI), algal bloom index47 (ABI), red–blue difference (RBD)50, Karenia brevis bloom index50 (KBBI) and red tide detection index51 (RDI). In practice, the most important task for these index-based algorithms is to determine their optimal thresholds for bloom classification. However, such optimal thresholds can be regional-or image-specific20, due to the complexity of optical features in coastal waters and/or the contamination of unfavourable observational conditions (such as thick aerosols, thin clouds, and so on), making it difficult to apply spectral-index-based algorithms at a global scale.To circumvent the difficulty in determining unified thresholds for various spectral indices across global coastal oceans, an approach from a recent study to classify algal blooms in freshwater lakes52 was adopted and modified here. In that study, the remotely sensed reflectance data in three visible bands (red, green and blue) were converted into two-dimensional colour space created by the Commission Internationale del’éclairage (CIE), in which the position on the CIE chromaticity diagram represented the colour perceived by human eyes (Extended Data Fig. 1a). As the algal blooms in freshwater lakes were manifested as greenish colours, the reflectance of bloom-containing pixels was expected to be distributed in the green gamut of the CIE chromaticity diagram; the stronger the bloom, the closer the distance to the upper border of the diagram (the greener the water).Here, the colour of phytoplankton blooms in the coastal oceans can be greenish, yellowish, brownish, or even reddish53, owing to the compositions of bloom species (diatoms or dinoflagellates) and the concentrations of different water constituents. Furthermore, the Chla concentrations of the coastal blooms are typically lower than those in inland waters, thus demanding more accurate classification algorithms. Thus, the algorithm proposed by Hou et al.52 was modified when using the CIE chromaticity space for bloom detection in marine environments. Specifically, we used the following coordinate conversion formulas to obtain the xy coordinate values in the CIE colour space:$$begin{array}{c}x=X/(X+Y+Z)\ y=Y/(X+Y+Z)\ X=2.7689R+1.7517G+1.1302B\ Y=1.0000R+4.5907G+0.0601B\ Z=0.0000R+0.0565G+5.5943Bend{array}$$
(1)
where R, G and B represent the Rrc at 748 nm, 678 nm (fluorescence band) and 667 nm in the MODIS Aqua data, respectively. By contrast, the R, G and B channels used in Hou et al.52 were the red, green and blue bands. We used the fluorescence band for the G channel because, for a given region, the 678 nm signal increases monotonically with the Chla concentration for blooms of moderate intensity21, which is similar to the response of greenness to freshwater algal blooms. Thus, the converted y value in the CIE coordinate system represents the strength of the fluorescence. In practice, for pixels with phytoplankton blooms, the converted colours in the chromaticity diagram will be located within the green, yellow or orange–red gamut (see Extended Data Fig. 1a); the stronger the fluorescence signal is, the closer the distance to the upper border of the CIE diagram (larger y value). By contrast, for bloom-free pixels without a fluorescence signal, their converted xy coordinates will be located in the blue or purple gamut. Therefore, we can determine a lower boundary in the CIE two-dimensional coordinate system to separate bloom and non-bloom pixels, similar to the method proposed by Hou et al.52.We selected 53,820 bloom-containing pixels from the MODIS Rrc data as training samples to determine the boundary of the CIE colour space. These sample points were selected from nearshore waters worldwide where frequent phytoplankton blooms have been reported (Extended Data Fig. 2); the algal species included various species of dinoflagellates and diatoms20. A total of 80 images was used, which were acquired from different seasons and across various bloom magnitudes, to ensure that the samples used could almost exhaustively represent the different bloom conditions in the coastal oceans.We combined the MODIS FLHRrc (fluorescence line height based on Rrc) and enhanced red–green–blue composite (ERGB) to delineate bloom pixels manually. The FLHRrc image was calculated as:$$begin{array}{c}{{rm{FLH}}}_{{rm{Rrc}}}={R}_{{rm{rc}}678}times {F}_{678}-[{R}_{{rm{rc}}667}times {F}_{667}+({R}_{{rm{rc}}748}times {F}_{748}\ ,,-,{R}_{{rm{rc}}667}times {F}_{667})times (678-667)/(748-667)]end{array}$$
(2)
where Rrc667, Rrc678 and Rrc748 are the Rrc at 667, 678 and 748 nm, respectively, and F667, F678 and F748 are the corresponding extraterrestrial solar irradiance. ERGB composite images were generated using Rrc of three bands at 555 (R), 488 (G) and 443 nm (B). Although phytoplankton-rich and sediment-rich waters have high FLHRrc values, they appear as darkish and bright features in the ERGB images (Extended Data Fig. 3), respectively21. In fact, visual examination with fluorescence signals and ERGB has been widely accepted as a practical way to delineate coastal algal blooms on a limited number of images21,54,55. Note that the FLHRrc here was slightly different from the NASA standard nFLH product56, as the latter is generated using Rrs (corrected for both Rayleigh and aerosol scattering) instead of Rrc (with residual effects of aerosols). However, when using the NASA standard algorithm to further perform aerosol scattering correction over Rrc, 20.7% of our selected bloom-containing pixels failed to obtain valid Rrs (without retrievals or flagged as low quality), especially for those with strong blooms (see examples in Extended Data Fig. 4). Likewise, we also found various nearshore regions with invalid Rrs retrievals. By contrast, Rrc had valid data for all selected samples and showed more coverage in nearshore coastal waters. The differences between Rrs and Rrc were because the assumptions for the standard atmospheric correction algorithm do not hold for bloom pixels or nearshore waters with complex optical properties57. In fact, Rrc has been used as an alternative to Rrs in various applications in complex waters58,59.We converted the Rrc data of 53,820 selected sample pixels into the xy coordinates in the CIE colour space (Extended Data Fig. 1a). As expected, these samples of bloom-containing pixels were located in the upper half of the chromaticity diagram (the green, yellow and orange–red gamut) (Extended Data Fig. 1a). We determined the lower boundary of these sample points in the chromaticity diagram, which represents the lightest colour and thus the weakest phytoplankton blooms; any point that falls above this boundary represents stronger blooms. The method to determine the boundary was similar to Hou et al.52: we first binned the sample points according to the x value in the chromaticity diagram and estimated the 1st percentile (Q1%) of the corresponding Y for each bin; then, we fit the Q1% using two-order polynomial regression. Sensitivity analysis with Q0.3% (the three-sigma value) resulted in minor changes ( 1/3 AND y > y2), it is classified as a ‘bloom’ pixel.Depending on the local region and application purpose, the meaning of ‘phytoplankton bloom’ may differ. Here, for a global application, the pixelwise bloom classification is based on the relationship (represented using the CIE colour space) between Rrc in the 667-, 678- and 754-nm bands derived from visual interpretation of the 80 pairs of FLHRrc and ERGB imagery. Instead of a simple threshold, we used a lower boundary of the sample points in the chromaticity diagram to define a bloom. In simple words, a pixel is classified as a bloom if its fluorescence signal is detectable (the associated xy coordinate in the CIE colour space located above the lower boundary). Histogram of the nFLH values from the 53,820 training pixels demonstrated the minimum value of ~0.02 mW cm−2 μm−1 (Extended Data Fig. 1a), which is in line with the lower-bound signal of K. brevis blooms on the West Florida shelf21,47. Note that, such a minimum nFLH is determined from the global training pixels, and it does not necessarily represent a unified lower bound for phytoplankton blooms across the entire globe, especially considering that fluorescence efficiency may be a large variable across different regions. Different regions may have different lower bounds of nFLH to define a bloom, and such variability is represented by the predefined boundary in the CIE chromaticity diagram in our study. Correspondingly, although the accuracy of Chla retrievals may have large uncertainties in coastal waters, the histogram of the 53,820 training pixels shows a lower bound of ~1 mg m−3 (Extended Data Fig. 1a). Similarly to nFLH, such a lower bound may not be applicable to all coastal regions, as different regions may have different lower bounds of Chla for bloom definition.Although the MODIS cloud (generated by SeaDAS with Rrc869 0.12) and Index2 ( More