Network motifs shape distinct functioning of Earth’s moisture recycling hubs

UTrack atmospheric moisture tracking model

The UTrack atmospheric moisture tracking model is a novel Lagrangian model that tracks parcels of moisture forward in three-dimensional space⁹. UTrack is the first moisture tracking model to employ ERA5 reanalysis data⁸. The basic principle of the model is that for each mm of evaporation, a certain number of “moisture parcels” is released and subsequently tracked through time and space. At each time step, the moisture budget of the parcels is updated based on evaporation and precipitation at the respective time and location, meaning that for each location of evaporation, a detailed image of the “footprint” of evaporation can be created. All types of evapotranspiration are included, and here is simply called evaporation.

For each mm of evaporation, 100 parcels are released 50 hPa above the surface height at random spatial locations within each 0.25° grid cell of input evaporation data. The trajectories of the parcels are based on interpolated three-dimensional ERA5 wind speed and wind direction data, which also have a horizontal resolution of 0.25° and consist of 25 pressure layers in the atmospheric column. The spatial coordinates of each parcel are updated at each time step of 0.1 h. Also, at each time step, there is a certain probability that a parcel is redistributed randomly along the atmospheric column such that, on average, every parcel is redistributed every 24 h (see methods section Moisture recycling dataset: validation and uncertainties below for further details). The relative probability of the new position in the atmospheric column is scaled with the vertical moisture profile. Parcels are tracked for 30 days or until 99% of their moisture has precipitated.

To allocate a certain fraction of any moisture parcel to precipitation events at the current time and location, ERA5 hourly total precipitation (P) and total precipitable water (TPW) are interpolated to the simulation time step of 0.1 h. The amount of moisture that precipitates at a certain time step equals the amount of precipitation at that time step over the total precipitable water in the atmospheric water column (P/TPW). Specifically, precipitation A in mm per time step at location x, y at time t that originated as evaporation from a particular source is described as:

$${A}_{x,y,t}={P}_{x,y,t}frac{{W}_{{{{{{{{rm{parcel,t}}}}}}}}}{E}_{{{{{{{{rm{source,t}}}}}}}}}}{{{{{{mathrm{TP}}}}}}{{{{{{mathrm{W}}}}}}}_{x,y,t}}$$

(1)

with P being precipitation in mm at time step t, W_parcel,t (mm) the amount of moisture in the parcel of interest, E_source,t the fraction of moisture present in the parcel at time t that has evaporated from the source, and TPW (mm) the precipitable water in the atmospheric water column. The moisture content of parcels is updated each time step using evaporation and precipitation at its current location:

$${W}_{{{{{{{{rm{parcel,t}}}}}}}}}={W}_{{{{{{{{rm{parcel,t-1}}}}}}}}}+({E}_{{{{{{{{rm{x,y,t}}}}}}}}}-{P}_{{{{{{{{rm{x,y,t}}}}}}}}})frac{{W}_{{{{{{{{rm{parcel,t-1}}}}}}}}}}{{{{{{mathrm{TP}}}}}}{{{{{{mathrm{W}}}}}}}_{{{{{{{{rm{x,y,t}}}}}}}}}}$$

(2)

The moisture (fraction) that has evaporated from the source is updated as follows:

$${E}_{{{{{{{{rm{source,t}}}}}}}}}=frac{{E}_{{{{{{{{rm{source,t-1}}}}}}}}}{W}_{{{{{{{{rm{parcel,t-1}}}}}}}}}{A}_{x,y,t}}{{W}_{{{{{{{{rm{parcel,t}}}}}}}}}}$$

(3)

The moisture flow m_ij from evaporation in cell i to precipitation in cell j is aggregated on a monthly basis (mm/month), where [x, y] ∈ j becomes:

$${m}_{ij}=mathop{sum }limits_{t=0}^{{{{{{{{rm{month}}}}}}}}}{A}_{j,t}frac{{E}_{i,t}}{{W}_{i,t}}$$

(4)

with W_i,t being the tracked amount of moisture from the source cell i at time t. These simulations were performed for all evaporation on Earth during 2008–2017. The results were then aggregated on a mean-monthly basis to produce monthly means, and stored at 0.5 degree resolution. This dataset can be downloaded from ref. 53. For details on how to process the data, we refer to the accompanying paper by ref. 3.

Moisture recycling dataset: validation and uncertainties

As with all moisture recycling simulations, the ones used in this study rely on a number of assumptions that may affect the moisture recycling rates. All offline moisture recycling models use atmospheric model output to simulate the path of evaporation through the atmosphere to the location where it precipitates. Therefore, there are two sources of uncertainty that affect the moisture recycling estimates: (1) the quality of the atmospheric forcing data and (2) the assumptions in the moisture tracking model.

Tuinenburg and Staal (2020)⁹ explored these sources of uncertainty for a number of locations globally. The effects of a decrease in the quality of the atmospheric forcing data were most important in the vertical resolution of the atmospheric data: the forcing data should have enough vertical levels to resolve any vertical shear in atmospheric moisture transport. If the forcing data has a low vertical resolution, the moisture tracking model is forced with the mean atmospheric flow over a number of layers. In many regions, there are surface moisture flows that are in a different direction than the moisture flow aloft, resulting in a very small vertically integrated transport, which would distort the simulation of atmospheric moisture transport. Compared to the vertical resolution of the forcing data, the horizontal and temporal resolutions were less important in order to keep errors as small as possible. Because of the importance of this high vertical resolution, it was recommended⁹ to use the ERA5 dataset⁸ as its forcing dataset, as this currently is the atmospheric reanalysis dataset with the highest vertical resolution.

In addition, the change of ERA-interim to ERA5 resulted in a much better land-surface scheme with monthly varying vegetation and better bare soil evaporation. Also, many more observations are assimilated, which results in a better precipitation product compared to ERA-interim. Following this, the tracking of atmospheric moisture using ERA5 allows for a better quality of the atmospheric moisture cycle than before. But, of course, also the already high horizontal resolution of 0.5^∘ × 0. 5^∘ has the limitation that very localized moisture recycling features like orography and locally varying land use cannot be resolved. Out of these reasons, the uncertainty in the evaporation estimates is a lot larger than that in the precipitation estimates, because of the lack of global evaporation measurements and the difficulty in measuring evaporation in general^54,55.

There are also uncertainties due to the assumptions in the moisture tracking model that can be split into a category of simulation assumptions and physical assumptions. The simulation assumptions include model formulation (Eulerian vs. Lagrangian model set-ups), time step lengths, number of parcels released, and types of interpolation. Of these simulation assumptions, the most important aspects were the model formulation, with Lagrangian models better able to resolve complex terrain and atmospheric flows. For the other model assumptions (see methods section UTrack atmospheric moisture tracking model), it was chosen to simulate with the highest level of precision before any more information (e.g., more parcels) would no longer affect evaporation footprints and moisture recycling statistics (see ref. 9 for further details). Even though the ERA5 dataset is known to have some precipitation biases in the tropics, the results of UTrack (forced by ERA5) have recently been validated across the tropics by independent measurements of deuterium excess, a measure of a stable isotope that depends on terrestrial precipitation recycling⁵⁶. UTrack estimates and isotope-based estimates of terrestrial moisture recycling corresponded, especially in tropical rainforests (Kendall’s (overline{tau }=0.52)⁵⁶), which are found to be moisture recycling hubs on a global scale.

Network construction

Motivated by the network-like structure of the data, we here employ a network perspective to study moisture flows. Hence, nodes in such a network are grid cells on a regular spherical grid and edges represent the moisture transported. However, interpreting the dataset directly as a weighted network is neither computationally feasible nor does a weighted network allow for identifying motifs, the building blocks of complex networks¹⁷. We, therefore, aim for an approach utilizing an unweighted network.

As shown in Fig. S1, moisture recycling strengths are heterogeneously distributed over multiple powers of magnitude. Thus, it is not appropriate to just withdraw the moisture transport volume and include all moisture transport connections within the dataset as equal and unweighted links. Instead, we attempt to highlight the strongest moisture pathways and, thus, the backbone of the Earth’s moisture recycling network. To, on the one hand, include as much moisture volume as possible but also keep the absolute volume of moisture transport represented per edge as similar as possible, we decided to include edges in a data-adaptive way: we step-wise include links starting from the strongest and stop this procedure as the total moisture transport volume exceeds the variable threshold ρ. The resulting edges then represent the backbone of the global moisture recycling network. In the main text, we have shown the results for a network where all edges together represent ρ = 25% of the total moisture transport. Here and in the SI figures, we add a sensitivity analysis for ρ = 20% and ρ = 30% and find that the results are stable for this broader range of total moisture volume thresholds.

Network measures and motifs

The topology of an unweighted network is typically encoded in an adjacency matrix A with elements a_ij indicating if there exists an edge from node i to node j (a_ij = 1) or not (a_ij = 0). The degree k of a node i describes the number of adjacent edges pointing towards or away from node i. Hence, the in-degree is defined by²⁵

$${k}_{{{{{{mathrm{in}}}}}}}^{i}=mathop{sum }limits_{i=1}^{N}{a}_{ji}$$

(5)

and out-degree is defined by²⁵

$${k}_{{{{{{mathrm{out}}}}}}}^{i}=mathop{sum }limits_{i=1}^{N}{a}_{ij}.$$

(6)

To further analyze the topology of a network and, in particular, the local connectivity patterns, we study the presence of three motifs—the feed-forward loop, the neighboring loop, and the zero loop.

The feed-forward loop (FFL) consists of three nodes, A, B, and C, where nodes A and C are directly connected via a detour over node B (intermediary node). Therefore, we have two different pathways that focus on node C. Hence, this motif can be referred to as a directed lens, due to its focused flow from two nodes on one singular and its purely directed linkage. This network motif has been studied in the context of tipping elements and has been proven to facilitate tipping cascades by lowering critical thresholds¹⁹. The zero loop (ZL) is made up of a bidirectional connection of two nodes. In contrast to the FFL, where node A does not receive feedback from node C, here, both nodes are dependent on each other without a preferred direction of network flow. This facilitates tipping to a much lesser degree than the FFL motif¹⁹. The neighboring loop (NBr) is an extension of the ZL. In this case, there is an additional node connected to one of the nodes of a zero loop. Hence, there is a two-step directionality in the motif, but in contrast to the FFL, this motif is characterized by reciprocity.

We count the number of motifs a certain node is involved in the network. The number of FFLs is counted as the number that a certain node is a so-called “target” node. The target node is the node, on which the triangular structure of the motif is converging to, i.e., the node that has been referred to as node C above. The ZL is a symmetric motif for the two involved nodes. Therefore, the number of ZLs of a certain node in the network is counted directly as the number of bidirectional interactions of the inspected node. Lastly, the number of NBrs of a certain node is the number of being in the center of a neighboring loop. With this procedure, each node is characterized by its number of FFLs, ZLs, and NBrs (cf. ref. 19).

Motif strength and their spatially aggregated difference

To assess the presence of motifs and, in particular, their relative frequency, we first determine the numbers of FFLs, ZLs, and NBrs per node. Subsequently, we normalize these counts by the respective maximum to obtain the motif strength, which is shown for each network motif in Fig. S5. In Fig. S5a–c, we display the motifs for the global network, and in Fig. S5d–f for the land-to-land network.

To specifically characterize the focus regions by means of the network topology, we evaluate which motifs dominate in which region. Consequently, we compute the difference of the motif strengths shown in Fig. S5 and reveal the patterns shown in Fig. 2. For spatially aggregated motif strength differences (Fig. 2c, d), we then compute the average of the respective values inside the highlighted boxes.

Sensitivity to link threshold ρ

The network analysis featured in the main text uses those moisture recycling edges that together represent ρ = 25% of all atmospheric moisture recycling on Earth. As we aimed to focus on the strongest moisture flows, we chose a threshold of ρ = 25% aggregating the strongest moisture transport pathways. This allows us to reveal the regions of strongest moisture connections, which are located in and close to the tropics, as we expected. Overall, the aim of this thresholding procedure is to utilize a network approach with unweighted edges but also take into account the large spread of moisture recycling strengths. To test the robustness of the results to the threshold value, we here show the same figures as above in the main text but with different thresholds ρ. Note that the error bars in Fig. 2 are based on the analysis featured in this part (the resulting differences using thresholds of ρ = 20% and ρ = 30%).

Figures S6 and S7 show the in- and out-degree of the all-to-all and land-to-land network using a threshold of ρ = 20% (Fig. S6) and ρ = 30% (Fig. S7). Note that the color bar has been adjusted as the number of links differs substantially between the networks. The main difference between Figs. S6 and S7 is the greater emphasis on moisture recycling in the mid-latitudes in Fig. S7. This is a direct consequence of considering more, and thus also some weaker, links. Acknowledging this difference, we stress that especially the land-to-land patterns (Figs. S6c, d, S7c, d) are consistent. In particular, the four focus regions, as defined in the main text, stand out as the main global land-to-land moisture recycling hubs. To support this visual analysis of the in- and out-degree pattern, we furthermore compute the motif strengths for both network configurations for quantitative validation of the results.

In line with the main text, we compare the FFL and ZL strength (see Fig. 2a–d). Not only the spatial patterns in our sensitivity analysis agree remarkably well with the results in the main text above, but also the focus regions remain basically the same (cf. Fig. S8 for ρ = 20% and Fig. S9 for ρ = 30% with Fig. 2). The only slight change is the shift towards a directed lens (spatially aggregated FFL and ZL strength difference) for the Amazon basin in the all-to-all network for increasing ρ (cf. Fig. S8c vs Fig. S9c vs Fig. 2c). We attribute the overproportional increase of the number of FFLs to those that include at least one oceanic grid cell to this noticeable shift. This underscores our characterization of the Amazon basin as a directed lens.

The spatially aggregated FFL and NBr difference (Figs. S10, S11) is structurally the same as above, where we computed the FFL and ZL difference (see Figs. S8, S9). The spatial patterns and the aggregated values are robust against shifts of ρ. However, for the Amazon basin (AB), the number of FFLs increases overproportionally in the all-to-all network when we include more links in our analysis. In other words, the spatially aggregated FFL-strength for AB increases for higher thresholds ρ (cf. Figs. S10c, S11c and Fig. 2g).

Sensitivity to the size of the focus regions

Another aspect affecting the results is the spatial extent chosen as a focus region (i.e., the rectangles in Fig. 2). Varying the size of these rectangles affects the spatially aggregated measures. For all focus regions besides the Amazon Basin (AB), the values are not significantly affected by changing the rectangle size, as the values close to the focus regions are either coherently negative, as for the Congo Rainforest (CR) and the Indonesian Archipelago (IA), or close to zero (South Asia: SA). The AB is characterized by positive values (tendency to lensing), whereas the more southern parts along the Andes are marked by more negative (corridor/washing machine) values.

Hence, we assess the stability of the results by using the spatial region covered by the Amazon rainforest (the extent of the Amazon rainforest is based on ref. 6) and compare them to the ones obtained by using the rectangle. The results featured in Fig. S12 indicate that only considering the rainforest-covered parts of the AB leads to similar or even more positive (lensing) values, confirming our conclusions that the Amazon rainforest region functions differently from the other focus regions.

Notes on maps

This paper makes use of perceptually uniform color maps developed by ref. 57. The underlying world maps have been created by cartopy⁵⁸.