More stories

  • in

    Forecasting point-of-consumption chlorine residual in refugee settlements using ensembles of artificial neural networks

    Study sites and data collectionThe data used for this study were obtained from a previous multi-site study on post-distribution FRC decay collected from refugee settlements in South Sudan, Jordan, and Rwanda19. This dataset was selected as process-based models have been used to produce FRC targets for these sites, which provide a useful comparison to the risk-based targets generated in this study. Details of the data collected at these sites, as well as important site characteristics are included in Table 3. Two datasets were collected from Jordan: one from the summer of 2014 and one 9 months later from the late winter of 2015. The original study treated these as two separate datasets due to differences in environmental conditions between the two datasets (10 °C difference in average temperature) and amount of time between the two datasets19. To ensure a consistent comparison with the original study, we have also treated the 2014 and 2015 data from Jordan as two distinct datasets.Table 3 Summary of Key Site Characteristics19,59,60,61.Full size tableThe dataset for each site includes FRC as well as other water quality parameters, which are routinely collected in humanitarian water systems operation including total residual chlorine, EC, water temperature, turbidity, and pH. Data were collected using paired sampling whereby the same unit of water was sampled at the following points along the post-distribution water supply chain:

    From the tap at the point-of distribution

    In the container immediately after collection

    In the container immediately after transport to the dwelling

    After a follow-up period of storage in the household

    This study only used the measurements at the point-of-distribution and point-of-consumption to reflect data collection practices that are more feasible for humanitarian operations. In preparing the dataset, observations were removed if the point-of-distribution water quality did not meet humanitarian drinking water quality guidelines. Supplementary Table 2 in the Supplementary Information includes the full list of data cleaning steps that were used to prepare the data for use in the ANN models.EthicsThe initial field work in South Sudan received exemption from full ethics review by the Medical Director of Médecins sans Frontières (MSF) (Operational Centre Amsterdam) as data collected was routine for the on-going water supply intervention at the study site. For subsequent field studies in Jordan and Rwanda, ethics approval was obtained from the Committee for Protection of Human Subjects (CPHS) of the Institutional Review Board at the University of California, Berkeley (CPHS Protocol Number: 2014-05-6326). Informed consent was provided throughout all data collection.Input variable selectionTwo input variable combinations were considered for predicting the output variable, the point-of-consumption FRC concentration. The variables considered are all variables that are routinely monitored in humanitarian water system operations. The first input variable combination (IV1) included FRC at the water point-of-distribution and the elapsed time between the measurement at the point-of-distribution and the point-of-consumption. This input variable combination represents the minimum number of variables that would be regularly collected under current humanitarian drinking water quality guidelines31. Additionally, these are the only two variables included in the process-based model developed in a past study for these sites19, so this input variable combination allows for a direct comparison of the ANN ensemble models with the process-based models. The second input variable combination (IV2) included the variables from IV1 as well as additional water quality variables measured from the point-of-distribution (directly after water had left the water distribution point): EC, water temperature, pH, and turbidity. These additional variables are recommended for collection in some humanitarian drinking water quality guidelines29,30,31, and as such, may also be available in humanitarian response settings. This larger input variable set allowed us to investigate the usefulness of additional water quality variables for forecasting point-of-consumption FRC concentrations.Base-learner structure and architectureThe ensemble base learners (the individual ANNs in the ensemble models) were built as multi-layer perceptrons (MLPs) with a single hidden layer using the Keras 2.3.0 package48 in Python v3.749. This structure was selected because it has been shown to outperform other data-driven models and ANN architectures for predicting FRC in piped distribution systems20,21. The weights and biases of the base learners were optimized to minimize mean squared error (MSE) using the Nadam algorithm with a learning rate of 0.1. An early stopping procedure with a patience of 10 epochs was used to prevent overfitting.The hidden layer size of the base learners was determined through an exploratory analysis by consecutively doubling the hidden layer size until performance decreased or ceased to improve substantially from one iteration to the next. Based on this analysis, we selected a hidden layer size of four hidden neurons at all sites for the models using the IV1 variable combination for all sites. For the models using the IV2 input variable combination, we selected a hidden layer size of 16 hidden nodes for South Sudan and Jordan (2015), and a hidden layer size of eight hidden nodes for Jordan (2014) and Rwanda. The full results of the exploratory analysis into hidden layer size are included in Supplementary Figs 13–20 in the Supplementary Information.Data divisionThe full dataset for each site and variable combination was divided into calibration and testing subsets, with the calibration subset further subdivided into training and validation data. The testing subset was obtained by randomly sampling 25% of the overall dataset. The same testing subset was used for all base learners so that each base-learner’s testing predictions could be combined into an ensemble forecast. The training and validation data were obtained by randomly resampling from the calibration subset, with a different combination of training and validation data for each base learner to promote ensemble diversity. The ratio of data from the calibration set used for training and validation, respectively, was selected to avoid both overfitting and underfitting through an exploratory analysis using a grid search process. In all but two cases, we selected a validation set that was twice the size of the training set, for an overall training-validation-testing split of 25–50–25%. The two exceptions to this were for the Jordan (2014) model when using the IV1 input variable combination where we found that a training-validation-testing split of 50–25–25 produced better performance, and for the Jordan (2015) model when using the IV1 input variable combination where a training-validation-testing split of 30–45–25 performed substantially better. The full results of the exploratory analysis for data division are included in Supplementary Figs 21–28 in the Supplementary Information. Descriptive statistics for the calibration and testing datasets are included in Supplementary Tables 3 and 4 of the Supplementary Information, and histograms of the input and output variables are provided in Supplementary Figs 5–12 in the Supplementary Information to provide context of the range and patterns in the data used to train the ANN base learners.Ensemble model formationThe ensemble models in this study were used to generate probabilistic forecasts of post-distribution FRC by combining the predictions of each base learner into a probability density function (pdf). Thus, for each observation of FRC at the point-of-consumption, the ensemble model outputs a pdf representing the predicted probability of point-of-consumption FRC concentrations. This pdf can then be used to identify ensemble confidence intervals (CIs) for the expected point-of-consumption FRC concentration. To ensure a good representation of the full output space in the final pdfs, two approaches were taken to ensure ensemble diversity. First, as discussed above, the data used to train the base-learner ANNs was randomly sampled from the calibration set, so each ANN was trained on a different subset of the data. Second, the initial weights and biases were randomized for each base learner in a random-start process. Both of these are implicit approaches to ensuring ensemble diversity as they do not directly create diversity and instead the diversity arises through the randomization of the training data and the weights and biases50. The benefit of implicit approaches is that the differences between the base learners are derived from randomness in the data50.The ensemble size (number of base learners included in the ensemble) was also determined through an exploratory analysis using a grid search procedure This exploratory analysis showed that in general, performance increased with larger ensemble sizes, but improvements in performance plateaued at ensemble sizes ranging from 50 members to 250 members. Based on this, a standard ensemble size of 250 members was selected for all sites and variable combinations. The full results of the exploratory analysis for ensemble size are included in Supplementary Figs 29–36 in the Supplementary Information.Ensemble post-processingWe used ensemble post-processing to attempt to improve the forecasts generated by the raw ensembles. We used the kernel dressing method to post-process ensemble predictions51. This method follows a two-step process: first a kernel function is fit centred on the base-learner prediction for each observation, then each member’s kernel is summed together to produce the post-processed pdf, which is a non-parametric mixture distribution function. We used a Gaussian kernel function in keeping with past studies27,28,38,51, though the selection of the specific kernel function is not critical28. The kernel bandwidth was defined using the best member error method where the bandwidth for all kernels is the variance of the absolute error of the prediction that is closest to each observation in the calibration dataset51.Ensemble verification and performance evaluationWe used ensemble verification metrics to evaluate the performance of the raw and post-processed ensembles for each site and variable combination. Ensemble verification metrics differ from traditional measures of performance (e.g. Nash Sutcliffe Efficiency, MSE, etc.) as they assess the performance of the probabilistic forecasts of an ensemble whereas traditional measures typically evaluate the average performance of an ensemble model or the predictions of a deterministic model52. Throughout the following section, (O) refers to the full set of observed FRC concentrations at the point-of-consumption and (o_i) refers to the (i^{{mathrm{th}}}) observation, where there are (I) total observations. (F) refers to the full set of probabilistic forecasts for point-of-consumption FRC, where (F_i) is the probabilistic forecast corresponding to observation (o_i) and (f_i^m) is the prediction by the (m^{{mathrm{th}}}) base learner in the ensemble on the (i^{{mathrm{th}}}) observation. For the following metrics, it is assumed that the predictions of each base learner in the ensemble are sorted from low to high for each observation such that (f_i^m le f_i^{m + 1}) from (m = 0) to (m = M).Percent capturePercent capture measures the percentage of observations which are captured within the ensemble forecast and provides a useful indication of how well the model can reproduce the full range of observed values, and, as such, can indicate if a model is underdispersed. For a raw ensemble forecast, the (i^{{mathrm{th}}}) observation is captured if (f_i^0 le o_i le _i^M). For a post-processed forecast, the (i^{{mathrm{th}}}) observation is captured if the probability of (o_i) in the mixture distribution is greater than 0. While not commonly used for ensemble verification, a similar metric has been used for evaluating other probabilistic or possibilistic models, especially neurofuzzy networks, referred to either as the percent capture or the percent of coverage53,54,55,56. The percent capture was calculated both for the overall set of observations, as well as for observations with point-of-consumption FRC below 0.2 mg/L. The latter is a useful indicator of how well the model can predict if water will have sufficient FRC at the point-of-consumption, which is an important indicator of the degree of confidence we have in the risk-based targets generated using these ensemble models.CI reliability diagramReliability diagrams are visual indicators of ensemble reliability, where reliability refers to the similarity between the observed and forecasted probability distributions with the ideal model having all observations plotted along the 1:1 line showing that the observed probabilities are equal to the forecasted probabilities. These diagrams plot the observed relative frequency of events against the forecast probability of that event, though the reliability diagram has been adapted in past studies as the CI reliability diagram which compares the frequency of observed values within the corresponding CI of the ensemble. For raw ensembles, the CIs are derived from the sorted forecasts of the base learners (for example, the ensemble 90% CI would include all of the forecasts between (f^{0.05M}) and (f^{0.95M})) and for post-processed ensembles, the CIs are calculated directly from the probability distribution. In this study, we extended the CI reliability diagram further by plotting the percent capture of each CI within the ensemble against the CI level. For each ensemble model we plotted the CI reliability for the 10–100% CI levels at 10% intervals as well as at the 95 and 99% CI. We used this to develop a numerical score for the CI reliability diagram, which is calculated as the squared distance between the percentage of observations captured within each CI and the ideal percent capture in that CI. This was calculated for each CI threshold, k, from 10 to 100% in 10% increments as shown in Eq. 1.$$CI;{mathrm{Reliability}};{mathrm{Score}} = mathop {sum }limits_{k = 0.1}^1 left( {k – {mathrm{Percent}};{mathrm{Capture}};{mathrm{in}};CI_k} right)^2$$
    (1)
    The CI reliability score measures the horizontal distance between the percent capture and the 1:1 line for each CI. The ideal value for this score would be 0, indicating all points fall on the 1:1 line. The worst possible score will depend on the number of CI’s included in the calculation of the score; for this study the worst score is 3.9, which would only occur if no observations were captured in any CI of the ensembles. The CI reliability score was calculated for both the overall dataset and for forecast-observation pairs where the observed household FRC concentration was below 0.2 mg/L.Continuous Ranked Probability ScoreThe Continuous Ranked Probability Score (CRPS) is a common metric for evaluating probabilistic forecasts that evaluates the difference between the predicted and observed probabilities of continuous variables and is equivalent to the mean absolute error of a deterministic forecast57,58. The CRPS measures not only model reliability but also sharpness, which is an indicator of how closely the ensemble predictions are clustered around the observed values. Thus, the CRPS can be a useful measure of overdispersion and can provide an indication if improvements in reliability are being obtained at the expense of excess overdispersion. The CRPS is measured as the area between the forecast cumulative distribution function (cdf) and the observed cdf for each forecast-observation pairing58. Since each observation is a discrete value, the observation cdf is represented with the Heaviside function (H{ x ge x_a}), which is a stepwise function with a value of 0 for all point-of-consumption FRC concentrations below the observed concentration and 1 for all point-of-consumption FRC concentrations above the observed concentration. The equation for calculating the CRPS of a single forecast-observation pair is given in Eq. 2. Note that Eq. 2 shows the calculation of CRPS for a single forecast-observation pair. To evaluate the ensemble models, the average CRPS, (overline {{mathrm{CRPS}}}), is calculated by taking the mean CRPS overall forecast-observation pairs.$${mathrm{CRPS}} = {int nolimits_{-infty }^infty} left( {F_ileft( x right) – Hleft{ {x ge o_i} right}} right)^2dx$$
    (2)
    For the post-processed probability distributions, we calculated CRPS directly from Eq. 2 using numerical integration. For the raw ensemble, we treated the forecast cdf as a stepwise continuous function with (N = M + 1) bins where each bin is bounded at two ensemble forecasts and the value in each bin is the cumulative probability58. (overline {{mathrm{CRPS}}}) is calculated using (overline {g_n}), the average width of bin (n) (average difference in FRC concentration between forecast values (m) and (m + 1)) and (overline {o_n}) the likelihood of the observed value being in bin (n)58. Using these values, the (overline {{mathrm{CRPS}}}) for an ensemble can be calculated as58:$$overline {{mathrm{CRPS}}} = mathop {sum }limits_{n = 1}^N overline {g_n} [(1 – overline {o_n} )p_n^2 + overline {o_n} left( {1 – p_n} right)^2]$$
    (3)
    Where (p_n) is the probability associated with each bin, (p_n = frac{n}{N})58.Generation of risk-based targetsTo generate the risk-based FRC targets, the trained ensembles of ANNs were used to forecast the point-of-consumption FRC for a series of point-of-distribution FRC concentrations from 0.2 to 2 mg/L in 0.05 mg/L increments. For each point-of-distribution FRC concentration, the predicted risk of insufficient FRC was calculated from the forecast pdf as the cumulative probability of FRC at the point-of-consumption being below 0.2 mg/L. Using this predicted risk, the target FRC concentration for the point-of-distribution was then selected as the lowest FRC concentration at the water point-of-distribution that provides the desired level of protection. For this study we selected the FRC concentration that resulted in negligible risk of FRC being below the 0.2 mg/L threshold (i.e. the lowest FRC concentration where the predicted risk is 0), though operationally any level of protection could be used and the risk of insufficient FRC at the point-of-consumption should be balanced against risks associated with high FRC concentrations, such as DBP formation and taste and odour concerns.For comparison with the previously published results, we used a storage duration of 10 h when generating the FRC targets for South Sudan, and 24 h for all other sites19. Since the IV2 model also requires values for EC, water temperature, pH, and turbidity, two scenarios were considered. First, an “average” scenario was used where the median observed value for all other water quality parameters were selected. The second scenario considered was a “worst-case” scenario, where we simulated a scenario where water quality conditions were unfavourable for maintaining chlorine residual. A partial correlation analysis, which assesses the correlation between an input variable and the output variable while controlling for the impacts of other input variables, was used to determine the least favourable conditions for each input variable. The partial correlation analysis is performed by first developing multiple linear regression predictions of both the output variable (point-of-consumption FRC) and the input variable of interest using the remaining input variables as the predictors to the linear regression models and then taking the Pearson correlation coefficient of the residuals between the two regression models. Partial correlation was used to assess the directionality of the effect of the additional water quality variables included in IV2 to assess whether high or low values of these inputs would create a worst-case scenario. Once the directionality of the impact of the different variables had been established, the 95th or 5th percentile observed value of that variable was used at each site to simulate the worst-case scenario. More

  • in

    Guiding urban water management towards 1.5 °C

    1.Rogelj, J. et al. In Global Warming of 1.5 °C. An IPCC Special Report on the Impacts of Global Warming of 1.5 °C Above Pre-industrial Levels and Related Global Greenhouse Gas Emission Pathways, in the Context of Strengthening the Global Response to the Threat of Climate Change (eds Masson-Delmotte, V. et al.) In press (2018).2.Mo, W., Wang, R. & Zimmerman, J. B. Energy–water nexus analysis of enhanced water supply scenarios: a regional comparison of Tampa Bay, Florida, and San Diego, California. Environ. Sci. Technol. 48, 5883–5891 (2014).CAS 
    Article 

    Google Scholar 
    3.Sambito, M. & Freni, G. LCA methodology for the quantification of the carbon footprint of the integrated urban water system. Water 9, 395 (2017).Article 
    CAS 

    Google Scholar 
    4.Meron, N., Blass, V. & Thoma, G. A national-level LCA of a water supply system in a Mediterranean semi-arid climate—Israel as a case study. Int. J. Life Cycle Assess. 25, 1133–1144 (2020).5.Hsien, C., Low, J. S. C., Fuchen, S. C. & Han, T. W. Life cycle assessment of water supply in Singapore—a water-scarce urban city with multiple water sources. Resour. Conserv. Recycl. 151, 104476 (2019).Article 

    Google Scholar 
    6.Slagstad, H. & Brattebø, H. Life cycle assessment of the water and wastewater system in Trondheim, Norway—a case study: Case Study. Urban water J. 11, 323–334 (2014).CAS 
    Article 

    Google Scholar 
    7.Parkinson, S. C. et al. Climate and human development impacts on municipal water demand: a spatially-explicit global modeling framework. Environ. Model. Softw. 85, 266–278 (2016).Article 

    Google Scholar 
    8.Rothausen, S. G. S. A. & Conway, D. Greenhouse-gas emissions from energy use in the water sector. Nat. Clim. Chang. 1, 210 (2011).CAS 
    Article 

    Google Scholar 
    9.Parkinson, S. et al. Balancing clean water-climate change mitigation trade-offs. Environ. Res. Lett. 14, 014009 (2019).CAS 
    Article 

    Google Scholar 
    10.McDonald, R. I. et al. Water on an urban planet: Urbanization and the reach of urban water infrastructure. Glob. Environ. Chang. 27, 96–105 (2014).Article 

    Google Scholar 
    11.Pal, A., He, Y., Jekel, M., Reinhard, M. & Gin, K. Y.-H. Emerging contaminants of public health significance as water quality indicator compounds in the urban water cycle. Environ. Int. 71, 46–62 (2014).CAS 
    Article 

    Google Scholar 
    12.Escriva-Bou, A., Lund, J. R. & Pulido-Velazquez, M. Saving energy from urban water demand management. Water Resour. Res. 54, 4265–4276 (2018).Article 

    Google Scholar 
    13.Dworak, T. et al. EU Water Saving Potential (Institute for International and European Environmental Policy, 2007).14.Flörke, M. et al. Domestic and industrial water uses of the past 60 years as a mirror of socio-economic development: A global simulation study. Glob. Environ. Chang. 23, 144–156 (2013).Article 

    Google Scholar 
    15.House-Peters, L. A. & Chang, H. Urban water demand modeling: review of concepts, methods, and organizing principles. Water Resour. Res. 47, W05401 (2011).16.Gracia-De-Rentería, P., Barberán, R. & Mur, J. Urban water demand for industrial uses in Spain. Urban Water J. 16, 114–124 (2019).Article 

    Google Scholar 
    17.Vassolo, S. & Döll, P. Global-scale gridded estimates of thermoelectric power and manufacturing water use. Water Resour. Res. 41, W04010 (2005).18.Dieu-Hang, T., Grafton, R. Q., Martínez-Espiñeira, R. & Garcia-Valiñas, M. Household adoption of energy and water-efficient appliances: An analysis of attitudes, labelling and complementary green behaviours in selected OECD countries. J. Environ. Manag. 197, 140–150 (2017).Article 

    Google Scholar 
    19.Attari, S. Z. Perceptions of water use. Proc. Natl Acad. Sci. 111, 5129–5134 (2014).CAS 
    Article 

    Google Scholar 
    20.Gonzales, P. & Ajami, N. Social and structural patterns of drought-related water conservation and rebound. Water Resour. Res. 53, 10619–10634 (2017).Article 

    Google Scholar 
    21.Grafton, R. Q. et al. The paradox of irrigation efficiency. Science 361, 748–750 (2018).CAS 
    Article 

    Google Scholar 
    22.Britton, T. C., Stewart, R. A. & O’Halloran, K. R. Smart metering: enabler for rapid and effective post meter leakage identification and water loss management. J. Clean. Prod. 54, 166–176 (2013).Article 

    Google Scholar 
    23.Cominola, A. et al. Long-term water conservation is fostered by smart meter-based feedback and digital user engagement. npj Clean Water 4, 1–10 (2021).Article 

    Google Scholar 
    24.Gurung, T. R., Stewart, R. A., Beal, C. D. & Sharma, A. K. Smart meter enabled informatics for economically efficient diversified water supply infrastructure planning. J. Clean. Prod. 135, 1023–1033 (2016).Article 

    Google Scholar 
    25.Kajenthira, A., Siddiqi, A. & Anadon, L. D. A new case for promoting wastewater reuse in Saudi Arabia: Bringing energy into the water equation. J. Environ. Manag. 102, 184–192 (2012).CAS 
    Article 

    Google Scholar 
    26.Stillwell, A. S. et al. An integrated energy, carbon, water, and economic analysis of reclaimed water use in urban settings: a case study of Austin, Texas. J. Water Reuse Desalin. 1, 208–223 (2011).Article 

    Google Scholar 
    27.Stillwell, A. S. & Webber, M. E. Geographic, technologic, and economic analysis of using reclaimed water for thermoelectric power plant cooling. Environ. Sci. Technol. 48, 4588–4595 (2014).CAS 
    Article 

    Google Scholar 
    28.Kavvada, O., Nelson, K. L. & Horvath, A. Spatial optimization for decentralized non-potable water reuse. Environ. Res. Lett. 13, 64001 (2018).Article 

    Google Scholar 
    29.Santhosh, A., Farid, A. M. & Youcef-Toumi, K. Real-time economic dispatch for the supply side of the energy-water nexus. Appl. Energy 122, 42–52 (2014).Article 

    Google Scholar 
    30.Gomez Sanabria, A., Höglund Isaksson, L., Rafaj, P. & Schöpp, W. Carbon in global waste and wastewater flows–its potential as energy source under alternative future waste management regimes. Adv. Geosci. 45, 105–113 (2018).Article 

    Google Scholar 
    31.Song, X. et al. Resource recovery from wastewater by anaerobic membrane bioreactors: Opportunities and challenges. Bioresour. Technol. 270, 669–677 (2018).CAS 
    Article 

    Google Scholar 
    32.Qadir, M. et al. Global and regional potential of wastewater as a water, nutrient and energy source. Nat Resour. Forum 44, 40–51 (2020).Article 

    Google Scholar 
    33.McCarty, P. L., Bae, J. & Kim, J. Domestic wastewater treatment as a net energy producer: Can this be achieved? Environ. Sci. Technol. 45, 7100–7106 (2011).CAS 
    Article 

    Google Scholar 
    34.Tubiello, F. N. et al. The FAOSTAT database of greenhouse gas emissions from agriculture. Environ. Res. Lett. 8, 15009 (2013).Article 

    Google Scholar 
    35.Bertrand, A., Aggoune, R. & Maréchal, F. In-building waste water heat recovery: An urban-scale method for the characterisation of water streams and the assessment of energy savings and costs. Appl. Energy 192, 110–125 (2017).Article 

    Google Scholar 
    36.Guo, X. & Hendel, M. Urban water networks as an alternative source for district heating and emergency heat-wave cooling. Energy 145, 79–87 (2018).Article 

    Google Scholar 
    37.Vesilind, P. Wastewater Treatment Plant Design Vol. 2 (IWA Publishing, 2003).38.Guo, T., Englehardt, J. & Wu, T. Review of cost versus scale: water and wastewater treatment and reuse processes. Water Sci. Technol. 69, 223–234 (2013).Article 

    Google Scholar 
    39.Liu, L. et al. The importance of system configuration for distributed direct potable water reuse. Nat. Sustain. 3, 548–555 (2020).40.Wu, D., Wang, H. & Seidu, R. Smart data driven quality prediction for urban water source management. Futur. Gener. Comput. Syst. 107, 418–432 (2020).Article 

    Google Scholar 
    41.Lafortezza, R., Chen, J., Van Den Bosch, C. K. & Randrup, T. B. Nature-based solutions for resilient landscapes and cities. Environ. Res. 165, 431–441 (2018).CAS 
    Article 

    Google Scholar 
    42.Engström, R., Howells, M., Mörtberg, U. & Destouni, G. Multi-functionality of nature-based and other urban sustainability solutions: New York City study. L. Degrad. Dev. 29, 3653–3662 (2018).Article 

    Google Scholar 
    43.Kernan, R., Liu, X., McLoone, S. & Fox, B. Demand side management of an urban water supply using wholesale electricity price. Appl. Energy 189, 395–402 (2017).Article 

    Google Scholar 
    44.Menke, R., Abraham, E., Parpas, P. & Stoianov, I. Demonstrating demand response from water distribution system through pump scheduling. Appl. Energy 170, 377–387 (2016).Article 

    Google Scholar 
    45.Davison-Kernan, R., Liu, X., McLoone, S. & Fox, B. Quantification of wind curtailment on a medium-sized power system and mitigation using municipal water pumping load. Renew. Sustain. Energy Rev. 112, 499–507 (2019).Article 

    Google Scholar 
    46.Wang, D. et al. Hierarchical market integration of responsive loads as spinning reserve. Appl. Energy 104, 229–238 (2013).47.ENBALA. Pennsylvania American Water Connects to the Smart Grid (ENBALA, 2018).48.Muhanji, S. O., Barrows, C., Macknick, J. & Farid, A. M. An enterprise control assessment case study of the energy–water nexus for the ISO New England system. Renew. Sustain. Energy Rev. 141, 110766 (2021).Article 

    Google Scholar 
    49.Oikonomou, K. & Parvania, M. Optimal coordinated operation of interdependent power and water distribution systems. IEEE Trans. Smart Grid 11, 4784–4794 (2020).Article 

    Google Scholar 
    50.Tilmant, A. & Kinzelbach, W. The cost of noncooperation in international river basins. Water Resour. Res. 48, https://doi.org/10.1029/2011WR011034 (2012).51.Vinca, A. et al. Transboundary cooperation a potential route to sustainable development in the Indus Basin. Nat. Sustain. 4, 331–339 (2020).52.Spang, E. S. & Loge, F. J. A high-resolution approach to mapping energy flows through water infrastructure systems. J. Ind. Ecol. 19, 656–665 (2015).Article 

    Google Scholar 
    53.Bartos, M. D. & Chester, M. V. The conservation nexus: valuing interdependent water and energy savings in Arizona. Environ. Sci. Technol. 48, 2139–2149 (2014).CAS 
    Article 

    Google Scholar 
    54.Wada, Y. et al. Co-designing Indus Water-Energy-Land. Futures One Earth 1, 185–194 (2019).Article 

    Google Scholar 
    55.Inland Empire Utility Agency. Chino Basin Watermaster Optimum Basin Management Program Update (Inland Empire Utility Agency, 2020).56.Helm, D. Catchment Management, Abstraction and Flooding: The Case for a Catchment System Operator and Coordinated Competition (New College, 2015).57.IWA. Action Agenda for Basin-Connected Cities: Influencing and Activating Urban Stakeholders to be Water Stewards in their Basins (IWA, 2018). More

  • in

    Airborne geophysical surveys of the lower Mississippi Valley demonstrate system-scale mapping of subsurface architecture

    A system-scale airborne geophysical surveyFrom 2018 through early 2020, we acquired more than 43,000 flight-line-kilometers (line-km) of airborne geophysical data over the MAP study area of ~140,000 km2 (Fig. 2a, “Methods” section). Data collection included a high-resolution survey over ~1000 km2 near Shellmound, Mississippi, regional surveys with 3–6 km line spacing across the entire study area, and over 3000 line-km of data acquired along streams and rivers to characterize potential surface water–groundwater connections beneath these important recharge pathways. Radiometric (Fig. 2b), magnetic (Fig. 2c), and inverted resistivity grids at multiple depth intervals (Fig. 2d–h) summarize the combined results from both regional survey phases. Together, this represents the first initiative to acquire system-scale airborne geophysical data over an entire US aquifer.Fig. 2: Airborne geophysical survey coverage and summary of regional datasets.a Primary management regions in the MAP study area, with flight lines for each of the three phases of data collection completed through early 2020 (CR Crowleys Ridge). Results from the combined regional surveys gridded onto the 1 km National Hydrogeologic Grid45 include b radiometric data presented as a ternary diagram that indicates the relative abundance of K, U, and Th in surficial sediments with areas of Holocene (H)- and Pleistocene (P)-aged sediments indicated32; c the residual magnetic intensity map (in nanoTeslas, nT) shows faults46 related to the New Madrid seismic zone (RR Reelfoot rift, CGL Commerce geophysical lineament); and d–h resistivity depth slices at five depth intervals from 0 to 220 m below land surface annotated with mapped surficial geologic units32 (Hb backswamp, Hp Point bar and meander belt, Pve & Pvcl Wisconsin-age valley train) and four-letter codes of distinguishable hydrogeologic units (MRVA Mississippi River Valley alluvial aquifer, VKBG Vicksburg–Jackson confining unit, MCAQ Middle Claiborne aquifer, LCCU Lower Claiborne confining unit, MDWY Midway confining unit).Full size imageAt the regional scale, radiometric data (Fig. 2b) correlate with mapped surficial geology32 and sediment age, with Holocene deposits clearly delineated as strong returns of multiple elements (light-colored areas in Fig. 2b) compared with Pleistocene sediments. Magnetic data gridded at this scale largely corroborated previously mapped structures, such as the line of southwest–northeast-trending magnetic highs (Fig. 2c) associated with mapped intrusive plutons along the Commerce geophysical lineament (CGL) adjacent to the RR in northeast Arkansas and southeast Missouri45,46.Inverted resistivity models in the uppermost 5 m (Fig. 2d and Fig. S1b) correspond closely with mapped surficial units32 (Fig. S1a). For example, Wisconsin-age valley train deposits, as well as point bars and meander scrolls of modern river networks, appear as resistive features in the uppermost 5 m, whereas fine-grained units such as backswamp deposits are characterized by low resistivity. At 20–25 m depth (Fig. 2e and Fig. S1c), intermediate to high-resistivity values are consistent with the coarse-grained lithology found throughout the MRVA. Lower resistivity can be found at this depth in sedimentary units outside the MAP region as well as over structural highs within the MAP region where Tertiary units are close to the surface. Low-resistivity values show the Vicksburg–Jackson confining unit (VKBG) in southeast Arkansas and northeast Louisiana, the Midway confining unit (MDWY) in northeast Arkansas, and an erosional remnant of Tertiary sediments beneath Crowleys Ridge. Beneath the Quaternary MRVA (~30–50 m depth in most of the region), resistivity values strongly correlate with Tertiary subcropping units (Fig. 2f–h and Fig. S1d, e). Most notable here is the low resistivity that corresponds with the known regional VKBG and MDWY confining units, both clay and shale rich. In contrast, the subcrop of the Middle Claiborne aquifer (MCAQ) sands are resistive (Fig. 2g, h and Fig. S1d), with a notable change in facies north of Memphis, Tennessee, associated with the coarse Memphis Sand of the Claiborne Group (dashed line, Fig. 2g, h).Further correlation between resistivity and geologic structure is evident in cross-section view (Fig. 3a–c), where gridded resistivity models are compared with the top elevation of MERAS model surfaces43. From west to east, prominent features in Fig. 3a include the highly resistive Paleozoic Ozark Plateaus aquifer that bounds the MAP region, the conductive east-dipping MDWY beneath and west of Crowleys Ridge, and the resistive MCAQ dipping to the east beneath the east side of Crowleys Ridge. This section highlights the variable degree of connectivity between the MRVA and underlying Tertiary aquifers. While the MCAQ sands appear connected to the shallow MRVA (veneer of moderate to high resistivity in the upper 30–50 m) immediately east of Crowleys Ridge, these aquifer layers become mostly separated by the Middle Claiborne confining unit (MCCU) farther east. To the south (Fig. 3b), the subcrop of the Claiborne Group aquifers (MCAQ and Upper Claiborne aquifer) suggests a direct connection to the MRVA beneath the Mississippi Delta, sandwiched between the VKBG confining unit in the shallow western half of the section and the Lower Claiborne confining unit (LCCU) that dips westward in the eastern part of the section. Towards the southern end of the AEM survey (Fig. 3c), the MRVA is disconnected from Tertiary aquifers by the MCCU and VKBG confining units in the western and eastern portions of the section, respectively. AEM-derived resistivity models largely corroborate the existing framework but also reveal greater detail in the overall structure and heterogeneity within units than could be previously determined through relatively sparse borehole observations. This study demonstrates that systematic mapping at high spatial resolution with AEM data illuminates model structural details expected to exist throughout the region but that cannot be fully understood with sparse observations.Fig. 3: Resistivity and interpreted facies classification cross-sections.West–east resistivity (a–c) and facies classification (d–f) sections are shown at three latitudes across the survey area (Fig. 2a). Surfaces of the top elevation of MERAS model hydrogeologic units43 are shown on cross-sections for reference. CR Crowleys Ridge, MR Mississippi River. Two-sided arrows indicate potential regions of connectivity between MRVA and deeper aquifer units.Full size imageCombined resistivity models from the two phases of regional data collection (Fig. 2d–h and Fig. 3a–c) have been interpolated onto grids with 1 km × 1 km lateral and 5-m vertical resolution useful for investigation of regional-scale structure. However, the native resolution of resistivity models along flight paths is much higher, with spacing between sounding locations for the Resolve and Tempest AEM systems equal to 25 and 75 m, respectively, with finer-scale near-surface vertical resolution for the Resolve system. On a native-resolution cross-section of the Resolve system in northeast Arkansas (Fig. S2), significant detail reveals the internal structure and variability of the MRVA. For example, the Resolve models capture the topography of the aquifer base, including incised channels in the subcropping Tertiary unit that may have been formed during periods of glacial outwash (interpreted locations marked “C” on Fig. S2), and the thickness and extent of surficial low-resistivity material that may be local confining units and a barrier to recharge. Lateral transitions in the upper ~5–30 m correspond with mapped braid belts33, suggesting that electrical resistivity can be an indicator of these distinct lithologic units. However, since the chronology of mapped braid belts is largely based on relatively shallow OSL dates and surficial mapping33, there is little constraint on the age of deeper Quaternary sediments beyond the estimate of their deposition after 250 ka34. Given the presence of shallow braid belt deposits older than the last sea-level lowstand ~20 ka33, along with the observed internal structure of the MRVA with a discontinuous low-resistivity layer at ~30 m (Fig. S2), we hypothesize that the deeper Quaternary sediments may represent earlier filling of post-250 ka eroded channels of the ancestral Mississippi–Ohio River systems34,38.Derivative products: interpretations of hydrogeologic structure and propertiesWhile lithology dominates AEM-derived resistivity values in the MAP region, porewater salinity is also known to influence resistivity17,19,24, and groundwater salinity varies throughout the study area47,48. Specific conductance (SC) measured in boreholes throughout the MERAS domain are generally low, with similar values in both Quaternary sediments (median log10 SC 2.79 = 617 μS/cm) and deeper Tertiary units (median log10 SC 2.67 = 471 μS/cm). Areas of high salinity are limited within the MRVA footprint, with only 6% of the area predicted to be >1000 μS/cm48. Correlation between SC- and AEM-derived resistivity (Fig. 4) by MAP region (Fig. 2) suggests that SC has limited overall control on resistivity across the study domain, but can be important in specific regions where SC is high, generally in the Grand Prairie (Fig. 4a) and Boeuf (Fig. 4b) regions. Lithology appears to be a primary driver for resistivity, with Quaternary sediments typically higher in resistivity than Tertiary units (Fig. 4a–f) over the same SC range. Tertiary, and to a lesser degree Quaternary, units follow an SC-resistivity trend indicative of moderate-high surface conductivity caused by an increased fraction of clays or other fine-grained sediments that flattens the slope of this relationship (black curves, Fig. 4a–f), compared with Archie’s Law where surface conduction is absent49,50. Notable exceptions where Quaternary and Tertiary resistivity values are similar occur in the Cache (Fig. 4c) and St. Francis (Fig.4d) regions, where coarse-grained MCAQ sand subcrop beneath the MRVA and appear similar geophysically (double-sided arrow in Fig. 3a just east of Crowleys Ridge). The sensitivity of AEM data to both model structure and porewater salinity makes it a powerful tool in hydrologic studies, and also highlights the need for borehole and other geologic observations to calibrate against to reduce uncertainty in the extrapolation of AEM interpretations across the entire survey domain.Fig. 4: Relationships between groundwater salinity and AEM-derived resistivity.a–f Measured groundwater specific conductance (SC) versus AEM-derived resistivity at borehole locations, organized by region (Fig. 2). SC measurements from MRVA, terrace (TRRC), and loess layers are shown in blue, with measurements from deeper Tertiary units in yellow. Black dashed curves illustrate theoretical SC-resistivity relationships for varying amounts of surface conductivity (({{{sigma }}}_{{rm{s}}})) caused by the increasing fraction of fine-grained sediment. g Map view of a high-SC cluster in White County, Arkansas, correlates with decreased shallow resistivity in cross-section view (h) and corresponds to the high-SC observations circled in (a).Full size imageUsing 6130 published picks defining the depth to the base of the MRVA39 within the AEM survey area, along with an additional 364 manual picks based on observation of resistivity cross-sections, we used a supervised machine learning algorithm51 (see “Methods” Section) to interpret the elevation of the base of the aquifer across the entire AEM dataset (Fig. 5a). An associated aquifer thickness map (Fig. 5b) is estimated by differencing the base of aquifer elevation surface from the land surface elevation, and saturated thickness (Fig. S3) is calculated by differencing the base of aquifer elevation from the 2018 potentiometric surface, assumed here to be the water table, derived from borehole observations52. By subregion (Fig. 2a and Fig. S3), saturated thickness is greatest in the St. Francis region where deep Quaternary scour channels of the Mississippi–Ohio River system have been documented east of Crowleys Ridge38, and thinnest in the Grand Prairie region because of the combined influence of deep water table, a thick confining layer that limits recharge, and shallow subcrop of the VKBG (Fig. 2e). The difference in saturated thickness between AEM and borehole interpretations of the base of the aquifer surface is on the order of ±10–15 m (Fig. S3b). While not a large difference in absolute value, this can represent a significant percentage of aquifer thickness, which is typically 3000 line-km of AEM data were acquired directly along the Mississippi and Arkansas River paths, as well several smaller tributaries (Fig. 2a). For example, native-resolution, ungridded resistivity profiles with ~25 m spacing along flight paths for several of the tributaries (Fig. S5) illustrate similar detail as the main block flight lines in aquifer structure and geometry of the aquifer base and subcropping unit contact (Fig. S2). Following structure along the river paths enables a detailed view of the discontinuous nature of shallow confining materials beneath river systems, as well as how rivers—which are often important surface water-groundwater conduits—may be connected to the aquifer below. A persistent feature in many river profiles is the intermediate-resistivity layer at ~20–30 m depth that was more discontinuous along the main block flight lines and hypothesized earlier as a fine-grained unit separating younger and older Pleistocene sediments also indicated in Fig. S2.In addition to streamwise surveys of river courses, gridded resistivity models from the complete regional dataset are intersected with the NHDPlus database of flowlines57 to produce cross-sections along any river path (Fig. 6b–f). For example, flowlines intersected by the west–east flight-line block but not flown streamwise can be produced, such as the White River (Fig. 6c). Everywhere the resistivity grids intersected a stream, we use the same connectivity metric discussed previously to quantify the magnitude of connection potential between rivers and the underlying aquifer (see “Methods” section). Here, the vic connectivity metric is calculated on a version of the combined regional dataset made in 2-m depth intervals, integrated from 0 to 10 m beneath the river bottom to characterize the presence or absence of fine-grained material beneath rivers (Fig. 6a). In contrast to the MERAS model calibration for streambed conductance, which incorporated 43 streams with limited data for informing parameter values54, the results here provide far greater coverage and finer granularity on the expected spatial patterns of streambed conductance that may be incorporated in groundwater model parameterization. To first order, surficial geology surrounding the streams32 is an important factor in the vic connectivity metric, with fine-grained units appearing less connected than coarse or intermediate units (Fig. S6).Fig. 6: Characterizing rivers and surface water-groundwater connections.a River connectivity metric defined along all NHDPlus segments in the study area identifies areas for high or low potential for surface water-groundwater connectivity based on streambed resistivity values. Resistivity cross-sections (from north to south) extracted from gridded data along the St. Francis River (b), White River (c), Yazoo-Tallahatchie River (d), Bayou Bartholomew (e), and Mississippi River (f). The top elevation of Tertiary MERAS model layers43 is indicated as solid lines on top of resistivity cross-sections.Full size imageGeological controls on groundwater ageGroundwater age can be used to assess groundwater availability by estimating recharge rates, delineating recharge areas, and characterizing aquifer susceptibility to surface contamination58,59. Tritium is used to qualitatively differentiate groundwater age into modern (recharged in 1953 or later), premodern (recharged prior to 1953), or mixed (combination of modern and premodern water) groundwater categories60. Tritium samples collected from the MAP (n = 582) were categorized into modern, mixed, and premodern groups61 based on the atmospheric tritium input for the well location, sample date, and the measured tritium concentration60. Modern ages are expected for surficial alluvial aquifers in a humid region and 39.1% of MRVA samples fall in this grouping61; however, 17.9% of samples were premodern61, indicating significant variability in either recharge rates or sources of groundwater to the MRVA. The heterogeneity in MRVA groundwater age implies local-scale control, which also likely varies among the MAP regions depending on the relative importance of surface water recharge, aerial recharge through surficial confining units, or possible upwelling from deeper Tertiary units. Part of the challenge in understanding the heterogeneity of groundwater age in this system was not having regional-scale interpretations of aquifer architecture and connectivity between units.To investigate the controls of shallow and deep connectivity on groundwater age, derivative product metrics for surface connectivity (Fig. 5e) and MRVA-Tertiary aquifer connectivity (Fig. 5f) are plotted together with tritium age categories (Fig. 7). Samples with premodern age (Fig. 7a) have characteristically high connectivity between the MRVA and subcropping Tertiary units, along with mixed surficial connectivity, suggesting that upwelling from deeper units may control these older groundwater measurements. Conversely, both mixed and modern tritium categories show a broader range of connectivity to deeper units (Fig. 7b,c), while the modern category has the largest fraction of points (85%) that indicate some degree of surface connectivity, suggesting that connection at the surface is a controlling factor for younger samples. While further data are needed on hydrologic gradients in the system to better predict actual flow paths, these insights suggest that geological structure provides at least partial control on groundwater age and is therefore also likely to be an important driver for vertical groundwater transport. The groundwater age observations—especially the preponderance of premodern groundwater—are difficult to understand without system-scale interpretations of aquifer architecture. The finding that geological structure controls groundwater transport and age is not itself a novel concept; however, we demonstrate that system-scale AEM data can map detailed model structure that facilitates a new understanding of hydrologic processes relevant to many applications and geologic settings.Fig. 7: Relationship between groundwater age and AEM-derived aquifer structure.Sample depth for tritium dates classified as premodern (a), mixed (b), or modern (c) is plotted against the degree of connectivity between the MRVA and deeper Tertiary units (Fig. 5f), with point colors that represent the degree of shallow aquifer connectivity (Fig. 5e).Full size imageHidden faults along the CGLResistivity cross-sections west of the northeast-trending portion of Crowleys Ridge image significant up-to-the-east vertical offset—as much as ~50–75 m—that can be tracked over a distance of >100 km along two shallow fault strands (Fig. 8). These two faults closely follow the path of the ~10-km-wide CGL, just west of the RR and NMSZ62. The western fault splay shows a clear offset of the contact between an Upper Cretaceous high-resistivity layer at depth (McNairy Nacatosh aquifer) and the low-resistivity Tertiary MDWY within the uppermost 100–150 m (Fig. 8b–d), clearly extending to at least the base of the Quaternary aquifer. This western fault is evident on cross-sections between Fig. 8b and d (solid brown line Fig. 8a), with vertical offsets largest in the middle of this segment (up to 75 m) and maximum width ~500–1000 m. The eastern of the two faults is characterized in cross-section view by a dipping conductive layer that appears to terminate at the base of the MDWY, separating the uplifted resistive Cretaceous block to the west from the conductive MDWY to the east. Whether the dipping conductive layer is related to the MDWY or the fault structure itself is unclear. Unlike the western fault, offset is not seen at the top of the MDWY, possibly because of Quaternary erosion of this surface. The eastern of the two structures can be traced on resistivity cross-sections over a greater distance, following the CGL to the southwest before turning south towards the Western Margin fault (Fig. 8).Fig. 8: Fault structures along the Commerce geophysical lineament (CGL).a Map view of resistivity at a depth of 60–65 m along with previously mapped fault structures and seismicity of the New Madrid seismic zone. Faults identified in this study (brown and yellow lines) fall along the 10-km-wide path of the CGL (white dashed lines)62. b–d Resistivity cross-sections indicate up to ~75 m of up-to-the-east offset on two near-vertical faults (black lines, with mapped locations marked by brown and yellow arrows) west of Crowleys Ridge, with maximum offset in the middle of the region (c). The western fault structure shows a clear offset of the deeper moderate-resistivity Cretaceous (K) McNairy Nacatosh aquifer against the low-resistivity Midway confining unit. The inferred base of the Midway confining unit in the vicinity of these faults is indicated by dashed lines. Location of the north–south seismic profile over the CGL65 is indicated by a red star.Full size imageWhile generally coincident with the CGL, both hidden faults captured in the AEM data are previously unmapped, and thus provide new insight into the tectonic history of this area. The observed uplift on these faults along the CGL can plausibly help explain the tectonic origin of Crowleys Ridge given the proximity of these features to one another. Previous geophysical and geomorphological observations along and near the ridge margins provide important insight63,64—especially in relation to the eastern fault indicated here—but have not revealed the source of uplift occurring 10–20 km farther west along the CGL. Existing geophysical data along the ridge margins64 and the CGL65 indicate faulting and support Quaternary fault activity, but do not have sufficient spatial scale and or resolution to delineate the discrete offset imaged by AEM data over the fault length of >100 km. The 50–75 m uplift (Fig. 8b–d) requires movement since the early Tertiary but is consistent with the Quaternary motion. The offset-wedge geometry suggests that the MDWY was lithified at the time of deformation. Given the lack of surface expression, the majority of offset may have occurred before the 32–45 ka Melville Ridge braid belt found immediately west of Crowleys Ridge33; however, further age constraint of Quaternary activity is not available.Dense borehole observations that quantify the overall thickness of Quaternary sediments identify the Pliocene–Pleistocene unconformity41 and support an argument for 53 m of Quaternary uplift34 in the region (Fig. 8a). However, the typical minimum separation between boreholes is >2 km and borehole density declines west of the CGL, making them also insufficient to identify this narrow fault feature as a potential source of uplift. Although the CGL and identified faults are removed from the seismically active region of the NMSZ, the possibility of Quaternary tectonic activity and offset on the faults identified here is important to understand past variations in seismicity and future hazard in the region62.Multi-scale mappingThe multiple phases of AEM mapping (Fig. 2a), along with targeted ground-based66 and waterborne67 geophysical data collection, provide an excellent case study in the value of subsurface mapping over multiple scales. From regional surveys aggregated to 3 km line spacing with 1 km grid cells to high-resolution AEM data in the Shellmound area with 0.25–1 km line spacing with 100 m grid cells, we are able to illustrate the value of an order of magnitude increase in flight-line spacing.The horizontal and vertical resolution of airborne geophysical surveys defines the scale of investigation, which depends on survey design parameters including instrument type, flight-line spacing, and total kilometers flown. Spatially extensive data capturing individual features several kilometers in size commensurate with mapping scales of ~1:1,000,000 (Figs. 2, 5, and  9a) are mapped with flight-line spacing on the order of 1–10 km and effectively inform regional-scale hydrologic models and decision support. High-resolution airborne data with sub-kilometer flight-line spacing greatly improve the resolution of smaller features appropriate for local ~1:100,000-scale mapping (Fig. 9c and Fig. S7).Fig. 9: Multi-scale mapping with airborne and ground-based methods.a Regional-scale structure mapped on 1 km grid cells interpolated from ~3 to 6 km-spaced flight-line data over the entire MAP region. b Regional-scale structure enlarged to the ~30 × 30 km Shellmound study area, compared with high-resolution structure mapped on 100 m grid cells interpolated from 0.25 to 1 km-spaced flight-line data from the Shellmound study area (c) shows the ability to resolve detailed buried channel structure. d Comparison of near-surface high-resolution Shellmound AEM data (background) with very high-resolution ground-based electromagnetic data66 acquired with a sensor towed over ~4 km2 on survey lines spaced by 25 m.Full size imageThe high-resolution data near Shellmound, Mississippi, map coarse-grained (high-resistivity) sediments associated with river meanders that also tend to have relatively high potassium (K) abundance (Fig. S7a, b, east), in contrast with backswamp deposits and other fine-grained overbank sediments that have relatively low-resistivity and greater thorium (Th) abundance. At 50–55 m depth (Fig. 9c and Fig. S7d), resistivity models map a buried paleochannel in the southeast corner of the grid that appears incised to depths of at least 75 m. This channel was previously unknown and may be a relict of the ancestral Mississippi–Ohio river system that flowed east of Crowleys Ridge during the Pleistocene33, creating conduits for local groundwater flow. Deep channels such as this are likely present throughout the region, but are under-sampled by borehole observations and even the regional geophysical flight lines (Figs. 2d–h, 3a–c).The combination of spatially extensive and high-resolution data enables detection and mapping of small-scale or hidden features that would otherwise be missed (i.e., the known unknowns in the subsurface). Our results demonstrate the value in AEM for imaging localized pathways for groundwater connection above and below the aquifer (Fig. 5e, f), buried paleochannels (Fig. 9c), and hidden faults with narrow zones of uplift previously unmapped (Fig. 8). These discrete and relatively small features may have an outsized and nonlinear impact on model predictions, hazard assessments, and decision making, but are impossible to recognize with sparse or limited spatial coverage.Outlook and future directionsAn emerging example of “big data” in geoscience, AEM is expanding the breadth of information that can be applied to near-surface investigation and modeling where detail about subsurface complexity is lacking. Many process-based hydrologic models have evolved to function in the absence of detailed subsurface structural information, largely because these data are typically not available and because incorrect assumptions about subsurface structure and unwarranted complexity lead to modeling errors68. Technological advances in remote sensing and airborne geophysics present new opportunities to advance the level of geological detail that can be incorporated in hydrologic models through the entire vertical extent of an aquifer system. As “big data” have become available to constrain details of model structure, commensurate advances in modeling, such as machine learning and uncertainty quantification69, will also be needed to realize the full potential of airborne geophysical datasets.As the dimension and complexity of questions that may be asked of a model or array of models (i.e., the “decision-space”) continue to increase, the need for detailed data to support these questions also increases. When allocating resources for data collection, a balance should be sought between the optimality of data collection to address a narrow decision-space (such as a single, focused question) and robustness of the investment to address future questions that expand the decision-space. The variety of applications demonstrated in this study highlights the robustness of large-scale AEM data in this respect. Although AEM surveys can involve high absolute cost, at large scales they may be 3–4 orders of magnitude less expensive on a per-data-point or per-square-kilometer basis compared with traditional ground-based surveys or drilling. Because large surveys can cover parts of multiple counties and states and support the interests of multiple scientific disciplines or stakeholder interests, a community-driven approach to the acquisition of these foundational geoscientific datasets is advantageous. Benefits of a coordinated approach to acquiring system-scale AEM data include reduced costs to individual participants, leveraging resources to acquire more data than any individual group could achieve, and ensuring data consistency across the study area that maximizes their value.Much as lidar has transformed our understanding of Earth’s surface, airborne geophysical data extend our view into the subsurface, transforming our ability to inform three-dimensional mapping from catchment to basin scales (Fig. 1b). Here, we demonstrated that system-scale AEM data provide a robust platform from which to address a host of subsurface questions. Airborne geophysical datasets represent the next generation in subsurface mapping, capable of filling in the gaps between existing boreholes with an order of magnitude or greater increase in data density. This will provide nationally consistent near-surface geologic datasets as a foundation for three-dimensional geologic interpretations, hydrologic models, and other subsurface studies that rely on detailed measurements of difficult to access belowground properties within 200–300 m of Earth’s surface (i.e., lidar for the subsurface). More

  • in

    The widespread and unjust drinking water and clean water crisis in the United States

    Data sourcesData for this analysis were extracted from the American Community Survey (ACS) 5-year estimates for 2014–2018 via Integrated Public Use Microdata Series – National Historic Geographic information System (IPUMS-NHGIS)26, and from the Environmental Protection Agency’s (EPA) Enforcement and Compliance History Online (ECHO) Exporter27. Data were extracted at the county level for all 50 states, Washington DC, and Puerto Rico. The ACS is an ongoing survey of the United States which documents a wide variety of social statistics ranging from simple population counts to housing characteristics. Due to the staggered sampling structure of the ACS, it takes 5 years for every county to be sampled. Because of this, researchers must use 5-year intervals to ensure complete data coverage. The data from these 5 years are projected into estimates for all counties in the United States for the 5-year period in question. As of this study, 2014–2018 was the most recently available data.ECHO collates data from EPA-regulated facilities across the United States of America to report compliance, violation, and penalty information for all facilities for the most recent 5-year interval. ECHO data is updated weekly and the data for this paper was extracted on 18 August 2020. This means that the data in our analysis represents the status of each community water system or Clean Water Act permittee, as reported by the EPA, as of 18 August 2020. Only those community water systems or Clean Water Act permittees listed as Active by ECHO were included in this analysis. As ECHO data is at the level of the water system, permittee, or utility, we aggregated data up to the county level.Safe Drinking Water Act data was geolocated using QGIS 3.10 based upon latitude and longitude. This was done because other geographic identifiers for the Safe Drinking Water Act data were often missing. In line with prior work4,5,7,8, and in order to facilitate a cleaner dataset, we only focus on those water systems labeled community water systems for our analysis. Community water systems were geolocated based upon the county in which their latitude and longitude were located, if a community water system had latitude and longitude over water, a nearest neighbor join was used. In total, 1334 out of 49,479 community water systems were dropped because of there being no reported latitude or longitude. Of these, a total of 4.0%, or 54 community waters systems, were reported as in serious violation.Active Clean Water Act permittees were first identified by listed county. This was done because 345,176 out of 350,476 permittees had a county reported. Those without a county reported were located using latitude and longitude in the same manner as community water systems. There were 10 permittees without latitude and longitude or county listed which were excluded from our analysis. Of these, seven were in significant noncompliance and three were not. Due to some Clean Water Act permittees having latitude and longitude placements far away from the United States, those over 100 km from their nearest county were excluded from analysis. Finally, for community water systems and Clean Water Act permittees, some counties (76 for community water systems and 13 for Clean Water Act permittees) had no reported cases. Those counties were treated as zeroes for cartography and as missing for modeling purposes.Similar to prior work in this area4,5,8, we restrict our analysis to the scale of the county for reasons related to data limitations and resulting conceptual validity. Although counties are arguably larger in geographic area than ideal for an environmental injustice analysis, if we were to use a smaller unit for which data is available such as the census tract, the conceptual validity of the analysis would be limited due to the apolitical nature of these units. As outlined above, ECHO data is messy and missing many geographic identifiers. What is provided is generally either the county or latitude and longitude. If only the county is provided, then we are constrained to using the county regardless of conceptual validity. However, even when latitude and longitude are provided—which is the case for many observations—the provided point location says nothing about which households the water system or permittee serves or impacts. Due to this, whatever geographic unit we use carries the assumption that those in the unit could be plausibly impacted by the water system or permittee. Given that counties are often responsible for both regulating drinking water, as well as maintaining and providing water infrastructure29, we were comfortable with this assumption between point location and presumed spatial impact when using the scale of the county. However, we believe this assumption would have been invalid and untestable for smaller apolitical units for which demographic data is available such as census tracts.Beyond the issues presented by ECHO data, the county is also the appropriate scale of analysis for this study due to the estimate-based nature of the ACS. ACS estimates are based on a rolling 5-year sample structure and often have very large margins of error. At the census tract level, these standard errors can be massive, especially in rural areas30,31,32. Due to this variation, and the need to include all rural areas in this analysis, the county, where the margins of error are considerably smaller, is the appropriate unit for this study. All of this said, the county is, in fact, a larger unit than often desired or used in environmental justice studies. Studies focused on exclusively urban areas with clearer pathways of impact can and should use smaller units such as census tracts. It will be imperative for future scholarship focused on water hardship across the rural-urban continuum to gain access to reliable data on sub-county political units, as well as data linking water systems to users, to continue documenting and pushing for water justice.Dependent variablesThe dependent variables for this analysis were assessed in both a continuous and dichotomous format. For descriptive results and mapping, continuous measures were used. For models of water injustice, a dichotomous measure which classified counties as either having low levels of the specific water issue or elevated levels or the specific water issue, was used due to the low relative frequency of water access and quality issues relative to the whole United States population. For all three outcomes, we benchmark an elevated level of the issue as what would be viewed as an unacceptable level under United Nations Sustainable Development Goal 6.1, which states, “by 2030 achieve universal and equitable access to safe and affordable drinking water for all”1. As this goal focuses on ensuring all people have safe water, we deem a county as having an elevated level of the issue if >1% of households, community water systems, or permittees had incomplete plumbing, were in Significant Violation, or Significant Noncompliance, respectively. Although we could have used an even stricter threshold given the SDG’s emphasis on ensuring access for all people, we use 1% as our cut-off due to its nominal value and ease of interpretation.For water access, the continuous measure was the percent of households in a county with incomplete household plumbing as reported by the ACS. The ACS currently asks respondents if they have access to hot and cold water, a sink with a faucet, and a bath or shower. Up until 2016, the question also included a flush toilet33. As we must use the most recent 2014–2018 5-year estimates to establish full coverage of all counties, this means that incomplete plumbing in this item may, or may not include a flush toilet depending on when the specific county was sampled. The dichotomous version of this variable benchmarked elevated levels of incomplete plumbing as whether or not 1% or more of households in a county had incomplete plumbing.Water quality was assessed via both community water systems from the Safe Drinking Water Act, and from permit data via the Clean Water Act. For Safe Drinking Water Act data, the continuous measure was the percent of community water systems within a county classified as a Safe Drinking Water Act Serious Violator at time of data extraction. The EPA assigns point values of either 1, 5, or 10 based upon the severity of violations of the Safe Drinking Water Act. A Serious Violator is one who has “an aggregate score of at least eleven points as a result of some combination of: unresolved more serious violations (such as maximum contaminant level violations related to acute contaminants), multiple violations (health-based, monitoring and reporting, public notification and/or other violations), and/or continuing violations”27. The dichotomous measure benchmarked elevated rates of Safe Drinking Water Act Significant Violation as whether or not >1% of county community water systems were classified as Serious Violators.For Clean Water Act permit data, the continuous measure was the percent of permit holders listed as in Significant Noncompliance at the time of data extraction. Significant Noncompliance in the Clean Water Act refers to those permit holders who may pose a “more severe level of environmental threat” and is based upon both pollution levels and reporting compliance27. The dichotomous measure again set the threshold for elevated levels of poor water quality at whether or not >1% of Clean Water Act permittees in a county were listed as in Significant Noncompliance at time of data extraction.Independent variablesThe independent variables we include in models of water injustice are those frequently shown to be related to environmental injustice in the United States. These include age, income, poverty, race, ethnicity, education, and rurality17,18,19,20,21,22,23,24,25. Age was included as median age. Income was included as median household income. Poverty was the poverty rate of the county as determined by the official poverty measure of the United States34. Race and ethnicity was included as percent non-Latino/a Black, percent non-Latino/a indigenous, and percent Latino/a. Because the focus was on indigeneity, percent American Indian or Alaska Native was collapsed with Native Hawaiian or Other Pacific Islander. We did not include percent non-Latino/a white due to issues of multicollinearity. Finally, rurality was included as a three-category county indicator of metropolitan, non-metropolitan metropolitan-adjacent, and non-metropolitan remote, as determined by the Office of Management and Budget in 201035. The OMB determines a county is metropolitan if it has a core urban area of 50,000 or more people, or is connected to a core metropolitan county by a 25% or greater share of commuting35. A non-metropolitan county is simply any county not classified as metropolitan. Non-metropolitan metropolitan adjacent counties are those which immediately border a metropolitan county, and non-metropolitan remote counties are those that do not.Water injustice modeling approachWater injustice was assessed by estimating linear probability models for the three dichotomous outcome variables with state fixed effects to control for the visible state level heterogeneity and differences in policy, reporting, and enforcement (e.g. the clear state boundary effects in Fig. 3). We employ cluster-robust standard errors at the state level to account for both heteroskedasticity and state similarities. All modeling was performed in Stata 16.0 and mapping was performed in QGIS 3.10. We assessed all full models for multicollinearity via condition index and VIF values and the independent variables had an acceptable condition index of 5.48, well below the conservative cut-off of 15, as well as VIF values of 20). All indications of statistical significance are at the p  More

  • in

    Iran is draining its aquifers dry

    Many communities in Iran depend on wells for water, and are threatened by the rapid fall in the nation’s groundwater table. Credit: Mohsen Maghrebi

    Water resources
    16 June 2021
    Iran is draining its aquifers dry

    Wells are proliferating, but data from across the country show that groundwater extraction is falling.

    Share on Twitter
    Share on Twitter

    Share on Facebook
    Share on Facebook

    Share via E-Mail
    Share via E-Mail

    Iran is using more groundwater than can be naturally recharged, according to an analysis of national data. And even as more wells are tapped into the ground each year, their overall output keeps dropping.Iran’s sources of groundwater include wells, springs and underground aqueducts known as qanats. Groundwater amounts to 60% of the country’s total supply and is consumed almost entirely by agriculture.Roohollah Noori at the University of Oulu in Finland and his colleagues studied data from Iran’s national groundwater-monitoring system from between 2002 and 2015. During that period, the number of wells and other locations that tap into groundwater nearly doubled. Yet the amount of groundwater extracted declined by 18%. Nationwide, the groundwater table dropped by an average of almost half a metre per year.In many wells, the water also become significantly saltier, to the point that only salt-tolerant plants would thrive if irrigated with it. Groundwater quality improved in only a few wet regions.Water scarcity threatens the livelihoods of people across Iran as the land becomes drier.

    Proc. Natl Acad. Sci. USA (2021)

    Water resources More

  • in

    Most rivers and streams run dry every year

    NEWS AND VIEWS
    16 June 2021

    Most rivers and streams run dry every year

    A model of the world’s rivers and streams has been developed to predict which of these watercourses flow all year round and which go dry. The analysis shows that rivers and streams that run dry are ubiquitous throughout the world.

    Kristin L. Jaeger

     ORCID: http://orcid.org/0000-0002-1209-8506

    0

    Kristin L. Jaeger

    Kristin L. Jaeger is at the Washington Water Science Center, US Geological Survey, Tacoma, Washington 98402, USA.

    View author publications

    You can also search for this author in PubMed
     Google Scholar

    Share on Twitter
    Share on Twitter

    Share on Facebook
    Share on Facebook

    Share via E-Mail
    Share via E-Mail

    Download PDF

    The flowing waters of surface rivers and streams efficiently transport sediment, organic material and nutrients, among other things, from hillsides and overland areas to downstream lakes, reservoirs and the ocean. Along the way, rivers and streams (hereafter referred to collectively as streams) provide important resources for our communities and support rich, complex ecosystems. Non-perennial streams, which do not flow year-round, are crucial in this context. However, because non-perennial streams are less reliable sources of surface water than perennial ones, they are less-well studied than their perennial counterparts. Writing in Nature, Messager et al.1 provide a much-needed estimate of the total proportion of the world’s stream network, by length, that is non-perennial — and find that most fall into this category.
    Read the paper: Global prevalence of non-perennial rivers and streams
    Messager and colleagues combined streamflow data from sites around the world with information describing the hydrology, climate, physical geography and land cover at those sites, to model the probability that water does not flow for at least one day per year. They then expanded their predictions to all stream segments recorded in a global stream-network database (RiverATLAS)2.The authors report that 51–60% of the world’s streams do not flow for at least one day per year, and that 44–53% of global stream length is dry for at least one month (about 30 days) each year. Their modelling shows that non-perennial streams occur in all climates and biomes on every continent (see Fig. 1 of the paper1). The model also shows that 95% of the stream network in hot, dry regions — which represent 10% of the global landmass — runs dry each year (Fig. 1). Astonishingly, even segments of major rivers, such as the Niger River in West Africa, are predicted to dry up in these arid regions. The vast prevalence of non-perennial streams in such locations highlights how even streams that do not flow continuously substantially affect water availability and water quality. The results emphasize the need for more-detailed maps of perennial and non-perennial flows at regional and local scales, and for further studies of how non-perennial streams affect overall water availability and quality.

    Figure 1 | The dried-up Darling River in New South Wales, Australia, February 2020. Messager and colleagues’ analysis1 shows that most rivers and streams run dry for at least one day per year, including sections of major rivers in arid regions.Credit: Mark Evans/Getty

    Small headwater streams (those that have no tributaries) make up 70–80% of stream length worldwide3, similar to the way in which the collective length of one’s fingers is much greater than the length of the palm of the hand. Messager and co-workers’ model predicts that, even in the wettest regions, such as the Amazon River basin and portions of central Africa and southeast Asia, up to 35% of these headwater streams stop flowing at some point in the year. However, it should be noted that headwater streams are monitored by relatively few stream gauges, which tend to be located on larger, perennial rivers downstream. The model might therefore provide highly uncertain estimates for the upstream regions of stream networks.Lack of streamflow data is a common problem for the modelling of headwater streams, and so data-collection efforts are being implemented to fill this knowledge gap. For example, France has developed the Observatoire National des Étiages (ONDE) network, which complements the national stream-gauging network but focuses on headwater streams. However, these programmes are costly and require considerable investment of resources.
    European rivers are fragmented by many more barriers than had been recorded
    Stream gauges are also scarce for non-perennial streams more generally. In Messager and colleagues’ analysis, for instance, there were no gauges in non-perennial streams in Argentina; just one in New Zealand; and 10 in the United States Pacific Northwest, out of a network of 250 gauges. To improve models that map perennial and non-perennial streams, low-cost field observations will be needed, coupled with the development of high-resolution remote-sensing technology that frequently detects — or at least predicts — surface flow in streams.Messager and co-workers’ analysis provides a robust, quantitative confirmation of the ubiquity of non-perennial rivers. Their results indicate the need for a fundamental change in the fields of river and stream science and management, in which non-perennial streams have been largely overlooked4. In arid regions, the predominance of non-perennial streams might be a major driver of water availability and quality. And in areas where services developed by humans are not readily available, ecosystem services such as flowing water in streams are used to meet basic needs and will, in part, determine the well-being and prosperity of people in that area5. The new findings therefore shine a light on the need for global accounting of both perennial and non-perennial streams.Moreover, changes in the distribution of streams can have far-reaching impacts on carbon and biogeochemical cycles at global and continental scales6, and on the survival of stream-dwelling organisms, including many endangered species7. A global benchmark of the prevalence of perennial and non-perennial streams is therefore crucial for evaluating the effects of future changes in their distribution associated with climate and land-use change. Finally, regional and local models of streams are needed, as well as better data for headwaters and non-perennial portions of the stream network, to further increase the value of global models.

    Nature 594, 335-336 (2021)
    doi: https://doi.org/10.1038/d41586-021-01528-4

    References1.Messager, M. L. et al. Nature 594, 391–397 (2021).Article 

    Google Scholar 
    2.Linke, S. et al. Sci. Data 6, 283 (2019).PubMed 
    Article 

    Google Scholar 
    3.Wohl, E. Front. Earth Sci. 11, 447–456 (2017).Article 

    Google Scholar 
    4.Acuña, V. et al. Science 343, 1080–1081 (2014).PubMed 
    Article 

    Google Scholar 
    5.McClain, M. E. Ambio 42, 549–565 (2013).PubMed 
    Article 

    Google Scholar 
    6.Aufdenkampe, A. K. et al. Front. Ecol. Environ. 9, 53–60 (2011).Article 

    Google Scholar 
    7.Magalhães, M. F., Beja, P., Schlosser, I. J. & Collares-Pereira, M. J. Freshwat. Biol. 52, 1494–1510 (2007).Article 

    Google Scholar 
    Download references

    Competing Interests
    The author declares no competing interests.

    Latest on:

    Hydrology

    Global prevalence of non-perennial rivers and streams
    Article 16 JUN 21

    A 10 per cent increase in global land evapotranspiration from 2003 to 2019
    Article 26 MAY 21

    Accelerated global glacier mass loss in the early twenty-first century
    Article 28 APR 21

    Environmental sciences

    Regulate waste recycling internationally
    Correspondence 15 JUN 21

    Community–academic partnerships helped Flint through its water crisis
    Comment 15 JUN 21

    EU forest strategy: adapt, innovate, employ
    Correspondence 25 MAY 21

    Water resources

    Iran is draining its aquifers dry
    Research Highlight 16 JUN 21

    Global prevalence of non-perennial rivers and streams
    Article 16 JUN 21

    Community–academic partnerships helped Flint through its water crisis
    Comment 15 JUN 21

    Jobs from Nature Careers

    All jobs

    Faculty Positions in School of Engineering, Westlake University
    Westlake University
    Hangzhou, China

    JOB POST

    Postdoctoral Fellow, In-Situ Gene Therapy
    The University of British Columbia (UBC)
    Vancouver, Canada

    JOB POST

    Postdoctoral Researcher – Numerical modelling of NOx formation in gas turbine combustors
    College of Engineering and Informatics, NUI Galway
    Ref. No. NUIG RES 125-21
    National University of Ireland Galway (NUI Galway)
    Galway, Ireland

    JOB POST

    PhD Position in Bioinformatics
    German Cancer Research Center in the Helmholtz Association (DKFZ)
    Heidelberg, Germany

    JOB POST

    Nature Briefing
    An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

    Email address

    Yes! Sign me up to receive the daily Nature Briefing email. I agree my information will be processed in accordance with the Nature and Springer Nature Limited Privacy Policy.

    Sign up More

  • in

    Global prevalence of non-perennial rivers and streams

    1.Larned, S. T., Datry, T., Arscott, D. B. & Tockner, K. Emerging concepts in temporary-river ecology. Freshw. Biol. 55, 717–738 (2010).
    Google Scholar 
    2.Leigh, C. & Datry, T. Drying as a primary hydrological determinant of biodiversity in river systems: a broad-scale analysis. Ecography 40, 487–499 (2017).
    Google Scholar 
    3.Datry, T. et al. A global analysis of terrestrial plant litter dynamics in non-perennial waterways. Nat. Geosci. 11, 497–503 (2018).ADS 
    CAS 

    Google Scholar 
    4.Marcé, R. et al. Emissions from dry inland waters are a blind spot in the global carbon cycle. Earth Sci. Rev. 188, 240–248 (2019).ADS 

    Google Scholar 
    5.Steward, A. L., von Schiller, D., Tockner, K., Marshall, J. C. & Bunn, S. E. When the river runs dry: human and ecological values of dry riverbeds. Front. Ecol. Environ. 10, 202–209 (2012).
    Google Scholar 
    6.Acuña, V. et al. Why should we care about temporary waterways? Science 343, 1080–1081 (2014).ADS 
    PubMed 

    Google Scholar 
    7.Fritz, K., Cid, N. & Autrey, B. Governance, legislation, and protection of intermittent rivers and ephemeral streams. In Intermittent Rivers and Ephemeral Streams: Ecology and Management 477–507 (Academic Press, 2017); https://doi.org/10.1016/B978-0-12-803835-2.00019-X.8.Sullivan, S. M. P., Rains, M. C., Rodewald, A. D., Buzbee, W. W. & Rosemond, A. D. Distorting science, putting water at risk. Science 369, 766–768 (2020).ADS 
    CAS 
    PubMed 

    Google Scholar 
    9.Allen, D. C. et al. River ecosystem conceptual models and non‐perennial rivers: a critical review. Wiley Interdiscip. Rev. Water 7, e1473 (2020).
    Google Scholar 
    10.Datry, T., Larned, S. T. & Tockner, K. Intermittent rivers: a challenge for freshwater ecology. Bioscience 64, 229–235 (2014).
    Google Scholar 
    11.Ficklin, D. L., Abatzoglou, J. T., Robeson, S. M., Null, S. E. & Knouft, J. H. Natural and managed watersheds show similar responses to recent climate change. Proc. Natl Acad. Sci. USA 115, 8553–8557 (2018).ADS 
    CAS 
    PubMed 

    Google Scholar 
    12.Jaeger, K. L., Olden, J. D. & Pelland, N. A. Climate change poised to threaten hydrologic connectivity and endemic fishes in dryland streams. Proc. Natl Acad. Sci. USA 111, 13894–13899 (2014).ADS 
    CAS 
    PubMed 

    Google Scholar 
    13.Pumo, D., Caracciolo, D., Viola, F. & Noto, L. V. Climate change effects on the hydrological regime of small non-perennial river basins. Sci. Total Environ. 542, 76–92 (2016).ADS 
    CAS 
    PubMed 

    Google Scholar 
    14.Stubbington, R. et al. Biomonitoring of intermittent rivers and ephemeral streams in Europe: current practice and priorities to enhance ecological status assessments. Sci. Total Environ. 618, 1096–1113 (2018).ADS 
    CAS 
    PubMed 

    Google Scholar 
    15.Acuña, V. et al. Accounting for flow intermittency in environmental flows design. J. Appl. Ecol. 57, 742–753 (2020).
    Google Scholar 
    16.Arthington, A. H. et al. The Brisbane Declaration and Global Action Agenda on Environmental Flows (2018). Front. Environ. Sci. 6, 45 (2018).
    Google Scholar 
    17.Zimmer, M. A. et al. Zero or not? Causes and consequences of zero-flow stream gage readings. Wiley Interdiscip. Rev. Water 7, e1436 (2020).
    Google Scholar 
    18.Beaufort, A., Lamouroux, N., Pella, H., Datry, T. & Sauquet, E. Extrapolating regional probability of drying of headwater streams using discrete observations and gauging networks. Hydrol. Earth Syst. Sci. 22, 3033–3051 (2018).ADS 

    Google Scholar 
    19.Jaeger, K. L. & Olden, J. D. Electrical resistance sensor arrays as a means to quantify longitudinal connectivity of rivers. River Res. Appl. 28, 1843–1852 (2012).
    Google Scholar 
    20.Yu, S. et al. Evaluating a landscape-scale daily water balance model to support spatially continuous representation of flow intermittency throughout stream networks. Hydrol. Earth Syst. Sci. 24, 5279–5295 (2020).ADS 
    CAS 

    Google Scholar 
    21.Snelder, T. H. et al. Regionalization of patterns of flow intermittence from gauging station records. Hydrol. Earth Syst. Sci. 17, 2685–2699 (2013).ADS 

    Google Scholar 
    22.Jaeger, K. L. et al. Probability of Streamflow Permanence Model (PROSPER): a spatially continuous model of annual streamflow permanence throughout the Pacific Northwest. J. Hydrol. X 2, 100005 (2019).
    Google Scholar 
    23.Yu, S., Bond, N. R., Bunn, S. E. & Kennard, M. J. Development and application of predictive models of surface water extent to identify aquatic refuges in eastern Australian temporary stream networks. Water Resour. Res. 55, 9639–9655 (2019).ADS 

    Google Scholar 
    24.Kennard, M. J. et al. Classification of natural flow regimes in Australia to support environmental flow management. Freshw. Biol. 55, 171–193 (2010).
    Google Scholar 
    25.Lane, B. A., Dahlke, H. E., Pasternack, G. B. & Sandoval‐Solis, S. Revealing the diversity of natural hydrologic regimes in California with relevance for environmental flows applications. J. Am. Water Resour. Assoc. 53, 411–430 (2017).ADS 

    Google Scholar 
    26.Müller Schmied, H. et al. Sensitivity of simulated global-scale freshwater fluxes and storages to input data, hydrological model structure, human water use and calibration. Hydrol. Earth Syst. Sci. 18, 3511–3538 (2014).ADS 

    Google Scholar 
    27.Linke, S. et al. Global hydro-environmental sub-basin and river reach characteristics at high spatial resolution. Sci. Data 6, 283 (2019).PubMed 
    PubMed Central 

    Google Scholar 
    28.Tooth, S. Process, form and change in dryland rivers: a review of recent research. Earth Sci. Rev. 51, 67–107 (2000).ADS 

    Google Scholar 
    29.Costigan, K. H., Jaeger, K. L., Goss, C. W., Fritz, K. M. & Goebel, P. C. Understanding controls on flow permanence in intermittent rivers to aid ecological research: integrating meteorology, geology and land cover. Ecohydrology 9, 1141–1153 (2016).
    Google Scholar 
    30.Benstead, J. P. & Leigh, D. S. An expanded role for river networks. Nat. Geosci. 5, 678–679 (2012).ADS 
    CAS 

    Google Scholar 
    31.Godsey, S. E. & Kirchner, J. W. Dynamic, discontinuous stream networks: hydrologically driven variations in active drainage density, flowing channels and stream order. Hydrol. Processes 28, 5791–5803 (2014).ADS 

    Google Scholar 
    32.Metzger, M. J. et al. A high-resolution bioclimate map of the world: a unifying framework for global biodiversity research and monitoring. Glob. Ecol. Biogeogr. 22, 630–638 (2013).
    Google Scholar 
    33.Tolonen, K. E. et al. Parallels and contrasts between intermittently freezing and drying streams: From individual adaptations to biodiversity variation. Freshw. Biol. 64, 1679–1691 (2019).
    Google Scholar 
    34.Prancevic, J. P. & Kirchner, J. W. Topographic controls on the extension and retraction of flowing streams. Geophys. Res. Lett. 46, 2084–2092 (2019).ADS 

    Google Scholar 
    35.FAO. AQUAMAPS: Global Spatial Database on Water and Agriculture (Food and Agriculture Organization of the United Nations, accessed 15 October 2020); https://data.apps.fao.org/aquamaps/36.Schneider, A. et al. Global-scale river network extraction based on high-resolution topography and constrained by lithology, climate, slope, and observed drainage density. Geophys. Res. Lett. 44, 2773–2781 (2017).ADS 

    Google Scholar 
    37.Raymond, P. A. et al. Global carbon dioxide emissions from inland waters. Nature 503, 355–359 (2013); erratum 507, 387 (2014).ADS 
    CAS 
    PubMed 

    Google Scholar 
    38.Tramblay, Y. et al. Trends in flow intermittence for European rivers. Hydrol. Sci. J. 66, 37–49 (2021).
    Google Scholar 
    39.Döll, P., Douville, H., Güntner, A., Müller Schmied, H. & Wada, Y. Modelling freshwater resources at the global scale: challenges and prospects. Surv. Geophys. 37, 195–221 (2016).ADS 

    Google Scholar 
    40.Hammond, J. C. et al. Spatial patterns and drivers of nonperennial flow regimes in the contiguous United States. Geophys. Res. Lett. 48, e2020GL090794 (2021).ADS 

    Google Scholar 
    41.Döll, P. & Schmied, H. M. How is the impact of climate change on river flow regimes related to the impact on mean annual runoff? A global-scale analysis. Environ. Res. Lett. 7, 014037 (2012).ADS 

    Google Scholar 
    42.Gleeson, T. et al. The water planetary boundary: interrogation and revision. One Earth 2, 223–234 (2020).
    Google Scholar 
    43.Dickens, C. et al. Incorporating Environmental Flows into “Water Stress” Indicator 6.4.2: Guidelines for a Minimum Standard Method for Global Reporting (FAO, 2019); http://www.fao.org/documents/card/en/c/ca3097en/44.Sood, A. et al. Global Environmental Flow Information for the Sustainable Development Goals. IWMI Research Report 168 (International Water Management Institute, 2017); https://doi.org/10.5337/2017.20145.Vannote, R. L., Minshall, G. W., Cummins, K. W., Sedell, J. R. & Cushing, C. E. The River Continuum Concept. Can. J. Fish. Aquat. Sci. 37, 130–137 (1980).
    Google Scholar 
    46.Grill, G. et al. Mapping the world’s free-flowing rivers. Nature 569, 215–221 (2019); correction 572, E9 (2019).ADS 
    CAS 
    PubMed 

    Google Scholar 
    47.Stanley, E. H., Fisher, S. G. & Grimm, N. B. Ecosystem expansion and contraction in streams: desert streams vary in both space and time and fluctuate dramatically in size. Bioscience 47, 427–435 (1997).
    Google Scholar 
    48.Datry, T. et al. Flow intermittence and ecosystem services in rivers of the Anthropocene. J. Appl. Ecol. 55, 353–364 (2018).PubMed 
    PubMed Central 

    Google Scholar 
    49.Nembrini, S., König, I. R. & Wright, M. N. The revival of the Gini importance? Bioinformatics 34, 3711–3718 (2018).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    50.Lehner, B. & Grill, G. Global river hydrography and network routing: baseline data and new approaches to study the world’s large river systems. Hydrol. Processes 27, 2171–2186 (2013).ADS 

    Google Scholar 
    51.Lehner, B., Verdin, K. & Jarvis, A. New global hydrography derived from spaceborne elevation data. Eos 89, 93–94 (2008).ADS 

    Google Scholar 
    52.Messager, M. L., Lehner, B., Grill, G., Nedeva, I. & Schmitt, O. Estimating the volume and age of water stored in global lakes using a geo-statistical approach. Nat. Commun. 7, 13603 (2016).ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    53.Global Runoff Data Centre. In-situ river discharge data (World Meteorological Organization, accessed 15 May 2015); https://portal.grdc.bafg.de/applications/public.html?publicuser=PublicUser#dataDownload/Home54.Do, H. X., Gudmundsson, L., Leonard, M. & Westra, S. The Global Streamflow Indices and Metadata Archive (GSIM) – Part 1: The production of a daily streamflow archive and metadata. Earth Syst. Sci. Data 10, 765–785 (2018).ADS 

    Google Scholar 
    55.Gudmundsson, L., Do, H. X., Leonard, M. & Westra, S. The Global Streamflow Indices and Metadata Archive (GSIM) – Part 2: Quality control, time-series indices and homogeneity assessment. Earth Syst. Sci. Data 10, 787–804 (2018).ADS 

    Google Scholar 
    56.Lehner, B. et al. High‐resolution mapping of the world’s reservoirs and dams for sustainable river‐flow management. Front. Ecol. Environ. 9, 494–502 (2011).
    Google Scholar 
    57.Mackay, S. J., Arthington, A. H. & James, C. S. Classification and comparison of natural and altered flow regimes to support an Australian trial of the Ecological Limits of Hydrologic Alteration framework. Ecohydrology 7, 1485–1507 (2014).
    Google Scholar 
    58.Zhang, Y., Zhai, X., Shao, Q. & Yan, Z. Assessing temporal and spatial alterations of flow regimes in the regulated Huai River Basin, China. J. Hydrol. 529, 384–397 (2015).ADS 

    Google Scholar 
    59.Reynolds, L. V., Shafroth, P. B. & LeRoy Poff, N. Modeled intermittency risk for small streams in the Upper Colorado River Basin under climate change. J. Hydrol. 523, 768–780 (2015).ADS 

    Google Scholar 
    60.Costigan, K. H. et al. Flow regimes in intermittent rivers and ephemeral streams. In Intermittent Rivers and Ephemeral Streams: Ecology and Management 51–78 (Academic Press, 2017); https://doi.org/10.1016/B978-0-12-803835-2.00003-661.Pickens, A. H. et al. Mapping and sampling to characterize global inland water dynamics from 1999 to 2018 with full Landsat time-series. Remote Sens. Environ. 243, 111792 (2020).ADS 

    Google Scholar 
    62.Hengl, T. et al. SoilGrids250m: Global gridded soil information based on machine learning. PLoS ONE 12, e0169748 (2017).PubMed 
    PubMed Central 

    Google Scholar 
    63.Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).
    Google Scholar 
    64.Trabucco, A. & Zomer, R. Global Aridity Index and Potential Evapotranspiration (ET0) Climate Database v2. figshare https://doi.org/10.6084/m9.figshare.7504448.v3 (2018).65.Bond, N. R. & Kennard, M. J. Prediction of hydrologic characteristics for ungauged catchments to support hydroecological modeling. Water Resour. Res. 53, 8781–8794 (2017).ADS 

    Google Scholar 
    66.Kotsiantis, S. B., Zaharakis, I. D. & Pintelas, P. E. Machine learning: a review of classification and combining techniques. Artif. Intell. Rev. 26, 159–190 (2006).
    Google Scholar 
    67.Wainer, J. Comparison of 14 different families of classification algorithms on 115 binary datasets. Preprint at https://arxiv.org/abs/1606.00930 (2016).68.Malley, J. D., Kruppa, J., Dasgupta, A., Malley, K. G. & Ziegler, A. Probability machines. Methods Inf. Med. 51, 74–81 (2012).CAS 
    PubMed 

    Google Scholar 
    69.Wright, M. N. & Ziegler, A. ranger: a fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Softw. 77, https://doi.org/10.18637/jss.v077.i01 (2017).70.Lang, M. et al. mlr3: a modern object-oriented machine learning framework in R. J. Open Source Softw. 4, 1903 (2019).ADS 

    Google Scholar 
    71.Landau, W. M. The drake R package: a pipeline toolkit for reproducibility and high-performance computing. J. Open Source Softw. 3, 550 (2018).ADS 

    Google Scholar 
    72.Hothorn, T., Hornik, K. & Zeileis, A. Unbiased recursive partitioning: a conditional inference framework. J. Comput. Graph. Stat. 15, 651–674 (2006).MathSciNet 

    Google Scholar 
    73.Hothorn, T. & Zeileis, A. Partykit: a modular toolkit for recursive partytioning in R. J. Mach. Learn. Res. 16, 3905–3909 (2015).MathSciNet 
    MATH 

    Google Scholar 
    74.Wright, M. N., Dankowski, T. & Ziegler, A. Unbiased split variable selection for random survival forests using maximally selected rank statistics. Stat. Med. 36, 1272–1284 (2017).MathSciNet 
    PubMed 

    Google Scholar 
    75.Zhang, G. & Lu, Y. Bias-corrected random forests in regression. J. Appl. Stat. 39, 151–160 (2012).MathSciNet 
    MATH 

    Google Scholar 
    76.Japkowicz, N. & Stephen, S. The class imbalance problem: a systematic study. Intell. Data Anal. 6, 429–449 (2002).MATH 

    Google Scholar 
    77.Bischl, B., Mersmann, O., Trautmann, H. & Weihs, C. Resampling methods for meta-model validation with recommendations for evolutionary computation. Evol. Comput. 20, 249–275 (2012).CAS 
    PubMed 

    Google Scholar 
    78.Probst, P., Wright, M. N. & Boulesteix, A. L. Hyperparameters and tuning strategies for random forest. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9, e1301 (2019).
    Google Scholar 
    79.Probst, P. & Boulesteix, A. L. To tune or not to tune the number of trees in random forest. J. Mach. Learn. Res. 18, 1–8 (2018).MathSciNet 
    MATH 

    Google Scholar 
    80.Schratz, P., Muenchow, J., Iturritxa, E., Richter, J. & Brenning, A. Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data. Ecol. Modell. 406, 109–120 (2019).
    Google Scholar 
    81.Brenning, A. Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: the R package sperrorest. In 2012 IEEE Int. Geoscience and Remote Sensing Symp. (IGARSS) 5372–5375 (2012); https://doi.org/10.1109/IGARSS.2012.635239382.Meyer, H., Reudenbach, C., Hengl, T., Katurji, M. & Nauss, T. Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environ. Model. Softw. 101, 1–9 (2018).
    Google Scholar 
    83.Meyer, H., Reudenbach, C., Wöllauer, S. & Nauss, T. Importance of spatial predictor variable selection in machine learning applications – moving from data reproduction to spatial prediction. Ecol. Modell. 411, 108815 (2019).
    Google Scholar 
    84.Brodersen, K. H., Ong, C. S., Stephan, K. E. & Buhmann, J. M. The balanced accuracy and its posterior distribution. In Proc. Int. Conf. Pattern Recognition 3121–3124 (2010); https://doi.org/10.1109/ICPR.2010.76485.Altmann, A., Toloşi, L., Sander, O. & Lengauer, T. Permutation importance: a corrected feature importance measure. Bioinformatics 26, 1340–1347 (2010).CAS 
    PubMed 

    Google Scholar 
    86.Amaratunga, D., Cabrera, J. & Lee, Y.-S. Enriched random forests. Bioinformatics 24, 2010–2014 (2008).CAS 
    PubMed 

    Google Scholar 
    87.Evans, J. S., Murphy, M. A., Holden, Z. A. & Cushman, S. A. Modeling species distribution and change using random forest. In Predictive Species and Habitat Modeling in Landscape Ecology: Concepts and Applications 139–159 (Springer, 2011); https://doi.org/10.1007/978-1-4419-7390-0_888.Jones, Z. M. & Linder, F. J. edarf: Exploratory Data Analysis using Random Forests. J. Open Source Softw. 1, 92 (2016).ADS 

    Google Scholar 
    89.Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).MathSciNet 
    MATH 

    Google Scholar 
    90.Bondarenko, M., Kerr, D., Sorichetta, A. & Tatem, A. J. Census/projection-disaggregated gridded population datasets for 189 countries in 2020 using Built-Settlement Growth Model (BSGM) outputs (WorldPop, University of Southampton, accessed 26 November 2020); https://doi.org/10.5258/SOTON/WP0068491.Colvin, S. A. R. et al. Headwater streams and wetlands are critical for sustaining fish, fisheries, and ecosystem services. Fisheries 44, 73–91 (2019).
    Google Scholar 
    92.Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer Science & Business Media, 2009).93.Clauset, A., Shalizi, C. R. & Newman, M. E. J. Power-law distributions in empirical data. SIAM Rev. 51, 661–703 (2009).ADS 
    MathSciNet 
    MATH 

    Google Scholar 
    94.Fritz, K. M. et al. Comparing the extent and permanence of headwater streams from two field surveys to values from hydrographic databases and maps. J. Am. Water Resour. Assoc. 49, 867–882 (2013).ADS 

    Google Scholar 
    95.Stoddard, J. L. et al. Environmental Monitoring and Assessment Program (EMAP): Western Streams and Rivers Statistical Summary. Report no. EPA/620/R-05/006 (NTIS PB2007-102088) (US Environmental Protection Agency, 2005).96.Hafen, K. C., Blasch, K. W., Rea, A., Sando, R. & Gessler, P. E. The influence of climate variability on the accuracy of NHD perennial and nonperennial stream classifications. J. Am. Water Resour. Assoc. 56, 903–916 (2020).ADS 

    Google Scholar 
    97.Colson, T., Gregory, J., Dorney, J. & Russell, P. Topographic and soil maps do not accurately depict headwater stream networks. Natl Wetlands Newsl. 30, 25–28 (2008).
    Google Scholar 
    98.Allen, D. C. et al. Citizen scientists document long-term streamflow declines in intermittent rivers of the desert southwest, USA. Freshw. Sci. 38, 244–256 (2019).
    Google Scholar 
    99.Datry, T., Pella, H., Leigh, C., Bonada, N. & Hugueny, B. A landscape approach to advance intermittent river ecology. Freshw. Biol. 61, 1200–1213 (2016).
    Google Scholar 
    100.McShane, R. R., Sando, R. & Hockman-Wert, D. P. Streamflow observation points in the Pacific Northwest, 1977–2016. U.S. Geological Survey data release https://doi.org/10.5066/F7BV7FSP (2017).101.Observatoire National des étiages (ONDE) (French Office for Biodiversity (OFC), accessed 21 June 2020); https://onde.eaufrance.fr/content/t%C3%A9l%C3%A9charger-les-donn%C3%A9es-des-campagnes-par-ann%C3%A9e102.Aguas Continentales de Argentina (Argentinian National Geographic Institute (IGN), accessed 11 June 2020); https://www.ign.gob.ar/NuestrasActividades/InformacionGeoespacial/CapasSIG103.Australian Hydrological Geospatial Fabric (Geofabric, v. 3.2) (Australian Bureau of Meteorology (BOM), accessed 11 June 2020); ftp://ftp.bom.gov.au/anon/home/geofabric/Geofabric_Metadata_GDB_V3_2.zip104.Base Cartográfica Continua do Brasil (BC250, 2019 version) (Brazilian Institute of Geography and Statistics (IBGE); accessed 11 June 2020); https://geoftp.ibge.gov.br/cartas_e_mapas/bases_cartograficas_continuas/bc250/versao2019/105.National Hydrography Dataset Plus (NHDPlus, medium resolution, v.2) (US Geological Survey, accessed 11 June 2020); https://www.epa.gov/waterdata/get-nhdplus-national-hydrography-dataset-plus-data106.Busch, M. H. et al. What’s in a name? Patterns, trends, and suggestions for defining non-perennial rivers and streams. Water 12, 1980 (2020).PubMed 

    Google Scholar 
    107.Datry, T. et al. Science and management of intermittent rivers and ephemeral streams (SMIRES). Res. Ideas Outcomes 3, e21774 (2017).
    Google Scholar 
    108.Trabucco, A. & Zomer, R. J. Global high-resolution soil–water balance. https://doi.org/10.6084/m9.figshare.7707605.v3 (2010).109.Hall, D. K. & Riggs, G. A. MODIS/Aqua Snow Cover Daily L3 Global 500m SIN Grid, Version 6. [2002–2015] (NASA National Snow and Ice Data Center Distributed Active Archive Center, accessed 15 February 2017); https://doi.org/10.5067/MODIS/MYD10A1.006110.Fan, Y., Li, H. & Miguez-Macho, G. Global patterns of groundwater table depth. Science 339, 940–943 (2013).ADS 
    CAS 
    PubMed 

    Google Scholar 
    111.Fluet-Chouinard, E., Lehner, B., Rebelo, L.-M., Papa, F. & Hamilton, S. K. Development of a global inundation map at high spatial resolution from topographic downscaling of coarse-scale remote sensing data. Remote Sens. Environ. 158, 348–361 (2015).ADS 

    Google Scholar 
    112.Döll, P., Kaspar, F. & Lehner, B. A global hydrological model for deriving water availability indicators: model tuning and validation. J. Hydrol. 270, 105–134 (2003).ADS 

    Google Scholar 
    113.Bartholomé, E. & Belward, A. S. GLC2000: a new approach to global land cover mapping from Earth observation data. Int. J. Remote Sens. 26, 1959–1977 (2005).ADS 

    Google Scholar 
    114.GLIMS and National Snow and Ice Data Center. GLIMS Glacier Database V1 (2012); https://doi.org/10.7265/N5V98602115.Gruber, S. Derivation and analysis of a high-resolution estimate of global permafrost zonation. Cryosphere 6, 221–233 (2012).ADS 

    Google Scholar 
    116.Ramankutty, N. & Foley, J. A. Estimating historical changes in global land cover: croplands from 1700 to 1992. Glob. Biogeochem. Cycles 13, 997–1027 (1999).ADS 
    CAS 

    Google Scholar 
    117.Lehner, B. & Döll, P. Development and validation of a global database of lakes, reservoirs and wetlands. J. Hydrol. 296, 1–22 (2004).ADS 

    Google Scholar 
    118.Robinson, N., Regetz, J. & Guralnick, R. P. EarthEnv-DEM90: a nearly-global, void-free, multi-scale smoothed, 90m digital elevation model from fused ASTER and SRTM data. ISPRS J. Photogramm. Remote Sens. 87, 57–67 (2014).ADS 

    Google Scholar 
    119.Williams, P. W. & Ford, D. C. Global distribution of carbonate rocks. Z. Geomorphol. Suppl. 147, 1–2 (2006).
    Google Scholar 
    120.Hartmann, J. & Moosdorf, N. The new global lithological map database GLiM: a representation of rock properties at the Earth surface. Geochem. Geophys. Geosyst. 13, Q12004 (2012).ADS 

    Google Scholar  More

  • in

    Community–academic partnerships helped Flint through its water crisis

    COMMENT
    15 June 2021

    Community–academic partnerships helped Flint through its water crisis

    A city that faced a public-health emergency shows how collaborations with neighbourhood advocates can advance health equity.

    E. Yvonne Lewis

    0
    &

    Richard C. Sadler

    1

    E. Yvonne Lewis

    E. Yvonne Lewis is founder and chief executive of the National Center for African American Health Consciousness, Flint; co-community principal investigator at the Flint Center for Health Equity Solutions; co-director of the Healthy Flint Research Coordinating Center Community Core; and director of outreach, Genesee Health Plan, Flint, Michigan, USA.

    View author publications

    You can also search for this author in PubMed
     Google Scholar

    Richard C. Sadler

    Richard C. Sadler is associate professor of public health at Michigan State University, Flint, Michigan, USA.

    View author publications

    You can also search for this author in PubMed
     Google Scholar

    Share on Twitter
    Share on Twitter

    Share on Facebook
    Share on Facebook

    Share via E-Mail
    Share via E-Mail

    Download PDF

    Residents of Flint, Michigan, attended community blood-testing events in 2016 after lead contamination was found in the city’s water supply.Credit: Brett Carlsen/Getty

    Flint in Michigan is infamous for its water crisis. From 2014, the state government decided to divert the city’s water supply through ageing pipes that contained lead, a neurotoxin, making many people unwell and leading to some deaths. Residents were left searching out water that was safe for drinking, washing and bathing. Nine public officials face criminal negligence charges around wilful neglect of duty and for allegedly concealing and misrepresenting data. A US$640-million class-action lawsuit is moving its way through the courts.But Flint should be known for more than its public-health tragedy. Accounts of the crisis often cast pioneering scientists and physicians as lone heroes, assuming that those who documented the lead in the water and blood of Flint’s residents were the ones who brought officials to account. That assumption erases the work of community activists who got academics to look for lead and its damaging health effects in the first place. Flint is a working example of how community members and academics can collaborate on problems — such as how to collect data or develop robust models of health risks and injustices — and on finding solutions.Flint’s water crisis came to light because of strong research partnerships between activists, academics and other specialists. These partnerships continue to advance work that matters to the community. Efforts include identifying neighbourhood conditions (including crime levels, asthma rates and access to healthy food) and assessing projects to improve them. It requires a commitment that research does not just end up in a thesis or paper, but becomes information that is useful to community members.Here’s one example. The Genesee Health Plan is a non-profit benefit programme that provides basic health-care coverage to uninsured residents of Genesee County, which includes Flint. It was established in 2001 and is supported by property taxes. One of us (E.Y.L.) helped to provide the other (R.C.S.) with data from a sample of Genesee Health Plan enrollees to produce maps of chronic conditions. One map showed the health plan’s wide adoption in our community, and officials used it to advocate for voter support when the tax measure was renewed in 2018. This partnership was possible only because of the connections already formed between E.Y.L., who is a community organizer, and R.C.S., a geographer and public-health specialist at Michigan State University (MSU) in Flint.
    The best research is produced when researchers and communities work together
    Long-standing efforts to ensure Flint community members have a voice in research have gained momentum. One tangible result was the creation of the Healthy Flint Research Coordinating Center in 2016. To form the centre, E.Y.L. and another Flint resident representing community organizations joined up with six researchers — two each at MSU, the University of Michigan in Ann Arbor and the University of Michigan–Flint. It works to minimize redundant research, maximize creation of new community–academic partnerships and ensure that research receives a community ethics review.Also established in 2016 to support equitable community–academic partnerships was the Flint Center for Health Equity Solutions, funded by the US National Institutes of Health (NIH). E.Y.L. is the centre’s overall community principal investigator. Each of its four divisions and two research projects is co-directed by a community member and an academic. The divisions are: methodology (which R.C.S. directs); dissemination and implementation sciences; administrative; and consortium partners. A programme within the centre — in which people with substance-use disorders are coached by their peers — has expanded and is now supported by additional external funding.Here, we distil how we’ve made community-based research work, and provide lessons others might use.Distinct challengesEach of us experienced different challenges before we formed our community partnership, which might offer some pointers for others considering such collaborations. To that end, here, we relate our stories individually.R.C.S. writes: I grew up in Flint, and joined MSU as a faculty member in 2015. I knew that the kind of community-focused work I was most passionate about makes it harder to rack up the publications and citations required to progress in most academic institutions, which often treat these as a proxy for high-quality research.

    A medical assistant checks for the presence of lead in blood samples as part of a community campaign in Flint, Michigan.Credit: Jim West/Alamy

    I still worked to hit those markers, publishing more than 50 papers in 6 years. I secured several grants from agencies that fund research that has community value — including agencies in the NIH, the US Centers for Disease Control and Prevention (CDC) and the Michigan Department of Health and Human Services. My focus was on work that mattered to the community, and I didn’t worry whether journals had high impact factors or huge name recognition.The community-engaged philosophy of the College of Human Medicine at MSU — where I gained tenure this year — made it more open to alternative metrics, such as volunteering on local non-profit committees, conducting community-based mapping and talking about research at local meetings. The key was to frame my academic output on a longer time scale than that of publications — long enough to see meaningful change.
    Rethink how we plan research to shrink COVID health disparities
    E.Y.L. writes: As an African American female community activist of decades’ standing, I worried about being physically mistreated, emotionally abused and misrepresented by research institutions. On one occasion before I moved to Flint, I remarked that some physicians’ descriptions of pregnant African American women as unconcerned with or unwilling to take care of their own needs did not reflect people in my community. I was asked about my academic credentials and then ignored for the rest of the conversation. I experienced this often during the water crisis: community members were touted as being great citizen scientists, until there was disagreement with the ‘real scientists’. Then we were marginalized and told we lacked the necessary degrees to provide input.As community members, we also see our ideas appropriated. For instance, during a discussion at one national meeting, I made a distinction — on the basis of my own experience — between projects that were faith-based (driven by religious principles) and those that were faith-placed (using spaces such as churches). The following year, a researcher presented data based on this model without acknowledging me as the inspiration. I felt dishonoured, discouraged and demotivated. I now ask academic partners to give attribution for my ideas. Knowing the norms — and what credit to request — has helped me immeasurably.Joint challengesOne of the biggest barriers to community participation is language. Words can have different meanings in different contexts — for instance, the phrase ‘those people’ can be highly offensive in many situations. When community members hear terms such as public engagement, they assume that ‘public’ refers to a broad, mixed group of individuals, such as those who might go to a public event. Yet academics often use the term to mean targeted outreach to specific groups of people — faith leaders, patient groups or policymakers, say. And to help navigate excessive jargon in the early stages of Flint’s partnerships, one group developed a glossary of acronyms such as NIH and CDC.Importantly, everyone involved must take time to understand the culture and unique characteristics of the groups within communities. Not all Black communities are the same, for instance, and none is homogeneous. The heterogeneity among people’s levels of income, education and health insurance must be kept in mind in communications. Research materials written in English for a ‘general audience’ might not be appropriate — strong cultural dialects and a lack of access to information need to be considered.Funding norms can also become a barrier to sustaining long-term relationships. Grants that last only one, two or five years are insufficient to address many community concerns. Too many communities have experienced projects for which funding ends and researchers move on, leaving unfinished work. Without sustained effort, the situation can revert to being the same or worse than it was before the project began. This is partly why community-engaged work is so important: researchers committed to the cause will continue as partners long after the funding is gone. And if grants from typical funders run out, academics will find other sources of support for community partners — such as by maintaining relationships with local philanthropies. (In Flint, such support has come from the C. S. Mott Foundation and Community Foundation of Greater Flint.)

    A local church in Flint was set up as a water distribution centre because lead contamination had made the public supply unsafe.Credit: Tom Williams/CQ Roll Call

    Researchers often come to communities with a prepared study design, seeking approval rather than input — even when input could improve a study. Researchers assessing campaigns to promote healthy eating might include a control group that receives nothing, whereas the treatment group receives a suite of services and vouchers. This creates a perception of unfairness that can warp a study and discourage participation. Too often, researchers treat community partners who point out such risks as a barrier to progress, rather than as a liaison to a robust study. That attitude undermines future interactions. Establishing realistic expectations is one way to mitigate this issue.Researchers might also offer to provide training in work that is already under way. For example, Flint has a crime-reduction programme in which residents proactively assess whether street lights are working and maintain vacant properties. Proposals that disregard what is already in place are wasteful and cause resentment. At one point, a team of researchers approached us to implement a healthy-eating project, not realizing that the Flint community had helped to develop the recipe book on which it was based. The Healthy Flint Research Coordinating Center now maintains an index of projects to discourage redundant work (one of R.C.S.’s tasks).
    Farmers transformed how we investigate climate
    Before and especially during the water crisis, a string of ‘helicopter researchers’ from outside Flint came to study topics from environmental issues to violence. Community members were asked to fill out surveys, or learnt through informal chatter about researchers who wanted records about emergency hospitalizations. But data and insights were not brought back to the community. Many residents felt used and dismissed. The coordinating centre now works with researchers so their results can be applied to inform and improve the community where data were collected.Interactions are generative: when academic researchers dismiss community ideas, take them without credit, bristle at valid input, ‘introduce’ programmes that are already in progress or focus more on producing papers than on helping communities, residents will expect the same of other researchers. Even those with the best of intentions can be rebuffed or face distrust, something R.C.S. was attuned to when he began his transition from Flint community member to academic.Nurture relationshipsIdeally, interactions become constructive feedback loops. In 2018, E.Y.L. provided health-plan data to R.C.S.. The resulting analysis using a geographic information system (GIS) showed, for the first time, that the centre of Flint was an asthma hotspot (see ‘Asthma hotspots in Flint’). This pattern correlates with historical sites of car factories and lead contamination in the soil (M. A. S. Laidlaw et al. Int. J. Environ. Res. Public Health 13, 358; 2016). R.C.S. explored how best to show those patterns in ways that would be interpretable and helpful to community members. These results have informed targeted outreach activities, such as developing tailored materials based on local landmarks and identifying specific neighbourhoods, churches or community groups where the materials can be distributed.

    Source: E. Yvonne Lewis & Richard C. Sadler

    None of this would have happened without the partnership and trust we had built. The university needed access to health-plan data. Health-plan officials had to trust researchers to answer relevant questions, honour patient confidentiality and provide insight to accomplish the plan’s goal.As the value of such analysis became clear, community members were eager for more. Most neighbourhood and community groups come together to solve a specific, immediate problem, not to form a self-sustaining, long-lasting organization, so they rarely consider mechanisms for collecting long-term data. Flint now sees community members approaching researchers; they seek to evaluate programmes that they’ve put into place. They want data to support the fact that they do good work and to show which efforts are most effective. A true partnership has been achieved.The partnership represents many works in progress, far beyond what we describe here. There are still conflicts, miscommunication and lost opportunities. But we now know how to set ourselves up for success as projects emerge.The most important ingredient in making collaborations work is commitment: to producing research that is relevant, and to understanding many angles and perspectives. This means spending less time and attention on conventional metrics, such as published papers, journal impact factors and procured grants, and much more on nurturing relationships. In true community-based partnerships, a paper is incomplete without a link back to the local community.Although our experiences are specific to Flint, community–academic partnerships that focus on research that is relevant to policy are essential worldwide. Regions in the Rust Belt of North America, Eastern Europe and east Asia have all experienced population decline and economic problems. More will soon do so. Exploring solutions is of benefit both to researchers and to communities when they work together.

    Nature 594, 326-329 (2021)
    doi: https://doi.org/10.1038/d41586-021-01586-8

    Competing Interests
    The authors declare no competing interests.

    Latest on:

    Society

    Scientists — be political in the good times, not just the bad
    Correspondence 08 JUN 21

    EU forest strategy: adapt, innovate, employ
    Correspondence 25 MAY 21

    What the science says about lifting mask mandates
    News Feature 25 MAY 21

    Water resources

    How waste water is helping South Africa fight COVID-19
    Technology Feature 24 MAY 21

    European rivers are fragmented by many more barriers than had been recorded
    News & Views 16 DEC 20

    More than one million barriers fragment Europe’s rivers
    Article 16 DEC 20

    Environmental sciences

    Regulate waste recycling internationally
    Correspondence 15 JUN 21

    EU forest strategy: adapt, innovate, employ
    Correspondence 25 MAY 21

    Deploy international satellite monitoring to safeguard forests
    Correspondence 25 MAY 21

    Jobs from Nature Careers

    All jobs

    Postdoctoral Position in RNA Biology
    Harpur College of Arts and Sciences, SUNY Binghamton
    Binghamton, NY, United States

    JOB POST

    Post-doctoral neuronal cell biologist: Functional CRISPRi screening of Parkinson’s gene in human iPSC neurons
    University of Oxford
    Oxford, United Kingdom

    JOB POST

    Lab-Montaner – Post-Doctoral Fellow
    The Wistar Institute
    Philadelphia, PA, United States

    JOB POST

    Research Fellow in Genomics and Neuroscience
    University of Leeds
    Leeds, United Kingdom

    JOB POST

    Nature Briefing
    An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

    Email address

    Yes! Sign me up to receive the daily Nature Briefing email. I agree my information will be processed in accordance with the Nature and Springer Nature Limited Privacy Policy.

    Sign up More