More stories

  • in

    The how tough is WASH framework for assessing the climate resilience of water and sanitation

    1.Howard, G., Calow, R., Macdonald, A. & Bartram, J. Climate change and water and sanitation: likely impacts and emerging trends for action. Annu. Rev. Environ. Resour. 41, 253–276 (2016).Article 

    Google Scholar 
    2.Jiménez-Cisneros, B. E., et al. Freshwater resources. In Climate Change 2014: Impacts, Adaptation, and Vulnerability, Part A: Global and Sectoral Aspects (Working Group II Contribution to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change), editors: C. B. Field, V. R. Barros, D. J. Dokken, K. J. Mach, M. D. Mastrandrea, et al., 229–269 (UK: Cambridge University Press, 2014).3.IPCC. Climate Change 2014: Synthesis Report. Contribution of Working Groups I, II and III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Core Writing Team, R. K. Pachauri and L. A. Meyer (eds.)] IPCC, Geneva, Switzerland, 151 (2014).4.Bartram, J. & Cairncross, S. Hygiene, sanitation, and water: forgotten foundations of health. PLoS Med. 7, e1000367 (2010).Article 

    Google Scholar 
    5.Howard, G., & Bartram, J. Vision 2030: the resilience of water supply and sanitation in the face of climate change. Technical Report, (WHO, Geneva, 2009).6.Sherpa, A. M., Koottatep, T., Zurbruegg, C. & Cissé, G. Vulnerability and adaptability of sanitation systems to climate change. J. Water Clim. Change 5, 487–495 (2014).Article 

    Google Scholar 
    7.Heath, T. T., Parker, A. H. & Weatherhead, E. K. Testing a rapid climate change adaptation assessment for water and sanitation providers in informal settlements in three cities in sub-Saharan Africa. Environ. Urbanization 24, 619–637 (2012).Article 

    Google Scholar 
    8.Fleming, L. et al. Urban and rural sanitation in the Solomon Islands: how resilient are these to extreme weather events? Sci. Total Environ. 683, 331–340 (2019).CAS 
    Article 

    Google Scholar 
    9.Khan, S. J. et al. Extreme weather events: should drinking water quality management systems adapt to changing risk profiles? Water Res. 85, 124–136 (2019).Article 
    CAS 

    Google Scholar 
    10.World Health Organisation. Climate-resilient water safety plans: managing health risks associated with climate variability and change. p 82, (World Health Organization, Geneva, 2017).11.Ricket, B., van den Berg, H., Bekurec, K. & Girmad, S. & de Roda Husman, A.M. Including aspects of climate change into water safety planning: Literature review of global experience and case studies from Ethiopian urban supplies. Int. J. Hyg. Environ. Health 222, 744–755 (2019).Article 

    Google Scholar 
    12.Hallegatte, S. & Engle, N. L. The search for the perfect indicator: reflections on monitoring and evaluation of resilience for improved climate risk management. Clim. Risk Manag. 23, 1–6 (2019).Article 

    Google Scholar 
    13.GWP & UNICEF. WASH Climate Resilient Development Technical Brief: Monitoring and evaluation for climate resilient WASH. https://www.gwp.org/globalassets/global/about-gwp/publications/unicef-gwp/gwp_unicef_monitoring-and-evaluation-brief.pdf (2017).14.ARCADIS. Measuring resilience in the water industry. https://www.unitedutilities.com/globalassets/z_corporate-site/about-us-pdfs/looking-to-the-future/measuring-resilience-in-the-water-industry_final.pdf (2017).15.Nokes, C. Water Supply Climate Change Vulnerability Assessment Tool Handbook Health Analysis & Information For Action (HAIFA). ESR Client Report No: CSC12010. (Environmental Science and Research Limited, Porirua, New Zealand, 2012).16.Lloyd, B. J. & Bartram, J. Surveillance solutions to microbiological problems in water quality control in developing countries. Water Sci. Technol. 24, 61–75 (1991).Article 

    Google Scholar 
    17.Lloyd, B. J. & Helmer, R. Surveillance of Drinking Water Quality in Rural Areas. Longman, Harlow, UK (1991).18.World Health Organisation. Guidelines for drinking-water quality 2nd edition Volume 3: Surveillance and control of community supplies. Geneva, (World Health Organization, 1997).19.Howard, G. & Bartram, J. Effective water supply surveillance in urban areas of developing countries. J. Water Health 3, 31–43 (2005).Article 

    Google Scholar 
    20.Kohlitz, J., Chong, J. & Willetts, J. Rural drinking water safety under climate change: the importance of addressing physical, social, and environmental dimensions. RESOURCES 9, 77 (2020).Article 

    Google Scholar 
    21.Kelly, E. R., Cronk, R., Kumpel, E., Howard, G. & Bartram, J. How we assess water safety: a critical review of sanitary inspection and water quality analysis. Sci. Total Environ. 718, 137237 (2020).CAS 
    Article 

    Google Scholar 
    22.MacDonald, A. M., Calow, R. C., MacDonald, D. M. J., Darling, W. G. & Dochartaigh, B. E. O. What impact will climate change have on rural groundwater supplies in Africa? Hydrological Sci. J. 54, 690–703 (2009).Article 

    Google Scholar 
    23.Rickert, B. Chorus, I. & Schmoll, O. (eds). Protecting surface water for health. Identifying, assessing and managing drinking-water quality risks in surface-water catchments. WHO, Geneva. 178pp (2016).24.Schmoll, O. Howard, G., Chilton, J. and Chorus, I. (eds). Protecting Groundwater for Health: managing the quality of drinking-water sources. WHO, Geneva. 609pp (2006).25.Saha, A. K. & Agrawal, S. Mapping and assessment of flood risk in Prayagraj district, India: a GIS and remote sensing study. Nanotechnol. Environ. Eng. 5, 1–18 (2020).Article 
    CAS 

    Google Scholar 
    26.Sahana, M. & Sajjad, H. Vulnerability to storm surge flood using remote sensing and GIS techniques: a study on Sundarban Biosphere Reserve, India. Remote Sens. Appl.: Soc. Environ. 13, 106–120 (2019).
    Google Scholar 
    27.Belal, A. A., El-Ramady, H. R., Mohamed, E. S. & Saleh, A. M. Drought risk assessment using remote sensing and GIS techniques. Arab. J. Geosci. 7, 35–53 (2014).Article 

    Google Scholar 
    28.Palamuleni, L. G., Ndomba, P. M. & Annegarn, H. J. Evaluating land cover change and its impact on hydrological regime in Upper Shire river catchment, Malawi. Reg. Environ. Change 11, 845–855 (2011).Article 

    Google Scholar 
    29.Masocha, M., Murwira, A., Magadza, C. H., Hirji, R. & Dube, T. Remote sensing of surface water quality in relation to catchment condition in Zimbabwe. Phys. Chem. Earth Parts A/B/C. 100, 13–18 (2017).Article 

    Google Scholar 
    30.Wang, X. et al. A method coupled with remote sensing data to evaluate non-point source pollution in the Xin’anjiang catchment of China. Sci. Total Environ. 430, 132–143 (2012).CAS 
    Article 

    Google Scholar 
    31.Basnyat, P., Teeter, L. D., Lockaby, B. G. & Flynn, K. M. The use of remote sensing and GIS in watershed level analyses of non-point source pollution problems. For. Ecol. Manag. 128, 65–73 (2000).Article 

    Google Scholar 
    32.Baird, J., et al. The emerging scientific water paradigm: Precursors, hallmarks, and trajectories. WIREs Water https://doi.org/10.1002/wat2.1489 (2021).33.da Silva Wells, C., van Lieshout, R. & Uytewall, E. Monitoring for learning and developing capacities in the WASH sector. Water Policy 15, 206–225 (2013).Article 

    Google Scholar 
    34.Howard, G. et al. Securing 2020 vision for 2030: climate change and ensuring resilience in water and sanitation services. J. Water Clim. 1, 2–16 (2010).Article 

    Google Scholar 
    35.Whaley, L. & Cleaver, F. Can ‘functionality’ save the community management model of rural water supply? Water Resour. Rural Dev. 9, 56–66 (2017).Article 

    Google Scholar 
    36.Kohlitz, J., Chong, J. & Willetts, J. Analysing the capacity to respond to climate change: a framework for community-managed water services. Clim. Dev. 11, 775–785 (2019).Article 

    Google Scholar 
    37.Blue, G., Rosol, M. & Fast, V. Justice as Parity of Participation: Enhancing Arnstein’s Ladder Through Fraser’s Justice Framework. J. Am. Plan. Assoc. 85, 363–376 (2019).Article 

    Google Scholar 
    38.Buggy, L. & McNamara, K. E. The need to reinterpret “community” for climate change adaptation: a case study of Pele Island, Vanuatu. Clim. Dev. 8, 270–280 (2016).Article 

    Google Scholar 
    39.Adger, W. N., Barnett, J., Brown, K., Marshall, N. & O’Brien, K. Cultural dimensions of climate change impacts and adaptation. Nat. Clim. Change 3, 112–117 (2013).Article 

    Google Scholar 
    40.Sanyal, S. & Routray, J. K. Social capital for disaster risk reduction and management with empirical evidences from Sundarbans of India. Int. J. Disaster Risk Reduct. 19, 101–111 (2016).Article 

    Google Scholar 
    41.Bihari, M. & Ryan, R. Influence of social capital on community preparedness for wildfires. Landsc. Urban Plan. 106, 253–261 (2012).Article 

    Google Scholar 
    42.Bisung, E. & Elliott, S. J. “It makes us really look inferior to outsiders”: Coping with psychosocial experiences associated with the lack of access to safe water and sanitation. Canadian. J. Public Health 108, 442–447 (2017).
    Google Scholar 
    43.Stoler, J. et al. Household water sharing: a missing link in international health. Int. Health 11, 163–165 (2019).Article 

    Google Scholar 
    44.Zug, S. & Graefe, O. The gift of water. Social redistribution of water among neighbours in Khartoum. Water Alternatives, 7, 140-159(2014).45.Adeniji-Oloukoi, G., Urmilla, B. & Vadi, M. Households’ coping strategies for climate variability related water shortages in Oke-Ogun region, Nigeria. Environmental. Development 5, 23–38 (2013).
    Google Scholar 
    46.Hutchings, P. et al. A systematic review of success factors in the community management of rural water supplies over the past 30 years. Water Policy 17, 963–983 (2015).Article 

    Google Scholar 
    47.Miller, M. et al. External support programs to improve rural drinking water service sustainability: A systematic review. Sci. Total Environ. 670, 717–731 (2019).CAS 
    Article 

    Google Scholar 
    48.Harvey, P. A. & Reed, R. A. Community-managed water supplies in Africa: sustainable or dispensable? Community Dev. J. 42, 365–378 (2006).Article 

    Google Scholar 
    49.Kayser, G. L., Moomaw, W., Portillo, J. M. O. & Griffiths, J. K. Circuit rider post-construction support: improvement in domestic water quality and system sustainability in El Salvador. J. Water, Sanitation Hyg. Dev. 4, 460–470 (2014).Article 

    Google Scholar 
    50.Harvey, P. A. & Reed, R. A. Sustainable supply chains for rural water supplies in Africa. Eng. Sustain. 159, 31–39 (2006).Article 

    Google Scholar 
    51.Colon, C., Hallegatte, S. & Rozenberg J. Criticality analysis of a country’s transport network via an agent-based supply chain model. Nat. Sustain. https://doi.org/10.1038/s41893-020-00649-4 (2020).52.Baharmand, H., Comes, T. & Lauras, M. Defining and measuring the network flexibility of humanitarian supply chains: insights from the 2015 Nepal earthquake. Ann. Oper. Res. 283, 961–1000 (2019). Special Issue: SI.Article 

    Google Scholar 
    53.Haraguchi, M. & Lall, U. Flood risks and impacts: A case study of Thailand’s floods in 2011 and research questions for supply chain decision making. Int. J. Disaster Risk Reduct. 14, 256–272 (2015).Article 

    Google Scholar 
    54.Salehi, S. et al. Climate change adaptation: a systematic review on domains and indicators. Nat. Hazards 96, 521–550 (2019).Article 

    Google Scholar 
    55.Pories, L., Fonseca, C. & Delmon, V. Mobilising Finance for WASH: Getting the foundations right. Water https://doi.org/10.3390/w11112425 (2019).56.Milly, P. C. D. et al. Stationarity Is Dead: Whither Water Management? Science https://doi.org/10.1126/science.1151915 (2008).57.Shepherd, T. G. Storyline approach to the construction of regional climate change information. Proc. R. Soc. Math. Phys. Eng. Sci. 475, 20190013 (2019).
    Google Scholar  More

  • in

    Empirical estimate of forestation-induced precipitation changes in Europe

    1.Lee, X. et al. Observed increase in local cooling effect of deforestation at higher latitude. Nature 479, 384–387 (2011).Article 

    Google Scholar 
    2.Li, Y. et al. Local cooling and warming effects of forests based on satellite observations. Nat. Commun. https://doi.org/10.1038/ncomms7603 (2015).3.Duveiller, G., Hooker, J. & Cescatti, A. The mark of vegetation change on Earth’s surface energy balance. Nat. Commun. https://doi.org/10.5194/essd-2018-24 (2018).4.Jia, G. et al. in Special Report on Climate Change and Land (eds Shukla, P. R. et al.) Ch. 2 (IPCC, 2019).5.Lejeune, Q., Seneviratne, S. I. & Davin, E. L. Historical land-cover change impacts on climate: comparative assessment of LUCID and CMIP5 multimodel experiments. J. Clim. 30, 1439–1459 (2017).Article 

    Google Scholar 
    6.Winckler, J., Reick, C. H. & Pongratz, J. Robust identification of local biogeophysical effects of land-cover change in a global climate model. J. Clim. 30, 1159–1176 (2017).Article 

    Google Scholar 
    7.Duveiller, G. et al. Biophysics and vegetation cover change: a process-based evaluation framework for confronting land surface models with satellite observations. Earth Syst. Sci. Data 10, 1265–1279 (2018).Article 

    Google Scholar 
    8.Meier, R. et al. Evaluating and improving the Community Land Model’s sensitivity to land cover. Biogeosciences 15, 4731–4757 (2018).Article 

    Google Scholar 
    9.Meier, R., Davin, E. L., Swenson, S. C., Lawrence, D. M. & Schwaab, J. Biomass heat storage dampens diurnal temperature variations in forests. Environ. Res. Lett. 14, 084026 (2019).Article 

    Google Scholar 
    10.Spracklen, D., Arnold, S. & Taylor, C. Observations of increased tropical rainfall preceded by air passage over forests. Nature 489, 282–285 (2012).Article 

    Google Scholar 
    11.Lejeune, Q., Davin, E. L., Guillod, B. P. & Seneviratne, S. I. Influence of Amazonian deforestation on the future evolution of regional surface fluxes, circulation, surface temperature and precipitation. Clim. Dyn. 44, 2769–2786 (2015).Article 

    Google Scholar 
    12.Khanna, J., Medvigy, D., Fueglistaler, S. & Walko, R. Regional dry-season climate changes due to three decades of Amazonian deforestation. Nat. Clim. Change 7, 200–204 (2017).Article 

    Google Scholar 
    13.Yosef, G. et al. Large-scale semi-arid afforestation can enhance precipitation and carbon sequestration potential. Sci. Rep. https://doi.org/10.1038/s41598-018-19265-6 (2018).14.Belušić, D., Fuentes-Franco, R., Strandberg, G. & Jukimenko, A. Afforestation reduces cyclone intensity and precipitation extremes over Europe. Environ. Res. Lett. 14, 074009 (2019).Article 

    Google Scholar 
    15.Perugini, L. et al. Biophysical effects on temperature and precipitation due to land cover change. Environ. Res. Lett. 12, 053002 (2017).Article 

    Google Scholar 
    16.Sandel, B. & Svenning, J. Human impacts drive a global topographic signature in tree cover. Nat. Commun. https://doi.org/10.1038/ncomms3474 (2013).17.Fuchs, R., Herold, M., Verburg, P. H. & Clevers, J. G. P. W. A high-resolution and harmonized model approach for reconstructing and analysing historic land changes in Europe. Biogeosciences 10, 1543–1559 (2013).Article 

    Google Scholar 
    18.Hansen, M. C. et al. High-resolution global maps of 21st-century forest cover change. Science 342, 850–853 (2013).Article 

    Google Scholar 
    19.Fuchs, R., Herold, M., Verburg, P. H., Clevers, J. G. & Eberle, J. Gross changes in reconstructions of historic land cover/use for Europe between 1900 and 2010. Glob. Change Biol. 21, 299–313 (2014).Article 

    Google Scholar 
    20.McGrath, M. J. et al. Reconstructing European forest management from 1600 to 2010. Biogeosciences 12, 4291–4316 (2015).Article 

    Google Scholar 
    21.Griscom, B. W. et al. Natural climate solutions. Proc. Natl Acad. Sci. USA 114, 11645–11650 (2017).Article 

    Google Scholar 
    22.Navarro, L. M. & Pereira, H. M. Rewilding Abandoned Landscapes in Europe (Springer, 2015).23.Lewis, E. et al. GSDR: a global sub-daily rainfall dataset. J. Clim. 32, 4715–4729 (2019).Article 

    Google Scholar 
    24.Menne, M. J., Durre, I., Vose, R. S., Gleason, B. E. & Houston, T. G. An overview of the global historical climatology network-daily database. J. Atmos. Ocean. Technol. 29, 897–910 (2012).Article 

    Google Scholar 
    25.Menne, M. J. et al. Global Historical Climatology Network—Daily (GHCN-Daily) Version 3.20 (NOAA, 2012); https://doi.org/10.7289/V5D21VHZ26.Zhang, M. et al. Response of surface air temperature to small-scale land clearing across latitudes. Environ. Res. Lett. https://doi.org/10.1088/1748-9326/9/3/034002 (2014).27.Liu, H., Randerson, J. T., Lindfors, J. & Chapin, F. S. III Changes in the surface energy budget after fire in boreal ecosystems of interior Alaska: an annual perspective. J. Geophys. Res. https://doi.org/10.1029/2004JD005158 (2005).28.Juang, J.-Y., Katul, G., Siqueira, M., Stoy, P. & Novick, K. Separating the effects of albedo from eco-physiological changes on surface temperature along a successional chronosequence in the southeastern United States. Geophys. Res. Lett. https://doi.org/10.1029/2007GL031296 (2007).29.Vanden Broucke, S., Luyssaert, S., Davin, E. L., Janssens, I. & van Lipzig, N. New insights in the capability of climate models to simulate the impact of LUC based on temperature decomposition of paired site observations. J. Geophys. Res. Atmos. 120, 5417–5436 (2015).Article 

    Google Scholar 
    30.Beck, H. E. et al. MSWEP V2 global 3-hourly 0.1° precipitation: methodology and quantitative assessment. Bull. Am. Meteorol. Soc. 100, 473–500 (2019).Article 

    Google Scholar 
    31.Schwaab, J. et al. Increasing the broad-leaved tree fraction in European forests mitigates hot temperature extremes. Sci. Rep. 10, 14153 (2020).Article 

    Google Scholar 
    32.Cohn, A. S. et al. Forest loss in Brazil increases maximum temperatures within 50 km. Environ. Res. Lett. 14, 084047 (2019).Article 

    Google Scholar 
    33.Houze, R. A. Jr Orographic effects on precipitating clouds. Rev. Geophys. https://doi.org/10.1029/2011RG000365 (2012).34.C3S ERA5-Land Reanalysis (Copernicus Climate Change Service, 2019).35.Daly, C. et al. Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol. 28, 2031–2064 (2008).Article 

    Google Scholar 
    36.Sprenger, M. & Wernli, H. The LAGRANTO Lagrangian analysis tool—version 2.0. Geosci. Model Dev. 8, 2569–2586 (2015).Article 

    Google Scholar 
    37.Kosztra, B., Büttner, G., Hazeu, G. & Arnold, S. Updated CLC Illustrated Nomenclature Guidelines (European Environment Agency, 2019).38.Duveiller, G., Fasbender, D. & Meroni, M. Revisiting the concept of a symmetric index of agreement for continuous datasets. Sci. Rep. 6, 19401 (2016).Article 

    Google Scholar 
    39.Griscom, B. W. et al. Global Reforestation Potential Map (Zenodo, 2017); https://doi.org/10.5281/zenodo.88344440.Sheffield, J. & Wood, E. F. Projected changes in drought occurrence under future global warming from multi-model, multi-scenario, IPCC AR4 simulations. Clim. Dyn. 31, 79–105 (2008).Article 

    Google Scholar 
    41.Kotlarski, S. et al. Regional climate modeling on European scales: a joint standard evaluation of the EURO-CORDEX RCM ensemble. Geosci. Model Dev. 7, 1297–1333 (2014).Article 

    Google Scholar 
    42.Prein, A. F. et al. A review on regional convection-permitting climate modeling: demonstrations, prospects, and challenges. Rev. Geophys. 53, 323–361 (2015).Article 

    Google Scholar 
    43.Liu, J. & Niyogi, D. Meta-analysis of urbanization impact on rainfall modification. Sci. Rep. https://doi.org/10.1038/s41598-019-42494-2 (2019).44.Van der Ent, R. J. & Savenije, H. H. G. Length and time scales of atmospheric moisture recycling. Atmos. Chem. Phys. 11, 1853–1863 (2011).Article 

    Google Scholar 
    45.Rüdisühli, S., Sprenger, M., Leutwyler, D., Schär, C. & Wernli, H. Attribution of precipitation to cyclones and fronts over Europe in a kilometer-scale regional climate simulation. Weather Clim. Dyn. 1, 675–699 (2020).Article 

    Google Scholar 
    46.Schultz, N. M., Lawrence, P. J. & Lee, X. Global satellite data highlights the diurnal asymmetry of the surface temperature response to deforestation. J. Geophys. Res. Biogeosci. 122, 903–917 (2017).Article 

    Google Scholar 
    47.Pollock, M. D. et al. Quantifying and mitigating wind-induced undercatch in rainfall measurements. Water Resour. Res. 54, 3863–3875 (2018).Article 

    Google Scholar 
    48.Trabucco, A., Zomer, R. J., Bossio, D. A., Straaten], O. V. & Verchot, L. V. Climate change mitigation through afforestation/reforestation: a global analysis of hydrologic impacts with four case studies. Agr. Ecosyst. Environ. 126, 81–97 (2008).Article 

    Google Scholar 
    49.Padrón, R. S., Gudmundsson, L., Greve, P. & Seneviratne, S. I. Large-scale controls of the surface water balance over land: insights from a systematic review and meta-analysis. Water Resour. Res. 53, 9659–9678 (2017).Article 

    Google Scholar 
    50.Beck, H. E. et al. MSWEP: 3-hourly 0.25° global gridded precipitation (1979–2015) by merging gauge, satellite, and reanalysis data. Hydrol. Earth Syst. Sci. 21, 589–615 (2017).Article 

    Google Scholar 
    51.Beck, H. E. et al. Global-scale evaluation of 22 precipitation datasets using gauge observations and hydrological modeling. Hydrol. Earth Syst. Sci. 21, 6201–6217 (2017).Article 

    Google Scholar 
    52.Beck, H. E. et al. Daily evaluation of 26 precipitation datasets using stage-IV gauge-radar data for the CONUS. Hydrol. Earth Syst. Sci. 23, 207–224 (2019).Article 

    Google Scholar 
    53.Lu, N. Scale effects of topographic ruggedness on precipitation over Qinghai-Tibet Plateau. Atmos. Sci. Lett. 20, e904 (2019).Article 

    Google Scholar 
    54.EU-DEM Statistical Validation (EEA, 2014).55.Siebert, S., Henrich, V., Frenken, K. & Burke, J. Global Map of Irrigation Areas Version 5 (Rheinische Friedrich-Wilhelms-University and FAO, 2013).56.DeAngelis, A. et al. Evidence of enhanced precipitation due to irrigation over the Great Plains of the United States. J. Geophys. Res. Atmos. https://doi.org/10.1029/2010JD013892 (2010).57.Thiery, W. et al. Present-day irrigation mitigates heat extremes. J. Geophys. Res. 122, 1403–1422 (2017).Article 

    Google Scholar 
    58.Wernli, B. H. & Davies, H. C. A Lagrangian-based analysis of extratropical cyclones. I: the method and some applications. Q. J. R. Meteorol. Soc. 123, 467–489 (1997).Article 

    Google Scholar 
    59.Smith, A., Lott, N. & Vose, R. The integrated surface database: recent developments and partnerships. Bull. Am. Meteorol. Soc. 92, 704–708 (2011).Article 

    Google Scholar 
    60.Blenkinsop, S., Lewis, E., Chan, S. C. & Fowler, H. J. Quality-control of an hourly rainfall dataset and climatology of extremes for the UK. Int. J. Climatol. 37, 722–740 (2017).Article 

    Google Scholar 
    61.Wood, S. N. Generalized Additive Models: An Introduction with R 2nd edn (CRC Press, 2017).62.Wood, S. N. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J. R. Stat. Soc. 73, 3–36 (2011).Article 

    Google Scholar 
    63.Wood, S. N., Li, Z., Shaddick, G. & Augustin, N. H. Generalized additive models for gigadata: modeling the UK black smoke network daily data. J. Am. Stat. Assoc. 112, 1199–1210 (2017).Article 

    Google Scholar 
    64.Li, Z. & Wood, S. N. Faster model matrix crossproducts for large generalized linear models with discretized covariates. Stat. Comput. 30, 19–25 (2020).Article 

    Google Scholar 
    65.Dormann, C. et al. Methods to account for spatial autocorrelation in the analysis of species distributional data: a review. Ecography 30, 609–628 (2007).Article 

    Google Scholar 
    66.CH2018. 2018 Climate Scenarios for Switzerland (National Centre for Climate Services, 2018).67.Prein, A. F. et al. Precipitation in the EURO-CORDEX 0.11° and 0.44° simulations: high resolution, high benefits? Clim. Dyn. 46, 383–412 (2016).Article 

    Google Scholar 
    68.Jacob, D. et al. EURO-CORDEX: new high-resolution climate change projections for European impact research. Reg. Environ. Change 14, 563–578 (2014).Article 

    Google Scholar 
    69.Digital Chart of the World (DMA and USGS, 1992). More

  • in

    GlobSnow v3.0 Northern Hemisphere snow water equivalent dataset

    Overview of the SWE Retrieval methodThe SWE processing system relies on Bayesian assimilation which combines ground-based data with satellite-borne observations2. The method applies two vertically polarized satellite-based brightness temperature observations at 19 and 37 GHz and a scene brightness temperature model (the HUT snow emission model4). First, snow microstructure described by an ‘effective snow grain size’ is estimated for grid cells with a coincident weather station SD observation. Effective snow grain size is used in the HUT model as a scalable model input parameter to optimize agreement with the satellite measurements. These values of grain size are used to interpolate a background map of the effective grain size, including an estimate of the effective grain size error. This spatially continuous map of grain size is then used as an input for a second HUT model inversion to provide an estimate of SWE. In the inversion process, the effective grain size in each grid cell is weighed with its respective error estimate and a constant value of snow density is applied. The spatially continuous SWE map obtained from the second run of the HUT snow model described above is fused with a background SD field (converted to SWE using 0.24 g cm−3) to obtain a final estimate of SWE using a Bayesian non-linear iterative assimilation approach (which weights the information sources with their estimated variances). The background SD field is generated from the same weather station SD observations used to estimate the effective snow grain size using kriging interpolation methods.The microwave scattering response to SWE saturates under deep snow conditions ( >150 mm) and model inversion of SD/SWE over areas of wet snow is not feasible because the microwave signal is absorbed rather than scattered. For these reasons, the method decreases the weight of satellite data for deep dry snowpacks and wet snow by assessing the modeled sensitivity of brightness temperature to SWE within the data assimilation procedure2,3.Before SWE retrieval, dry snow is identified from brightness temperature data7. For the autumn snow accumulation season (August to December), the dry snow detection is used to construct a cumulative snow presence mask to track the advance of snow extent (SWE estimates are restricted to the domain indicated by the cumulative snow presence mask). During spring the overall mapped snow extent is determined from the cumulative mask, which (as the melt season proceeds) is reduced using a satellite passive microwave derived estimate for the end of snow melt season for each grid cell8.The snow part of the applied scene brightness temperature model is based on the semi-empirical HUT snow emission model which describes the brightness temperature from a multi-layer snowpack covering frozen ground in the frequency range of 11 to 94 GHz4,5. Input parameters to the model include snowpack depth, density, effective grain size, snow volumetric moisture and temperature. Separate modules account for ground emission and the effect of vegetation and atmosphere. Comparisons of HUT model simulations to airborne and tower-based observations, reported elsewhere (e.g.9,10), demonstrate the ability of the model to simulate different snow conditions and land cover regimes. Intercomparisons with other emission models show comparable performance when driven by in situ data11,12 or physical model outputs13, although the HUT model has the tendency to underestimate brightness temperatures for deep snowpacks12.Basic underlying assumptionsPassive microwave sensitivity to SWE is based on the attenuating effect of snow cover on the naturally emitted brightness temperature from the ground surface. The ground brightness temperature is scattered and absorbed by the overlying snow medium, typically resulting in a decreasing brightness temperature with increasing (dry) snow mass. The scattering intensity increases as the wavelength approaches the size of the scattering particles. Considering that individual snow particles tend to range from 0.5 to 4 mm in the long axis direction, high microwave frequencies (short wavelengths) will be scattered more than low frequencies (long wavelengths). The intensity of absorption can be related to the dielectric properties of snow, with snow density largely defining the permittivity for dry snow. Absorption at microwave frequencies increases dramatically with the inclusion of free water (moisture) in snow, resulting in distinct differences of microwave signatures from dry and wet snowpacks.Initial investigations pointed out the sensitivity of microwave emission from snowpacks to the total snow water equivalent14. This led to the development of various retrieval approaches of SWE from the earliest passive microwave instruments in space (e.g.15,16). From the available set of observed frequencies, most SWE algorithms employ the ~37 GHz and ~19 GHz channels in combination. These two frequencies are available continuously since 1979. The scattering from snow at 19 GHz is smaller when compared to 37 GHz, while the emissivity of frozen soil and snow is estimated to be largely similar at both frequencies. The brightness temperature difference of the two channels can be related to snow depth (or SWE), with the additional benefit that the effect of variations in physical temperature on the measured brightness temperature are reduced (relative to the analysis of single frequencies). Similarly, observing a channel difference reduces or even cancels out systematic errors of the observation, provided that the errors in the two observations are similar (e.g. due to using common calibration targets on a space-borne sensor). Typically, the vertically polarized channel at 19 and 37 GHz is preferred due to the inherent decreased sensitivity to snow layering (e.g.17).A basic assumption in the data assimilation procedure that combines spaceborne passive microwave observations and synoptic weather station data to estimate snow depth is that the background snow depth field, interpolated from weather station data, provides meaningful information on the spatial patterns of snow depth. A limitation of the methodology is that this assumption does not hold for complex terrain (mountains). Further, the methodology is not suitable for snow cover on top of ice sheets, sea ice or glaciers.Input dataThe main input data are synoptic snow depth (SD) observations and spaceborne passive microwave brightness temperatures from the Scanning Multichannel Microwave Radiometer (SMMR), Special Sensor Microwave/Imager (SSM/I) and Special Sensor Microwave Imager/Sounder (SSMIS) data from Nimbus-7 and DMSP F-series satellites. The most important frequencies for SWE retrieval and snow detection are 19 GHz (reference measurement with very little scattering from the snow volume) and 37 GHz (sensitive to volume scattering by dry snow), which are available in all instruments. The satellite datasets are described in detail in Data Records section.Ground-based SD data were acquired from the Finnish Meteorological Institute (FMI) weather station observation database, augmented from several archive sources, including the European Centre for Medium-Range Weather Forecasts (ECMWF), The United States National Climatic Data Centre (NCDC), The All-Russia Research Institute of Hydrometeorological Information-World Data Centre (RIHMI-WDC) and The Meteorological Service of Canada (MSC) archives, as described in the Data Records section.In the assimilation of SD values with space-borne estimates, a density value of 0.24 g cm−3 is assumed in estimating SWE. In the assimilation procedure the spatial small-scale variability of SD is considered by assigning a variance of 150 cm2 to the weather station observations over forested areas, and a variance of 400 cm2 for open areas. These variance estimates describe how well a single-point SD observation describes the snow depth over a larger area surrounding the measurement site, and were determined from available FMI, Finnish Environment Institute (SYKE) and Environment and Climate Change Canada (ECCC) snow transect measurements, as well as experimental field campaign data from across Finland and Canada.Daily SD background fields were generated from observations at synoptic weather station locations acquired from multiple archives for the years 1979–2018. For each measurement, the exact location, date of measurement, and SD are required. The long-term weather station data is pre-processed before utilization in the SWE retrieval to remove outliers and improve the overall consistency of the data, as described in the Methods section.Land use and, most importantly, forest cover fraction are derived from ESA GlobCover 2009 300 m data18. Stem volume is required as an input parameter to the emission model to compensate for forest cover effects4,19; average stem volumes are estimated by the ESA BIOMASAR20 data records as described in the Methods section.The following auxiliary datasets are used to mask out water and complex terrain (mountain) pixels:

    ESA CCI Land Cover from 2000: water fraction is aggregated to the 25 km grid cell spacing of the SWE product, pixels with a water fraction >50% are masked as water.

    ETOPO521: if the standard deviation of the elevation within a 25 km grid cell is above 200 m it is masked as complex terrain.

    The Forward model applied in SWE retrievalCalculation of brightness temperature for a satellite sceneFor a satellite scene consisting of a mixture of non-forested terrain, forests, and snow-covered lake ice, the bottom-of-atmosphere brightness temperature TB,BOA is calculated so that:$${T}_{B,BOA}=left(1-FF-LFright){T}_{B,snow}+FFcdot {T}_{B,forest}+LFcdot {T}_{B,lake}$$
    (1)
    where FF is the forest fraction and LF the lake fraction of a given grid cell. ({T}_{B,snow}), ({T}_{B,forest}), and ({T}_{B,lake}) are the brightness temperatures emitted from non-forested terrain (ground/snow), forested terrain, and lake ice, respectively. Land cover fractions FF and LF are determined from ESA GlobCover data resampled to the 25 km EASE grid. A statistical approach is used to calculate top-of-atmosphere brightness temperatures from TB,BOA, statistics are based on studies covering the Northern Hemisphere4,22,23.Brightness temperature from snow-covered groundThe brightness temperature ({T}_{B,snow}) for snow-covered, non-forested terrain is calculated using the HUT snow emission model4. The model is a radiative transfer-based, semi-empirical model which calculates the emission from a single homogenous snowpack. The current approach utilizes multi-layer modification which allows the simulation of brightness temperature from a stacked system of snow or ice layers5.The absorption coefficient in the HUT model is determined from the complex dielectric constant of dry snow, applying the Polder-van Santen mixing model for the imaginary part24. The calculation of the dielectric constant for dry snow as well as effects of possible liquid water and salinity inclusions, are described through empirical formulae25. Emission from the snow layer is considered as both up- and down-welling emission. These are, in turn, reflected from interfaces between layers (air-snow, snow-ground). The transmission and multiple reflections between layer interfaces are calculated using the incoherent power transfer approach.Applying the delta-Eddington approximation to the radiative transfer equation, the HUT model assumes that most of the scattered radiation in a snowpack is concentrated in the forward direction (of propagation) due to multiple scattering within the snow media, based on26, which assumes that losses due to scattering are approximately equal to generation of incoherent intensity by scattering. However the omission of the backward scattering component as well as omission of trapped radiation will lead to underestimation of brightness temperature for deep snowpacks12. In the HUT model, the rough bare soil reflectivity model27 is applied to simulate the upwelling brightness temperature of the soil medium.Brightness temperature from forest vegetationThe brightness temperature over forested portions of the grid cell ({T}_{B,forest}) is derived from ({T}_{B,snow}) using a simple approximation so that:$${T}_{B,forest}={t}_{veg}cdot {T}_{B,snow}+left(1-{t}_{veg}right)cdot {T}_{veg}+left(1-{t}_{veg}right)cdot left(1-{e}_{snow}right)cdot {t}_{veg}cdot {T}_{veg}$$
    (2)
    where ({t}_{veg}) is the one-way transmissivity of the forest vegetation layer, ({T}_{veg}) the physical temperature of the vegetation (considered to be equal to air, snow and ground temperatures, ({T}_{veg}={T}_{air}={T}_{snow}={T}_{gnd}=-,{5}^{^circ }{rm{C}})) and ({e}_{snow}) the emissivity of the snow covered ground system. The choice of −5 °C is based on experimental data28 and follows the previous publications2,3,4. Moreover the impact of physical temperature is minimal on the simulated brightness temperature difference of two frequencies applied in the retrieval (typically More

  • in

    Towards an integrated decision-support system for sustainable organic waste management (optim-O)

    The development of the proposed decision-support system requires the undertaking of interdisciplinary research brought about by a diverse team. It is in this context that researchers from the chemical engineering department and the geomatics sciences department at Laval University, in Quebec, Canada, have developed a nutrient stakeholder platform (NutriPlatform-QC), i.e. a regrouping of actors from research institutions, industry, governmental authorities, municipalities, and agricultural organizations, among others, that are active in the field of organic waste management. Since 2017, regular meetings have been organized with the members of the platform in order to frame the objectives and methodology of such interdisciplinary research, as well as to adapt the scope of the research to the stakeholder needs.As such, the authors initiated the design and development of a decision-support software tool that allows setting up optimal organic waste value chains for the province of Quebec, with Primodal Inc. and Chamard Environmental Strategies as industrial partners. The system, named optim-O (www.optim-o.com), applies a holistic modelling approach that focuses on minimization of costs and greenhouse gas emissions throughout the entire value chain. The scope (Fig. 1) includes the generation and collection of organic waste across the province (including urban, suburban, peri-urban and rural areas), the treatment of the waste through biomethanation, composting, and/or nutrient recovery, and the distribution of the end-products such as biogas, digestate, compost, and recovered mineral fertilizers. All of these items are geolocated in order to account for transport distances and potential traffic nuisance. Regulatory and market restrictions for product distribution are also taken into account.Fig. 1: Scope and four use cases of the optim-O decision-support system.Scope and use cases.Full size imageThe software tool integrates three key components: (1) a multidimensional spatiotemporal database system (including georeferenced and non-georeferenced data), (2) a model-based decision module (for simulation and optimization) and (3) a user-friendly interface (to facilitate knowledge transfer and interpretation). Table 1 provides an overview of the data included in the system. Generally speaking, georeferenced data includes data that is location-specific, such as population, commerce, services and industry (position and size), road networks, hydrographic networks, existing infrastructure (wastewater treatment plants, biogas and composting facilities), agricultural parcels (location, size, crop, nutrient saturation index) and associated regulatory and market constraints (fertilizer application limitations). Non-georeferenced data includes costs and other factors used for economic assessments, greenhouse gas emission factors, technical process-related factors (used for the mathematical process models), and social factors (odour emissions, population density, the latter also being part of the georeferenced data). Default values are provided for the non-georeferenced data, but the user can modify these if case-specific data would be available. A prototype of the developed tool is currently being validated using two major biomethanation plants in Quebec. The tool also has the flexibility for extension with other resource recovery processes.Table 1 Georeferenced and non-georeferenced data included in the optim-O decision-support software tool.Full size tableFigure 1 presents the four use cases for which the tool can be used. It concerns decision problems related to (1) the collection of organic waste, (2) the treatment process operation, (3) the end-product distribution and (4) the integration of the three previous use cases as one global optimization problem. In each case, the tool can be used to either simulate and evaluate various scenarios defined by the user, or to solve the optimization problem taking into account optimization criteria defined by the user, as described in the examples below.In the first use case, i.e. the collection of organic waste, the tool allows for the estimation of organic waste generation based on data from households, services, businesses and industries, with associated organic matter generation rates for each, either based on the number of members in a household, employees or clients, as well as the type of service, business or industry. As presented in Table 1, all of this information is geolocated, allowing users to locate sources of organic waste across a territory, as presented in Fig. 2. From here, using the treatment plant location and road networks, various waste collection routes can be simulated. The user can also select specific modelling objectives, for example: maximising organic waste collection, evaluating the potential to collect a certain waste type, assessing long-distance travel (for example, through transfer stations), as well as associated optimization objectives, for example, reducing GHG emissions, reducing costs or reducing both at the same time.Fig. 2: Geolocated organic waste generation throughout the southern area of the province of Quebec, as estimated by the decision-support system.Geolocated organic waste generation Southern Quebec.Full size imageIn the second use case, i.e. treatment process operation, the system can evaluate processing outcomes through a mathematical model library developed for this tool. It includes models for anaerobic digestion, composting and processes to recover nutrients as mineral fertilizers from digestate, and allows easy extension with other process models in the future. The models are numerically simple, requiring basic data inputs (e.g., key physico-chemical waste characteristics), and are coded directly in the database. By selecting this approach, a balance was sought between model complexity and simulation times, with the aim to minimize computational efforts, while maximizing usability. Using the models, one can aim at evaluating the impact of varying substrates on the process performance, seeking to optimize certain parameters (e.g., minimizing GHG emissions, maximizing product quality or minimizing process duration/size). Moreover, different treatment process combinations can be evaluated and compared, for example the implementation of anaerobic digestion as sole technology vs the implementation of anaerobic digestion with nitrogen recovery from the liquid fraction of digestate and composting of the solid fraction of digestate.In the third use case, users can simulate and optimize locations for end-product distribution. In this case, an estimation of quantity and quality factors for the end-products (biogas, digestate, compost, recovered mineral fertilizers), either provided as model outputs or entered by the user, are considered as data inputs. From here, agricultural lands can be evaluated regarding their receptivity for the product. This receptivity is based on the quality of the product, size of the plot, the phosphorus saturation status of the soil, the nitrogen pollution status of the surrounding water bodies and the nitrogen requirements for crop production, which all determine how much product can be accepted on the land under study. Distribution networks can then be set up and optimized using spatial analysis, identifying the nearest receptive lands.Finally, a fourth use case concerns the integrated assessment of the above three use cases. Indeed, the outputs of one module can serve as the inputs to another module. As such, the outputs of the waste collection module can be used as inputs to the treatment process module, providing a certain quantity and quality of substrate(s). The process models can then be run to determine the optimal treatment process chain, as well as quantity and quality factors for the end-products. The latter can then be used to search for an optimal agricultural site for end-product distribution. This process can be undertaken iteratively by the system seeking to meet desired criteria and/or propose a few scenarios of interest to decision makers. The fourth use case can also be applied to select the optimal position of a new treatment plant, taking into account the organic waste availability and the access to agricultural land for end-product distribution. Moreover, such integrated approach can allow users to understand the impact of changing waste collection strategies on existing treatment process chains, or to evaluate how a change in process conditions can affect end-product distribution. More

  • in

    Forecasting point-of-consumption chlorine residual in refugee settlements using ensembles of artificial neural networks

    Study sites and data collectionThe data used for this study were obtained from a previous multi-site study on post-distribution FRC decay collected from refugee settlements in South Sudan, Jordan, and Rwanda19. This dataset was selected as process-based models have been used to produce FRC targets for these sites, which provide a useful comparison to the risk-based targets generated in this study. Details of the data collected at these sites, as well as important site characteristics are included in Table 3. Two datasets were collected from Jordan: one from the summer of 2014 and one 9 months later from the late winter of 2015. The original study treated these as two separate datasets due to differences in environmental conditions between the two datasets (10 °C difference in average temperature) and amount of time between the two datasets19. To ensure a consistent comparison with the original study, we have also treated the 2014 and 2015 data from Jordan as two distinct datasets.Table 3 Summary of Key Site Characteristics19,59,60,61.Full size tableThe dataset for each site includes FRC as well as other water quality parameters, which are routinely collected in humanitarian water systems operation including total residual chlorine, EC, water temperature, turbidity, and pH. Data were collected using paired sampling whereby the same unit of water was sampled at the following points along the post-distribution water supply chain:

    From the tap at the point-of distribution

    In the container immediately after collection

    In the container immediately after transport to the dwelling

    After a follow-up period of storage in the household

    This study only used the measurements at the point-of-distribution and point-of-consumption to reflect data collection practices that are more feasible for humanitarian operations. In preparing the dataset, observations were removed if the point-of-distribution water quality did not meet humanitarian drinking water quality guidelines. Supplementary Table 2 in the Supplementary Information includes the full list of data cleaning steps that were used to prepare the data for use in the ANN models.EthicsThe initial field work in South Sudan received exemption from full ethics review by the Medical Director of Médecins sans Frontières (MSF) (Operational Centre Amsterdam) as data collected was routine for the on-going water supply intervention at the study site. For subsequent field studies in Jordan and Rwanda, ethics approval was obtained from the Committee for Protection of Human Subjects (CPHS) of the Institutional Review Board at the University of California, Berkeley (CPHS Protocol Number: 2014-05-6326). Informed consent was provided throughout all data collection.Input variable selectionTwo input variable combinations were considered for predicting the output variable, the point-of-consumption FRC concentration. The variables considered are all variables that are routinely monitored in humanitarian water system operations. The first input variable combination (IV1) included FRC at the water point-of-distribution and the elapsed time between the measurement at the point-of-distribution and the point-of-consumption. This input variable combination represents the minimum number of variables that would be regularly collected under current humanitarian drinking water quality guidelines31. Additionally, these are the only two variables included in the process-based model developed in a past study for these sites19, so this input variable combination allows for a direct comparison of the ANN ensemble models with the process-based models. The second input variable combination (IV2) included the variables from IV1 as well as additional water quality variables measured from the point-of-distribution (directly after water had left the water distribution point): EC, water temperature, pH, and turbidity. These additional variables are recommended for collection in some humanitarian drinking water quality guidelines29,30,31, and as such, may also be available in humanitarian response settings. This larger input variable set allowed us to investigate the usefulness of additional water quality variables for forecasting point-of-consumption FRC concentrations.Base-learner structure and architectureThe ensemble base learners (the individual ANNs in the ensemble models) were built as multi-layer perceptrons (MLPs) with a single hidden layer using the Keras 2.3.0 package48 in Python v3.749. This structure was selected because it has been shown to outperform other data-driven models and ANN architectures for predicting FRC in piped distribution systems20,21. The weights and biases of the base learners were optimized to minimize mean squared error (MSE) using the Nadam algorithm with a learning rate of 0.1. An early stopping procedure with a patience of 10 epochs was used to prevent overfitting.The hidden layer size of the base learners was determined through an exploratory analysis by consecutively doubling the hidden layer size until performance decreased or ceased to improve substantially from one iteration to the next. Based on this analysis, we selected a hidden layer size of four hidden neurons at all sites for the models using the IV1 variable combination for all sites. For the models using the IV2 input variable combination, we selected a hidden layer size of 16 hidden nodes for South Sudan and Jordan (2015), and a hidden layer size of eight hidden nodes for Jordan (2014) and Rwanda. The full results of the exploratory analysis into hidden layer size are included in Supplementary Figs 13–20 in the Supplementary Information.Data divisionThe full dataset for each site and variable combination was divided into calibration and testing subsets, with the calibration subset further subdivided into training and validation data. The testing subset was obtained by randomly sampling 25% of the overall dataset. The same testing subset was used for all base learners so that each base-learner’s testing predictions could be combined into an ensemble forecast. The training and validation data were obtained by randomly resampling from the calibration subset, with a different combination of training and validation data for each base learner to promote ensemble diversity. The ratio of data from the calibration set used for training and validation, respectively, was selected to avoid both overfitting and underfitting through an exploratory analysis using a grid search process. In all but two cases, we selected a validation set that was twice the size of the training set, for an overall training-validation-testing split of 25–50–25%. The two exceptions to this were for the Jordan (2014) model when using the IV1 input variable combination where we found that a training-validation-testing split of 50–25–25 produced better performance, and for the Jordan (2015) model when using the IV1 input variable combination where a training-validation-testing split of 30–45–25 performed substantially better. The full results of the exploratory analysis for data division are included in Supplementary Figs 21–28 in the Supplementary Information. Descriptive statistics for the calibration and testing datasets are included in Supplementary Tables 3 and 4 of the Supplementary Information, and histograms of the input and output variables are provided in Supplementary Figs 5–12 in the Supplementary Information to provide context of the range and patterns in the data used to train the ANN base learners.Ensemble model formationThe ensemble models in this study were used to generate probabilistic forecasts of post-distribution FRC by combining the predictions of each base learner into a probability density function (pdf). Thus, for each observation of FRC at the point-of-consumption, the ensemble model outputs a pdf representing the predicted probability of point-of-consumption FRC concentrations. This pdf can then be used to identify ensemble confidence intervals (CIs) for the expected point-of-consumption FRC concentration. To ensure a good representation of the full output space in the final pdfs, two approaches were taken to ensure ensemble diversity. First, as discussed above, the data used to train the base-learner ANNs was randomly sampled from the calibration set, so each ANN was trained on a different subset of the data. Second, the initial weights and biases were randomized for each base learner in a random-start process. Both of these are implicit approaches to ensuring ensemble diversity as they do not directly create diversity and instead the diversity arises through the randomization of the training data and the weights and biases50. The benefit of implicit approaches is that the differences between the base learners are derived from randomness in the data50.The ensemble size (number of base learners included in the ensemble) was also determined through an exploratory analysis using a grid search procedure This exploratory analysis showed that in general, performance increased with larger ensemble sizes, but improvements in performance plateaued at ensemble sizes ranging from 50 members to 250 members. Based on this, a standard ensemble size of 250 members was selected for all sites and variable combinations. The full results of the exploratory analysis for ensemble size are included in Supplementary Figs 29–36 in the Supplementary Information.Ensemble post-processingWe used ensemble post-processing to attempt to improve the forecasts generated by the raw ensembles. We used the kernel dressing method to post-process ensemble predictions51. This method follows a two-step process: first a kernel function is fit centred on the base-learner prediction for each observation, then each member’s kernel is summed together to produce the post-processed pdf, which is a non-parametric mixture distribution function. We used a Gaussian kernel function in keeping with past studies27,28,38,51, though the selection of the specific kernel function is not critical28. The kernel bandwidth was defined using the best member error method where the bandwidth for all kernels is the variance of the absolute error of the prediction that is closest to each observation in the calibration dataset51.Ensemble verification and performance evaluationWe used ensemble verification metrics to evaluate the performance of the raw and post-processed ensembles for each site and variable combination. Ensemble verification metrics differ from traditional measures of performance (e.g. Nash Sutcliffe Efficiency, MSE, etc.) as they assess the performance of the probabilistic forecasts of an ensemble whereas traditional measures typically evaluate the average performance of an ensemble model or the predictions of a deterministic model52. Throughout the following section, (O) refers to the full set of observed FRC concentrations at the point-of-consumption and (o_i) refers to the (i^{{mathrm{th}}}) observation, where there are (I) total observations. (F) refers to the full set of probabilistic forecasts for point-of-consumption FRC, where (F_i) is the probabilistic forecast corresponding to observation (o_i) and (f_i^m) is the prediction by the (m^{{mathrm{th}}}) base learner in the ensemble on the (i^{{mathrm{th}}}) observation. For the following metrics, it is assumed that the predictions of each base learner in the ensemble are sorted from low to high for each observation such that (f_i^m le f_i^{m + 1}) from (m = 0) to (m = M).Percent capturePercent capture measures the percentage of observations which are captured within the ensemble forecast and provides a useful indication of how well the model can reproduce the full range of observed values, and, as such, can indicate if a model is underdispersed. For a raw ensemble forecast, the (i^{{mathrm{th}}}) observation is captured if (f_i^0 le o_i le _i^M). For a post-processed forecast, the (i^{{mathrm{th}}}) observation is captured if the probability of (o_i) in the mixture distribution is greater than 0. While not commonly used for ensemble verification, a similar metric has been used for evaluating other probabilistic or possibilistic models, especially neurofuzzy networks, referred to either as the percent capture or the percent of coverage53,54,55,56. The percent capture was calculated both for the overall set of observations, as well as for observations with point-of-consumption FRC below 0.2 mg/L. The latter is a useful indicator of how well the model can predict if water will have sufficient FRC at the point-of-consumption, which is an important indicator of the degree of confidence we have in the risk-based targets generated using these ensemble models.CI reliability diagramReliability diagrams are visual indicators of ensemble reliability, where reliability refers to the similarity between the observed and forecasted probability distributions with the ideal model having all observations plotted along the 1:1 line showing that the observed probabilities are equal to the forecasted probabilities. These diagrams plot the observed relative frequency of events against the forecast probability of that event, though the reliability diagram has been adapted in past studies as the CI reliability diagram which compares the frequency of observed values within the corresponding CI of the ensemble. For raw ensembles, the CIs are derived from the sorted forecasts of the base learners (for example, the ensemble 90% CI would include all of the forecasts between (f^{0.05M}) and (f^{0.95M})) and for post-processed ensembles, the CIs are calculated directly from the probability distribution. In this study, we extended the CI reliability diagram further by plotting the percent capture of each CI within the ensemble against the CI level. For each ensemble model we plotted the CI reliability for the 10–100% CI levels at 10% intervals as well as at the 95 and 99% CI. We used this to develop a numerical score for the CI reliability diagram, which is calculated as the squared distance between the percentage of observations captured within each CI and the ideal percent capture in that CI. This was calculated for each CI threshold, k, from 10 to 100% in 10% increments as shown in Eq. 1.$$CI;{mathrm{Reliability}};{mathrm{Score}} = mathop {sum }limits_{k = 0.1}^1 left( {k – {mathrm{Percent}};{mathrm{Capture}};{mathrm{in}};CI_k} right)^2$$
    (1)
    The CI reliability score measures the horizontal distance between the percent capture and the 1:1 line for each CI. The ideal value for this score would be 0, indicating all points fall on the 1:1 line. The worst possible score will depend on the number of CI’s included in the calculation of the score; for this study the worst score is 3.9, which would only occur if no observations were captured in any CI of the ensembles. The CI reliability score was calculated for both the overall dataset and for forecast-observation pairs where the observed household FRC concentration was below 0.2 mg/L.Continuous Ranked Probability ScoreThe Continuous Ranked Probability Score (CRPS) is a common metric for evaluating probabilistic forecasts that evaluates the difference between the predicted and observed probabilities of continuous variables and is equivalent to the mean absolute error of a deterministic forecast57,58. The CRPS measures not only model reliability but also sharpness, which is an indicator of how closely the ensemble predictions are clustered around the observed values. Thus, the CRPS can be a useful measure of overdispersion and can provide an indication if improvements in reliability are being obtained at the expense of excess overdispersion. The CRPS is measured as the area between the forecast cumulative distribution function (cdf) and the observed cdf for each forecast-observation pairing58. Since each observation is a discrete value, the observation cdf is represented with the Heaviside function (H{ x ge x_a}), which is a stepwise function with a value of 0 for all point-of-consumption FRC concentrations below the observed concentration and 1 for all point-of-consumption FRC concentrations above the observed concentration. The equation for calculating the CRPS of a single forecast-observation pair is given in Eq. 2. Note that Eq. 2 shows the calculation of CRPS for a single forecast-observation pair. To evaluate the ensemble models, the average CRPS, (overline {{mathrm{CRPS}}}), is calculated by taking the mean CRPS overall forecast-observation pairs.$${mathrm{CRPS}} = {int nolimits_{-infty }^infty} left( {F_ileft( x right) – Hleft{ {x ge o_i} right}} right)^2dx$$
    (2)
    For the post-processed probability distributions, we calculated CRPS directly from Eq. 2 using numerical integration. For the raw ensemble, we treated the forecast cdf as a stepwise continuous function with (N = M + 1) bins where each bin is bounded at two ensemble forecasts and the value in each bin is the cumulative probability58. (overline {{mathrm{CRPS}}}) is calculated using (overline {g_n}), the average width of bin (n) (average difference in FRC concentration between forecast values (m) and (m + 1)) and (overline {o_n}) the likelihood of the observed value being in bin (n)58. Using these values, the (overline {{mathrm{CRPS}}}) for an ensemble can be calculated as58:$$overline {{mathrm{CRPS}}} = mathop {sum }limits_{n = 1}^N overline {g_n} [(1 – overline {o_n} )p_n^2 + overline {o_n} left( {1 – p_n} right)^2]$$
    (3)
    Where (p_n) is the probability associated with each bin, (p_n = frac{n}{N})58.Generation of risk-based targetsTo generate the risk-based FRC targets, the trained ensembles of ANNs were used to forecast the point-of-consumption FRC for a series of point-of-distribution FRC concentrations from 0.2 to 2 mg/L in 0.05 mg/L increments. For each point-of-distribution FRC concentration, the predicted risk of insufficient FRC was calculated from the forecast pdf as the cumulative probability of FRC at the point-of-consumption being below 0.2 mg/L. Using this predicted risk, the target FRC concentration for the point-of-distribution was then selected as the lowest FRC concentration at the water point-of-distribution that provides the desired level of protection. For this study we selected the FRC concentration that resulted in negligible risk of FRC being below the 0.2 mg/L threshold (i.e. the lowest FRC concentration where the predicted risk is 0), though operationally any level of protection could be used and the risk of insufficient FRC at the point-of-consumption should be balanced against risks associated with high FRC concentrations, such as DBP formation and taste and odour concerns.For comparison with the previously published results, we used a storage duration of 10 h when generating the FRC targets for South Sudan, and 24 h for all other sites19. Since the IV2 model also requires values for EC, water temperature, pH, and turbidity, two scenarios were considered. First, an “average” scenario was used where the median observed value for all other water quality parameters were selected. The second scenario considered was a “worst-case” scenario, where we simulated a scenario where water quality conditions were unfavourable for maintaining chlorine residual. A partial correlation analysis, which assesses the correlation between an input variable and the output variable while controlling for the impacts of other input variables, was used to determine the least favourable conditions for each input variable. The partial correlation analysis is performed by first developing multiple linear regression predictions of both the output variable (point-of-consumption FRC) and the input variable of interest using the remaining input variables as the predictors to the linear regression models and then taking the Pearson correlation coefficient of the residuals between the two regression models. Partial correlation was used to assess the directionality of the effect of the additional water quality variables included in IV2 to assess whether high or low values of these inputs would create a worst-case scenario. Once the directionality of the impact of the different variables had been established, the 95th or 5th percentile observed value of that variable was used at each site to simulate the worst-case scenario. More

  • in

    Guiding urban water management towards 1.5 °C

    1.Rogelj, J. et al. In Global Warming of 1.5 °C. An IPCC Special Report on the Impacts of Global Warming of 1.5 °C Above Pre-industrial Levels and Related Global Greenhouse Gas Emission Pathways, in the Context of Strengthening the Global Response to the Threat of Climate Change (eds Masson-Delmotte, V. et al.) In press (2018).2.Mo, W., Wang, R. & Zimmerman, J. B. Energy–water nexus analysis of enhanced water supply scenarios: a regional comparison of Tampa Bay, Florida, and San Diego, California. Environ. Sci. Technol. 48, 5883–5891 (2014).CAS 
    Article 

    Google Scholar 
    3.Sambito, M. & Freni, G. LCA methodology for the quantification of the carbon footprint of the integrated urban water system. Water 9, 395 (2017).Article 
    CAS 

    Google Scholar 
    4.Meron, N., Blass, V. & Thoma, G. A national-level LCA of a water supply system in a Mediterranean semi-arid climate—Israel as a case study. Int. J. Life Cycle Assess. 25, 1133–1144 (2020).5.Hsien, C., Low, J. S. C., Fuchen, S. C. & Han, T. W. Life cycle assessment of water supply in Singapore—a water-scarce urban city with multiple water sources. Resour. Conserv. Recycl. 151, 104476 (2019).Article 

    Google Scholar 
    6.Slagstad, H. & Brattebø, H. Life cycle assessment of the water and wastewater system in Trondheim, Norway—a case study: Case Study. Urban water J. 11, 323–334 (2014).CAS 
    Article 

    Google Scholar 
    7.Parkinson, S. C. et al. Climate and human development impacts on municipal water demand: a spatially-explicit global modeling framework. Environ. Model. Softw. 85, 266–278 (2016).Article 

    Google Scholar 
    8.Rothausen, S. G. S. A. & Conway, D. Greenhouse-gas emissions from energy use in the water sector. Nat. Clim. Chang. 1, 210 (2011).CAS 
    Article 

    Google Scholar 
    9.Parkinson, S. et al. Balancing clean water-climate change mitigation trade-offs. Environ. Res. Lett. 14, 014009 (2019).CAS 
    Article 

    Google Scholar 
    10.McDonald, R. I. et al. Water on an urban planet: Urbanization and the reach of urban water infrastructure. Glob. Environ. Chang. 27, 96–105 (2014).Article 

    Google Scholar 
    11.Pal, A., He, Y., Jekel, M., Reinhard, M. & Gin, K. Y.-H. Emerging contaminants of public health significance as water quality indicator compounds in the urban water cycle. Environ. Int. 71, 46–62 (2014).CAS 
    Article 

    Google Scholar 
    12.Escriva-Bou, A., Lund, J. R. & Pulido-Velazquez, M. Saving energy from urban water demand management. Water Resour. Res. 54, 4265–4276 (2018).Article 

    Google Scholar 
    13.Dworak, T. et al. EU Water Saving Potential (Institute for International and European Environmental Policy, 2007).14.Flörke, M. et al. Domestic and industrial water uses of the past 60 years as a mirror of socio-economic development: A global simulation study. Glob. Environ. Chang. 23, 144–156 (2013).Article 

    Google Scholar 
    15.House-Peters, L. A. & Chang, H. Urban water demand modeling: review of concepts, methods, and organizing principles. Water Resour. Res. 47, W05401 (2011).16.Gracia-De-Rentería, P., Barberán, R. & Mur, J. Urban water demand for industrial uses in Spain. Urban Water J. 16, 114–124 (2019).Article 

    Google Scholar 
    17.Vassolo, S. & Döll, P. Global-scale gridded estimates of thermoelectric power and manufacturing water use. Water Resour. Res. 41, W04010 (2005).18.Dieu-Hang, T., Grafton, R. Q., Martínez-Espiñeira, R. & Garcia-Valiñas, M. Household adoption of energy and water-efficient appliances: An analysis of attitudes, labelling and complementary green behaviours in selected OECD countries. J. Environ. Manag. 197, 140–150 (2017).Article 

    Google Scholar 
    19.Attari, S. Z. Perceptions of water use. Proc. Natl Acad. Sci. 111, 5129–5134 (2014).CAS 
    Article 

    Google Scholar 
    20.Gonzales, P. & Ajami, N. Social and structural patterns of drought-related water conservation and rebound. Water Resour. Res. 53, 10619–10634 (2017).Article 

    Google Scholar 
    21.Grafton, R. Q. et al. The paradox of irrigation efficiency. Science 361, 748–750 (2018).CAS 
    Article 

    Google Scholar 
    22.Britton, T. C., Stewart, R. A. & O’Halloran, K. R. Smart metering: enabler for rapid and effective post meter leakage identification and water loss management. J. Clean. Prod. 54, 166–176 (2013).Article 

    Google Scholar 
    23.Cominola, A. et al. Long-term water conservation is fostered by smart meter-based feedback and digital user engagement. npj Clean Water 4, 1–10 (2021).Article 

    Google Scholar 
    24.Gurung, T. R., Stewart, R. A., Beal, C. D. & Sharma, A. K. Smart meter enabled informatics for economically efficient diversified water supply infrastructure planning. J. Clean. Prod. 135, 1023–1033 (2016).Article 

    Google Scholar 
    25.Kajenthira, A., Siddiqi, A. & Anadon, L. D. A new case for promoting wastewater reuse in Saudi Arabia: Bringing energy into the water equation. J. Environ. Manag. 102, 184–192 (2012).CAS 
    Article 

    Google Scholar 
    26.Stillwell, A. S. et al. An integrated energy, carbon, water, and economic analysis of reclaimed water use in urban settings: a case study of Austin, Texas. J. Water Reuse Desalin. 1, 208–223 (2011).Article 

    Google Scholar 
    27.Stillwell, A. S. & Webber, M. E. Geographic, technologic, and economic analysis of using reclaimed water for thermoelectric power plant cooling. Environ. Sci. Technol. 48, 4588–4595 (2014).CAS 
    Article 

    Google Scholar 
    28.Kavvada, O., Nelson, K. L. & Horvath, A. Spatial optimization for decentralized non-potable water reuse. Environ. Res. Lett. 13, 64001 (2018).Article 

    Google Scholar 
    29.Santhosh, A., Farid, A. M. & Youcef-Toumi, K. Real-time economic dispatch for the supply side of the energy-water nexus. Appl. Energy 122, 42–52 (2014).Article 

    Google Scholar 
    30.Gomez Sanabria, A., Höglund Isaksson, L., Rafaj, P. & Schöpp, W. Carbon in global waste and wastewater flows–its potential as energy source under alternative future waste management regimes. Adv. Geosci. 45, 105–113 (2018).Article 

    Google Scholar 
    31.Song, X. et al. Resource recovery from wastewater by anaerobic membrane bioreactors: Opportunities and challenges. Bioresour. Technol. 270, 669–677 (2018).CAS 
    Article 

    Google Scholar 
    32.Qadir, M. et al. Global and regional potential of wastewater as a water, nutrient and energy source. Nat Resour. Forum 44, 40–51 (2020).Article 

    Google Scholar 
    33.McCarty, P. L., Bae, J. & Kim, J. Domestic wastewater treatment as a net energy producer: Can this be achieved? Environ. Sci. Technol. 45, 7100–7106 (2011).CAS 
    Article 

    Google Scholar 
    34.Tubiello, F. N. et al. The FAOSTAT database of greenhouse gas emissions from agriculture. Environ. Res. Lett. 8, 15009 (2013).Article 

    Google Scholar 
    35.Bertrand, A., Aggoune, R. & Maréchal, F. In-building waste water heat recovery: An urban-scale method for the characterisation of water streams and the assessment of energy savings and costs. Appl. Energy 192, 110–125 (2017).Article 

    Google Scholar 
    36.Guo, X. & Hendel, M. Urban water networks as an alternative source for district heating and emergency heat-wave cooling. Energy 145, 79–87 (2018).Article 

    Google Scholar 
    37.Vesilind, P. Wastewater Treatment Plant Design Vol. 2 (IWA Publishing, 2003).38.Guo, T., Englehardt, J. & Wu, T. Review of cost versus scale: water and wastewater treatment and reuse processes. Water Sci. Technol. 69, 223–234 (2013).Article 

    Google Scholar 
    39.Liu, L. et al. The importance of system configuration for distributed direct potable water reuse. Nat. Sustain. 3, 548–555 (2020).40.Wu, D., Wang, H. & Seidu, R. Smart data driven quality prediction for urban water source management. Futur. Gener. Comput. Syst. 107, 418–432 (2020).Article 

    Google Scholar 
    41.Lafortezza, R., Chen, J., Van Den Bosch, C. K. & Randrup, T. B. Nature-based solutions for resilient landscapes and cities. Environ. Res. 165, 431–441 (2018).CAS 
    Article 

    Google Scholar 
    42.Engström, R., Howells, M., Mörtberg, U. & Destouni, G. Multi-functionality of nature-based and other urban sustainability solutions: New York City study. L. Degrad. Dev. 29, 3653–3662 (2018).Article 

    Google Scholar 
    43.Kernan, R., Liu, X., McLoone, S. & Fox, B. Demand side management of an urban water supply using wholesale electricity price. Appl. Energy 189, 395–402 (2017).Article 

    Google Scholar 
    44.Menke, R., Abraham, E., Parpas, P. & Stoianov, I. Demonstrating demand response from water distribution system through pump scheduling. Appl. Energy 170, 377–387 (2016).Article 

    Google Scholar 
    45.Davison-Kernan, R., Liu, X., McLoone, S. & Fox, B. Quantification of wind curtailment on a medium-sized power system and mitigation using municipal water pumping load. Renew. Sustain. Energy Rev. 112, 499–507 (2019).Article 

    Google Scholar 
    46.Wang, D. et al. Hierarchical market integration of responsive loads as spinning reserve. Appl. Energy 104, 229–238 (2013).47.ENBALA. Pennsylvania American Water Connects to the Smart Grid (ENBALA, 2018).48.Muhanji, S. O., Barrows, C., Macknick, J. & Farid, A. M. An enterprise control assessment case study of the energy–water nexus for the ISO New England system. Renew. Sustain. Energy Rev. 141, 110766 (2021).Article 

    Google Scholar 
    49.Oikonomou, K. & Parvania, M. Optimal coordinated operation of interdependent power and water distribution systems. IEEE Trans. Smart Grid 11, 4784–4794 (2020).Article 

    Google Scholar 
    50.Tilmant, A. & Kinzelbach, W. The cost of noncooperation in international river basins. Water Resour. Res. 48, https://doi.org/10.1029/2011WR011034 (2012).51.Vinca, A. et al. Transboundary cooperation a potential route to sustainable development in the Indus Basin. Nat. Sustain. 4, 331–339 (2020).52.Spang, E. S. & Loge, F. J. A high-resolution approach to mapping energy flows through water infrastructure systems. J. Ind. Ecol. 19, 656–665 (2015).Article 

    Google Scholar 
    53.Bartos, M. D. & Chester, M. V. The conservation nexus: valuing interdependent water and energy savings in Arizona. Environ. Sci. Technol. 48, 2139–2149 (2014).CAS 
    Article 

    Google Scholar 
    54.Wada, Y. et al. Co-designing Indus Water-Energy-Land. Futures One Earth 1, 185–194 (2019).Article 

    Google Scholar 
    55.Inland Empire Utility Agency. Chino Basin Watermaster Optimum Basin Management Program Update (Inland Empire Utility Agency, 2020).56.Helm, D. Catchment Management, Abstraction and Flooding: The Case for a Catchment System Operator and Coordinated Competition (New College, 2015).57.IWA. Action Agenda for Basin-Connected Cities: Influencing and Activating Urban Stakeholders to be Water Stewards in their Basins (IWA, 2018). More

  • in

    The widespread and unjust drinking water and clean water crisis in the United States

    Data sourcesData for this analysis were extracted from the American Community Survey (ACS) 5-year estimates for 2014–2018 via Integrated Public Use Microdata Series – National Historic Geographic information System (IPUMS-NHGIS)26, and from the Environmental Protection Agency’s (EPA) Enforcement and Compliance History Online (ECHO) Exporter27. Data were extracted at the county level for all 50 states, Washington DC, and Puerto Rico. The ACS is an ongoing survey of the United States which documents a wide variety of social statistics ranging from simple population counts to housing characteristics. Due to the staggered sampling structure of the ACS, it takes 5 years for every county to be sampled. Because of this, researchers must use 5-year intervals to ensure complete data coverage. The data from these 5 years are projected into estimates for all counties in the United States for the 5-year period in question. As of this study, 2014–2018 was the most recently available data.ECHO collates data from EPA-regulated facilities across the United States of America to report compliance, violation, and penalty information for all facilities for the most recent 5-year interval. ECHO data is updated weekly and the data for this paper was extracted on 18 August 2020. This means that the data in our analysis represents the status of each community water system or Clean Water Act permittee, as reported by the EPA, as of 18 August 2020. Only those community water systems or Clean Water Act permittees listed as Active by ECHO were included in this analysis. As ECHO data is at the level of the water system, permittee, or utility, we aggregated data up to the county level.Safe Drinking Water Act data was geolocated using QGIS 3.10 based upon latitude and longitude. This was done because other geographic identifiers for the Safe Drinking Water Act data were often missing. In line with prior work4,5,7,8, and in order to facilitate a cleaner dataset, we only focus on those water systems labeled community water systems for our analysis. Community water systems were geolocated based upon the county in which their latitude and longitude were located, if a community water system had latitude and longitude over water, a nearest neighbor join was used. In total, 1334 out of 49,479 community water systems were dropped because of there being no reported latitude or longitude. Of these, a total of 4.0%, or 54 community waters systems, were reported as in serious violation.Active Clean Water Act permittees were first identified by listed county. This was done because 345,176 out of 350,476 permittees had a county reported. Those without a county reported were located using latitude and longitude in the same manner as community water systems. There were 10 permittees without latitude and longitude or county listed which were excluded from our analysis. Of these, seven were in significant noncompliance and three were not. Due to some Clean Water Act permittees having latitude and longitude placements far away from the United States, those over 100 km from their nearest county were excluded from analysis. Finally, for community water systems and Clean Water Act permittees, some counties (76 for community water systems and 13 for Clean Water Act permittees) had no reported cases. Those counties were treated as zeroes for cartography and as missing for modeling purposes.Similar to prior work in this area4,5,8, we restrict our analysis to the scale of the county for reasons related to data limitations and resulting conceptual validity. Although counties are arguably larger in geographic area than ideal for an environmental injustice analysis, if we were to use a smaller unit for which data is available such as the census tract, the conceptual validity of the analysis would be limited due to the apolitical nature of these units. As outlined above, ECHO data is messy and missing many geographic identifiers. What is provided is generally either the county or latitude and longitude. If only the county is provided, then we are constrained to using the county regardless of conceptual validity. However, even when latitude and longitude are provided—which is the case for many observations—the provided point location says nothing about which households the water system or permittee serves or impacts. Due to this, whatever geographic unit we use carries the assumption that those in the unit could be plausibly impacted by the water system or permittee. Given that counties are often responsible for both regulating drinking water, as well as maintaining and providing water infrastructure29, we were comfortable with this assumption between point location and presumed spatial impact when using the scale of the county. However, we believe this assumption would have been invalid and untestable for smaller apolitical units for which demographic data is available such as census tracts.Beyond the issues presented by ECHO data, the county is also the appropriate scale of analysis for this study due to the estimate-based nature of the ACS. ACS estimates are based on a rolling 5-year sample structure and often have very large margins of error. At the census tract level, these standard errors can be massive, especially in rural areas30,31,32. Due to this variation, and the need to include all rural areas in this analysis, the county, where the margins of error are considerably smaller, is the appropriate unit for this study. All of this said, the county is, in fact, a larger unit than often desired or used in environmental justice studies. Studies focused on exclusively urban areas with clearer pathways of impact can and should use smaller units such as census tracts. It will be imperative for future scholarship focused on water hardship across the rural-urban continuum to gain access to reliable data on sub-county political units, as well as data linking water systems to users, to continue documenting and pushing for water justice.Dependent variablesThe dependent variables for this analysis were assessed in both a continuous and dichotomous format. For descriptive results and mapping, continuous measures were used. For models of water injustice, a dichotomous measure which classified counties as either having low levels of the specific water issue or elevated levels or the specific water issue, was used due to the low relative frequency of water access and quality issues relative to the whole United States population. For all three outcomes, we benchmark an elevated level of the issue as what would be viewed as an unacceptable level under United Nations Sustainable Development Goal 6.1, which states, “by 2030 achieve universal and equitable access to safe and affordable drinking water for all”1. As this goal focuses on ensuring all people have safe water, we deem a county as having an elevated level of the issue if >1% of households, community water systems, or permittees had incomplete plumbing, were in Significant Violation, or Significant Noncompliance, respectively. Although we could have used an even stricter threshold given the SDG’s emphasis on ensuring access for all people, we use 1% as our cut-off due to its nominal value and ease of interpretation.For water access, the continuous measure was the percent of households in a county with incomplete household plumbing as reported by the ACS. The ACS currently asks respondents if they have access to hot and cold water, a sink with a faucet, and a bath or shower. Up until 2016, the question also included a flush toilet33. As we must use the most recent 2014–2018 5-year estimates to establish full coverage of all counties, this means that incomplete plumbing in this item may, or may not include a flush toilet depending on when the specific county was sampled. The dichotomous version of this variable benchmarked elevated levels of incomplete plumbing as whether or not 1% or more of households in a county had incomplete plumbing.Water quality was assessed via both community water systems from the Safe Drinking Water Act, and from permit data via the Clean Water Act. For Safe Drinking Water Act data, the continuous measure was the percent of community water systems within a county classified as a Safe Drinking Water Act Serious Violator at time of data extraction. The EPA assigns point values of either 1, 5, or 10 based upon the severity of violations of the Safe Drinking Water Act. A Serious Violator is one who has “an aggregate score of at least eleven points as a result of some combination of: unresolved more serious violations (such as maximum contaminant level violations related to acute contaminants), multiple violations (health-based, monitoring and reporting, public notification and/or other violations), and/or continuing violations”27. The dichotomous measure benchmarked elevated rates of Safe Drinking Water Act Significant Violation as whether or not >1% of county community water systems were classified as Serious Violators.For Clean Water Act permit data, the continuous measure was the percent of permit holders listed as in Significant Noncompliance at the time of data extraction. Significant Noncompliance in the Clean Water Act refers to those permit holders who may pose a “more severe level of environmental threat” and is based upon both pollution levels and reporting compliance27. The dichotomous measure again set the threshold for elevated levels of poor water quality at whether or not >1% of Clean Water Act permittees in a county were listed as in Significant Noncompliance at time of data extraction.Independent variablesThe independent variables we include in models of water injustice are those frequently shown to be related to environmental injustice in the United States. These include age, income, poverty, race, ethnicity, education, and rurality17,18,19,20,21,22,23,24,25. Age was included as median age. Income was included as median household income. Poverty was the poverty rate of the county as determined by the official poverty measure of the United States34. Race and ethnicity was included as percent non-Latino/a Black, percent non-Latino/a indigenous, and percent Latino/a. Because the focus was on indigeneity, percent American Indian or Alaska Native was collapsed with Native Hawaiian or Other Pacific Islander. We did not include percent non-Latino/a white due to issues of multicollinearity. Finally, rurality was included as a three-category county indicator of metropolitan, non-metropolitan metropolitan-adjacent, and non-metropolitan remote, as determined by the Office of Management and Budget in 201035. The OMB determines a county is metropolitan if it has a core urban area of 50,000 or more people, or is connected to a core metropolitan county by a 25% or greater share of commuting35. A non-metropolitan county is simply any county not classified as metropolitan. Non-metropolitan metropolitan adjacent counties are those which immediately border a metropolitan county, and non-metropolitan remote counties are those that do not.Water injustice modeling approachWater injustice was assessed by estimating linear probability models for the three dichotomous outcome variables with state fixed effects to control for the visible state level heterogeneity and differences in policy, reporting, and enforcement (e.g. the clear state boundary effects in Fig. 3). We employ cluster-robust standard errors at the state level to account for both heteroskedasticity and state similarities. All modeling was performed in Stata 16.0 and mapping was performed in QGIS 3.10. We assessed all full models for multicollinearity via condition index and VIF values and the independent variables had an acceptable condition index of 5.48, well below the conservative cut-off of 15, as well as VIF values of 20). All indications of statistical significance are at the p  More

  • in

    Airborne geophysical surveys of the lower Mississippi Valley demonstrate system-scale mapping of subsurface architecture

    A system-scale airborne geophysical surveyFrom 2018 through early 2020, we acquired more than 43,000 flight-line-kilometers (line-km) of airborne geophysical data over the MAP study area of ~140,000 km2 (Fig. 2a, “Methods” section). Data collection included a high-resolution survey over ~1000 km2 near Shellmound, Mississippi, regional surveys with 3–6 km line spacing across the entire study area, and over 3000 line-km of data acquired along streams and rivers to characterize potential surface water–groundwater connections beneath these important recharge pathways. Radiometric (Fig. 2b), magnetic (Fig. 2c), and inverted resistivity grids at multiple depth intervals (Fig. 2d–h) summarize the combined results from both regional survey phases. Together, this represents the first initiative to acquire system-scale airborne geophysical data over an entire US aquifer.Fig. 2: Airborne geophysical survey coverage and summary of regional datasets.a Primary management regions in the MAP study area, with flight lines for each of the three phases of data collection completed through early 2020 (CR Crowleys Ridge). Results from the combined regional surveys gridded onto the 1 km National Hydrogeologic Grid45 include b radiometric data presented as a ternary diagram that indicates the relative abundance of K, U, and Th in surficial sediments with areas of Holocene (H)- and Pleistocene (P)-aged sediments indicated32; c the residual magnetic intensity map (in nanoTeslas, nT) shows faults46 related to the New Madrid seismic zone (RR Reelfoot rift, CGL Commerce geophysical lineament); and d–h resistivity depth slices at five depth intervals from 0 to 220 m below land surface annotated with mapped surficial geologic units32 (Hb backswamp, Hp Point bar and meander belt, Pve & Pvcl Wisconsin-age valley train) and four-letter codes of distinguishable hydrogeologic units (MRVA Mississippi River Valley alluvial aquifer, VKBG Vicksburg–Jackson confining unit, MCAQ Middle Claiborne aquifer, LCCU Lower Claiborne confining unit, MDWY Midway confining unit).Full size imageAt the regional scale, radiometric data (Fig. 2b) correlate with mapped surficial geology32 and sediment age, with Holocene deposits clearly delineated as strong returns of multiple elements (light-colored areas in Fig. 2b) compared with Pleistocene sediments. Magnetic data gridded at this scale largely corroborated previously mapped structures, such as the line of southwest–northeast-trending magnetic highs (Fig. 2c) associated with mapped intrusive plutons along the Commerce geophysical lineament (CGL) adjacent to the RR in northeast Arkansas and southeast Missouri45,46.Inverted resistivity models in the uppermost 5 m (Fig. 2d and Fig. S1b) correspond closely with mapped surficial units32 (Fig. S1a). For example, Wisconsin-age valley train deposits, as well as point bars and meander scrolls of modern river networks, appear as resistive features in the uppermost 5 m, whereas fine-grained units such as backswamp deposits are characterized by low resistivity. At 20–25 m depth (Fig. 2e and Fig. S1c), intermediate to high-resistivity values are consistent with the coarse-grained lithology found throughout the MRVA. Lower resistivity can be found at this depth in sedimentary units outside the MAP region as well as over structural highs within the MAP region where Tertiary units are close to the surface. Low-resistivity values show the Vicksburg–Jackson confining unit (VKBG) in southeast Arkansas and northeast Louisiana, the Midway confining unit (MDWY) in northeast Arkansas, and an erosional remnant of Tertiary sediments beneath Crowleys Ridge. Beneath the Quaternary MRVA (~30–50 m depth in most of the region), resistivity values strongly correlate with Tertiary subcropping units (Fig. 2f–h and Fig. S1d, e). Most notable here is the low resistivity that corresponds with the known regional VKBG and MDWY confining units, both clay and shale rich. In contrast, the subcrop of the Middle Claiborne aquifer (MCAQ) sands are resistive (Fig. 2g, h and Fig. S1d), with a notable change in facies north of Memphis, Tennessee, associated with the coarse Memphis Sand of the Claiborne Group (dashed line, Fig. 2g, h).Further correlation between resistivity and geologic structure is evident in cross-section view (Fig. 3a–c), where gridded resistivity models are compared with the top elevation of MERAS model surfaces43. From west to east, prominent features in Fig. 3a include the highly resistive Paleozoic Ozark Plateaus aquifer that bounds the MAP region, the conductive east-dipping MDWY beneath and west of Crowleys Ridge, and the resistive MCAQ dipping to the east beneath the east side of Crowleys Ridge. This section highlights the variable degree of connectivity between the MRVA and underlying Tertiary aquifers. While the MCAQ sands appear connected to the shallow MRVA (veneer of moderate to high resistivity in the upper 30–50 m) immediately east of Crowleys Ridge, these aquifer layers become mostly separated by the Middle Claiborne confining unit (MCCU) farther east. To the south (Fig. 3b), the subcrop of the Claiborne Group aquifers (MCAQ and Upper Claiborne aquifer) suggests a direct connection to the MRVA beneath the Mississippi Delta, sandwiched between the VKBG confining unit in the shallow western half of the section and the Lower Claiborne confining unit (LCCU) that dips westward in the eastern part of the section. Towards the southern end of the AEM survey (Fig. 3c), the MRVA is disconnected from Tertiary aquifers by the MCCU and VKBG confining units in the western and eastern portions of the section, respectively. AEM-derived resistivity models largely corroborate the existing framework but also reveal greater detail in the overall structure and heterogeneity within units than could be previously determined through relatively sparse borehole observations. This study demonstrates that systematic mapping at high spatial resolution with AEM data illuminates model structural details expected to exist throughout the region but that cannot be fully understood with sparse observations.Fig. 3: Resistivity and interpreted facies classification cross-sections.West–east resistivity (a–c) and facies classification (d–f) sections are shown at three latitudes across the survey area (Fig. 2a). Surfaces of the top elevation of MERAS model hydrogeologic units43 are shown on cross-sections for reference. CR Crowleys Ridge, MR Mississippi River. Two-sided arrows indicate potential regions of connectivity between MRVA and deeper aquifer units.Full size imageCombined resistivity models from the two phases of regional data collection (Fig. 2d–h and Fig. 3a–c) have been interpolated onto grids with 1 km × 1 km lateral and 5-m vertical resolution useful for investigation of regional-scale structure. However, the native resolution of resistivity models along flight paths is much higher, with spacing between sounding locations for the Resolve and Tempest AEM systems equal to 25 and 75 m, respectively, with finer-scale near-surface vertical resolution for the Resolve system. On a native-resolution cross-section of the Resolve system in northeast Arkansas (Fig. S2), significant detail reveals the internal structure and variability of the MRVA. For example, the Resolve models capture the topography of the aquifer base, including incised channels in the subcropping Tertiary unit that may have been formed during periods of glacial outwash (interpreted locations marked “C” on Fig. S2), and the thickness and extent of surficial low-resistivity material that may be local confining units and a barrier to recharge. Lateral transitions in the upper ~5–30 m correspond with mapped braid belts33, suggesting that electrical resistivity can be an indicator of these distinct lithologic units. However, since the chronology of mapped braid belts is largely based on relatively shallow OSL dates and surficial mapping33, there is little constraint on the age of deeper Quaternary sediments beyond the estimate of their deposition after 250 ka34. Given the presence of shallow braid belt deposits older than the last sea-level lowstand ~20 ka33, along with the observed internal structure of the MRVA with a discontinuous low-resistivity layer at ~30 m (Fig. S2), we hypothesize that the deeper Quaternary sediments may represent earlier filling of post-250 ka eroded channels of the ancestral Mississippi–Ohio River systems34,38.Derivative products: interpretations of hydrogeologic structure and propertiesWhile lithology dominates AEM-derived resistivity values in the MAP region, porewater salinity is also known to influence resistivity17,19,24, and groundwater salinity varies throughout the study area47,48. Specific conductance (SC) measured in boreholes throughout the MERAS domain are generally low, with similar values in both Quaternary sediments (median log10 SC 2.79 = 617 μS/cm) and deeper Tertiary units (median log10 SC 2.67 = 471 μS/cm). Areas of high salinity are limited within the MRVA footprint, with only 6% of the area predicted to be >1000 μS/cm48. Correlation between SC- and AEM-derived resistivity (Fig. 4) by MAP region (Fig. 2) suggests that SC has limited overall control on resistivity across the study domain, but can be important in specific regions where SC is high, generally in the Grand Prairie (Fig. 4a) and Boeuf (Fig. 4b) regions. Lithology appears to be a primary driver for resistivity, with Quaternary sediments typically higher in resistivity than Tertiary units (Fig. 4a–f) over the same SC range. Tertiary, and to a lesser degree Quaternary, units follow an SC-resistivity trend indicative of moderate-high surface conductivity caused by an increased fraction of clays or other fine-grained sediments that flattens the slope of this relationship (black curves, Fig. 4a–f), compared with Archie’s Law where surface conduction is absent49,50. Notable exceptions where Quaternary and Tertiary resistivity values are similar occur in the Cache (Fig. 4c) and St. Francis (Fig.4d) regions, where coarse-grained MCAQ sand subcrop beneath the MRVA and appear similar geophysically (double-sided arrow in Fig. 3a just east of Crowleys Ridge). The sensitivity of AEM data to both model structure and porewater salinity makes it a powerful tool in hydrologic studies, and also highlights the need for borehole and other geologic observations to calibrate against to reduce uncertainty in the extrapolation of AEM interpretations across the entire survey domain.Fig. 4: Relationships between groundwater salinity and AEM-derived resistivity.a–f Measured groundwater specific conductance (SC) versus AEM-derived resistivity at borehole locations, organized by region (Fig. 2). SC measurements from MRVA, terrace (TRRC), and loess layers are shown in blue, with measurements from deeper Tertiary units in yellow. Black dashed curves illustrate theoretical SC-resistivity relationships for varying amounts of surface conductivity (({{{sigma }}}_{{rm{s}}})) caused by the increasing fraction of fine-grained sediment. g Map view of a high-SC cluster in White County, Arkansas, correlates with decreased shallow resistivity in cross-section view (h) and corresponds to the high-SC observations circled in (a).Full size imageUsing 6130 published picks defining the depth to the base of the MRVA39 within the AEM survey area, along with an additional 364 manual picks based on observation of resistivity cross-sections, we used a supervised machine learning algorithm51 (see “Methods” Section) to interpret the elevation of the base of the aquifer across the entire AEM dataset (Fig. 5a). An associated aquifer thickness map (Fig. 5b) is estimated by differencing the base of aquifer elevation surface from the land surface elevation, and saturated thickness (Fig. S3) is calculated by differencing the base of aquifer elevation from the 2018 potentiometric surface, assumed here to be the water table, derived from borehole observations52. By subregion (Fig. 2a and Fig. S3), saturated thickness is greatest in the St. Francis region where deep Quaternary scour channels of the Mississippi–Ohio River system have been documented east of Crowleys Ridge38, and thinnest in the Grand Prairie region because of the combined influence of deep water table, a thick confining layer that limits recharge, and shallow subcrop of the VKBG (Fig. 2e). The difference in saturated thickness between AEM and borehole interpretations of the base of the aquifer surface is on the order of ±10–15 m (Fig. S3b). While not a large difference in absolute value, this can represent a significant percentage of aquifer thickness, which is typically 3000 line-km of AEM data were acquired directly along the Mississippi and Arkansas River paths, as well several smaller tributaries (Fig. 2a). For example, native-resolution, ungridded resistivity profiles with ~25 m spacing along flight paths for several of the tributaries (Fig. S5) illustrate similar detail as the main block flight lines in aquifer structure and geometry of the aquifer base and subcropping unit contact (Fig. S2). Following structure along the river paths enables a detailed view of the discontinuous nature of shallow confining materials beneath river systems, as well as how rivers—which are often important surface water-groundwater conduits—may be connected to the aquifer below. A persistent feature in many river profiles is the intermediate-resistivity layer at ~20–30 m depth that was more discontinuous along the main block flight lines and hypothesized earlier as a fine-grained unit separating younger and older Pleistocene sediments also indicated in Fig. S2.In addition to streamwise surveys of river courses, gridded resistivity models from the complete regional dataset are intersected with the NHDPlus database of flowlines57 to produce cross-sections along any river path (Fig. 6b–f). For example, flowlines intersected by the west–east flight-line block but not flown streamwise can be produced, such as the White River (Fig. 6c). Everywhere the resistivity grids intersected a stream, we use the same connectivity metric discussed previously to quantify the magnitude of connection potential between rivers and the underlying aquifer (see “Methods” section). Here, the vic connectivity metric is calculated on a version of the combined regional dataset made in 2-m depth intervals, integrated from 0 to 10 m beneath the river bottom to characterize the presence or absence of fine-grained material beneath rivers (Fig. 6a). In contrast to the MERAS model calibration for streambed conductance, which incorporated 43 streams with limited data for informing parameter values54, the results here provide far greater coverage and finer granularity on the expected spatial patterns of streambed conductance that may be incorporated in groundwater model parameterization. To first order, surficial geology surrounding the streams32 is an important factor in the vic connectivity metric, with fine-grained units appearing less connected than coarse or intermediate units (Fig. S6).Fig. 6: Characterizing rivers and surface water-groundwater connections.a River connectivity metric defined along all NHDPlus segments in the study area identifies areas for high or low potential for surface water-groundwater connectivity based on streambed resistivity values. Resistivity cross-sections (from north to south) extracted from gridded data along the St. Francis River (b), White River (c), Yazoo-Tallahatchie River (d), Bayou Bartholomew (e), and Mississippi River (f). The top elevation of Tertiary MERAS model layers43 is indicated as solid lines on top of resistivity cross-sections.Full size imageGeological controls on groundwater ageGroundwater age can be used to assess groundwater availability by estimating recharge rates, delineating recharge areas, and characterizing aquifer susceptibility to surface contamination58,59. Tritium is used to qualitatively differentiate groundwater age into modern (recharged in 1953 or later), premodern (recharged prior to 1953), or mixed (combination of modern and premodern water) groundwater categories60. Tritium samples collected from the MAP (n = 582) were categorized into modern, mixed, and premodern groups61 based on the atmospheric tritium input for the well location, sample date, and the measured tritium concentration60. Modern ages are expected for surficial alluvial aquifers in a humid region and 39.1% of MRVA samples fall in this grouping61; however, 17.9% of samples were premodern61, indicating significant variability in either recharge rates or sources of groundwater to the MRVA. The heterogeneity in MRVA groundwater age implies local-scale control, which also likely varies among the MAP regions depending on the relative importance of surface water recharge, aerial recharge through surficial confining units, or possible upwelling from deeper Tertiary units. Part of the challenge in understanding the heterogeneity of groundwater age in this system was not having regional-scale interpretations of aquifer architecture and connectivity between units.To investigate the controls of shallow and deep connectivity on groundwater age, derivative product metrics for surface connectivity (Fig. 5e) and MRVA-Tertiary aquifer connectivity (Fig. 5f) are plotted together with tritium age categories (Fig. 7). Samples with premodern age (Fig. 7a) have characteristically high connectivity between the MRVA and subcropping Tertiary units, along with mixed surficial connectivity, suggesting that upwelling from deeper units may control these older groundwater measurements. Conversely, both mixed and modern tritium categories show a broader range of connectivity to deeper units (Fig. 7b,c), while the modern category has the largest fraction of points (85%) that indicate some degree of surface connectivity, suggesting that connection at the surface is a controlling factor for younger samples. While further data are needed on hydrologic gradients in the system to better predict actual flow paths, these insights suggest that geological structure provides at least partial control on groundwater age and is therefore also likely to be an important driver for vertical groundwater transport. The groundwater age observations—especially the preponderance of premodern groundwater—are difficult to understand without system-scale interpretations of aquifer architecture. The finding that geological structure controls groundwater transport and age is not itself a novel concept; however, we demonstrate that system-scale AEM data can map detailed model structure that facilitates a new understanding of hydrologic processes relevant to many applications and geologic settings.Fig. 7: Relationship between groundwater age and AEM-derived aquifer structure.Sample depth for tritium dates classified as premodern (a), mixed (b), or modern (c) is plotted against the degree of connectivity between the MRVA and deeper Tertiary units (Fig. 5f), with point colors that represent the degree of shallow aquifer connectivity (Fig. 5e).Full size imageHidden faults along the CGLResistivity cross-sections west of the northeast-trending portion of Crowleys Ridge image significant up-to-the-east vertical offset—as much as ~50–75 m—that can be tracked over a distance of >100 km along two shallow fault strands (Fig. 8). These two faults closely follow the path of the ~10-km-wide CGL, just west of the RR and NMSZ62. The western fault splay shows a clear offset of the contact between an Upper Cretaceous high-resistivity layer at depth (McNairy Nacatosh aquifer) and the low-resistivity Tertiary MDWY within the uppermost 100–150 m (Fig. 8b–d), clearly extending to at least the base of the Quaternary aquifer. This western fault is evident on cross-sections between Fig. 8b and d (solid brown line Fig. 8a), with vertical offsets largest in the middle of this segment (up to 75 m) and maximum width ~500–1000 m. The eastern of the two faults is characterized in cross-section view by a dipping conductive layer that appears to terminate at the base of the MDWY, separating the uplifted resistive Cretaceous block to the west from the conductive MDWY to the east. Whether the dipping conductive layer is related to the MDWY or the fault structure itself is unclear. Unlike the western fault, offset is not seen at the top of the MDWY, possibly because of Quaternary erosion of this surface. The eastern of the two structures can be traced on resistivity cross-sections over a greater distance, following the CGL to the southwest before turning south towards the Western Margin fault (Fig. 8).Fig. 8: Fault structures along the Commerce geophysical lineament (CGL).a Map view of resistivity at a depth of 60–65 m along with previously mapped fault structures and seismicity of the New Madrid seismic zone. Faults identified in this study (brown and yellow lines) fall along the 10-km-wide path of the CGL (white dashed lines)62. b–d Resistivity cross-sections indicate up to ~75 m of up-to-the-east offset on two near-vertical faults (black lines, with mapped locations marked by brown and yellow arrows) west of Crowleys Ridge, with maximum offset in the middle of the region (c). The western fault structure shows a clear offset of the deeper moderate-resistivity Cretaceous (K) McNairy Nacatosh aquifer against the low-resistivity Midway confining unit. The inferred base of the Midway confining unit in the vicinity of these faults is indicated by dashed lines. Location of the north–south seismic profile over the CGL65 is indicated by a red star.Full size imageWhile generally coincident with the CGL, both hidden faults captured in the AEM data are previously unmapped, and thus provide new insight into the tectonic history of this area. The observed uplift on these faults along the CGL can plausibly help explain the tectonic origin of Crowleys Ridge given the proximity of these features to one another. Previous geophysical and geomorphological observations along and near the ridge margins provide important insight63,64—especially in relation to the eastern fault indicated here—but have not revealed the source of uplift occurring 10–20 km farther west along the CGL. Existing geophysical data along the ridge margins64 and the CGL65 indicate faulting and support Quaternary fault activity, but do not have sufficient spatial scale and or resolution to delineate the discrete offset imaged by AEM data over the fault length of >100 km. The 50–75 m uplift (Fig. 8b–d) requires movement since the early Tertiary but is consistent with the Quaternary motion. The offset-wedge geometry suggests that the MDWY was lithified at the time of deformation. Given the lack of surface expression, the majority of offset may have occurred before the 32–45 ka Melville Ridge braid belt found immediately west of Crowleys Ridge33; however, further age constraint of Quaternary activity is not available.Dense borehole observations that quantify the overall thickness of Quaternary sediments identify the Pliocene–Pleistocene unconformity41 and support an argument for 53 m of Quaternary uplift34 in the region (Fig. 8a). However, the typical minimum separation between boreholes is >2 km and borehole density declines west of the CGL, making them also insufficient to identify this narrow fault feature as a potential source of uplift. Although the CGL and identified faults are removed from the seismically active region of the NMSZ, the possibility of Quaternary tectonic activity and offset on the faults identified here is important to understand past variations in seismicity and future hazard in the region62.Multi-scale mappingThe multiple phases of AEM mapping (Fig. 2a), along with targeted ground-based66 and waterborne67 geophysical data collection, provide an excellent case study in the value of subsurface mapping over multiple scales. From regional surveys aggregated to 3 km line spacing with 1 km grid cells to high-resolution AEM data in the Shellmound area with 0.25–1 km line spacing with 100 m grid cells, we are able to illustrate the value of an order of magnitude increase in flight-line spacing.The horizontal and vertical resolution of airborne geophysical surveys defines the scale of investigation, which depends on survey design parameters including instrument type, flight-line spacing, and total kilometers flown. Spatially extensive data capturing individual features several kilometers in size commensurate with mapping scales of ~1:1,000,000 (Figs. 2, 5, and  9a) are mapped with flight-line spacing on the order of 1–10 km and effectively inform regional-scale hydrologic models and decision support. High-resolution airborne data with sub-kilometer flight-line spacing greatly improve the resolution of smaller features appropriate for local ~1:100,000-scale mapping (Fig. 9c and Fig. S7).Fig. 9: Multi-scale mapping with airborne and ground-based methods.a Regional-scale structure mapped on 1 km grid cells interpolated from ~3 to 6 km-spaced flight-line data over the entire MAP region. b Regional-scale structure enlarged to the ~30 × 30 km Shellmound study area, compared with high-resolution structure mapped on 100 m grid cells interpolated from 0.25 to 1 km-spaced flight-line data from the Shellmound study area (c) shows the ability to resolve detailed buried channel structure. d Comparison of near-surface high-resolution Shellmound AEM data (background) with very high-resolution ground-based electromagnetic data66 acquired with a sensor towed over ~4 km2 on survey lines spaced by 25 m.Full size imageThe high-resolution data near Shellmound, Mississippi, map coarse-grained (high-resistivity) sediments associated with river meanders that also tend to have relatively high potassium (K) abundance (Fig. S7a, b, east), in contrast with backswamp deposits and other fine-grained overbank sediments that have relatively low-resistivity and greater thorium (Th) abundance. At 50–55 m depth (Fig. 9c and Fig. S7d), resistivity models map a buried paleochannel in the southeast corner of the grid that appears incised to depths of at least 75 m. This channel was previously unknown and may be a relict of the ancestral Mississippi–Ohio river system that flowed east of Crowleys Ridge during the Pleistocene33, creating conduits for local groundwater flow. Deep channels such as this are likely present throughout the region, but are under-sampled by borehole observations and even the regional geophysical flight lines (Figs. 2d–h, 3a–c).The combination of spatially extensive and high-resolution data enables detection and mapping of small-scale or hidden features that would otherwise be missed (i.e., the known unknowns in the subsurface). Our results demonstrate the value in AEM for imaging localized pathways for groundwater connection above and below the aquifer (Fig. 5e, f), buried paleochannels (Fig. 9c), and hidden faults with narrow zones of uplift previously unmapped (Fig. 8). These discrete and relatively small features may have an outsized and nonlinear impact on model predictions, hazard assessments, and decision making, but are impossible to recognize with sparse or limited spatial coverage.Outlook and future directionsAn emerging example of “big data” in geoscience, AEM is expanding the breadth of information that can be applied to near-surface investigation and modeling where detail about subsurface complexity is lacking. Many process-based hydrologic models have evolved to function in the absence of detailed subsurface structural information, largely because these data are typically not available and because incorrect assumptions about subsurface structure and unwarranted complexity lead to modeling errors68. Technological advances in remote sensing and airborne geophysics present new opportunities to advance the level of geological detail that can be incorporated in hydrologic models through the entire vertical extent of an aquifer system. As “big data” have become available to constrain details of model structure, commensurate advances in modeling, such as machine learning and uncertainty quantification69, will also be needed to realize the full potential of airborne geophysical datasets.As the dimension and complexity of questions that may be asked of a model or array of models (i.e., the “decision-space”) continue to increase, the need for detailed data to support these questions also increases. When allocating resources for data collection, a balance should be sought between the optimality of data collection to address a narrow decision-space (such as a single, focused question) and robustness of the investment to address future questions that expand the decision-space. The variety of applications demonstrated in this study highlights the robustness of large-scale AEM data in this respect. Although AEM surveys can involve high absolute cost, at large scales they may be 3–4 orders of magnitude less expensive on a per-data-point or per-square-kilometer basis compared with traditional ground-based surveys or drilling. Because large surveys can cover parts of multiple counties and states and support the interests of multiple scientific disciplines or stakeholder interests, a community-driven approach to the acquisition of these foundational geoscientific datasets is advantageous. Benefits of a coordinated approach to acquiring system-scale AEM data include reduced costs to individual participants, leveraging resources to acquire more data than any individual group could achieve, and ensuring data consistency across the study area that maximizes their value.Much as lidar has transformed our understanding of Earth’s surface, airborne geophysical data extend our view into the subsurface, transforming our ability to inform three-dimensional mapping from catchment to basin scales (Fig. 1b). Here, we demonstrated that system-scale AEM data provide a robust platform from which to address a host of subsurface questions. Airborne geophysical datasets represent the next generation in subsurface mapping, capable of filling in the gaps between existing boreholes with an order of magnitude or greater increase in data density. This will provide nationally consistent near-surface geologic datasets as a foundation for three-dimensional geologic interpretations, hydrologic models, and other subsurface studies that rely on detailed measurements of difficult to access belowground properties within 200–300 m of Earth’s surface (i.e., lidar for the subsurface). More