Global potential for harvesting drinking water from air using solar energy
Water access data processingData on drinking water coverage by region was acquired from the WHO/UNICEF JMP. The JMP acts as official custodian of global data on water supply, sanitation and hygiene2 and assimilates data from administrative data, national census and surveys for individual countries, and maintains a database that can be accessed online through their website. We accessed data tables for national and subnational drinking water service levels from https://washdata.org.JMP datasets are not geographically linked to official boundary files. We joined the tables to GIS boundaries obtained from the following open-source collections: GADM (https://gadm.org), the Spatial Data Repository of the Demographic and Health Surveys Program of USAID (DHS) and the Global Data Lab of Radboud University (GDL)2,50,51,52,53. Subnational regions reported by the JMP are unstructured, representing various regional administrative levels (province, state, district and others).The JMP national and subnational data were joined to GIS boundaries using a custom geoprocessing tool built in Python and ArcGIS 10. The tool joins the available JMP subnational-level survey data to the closest name match of regional boundary names from a merged stack of GADM (admin1, admin2 and admin3), DHS and GDL boundaries worldwide. The JMP national-level survey data is then joined to GADM national (admin0) boundaries for countries which have no subnational data available. Finally, the two boundary-joined datasets (national and subnational) are merged, processed and exported as a seamless global fabric of water-stressed-population data at the highest respective spatial resolutions available (Fig. 1a).JMP does not report the breakdown between the SMDW and basic service level within subnational regions, and instead reports a combined category called ‘at least basic’ (ALB). To estimate the SMDW values in subnational regions, a simple cross-multiplication was performed using the splits at the national level:$${{rm{SMDW}}}_{{rm{subnational}}}=frac{{{rm{SMDW}}}_{{rm{national}}}}{{{rm{ALB}}}_{{rm{national}}}}{times {rm{ALB}}}_{{rm{subnational}}},$$where ALBnational, ALBsubnational and SMDWnational are known values.Validation of the cross-estimation of share of SMDW from ALB for subnational regions was conducted on a reference dataset of nationally representative household surveys that collected data on all criteria for SMDW54, shown in Extended Data Fig. 2. We report regression results of R2 = 0.87 and a standard error of 3.67, indicating a bias which over-reports SMDW share and a probable underestimate of people living without SMDW in our study. This discrepancy comes from JMP calculations of SMDW that rely on the minimum value of multiple drinking water service criteria (free from contamination, available when needed and accessible on premise) rather than considering whether individual households meet all criteria for SMDW55.The fraction of population without SMDW was multiplied by residential population values in the WorldPop top-down unconstrained global mosaic population count of 2017 at 1 km spatial resolution56 (https://www.worldpop.org). WorldPop was accessed online as a TIF image and imported to Google Earth Engine. The year 2017 was chosen to more closely match water access data from JMP. The percentages reported by JMP are probably not uniform within most regions57, introducing an unknown error to Fig. 1b, but represent the best estimate available to us given the limitations of these regionally reported data.Climate input and conversion approximationsGHI and reference planeWe used GHI (in W m−2) as solar energy input data. GHI has good availability in climate datasets and introduces the fewest number of assumptions. Since GHI describes the irradiance in a locally horizontal reference plane, this approximation is only exact for devices having a horizontally oriented solar harvesting area. Annually averaged comparisons between horizontal and optimal fixed-tilt panels show negligible differences in direct plus diffuse radiation in tropical latitudes, and ratios below 25% in locations within 50° north and south latitudes58. Those seeking precise absolute predictions for tilted devices or higher latitudes are encouraged to adapt the provided code to their specific assumptions.Conversion from SY to AWH outputAs discussed in the main text, solar-driven AWH devices typically have one of two predominant energy inputs: thermal (converted directly from incident sunlight on the device) or electrical (from PV). Here, the energy units used to calculate yield in l kWh−1 are incident solar energy directly from GHI. The various assumptions are made in relation to the reported values based on their source. The thermal limits33, target curves, and experimental results reported by TRP15 and MOFs were assumed to have direct (100%) conversion from sunlight to heat. For the ZMW device, the table provided by the manufacturer accounts for system losses, so the table values were directly converted in our model35. For ref. 34 and the cooler–condenser limits from ref. 32, which both assume work input instead of heat, we applied a typical PV conversion efficiency of 20% to convert from sunlight kWh (GHI) to kWhPV (electrical work) input to the device59.Sufficiently short sorbent cycling timesAWH-Geo assumes continuous or quasi-continuous AWH. AWH-Geo considers each 1-h timestep independently and is thus stateless. Aside from edge cases, this is a safe assumption for mass efficient SC-AWH devices, which typically have time constants shorter than 1 h, both for sorbent cycling and for most of the thermal time constants. For devices with longer time constants, batch devices or processes with slow (de)sorption kinetics, this assumption may introduce increased error, and may require further adaptation of the provided code.Climate time-series calculationAWH-Geo is a resource-assessment tool for AWH. It consists of a geospatial processing pipeline for mapping water production (in litres per unit time) around the world of any solar-driven continuous AWH device that can be characterized by an output table of the form output = f(RH, T, GHI).Output tables show AWH output values in l h−1 or l h−1 m−2 across permutations of the 3 main climate variables in the following ranges: RH between 0 and 100 % in intervals of 10%, GHI between 0 and 1,300 W m−2 in intervals of 100 W m−2, and T between 0 and 45 °C in intervals of 2.5 °C (2,145 total output values). The tables are converted into a 3D array image in Google Earth Engine and processed across the climate time-series image collection for the period of interest. Finally, these AWH output values are composited (reduced) to a single time-averaged statistic of interest as an image.Climate time-series data was acquired from the ERA5-Land climate reanalysis from the European Centre for Medium-Range Weather Forecasts (ECMWF)60, accessed from the Google Earth Engine data catalogue. ERA5-Land surface variables were used in 1-h intervals and 0.1°× 0.1° (nominal 9 km). The 10-year analysis period (2010–2019, inclusive) was used for this work, and represents a period long enough to provide a reasonable correction for medium-term interannual climatic variability.Climate variables GHI and T were matched to ERA5-Land parameters ‘Surface solar radiation downwards’ (converted from cumulative to mean hourly) and ‘2 metre temperature’ (converted from K to °C), respectively. RH was calculated from the ambient and dew point temperature parameters in a relationship derived from the August–Roche–Magnus approximation61 rearranged as:$${rm{RH}}=100 % times frac{{{rm{e}}}^{left(frac{a{T}_{{rm{d}}}}{b+{T}_{{rm{d}}}}right)}}{{{rm{e}}}^{left(frac{{aT}}{b+T}right)}}$$where a is 17.625 (constant), b is 243.04 (constant), T is the ERA5-Land parameter ‘2 metre temperature’ converted from K to °C, and Td is the ERA5-Land parameter ‘2 metre dewpoint temperature’ converted from K to °C.Spot validation of the climate parameters and the mapped output was performed manually in Google Earth Engine across several timesteps in 2016 in Ames, Iowa (using the Iowa Environmental Mesonet AMES-8-WSW station62) and showed insignificant error (99.99 to 95.80% for thermal absorbers, depending on the level of angular selectivity63.Rearranged, Kim’s model yields$$frac{{dot{V}}_{{rm{water}},{rm{out}}}}{A}le {E}_{{rm{GHI}}}times left(1-frac{{T}_{{rm{ambient}}}}{{T}_{{rm{hot}}}}right)times {left[frac{1}{{omega }_{{rm{air}},{rm{in}}}-{omega }_{{rm{air}},{rm{out}}}}({e}_{{rm{air}},{rm{out}}}-{e}_{{rm{air}},{rm{in}}})+{e}_{{rm{water}},{rm{out}}}right]}^{-1}times frac{1}{{rho }_{{rm{water}}}}$$where, in addition, ({dot{V}}_{{rm{water}},{rm{out}}}) is the production rate of liquid water by volume, ({A}) is the area harvesting sunlight (see approximation section below), ({E}_{{rm{GHI}}}) is GHI in Wsun m−2, and ({rho }_{{rm{water}}}) is the density of water.This is now a function of the three key climate variables: GHI (in the first term), ambient temperature (in the second and hidden in the third term) and RH (entering the third term). This was converted to an output table and processed through the AWH-Geo pipeline and presented in Fig. 3a. While this can be run for any choice of parameter ({T}_{{rm{hot}}}), we present figures here for ({T}_{{rm{hot}}}) = 100 °C, a temperature still achievable in low-cost (non-vacuum) practical devices without tracking or sunlight concentration. Higher driving temperatures increase the upper bound for water output. For the limits analysis, values of RH above 90% are clamped to prevent unrealistically high theoretical outputs as Kim’s equation goes to infinity at 100% RH. A further assumption is made that new ambient air is efficiently refreshed.Figure 3b maps the maximum yield for active cooler–condensers without recuperation of sensible heat—all given work input and an optimum coefficient of performance of the cooling unit at a condenser temperature that maximizes specific yield as modelled by Peeters32, which we digitized from their fig. 11. Peeters chose to set yield to zero whenever frost formation would be expected on the condenser. Since Peeters assumes work input, we convert from solar energy (GHI) to kWhPV as discussed above.Figure 3c maps Zhao’s experimental results from a TRP using a logistic regression curve fit to their reported SYs of 0.21, 3.71 and 9.28 l kWh−1 at 30, 60 and 90% RH, respectively15. The terms of the curve fit are reported in the next section.Custom yellow to blue map colours are based on www.ColorBrewer.org, by C. A. Brewer, Penn State64.Specific yield and target curvesTwo simple characteristic equations, linear and logistic, were used to fit a limited set of SY and RH pairs from laboratory experiments or reported values and plotted through AWH-Geo using calculated output tables. Hypothetical curves of similar form whose terms were adjusted iteratively in AWH-Geo to goal-seek a target output (5 l d−1) and user base, and are reported here (for 1-m2 devices). In the following equations, RH in % is taken as a fraction (for example 55% is equivalent to 0.55).The linear target curve is a simple linear function which crosses the y-axis at zero:$${rm{SY}}({rm{RH}})=atimes {rm{RH}}$$where a is set to 1.60, 1.86 and 2.60 L/kWh to reach targets of 0.5, 1.0, and 2.0 billion people without SMDW, respectively, and RH is input RH (fractional).The logistic target curve is a logistic function:$${rm{SY}}({rm{RH}})=frac{L}{1+{{rm{e}}}^{-k({rm{RH}}-{{rm{RH}}}_{0})}}$$where L is set to 1.80, 2.40 and 4.80 L kWh−1 to reach targets of 0.5, 1.0 and 2.0 billion people without SMDW, respectively, k is the growth rate set to 10.0, and ({rm{RH}}) and ({{rm{RH}}}_{0}) are input RH (fractional), and 0.60, respectively.The SY values reported by Zhao for TRPs (which they term ‘SMAG’) were fit to a logistic function of the same form with the following parameters: L set to 9.81 L kWh−1, k set to 11.25 and RH0 set to 0.645.The resulting fitted SY profile is expanded into an output table. As with all reports providing SY values instead of full output tables, this forces an assumption of linearity in heat rate (approximately equal to GHI), which may introduce error at lower GHI levels. Zhao reports SY of the TRP material is consistent across temperature below 40 °C—the material’s lower critical solution temperature—above which its performance drops precipitously. Accordingly, we set the SY to 0 l kWh−1 for temperatures ≥40 °C in the output table.Bagheri reported performance of three existing AWH devices across several climate conditions using an ‘energy consumption rate’ in kWh/L, which can be considered to be the SEC, and the simple reciprocal of SY. Instead of fitting a logistic curve to the reciprocals, we fit an exponential function to the average SEC of the three devices in conditions above 20 °C of the equation:$${rm{SEC}}({rm{RH}})=9.03{{rm{e}}}^{-2.99{rm{RH}}}$$where SEC is specific energy consumption in kWhPV l−1 and RH is fractional.This was applied to RH and taken as reciprocal in an output table and run through AWH-Geo. Since Bagheri reports the equivalent of kWhPV, we scale to adapt to GHI input with a photovoltaic conversion efficiency as discussed above.For performance of the ZMW device (the company’s ~3 m2 SOURCE Hydropanel), we used values from the panel production contour plot in the technical specification sheet available from the manufacturer’s website35. The decision for inclusion was made owing to the importance as an early example of a SC-AWH product with commercial intent. Values in l per panel per day were taken at each 10% RH step at 5 kWh m−2, assumed to represent kWh m−2 d−1, and divided by 15 kWh (~3 m2 × 5 kWh m−2) to convert to SY in l kWh−1. From the resulting SY curve, an output table was generated and processed with AWH-Geo.Coincidence analysis and population sumsThe coincidence analysis was run through AWH-Geo across 70 threshold pairs given the full permutation set of RH from 10 to 100% and GHI from 400 to 700 W m−2 threshold intervals, using binary image time series. The resulting mean multiplied by 24 represents average hours per day thresholds are met simultaneously, giving ophd. Below is a functional representation of this time-series calculation:$${langle ({{rm{RH}}}_{t,{rm{px}}} > {{rm{RH}}}_{{rm{threshold}}}){{rm{& & }}}_{{rm{simultaneous}}}({{rm{GHI}}}_{t,{rm{px}}} > {{rm{GHI}}}_{{rm{threshold}}})rangle }_{{rm{time; average}}}$$where ({{rm{RH}}}_{t,{rm{px}}}) is the RH in the map pixel ({rm{px}}) at time (t), ({{rm{RH}}}_{{rm{threshold}}}) is the threshold of RH above which the device is assumed to operate, ({{rm{GHI}}}_{t,{rm{px}}}) is the GHI in the map pixel ({rm{px}}) at time (t), and ({{rm{GHI}}}_{{rm{threshold}}}) is the threshold of GHI above which the device is assumed to operate.The population calculation was then conducted on these images in Google Earth Engine.Zonal statistics were performed on the mean ophd images as integers (0–24) using a grouped image reduction (at 1,000-m scale) summing the population integer counts on the population without SMDW distribution image created previously (derived from WorldPop). This reduction was performed at 1,000 m. Validation was performed in Google Earth Engine on single countries within single ophd zones and showed insignificant error ( More