To compute the water footprint of global energy trade, we rely upon data originating from the UN Commodity Trade (UN Comtrade) database, sourced through an Application Program Interface (API) in the R coding language. Trade data are then cleaned to eliminate outliers and erroneous values. The International Energy Agency (IEA) provides information on electricity generation portfolios, retrieved using an API in Python. Water intensity factors are sourced through various literature based on the energy type; see Table 1. Figure 1 illustrates the process of generating the database. Each step is described thoroughly below. All referenced scripts, input files, and output files are accessible via the study’s accompanying Zenodo database22, https://doi.org/10.5281/zenodo.3891722. These methods are expanded versions of descriptions in our related work21 and offer additional insights to related discussions on water footprints of electricity. This data descriptor facilitates the reproducibility of results and provides the scripts needed to add additional years to the database should the study need to be revisited. Therefore, this data descriptor goes beyond the complementary manuscript by providing greater insights, assumptions, and opportunities on the methods and resultant datasets.
There are five general steps in the creation of the databases, requiring the integration of data from the United Nations, International Energy Agency, and literature sources.
Determining electricity water footprints
There is no comprehensive database associated with water footprints of electricity across the globe. Therefore, many studies rely on data based in the United States to inform global estimations23. For thermoelectric power plants, water footprints vary based on cooling system and fuel type24. Water footprint values for thermoelectric power generation were obtained from Macknick et al.3, a widely utilized reference in the literature. Additionally, Davies et al.25 provides estimates of cooling technology, globally, by region. The range and expected values of water consumption intensity for each of these generation technologies and fuel types were utilized. Finally, country-specific water footprint values of hydroelectricity were gathered from Mekonnen et al.26; no uncertainty or range for these values was available. The water intensity values for electricity generation were static and did not vary interannually.
Creating country-specific water footprints of electricity
Here, we determine the water footprint of electricity for each country based on their electricity generation portfolio. The portfolios are gathered from the IEA for each year27. In several instances, specific country values are not available and generation portfolios were manually determined. The database contains the Python script used to interface with the API and download electricity portfolios from 2010-2017 (IEA-webscraping.ipynb). At the time of writing, the IEA values were incomplete for 2018 and generation profiles in this year were assumed to be identical to 2017 absent any other data. We also provide the cleaned outputs of this script (IEA-electricity-mix-20XX-GWh.csv).
In this study, we consider the water footprint of renewable electricity technologies such as solar or wind power as negligible with respect to its operational stage and assign a water consumption rate of 0 m3/MWh to these electricity resources. Utilizing the water intensity factors completed in the previous step, we determine the virtual water footprint of each country, weighted by generation; see Eq. 1. Therefore, interannual variations in a countries’ water footprint are dictated solely by changing electricity generation portfolios.
$$VW{F}_{i}=frac{sum _{g}{w}_{g}times {e}_{i,g}}{sum _{g}{e}_{i,g}}$$
(1)
Where i is the country of origin, w is the water footprint of each electricity generation technology, g, and e is the electricity generation in each country by technology. This calculation is completed using R and the script, getElectricityWF.R. The resultant database contains estimates of water footporint in m3/MWh for each country from 2010–2018 (ElectricityWaterIntensity.csv).
Water footprints of other energy sources
Fossil fuels
Only static values of water footprint were available for the energy resources of coal, lignite, and oil resources, which did not vary temporally or spatially. Table 1 shows the assumed range of water footprints and the literature sources for these values23,26,28,29,30. All water intensities are provided in m3/kg to be consistent with the reported values of the UN Comtrade data. For coal, we assume a conversion value of 36.04 kg/MMBtu to convert between literature estimates. Similarly, we assume a conversion value of 45 MJ/kg for crude oil.
The water consumption of natural gas varies widely depending on the method of extraction. However, there are only four countries that produce shale gas commercially: the United States, Canada, China, and Argentina. Therefore, for exports from these four countries, we define a weighted average of conventional and shale gas extraction water intensity, based on data from the Energy Information Administration31. In the process of extracting and processing natural gas, other hydrocarbons such as ethane, propane, butane, or pentanes are produced. Using production factors from the United States, we estimate the ratio of butane and propane production versus natural gas production32. Using these ratios, we estimate water footprints of these fuels assuming similar water footprints to natural gas. Approximately 8.1 MMBtu of propane was produced per MMBtu of natural gas from 2010 to 2018. The ratio of butane to natural gas was lower, at 2.6 MMBtu butane per MMBtu natural gas.
Biodiesel
Biofuel trade data are only available from 2012–2018. To calculate the temporal and spatial differences in water footprints for biodiesel, we combine water footprint estimates from Gerbens-Leenes et al.33 and Mekonnen and Hoekstra34 with biofuel reports from the US Department of Agriculture (USDA) and Energy Information Administration (EIA) that detail foodstock inputs by weight for biodiesel production. These reports provide a significant amount of data to capture the major countries and foodstock sources, but it is still necessary to create assumptions for the remainder of the countries and foodstocks. Animal fat was a common source for biodiesel production. We estimate that one liter of biodiesel requires 0.88 kg of animal fat and has a yield of 90%35, averaging the water footprint of pork, chicken, and beef fat based on global meat production estimates from the UN Food and Agriculture Organization and water footprints of meat36. The resultant estimate of water intensity for animal fat-derived biodiesel is 217 m3/GJ. Another common component of biodiesel production is used cooking oil, which was assigned a water footprint of zero as it is a waste product. Equation 2 describes the process of generating country-specific biodiesel water footprints.
$$W{F}_{C}=frac{mathop{sum }limits_{C}^{f}{w}_{f}times {y}_{f}times {m}_{f}}{sum {y}_{f}times {m}_{f}}$$
(2)
Where, C is a country, f is a specific feedstock, w is the water intensity of each feedstock, y is the yield ratio of each feedstock to biodiesel, and m is the mass of feedstock used in each country.
Per the UN trade definition, the biodiesel category must contain less than 70% petroleum based fuels. This creates a wide range of uncertainty in the actual biodiesel content of the trade. To account for this uncertainty, we set mean, maximum, and minimum thresholds of the fuel mix. We assume the mean to be 50% biodiesel, the minimum to be 30% biodiesel, and the maximum to be 70% biodiesel. The resulting water footprints of biodiesel for each exporting country are provided in BiodieselWF.csv.
Firewood and charcoal
Schyns et al.37 provide globally gridded blue and green water footprints of roundwood production, attributing the water consumption of forests based on an economic evaluation of the wood product relative to other forest values. Blue water footprints refer to water consumed from surface or groundwater sources, whereas green water footprints are driven by rainfall38. The globally gridded values were aggregated by country with minimum, maximum, and average values reported in m3water/m3wood. We assume a specific volume of firewood to be 2.08 × 10−3 m3/kg and 5.92 × 10−3 m3/kg of charcoal39. For countries that did not have gridded values within the dataset, we average values from neighboring countries. No interannual variation in water footprint was available. The final water footprints by country for firewood and charcoal are provided in FuelwoodWF.csv and CharcoalWF.csv, respectively.
Trade data download and cleaning
The UN Comtrade data provide the basis for the analysis40. These data were downloaded using the comtradr package in R, which interfaces with the Comtrade API. Both import and export data were downloaded for all countries from 2010-2018 across eleven different energy commodities. These trade statistics provide the value (USD) of the economic transfer, direction of trade (import or export), trade partners, commodity traded, and the amount of the good transferred. Electricity trade is reported in 1000 kWh (1 MWh); all other energy commodities report trade in kilograms. The script to download trade data is included in the database; see DownloadComtradeData.R. Querying the Comtrade API is limited by the number of qualifiers; therefore, the queries are broken down by energy commodity and combined using the CompileTradeData.R script provided in the database.
Upon investigation of the data, we identify four areas of data cleaning: (i) resolve differences in imports versus export data, (ii) address discrepancies in electricity trade, (iii) fill data gaps, and (iv) remove outliers.
- i
To be conservative, in cases where the import and export data were different (or one was not available), the largest traded volume was kept. This assumption was made in absence of any other estimate acknowledging the potential for overestimation. This conservative approach for estimating the water footprint of energy is consistent with similar studies19.
- ii
In some instances, there was reported electricity trade between two non-neighboring countries (i.e., European Countries and the United States). To resolve these potential concerns, we created a database of geographically neighboring countries and inventoried a list of undersea connections and eliminate trades occuring outside these connections. While this assumption would negate potential agreements of electricity trade through countries, we assumed that this proportion of electricity trade is relatively small. Additionally, these assumptions reflect the constraints of electricity trade with infrastructure. Removal of these links is completed in CompileTradeData.R using the ElectricConnection.R function and the accompanying database of country neighbors, ElectricityConnections.csv.
- iii
For values that were reported as zero but had a monetary value, we determined a unit value ($/kg or $/MWh) as the median of a commodities’ trade between the two countries in the preceding year, following year, and the overall unit value of the commodity originating from the country in the current year. Data gaps are filled using the filldatagaps.R script.
- iv
There were some instances of extreme trade values, particularly in the year 2017. These quantities were often reported two orders of magnitude greater than similar trade links in other years. To manage these errors, we took an objective approach to all reported quantities and identified any values that met all of the following three criteria:
reported quantities greater than 5 times the median value from other years on the same link,
reported quantities with a unit value greater than 5 times the median unit value from other years on the same link, and
reported quantities with a unit value greater than 5 times the median unit value for all exports from the originating country in that year
The above assumptions allowed us to remove extreme values and replace them with estimations based on the reported monetary trade value of the commodity, as above. The script for removing and correcting outliers is provided in the database; see reviseOutliers.R.
Creating the virtual water trade network
Following a cleaned and formatted version of the Comtrade data from Step 4, the virtual water trade network was created by multiplying a water footprint of each energy source, f, based on its country of origin i and year y and the reported trade volume from country i to j:
$$VW{T}_{i,j}^{f,y}={e}_{i,j}^{f,y}times {w}_{i}^{f,y}$$
(3)
For each link in the network, a mean, minimum, and maximum value of trade is calculated based on the ranges of water footprint calculated in Steps 1-3. This step is completed using the DetermineWF.R script and results in the main output of the database EnergyWF_Trade.csv. The final dataset maintains information on export/import, countries in the trade link, quantity traded, trade value, and associated water footprints. ‘Reporter’ columns refer to the country of origin and ‘partner’ columns are the destination country. The final database includes virtual water exports from 215 countries. Of these countries, only 25 (12%) did not feature exports for all nine years of the analysis. Figure 2 illustrates the global extent of the database with many countries exporting all eleven commodities for at least one year during the study period.
Many countries had export data for all 11 commodities for at least one year from 2010–2018.
Source: Resources - nature.com