# GlobSnow v3.0 Northern Hemisphere snow water equivalent dataset

### Overview of the SWE Retrieval method

The SWE processing system relies on Bayesian assimilation which combines ground-based data with satellite-borne observations2. The method applies two vertically polarized satellite-based brightness temperature observations at 19 and 37 GHz and a scene brightness temperature model (the HUT snow emission model4). First, snow microstructure described by an ‘effective snow grain size’ is estimated for grid cells with a coincident weather station SD observation. Effective snow grain size is used in the HUT model as a scalable model input parameter to optimize agreement with the satellite measurements. These values of grain size are used to interpolate a background map of the effective grain size, including an estimate of the effective grain size error. This spatially continuous map of grain size is then used as an input for a second HUT model inversion to provide an estimate of SWE. In the inversion process, the effective grain size in each grid cell is weighed with its respective error estimate and a constant value of snow density is applied. The spatially continuous SWE map obtained from the second run of the HUT snow model described above is fused with a background SD field (converted to SWE using 0.24 g cm−3) to obtain a final estimate of SWE using a Bayesian non-linear iterative assimilation approach (which weights the information sources with their estimated variances). The background SD field is generated from the same weather station SD observations used to estimate the effective snow grain size using kriging interpolation methods.

The microwave scattering response to SWE saturates under deep snow conditions (>150 mm) and model inversion of SD/SWE over areas of wet snow is not feasible because the microwave signal is absorbed rather than scattered. For these reasons, the method decreases the weight of satellite data for deep dry snowpacks and wet snow by assessing the modeled sensitivity of brightness temperature to SWE within the data assimilation procedure2,3.

Before SWE retrieval, dry snow is identified from brightness temperature data7. For the autumn snow accumulation season (August to December), the dry snow detection is used to construct a cumulative snow presence mask to track the advance of snow extent (SWE estimates are restricted to the domain indicated by the cumulative snow presence mask). During spring the overall mapped snow extent is determined from the cumulative mask, which (as the melt season proceeds) is reduced using a satellite passive microwave derived estimate for the end of snow melt season for each grid cell8.

The snow part of the applied scene brightness temperature model is based on the semi-empirical HUT snow emission model which describes the brightness temperature from a multi-layer snowpack covering frozen ground in the frequency range of 11 to 94 GHz4,5. Input parameters to the model include snowpack depth, density, effective grain size, snow volumetric moisture and temperature. Separate modules account for ground emission and the effect of vegetation and atmosphere. Comparisons of HUT model simulations to airborne and tower-based observations, reported elsewhere (e.g.9,10), demonstrate the ability of the model to simulate different snow conditions and land cover regimes. Intercomparisons with other emission models show comparable performance when driven by in situ data11,12 or physical model outputs13, although the HUT model has the tendency to underestimate brightness temperatures for deep snowpacks12.

#### Basic underlying assumptions

Passive microwave sensitivity to SWE is based on the attenuating effect of snow cover on the naturally emitted brightness temperature from the ground surface. The ground brightness temperature is scattered and absorbed by the overlying snow medium, typically resulting in a decreasing brightness temperature with increasing (dry) snow mass. The scattering intensity increases as the wavelength approaches the size of the scattering particles. Considering that individual snow particles tend to range from 0.5 to 4 mm in the long axis direction, high microwave frequencies (short wavelengths) will be scattered more than low frequencies (long wavelengths). The intensity of absorption can be related to the dielectric properties of snow, with snow density largely defining the permittivity for dry snow. Absorption at microwave frequencies increases dramatically with the inclusion of free water (moisture) in snow, resulting in distinct differences of microwave signatures from dry and wet snowpacks.

Initial investigations pointed out the sensitivity of microwave emission from snowpacks to the total snow water equivalent14. This led to the development of various retrieval approaches of SWE from the earliest passive microwave instruments in space (e.g.15,16). From the available set of observed frequencies, most SWE algorithms employ the ~37 GHz and ~19 GHz channels in combination. These two frequencies are available continuously since 1979. The scattering from snow at 19 GHz is smaller when compared to 37 GHz, while the emissivity of frozen soil and snow is estimated to be largely similar at both frequencies. The brightness temperature difference of the two channels can be related to snow depth (or SWE), with the additional benefit that the effect of variations in physical temperature on the measured brightness temperature are reduced (relative to the analysis of single frequencies). Similarly, observing a channel difference reduces or even cancels out systematic errors of the observation, provided that the errors in the two observations are similar (e.g. due to using common calibration targets on a space-borne sensor). Typically, the vertically polarized channel at 19 and 37 GHz is preferred due to the inherent decreased sensitivity to snow layering (e.g.17).

A basic assumption in the data assimilation procedure that combines spaceborne passive microwave observations and synoptic weather station data to estimate snow depth is that the background snow depth field, interpolated from weather station data, provides meaningful information on the spatial patterns of snow depth. A limitation of the methodology is that this assumption does not hold for complex terrain (mountains). Further, the methodology is not suitable for snow cover on top of ice sheets, sea ice or glaciers.

#### Input data

The main input data are synoptic snow depth (SD) observations and spaceborne passive microwave brightness temperatures from the Scanning Multichannel Microwave Radiometer (SMMR), Special Sensor Microwave/Imager (SSM/I) and Special Sensor Microwave Imager/Sounder (SSMIS) data from Nimbus-7 and DMSP F-series satellites. The most important frequencies for SWE retrieval and snow detection are 19 GHz (reference measurement with very little scattering from the snow volume) and 37 GHz (sensitive to volume scattering by dry snow), which are available in all instruments. The satellite datasets are described in detail in Data Records section.

Ground-based SD data were acquired from the Finnish Meteorological Institute (FMI) weather station observation database, augmented from several archive sources, including the European Centre for Medium-Range Weather Forecasts (ECMWF), The United States National Climatic Data Centre (NCDC), The All-Russia Research Institute of Hydrometeorological Information-World Data Centre (RIHMI-WDC) and The Meteorological Service of Canada (MSC) archives, as described in the Data Records section.

In the assimilation of SD values with space-borne estimates, a density value of 0.24 g cm−3 is assumed in estimating SWE. In the assimilation procedure the spatial small-scale variability of SD is considered by assigning a variance of 150 cm2 to the weather station observations over forested areas, and a variance of 400 cm2 for open areas. These variance estimates describe how well a single-point SD observation describes the snow depth over a larger area surrounding the measurement site, and were determined from available FMI, Finnish Environment Institute (SYKE) and Environment and Climate Change Canada (ECCC) snow transect measurements, as well as experimental field campaign data from across Finland and Canada.

Daily SD background fields were generated from observations at synoptic weather station locations acquired from multiple archives for the years 1979–2018. For each measurement, the exact location, date of measurement, and SD are required. The long-term weather station data is pre-processed before utilization in the SWE retrieval to remove outliers and improve the overall consistency of the data, as described in the Methods section.

Land use and, most importantly, forest cover fraction are derived from ESA GlobCover 2009 300 m data18. Stem volume is required as an input parameter to the emission model to compensate for forest cover effects4,19; average stem volumes are estimated by the ESA BIOMASAR20 data records as described in the Methods section.

The following auxiliary datasets are used to mask out water and complex terrain (mountain) pixels:

• ESA CCI Land Cover from 2000: water fraction is aggregated to the 25 km grid cell spacing of the SWE product, pixels with a water fraction >50% are masked as water.

• ETOPO521: if the standard deviation of the elevation within a 25 km grid cell is above 200 m it is masked as complex terrain.

### The Forward model applied in SWE retrieval

#### Calculation of brightness temperature for a satellite scene

For a satellite scene consisting of a mixture of non-forested terrain, forests, and snow-covered lake ice, the bottom-of-atmosphere brightness temperature TB,BOA is calculated so that:

$${T}_{B,BOA}=left(1-FF-LFright){T}_{B,snow}+FFcdot {T}_{B,forest}+LFcdot {T}_{B,lake}$$

(1)

where FF is the forest fraction and LF the lake fraction of a given grid cell. ({T}_{B,snow}), ({T}_{B,forest}), and ({T}_{B,lake}) are the brightness temperatures emitted from non-forested terrain (ground/snow), forested terrain, and lake ice, respectively. Land cover fractions FF and LF are determined from ESA GlobCover data resampled to the 25 km EASE grid. A statistical approach is used to calculate top-of-atmosphere brightness temperatures from TB,BOA, statistics are based on studies covering the Northern Hemisphere4,22,23.

#### Brightness temperature from snow-covered ground

The brightness temperature ({T}_{B,snow}) for snow-covered, non-forested terrain is calculated using the HUT snow emission model4. The model is a radiative transfer-based, semi-empirical model which calculates the emission from a single homogenous snowpack. The current approach utilizes multi-layer modification which allows the simulation of brightness temperature from a stacked system of snow or ice layers5.

The absorption coefficient in the HUT model is determined from the complex dielectric constant of dry snow, applying the Polder-van Santen mixing model for the imaginary part24. The calculation of the dielectric constant for dry snow as well as effects of possible liquid water and salinity inclusions, are described through empirical formulae25. Emission from the snow layer is considered as both up- and down-welling emission. These are, in turn, reflected from interfaces between layers (air-snow, snow-ground). The transmission and multiple reflections between layer interfaces are calculated using the incoherent power transfer approach.

Applying the delta-Eddington approximation to the radiative transfer equation, the HUT model assumes that most of the scattered radiation in a snowpack is concentrated in the forward direction (of propagation) due to multiple scattering within the snow media, based on26, which assumes that losses due to scattering are approximately equal to generation of incoherent intensity by scattering. However the omission of the backward scattering component as well as omission of trapped radiation will lead to underestimation of brightness temperature for deep snowpacks12. In the HUT model, the rough bare soil reflectivity model27 is applied to simulate the upwelling brightness temperature of the soil medium.

#### Brightness temperature from forest vegetation

The brightness temperature over forested portions of the grid cell ({T}_{B,forest}) is derived from ({T}_{B,snow}) using a simple approximation so that:

$${T}_{B,forest}={t}_{veg}cdot {T}_{B,snow}+left(1-{t}_{veg}right)cdot {T}_{veg}+left(1-{t}_{veg}right)cdot left(1-{e}_{snow}right)cdot {t}_{veg}cdot {T}_{veg}$$

(2)

where ({t}_{veg}) is the one-way transmissivity of the forest vegetation layer, ({T}_{veg}) the physical temperature of the vegetation (considered to be equal to air, snow and ground temperatures, ({T}_{veg}={T}_{air}={T}_{snow}={T}_{gnd}=-,{5}^{^circ }{rm{C}})) and ({e}_{snow}) the emissivity of the snow covered ground system. The choice of −5 °C is based on experimental data28 and follows the previous publications2,3,4. Moreover the impact of physical temperature is minimal on the simulated brightness temperature difference of two frequencies applied in the retrieval (typically <1 K, and <3 K for extreme cases).

For the GSv3 SWE retrieval, the one-way forest transmissivity ({t}_{veg}) is calculated by19;

$${t}_{veg}={e}^{-{kappa }_{e}SV}$$

(3)

where ({{rm{kappa }}}_{e}) is the forest vegetation extinction coefficient, and SV the forest stem volume (biomass). The coefficients were determined using airborne data19 for the key frequencies of 10.65, 18.7 and 36.5 GHz, and were validated for the range 0–100 m3 ha−1. The model29 applied in previous GlobSnow versions (1.0 and 2.0) saturated even for modest stem volumes (50 m3 ha−1 at 18.7 GHz and 100 m3 ha−1 at 36.5 GHz), which contradicts observational data. The earlier GlobSnow products also applied a constant stem volume of 80 m3 ha−1 for the entire Northern Hemisphere, due to lack of a reliable stem volume dataset at the time of development of those products.

Equation (3) and the extinction coefficients of ({{rm{kappa }}}_{e}) = 0.007 at 18.7 GHz and ({{rm{kappa }}}_{e}) = 0.011 at 36.5 GHz (for V-polarization) are now applied and replace the transmissivity model applied in the earlier SWE retrieval framework29. The ESA BIOMASAR20 stem volume estimates are applied to calculate spatially variable transmissivity, replacing the constant stem volume applied in earlier versions.

#### Brightness temperature from lake scenes

Earlier versions of the algorithm used in GlobSnow considered the brightness temperature over sub-grid lake fractions to be equal to that of snow covered ground. In GSv3, ({T}_{B,lake}) is calculated separately using the multiple-layer version of the HUT snow emission model5, which considers frozen lakes as a stacked system of water, ice and snow. While introducing the cumulative effect of multiple reflections in a system of stacked layers, the original formulation of radiation scattering and absorption in individual snow layers was not altered. The mathematical solution2 was formulated for practical applications and is considered for scenes including lake ice in the GlobSnow SWE method following certain underlying generalizations30.

Based on a comparison of measurements of snow depth over lakes and over land in Finland30, the depth of the snow layer on ice in the forward simulation of ({T}_{B,lake}) is considered to be always half of the equivalent snowpack on land. Snow density and grain size are considered to be identical. The depth (thickness) of the ice layer is considered to be a constant 50 cm, an approximation based on the mean maximum ice thickness over Finnish lakes in March30,31. The physical temperature of ice is considered to be equal to snow temperature (({T}_{ice}={T}_{snow}=-,{5}^{^circ }{rm{C}})), while the temperature of the underlying water layer is considered to be Twater = 0 °C.

### The SWE retrieval procedure

The general processing chain for the GSv3 SWE product is given in Fig. 1 and explained in detail below. Northern Hemisphere GSv3 SWE map for 15 February 2010 is shown in Fig. 2, which shows that SWE estimates are not provided over sea ice, large lakes, glaciers, Greenland, nor complex terrain (mountains), as the retrieval is focused on terrestrial seasonal snow for which the PMW retrievals are feasible.

Step 1: The mountain mask and high SD value filter are applied to synoptic SD data. The high value mask removes the deepest 1.5% of reported snow depth values in order to avoid spurious or erroneous deep snow observations. The mountain mask criterion removes all observations that fall within EASE-grid cells with a height standard deviation above 200 m within the grid cell.

The mountain mask, water mask and dry snow masks (Step 5) are applied to satellite brightness temperature data. The water mask removes grid cells with over 50% water. The SWE retrieval from satellite data is performed for dry snow pixels. For wet snow, the final SWE estimates are based solely on the weather station kriged background SD field (assuming a constant snow density of 0.24 g cm−3 when converting SD to SWE).

Step 2: Numerical inversion of the multi-layer HUT snow emission model2 is performed for grid cells containing synoptic snow depth observations to retrieve values of effective grain size d0. The model is fit to spaceborne observed TB values at the locations of weather stations by optimizing the value of snow grain size d0. The fitting procedure is:

$$min {d}_{0}{left{left({{rm{T}}}_{{rm{B}},19{rm{V}},{rm{mod}}}left({d}_{0},S{D}_{gr}right)-{{rm{T}}}_{{rm{B}},37{rm{V}},{rm{mod}}}left({d}_{0},S{D}_{gr}right)right)-left({{rm{T}}}_{{rm{B}},19{rm{V}},obs}-{{rm{T}}}_{{rm{B}},37{rm{V}},{rm{obs}}}right)right}}^{2}$$

(4)

where the known snow depth (on ground) is (S{D}_{gr}), ({T}_{B,19V}) and ({T}_{B,37V}) denote the vertically polarized brightness temperature at approximately 19 and 37 GHz with indices mod and obs referring to modelled and observed values, respectively. Vertical polarization is used because it correlates best with SWE in the boreal forest zone4,32. Snow density is treated with a constant value of 0.24 g cm−3. At each synoptic SD station location, the final d0,ref estimate (and its standard deviation λ) is obtained by averaging values obtained for the ensemble of the nearest stations, so that

$$leftlangle {widehat{d}}_{0,ref}rightrangle =frac{1}{M}mathop{sum }limits_{j=1}^{M}{widehat{d}}_{0,ref,j}$$

(5)

$${lambda }_{d0,ref}=sqrt{frac{1}{M-1}mathop{sum }limits_{j=1}^{M}{left({widehat{d}}_{0,ref,j}-leftlangle {widehat{d}}_{0,ref}rightrangle right)}^{2}}$$

(6)

where M is the number of stations (default 6).

Step 3: The effective grain size and its variance are interpolated over the full domain of brightness temperature observations (35° N to 85° N latitude and 180° W to 180° E longitude at a resolution of 25 km) using kriging interpolation, to obtain a spatially continuous field of effective grain size and its variance.

Step 4: The synoptic SD observations are interpolated over the same brightness temperature domain also using kriging interpolation. The variance λD2 assigned to individual SD observations is set at 150 cm2 for open areas and 400 cm2 for forested areas. As an output, a spatially continuous estimate of SD and its variance are obtained.

Step 5: A dry snow mask is determined from the brightness temperature data to identify areas where dry snow is not present, and areas with wet snow cover. For dry snow, the following conditions7 need to be met:

$$begin{array}{c}S{D}_{i}=15.9cdot ({T}_{B,obs}^{19H}-{T}_{B,obs}^{37H}) > 80(mm) {T}_{B,obs}^{37H} < 240K {T}_{B,obs}^{37V} < 250Kend{array}$$

(7)

where SDi is indicative snow depth for the given pixel, and needs to be above 80 (mm) and observed brightness temperatures of 37H and 37 V need to be below thresholds of 240 K and 250 K respectively for dry snow to be indicated. Only grid cells identified as dry snow are subject to the SWE retrieval process. Areas identified as wet snow for the given day are assigned a SWE value based on the kriging-interpolated SD map (from Step 4), converted to SWE. (wet snow masking is carried out on a daily basis, not cumulatively).

Step 6: Observed brightness temperatures, the effective grain size (Step 2 and 3) and effective grain size variance, over the whole area of interest are ingested into a numerical inversion of the HUT snow emission model to retrieve bulk SWE. Similar to the retrieval of effective grain size, an iterative cost function is applied. HUT snow emission model estimates are matched to observations numerically by fluctuating the SWE value. The background field of SDt is applied to constrain the retrieval. A constant value of snow density (0.24 g cm−3) is applied to calculate SWEt from SDt where t refers to day in question. The cost function constrains the grain size value according to the predicted background grain size and the estimated variance produced in Step 3. Thus, assimilation adaptively weighs the space-borne brightness temperature observations and the background SDt field (produced in Step 4) to estimate a final SWE and a measure of statistical uncertainty (in the form of a variance estimate) on a pixel basis:

$${min }_{S{D}_{t}}left{{left(frac{left({T}_{B,{rm{mod}}}^{19V}left(S{D}_{t}right)-{T}_{B,{rm{mod}}}^{37V}left(S{D}_{t}right)right)-left({T}_{B,obs}^{19V}-{T}_{B,obs}^{37V}right)}{{sigma }_{t}}right)}^{2}+{left(frac{S{D}_{t}-S{widehat{D}}_{ref,t}}{{lambda }_{SD,ref,t}}right)}^{2}right}$$

(8)

where (S{widehat{D}}_{ref,t}) is the snow depth estimate from the kriging interpolation for the day under consideration t. ({lambda }_{SD,ref,t}) is the estimate of standard deviation from the kriging interpolation, and SDt is the snow depth for which Eq. (8) is minimized.

The variance of TB, σt2, is estimated by approximating ΔTB ((Delta {T}_{B}={T}_{B}^{19}-{T}_{B}^{37})) as function of snow depth and grain size in a Taylor series:

$$Delta {T}_{B}left({D}_{t},{d}_{0}right)approx Delta {T}_{B}left({D}_{t},{widehat{d}}_{0,ref,t}right)+frac{partial Delta {T}_{B}left({D}_{t},{widehat{d}}_{0,ref,t}right)}{partial {d}_{0}}left({d}_{0}-{widehat{d}}_{0,ref,t}right)$$

(9)

$${sigma }_{t}^{2}={rm{var}}left(Delta {T}_{B}left({D}_{t},{widehat{d}}_{0,ref,t}right)right)={left(frac{partial Delta {T}_{B}left({D}_{t},{widehat{d}}_{0,ref,t}right)}{partial {d}_{0}}right)}^{2}{lambda }_{d0,ref,t}^{2}$$

(10)

In practice, the variance σt2 adjusts the weight of brightness temperature data with respect to the weight of the background SD field (parameter λD,ref). A basic feature of the algorithm is that if the sensitivity of space-borne radiometer observations to SWE is assessed to be close to zero by formulas (9) and (10), the weight of radiometer data on the final SWE product approaches zero (this is the case if the magnitude of SWE is very high). The higher the estimated sensitivity of TB to SWE, the higher is the weight given to the radiometer data. Thus, the weight of the radiometer data varies both temporally and spatially in order to provide a maximum likelihood estimate of SWE. The assimilation procedure is performed daily for dry snow pixels that are not excluded by water and complex terrain (mountain) masks.

Step 7: Snow-free areas are detected and cleared (SWE = 0 mm, inserted) using a combination of radiometer-derived information and snow extent information from optical remote sensing. A time-series detection approach8 is used to estimate the end of snow-melt season and any remaining SWE estimates are cleared from those pixels. After this, SWE estimates are also cleared from regions where optical data indicate snow-free conditions. The Japan Aerospace Exploration Agency (JAXA) JASMES 5 km Snow Extent data product 1978–201833 is used to construct a cumulative snow mask in 25 km EASE-Grid projection for the SWE product time span. Cumulative masking retains the latest cloud-free observation for each EASE-Grid pixel, and uses the daily product to update snow-free/snow-covered conditions, based on a 25% snow cover fraction threshold.

### Uncertainty estimation for SWE retrieval

The GSv3 SWE product contains pixel-wise information of the retrieved SWE along with an uncertainty estimate which represents the total product error (determined for each individual pixel and each day) composed of a statistical random error component and a systematic error component. The statistical component is derived using an error propagation analysis34 and the systematic error component was determined using the snow transect reference dataset.

The statistical error component is the theoretical error estimated through an adaptive formulation35 that takes into account both the spatially and temporally varying characteristics of the weather station snow depth observations and the spaceborne microwave radiometer measurements2,3.

The systematic error is defined as the sum of all other error factors not considered in the statistical error analysis. These include unknown error that vary spatially and fluctuate temporally such as the inaccuracy of forward modeling, which can result in biases to the outcome of inversion algorithms (geophysical retrievals). Such derived biases of snow retrieval algorithms can be different for different snow regimes and they may show variability between seasons and from one winter to another. Based on the above, the systematic error is calculated as the residual from the total (absolute) error and the statistical error, where the total error itself is determined through analysis with the independent snow transect validation data35.

### Generation of monthly SWE products and bias-correction

The monthly aggregated snow water equivalent product (Level B) is calculated from all available daily SWE estimates for the given calendar month. The number of available daily data (daily products are considered Level A) is less than the total number of days in some months due to missing satellite data. During the SMMR era (1979 to 1987) satellite PMW data are only available every other day. The variation of the available daily data is not accounted for in the monthly product, only a simple mathematical average is provided. For the months that have zero observed daily SWE products, there are no monthly SWE estimates (typically for the summer months).

Monthly SWE products can be bias-corrected by applying available independent snow course observations that cover the Northern Hemisphere1. The principle of the bias correction method is to investigate the difference between the mean SWE observed across a snow course and its coincident daily GSv3 SWE estimate. By collecting all data pairs at each snow course location through the period of satellite observations, we can determine how the bias between the GSv3 estimates and reference snow course values behaves with time. The performed analysis1 indicated that, in general, the monthly bias observed for a single snow course does not change from year to year, but it varies strongly spatially. This was determined for a time period ranging from 1980 to 2016 across Eurasia and from 1980 to 2003 across North America.

Average bias and its variance is derived for each snow course for each month separately. These biases are spatially interpolated, taking into account the estimated temporal variance, by ordinary kriging interpolation. This results in a monthly hemispheric map of GlobSnow retrieval bias and its spatial variance that are used to bias correct the monthly SWE product. The GSv3 bias correction is time invariant (same correction applied for a given month through the time series), so the technique serves to improve the climatology.

The bias-corrected monthly dataset is generated by applying the bias-correction procedure for each monthly product through the 37 year time series. This is done by subtracting the estimated retrieval-bias on a pixel-level and storing these “bias-corrected SWE estimates” as the bias-corrected monthly products. A fully detailed description is presented in1.

Source: Resources - nature.com