in

User-focused evaluation of National Ecological Observatory Network streamflow estimates

As part of the streamflow data release, NEON released four relevant data products: Gauge Height26, Elevation of Surface Water29, Stage-discharge Rating Curves30, and Continuous Discharge15. Data users are able to download this full suite of information and protocols to inform decisions on data usage and applicability. We evaluated the quality of the Continuous Discharge product using all four relevant NEON data products, considering the validity of model inputs as well as the goodness-of-fit of final streamflow estimates. We analyzed 1) the fit of the regression between manual stage height readings and continuous pressure transducer data used to estimate continuous stream surface elevation, 2) the fit of rating curves transforming stream surface elevation to streamflow, and 3) the proportion of streamflow estimates over the maximum manually-measured streamflow.

Stage classification

The rating curve models predicting streamflow required continuous stream stage estimates as model inputs. NEON predicted continuous gauge height with a two step approach. First, continuous in-stream transducer readings were converted to water height by applying an offset between the transducer elevation and the staff gauge (Eq. 1). This offset is derived from the NEON geolocation database as the difference between the location of the pressure transducer and the staff gauge27. The offset changes only when the location of either the staff gauge or transducer moves.

$${h}_{wc}=frac{{P}_{sw}}{p,ast ,g},ast ,1000+{h}_{stage}$$

(1)

Conversion of pressure data to water height used by NEON27 where hwc is the estimated water column height (m), Psw is calibrated surface water pressure (kPa), p is the density of water (999 kg/m3), g is the acceleration due to gravity (9.81 m/s2), and hstage is the offset between the pressure transducer and the staff gauge (m).

Then, NEON uses a linear regression between manually-measured reference stage height and the calculated gauge height from Eq. 1, yielding final predictions of continuous stream gauge height27. In an ideal setting, stage and gauge height should correlate perfectly28. In the field, sensor uncertainty, manual reference measurement error, and shifting conditions in the stream can convolute the relationship. We tested the goodness of fit between continuously estimated stream gauge height values and manual stage measurements using the Nash-Sutcliffe model efficiency coefficient (Eq. 2). Nash-Sutcliffe coefficient is a commonly used metric in hydrology used to evaluate how well a model performed relative to observed values (manually measured stage and calculated gauge height). For the purposes of this discussion, manual reference measurements will be referred to as ‘stage’ and automated, sensed readings as ‘gauge height’.

$$NSE=1-frac{Sigma {left({Q}_{o}-{Q}_{m}right)}^{2}}{Sigma {left({Q}_{o}-{bar{Q}}_{o}right)}^{2}}$$

(2)

Equation 2 presents Nash-Sutcliffe model efficiency coefficient, where Qo is an observed value (streamflow or stage height), Qm is a modeled value, and ({bar{Q}}_{o}) is the mean of observed values.

Stage, gauge height, and regression data were sourced from the NEON Continuous Discharge product, representing what was directly applied to streamflow estimation. Up to 26 stage measurements were available per year. We examined every regression between stage and gauge height (one per site year in which data was available) and classified each as either ‘good’, ‘fair’, or ‘poor’ quality based on their goodness of fit. Regressions with a NSE (Eq. 2) of 0.90 or greater were considered good, those with a NSE of less than 0.90 but greater than or equal to 0.75 were considered fair, and those with an NSE of less than 0.75 were considered poor (Fig. 2).

Drift detection

Because electronic instruments, such as pressure transducers, can have systematic directional drift, referred to as ‘drift’, during deployment, we developed an approach to detect periods of time when NEON’s Elevation of Surface Water product drifted. We used two methods to assess and flag the potential for instrument drift at monthly time steps. First, we flagged any period the manually measured stage fell outside NEON’s uncertainty bound for gauge height made at the same time. From this, we calculated the proportion of stage measurements outside of the gauge height uncertainty bounds per month. This proved to be a relatively lenient filter that missed periods of manually identified drift. We found adding a second filter that flagged any month where the difference between the manually measured stage and gauge height exceeded 6 cm, was effective in catching the majority of periods where drift was identified. Second, we calculated the average differences between stage and gauge height for each month (Fig. 3). To determine appropriate cut-off values to classify areas of potential drift, we manually audited and flagged periods of observable directional drift. Our goal was to set a maximum cut-off difference which retained as much usable data as possible while still capturing 70% of the manually flagged directional drift periods. Applying this method, we determined a cut-off value of 6 cm average monthly deviation between observed and predicted stage values.

Using these two filters in combination, we again classified data into three groups: ‘likely no drift’, ‘potential drift’, and ‘not assessed’. Site-months with no more than 50% of stage measurements outside of the gauge height time series uncertainty and an average difference between stage and gauge height less than 6 cm were considered to have ‘likely no drift’. Site-months with either more than 50% of stage readings outside of the gauge height time series uncertainty or an average difference between stage and gauge height more than 6 cm were deemed to have ‘potential drift’. Site-months with no stage measurements could not be evaluated and were considered ‘not assessed’. Although this approach to identify drift is imperfect, in that slight drift could be missed and times without manual measurements are not possible to assess, we believe this is a helpful method given the data available from NEON and the fact drift has been observed when visually inspecting data (Fig. 3).

Rating curve classification

To evaluate how well rating curves predicted streamflow, we assessed each rating curve used to convert stage to discharge. NEON prepares a new rating curve for each site’s water year (beginning on October 1st)27. In cases where NEON reported multiple rating curves for a site’s water year each curve was assessed separately across the time series which it was used. We classified rating curves into three tiers based on two metrics: the Nash-Sutcliffe coefficient (Eq. 2) between observed and predicted streamflow, and the percentage of continuous discharge values above the maximum manually measured gauging used to construct the rating curve.

First, we calculated the Nash-Sutcliffe coefficient for each rating curve to estimate how well rating curves captured the variation in the stage-streamflow relationship. We used the reported values for modeled and manually measured streamflow from the ‘Y1simulated’ and ‘Y1observed’ columns in the ‘sdrc_resultsResiduals’ table of the Stage-discharge rating curves product. NEON generally conducts between 12 and 24 manual gaugings per year to build and maintain the stage-discharge relationship.

Second, we calculated the percentage of continuous streamflow values outside the range of manually measured estimates of streamflow. This was useful to assess if the stage-discharge relationship is representative of observed flow conditions. The relationship between discharge and stage is often nonlinear, with inflection points around changes in channel morphology making gauging the stream at high and low flow conditions critical to building a reliable rating curve16. A rating curve based on a large number of direct field measurements all taken during a narrow range of baseflows, for example, could generate a rating curve with a high Nash-Sutcliffe coefficient that is unreliable when extrapolated to high or low flow events. Using these two metrics, we were able to classify rating curves into categories of relative quality. To calculate the percentage of values in the continuous streamflow product that fall outside the range of manually gauged streamflow values, we extracted the maximum and minimum gauging values from the ‘sdrc_resultsResiduals’ table in the Stage-discharge Rating Curve product. We then compared the predicted values derived from each rating curve (as reported in the ‘csd_continuousDischarge’ table) to the extracted range and calculated the proportion of values which fell outside of it.

We used the Nash-Sutcliffe coefficient and percentage of streamflow values over the maximum observed field measurements to classify rating curves into three categories outlined in Table 1.

To integrate stage-gauge regressions, drift detections, and rating curve classification, we produced a summary table with classifications for all three tests and the corresponding metrics used in each classification (Fig. 5). The table is grouped by month and site so users can query sites and determine which months have the appropriate data for their needs.


Source: Ecology - nature.com

Genomic architecture of migration timing in a long-distance migratory songbird

Adélie penguins north and east of the ‘Adélie gap’ continue to thrive in the face of dramatic declines elsewhere in the Antarctic Peninsula region