Experimental designs varied across the 39 research sites with plot size ranging from 0.04 ha to 80 ha. The size of the plot drainage areas varied accordingly from 0.02 to 56 ha. The number of site-years of available data ranging from 2 to 17 with a mean of 7 years. There were diverse soil types, five soil textural classes and soil organic carbon ranging from 0.1% to 3.7%. Corn (Zea mays) and soybean (Glycine max) were the predominant crops grown, but 23 site-years had popcorn (Zea mays everta), wheat (Triticum aestivum), forage, oats (Avena sativa), or sugar beets (Beta vulgaris).
CD was practised at the greatest number (19) of sites (Fig. 2) across seven states in the Midwest and North Carolina. The research sites extended from 35.8° to 46.4° N and 76.7° to 96.9° W. The majority of sites (30) were on private farm (cooperator) fields through a lease or collaborative arrangement, with the remaining 9sites on university-owned and managed research farms. The USDA soil drainage class for the dominant soil type at each site ranged from somewhat poorly drained to very poorly drained11. The subsurface drainage of all sites consisted of 102 mm-diameter perforated corrugated tubing except MN_Clay sites (76 mm diameter tubing) and included both CD and free drainage (FD) treatments. Tile depth ranged from 0.61 m to 1.22 m, and tile spacing varied from 6 m to 36 m with median 13.7 m. All sites had similar drain spacings across treatments except IA_di4 and IA_Washington. These two sites varied tile spacing and/or tile depth. IA_di4 tile spacing differed with 27 m and 36 m for FD and CD plots, respectively. While at IA_Washington, tile spacing was 12 m in the shallower drainage treatment compared to 18 m spacing in the conventional drainage treatment. Seven sites had replicated drainage treatments with an average drainage area of 1.1 ha. Sites that did not include replications were larger farm fields with an average drainage area of 10.5 ha, except one university research field with a drainage area of 1.8 ha.
DWR research was conducted at seven sites across the Midwest. Individual research site locations ranged from 39° to 46° N and 83° to 96° W. The treatments at the sites included DWR utilizing controlled drainage with sub-irrigation or controlled drainage with on-surface drip irrigation. In addition, there was a comparison treatment of FD with no irrigation. The three Ohio sites included wetland monitoring in addition to drainage water recycling as part of the Wetland Reservoir Subirrigation System (WRSIS) project12.
Eight SB sites were monitored as part of this project, seven in Iowa and one in Minnesota. One of the Iowa sites included the first SB installed in the US4. Five sites categorized as ‘Other’ included monitored drainage practices slightly different from the previously described categories. The IN_Tippecanoe site was a wetland with future drainage water recycling planned but not implemented during this period. MN_Clay1 was a conventionally drained farm, MN_Clay3 was an undrained farm with only surface drainage, MN_Redwood2 was an undrained prairie area and ND_Richland had controlled drainage and a sub-irrigated area utilizing a sump pump lift station for water management.
Data collected at each site
The data describes crop and field management, soil physical characteristics, water quality and quantity time series, drainage system design and specific practice variables for the 39 research sites. Weather data, primarily precipitation and air temperature, were also available for each site. However, other data collected varied since the measurement protocols were not coordinated before research was initiated at many sites. Cumulatively, more than 90 in-field variables were measured across all sites to characterize the performance of these alternative agricultural water management strategies. Water quality and quantity time series (drain flow, water table depth, nitrate-N concentration, and precipitation) were considered essential data for temporal robustness and accuracy regarding the hydrological response.
Precipitation (39 sites) and drain flow or discharge (36 sites) were the most reported variables, followed by nitrate-N concentration (32 sites) and load (30 sites) (Fig. 2). Other common water quality variables are summarized in Fig. 3. In addition, soil moisture time series collected at varying depths were reported for 16 sites.
In addition to the water quantity and quality variables that provide a direct measure of treatment impact to water sustainability, other variables including crop yield, crop and field management and soil characteristic data are important for evaluating inter-site variability. For example, differences in nutrient application with fertilizer and nutrient removal through crop uptake will influence the water quality impact of different treatments. Soil texture (reported for 21 sites), crop yield (29 sites), tillage (27 sites) and fertilizer application (31 sites) were considered most essential of these site characteristic variables13. Along with crop yield, sites reported additional variables that assisted in quantifying plant water, nutrient and carbon uptake, including grain moisture content (13 sites), final plant population (end of season plant density; 9 sites), grain total N (8 sites) and grain biomass (6 sites). Whole plant, vegetative and cob biomass, and whole plant, vegetative, cob and grain N and C contents, forage biomass and leaf area index were reported for five or fewer sites.
Sixteen sites reported soil organic carbon and total N, in addition to basic soil texture information. In addition, 31 other soil parameters were reported for a subset of sites; the most common are summarized in Table 1. Soil organic matter, infiltration, lime index, sodium concentration or amount, sodium absorption ratio, neutralizable acid and salinity were reported for five or fewer sites.
Summary of measurement methods
Most experiments were not coordinated when the data collection project was initiated; hence research data collected, length of experimentation, years of available data, and protocols varied. Methods for each research site are provided in the data to document differences in measurement schedule, sample size, sample collection frequency, and equipment precision. Here, we summarize methods for determining drain flow, nitrate-N concentration and load, water table, soil properties and weather data due to the variability across sites within these key metrics. Crop yield is not summarized here despite its importance as a metric due to more consistent methods typically used across sites. Inter-site sampling methods for water measurements varied more than methodology for measuring other parameters. This variability is due to differing infrastructure at each site that required different measurement methods and the financial resources available for monitoring.
Drain flow measurement and reporting
Drain flow or discharge data were reported for 36 sites, including 19 CD, eight SB, six DWR and three with other practices (e.g., wetland). For all CD, three DWR and two wetland sites, drain flow was reported in mm/day (drainage discharge normalized by the drainage area). For all other sites, volumetric drainage discharge was reported in m3/day. Two of the sites (MN_Clay3 and MN_Redwood2) were undrained control sites that did not report drain flow or discharge. A third site (MO_Shelby) focused on the agronomic impact of subsurface drainage practices and did not monitor drain flow.
Drain flow was measured hourly or sub-hourly at more than 80% of the sites, followed by aggregation to daily flow measurements. Subsurface drainage flow rates were determined as a function of the water head measured using pressure transducers installed inside drainage control structures or at the drain outlet for approximately two-thirds of the sites. The water head was measured upstream of V-notch or rectangular weirs and empirical equations that depend on the weir dimensions were used to determine drain flow, which was measured and recorded hourly or sub-hourly. For IN_Tippecanoe drain flow was estimated as a function of water head using an empirical rating curve. At three sites, drain flow was measured using inline flow meters and recorded by data loggers. The advantage of this method is that flow could be recorded in either direction, valuable for sites experiencing backflow in the drainage system due to high downstream water levels14. At ND_Richland, drainage was collected at a sump where a current sensor was used to measure pumping frequency to calculate drainage flow15. For an additional three sites, drainage discharge was measured using a depth-velocity meter installed at the outlet of the drainage pipe or a drainage ditch. The drainage discharge was calculated as the product of the flow velocity and the area of flowing water. Only one site (MN_Redwood3) had manual measurements of drain flow that were collected two to three times per week.
Measured drain flow data exhibited variable frequency and duration gaps due to instrumentation malfunctioning, particularly with the automated monitoring systems that provide near-continuous data. Missing data and their non-uniform distribution created problems in statistical analyses when comparing aggregated drain flow and loads from different locations. A systematic approach was used to infill missing drain flow data utilizing variables available at all sites (precipitation, temperature, drain flow) and replicate plots where available. The method consisted of the following three phases and completed in progression, when applicable.
Phase 1, fill in zero flow.
During most winters in the northern states, the soil is frozen to the depth of the tile, and no subsurface drain flow is expected. Such periods were identified based on expert judgment by researchers at each site, relying on soil and air temperature information and local knowledge of the drainage system’s response to these conditions. If no drainage measurements were available due to frozen soil, the corresponding gaps in the data record were infilled with zero.
Phase 2, predict using replicate plots.
Regression-based estimation was used to infill missing data at three sites which had replicated plots or adjacent fields with available data. Due to the seasonal nature of subsurface drainage from croplands, individual linear regression models were developed for each season: winter (Jan, Feb, Mar), spring (Apr, May, Jun), summer (Jul, Aug, Sep) and autumn (Oct, Nov, Dec). Regression r2 values ranged from 0.66 to 0.94 based on the site and season, although mean across-site values were similar: winter (0.80), spring (0.82), summer (0.80), and autumn (0.83).
Phase 3, populate based on precipitation and drain flow from the preceding day.
The remaining missing daily drain flow data at 11 sites were filled as described below, based on the assumption that drain flow occurs on a given day only if (a) precipitation occurred on that day or (b) the drain continued to flow from the day before.
For days with precipitation, a two-day moving average was calculated to account for the time lag between rainfall and resulting drain flow. A linear regression model was fitted to non-zero drain flow and two-day moving average precipitation for each season, with the model’s intercept fixed to zero. We used these models to predict the missing drain flow data for days with non-zero precipitation. The predicted drain flow values were limited to the drainage system’s capacity by replacing predictions greater than the site’s drainage coefficient (depth of water the drainage system could remove within 24 hours) with the coefficient’s value.
For days with zero precipitation, missing drain flow was calculated from the previous day’s observed flow using the following first-order recession equation
where Q is daily drain flow, k is the average recession coefficient of falling limbs calculated as a linear slope of ln(Q), and i indicates day. The recession coefficient was calculated as a linear slope between the peak and inflection point of log-transformed daily drain flow data. The coefficient was calculated for all falling limbs of drain flow data, and the average seasonal values were calculated as their arithmetic mean.
The regression model between on-site precipitation and peak flow and recession equation were only applied to the original (pre-gap-filled) drain flow data. Predictions were not made when the number of missing drainage days exceeded 152 (5 months) within a calendar year; therefore, approx. 18% of the drain flow data remain missing. Both the original and filled data are included in the published data.
Nitrate-N concentration and load measurement and reporting
Nitrate-N (NO3) concentrations were reported for 32 sites, including 15 CD, eight SB, six DWR, and three sites with other practices (e.g., wetland). The three sites not reporting drain flow (MN_Clay3 and MN_Redwood2, MO_Shelby) did not report NO3 concentrations. Two sites (MO_Knox1 and MO_Knox3) provided NO3 load along with discharge in place of reporting the concentration of individual water samples. Two sites (OH_Hardin2 and OH_Henry) did not report NO3 concentrations or load due to limited water sample collection at these sites.
Six sites collected flow-proportional samples, in which a sample is collected every time a given volume of water passes through the drainage system. The flow-proportional sampling methods at the sites varied. At NC_Washington, a portion of flow was diverted continuously into a composite sample which was collected fortnightly (or more frequently under high flows). At IA_di4, a proportional sample was collected each time the drainage system was pumped. At MN_Redwood1, flow proportional samples were collected during storm and baseflow conditions. These samples were not composited but rather kept discrete. Seven sites used automated samplers to collect time-proportional samples. Five of these sites composited samples daily, while one site (IN_Randolph) collected samples hourly, then combined samples into approximately weekly composites. One site (IN_Tippecanoe) collected weekly grab samples prior to 2016 but then switched to automated, time-proportional sampling composited weekly in March 2016. Sites that used automated samplers typically switched to manual sampling (every two days to weekly frequency) in winter to protect automated samplers from freezing. Twelve sites collected weekly grab samples, another collected samples 2–3 times per week. One site collected biweekly grab samples, and four sites collected grab samples approximately monthly. Regardless of the collection method, all samples were either frozen or refrigerated (4–5 °C) upon return to the laboratory until analysis.
The sampling strategy primarily affects the frequency and compositing strategy of the water samples. Automated samplers permit more complex sampling strategies, such as flow-proportional or sub-daily sampling. However, the disadvantages of this method are the high initial expense of sampling equipment and the propensity for equipment malfunction at below-freezing air temperatures. The potential for equipment failure prompted sites using automated samplers to switch to a manual sampling in winter while drains remained flowing. Manual sampling frequency varied among sites due to differences in site accessibility or personnel availability. Both automated and manual water samples were often composited following collection, and sample compositing frequency ranged from daily to biweekly. Although sample collection frequency and compositing strategy affect the uncertainty of loading measurements, a collection frequency between 3 to 17 days is generally sufficient to reach ± 10% accuracy for annual nitrate load estimation for tile-drained landscapes in the Midwest16.
For nitrate-N analysis, 12 sites reported a cadmium reduction followed by a sulfanilamide reaction (equivalent to EPA 353.2). However, there was a slight methodological variation depending on the equipment, either Lachat QuikChem 8000 Flow-Injection Analyzer or SEAL AQ2 Discrete Analyzer. The resulting nitrate-N concentrations calculated via cadmium reduction were directly comparable regardless of the instrument used. At one site, SD_Clay, ion chromatography (EPA 300.1) was used to measure nitrate-N in 2015 but was subsequently switched to a cadmium reduction method. Samples at the seven IA sites were analysed by second-derivative spectrophotometry17.
Daily nitrate loads were calculated by multiplying nitrate concentration by drain flow and were therefore available for 32 sites for which both values were reported. Load calculation methods differed slightly in terms of determining the volume of water associated with each concentration. Typically, linear interpolation was used to determine the daily nitrate concentration at sites which collected “grab” water samples following precipitation events or on a schedule spanning two days or more. One variation used assumed the measured concentration was representative of adjacent days (prior and post), hence no interpolation was done. One site (OH_Delaware) used a midpoint approach to determine the time interval in which measured concentrations were associated with, while another site (IA_di4) assumed measured concentrations represented all water drained before the sample was collected.
Water table measurement and reporting
The water table was measured at 16 sites including nine CD sites, three SB sites, three DWR sites and one wetland site. Documenting water table fluctuation is key to experimental and modelling research investigating crop production systems on artificially drained soils. In a tile-drained field, the water table is used as an input parameter in estimates of drain flow, evapotranspiration, and soil hydraulic conductivity14,18,19. In controlled drainage, the water table is used to determine CD effectiveness and guide water management in the field for different crop stages. For DWR practice, the water table, particularly the midpoint water table, is used to evaluate sub-irrigation performance, such as uniformity and efficiency20. In a saturated buffer field, the water table is the most important factor used to indicate a field’s saturation status21.
The field water table was typically measured at the midpoint between the subsurface drains. Some field studies also measured the water table at two locations, one near the drain tile and the other at the midpoint between two drain tiles. The water table was commonly measured and recorded hourly or sub-hourly using pressure transducers installed inside 1.5–2.5 m deep wells of perforated PVC pipes. The water table depth was calculated using the measured water pressure above the transducer, and the in-situ water temperature and barometric pressure measured in a nearby field and periodically adjusted with manually measured water tables. If there were any discrepancies, all previous water table depth data were moved up or down correspondingly.
Differences across sites spanned the type of pressure transducers used, depth of measurement (1.5 to 2.4 m), data collection frequency (0.17 hr (10 min) to 6 hr), location of the measurement, and the length of the screened section of the pipe. The selection of the transducer type was due to individual choice and cost, while differences in the water table depth measurements were affected by the soil types and drain depth. The frequency of data collection was based on data logger capacity, water table variations, and the purpose of the measurements. The length of the perforated (“screened”) section of the pipe, in which the transducer was installed, also varied. For a typical tile-drained field, the pipe was screened beginning 0.3 m below the soil surface while for a saturated buffer, the pipe was screened beginning at the soil surface16. The data collection frequency for the saturated buffer area was every 6 hr since the water table variations were minimal across time. Within the field experiments, data were collected every hour at 10 sites and every 0.17 hr (10 min) at two sites.
Soil physicochemical variable measurement and reporting
Potentially important soil physical and chemical properties that might affect or be affected by soil drainage were collected from 19 experiments across six states. Data included 17 total variables, continuous and categorical. Soil physical variables included bulk density, hydraulic conductivity (saturated), moisture (water) content, temperature, texture, and soil water retention data (used to form a water retention curve). The remaining 11 variables were chemical properties. The five most common chemical variables characterized were nitrate, total nitrogen, soil organic carbon, pH and cation exchange capacity with several sites using similar methods22.
There was large variability of soil sampling depth among the studies and within specific variables at a site, and in a few cases (<1%) sampling depth intervals were not specified. Of the soil physical properties, soil texture was consistently measured using the same quantitative method (hydrometer). Soil textural class, a categorical variable, was reported once during the study period. Although not explicitly stated, it was assumed that researchers used intact soil cores and standard methods23 for quantifying bulk density. Soil water content was measured using capacitance/frequency domain technology or dielectric impedance-based sensor. At sites where soil temperature was measured, a thermistor in thermal contact with a water content probe measured temperature. Soil water content and temperature data were saved at time intervals ranging from five to 60 minutes.
Of the soil chemical properties, total nitrogen was measured once during a study and consistently using the dry combustion method. The soil nitrate concentration measurement frequency was not specified, and nitrate was extracted using potassium chloride (2 M KCl). The nitrate analysis method, when specified, was either by cadmium reduction or nitrate specific electrode. Soil pH was analyzed using a 1:1 soil:water mixture or salt solution. Where soil organic carbon was measured, the dry combustion method was used after carbonate removal by sulfuric acid.
Weather data collection
Weather information was available from on-site weather stations and nearby weather stations operated by the Federal Aviation Administration, National Weather Service, or state agricultural weather networks. Precipitation was recorded using automated tipping bucket gauges for rainfall at 31 sites, while at one site, a manual read gauge was used. Multiple rain gauges were installed for redundancy and measurement intercomparisons at four sites. The air temperature was recorded electronically except at one site, where a manual thermometer was used. Weather information was recorded at daily intervals.
Precipitation measurements were a particular focus due to its important effects on drainage response and its higher spatial variability compared to solar radiation, air temperature, humidity, and wind speed. To help improve the quality and integrity of precipitation data in situations where the nearest weather station was located more than 1000 m from the research site or to provide more localized information, on-site precipitation was collected at 20 sites. These onsite precipitation measurements used tipping bucket gauges with 15.2 or 20.3 cm orifices to optimize measurement accuracy. Rain gauges may underestimate rainfall due to wind effects, wetting of and evaporation from the funnel, mechanical limitations, splashing out of the funnel or other factors24 by up to 20% at high rainfall intensities25. Reference evapotranspiration was estimated for short or tall reference (or both) using the standardized Penman-Monteith26,27 equation for most sites, except for three of the sites where potential evapotranspiration was estimated using the temperature-based Thornthwaite equation28.
Weather data were reviewed by the Data Management Team using visualization and statistical methods to ensure uniformity and high quality of the data across all sites. Plausible ranges were defined for each parameter based on known climatological limits for the site with guidance by research personnel. For certain parameters, absolute limits were used to identify impossible values (e.g., such as wind directions not between 0 and 360 degree or relative humidity greater than 100%). In addition, an internal consistency check was performed between daily minimum and maximum air temperature observations. All questionable records were flagged, reconciled or removed except for 6 records with internal inconsistency at IN_Randolph but retained and the decision left to data users. The percent of missing data varied among variables (1.2% relative humidity to 36.7% evapotranspiration) and research sites (0 to 38.5% at OH_Crawford). Only, 4.2% of the records are missing in total and it is left to the discretion of users to reconstruct missing data.
Data ingestion and processing
The team leveraged a previously existing Google Cloud entry and management platform developed for the USDA–NIFA funded Sustainable Corn CAP team29,30,31. The system was expanded and customized to the needs of the Transforming Drainage team. Team members were led through a workflow (Supplementary Fig. 1) utilizing Google Sheets (web browser-based spreadsheets stored in the cloud), customized Google App Scripts (web browser-based data entry forms) and uploading free-form content to the shared Google Drive. All information was stored in Google’s Cloud with associated versioning and metadata tracking with team member identification. Automation was used to provide daily alerts to Data Management Team personnel regarding content modified within the previous 24 hours for further review and quality control. Additional automation via Python scripting synced data content to a local PostgreSQL database server housed at Iowa State University for further analysis, quality control and viewing within custom interfaces on an internal website.
The team implemented a targeted data entry approach to ensure timely revision and sharing of key variables among project members. For this purpose, all variables were categorized as primary or secondary in importance based on research priorities. Primary variables were crop yield, drain flow, tile nitrate-N concentration, water table depth, on-site precipitation, and field management data. These data were critical to address research areas of interest and expected to be the most time-consuming to retrieve, given the sampling frequency and quality control needed. All other variables were secondary and incorporated into the database following the entry of primary variables.
The general workflow of the team included characterizing treatments and data collected for each site, building the data dictionary, creating site-specific spreadsheets for different subsets of variables, entering priority data, entering methods used to collect measurements, entering secondary data, and adding additional research sites from cooperating partners. Subsequently, supporting photographs and drainage system design drawings and site maps were uploaded. Quality control was performed on all data throughout the process. Data were interrogated multiple times during the life of the project, and data reviews with site personnel occurred annually to address issues or questions. Issues needing attention were listed internally for site personnel to further scrutinize data prior to publishing.
Standards were instituted for each variable to ensure uniformity and enable data synthesis across dispersed datasets. These standards specified the minimum frequency for continuous or semi-continuous variables (e.g., drain flow was required to be reported at least daily) and precision according to assigned units of measurement. There was only one site (MN_Redwood3) that did not meet the requirements in terms of periodicity for soil moisture, as measurements were collected weekly. At several locations, variables were reported at higher or irregular frequencies. At the end of the project, all high-frequency measurements were aggregated into values corresponding to the minimum frequency of the respective variable.
Nutrient concentration measurements were limited to the detection limit corresponding to the analytical methods used at each site when the information was available. Any concentration below the limit was replaced with the corresponding limit, such as < 0.3. Concentrations were not adjusted at several sites where the detection limit was not available. Occasionally, zero concentrations were reported, indicating that the analytical method could not detect the nutrient. This limitation should be considered for inter-site water quality analysis.
Source: Resources - nature.com