### Catchment locations and input data

To determine the impact of afforestation on catchment hydrology we select twelve varied catchments from across the British Isles (Supplementary Material-S1 and Fig. 1). These catchments capture a range of hydrological regimes, drainage patterns and catchment soil and land-cover properties to determine how such factors may influence catchment response to afforestation. Being predominantly >1000 km^{2} in area (ranging from 511 to 9931 km^{2} in size), they are adequately represented in a hydrological model to integrate processes at a 1 km^{2} spatial resolution^{54,55}. Two catchments are nested within larger ones, the Ure within the Ouse, and the Severn at Bewdley (Severn-B) within the Severn at Haw Bridge (Severn-HB) (Fig. 1).

The period 2000–2010, a flood-rich period for the British Isles^{36,37}, is chosen to assess afforestation influence on streamflow as it allows us to avoid the uncertainty that would be associated with land-cover changes over a longer period when comparing to baseline results. This length of the simulation period also reduces the computational demand with a large ensemble of land-cover scenarios. Accordingly, the CEH land-cover map for the year 2000^{56}, in the form of the CHESS-land dataset^{57}, is used to provide configurational datasets specifying soil hydraulic and thermal properties, vegetation characteristics, and orography, for the model at a 1 km^{2} spatial resolution for the unaltered land-cover scenarios. This dataset has successfully been used in other studies^{55,58}. The 25 m rasterised land-cover map is reclassified into eight different land-cover types (Supplementary Tables S4 and S5) and used to derive afforestation scenarios related to land cover before being converted to a percentage land-cover fraction at a 1 km^{2} spatial resolution. To provide the required meteorological driving data, we use the CHESS-met dataset^{59} which includes long-wave and short-wave radiation, air temperature, specific humidity and pressure. The 50 m CEH Integrated Hydrological Digital Terrain Model elevation data is used to derive topographical and catchment attributes as well as catchment boundaries and river networks^{60}. Soil hydraulic information comes from the Harmonised World Soil Database and was made uniform across each grid cell^{61}.

### The model

The Joint UK Land Environment Simulator (JULES) is a physically based land-surface model that simulates the fluxes of carbon, water and energy at the land surface when driven by a time series of the atmospheric data^{23,24}. Multiple studies have used JULES before including the investigation into evapotranspiration drivers across Great Britain^{58,62}, atmospheric river formation over Europe^{63}, the impact of solar dimming and carbon dioxide on runoff^{64} and developing river routing algorithms with a Regional Climate Model^{65}. JULES is routinely used at the Met Office, where it is coupled with several other models to understand future changes globally and across the UK, by bridging the atmosphere, land surface and ocean^{66}. This study is predominantly a theoretical, scenario-based modelling study designed to draw out general principles and to quantify the relation between afforestation and hydrological response, and as such the results are not intended to provide detailed guidance for specific practical actions.

The use of a process-based model enables us to investigate physical explanations for the hydrological impacts of changes in land cover and the explicit representation of vegetation that will influence fluxes, partitioning and storages within the realm of epistemic uncertainty for other conceptual and hydrological models where vegetation is not included. JULES models both plant phenology and canopy storages^{23,24}. When changing the plant functional type in JULES, both the properties of the above-ground vegetation (such as canopy height and leaf area index) and the soil infiltration factor and the root depth are altered^{23}. However, there are several caveats that must be considered with this approach. First, the model configuration used in the present study is uncoupled from the atmosphere and so large-scale land-cover changes cannot alter nearby weather^{67}. Second, each grid cell is hydrologically separated from adjacent cells, with streamflow and runoff hydrologically uncoupled from the rest of the system. Soil water also does not flow between grid cells. Third, soil thermal and hydraulic properties are uniform across a grid cell. This reduces the impact of hydrological pathways within a cell and the interaction of vegetation with these varying soil types that could have ramifications at multiple temporal and spatial scales. For example, within the cell there may be vegetation that is water-stressed (e.g. valley sides) compared with vegetation where water is not limited (e.g. floodplain) which would change how much transpiration is possible and thus runoff^{68}.

Precipitation in the model is partitioned by vegetation and when it reaches the soil surface it is portioned into either infiltration excess overland flow, at a rate controlled by the hydraulic conductivity of the soil, or saturation-excess overland flow as determined by the Probability Distributed Model (PDM)^{69,70}. Throughfall (*T*_{F}) through the canopy is dependent on the rainfall and the existing water in the canopy:

$${T}_{F}=Pleft(1-frac{C}{{C}_{m}}right)exp left(-frac{{varepsilon }_{r}{C}_{m}}{PvarDelta t}right)+Pfrac{C}{{C}_{m}}$$

(1)

where *P* is the rainfall rate (kg m^{−2} s^{−1}), *C* is the amount of water in the canopy (mm), *C*_{m} is the maximum water storage of the canopy (mm) and *ε*_{r} is the fraction of the grid cell occupied by convective precipitation. The maximum amount of canopy water storage is a function of the leaf area index (*L*):

$${C}_{m}={A}_{m}+{B}_{m}L$$

(2)

where *A*_{m} is the ponding of water on the soil surface and interception by leafless vegetation (mm) and *B*_{m} is the rate of change of water holding capacity with leaf area index. At each timestep (*n*) the canopy storage is updated thus:

$${C}^{(n+1)}={C}^{(n)}+(P-{T}_{F})Delta t$$

(3)

Based on the surface energy balance, the fraction of the proportion of water stored in the canopy compared with the maximum canopy capacity of that plant type is used to calculate the effective surface resistance to determine tile evapotranspiration.

Surface runoff is generated by two processes in JULES: infiltration excess, where the water flux at the surface is greater than the infiltration rate of the soil, and saturated excess overland flow where the water flux at the surface is converted to runoff when the soil is completely saturated. To calculate the saturation-excess overland flow, the PDM^{69} is used to determine the fraction of the model grid cell that will be saturated (*f*_{sat}) which is used as a multiplier to convert any excess water reaching the surface to runoff:

$${f}_{{{{{{mathrm{sat}}}}}}}=1-{left[frac{{max }(0,S-{S}_{0})}{{S}_{{max }}-{S}_{0}}right]}^{frac{b}{b-1}}$$

(4)

where *S* is the fraction of the grid cell soil water storage, *S*_{0} is the minimum storage at and below which there is no surface saturation (mm), *S*_{max} is the maximum grid cell storage (mm) and *b* is the Clapp and Hornberger^{71} soil exponent. We use the topography-derived parameterisation for the *b* and *S*_{0}/*S*_{max} parameters to reduce individual calibration with the following relationship^{55}:

$$left{begin{array}{c}b=2.0hfill {S}_{0}/{S}_{{max }}=,{max },(1-frac{s}{{s}_{{max }}},,0.0)end{array}right.$$

(5)

where *s* is the grid cell slope (°) and *s*_{max} is the maximum grid cell storage (mm). Once interception and surface runoff have been calculated, the remaining water enters the soil. This water is allocated to the different soil layers within the soil column by using the Darcy–Richards equation:

$$W=kleft(frac{{{{{{mathrm{d}}}}}}varphi }{{{{{{mathrm{d}}}}}}z}+1right)$$

(6)

where *W* is the vertical flux of water through the soil (kg m^{−2} s^{−1}), *k* is soil conductivity (kg m^{−2} s^{−1}), *φ* is suction (m) and *z* is the vertical flux of water through the soil (m). To calculate suction and soil conductivity, we use the van Genuchten^{72} scheme:

$$left(frac{theta }{{theta }_{s}}right)=frac{1}{{[1+{(alpha varphi )}^{(frac{1}{1-m})}]}^{m}}$$

(7)

where *θ* is the average volumetric soil moisture (m^{3} m^{−3}), *θ*_{s} is the soil moisture at saturation (m^{3} m^{−3}), *α* and *m* are van Genuchten parameters dependent on soil type. The hydraulic conductivity is calculated thus:

$${K}_{h}={K}_{hs}{S}^{varepsilon }{left[1-{left(1-{S}^{frac{1}{m}}right)}^{m}right]}^{2}$$

(8)

where *K*_{h} is the hydraulic conductivity (m s^{−1}) and *K*_{hs} is the hydraulic conductivity for saturated soil (m s^{−1}). *ε* is an empirical value set at 0.5 in JULES and *S* is found by:

$$S=frac{(theta -{theta }_{r})}{({theta }_{s}-{theta }_{r})}$$

(9)

where *θ*_{r} is the residual soil moisture (m^{3} m^{−3}). Vegetation can access water from each level in the soil column as a function of the root density where the fraction of roots (*r*) in each soil layer (*l*) from depth *z*_{l-1} to *z*_{l} is:

$${r}_{l}=frac{{e}^{-frac{2{z}_{l-1}}{{d}_{r}}}-{e}^{-frac{2{z}_{l}}{{d}_{r}}}}{1-{e}^{-frac{2{z}_{t}}{{d}_{r}}}}$$

(10)

where *z*_{l} is the depth of the *l*th soil layer, *d*_{r} is the root depth (m) and *z*_{t} is the total depth of the soil column (m). The water flux extracted from a soil layer is *e*_{l}*E* where *E* is transpiration (kg m^{−2} s^{−1}) and *e*_{l} can be found by:

$${e}_{l}=frac{{r}_{l}{beta }_{l}}{{sum }_{l}{r}_{l}{beta }_{l}}$$

(11)

and *β*_{l} is defined by:

$${beta }_{l}=left{begin{array}{cc}1 hfill& {theta }_{l}ge {theta }_{c}hfill ({theta }_{l}-{theta }_{w})/({theta }_{c}-{theta }_{w}) & {theta }_{w}, < ,{theta }_{l} , < , {theta }_{c}hfill 0 hfill& {theta }_{l}le {theta }_{w}hfillend{array}right.$$

(12)

where *θ*_{c} and *θ*_{w} are the volumetric soil moisture critical and wilting points respectively (m^{3} m^{−3}) and *θ*_{l} is the unfrozen soil moisture at that soil layer (m^{3} m^{−3}). In this configuration of JULES, when a soil layer becomes saturated, the excess water is routed to lower layers. When the bottom layer becomes fully saturated any excess water is added to the subsurface runoff. Both the surface and subsurface runoff are then passed to the River Flow Model^{65,73} which routes the flows according to a flow direction grid^{74}.

This study uses a combination of calibrated model parameters from the previous work of Robinson et al.^{59} and Martinez-de la Torre et al.^{55} (Rose suites u-bi090 and u-au394, respectively, which can be found using the Rose/Cylc suite control system: https://metomi.github.io/rose/doc/html/index.html). We compare observed streamflow from the NRFA database^{75} with model output for the years 2000–2010 using the base land and CHESS-met datasets. The model is spun-up for the years 1990–2000 to ensure soil moisture content has been equilibrised. To quantify the accuracy of the model, we use a range of standard error metrics. These include the Nash–Sutcliffe efficiency^{76} measure:

$${{{{{mathrm{NSE}}}}}}=1-frac{{sum }_{i=1}^{n}{({Q}_{{{{{{mathrm{sim}}}}}}}-{Q}_{{{{{{mathrm{obs}}}}}}})}^{2}}{{sum }_{i=1}^{n}{({Q}_{{{{{{mathrm{obs}}}}}}}-{bar{Q}}_{{{{{{mathrm{obs}}}}}}})}^{2}}$$

(13)

Kling–Gupta efficiency^{77}:

$${{{{{mathrm{KGE}}}}}}=1-sqrt{{(r-1)}^{2}+{left(frac{{sigma }_{{{{{{mathrm{sim}}}}}}}}{{sigma }_{{{{{{mathrm{obs}}}}}}}}-1right)}^{2}+{left(frac{{mu }_{{{{{{mathrm{sim}}}}}}}}{{mu }_{{{{{{mathrm{obs}}}}}}}}-1right)}^{2}}$$

(14)

Root-mean-squared error:

$${{{{{mathrm{RMSE}}}}}}=sqrt{mathop{sum }limits_{i=1}^{n}{({Q}_{{{{{{mathrm{sim}}}}}}}-{Q}_{{{{{{mathrm{obs}}}}}}})}^{2}}$$

(15)

Mean absolute error:

$${{{{{mathrm{MAE}}}}}}=frac{{sum }_{i=1}^{n}|{Q}_{{{{{{mathrm{sim}}}}}}}-{Q}_{{{{{{mathrm{obs}}}}}}}|}{n}$$

(16)

where *Q*_{sim} is the simulated discharge, *Q*_{obs} is the observed discharge, *r* is the linear correlation between observation and simulations, *σ*_{sim|obs} is the standard deviation of discharge, *μ*_{sim|obs} is the mean of discharge and *n* is the number of observations. We also use NSE(log(*Q*)) and KGE(1/*Q*) to understand how well the model can reproduce low flows. Using these measures, we find that JULES performs satisfactorily apart from the Avon Catchment which may be due to fast subsurface flows generated by its geology^{55} (Supplementary Table S7). With process-based models, it is difficult to both accurately reproduce physical processes and make the output faithful to reality due to epistemic uncertainties^{78}. Even though model performance is not the same as achieved with calibrated conceptual or empirical models, it allows us to determine the effects of vegetation changes on the hydrological cycle.

JULES’ ability to faithfully represent hydrological land-surface processes in Great Britain has been evaluated in several studies^{58,79,80} and the plant functional type parameters it uses at global scales^{81,82}. To validate the ability of our configuration of JULES to represent soil moisture and potential evapotranspiration rates, we compare the model output with observations from twelve COSMOS-UK sites within our catchments covering grasslands, croplands, coniferous and broadleaf woodland^{83} (Supplementary Fig. S8). We evaluate model performance from the start of the COSMOS-UK station records until January 1, 2018 so that we use the same forcing data as our experiments. Station start dates vary from October 2013 to August 2017. We compare COSMOS-UK observed soil moisture to the first 0.1 m of the soil column in JULES and evaporation to the sum of the soil evaporation and plant transpiration. We find a median KGE score of 0.44 for the topsoil moisture and 0.53 for potential evaporation (Supplementary Tables S9 and S10). Low error metrics observed for topsoil moisture are due to systematic undercalculation by JULES^{80}. At our broadleaf sites, Alice Holt and Wytham Woods, we find both systematic over and underprediction of the topsoil moisture respectively. In line with other studies, we find that there is a slight overestimation of evaporation in JULES^{58,84}. As illustrated by the median coefficients of determination between the COSMOS-UK and JULES data of 0.62 and 0.60 for the topsoil moisture and evaporation respectively, JULES broadly represents changes in these variables over time.

### Land-cover scenarios

Modelling the influence of afforestation on catchment hydrology has been attempted before but usually only at the scale of a single catchment for a limited range of scenarios. In this study, we focus on the theoretical effect of widespread planting of broadleaf trees to examine whether planting location is a stronger control on hydrological response than afforestation extent by using a large ensemble of up to 288 land-cover change scenarios. We choose to focus just on broadleaf woodland for several reasons. First, we are trying to replicate a landscape that could be considered a natural climatic climax community that might occur if it had not been for human intervention during the Holocene. Second, broadleaf woodland has the potential to absorb and store carbon in soils for longer time periods. Finally, to reduce computational cost and the issue of potentially expanding the errors induced by potentially spurious parameters of needleleaf woodland in this version of JULES^{85}. Although potential woodland planting locations have been suggested by the Environment Agency and authorities in Scotland and Wales^{86,87,88}, the differences in planting criteria means it is not possible to systematically compare hydrological changes across our chosen catchments. Here we attempt to create afforestation scenarios related to both catchment river network structure and land use that are directly comparable across a range of catchments. Afforestation was in grassland areas to reduce the complexity of the decisions made and enable an understanding behind catchment sensitivity to land-cover changes related to soil and catchment structure.

Three metrics were selected to discretise the catchment into distinct areas for afforestation: the Topographic Wetness Index (TWI)^{28}, Strahler^{27} and Shreve orders^{26}. These metrics capture different parts of the catchment such as propensity for saturation, drainage network location and relative contributing areas. TWI is calculated by:

$${{{{{mathrm{TWI}}}}}}=,{{{{mathrm{ln}}}}},frac{a}{tan ,gamma }$$

(17)

where *a* is the upslope area draining through a point, per unit contour length, and *γ* is the local surface topographic slope in radians. All three metrics were calculated using the 50 m IHDTM^{60}, thresholding stream formation at an accumulation of ten pixels using the D8 flow direction algorithm within ArcGIS 10.6.1^{89}. Strahler order ranges from one (headwaters) to seven (lowlands). Due to the continuous nature of TWI (0.05–31.49) and the large ordinal range of Shreve order (1–9523) calculated for the entire British Isles, we group TWI orders into five quantiles and seven quantiles for Shreve. Increasing TWI order in this case indicates increasing propensity for saturation, or potential maximum saturation level, and increasing Shreve order indicates increasing contributing area. Catchments were broken down to watersheds from the downstream point of the Shreve and Strahler orders. Due to the nature of the data, this led to some first order Strahler catchments being incorrectly generated for some catchments (Supplementary Table S7). Using these generated catchment areas, we plant both inside and outside of these watersheds to understand the hydrological difference between opposing planting locations. In each of the catchment areas, two different levels of afforestation were tested of ~25 and 50% of the possible planting area. Planted area was assigned at random in the catchment and was produced by calculating the area available for afforestation and randomly producing points that covered the area required using the Create Random Points tool in ArcGIS 10.6.1.

Discussions exist about where to plant woodland in relation to existing land cover, to provide ecosystem services, including around watercourses^{29,30}, urban areas^{31,32} and woodland^{4,33}. Therefore, in this study we try to understand how these potential planting scenarios will affect hydrology in general. Using the CEH 2000 land-cover map^{56} buffers of broadleaf land cover were created at 25 and 50 m around these three land uses (Supplementary Fig. S7). These were then discretised according to the catchment areas. As an example, one scenario would be afforesting up to 50 m around existing broadleaf woodland inside the Shreve order one catchment area, whilst another would be randomly afforesting within 25% of the available area outside of TWI order five areas.

Afforestation according to different catchment areas and land-cover uses between 234 and 288 scenarios for each catchment and between 0 and c. 40 percentage point increase in broadleaf woodland (Supplementary Fig. S2 and Table S8). Due to the structure and size of the different catchments, and thus differences in Strahler and Shreve orders, not all catchments had a comparable number of higher orders. Produced scenarios were converted to the 1-km^{2} grid scale by altering the fraction of land-cover types within each grid cell. It should be noted that this work only considers the impact of mature broadleaf woodland and neglects the influence of the initial planting and growing of the woodland that would likely have its own impact on catchment hydrology as frequently reported^{13,49}. Furthermore, it does not include the period when there would be the highest amount of carbon sequestration. This study seeks to understand the theoretical impact of woodland on catchment hydrology when fully developed to understand the long-term implications of management decisions.

### Hydrological signatures and analysis

Several hydrologic indices can be used to characterise the influence of afforestation on streamflow regime^{34,35}. To analyse average streamflow and extremes, we look at the top 1% (very high flow), 5% (high flow), 50% (median flow), 90% (low flow) and 95% (very low flow) quantiles of daily streamflow. To quantify flow variability, we use the slope of the flow duration curve^{38,40} calculated thus:

$${{{{{mathrm{FDC}}}}}}=frac{{{{{mathrm{ln}}}}}({Q}_{33 % })-,{{{{mathrm{ln}}}}}({Q}_{66 % })}{(0.66-0.33)}$$

(18)

where *Q*_{33%} is the 33rd flow exceedance quantile and *Q*_{66%} is the 66th flow exceedance quantile. To ascertain catchment responsiveness to climatic forcing, we use median streamflow elasticity^{40,41}:

$${{{{{mathrm{MSE}}}}}}={{{{{mathrm{median}}}}}}left(frac{{{{{{mathrm{d}}}}}}Q}{{{{{{mathrm{d}}}}}}P}frac{P}{Q}right)$$

(19)

where d*Q* and d*P* are the annual changes in yearly discharge and precipitation, respectively. Finally, we use the runoff ratio to quantify water balance changes related to streamflow and evapotranspiration^{42}:

$${{{{{mathrm{RR}}}}}}=frac{{mu }_{Q}}{{mu }_{P}}$$

(20)

where *µ*_{Q} and *µ*_{P} are the average yearly discharge and precipitation using daily values, respectively. We also qualitatively assess the largest peak flow daily event in the 10-year record used in this study to determine the impact of afforestation on the highest possible flows in each catchment.

To determine how afforestation influences streamflow metrics, percentage changes in flow metrics are plotted as a function of percentage point increases in afforestation (calculated using the difference between original and afforested scenario). Quantile regression is applied to determine the median regression slope of the trend for the entire period^{43}. The benefit of using quantile regression is that it identifies the median response of the input variable (in this case the level of afforestation in both percentage and absolute terms) without being influenced by extreme outliers. In this way, we can estimate the proportional streamflow response to afforestation over the period. We use the regression slope coefficient as a proxy of catchment sensitivity to afforestation for each streamflow metric. The slope coefficient is then correlated to catchment attributes, as stated in the CAMELS-GB dataset^{44}, using Spearman’s rank correlation. This allows us to determine the direction and significance of the catchment property influences on the sensitivity of catchments to afforestation for the different hydrologic signatures. To determine the impact of different planting locations according to catchment and land-cover location a one-way analysis of variance (ANOVA) test is undertaken using R^{90}.

Source: Resources - nature.com