Overestimation of the effect of climatic warming on spring phenology due to misrepresentation of chilling
Phenological and climatic data
We used data from the Pan European Phenology Project (PEP725)29, an open-access database with long-term plant phenological observations across 25 European countries (http://www.pep725.eu/). The regional/national network partners of PEP725 are following a consistent guideline for phenological observations30 and prepare the data for submission to the PEP725 database curators29. We selected 30 species for which sufficient observational data were available: 21 deciduous broadleaved trees or shrubs, 6 herbaceous perennials, 2 evergreen coniferous trees, and 1 deciduous coniferous tree (Supplementary Table 3). Particularly, our data set included one fruit tree (Prunus avium) and one nut tree (Corylus avellana) since some of the chilling models are specifically developed for fruit and nut trees. A total of 2,493,644 individual records from 15,533 phenological stations were used. The stations were mainly distributed in moderate climates in Central Europe (Supplementary Fig. 3). Four spring events based on the BBCH code were investigated: BBCH 10, 11, 60, and 69, representing first leaves separated, first leaves unfolded, first flowers open, and end of flowering, respectively31.
We used the E-OBS v19.0eHOM data set32 with a spatial resolution of 0.1 × 0.1° for 1950–2018 for calculating CA and HR of the in situ phenological records. This data set is provided by the European Climate Assessment & Data set project and includes homogenized series of daily mean, minimum, and maximum temperatures. We also use the daily maximum and minimum temperature data from the GHCN data set33 to assess the scale effect. The GHCN data set contains station-based measurements from over 90,000 land-based stations worldwide, but only parts of PEP725 stations match with the GHCN stations.
For future climatic data (2019–2099), we used daily minimum and maximum temperatures simulated by the HADGEM2-ES model (with a spatial resolution of 0.5 × 0.5°) under two climatic scenarios (RCP 4.5 and RCP 8.5). These data have been bias-corrected by applying the method used in the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP)34, which were available on the ISIMIP server (https://esg.pik-potsdam.de/projects/isimip2b/).
Chilling models
We used 12 chilling models to measure the amount of chilling. One type of chilling model is based on several specific temperature thresholds. The most commonly used model, developed in the 1930s and 1940s for peach35, calculates the number of hours or days with temperatures 0 °C for calculating CA11,38. Models C1–C6 were developed based on various combinations of the upper and lower temperature limits. The rate of chilling was 1 for daily temperatures 5,{mathrm{or}},{T} < - 10} hfill end{array}} right.,$$
(2) $${mathop{rm{CU}}nolimits} _3 = left{ {begin{array}{*{20}{l}} 1 hfill & {0 le {mathop{T}nolimits} le 5} hfill \ 0 hfill & {{T} > 5,{mathrm{or}},{T} < 0} hfill end{array}} right.,$$
(3) $${mathrm{CU}}_{mathrm{4}} = left{ {begin{array}{*{20}{c}} 1 & {{mathop{T}nolimits} le 7} \ 0 & {{T} > 7} end{array}} right.,$$
(4)
$${mathop{rm{CU}}nolimits} _5 = left{ {begin{array}{*{20}{l}} 1 hfill & { – 10 le {mathop{T}nolimits} le 7} hfill \ 0 hfill & {{T} > 7,{mathrm{or}},{T} < - 10} hfill end{array}} right.,$$
(5) $${mathop{rm{CU}}nolimits} _6 = left{ {begin{array}{*{20}{l}} 1 hfill & {0 le {mathop{T}nolimits} le 7} hfill \ 0 hfill & {{T} > 7,{mathrm{or}},{T} < 0} hfill end{array}} right.,$$
(6) where CUi is the rate of chilling for Model Ci, and T is the daily mean temperature (°C).
Model C7 is also known as the Utah Model39, which assigned different weights to different ranges of temperatures and was first used to measure the chilling requirements of peach (Eq. (7)). The Utah Model was modified to produce Model C8 (Eq. (8)), which removed the negative contributions of warm temperatures to accumulated chilling20. $${mathop{rm{CU}}nolimits} _7 = left{ {begin{array}{*{20}{l}} 0 hfill & {{mathop{T}nolimits} le 1.4} hfill \ {0.5} hfill & {1.4 < {mathop{T}nolimits} le 2.4} hfill \ 1 hfill & {2.4 < {mathop{T}nolimits} le 9.1} hfill \ {0.5} hfill & {9.1 < {mathop{T}nolimits} le 12.4} hfill \ 0 hfill & {12.4 < {mathop{T}nolimits} le 15.9} hfill \ { - 0.5} hfill & {15.9 < {mathop{T}nolimits} le 18} hfill \ { - 1} hfill & {{mathop{T}nolimits} > 18} hfill end{array}} right.,$$
(7)
$${mathop{rm{CU}}nolimits} _8 = left{ {begin{array}{*{20}{l}} 0 hfill & {{mathop{T}nolimits} le 1.4} hfill \ {0.5} hfill & {1.4 < {mathop{T}nolimits} le 2.4} hfill \ 1 hfill & {2.4 < {mathop{T}nolimits} le 9.1} hfill \ {0.5} hfill & {9.1 < {mathop{T}nolimits} le 12.4} hfill \ 0 hfill & {{mathop{T}nolimits} > 12.4} hfill end{array}} right.,$$
(8)
where CUi is the rate of chilling for Model Ci, and T is the daily mean temperature (°C).
Model C9 is a dynamic model developed for peach in Israel and South Africa40,41 and now adopted for apricot cultivars42. The most important characteristic of Model C9 was that a previous intermediate product affected the rate of chilling in the current hour or day. We did not provide equations for Model C9 for simplicity (see the equation in Luedeling et al.20).
Harrington et al.43 summarized published results for chilling units and constructed a chilling function based on a three-parameter Weibull distribution, coded as Model C10 (Eq. (9)). Model C11 has a triangular form, which was fitted by Hänninen44 using previous experimental results for Finnish birch seedlings (Eq. (10)). Zhang et al.45 recently fitted observational data to the triangular model for 24 plant species and found that a mean optimal chilling temperature of 0.2 °C and an upper limit of the chilling temperature of 6.9 °C were most effective. Model C12, therefore, uses the triangular form with parameters of 0.2 and 6.9 °C (Eq. (11)).
$${mathop{rm{CU}}nolimits} _{10} = left{ {begin{array}{*{20}{l}} 1 hfill & {2.5 < {mathop{T}nolimits} < 7.4} hfill \ 0 hfill & {{mathop{T}nolimits} < - 4.7,{mathrm{or}},{mathop{T}nolimits} > 16} hfill \ {3.13left( {frac{{{mathop{T}nolimits} + 4.66}}{{10.93}}} right)^{2.10}{mathop{rm{e}}nolimits} ^{ – left( {frac{{{mathop{T}nolimits} + 4.66}}{{10.93}}} right)^{3.10}}} hfill & {{mathrm{else}}} hfill end{array}} right.,$$
(9)
$${mathrm{CU}}_{11} = left{ {begin{array}{*{20}{l}} 0 hfill & {{mathop{T}nolimits} le – 3.4,{mathop{rm{or}}nolimits} ,{mathop{T}nolimits} ge 10.4} hfill \ {frac{{{mathop{T}nolimits} + 3.4}}{{5 + 3.4}}} hfill & { – 3.4 < {mathop{rm{T}}nolimits} le 5} hfill \ {frac{{{mathop{T}nolimits} - 10.4}}{{5 - 10.4}}} hfill & {5 < {mathop{T}nolimits} < 10.4} hfill end{array}} right.,$$
(10) $${mathop{rm{CU}}nolimits} _{12} = left{ {begin{array}{*{20}{l}} 0 hfill & {{mathop{T}nolimits} le - 6.5,{mathop{rm{or}}nolimits} ,{mathop{T}nolimits} ge 6.9} hfill \ {frac{{{T + }6.5}}{{6.9 - 0.2}}} hfill & { - 6.5 < {mathop{T}nolimits} le 0.2} hfill \ {frac{{6.9 - {T}}}{{6.9 - 0.2}}} hfill & {0.2 < {T} < 6.9} hfill end{array}} right.,$$
(11) where CUi is the rate of chilling for Model Ci, and T is the daily mean temperature in °C.
Forcing models
Forcing models were used to measure HR for the spring events of plants. The GDD model is the most commonly used forcing model, which assumes that the rate of forcing is linearly correlated with temperature if the temperature is above a particular threshold. We mainly used Model F1 (Eq. (12)), which adopts a temperature threshold of 0 °C46,47,48, for examining the relationship between CA and HR: $${mathrm{FU}}_{mathrm{1}} = {mathrm{max}}({T},0),$$
(12) where FU1 is the rate of forcing for Model F1, and T is the daily mean temperature (°C).
We also validated the chilling models by correlating them with HR based on seven other forcing models to assess the impact of the choice of forcing model on the results. Model F2 (Eq. (13)) is also a GDD model but has a temperature threshold of 5 °C12,18: $${mathrm{FU}}_{mathrm{2}} = {mathrm{max}}(T - 5,0),$$
(13) where FU2 is the rate of forcing for Model F2, and T is the daily mean temperature (°C).
Piao et al.48 found that leaf onset in the Northern Hemisphere was triggered more by daytime than nighttime temperature. They thus proposed a GDD model using maximum instead of mean temperature. Models F3 (Eq. (14)) and F4 (Eq. (15)) are based on maximum temperature with thresholds of 0 and 5 °C, respectively. $${mathop{rm{FU}}nolimits} _3 = max ({mathop{T}nolimits} _{rm{max}},0),$$
(14) $${mathop{rm{FU}}nolimits} _4 = max ({mathop{T}nolimits} _{rm{max}} - 5,0),$$
(15) where FUi is the rate of forcing for Model Fi, and Tmax is the daily maximum temperature (°C).
A recent experiment demonstrated that the impact of daytime temperature on leaf unfolding for temperate trees was approximately threefold higher than the impact of nighttime temperature49. Model F5 (Eq. 16) thus uses two parameters (0.75 and 0.25) to weigh the impact of daytime and nighttime temperatures on HR. $${mathop{rm{FU}}nolimits} _5 = 0.75 times max ({mathop{T}nolimits} _{rm{max}} - 5,0){mathrm{ + }}0.25 times max ({mathop{T}nolimits} _{rm{min}} - 5,0),$$
(16) where FU5 is the rate of forcing for Model F5. Tmax and Tmin are the daily maximum and minimum temperatures (°C), respectively.
Many studies have suggested that the rate of forcing followed a logistic function of temperature44,50. Model F6 (Eq. (17)) uses a logistic function proposed by Hänninen44, and Model F7 (Eq. (18)) uses another logistic function proposed by Harrington et al.43. $${mathop{rm{FU}}nolimits} _6 = left{ {begin{array}{*{20}{l}} {frac{{28.4}}{{1 + {mathop{rm{e}}nolimits} ^{ - 0.185({mathop{T}nolimits} - 18.5)}}}} hfill & {{mathop{T}nolimits} > 0} hfill \ 0 hfill & {{mathop{rm{else}}nolimits} } hfill end{array}} right.,$$
(17)
$${mathop{rm{FU}}nolimits} _7 = frac{1}{{1 + {mathop{rm{e}}nolimits} ^{ – 0.47{mathop{T}nolimits} {mathrm{ + 6}}{mathrm{.49}}}}},$$
(18)
where FUi is the rate of forcing for Model Fi, and T is the daily mean temperature (°C).
Model F8 is a growing degree hour (GDH) model, where species have an optimum temperature for growth and where temperatures above or below that optimum have a smaller impact51. Model F8 (Eq. (19)) was first designed for calculating HR at hourly intervals, but we applied it at a daily interval. The stress factor in the original GDH model was ignored, because we assumed that the plants were not under other stresses.
$${mathrm{FU}}_{mathrm{8}}{mathrm{ = }}left{ {begin{array}{*{20}{l}} {mathrm{0}} hfill & {{T < }T_{mathrm{L}},{mathrm{or}},{T > }T_{mathrm{c}}} hfill \ {frac{{T_{mathrm{u}} – T_{mathrm{L}}}}{{mathrm{2}}}left( {{mathrm{1 + cos}}left( {pi {mathrm{ + }}pi frac{{{T} – T_{mathrm{L}}}}{{T_{mathrm{u}} – T_{mathrm{L}}}}} right)} right)} hfill & {T_{mathrm{L}} ge {T} ge T_{mathrm{u}}} hfill \ {left( {T_{mathrm{u}} – T_{mathrm{L}}} right)left( {{mathrm{1 + cos}}left( {frac{{uppi }}{{mathrm{2}}}{mathrm{ + }}frac{{uppi }}{{mathrm{2}}}frac{{{T} – T_{mathrm{u}}}}{{T_{mathrm{c}} – T_{mathrm{u}}}}} right)} right)} hfill & {T_{mathrm{u}}{ < T} le T_{mathrm{c}}} hfill end{array}} right.,$$ (19) where FU8 is the rate of forcing for Model F8, T is the daily mean temperature (°C), TL = 4, Tu = 25, and Tc = 36. Analysis We assessed the ability of each chilling model to represent long-term trends in the chilling conditions by calculating CA using each chilling model for each station for 1951–2018. CA was calculated as the sum of CUi from 1 November in the previous year to 30 April. The trend of CA at each station was visualized as the slope of the linear regression of CA against year. We also calculated Pearson’s r between each pair of chilling models for each station to determine if the chilling models were interrelated. We calculated HR and CA of spring events for each species, station, and year to determine if the chilling models are consistent with the physiological assumption that the reduction in chilling would increase HR (Fig. 1). HR was calculated as the sum of FUi from 1 January to the date of onset of spring events using Model F1, and the performances of the other forcing models (F2–F8) were also tested. We also compared 1 January with the other two starting dates of temperature accumulation (15 January and 1 February) to test any potential difference causing by the date when temperature accumulation begins. CA was calculated as the sum of CUi from 1 November in the previous year to the date of onset of spring events. We chose 1 November as the start date for CA because the endodormancy of temperate trees began around 1 November52. We only tested the linear relationship because the data were better fitted by the linear regression than the exponential model (Fig. 3), even though CA was linearly or nonlinearly negatively correlated with HR17. Pearson’s r between CA and HR for all records was calculated for each species, with a significantly negative Pearson’s r (p More