in

# Incorporating the field border effect to reduce the predicted uncertainty of pollen dispersal model in Asia

### Dispersal models

In this study, the dispersal model consists of two parts, namely, kernel and observation model (Fig. 1). The main purpose of the kernel was employed to estimate the proportion of pollen dispersed from location s′ to location s and calculate the expected number of CP grains. The observation model used the expected number of CP grains as a parameter and described the number of CP grains at location s (Ys) by a specific distribution in the following:

$${Y}_{s}sim fleft(left.{y}_{s}right|{{varvec{theta}}}_{s}right),$$

(1)

where f indicates the probability density function (PDF) of the specific distribution. The θs is the parameter vector of the distribution. This study constructed eight different dispersal models combined with two observation models, two kernels, and two conditions of the field border (FB) effect (Table 1). The details of the kernels and observation models were described in the following subsections.

### Kernels

The kernel indicates the probability when the pollen emitted at location s′ and would fall down at location s. It can be expressed as γ(s, s′), where s′ is the source location closest to location s. Numerous kernels have been used to describe various dispersal phenomena24. The output of the kernel represents the donor pollen density of location s. In order to calculate the expected number of CP grains, the donor pollen density is multiplied by the average total grain number described as follows:

$${lambda }_{s}=Ktimes gamma left(s,{s}^{^{prime}}right),$$

(2)

where λs and K indicate the expected number of CP grains at location s and the average number of grains per cob, respectively. The effect of the FB was introduced into the kernel to suit to the small-scale farming system in Asia. This study assumed that the relation between the pollen density at the first recipient row and the width of the FB displayed an exponential decrease25,26. To evaluate the improvement of the kernel with the FB effect, the kernels without the FB effect were also established in this study.

The compound exponential kernel (γExpo) has been used in the previous pollen dispersal study27. Our study introduced the FB effect into this kernel. Therefore, the form of the compound exponential kernel can be expressed as follows:

$$gamma_{{{text{Expo}}}} left( {s,s^{prime}} right) = left{ {begin{array}{*{20}l} {K_{e} exp left( { – a_{1} d^{*} left( {s,s^{prime}} right)} right)exp left( { – ksqrt {FB} } right),} {K_{e} exp left( { – a_{1} D – a_{2} left( {d^{*} left( {s,s^{prime}} right) – D} right)} right)exp left( { – ksqrt {FB} } right),} end{array} } right.begin{array}{*{20}l} {{text{if}},, d^{*} left( {s,s^{prime}} right) le D} {{text{if}} ,,d^{*} left( {s,s^{prime}} right) > D,} end{array}$$

(3)

where Ke, a1, a2, k, D are the parameters of the kernel. d*(s, s′) indicates the shortest distance between locations s′ and s in which the width of the FB has been subtracted. In the compound exponential kernel without the FB effect, the exponential term of the FB effect was removed and the d*(s, s′) was replaced directly by the shortest distance between s′ and s.

The second kernel applied in this study was the modified Cauchy kernel (γCauchy) which was based on the PDF of the Cauchy distribution and the concept of compound distribution. The modified Cauchy kernel is represented as follows:

$$gamma_{Cauchy} left( {s,s^{prime}} right) = left{ {begin{array}{*{20}l} {frac{2beta }{{pi left[ {beta^{2} + d^{*} left( {s,s^{prime}} right)^{2} } right]}}{text{exp}}left( { – ksqrt {FB} } right),} {frac{2beta }{{pi left[ {beta^{2} + D^{2} + c_{1} left( {d^{*} left( {s,s^{prime}} right) – D} right)^{2} } right]}}{text{exp}}left( { – ksqrt {FB} } right),} end{array} } right.begin{array}{*{20}l} {{text{if}} ,,d^{*} left( {s,s^{prime}} right) le D} {{text{if}} ,,d^{*} left( {s,s^{prime}} right) > D,} end{array}$$

(4)

where the β indicates the decline rate of the curve. Parameters of k and D are same as the compound exponential kernel. c1 indicates the relative slow decrease of pollen density at further distances. Similarly, in the modified Cauchy kernel without the FB effect, the term of the FB effect was removed and the d*(s, s′) was replaced directly by the shortest distance between s′ and s in which the row spacing (0.75 m) had been subtracted.

### Observation models

Because of the high proportions of zero value observations, the present study assumed that the CP grain count followed the zero-inflated Poisson (ZIP) distribution to account for zero-excess condition28. The ZIP distribution was first proposed by Lambert29, and several studies had applied the ZIP distribution to deal with the CP data27,30. The ZIP distribution consists of a Dirac distribution in zero and a Poisson distribution. Therefore, the distribution of CP grain count at location s (Ys) can be expressed as follows:

$${Y}_{s}sim mathrm{ZIP}left(1-{q}_{s},{uplambda }_{s}right),$$

(5)

where qs indicates the probability of an observation following a Poisson distribution, and λs is the parameter of Poisson distribution calculated by Eq. (2). Furthermore, the parameter qs can be assumed to depend on the shortest distance between the recipient and donor plants. The border effect is also included in the estimation of qs because it is related to the distance effect. The relationship among distance, border, and the qs can be described using the following logistic function:

$${q}_{s}=frac{1}{1+mathrm{exp}({b}_{1}-{b}_{2}{d}^{*}left(s,{s}^{^{prime}}right))},$$

(6)

where b1 and b2 are the parameters of the logistic function. The d*(s, s′) was the shortest distance between s′ and s in the version of dispersal models without the FB effect. The Poisson distribution was also used as an observation model for comparison with the ZIP observation model.

### Experimental and meteorological data collection

The pollen dispersal data were collected from experiments performed in 2009 and 2010 at the geographic coordinates 23° 47′ N, 120° 26′ E, and an altitude of 20 m. These experiments were coded as 2009-1, 2009-2, and 2010-1, respectively. The experiment 2009-2 was divided into 2009-2A (without the FB) and 2009-2B (with the FB) based on the presence of the FB. The different layouts of the field experiments were designed to investigate the effect of the FB. Two commercial glutinous maize varieties, black pearl (purple grain) and Tainan No. 23 (white grain), were selected as the pollen donor and pollen recipient, respectively. The distance between the plants in a row was 25 cm, whereas the distance between the rows was 75 cm. The recipient plots consisted of 82 and 91 rows in 2009 and 2010 experiments, respectively.

The CP rate was determined based on the differences in grain color on recipient cobs as a result of the xenia effect31. In the sampling framework, the whole field was divided into many grids and corn samples were collected from each grid in the whole field. The CP rate of each grid was calculated using the method presented in a previous study32 and defined as:

$$mathrm{CP}left(%right)=left[sum_{i=1}^{n}{Cob}_{i}/left(ntimes Kright)right],$$

(7)

where Cobi and n indicate ith cob and total number of cobs in the grid, respectively. K is the average grain number per cob. Meteorological data were collected from the meteorological station at geographic coordinates 23° 35′ N, 120° 27′ E, and an altitude of 20 m. The detailed experimental setup was described in our previous study33. The study complies with relevant institutional, national, and international guidelines and legislation.

### Statistical analyses

All statistical analyses were performed using SAS (Statistical Analysis System, version 9.4). The dispersal model parameters were estimated by two methods. First, the nonlinear model estimation was conducted by PROC NLMIXED to evaluate the fitting and predictive abilities of dispersal models. Then the dispersal models with the observation model performed better fitting ability were re-estimated using the Bayesian estimation method to assess the uncertainty by PROC MCMC. In the Bayesian method, the noninformative prior distribution was used to estimate all parameters (Supplementary Table S1). The iteration of Markov Chain was 500,000 times and the burn-in was set to 450,000 iterations. In order to reduce the autocorrelations in the chain, the thinned value was set to 25.

The validation method used in this study was the threefold cross-validation for the results of both estimation methods. The data from three experiments were combined and randomly partitioned into three sub-datasets. To avoid the heterogeneity of the different field designs and distances among sub-datasets, the observations from the same field design and same distance were considered as a group, and then partitioned into three parts. Each sub-dataset contained one part of all groups. At each validation run, two sub-datasets were selected as the training set, and the remaining one was used for validation.

The fitting ability of the dispersal models was evaluated based on two criteria, namely, Akaike information criterion (AIC), Deviance, and coefficient of determination (R2). The smaller values of AIC or deviance indicate a better fitting. The higher R2 value represents a better fitting performance. The correlation coefficient (r) between the predicted and actual CP rates was used to assess the predictive ability. The deviance information criterion (DIC) was used to evaluate the performance of dispersal model fitting for the Bayesian estimation. The criterion values calculated from three training and validation sets were averaged to assess the overall results. The uncertainty of the model parameter was quantified by the standard deviation (SD) of parameter posterior distribution. The 95% credible intervals of posterior predictive distribution constructed by the 2.5th and 97.5th percentiles of 200,000 samples generated from the posterior predictive distribution were used to assess the predictive uncertainty. Furthermore, to assess the zero-excess condition, the percentage of observed zero CP grain events was compared with the Poisson probability of the zero CP grain event. A zero-excess condition occurred if the observed percentage was higher than the Poisson probability34.

Source: Ecology - nature.com