Steering ecological-evolutionary dynamics to improve artificial selection of microbial communities
Calculating landscape, attractor, and restrictorIn this work, we considered communities with commensal, mutualistic, and exploitative interactions. Below, we describe the differential equations for each type of interaction, and how we calculate the corresponding community function landscape, species-composition attractor, and Newborn restrictor.Commensal H–M community: The model community for most simulations is the same commensal H–M community used in our previous work15. The community function landscape plots P(T) as a function of ϕM(0) and ({overline{f}}_{P}(0)). Assume that a Newborn community has 100 biomass units, that all cells have the same genotype (all M cells have the same ({f}_{P}={overline{f}}_{P}(0))), that death and birth processes are deterministic, and that there is no mutation. P(T) can then be numerically integrated from the following set of scaled differential equations for any given pair of ϕM(0) and ({overline{f}}_{P}(0))15:$$frac{dR}{dt}=-{c}_{{RM}}{g}_{M}M-{c}_{{RH}}{g}_{H}H$$
(1)
$$frac{dB}{dt}={g}_{H}H-{c}_{{BM}}{g}_{M}M$$
(2)
$$frac{dP}{dt}={f}_{P}{g}_{M}M$$
(3)
$$frac{dH}{dt}={g}_{H}H-{delta }_{H}H$$
(4)
$$frac{dM}{dt}={g}_{M}left(1-{f}_{P}right)M-{delta }_{M}M$$
(5)
where$${g}_{H}(R)={g}_{{Hmax}}frac{R}{R+{K}_{{HR}}}$$
(6)
$${g}_{M}(R, B)={g}_{{Mmax}}frac{{R}_{M}{B}_{M}}{{R}_{M}+{B}_{M}}left(frac{1}{{R}_{M}+1}+frac{1}{{B}_{M}+1}right)$$
(7)
and RM = R/KMR and BM = B/KMB. Unless otherwise specified, landscapes in this paper are obtained by integrating Equations (1–5) from t = 0 to t = 17.Equation (1) states that Resource R is depleted by biomass growth of M and H, where cRM and cRH represent the amount of R consumed per unit of M and H biomass, respectively. Equation (2) states that Byproduct B is released as H grows, and is decreased by biomass growth of M due to consumption (cBM amount of B per unit of M biomass). Equation (3) states that Product P is produced as fP fraction of potential M growth. Equation (4) states that H biomass increases at a rate dependent on Resource R in a Monod fashion (Equation (6)) and decreases at the death rate δH. Note that Agricultural waste is not a state variable here as it is present in excess. Equation (5) states that M biomass increases at a rate dependent on Resource R and Byproduct B according to the Mankad and Bungay model (Equation (7)51) discounted by (1 − fP) due to the fitness cost of making Product, and decreases at the death rate δM. In the Monod growth model (Equation (6)), gHmax is the maximal growth rate of H and KHR is the R at which gHmax/2 is achieved. In the Mankad and Bungay model (Equation (7)), KMR is the R at which gMmax/2 is achieved when B is in excess; KMB is the B at which gMmax/2 is achieved when R is in excess.Mutualistic H–M community: If Byproduct is harmful for H, then the community is mutualistic: H and M promote the growth of each other. Such a mutualistic community can still be described by Equations (1–5) and (7), but Equation (6) is replaced with$${g}_{H}(R)={g}_{{Hmax}}frac{R}{R+{K}_{{HR}}}exp left(-frac{B}{{B}_{0}}right)$$
(8)
where larger B0 indicates lower sensitivity, or higher resistance of H to its Byproduct B.Exploitative H–M community: If M releases an antagonistic byproduct A that inhibits the growth of H, then the interaction is exploitative: H promotes the growth of M, but M inhibits the growth of H. Besides Eqs (1–5) and (7), we then need to add an equation that describes the dynamics of A$$frac{dwidetilde{A}}{dt}={r}_{A}{g}_{M}left(1-{f}_{P}right)M$$where rA is the amount of A released when M’s biomass grows by 1 unit. We can then normalize (widetilde{A}) with rA$$A=widetilde{A}/{r}_{A}$$so that$$frac{dA}{dt}={g}_{M}left(1-{f}_{P}right)M.$$
(9)
We also need to modify the growth rates for H:$${g}_{H}={g}_{H}(R)={g}_{{Hmax}}frac{R}{R+{K}_{{HR}}}frac{{A}_{0}}{A+{A}_{0}}$$
(10)
where larger A0 indicates lower sensitivity, or higher resistance of H to M’s Antagonistic by product A.To calculate the community function landscape, species attractor, and Newborn restrictor, all phenotype parameters, except ({overline{f}}_{P}(0)) take the value from the Bounds column in Table 1. To construct the landscape such as in Fig. 2c, we calculated P(T) for every grid point on a 2D quadrilateral mesh of 10−2 ≤ ϕM(0) ≤ 0.99 and (1{0}^{-2} le {overline{f}}_{P}(0) le 0.99) with a mesh size of ΔϕM(0) = 10−2 and ({{Delta }}{overline{f}}_{P}(0)=1{0}^{-2}). To construct the landscapes in Fig. 5b(ii) and b(iii), P(T) was similarly calculated on a 2D grid with a finer mesh of ΔϕM(0) = 5 × 10−3 and ({{Delta }}{overline{f}}_{P}(0)=1{0}^{-4}).To calculate the species composition attractor, we integrated Equations (1–5) to obtain ϕM(T) − ϕM(0) for each grid point on the 2D mesh of ϕM(0) and ({overline{f}}_{P}(0)). The contour of ϕM(T) − ϕM(0) = 0 is then the species attractor (blue dashed curve in Fig. 2b).The attractor-induced Newborn restrictor at a given ({overline{f}}_{P}(0)) is calculated from its definition: if ϕM(0) of a parent Newborn is on the restrictor, then so is the average ϕM(0) among its offspring Newborns. Under no spiking, since the average ϕM(0) among offspring Newborn is the same as ϕM(T) of their parent Adult, the Newborn restrictor coincides with the species attractor (Fig. 3b and Fig. 5b ii). Under x% H spiking, x% of the biomass in Newborns is replaced with H cells. Thus if the parent Adult’s fraction of M biomass is ϕM(T), the average ϕM(0) among its offspring Newborns is (1 − x%)ϕM(T) under x% H spiking. The Newborn restrictor therefore is the contour of (1 − x%)ϕM(T) − ϕM(0) = 0 (teal curve in Fig. 5a ii and b iii, Fig. 2d ii). Compared with the orange restrictor under no spiking, the teal restrictor is shifted down.Parameter choicesDetails justifying our parameter choices are given in the Methods section of our previous work15. Briefly, our parameter choices are based on experimental measurements of microorganisms (e.g., S. cerevisiae and E. coli). To ensure the coexistence of H and M, M must grow faster than H for part of the maturation cycle since M has to wait for H’s Byproduct at the beginning of a cycle. Because we have assumed M and H to have similar affinities for Resource (Table 1), the maximal growth rate of M (gMmax) must exceed the maximal growth rate of H (gHmax), and M’s affinity for Byproduct (1/KMB) must be sufficiently large. Moreover, metabolite release and consumption need to be balanced to avoid extreme species ratios. We assume that H and M consume the same amount of Resource per new cell (cRH = cRM) since the biomass of various microbes shares similar elemental (e.g., carbon or nitrogen) compositions. We set consumption value so that the input Resource can support a maximum of 104 total biomass. The evolutionary bounds are set, such that evolved H and M could coexist for fp 0, the number of H cells supplemented to the Newborn community is the nearest integer to (B{M}_{{{{{{{{rm{target}}}}}}}}}{varphi }_{S}{L}_{H}^{-1}). Because integer number of cells is assigned to each Newborn, the total biomass might not be exactly BMtarget but within a small deviation of ~2 biomass units.To mimic reproducing through pipetting, each M and H cell in an Adult community is assigned a random integer between 1 and dilution factor nD (Equation (12)). All cells assigned with the same random integer are then dealt to the same Newborn, generating nD Newborn communities. If φS > 0, the number of H cells supplemented into each Newborn is a random number drawn from a Poisson distribution of a mean of (B{M}_{{{{{{{{rm{target}}}}}}}}}{varphi }_{S}{L}_{H}^{-1}).To mimic reproducing through cell sorting, each Newborn receives a biomass of (B{M}_{{{{{{{{rm{target}}}}}}}}}left(1-{varphi }_{S}right)) from its parent Adult. Suppose that the fraction of M biomass in the parent Adult is ϕM(T), then M cells from the parent Adult are randomly assigned to the Newborn, until the total biomass of M comes closest to (B{M}_{{{{{{{{rm{target}}}}}}}}}{phi }_{M}(T)left(1-{varphi }_{S}right)) without exceeding it. H cells with a total biomass of (B{M}_{{{{{{{{rm{target}}}}}}}}}left(1-{phi }_{M}(T)right)left(1-{varphi }_{S}right)) are assigned similarly. If φS > 0, the number of H cells supplemented to the Newborn community is the nearest integer to (B{M}_{{{{{{{{rm{target}}}}}}}}}{varphi }_{S}{L}_{H}^{-1}) where LH is the biomass of individual H cell in the parent Adult. Because each of M and H cells had a length between 1 and 2, the actual biomass of M and H assigned to a Newborn could vary from the target by up to 2 biomass units. Consequently, deviations of BM(0) from BMtarget and of ϕM(0) from parent Adult’s ϕM(T) are only a few percent.Simulating species spiking when both H and M cells evolveIn the more complex scenario, both H and M evolve. We thus need to spike with evolved H and M clones. Additionally, Newborns are spiked with H or M clones from their own lineage as demonstrated in Supplementary Fig. 11a. Below, we describe the simulation code for the experimental procedure (Supplementary Fig. 11a) we simulated.In all simulations where 6 or 7 phenotypes are modified by mutations, chosen Adults are reproduced through pipetting in a similar fashion as described above. After Newborns are reproduced from a chosen Adult in Cycle C − 1, a preset number of H or M cells are randomly picked from the remaining of this Adult to form H or M-spiking mix for Cycle C. At the end of Cycle C, we choose 10 Adults with the highest functions. Assuming that each chosen Adult is reproduced through pipetting with φS-H-spiking strategy, a Newborn receives on average a biomass of (B{M}_{{{{{{{{rm{target}}}}}}}}}left(1-{varphi }_{S}right)) from its parent Adult community and on average a biomass of BMtargetφS from H spiking mix generated at the end of Cycle C − 1. Since each chosen Adult usually gives rise to 10 Newborns, the number of cells distributed from the chosen Adult to each Newborn is drawn from a multinomial distribution. Specifically, denote the integer random numbers of cells that would be assigned to 10 Newborns to be {x1, x2,…, x10}. If the chosen Adult has a total biomass of BM(T) composed of IM M cells and IH H cells (both IM and IH are integers), the probability that {x1, x2,…, x10} cells are assigned to 10 Newborns, respectively, and x11 cells remain, is$$Pr left({{x}_{1},{x}_{2},…,{x}_{10},{x}_{11}}right)=frac{({I}_{H}+{I}_{M})!}{{x}_{1}!cdots {x}_{10}!{x}_{11}!},{p}_{0}{{,}^{{x}_{1}+cdots +{x}_{10}}},{p}_{11}^{{x}_{11}}.$$Here, ({p}_{0}=B{M}_{{{{{{{{rm{target}}}}}}}}}left(1-{varphi }_{S}right)/BM(T)) is the probability that a cell is assigned to one of 10 Newborns, p11 = 1 − 10p0 is the probability that a cell is not assigned to Newborns. Thus, ({x}_{11}={I}_{H}+{I}_{M}-mathop{sum }nolimits_{i = 1}^{10}{x}_{i}) is the number of cells remaining after reproduction, from which H and M cells are randomly picked to generate the spiking mix for Cycle C + 1.Suppose that the current spiking strategy is φS-H, then these 10 Newborns are spiked with H-spiking mix generated in Cycle C − 1. An average of BMtargetφS of H biomass is spiked into each Newborn so that the total biomass of Newborns is on average BMtarget. Suppose that five H cells from the parent Adult’s lineage are randomly picked at the end of Cycle C − 1, and that they have biomass {LH1, LH2, LH3, LH4, LH5}, respectively. The total number of H cells assigned to each Newborn, xH, is then randomly drawn from a Poisson distribution with a mean of (B{M}_{{{{{{{{rm{target}}}}}}}}}{varphi }_{S}/{overline{L}}_{H}), where ({overline{L}}_{H}=frac{1}{5}mathop{sum }nolimits_{j = 1}^{5}{L}_{Hj}) is the average biomass of the five H cells. Each spiked H cell has an equal chance of being one of the five cells.Updating spiking percentage based on heritability checksWhen the community function landscape is unknown, we can estimate heritability of community function under different spiking percentages through parent–offspring regression. In most simulations (e.g., Fig. 7), heritability evaluation is carried out about every 100 cycles (“periodic heritability check”). In the simulations demonstrated in Supplementary Fig. 17, the average improvement rate in community function is estimated from the chosen Adults over the last 50 cycles. Heritability evaluation is carried out when this average improvement rate becomes negative (“adaptive heritability check”). For both periodic and adaptive checks, heritability evaluation can be postponed until within-community selection improves cell growth sufficiently to provide sufficient biomass for heritability check.During one round of heritability evaluation, heritability of community function is estimated through parent–offspring community function regression under all candidate spiking strategies (Supplementary Fig. 11b). The current spiking strategy is updated if an alternative spiking strategy confers significantly higher community function heritability.To evaluate heritability under one spiking strategy, up to 100 Newborn communities are generated under this spiking strategy. After these mature into Adults, their functions are the parent functions. Each Adult parent then gives rise to six Newborn offspring under the same spiking strategy. When the six Newborn offspring mature into Adults, the median of their functions is the average offspring function. When offspring functions are plotted against their parent functions, the slope of the least-squares linear regression (green dashed line in Supplementary Fig. 11b) quantifies the heritability of community function. Heritability of a community function is thus similar to heritability of an individual trait, except that we use median instead of mean of offspring functions, because median is less sensitive to outliers. The 95% confidence interval of heritability is then estimated by nonparametric bootstrap58,59. More specifically, first, 100 pairs of parent–offspring community functions are resampled with replacement. Second, heritability is calculated with the resampled data. Third, 1000 heritabilities are calculated from 1000 independent resamplings, from which the 95% confidence interval is estimated from the 5th and 95th percentile.An alternative spiking strategy is considered significantly more advantageous than the current spiking strategy if heritability of the alternative spiking strategy is higher than the right endpoint of the 95% confidence interval of the heritability of the current spiking strategy. If more than one alternative spiking strategies are more advantageous, the one with the highest heritability is implemented to replace the current strategy. Similarly, an alternative spiking strategy is considered more disadvantageous if heritability of the alternative spiking strategy is lower than the left endpoint of the 95% confidence interval of the heritability of the current spiking strategy. When implementing random spiking strategy, the current spiking strategy is updated with a strategy randomly picked from candidate spiking strategies.Simulating community selection with large population sizeWhen the population size of each community is scaled up by 10 or 100 times (Supplementary Figs. 2 and 18b), the simulation codes described above become inefficient. Instead of tracking the biomass and phenotype of each cell in a large population, we divide the cells into categories and track the number of cells from different categories, where a category is defined by a unique combination of cell biomass and phenotype ranges. In our simulations, the biomass of each cell ranges between 1 and 2, fP of each M cell ranges between 0 and 1. Since H cells do not mutate, H cells are divided into 100 categories. H cells that belong to category i have a biomass between [1 + (i − 1) × ΔL, 1 + i × ΔL] where ΔL = 10−2. Since only fP of M cells are modified by mutations, M cells are divided into 100 × 105 categories. M cells that belong to category (i, j) have a biomass between [1 + (i − 1) × ΔL, 1 + i × ΔL] and fP between [(j − 1) × ΔfP, j × ΔfP] where ΔfP = 10−5. Every time fP of a M cell is modified by mutations, this cell jumps from the current category to a new category determined by its new fP value.Similar to simulations with small population sizes, each selection cycle starts with ntot = 100 Newborn communities. Maturation time T is divided into time steps of length Δτ = 0.05. Over each time step, the growth in cell biomass and the changes in metabolites are simulated in a similar fashion as described above. At the end of each time step, the number of cells to die or to mutate in each category is drawn from a bionomial distribution. If fP of a M cell is modified by mutation, the mutation effect is drawn from the same distribution as described above: (frac{1}{2}) of mutations reduce fP to 0 and the other (frac{1}{2}) is randomly drawn from the distribution in Equation (11).At the end of a maturation cycle, top 10 Adults with the highest functions are chosen. Each then reproduces 10 Newborns via pipetting for the next cycle. The fold of dilution is similarly adjusted, so that the average of Newborn total biomass is BMtarget over all selection cycles. From each category of a chosen Adult, the number of cells assigned to a Newborn community is randomly drawn from a multinomial distribution.Reporting summaryFurther information on research design is available in the Nature Research Reporting Summary linked to this article. More