in

Shipped and shifted: modeling collection-induced bias in microbiome multi-omics using a tractable fermentation system


Abstract

Large-scale, decentralized microbiome sampling surveys and citizen science initiatives often require periods of storage at ambient temperature, potentially altering sample composition during collection and transport. We developed a generalizable framework to quantify and model these biases using sourdough as a tractable fermentation system, with samples subjected to controlled storage conditions (4 °C, 17 °C, 30 °C, regularly sampled up to 28 days). Machine-learning models paired with multi-omics profiling—including microbiome, targeted and untargeted metabolome profiling, and cultivation—revealed temperature-dependent shifts in bacterial community structure and metabolic profiles, while fungal communities remained stable. Storage induced ecological restructuring, marked by reduced network modularity and increased centrality of dominant taxa at higher temperatures. Notably, storage duration and temperature were strongly encoded in the multi-omics data, with temperature exerting a more pronounced influence than time. 24 of the top 25 predictors of storage condition were metabolites, underscoring functional layers as both sensitive to and informative of environmental exposure. These findings demonstrate that even short-term ambient storage (<2 days) can substantially reshape microbiome, metabolome, and biochemical profiles, posing risks to data comparability in decentralized studies and emphasizing the need to recognize and address such biases. Critically, the high predictability of storage history offers a path toward bias detection and correction— particularly when standardized collection protocols are infeasible, as is common in decentralized sampling contexts. Our approach enables robust quantification and modeling of such storage effects across multi-omics datasets, unlocking more accurate interpretation of large-scale microbiome surveys.

Data availability

All sequence data have been deposited on EBI-ENA under accession number PRJEB94514 (16S) and PRJEB94515 (ITS). Source data (metadata) along with processed HPLC and FIA-MS data have been deposited together with all code notebooks in github (see code availability).

Code availability

All code notebooks for bioinformatic processing, statistical analyses, and machine-learning models have been deposited and are openly accessible in github https://github.com/bokulich-publications/shipped-and-shifted.

References

  1. Banerjee, S. et al. Agricultural intensification reduces microbial network complexity and the abundance of keystone taxa in roots. ISME J. 13, 1722–1736 (2019).

    Google Scholar 

  2. Berg, G. et al. Microbiome definition re-visited: old concepts and new challenges. Microbiome 8, 103 (2020).

    Google Scholar 

  3. Gilbert, J. A. et al. Current understanding of the human microbiome. Nat. Med. 24, 392–400 (2018).

    Google Scholar 

  4. Burman, E. & Bengtsson-Palme, J. Microbial community interactions are sensitive to small changes in temperature. Front. Microbiol. 12, 672910 (2021).

    Google Scholar 

  5. Bokulich, N. A., Maldonado, J., Kang, D.-W., Krajmalnik-Brown, R. & Caporaso, J. G. Rapidly processed stool swabs approximate stool microbiota profiles. mSphere 4, e00208–e00219 (2019).

    Google Scholar 

  6. McDonald, D. et al. American gut: an open platform for citizen science microbiome research. mSystems 3, e00031–e00031 (2018).

    Google Scholar 

  7. Scofield, V., Jacques, S. M. S., Guimarães, J. R. D. & Farjalla, V. F. Potential changes in bacterial metabolism associated with increased water temperature and nutrient inputs in tropical humic lagoons. Front. Microbiol. 6, 310 (2015).

    Google Scholar 

  8. Silva, I. et al. Short-term responses of soil microbial communities to changes in air temperature, soil moisture and UV radiation. Genes 13, 850 (2022).

    Google Scholar 

  9. Bassis, C. M. et al. Comparison of stool versus rectal swab samples and storage conditions on bacterial community profiles. BMC Microbiol. 17, 78 (2017).

    Google Scholar 

  10. Tedjo, D. I. et al. The effect of sampling and storage on the fecal microbiota composition in healthy and diseased subjects. PLOS ONE 10, e0126685 (2015).

    Google Scholar 

  11. Momo Cabrera, P., Bokulich, N. A. & Zimmermann, P. Evaluating stool microbiome integrity after domestic freezer storage using whole-metagenome sequencing, genome assembly, and antimicrobial resistance gene analysis. Microbiol. Spectr. 13, e02278–24 (2025).

    Google Scholar 

  12. Song, S. J. et al. Preservation methods differ in fecal microbiome stability, affecting suitability for field studies. mSystems 1, e00021–16 (2016).

    Google Scholar 

  13. Teo, Y. et al. Evaluating long-term stool preservation methods for maximizing the recovery of viable human fecal microbiota. Gut Microbes Rep. 2, 2594958 (2025).

    Google Scholar 

  14. Landis, E. A. et al. The diversity and function of sourdough starter microbiomes. eLife 10, e61644 (2021).

    Google Scholar 

  15. Meyer, A. et al. Rising together: exploring sourdough fermentation diversity through Co-design in the HealthFerm Citizen Science Initiative. 2025.05.23.655785 Preprint at https://doi.org/10.1101/2025.05.23.655785 (2025).

  16. Amir, A. et al. Correcting for microbial blooms in fecal samples during room-temperature shipping. mSystems 2, e00199–16 (2017).

    Google Scholar 

  17. Marco, M. L. et al. Health benefits of fermented foods: microbiota and beyond. Curr. Opin. Biotechnol. 44, 94–102 (2017).

    Google Scholar 

  18. Sawant, S. S., Park, H.-Y., Sim, E.-Y., Kim, H.-S. & Choi, H.-S. Microbial fermentation in food: impact on functional properties and nutritional enhancement—a review of recent developments. Fermentation 11, 15 (2025).

    Google Scholar 

  19. Wei, Q., Wang, X., Sun, D.-W. & Pu, H. Rapid detection and control of psychrotrophic microorganisms in cold storage foods: a review. Trends Food Sci. Technol. 86, 453–464 (2019).

    Google Scholar 

  20. Louw, N. L., Lele, K., Ye, R., Edwards, C. B. & Wolfe, B. E. Microbiome assembly in fermented foods. Annu. Rev. Microbiol. 77, 381–402 (2023).

    Google Scholar 

  21. Valentino, V. et al. Fermented foods, their microbiome and its potential in boosting human health. Microb. Biotechnol. 17, e14428 (2024).

    Google Scholar 

  22. Cabello-Olmo, M. et al. Influence of storage temperature and packaging on bacteria and yeast viability in a plant-based fermented food. Foods 9, 302 (2020).

    Google Scholar 

  23. Kim, E., Yang, S.-M. & Kim, H.-Y. Analysis of cultivable microbial community during kimchi fermentation using MALDI-TOF MS. Foods 10, 1068 (2021).

    Google Scholar 

  24. Kim, J. Y. et al. Long-term population dynamics of viable microbes in a closed ecosystem of fermented vegetables. Food Res. Int. 154, 111044 (2022).

    Google Scholar 

  25. De Filippis, F., Genovese, A., Ferranti, P., Gilbert, J. A. & Ercolini, D. Metatranscriptomics reveals temperature-driven functional changes in microbiome impacting cheese maturation rate. Sci. Rep. 6, 21871 (2016).

    Google Scholar 

  26. Martins, I. E. et al. Effect of packaging materials and storage conditions on the microbial quality of pearl millet sourdough bread. J. Food Sci. Technol. 58, 52–61 (2021).

    Google Scholar 

  27. Minervini, F., De Angelis, M., Di Cagno, R. & Gobbetti, M. Ecological parameters influencing microbial diversity and stability of traditional sourdough. Int. J. Food Microbiol. 171, 136–146 (2014).

    Google Scholar 

  28. Sanmartin C, A. G. The kinetics of fermentations in sourdough bread stored at different temperature and influence on bread quality. J. Bioprocess. Biotech. 3, 134–138 (2013).

  29. Van Kerrebroeck, S., Maes, D. & De Vuyst, L. Sourdoughs as a function of their species diversity and process conditions, a meta-analysis. Trends Food Sci. Technol. 68, 152–159 (2017).

    Google Scholar 

  30. De Vuyst, L., Van Kerrebroeck, S. & Leroy, F. Microbial Ecology and Process Technology of Sourdough Fermentation. in Advances in Applied Microbiology (eds Sariaslani, S. & Gadd, G. M.) 100, 49–160 (Academic Press, 2017).

  31. Ercolini, D. et al. Microbial ecology dynamics during rye and wheat sourdough preparation. Appl. Environ. Microbiol. 79, 7827–7836 (2013).

    Google Scholar 

  32. Reese, A. T., Madden, A. A., Joossens, M., Lacaze, G. & Dunn, R. R. Influences of Ingredients and Bakers on the Bacteria and Fungi in Sourdough Starters and Bread. mSphere 5, e00950–19 (2020).

    Google Scholar 

  33. Ripari, V., Gänzle, M. G. & Berardi, E. Evolution of sourdough microbiota in spontaneous sourdoughs started with different plant materials. Int. J. Food Microbiol. 232, 35–42 (2016).

    Google Scholar 

  34. Lim, J.-Y. et al. Microbial dynamics and metabolite profiles in different types of salted seafood (Jeotgal) during fermentation. ACS Omega 9, 35798–35808 (2024).

    Google Scholar 

  35. Tan, G. et al. Microbial community and metabolite dynamics during soy sauce koji making. Front. Microbiol. 13, 841529 (2022).

    Google Scholar 

  36. Cleenwerck, I., Vandemeulebroecke, K., Janssens, D. & Swings, J. Re-examination of the genus Acetobacter, with descriptions of Acetobacter cerevisiae sp. nov. and Acetobacter malorum sp. nov. Int. J. Syst. Evol. Microbiol. 52, 1551–1558 (2002).

    Google Scholar 

  37. Corsetti, A. & Settanni, L. Lactobacilli in sourdough fermentation. Food Res. Int. 40, 539–558 (2007).

    Google Scholar 

  38. Kilstrup, M., Hammer, K., Ruhdal Jensen, P. & Martinussen, J. Nucleotide metabolism and its control in lactic acid bacteria. FEMS Microbiol. Rev. 29, 555–590 (2005).

    Google Scholar 

  39. Papadimitriou, K. et al. Stress physiology of lactic acid bacteria. Microbiol. Mol. Biol. Rev. MMBR 80, 837 (2016).

    Google Scholar 

  40. Baig, M. A. et al. Potential probiotic pediococcus pentosaceus M41 modulates its proteome differentially for tolerances against heat, cold, acid, and bile stresses. Front. Microbiol. 12, 731410 (2021).

    Google Scholar 

  41. Gomes, R. J. et al. Acetic acid bacteria in the food industry: systematics, characteristics and applications. Food Technol. Biotechnol. 56, 139–151 (2018).

    Google Scholar 

  42. Han, N. R. et al. Evolution-aided improvement of the acid tolerance of Levilactobacillus brevis and its application in sourdough fermentation. Food Res. Int. 190, 114584 (2024).

    Google Scholar 

  43. Wang, B., Shao, Y., Chen, T., Chen, W. & Chen, F. Global insights into acetic acid resistance mechanisms and genetic stability of Acetobacter pasteurianus strains by comparative genomics. Sci. Rep. 5, 18330 (2015).

    Google Scholar 

  44. Marsland, R. et al. Available energy fluxes drive a transition in the diversity, stability, and functional structure of microbial communities. PLOS Comput. Biol. 15, e1006793 (2019).

  45. Vogel, R. F. et al. Genomic analysis reveals Lactobacillus sanfranciscensis as stable element in traditional sourdoughs. Microb. Cell Factories 10, S6 (2011).

    Google Scholar 

  46. Johnson, J. S. et al. Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis. Nat. Commun. 10, 5029 (2019).

    Google Scholar 

  47. López, P. C., Peng, C., Arneborg, N., Junicke, H. & Gernaey, K. V. Analysis of the response of the cell membrane of Saccharomyces cerevisiae during the detoxification of common lignocellulosic inhibitors. Sci. Rep. 11, 6853 (2021).

    Google Scholar 

  48. Blasche, S. et al. Metabolic cooperation and spatiotemporal niche partitioning in a kefir microbial community. Nat. Microbiol. 6, 196–208 (2021).

    Google Scholar 

  49. Gänzle, M. & Follador, R. Metabolism of oligosaccharides and starch in lactobacilli: a review. Front. Microbiol. 3, 340 (2012).

    Google Scholar 

  50. Louca, S. et al. Function and functional redundancy in microbial systems. Nat. Ecol. Evol. 2, 936–943 (2018).

    Google Scholar 

  51. Shade, A. et al. Fundamentals of microbial community resistance and resilience. Front. Microbiol. 3, 417 (2012).

    Google Scholar 

  52. Stegen, J. C., Lin, X., Konopka, A. E. & Fredrickson, J. K. Stochastic and deterministic assembly processes in subsurface microbial communities. ISME J. 6, 1653–1664 (2012).

    Google Scholar 

  53. Rappaport, H. B., Senewiratne, N. P. J., Lucas, S. K., Wolfe, B. E. & Oliverio, A. M. Genomics and synthetic community experiments uncover the key metabolic roles of acetic acid bacteria in sourdough starter microbiomes. mSystems 0, e00537–24 (2024).

    Google Scholar 

  54. Akiyama, S. et al. Multi-biome analysis identifies distinct gut microbial signatures and their crosstalk in ulcerative colitis and Crohn’s disease. Nat. Commun. 15, 10291 (2024).

    Google Scholar 

  55. McLaren, M. R., Willis, A. D. & Callahan, B. J. Consistent and correctable bias in metagenomic sequencing experiments. eLife 8, e46923 (2019).

    Google Scholar 

  56. Sinha, R. et al. Assessment of variation in microbial community amplicon sequencing by the Microbiome Quality Control (MBQC) project consortium. Nat. Biotechnol. 35, 1077–1086 (2017).

    Google Scholar 

  57. Ndraha, N., Hsiao, H.-I., Vlajic, J., Yang, M.-F. & Lin, H.-T. V. Time-temperature abuse in the food cold chain: review of issues, challenges, and recommendations. Food Control 89, 12–21 (2018).

    Google Scholar 

  58. Harth, H., Van Kerrebroeck, S. & De Vuyst, L. Community dynamics and metabolite target analysis of spontaneous, backslopped barley sourdough fermentations under laboratory and bakery conditions. Int. J. Food Microbiol. 228, 22–32 (2016).

    Google Scholar 

  59. Flörl, L., Cabrera, P. M., Moccia, M. D., Plüss, S. & Bokulich, N. A. HighALPS: ultra-high-throughput marker-gene amplicon library preparation and sequencing on the illumina NextSeq and NovaSeq Platforms. Preprint at https://doi.org/10.1101/2024.10.10.617643 (2024).

  60. Apprill, A., McNally, S. P., Parsons, R. & Weber, L. Minor revision to V 4 region SSU rRNA 806 R gene primer greatly increases detection of SAR 11 bacterioplankton. Aquat. Microb. Ecol. 75, 129–137 (2015).

    Google Scholar 

  61. Parada, A. E., Needham, D. M. & Fuhrman, J. A. Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples. Environ. Microbiol. 18, 1403–1414 (2016).

    Google Scholar 

  62. Bokulich, N. A. & Mills, D. A. Improved selection of internal transcribed spacer-specific primers enables quantitative, ultra-high-throughput profiling of fungal communities. Appl. Environ. Microbiol. 79, 2519–2526 (2013).

    Google Scholar 

  63. Fuhrer, T., Heer, D., Begemann, B. & Zamboni, N. High-throughput, accurate mass metabolome profiling of cellular extracts by flow injection–time-of-flight mass spectrometry. Anal. Chem. 83, 7074–7080 (2011).

    Google Scholar 

  64. Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).

    Google Scholar 

  65. Schmid, R. et al. Integrative analysis of multimodal mass spectrometry data in MZmine 3. Nat. Biotechnol. 41, 447–449 (2023).

    Google Scholar 

  66. Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M. & Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457–D462 (2016).

    Google Scholar 

  67. Wishart, D. S. et al. HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46, D608–D617 (2018).

    Google Scholar 

  68. Bolyen, E. et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 37, 852–857 (2019).

    Google Scholar 

  69. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011).

    Google Scholar 

  70. Callahan, B. J. et al. DADA2: high-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016).

    Google Scholar 

  71. Bokulich, N. A. et al. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome 6, 90 (2018).

    Google Scholar 

  72. Abarenkov, K. et al. UNITE QIIME release for Fungi 2. UNITE Community https://doi.org/10.15156/BIO/2959337 (2024).

  73. Robeson, M. S. et al. RESCRIPt: reproducible sequence taxonomy reference database management. PLOS Comput. Biol. 17, e1009581 (2021).

    Google Scholar 

  74. Rognes, T., Flouri, T., Nichols, B., Quince, C. & Mahé, F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 4, e2584 (2016).

    Google Scholar 

  75. Davis, N. M., Proctor, D. M., Holmes, S. P., Relman, D. A. & Callahan, B. J. Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data. Microbiome 6, 226 (2018).

    Google Scholar 

  76. Chuvochina, M. et al. SILVA in 2026: a global core biodata resource for rRNA within the DSMZ digital diversity. Nucleic Acids Res. 18, gkaf1247 (2025).

    Google Scholar 

  77. Kaehler, B. D. et al. Species abundance information improves sequence taxonomy classification accuracy. Nat. Commun. 10, 4643 (2019).

    Google Scholar 

  78. Zhang, Z., Schwartz, S., Wagner, L. & Miller, W. A greedy algorithm for aligning DNA sequences. J. Comput. Biol. 7, 203–214 (2000).

    Google Scholar 

  79. Tatusova, T. et al. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res. 44, 6614–6624 (2016).

    Google Scholar 

  80. Bokulich, N. A. Integrating sequence composition information into microbial diversity analyses with k-mer frequency counting. mSystems 10, e01550-24 (2025).

    Google Scholar 

  81. MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proc. Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics 5, 281–298 (University of California Press, 1967).

  82. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  83. Anderson, M. J. A new method for non-parametric multivariate analysis of variance. Austral Ecol. 26, 32–46 (2001).

    Google Scholar 

  84. Oksanen, J. et al. vegan: community ecology package. 2.7–1 https://doi.org/10.32614/CRAN.package.vegan (2001).

  85. Rideout, J. R. et al. biocore/scikit-bio: scikit-bio 0.5.9: Maintenance release. Zenodo https://doi.org/10.5281/zenodo.8209901 (2023).

  86. Lin, H. & Peddada, S. D. Multigroup analysis of compositions of microbiomes with covariate adjustments and repeated measures. Nat. Methods 21, 83–91 (2024).

    Google Scholar 

  87. Friedman, J., Hastie, T. & Tibshirani, R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441 (2008).

    Google Scholar 

  88. Faust, K. & Raes, J. CoNet app: inference of biological association networks using Cytoscape. [version 2; peer review: 2 approved]. F1000 Res. 5, 1519. https://doi.org/10.12688/f1000research.9050.2 (2016).

  89. Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and function using networkX. scipy https://doi.org/10.25080/TCWV9851 (2008).

    Google Scholar 

  90. Gansner, E. R. & North, S. C. An open graph visualization system and its applications to software engineering. Softw. Pract. Exp. 30, 1203–1233 (2000).

    Google Scholar 

  91. Breiman, L. Random Forests. Mach. Learn. 45, 5–32 (2001).

    Google Scholar 

  92. The pandas development team. pandas-dev/pandas: Pandas. https://doi.org/10.5281/zenodo.13819579 (2024).

  93. Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

    Google Scholar 

  94. Seabold, S. & Perktold, J. Statsmodels: Econometric and Statistical Modeling with Python. in 92–96 (Austin, Texas, 2010). https://doi.org/10.25080/Majora-92bf1922-011.

  95. Waskom, M. L. seaborn: statistical data visualization. J. Open Source Softw. 6, 3021 (2021).

    Google Scholar 

  96. The Matplotlib Development Team. Matplotlib: Visualization with Python. https://doi.org/10.5281/zenodo.11201097 (2024).

Download references

Acknowledgements

The authors acknowledge financial support from the project HealthFerm, which is funded by the European Union under the Horizon Europe grant agreement No. 101060247 and by the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract No. 22.00210. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union nor European Research Executive Agency (REA). Neither the European Union nor REA can be held responsible for them.The authors thank Luisa Ferreira for support in data collection and the Genomic Diversity Center of ETH Zürich for their support with amplicon library preparation. The microbiome amplicon sequencing was performed at the Functional Genomics Center Zurich of University of Zurich and ETH Zurich.

Funding

Open access funding provided by Swiss Federal Institute of Technology Zurich.

Author information

Authors and Affiliations

Authors

Contributions

Annina R. Meyer, Conceptualization, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing; Jan P. Tan, Formal analysis, Investigation, Methodology, Writing – review and editing; Mihnea P. Mihaila, Michelle Neugebauer, Investigation, Writing – review and editing; Laura Nyström, Resources, Supervision, Funding acquisition, Writing – review and editing; Nicholas A. Bokulich, Conceptualization, Resources, Supervision, Funding acquisition, Writing – original draft, Writing – review and editing.

Corresponding author

Correspondence to
Nicholas A. Bokulich.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Meyer, A.R., Tan, J.P., Mihaila, M.P. et al. Shipped and shifted: modeling collection-induced bias in microbiome multi-omics using a tractable fermentation system.
npj Biofilms Microbiomes (2026). https://doi.org/10.1038/s41522-025-00909-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41522-025-00909-1


Source: Ecology - nature.com

Landscape effects on global soil pathogenic fungal diversity across spatial scales

Energy and biomass distribution in soil food webs of temperate and tropical forests

Back to Top