in

A global long-term daily multilayer soil moisture dataset derived from machine learning


Abstract

Soil moisture is a critical component of the Earth’s energy and water cycles. However, most existing products focus solely on surface layers, and continuous, high‐resolution datasets for deep soil horizons remain scarce. To address this gap, we generated a global, daily, seamless multilayer soil moisture dataset (SWSM) for the period 2002–2021 by leveraging a machine learning approach (XGBoost). The SWSM dataset provides estimates at a 0.05° spatial resolution for three depth horizons: 0–10 cm, 10–30 cm, and 30–60 cm. Rigorous validation against in situ observations demonstrated the dataset’s high accuracy, with Pearson correlation coefficients exceeding 0.90 and root mean square errors below 0.05 across all depths. A feature importance assessment verified the dataset’s physical consistency, revealing depth-dependent patterns aligned with established hydrological understanding. The SWSM dataset, with its long-term temporal coverage, fine spatial resolution, and multi-layer structure, is a valuable resource for applications in hydrologic modeling, agricultural water management, and climate change studies.

Similar content being viewed by others

Global soil moisture data derived through machine learning trained with in-situ measurements

High-resolution European daily soil moisture derived with machine learning (2003–2020)

Global long term daily 1 km surface soil moisture dataset with physics informed machine learning

Data availability

The SWSM dataset generated in this study is openly available at Zenodo. Two repositories are provided: https://doi.org/10.5281/zenodo.15262116 and https://doi.org/10.5281/zenodo.15250534.

Code availability

The custom script used to read the NetCDF files in this study is publicly hosted on GitHub at the repository address: https://github.com/weizeyang1997/SWSM.

References

  1. Dorigo, W. et al. ESA CCI soil moisture for improved earth system understanding: state-of-the art and future directions. Remote Sens. Environ. 203, 185–215 (2017).

    Google Scholar 

  2. Yuan, Q., Xu, H., Li, T., Shen, H. & Zhang, L. Estimating surface soil moisture from satellite observations using a generalized regression neural network trained on sparse ground-based measurements in the continental U.S. Journal of Hydrology 580, 124351 (2020).

    Google Scholar 

  3. Dong, J., Akbar, R., Feldman, A. F., Gianotti, D. S. & Entekhabi, D. Land surfaces at the tipping‐point for water and energy balance coupling, https://doi.org/10.1029/2022WR032472.

  4. Zohaib, M., Kim, H. & Choi, M. Evaluating the patterns of spatiotemporal trends of root zone soil moisture in major climate regions in east Asia, https://doi.org/10.1002/2016JD026379.

  5. Shellito, P. J. et al. Assessing the impact of soil layer depth specification on the observability of modeled soil moisture and brightness temperature, https://doi.org/10.1175/JHM-D-19-0280.1 (2020).

  6. Song, P. et al. A 1 km daily surface soil moisture dataset of enhanced coverage under all-weather conditions over china in 2003–2019. Earth Syst. Sci. Data 14, 2613–2637 (2022).

    Google Scholar 

  7. Zhang, N., Quiring, S. M. & Ford, T. W. Blending noah, SMOS, and in situ soil moisture using multiple weighting and sampling schemes, https://doi.org/10.1175/JHM-D-20-0119.1 (2021).

  8. Chen, Y., Feng, X. & Fu, B. An improved global remote-sensing-based surface soil moisture (RSSSM) dataset covering 2003–2018. Earth Syst. Sci. Data 13, 1–31 (2021).

    Google Scholar 

  9. Fisher, R. A. & Koven, C. D. Perspectives on the future of land surface models and the challenges of representing complex terrestrial systems. JAMES 12, e2018MS001453 (2020).

    Google Scholar 

  10. Tai, S.-L. et al. A 1 km soil moisture dataset over eastern CONUS generated by assimilating SMAP data into the noah-MP land surface model. Earth Syst. Sci. Data 17, 4587–4611 (2025).

    Google Scholar 

  11. Feldman, A. F. et al. Remotely sensed soil moisture can capture dynamics relevant to plant water uptake, https://doi.org/10.1029/2022WR033814.

  12. Feldman, A. F. et al. Soil moisture profiles of ecosystem water use revealed with ECOSTRESS, https://doi.org/10.1029/2024GL108326.

  13. Liu, J., Rahmani, F., Lawson, K. & Shen, C. A multiscale deep learning model for soil moisture integrating satellite and In situ data, https://doi.org/10.1029/2021GL096847.

  14. Zhao, H., Montzka, C., Vereecken, H. & Franssen, H.-J. H. A comparative analysis of remote sensing soil moisture datasets fusion methods: novel LSTM approach versus widely used triple collocation technique. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 17, 16659–16671 (2024).

    Google Scholar 

  15. Hu, J., Deng, C., Zhang, Q. & Pang, A. Physics-informed neural networks enhanced by data augmentation: a novel framework for robust soil moisture estimation using multi-source data fusion. J. Hydrol. 663, 134320 (2025).

    Google Scholar 

  16. Chen, L. et al. Using remote sensing and machine learning to generate 100-cm soil moisture at 30-m resolution for the black soil region of China: implication for agricultural water management. Agric. Water Manage. 309, 109353 (2025).

    Google Scholar 

  17. Zhang, Y. et al. Generation of global 1 km daily soil moisture product from 2000 to 2020 using ensemble learning. Earth Syst. Sci. Data 15, 2055–2079 (2023).

    Google Scholar 

  18. O, S. & Orth, R. Global soil moisture data derived through machine learning trained with in-situ measurements. Sci. Data 8, 170 (2021).

    Google Scholar 

  19. O, S., Orth, R., Weber, U. & Park, S. K. High-resolution european daily soil moisture derived with machine learning (2003–2020), https://doi.org/10.48550/arXiv.2205.10753 (2022).

  20. Han, Q. et al. Global long term daily 1 km surface soil moisture dataset with physics informed machine learning. Sci. Data 10, 101 (2023).

    Google Scholar 

  21. Padarian, J., McBratney, A. B. & Minasny, B. Game theory interpretation of digital soil mapping convolutional neural networks. Soil 6, 389–397 (2020).

    Google Scholar 

  22. Muñoz-Sabater, J. et al. ERA5-land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383 (2021).

    Google Scholar 

  23. Hersbach, H & Bell, B.: ERA5 hourly time-series data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.e2161bac (2025).

  24. Zhou, J., Liang, S., Cheng, J., Wang, Y. & Ma, J. The GLASS land surface temperature product. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 12, 493–507 (2019).

    Google Scholar 

  25. Ma, H. & Liang, S. Development of the GLASS 250-m leaf area index product (version 6) from MODIS data using the bidirectional LSTM deep learning model. Remote Sens. Environ. 273, 112985 (2022).

    Google Scholar 

  26. Friedl, M. & Sulla-Menashe, D. MODIS/terra+aqua land cover type yearly L3 global 0.05Deg CMG V061. NASA Land Processes Distributed Active Archive Center, https://doi.org/10.5067/MODIS/MCD12C1.061 (2022).

  27. Danielson, J. J. & Gesch, D. B. Global Multi-Resolution Terrain Elevation Data 2010 (GMTED2010). Open-File Report https://pubs.usgs.gov/publication/ofr20111073, https://doi.org/10.3133/ofr20111073 (2011).

  28. Hengl, T. et al. SoilGrids250m: global gridded soil information based on machine learning. PLOS One 12, e0169748 (2017).

    Google Scholar 

  29. Lehmann, P., Berli, M., Koonce, J. E. & Or, D. Surface evaporation in arid regions: insights from lysimeter decadal record and global application of a surface evaporation capacitor (SEC) model, https://doi.org/10.1029/2019GL083932.

  30. Beck, H. E. et al. Evaluation of 18 satellite- and model-based soil moisture products using in situ measurements from 826 sensors. Hydrol. Earth Syst. Sci. 25, 17–40 (2021).

    Google Scholar 

  31. Zhang, L. et al. Environmental factors driving evapotranspiration over a grassland in a transitional climate zone in China, https://doi.org/10.1002/met.2066.

  32. Yang, J., Li, Z., Zhai, P., Zhao, Y. & Gao, X. The influence of soil moisture and solar altitude on surface spectral albedo in arid area. Environ. Res. Lett. 15, 35010 (2020).

    Google Scholar 

  33. Hu, Y. et al. A physical method for downscaling land surface temperatures using surface energy balance theory. Remote Sens. Environ. 286, 113421 (2023).

    Google Scholar 

  34. Matsushima, D. Thermal inertia-based method for estimating soil moisture. in Soil Moisture, https://doi.org/10.5772/intechopen.80252 (IntechOpen, 2018).

  35. Zhang, J., Wang, W.-C. & Wu, L. Land‐atmosphere coupling and diurnal temperature range over the contiguous united states, https://doi.org/10.1029/2009GL037505.

  36. Lagos, L. O. et al. Surface energy balance model of transpiration from variable canopy cover and evaporation from residue-covered or bare soil systems: model evaluation. Irrig. Sci. 31, 135–150 (2013).

    Google Scholar 

  37. Alves, I. & do Rosário Cameira, M. Evapotranspiration estimation performance of root zone water quality model: evaluation and improvement. Agric. Water Manage. 57, 61–73 (2002).

    Google Scholar 

  38. Cisneros Vaca, C., van der Tol, C. & Ghimire, C. P. The influence of long-term changes in canopy structure on rainfall interception loss: a case study in speulderbos, the Netherlands. Hydrol. Earth Syst. Sci. 22, 3701–3719 (2018).

    Google Scholar 

  39. Hoek van Dijke, A. J. et al. Examining the link between vegetation leaf area and land–atmosphere exchange of water, energy, and carbon fluxes using FLUXNET data. Biogeosciences 17, 4443–4457 (2020).

    Google Scholar 

  40. Liu, Z. et al. Modeling the response of daily evapotranspiration and its components of a larch plantation to the variation of weather, soil moisture, and canopy leaf area index. J. Geophys. Res.: Atmos. 123, 7354–7374 (2018).

    Google Scholar 

  41. Chen, M., Willgoose, G. R. & Saco, P. M. Investigating the impact of leaf area index temporal variability on soil moisture predictions using remote sensing vegetation data. J. Hydrol. 522, 274–284 (2015).

    Google Scholar 

  42. Wang, Y., Yang, J., Chen, Y., Wang, A. & De Maeyer, P. The spatiotemporal response of soil moisture to precipitation and temperature changes in an arid region, china. Remote Sens. 10, 468 (2018).

    Google Scholar 

  43. Fan, L. et al. Mapping soil moisture at a high resolution over mountainous regions by integrating In situ measurements, topography data, and MODIS land surface temperatures. Remote Sens. 11, 656 (2019).

    Google Scholar 

  44. Lapides, D. A. et al. Inclusion of bedrock vadose zone in dynamic global vegetation models is key for simulating vegetation structure and function. Biogeosciences 21, 1801–1826 (2024).

    Google Scholar 

  45. Dai, Y. et al. A global high-resolution data set of soil hydraulic and thermal properties for land surface modeling. JAMES 11, 2996–3023 (2019).

    Google Scholar 

  46. Shangguan, W., Hengl, T., Mendes de Jesus, J., Yuan, H. & Dai, Y. Mapping the global depth to bedrock for land surface modeling. JAMES 9, 65–88 (2017).

    Google Scholar 

  47. Chen, L. & Dirmeyer, P. A. Impacts of land-use/land-cover change on afternoon precipitation over north america, https://doi.org/10.1175/JCLI-D-16-0589.1 (2017).

  48. Floriancic, M. G. et al. Potential for significant precipitation cycling by forest‐floor litter and deadwood, https://doi.org/10.1002/eco.2493.

  49. Li, Y. et al. Spatiotemporal impacts of land use land cover changes on hydrology from the mechanism perspective using SWAT model with time-varying parameters. Hydrol. Res. 50, 244–261 (2018).

    Google Scholar 

  50. Dorigo, W. et al. The international soil moisture network: serving earth system science for over a decade. Hydrol. Earth Syst. Sci. 25, 5749–5804 (2021).

    Google Scholar 

  51. Dorigo, W. A. et al. Global automated quality control of In situ soil moisture data from the international soil moisture network. Vadose Zone J. 12, vzj2012.97 (2013).

    Google Scholar 

  52. Li, Q. et al. A 1 km daily soil moisture dataset over china using in situ measurement and machine learning. Earth Syst. Sci. Data 14, 5267–5286 (2022).

    Google Scholar 

  53. Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794, https://doi.org/10.1145/2939672.2939785 (2016).

  54. Entekhabi, D., Reichle, R. H., Koster, R. D. & Crow, W. T. Performance Metrics for Soil Moisture Retrievals and Application Requirements. J. Hydrometeorol. 11, 832–840 (2010).

    Google Scholar 

  55. Wei, Z. High resolution daily multilayer soil moisture dataset 2002 to 2013 derived from integrated multi-source data fusion. Zenodo https://doi.org/10.5281/zenodo.15250534 (2025).

  56. Wei, Z. High resolution daily multilayer soil moisture dataset 2014 to 2021 derived from integrated multi-source data fusion. Zenodo https://doi.org/10.5281/zenodo.15262116 (2025).

  57. Beaudoing, H., Rodell, M. & Nasa/Gsfc/Hsl. GLDAS noah land surface model L4 3 hourly 0.25 x0.25 degree, version 2.1. NASA Goddard Earth Sciences Data and Information Services Center, https://doi.org/10.5067/E7TYRXPJKWOQ (2020).

  58. Miralles, D. G. et al. GLEAM4: global land evaporation and soil moisture dataset at 0.1° resolution from 1980 to near present. Sci. Data 12, 416 (2025).

    Google Scholar 

  59. Preimesberger, W., Stradiotti, P. & Dorigo, W. ESA CCI soil moisture GAPFILLED: an independent global gap-free satellite climate data record with uncertainty estimates. Earth Syst. Sci. Data 17, 4305–4329 (2025).

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (42271392), the Opening Foundation of Xi’an Key Laboratory of Territorial Spatial Information (3001023545016), and the Open Fund of the Key Laboratory of Natural Resources Monitoring and Supervision in Southern Hilly Region, Ministry of Natural Resources (NRMSSHR2022Y02, NRMSSHR2023Y03). We also gratefully acknowledge the QA4SM platform for providing critical soil moisture validation services; their efforts have greatly facilitated and enhanced the quality of our soil moisture research.

Author information

Authors and Affiliations

Authors

Contributions

Z.W. and L.W. conceived the overall experiment; Z.W. conducted the entire experiment and wrote the manuscript; T.W. provided computational resources; Q.L. optimized the experimental procedure; S.T. assisted in data cleaning; F.Z. reviewed the manuscript; and Y.Z. guided the methodology and made revisions to the figures and manuscript. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to
Lifei Wei.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wei, Z., Wei, L., Wang, T. et al. A global long-term daily multilayer soil moisture dataset derived from machine learning.
Sci Data (2025). https://doi.org/10.1038/s41597-025-06436-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41597-025-06436-0


Source: Ecology - nature.com

Plant community data along elevational gradients in China’s 17 mountains

Evolutionary adaptation of anaerobic and aerobic metabolism to high sulfide and hypoxic hydrothermal vent crab, Xenograpsus testudinatus

Back to Top