Abstract
Soil moisture is a critical component of the Earth’s energy and water cycles. However, most existing products focus solely on surface layers, and continuous, high‐resolution datasets for deep soil horizons remain scarce. To address this gap, we generated a global, daily, seamless multilayer soil moisture dataset (SWSM) for the period 2002–2021 by leveraging a machine learning approach (XGBoost). The SWSM dataset provides estimates at a 0.05° spatial resolution for three depth horizons: 0–10 cm, 10–30 cm, and 30–60 cm. Rigorous validation against in situ observations demonstrated the dataset’s high accuracy, with Pearson correlation coefficients exceeding 0.90 and root mean square errors below 0.05 across all depths. A feature importance assessment verified the dataset’s physical consistency, revealing depth-dependent patterns aligned with established hydrological understanding. The SWSM dataset, with its long-term temporal coverage, fine spatial resolution, and multi-layer structure, is a valuable resource for applications in hydrologic modeling, agricultural water management, and climate change studies.
Similar content being viewed by others
Global soil moisture data derived through machine learning trained with in-situ measurements
High-resolution European daily soil moisture derived with machine learning (2003–2020)
Global long term daily 1 km surface soil moisture dataset with physics informed machine learning
Data availability
The SWSM dataset generated in this study is openly available at Zenodo. Two repositories are provided: https://doi.org/10.5281/zenodo.15262116 and https://doi.org/10.5281/zenodo.15250534.
Code availability
The custom script used to read the NetCDF files in this study is publicly hosted on GitHub at the repository address: https://github.com/weizeyang1997/SWSM.
References
Dorigo, W. et al. ESA CCI soil moisture for improved earth system understanding: state-of-the art and future directions. Remote Sens. Environ. 203, 185–215 (2017).
Yuan, Q., Xu, H., Li, T., Shen, H. & Zhang, L. Estimating surface soil moisture from satellite observations using a generalized regression neural network trained on sparse ground-based measurements in the continental U.S. Journal of Hydrology 580, 124351 (2020).
Dong, J., Akbar, R., Feldman, A. F., Gianotti, D. S. & Entekhabi, D. Land surfaces at the tipping‐point for water and energy balance coupling, https://doi.org/10.1029/2022WR032472.
Zohaib, M., Kim, H. & Choi, M. Evaluating the patterns of spatiotemporal trends of root zone soil moisture in major climate regions in east Asia, https://doi.org/10.1002/2016JD026379.
Shellito, P. J. et al. Assessing the impact of soil layer depth specification on the observability of modeled soil moisture and brightness temperature, https://doi.org/10.1175/JHM-D-19-0280.1 (2020).
Song, P. et al. A 1 km daily surface soil moisture dataset of enhanced coverage under all-weather conditions over china in 2003–2019. Earth Syst. Sci. Data 14, 2613–2637 (2022).
Zhang, N., Quiring, S. M. & Ford, T. W. Blending noah, SMOS, and in situ soil moisture using multiple weighting and sampling schemes, https://doi.org/10.1175/JHM-D-20-0119.1 (2021).
Chen, Y., Feng, X. & Fu, B. An improved global remote-sensing-based surface soil moisture (RSSSM) dataset covering 2003–2018. Earth Syst. Sci. Data 13, 1–31 (2021).
Fisher, R. A. & Koven, C. D. Perspectives on the future of land surface models and the challenges of representing complex terrestrial systems. JAMES 12, e2018MS001453 (2020).
Tai, S.-L. et al. A 1 km soil moisture dataset over eastern CONUS generated by assimilating SMAP data into the noah-MP land surface model. Earth Syst. Sci. Data 17, 4587–4611 (2025).
Feldman, A. F. et al. Remotely sensed soil moisture can capture dynamics relevant to plant water uptake, https://doi.org/10.1029/2022WR033814.
Feldman, A. F. et al. Soil moisture profiles of ecosystem water use revealed with ECOSTRESS, https://doi.org/10.1029/2024GL108326.
Liu, J., Rahmani, F., Lawson, K. & Shen, C. A multiscale deep learning model for soil moisture integrating satellite and In situ data, https://doi.org/10.1029/2021GL096847.
Zhao, H., Montzka, C., Vereecken, H. & Franssen, H.-J. H. A comparative analysis of remote sensing soil moisture datasets fusion methods: novel LSTM approach versus widely used triple collocation technique. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 17, 16659–16671 (2024).
Hu, J., Deng, C., Zhang, Q. & Pang, A. Physics-informed neural networks enhanced by data augmentation: a novel framework for robust soil moisture estimation using multi-source data fusion. J. Hydrol. 663, 134320 (2025).
Chen, L. et al. Using remote sensing and machine learning to generate 100-cm soil moisture at 30-m resolution for the black soil region of China: implication for agricultural water management. Agric. Water Manage. 309, 109353 (2025).
Zhang, Y. et al. Generation of global 1 km daily soil moisture product from 2000 to 2020 using ensemble learning. Earth Syst. Sci. Data 15, 2055–2079 (2023).
O, S. & Orth, R. Global soil moisture data derived through machine learning trained with in-situ measurements. Sci. Data 8, 170 (2021).
O, S., Orth, R., Weber, U. & Park, S. K. High-resolution european daily soil moisture derived with machine learning (2003–2020), https://doi.org/10.48550/arXiv.2205.10753 (2022).
Han, Q. et al. Global long term daily 1 km surface soil moisture dataset with physics informed machine learning. Sci. Data 10, 101 (2023).
Padarian, J., McBratney, A. B. & Minasny, B. Game theory interpretation of digital soil mapping convolutional neural networks. Soil 6, 389–397 (2020).
Muñoz-Sabater, J. et al. ERA5-land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383 (2021).
Hersbach, H & Bell, B.: ERA5 hourly time-series data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.e2161bac (2025).
Zhou, J., Liang, S., Cheng, J., Wang, Y. & Ma, J. The GLASS land surface temperature product. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 12, 493–507 (2019).
Ma, H. & Liang, S. Development of the GLASS 250-m leaf area index product (version 6) from MODIS data using the bidirectional LSTM deep learning model. Remote Sens. Environ. 273, 112985 (2022).
Friedl, M. & Sulla-Menashe, D. MODIS/terra+aqua land cover type yearly L3 global 0.05Deg CMG V061. NASA Land Processes Distributed Active Archive Center, https://doi.org/10.5067/MODIS/MCD12C1.061 (2022).
Danielson, J. J. & Gesch, D. B. Global Multi-Resolution Terrain Elevation Data 2010 (GMTED2010). Open-File Report https://pubs.usgs.gov/publication/ofr20111073, https://doi.org/10.3133/ofr20111073 (2011).
Hengl, T. et al. SoilGrids250m: global gridded soil information based on machine learning. PLOS One 12, e0169748 (2017).
Lehmann, P., Berli, M., Koonce, J. E. & Or, D. Surface evaporation in arid regions: insights from lysimeter decadal record and global application of a surface evaporation capacitor (SEC) model, https://doi.org/10.1029/2019GL083932.
Beck, H. E. et al. Evaluation of 18 satellite- and model-based soil moisture products using in situ measurements from 826 sensors. Hydrol. Earth Syst. Sci. 25, 17–40 (2021).
Zhang, L. et al. Environmental factors driving evapotranspiration over a grassland in a transitional climate zone in China, https://doi.org/10.1002/met.2066.
Yang, J., Li, Z., Zhai, P., Zhao, Y. & Gao, X. The influence of soil moisture and solar altitude on surface spectral albedo in arid area. Environ. Res. Lett. 15, 35010 (2020).
Hu, Y. et al. A physical method for downscaling land surface temperatures using surface energy balance theory. Remote Sens. Environ. 286, 113421 (2023).
Matsushima, D. Thermal inertia-based method for estimating soil moisture. in Soil Moisture, https://doi.org/10.5772/intechopen.80252 (IntechOpen, 2018).
Zhang, J., Wang, W.-C. & Wu, L. Land‐atmosphere coupling and diurnal temperature range over the contiguous united states, https://doi.org/10.1029/2009GL037505.
Lagos, L. O. et al. Surface energy balance model of transpiration from variable canopy cover and evaporation from residue-covered or bare soil systems: model evaluation. Irrig. Sci. 31, 135–150 (2013).
Alves, I. & do Rosário Cameira, M. Evapotranspiration estimation performance of root zone water quality model: evaluation and improvement. Agric. Water Manage. 57, 61–73 (2002).
Cisneros Vaca, C., van der Tol, C. & Ghimire, C. P. The influence of long-term changes in canopy structure on rainfall interception loss: a case study in speulderbos, the Netherlands. Hydrol. Earth Syst. Sci. 22, 3701–3719 (2018).
Hoek van Dijke, A. J. et al. Examining the link between vegetation leaf area and land–atmosphere exchange of water, energy, and carbon fluxes using FLUXNET data. Biogeosciences 17, 4443–4457 (2020).
Liu, Z. et al. Modeling the response of daily evapotranspiration and its components of a larch plantation to the variation of weather, soil moisture, and canopy leaf area index. J. Geophys. Res.: Atmos. 123, 7354–7374 (2018).
Chen, M., Willgoose, G. R. & Saco, P. M. Investigating the impact of leaf area index temporal variability on soil moisture predictions using remote sensing vegetation data. J. Hydrol. 522, 274–284 (2015).
Wang, Y., Yang, J., Chen, Y., Wang, A. & De Maeyer, P. The spatiotemporal response of soil moisture to precipitation and temperature changes in an arid region, china. Remote Sens. 10, 468 (2018).
Fan, L. et al. Mapping soil moisture at a high resolution over mountainous regions by integrating In situ measurements, topography data, and MODIS land surface temperatures. Remote Sens. 11, 656 (2019).
Lapides, D. A. et al. Inclusion of bedrock vadose zone in dynamic global vegetation models is key for simulating vegetation structure and function. Biogeosciences 21, 1801–1826 (2024).
Dai, Y. et al. A global high-resolution data set of soil hydraulic and thermal properties for land surface modeling. JAMES 11, 2996–3023 (2019).
Shangguan, W., Hengl, T., Mendes de Jesus, J., Yuan, H. & Dai, Y. Mapping the global depth to bedrock for land surface modeling. JAMES 9, 65–88 (2017).
Chen, L. & Dirmeyer, P. A. Impacts of land-use/land-cover change on afternoon precipitation over north america, https://doi.org/10.1175/JCLI-D-16-0589.1 (2017).
Floriancic, M. G. et al. Potential for significant precipitation cycling by forest‐floor litter and deadwood, https://doi.org/10.1002/eco.2493.
Li, Y. et al. Spatiotemporal impacts of land use land cover changes on hydrology from the mechanism perspective using SWAT model with time-varying parameters. Hydrol. Res. 50, 244–261 (2018).
Dorigo, W. et al. The international soil moisture network: serving earth system science for over a decade. Hydrol. Earth Syst. Sci. 25, 5749–5804 (2021).
Dorigo, W. A. et al. Global automated quality control of In situ soil moisture data from the international soil moisture network. Vadose Zone J. 12, vzj2012.97 (2013).
Li, Q. et al. A 1 km daily soil moisture dataset over china using in situ measurement and machine learning. Earth Syst. Sci. Data 14, 5267–5286 (2022).
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794, https://doi.org/10.1145/2939672.2939785 (2016).
Entekhabi, D., Reichle, R. H., Koster, R. D. & Crow, W. T. Performance Metrics for Soil Moisture Retrievals and Application Requirements. J. Hydrometeorol. 11, 832–840 (2010).
Wei, Z. High resolution daily multilayer soil moisture dataset 2002 to 2013 derived from integrated multi-source data fusion. Zenodo https://doi.org/10.5281/zenodo.15250534 (2025).
Wei, Z. High resolution daily multilayer soil moisture dataset 2014 to 2021 derived from integrated multi-source data fusion. Zenodo https://doi.org/10.5281/zenodo.15262116 (2025).
Beaudoing, H., Rodell, M. & Nasa/Gsfc/Hsl. GLDAS noah land surface model L4 3 hourly 0.25 x0.25 degree, version 2.1. NASA Goddard Earth Sciences Data and Information Services Center, https://doi.org/10.5067/E7TYRXPJKWOQ (2020).
Miralles, D. G. et al. GLEAM4: global land evaporation and soil moisture dataset at 0.1° resolution from 1980 to near present. Sci. Data 12, 416 (2025).
Preimesberger, W., Stradiotti, P. & Dorigo, W. ESA CCI soil moisture GAPFILLED: an independent global gap-free satellite climate data record with uncertainty estimates. Earth Syst. Sci. Data 17, 4305–4329 (2025).
Acknowledgements
This work was supported by the National Natural Science Foundation of China (42271392), the Opening Foundation of Xi’an Key Laboratory of Territorial Spatial Information (3001023545016), and the Open Fund of the Key Laboratory of Natural Resources Monitoring and Supervision in Southern Hilly Region, Ministry of Natural Resources (NRMSSHR2022Y02, NRMSSHR2023Y03). We also gratefully acknowledge the QA4SM platform for providing critical soil moisture validation services; their efforts have greatly facilitated and enhanced the quality of our soil moisture research.
Author information
Authors and Affiliations
Contributions
Z.W. and L.W. conceived the overall experiment; Z.W. conducted the entire experiment and wrote the manuscript; T.W. provided computational resources; Q.L. optimized the experimental procedure; S.T. assisted in data cleaning; F.Z. reviewed the manuscript; and Y.Z. guided the methodology and made revisions to the figures and manuscript. All authors have read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
About this article
Cite this article
Wei, Z., Wei, L., Wang, T. et al. A global long-term daily multilayer soil moisture dataset derived from machine learning.
Sci Data (2025). https://doi.org/10.1038/s41597-025-06436-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-06436-0
Source: Ecology - nature.com
