Abstract
Surface water quality forecasting is crucial for pollution early warning and sustainable water resource management. However, accurate prediction of key water quality indicators remains challenging due to the highly nonlinear spatio-temporal dynamics and complex inter-variable relationships. Traditional statistical models and conventional machine learning approaches often struggle to effectively capture these couplings, leading to limited predictive performance. In this study, we propose a novel hybrid deep learning framework, termed MLA-Mamba, which integrates an improved Mamba-based sequence modeling network with a Multi-Head Local Attention (MLA) mechanism, optimized through a Gradient Reparameterization Optimization (GRPO) strategy. The Mamba module is designed to extract long-range temporal dependencies from water quality time series via a state-space modeling paradigm, while the MLA mechanism captures localized spatial correlations among multiple monitoring stations. To the best of our knowledge, this study represents one of the first explorations of applying Gradient Reparameterization Optimization (GRPO) to water quality prediction tasks. Furthermore, a multi-task learning scheme is incorporated to jointly predict multiple key indicators, including permanganate index (CODMn), ammonia nitrogen (NH3–N), total phosphorus (TP), and total nitrogen (TN), thereby exploiting inter-variable dependencies to enhance overall forecasting accuracy. The proposed GRPO strategy dynamically adjusts learning rates during training to accelerate convergence and improve model stability. Experimental evaluations on two real-world surface water datasets demonstrate that the proposed MLA-Mamba model achieves consistent performance improvements over the evaluated baseline methods across multiple error metrics. In addition, predictive uncertainty is quantified via Monte Carlo dropout, enabling the estimation of confidence intervals to support risk-aware water quality assessment. These results highlight the effectiveness of integrating advanced sequence modeling, attention-driven spatial feature extraction, and adaptive optimization for robust environmental time series forecasting.
Similar content being viewed by others
Phosphorus prediction in the middle reaches of the Yangtze river based on GRA-CEEMDAN-CNLSTM-DBO
Automated machine learning achieves accurate water quality prediction with reduced parameter requirements
Application of an improved LSTM model based on FECA and CEEMDAN VMD decomposition in water quality prediction
Data availability
The datasets used and analysed during the current study are not publicly available, but are available from the corresponding author upon reasonable request. Interested researchers may contact Dr. Wang at [email protected].
References
Edition, F. Guidelines for drinking-water quality. WHO Chronicle 38, 104–108 (2011).
Brown, L. C. & Barnwell, T. O. The enhanced stream water quality models QUAL2E and QUAL2E-UNCAS: Documentation and user model (Environmental Research Laboratory, Office of Research and Development, US …, 1987).
Arnold, J. G., Srinivasan, R., Muttiah, R. S. & Williams, J. R. Large area hydrologic modeling and assessment part I: Model development 1. JAWRA J. Am. Water Resour. Assoc. 34, 73–89 (1998).
Box, G. E., Jenkins, G. M., Reinsel, G. C. & Ljung, G. M. Time Series Analysis: Forecasting and Control (Wiley, 2015).
Olyaie, E., Abyaneh, H. Z. & Mehr, A. D. A comparative analysis among computational intelligence techniques for dissolved oxygen prediction in Delaware river. Geosci. Front. 8, 517–527 (2017).
Li, X., Cheng, Z., Yu, Q., Bai, Y. & Li, C. Water-quality prediction using multimodal support vector regression: Case study of Jialing River, China. J. Environ. Eng. 143, 04017070 (2017).
Graves, A. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks. 37–45 (2012).
Baek, S.-S., Pyo, J. & Chun, J. A. Prediction of water level and water quality using a cnn-lstm combined deep learning approach. Water 12, 3399 (2020).
Barzegar, R., Aalami, M. T. & Adamowski, J. Short-term water quality variable prediction using a hybrid cnn-lstm deep learning model. Stoch. Environ. Res. Risk Assess. 34, 415–433 (2020).
Liu, Y., Zhang, Q., Song, L. & Chen, Y. Attention-based recurrent neural networks for accurate short-term and long-term dissolved oxygen prediction. Comput. Electron. Agric. 165, 104964 (2019).
Wang, X., Tang, X., Zhu, M., Liu, Z. & Wang, G. Predicting abrupt depletion of dissolved oxygen in Chaohu lake using cnn-bilstm with improved attention mechanism. Water Res. 261, 122027 (2024).
Wang, W.-C., Tian, W.-C., Ren, M.-l. & Xu, D.-M. Mamga: A deep neural network architecture for dual-channel parallel monthly runoff prediction based on Mamba and depth-gated attention layer. J. Hydrol. 134304 (2025).
Wu, H. et al. Water quality prediction based on multi-task learning. Int. J. Environ. Res. Public Health 19, 9699 (2022).
Adam, K. D. B. J. et al. A method for stochastic optimization. Vol. 1412 arXiv preprint arXiv:1412.6980 (2014).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017).
Palani, S., Liong, S.-Y. & Tkalich, P. An ANN application for water quality forecasting. Mar. Pollut. Bull. 56, 1586–1597 (2008).
Leong, W. C., Bahadori, A., Zhang, J. & Ahmad, Z. Prediction of water quality index (WQI) using support vector machine (SVM) and least square-support vector machine (ls-svm). Int. J. River Basin Manag. 19, 149–156 (2021).
Chapra, S. C. Surface Water-Quality Modeling (Waveland Press, 2008).
Streeter, H. W. & Phelps, E. B. A Study of the Pollution and Natural Purification of the Ohio River. Vol. 146 (United States Public Health Service, 1925).
Moriasi, D. N. et al. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. ASABE 50, 885–900 (2007).
Juahir, H., Zain, S. M., Toriman, M. E., Mokhtar, M. & Man, H. C. Application of artificial neural network models for predicting water quality index. Malays. J. Civ. Eng. 16 (2004).
Heddam, S. Generalized regression neural network-based approach for modelling hourly dissolved oxygen concentration in the upper klamath river, oregon, usa. Environ. Technol. 35, 1650–1657 (2014).
Alijanpour Shalmani, A., Vaezi, A. R. & Tabatabaei, M. R. Prediction of daily suspended sediment load using the genetic expression programming and artificial neural network models. Environ. Resour. Res. 10, 115–132 (2022).
Maier, H. R., Jain, A., Dandy, G. C. & Sudheer, K. P. Methods used for the development of neural networks for the prediction of water resource variables in river systems: Current status and future directions. Environ. Model. Softw. 25, 891–909 (2010).
Leong, W. C., Bahadori, A., Zhang, J. & Ahmad, Z. Prediction of water quality index (wqi) using support vector machine (svm) and least square-support vector machine (ls-svm). Int. J. River Basin Manag. 19, 149–156 (2021).
Cutler, D. R. et al. Random forests for classification in ecology. Ecology 88, 2783–2792 (2007).
Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785–794 (2016).
Kipf, T. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
Gers, F. A., Schmidhuber, J. & Cummins, F. Learning to forget: Continual prediction with lstm. Neural Comput. 12, 2451–2471 (2000).
Cho, K. et al. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
Goodfellow, I., Bengio, Y., Courville, A. & Bengio, Y. Deep Learning. Vol. 1 (MIT Press, 2016).
Chen, K. et al. Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water Res. 171, 115454 (2020).
Kratzert, F., Klotz, D., Brenner, C., Schulz, K. & Herrnegger, M. Rainfall-runoff modelling using long short-term memory (lstm) networks. Hydrol. Earth Syst. Sci. 22, 6005–6022 (2018).
Mi, Z., Li, Q., Sha, Y. & Wu, Z. cnn-lstm-attention water quality prediction hybrid model. In Second International Conference on Sustainable Technology and Management (ICSTM 2023). Vol. 12804. 19–24 (SPIE, 2023).
Cao, X., Ren, N., Tian, G., Fan, Y. & Duan, Q. A three-dimensional prediction method of dissolved oxygen in pond culture based on attention-gru-gbrt. Comput. Electron. Agric. 181, 105955 (2021).
Ni, Q., Cao, X., Tan, C., Peng, W. & Kang, X. An improved graph convolutional network with feature and temporal attention for multivariate water quality prediction. Environ. Sci. Pollut. Res. 30, 11516–11529 (2023).
Wu, Z., Pan, S., Long, G., Jiang, J. & Zhang, C. Graph wavenet for deep spatial-temporal graph modeling. arXiv preprint arXiv:1906.00121 (2019).
Chen, C. et al. Citywide traffic flow prediction based on multiple gated spatio-temporal convolutional neural networks. ACM Trans. Knowl. Discov. Data (TKDD) 14, 1–23 (2020).
Bi, J. et al. Long-term water quality prediction with transformer-based spatial-temporal graph fusion. In IEEE Transactions on Automation Science and Engineering (2025).
Huan, J. et al. Prediction of codmn concentration in lakes based on spatiotemporal feature screening and interpretable learning methods-a study of Changdang lake, China. Comput. Electron. Agric. 219, 108793 (2024).
Jiang, Y. et al. A deep learning algorithm for multi-source data fusion to predict water quality of urban sewer networks. J. Clean. Prod. 318, 128533 (2021).
Wang, Z. et al. A deep learning based dynamic cod prediction model for urban sewage. Environ. Sci. Water Res. Technol. 5, 2210–2218 (2019).
Wu, H. et al. Water quality prediction based on multi-task learning. Int. J. Environ. Res. Public Health 19, 9699 (2022).
Li, D. & Zhang, X. Utilizing a two-dimensional data-driven convolutional neural network for long-term prediction of dissolved oxygen content. Front. Environ. Sci. 10, 904939 (2022).
Wu, J. & Wang, Z. A hybrid model for water quality prediction based on an artificial neural network, wavelet transform, and long short-term memory. Water 14, 610 (2022).
Vandal, T. et al. Deepsd: Generating high resolution climate change projections through single image super-resolution. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1663–1672 (2017).
Hyndman, R. J. & Khandakar, Y. Automatic time series forecasting: The forecast package for r. J. Stat. Softw. 27, 1–22 (2008).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Smith, L. N. Cyclical learning rates for training neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). 464–472 (IEEE, 2017).
Loshchilov, I. & Hutter, F. SGDR: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016).
Smola, A. J. & Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004).
Funding
This work is supported by the Guizhou Provincial Science and Technology Support Program (2023): “Research on Water Quality Prediction and Early Warning Technology Integrating AI in Big Data Environment” (Grant No. Qiankehe Support [2023] General 108).
Author information
Authors and Affiliations
Contributions
Ronghao Wei: Methodology, Writing–original draft, Software, Formal analysis, Project administration, Resources. Hang Chen: Methodology, Formal analysis, Project administration, Resources. Haihe Wang: Supervision, Methodology, Resources, Funding acquisition, Writing–review & editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
About this article
Cite this article
Wei, R., Chen, H. & Wang, H. Surface water quality prediction via an MLA-Mamba hybrid neural network with GRPO optimization.
Sci Rep (2026). https://doi.org/10.1038/s41598-026-36229-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-36229-3
Keywords
- Surface water quality prediction
- Spatio-temporal deep learning
- MLA-Mamba network
- Multi-head local attention
- Gradient reparameterization optimization
Source: Resources - nature.com

