in

Enhancing crayfish sex identification with Kolmogorov-Arnold networks and stacked autoencoders


Abstract

Crayfish play an important role in freshwater ecosystems, and sex classification is crucial for analyzing their demographic structures. This study performed binary classification using traditional machine learning and deep learning models on tabular and image datasets with an imbalanced class distribution. For tabular classification, features related to crayfish weight and size were used. Missing values were handled using different methods to create various datasets. Kolmogorov-Arnold networks demonstrated the best performance across all metrics, achieving accuracy rates between 95 and 100%. Image data were generated by combining at least five images of each crayfish. Autoencoders were employed to extract meaningful features. In experiments conducted on these extracted features, support vector machines achieved 84% accuracy, and multilayer perceptrons achieved 82% accuracy, outperforming other models. To enhance performance, a novel architecture based on stacked autoencoders was proposed. While some models experienced performance declines, Kolmogorov-Arnold networks showed an average improvement of 3.5% across all metrics, maintaining the highest accuracy. To statistically evaluate performance differences, McNemar’s and Wilcoxon tests were applied. The results confirmed significant differences between Kolmogorov-Arnold networks, support vector machines, multilayer perceptrons, and naive Bayes. In conclusion, this study highlights the effectiveness of deep learning and machine learning models in crayfish sex classification and provides a significant example of hybrid artificial intelligence models incorporating autoencoders.

Similar content being viewed by others

Modeling and predicting meat yield and growth performance using morphological features of narrow-clawed crayfish with machine learning techniques

Lightweight detection and segmentation of crayfish parts using an improved YOLOv11n segmentation model

Attention-enhanced and integrated deep learning approach for fishing vessel classification based on multiple features

Data availability

The datasets generated and/or analysed during the current study are available in the Zenodo repository: https://doi.org/10.5281/zenodo.17516963. The source codes developed for the experiments are stored in a GitHub repository at https://github.com/yasinatilkan60/Crayfish-Sex-Identification.

References

  1. Pastorino, P. et al. The invasive red swamp crayfish (Procambarus clarkii) as a bioindicator of microplastic pollution: insights from lake Candia (northwestern Italy). Ecol. Ind. 150, 110200 (2023).

    Google Scholar 

  2. Piscart, C. et al. In Identification and Ecology of Freshwater Arthropods in the Mediterranean Basin 157–223 (Elsevier, 2024).

  3. Muruganandam, M. et al. Impact of climate change and anthropogenic activities on aquatic ecosystem–A review. Environ. Res. 238, 117233 (2023).

    Google Scholar 

  4. Özdoğan, H. B. & Koca, H. U. Effects of different diets on growth and survival of first feeding second-stage juvenile Pontastacus leptodactylus (Eschscholtz, 1823)(Decapoda, Astacidea). Crustaceana 96, 673–682 (2023).

    Google Scholar 

  5. Đuretanović, S., Rajković, M. & Maguire, I. Ecological Sustainability of Fish Resources of Inland Waters of the Western Balkans: Freshwater Fish Stocks, Sustainable Use and Conservation 341–374 (Springer, 2024).

  6. Suryanto, M. E. et al. Using crayfish behavior assay as a simple and sensitive model to evaluate potential adverse effects of water pollution: emphasis on antidepressants. Ecotoxicol. Environ. Saf. 265, 115507 (2023).

    Google Scholar 

  7. Kazery, J. A. et al. Internal and external Spatial analysis of trace elements in local crayfish. Environ. Technol., 1–14 (2024).

  8. Jin, S. et al. Length-based stock assessment for Procambarus Clarkii aquaculture management in china: an alarming of ongoing recruitment overfishing. Aquaculture 579, 740182 (2024).

    Google Scholar 

  9. McLay, C. L., van den Brink, A. M., Longshaw, M. & Stebbing, P. Crayfish growth and reproduction. Biol. Ecol. Crayfish 62–116 (2016).

  10. Budd, A. M., Banh, Q. Q., Domingos, J. A. & Jerry, D. R. Sex control in fish: approaches, challenges and opportunities for aquaculture. J. Mar. Sci. Eng. 3, 329–355 (2015).

    Google Scholar 

  11. Crandall, K. A. & De Grave, S. An updated classification of the freshwater crayfishes (Decapoda: Astacidea) of the world, with a complete species list. J. Crustacean Biology. 37, 615–653 (2017).

    Google Scholar 

  12. Dargan, S., Kumar, M., Ayyagari, M. R. & Kumar, G. A survey of deep learning and its applications: a new paradigm to machine learning. Arch. Comput. Methods Eng. 27, 1071–1092 (2020).

    Google Scholar 

  13. Lan, K. et al. A survey of data mining and deep learning in bioinformatics. J. Med. Syst. 42, 1–20 (2018).

    Google Scholar 

  14. Lauriola, I., Lavelli, A. & Aiolli, F. An introduction to deep learning in natural Language processing: Models, techniques, and tools. Neurocomputing 470, 443–456 (2022).

    Google Scholar 

  15. Bambil, D. et al. Plant species identification using color learning resources, shape, texture, through machine learning and artificial neural networks. Environ. Syst. Decisions. 40, 480–484 (2020).

    Google Scholar 

  16. Khanmohammadi, R., Mirshafiee, M. S., Ghassemi, M. M. & Alhanai, T. Fetal gender identification using machine and deep learning algorithms on phonocardiogram signals. arXiv (2021).

  17. Atilkan, Y. et al. Advancing crayfish disease detection: A comparative study of deep learning and canonical machine learning techniques. Appl. Sci. 14, 6211 (2024).

    Google Scholar 

  18. Korfmann, K., Gaggiotti, O. E. & Fumagalli, M. Deep learning in population genetics. Genome Biol. Evol. 15, evad008 (2023).

    Google Scholar 

  19. Garabaghi, F. H., Benzer, R., Benzer, S. & Günal, A. Ç. Effect of polynomial, radial basis, and pearson VII function kernels in support vector machine algorithm for classification of crayfish. Ecol. Inf. 72, 101911 (2022).

    Google Scholar 

  20. Li, J. et al. Deep learning for visual recognition and detection of aquatic animals: A review. Reviews Aquaculture. 15, 409–433 (2023).

    Google Scholar 

  21. Hasan, Y. & Siregar, K. Computer vision identification of species, sex, and age of Indonesian marine lobsters. INFOKUM 9, 478–489 (2021).

    Google Scholar 

  22. Ye, X. et al. Rapid and accurate crayfish sorting by size and maturity based on improved YOLOv5. Appl. Sci. 13, 8619 (2023).

    Google Scholar 

  23. Wang, C. et al. Convolutional neural network-based portable computer vision system for freshness assessment of crayfish (Prokaryophyllus clarkii). J. Food Sci. 87, 5330–5339 (2022).

    Google Scholar 

  24. Favaro, L., Tirelli, T. & Pessani, D. Modelling habitat requirements of white-clawed crayfish (Austropotamobius pallipes) using support vector machines. Knowl. Manag. Aquat. Ecosyst. 21 (2011).

  25. Chen, Y. et al. Study on positioning and detection of crayfish body parts based on machine vision. J. Food Meas. Charact. 18, 4375–4387 (2024).

    Google Scholar 

  26. Zhang, H., Yu, F., Sun, J., Shen, X. & Li, K. Deep learning for sea cucumber detection using stochastic gradient descent algorithm. Eur. J. Remote Sens. 53, 53–62 (2020).

    Google Scholar 

  27. Borowicz, A. et al. Aerial-trained deep learning networks for surveying cetaceans from satellite imagery. PloS One. 14, e0212532 (2019).

    Google Scholar 

  28. Eickholt, J., Kelly, D., Bryan, J., Miehls, S. & Zielinski, D. Advancements towards selective barrier passage by automatic species identification: applications of deep convolutional neural networks on images of dewatered fish. ICES J. Mar. Sci. 77, 2804–2813 (2020).

    Google Scholar 

  29. Kim, Y., Kim, S. Y. & Kim, H. Heterogeneous random forest. arXiv (2024).

  30. Nanni, L., Brahnam, S., Loreggia, A. & Barcellona, L. Heterogeneous ensemble for medical data classification. Analytics 2, 676–693 (2023).

    Google Scholar 

  31. Xie, Q., Zhang, Q., Xia, S., Zhou, X. & Wang, G. GAdaBoost: an efficient and robust adaboost algorithm based on granular-ball structure. Knowl. Based Syst., 113898 (2025).

  32. Lu, S. Y., Zhang, Y. D. & Yao, Y. D. A regularized transformer with adaptive token fusion for alzheimer’s disease diagnosis in brain magnetic resonance images. Eng. Appl. Artif. Intell. 155, 111058 (2025).

    Google Scholar 

  33. Lu, S. Y., Zhu, Z., Zhang, Y. D. & Yao, Y. D. Tuberculosis and pneumonia diagnosis in chest X-rays by large adaptive filter and aligning normalized network with report-guided multi-level alignment. Eng. Appl. Artif. Intell. 158, 111575 (2025).

    Google Scholar 

  34. Lu, S. Y., Zhu, Z., Tang, Y., Zhang, X. & Liu, X. CTBViT: A novel ViT for tuberculosis classification with efficient block and randomized classifier. Biomed. Signal Process. Control. 100, 106981 (2025).

    Google Scholar 

  35. Benzer, S. Crayfish Sex Classification Dataset. https://doi.org/10.5281/zenodo.17516963%3E (2025).

  36. Yazicioglu, B., Reynolds, J. & Kozák, P. Different aspects of reproduction strategies in crayfish: A review. Knowl. Manag. Aquat. Ecosyst., 33 (2016).

  37. Guan, H. & Liu, M. Domain adaptation for medical image analysis: a survey. IEEE Trans. Biomed. Eng. 69, 1173–1185 (2021).

    Google Scholar 

  38. Matta, S. et al. A systematic review of generalization research in medical image classification. Comput. Biol. Med. 183, 109256 (2024).

    Google Scholar 

  39. Brown, J., Nguyen, A. & Raj, N. Effect of camera choice on Image-Classification inference. Appl. Sci. 15, 246 (2024).

    Google Scholar 

  40. Bradshaw, T. J., Huemann, Z., Hu, J. & Rahmim, A. A guide to cross-validation for artificial intelligence in medical imaging. Radiology: Artif. Intell. 5, e220232 (2023).

    Google Scholar 

  41. Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).

    Google Scholar 

  42. Nalepa, J. & Kawulok, M. Selecting training sets for support vector machines: a review. Artif. Intell. Rev. 52, 857–900 (2019).

    Google Scholar 

  43. Jakkula, V. Tutorial on support vector machine (svm). School EECS Wash. State Univ. 37, 3 (2006).

    Google Scholar 

  44. Raschka, S. Naive bayes and text classification i-introduction and theory. arXiv (2014).

  45. Cover, T. & Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory. 13, 21–27 (1967).

    Google Scholar 

  46. Bishop, C. M. & Nasrabadi, N. M. Pattern Recognition and Machine Learning Vol. 4 (Springer, 2006).

  47. What is the k-nearest neighbors (KNN) algorithm? https://www.ibm.com/think/topics/knn (2025).

  48. Goodfellow, I., Bengio, Y., Courville, A. & Bengio, Y. Deep Learning Vol. 1 (MIT press Cambridge, 2016).

  49. Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).

    Google Scholar 

  50. What is random forest? https://www.ibm.com/think/topics/random-forest (2025).

  51. LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE. 86, 2278–2324 (1998).

    Google Scholar 

  52. Masci, J., Meier, U., Cireşan, D. & Schmidhuber, J. in Artificial neural networks and machine learning–ICANN 2011: 21st international conference on artificial neural networks, espoo, Finland, June 14–17, Proceedings, Part i 21. 52–59 (Springer, 2011).

  53. Schmid, U., Günther, J. & Diepold, K. Stacked denoising and stacked¨ convolutional autoencoders. (2017).

  54. Du, B. et al. Stacked convolutional denoising auto-encoders for feature representation. IEEE Trans. Cybernet. 47, 1017–1027 (2016).

    Google Scholar 

  55. Zhu, Y., Li, L. & Wu, X. Stacked convolutional sparse auto-encoders for representation learning. ACM Trans. Knowl. Discovery Data (TKDD). 15, 1–21 (2021).

    Google Scholar 

  56. Tan, S. & Li, B. Signal and Information Processing Association Annual Summit and Conference (APSIPA). 1–4 (IEEE, 2014).

  57. Liu, Z. et al. Kan: Kolmogorov-arnold networks. arXiv (2024).

  58. Ibrahum, A. D. M., Shang, Z. & Hong, J. E. How resilient are Kolmogorov–Arnold networks in classification tasks? A robustness investigation. Appl. Sci. 14, 10173 (2024).

    Google Scholar 

  59. Harris, C. R. et al. Array programming with numpy. Nature 585, 357–362 (2020).

    Google Scholar 

  60. McKinney, W. Data structures for statistical computing in python. SciPy 445, 51–56 (2010).

    Google Scholar 

  61. Pedregosa, F. et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  62. Paszke, A. et al. Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inform. Process. Syst. 32 (2019).

Download references

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, R.B. and S.B.; Methodology, Y.A., K.A. and T.A.; Software, Y.A., E.T.A. and B.K.; Validation, M.S.G. and F.E.; Data Curation, R.B. and S.B.; Writing—Original Draft Preparation, Y.A., E.T.A., K.A., T.A. and R.B.; Writing—Review and Editing, Y.A., K.A., T.A., R.B., M.S.G. and F.E.; Visualization, Y.A., K.A. and T.A.; Supervision, T.A., K.A., M.S.G., R.B. and S.B. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to
Tunc Asuroglu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Atilkan, Y., Kirik, B., Acikbas, E.T. et al. Enhancing crayfish sex identification with Kolmogorov-Arnold networks and stacked autoencoders.
Sci Rep (2025). https://doi.org/10.1038/s41598-025-34095-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41598-025-34095-z

Keywords

  • Crayfish
  • Sex identification
  • Deep learning
  • Machine learning
  • Kolmogorov-Arnold networks
  • Stacked autoencoders


Source: Ecology - nature.com

Climate-driven habitat shifts And niche overlap of overexploited trees Cordia africana Lam. and Terminalia brownii Fresen in Ethiopia

A north-south hemispheric migratory divide in the butterfly Vanessa cardui

Back to Top