Abstract
Crayfish play an important role in freshwater ecosystems, and sex classification is crucial for analyzing their demographic structures. This study performed binary classification using traditional machine learning and deep learning models on tabular and image datasets with an imbalanced class distribution. For tabular classification, features related to crayfish weight and size were used. Missing values were handled using different methods to create various datasets. Kolmogorov-Arnold networks demonstrated the best performance across all metrics, achieving accuracy rates between 95 and 100%. Image data were generated by combining at least five images of each crayfish. Autoencoders were employed to extract meaningful features. In experiments conducted on these extracted features, support vector machines achieved 84% accuracy, and multilayer perceptrons achieved 82% accuracy, outperforming other models. To enhance performance, a novel architecture based on stacked autoencoders was proposed. While some models experienced performance declines, Kolmogorov-Arnold networks showed an average improvement of 3.5% across all metrics, maintaining the highest accuracy. To statistically evaluate performance differences, McNemar’s and Wilcoxon tests were applied. The results confirmed significant differences between Kolmogorov-Arnold networks, support vector machines, multilayer perceptrons, and naive Bayes. In conclusion, this study highlights the effectiveness of deep learning and machine learning models in crayfish sex classification and provides a significant example of hybrid artificial intelligence models incorporating autoencoders.
Similar content being viewed by others
Modeling and predicting meat yield and growth performance using morphological features of narrow-clawed crayfish with machine learning techniques
Lightweight detection and segmentation of crayfish parts using an improved YOLOv11n segmentation model
Attention-enhanced and integrated deep learning approach for fishing vessel classification based on multiple features
Data availability
The datasets generated and/or analysed during the current study are available in the Zenodo repository: https://doi.org/10.5281/zenodo.17516963. The source codes developed for the experiments are stored in a GitHub repository at https://github.com/yasinatilkan60/Crayfish-Sex-Identification.
References
Pastorino, P. et al. The invasive red swamp crayfish (Procambarus clarkii) as a bioindicator of microplastic pollution: insights from lake Candia (northwestern Italy). Ecol. Ind. 150, 110200 (2023).
Piscart, C. et al. In Identification and Ecology of Freshwater Arthropods in the Mediterranean Basin 157–223 (Elsevier, 2024).
Muruganandam, M. et al. Impact of climate change and anthropogenic activities on aquatic ecosystem–A review. Environ. Res. 238, 117233 (2023).
Özdoğan, H. B. & Koca, H. U. Effects of different diets on growth and survival of first feeding second-stage juvenile Pontastacus leptodactylus (Eschscholtz, 1823)(Decapoda, Astacidea). Crustaceana 96, 673–682 (2023).
Đuretanović, S., Rajković, M. & Maguire, I. Ecological Sustainability of Fish Resources of Inland Waters of the Western Balkans: Freshwater Fish Stocks, Sustainable Use and Conservation 341–374 (Springer, 2024).
Suryanto, M. E. et al. Using crayfish behavior assay as a simple and sensitive model to evaluate potential adverse effects of water pollution: emphasis on antidepressants. Ecotoxicol. Environ. Saf. 265, 115507 (2023).
Kazery, J. A. et al. Internal and external Spatial analysis of trace elements in local crayfish. Environ. Technol., 1–14 (2024).
Jin, S. et al. Length-based stock assessment for Procambarus Clarkii aquaculture management in china: an alarming of ongoing recruitment overfishing. Aquaculture 579, 740182 (2024).
McLay, C. L., van den Brink, A. M., Longshaw, M. & Stebbing, P. Crayfish growth and reproduction. Biol. Ecol. Crayfish 62–116 (2016).
Budd, A. M., Banh, Q. Q., Domingos, J. A. & Jerry, D. R. Sex control in fish: approaches, challenges and opportunities for aquaculture. J. Mar. Sci. Eng. 3, 329–355 (2015).
Crandall, K. A. & De Grave, S. An updated classification of the freshwater crayfishes (Decapoda: Astacidea) of the world, with a complete species list. J. Crustacean Biology. 37, 615–653 (2017).
Dargan, S., Kumar, M., Ayyagari, M. R. & Kumar, G. A survey of deep learning and its applications: a new paradigm to machine learning. Arch. Comput. Methods Eng. 27, 1071–1092 (2020).
Lan, K. et al. A survey of data mining and deep learning in bioinformatics. J. Med. Syst. 42, 1–20 (2018).
Lauriola, I., Lavelli, A. & Aiolli, F. An introduction to deep learning in natural Language processing: Models, techniques, and tools. Neurocomputing 470, 443–456 (2022).
Bambil, D. et al. Plant species identification using color learning resources, shape, texture, through machine learning and artificial neural networks. Environ. Syst. Decisions. 40, 480–484 (2020).
Khanmohammadi, R., Mirshafiee, M. S., Ghassemi, M. M. & Alhanai, T. Fetal gender identification using machine and deep learning algorithms on phonocardiogram signals. arXiv (2021).
Atilkan, Y. et al. Advancing crayfish disease detection: A comparative study of deep learning and canonical machine learning techniques. Appl. Sci. 14, 6211 (2024).
Korfmann, K., Gaggiotti, O. E. & Fumagalli, M. Deep learning in population genetics. Genome Biol. Evol. 15, evad008 (2023).
Garabaghi, F. H., Benzer, R., Benzer, S. & Günal, A. Ç. Effect of polynomial, radial basis, and pearson VII function kernels in support vector machine algorithm for classification of crayfish. Ecol. Inf. 72, 101911 (2022).
Li, J. et al. Deep learning for visual recognition and detection of aquatic animals: A review. Reviews Aquaculture. 15, 409–433 (2023).
Hasan, Y. & Siregar, K. Computer vision identification of species, sex, and age of Indonesian marine lobsters. INFOKUM 9, 478–489 (2021).
Ye, X. et al. Rapid and accurate crayfish sorting by size and maturity based on improved YOLOv5. Appl. Sci. 13, 8619 (2023).
Wang, C. et al. Convolutional neural network-based portable computer vision system for freshness assessment of crayfish (Prokaryophyllus clarkii). J. Food Sci. 87, 5330–5339 (2022).
Favaro, L., Tirelli, T. & Pessani, D. Modelling habitat requirements of white-clawed crayfish (Austropotamobius pallipes) using support vector machines. Knowl. Manag. Aquat. Ecosyst. 21 (2011).
Chen, Y. et al. Study on positioning and detection of crayfish body parts based on machine vision. J. Food Meas. Charact. 18, 4375–4387 (2024).
Zhang, H., Yu, F., Sun, J., Shen, X. & Li, K. Deep learning for sea cucumber detection using stochastic gradient descent algorithm. Eur. J. Remote Sens. 53, 53–62 (2020).
Borowicz, A. et al. Aerial-trained deep learning networks for surveying cetaceans from satellite imagery. PloS One. 14, e0212532 (2019).
Eickholt, J., Kelly, D., Bryan, J., Miehls, S. & Zielinski, D. Advancements towards selective barrier passage by automatic species identification: applications of deep convolutional neural networks on images of dewatered fish. ICES J. Mar. Sci. 77, 2804–2813 (2020).
Kim, Y., Kim, S. Y. & Kim, H. Heterogeneous random forest. arXiv (2024).
Nanni, L., Brahnam, S., Loreggia, A. & Barcellona, L. Heterogeneous ensemble for medical data classification. Analytics 2, 676–693 (2023).
Xie, Q., Zhang, Q., Xia, S., Zhou, X. & Wang, G. GAdaBoost: an efficient and robust adaboost algorithm based on granular-ball structure. Knowl. Based Syst., 113898 (2025).
Lu, S. Y., Zhang, Y. D. & Yao, Y. D. A regularized transformer with adaptive token fusion for alzheimer’s disease diagnosis in brain magnetic resonance images. Eng. Appl. Artif. Intell. 155, 111058 (2025).
Lu, S. Y., Zhu, Z., Zhang, Y. D. & Yao, Y. D. Tuberculosis and pneumonia diagnosis in chest X-rays by large adaptive filter and aligning normalized network with report-guided multi-level alignment. Eng. Appl. Artif. Intell. 158, 111575 (2025).
Lu, S. Y., Zhu, Z., Tang, Y., Zhang, X. & Liu, X. CTBViT: A novel ViT for tuberculosis classification with efficient block and randomized classifier. Biomed. Signal Process. Control. 100, 106981 (2025).
Benzer, S. Crayfish Sex Classification Dataset. https://doi.org/10.5281/zenodo.17516963%3E (2025).
Yazicioglu, B., Reynolds, J. & Kozák, P. Different aspects of reproduction strategies in crayfish: A review. Knowl. Manag. Aquat. Ecosyst., 33 (2016).
Guan, H. & Liu, M. Domain adaptation for medical image analysis: a survey. IEEE Trans. Biomed. Eng. 69, 1173–1185 (2021).
Matta, S. et al. A systematic review of generalization research in medical image classification. Comput. Biol. Med. 183, 109256 (2024).
Brown, J., Nguyen, A. & Raj, N. Effect of camera choice on Image-Classification inference. Appl. Sci. 15, 246 (2024).
Bradshaw, T. J., Huemann, Z., Hu, J. & Rahmim, A. A guide to cross-validation for artificial intelligence in medical imaging. Radiology: Artif. Intell. 5, e220232 (2023).
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Nalepa, J. & Kawulok, M. Selecting training sets for support vector machines: a review. Artif. Intell. Rev. 52, 857–900 (2019).
Jakkula, V. Tutorial on support vector machine (svm). School EECS Wash. State Univ. 37, 3 (2006).
Raschka, S. Naive bayes and text classification i-introduction and theory. arXiv (2014).
Cover, T. & Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory. 13, 21–27 (1967).
Bishop, C. M. & Nasrabadi, N. M. Pattern Recognition and Machine Learning Vol. 4 (Springer, 2006).
What is the k-nearest neighbors (KNN) algorithm? https://www.ibm.com/think/topics/knn (2025).
Goodfellow, I., Bengio, Y., Courville, A. & Bengio, Y. Deep Learning Vol. 1 (MIT press Cambridge, 2016).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
What is random forest? https://www.ibm.com/think/topics/random-forest (2025).
LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE. 86, 2278–2324 (1998).
Masci, J., Meier, U., Cireşan, D. & Schmidhuber, J. in Artificial neural networks and machine learning–ICANN 2011: 21st international conference on artificial neural networks, espoo, Finland, June 14–17, Proceedings, Part i 21. 52–59 (Springer, 2011).
Schmid, U., Günther, J. & Diepold, K. Stacked denoising and stacked¨ convolutional autoencoders. (2017).
Du, B. et al. Stacked convolutional denoising auto-encoders for feature representation. IEEE Trans. Cybernet. 47, 1017–1027 (2016).
Zhu, Y., Li, L. & Wu, X. Stacked convolutional sparse auto-encoders for representation learning. ACM Trans. Knowl. Discovery Data (TKDD). 15, 1–21 (2021).
Tan, S. & Li, B. Signal and Information Processing Association Annual Summit and Conference (APSIPA). 1–4 (IEEE, 2014).
Liu, Z. et al. Kan: Kolmogorov-arnold networks. arXiv (2024).
Ibrahum, A. D. M., Shang, Z. & Hong, J. E. How resilient are Kolmogorov–Arnold networks in classification tasks? A robustness investigation. Appl. Sci. 14, 10173 (2024).
Harris, C. R. et al. Array programming with numpy. Nature 585, 357–362 (2020).
McKinney, W. Data structures for statistical computing in python. SciPy 445, 51–56 (2010).
Pedregosa, F. et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Paszke, A. et al. Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inform. Process. Syst. 32 (2019).
Author information
Authors and Affiliations
Contributions
Conceptualization, R.B. and S.B.; Methodology, Y.A., K.A. and T.A.; Software, Y.A., E.T.A. and B.K.; Validation, M.S.G. and F.E.; Data Curation, R.B. and S.B.; Writing—Original Draft Preparation, Y.A., E.T.A., K.A., T.A. and R.B.; Writing—Review and Editing, Y.A., K.A., T.A., R.B., M.S.G. and F.E.; Visualization, Y.A., K.A. and T.A.; Supervision, T.A., K.A., M.S.G., R.B. and S.B. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary Material 1
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and permissions
About this article
Cite this article
Atilkan, Y., Kirik, B., Acikbas, E.T. et al. Enhancing crayfish sex identification with Kolmogorov-Arnold networks and stacked autoencoders.
Sci Rep (2025). https://doi.org/10.1038/s41598-025-34095-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-34095-z
Keywords
- Crayfish
- Sex identification
- Deep learning
- Machine learning
- Kolmogorov-Arnold networks
- Stacked autoencoders
Source: Ecology - nature.com
