Comparing recurrent convolutional neural networks for large scale bird species classification
1.Rosenberg, K. V. et al. Decline of the North American avifauna. Science 366, 120–124 (2019).ADS
CAS
Article
Google Scholar
2.Inger, R. et al. Common European birds are declining rapidly while less abundant species numbers are rising. Ecol. Lett. 18, 28–36 (2015).Article
Google Scholar
3.Leach, E. C., Burwell, C. J., Ashton, L. A., Jones, D. N. & Kitching, R. L. Comparison of point counts and automated acoustic monitoring: Detecting birds in a rainforest biodiversity survey. Emu 116, 305–309 (2016).Article
Google Scholar
4.Drake, K. L., Frey, M., Hogan, D. & Hedley, R. Using digital recordings and sonogram analysis to obtain counts of yellow rails. Wildl. Soc. Bull. 40, 346–354 (2016).Article
Google Scholar
5.Lambert, K. T. & McDonald, P. G. A low-cost, yet simple and highly repeatable system for acoustically surveying cryptic species. Austral. Ecol. 39, 779–785 (2014).Article
Google Scholar
6.Burnett, K. Distribution, abundance, and acoustic characteristics of Kohala forest birds. Ph.D. thesis, University of Hawaii at Hilo (2020).7.Owen, K. et al. Bioacoustic analyses reveal that bird communities recover with forest succession in tropical dry forests. Avian Conserv. Ecol. 15, 25 (2020).Article
Google Scholar
8.Furnas, B. J., Landers, R. H. & Bowie, R. C. Wildfires and mass effects of dispersal disrupt the local uniformity of type I songs of hermit warblers in California. Auk 137, ukaa031 (2020).Article
Google Scholar
9.Aide, T. M. et al. Real-time bioacoustics monitoring and automated species identification. PeerJ 1, e103 (2013).Article
Google Scholar
10.Potamitis, I., Ntalampiras, S., Jahn, O. & Riede, K. Automatic bird sound detection in long real-field recordings: Applications and tools. Appl. Acoust. 80, 1–9 (2014).Article
Google Scholar
11.Stowell, D. & Plumbley, M. D. Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ 2, e488 (2014).Article
Google Scholar
12.Tachibana, R. O., Oosugi, N. & Okanoya, K. Semi-automatic classification of birdsong elements using a linear support vector machine. PLoS ONE 9, e92584 (2014).ADS
Article
Google Scholar
13.Zheng, A. & Casari, A. Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists (OReilly, London, 2018).
Google Scholar
14.Najafabadi, M. M. et al. Deep learning applications and challenges in big data analytics. J. Big Data 2, 1 (2015).Article
Google Scholar
15.Dieleman, S., Brakel, P. & Schrauwen, B. Audio-based music classification with a pretrained convolutional network. In ISMIR (2011).16.Lee, H., Pham, P., Largman, Y. & Ng, A. Unsupervised feature learning for audio classification using convolutional deep belief networks. In Advances in Neural Information Processing Systems22 (2009).17.Bergler, C. et al. Orca-spot: An automatic killer whale sound detection toolkit using deep learning. Sci. Rep. 9, 10997 (2019).ADS
Article
Google Scholar
18.Zhong, M. et al. Beluga whale acoustic signal classification using deep learning neural network models. J. Acoust. Soc. Am. 147, 1834–1841 (2020).ADS
Article
Google Scholar
19.Strout, J. et al. Anuran call classification with deep learning. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2662–2665 (2017).20.Salamon, J., Bello, J. P., Farnsworth, A. & Kelling, S. Fusing shallow and deep learning for bioacoustic bird species classification. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017).21.Stowell, D., Wood, M. D., Pamuła, H., Stylianou, Y. & Glotin, H. Automatic acoustic detection of birds through deep learning: The first bird audio detection challenge. Methods Ecol. Evol. 10, 368–380. https://doi.org/10.1111/2041-210X.13103 (2019).Article
Google Scholar
22.[Dataset] Cornell Lab of Ornithology. Cornell birdcall identification. https://www.kaggle.com/c/birdsong-recognition (accessed 15 Jun 2020).23.McFee, B. et al. librosa: Audio and music signal analysis in python. In Proceedings of the 14th Python in Science Conference 8 (2015).24.Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In ICLR (2015).25.He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In CVPR 770–778 (2016).26.Billerman, S. M., Keeney, B. K., Rodewald, P. G. & Schulenberg, T. S. (eds.) Birds of the World Cornell Laboratory of Ornithology, Ithaca, NY, USA, 2020). https://birdsoftheworld.org/bow/home.27.Gu, A., Dao, T., Ermon, S., Rudra, A. & Re, C. Hippo: Recurrent memory with optimal polynomial projections (2020). arXiv:2008.07669.28.Molau, S., Pitz, M., Schluter, R. & Ney, H. Computing mel-frequency cepstral coefficients on the power spectrum. In 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221) 1, 73–76 (2001). https://doi.org/10.1109/ICASSP.2001.940770.29.Choi, K., Fazekas, G. & Sandler, M. Automatic tagging using deep convolutional neural networks (2016). arXiv:1606.00298.30.Dieleman, S. & Schrauwen, B. End-to-end learning for music audio. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 6964–6968 (2014).31.Voelker, A., Kajic, I. & Eliasmith, C. Legendre memory units: Continuous-time representation in recurrent neural networks. In NeurIPS (2019).32.Huang, G., Liu, Z., van der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In CVPR 4700–4708 (2017).33.Doriana, C., Leforta, R., Bonnela, J., Zaraderb, J.-L. & Adam, O. Bi-class classification of humpback whale sound units against complex background noise with deep convolution neural network (2017). arXiv:1702.02741.34.Narasimhan, R., Fern, X. Z. & Raich, R. Simultaneous segmentation and classification of bird song using cnn. In Proc. Int. Conf. Acoust. Speech, Signal Process 146–150 (2017).35.Sankupellay, M. & Konovalov, D. Bird call recognition using deep convolutional neural network, resnet-50 (2018).36.Zhang, L., Wang, D., Bao, C., Wang, Y. & Xu, K. Large-scale whale-call classification by transfer learning on multi-scale waveforms and time-frequency features. Appl. Sci. 9, 1020 (2019).Article
Google Scholar
37.Berman, P. C., Bronstein, M. M., Wood, R. J., Gero, S. & Gruber, D. F. Deep machine learning techniques for the detection and classification of sperm whale bioacoustics. Sci. Rep. 9, 12588 (2019).ADS
Article
Google Scholar
38.Zhong, M. et al. Improving passive acoustic monitoring applications to the endangered cook inlet beluga whale. J. Acoust. Soc. Am. 146, 3089–3089 (2019).ADS
Article
Google Scholar
39.Efremova, D. B., Sankupellay, M. & Konovalov, D. A. Data-efficient classification of birdcall through convolutional neural networks transfer learning. In 2019 Digital Image Computing: Techniques and Applications (DICTA) 1–8 (2019).40.Zhong, M. et al. Multispecies bioacoustic classification using transfer learning of deep convolutional neural networks with pseudo-labeling. Appl. Acoust. 166, 107375 (2020).Article
Google Scholar
41.Thakura, A., Thapar, D., Rajan, P. & Nigam, A. Deep metric learning for bioacoustic classification: Overcoming training data scarcity using dynamic triplet loss. J. Acoust. Soc. Am. 146, 534 (2019).ADS
Article
Google Scholar
42.Wang, Z., Yan, W. & Oates, T. Time series classification from scratch with deep neural networks: A strong baseline (2016). arXiv:1611.06455.43.Williams, R. J. & Zipser, D. A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1, 270–280 (1989).Article
Google Scholar
44.Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).ADS
Article
Google Scholar
45.Bengio, Y., Simard, P. & Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994).CAS
Article
Google Scholar
46.Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).CAS
Article
Google Scholar
47.Cho, K., Van Merriënboer, B., Bahdanau, D. & Bengio, Y. On the properties of neural machine translation: Encoder–decoder approaches (2014). arXiv:1409.1259.48.Zeng, Y., Mao, H., Peng, D. & Yi, Z. Spectrogram based multi-task audio classification. Multimed. Tools Appl. 78, 3705–3722 (2019).Article
Google Scholar
49.Voelker, A. R. & Eliasmith, C. Improving spiking dynamical networks: Accurate delays, higher-order synapses, and time cells. Neural Comput. 30, 569–609 (2018).MathSciNet
Article
Google Scholar
50.Xu, Y., Kong, Q., Huang, Q., Wang, W. & Plumbley, M. D. Convolutional gated recurrent neural network incorporating spatial features for audio tagging (2017). arXiv:1702.07787.51.Keren, G. & Schuller, B. Convolutional RNN: An enhanced model for extracting features from sequential data (2016). arXiv:1602.05875.52.Lai, G., Chang, W.-C., Yang, Y. & Liu, H. Modeling long- and short-term temporal patterns with deep neural networks (2017). arXiv:1703.07015.53.Shiu, Y. et al. Deep neural networks for automated detection of marine mammal species. Sci. Rep. 10, 607 (2020).ADS
CAS
Article
Google Scholar
54.Espi, M., Fujimoto, M., Kubo, Y. & Nakatani, T. Spectrogram patch based acoustic event detection and classification in speech overlapping conditions. In 2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA) 117–121 (2014).55.Feng, L., Liu, S. & Yao, J. Music genre classification with paralleling recurrent convolutional neural network (2017). arXiv:1712.08370.56.Choi, K., Fazekas, G., Sandler, M. & Cho, K. Convolutional recurrent neural networks for music classification (2016). arXiv:1609.04243.57.Himawan, I., Towsey, M. & Roe, P. 3d convolution recurrent neural networks for bird sound detection. In Wood, M., Glotin, H., Stowell, D. & Stylianou, Y. (eds.) Proceedings of the 3rd Workshop on Detection and Classification of Acoustic Scenes and Events 1–4 (Detection and Classification of Acoustic Scenes and Events, 2018).58.Cakir, E., Adavanne, S., Parascandolo, G., Drossos, K. & Virtanen, T. Convolutional recurrent neural networks for bird audio detection. In 2017 25th European Signal Processing Conference (EUSIPCO) 1744–1748 (2017). More