Comparing recurrent convolutional neural networks for large scale bird species classification

Rosenberg, K. V. et al. Decline of the North American avifauna. Science 366, 120–124 (2019).

ADS
CAS
Article

Google Scholar

Inger, R. et al. Common European birds are declining rapidly while less abundant species numbers are rising. Ecol. Lett. 18, 28–36 (2015).

Article

Google Scholar

Leach, E. C., Burwell, C. J., Ashton, L. A., Jones, D. N. & Kitching, R. L. Comparison of point counts and automated acoustic monitoring: Detecting birds in a rainforest biodiversity survey. Emu 116, 305–309 (2016).

Article

Google Scholar

Drake, K. L., Frey, M., Hogan, D. & Hedley, R. Using digital recordings and sonogram analysis to obtain counts of yellow rails. Wildl. Soc. Bull. 40, 346–354 (2016).

Article

Google Scholar

Lambert, K. T. & McDonald, P. G. A low-cost, yet simple and highly repeatable system for acoustically surveying cryptic species. Austral. Ecol. 39, 779–785 (2014).

Article

Google Scholar

Burnett, K. Distribution, abundance, and acoustic characteristics of Kohala forest birds. Ph.D. thesis, University of Hawaii at Hilo (2020).

Owen, K. et al. Bioacoustic analyses reveal that bird communities recover with forest succession in tropical dry forests. Avian Conserv. Ecol. 15, 25 (2020).

Article

Google Scholar

Furnas, B. J., Landers, R. H. & Bowie, R. C. Wildfires and mass effects of dispersal disrupt the local uniformity of type I songs of hermit warblers in California. Auk 137, ukaa031 (2020).

Article

Google Scholar

Aide, T. M. et al. Real-time bioacoustics monitoring and automated species identification. PeerJ 1, e103 (2013).

Article

Google Scholar

10.

Potamitis, I., Ntalampiras, S., Jahn, O. & Riede, K. Automatic bird sound detection in long real-field recordings: Applications and tools. Appl. Acoust. 80, 1–9 (2014).

Article

Google Scholar

11.

Stowell, D. & Plumbley, M. D. Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ 2, e488 (2014).

Article

Google Scholar

12.

Tachibana, R. O., Oosugi, N. & Okanoya, K. Semi-automatic classification of birdsong elements using a linear support vector machine. PLoS ONE 9, e92584 (2014).

ADS
Article

Google Scholar

13.

Zheng, A. & Casari, A. Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists (OReilly, London, 2018).

Google Scholar

14.

Najafabadi, M. M. et al. Deep learning applications and challenges in big data analytics. J. Big Data 2, 1 (2015).

Article

Google Scholar

15.

Dieleman, S., Brakel, P. & Schrauwen, B. Audio-based music classification with a pretrained convolutional network. In ISMIR (2011).

16.

Lee, H., Pham, P., Largman, Y. & Ng, A. Unsupervised feature learning for audio classification using convolutional deep belief networks. In Advances in Neural Information Processing Systems22 (2009).

17.

Bergler, C. et al. Orca-spot: An automatic killer whale sound detection toolkit using deep learning. Sci. Rep. 9, 10997 (2019).

ADS
Article

Google Scholar

18.

Zhong, M. et al. Beluga whale acoustic signal classification using deep learning neural network models. J. Acoust. Soc. Am. 147, 1834–1841 (2020).

ADS
Article

Google Scholar

19.

Strout, J. et al. Anuran call classification with deep learning. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2662–2665 (2017).

20.

Salamon, J., Bello, J. P., Farnsworth, A. & Kelling, S. Fusing shallow and deep learning for bioacoustic bird species classification. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017).

21.

Stowell, D., Wood, M. D., Pamuła, H., Stylianou, Y. & Glotin, H. Automatic acoustic detection of birds through deep learning: The first bird audio detection challenge. Methods Ecol. Evol. 10, 368–380. https://doi.org/10.1111/2041-210X.13103 (2019).

Article

Google Scholar

22.

[Dataset] Cornell Lab of Ornithology. Cornell birdcall identification. https://www.kaggle.com/c/birdsong-recognition (accessed 15 Jun 2020).

23.

McFee, B. et al. librosa: Audio and music signal analysis in python. In Proceedings of the 14th Python in Science Conference 8 (2015).

24.

Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In ICLR (2015).

25.

He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In CVPR 770–778 (2016).

26.

Billerman, S. M., Keeney, B. K., Rodewald, P. G. & Schulenberg, T. S. (eds.) Birds of the World Cornell Laboratory of Ornithology, Ithaca, NY, USA, 2020). https://birdsoftheworld.org/bow/home.

27.

Gu, A., Dao, T., Ermon, S., Rudra, A. & Re, C. Hippo: Recurrent memory with optimal polynomial projections (2020). arXiv:2008.07669.

28.

Molau, S., Pitz, M., Schluter, R. & Ney, H. Computing mel-frequency cepstral coefficients on the power spectrum. In 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221) 1, 73–76 (2001). https://doi.org/10.1109/ICASSP.2001.940770.

29.

Choi, K., Fazekas, G. & Sandler, M. Automatic tagging using deep convolutional neural networks (2016). arXiv:1606.00298.

30.

Dieleman, S. & Schrauwen, B. End-to-end learning for music audio. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 6964–6968 (2014).

31.

Voelker, A., Kajic, I. & Eliasmith, C. Legendre memory units: Continuous-time representation in recurrent neural networks. In NeurIPS (2019).

32.

Huang, G., Liu, Z., van der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In CVPR 4700–4708 (2017).

33.

Doriana, C., Leforta, R., Bonnela, J., Zaraderb, J.-L. & Adam, O. Bi-class classification of humpback whale sound units against complex background noise with deep convolution neural network (2017). arXiv:1702.02741.

34.

Narasimhan, R., Fern, X. Z. & Raich, R. Simultaneous segmentation and classification of bird song using cnn. In Proc. Int. Conf. Acoust. Speech, Signal Process 146–150 (2017).

35.

Sankupellay, M. & Konovalov, D. Bird call recognition using deep convolutional neural network, resnet-50 (2018).

36.

Zhang, L., Wang, D., Bao, C., Wang, Y. & Xu, K. Large-scale whale-call classification by transfer learning on multi-scale waveforms and time-frequency features. Appl. Sci. 9, 1020 (2019).

Article

Google Scholar

37.

Berman, P. C., Bronstein, M. M., Wood, R. J., Gero, S. & Gruber, D. F. Deep machine learning techniques for the detection and classification of sperm whale bioacoustics. Sci. Rep. 9, 12588 (2019).

ADS
Article

Google Scholar

38.

Zhong, M. et al. Improving passive acoustic monitoring applications to the endangered cook inlet beluga whale. J. Acoust. Soc. Am. 146, 3089–3089 (2019).

ADS
Article

Google Scholar

39.

Efremova, D. B., Sankupellay, M. & Konovalov, D. A. Data-efficient classification of birdcall through convolutional neural networks transfer learning. In 2019 Digital Image Computing: Techniques and Applications (DICTA) 1–8 (2019).

40.

Zhong, M. et al. Multispecies bioacoustic classification using transfer learning of deep convolutional neural networks with pseudo-labeling. Appl. Acoust. 166, 107375 (2020).

Article

Google Scholar

41.

Thakura, A., Thapar, D., Rajan, P. & Nigam, A. Deep metric learning for bioacoustic classification: Overcoming training data scarcity using dynamic triplet loss. J. Acoust. Soc. Am. 146, 534 (2019).

ADS
Article

Google Scholar

42.

Wang, Z., Yan, W. & Oates, T. Time series classification from scratch with deep neural networks: A strong baseline (2016). arXiv:1611.06455.

43.

Williams, R. J. & Zipser, D. A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1, 270–280 (1989).

Article

Google Scholar

44.

Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

ADS
Article

Google Scholar

45.

Bengio, Y., Simard, P. & Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994).

CAS
Article

Google Scholar

46.

Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).

CAS
Article

Google Scholar

47.

Cho, K., Van Merriënboer, B., Bahdanau, D. & Bengio, Y. On the properties of neural machine translation: Encoder–decoder approaches (2014). arXiv:1409.1259.

48.

Zeng, Y., Mao, H., Peng, D. & Yi, Z. Spectrogram based multi-task audio classification. Multimed. Tools Appl. 78, 3705–3722 (2019).

Article

Google Scholar

49.

Voelker, A. R. & Eliasmith, C. Improving spiking dynamical networks: Accurate delays, higher-order synapses, and time cells. Neural Comput. 30, 569–609 (2018).

MathSciNet
Article

Google Scholar

50.

Xu, Y., Kong, Q., Huang, Q., Wang, W. & Plumbley, M. D. Convolutional gated recurrent neural network incorporating spatial features for audio tagging (2017). arXiv:1702.07787.

51.

Keren, G. & Schuller, B. Convolutional RNN: An enhanced model for extracting features from sequential data (2016). arXiv:1602.05875.

52.

Lai, G., Chang, W.-C., Yang, Y. & Liu, H. Modeling long- and short-term temporal patterns with deep neural networks (2017). arXiv:1703.07015.

53.

Shiu, Y. et al. Deep neural networks for automated detection of marine mammal species. Sci. Rep. 10, 607 (2020).

ADS
CAS
Article

Google Scholar

54.

Espi, M., Fujimoto, M., Kubo, Y. & Nakatani, T. Spectrogram patch based acoustic event detection and classification in speech overlapping conditions. In 2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA) 117–121 (2014).

55.

Feng, L., Liu, S. & Yao, J. Music genre classification with paralleling recurrent convolutional neural network (2017). arXiv:1712.08370.

56.

Choi, K., Fazekas, G., Sandler, M. & Cho, K. Convolutional recurrent neural networks for music classification (2016). arXiv:1609.04243.

57.

Himawan, I., Towsey, M. & Roe, P. 3d convolution recurrent neural networks for bird sound detection. In Wood, M., Glotin, H., Stowell, D. & Stylianou, Y. (eds.) Proceedings of the 3rd Workshop on Detection and Classification of Acoustic Scenes and Events 1–4 (Detection and Classification of Acoustic Scenes and Events, 2018).

58.

Cakir, E., Adavanne, S., Parascandolo, G., Drossos, K. & Virtanen, T. Convolutional recurrent neural networks for bird audio detection. In 2017 25th European Signal Processing Conference (EUSIPCO) 1744–1748 (2017).

Source: Ecology - nature.com

Comparing recurrent convolutional neural networks for large scale bird species classification

Areas of global importance for conserving terrestrial biodiversity, carbon and water

Climate and sustainability classes expand at MIT

ITALIAN LANGUAGE

ENGLISH LANGUAGE