in

Acoustic preadaptation to transmit vocal individuality of savanna nightjars in noisy urban environments

Study area and field observations

We recorded the territorial calls of male savanna nightjars in eight areas of Taiwan, from north to south: Jinshan District (16 nightjars), Taichung City (7 nightjars), Hualien City (9 nightjars), Yuanlin City (8 nightjars), Beigang Township (8 nightjars), Chiayi City (14 nightjars), Taitung City (4 nightjars), and Hengchun Township (1 nightjar) (Fig. 1) during April-June of 2018. Sound recordings were collected in the downtown of each area using a Denon Portable IC Recorder (DN-F20R, sampling rate = 48 kHz, 16 bit, wav format) equipped with a Sennheiser ME67 unidirectional microphone. We made recordings between 19:00 to 24:00 in good weather during 1–2 consecutive nights for each area. If we recorded in the same area for more than one consecutive nights, the recording range of the second night was at least 1000 m away from the previous recording range. The maximum territory size of the savannah nightjar observed by Chan48 in urban areas of Taiwan was 83,424 m2 with a radius of about 163 m. Therefore, we are confident that we avoided recording the same territorial male twice. Territorial calls from an individual were recorded until the individual stopped calling or flew out of our recording range. When emitting territorial calls, nightjars either perched on some artificial structure, such as antennas or fences on roof tops, or they flew around the tops of buildings. Because the loud territorial calls recorded in these two situations demonstrate the same time–frequency patterns on spectrograms and sound the same when listened to, we did not differentiate between them while recording. We always attempted to record at the closest possible distance to the calling individual by moving closer to the individual at the street level.

After measuring the individual’s calls, we immediately took three samples of the maximum noise levels (maximum hold function, C-weighting function on Sound Level Meter TES-1350, TES, Taiwan) near the calling location and close to a road intersection within the calling range following the procedures detailed in Shieh et al.49. During each noise sampling, the sound level meter was held horizontally at a height of about 1.5 m and turned 360° clockwise to measure the maximum noise level from all directions within a time period of about 30 s. We then averaged the three samples of maximum noise measurements to arrive at our value of ‘ambient noise levels’ for each individual. Therefore, ambient noise levels were measured at a height of 1.5 m, which was not only the height at which anthropogenic sources such as traffic and human activity are the main sources of noise but also the height at which we held the microphone to record the nightjar calls emitted at a height of usually more than 10 m.

Playback-recording experiments

Seven artificial calls were generated using the frequency shift function (+ 300 Hz, + 200 Hz, + 100 Hz, 0 Hz, − 100 Hz, − 200 Hz, − 300 Hz) under the Frequency Domain Transformations tool in Avisoft-SASLab Pro software v5.2.12 from a source call with good quality (see Supplementary Table S6 for descriptions of acoustic measurements) and after band-pass filters (2–7 kHz) for noise removal. These seven artificial frequency-shifted calls represented seven different artificially created individuals and were identified by their frequency shift values (+ 300 Hz, + 200 Hz, + 100 Hz, 0 Hz, − 100 Hz, − 200 Hz, − 300 Hz). We then copied each frequency-shifted call 10 times with equal silent intervals and same amplitude as a group, and then we merged the seven groups of frequency-shifted calls into one sound file with alternated orders following a Latin square design. A total of seven merged sound files were obtained, and each was broadcast once in three sites of different urban noise levels: high, medium and low. The playback-recording experiments were conducted at three sites near or on the campus of Kaohsiung Medical University (22.648 N, 120.310 E). We also took 10 samples of the maximum noise level (maximum hold function, C-weighting function on Sound Level Meter TES-1350, TES, Taiwan) at the recording sites during the playback periods to obtain the ambient noise levels for each site. The first site was on the traffic roadside near a road intersection and had the highest ambient noise levels with a mean of 83.7 dB (n = 10) and a range of 79.3–89.6 dB. The second site was on the sport field of the campus about 30–50 m away from the traffic road and had the medium ambient noise levels with a mean of 74.6 (n = 10) and a range of 73.1–76.3 dB. The third site was on a 4th floor roof garden of the campus and had the lowest ambient noise levels with a mean of 71.4 dB (n = 10) and a range of 69.9–74 dB.

We used a Denon Portable IC Recorder (DN-F20R) connected to a speaker (Sony SRS-X11) for the playback experiments. The speaker was placed 1.5 m above the ground on a tripod. Playback-recording experiments were conducted between 17:00 and 18:30 h, a period with high traffic levels. The seven merged sound files were played back at a standardised volume with a sound-pressure level of 82.5 dB at 1 m from the speaker, and about 62.5 dB at 10 m from the speaker. The sound-pressure level of the broadcasting sound which we received at 10 m is about the same amplitude that we recorded a nightjar call at a distance of 28.8 m away from its calling spot with an amplitude of 97.7 dB. We recorded with a Denon Portable IC Recorder (DN-F20R) connected to a shotgun microphone (Sennheiser ME67) that was placed on a stand 1.5 m above the ground and oriented toward the broadcasting speaker at a distance of 10 m. All the recordings were set at the same recording levels and same settings (sampling rate = 48 kHz, 16 bit).

Sound analyses

We selected high-quality calls with clear acoustic structures on spectrograms and thus excluded any recordings with low-quality calls, which were, for example, emitted when the calling nightjar was too far away, or its calls were overlapped with calls from neighboring nightjars.

We also excluded any recordings where the individual only uttered one call (two calls being the minimum needed for inclusion). This left us with a total of 1925 calls from 67 individuals for our analyses, with a mean of 28.7 ± 3.1 calls/individual and a range of 2–97 calls/individual. The band-pass filters were set from 2.0–6.8 kHz and adjusted for each individual to reduce noise components. Using the recordings, we produced spectrograms with the following spectrogram parameters in the software: sampling frequency = 22.05 kHz, FFT = 512, hamming window, frequency resolution = 43 Hz, and time resolution = 2.9 ms. We quantified 30 acoustic variables (Table 1) from the spectrograms using the Automatic Parameter Measurements setup in the Avisoft-SASLab Pro software v5.2. We marked a call with the section label manually on the spectrogram by eye, and then two time-based parameters were measured (duration of the call and the temporal distance from the start to the location of the maximum amplitude) (see the manual50 of the Avisoft-SASLab Pro software for details). To automatically measure parameters other than the time-based on the labelled section, we specified four spectrum-based parameters (peak frequency, quartile 25%, quartile 50% and quartile 75%) to be measured at seven locations of the labelled section (start, end, maximum amplitude of the call, minimum parameter of entire call, maximum parameter of entire call, mean parameter of entire call, relative standard deviation of entire call); thus, 24 frequency-based parameters and four frequency-modulation-based parameters were automatically measured based on the labelled sections on the spectrograms.

For the playback-recording experiments, the recordings were first analyzed using the same settings as the above except for two differences. We adjusted the band-pass filters to 2.0–7.3 kHz because of the high frequency shift value (up to 300 Hz). Furthermore, we used the duration of the source call as the duration of all received calls; that is, we fixed the duration of the call section while marking, and the other 29 variables were automatically measured on the marked section by the software to reduce any possible human measurement errors51.

Statistical analyses

To examine possible geographic variation of the calls, we used individuals as our sample units (n = 67), and the averaged measurements from the calls of each individual were analyzed using a PCA. The PCA was performed on the 30 acoustic variables after a normalised transformation of each variable to a mean of zero and unit standard deviation (software PRIMER 6, version 6.1.5). We retained the five components with eigenvalues greater than one and interpreted each component based on its correlations with the original variables. However, because there is a sharp decrease of the eigenvalue and of the explained variance to smaller values from PC2 to PC3 (Supplementary Table S2), only PC1 and PC2 were used for examining the geographic patterns of the calls. The 95% confidence ellipses of the groups of individuals from different geographic areas are shown on the plot of PC1 against PC2 (Fig. 3). If the 95% confidence ellipses of the eight geographic areas overlapped, we can treat all the sampled individuals as one population and pool them for further analysis.

The following statistical tests were performed on the untransformed data of each acoustic variable using JMP Pro 14.2.0. First, we treated individuals as our sample units, and we used the averaged measurements from the calls of each individual. We performed Spearman rank tests to examine the relationships between ambient noise levels and each acoustic variable for 65 individuals (because noise measurements were not taken for two individuals). To distinguish variables’ relationship to ambient noise levels, we then classified the 30 acoustic variables into two categories: (1) noise-related variables had a significant relationship with ambient noise levels, and (2) noise-unrelated variables did not. Acoustic variables which had a significant (positive or negative) correlation with ambient noise levels using Spearman rank tests were classified as noise-related variables; all remaining variables lacking such a significant correlation were classified as noise-unrelated variables.

To determine variables which can distinguish the calls of different individuals, we treated calls as sample units and individuals as groups. We then performed a Kruskal–Wallis test to select variables which were more likely to encode individual information, that is, with significant individual differences. We only included those variables with a significant P-value into the DFA (see details below). To investigate how ambient noise levels affected the transmission efficacy of vocal individuality, we used DFA which calculates the accuracy of correctly assigning a particular call to a particular individual. The purpose of a DFA is thus to discriminate sampled individuals based on all the possible acoustic measurements which encode information about individual identity; therefore, we did not perform a variable selection process. Specifically, we used the accuracy value (1—misclassification rate) calculated from the DFA to assess the transmission efficacy of vocal individuality information. Higher accuracy values are assumed to indicate higher transmission efficacy of vocal individuality information from calls recorded through various ambient noise levels.

To compare possible differences in the transmission efficacy of vocal individuality for different sets of acoustic variables, we then calculated a separate DFA for three datasets: (1) all the 30 acoustic variables, (2) the noise-related variables, and (3) the noise-unrelated variables. For each dataset, a separate DFA was used to calculate one overall accuracy value and 67 individual accuracy values. The overall accuracy value (1—misclassification rate) describes the ability of the DFA to correctly assign the 1925 calls to the 67 recorded individuals. The individual accuracy values then describe the ability of the DFA to correctly assign the calls of one particular individual to that individual. To account for small sample sizes of calls by some individuals, the overall misclassification rates were bootstrapped with fractional weights option (number of bootstrap samples, n = 20,000), and the overall accuracy values of the three datasets and the associated 95% biased-corrected confidence intervals (BCI) were obtained. Sixty-seven individual accuracy values (corresponding to the 67 sampled nightjars) were also calculated for each set of variables.

To compare the individual accuracy values using the noise-related variables with the individual accuracy values using the noise-unrelated variables, we used the Wilcoxon signed rank test. To examine how noise levels affected the transmission efficacy through different acoustic structures (noise-related vs. noise-unrelated), we investigated the correlation between the individual accuracy value and the ambient noise level associated with that particular individual by using Spearman rank tests. The differences of the individual accuracy values, which was taken as the accuracy value of the DFA using the noise-unrelated variables minus the accuracy value of the DFA using the noise-related variables from the same individual, were correlated with the ambient noise levels by using Spearman rank tests to examine the similarity in trends.

For the playback-recording experiments, seven sound found files were played in each site (high, medium or low urban noise levels), and seven corresponding recordings were received as seven samples for each site. In each recording sample, although 10 calls of each individual (identified by frequency shift values) were played, the number of received calls might be less than 10 because the calls were overlapped with other unexpected sounds and thus excluded for measurements. Therefore, we first averaged the measurements of the possible received calls for each individual in a sample and then used the averaged measurements as the measurements of the sample. Thus, we obtained 49 samples (7 samples × 7 individuals) for each site for analyzing the overall accuracy of vocal individuality and variable accuracy for the playback-recording experiments. Since we excluded the duration variable, only 29 variables were used for DFA because the duration was fixed in measurements for all individuals. Furthermore, we only used six main frequency-based variables (PFSTART, PFEND, PFMIN, PFMAXA, PFMAX, PFMEAN) for further comparison on transmission accuracy between noise-related variables and noise-unrelated variables because the artificial calls were generated by transforming only the frequency domain. Thus, for each site, we calculated a separate DFA for three datasets: (1) all the 29 acoustic variables, (2) the three noise-related variables (PFSTART, PFEND, PFMIN), and (3) the three noise-unrelated variables (PFMAXA, PFMAX, PFMEAN). For the DFA, the overall misclassification rates were also bootstrapped (number of bootstrap samples, n = 20,000), and the overall accuracy values of the three datasets and the associated 95% biased-corrected confidence intervals (BCI) were obtained. Furthermore, for the six main frequency-based variables, an accuracy value for each variable was calculated as 1 − (|R – B| /B), in which R indicated the measurement value of the received call and B indicated the measurement value of the broadcast call. In each site for each variable, the accuracy values of the seven samples of the same individual were averaged as the variable accuracy value for the individual. In each site, we then investigated the differences of variable accuracy values among variables by Friedman rank tests (with individual as block) and Wilcoxon signed rank test for paired comparison between variables. We set the significance level at 0.05 for all tests and report the two-tailed probability values.


Source: Ecology - nature.com

Solve Challenge Finals go virtual for 2020

Universities should lead the way on climate action, MIT panelists say