More stories

  • in

    Opposing shifts in distributions of chlorophyll concentration and composition in grassland under warming

    The CV represents the discrete degree of trait values, that is, the size of the trait space (Fig. 6b; Supplementary Note S1); S and K are generally used to describe the shape of trait distribution (Fig. 6c,d, Supplementary Note S1). Environment filtering can force a trait to deviate from the original distribution, with characteristics of smaller CV and larger S and K values16,17. Partly consistent with our hypothesis, MAT significantly exerted positive effects on the total concentration of CV, S, and K, but weaker negative effects on the three values of Chl a/b (Fig. 6a). That is, the distributions of Chl concentration and composition shifted in opposite directions under global warming: Chl concentration was distributed in a broader but more differential way (Fig. 7a), while Chl a/b was distributed in a narrower but more uniform way (Fig. 7b).Figure 7Theoretical sketches of distribution shifts for (a) chlorophyll concentration and (b) composition (Chl a/b) under global warming. Purple curves denote the current distributions, and pink ones represent the scenario under global warming. Dashed lines denote the normal distribution in the respective scenarios. It is supposed that the distribution of chlorophyll concentration will shift toward higher mean, CV, S and K values, while Chl a/b shifts toward higher mean but lower CV, S and K values under warming. Chl chlorophyll, CV coefficient of variation, S skewness, K kurtosis.Full size imageThe trait distributional shift under warming is possibly caused by the relative role of species turnover and intraspecific variation (due to plasticity and/or heritable differentiation)25. For Chl concentration and composition, very weak phylogenetic signals were found in three plateaus (Supplementary Table S2), indicating the phenotypic plasticity of Chl, which environments have influenced during the long-term evolution. However, plasticity and intraspecific variation are not the focus of the discussion. Because the species compositions were significantly different among the three plateaus: with only a few species overlapping (Supplementary Fig. S3), and the dominant species and co-existing species gradually varied along the 30 sites (Supplementary Table S3). Shifts in Chl distributions under warming may be interpreted mainly by the alternation of species composition.For Chl concentration, a broader trait space (higher CV) and a more skewed distribution (higher S and K) under warming conditions indicate several new species that differ in functions (here refer to rare species with higher Chl concentration) appeared or increased. This contributed to the long tail of the curve and raised the average Chl concentration. At the same time, most of the other species converged at lower Chl concentrations; that is, Chl concentration undergoes more substantial differentiation and functional contrasting species co-exist under warming. The concentration of Chl is representative of plant growth rate and production ability. Its distribution shift may imply a possible trend of polarisation in functions: both acquisitive and conservative species occur simultaneously. This alteration in species composition indicates changing biotic interactions26. The co-existence of functional contrasting species allows individuals to avoid competition and enhance the exploitation of resources and niche27,28, which is of great importance in optimising community functions28,29. In desert and alpine regions, functional contrasting species with large inter-specific trait variations improve community multi-functionality and enable better resistance to climate change17,30.However, despite the shift in species composition, the distribution of Chl a/b only changed slightly compared to the Chl concentration under warming. The ratio of Chl a to Chl b represents the plant allocation to RC and LHC in PS and the efficiency trade-offs between light capture and light conversion6,7. This ratio is characteristic of conservatism which is mainly manifested in the following aspects: (1) Chl a/b is independent of Chl concentration (orthogonal relationship of the two; Supplementary Fig. S2); (2) Chl a/b distributed more converged with higher K and lower CV (Supplementary Table S1); (3) relative fixed allometric relationships were found between Chl a and Chl b (beside TP; Fig. 8). Plants may adjust their RC and LHC allocation to a common ratio of 3:1 despite large variations in light availability or Chl concentration, which has also been confirmed by a study from forests14. Considering that RC is costlier than LHC, plants tend to sustain the Chl a/b as low as possible unless there is a functional imbalance caused by environmental stress such as warming9,31.Figure 8Standardised major axis regression of chlorophyll a to chlorophyll b in three plateaus. Slopes were given and compared among regions; different lowercase words denote significant (p  More

  • in

    Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture

    This study uses SincNet according to the instructions provided by the authors for its application in a different dataset32. This section provides an introduction to SincNet and NIPS4Bplus before detailing the experimental procedure.SincNetThe first convolutional layer of a standard CNN trained on the raw waveform learns filters from the data, where each filter has a number of parameters that matches the filter length (Eq. 1).$$yleft[ n right] = xleft[ n right] times fleft[ n right] = mathop sum limits_{i = 0}^{I – 1} xleft[ i right] cdot fleft[ {n – i} right],$$
    (1)

    where (xleft[ n right]) is the chunk of the sound, (fleft[ n right]) is the filter of length (I), and (yleft[ n right]) is the filtered output. All the elements of the filter ((i)) are learnable parameters. SincNet replaces (fleft[ n right]) with another function (g) that only depends on two parameters per filter: the lower and upper frequencies of a rectangular bandpass filter (Eq. 2).$$gleft[ {n,f_{l} ,f_{h} } right] = 2f_{h} sincleft( {2pi f_{h} n} right) – 2f_{l} sincleft( {2pi f_{l} n} right),$$
    (2)

    where (f_{l} text{ and } f_{h}) are the learnable parameters corresponding to the low and high frequencies of the filter and (sincleft( x right) = frac{sinleft( x right)}{x}). The function (g) is smoothed with a Hamming window and the learnable parameters are initialised with given cut-off frequencies in the interval (left[ {0,frac{{f_{s} }}{2}} right]), where (f_{s}) is the sampling frequency.This first layer of SincNet performs the sinc-based convolutions for a set number and length of filters, over chunks of the raw waveform of given window size and overlap. A conventional CNN architecture follows the first layer, that in this study maintains the architecture and uses both standard and enhanced settings. The standard settings used are those of the TIMIT speaker recognition experiment27,32. They include two convolutional layers after the first layer with 60 filters of length 5. All three convolutions use layer normalisation. Next, three fully-connected (leaky ReLU) layers with 2048 neurons each follow, normalised with batch normalisation. To obtain frame-level classification, a final softmax output layer, using LogSoftmax, provides a set of posterior probabilities over the target classes. The classification for each file derives from averaging the frame predictions and voting for the class that maximises the average posterior. Training uses the RMSprop optimiser with the learning rate set to 0.001 and minibatches of size 128. A sample of sinc-based filters generated during this study shows their response both in the time and the frequency domains (Fig. 4).Figure 4Examples of learned SincNet filters. The top row (a–c) shows the filters in the time domain, the bottom row (d–f) shows their respective frequency response.Full size imageThe SincNet repository32 provides an alternative set of settings used in the Librispeech speaker recognition experiment27. Tests of the alternative settings, which include changes in the hidden CNN layers, provided similar results to those of the TIMIT settings and are included as Supplementary Information 1.NIPS4BplusNIPS4Bplus includes two parts: sound files and rich labels. The sound files are the training files of the 2013 NIPS4B challenge for bird song classification23. They are a single channel with a 44.1 kHz sampling rate and 32 bit depth. They comprise field recordings collected from central and southern France and southern Spain15. There are 687 individual files with lengths from 1 to 5 s for a total length of 48 min. The tags in NIPS4Bplus are based on the labels released with the 2013 Bird Challenge but annotated in detail by an experienced bird watcher using dedicated software15. The rich labels include the name of the species, the class of sound, the starting time and the duration of each sound event for each file. The species include 51 birds, 1 amphibian and 9 insects. For birds there can be two types of vocalisations: call and song; and there is also the drumming of a woodpecker. Calls are generally short sounds with simple patterns, while songs are usually longer with greater complexity and can have modular structures or produced by one of the sexes8,13. In the dataset, only bird species have more than one type of sound, with a maximum of two types. The labels in NIPS4Bplus use the same 87 tags present in the 2013 Bird Challenge training dataset with the addition of two other tags: “human” and “unknown” (for human sounds and calls which could not be identified). Tagged sound events in the labels typically correspond to individual syllables although in some occasions the reviewer included multiple syllables into single larger events15. The tags cover only 569 files of the original training set of 687 files. Files without tags include 100 that, for the purpose of the challenge, had no bird sounds but only background noise. Other files were excluded for different reasons such as vocalisations hard to identify or containing no bird or only insect sounds15. The 2013 Bird Challenge also includes a testing dataset with no labels that we did not use15.The total number of individual animal sounds tagged in the NIPS4Bplus labels is 5478. These correspond to 61 species and 87 classes (Fig. 5). The mean length of each tagged sound ranges from ~ 30 ms for Sylcan_call (the call of Sylvia cantillans, subalpine warbler) to more than 4.5 s for Plasab_song (the song of Platycleis sabulosa, sand bush-cricket). The total recording length for a species ranges from 0.7 s for Turphi_call (the call of Turdus philomelos, song thrush) to 51.4 s for Plasab_song. The number of individual files for each call type varies greatly from 9 for Cicatr_song (the call of Cicadatra atra, black cicada) to 282 for Sylcan_call.Figure 5Distribution of sound types by number of calls (number of files) and total length in seconds. Sound types are sorted first by taxonomic group and then by alphabetical order.Full size imageProcessing NIPS4BplusThe recommended pre-processing of human speech files for speaker recognition using SincNet includes the elimination of silent leading and trailing sections and the normalisation of the amplitude27. This study attempts to replicate this by extracting each individual sound as a new file according to the tags provided in the NIPS4Bplus labels. A Python script42 uses the content of the labels to read each wavefile, apply normalisation, select the time of origin and length specified in each individual tag and save it as a new wavefile. The name of the new file includes the original file name and a sequential number suffix according to the order in which tags are listed in the label files (the start time of the sound) to match the corresponding call tags at the time of processing. Each wavefile in the new set fully contains a sound according to the NIPS4Bplus labels. A cropped file may contain sounds from more than one species15, with over 20% of the files in the new set overlapping, at least in part, with sound from another species. The machine learning task does not use files containing background noise or the other parts of the files that are not tagged in the NIPS4Bplus labels. A separate Python script42 generates the lists of files and tags that SincNet requires for processing. The script randomly generates a 75:25 split into lists of train and test files and a Python dictionary key that assigns each file to the corresponding tag according to the file name. The script selects only files confirmed as animal sounds (excluding the tags “unknown” and “human”) and generates three different combinations of tags, as follows: (1) “All classes”: includes all the 87 types of tags originally included in the 2013 Bird Challenge training dataset; (2) “Bird classes”: excludes tags for insects and one amphibian species for a total of 77 classes; and (3) “Bird species”: one class for each bird species independently of the sounds type (call, songs and drumming are merged for each species) for a total of 51 classes. The script also excludes three very short files (length shorter than 10 ms) which could not be processed without code modifications.To facilitate the repeatability of the results, this study attempts to maintain the default parameters of SincNet used in the TIMIT speaker identification task27,32. The number and length of filters in the first sinc-based convolutional layer was set to the same values as the TIMIT experiment (80 filters of length 251 samples) as was the architecture of the CNN. The filters were initialised following the Mel scale cut-off frequencies. We did change the following parameters: (1) reduced the window frame size (cw_len) from 200 to 10 ms to accommodate for the short duration of some of the sounds in the NIPS4Bplus tags (such as some bird vocalisations); (2) reduced the window shift (cw_shift) from 10 to 1 ms in proportion to the reduction in window size (a value a 0.5 could not be given without code modifications); (3) updated the sampling frequency setting (fs) from the TIMIT 16,000 to the 44,100 Hz of the present dataset; and (4) updated the number of output classes (class_lay) to match the number of classes in each training run.To evaluate performance, the training sequence was repeated with the same settings and different random train and test file splits. Five training runs took place for each of the selection of tags: “All classes”, “Bird classes” and “Bird species”.Enhancements and comparisonsChanges in the parameters of SincNet result in different levels of performance. To assess possible improvements and provide baselines to compare against other models we attempted to improve the performance by adjusting a series of parameters, but did not modify the number of layers or make functional changes to the code other than the two outlined below. The parameters tested include: the length of the window frame size, the number and length of the filters in the first layer, number of filters and lengths of the other convolutional and fully connected layers, the length and types of normalisation in the normalisation layers, alternative activation and classification functions, and the inclusion of dropouts (Supplementary Information 1). In addition the SincNet code includes a hard-coded random amplification of each sound sequence; we also tested changing the level and excluding this random amplification through changes in the code. In order to process window frames larger than some of the labelled calls in the NIPS4Bplus dataset, the procedure outlined earlier in which files are cut according to the labels was replaced by a purpose-built process. The original files were not cut, instead a custom python script42 generated train and test file lists that contain the start and length of each labelled call. A modification of the SincNet code42 uses these lists to read the original files and select the labelled call. When the call is shorter than the window frame the code randomly includes the surrounding part of the file to complete the length of the window frame. Grid searches for individual parameters or combinations of similar parameters, over a set number of epochs, selected the best performing values. We also tested the use of the Additive Margin Softmax (AM-softmax) as a cost function37. The best performing models reported in the results use combinations of the best parameter values (Supplementary Information 1). All enhancements and model comparisons use the same dataset selection, that is the same train and test dataset split, of the normalised files for each set of tagged classes.The comparison using waveform + CNN models trained directly on the raw waveform, replaces the initial sinc-based convolution of SincNet with a standard 1d convolutional layer27, thus retaining the same network architecture as SincNet. As with SincNet enhancements, a series of parameter searches provided the best parameter combinations to obtain the best performing models.The pre-trained models used for comparison are DenseNet121, ResNet50 and VGG16 with architectures and weights sourced from the Torchvision library of PyTorch33. We tested three types of spectrograms: Fast Fourier Transform (FFT), Mel spectrum (Mel) and Mel-frequency cepstral coefficient (MFCC) to fine-tune the pre-trained models. FFT calculations used a frame length of 1024 samples, 896 samples overlap and a Hamming window. Mel spectrogram calculations used 128 Mel bands. Once normalised and scaled to 255 pixel intensity three repeats of the same spectrogram represented each of the three input channels of the pre-trained models. The length of sound used to generate the spectrograms was 3 s, and similarly with routines above, for labelled calls shorter than 3 s the spectrogram would randomly include the surrounding sounds. That is, the extract would randomly start in the interval between the end of the labelled call minus 3 s and the start of the call plus 3 s. This wholly includes the labelled call but its position is random within the 3 s sample. A fully connected layer replaced the final classifying layer of the pre-trained models to output the number of labelled classes. In the fine-tuning process the number of trainable layers of the model was not limited to the final fully connected layer, but also included an adjustable number of final layers to improve the results. The learning rate set initially to 0.0001 was halved if the validation loss stopped decreasing for 10 epochs.MetricsMeasures of performance include accuracy, ROC AUC, precision, recall, F1 score, top 3 accuracy and top 5 accuracy. Accuracy, calculated as part of the testing routine, is the ratio between the number of correctly predicted files of the test set and the total number of test files. The calculation of the other metrics uses the Scikit-learn module43 relying on the predicted values provided by the model and performing weighted averages. The ROC AUC calculation uses the mean of the posterior probabilities provided by SincNet for each tagged call. In the pre-trained models the ROC AUC calculations used the probabilities obtained after normalising the output with a softmax function. More

  • in

    Illumina iSeq 100 and MiSeq exhibit similar performance in freshwater fish environmental DNA metabarcoding

    Sample collection and filtrationWe used 40 water samples for eDNA metabarcoding from 27 sites in 9 rivers and 13 lakes in Japan from 2016 to 2018 (Fig. 4). Sampling ID and detailed information for each site are listed in Supplementary Table S1. In the river water sampling, 1-L water samples were collected from the surface of at the shore of each river using bleached plastic bottles. In the field, a 1-ml Benzalkonium chloride solution (BAC, Osvan S, Nihon Pharmaceutical, Tokyo, Japan)33 was added to each water sample to suppress eDNA degeneration before filtering the water samples. We did not include field negative control samples in the HTS library, considering the aim of the presents study. The lake samples were provided by Doi et al. (2020)34 as DNA extracted samples. In the lake samples, 1-L water samples were collected from the surface at shore sites at each lake. The samples were then transported to the laboratory in a cooler at 4 °C. Each of the 1-L water samples was filtered through GF/F glass fiber filter (normal pore size = 0.7 μm; diameter = 47 mm; GE Healthcare Japan Corporation, Tokyo, Japan) and divided into two parts (maximum 500-ml water per 1 GF/F filter). To prevent cross-contamination among the water samples, the filter funnels, and the measuring cups were bleached after filtration. All filtered samples were stored at -20 ℃ in the freezer until the DNA extraction step.Figure 4Sampling sites used in the present study. Blue circles and orange triangles show the locations of the river and lake samples, respectively. Detailed information on each site is listed in Supplementary Table S1. This map has been illustrated using QGIS ver.3.10 (http://www.qgis.org/en/site/) based on the Administrative Zones Data (http://nlftp.mlit.go.jp/ksj/gml/datalist/KsjTmplt-N03-v2_3.html) which were obtained from free download service of the National Land Numerical Information (http://nlftp.mlit.go.jp/ksj/index.html, edited by RN). There was no need of obtaining permissions for editing and publishing of map data.Full size imageDNA extraction and library preparationThe total eDNA was extracted from each filtered sample using the DNeasy Blood and Tissue Kit (QIAGEN, Hilden, Germany). Extraction methods were according to Uchii et al.35, with a few modifications. A filtered sample was placed in the upper part of a Salivette tube and 440 μL of a solution containing 400 μL Buffer AL and 40 μL Proteinase K added. The tube with the filtered sample was incubated at 56 °C for 30 min. Afterward, the tube was centrifuged at 5000 × g for 3 min, and the solution at the bottom part of the tube was collected. To increase eDNA yield, 220-μL Tris–EDTA (TE) buffer was added to the filtered sample and the sample re-centrifuged at 5000 × g for 1 min. Subsequently, 400 μL of ethanol was added to the collected solution, and the mixture was transferred to a spin column. Afterward, the total eDNA was eluted in 100-μL buffer AE according to the manufacturer’s instructions. All eDNA samples were stored at -20 °C until the library preparation step.In the present study, we used a universal primer set “MiFish” for eDNA metabarcoding9. The amplicon library was prepared according to the following protocols. In the first PCR, the total reaction volume was 12 μL, containing 6.0μL 2 × KOD buffer, 2.4 μL dNTPs, 0.2 μL KOD FX Neo (TOYOBO, Osaka, Japan), 0.35 μL MiFish-U-F (5ʹ-ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNGTCGGTAAAACTCGTGCCAGC-3ʹ), MiFish-U-R (5ʹ-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNNCATAGTGGGGTATCTAATCCCAGTTTG-3ʹ), MiFish-E-F (5ʹ-ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNRGTTGGTAAATCTCGTGCCAGC-3ʹ) and MiFish-E-R (5ʹ-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNNGCATAGTGGGGTATCTAATCCTAGTTTG -3ʹ) primers with Illumina sequencing primer region and 6-mer Ns, and 2 μL template DNA. The thermocycling conditions were 94 ℃ for 2 min, 35 cycles of 98 ℃ for 10 s, 65 ℃ for 30 s, 68 ℃ for 30 s, and 68 ℃ for 5 min. The first PCR was repeated four times for each sample, and the replicated samples were pooled as a single first PCR product for use in the subsequent step. The pooled first PCR products were purified using the Solid Phase Reversible Immobilization select Kit (AMPure XP; BECKMAN COULTER Life Sciences, Indianapolis, IN, USA) according to the manufacturer’s instructions. The DNA concentrations of purified first PCR products were measured using a Qubit dsDNA HS assay kit and a Qubit 3.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA). All purified first PCR products were diluted to 0.1 ng/μL with H2O, and the diluted samples were used as templates for the second PCR. In the first PCR step, the PCR negative controls (four replicates) were included in each experiment. A total of three PCR negative controls were included in the library (PCR Blank 1–3 samples in Supplementary Table S1, S2, S4, and S5).The second PCR was performed to add HTS adapter sequences with 8-bp dual indices. The total reaction volume was 12 μL, containing 6.0 μL 2 × KAPA HiFi HotStart ReadyMix, 1.4 μL forward and reverse primer (2.5 μM), 1 μL purified first PCR product, and 2.2 μL H2O. The thermocycling conditions were 95 ℃ for 3 min, 12 cycles of 98 ℃ for 20 s, 72 ℃ for 15 s, and 72 ℃ for 5 min.Each Indexed second PCR product was pooled in the equivalent volume, and 25 μL of the pooled libraries were loaded on a 2% E-Gel SizeSelect agarose gels (Thermo Fisher Scientific), and a target library size (ca. 370 bp) was collected. The quality of the amplicon library was checked using an Agilent 2100 Bioanalyzer and Agilent 2100 Expert (Agilent Technologies Inc., Santa Clara, CA, USA), and the DNA concentrations of the amplicon library were measured using Qubit dsDNA HS assay Kit using a Qubit 3.0 fluorometer.High-throughput sequencingAmplicon library was sequenced using iSeq and MiSeq platforms (Illumina, San Diego, CA, USA). To normalize the percentage of pass-filtered read numbers, the sequencing runs using the same libraries were performed using iSeq i1 Reagent and MiSeq Reagent Kit v2 Micro. Both sequencing was performed with 8 million pair-end reads and 2 × 150 bp read lengths. Each library was spiked with approximately 20% PhiX control (PhiX Control Kit v3, Illumina, San Diego, CA, USA) before sequencing runs according to the recommendation of Illumina. The wells of cartridges in the iSeq run were loaded with 20 μL of 50 pM library pool, and sequencing performed at Yamaguchi University, Yamaguchi, Japan. The wells of cartridges for MiSeq runs were loaded with 600 μL of 16 pM library pool, and sequencing performed at Illumina laboratories (Minato-ku, Tokyo, Japan). Subsequently, the sequencing dataset outputs from iSeq and MiSeq were subjected to pre-processing and taxonomic assignments. All sequence data are registered in the DNA Data Bank of Japan (DDBJ) Sequence Read Archive (DRA, Accession number: DRA10593).Pre-processing and taxonomic assignmentsWe used the USEARCH v11.066736 for all data pre-processing activities and taxonomic assignment of the HTS datasets obtained from the iSeq and MiSeq platforms16,37. First, pair-end reads (R1 and R2 reads) generated from iSeq and MiSeq platforms were assembled using the “fastq_mergepairs” command with a minimum overlap of 10 bp. In the process, the low-quality tail reads with a cut-off threshold at a Phred score of 2, and the paired reads with too many mismatches ( > 5 positions) in the aligned regions were discarded38. Secondly, the primer sequences were removed from the merged reads using the “fastx_truncate” command. Afterward, read quality filtering was performed using the “fastq_filter” command with thresholds of max expected error  > 1.0 and  > 50 bp read length. The pre-processed reads were dereplicated using the “fastx_uniques” command, and the chimeric reads and less than 10 reads were removed from all samples as the potential sequence errors. Finally, an error-correction of amplicon reads, which checks and discards the PCR errors and chimeric reads, was performed using the “unoise3” command in the unoise3 algorithm39. Before the taxonomic assignment, the processed reads from the above steps were subjected to sequence similarity search using the “usearch_global” command against reference databases of fish species that had been established previously (MiFish local database v34). The sequence similarity and cut off E-value were 99% and 10–5, respectively. If there was only one species with ≧ 99% similarity, the sequence was assigned to the top-hit species. Conversely, sequences assigned to two or more species in the ≧ 99% similarity were merged as species complex and listed in the synonym group. Generally, the species complexes were assigned to the genus level (e.g., Asian crucian carp Carassius spp.). Species that were unlikely to inhabit Japan were excluded from the candidate list of species complexes. For example, the sequence of one of bitterling Acheilignathus macropterus included other different two species, A. barbatus and A. chankaensis, as the species of the 2nd hit candidate; however, the two species are not currently found in Japan. Therefore, the sequence was assigned to A. macropterus in the present study. Because we used only freshwater fish species, we removed the operational taxonomic units (OTUs) assigned to marine and brackish fishes from each sample. Finally, sequence reads of each fish species were arranged into the matrix, with the rows and columns representing the number of sites and fish species (or genus), respectively.We evaluated sequence quality based on (1) the percentage of clustering passing filter (% PF) and (2) sequencing quality score ≧ % Q30 (Read1 and Read2) between iSeq and MiSeq platforms. The % PF value is an indicator of signal purity for each cluster40. The condition leads to poor template generation, which decreases the % PF value40. In the present study, a  > 80% PF value was set as the threshold of sequence quality in iSeq and MiSeq runs. Sequence quality scores (Q score) measure the probability that a base is called incorrectly. Higher Q scores indicate lower probability of sequencing error, and lower Q scores indicate probability of false-positive variant calls resulting in inaccurate conclusions41. In the present study, the % Q30 values (error rate = 0.001%) were used for the comparison of sequence quality between iSeq and MiSeq. The parameters were collected directly using Illumina BaseSpace Sequence Hub. We also evaluated changes in sequence reads in pre-processing steps between iSeq and MiSeq platforms. Sequence reads were assessed based (1) merge pairs, (2) quality filtering, and (3) denoising. In each step, the change in the number of reads before and after processing was calculated. The calculated numbers of sequence reads are listed in Supplementary Table S2 and S3 in series.Comparing sequence quality and fish fauna between iSeq and MiSeqTo test a relationship of remained sequence reads between iSeq and MiSeq in each pre-processing part, we performed spearman’s rank correlation test in each step. In the present study, however, the sequencing run by iSeq and MiSeq was performed only once each for the same sample. Therefore, we could not assess the variabilities of the sequence read in quality checks and taxonomic assignment in the same samples between iSeq and MiSeq.Before the comparison of fish fauna, rarefaction curves were illustrated for each sample in both iSeq and MiSeq to confirm that the sequencing depth adequately covered the species composition using the “rarecurve” function of the “vegan” package ver. 2.5–6 (https://github.com/vegandevs/vegan) in R ver. 3.6.242. In the present study, the differences in the numbers of sequence reads among samples were confirmed in the two sequencers, but rarefaction curves were saturated in all iSeq and MiSeq samples (Supplementary Fig. S6 and S7). We performed a rarefaction using the “rrarefy” function in “vegan” package to match up the iSeq sequence depths of each sample with that of MiSeq. However, the number of species in each sample on the iSeq have not changed before or after the rarefaction. Therefore, we have used the raw data set before the rarefaction for the subsequent analyses.We compared the species detection capacities of iSeq and MiSeq based on environmental DNA metabarcoding. Using fish faunal data obtained from iSeq and MiSeq, non-metric multidimensional scaling (NMDS) was performed in 1000 separate runs using the “metaMDS” functions in the “vegan” package ver. 2.5–6. For NMDS, the dissimilarity of the fish fauna was calculated based on the incidence-based Jaccard indices. To evaluate the differences in species composition and variance across sites between the two HTS, we performed a permutational multivariate analysis of variance (PERMANOVA) and the permutational analyses of multivariate dispersions (PERMDISP) with 10,000 permutations, respectively. For the PERMANOVA and PERMDISP, we used the “adonis”, and “betadisper” functions in the “vegan” package ver. 2.5-6.Comparison of fish species detectability between eDNA metabarcoding and conventional methodsWe evaluated species detectability between the two HTS by comparing the fish species lists of the two HTS with lists from conventional methods. Five sampling sites were selected from Kyushu and Chugoku districts (R23–27 in Fig. 4). The fish fauna data obtained by conventional methods were based on the results of a previous study43. The conventional surveys were conducted through hand-net sampling and visual observation by snorkeling (see a previous study43 for the detailed methods). The count data of each species were replaced with the incidence-based datasets (presence or absence) for comparing with the eDNA metabarcoding datasets. Fish sequence reads of each sampling site obtained by eDNA metabarcoding were also replaced with the incidence-based data.To test the detectability of species observed by conventional methods, the fish species compositions in five rivers were compared between the eDNA metabarcoding (iSeq and MiSeq) and the conventional methods. To visualize the differences in the species composition between HTSs and conventional methods, heat maps were illustrated for each sampling site. To assess differences in the number of species among methods at each river, the repeated measures analysis of variance (ANOVA) was performed among iSeq, MiSeq, and conventional methods. If a significant difference was found in repeated measures ANOVA, the Tukey–Kramer multiple comparison test was performed to analyze differences among methods.Using fish faunal data obtained from iSeq, MiSeq, and conventional methods, the NMDS was performed in 1000 separate runs with Jaccard indices. The PERMANOVA was performed with 1000 permutations to assess the differences in fish fauna among the methods and sites. Furthermore, to evaluate variance across sites among methods, the PERMDISP was also performed with 1000 permutations. To visualize the number of species in each method and the number of common species between methods, Venn diagrams were illustrated for each river using the “VennDiagram” package ver. 1.6.2 in R44. More

  • in

    Effect of sowing proportion on above- and below-ground competition in maize–soybean intercrops

    Site descriptionField experiments were conducted at the Changwu Experimental Station (35° 12′ N, 107° 40′ E, altitude 1200 m) located in Shaanxi Province, China. The experimental site was in the typical dryland farming area on the Loess Plateau. Annual precipitation in the area averaged 582 mm between 1957 and 2013, with a mean annual temperature of 9.7 °C over that period. Rainfall and temperature during the two study years are shown in Fig. S1. Soils were generally of the Calcaric Regosol group, according to the FAO/UNESCO soil classification system52, and were composed of 4% sand, 59% silt, and 37% clay53. The 0–20 cm soil properties were the following: pH, 8.4; organic matter content, 11.8 g kg−1; total N content, 0.87 g kg−1; and Olsen-P, 14.4 mg kg−1.Experimental design and field managementTwo-year experiment was arranged in a randomized complete block design with three replicate plots during 2012 and 2013 growing seasons25,54_ENREF_53. The study was conducted using the soybean cultivar (Glycine max L.) cv. Zhonghuang 24 and the maize cultivar (Zea mays L.) cv. Zhengdan 958 grown in cereal–legume agricultural systems. Zhonghuang 24 was bred from Jilin 21 and fendou 31 × Zhongdou 19 (deposition number 2008003); Zhengdan 958 was the offspring of inbred Zheng 58 and Chang 7-2 (deposition number 20000009), which are approved in China. The cropping system treatments were as follows:

    1.

    Sole-cropped soybean (S).

    2.

    Sole-cropped maize (M).

    3.

    Two rows of maize intercropped with two rows of soybean (M2S2).

    4.

    Two rows of maize intercropped with four rows of soybean (M2S4).

    Each plot measured 6 m × 4 m, with row spacing of 50 cm for maize and soybean both in sole crops and intercrops. Individual plants were spaced at 22 cm and 19 cm for maize and soybean, respectively, with one plant per stand for maize and two plants per stand for soybean to attain densities of 90,000 and 210,000 plants ha−1, respectively. In 2012, seeds of maize and soybean were sown on 25 April and harvested on 28 September, and in 2013, seeds were sown on 20 April and harvested on 25 September. Before sowing, basal fertilizer was applied at a rate of 90 kg N ha−1 as urea (46% N) and 150 kg P2O5 ha−1 as superphosphate (12%, P2O5), and then additional fertilizers were uniformly spread in each plot, which were then ploughed into the 0–30 cm soil layer using a rotary tiller. All of the plots received 67.5 kg N ha−1 as urea at the bell and silking stages using a hole-seeding machine. No irrigation was applied, and weeds were removed by hand when sighted. The research on plants complied with relevant institutional, national, and international guidelines and legislation.Above- and below-ground measurementsThe Pn was measured with a LI-6400 portable photosynthesis system (LI-COR Inc., Lincoln, NE, USA) from 9:00 to 11:00 h at 120 days after sowing, which corresponds to the milk stage in maize and full seed stage in soybean7,13. We measured photosynthesis of ear leaves of maize, the first spreading leaves at the top of soybean in both the sole crops and intercrops. The Pn values were calculated as the sum of the mean readings for five leaves in each plot. The LAI values, DIFN were recorded using a Plant Canopy Analyzer (Li-2200, LiCor Inc., Lincoln, NE, USA) without direct sunlight at milk stage of maize. One above-canopy measurement and three below-canopy measurements at the soil surface were taken for four replicates in each plot. SPAD were collected using a hand-held dual wavelength meter (SPAD 502, Chlorophyll meter, Minolta Camera Co., Ltd., Japan) at milk stage of maize. Measurements were taken midway along the ear leaves of maize and the first spreading leaves at the top of soybean from five adjacent plants at the center of row in each plot.The SWS was measured gravimetrically using a soil auger at 10 cm intervals over a depth of 100 cm and at 20 cm intervals over a depth of 200 cm at milk stage of maize for three replicates in each plot. The SWS was calculated for each plot in the 0–200 cm soil profile for the soil moisture using the following formula: SWS = SWC × SD × SBD, where SWC represents soil water content, SD represents soil depth, and SBD represents soil bulk density. Apparent water use during crop growth season was expressed as evapotranspiration (ET), which was determined according to the following formula: ET = ΔSWS + P, where ΔSWS is the change in soil water storage in the top 200 cm and P is the rainfall (mm) between planting and at milk stage in maize. The six adjacent plant samples were collected at milk stage of maize in the middle two rows of each plots (Fig. S2). The sampling included shoots and roots of maize and soybean. At the cotyledonary node, above-ground parts were separated from below-ground parts. Soil core samples (9 cm diameter × 15 cm) at the intra-row of crop were collected to a depth of 100 cm using an auger and separated in 10-cm sections to determine the root growth in sole-cropping and intercropping systems. The samples were exposed to 105 °C for 30 min and then dried to a constant weight at 75 °C. The oven-dried samples were put in small plastic bags after grinding. The study of N and P uptake are the most common among mineral elements55,56. Concentrations of N and P in the plant dry matter were determined after digestion with H2SO4 and H2O2; N concentration was measured according to the Kjeldahl method20, whereas P concentration was measured by the molybdenum-antimony anti-spectrophotometric method16. Crop N and P uptake were calculated by the actual above-ground biomass multiplied by plant tissue N and P concentrations. Grain yield was estimated at harvest from 6 m2 for maize and soybean based on the average of three plot replicates.Data analysisThe LER for assessment of land use advantage. LER is sum of ratio of intercrop to sole crop for maize and soybean yield57:$$ LER = LER_{m} + LER_{s} ,;LER_{m} = frac{{Y_{im} }}{{Y_{sm} }}, ;LER_{s} = frac{{Y_{is} }}{{Y_{ss} }} $$where LERm and LERs are patial LER for maize and soybean, respectively. Yim and Yis are yields of maize and soybean under intercrops, respectively. Ysm and Yss are the yield of maize and soybean under sole crop, respectively.The water equivalent ratio (WER) was calculated to measure water use advantage of intercropping58:$$ WER = WER_{m} + WER_{s} ,;WER_{m} = frac{{Y_{im} /ET_{im} }}{{Y_{sm} /ET_{sm} }},;WER_{s} = frac{{Y_{is} /ET_{is} }}{{Y_{ss} /ET_{ss} }} $$where WERm and WERs are patial WER for maize and soybean, respectively. ETim and ETis are ET of maize and soybean under intercrops, respectively. ETsm and ETss are the ET of maize and soybean under sole crop, respectively.All analyses were conducted in SPSS Statistics 17.0 (SPSS Inc., Chicago, IL, USA). Treatment means showing significant differences among different cropping systems were separated using one-way ANOVA or least significant difference (LSD) at a threshold of 5% to compare the effect of yield, above- and below-ground related parameters (Pn, LAI, SPAD, DIFN, SWS, N and P uptake) in different maize–soybean intercropping. The variation in Pn, LAI, SPAD, DIFN, SWS, N, and P uptake of crop, and the effects of cropping system × year were made using Univariate General Linear Models. Pearson’s correlation test was used to analyze between LER and above-and below-ground biomass of maize and soybean. The effects of above- and below-ground factors on biological yield were quantified, by calculating the contribution value of some key factors to yield. The effects of between above- (LAI, SPAD, DIFN) and below-ground (SWS, N and P uptake) competition on the biological yield and contribution rate were conducted by the linear regression model59:$$ Y = beta_{0} LAI + beta_{1} SPAD + beta_{2} DIFN + beta_{3} SWS + beta_{4} {text{N}} + beta_{5} {text{P}} + beta_{6} X + beta_{7} $$
    (1)
    where Y represents biological yield, LAI represents leaf area index, SPAD represents chlorophyll, DIFN represents diffuse non interceptance, SWS represents soil water storage, N represents crop nitrogen uptake, P represents crop phosphorus uptake, X represents interaction for LAI, SPAD, DIFN, SWS, N, and P, and β0, β1, β2, β3, β4, β5, β6 and β7 represent the fitted parameters. The standard regression coefficients (Beta) of LAI, SPAD, DIFN, SWS, N, and P were determined on the basis of Eq. (1) to split their influence on the biological yield by the following equations:$$ beta_{0}^{prime } = beta_{0} times left( {LAI^{prime } /Y^{prime } } right) $$
    (2)
    $$ beta_{1}^{prime } = beta_{1} times left( {SPAD^{prime } /Y^{prime } } right) $$
    (3)
    $$ beta_{2}^{prime } = beta_{2} times left( {DIFN^{prime } /Y^{prime } } right) $$
    (4)
    $$ beta_{3}^{prime } = beta_{3} times left( {SWS^{prime } /Y^{prime } } right) $$
    (5)
    $$ beta_{4}^{prime } = beta_{4} times left( {{text{N}}^{prime } /Y^{prime } } right) $$
    (6)
    $$ beta_{5}^{prime } = beta_{5} times left( {{text{P}}^{prime } /Y^{prime } } right) $$
    (7)
    where β0′, β1′, β2′, β3′, β4′, and β5′ represent the standard regression coefficients for LAI, SPAD, DIFN, SWS, N, and P. LAI′, SPAD′, DIFN′, SWS′, N′, and P′ represent the standard deviations for LAI, SPAD, DIFN, SWS, N, and P. Y′ is the standard deviation for the modeled biological yield. More

  • in

    Macroscale patterns of oceanic zooplankton composition and size structure

    1.Litchman, E., Ohman, M. D. & Kiørboe, T. Trait-based approaches to zooplankton communities. J. Plankton Res. 35, 473–484. https://doi.org/10.1093/plankt/fbt019 (2013).Article 

    Google Scholar 
    2.Kiørboe, T. & Hirst, A. G. Shifts in mass scaling of respiration, feeding, and growth rates across life-form transitions in marine pelagic organisms. Am. Nat. 183, E118–E130. https://doi.org/10.1086/675241 (2014).Article 
    PubMed 

    Google Scholar 
    3.Andersen, K. H. et al. Characteristic sizes of life in the oceans, from bacteria to whales. Annu. Rev. Mar. Sci. 8, 1–25. https://doi.org/10.1146/annurev-marine-122414-034144 (2015).Article 

    Google Scholar 
    4.Bergmann, C. Über die Verhältnisse der Wärmeökonomie der Thiere zu ihrer Grösse. Göttinger Studien 3, 595–708 (1847).
    Google Scholar 
    5.Woodson, C., Schramski, J. R. & Joye, S. B. A unifying theory for top-heavy ecosystem structure in the ocean. Nat. Commun. 9, 1–8. https://doi.org/10.1038/s41467-017-02450-y (2018).CAS 
    Article 

    Google Scholar 
    6.Brown, J. H., Gillooly, J. F., Allen, A. P., Savage, V. M. & West, G. B. Toward a metabolic theory of ecology. Ecology 85, 1771–1789. https://doi.org/10.1890/03-9000 (2004).Article 

    Google Scholar 
    7.Gardner, J. L., Peters, A., Kearney, M. R., Joseph, L. & Heinsohn, R. Declining body size: A third universal response to warming?. Trends Ecol. Evol. 26, 285–291. https://doi.org/10.1016/j.tree.2011.03.005 (2011).Article 
    PubMed 

    Google Scholar 
    8.Angilletta, M. J., Steury, T. D. & Sears, M. W. Temperature, growth rate, and body size in ectotherms: Fitting pieces of a life-history puzzle. Integr. Comp. Biol. 44, 498–509. https://doi.org/10.1093/icb/44.6.498 (2004).Article 
    PubMed 

    Google Scholar 
    9.Atkinson, D. Temperature and organism size: A biological law for ectotherms?. Adv. Ecol. Res. 25, 1–58. https://doi.org/10.1016/S0065-2504(08)60212-3 (1994).Article 

    Google Scholar 
    10.Atkinson, D. & Sibly, R. M. Why are organisms usually bigger in colder environments? Making sense of a life history puzzle. Trends Ecol. Evol. 12, 235–239. https://doi.org/10.1016/S0169-5347(97)01058-6 (1997).CAS 
    Article 
    PubMed 

    Google Scholar 
    11.Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359. https://doi.org/10.1126/science.1261359 (2015).CAS 
    Article 
    PubMed 

    Google Scholar 
    12.Audzijonyte, A. et al. Is oxygen limitation in warming waters a valid mechanism to explain decreased body sizes in aquatic ectotherms?. Glob. Ecol. Biogeogr. 28, 64–77. https://doi.org/10.1111/geb.12847 (2018).Article 

    Google Scholar 
    13.Begon, M., Townsend, C. R. & Harper, J. L. Ecology: From Individuals to Ecosystems 4th edn. (Blackwell Publishing, New York, 2006).
    Google Scholar 
    14.Hirata, T., Aiken, J., Hardman-Mountford, N., Smyth, T. J. & Barlow, R. G. An absorption model to determine phytoplankton size classes from satellite ocean colour. Remote Sens. Environ. 112, 3153–3159. https://doi.org/10.1016/j.rse.2008.03.011 (2008).ADS 
    Article 

    Google Scholar 
    15.Kostadinov, T., Siegel, D. & Maritorena, S. Global variability of phytoplankton functional types from space: Assessment via the particle size distribution. Biogeosciences 7, 3239–3257. https://doi.org/10.5194/bg-7-3239-2010 (2010).ADS 
    Article 

    Google Scholar 
    16.Brun, P., Payne, M. R. & Kiørboe, T. Trait biogeography of marine copepods: An analysis across scales. Ecol. Lett. 19, 1403–1413. https://doi.org/10.1111/ele.12688 (2016).Article 
    PubMed 

    Google Scholar 
    17.Horne, C. R., Hirst, A. G., Atkinson, D., Neves, A. & Kiørboe, T. A global synthesis of seasonal temperature–size responses in copepods. Glob. Ecol. Biogeogr. 25, 988–999. https://doi.org/10.1111/geb.12460 (2016).Article 

    Google Scholar 
    18.Garzke, J., Hansen, T., Ismar, S. M. H. & Sommer, U. Combined effects of ocean warming and acidification on copepod abundance, body size and fatty acid content. PLoS ONE 11, e0155952. https://doi.org/10.1371/journal.pone.0155952 (2006).CAS 
    Article 

    Google Scholar 
    19.Stelzer, C. P. Phenotypic plasticity of body size at different temperatures in a planktonic rotifer: Mechanisms and adaptive significance. Funct. Ecol. 16, 835–841. https://doi.org/10.1046/j.1365-2435.2002.00693.x (2002).Article 

    Google Scholar 
    20.Riemer, K., Anderson-Teixeira, K. J., Smith, F. A., Harris, D. J. & Ernest, S. K. M. Body size shifts influence effects of increasing temperatures on ectotherm metabolism. Global Ecol. Biogeogr. 27, 958–967. https://doi.org/10.1111/geb.12757 (2018).Article 

    Google Scholar 
    21.Gorsky, G. et al. Digital zooplankton image analysis using the ZooScan integrated system. J. Plankton Res. 32, 285–303. https://doi.org/10.1093/plankt/fbp124 (2010).Article 

    Google Scholar 
    22.Ibarbalz, F. M. et al. Global trends in marine plankton diversity across kingdoms of life. Cell 179, 1084–1097. https://doi.org/10.1016/j.cell.2019.10.008 (2019).CAS 
    Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    23.Hoefnagel, K. N. & Verberk, W. C. Is the temperature-size rule mediated by oxygen in aquatic ectotherms?. J. Therm. Biol. 54, 56–65. https://doi.org/10.1016/j.jtherbio.2014.12.003 (2015).Article 
    PubMed 

    Google Scholar 
    24.Wojewodzic, M. W., Kyle, M., Elser, J. J., Hessen, D. O. & Andersen, T. Joint effect of phosphorus limitation and temperature on alkaline phosphatase activity and somatic growth in Daphnia magna. Oecologia 165, 837–846. https://doi.org/10.1007/s00442-010-1863-2 (2011).ADS 
    Article 
    PubMed 

    Google Scholar 
    25.Gillooly, J. F., Brown, J. H., West, G. B., Savage, V. M. & Charnov, E. L. Effects of size and temperature on metabolic rate. Science 293, 2248–2251. https://doi.org/10.1126/science.1061967 (2001).ADS 
    CAS 
    Article 
    PubMed 

    Google Scholar 
    26.Czarnoleski, M., Ejsmont-Karabin, J., Angilletta, M. K. & Kozlowski, J. Colder rotifers grow larger but only in oxygenated waters. Ecosphere 6, 1–5. https://doi.org/10.1890/ES15-00024.1 (2015).Article 

    Google Scholar 
    27.Kiørboe, T. How zooplankton feed: Mechanisms, traits and trade-offs. Biol. Rev. 86, 311–339. https://doi.org/10.1111/j.1469-185X.2010.00148.x (2011).Article 
    PubMed 

    Google Scholar 
    28.Benedetti, F., Gasparini, S. & Ayata, S.-D. Identifying copepod functional groups from species functional traits. J. Plankton Res. 38, 159–166. https://doi.org/10.1093/plankt/fbv096 (2016).Article 
    PubMed 

    Google Scholar 
    29.Brun, P., Payne, M. R. & Kiørboe, T. A trait database for marine copepods. Earth Syst. Sci. Data 9, 99–113. https://doi.org/10.5194/essd-9-99-2017 (2017).ADS 
    Article 

    Google Scholar 
    30.Anderson, T. R. Plankton functional type modelling: Running before we can walk?. J. Plankton Res. 27, 1073–1081. https://doi.org/10.1093/plankt/fbi076 (2005).ADS 
    Article 

    Google Scholar 
    31.Biard, T. et al. In situ observations unveil an unexpectedly large biomass of Radiolaria and Phaeodaria (Rhizaria) in the oceans. Nature 532, 504–507. https://doi.org/10.1038/nature17652 (2016).ADS 
    CAS 
    Article 
    PubMed 

    Google Scholar 
    32.Takagi, H. et al. Characterizing photosymbiosis in modern planktonic foraminifera. Biogeosciences 16, 3377–3396. https://doi.org/10.5194/bg-16-3377-2019 (2019).ADS 
    CAS 
    Article 

    Google Scholar 
    33.Rink, S., Kühl, M., Bijma, J. & Spero, H. J. Microsensor studies of photosynthesis and respiration in the symbiotic foraminifer Orbulina universa. Mar. Biol. 131, 583–595. https://doi.org/10.1007/s002270050350 (1998).Article 

    Google Scholar 
    34.Lombard, F., Erez, J., Michel, E. & Labeyrie, L. Temperature effect on respiration and photosynthesis of the symbiont-bearing planktonic foraminifera Globigerinoides ruber, Orbulina universa, and Globigerinella siphonifera. Limnol. Oceanogr. 54, 210–218. https://doi.org/10.4319/lo.2009.54.1.0210 (2009).ADS 
    CAS 
    Article 

    Google Scholar 
    35.Lesser, M. P. Coral Bleaching: Causes and Mechanisms. In Coral Reefs: An Ecosystem in Transition (eds Dubinsky, Z. & Stambler, N.) (Springer, 2011).
    Google Scholar 
    36.Villar, E. et al. Symbiont chloroplasts remain active during bleaching-like response induced by thermal stress in Collozoum pelagicum (Collodaria, Retaria). Front. Mar. Sci. 5, 387. https://doi.org/10.3389/fmars.2018.00387 (2018).Article 

    Google Scholar 
    37.Hemleben, C., Spindler, M. & Anderson, O. R. Modern Planktonic Foraminifera (Springer-Verlag, 1989).Book 

    Google Scholar 
    38.Suzuki, N. & Not, F. Biology and ecology of radiolaria. In Marine Protists: Diversity and Dynamics (eds Ohtsuka, S. et al.) (Springer, 2015).
    Google Scholar 
    39.de Puelles, F. et al. Zooplankton abundance and diversity in the tropical and subtropical ocean. Diversity 11, 203. https://doi.org/10.3390/d11110203 (2019).CAS 
    Article 

    Google Scholar 
    40.Beaugrand, G., Edwards, M. & Legendre, L. Marine biodiversity, ecosystem functioning, and carbon cycles. PNAS 107, 10120–10124. https://doi.org/10.1073/pnas.0913855107 (2010).ADS 
    Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    41.Brun, P. et al. Climate change has altered zooplankton-fuelled carbon export in the North Atlantic. Nat. Ecol. Evol. 3, 416–423. https://doi.org/10.1038/s41559-018-0780-3 (2019).Article 
    PubMed 

    Google Scholar 
    42.Buitenhuis, E. T., Le Quéré, C., Bednaršek, N. & Schiebel, R. Large contribution of pteropods to shallow CaCO3 export. Glob. Biogeochem. Cyc. 33, 458–468. https://doi.org/10.1029/2018GB006110 (2019).ADS 
    CAS 
    Article 

    Google Scholar 
    43.Follows, M. J., Dutkiewicz, J., Grant, S. & Chisholm, S. W. Emergent biogeography of microbial communities in a model ocean. Science 315, 1843–1846. https://doi.org/10.1126/science.1138544 (2007).ADS 
    CAS 
    Article 
    PubMed 

    Google Scholar 
    44.Ward, B. A., Dutkiewicz, S., Jahn, O. & Follows, J. F. A size-structured food-web model for the global ocean. Limnol. Oceanogr. 57, 1877–1891. https://doi.org/10.4319/lo.2012.57.6.1877 (2012).ADS 
    Article 

    Google Scholar 
    45.Sailley, S. F. et al. Comparing food web structures and dynamics across a suite of global marine ecosystem models. Ecol. Model. 261, 43–57. https://doi.org/10.1016/j.ecolmodel.2013.04.006 (2013).Article 

    Google Scholar 
    46.Le Quéré, C. et al. Role of zooplankton dynamics for Southern Ocean phytoplankton biomass and global biogeochemical cycles. Biogeosciences 13, 4111–4133. https://doi.org/10.5194/bg-13-4111-2016 (2016).ADS 
    CAS 
    Article 

    Google Scholar 
    47.Kwiatkowski, L. et al. Emergent constraints on projections of declining primary production in the tropical oceans. Nat. Clim. Change 7, 355–358. https://doi.org/10.1038/nclimate3265 (2017).ADS 
    CAS 
    Article 

    Google Scholar 
    48.Sunagawa, S. et al. Tara Oceans: Towards global ocean ecosystems biology. Nat. Rev. Microbiol. 18, 428–445. https://doi.org/10.1038/s41579-020-0364-5 (2020).CAS 
    Article 
    PubMed 

    Google Scholar 
    49.Pesant, S. et al. Open science resources for the discovery and analysis of Tara Oceans data. Sci. Data 2, 150023. https://doi.org/10.1038/sdata.2015.23 (2015).CAS 
    Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    50.Picheral, M. et al. Vertical profiles of environmental parameters measured on discrete water samples collected with Niskin bottles at station TARA_147 during the Tara Oceans expedition 2009–2013. PANGAEA https://doi.org/10.1594/PANGAEA.839235 (2014).51.Guidi, L. et al. Environmental context of all samples from the Tara Oceans Expedition (2009–2013), about sensor data in the targeted environmental feature. PANGAEA https://doi.org/10.1594/PANGAEA.875576 (2017).52.Guidi, L. et al. Environmental context of all samples from the Tara Oceans Expedition (2009–2013), about pigment concentrations (HPLC) in the targeted environmental feature. PANGAEA https://doi.org/10.1594/PANGAEA.875569 (2017).53.Guidi, L. et al. Environmental context of all samples from the Tara Oceans Expedition (2009–2013), about nutrients in the targeted environmental feature. PANGAEA https://doi.org/10.1594/PANGAEA.875575 (2017).54.Speich, S. et al. Environmental context of all samples from the Tara Oceans Expedition (2009–2013), about the water column features at the sampling location. PANGAEA https://doi.org/10.1594/PANGAEA.875579 (2017).55.de Boyer-Montegut, C., Madec, G., Fischer, A. S., Lazar, A. & Iudicone, D. Mixed layer depth over the global ocean: An examination of profile data and a profile-based climatology. J. Geophys. Res. 109, C12003. https://doi.org/10.1029/2004JC002378 (2004).ADS 
    Article 

    Google Scholar 
    56.Aminot, A., Kérouel, R. & Coverly, S. C. Nutrients in seawater using segmented flow analysis. In Practical Guidelines for the Analysis of Seawater (ed. Wurl, O.) (CRC Press, 2009).
    Google Scholar 
    57.Uitz, J., Claustre, H., Morel, A. & Hooker, S. B. Vertical distribution of phytoplankton communities in Open Ocean: An assessment based on surface chlorophyll. J. Geophys. Res. 111, C08005. https://doi.org/10.1029/2005JC003207 (2006).ADS 
    Article 

    Google Scholar 
    58.Pante, E. & Simon-Bouhet, B. marmap: A Package for importing, plotting and analyzing bathymetric and topographic data in R. PLoS ONE 8(9), e73051. https://doi.org/10.1371/journal.pone.0073051 (2013).ADS 
    CAS 
    Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    59.Picheral, M., Colin, S. & Irisson J.-O. EcoTaxa, A Tool for the Taxonomic Classification of Images. http://ecotaxa.obs-vlfr.fr (2017).60.Wood, S. N. Generalized Additive Models: An Introduction with R 2nd edn. (Chapman and Hall/CRC, 2017).Book 

    Google Scholar 
    61.Dormann, C. F. et al. Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography 36, 27–46. https://doi.org/10.1111/j.1600-0587.2012.07348.x (2013).Article 

    Google Scholar 
    62.Giorgino, T. Computing and visualizing dynamic time warping alignments in R: The dtw package. J. Stat. Softw. 31, 1–24. https://doi.org/10.18637/jss.v031.i07 (2009).Article 

    Google Scholar 
    63.Park, H.-S. & Jun, C.-H. A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36, 3336–3341. https://doi.org/10.1016/j.eswa.2008.01.039 (2009).Article 

    Google Scholar 
    64.R Core Team. R: A Language and Environment for Statistical Computing. (R Foundation for Statistical Computing, 2018). https://www.R-project.org/.65.Wickham, H. et al. Welcome to the Tidyverse. J. Open Source Softw. 4, 1686. https://doi.org/10.21105/joss.01686 (2019).ADS 
    Article 

    Google Scholar 
    66.Heiberger, R. M. HH: Statistical Analysis and Data Display: Heiberger and Holland. R package version 3.1–40, https://CRAN.R-project.org/package=HH (2020).67.Lê, S., Josse, J. & Husson, F. FactoMineR: An R package for multivariate analysis. J. Stat. Softw. 25, 1–18. https://doi.org/10.18637/jss.v025.i01 (2008).Article 

    Google Scholar 
    68.Sarda-Espinosa, A. dtwclust: Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance. R package version 5.5.6, https://CRAN.R-project.org/package=dtwclust (2019). More

  • in

    Metagenomes, metatranscriptomes and microbiomes of naturally decomposing deadwood

    1.Luyssaert, S. et al. Old-growth forests as global carbon sinks. Nature 455, 213–215 (2008).ADS 
    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    2.Pan, Y. et al. A large and persistent carbon sink in the world’s forests. Science 333, 988–993 (2011).ADS 
    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    3.Rinne-Garmston, K. T. et al. Carbon flux from decomposing wood and its dependency on temperature, wood N2 fixation rate, moisture and fungal composition in a Norway spruce forest. Glob. Chang. Biol. 25, 1852–1867 (2019).ADS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    4.Šamonil, P. et al. Convergence, divergence or chaos? Consequences of tree trunk decay for pedogenesis and the soil microbiome in a temperate natural forest. Geoderma 376, 114499 (2020).ADS 
    Article 

    Google Scholar 
    5.Tláskal, V. et al. Complementary roles of wood-inhabiting fungi and bacteria facilitate deadwood decomposition. mSystems 6, e01078–20 (2021).PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    6.Odriozola, I. et al. Fungal communities are important determinants of bacterial community composition in deadwood. mSystems 6, e01017–20 (2021).CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    7.Valášková, V., de Boer, W., Gunnewiek, P. J. A. K., Pospíšek, M. & Baldrian, P. Phylogenetic composition and properties of bacteria coexisting with the fungus Hypholoma fasciculare in decaying wood. ISME J. 3, 1218–1221 (2009).PubMed 
    Article 
    CAS 
    PubMed Central 

    Google Scholar 
    8.Brunner, A. & Kimmins, J. P. Nitrogen fixation in coarse woody debris of Thuja plicata and Tsuga heterophylla forests on northern Vancouver Island. Can. J. For. Res. 33, 1670–1682 (2003).CAS 
    Article 

    Google Scholar 
    9.Rinne, K. T. et al. Accumulation rates and sources of external nitrogen in decaying wood in a Norway spruce dominated forest. Funct. Ecol. 31, 530–541 (2016).Article 

    Google Scholar 
    10.Põlme, S. et al. FungalTraits: a user-friendly traits database of fungi and fungus-like stramenopiles. Fungal Divers. 105, 1–16 (2020).Article 

    Google Scholar 
    11.Tláskal, V. & Baldrian, P. Deadwood-inhabiting bacteria show adaptations to changing carbon and nitrogen availability during decomposition. Front. Microbiol. 12, 685303 (2021).PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    12.Lemos, L. N., Mendes, L. W., Baldrian, P. & Pylro, V. S. Genome-resolved metagenomics is essential for unlocking the microbial black box of the soil. Trends Microbiol. 29, 279–282 (2021).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    13.Větrovský, T. et al. GlobalFungi, a global database of fungal occurrences from high-throughput-sequencing metabarcoding studies. Sci. Data 7, 228 (2020).PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    14.Thompson, L. R. et al. A communal catalogue reveals Earth’s multiscale microbial diversity. Nature 551, 457–463 (2017).ADS 
    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    15.Anderson-Teixeira, K. J., Davies, S. J., Bennett, A. C., Muller-landau, H. C. & Wright, S. J. CTFS-ForestGEO: a worldwide network monitoring forests in an era of global change. Glob. Chang. Biol. 21, 528–549 (2015).Article 

    Google Scholar 
    16.Baldrian, P. et al. Fungi associated with decomposing deadwood in a natural beech-dominated forest. Fungal Ecol. 23, 109–122 (2016).Article 

    Google Scholar 
    17.Smyth, C. E. et al. Patterns of carbon, nitrogen and phosphorus dynamics in decomposing wood blocks in Canadian forests. Plant Soil 9, 46–62 (2016).
    Google Scholar 
    18.Král, K. et al. Local variability of stand structural features in beech dominated natural forests of Central Europe: Implications for sampling. For. Ecol. Manage. 260, 2196–2203 (2010).Article 

    Google Scholar 
    19.Caporaso, J. G. et al. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J. 6, 1621–1624 (2012).CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    20.Lanzén, A. et al. CREST – Classification resources for environmental sequence tags. PLoS One 7, e49334 (2012).ADS 
    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar 
    21.Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    22.Žifčáková, L., Větrovský, T., Howe, A. & Baldrian, P. Microbial activity in forest soil reflects the changes in ecosystem properties between summer and winter. Environ. Microbiol. 18, 288–301 (2016).PubMed 
    Article 
    CAS 
    PubMed Central 

    Google Scholar 
    23.Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    24.Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).CAS 
    Article 

    Google Scholar 
    25.Kang, D. D., Froula, J., Egan, R. & Wang, Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 (2015).PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar 
    26.Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    27.Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    28.Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    29.Lee, M. D. GToTree: A user-friendly workflow for phylogenomics. Bioinformatics 35, 4162–4164 (2019).CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    30.Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar 
    31.Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).ADS 
    MathSciNet 
    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    32.Edgar, R. C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    33.Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar 
    34.Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490 (2010).ADS 
    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar 
    35.Ihrmark, K. et al. New primers to amplify the fungal ITS2 region – evaluation by 454-sequencing of artificial and natural communities. FEMS Microbiol. Ecol. 82, 666–677 (2012).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    36.Větrovský, T., Baldrian, P. & Morais, D. SEED 2: A user-friendly platform for amplicon high-throughput sequencing data analyses. Bioinformatics 34, 2292–2294 (2018).PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar 
    37.Aronesty, E. Comparison of sequencing utility programs. Open Bioinforma. J. 7, 1–8 (2013).MathSciNet 
    Article 

    Google Scholar 
    38.Nilsson, R. H. et al. An open source software package for automated extraction of ITS1 and ITS2 from fungal ITS sequences for use in high-throughput community assays and molecular ecology. Fungal Ecol. 3, 284–287 (2010).Article 

    Google Scholar 
    39.Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    40.Edgar, R. C. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat. Methods 10, 996–998 (2013).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    41.Nilsson, R. H. et al. The UNITE database for molecular identification of fungi: handling dark taxa and parallel taxonomic classification. Nucleic Acids Res. 47, D259–D264 (2018).PubMed Central 
    Article 
    CAS 

    Google Scholar 
    42.Wright, E. S. Using DECIPHER v2.0 to analyze big biological sequence data in R. R J. 8, 352–359 (2016).Article 

    Google Scholar 
    43.Murali, A., Bhargava, A. & Wright, E. S. IDTAXA: A novel approach for accurate taxonomic classification of microbiome sequences. Microbiome 6, 140 (2018).PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    44.NCBI BioProject https://identifiers.org/ncbi/bioproject:PRJNA603240 (2020).45. NCBI Sequence Read Archive, https://identifiers.org/ncbi/bioproject:PRJNA672674 (2020).46.Sutela, S., Poimala, A. & Vainio, E. J. Viruses of fungi and oomycetes in the soil environment. FEMS Microbiol. Ecol. 95, fiz119 (2019).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    47.Woodcroft, B. J. et al. Genome-centric view of carbon processing in thawing permafrost. Nature 560, 49–54 (2018).ADS 
    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    48.Mackelprang, R. et al. Microbial community structure and functional potential in cultivated and native tallgrass prairie soils of the Midwestern United States. Front. Microbiol. 9, 1775 (2018).PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    49.Hervé, V. et al. Phylogenomic analysis of 589 metagenome-assembled genomes encompassing all major prokaryotic lineages from the gut of higher termites. PeerJ 8, e8614 (2020).PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar 
    50.Clissmann, F. et al. First insight into dead wood protistan diversity: a molecular sampling of bright-spored Myxomycetes (Amoebozoa, slime-moulds) in decaying beech logs. FEMS Microbiol. Ecol. 91, fiv050 (2015).PubMed 
    Article 
    CAS 
    PubMed Central 

    Google Scholar 
    51.Urich, T. et al. Simultaneous assessment of soil microbial community structure and function through analysis of the meta-transcriptome. PLoS One 3, e2527 (2008).ADS 
    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar 
    52.Geisen, S. et al. Metatranscriptomic census of active protists in soils. ISME J. 9, 2178–2190 (2015).CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar 
    53.Tláskal, V., Zrůstová, P., Vrška, T. & Baldrian, P. Bacteria associated with decomposing dead wood in a natural temperate forest. FEMS Microbiol. Ecol. 93, fix157 (2017).Article 
    CAS 

    Google Scholar 
    54.Moll, J. et al. Bacteria inhabiting deadwood of 13 tree species reveal great heterogeneous distribution between sapwood and heartwood. Environ. Microbiol. 20, 3744–3756 (2018).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    55.Christofides, S. R., Hiscox, J., Savoury, M., Boddy, L. & Weightman, A. J. Fungal control of early-stage bacterial community development in decomposing wood. Fungal Ecol. 42, 100868 (2019).Article 

    Google Scholar 
    56.Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509 (2021).CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar 
    57.Seibold, S. et al. Experimental studies of dead-wood biodiversity — A review identifying global gaps in knowledge. Biol. Conserv. 191, 139–149 (2015).Article 

    Google Scholar  More

  • in

    Reef Cover, a coral reef classification for global habitat mapping from remote sensing

    Reef Cover was specifically developed to support the process used to produce and deliver globally applicable coral reef mapping products from remotely sensed data16. The typology acts as a key to bridge historic and contemporary knowledge, plot-scale and aerial viewpoints, and pixel data with natural history to convert pixel data into information in a form suitable for reef management decisions. Accompanying case-studies18,19 we use to demonstrate its application are also publicly available. The mapping products described in each case study were developed specifically using Reef Cover to support science and conservation of coral reef ecosystems.We sought to develop a robust system which balances the geomorphic complexity of reefs with the need to develop high accuracy maps of each class in the system. The result is a 17-class system that can be (i) applied to remote sensing datasets for future mapping, (ii) used to interpret coral reef maps (iii) effectively disseminated to users – mainly in coral reef ecology and conservation space – in a way that promotes research and conservation.Three steps were used in the development of the classification scheme.

    1.

    Step 1. Review. Existing coral reef geomorphic classification schemes (expert-led classifications from Darwin’s 1842 reef classification20 to the Millennium Coral Reef Mapping Project classification21) were carefully reviewed22 to identify synergies in terminology and definitions for reef features, and evaluate how well common features can be described in terms of remote sensing biophysical data. The review allowed us to develop a set of classes that build constructively on previous foundational knowledge on coral reef geomorphology and are relatable to existing mapping and classification efforts, and addresses the challenge of relevance.

    2.

    Step 2. Development. Reef Cover classes were then derived from attributes data, building on established machine-led reef mapping theory13. Physical attributes datasets commonly available to remote sensing scientists were examined to refine a set of 17 meaningful internal reef classes that relate to broader interpretation from a natural history point of view, gathered in Step 1. A workshop was organised to gather feedback on classes. Clarity around how each class relates to attribute data addresses the challenge of transparency.

    3.

    Step 3. Dissemination. Reef Cover classes were then documented23 in a way to promote re-use and cross-walking, with a strong focus on needs of the users, to address the challenges of clarity and accessibility. Development of the Reef Cover document considers and details the 1) relevance e.g. rationale behind why it was important to map this class, but also broader global applicability of the class, 2) simplicity e.g., promoting user-uptake by employing plain language, not over-complicating descriptors and limiting the number of classes to manageable amount, 3) transparency supplying methodological basis behind each class, and exploring caveats and ambiguities in interpretation, 4) accessibility including discoverability, open access and language translations to support users, and 5) flexibility allowing for flexible use of the scheme depending on user needs, allowing for flexible interpretation of classes by providing cross-walk to other schemes and existing maps, and making the classification adaptable, and open to user feedback (versioning).

    Finally, as a proof-of-concept the Reef Cover classification was tested in two large scale coral reef mapping exercises: one in the Great Barrier Reef 24 (Case Study 1) and one across Micronesia19 (Case Study 2, Technical Validation section). During this process, the Reef Cover dataset was reviewed to assess how useful it was for both a) producers using Reef Cover to map large coral reef areas from satellite data, and b) consumers using Reef Cover to interpret map products for application to real world problems.Step 1. Review. Building global classes on foundational reef mapping and classification workGlobal reef mapping: the need for a geomorphic classification to map coral reefs at scaleCoral reefs represent pockets of biodiversity that are widely dispersed, often remote/inaccessible and globally threatened2,3. Communities and economies are highly dependent on the ecosystem services they provide25,26,27. This combination of vulnerability, value and a broad and dispersed global distribution mean global strategies are needed for reef conservation, for which maps (and the classifications that underpin them) play a supporting role. Global coral reef maps have been fundamental to geo-political resource mapping and understanding inequalities28,29, the valuation of reef ecosystem services26, understanding the past30, present31, and future threats to reefs32, supporting more effective conservation33,34 and reef restoration strategies35,36, and facilitating scientific collaborations and research outcomes37. Reef conservation science and practice may particularly benefit from technological advancements that allow delivery of more appropriate map-based information, particularly across broader, more detailed spatial scales and in a consistent manner34,36,38.Existing expert-led reef classificationsTraditionally, coral reef features have been grouped based on observations of morphological structure, distributions of biota and theories on reef development, gleaned from aerial imagery, bathymetric surveys, geological cores and biological field censuses by natural scientists7,39 (Fig. 1). Natural scientists were struck by both the uniformity and predictability of much of the large-scale three-dimensional geomorphic structure of reefs and biological partitioning across that structure, and how consistent these characteristic geologic and ecological zones were across large biogeographic regions20,40). Technological developments of the 20th century, such as SCUBA demand regulators and compressed air tanks (commercially available in the 1940s41), acoustic imaging for determining seafloor bathymetry (e.g., side-scan sonar developed in the 1950s), light aircraft for aerial photography (first applied in the 1950s42) and lightweight submersible drilling rigs for coring (applied in the 1970s43), allowed reef structure to be viewed from fresh perspectives. New aerial, underwater and internal assessments of reef structure expanded the diversity of external and internal classes, with hundreds of new terms for features defined7,8,10 (Online-Only Table 1). However, the localised nature of most of these applications (Fig. 1) meant that many of the classes developed using these tools were region-specific, leading to experts warning against too heavy a reliance on “the imperfect and perhaps biased existing field knowledge on reefs” for developing global classifications44.Existing reef classifications derived from satellite dataShallow water tropical coral reefs are particularly amenable to global mapping from above12. They develop in clear, oligotrophic tropical waters, so many features are detectable from space45. Satellite technology has spawned a wealth of data on reefs, enabling large area coverage, with resolution of within reef variations. Initial approaches to reef mapping in the 1980s expanded our traditional viewpoint from single reef mapping and extent mapping to detailed habitat mapping of whole reef systems46. Through the 1990s and early 2000s evolving field survey techniques described above enabled more effective linkage of ecological surveys to remote sensing data47,48. Accessibility to higher spatial resolution images over larger areas in combination with detailed field data, physical attributes and object-based analysis resulted in large reef area mapping13,21,49,50. In the last five years, the increase in daily to weekly global coverage of this type of imagery, in combination with cloud-based processing capability has expanded to a global capability for reef mapping38. This is a new type of global information that requires a different approach to classification to make sense of complex natural systems at ocean scales.One of the first steps in creating the Reef Cover classification was reconciling existing classification schemes across the nomenclature driven by disciplinary, linguistic and regional biases. To do this we conducted a review of reef geomorphic classifications, looking for consistencies and usage of terms that transcended divides in discipline22 (see summary in Online-Only Table 1).Scaling and consistency: choosing an appropriate level for Reef CoverRemote sensing scientists have been developing automated methods to make sense of the increasing availability of earth observation data over coral reefs, yielding information on ecosystem zones derived from data sources such as spectral reflectance and bathymetry at increasingly larger scales12,51. As more data increasingly reveal the diversity and complexity of reefs, selecting an appropriate level at which to map reefs on the global scale requires balancing the need for a limited number of classes that can be mapped consistently based on available earth observation data, with user need for comparable information.Reef type classification
    Morphological diversity can make global geomorphic classification – particularly between reefs (at the “reef type” level, e.g., Fringing, Atoll reefs) – challenging. Divergent regional morphologies (e.g., Pacific atolls vs Caribbean fringing reefs) and endemic local features (e.g., Bahamian shallow carbonate banks, Maldivian farus) are created by underlying tectonics, antecedent topography, eustatics, climate and reef accretion rates which can all vary geographically52. The diversity of reef types is reflected in the large number of classes defined in the impressive Millennium Coral Reef Mapping Project (68 classes at the between-reef geomorphic level L3), the most comprehensive globally applicable coral reef classification system to date21.

    Geomorphic zone classification
    Internally, reef morphology becomes a lot more consistent. Physical boundaries in the depth, slope angle and exposure of the reef surface create partitioning into “geomorphic zones” (e.g., Reef Flat, Reef Crest), developed in parallel to the reef edge and coastlines and generally with a distinct ecology17,39,53. These internal patterns of three-dimensional geomorphic structure can be remarkably predictable, even between oceans. This makes geomorphic zonation a good basis for consistent and comparable mapping at regional to global scales54. Moreover, congruence between geomorphic zones and ecological partitioning means that ecological understanding can be derived from geomorphic habitat classes, making geomorphic mapping valuable to conservation practitioners55.

    Benthic classification
    Many classifications developed for reef mapping (e.g., Living Oceans Foundation49, NOAA Biogeography Reef Mapping Program50), monitoring (Atlantic and Gulf Rapid Reef Assessment56, Reef Cover Classification System57, Reef Check58,59) and management (Marine Ecosystem Benthic Classification60) have included an ecological component. Classifying reef benthos is important as associated metrics, such as abundance of living coral and algae, are widely used indicators of ecosystem change. However, most classifications that consider benthic cover are operational at reef61 to regional scales, due to the need for very high-resolution remote sensing data11 (e.g. from UAVs and CASI62, or high-resolution satellites like QuickBird and WorldView 1 m) to be able to reliably determine classes such as coral cover and type, soft coral, turf, coralline algae, rubble and sand. A comprehensive benthic coral reef classification23 that met the Reef Cover objective of being globally scalable (both in terms of remote sensing biophysical data availability and processing capabilities) but that also fully recognises and includes the rich benthic detail required to address ecological questions at sub-metre scales is beyond the scope of this classification. In the coming years it is likely that further advancements in technology – both downscaling of remote-sensing and up-scaling of field observations63 – will enable us to address this spatial mismatch.
    The challenge of creating the Reef Cover classification was to create a set of classes that related to natural science observations, despite using data pulled from remote sensing. Intra-reef zones defined by natural scientists often represent different biophysical /ecological communities that in turn reflect environmental gradients (e.g., in light, water flow) and geo-ecological processes (sediment deposition, reef vertical accretion) below the water that led to the arrangement17. However, these classes frequently also can be related to biophysical information on slope, depth and aspect that can be determined remotely. A thoughtfully prepared classification – that adheres to Stoddart’s (1978) classification principles, which state that classes should be explicit, unique, comprehensible, and should follow the language of prior schemes – can support production of maps and other science (monitoring, management) that are still relevant to historic work but that can go forward with consistent definitions21.
    Step 2. Development. Creator requirements – relating Reef Cover classes to remote sensing dataDevelopment of appropriate mapping classes requires a sensitive trade-off between the needs of users (in terms of the level of detail needed, appropriate for scaling, consistent across regions, simple enough to be manageable but detailed enough to be understandable), and the input data available and quality of the globally repeatable mapping methods of the map producers.While vast in terms of scalability, data producers are more constrained in terms of sensor capabilities such as spatial resolution (limited to pixels) and depth detection limits (limited by light penetration), and processing power (high numbers of map classes becoming more computationally expensive). Physical conditions and colour derived from remote sensing, along with their textural and spatial relationships, can be linked to reef zonation13, with depth and wave exposure being particularly important information to explaining geomorphology64.To select a set of Reef Cover classes that could be defined by attributes available from most commonly available public access or commercial satellite data, but that also corresponded to common classes found in the classification literature, and relevant from a user perspective, we looked for intersectionality between physical attribute data that can be derived from satellites but also help shape and define reef morphology.Physical attributesThe physical environment – light, waves and depth – plays a deterministic role in reef structural development and the ecological patterning across zones39. Underlying geomorphic structural features can almost always be characterised in terms of three core characteristics: i) depth, ii) slope angle and iii) exposure to waves (Fig. 2).Fig. 2Physical attributes derived from remote sensing data such as depth, slope angle and exposure are sufficient to delineate some of the key geomorphic reef zones in the classic literature. The coral reef classifier for global scale analyses of shallow water tropical coral reefs shows how relative measures can characterise reef zones.Full size image
    Depth
    Depth is a useful attribute for bridging human and machine-led classifications. Bathymetry can be derived from spectral information from satellites since the absorption of light at specific wavelengths also has known relationships with water column depth65 but also relates to reef geomorphology (due to role of primary production in powering biogenic calcification)66. Bathymetric data also provides the basis for other critical depth-derived products, slope and aspect, which are used to distinguish geomorphic classes and reef environmental parameters, e.g., exposure to breaking wave energy (Fig. 2).
    Reef Crest, for example, is often described as the shallowest part of the reef 62, while Lagoon represents a deep depression in the reef structure57,67. Depth thresholds are sometimes defined: a threshold of 10 m was suggested to differentiate true lagoons from shallow water areas68, and an 18 m threshold has been used to distinguish Reef Front from Reef Slope17. In the Reef Cover classification, depth was particularly important for distinguishing Fore Reef classes (e.g. Reef Slope, Terrace) from Reef Crest and Reef Flat classes (Table 1, Fig. 2). Generally, tides and variability in water clarity and regional eustatic discrepancies in reef top depth (e.g., Reef Flat in Atlantic systems generally lie much lower with respect to tides than in the Indo-Pacific67) mean relative depths are more appropriate, which is why absolute numbers were not used in Reef Cover definitions.Table 1 Attributes of reef zones that help support classification.Full size table

    Slope
    Slope angle, either absolute angle or discontinuities in angle acting as a break between zones, is an important differentiator of reef zones. Reef Flats are defined as being horizontal ‘flattened”69 “flat-topped”70; Fore Reef slope zones often include references to slope angle (e.g., in one classification Fore Reef has been defined as “any area of the reef with an incline of between 0 and 45 degrees”62), and Walls– common on atolls – are “near vertical” features. Variability in slope continuity can also be an important way to demarcate zones71. Montaggioni illustrated a range of representative profiles across atolls and barrier reefs, with convoluted profiles allowing subdivisions of reef slope, particularly across fringing reefs which are less likely to show a uniform reef slope than an atoll53, and Reef Crest is sometimes defined as a demarcation point separating the Fore Reef from the Reef Flat53,62,72. Where water depth can be derived from remotely sensed spectral data, bathymetry can be used to directly calculate slope (i.e., by calculating the slope angle between a pixel and its neighbours) or by considering the local variance in depth (e.g., the standard deviation in depth values within some radius of each pixel).
    In the Reef Cover classification, slope angle data were used to distinguishing Fore Reef classes such as Reef Slope and Reef Front, from horizontal classes such as Outer and Inner Reef Flat and Lagoons (Table 1, Fig. 2).

    Exposure
    Physical exposure of reefs is a key driver of zonation. Reef Crests – linked to wave breaking – are often described as “an area of maximum wave shoaling”, i.e. a zone that absorbs the greatest wave energy62,69. Fore Reefs are frequently sub-divided based on relative exposure (e.g. exposed vs sheltered slope, or windward vs leeward57). Exposure influences profile shape and importantly the communities growing in the zone, so that slopes with identical profiles could have very different communities57,73. Sometimes these zones are related to the communities found there. Meanwhile, exposure across the reef means back-reef zones contain sheltered water bodies. Together with data on water depth and bathymetry, wave energy data was key for distinguishing key Reef Cover classes74,75.

    Colour and texture
    Sub-surface spectral reflectance data can provide measurements of reef colour and texture over large areas. Concentrations of photosynthetic pigments in coral, algae and seagrass as well as light scattering by inorganic materials means spectral reflectance can also be used to determine biophysical properties of the reef 65. Colour and texture information derived from satellites can be used to manually draw polygons around similar geomorphologic units or habitats but provide the basis to drive image-based thematic mapping (such as digital number, radiance, reflectance) and texture, through spectral processing64. Texture measures are also used to improve classification by allowing spectrally similar substrates like corals and macroalgae to be distinguished. Reef Flats, for example, having a single driver of zonation, in contrast to several drivers on most other zones, makes benthic zonation particularly distinct39, and easily detectable as coloured bands in aerial images of reef flats. This allows colour and texture to be used to distinguish Outer Reef Flats, which have a greater component of photosynthetically active corals and algae, from Inner Reef Flats which appear brighter due to a higher proportion of sand build up in this depositional area (Fig. 3).Fig. 3Satellite-derived colour and texture can be informative in distinguishing Reef Cover classes of relevance to ecologists and managers, since spectral reflectance mirrors the benthos which in depositional areas may be dominated by reef-derived sediments, or on hard substrate may reflect benthic communities. Not all zones can be distinguished by colour alone (e.g., walls and steep slopes), but examples of zones with clear colour/texture differences are outlined in red.Full size image
    Spatial relationships
    Size and shape
    The size and shape of reef features can help determine Reef Cover class. Many large-scale reef structural features appear elongate as the shelf constrains shape – and reef morphology can even help predict shape as they constrain accommodation space and influence deposition76. Reef Flats, for example, boast the broadest horizontal extent of any geomorphic zone, typically 500 to 1000 m across, but reaching several kilometres in width across some Pacific atolls71. Lagoons also tend to be broad in width although width and shape can be variable depending on reef type. Understanding some of these characteristics can help determine classes, although these are usually defined relationally rather than by application of size thresholds.

    Neighbourhood and enclosure
    Natural scientists agree that reefs feature three major geomorphic elements: a Fore Reef, a Reef Crest and a Back Reef (although subdivisions and complexities exist around these). Because of the influence of large-scale processes on reef development, these zones occur in order17,39,53. Reef Crest is arguably the most defining characteristic of any reef – the break point at which a sharply defined edge divides the shallower platform from a more steeply shelving reef front71, around which other geomorphic zones arranged in parallel77. As a result, spatial arrangement of zones can be informative for mapping (Table 2). For example, Back Reef is often defined as being contiguous to the Reef Crest (Back Reef is often defined as any reef feature found landward of the crest).Table 2 Relational characteristics of Reef Cover classes displaying neighbourhood (including adjacency and enclosure rules) used to distinguish internal coral reef geomorphological zones.Full size table
    Enclosure to semi-enclosure within a bordering reef construction (e.g., in lagoons68) is another feature used classically to define reef zones, but that could also be derived from satellite imagery.
    The Reef Cover typology presented is derived from earth observation data, but attempts to link classes to genetic process, social, ecological and geological importance23. By focussing on the attributes of depth, light, exposure, colour and texture and spatial relationships that are common to both domains, our traditional biophysical knowledge of reefs can be integrated with remote-sensing capabilities. Attributes can be combined to make decision trees (Fig. 4) to help use satellite data to map reefs at the global scale. The Reef Cover list of classes can all be distinguished from these physical attributes alone, supporting production of maps that are still relevant to existing work but that can allow computationally inexpensive determination of mapping classes to beyond what was previously possible38.Fig. 4Example decision tree for classification of intra-reef zonation using Reef Cover. The decision tree for use by mappers is based on information that would typically be available at the global scale, and related to the physical attributes (depth, slope angle and exposure), colour and texture, and spatial relationships. Here a mix of a priori logical or philosophical grounds taken from a review of literature, tailored to fit a methodology limited by the data.Full size image
    Step 3. Dissemination. Providing user friendly Reef Cover class descriptors that facilitate uptake and useComputers have revolutionised our ability to classify multidimensional data sources, which allows mapping and modelling at far larger scales for the same effort compared to a human taxonomist. However, without proper consideration of the needs of the end user, classified data may not be effectively applied to conservation challenges4. The Reef Cover classification was developed with five user-needs in mind: relevance, simplicity, transparency, accessibility and flexibility.RelevanceDifferent habitats within reefs contribute differently to biological and physical processes. For example, Reef Crests play a disproportionate role in coastal protection, dissipating on average 85% of the incoming wave energy and 70% of the swell energy78,79; Reef Slopes supply an order of magnitude more material to maintain island stability61,80; shallow Reef Front areas often host more coral biodiversity81; Reef Flats support herbivorous fish biomass82 and accessibility of Lagoons often affords them cultural importance as places important for artisanal harvesting83. A classification that effectively captures the appropriate diversity of these habitats can therefore better inform social, biological and physical studies36, such as global conservation planning to safeguard reefs, for example, in order to meet the Convention on Biodiversity Aichi targets34. Map classes need to reflect differences of interest to a wide range of reef scientists, from oceanographers to paleoecologists and fisheries scientists – so careful consideration of natural history is important. Global mapping is usually to enable spatial comparisons, so a classification that is globally applicable was also important.To explore relevancy, a crosswalk was performed between Reef Cover and a selection of major regional to global coral reef classification10,60,84, mapping22,36,85 and monitoring efforts7,8,56, to make sure important classes from established classifications had not been missed23 (Online-Only Table 2).SimplicitySimplicity was achieved by (1) choosing an appropriate mapping scale (internal geomorphic classes), (2) limiting the number of geomorphic classes (17 classes), and 3) providing clear (1 line) descriptors with additional information to address issues of semantic interoperability.

    1.

    Reef Cover was developed to provide consistent mapping of reefs across very large areas: classification of geologic and ecological zones is much more amenable to mapping using remote sensing, given greater consistency in geomorphology across large biogeographic regions32. Satellite data has supported the development of several detailed regional “reef type” classifications, such as nine reef classes for the Great Barrier Reef from Landsat imagery73, six reef classes from the Torres Straight74 and 16 classes for the Red Sea from Quickbird6. However, local reef type classifications are not always applicable globally due to large regional discrepancies in Reef Type. As a result, detailed reef type typologies are more suitable for local to regional classifications6,35. For global mapping, an internal geomorphic approach is better. Finer spatial scale classifications from satellite data are also challenging, due to differences in the spatial scale at which spectral data can be generated (metres) and which benthic assemblages display heterogeneity (sub-metres)32. Medium spatial resolution multispectral data (5 to 30 m) is the most commonly used satellite information used for coral reef habitat mapping42, and classification of internal geomorphic structures may be best suited to this kind of data.

    2.

    Reviews of habitat mapping from remote sensing found the number of map classes averages 18 at continental and global scales13. More than this can become overwhelming for users and computationally expensive for developers at this point in time. Many coral reef classifications contain four or five hierarchical levels and high numbers of classes: the Millennium Coral Reef Mapping Project (MCRMP) was ambitious in developing a standardised typology that captured much of the reef type diversity, but despite defining over 800 reef classes defined at the finest (level 5, essential for local reef mapping) scale36, level 3 (68 classes) continue to be more popularly adopted in publications using this dataset. To keep the classification simple, Reef Cover was limited to 17 geomorphic classes, with simple one line definition provided. A limited class was needed 1) to make it manageable for users, 2) to make it computationally manageable for very large (regional and global) data processing and 3) reduction in classes compared to MCRMP allowed for consistent automated mapping at the global scale – so that whole regions could be directly compared for monitoring and management.

    3.

    Short definitions were provided in plain language for simplicity. To address additional uses issues of semantic interoperability each Reef Cover class definition also outlines other commonly used terms for concepts (synonymy) and explains different interpretations of the same meanings and understanding of the relations between concepts.

    TransparencyOne barrier to the use of analysing and interpreting big data is user-friendliness. Of 79 coral reef mapping attempts reviewed (62 benthic coral reef maps, 6 geomorphic coral reef maps and 11 mixed), only 13% were accompanied by a clear classification that defined the meaning of map classes14. Describing how the classification relates to data (Step 2) and producing a detailed descriptor (Step 3) along with a diagram allows classification to be understood and also adopted for different projects. We also attempted to address transparency by relating Reef Cover classes to other major global mapping and monitoring efforts (Online-Only Table 2) and providing a decision support tree for users (Fig. 4)69.AccessibilityAnother barrier to the use of analysing and interpreting big data is access75. Much information remains locked behind paywalls, and additional barriers exist including discoverability. To promote accessibility and encourage use, all data were made publicly available (see Data Records section for access). Terms were translated into different languages, as science published in just one language has been shown to hinder knowledge transfer and new findings getting through to practitioners in the field86.FlexibilityOne criticism of thematic habitat maps derived from remote sensing is a lack of flexibility: categorical descriptions of habitats are by design a discrete simplification of the ecological continua, thus classifications limit the interpretation and questions that can be asked76. Flexibility issues were addressed by 1) not prescribing absolute thresholds to each class, instead providing information on how classes relate to each other (Tables 1–3) allowing a) map producers to adapt application of Reef Cover to their own needs, and b) users to interpret with flexibility, 2) providing additional information (Standard Descriptors) including main features, exceptions to rules and broadness as to provide users with a broader understanding of hidden complexities when interpreting class meaning, 3) remaining open to feedback, we hope this Reef Cover version 1 can be improved upon with feedback from the community.Table 3 Table detailing how Reef Cover classes were used in each Case Study, and confidence of producers in determining each class (scored from 1 to 10, with 1 being very low confidence and 10 being very high) from satellite information in Case Study 2.Full size table More

  • in

    Analysis of volatiles from feces of released Przewalski’s horse (Equus przewalskii) in Gasterophilus pecorum (Diptera: Gasterophilidae) spawning habitat

    The volatiles from fresh feces of Przewalski’s horse at the pre-oviposition, oviposition, and post-oviposition stages of G. pecorum
    Throughout the stages of pre-oviposition (PREO), oviposition (OVIP), and post-oviposition (POSO) of G. pecorum, 70 volatiles were identified in fresh feces of Przewalski’s horse. Among them, 46, 48, and 52 volatiles were identified at PREO, OVIP, and POSO, respectively, and 29 volatiles were common at all three stages. In addition, 4, 5, and 9 volatiles were common between PREO and OVIP, OVIP and POSO, as well as PREO and POSO, whereas 4, 10, and 9 volatiles were unique at the single stage of PREO, OVIP, and POSO, respectively (Table 1; Fig. S1). According to relative content, the two main chemical classes of volatiles were aromatic hydrocarbons and alkenes, that is, their respective contents in a sample were both more than 25% of the total content. Except alcohols which exhibited significant difference between PREO and POSO (One-way ANOVA, F = 8.400, df = 2, P = 0.018), there was no significant difference in all other pairwise comparisons among the nine chemical classes at three stages (One-way ANOVA or Kruskal–Wallis test: P  > 0.05) (Fig. 1). Non-metric multidimensional scaling (NMDS) analysis revealed certain extent of overlap (Fig. 2), while one-way analysis of similarity (ANOSIM) indicated that there were significant differences among the three stages (R = 0.5391, P = 0.008).Table 1 The volatiles from fresh feces of Przewalski’s horse at the stages of PREO, OVIP, and POSO of Gasterophilus pecorum.Full size tableFigure 1Volatile classes detected from fresh feces of Przewalski’s horse at the stages of PREO, OVIP, and POSO of Gasterophilus pecorum. PREO, OVIP, and POSO represent fresh feces at the stages of pre-oviposition, oviposition, and post-oviposition of Gasterophilus pecorum, respectively. Data are mean (n = 3) ± SE. Different letters indicate significant differences at P  0.05). Furthermore, acetic acid was common to PREO and POSO, but no difference was observed between them (Independent t test, t = 0.137, df = 4, P = 0.897) (Table 1).Of particular concern among the eight volatiles mentioned above, ammonium acetate and butanoic acid were unique to OVIP, the critical stage of oviposition. Although not one of the five most abundant volatiles, another nine volatiles were also specific to OVIP, of which hexanoic acid, cyclopentasiloxane,decamethyl- and cyclohexene,3-methyl-6-(1-methylethyl)- were higher than 1% in relative content (Table 1).Among the 47 volatiles common to two or three stages, only six volatiles were significantly different in relative contents. Of which, D-limonene was higher at PREO than at OVIP (One-way ANOVA: F = 11.936, df = 2, P = 0.012) or POSO (P = 0.012), and 1-butanol was higher at OVIP than at PREO (One-way ANOVA: F = 8.175, df = 2, P = 0.024) or POSO (P = 0.04). Relative contents of the other four volatiles were less than 1% (Table 1).The volatiles from feces of Przewalski’s horse with different freshness states at the OVIP stage of G. pecorum
    Totally, 83 volatiles were detected from fresh feces (Fresh), semi-fresh feces (Semi-fresh), and dry feces (Dry) at the OVIP stage of G. pecorum. Of which, 48, 41 and 28 volatiles were identified in Fresh, Semi-fresh and Dry, and 7 volatiles were common to all three feces with different freshness states. In addition, 14, 3 and 3, were common between Fresh and Semi-fresh, Semi-fresh and Dry, as well as Fresh and Dry, whereas 24, 17, and 15 were unique to Fresh, Semi-fresh, and Dry, respectively (Table 2; Fig. S2). Aromatic hydrocarbons and alkenes, acids and ketones, as well as alcohols and aldehydes were the two main chemical classes of Fresh, Semi-fresh, and Dry in respective. Except esters and ‘others’ which showed no significant difference in the feces, there were significant differences among other seven classes in at least one pairwise comparison of the three freshness states (One-way ANOVA, Independent t-test or Kruskal–Wallis test: P  More