Scat sample collection and preparation
At known harbour seal haulout sites individual scat samples were collected using a standardized protocol (Fig. 1). Disposable wooden tongue depressors were used to transfer deposited scats into 500 ml single-use jars or zip-style bags lined with 126 µm nylon mesh paint strainers18. Samples were either preserved immediately in the field by adding 300 ml 95% ethanol to the collection jar, or were taken to the lab and frozen at −20 °C within 6 hours of collection19. Later, samples were thawed and filled with ethanol before being manually homogenized with a disposable wooden depressor inside the paint strainer to separate the scat matrix material from hard prey remains (e.g. bones, cephalopod beaks). The paint strainer containing prey hard parts was then removed from the jar leaving behind the ethanol preserved scat matrix for genetic analysis20. The paint strainer containing prey hard parts was refrozen for subsequent parallel morphological prey ID.
The 52 harbour seal scat collection sites in the Salish Sea represented in this dataset.
Molecular laboratory processing
Scat matrix samples were subsampled (approximately 20 mg), centrifuged and dried to remove ethanol prior to DNA extraction. DNA was extracted from scat with the QIAGEN QIAamp DNA Stool Mini Kit according to the manufacturer’s protocols. For additional details on the extraction process see Deagle et al.21 and Thomas et al.20.
The metabarcoding marker we used to quantify fish and cephalopod proportions was a 16S mDNA fragment (~260 bp) previously described in Deagle et al.15 for pinniped scat analysis. We used the combined Chord/Ceph primer sets: Chord_16S_F (GATCGAGAAGACCCTRTGGAGCT), Chord_16S_R (GGATTGCGCTGTTATCCCT), and Ceph_16S_F (GACGAGAAGACCCTAWTGAGCT), Ceph_16S_R (AAATTACGCTGTTATCCCT). This multiplex PCR reaction is designed to amplify both chordate and cephalopod prey species DNA. A blocking oligonucleotide was included in the all 16S PCRs to limit amplification of seal DNA22. The oligonucleotide (32 bp: ATGGAGCTTTAATTAACTAACTCAACAGAGCA-C3) matches harbour seal sequence (GenBank Accession AM181032) and was modified with a C3 spacer so it is non-extendable during PCR22.
A secondary metabarcoding marker was used in a separate PCR reaction to quantity the salmon portion of seal diet, because the primary 16S marker was unable to reliably differentiate between coho and steelhead DNA sequences. This marker was a COI “minibarcode” specifically for salmonids within the standard COI barcoding region: Sal_COI_F (CTCTATTTAGTATTTGGTGCCTGAG), Sal_COI_R (GAGTCAGAAGCTTATGTTRTTTATTCG). The COI amplicons were sequenced alongside 16S such that the overall salmonid fraction of the diet was quantified by 16S, and the salmon species proportions within that fraction were quantified by COI.
To take full advantage of sequencing throughput, we used a two-stage labeling scheme to identify individual samples that involved both PCR primer tags and labeled MiSeq adapter sequences. The open source software package EDITTAG was used to create 96 primer sets each with a unique 10 bp primer tag and an edit distance of 5; meaning that to mistake one sample’s sequences for another, 5 insertions, substitutions or deletions would have to occur23.
All PCR amplifications were performed in 20 μl volumes using the Multiplex PCR Kit (QIAGEN). Reactions contained 10 μl (0.5 X) master mix, 0.25 μM of each primer, 2.5 μM blocking oligonucleotide and 2 μl template DNA. Thermal cycling conditions were: 95 °C for 15 min followed by 34 cycles of: 94 °C for 30 s, 57 °C for 90 s, and 72 °C for 60 s.
Amplicons from 96 individually labeled samples were pooled by running all samples on 1.5% agarose gels, and the luminosity of each sample’s PCR product was quantified using Image Studio Lite (Version 3.1). To combine all samples in roughly equal proportion (normalization), we calculated the fraction of each sample’s PCR product added to the pool based on the luminosity value relative to the brightest band. After 2013, amplicon normalization was performed using SequalPrep™ Normalization Plate Kits, 96-well.
Sequencing libraries were prepared from pools of 96 samples using an Illumina TruSeq DNA sample prep kit which ligated uniquely labeled adapter sequences to each pool. Libraries were then pooled and DNA sequencing was performed on Illumina MiSeq using the MiSeq Reagent Kit v2 (300 cycle) for SE 300 bp reads. Samples were sequenced on multiple different runs as part of the larger study; however, typically between 4 and 6 libraries (each a pool of 96 individually identifiable samples) were sequenced on a single MiSeq run.
Bioinformatics
To assign DNA sequences to a fish or cephalopod species, we created a custom BLAST reference database of 16S sequences by an iterative process. First, using a list of the fish species of Puget Sound, we searched Genbank for the 16S sequence fragment of all fishes known to occur in the region (71 fish families 230 species)24,25. Reference sequences for each prey species were included in the database if the entire fragment was available, and preference was given to sequences of voucher specimens. When the database was first generated (November, 2012) Genbank contained 16S sequences for 192 of the 230 fish species in the region, and the remaining 38 species were mostly uncommon species unlikely to occur in seal diets. Following a similar procedure, we added to this database sequences for all of the regional cephalopods for which 16S data were available (7 squid species, 2 octopus species). A separate reference database was generated for the COI salmon marker containing Genbank sequences for the nine salmonid species known to occur regionally: Oncorhynchus gorbuscha (Pink Salmon), Oncorhynchus keta (Chum Salmon), Oncorhynchus kisutch (Coho Salmon), Oncorhynchus mykiss (Steelhead), Oncorhynchus nerka (Sockeye Salmon), Oncorhynchus tshawytscha (Chinook Salmon), Oncorhynchus clarkii (Cutthroat Trout), Salmo salar (Atlantic Salmon), Salvelinus malma (Dolly Varden)24.
To determine if some species in the database cannot be distinguished from each other at 16S (i.e. have identical sequences in the reference database) a distance matrix was performed on the complete database using the DistanceMatrix function in the R package DECIPHER26. Species with identical sequences were identified as having a distance of “0.00”. In some cases, one haplotype for a species was identical to another species but other haplotypes were not. When two species’ sequences were identical, we ultimately reported both species in the prey_ID field.
Sequences were automatically sorted (MiSeq post processing) by amplicon pool using the indexed TruSeqTM adapter sequences. FASTQ sequence files for each library were imported into MacQIIME (version 1.9.1-20150604) for demultiplexing and sequence assignment to species27. For a sequence to be assigned to a sample, it had to match the full forward and reverse primer sequences and match the 10 bp primer tag for that sample (allowing for up to 2 mismatches in either primers or tag sequence).
Next, we clustered the DNA sequences that were assigned to scat or tissue samples with USEARCH (similarity threshold = 0.99; minimum cluster size = 3; de novo chimera detection), and entered a representative sequence from each cluster into a GenBank nucleotide BLAST search28,29. If the top matching species for any cluster was not included in the existing database (or the sequence differed indicating haplotype variation), we put the top matching entry in the reference database. We repeated this procedure with every new batch of sequence data to minimize the potential for incorrect species assignment or prey species exclusion. This process was conducted for both the 16S and COI reference databases with each new batch of samples.
For all DNA sequences successfully assigned to a sample, a BLAST search was performed against our custom 16S or COI reference databases. A sequence was assigned to a species based on the best match in the database (threshold BLASTN e-value < 1e-20 and a minimum identity of 0.9), and the proportions of each species’ sequences were quantified by individual sample after excluding harbour seal sequences or any identified contaminants27. Samples were excluded from subsequent analysis if they contained <10 identified prey DNA sequences (given the current costs of DNA sequencing, a higher threshold is now advisable). Harbour seal DNA diet percentages for individual scats were then calculated using the Relative Read Abundance (RRA) calculation commonly used in metabarcoding studies (Box 1)16. The RRA formula was used to calculate the “DNA_diet_percent” field in data record “Harbour_Seal_DNA_Diet_Data.csv”.
Prey hard parts analysis
Extraction and identification of hard structures from harbour seal scats was conducted by three different analysts. We used the “all structures” approach to identify harbor seal prey contained in individual scat samples, which make our results comparable to similar studies previously conducted in the region8,13,14. Prey “hard parts” retained in paint strainers were cleaned of debris using either a conventional washing machine or nested sieves. All diagnostic prey hard parts were identified to the lowest possible taxon using a dissecting microscope and reference fish bones from Washington and British Columbia, in addition to published keys for fish bones and cephalopod beaks30,31,32,33. Samples containing prey hard parts identifiable only to the family level (e.g., Clupeidae), and bones identifiable to the species level of the same family (e.g., Pacific herring, Clupea pallasii) were both tallied.
In previous studies of harbour seal impacts to juvenile salmonids in the Salish Sea1,2,8,34, these diagnostic hard structures (e.g., otoliths, bones) were combined with DNA extracted from each scat sample to estimate the proportion of juvenile and adult salmon (by species) in the seal diet. This approach (see: Thomas et al.8) integrates separate analyses of hard parts and DNA through an algorithm that apportions the salmonid DNA component in each sample to a “juvenile” or “adult” classification. The decision algorithm is based on the co-occurrence of age-classified salmon bones and salmon DNA in samples, and (when bones were not present but salmon DNA was detected) on known seasonal life-history information. For example, an individual scat sample found to contain 5% Chinook salmon, and a 1:1 ratio of juvenile to adult salmon bones, would be disaggregated into a final classification of 2.5% juvenile Chinook salmon and 2.5% adult Chinook salmon. For individual scat samples that do not contain diagnostic hard structures, the ratio of juveniles to adults in that sample would rely on the ratio of hard parts pooled for the collection month. If no hard structures are available for the collection month-which only occurred for 7% of samples in Thomas et al.8 a seasonal classification would then be applied to the sample (spring = juvenile, fall = adult). The classification of hard parts as “juvenile” or “adult” was performed by taxonomic experts who differentiated samples visually, and/or according to otolith or vertebral measurements (e.g., Nelson et al.34).
Source: Ecology - nature.com