Taxonomic composition and seasonal dynamics of the air microbiome in West Siberia

Time-series sampling

Air samples were collected in Yurga (55.711 N, 84.937 E), where the average temperatures range from 6 to 24 °C in summer (June–August), and from − 21 to − 6 °C in winter (November–March) (the open source service https://weatherspark.com/). Meteorological characteristics (temperature, relative humidity, and wind direction) during the time series are represented in Fig. S1 and S2. Specifically, air samplers were positioned at an open-air balcony (~ 4 m above the ground level under a concrete canopy) of a five-storey residential setting. Samples were collected in duplicates (i.e., two technical replicates) with two high flowrate and filter-based air samplers (SASS3100, Research International, USA). The first set of samples were collected during three time periods (1:00–3:00, 9:00–11:00, and 15:00–17:00) on 26 and 28 July 2017; the second set was also collected during three time periods (9:00–11:00, 15:00–17:00, and 21:00–23:00) within consecutive days from 2 to 5 December 2017, the third set was collected during four time periods (1:00–3:00, 9:00–11:00, 15:00–17:00, and 21:00–23:00) within consecutive days from 27 August to 2 September, 2018. In total, 78 samples in 39 time intervals were collected and used for preparation of 62 sequencing libraries (Table S1).

High volumetric, filter-based air samplers (SASS3100, Research International, USA) were used in this study, with SASS bioaerosol electret filters (6 cm diameter, expected 50% efficiency for 0.5 µm particle size, Research International, USA) as the filter medium. Sampling was performed at 300 L/min air flowrate for 2 h. After sampling, the SASS filters were stored at − 20 °C. During transport from Siberia to Singapore, the samples were hand-carried with cooling.

Sample blanks

In each sampling set, blanks were also collected as controls. The blanks consisted of 12 filter blank samples (FB) and three reagent blank samples (RB). The filter blank samples were collected by installing a new filter on the air sampler at the sampling location for about 5 s. The filter was then collected and analysed with the same protocol as the time-series samples. Reagent blank samples involved extractions performed with extraction reagents without any filter.

Details on metagenomic analysis for blanks are provided in the Supplementary section (Fig. S11).

DNA extraction

Technical replicates were isolated separately. For processing, the SASS filter was first transferred into a sterile 5 mL tube. Phosphate buffered saline (pH 7.2) with 0.1% (v/v) Triton X-100 (2 mL, PBS-T) was added to the 5 mL tube as the wash buffer. Using tweezers, the SASS filter in the tube was moved up and down a few times to let the PBS-T penetrate the filter. The tube was then sonicated for 1 min in a sonication bath without heating to dislodge the biomass from the filter. After sonication, the filter was squeezed with tweezers and the PBS-T with suspended particles was transferred into a sterile 50 mL conical tube to complete the first washing step. This washing step was repeated three times for each filter sample, using fresh 2 mL PBS-T for each repeat. At the end of the second and third repeats, the filter was transferred into the barrel of a 10 mL syringe, placed in the same 50 mL conical tube containing the wash liquid. The 50 mL tube with the syringe and SASS filter was then centrifuged at 5000×g for 2 min to remove any leftover PBS-T. The expected total recovered supernatant volume from the three washes for each sample was 6 mL, which contained the captured airborne particles.

Upon completion of the wash steps, the supernatant was subsequently filtered through a 0.02 µm Anodisc filter (Whatman, UK) using a vacuum manifold (DHI, Denmark). The Anodisc was finally transferred into a 5 mL bead tube provided in the DNeasy PowerWater Kit (Qiagen, Germany) for DNA extraction.

DNA extraction from the Anodisc was mostly performed following the standard protocol of the DNeasy Power Water Kit with the following modifications to increase DNA yield. Briefly, 0.1 mg/mL (final) Proteinase K was added to the lysis buffer (solution PW1) prior to the initial 55 °C incubation. The initial incubation time at 55 °C was also prolonged from the recommended 10 min to overnight incubation. After initial incubation, the sample tubes were vortexed for 3 min and subsequently placed into an ultrasonic bath (Elmasonic, USA) for sonication at 65 °C for 30 min²⁹, followed by another 5 min vortex. The remaining extraction steps were completed as instructed in the manufacturer’s protocol.

In the first and second time series (SUMMER 2017 and WINTER 2017), the DNA isolated from the technical replicates was pooled to provide sufficient material for sequencing.

Metagenomic sequencing

For the metagenomic sequencing and NGS data processing, we used standardised procedures and pipelines described in detail elsewhere¹. Extracted air DNA samples were quantitated on a Qubit 2.0 fluorometer, using the Qubit dsDNA HS (High Sensitivity) Assay Kit (Invitrogen). Immediately prior to library preparation, sample quantitation was repeated on a Promega QuantiFluor fluorometer, using Invitrogen’s Picogreen assay.

Next-generation sequencing libraries were prepared with Swift Biosciences’ Accel-NGS 2S Plus DNA Library Kit, following the instructions provided in the kit. With the exception of samples that had a concentration of < 0.25 ng/µL, the starting amount of DNA for library preparation was normalized to 5 ng. DNA shearing was performed on a Covaris E220 focused-ultrasonicator with the following settings: peak power: 175, duty factor: 5.0, cycles/burst: 200, run time: 90 s. All libraries were dual-barcoded, using Swift Biosciences’ 2S Combinatorial Dual Indexing Kit. For PCR amplification, which selectively enriches for library fragments that have adapters ligated on both ends, the PCR cycles were normalized to eight for all libraries with a starting amount of 4–5 ng of DNA. For samples with less than 4 ng of DNA, amplification cycles were adjusted as follows: 3.0–3.9 ng: 9 cycles, 2.0–2.9 ng: 11 cycles, 1.0–1.9 ng: 13 cycles, < 1 ng: 15 cycles. Size-selection was omitted for all libraries.

Library quantitation was performed using Invitrogen’s Picogreen assay and the average library size was determined by running the libraries on a Bioanalyzer DNA 7500 chip (Agilent). Library concentrations were normalized to 4 nM and the concentration was validated by qPCR on a ViiA-7 real-time thermocycler (Applied Biosystems), using Kapa Biosystem’s Library Quantification Kit for Illumina Platforms. Libraries were then pooled at equal volumes and sequenced on Illumina HiSeq2500 rapid runs at a final concentration of 10–16 pM and a read-length of 251 bp paired-end (Illumina V2 Rapid sequencing reagents).

High-throughput sequencing data processing and analysis

Metagenomic data generated for the air samples were processed for adaptor removal and quality trimming with a Phred quality score threshold of Q20 using Cutadapt v. 1.8.1³⁰. Two million reads (250 bp) were randomly selected from each sample as a representative set and aligned against the NCBI non-redundant (NR) protein database downloaded on 7/08/2017 using the alignment tool RAPSearch v. 2.15^21,22.

Resulting alignments were imported into MEGAN v.5.11.3, which assigns taxon IDs based on the NCBI taxonomy^23,24. To achieve the desired taxonomic specificity, we used the following filtering parameters: min score = 100 (bit score), max expected = 0.01 (e-value), top percent = 10 (top 10% of highest bit score), min support = 25 (minimum number of reads required for taxonomic assignment), LCA percent = 100 (naive), min complexity = 0.33 (sequence complexity). Lowest common ancestry (LCA) for each read on the NCBI taxonomy is assigned using MEGAN’s LCA algorithm. In instances where all of the above filtering criteria have been fulfilled, reads are assigned to levels of taxonomic classification ranging from domain to species. In our study, species-level classification is only reached if at least 25 reads uniquely align to a single species in the database with a 100% match on the protein level over at least 50% of the 250 bp read. Due to limits of existing public sequence databases, some sequencing reads did not result in meaningful alignments and were assigned to the ‘no-hits’ category. Unassigned reads are sequencing reads for which low-complexity, repetitive DNA sequences or multiple alignments beyond domain-level are encountered.

Statistical analysis

Seasonal difference in richness, evenness and relative abundances of the microbial taxa was assessed by generalized regression modelling in R v. 3.3.3³¹. Multivariate linear modelling analysis was performed in mvabund package in R v. 3.3.3. To visualize multivariate patterns in microbial communities, Bray–Curtis dissimilarity distances among centroids for each sample series were calculated in vegan package in R v. 3.3.3. Principal Coordinates (PCo) were used as an ordination method. Alfa diversity indices chao1 and Simpson E were calculated in QIIME v. 1.8.0³². Cross-correlation analysis was performed in R v.3.3.3.

Meteorological data

The retrospective meteorological data of the local weather station were downloaded from the open source service “Weather and Climate” (http://www.pogodaiklimat.ru/, weather archive information downloaded on 2.10.2018). Meteorological characteristics (temperature, relative humidity, and wind direction) during the time series are represented in Fig. S1 and Fig. S2.

Source: Ecology - nature.com

Taxonomic composition and seasonal dynamics of the air microbiome in West Siberia

Time-series sampling

Sample blanks

DNA extraction

Metagenomic sequencing

High-throughput sequencing data processing and analysis

Statistical analysis

Meteorological data

Sunflower inflorescences absorb maximum light energy if they face east and afternoons are cloudier than mornings

Late Quaternary range shifts of marcescent oaks unveil the dynamics of a major biogeographic transition in southern Europe

ITALIAN LANGUAGE

ENGLISH LANGUAGE