Pollen collection and RNA extractionPollen is a microscopic and notoriously resistant plant product. Thus, methods to collect a sufficient and roughly equivalent volume of pollen per species, and to ensure RNA was collected from viruses both internal and external to pollen grains, were developed specifically for this work. At each of the four regions, we identified visually asymptomatic plants species that were in full flower and in high enough abundance to achieve our pollen sample minimum. Many of the pollen samples were collected from public roadsides. However, some from the California Grasslands were collected from the University of California’s McLaughlin Natural Reserve, and some from the Eastern Deciduous Agro-forest Interface were collected from the University of Pittsburgh’s Pymatuning Laboratory of Ecology. We had permission to sample in both places. In addition, we obtained permission from the USDA Forest Service to sample in the Till Ridge Cove area of the Chattahoochee-Oconee National Forest for sampling in Central Appalachia. None of the sampled plants displayed classic viral symptoms (e.g., leaf yellowing, vein clearing, leaf distortions, growth abnormalities). To achieve the broadest representation of plant species, we selected species in different families, where feasible. Also when possible, we focused primarily on perennial species to avoid any effects of life history variation. From these, we collected 30 to 50 mg of pollen from newly dehiscing anthers (3–967 fresh hermaphroditic flowers from 1–27 plants per species; Supplementary Table 3) in situ using a sterile sonic dismembrator (Fisherbrand Model 50, Fisher Scientific, Waltham, MA, USA) with a frequency of 20 Hz. We removed non-pollen tissues (e.g., anther debris) with sterile forceps. In addition to removing non-pollen debris that was visible to the naked eye in the field at the time of pollen sample collection, we conducted microscopic and gene expression analyses to confirm the purity of the pollen samples in the lab (Supplementary Methods). Visibly pure pollen from a single species was transferred to a 2-mL collection tube with Lysing Matrix D (MP Biomedicals, Irvine, CA, USA) and kept on dry ice until transported to and stored at −80°C at the University of Pittsburgh (Pittsburgh, PA, USA).Before extracting the total RNA, we freeze-dried the pollen samples (FreeZone 4.5 Liter Benchtop Freeze Dry System, Labconco Corporation, Kansas City, MO, USA) and lysed with a TissueLyser II (Qiagen, Inc., Germantown, MD, USA) at 30 Hz with varying times for different plant species (Supplementary Table 3). We confirmed via microscopy that this protocol resulted in the breakage of ≥50% of the pollen grains in a sample. The total RNA, including dsRNA, was extracted using the Quick-RNA Plant Miniprep Extraction Kit (Zymo Research Corporation, Irvine, CA, USA), following the full manufacturer’s protocol, including the optional steps of in-column DNA digestion and inhibitor removal.RNA sequencingWe assessed the quantity and quality of the total RNA extracted from each pollen sample with a Qubit 2.0 fluorometer (Invitrogen, ThermoFisher Scientific, Waltham, MA, USA) and with TapeStation analyses performed by the Genomics Research Core (GRC) at the University of Pittsburgh. Only samples with an RNA integrity value of ≥1.9 were used (Supplementary Table 3). Stranded RNA libraries were prepared by the GRC using the TruSeq Total RNA Library Kit (Illumina, Inc., San Diego, CA, USA), and ribosomal depletion was performed using a RiboZero Plant Leaf Kit (Illumina, Inc., San Diego, CA, USA). At the GRC, we pooled depleted RNA libraries from six species on a single lane of an Illumina NextSeq500 platform.Pre-virus detection stepsA sequencing depth of 117–260 million 75 bp paired-end reads was achieved per sample (Supplementary Table 3). Sequences were demultiplexed and trimmed of adapter sequences. We used the Pickaxe pipeline42,60,61 to detect known and novel pollen-associated viruses. First, Pickaxe removes poor-quality raw reads42,60,61 and aligns the quality-filtered reads using the Bowtie2 aligner with default parameters62 to a subtraction library. Each customized subtraction library contained the host plant species genome or the most closely related plant genomes in the National Center for Biotechnology Information (NCBI) database, if the host plant genome was not available (Supplementary Table 7), as well as other possible contaminant genomes (e.g., the human genome)42,60,61. The subtraction libraries with 1–8 closely related plant genomes, a bioinformatically tractable amount, were used to remove plant sequences, which allows for a conservative estimate of the viruses associated with pollen to be made. The size of the subtraction libraries did not influence the number of identified viruses, as there was no correlation between library size and either estimate of virus richness (conservative: r = 0.08, P = 0.75; relaxed: r = 0.06, P = 0.77). After subtraction, only non-plant reads remained and were used for viral detection.Known RNA virus detection, identity confirmationWith Pickaxe, we used the Bowtie2 aligner with default parameters62 (v2.3.4.2-3) to align viral non-plant reads to Viral RefSeq42,60,61 (hereafter, VRS; Index of /refseq/release/viral (nih.gov)). Each known virus reflects the top hit of an alignment to VRS42,60,61. Following Cantalupo et al.42, we considered a known virus to be present if the viral reads covered at least 20% of the top hit and aligned to it at least ten times. For viruses with segmented genomes, at least one segment was required to meet these criteria.Contig annotation and extension; novel RNA viral genome detection, identity confirmationViral reads were assembled into contigs using the CLC Assembly Cell (Qiagen Digital Insights, Redwood City, CA, USA), and Pickaxe was used to remove repetitive, short ( More