Generation of Resp-and bab-recombinant lines
Male informative backcross (BC) families using O. nubilalis Slovenia and Hungary strains15 were generated that exhibited fixed recombination between the flanking genes of the Resp region, trol and not. ZE and EZ hybrid males were backcrossed to a Z-strain female to generate backcross 1 (BC1) (Supplementary Fig. 1a, b). Recombinants between trol and not were identified via polymerase chain reaction (PCR) (Supplementary Methods, Supplementary Table 1), and crossed to Z-strain individuals to obtain BC2 (Supplementary Fig. 1c). BC2 individuals were genotyped to detect recombinants, then mated with each other to generate inbred 1 (IB1) crosses (Supplementary Fig. 1d). IB1 adults with the desired genotype were mass reared to obtain IB2 (Supplementary Fig. 1e). IB2 families that originated from a BC1 male cross were fixed homozygote recombinants, whereas BC1 female cross descendants were genotyped and inbred again to obtain fixed recombinant homozygotes (Supplementary Fig. 1f, g). Nine Resp-recombinant lines had one recombination point between homozygous trol and not genes (L165, L173, L185, L190, L195, L205, L215, L220, L237). bab-recombinant lines exhibited fixed recombination between bab’s flanking genes, ago and not, and were generated using the two homozygote recombinant lines L165 with Z-strain phenotype and L205 with E-strain phenotype. Single pair matings between L165 females and L205 males were set up to obtain hybrid males, which were backcrossed to L165 females. The BC individuals were screened with PCR (Supplementary Methods) to select recombinant adults that were used for inbred mass rearing. The PCR selection process continued until two fixed homozygote populations were established, i.e. line L44-Z and line L44-E (Fig. 2a).
Genomic sequencing of Resp-recombinant lines
A pool of 10 male pupae of lines L165 and L205 were homogenized in liquid nitrogen using mortar and pestle and DNA extractions were performed with QIAGEN Genomic-tip 100/G and the Genomic DNA Buffer Set (Qiagen, Hilden, Germany) according to the manufacturers’ instructions, but extending incubation times with buffer G2 containing proteinase K and RNase A to 12 h. HMW genomic DNA was sent to GATC Biotech for sequencing. Sequencing was done using an Illumina HiSeq2500 instrument, obtaining ~200 Mio paired end (2 × 150 bp) sequences per Resp-recombinant line. Shotgun genome assemblies were generated using the CLC Genomics Workbench v10.1. For PacBio sequencing, HMW genomic DNA was isolated from individual pupae of lines L165 and L205 by the Max Planck-Genome Centre Cologne (MPGCC) using the Qiagen MagAttract HMW DNA Kit. Sequencing of the size-selected HMW genomic DNA of each strain further purified with AMPure beads was performed at the MPGCC on a PacBio Sequel instrument. PacBio reads for both recombinant lines were assembled separately using the HGAP4 assembly pipeline implemented in the SMRT analysis software with standard settings. After genome sequencing of lines L165 and L205, primers were designed which amplified line-specific size polymorphisms and used to narrow down the breakpoint within all Resp-recombinant lines (Supplementary Methods, Supplementary Table 1).
Phenotyping with wind tunnel assays
Wind tunnel experiments were conducted with 0–5-day-old unmated males in a 2.5 × 1 × 1 m wind tunnel at 20–25 °C, 70% humidity, 30 cm/s airflow, and 26% red light. Synthetic lures (Z-strain lure: 97% Z11-14:OAc + 3% E11-14:OAc; E-strain lure: 99% E11-14:OAc + 1% Z11-14:OAc) diluted Z11-14:OAc and E11-14:OAc (purity of ≥99%, Pherobank, Wijk bij Duurstede, Netherlands) with hexane to 30 µg per lure. Blend quality and quantity was confirmed with gas chromatography. Pheromones were applied to rubber septa (Thomas Scientific, Swedesboro, NJ, USA) and stored at −20 °C. Individual males were placed in a small cylinder (10 cm, 3.2 cm diameter) covered with netted cloth at both ends permitting flow of odorized air. After placing the cylinder at the downwind end of the wind tunnel, male behavior, i.e. (1) resting (=no response), (2) wing fanning (=medium response), and (3) hair-pencil extrusion (=highest response), was recorded using setup adapted from Koutroumpa et al. 15, Supplementary Fig. 11). Each male was exposed to one blend for 60 s, kept for 30–60 min in the tunnel without any odor, and then the opposite blend was tested. Blends testing order was switched between experimental days. Statistical analysis was performed with R version 3.6.144 using Fisher’s Exact or Chi-squared test. To complement behavioral phenotypes, electrophysiological phenotypes (electroantennogram (EAG) and single sensillum recordings (SSR)) of bab-recombinant and CRISPR lines (described below) were recorded (Supplementary Methods).
RNA isoform identification
De novo transcriptomes of US laboratory populations45 were constructed using Trinity46 separately for E- and Z-strain individuals following methods in Levy et al. 47 to identify all splice variants of candidate genes. RNA was isolated from larval heads45, adult female heads47, or from whole pupae newly reported here. Briefly, RNA was extracted from samples using RNeasy kits (Qiagen, Hilden, Germany), then quantified with a Nanodrop (Thermo Scientific, Wilmington, DE, USA) and Qubit Broad Range RNA assays (Life Technologies, Carlsbad, CA, USA). cDNA libraries were prepared from mRNA using the TruSeq Sample Prep Kit v2 Set A (Illumina Inc., San Diego, CA, USA) using 1 mg total RNA, and prepared libraries were quantified using the Qubit High Sensitivity DNA assay. Libraries were quantified a second time on an Agilent Bioanalyzer (Santa Clara, CA, USA). Libraries were run on an Illumina HiSeq 2500, located at the Tufts University Core Facility for Genomics (Boston, MA, USA) to generate 100 bp single-end reads. Single-end reads were assessed for quality using the FastQC program (http://www.bioinformatics.babraham.ac.uk/projects/fastqc). Sequences were then trimmed using Trimmomatic version 0.35 to remove adapter sequences, bases with low sequence quality, and any reads that were shorter than 36 base pairs. FastQC reports were generated for each file again to confirm post-trimming quality. Mitochondrial DNA and ribosomal RNA sequences were removed using Bowtie248 by aligning against known mtDNA sequences and identical reads were collapsed prior to assembly (but counts retained) using the FastX Toolkit version 0.013 (http://hannonlab.cshl.edu/fastx_toolkit). The transcriptome was assembled de novo using Trinity46 and a k-mer length of 25. The longest transcript for each component were retained using custom scripts.
Reverse-transcription quantitative PCR (RT-qPCR)
Six genes in the Resp region between kon and not (Bap18, LIM, Bgi-A, Bgi-B, ago, bab), plus Orco and OnubOR6 were analyzed for their expression ratio in different tissues of E-strain and Z-strain individuals of European laboratory populations. Stages and tissues include: 5th instar larvae (antennae, head without antennae, thorax, abdomen), prepupal instar (head, thorax, abdomen), 2- and 4-day-old male and female pupae (antennae), 2-day-old male and female adults (antennae, brain, 1st pair of legs, 2nd plus 3rd pair of legs, abdomen). Expression ratios of bab were additionally evaluated for 7-day-old male and female pupal antennae as well as for 7-day-old male and female antennae and brains. Due to the large number of samples needing to be tested for expression simultaneously, a first qPCR was run comparing all tissues within each strain (Fig. 1a). At a next step only most expressed and most related tissues to the scientific question (i.e., antennae and brain) were included and comparisons were made simultaneously for the two strains (Supplementary Fig. 3). Three biological replicates of each of 27 sample types were collected during the second hour of scotophase from each strain. Total RNAs were extracted from each tissue using a Trizol/Chloroform approach followed by RNeasy Micro Kit purification (QIAGEN). Single-stranded cDNA synthesis was performed from 1 μg total RNA with iScript Reverse Transcription Supermix for RT-qPCR from BioRad (Hercules, CA, USA). Three control genes, (GAPDH, 18S rRNA, rpL8) were tested for stability between samples, and rpL849 was chosen for final comparisons. Gene-specific primers designed using “Primer 3”50 amplified 100–200 bp fragments (Supplementary Table 2). qPCR reactions were performed using Sso Advanced Universal SYBR Green Supermix (BioRad) in a total volume of 12 μl with 3 μl cDNA (or water as negative control or RNA for controlling the absence of genomic DNA) and 0.25 mM of each primer. cDNA amplifications were performed in a BioRad CFX96 Real-Time System using a gradient of annealing temperatures for each gene of interest. Three gradient temperatures were tested per gene on a 4-fold dilution series, 1/4–1/128 of a sample representative cDNA pool [E = 10 (−1/slope)] for relative quantification of the same gene in all other cDNA samples. Two replicates of each dilution were tested. A melting curve ramp (65–95 °C: Increment 0.5 °C/5 s) was generated to confirm that reactions did not produce nonspecific amplification. The final protocol included a denaturation step at 95 °C for 3 min followed by 40 cycles of amplification and quantification (denaturation at 95 °C for 10 s and annealing for 30 s at temperatures given in Supplementary Table 2 for each primer pair). Reactions were performed in two technical replicates. After confirming similar amplification efficiencies of target and control gene, expression levels were calculated relative to rpL8 expression and expressed as the ratio = E(−Cq Resp candidate)/E(−Cq rpL8)51. Statistical comparisons between strains, sexes, and tissues for each gene were assessed using one-way analysis of variance (ANOVA), followed by honest-significant difference (HSD) tests (post hoc Tukey’s test). A Benjamini–Hochberg multiple-test correction was applied over the genes tested.
Targeted mutagenesis of bab exon 1.5
Nine RNA guides were designed against intron 1A, exon 1.5, and intron 1B of bab (Supplementary Table 3) using the CRISPOR gRNA design tool cripsor.tefor.net and the O. nubilalis bab genomic DNA sequence as target. Guide sequences were subcloned in DR274 (http://www.addgene.org/42250) derived vector. Plasmids were digested by DraI, purified, and transcribed using HiScribe T7 high yield RNA synthesis kit (New England Biolabs). Reactions were purified using EZNA microelute RNA clean-up kit (OMEGA Biotek). Streptococcus pyogenes Cas9 protein, bearing three nuclear localization sequences, was provided by TacGene (Paris-France)52. Nine different guide RNAs were designed; three targeting exon 1.5, three in the preceding intron, and three in the following intron. Aliquots of sgRNA were denatured at 80 °C for 2 min and then left on ice for 2 min before mixing them with the equivalent amount of Cas9 for a sgRNA:Cas9 complex ratio of 1.5:1. Concentrations of the sgRNA are given in Supplementary Table 3 and the Cas9 was 30 µM (Sp-Cas9-NLS-GFP-NLS). The complex was formed at room temperature (RT) for 10 min. sgRNA:Cas9 complexes were formed separately for each sgRNA to ensure that Cas9 would bind equally to each sgRNA. These were combined as desired and placed on ice. Eggs of either strain from the European populations were injected (using an Eppendorf FemtoJet 4i injector) within 0.5 h after oviposition to target the one cell embryo stage. We injected three combinations of sgRNA (Supplementary Table 3) in order to create a deletion 5′ of exon 1.5 (KO1), a deletion 3′ of exon 1.5 (KO2), or a complete deletion of exon 1.5 (DEL). Injected eggs were reared to adulthood and genotyped. DNA of adult legs was extracted51 and amplified with Terra™ PCR Direct Polymerase Mix (Takara Bio Europe) using primer Bab-Z/E-i01-F9 (GTGCATTTCCTGCTTATGA) on intron 1, Bab-E-i01-R10 (AATTTGCCCCTAAGTGTACC) on intron 1.5, and the following program: 98 °C for 2 min, 35×(10 s at 98 °C, 15 s at 60 °C, 30 s at 68 °C). Size polymorphism were detected with agarose gel analysis and confirmed by Sanger sequencing (Macrogen, Amsterdam). Sequences were aligned using SEQUENCHER™ 4.7 (Gene Codes Corporation, Inc.). Heterozygote G0 adults with mutations were crossed to adults from the wild type rearing. G1 heterozygote males and females carrying the same mutation were crossed to obtain homozygote G2 mutants. Four G2 CRISPR lines were established: lines L46 (KO1), L72α (KO2), L72β (KO2), and L73 (KO2). Males of all CRISPR lines were phenotyped using EAG (Supplementary Methods) and wind tunnel assays.
Whole mount in situ hybridization
Male O. nubilalis whole antennae were mounted and in situ hybridized with two RNA probes simultaneously. bab digoxigenin-labeled antisense riboprobe, was generated using a Sp6/T7 RNA transcription system (Roche) and linearized recombinant pCRII-TOPO plasmids (TOPO TA cloning kit Invitrogen) following manufacturer’s protocols. Orco, OR4, OR6, and OR7 probes are the same preparations that were used in ref. 21. Two color double in situ hybridization with two different antisense RNA probes (digoxigenin-labeled or biotin-labeled probes), as well as visualization of hybridization were performed as reported previously21,53 and described below. Antennae of 1–2-day-old Z-strain and E-strain male moths from the European laboratory populations were dissected by first cutting off the tips. The remaining antennal stem was further cut into smaller pieces of 5–15 antennal segments. The same procedure was done for 4-day-old pupal antennae that were extracted underneath the pupal cuticle, which was broken and lifted at antennal base so that the antenna could be pulled out with forceps.
DIG-labeled probes were detected by an anti-DIG AP-conjugated antibody in combination with HNPP/Fast Red (Fluorescent detection Set; Roche); for biotin-labeled probes the TSA kit (Perkin Elmer, Boston, MA, USA), including an antibiotin–streptavidin–horseradish peroxidase conjugate and FITC tyramides as substrate was used. All incubations and washes were made in a volume of 0.3 mL (unless otherwise stated) in 0.5 mL tubes with slow rotation on a small table rotor at RT or in a hybridization oven (Bambino, Dutcher) when heating was needed. Antennal fragments were fixed in 4% paraformaldehyde in 0.1 M NaCO3, pH 9.5 for 24 h at 4 °C (PF1) followed by washes at RT for 1 min in phosphate-buffered saline (PBS: 0.85% NaCl, 1.4 mM KH2PO4, 8 mM Na2HPO4, pH 7.1), 10 min in 0.2 M HCl and 2 min in PBS with 1% Triton X-100. Antennal fragments were then incubated for 3 h in whole mount hybridization solution (50% formamide, 1% Tween 20, 0.1% CHAPS, 50 µg/mL yeast tRNA, 5× SSC, 1× Denhart’s reagent and 5 mM EDTA, pH 8.0) at 55 °C. Hybridization, using one DIG-labeled and one biotin-labeled probe, took place at 55 °C. Prior to hybridization, probes were diluted to adequate ratios (final volume 200 µL) in hybridization buffer (50% formamide, 10% dextran sulfate, 2× SSC, 0.2 µg/µL yeast tRNA, 0.2 µg/µL herring sperm DNA) and heated for 10 min at 65 °C. After heating, the probes were kept on ice for at least 5 min before use. Post-hybridization antennal fragments were washed four times for 15 min in 200 µL of 0.1× SSC (1× SSC = 0.15 M NaCl, 0.015 M Na-citrate, pH 7.0) at 60 °C then treated for 16 h in 5 mL of blocking solution (10 g blocking reagent from Roche in up to 100 mL maleic acid solution: 0.1 mol/L maleic acid and 0.15 mol/L NaCl) in 45 mL TBS and 150 µL Triton X-100 at 4 °C. The next step was to incubate fragments for 48 h with an anti-dioxigenin alkaline phosphatase-conjugated antibody (Roche) diluted 1:500 and with a streptavidine horse radish peroxidase-conjugate diluted 5:500 in blocking solution in TBS prepared as previously. After washing five times for 10 min in TBS, 0.05% Tween, antennal fragments were rinsed in DAP-buffer (100 mM Tris, pH 9.5, 100 mM NaCl, 50 mM MgCl2), after which hybridization signals were visualized using HNPP (Roche; 1:100 in DAP-buffer, pH 8.0) incubations for 15 h at 4 °C. After washing five times for 10 min in TBS, 0.05% Tween, antennal fragments were incubated for 18 h with the TSA kit substrates (Perkin Elmer, MA, USA): 2% Tyramide in amplification diluent. After a last set of washes, five times for 10 min in TBS, 0.05% Tween, antennal fragments were mounted in 1/3 PBS/glycerol and specific antennal cell stainings were observed with a Zeiss (Oberkochen, Germany) LSM 700 confocal laser scanning microscope (MIMA2 Platform, INRA, France, https://doi.org/10.15454/1.5572348210007727E12). Images were arranged in Powerpoint (Microsoft) and Adobe Illustrator (Adobesystems, San Jose, CA, USA) and were not altered except adjusting brightness or contrast for uniform tone within a figure.
Phenotyping pheromone preference in nature
Pheromone trapping in North America was used to collect wild E-pheromone and Z-pheromone preferring males using Scentry Heliothis traps baited with synthetic E (“New York”) and Z (“Iowa”) lures (Scentry Biologicals, Billings, MO, USA). Traps were placed directly next to sweet corn fields and males were collected from each trap every 1–2 weeks and stored at −20 °C. Lures were replaced every 2 weeks. Trapping of >20 males from each E and Z trap was done at three sympatric sites between 2010 and 2012 (Supplementary Table 4). Tissues were moved from −20 °C within 3 months of collection to at −80 °C for long-term storage. DNA was isolated from both Pennsylvania sites by grinding frozen tissues and using the Qiagen DNeasy tissue protocol (Qiagen, Germantown, MD, USA) without vortexing preserve high molecular weight DNA. DNA isolation of samples from Bellona, NY was conducted with Qiagen genomic tips (20 G). All samples were treated with Qiagen RNase. DNA concentrations were quantified using Qubit prior to sequencing.
Individual genome resequencing of field moths
Individual resequencing data were collected for 31 E-trapped and 31 Z-trapped individuals from two sites (Rockspring, PA, USA (n = 15 per trap), and Landisville, PA, USA (n = 16 per trap); Supplementary Table 5). Landisville, PA, Z-trap data were originally described by Kozak et al. 54; all other data are new. Libraries were prepared using Illumina TruSeq (Illumina Inc.) and sequenced on an Illumina NextSeq using 150 bp paired-end sequencing at Cornell University. Trimmed genomic data were analyzed using the GATK best practices pipeline55,56,57 with data aligned to the repeat-masked genome reference (GenBank BioProject: PRJNA534504; Accession SWFO0000000054) using bwa58, sorted and filtered using Picard and samtools to remove duplicates and reads with a mapping quality score below 20. SNPs and small indels were called using GATK Haplotype caller (joint genotyping mode) after realigning around indels and filtered using recommended GATK filters57. Large structural variants (SV) were called from aligned bam files using information from split paired end reads using split reads and anomalies in pair orientation and insert size in Delly259 (https://github.com/dellytools/delly); these structural variants included indels (>300 bp), translocations, and inversions. Delly2 was run on all individual files, these were merged to a consensus SV file and genotypes were reassessed.
BayPASS 2.160 was used to identify SNPs associated with pheromone trap while controlling for population demography in the individual resequencing data using allele frequencies for our four populations to test the association with pheromone trap (Z = 1, E = −1) using the STD model. As described in Kozak et al. 54, significantly associated polymorphisms had XtX above the 0.001% quantile of pseudo-observed data of simulated “neutral” loci, BF > 20 dB61, and eBPis > 2 (equivalent to P value < 0.01 for β = 0)60,62,63. We expected the demographic history of the sex chromosome to be different from that of the autosomes, so we ran analyses with only Z chromosome loci. FST was calculated in vcftools v0.1.1664. Plots were created in R using packages qqman65 and ggplot266.
Pooled genome resequencing of field moths
Pooled samples were created by including equal amounts of DNA from all males caught within the same pheromone trap that were also homozygous pgFAR genotypes, as determined by a diagnostic Taq1α restriction digest67. At each of three sites (Rockspring, PA, Landisville, PA, Bellona, NY), separate E-preferring pgFAR-e/pgFAR-e (E-strain) and Z-preferring pgFAR-z/pgFAR-z (Z-strain) pooled libraries were prepared as above and uniquely indexed E and Z pools from each site were sequenced on a single Illumina sequencing lane (n = 25–41 males per trap per site; 203 males total). Bellona, NY, and Landisville, PA, pools were sequenced on an Illumina HiSeq3000 at the Iowa State University DNA Facility (Ames, IA) using 150 bp paired-end sequencing while Rockspring, PA, pools were sequenced using 100 bp single end sequencing. Rockspring, PA, and Landisville, PA, data were originally described in ref. 68, whereas Bellona, NY, data are new. Genomic reads were trimmed using Trimmomatic v.35 to remove Illumina adapters (TruSeq2 single-end or TruSeq3 paired-end), reads with quality <15 over a sliding window of 4 and reads <36 bp long. Reads were aligned using Bowtie248 and genomic positions were determined as described above. Aligned reads were sorted and filtered using Picard and samtools to remove duplicates and reads with a mapping quality score below 20. Samtools was used to identify single nucleotide polymorphisms (SNPs) in populations58.
Scripts from the Popoolation2 package were used to filter SNPs (removing SNPs near small indels, and those with rare minor alleles that did not appear twice in each population), calculate allele frequency and FST between strains69,70. We ran the CMH test to identify consistent differences in allele frequency among independent sympatric E and Z pools from a given site36. CMH comparisons were done on individual SNPs that passed a Woolf heterogeneity test (read coverage minimum 10, maximum 200)71. P values were corrected using a genome-wide using false discovery rate (FDR) in the fdrtool package in R72. We also calculated the mean FST for each of the 3 pairs of E and Z strains by population over 1 kb windows.
Genome scan of positive assortative mating
Using individual genome resequencing data, we conducted a scan for two signatures of assortative mating: (a) elevated linkage disequilibrium with the autosomal signal locus pgFAR and (b) a deficit of heterozygote genotypes. LD was calculated as the squared correlation coefficient (r2) between 33 missense mutations in pgFAR28 and the Z chromosome using vcftools after genotype phase was imputed with Beagle 5.073 and data were filtered for SNPs with minor allele frequency >0.05. All but two nonsynonymous pgFAR SNPs are fixed between strains7 and should have identical correlations with Z loci; however, estimates of LD varied due to changes in genome sequencing coverage (e.g., 2 of 33 nonsynonymous mutations at pgFAR lacked coverage). To estimate how LD typically varies across unlinked regions, r2 values were calculated from ≥50 kb chromosome-assigned scaffolds for variants separated by at least 1 Mb or located on different chromosomes (n = 1350 SNPs). To estimate how LD varies by physical distance on the Z chromosome, r2 values were calculated from variants within 15 kb regions haphazardly sampled across all Z scaffolds ≥ 50 kb (n = 57). A deficit of heterozygous genotypes was tested using an exact test as implemented in vcftools “hwe” function with a genome-wide FDR correction.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Source: Ecology - nature.com