Retrieval of E. coli plasmid sequences
All E. coli sequences were downloaded from the NCBI FTP server in May 2020. To establish an initial collection of plasmids, only complete genomes with an associated plasmid were retained. All genomes were verified for belonging to the species E. coli using kmerfinder (https://cge.cbs.dtu.dk/services/KmerFinder/). Sequence type (ST) was determined via multi-locus sequence typing (MLST) based on the 7-gene Achtman scheme using pubMLST (https:/github.com/tseemann/mlst). Only genomes with exact matches were assigned for each ST and used for subsequent analysis. To ensure our sequences were sufficiently representative of E. coli pathogens expected in nature, a systematic literature search (see description below and Fig. S1) was conducted to establish an expected distribution of STs (Table S1). This information was used to update our initial collection to match the top 4 most prevalent STs (131, 11, 73, and 95). Specifically, to identify supplementary plasmid sequences, genome accession IDs were chosen from EnteroBase based on the following criteria: the strain was matched to the correct ST and had a high-quality genome sequence (based on N50 > 20,000 and the number of contigs <250). Of those that met our criteria, genomes were randomly chosen until our desired distributions were achieved. These genome sequences were downloaded from GenBank and plasmids were assembled using plasmidSPAdes [35] (n = 1044). Assembled plasmids with an identifiable incompatibility group and at least 95% percent identity to any plasmid on NCBI via BLAST were then added to our initial plasmid collection. Finally, genomes were filtered to remove those without a known high quality phylotype (n = 1016), establishing a plasmid-level dataset consisting of 2235 (1984 complete and 251 assembled) unique plasmids, and 1775 unique transferrable plasmids (excluding non-mobilizable plasmids). The complete list of all plasmids can be found in Table S2.
Systematic literature search
A comprehensive literature review was conducted to establish a representative distribution of E. coli ST’s expected in nature (Fig. S1). In particular, a search using the databases PubMed and EMBASE was performed to retrieve primary literature using the following key search terms: “Escherichia coli”, “sequence type”, “population structure”, and “antibiotic resistance”. Search terms were limited to publications in the English language from January 1st, 2008, through present (as of July 2020). The inclusion criteria for articles were as follows: sequence type frequencies were described, and total number of isolates reported. Articles were also required to include at least 10 isolates. Articles were excluded if there was no clear information on isolate source or collection. Total sample number and frequency of each sequence type mentioned were recorded (Table S1). Data was also collected on isolate source (e.g., human patients, animal, or environmental sources).
Plasmid annotation
Plasmids were annotated using Prokka [36], which provided KEGG [37] gene identification tags, along with GenBank, and manually curated to verify consensus and mismatching genes. To verify the presence of antibiotic resistance genes, plasmids were additionally annotated using the CARD database (https://card.mcmaster.ca/), AMRfinder [38] and ResFinder [39] tools. A consensus from at least two sources, or the results from ResFinder otherwise, was used to update the existing annotations. Incompatibility groups were determined using PlasmidFinder [40], and plasmid mobility was determined based on established mobility typing schemes using MOB-typer [24]. Genes from each plasmid were then combined with plasmid-level metadata using a custom MATLAB pipeline that integrated KEGG classification categories and resistance gene counts per plasmid. The resulting merged dataset contains gene-level information per unique plasmid as well as important plasmid-specific metadata (Table S3). Gene associations with mobile elements were determined using MobileElementFinder. All gene types were determined to be associated with a mobile genetic element if they were found within 30 kb of the mobile element sequence [41]. Results were compiled based on identifying mobile elements with 100% percent identity and coverage (Table S5).
Phylogenetic analysis
To generate the clinical genome phylogeny, all non-clinical genomes were removed, along with any strain that did not carry a transferrable plasmid (Fig. S3C). To generate a representative phylogeny, the same subset of genomes was further filtered to remove those with unknown collection dates or locations (Fig. 3B). In all cases, core genomes were aligned using Roary [42], and phylogenetic relationships were inferred using RAxML [43]. Trees were visualized and annotated in R using the ggtree [44,45,46] library.
Creation of over-expressed metabolic and antibiotic resistance genes
Genomic DNA was extracted from E. coli strain MG1655 using a ZR Fungal/Bacteria DNA Miniprep kit (Zymo Research, Irvine, CA) kit following manufacturer’s directions. All primers can be found in Table S6A. Genes katG, lpxM, aroH, arcA, mumM, apg, agrF, ahr, fabG, eptC and yfbR were amplified from genomic DNA. Genes pld and ftdC were synthesized as GeneBlocks (IDT DNA, Coralville, IA) and were then amplified. bla-NDM-1 was amplified from the plasmid pGDP1 NDM-1 (Addgene, Watertown, MA), which was extracted using a Zyppy Plasmid Miniprep kit following manufacturer’s directions. We used Q5 High-Fidelity DNA Polymerase (New England BioLabs, Ipswich, MA) for all PCR reactions using the following cycling protocol: 95 °C for 5 min, followed by 30 cycles of 95 °C (30 s), 61 °C (30 s) and 72 °C (3 min) followed by 5 min at 72 °C. All amplicons were gel purified using NucleoSpin Gel and PCR clean up kit (Takara Bio, Mountain View, CA). When cloning yfbR, lpxM, arcA, mumM, ahr, ftdC, pld, eptC and katG amplicons, both amplicons and plasmid pPROLAR were cut with Kpn1 and HindIII (New England Biolabs). When cloning bla-NDM-1, aroH, and fabG both the amplicon and pPROLAR were cut with EcoR1 and HindIII (New England Biolabs). When cloning agp and agrF, both the amplicon and pPROLAR were cut with Kpn1 and Xba1 (New England Biolabs). Cut amplicons were purified using a QIAquick PCR purification kit (Qiagen, Germantown MD) following manufacturer’s recommendations. Ligation occurred overnight at 16 oC using T4 DNA ligase (New England Biolabs). Competent E. coli DH5αPRO cells were prepared and transformed using a Mix & Go! E. coli Transformation Kit following the manufacturer’s recommendations. Transformants were selected overnight on LB agar containing 50 μg/mL kanamycin. Sequences were verified using Sanger sequencing. To create blaNDM-1-gene fusions (blaNDM-lpxM, blaNDM-katG, blaNDM-yfbR), we extracted pPROLAR containing blaNDM-1 from sequence verified DH5αPRO bacteria. katG, lpxM, and yfbR amplicons were produced following the protocol described above using fusion primers listed in Table S6A. Note that the forward primer of each primer pair contained the same ribosomal binding sequence as that driving bla-NDM-1 to ensure comparable translation rates. Both blaNDM–1 + pPROLAR and the resulting amplicons were cut with HindIII. To reduce recircularization of the plasmid, blaNDM–1 + pPROLAR was treated with Quick CIP (New England Biolabs). Purification of plasmid and cut amplicons were completed using a QIAquick PCR purification kit. Ligation, transformation and selective plating occurred as described above. The orientation of metabolic genes relative to blaNDM-1 was initially verified using restriction digest, and subsequently confirmed using Sanger sequencing. To create chloramphenicol resistant versions of blaNDM-lpxM, and blaNDM-yfbR, the kanamycin resistance marker was replaced with the chloramphenicol resistance marker from pPROTET using AatII and Spe1. Relevant bands were gel excised and purified as described above. To create chloramphenicol resistant blaNDM-katG, the chloramphenicol resistance marker was first cloned into pPROLAR containing blaNDM. katG was then cut from the plasmid contained in the katG strain and inserted using HindIII. CIP was used as described above to prevent recircularization of cut plasmids. Transformants were selected overnight on LB agar containing 25 μg/mL chloramphenicol. We confirmed the presence of all genes using PCR.
General bacterial growth conditions
In all cases, the strain DH5αPro was transformed with the plasmid of interest (Table S6B). Strains were streaked onto Luria-Bertani (LB) (BD Difco, catalog #DF0446-07-5) agar plates containing the appropriate antibiotic (50 μg/mL for kanamycin, 30 μg/mL for chloramphenicol, and 50 μg/mL for rifampicin). For every experiment, individual colonies were picked and inoculated into 2 mL LB broth and grown for exactly 16 h at 37 °C with 250 rpm agitation, along with a LB-only negative control. Cultures were used the following day only if the negative control was clear.
Measuring the MIC
All strains were grown as previously described. After 16 h, cells were diluted 1:1000 in LB media with 1 mM IPTG and 50 μg/mL kanamycin, and placed at 25 °C for one hour to initiate gene expression. Diluted cells were then aliquoted into wells of a 96 well plate. Antibiotics were added to the leftmost well to achieve the highest concentration of the respective antibiotic (256 and 16 for carbenicillin and ceftriaxone, respectively), and serially diluted two-fold from column 2 until column 11. Sterile water was added to column 12 instead of antibiotic as a growth control, and the total final volume in each well was kept at 100 μL. Plates were then sealed with a paper film (BioExcell, catalog # 41061023), covered with the plastic lid, and incubated in a shaking 37 °C incubator for 20 h at 250 rpm. Following incubation, cell growth was measured by optical density (OD) in an absorbance Tecan MPLEX plate reader at 600 nm. For each condition, at least two technical replicates were averaged to determine one biological replicate value. The MIC was determined by identifying the first concentration of the antibiotic that resulted in an OD600 value which did not statistically differ (p < 0.05) from the cell-free LB control (~0.05 OD600), using a student’s t test. All experiments were conducted with at least three independent biological replicates. To confirm IPTG induction, the MIC as described above using blaNDM with, and without, 1 mM IPTG in the growth media (Fig. S4A, Table S8). After establishing that IPTG significantly increased the MIC of blaNDM compared to the IPTG-free control, shifts in MIC were determined by comparing each strain’s MIC to ctrl, in the presence of IPTG (Fig. S4B, Fig. S7, Table S9).
Quantifying 20-h sensitivity for all antibiotics
Analysis of carbenicillin time course curves with carbenicillin revealed that changes in antibiotic sensitivity according to the IC50 values were equivalent depending on whether IC50 was quantified using the growth rates of the entire time courses, or the final density at 14, 16, 20, and 24 h (Fig. S6A, Fig. 4A bottom). Thus, to streamline analysis and increase experimental throughput, antibiotic susceptibility for the remaining 7 antibiotics was quantified using 20 h OD600 measurements instead of growth rates. In particular, experiments were set up analogously as described above, except plates were sealed with a paper seal and incubated in a shaking 37 °C incubator for 20 h at 250 rpm, instead of covering wells with mineral oil (Fig. S6B, Fig. 4C top row). We note that inter-day variability in susceptibility responses were occasionally observed (Table S10), likely due to environmental fluctuations. Although a minority of replicates did not exhibit shifted carbenicillin sensitivity, sufficient replicates ensured this heterogeneity did not affect the trends described.
Confirming OD shifts using colony forming units (CFU)
To verify that changes in MIC as determined by OD600 were indeed due to differences in cell number rather than morphology, cell debris, or other OD600 artifacts, we quantified CFUs under conditions that exhibited a clear shift in MIC in response to ceftriaxone. Specifically, the MIC of katG was determined to be 0.125 μg/mL, while ctrl had no observable growth at this concentration (and instead showed an MIC of 0.0625 μg/mL). To confirm the growth of katG at higher antibiotic concentrations was real, the colony forming units (CFU) of both strains at 0.125 μg/mL ceftriaxone was obtained following a standard MIC experiment. After 20 h of incubation, cells were serially diluted, and spot plated using 10 μL in 3 technical replicates on LB agar containing 50 μg/mL kanamycin. CFUs were averaged for all technical replicates and conducted for two biological replicates for each strain (Fig. S7B, Table S11).
Quantifying growth rates and IC50 values
All strains were grown as previously described. After 16 h, each clone was resuspended and diluted 1:1000 in the appropriate media. In the text, “minimal” media refers to M9 minimal media with casamino acids, 0.4% glucose (M9CA medium broth catalog with 1 mM thiamine (cat #J864-100G). 40% glucose solution (cat #G2020), referred herein as M9CAG), and “rich” refers to standard LB media. In all cases, media was supplemented with 50 μg/mL kanamycin and 1 mM IPTG. Clones were then aliquoted into wells of a 96-well microtiter plate to achieve a total volume of 200 μL per well. In the case of IC50 measurements, log serial dilutions were prepared for each antibiotic and aliquoted to achieve the desired final concentrations (Table S7). Wells were covered in 50 μL mineral oil and growth was measured by absorbance in a Tecan MPLEX plate reader at 600 nm in 15-minute intervals for up to 24 h. To analyze the time courses, growth curves were first background subtracted and log-transformed in all cases. The maximum growth rate μm was calculated using the modified logistic equation [47]:
$$N = frac{A}{{left[ {a + e^{frac{{4mu _m}}{A }left( {lambda – t} right) + 2}} right]}}$$
for all experiments measured in minimal media, referred to as “Logistic method”. For experiments in rich media, growth rates were determined by obtaining the maximum derivative within the longest region associated with linear growth. This region of the growth curve was identified using k-means clustering with 3 groups to separate lag, log, and residual growth/stationary phases, as described previously [48]. This separate method, referred to as the “slope method”, was used for growth in rich media since, upon inspection, the multi-phase growth that is characteristic of LB media reduced the quality of logistic fits in many cases. Regardless, both methods resulted in trends that were qualitatively and statistically consistent for rich media data (Fig. S8A left compared to right). Therefore, growth rates for all genes (Fig. S8D) were measured in rich media and quantified using the slope method. To calculate the IC50 values, growth rates (μm) were determined at each antibiotic concentration, and fit using the following equation:
$$mu _m = frac{{mu IC_{50}^n}}{{A^n + IC_{50}^n}}$$
as previously described [48, 49]. All subsequent experiments that utilized carbenicillin concentrations were based on IC50 values from the control strain (Fig. 4A bottom, gray line) of 1.24 μg/mL.
Pairwise competition experiments
The selected strains were grown overnight as previously described. In all cases, overnight cultures were resuspended in either LB or M9CAG media with 1 mM IPTG and appropriate antibiotic, and placed in 25 °C for one hour to initiate gene expression. Cells were then diluted 1:10,000, and 200 μL was added to wells in a microtiter plate surrounded by Milli-Q water. For competitions with carbenicillin, 0.93 μg/mL (75% IC50) was added directly to the well. The plate was then sealed with a paper film, covered with a plastic lid and incubated at 37 °C for 24 h with agitation at 250 rpm. Both at time = 0, and after 24 h, CFU was taken by serially diluting the cultures and plating 20 μL from appropriate dilutions onto agar plates in technical replicates of six. Plates were then grown for 16 h at 37 °C. Dilutions were chosen based on achieving between 10–70 colonies. Relative fitness (W) was quantified as described previously [50, 51], according to the following equation: W = log(M0/M24)/log(W0/W24), where 0 or 24 indicates the CFU at the corresponding time point, and M and W (“mutant” and “wild-type” respectively by convention) designate the two competing strains, as defined in Table S12. In all cases, the single-gene variant was designated as M. To differentiate strains from one another, we used two methods. First, we utilized identical DH5αPro hosts carrying plasmid variants containing either kanamycin resistance (kanR) or chloramphenicol resistance (cmR) genes. In these cases, 24 h competitions were conducted in the absence of any antibiotic selection, and M and W were each determined by CFU obtained from kanamycin or chloramphenicol-containing agar plates, respectively. We verified that plasmid loss was negligible over the 24-hour period in the absence of any selection for both strains (Fig. S5Ap > 0.1, 2-tailed student t test), and that direct competition resulted in no statistical difference from 1 (Fig. S5B, p > 0.1, 2-tailed student t test). For the second method, all kanR plasmids were used, and instead changed the hosts such that DH5αPro cells were in competition with DH5αPro containing a spontaneous rifampicin-resistant mutant (rifR). Any rifR strain was quantified on rifampicin-containing plates, and the second strain was quantified by rifampicin CFU minus CFU obtained on blank plates. We established that rifR exhibited no fitness defects by (1) growth rates between the wild-type (WT) strain (W) and rifR (M) (Fig. S5D), and (2) directly competing the two control strains (Fig. S5E). In both cases, results were statistically indistinguishable (p > 0.1, two-tailed student t test). KanR/cmR and WT/rifR experiments were each conducted in LB or M9CAG, respectively. In all cases, experiments were repeated with at least three independent biological replicates.
Time-kill measurements in the presence of carbenicillin
All strains were grown as previously described. Time-kill experiments entailed hourly measurements of CFU in presence of carbenicillin at either 3.75 μg/mL (3x IC50) or 5 μg/mL (4x IC50) over a span of 2 or 3 h, including time 0. Specifically, overnight cultures were first diluted 1:100 into LB media containing 1 mM IPTG and 50 μg/mL kanamycin and sub-cultured for two hours in a 37 °C incubator with shaking at 250 rpm. Following this, cell density was adjusted as necessary to achieve a starting OD600 of ~0.15 in all cases. Adjusted subcultures were then aliquoted into a 96-well plate and the appropriate carbenicillin treatments were added directly to the well. Plates were sealed with a paper film and placed in a 37 °C incubator with shaking at 250 rpm. Initial collection for time=0 was acquired before carbenicillin treatment. Thereafter, 10 μL of culture was removed from the well every hour, 10-fold serial dilutions were performed and 10 μL was plated on blank LB agar with three technical replicates at each time point. Colonies were counted after plates were grown for 16 h in a 37 °C incubator to determine CFU. This procedure utilized 14 strains of DH5αPro transformed with kanR plasmids of interest – ctrl, katG, lpxM, yfbR, aroH, pld, fdtC, agp, eptC, arcA, argF, mmuM, ahr, and fabG. CFUs were averaged for all technical replicates, and experiments were conducted with at least three independent biological replicates.
Oxygen consumption rate
Oxygen consumption rates (OCR) were obtained with the Resipher device from Lucid Scientific. The selected strains were grown overnight as previously described. Overnight cultures were resuspended in M9CAG media with 1 mM IPTG and 50 μg/mL kanamycin, and placed in 25 °C for one hour to initiate gene expression. Following this, cells were diluted 10x into M9CAG media containing kanamycin and IPTG, and 100 μL was aliquoted per well into a 96-well microtiter plate according to the manufacturer’s instructions. Plates were placed at 30 °C to minimize growth, and oxygen concentration (μM) was measured immediately thereafter. 24 wells were measured consisting of 6 technical replicates for each strain. Given the clear well-well variability (Fig. S8B, C), data shown are for one biological replicate. However, qualitative trends were consistently reproduced in multiple independent experiments.
Statistics
In all cases where t tests and ANOVA’s were used, data was first verified to be normally distributed using Kolmogorov test for normality. Otherwise, Mann-Whitney U-tests were conducted. For panels with multiple tests, Bonferroni correction was used to adjust the p values. To determine whether any metabolic category was significantly dependent on incompatibility groups, we implemented logistic regressions in MATLAB with the function fitglm. Random forest classification was used to establish the relative importance of prevalent metabolic genes and gene categories predicting the presence of antibiotic resistance genes. Chi-square tests were conducted to determine significant co-occurrence of individual antibiotic resistant and metabolism genes. Dissociative relationships were distinguished by the odds ratios from the chi-square tests. To investigate whether the strong associations and disassociations were driven by evolutionary constraints, or simply artifacts of a common ancestor, we re-ran our statistical analysis using Coinfinder [29] to take in our gene presence-absence data, along with the genome phylogeny, and compute the Bonferroni-corrected statistical likelihood of coincidence (either associations or dissociations), thereby accounting for evolutionary relatedness.
Source: Ecology - nature.com