Cohort and data description
In order to examine seasonal changes of human molecular data, we leveraged the power of longitudinal multiomics data from profiling of 105 individuals (55 women and 50 men) with ages ranging from 25 to 75 years old (Fig. 1a; Supplementary Table 1). This cohort was generally healthy and well characterized for glucose dysregulation using annual oral glucose tolerance tests (OGTTs), insulin resistance measuring steady-state plasma glucose (SSPG), fasting glucose and hemoglobin A1c (HbA1c; an indicator of the average level of blood glucose over the past 100 days)19 as well as quarterly sample collections with measurements of transcriptomes (from peripheral blood mononuclear cells), proteome and metabolome from plasma, targeted cytokine and growth factor assays using serum. Nasal and gut microbiomes were analyzed using 16S rRNA sequencing providing information at the genus level and host exome sequencing was performed once from PBMCs (Fig. 1b). Moreover, 51 clinical laboratory tests were acquired on each visit and they were aligned to the meteorological data (e.g. air temperature), pollen counts (e.g. mold spores, grass pollens, tree pollens, weed pollens) and airborne fungi from the San Francisco bay area. In total, there were 902 visits (average across different types of omes‘) from which samples were drawn over up to 4 years (see “Methods”). The sample collections were generally evenly distributed throughout the year (Fig. 1b). Nearly all individuals lived in the San Francisco Bay Area with the exception of three individuals who lived in Southern California and frequented the Bay area (Supplementary Data 1). Participants in our study were well characterized for steady-state plasma glucose (SSPG) using the modified insulin suppression test20, in which 31 participants were insulin sensitive (SSPG < 150 mg/dL), and 35 were insulin resistant (SSPG ≥ 150 mg/dL) (Supplementary Table 1).
a. Integrative personal omics profiling (iPOP) cohort sampling and data collection for seasonal analyses. Omic assays included immune molecules profiling using Luminex assay, proteomics using sequential windowed acquisition of all theoretical fragment ion mass spectrometry (SWATH-MS), metabolomics using liquid chromatography (LC)—mass spectrometry, transcriptomics and microbial profiling (gut and nasal) using next-generation sequencing, in conjunction with clinical lab tests and meteorological measurements. b Subjects and sampling timepoints for each individual, as well as ethnicity (A: Asian, B; Black, C; Caucasian), insulin sensitive (IS) and insulin resistance (IR), and gender information (M: Male, F: Female). c Examples of omics analytes with seasonal patterns (transcripts, cytokines, metabolites, proteins, clinical lab tests, gut and nasal microbiome). The X-axis shows the days of the year (1–365 days) and Y-axis shows the normalized expression/abundance values. The samples are collected up to 4 years and aggregated and mapped to 1-year-long time frame. The shaded area represents 95% confidence bounds computed as ±1.96 standard deviation of model coefficients. Standard deviations were derived from a maximum likelihood fit.
Seasonal changes of diverse biological molecules
We first systematically searched for molecules that fluctuate throughout the year. We applied a generalized additive mixed model (GAMM) (see “Methods”) in order to detect molecules with seasonality effects (GAMM model likelihood ratio test p-value ≤ 0.05). We identified seasonal patterns for 898 transcripts, 119 metabolites, 116 proteins, 22 clinical lab tests, seven cytokines, seven gut microbial taxa, and 23 nasal microbial taxa (GAMM model likelihood ratio test p-value ≤ 0.05; Fig. 2d, Supplementary Table 1, Supplementary Data 2). Our computational analysis detected expected molecules known to exhibit seasonal patterns such as HbA1c19 (p-value = 1.27E − 08), which peaks in spring and summer, and is low in winter19. We also found that RBC21 (red blood cells) (p-value = 0.029) and RDW (Red Blood Cell Distribution Width)22 (p-value = 0.002) follow a similar seasonal pattern. HDL (High-density lipoprotein)23 (p-value = 0.025) peaks in summer, and the LDL/HDL ratio (p-value = 0.003) peaks in winter (Fig. 1c). Also, PER1 (period circadian regulator 1), the primary circadian pacemaker in the mammalian brain24, shows seasonal effects (p-value=0.003) with highest expression level in spring. PER1 belongs to a family of genes responsible for the circadian rhythms of locomotor activity, metabolism, and tightly involved in photo- and thermo-periodic measurements25,26.
Our analysis also revealed a number of novel molecules with seasonality variations. C2, C9, IL5, SIGLEC15, and IL1RAP are examples of molecules with roles in immunity, inflammation and allergy that demonstrate seasonal effects (Fig. 1c; Supplementary Data 2). Although the majority of omics molecules peaked once during the year, we found multiple molecules that peaked twice or thrice such as CTTNBP2, COQ10A, and gut microbial genus Holdemania (Supplementary Fig. 1). Thus, a large number of molecules undergo seasonal changes with a variety of different patterns.
Two predominant seasonal patterns in California
Presently we think of seasons as four equally partitioned periods arbitrarily set by the calendar. To determine if general seasonal patterns of molecules could be observed and how many classes might exist in our California cohort, we performed fuzzy C-means clustering with Silhouette criterion on both normalized multiomics data as well as individual omes’ to determine the number of clusters (see “Methods”; Fig. 2a; Supplementary Fig. 2). Rather than four seasonal patterns, two distinct clusters of omics molecules both at the level of all omes’ combined as well as each single ome were observed (Fig. 2a–c; Supplementary Fig. 3; Supplementary Data 3). Interestingly, omics seasonal pattern one peaks in late April, whereas omics seasonal pattern two peaks in December and drops in March through July. Pattern one corresponds to late spring, a period of high pollen count and end of the California rainy season, and pattern two peaks in late fall and early winter, a period of high viral infection incidents. Individual omes‘ exhibited similar but slightly shifted patterns, depending upon the ome (Fig. 2c).
a Silhouette width plot for identifying the optimal number of clusters for all the host omics analytes combined as well as individual omes‘ (Elbow plots in Supplementary Fig. 2). The X-axis shows the number of clusters (k) and the Y-axis shows average silhouette width. The silhouette plot shows the silhouette coefficient over values of k clusters ranging from 1 to 10. This plot shows the highest average silhouette coefficient occurring at k = 2. b Fuzzy C-means clustering of seasonal patterns for all omics analytes combined. The X-axis shows days of the year (1-365 days) and the Y-axis shows normalized GAMM coefficients (omics pattern one is shown in green color and omics patterns two is shown in orange color). c Summarized patterns for all the omics analytes combined as well as individual omes‘ (Supplementary Fig. 3) (omics pattern one is shown in green color and omics pattern two is shown in orange color). d Summary of proportion of omics analysts with seasonality effects (GAMM model likelihood ratio test p-value ≤ 0.05) per ome. e Integrative canonical pathway analysis and f Integrative disease analysis of omics analytes and their enrichment with seasonal patterns 1 and 2. The heatmap plot shows the −log10 (adj-p values). g Examples of disease enrichment analysis of seasonal patterns. Hypertension, acne and esterification of cholesterol pathways are shown.
Integrative pathway analysis of seasonal signatures
To obtain a better understanding of the biological processes and human diseases associated with each seasonal pattern, we performed pathway and disease enrichment analyses using Ingenuity pathway analysis (see “Methods”) using all the 1133 omics molecules that showed significant seasonal effects (p ≤ 0.05) (Supplementary Data 2).
Pathways and their associated diseases that change during the two omics seasonal patterns are shown in Fig. 2 (Fig. 2e, f; Supplementary Data 4; Supplementary Table 2; adjusted p ≤ 0.05). Interestingly, we found disorders related to blood pressure, hypertension, and cardiovascular disease to be associated with seasonal omics pattern one where the omics molecules have the highest expression level in spring/summer. Other notable pathways and diseases associated with pattern one are schizophrenia spectrum disorder, sleep pattern, and seizure (Fig. 2e). We also discovered that transcripts from 12 collagen genes show a strong match with pattern one (Supplementary Fig. 4). Collagen plays a structural role by contributing to the molecular architecture, shape, and mechanical properties of tissues, such as the tensile strength in skin and the resistance to traction in ligaments27.
Pattern two is associated with acute phase response, clathrin-mediated endocytosis, esterification of cholesterol, volume of urine and acne, as well as other pathways (Fig. 3a, b). The acute phase response (Supplementary Fig. 5) is upregulated over fall and winter and is a rapid inflammatory response that provides protection against microorganisms (bacteria, viruses, etc.). It involves an increase in pro-inflammatory cytokines (IP10, IL1, IL1R1, IL1RAP, IL6) and a change in concentration of plasma acute phase proteins and complement system (C2, C3, C9). Clathrin-mediated endocytosis pathway (Fig. 2e, f) is also associated with pathogen-influenced signaling and is a major gateway for the internalization of nutrients, hormones and other signaling molecules from the plasma membrane into intracellular compartments. The complement system (Fig. 2e, f; Supplementary Fig. 6) is a cascade of enzyme activations that bridges innate and acquired immune systems and is involved in clearance of immune complexes, activation of inflammation and augmenting the antibody response28,29. Complement system defects have been found in autoimmune disorders such as systemic lupus erythematosus that can affect the joints, skin, kidneys, blood and lungs. These results indicate that biological processes and their associated diseases correlate with the two major seasonal patterns.
Correlation of (a) clinical lab tests and (b) meteorological measurements with the two omics seasonal patterns are shown. The heatmap shows cluster membership values (which are on a scale of 0 and 1) based on existing cluster centroids. Seasonal patterns of (c) meteorological measurements and (d) clinical lab tests are shown (omics pattern one is shown in green color and omics pattern two is shown in orange color). The X-axis shows the days of the year (1–365 days) and Y-axis show the normalized measurement values. The shaded area represents 95% confidence bounds computed as ±1.96 standard deviation of model coefficients. Standard deviations were derived from a maximum likelihood fit.
Correlation of seasonal patterns with clinical lab tests and meteorological measurements
We next conducted the seasonality analysis for 51 clinical laboratory measures (Supplementary Data 6) and identified health markers with significant seasonality components (Figs. 1c; 3a, d; Supplementary Data 2; p ≤ 0.05). We further correlated these markers with the two omics seasonal patterns described above (Fig. 3a). As mentioned above, we observed that A1c (Hemoglobin A1c), as well as insulin, RDW (red blood cell distribution width), RBC (red blood cell counts), MONO (monocytes), UALB (urine albumin), and EGFR (estimated glomerular filtration rate) correlated with omics seasonal pattern one, whereas LDLHDL (LDL/HDL ratio), CHOLHDL (cholesterol/HDL ratio), MCH (mean corpuscular hemoglobin), ALB (albumin), PLT (platelet count), ALKP (alkaline phosphatase), CR (creatinine), CO2 (carbon dioxide), GLOB (Globulin) and NEUTAB (neutrophil absolute count), AG (albumin/globulin ratio), TBIL (Total Bilirubin) and TP (total protein) correlated with omics seasonal pattern two (Fig. 3a, d). These findings highlight the extensive seasonality variation in clinical health biomarkers and the importance of considering seasonality components in interpreting health biomarkers in the clinic.
Since most of the individuals that participated in this study are residents of the San Francisco Bay Area, we further correlated omics seasonal patterns with meteorological measurements collected from this area (see “Methods”, Fig. 3b). Average air temperature and average solar radiation correlated with omics seasonal pattern one, whereas average air pressure, air humidity, and precipitation correlated better with pattern two (Fig. 3b).
Seasonal gut and nasal microbial shifts
In order to detect microbial shifts over the two omics related patterns, we calculated GAMM seasonal coefficients for gut and nasal microbiome from the same individuals. We first measured the diversity of the gut and nasal microbiome by estimating Chao diversity index30. We discovered that the overall microbial diversity increased in winter compared to summer (p ≤ 0.05) in both nasal and gut microbiomes at all tested taxonomic levels (genus, family, order, class, phylum) (Fig. 4c). The correlation of gut and nasal microbial taxa with patterns one and two are shown in Fig. 4a, b. Overall, we observed more seasonal changes in nasal microbial taxa (23 microbial taxa) compared to the gut (four microbial taxa) (Supplementary Fig. 7a, b; Supplementary Data 2). Gut microbial taxa Holdemania genus, Ruminococcaceae genus, Oscillibacter genus as well as Firmicutes phylum show significant (p ≤ 0.05) seasonal components. These microbial taxa correlated with the omics seasonal pattern one (Fig. 4b). Of notable nasal bacterial taxa, we observed Staphylococcus genus, Porphyromonas genus, Dolosigranulum genus, Corynebacterium genus, Bacteroidia class, Bacilli class, Actinobacteria class, and Pasteurellaceae family show seasonal effects and correlate with either omics pattern one or two, depending upon the taxa (Fig. 4). Thus, the human nasal and gut microbiome undergoes extensive seasonal changes.
Correlation of (a) nasal and (b) gut microbial taxa with the two omics seasonal patterns. The heatmap shows cluster membership values (which are on a scale of 0 and 1) based on existing cluster centroids (omics pattern one is shown in green color and omics pattern two is shown in orange color). c Gut and nasal microbiome diversity scores (by Chao diversity) over the year. The Y-axis shows the days of the year and X-axis shows normalized Chao diversity scores. The shaded area represents 95% confidence bounds computed as ±1.96 standard deviation of model coefficients. Standard deviations were derived from a maximum likelihood fit. d Gut microbial taxa and (e) nasal microbial taxa with seasonal patterns.
Correlation of seasonal patterns with pollen counts and airborne fungi
To evaluate the relationship between regional pollen exposure and omics seasonal patterns, we calculated GAMM seasonal coefficients for pollen counts (tree pollens, weed pollens, mold spores, grass pollens), as well as 23 airborne fungi collected from the San Francisco Bay area (Los Altos Hills, California) (Fig. 5a, b, Supplementary Fig. 9; Supplementary Table 4; Supplementary Data 10). We then correlated pollen count patterns with the two major omics seasonal patterns (Fig. 5a, b). We observed that total counts for grass pollens, mold spores, tree pollens, and weed pollens correlate with omics seasonal pattern one, where the peak allergy season starts around early spring (Fig. 5a). Total tree pollen counts peaks in early springtime, followed by total grass pollen counts and total mold spores counts that peak in late spring. These peaks also correlate with the allergy season in the San Francisco bay area that extends from March through June. There is also a surge in weed pollen that peaks in around mid-summer (Fig. 5a). Figure 5b shows the correlation of specific airborne fungi with omics seasonal pattern one and two. The individual seasonal patterns of airborne fungi are shown in Fig. 5b; Supplementary Fig. 9. Airborne fungi show seasonal changes and correlate with both omics seasonal pattern one and two (Fig. 5b). Fungal spores that peak in early spring/early summer and correlate with omics seasonal pattern one are Rusts, Smuts/Myxomycetes, Algae, Oidium/Erysiphe, Periconia and Ganoderma. Fungal spores that peak in late fall/winter are Penicillium/Aspergillus, Ascospores, Basidiospores and Pithomyces. Thus, pollen and spore counts associate with the major patterns and may be contributors to driving these patterns.
Correlation of (a) pollen counts (tree pollens, weed pollens, mold spores, grass pollens), and (b) airborne fungi with the two omics seasonal patterns. The heatmap shows cluster membership values (which are on a scale of 0 and 1) based on existing cluster centroids (omics pattern one is shown in green color and omics patterns two are shown in orange color). Seasonal patterns of tree pollens, weed pollens, mold spores and grass pollens are shown in (c) and seasonal patterns of specific airborne fungi are shown in (d). The Y-axis shows the days of the year and the X-axis shows normalized pollen counts or normalized airborne fungi counts. The shaded area represents 95% confidence bounds computed as ±1.96 standard deviation of model coefficients. Standard deviations were derived from a maximum likelihood fit.
Seasonal effects in insulin resistant and insulin sensitive individuals
Whether seasonal patterns are distinct between people with different diseases has not been investigated. Since the participants in our study have been well characterized as either insulin resistant (IR) or insulin sensitive (IS), we examined the seasonal differences of biomolecules and microbes in these two groups. Our aim is to identify time intervals of omics features that show differences between IR and IS groups either in part or all over the entire year. For this purpose, we used the multiomics profiles from 66 subjects who were classified as either IR (35 subjects) or IS (31 subjects) based on steady-state plasma glucose (SSPG) measurements (Supplementary Table 3). We utilized our recently developed longitudinal analysis method, OmicsLonDA31 (see “Methods”), to identify the time intervals where differences are observed (Supplementary Data 11 and 12). Of 11,184 analytes examined (all omics analytes combined), there were 187 omics analytes (138 genes, 13 proteins, 19 metabolites, 6 clinical markers, and 11 microbial taxa) that showed significant (p ≤ 0.05) seasonal differences between IR and IS (Fig. 6a). Of 187 significant analytes, we identified 71 that showed statistical significance in part of the year (Fig. 6c), whereas the remaining 116 features exhibit a global difference between IR and IS across the entire year. More specifically, at the microbiome level, Veillonella has a higher abundance in IR than IS throughout the year except mid-March until late June (Fig. 6b). The family Rikenellaceae from phylum Bacteroides is enriched in IS between mid-April until the end of October (Fig. 5b). The Lachnospiraceae family and Flavonifractor genus are examples of microbes that show a significant increase in IR than IS over the entire year (Fig. 6b). On the transcriptome level, 138 genes show differential seasonal effects (Supplementary Data 12). Among those genes are APCDD1, PL2, GPS2, and EXOSC4. APCDD1 which showed higher expression in IR than IS during December until March (Fig. 6b). At the proteome level, APOF, C7, KRT17, and PI16 show significantly higher expression in IS than IR in part of the year, whereas IGLL5 showed a significant increase in IR than IS (Supplementary Data 12). At the metabolome level, our results indicated that 18 metabolites show significant changes between IR and IS across the entire year. It is of note that only Hippuric Acid (HMBD00714) is in higher abundance in IS than IR all year, except during April and October (Fig. 6b). Among metabolites that have higher expression in IR are L-Octanoylcarnitine, Adrenic acid, Dihomo-gamma-linolenic acid, gamma-Linolenic acid, Eicosadienoic acid, 2-Octanoylcarnitine, 3, 5-Tetradecadiencarnitine, and Linoleic acid. Butyric acid, l-Malic acid, Cholic acid, N6-Trimethyl-l-lysine, Phenylacetylglutamine, Cinnamoylglycine, p-Cresol glucuronide, and 20-Dihydroxy Eicosanoic acid. Furthermore, three clinical markers, Neutrophils absolute count, Platelet count, and Triglycerides (TG) level, were found to be higher in IR than IS across the year. Overall, these results reveal that there are extensive differences between IR and IS participants.
a 187 analytes show seasonality difference between IR and IS, of which 116 features exhibit global differences between IR and IS throughout the entire year and 71 features show the statistical significance during part of the year (OmicsLonDA permutation test p ≤ 0.05). b Examples of omics analytes that show significant differential abundance/expression between IR and IS subjects across the year. For each analyte, the orange curve represents fitted curve for IR, and dark blue curve represents the fitted curve for IS subjects. Gray shaded area between the two curves highlights time intervals when the corresponding feature is significantly different between the two groups (IR and IS). c Significant time intervals of analytes that show differences between IR and IS in part of the year. Each row represents an analyte, and the shaded area represents the time of the year that each corresponding feature shows a significant difference between IR (orange) and IS (blue) based on OmicsLonDA test statistics of each time interval.
Lifestyle factors
We analyzed lifestyle factors that may have seasonal effects, such as dietary habits and physical activities32,33. Dietary habits data were collected every 3 months using the Dietary Targets Monitor survey34 (see “Methods”). The survey asks about the frequency of consumption of a variety of foods including fish, meat, cheese, fruits and vegetables, starchy foods as well as sweets, and savory snacks and comprises a total of 25 different food categories (see “Methods”, Supplementary Data 15 and 16). None of these food categories were significantly different between IR and IS groups throughout the year (One-way ANOVA with random blocks, P-value > 0.05, Supplementary Table 5, Supplementary Fig. 10). In our analysis we used subject ID as a random effect to account for different numbers of samples per subject. On the other hand, physical activity measured in total metabolic equivalent of task (MET) is significantly different between the IR and the IS groups in February, May, June, and August (P-value = 0.01787, Supplementary Fig. 11). However, a post-hoc analysis of all the omics features that were identified to be significantly different between the IR and the IS groups, are not associated with the physical activity.
Source: Ecology - nature.com