More stories

  • in

    The impact of natural fibers’ characteristics on mechanical properties of the cement composites

    The structure and microstructure of the fibresThe surfaces of the natural fibres are presented from Figs. 6, 7, 8, 9, 10 and of the synthetic fibres are presented in Figs. 11 and 12.Figure 6SEM of jute fibre [Fot.M.Kurpińska].Full size imageFigure 7SEM of bamboo fibre [Fot.M.Kurpińska].Full size imageFigure 8SEM of sisal fibre [Fot.M.Kurpińska].Full size imageFigure 9SEM of cotton fibre [Fot.M.Kurpińska].Full size imageFigure 10SEM of ramie fibre [Fot.M.Kurpińska].Full size imageFigure 11SEM of polymer fibre [Fot.M.Kurpińska].Full size imageFigure 12SEM of polypropylene (PP) fibre [Fot.M.Kurpińska].Full size imageThe basic components of natural fibres influencing their properties are cellulose, hemicellulose, lignin, waxes, oils, and pectin. Cellulose is mainly composed of three elements such as carbon, hydrogen, and oxygen, and it is the material basis that forms the cell wall natural fibre. Typically, cellulose remains in the form of micro-fibrils within the cell wall of a plant. Cellulose is the main factor affecting the tensile strength along natural fibre and the cellulose content is closely related to the plant’s age and content decreases with the increasing age of the plant6.Hemicellulose is an amorphous substance offering a low degree of polymerization and it exists between fibres. Hemicellulose is a complex polysaccharide with xylan as the predominant chain, and the branches mainly include 4-O-methyl-D-glucuronic acid, L-arabinose, and D-xylose. Lignin is a kind of polymer with complex structures and of many types. The basic units of lignin include: guaiacyl, syringyl monomers, and p-hydroxyphenyl monomers. The structural units in lignin are mainly connected by ether bonds and carbon–carbon single bonds. Usually, lignin is not evenly distributed in the plant fibre wall9.In addition to three main components, lignin often contains various sugars, fats, protein substances, and a small amount of ash elements. These chemical compositions affect not only the properties of natural fibres, but also the possibility of a specific application of fibre. The composition of individual natural fibres and their properties are presented in Table 1. Figure 6a–c shows longitudinal and cross-sectional views of the untreated jute fibre. Externally, the fibre is smooth and shiny. The presence of hemicellulose influences the high hygroscopicity of jute fibres. The structure of the jute fibre shows that the fibre swells when it absorbs water. Possible swelling of the fibre in the cross-section by approx. 30%. The microscope scans of indicate the succinylated regions. This is due to the chemical bonding of the succinic anhydride molecule with the hydroxyl group of the cellulose present in the fibre. The encircled region in the top side shows an unsuccinylated region with naturally waxy impurities16.Figure 7a shows the scanning electron micrograph (SEM) of the bamboo fibre. According to the SEM analysis, the microstructure of bamboo is anisotropic. At the Fig. 7b–c it can be recognized that the orientation of cellulose fibrils was placed almost along the fibre axis which may affect to maximize the modulus of elasticity. Factors affect the mechanical properties of bamboo fibres are the chemical composition and structure of bamboo fibres, moisture content, age of bamboo, etc. In addition, the age of the plant affects the chemical composition and structure of fibre. These factors and the natural humidity influence their change of mechanical properties. The hemicellulose content directly influences the tensile strength. This parameter increases with the decrease in the hemicellulose content in the bamboo fibre18.The cell structure of bamboo fibres is complex, and the middle layer of the cell wall has a multi-layer structure. The lignification of the thin and thick layers in the multilayer structure varies. The multi-layered cell wall structure leads to better fracture resistance and promotes internal sliding between the cell wall layers during tension. The angle of the microfiber alignment is also an important factor influencing the mechanical properties of the fibre. Typically, the tensile strength and modulus of elasticity of a fibre increase as the angle between the interposition of the microfibers decreases. Hence, the smaller microfibril angle is an important factor that contributes to the good mechanical properties of bamboo fibre. Large voids between bamboo fibre molecules can be seen, which impact good hygroscopicity19. The moisture content is an important factor affecting the mechanical properties of bamboo fibres. Figure 8a–c shows the morphology of the sisal fibre. The surface of the sisal fibre has higher roughness, and it increases the bonding area between the fibre and cement paste. This leads to increase the mechanical properties of the composites38.Figure 9a–c shows images of the cotton fibres. At the microscope image, a cotton fibre looks like a twisted ribbon or a collapsed and twisted tube. These twists are called convolutions: there are about 60 convolutions per centimetre. The weaves give the cotton an uneven surface of the fibres, which increases the friction between the fibres, but at the same time they can prevent fibres from evenly dispersing in the cement matrix. The outer layer, the cuticle is a thin film of mostly fats and waxes. Figure 9b shows the waxy layer surface with some smooth grooves. The waxy layer forms a thin sheet over the primary wall that forms grooves on the cotton surface19. The cotton fibre surface comprises non-cellulosic materials and amorphous cellulose in which the fibrils are arranged in a criss-cross pattern. Owing to the non-structured orientation of cellulose and non-cellulosic materials, the wall surface is unorganized and open. This gives flexibility to the fibre. The basic ingredients, responsible for the complicated interconnections in the primary wall, are cellulose, hemicelluloses, pectin, proteins, and ions. In the core of fibre, only the crystalline cellulose is present, what is highly ordered and has a compact structure with the cellulose fibrils lying parallel to one another18.SEM micrograph of the surface and cross section of the ramie fibre are shown at Fig. 10a–c. The surface of the ramie fibres is dense but porous. There are many micropores and continuous bubbles in the porous structure of a single bundle of a ramie fibre Fig. 10c. This structure has some effect for low absorption of water, moreover, it is also related to the fibre distribution in the cement composites. In case of the short ramie fibre, due to its random distribution in composites, the strength of the composite may be affected. Cellulose, lignin, and hemicellulose weight materials can form a dense layer on the surface of the ramie fibres, so the water absorptivity is low. This special structure of the fibre with a dense matrix, and at the same time, with a characteristic pore arrangement has an influence on the adhesion of the cement matrix and the strength of the cement composite18.The surface and cross section of multifilament macrofibre is demonstrated at Fig. 11a–c. From the chemical point of view, this type of fibres belongs to the polymers from the group of polyolefins, composed of units of the formula: –[CH2CH (CH3)]–. They are obtained by low-pressure polymerization of propylene. They are made of 100% pure co-polymer twisted bundles of multifilament fibres Fig. 11c. Polypropylene is one of two most commonly used plastics, in addition to polyethylene. Polypropylene is a hydrocarbon thermoplastic polymer2.Figure 12a–c shows the structure of a bundle of polypropylene (PP) fibres in the form of a 3D mesh. They are made of isotactic polypropylene, called propylene, CH2=CHCH3 obtained from crude oil. They are one of the finest polypropylene fibres. The surface of the fibres is smooth Fig. 12b 2.The consistency—fluidityThe results of fluidity are shown at Fig. 13. The fluidity of the composite not modified with fibres is 145 mm and is a reference to other test results. The use of bamboo fibres increased the composite fluidity and composite flow by 8.6% (157.5 mm). The use of polymer fibers and jute increased the consistency by about 7%, while the use of sisal fibres by 3%. The use of PP fibres (122.5 mm) had the greatest impact on the loss of consistency by 15.5%. The use of cotton and frame fibres resulted in a reduction of workability and consistency by 13.8% and 3.5%, respectively.Figure 13Results of fluidity test.Full size imageBased on the research results, it was found that in the case of using bamboo fibres characterizing a high absorption of 120–145%, the consistency of composite increased by 8.2% compared to the consistency of composite without fibres. In the case of a change in consistency, the chemical composition of natural fibres, their surface, and the total length in the volume of composite are significant, too. There is a noticeable regularity related to the cellulose content in natural fibres. If the higher cellulose content, it reduces the consistency of the composite. For example, the cellulose content in bamboo fibres is the lowest and amounts to 40–45%, while the cellulose content in cotton fibres is the highest, ranging from 80 to 94%. It can also be recognized that consistency and workability will be influenced by the hemicellulose content.The higher the hemicellulose content, it impacts the higher consistency of the composite. It is similar referring to the content of lignin. It was noticed that the higher the lignin content, the higher the composite consistency was found. Regarding the total length of the fibres, a regularity is apparent that the greater the total length of fibres, e.g., in the case of cotton fibres, the greater decrease in consistency is visible. In the case of polymer and polypropylene (PP) fibres, the consistency is influenced by the surface of the fibre, the number of fibres, and their total length in the volume of the composite. Increasing the total length of PP fibres by approx. 15% resulted in a reduction of the consistency of approx. 20%.Flexural and compressive strengthAssigning mechanical properties of fibre reinforced composite, particular emphasis was placed on the determination of the flexural strength of the composite. This parameter was appointed by the 3-point test. Figure 14. shows the flexural strength of plain composite and 7 groups of different fibre reinforced composites on the 2nd, 7th, 28th, and 56th days.Figure 14Flexural strength test results.Full size imageIt can be seen that the bending strength of composites with the addition of natural fibres, ramie, bamboo, jute, and sisal are similar. The bending strength of composites with PP and polymer fibres is lower. It should be noted that the strength of the cotton fibre-reinforced composite is much lower than that of all the others tested. The reason may be the low tensile strength of the cotton fibres used. When mixing the composites, a tendency to create conglomerates of cotton fibres was also noticed, which may affect the strength of the composites.The test results clearly show that the effectiveness of the added natural fibres depends on the chemical composition and mechanical properties, and above all, on their adhesion to the cement matrix. The adhesion of the natural fibre to the cement matrix has a significant influence on the mechanical properties of the cement composite, in particular on compression and bending strength. The highest bending strength was achieved by cement composites modified with ramie fibres. Ramie fibres are characterized by the highest tensile strength among the tested synthetic and natural fibres, ranging from 400 to 1000 MPa. The results of the compressive strength are shown in Fig. 15.Figure 15Compressive strength test results.Full size imageThe analysis of the test results shows that the use of dispersed fibres reduced the early compressive strength after 2 days from 8.5 to 33%. The exception is the ramie fibres, the use of which increased the early strength by 6.6%. Within 28 days, as in the case of early strength, the use of all types of synthetic and natural fibres resulted in a decrease in strength from 4.6 to 26.5%. The exception is the use of ramie fibres, which increased the compressive strength by 7.2% after 28 days. After 56 days, a decrease in strength was noticed in the case of using PP and polymer synthetic fibres as well as natural cotton and bamboo from 5.5 to 11.9%.On the other hand, the increase in compressive strength after 56 days from 5.8 to 16.4% was visible in the case of using fibres such as sisal, jute and ramie. The highest compressive strength was achieved by the composite with a ramie fibre. The fibre of the ramie is characterized by the highest modulus of elasticity ranging from 24.5 to 128 GPa and is over 100% higher than the Young’s modulus of the other fibres.Shrinkage testFigure 16A shows that the samples after demolding showed expansion for about 2 days, and from the third day after demolding, the length of the samples was shortened. The lowest degree of expansion in the first days was shown by samples without fibres and samples containing cotton fibres. In this case, the expansion did not exceed 0.02 mm/m. However, the same samples finally showed the highest shrinkage after 180 days, which was 0.06 mm/m.Figure 16Testing the change in length of samples.Full size imageThe highest expansion within 48 h after deformation was shown by samples containing sisal fibres, while these samples finally after 180 days showed the lowest deformation of the length of the samples, which was 0.001 mm/m. The samples containing the synthetic fibres showed an expansion of about 0.02–0.03 mm/m in 48 h and the final shrinkage after 180 days was 0.03 mm/m for both the polymer and PP fibre samples. The bamboo and ramie fibres initially showed an expansion of 0.04–0.06 mm/m while their final shrinkage was 0.02 mm/m. The samples with jute fibres showed an expansion of 0.04 mm/m and the final shrinkage of the samples was 0.04 mm/m. Figure 16a,b shows the results of testing the change in length of samples over time.After 180 days, the total deformation of the samples was determined. Samples containing sisal fibers showed a slight expansion of about 0.001 mm/m, while the highest deformation (shrinkage) was shown for composite samples without fibers and with cotton fibres, which was 0.06 mm/m. Samples with bamboo, jute, PP, polymer and ramie fibres showed a shrinkage from 0.02 to 0.04 mm/m. Only the samples containing the sisal fibre showed a slight expansion of 0.001 mm/m.Ultimately, the samples containing sisal fibres were characterized by the lowest deformability. This phenomenon is related to the fibre structure and the total length of the fibres in a sample with dimensions of 40 × 40 × 160mm. For example, in a sample containing sisal fibres, their total length is 5856.7 m. Otherwise, a sample containing jute fibres, their total length in the sample is only 7.4 m. Therefore it was found that the fibre structure, its diameter, the cellulose content and the total length of the fibres in the element are important factors of deformation as a result of shrinkage or expansion of the fibre reinforced composite.Water absorption of composite testHigher water absorption (8.5%) compared to the composite without fibres was noticed in the case of using both synthetic fibres and with the exception of the use of ramie fibres, which caused a slight reduction in water absorption to 8.2%. It can be recognized that the water absorption rate of the 8 groups of samples is slightly different, the highest is the polymer fibre-reinforced composite (9.2%); the lowest water absorption rate refers to ramie fibre-reinforced composite (8.2%). The difference in water absorption rates is presented at Fig. 17.Figure 17Water absorption of composite (%).Full size imageExcept for cotton fibre-reinforced composite, the water absorption rate of another plant fibre-reinforced composite is lower than that of synthetic fibre-reinforced composite. Probably because of the fact that ramie, sisal, and jute fibres all have good moisture absorption and release properties. It is commonly known that plant fibre-reinforced cement-based materials have reduced strength and initial properties due to their performance degradation in a humid environment, so their long-term durability could become problematic. Sisal fibres (with noticed absorption of 95–100%) have absorbed more cement slurry on their surface than jute fibres (absorption of fibre 7–12%). This phenomenon could be explained by the fact that the slurry became the impregnation of the fibre. The absorbability of the composite was tested after the composite had completely hardened. Probably a fibre that is characterized by high absorption—sisal is very well “embedded” in the matrix, therefore the bending strength results for composites with sisal fibre were higher by 8–10%. More

  • in

    Eddy covariance-based differences in net ecosystem productivity values and spatial patterns between naturally regenerating forests and planted forests in China

    Differences in environmental factorsEnvironmental factors showed value differences between forest types, while the significance of differences differed among variables, which were both found with corrected values and original measurements (Fig. 1).Figure 1The differences in environmental factors between naturally regenerating forests (NF) and planted forests (PF) in China. The environmental factors include three annual climatic factors (a–c), three seasonal temperature factors (d–f), three seasonal precipitation factors (g–i), three biotic factors (j–l), and two soil factors (m,n). Three annual climatic factors include mean annual air temperature (MAT, a), mean annual precipitation (MAP, b), and aridity index (AI, c) defined as the ratio of MAP to annual potential evapotranspiration. Three seasonal temperature factors include the temperature of the warmest month (Tw, d), the temperature of the coldest month (Tc, e), temperature annual range (TR, f). Three seasonal precipitation factors include precipitation of the wettest month (Pw, g), precipitation of the driest month (Pd, h), and precipitation seasonality (Ps, i) defined as the standard deviation of monthly precipitation during the measuring year. Three biological factors include the mean annual leaf area index (LAI, j), the maximum leaf area index (MLAI, k), and stand age (SA, l). Two soil factors include soil organic carbon content (SOC, m) and soil total nitrogen content (STN, n). The differences are tested for each variable with one-way analysis of variance (ANOVA), where * and ** indicate significant differences between forest types at significance levels of α = 0.05 and α = 0.01, respectively. The corrected values are mean values during 2003–2019 after correcting the original measurements with the interannual trend (See methods), which are listed in each panel, while original measurements are mean values during the measuring period of each ecosystem, which are not shown in each panel.Full size imageFor annual climatic factors, the significant difference between NF and PF only appeared in MAT (Fig. 1a). The mean MAT of NF was 10.50 ± 7.81 °C, which was significantly lower than that of PF (15.65 ± 6.23 °C) (p  0.05) (Fig. 2c). Even considering the significant effects of MAT on ER, ANCOVA results obtained by fixing MAT as a covariant also suggested that ER values did not significantly differ between forest types (F = 0.01, p  > 0.05). Fixing other variables as a covariant also drew a similar result.Therefore, NF showed a lower NEP resulting from the lower GPP than PF, while their differences were not statistically significant (Fig. 2).Differences in NEP latitudinal patternsCarbon fluxes showed divergent latitudinal patterns between NF and PF, while their latitudinal patterns varied among carbon fluxes, which were both found with corrected values and original measurements (Fig. 3).Figure 3The latitudinal patterns of carbon fluxes over Chinese naturally regenerating forests (NF) and planted forests (PF). The carbon fluxes include net ecosystem productivity (NEP, a,b), gross primary productivity (GPP, c,d), and ecosystem respiration (ER, e,f). Each panel is drawn with the corrected values (blue points) and original measurements (grey points), respectively. The blue and black lines represent the regression lines calculated from the corrected values and original measurements, respectively, with their regression statistics listed in blue and black letters. Only the regression slope (Sl) and R2 of each regression are listed. The grey lines represent the regressions between carbon fluxes added by random errors and latitude. Only significant (p  0.05).The ER of NF showed a significant decreasing latitudinal pattern (Fig. 3e), while that of PF exhibited no significant latitudinal pattern (Fig. 3f). The increasing latitude caused the ER of NF to significantly decrease. Each unit increase in latitude led to a 28.71 gC m−2 year−1 decrease in ER, with an R2 of 0.31. However, the increasing latitude contributed little to the ER spatial variation of PF (p  > 0.05).In addition, the latitudinal patterns of carbon fluxes and their differences between forest types were also obtained with the original measurements (Fig. 3, grey points). The latitudinal patterns of random error adding carbon fluxes were comparable to those of our corrected carbon fluxes (Fig. 3), which confirmed that the latitudinal patterns of carbon fluxes and their differences between forest types would not be affected by the uncertainties in generating the corrected carbon fluxes.Therefore, among NFs, the similar decreasing latitudinal patterns of GPP and ER meant that NEP showed no significant latitudinal pattern, while the significant decreasing latitudinal pattern of GPP and no significant latitudinal pattern of ER caused NEP to show a decreasing latitudinal pattern among PFs.Differences in the environmental effects on NEP spatial variationsEnvironmental factors, including the annual climatic factors, seasonal temperature factors, seasonal precipitation factors, biological factors, and soil factors, exerted divergent effects on the spatial variations of NEP and its components, which also differed between forest types (Table 1). No factor was found to affect that the spatial variation of NEP among NFs, while most annual and seasonal climatic factors were found to affect that among PFs. The spatial variations of GPP and ER among NFs were both affected by most annual and seasonal climatic factors and LAI, while those among PFs were primarily shaped by most annual and seasonal climatic factors. Though LAI showed no significant effect on GPP and ER spatial variations among PFs, SA exerted a significant negative effect. In addition, the spatial variations of soil variables contributed little to the spatial variations of carbon fluxes. Therefore, among NFs, most annual and seasonal climatic factors and LAI were found to affect GPP and ER spatial variations, while no factor was found to significantly influent the NEP spatial variation. However, among PFs, most annual and seasonal climatic factors were found to affect the spatial variations of NEP and its components, while LAI showed no significant effect. Using the original measurements also generated the similar correlation coefficients (Supplementary Table S1).Table 1 Correlation coefficients between carbon fluxes and environmental factors in naturally regenerating forests (NF) and planted forests (PF).Full size tableGiven the high correlations among annual climatic factors and seasonal climatic factors (Supplementary Table S2), the partial correlation analysis was applied to determine which factors should be employed to reveal the mechanisms underlying the spatial variations of NEP. Partial correlation analysis showed that MAT and MAP exerted the most important roles in spatial variations of NEP and its components (Table 2). After controlling MAT (or MAP), other factors seldom showed significant correlation with carbon fluxes, especially fixing MAT (Table 2). In addition, MAT and MAP exerted similar effects on the spatial variations of NEP and its components (Table 1). Using the original measurements also generated the similar partial correlation coefficients (Supplementary Table S3). Therefore, we only presented the effects of MAT on carbon flux spatial variations and their differences between forest types in detail.Table 2 Partial correlation coefficients between carbon fluxes and environmental factors in naturally regenerating forests (NF) and planted forests (PF) with fixing mean annual air temperature (MAT) or mean annual precipitation (MAP).Full size tableThe increasing MAT increased carbon fluxes, while the increasing rates differed between forest types (Fig. 4). The increasing MAT contributed little to the NEP spatial variation of NF but raised the NEP of PF (Fig. 4a,b). Each unit increase in MAT caused the NEP of PF to increase at a rate of 27.77 gC m−2 year−1, with an R2 of 0.31 (Fig. 4b). The increasing MAT significantly raised GPP in NF and PF (Fig. 4c,d). For NF, each unit increase in MAT increased GPP at a rate of 43.76 gC m−2 year−1, with an R2 of 0.49 (Fig. 4c), while each unit increase in MAT increased the GPP of PF at a rate of 69.18 gC m−2 year−1, with an R2 of 0.57 (Fig. 4d). The GPP increasing rates did not significantly differ between NF and PF (F = 1.52, p  > 0.05). The increasing MAT also raised ER in both NF and PF (Fig. 4e,f), whose increasing rates were 38.97 gC m−2 year−1 (Fig. 4e) and 36.79 gC m−2 year−1 (Fig. 4f), respectively, while their differences were not statistically significant (F = 0.01, p  > 0.05). In addition, using the original measurements also generated the similar spatial variations and their differences between forest types (Fig. 4). Furthermore, the random error adding carbon fluxes responded similarly to those of our correcting carbon fluxes (Fig. 4), indicating that the effects of MAT on carbon fluxes would not be affected by the uncertainties in our correcting carbon fluxes. Therefore, the similar responses of GPP and ER to MAT made MAT contribute little to NEP spatial variations among NFs, while GPP and ER showed divergent response rates to MAT, which made NEP increase with MAT among PFs.Figure 4The effects of mean annual air temperature (MAT) on the spatial variations of carbon fluxes over Chinese naturally regenerating forests (NF) and planted forests (PF). The carbon fluxes include net ecosystem productivity (NEP, a,b), gross primary productivity (GPP, c,d), and ecosystem respiration (ER, e,f). Each panel is drawn with the corrected values (blue points) and original measurements (grey points), respectively. The blue and black lines represent the regression lines calculated from the corrected values and original measurements, respectively, with their regression statistics listed in blue and black letters. Only the regression slope (Sl) and R2 of each regression are listed. The grey lines represent the regressions between carbon fluxes added by random errors and latitude. Only significant (p  More

  • in

    Host identity is the dominant factor in the assembly of nematode and tardigrade gut microbiomes in Antarctic Dry Valley streams

    Alpha diversity differences among communitiesNematode gut microbiomes were assigned into their respective species categories of E. antarcticus and P. murrayi based on 18S host data that was consistent with morphology (see Methods “Microinvertebrate haplotypes”). In contrast, due to recovery of three undiscernible 18S tardigrade haplotypes, the gut microbiomes were assigned to Tardigrada. Mat bacterial communities were significantly (Tukey’s HSD, P  0.65, χ2(1)  0.38, χ2(3)  More

  • in

    In-hive learning of specific mimic odours as a tool to enhance honey bee foraging and pollination activities in pear and apple crops

    Study sites and coloniesAll the experiments were carried out during the apple and pear blooming seasons of 2007, 2008, 2011, 2013 and 2014 in different locations of the province of Rio Negro, Argentina, while some laboratory experiments performed in the city of Buenos Aires. We used individual foragers of Apis mellifera L. and their colonies containing a mated queen, brood, and food reserves in ten-frame Langstroth hives. All beehives used had similar sizes and the same management history from the beekeeper. The honey bees studied belonged to commercial Langstroth-type hives rented to pollinate these plots. Each hive had a fertilized queen, 3 or 4 capped brood frames, reserves and approximately 15,000 individuals56.Testing generalization of memories from pear mimic odours to pear and apple natural floral scentsThe absolute conditioning assays were performed in the laboratories of the School of Exacts and Natural Sciences of the University of Buenos Aires (34° 32′ S, 58° 26′ W), Buenos Aires, Argentina. We used honey bee foragers collected at the entrance of the hives settle in the experimental field of the School of Exacts and Natural Sciences. The apple (‘Granny Smith’ and ‘Red Delicious’ varieties) and pear (‘Packham’ and ‘D’anjou’ varieties) bud samples that we used as conditioned stimuli (CS) during the conditioning were collected at the end of the blossom of 2011 in Ingeniero Huergo (39° 03′ 27.5″ S; 67° 13′ 53.5″ W), province of Río Negro, Argentina, and taken to the laboratory in the city of Buenos Aires, Argentina, to be used within the following 2 days.We first developed the three different synthetic mixtures (PM, PMI and PMII) that could be generalized to the fragrance of the pear flower by foraging bees. The pear synthetic mixtures were formulated considering the previously reported volatile profile of pear blossoms57. Then, we chose the synthetic mixture most perceptually similar to the pear flower fragrance and measured its generalisation response to the apple flower fragrance to test the compounds’ specificity. The chemical compounds used to prepare the different synthetic mixtures for the behavioural assays were obtained from Sigma-Aldrich, Steinheim, Germany. The compounds used for the three pear mixtures (PM, PMI and PMII) were composed by alpha-pinene, 2-ethyl-hexanol, (R)-(+)-limonene, and (±)-linalool. For details of the PM and mixture proportions see Patent PCT/IB2018/05555058.To test generalization, we took advantages of the fact that honey bees reflexively extend their proboscises when sugar solution is applied to their antennae59. The proboscis extension reflex (PER) can be used to condition bees to an odour if a neutral olfactory stimulus (CS) is paired with a sucrose reward as unconditioned stimulus, US60. Conditioned honey bees extend their proboscises towards the odour alone, a response that indicates that this stimulus has been learned and predicts the oncoming food reward. Conditioned bees can generalize such a learned response to a novel odour if it is perceived like the conditioned one (CS). Then we performed three absolute PER conditionings where we paired each of the three PMs with a sucrose-water solution (30%) reward along three learning trials (exp. 4.2a). Afterwards, pear floral scent was presented as novel odour to test generalization. Based on the generalization level to the pear odour, we chose the synthetic mixture that showed the highest generalisation towards pear flower fragrance, and we used it in all the experiments that follow. In an additional 3-trial PER conditioning with the chosen mixture, we quantified generalisation towards both the pear and apple fragrances as novel stimuli (exp. 4.2b).The experimental bees were all foragers, captured from colonies that had no access to any pear and/or apple tree, hence completely naïve for the CSs. Immediately after capture, bees were anaesthetized at 4 °C and harnessed in metal tubes so that they could only move their mouthparts and antennae60. They were fed 30% weight/weight unscented sucrose solution for about three seconds and kept in a dark incubator (30 °C, 55% relative humidity) for about two hours. Only those bees that showed the unconditioned response (the reflexive extension of the proboscis after applying a 30% w/w sucrose solution to the antennae) and did not respond to the mechanical air flow stimulus were used. Trials lasted 46 s and presented three steps: 20 s of clean air, 6 s of odour presentation (CS) and the last 20 s of clean air. During rewarded trials (CS), the reward (US, a drop of 30% w/w sucrose solution) was delivered upon the last 3 s of CS presentation. The synthetic mixtures (PM) were delivered in a constant air flow (15 ml/s) that passed through a 1 ml syringe containing 4 µl of the synthetic mixture on a small strip of filter paper. On the other hand, pear and apple floral volatiles were swept from a 100 g of fresh pear buds (var. ‘D’Anjou’ and ‘Packham’) or apple buds (var. ‘Granny Smith’, ‘Gala’ and ‘Red Delicious’) inside a kitasato by means of an air flow (54 ml/s).Testing discrimination between mimics and natural floral scentsThe differential conditioning assays were performed in a field laboratory in Ingeniero Huergo, province of Río Negro, Argentina. Conditioning trials with AM as CS were carried out in September 2007 and 2008, prior to the beginning of flowering of the fruit trees. Conditioning trials with PM as CS were carried out in September 2011 in the same area (Ingeniero Huergo, province of Río Negro, Argentina). Apple and pear bud samples used as CS were collected in plots that start blooming located around Ingeniero Huergo, but distant (more than 1 km) from the plot where we collected the bees. The bud samples presented the following varieties: M. domesticus sp., ‘Granny Smith’, ‘Gala’, and ‘Red Delicious’; P. communis sp., ‘Packham’ and ‘D’Anjou’.With the aim to develop a synthetic mixture that presents difficult to discriminate with the fragrance of the apple flower by foraging bees, an apple synthetic mixture (AM) was formulated considering the previously reported volatile profile of apple blossoms61. The chemical compounds used to prepare the apple synthetic mixtures for the behavioural assays were obtained from Sigma-Aldrich, Steinheim, Germany. Apple mimic (AM) was composed by benzaldehyde, limonene and citral. For details of the AM proportions see Patent AR2011010244162. Jasmine mimic (JM) was a commercial extract obtained from Firmenich S.A.I.C. y F, Argentina.If the synthetic mixture chosen were perceptually similar to the apple flower fragrance, experimental bees should have difficult to discriminate to the apple flower fragrance to test the compounds’ specificity. Thus, we performed differential PER conditioning between synthetic mixtures (AM and Jasmine mimic, JM) or between synthetic mixtures (AM or JM) and the apple natural fragrance. We followed a differential PER conditioning34 to assess to what extent the bees were able to discriminate the synthetic mimics from their natural flower scents. PER differential conditioning consisted of four pairs of trials, four rewarded trials (CS+) and four non-rewarded trials (CS−) that were presented in a pseudo-randomized manner. Conditionings were performed using the synthetic mixtures PM and AM and the natural floral scents, pear and apple, either as CS+ and CS−. We followed the same procedure that in 3.3 to capture the bees and to present the stimuli during trials.Feeding protocolWe used the offering of scented sucrose solution in the hive as a standardized procedure to establish long-term olfactory memory in honey bees23,24,24,26,63. Scented sucrose solution was obtained by diluting 50 µl of PM or AM per litre of sucrose solution (50% weight/weight, henceforth: w/w). For the ‘apple’ series, colonies were fed 1500 ml of sugar solution offered in an internal plastic feeder for 2 days, about 3 days before the apple trees began to bloom. For the ‘pear’ series, hives were fed 500 ml of sugar solution that we spread over the top of the central frames. Both feeding procedures have been found to be functional for establishing olfactory in-hive memories26. Depending on the pear varieties, the scented sucrose solution was offered when the pear trees were 10–40% in bloom.Colony activityThe effects of the AM-treatment on colony nest entrance activity were studied in 18 colonies located in an agricultural setting of apple and pear trees in Ingeniero Huergo, on an 8-ha plot, half of which was planted with apple trees (varieties: ‘Granny Smith’, ‘Gala’ and ‘Red Delicious’) and the other 4 ha with pear trees (varieties: ‘Packham’ and ‘D’anjou’). The effect of the PM-treatment on colony activity was studied in 14 colonies located in three adjoining pear plots (total surface: 8 ha) in Otto Krause (39° 06′ 22″ S 66° 59′ 46″ O, Supplementary Fig. S5), province of Río Negro, Argentina. The varieties of these plots corresponded to ‘Packham’ and ‘Williams’. Pollen collection (exp. 4.5.2) was also studied in colonies located in these plots.We focused on the nest entrance activity since once the first successful foragers return to the hive and display dances and/or unload the food collected, it promotes the activation or reactivation of inactive foragers and, in a minor proportion, those hive mates ready to initiate foraging tasks39,65,66,67,67. Then, we choose number of incoming bees as an indicator of colony foraging activity, since most of these bees are expected to return from foraging sites33. Thus, we compared the activity level at the nest entrance between 7 SS + PM-treated colonies and 7 SS-treated colonies. We also compared the nest entrance activity level between 5 colonies treated with SS + AM and 5 colonies fed with SS. This activity value was estimated by the number of incoming foragers at the entrance of the hive for one minute, every morning at the same time (10:30 a.m.) during the entire experiment (9 consecutive days). A first measurement was done one day before feeding the colonies (used as covariate) and 7 measurements afterwards.We measured the amount of pollen loads collected by two colonies: one fed with SS + PM and one fed with SS. Pollen loads were collected using conventional pollen traps (frontal-entrance trap), consisting of a wooden structure with a removable metal mesh inside. Pollen samples were collected for 3 days, two hours per day during the late morning, 3, 7 and 8 days after the offering of SS + PM or SS. Pollen pellets identified based on pollen colour as coming from the pear flower or from other species were separated and counted. In addition, we estimated the weight of pear pollen loads during a 5 days period, from 6 to 10 days after the offering of scented or unscented sucrose solution. To reduce measurement error, pollen loads were weighed in groups of 10.Crop yieldPear crop yield was studied in pear plots in General Roca (39° 02′ 00″ S; 67° 35′ 00″ O, Supplementary Fig. S4, Supplementary Table S3), province of Río Negro, Argentina. In an area of 15.2 ha (4 plots of 3.8 ha each), 45 beehives were equidistantly located in groups. We measured the number of fruits per tree set of 30 trees in the surrounding areas of the PM-treated colonies (2 groups of 8 hives) and control colonies (2 groups of 8 hives). A third group category contained 13 untreated colonies. The varieties of the pear trees were ‘D’Anjou’ and ‘Packham’.Apple crop yield estimated by means of number of fruits per plant was studied in General Roca (Supplementary Fig. S2, Supplementary Table S1), province of Río Negro, Argentina. We measured fruit set in the two plots that covered a surface of 3.8 ha and contained a total of 74 colonies distributed in groups (the control plot, 39 SS-treated-colonies treated with SS; and the treated plot, 35 SS + AM-treated-colonies treated with SS + AM). The varieties of the apple trees were ‘Red Delicious’ (clone 1), ‘Royal Gala’ and ‘Granny Smith’.A second studied on apple fruit yield by means of kg of fruits per hectare was performed in Coronel Belisle (39° 11′ 00″ S 65° 59′ 00″ O, Supplementary Fig. S3, Supplementary Table S2), province of Río Negro, Argentina. Four apple plots with ‘Granny Smith’, ‘Hi Early’ and ‘Red Delicious’, clone 1 varieties of 15.4 ha each were randomly assigned to different treatments (treated plot 1, 40 SS + AM-treated-hives treated with SS + AM; treated plot 2, 40 SS + AM-treated-hives treated with SS + AM; control plot 1, 40 SS-treated-hives treated with SS; control plot 2, 40 SS-treated-hives treated with SS).During the fruit harvest, the fruit yield was estimated in the surroundings (150 m around) of two groups of 8 colonies each. We fed one group SS + PM and the other unscented sucrose solution (SS). Yield was estimated as the number of fruits per trees in 30 randomly selected trees within each area, alternating the counts between the North and South faces of the plots. Following the same procedure, we also estimated the number of fruits per trees in the surroundings of two groups of 14 colonies each that pollinated apple crops. Again, we fed one group SS + AM and the other SS. Additionally, a total of 218 colonies in General Roca and 180 colonies in Coronel Belisle have been separated in the two experimental groups, in which yield had been provided by the producer and expressed in kg of fruits per ha. It is worth remarking that in some plots the distance between treated and control beehive groups was around 300 m, suggesting that might have been overlapping flying areas between treated and control hives. Additionally, the apple fields studied in the surrounding of Coronel Belisle, presented many trees without flowers. It was considered that the absence of flowers in numerous trees would bias the counts performed in those fields. Then, to quantify this situation, which might be associated with the masting phenomenon68, samples with the proportions of trees without flowers for every 20 trees in each plot was done. Trees that had between 80 and 100% of their surface devoid of flowers were considered “without flowers” trees, and “trees with available flowers” those that had more than 20% of their surface covered with flowers. An average of 30% of the trees within these plots were devoid of flowers. Thus, a correction factor was considered to evaluate the yield data provided by the grower per plot analysed (Supplementary Table S4).StatisticsAll statistical analyses were performed with R Core Team 201969. For Experiment 4.2 and 4.3, we analysed PER proportion by means of a binomial multiplicative generalized linear mixed model using the “glmer” function of the ‘lme4’ package70.For experiment 4.2a we considered the pear mimics (three-level factor corresponding to PM, PMI and PMII) and the event (two-level factor corresponding to 3rd trial and test) as fixed factors and each “bee” as a random factor.For experiment 4.2b we considered the tested odours (three-level factor corresponding to Apple, Pear and PM) as fixed factors.For experiment 4.3 we considered the tested odours (two-level factor corresponding to CS+ and CS−) as fixed factors. Post hoc contrasts were conducted on models to assess effects and significance between fixed factors using the “emmeans” function of the ‘emmeans’ package version 1.7.071 with a significance level of 0.05.For experiment 4.5.1 we analysed “rate of incoming bees” using a generalized linear mixed model. As Poisson model for incoming bees was overdispersed72, we used a negative binomial distribution using the ‘glmmTMB’ package (function ‘glmmTMB’73. We considered “treatment” [two-level factor corresponding to SS + AM (or SS + PM) and SS], “days” (7-level factor corresponding to the date after treatment), the rate of incoming bees before the offering of food (to control for pre-existing colony differences) as covariate (a quantitative fixed effects variable), and “colony” as a random factor.For experiment 4.6, we analysed fruits per trees by means of a negative binomial multiplicative generalized linear mixed model using the “log” function of the ‘ml’ package70. Post hoc contrasts were conducted on models to assess effects and significance between fixed factors using the “emmeans” function of the ‘emmeans’ package version 1.8.071 with a significance level of 0.05. For experiment 4.6b we analysed “yield” (as weight of fruits per unit area) using a general linear mixed model. We checked homoscedasticity and normality assumptions (Levene and Shapiro–Wilk tests, respectively). We considered “treatment” (two-level factor corresponding to SS + AM and SS) and “apple varieties” (3-level factor corresponding to Hi Early, Granny Smith and Chañar 28) as fixed factors and “location” (2-level factor corresponding to General Roca and Coronel Belisle) as random factors. More

  • in

    Limited carbon cycling due to high-pressure effects on the deep-sea microbiome

    Aristegui, J., Gasol, J. M., Duarte, C. M. & Herndl, G. J. Microbial oceanography of the dark ocean’s pelagic realm. Limnol. Oceanogr. 54, 1501–1529 (2009).Article 

    Google Scholar 
    Jannasch, H. W., Eimhjellen, K., Wirsen, C. O. & Farmanfarmaian, A. Microbial degradation of organic matter in the deep sea. Science 171, 672–675 (1971).Article 

    Google Scholar 
    Tamburini, C., Boutrif, M., Garel, M., Colwell, R. R. & Deming, J. W. Prokaryotic responses to hydrostatic pressure in the ocean – a review. Environ. Microbiol. 15, 1262–1274 (2013).Article 

    Google Scholar 
    Yayanos, A. A. Microbiology to 10,500 meters in the deep-sea. Annu. Rev. Microb. 49, 777–805 (1995).Article 

    Google Scholar 
    Jebbar, M., Franzetti, B., Girard, E. & Oger, P. Microbial diversity and adaptation to high hydrostatic pressure in deep-sea hydrothermal vents prokaryotes. Extremophiles 19, 721–740 (2015).Article 

    Google Scholar 
    Yayanos, A. A. Evolutional and ecological implications of the properties of deep-sea barophilic bacteria. Proc. Natl Acad. Sci. USA 83, 9542–9546 (1986).Article 

    Google Scholar 
    Nagata, T. et al. Emerging concepts on microbial processes in the bathypelagic ocean – ecology, biogeochemistry, and genomics. Deep-Sea Res. II 57, 1519–1536 (2010).Article 

    Google Scholar 
    Picard, A. & Daniel, I. Pressure as an environmental parameter for microbial life – a review. Biophys. Chem. 183, 30–41 (2013).Article 

    Google Scholar 
    Herndl, G. J. & Reinthaler, T. Microbial control of the dark end of the biological pump. Nat. Geosci. 6, 718–724 (2013).Article 

    Google Scholar 
    Marietou, A. & Bartlett, D. H. Effects of high hydrostatic pressure on coastal bacterial community abundance and diversity. Appl. Environ. Microbiol. 80, 5992–6003 (2014).Article 

    Google Scholar 
    Lauro, F. M. & Bartlett, D. H. Prokaryotic lifestyles in deep sea habitats. Extremophiles 12, 15–25 (2008).Article 

    Google Scholar 
    Peoples, L. M. et al. Distinctive gene and protein characteristics of extremely piezophilic Colwellia. BMC Genom. 21, 692 (2020).Article 

    Google Scholar 
    Reinthaler, T. et al. Prokaryotic respiration and production in the meso- and bathypelagic realm of the eastern and western North Atlantic basin. Limnol. Oceanogr. 51, 1262–1273 (2006).Article 

    Google Scholar 
    Steinberg, D. K. et al. Bacterial vs. zooplankton control of sinking particle flux in the ocean’s twilight zone. Limnol. Oceanogr. 53, 1327–1338 (2008).Article 

    Google Scholar 
    Burd, A. B. et al. Assessing the apparent imbalance between geochemical and biochemical indicators of meso- and bathypelagic biological activity: what the @$#! is wrong with present calculations of carbon budgets? Deep-Sea Res. II 57, 1557–1571 (2010).Article 

    Google Scholar 
    Boyd, P. W., Claustre, H., Levy, M., Siegel, D. A. & Weber, T. Multi-faceted particle pumps drive carbon sequestration in the ocean. Nature 568, 327–335 (2019).Article 

    Google Scholar 
    Kirchman, D., Knees, E. & Hodson, R. Leucine incorporation and its potential as a measure of protein-synthesis by bacteria in natural aquatic systems. Appl. Environ. Microbiol. 49, 599–607 (1985).Article 

    Google Scholar 
    Nielsen, J. L., Christensen, D., Kloppenborg, M. & Nielsen, P. H. Quantification of cell-specific substrate uptake by probe-defined bacteria under in situ conditions by microautoradiography and fluorescence in situ hybridization. Environ. Microbiol. 5, 202–211 (2003).Article 

    Google Scholar 
    Sintes, E. & Herndl, G. J. Quantifying substrate uptake by individual cells of marine bacterioplankton by catalyzed reporter deposition fluorescence in situ hybridization combined with micro autoradiography. Appl. Environ. Microbiol. 72, 7022–7028 (2006).Article 

    Google Scholar 
    Garel, M. et al. Pressure-retaining sampler and high-pressure systems to study deep-sea microbes under in situ conditions. Front. Microbiol 10, 453 (2019).Article 

    Google Scholar 
    Peoples, L. M. et al. A full-ocean-depth rated modular lander and pressure-retaining sampler capable of collecting hadal-endemic microbes under in situ conditions. Deep-Sea Res. I 143, 50–57 (2019).Article 

    Google Scholar 
    Gross, M. & Jaenicke, R. Proteins under pressure – the influence of high hydrostatic pressure on structure, function and assembly of proteins and protein complexes. Eur. J. Biochem. 221, 617–630 (1994).Article 

    Google Scholar 
    Kirchman, D. L. Growth rates of microbes in the oceans. Annu. Rev. Mar. Sci. 8, 285–309 (2016).Article 

    Google Scholar 
    Ashburner, M. et al. Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).Article 

    Google Scholar 
    Xie, Z., Jian, H., Jin, Z. & Xiao, X. Enhancing the adaptability of the deep-sea bacterium Shewanella piezotolerans WP3 to high pressure and low temperature by experimental evolution under H2O2 stress. Appl. Environ. Microbiol. 84, e02342–02317 (2018).Article 

    Google Scholar 
    Tamburini, C. et al. Effects of hydrostatic pressure on microbial alteration of sinking fecal pellets. Deep-Sea Res. II 56, 1533–1546 (2009).Article 

    Google Scholar 
    Ivars-Martinez, E. et al. Comparative genomics of two ecotypes of the marine planktonic copiotroph Alteromonas macleodii suggests alternative lifestyles associated with different kinds of particulate organic matter. ISME J. 2, 1194–1212 (2008).Article 

    Google Scholar 
    Zhao, Z., Baltar, F. & Herndl, G. J. Linking extracellular enzymes to phylogeny indicates a predominantly particle-associated lifestyle of deep-sea prokaryotes. Sci. Adv. 6, eaaz4354 (2020).Article 

    Google Scholar 
    Bochdansky, A. B., van Aken, H. M. & Herndl, G. J. Role of macroscopic particles in deep-sea oxygen consumption. Proc. Natl Acad. Sci. USA 107, 8287–8291 (2010).Article 

    Google Scholar 
    Chikuma, S., Kasahara, R., Kato, C. & Tamegai, H. Bacterial adaptation to high pressure: a respiratory system in the deep-sea bacterium Shewanella violacea DSS12. FEMS Microbiol. Lett. 267, 108–112 (2007).Article 

    Google Scholar 
    Qin, Q. L. et al. Oxidation of trimethylamine to trimethylamine N-oxide facilitates high hydrostatic pressure tolerance in a generalist bacterial lineage. Sci. Adv. 7, eabf9941 (2021).Article 

    Google Scholar 
    Mestre, M. et al. Sinking particles promote vertical connectivity in the ocean microbiome. Proc. Natl Acad. Sci. USA 115, E6799–E6807 (2018).Article 

    Google Scholar 
    Thiele, S., Fuchs, B. M., Amann, R. & Iversen, M. H. Colonization in the photic zone and subsequent changes during sinking determine bacterial community composition in marine snow. Appl. Environ. Microbiol. 81, 1463–1471 (2015).Article 

    Google Scholar 
    Tada, Y. et al. Differing growth responses of major phylogenetic groups of marine bacteria to natural phytoplankton blooms in the western North Pacific Ocean. Appl. Environ. Microbiol. 77, 4055–4065 (2011).Article 

    Google Scholar 
    Cottrell, M. T. & Kirchman, D. L. Natural assemblages of marine proteobacteria and members of the Cytophaga-Flavobacter cluster consuming low- and high-molecular-weight dissolved organic matter. Appl. Environ. Microbiol. 66, 1692–1697 (2000).Article 

    Google Scholar 
    Poff, K. E., Leu, A. O., Eppley, J. M., Karl, D. M. & DeLong, E. F. Microbial dynamics of elevated carbon flux in the open ocean’s abyss. Proc. Natl Acad. Sci. USA 118, e2018269118 (2021).Article 

    Google Scholar 
    Ducklow, H. in Microbial Ecology of the Oceans (ed. Kirchman, D. L.) Ch. 4, 85–120 (Wiley-Liss, 2000).Herndl, G. J. et al. Contribution of archaea to total prokaryotic production in the deep Atlantic Ocean. Appl. Environ. Microbiol. 71, 2303–2309 (2005).Article 

    Google Scholar 
    Baltar, F., Aristegui, J., Gasol, J. M. & Herndl, G. J. Prokaryotic carbon utilization in the dark ocean: growth efficiency, leucine-to-carbon conversion factors, and their relation. Aquat. Microb. Ecol. 60, 227–232 (2010).Article 

    Google Scholar 
    Edgcomb, V. P. et al. Comparison of Niskin vs. in situ approaches for analysis of gene expression in deep Mediterranean Sea water samples. Deep-Sea Res. II 129, 213–222 (2016).Article 

    Google Scholar 
    Cario, A., Oliver, G. C. & Rogers, K. L. Exploring the deep marine biosphere: challenges, innovations, and opportunities. Front. Earth Sci. 7, 225 (2019).Article 

    Google Scholar 
    Giering, S. L. C. et al. Reconciliation of the carbon budget in the ocean’s twilight zone. Nature 507, 480–483 (2014).Article 

    Google Scholar 
    Simon, M. & Azam, F. Protein content and protein synthesis rates of planktonic marine bacteria. Mar. Ecol. Prog. Ser. 51, 201–213 (1989).Article 

    Google Scholar 
    Gasol, J. M. et al. Mesopelagic prokaryotic bulk and single-cell heterotrophic activity and community composition in the NW Africa-Canary Islands coastal-transition zone. Prog. Oceanogr. 83, 189–196 (2009).Article 

    Google Scholar 
    DeLong, E. F. et al. Community genomics among stratified microbial assemblages in the ocean’s interior. Science 311, 496–503 (2006).Article 

    Google Scholar 
    Teira, E., Reinthaler, T., Pernthaler, A., Pernthaler, J. & Herndl, G. J. Combining catalyzed reporter deposition-fluorescence in situ hybridization and microautoradiography to detect substrate utilization by bacteria and archaea in the deep ocean. Appl. Environ. Microbiol. 70, 4411–4414 (2004).Article 

    Google Scholar 
    Woebken, D., Fuchs, B. M., Kuypers, M. M. M. & Amann, R. Potential interactions of particle-associated anammox bacteria with bacterial and archaeal partners in the Namibian upwelling system. Appl. Environ. Microbiol. 73, 4648–4657 (2007).Article 

    Google Scholar 
    Wand, M. P. Data-based choice of histogram bin width. Am. Stat. 51, 59–64 (1997).
    Google Scholar 
    Acinas, S. G. et al. Deep ocean metagenomes provide insight into the metabolic architecture of bathypelagic microbial communities. Commun. Biol. 4, 604 (2021).Article 

    Google Scholar 
    Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015).Article 

    Google Scholar 
    Delmont, T. O. et al. Nitrogen-fixing populations of Planctomycetes and Proteobacteria are abundant in surface ocean metagenomes. Nat. Microbiol. 3, 804–813 (2018).Article 

    Google Scholar 
    Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).Article 

    Google Scholar 
    Wu, Y. W., Tang, Y. H., Tringe, S. G., Simmons, B. A. & Singer, S. W. MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome 2, 26 (2014).Article 

    Google Scholar 
    Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. Peerj 7, e7359 (2019).Article 

    Google Scholar 
    Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 11, 2864–2868 (2017).Article 

    Google Scholar 
    Chaumeil, P. A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2020).
    Google Scholar 
    Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinf. 11, 119 (2010).Article 

    Google Scholar 
    Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).Article 

    Google Scholar 
    Eng, J. K., McCormack, A. L. & Yates, J. R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass. Spectrom. 5, 976–989 (1994).Article 

    Google Scholar 
    Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).Article 

    Google Scholar 
    Riffle, M. et al. MetaGOmics: a web-based tool for peptide-centric functional and taxonomic analysis of metaproteomics data. Proteomes 6, 2 (2017).Article 

    Google Scholar 
    Reinthaler, T., van Aken, H. M. & Herndl, G. J. Major contribution of autotrophy to microbial carbon cycling in the deep North Atlantic’s interior. Deep-Sea Res. II 57, 1572–1580 (2010).Article 

    Google Scholar 
    Yokokawa, T., Yang, Y. H., Motegi, C. & Nagata, T. Large-scale geographical variation in prokaryotic abundance and production in meso- and bathypelagic zones of the central Pacific and Southern Ocean. Limnol. Oceanogr. 58, 61–73 (2013).Article 

    Google Scholar 
    Frank, A. H., Garcia, J. A., Herndl, G. J. & Reinthaler, T. Connectivity between surface and deep waters determines prokaryotic diversity in the North Atlantic Deep Water. Environ. Microbiol. 18, 2052–2063 (2016).Article 

    Google Scholar 
    Herndl, G. J., Bayer, B., Baltar, F. & Reinthaler, T. Prokaryotic life in the deep ocean’s water column. Annu. Rev. Mar. Sci. (in the press).Uchimiya, M., Ogawa, H. & Nagata, T. Effects of temperature elevation and glucose addition on prokaryotic production and respiration in the mesopelagic layer of the western North Pacific. J. Oceanogr. 72, 419–426 (2016).Article 

    Google Scholar 
    Antia, A. N. et al. Basin-wide particulate carbon flux in the Atlantic Ocean: regional export patterns and potential for atmospheric CO2 sequestration. Glob. Biogeochem. Cycles 15, 845–862 (2001).Article 

    Google Scholar 
    Behrenfeld, M. J. & Falkowski, P. G. Photosynthetic rates derived from satellite-based chlorophyll concentration. Limnol. Oceanogr. 42, 1–20 (1997).Article 

    Google Scholar  More

  • in

    Standardized multi-omics of Earth’s microbiomes reveals microbial and metabolite diversity

    Dataset descriptionSample collectionOur research complies with all relevant ethical regulations following policies at the University of California, San Diego (UCSD). Animal samples that were sequenced were not collected at UCSD and are not for vertebrate animals research at UCSD following the UCSD Institutional Animal Care and Use Committee (IACUC). Samples were contributed by 34 principal investigators of the Earth Microbiome Project 500 (EMP500) Consortium and are samples from studies at their respective institutions (Supplementary Table 1). Relevant permits and ethics information for each parent study are described in the ‘Permits for sample collection’ section below. Samples were contributed as distinct sets referred to here as studies, where each study represented a single environment (for example, terrestrial plant detritus). To achieve more even coverage across microbial environments, we devised an ontology of sample types (microbial environments), the EMP Ontology (EMPO) (http://earthmicrobiome.org/protocols-and-standards/empo/)1, and selected samples to fill out EMPO categories as broadly as possible. EMPO recognizes strong gradients structuring microbial communities globally, and thus classifies microbial environments (level 4) on the basis of host association (level 1), salinity (level 2), host kingdom (if host-associated) or phase (if free-living) (level 3) (Fig. 1a). As we anticipated previously1, we have updated the number of levels as well as states therein for EMPO (Fig. 1b) on the basis of an important additional salinity gradient observed among host-associated samples when considering the previously unreported shotgun metagenomic and metabolomic data generated here (Fig. 3c,d). We note that although we were able to acquire samples for all EMPO categories, some categories are represented by a single study.Samples were collected following the Earth Microbiome Project sample submission guide50. Briefly, samples were collected fresh, split into 10 aliquots and then frozen, or alternatively collected and frozen, and subsequently split into 10 aliquots with minimal perturbation. Aliquot size was sufficient to yield 10–100 ng genomic DNA (approximately 107–108 cells). To leave samples amenable to chemical characterization (metabolomics), buffers or solutions for sample preservation (for example, RNAlater) were avoided. Ethanol (50–95%) was allowed as it is compatible with LC–MS/MS although it should also be avoided if possible.Sampling guidance was tailored for four general sample types: bulk unaltered (for example, soil, sediment, faeces), bulk fractionated (for example, sponges, corals, turbid water), swabs (for example, biofilms) and filters. Bulk unaltered samples were split fresh (or frozen), sampled into 10 pre-labelled 2 ml screw-cap bead beater tubes (Sarstedt, 72.694.005 or similar), ideally with at least 200 mg biomass, and flash frozen in liquid nitrogen (if possible). Bulk fractionated samples were fractionated as appropriate for the sample type, split into 10 pre-labelled 2 ml screw-cap bead beater tubes, ideally with at least 200 mg biomass, and flash frozen in liquid nitrogen (if possible). Swabs were collected as 10 replicate swabs using 5 BD SWUBE dual cotton swabs with wooden stick and screw cap (281130). Filters were collected as 10 replicate filters (47 mm diameter, 0.2 um pore size, polyethersulfone (preferred) or hydrophilic PTFE filters), placed in pre-labelled 2 ml screw-cap bead beater tubes, and flash frozen in liquid nitrogen (if possible). All sample types were stored at –80 °C if possible, otherwise –20 °C.To track the provenance of sample aliquots, we employed a QR coding scheme. Labels were affixed to aliquot tubes before shipping when possible. QR codes had the format ‘name.99.s003.a05’, where ‘name’ is the PI name, ‘99’ is the study ID, ‘s003’ is the sample number and ‘a05’ is the aliquot number. QR codes (version 2, 25 pixels × 25 pixels) were printed on 1.125’ × 0.75’ rectangular and 0.437’ circular cap Cryogenic Direct Thermal labels (GA International, DFP-70) using a Zebra model GK420d printer and ZebraDesigner Pro 3 software for Windows. After receipt but before aliquots were stored in freezers, QR codes were scanned into a sample inventory spreadsheet using a QR scanner.Sample metadataEnvironmental metadata were collected for all samples on the basis of the EMP Metadata Guide, which combines guidance from the Genomics Standards Consortium MIxS (Minimum Information about any Sequence) standard74 and the Qiita Database (https://qiita.ucsd.edu)51. The metadata guide provides templates and instructions for each MIxS environmental package (that is, sample type). Relevant information describing each PI submission, or study, was organized into a separate study metadata file (Supplementary Table 1).MetabolomicsLC–MS/MS sample extraction and preparationTo profile metabolites among all samples, we used LC–MS/MS, a versatile method that detects tens of thousands of metabolites in biological samples. All solvents and reactants used were LC–MS grade. To maximize the biomass extracted from each sample, the samples were prepared depending on their sampling method (for example, bulk, swabs, filter and controls). The bulk samples were transferred into a microcentrifuge tube (polypropylene, PP) and dissolved in 7:3 MeOH:H2O using a volume varying from 600 µl to 1.5 ml, depending on the amounts of sample available, and homogenized in a tissue lyser (QIAGEN) at 25 Hz for 5 min. Then, the tubes were centrifuged at 2,000 × g for 15 min, and the supernatant was collected in a 96-well plate (PP). For swabs, the swabs were transferred into a 96-well plate (PP) and dissolved in 1.0 ml of 9:1 ethanol:H2O. The prepared plates were sonicated for 30 min, and after 12 h at 4 °C, the swabs were removed from the wells. The filter samples were dissolved in 1.5 ml of 7:3 MeOH:H2O in microcentrifuge tubes (PP) and sonicated for 30 min. After 12 h at 4 °C, the filters were removed from the tubes. The tubes were centrifuged at 2,000 × g for 15 min, and the supernatants were transferred to 96-well plates (PP). The process control samples (bags, filters and tubes) were prepared by adding 3.0 ml of 2:8 MeOH:H2O and recovering 1.5 ml after 2 min. After the extraction process, all sample plates were dried with a vacuum concentrator and subjected to solid phase extraction (SPE). SPE was used to remove salts that could reduce ionization efficiency during mass spectrometry analysis, as well as the most polar and non-polar compounds (for example, waxes) that cannot be analysed efficiently by reversed-phase chromatography. The protocol was as follows: the samples (in plates) were dissolved in 300 µl of 7:3 MeOH:H2O and put in an ultrasound bath for 20 min. SPE was performed with SPE plates (Oasis HLB, hydrophilic-lipophilic-balance, 30 mg with particle sizes of 30 µm). The SPE beds were activated by priming them with 100% MeOH, and equilibrated with 100% H2O. The samples were loaded on the SPE beds, and 100% H2O was used as wash solvent (600 µl). The eluted washing solution was discarded, as it contains salts and very polar metabolites that subsequent metabolomics analysis is not designed for. The sample elution was carried out sequentially with 7:3 MeOH:H2O (600 µl) and 100% MeOH (600 µl). The obtained plates were dried with a vacuum concentrator. For mass spectrometry analysis, the samples were resuspended in 130 µl of 7:3 MeOH:H2O containing 0.2 µM of amitriptyline as an internal standard. The plates were centrifuged at 30 × g for 15 min at 4 °C. Samples (100 µl) were transferred into new 96-well plates (PP) for mass spectrometry analysis.LC–MS/MS sample analysisThe extracted samples were analysed by ultra-high performance liquid chromatography (UHPLC, Vanquish, Thermo Fisher) coupled to a quadrupole-Orbitrap mass spectrometer (Q Exactive, Thermo Fisher) operated in data-dependent acquisition mode (LC–MS/MS in DDA mode). Chromatographic separation was performed using a Kinetex C18 1.7 µm (Phenomenex), 100 Å pore size, 2.1 mm (internal diameter) × 50 mm (length) column with a C18 guard cartridge (Phenomenex). The column was maintained at 40 °C. The mobile phase was composed of a mixture of (A) water with 0.1% formic acid (v/v) and (B) acetonitrile with 0.1% formic acid. Chromatographic elution method was set as follows: 0.00–1.00 min, isocratic 5% B; 1.00–9.00 min, gradient from 5% to 100% B; 9.00–11.00 min, isocratic 100% B; followed by equilibration 11.00–11.50 min, gradient from 100% to 5% B; 11.50–12.50 min, isocratic 5% B. The flow rate was set to 0.5 ml min−1.The UHPLC was interfaced to the orbitrap using a heated electrospray ionization source with the following parameters: ionization mode, positive; spray voltage, +3,496.2 V; heater temperature, 363.90 °C; capillary temperature, 377.50 °C; S-lens RF, 60 arbitrary units (a.u.); sheath gas flow rate, 60.19 a.u.; and auxiliary gas flow rate, 20.00 a.u. The MS1 scans were acquired at a resolution (at m/z 200) of 35,000 in the m/z 100–1500 range, and the fragmentation spectra (MS2) scans at a resolution of 17,500 from 0 to 12.5 min. The automatic gain control target and maximum injection time were set at 1.0 × 106 and 160 ms for MS1 scans, and set at 5.0 × 105 and 220 ms for MS2 scans, respectively. Up to three MS2 scans in data-dependent mode (Top 3) were acquired for the most abundant ions per MS1 scans using the apex trigger mode (4–15 s), dynamic exclusion (11 s) and automatic isotope exclusion. The starting value for MS2 was m/z 50. Higher-energy collision induced dissociation (HCD) was performed with a normalized collision energy of 20, 30 and 40 eV in stepped mode. The major background ions originating from the SPE were excluded manually from the MS2 acquisition. Analyses were randomized within plate and blank samples analysed every 20 injections. A quality control mix sample assembled from 20 random samples across the sample types was injected at the beginning, the middle and the end of each plate sequence. The chromatographic shift observed throughout the batch was estimated as less than 2 s, and the relative standard deviation of ion intensity was 15% per replicate.LC–MS/MS data processingThe mass spectrometry data were centroided and converted from the proprietary format (.raw) to the m/z extensible markup language format (.mzML) using ProteoWizard (ver. 3.0.19, MSConvert tool)75. The mzML files were then processed with MZmine 2 toolbox76 using the ion-identity networking modules77 that allow advanced detection for adduct/isotopologue annotations. The MZmine processing was performed on Ubuntu 18.04 LTS 64-bits workstation (Intel Xeon E5-2637, 3.5 GHz, 8 cores, 64 Gb of RAM) and took ~3 d. The MZmine project, the MZmine batch file (.XML format) and results files (.MGF and .CSV) are available in the MassIVE dataset MSV000083475. The MZmine batch file contains all the parameters used during the processing. In brief, feature detection and deconvolution was performed with the ADAP chromatogram builder78 and local minimum search algorithm. The isotopologues were regrouped and the features (peaks) were aligned across samples. The aligned peak list was gap filled and only peaks with an associated fragmentation spectrum and occurring in a minimum of three files were conserved. Peak shape correlation analysis grouped peaks originating from the same molecule and annotated adduct/isotopologue with ion-identity networking77. Finally, the feature quantification table results (.CSV) and spectral information (.MGF) were exported with the GNPS module for feature-based molecular networking analysis on GNPS79 and with SIRIUS export modules.LC–MS/MS data annotationThe results files of MZmine (.MGF and .CSV files) were uploaded to GNPS (http://gnps.ucsd.edu)52 and analysed with the feature-based molecular networking workflow79. Spectral library matching was performed against public fragmentation spectra (MS2) spectral libraries on GNPS and the NIST17 library.For the additional annotation of small peptides, we used the DEREPLICATOR tools available on GNPS80,81. We then used SIRIUS82 (v. 4.4.25, headless, Linux) to systematically annotate the MS2 spectra. Molecular formulae were computed with the SIRIUS module by matching the experimental and predicted isotopic patterns83, and from fragmentation trees analysis84 of MS2. Molecular formula prediction was refined with the ZODIAC module using Gibbs sampling85 on the fragmentation spectra (chimeric spectra or those with poor fragmentation were excluded). In silico structure annotation using structures from biodatabase was done with CSI:FingerID86. Systematic class annotations were obtained with CANOPUS41 and used the NPClassifier ontology87.The parameters for SIRIUS tools were set as follows, for SIRIUS: molecular formula candidates retained, 80; molecular formula database, ALL; maximum precursor ion m/z computed, 750; profile, orbitrap; m/z maximum deviation, 10 ppm; ions annotated with MZmine were prioritized and other ions were considered (that is, [M+H3N+H]+, [M+H]+, [M+K]+, [M+Na]+, [M+H-H2O]+, [M+H-H4O2]+, [M+NH4]+); for ZODIAC: the features were split into 10 random subsets for lower computational burden and computed separately with the following parameters: threshold filter, 0.9; minimum local connections, 0; for CSI:FingerID: m/z maximum deviation, 10 ppm; and biological database, BIO.To establish putative microbially related secondary metabolites, we collected annotations from spectral library matching and the DEREPLICATOR+ tools and queried them against the largest microbial metabolite reference databases (Natural Products Atlas88 and MIBiG89). Molecular networking79 was then used to propagate the annotation of microbially related secondary metabolites throughout all molecular families (that is, the network component).LC–MS/MS data analysisWe combined the annotation results from the different tools described above to create a comprehensive metadata file describing each metabolite feature observed. Using that information, we generated a feature-table including only secondary metabolite features determined to be microbially related. We then excluded very low-intensity features introduced to certain samples during the gap-filling step described above. These features were identified on the basis of presence in negative controls that were universal to all sample types (that is, bulk, filter and swab) and by their relatively low per-sample intensity values. Finally, we excluded features present in positive controls for sampling devices specific to each sample type (that is, bulk, filter or swab). The final feature-table included 618 samples and 6,588 putative microbially related secondary metabolite features that were used for subsequent analysis.We used QIIME 2’s90 (v2020.6) ‘diversity’ plugin to quantify alpha-diversity (that is, feature richness) for each sample and ‘deicode’91 to quantify beta-diversity (that is, robust Aitchison distances, which are robust to both sparsity and compositionality in the data) between each pair of samples. We parameterized our robust Aitchison principal components analysis (RPCA)91 to exclude samples with fewer than 500 features and features present in fewer than 10% of samples. We used the ‘taxa’ plugin to quantify the relative abundance of microbially related secondary metabolite pathways and superclasses (that is, on the basis of NPClassifier) within each environment (that is, for each level of EMPO 4), and ‘songbird’ v1.0.492 to identify sets of microbially related secondary metabolites whose abundances were associated with certain environments. We parameterized our ‘songbird’ model as follows: epochs, 1,000,000; differential prior, 0.5; learning rate, 1.0 × 10−5; summary interval, 2; batch size, 400; minimum sample count, 0; and training on 80% of samples at each level of EMPO 4 using ‘Animal distal gut (non-saline)’ as the reference environment. Environments with fewer than 10 samples were excluded to optimize model training (that is, ‘Animal corpus (non-saline)’, ‘Animal proximal gut (non-saline)’, ‘Surface (saline)’). The output from ‘songbird’ includes a rank value for each metabolite in every environment, which represents the log fold change for a given metabolite in a given environment92. We compared log fold changes for each metabolite from this run to those from (1) a replicate run using the same reference environment and (2) a run using a distinct reference environment: ‘Water (saline)’. We found strong Spearman correlations in both cases (Supplementary Table 8), and therefore focused on results from the original run using ‘Animal distal gut (non-saline)’ as the reference environment, as it has previously been shown to be relatively unique among other habitats. In addition to summarizing the top 10 metabolites for each environment (Supplementary Table 3), we used the log fold change values in our multi-omics analyses described below.We used the RPCA biplot and QIIME 2’s90 EMPeror93 to visualize differences in composition among samples, as well as the association with samples of the 25 most influential microbially related secondary metabolite features (that is, those with the largest magnitude across the first three principal component loadings). We tested for significant differences in metabolite composition across all levels of EMPO using PERMANOVA implemented with QIIME 2’s ‘diversity’ plugin90 and using our robust Aitchison distance matrix as input. In parallel, we used the differential abundance results from ‘songbird’ described above to identify specific microbially related secondary metabolite pathways and superclasses that varied strongly across environments. We then went back to our metabolite feature-table to visualize differences in the relative abundances of those pathways and superclasses within each environment by first selecting features and calculating log-ratios using ‘qurro’94, and then plotting using the ‘ggplot2’ package95 in R96 v4.0.0. We tested for significant differences in relative abundances across environments using Kruskal–Wallis tests implemented with the base ‘stats’ package in R96.GC–MS sample extraction and preparationTo profile volatile small molecules among all samples in addition to what was captured with LC–MS/MS, we used gas chromatography coupled with mass spectrometry (GC–MS). All solvents and reactants were GC–MS grade. Two protocols were used for sample extraction, one for the 105 soil samples and a second for the 356 faecal and sediment samples that were treated as biosafety level 2. The 105 soil samples were received at the Pacific Northwest National Laboratory and processed as follows. Each soil sample (1 g) was weighed into microcentrifuge tubes (Biopur Safe-Lock, 2.0 ml, Eppendorf). H2O (1 ml) and one scoop (~0.5 g) of a 1:1 (v/v) mixture of garnet (0.15 mm, Omni International) and stainless steel (0.9–2.0 mm blend, Next Advance) beads and one 3 mm stainless steel bead (Qiagen) were added to each tube. Samples were homogenized in a tissue lyser (Qiagen) for 3 min at 30 Hz and transferred into 15 ml polypropylene tubes (Olympus, Genesee Scientific). Ice-cold water (1 ml) was used to rinse the smaller tube and combined into the 15 ml tube. Chloroform:methanol (10 ml, 2:1 v/v) was added and samples were rotated at 4 °C for 10 min, followed by cooling at −70 °C for 10 min and centrifuging at 150 × g for 10 min to separate phases. The top and bottom layers were combined into 40 ml glass vials and dried using a vacuum concentrator. Chloroform:methanol (1 ml, 2:1) was added to each large glass vial and the sample was transferred into 1.5 ml tubes and centrifuged at 1,300 × g. The supernatant was transferred into glass vials and dried for derivatization.The remaining 356 samples received from UCSD that included faecal and sediment samples were processed as follows: 100 µl of each sample was transferred to a 2 ml microcentrifuge tube using a scoop (MSP01, Next Advance). The final volume of the sample was brought to 1.5 ml, ensuring that the solvent ratio is 3:8:4 H2O:CHCl3:MeOH by adding the appropriate volumes of H2O, MeOH and CHCl3. After transfer, one 3 mm stainless steel bead (QIAGEN), 400 µl methanol and 300 µl H2O were added to each tube and the samples were vortexed for 30 s. Then, 800 µl chloroform was added and samples were vortexed for 30 s. After centrifuging at 150 × g for 10 min to separate phases, the top and bottom layers were combined in a vial and dried for derivatization.The samples were derivatized for GC–MS analysis as follows: 20 µl of a methoxyamine solution in pyridine (30 mg ml−1) was added to the sample vial and vortexed for 30 s. A bath sonicator was used to ensure that the sample was completely dissolved. Samples were incubated at 37 °C for 1.5 h while shaking at 1,000 r.p.m. N-methyl-N-trimethylsilyltrifluoroacetamide (80 µl) and 1% trimethylchlorosilane solution was added and samples were vortexed for 10 s, followed by incubation at 37 °C for 30 min, with 1,000 r.p.m. shaking. The samples were then transferred into a vial with an insert.An Agilent 7890A gas chromatograph coupled with a single quadrupole 5975C mass spectrometer (Agilent) and an HP-5MS column (30 m × 0.25 mm × 0.25 μm; Agilent) was used for untargeted analysis. Samples (1 μl) were injected in splitless mode, and the helium gas flow rate was determined by the Agilent Retention Time Locking function on the basis of analysis of deuterated myristic acid (Agilent). The injection port temperature was held at 250 °C throughout the analysis. The GC oven was held at 60 °C for 1 min after injection, and the temperature was then increased to 325 °C at a rate of 10 °C min−1, followed by a 10 min hold at 325 °C. Data were collected over the mass range of m/z 50–600. A mixture of FAMEs (C8–C28) was analysed each day with the samples for retention index alignment purposes during subsequent data analysis.GC–MS data processing and annotationThe data were converted from vendor’s format to the .mzML format and processed using GNPS GC–MS data analysis workflow (https://gnps.ucsd.edu)97. The compounds were identified by matching experimental spectra to the public libraries available at GNPS, as well as NIST 17 and Wiley libraries. The data are publicly available at the MassIVE depository (https://massive.ucsd.edu); dataset ID: MSV000083743. The GNPS deconvolution is available in GNPS (https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=d5c5135a59eb48779216615e8d5cb3ac), as is the library search (https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=59b20fc8381f4ee6b79d35034de81d86).GC–MS data analysisFor multi-omics analyses including GC–MS data, we first removed noisy (that is, suspected background contaminants and artifacts) features by excluding those with balance scores 1.5–2 kb DNA fragments’ (Oxford Nanopore Technologies). The resulting product consists of uniquely tagged rRNA operon amplicons. The uniquely tagged rRNA operons were amplified in a second PCR, where the reaction (100 µl) contained 2 U Platinum SuperFi DNA Polymerase High Fidelity (Thermo Fisher) and a final concentration of 1X SuperFi buffer, 0.2 mM of each dNTP, and 500 nM of each forward and reverse synthetic primer targeting the tailed primers from above. The PCR cycling parameters consisted of an initial denaturation (3 min at 95 °C) and then 25–35 cycles of denaturation (15 s at 95 °C), annealing (30 s at 60 °C) and extension (6 min at 72 °C), followed by final extension (5 min at 72 °C). The PCR product was purified using the custom bead purification protocol above. Batches of 25 amplicon libraries were barcoded and sent for PacBio Sequel II library preparation and sequencing (Sequel II SMRT Cell 8M and 30 h collection time) at the DNA Sequencing Center at Brigham Young University. Circular consensus sequencing (CCS) reads were generated using CCS v.3.4.1 (https://github.com/PacificBiosciences/ccs) using default settings. UMI consensus sequences were generated using the longread_umi pipeline (https://github.com/SorenKarst/longread_umi) with the following command: longread_umi pacbio_pipeline -d ccs_reads.fq -o out_dir -m 3500 -M 6000 -s 60 -e 60 -f CAAGCAGAAGACGGCATACGAGAT -F AGRGTTYGATYMTGGCTCAG -r AATGATACGGCGACCACCGAGATC -R CGACATCGAGGTGCCAAAC -U ‘0.75;1.5;2;0’ -c 2.Amplicon data analysisFor multi-omics analyses including amplicon sequence data, we processed each dataset for comparison of beta-diversity. For all amplicon data except that for bacterial full-length rRNA amplicons, raw sequence data were converted from bcl to fastq, and then multiplexed files for each sequencing run uploaded as separate preparations to Qiita (study: 13114).For each 16S sequencing run, in Qiita, data were demultiplexed, trimmed to 150 bp and denoised using Deblur122 to generate a feature-table of sub-operational taxonomic units (sOTUs) per sample, using default parameters. We then exported feature-tables and denoised sequences from each sequencing run, used QIIME 2’s ‘feature-table’ plugin to merge feature-tables and denoised reads across sequencing runs, and placed all denoised reads into the GreenGenes 13_8 phylogeny123 via fragment insertion using QIIME 2’s90 SATé-Enabled Phylogenetic Placement (SEPP)124 plugin to produce a phylogeny for diversity analyses. To allow for phylogenetically informed diversity analyses, reads not placed during SEPP (that is, 513 sOTUs, 0.1% of all sOTUs) were removed from the merged feature-table. We then used QIIME 2’s ‘feature-table’ plugin to exclude singleton sOTUs and rarefy the data to 5,000 reads per sample. Rarefaction depths for all amplicon analyses were chosen to best normalize sampling effort per sample while maintaining ≥75% of samples representative of Earth’s environments, and also to maintain consistency with the analyses from EMP release 1. We then used QIIME 2’s90 ‘diversity’ plugin to estimate alpha-diversity (that is, sOTU richness) and beta-diversity (that is, unweighted UniFrac distances). The final feature-table for 16S beta-diversity analysis included 681 samples and 93,260 features. We performed a comparative analysis of the data including and excluding the reads not placed during SEPP, and note that both alpha-diversity (that is, sOTU richness) and beta-diversity (that is, sample–sample RPCA distances) were highly correlated between datasets (Spearman r = 1.0) (Supplementary Fig. 5). We thus proceeded with the SEPP-filtered dataset and used phylogenetically informed diversity metrics where applicable.For 18S data, we used QIIME 2’s90 ‘demux’ plugin’s ‘emp-paired’ method125,126 to first demultiplex each sequencing run, and then the ‘cutadapt’ plugin’s127 ‘trim-paired’ method to trim sequencing primers from reads. We then exported trimmed reads, concatenated R1 and R2 read files per sample, and denoised reads using Deblur’s122,128 ‘workflow’ with default settings, trimming reads to 90 bp, and taking the ‘all.biom’ and ‘all.seqs’ output, for each sequencing run. We then used QIIME 2’s ‘feature-table’ plugin to merge feature-tables and denoised sequences across sequencing runs, and then the ‘feature-classifier’ plugin’s ‘classify-sklearn’ method to classify taxonomy for each sOTU via pre-fitted machine-learning classifiers129 and the SILVA 138 reference database130. We then used QIIME 2’s90 ‘feature-table’ plugin to exclude reads assigned to bacteria and archaea, singleton sOTUs and samples with a total frequency of More

  • in

    Mapping the planet’s critical natural assets

    Extent and location of critical natural assetsCritical natural assets providing the 12 local NCP (Fig. 1a) occupy only 30% (41 million km2) of total land area (excluding Antarctica) and 24% (34 million km2) of marine Exclusive Economic Zones (EEZs), reflecting the steep slope of the aggregate NCP accumulation curve (Fig. 1b). Despite this modest proportion of global land area, the shares of countries’ land areas that are designated as critical can vary substantially. The 20 largest countries require only 24% of their land area, on average, to maintain 90% of current levels of NCP, while smaller countries (10,000 to 1.5 million km2) require on average 40% of their land area (Supplementary Data 1). This high variability in the NCP–area relationship is primarily driven by the proportion of countries’ land areas made up by natural assets (that is, excluding barren, ice and snow, and developed lands), but even when this is accounted for, there are outliers (Extended Data Fig. 2). Outliers may be due to spatial patterns in human population density (for example, countries with dense population centres and vast expanses with few people, such as Canada and Russia, require far less area to achieve NCP targets) or large ecosystem heterogeneity (if greater ecosystem diversity yields higher levels of diverse NCP in a smaller proportion of area, which may explain patterns in Chile and Australia).The highest-value critical natural assets (the locations delivering the highest magnitudes of NCP in the smallest area, denoted by the darkest blue or green shades in Fig. 1c) often coincide with diverse, relatively intact natural areas near or upstream from large numbers of people. Many of these high-value areas coincide with areas of greatest spatial congruence among multiple NCP (Extended Data Fig. 3). Spatially correlated pairs of local NCP (Supplementary Table 4) include those related to water (flood risk reduction with nitrogen retention and nitrogen with sediment retention); forest products (timber and fuelwood); and those occurring closer to human-modified habitats (pollination with nature access and with nitrogen retention). Coastal risk reduction, forage production for grazing, and riverine fish harvest are the most spatially distinct from other local NCP. In the marine realm, there is substantial overlap of fisheries with coastal risk reduction and reef tourism (though not between the latter two, which each have much smaller critical areas than exist for fisheries).Number of people benefitting from critical natural assetsWe estimate that ~87% of the world’s current population, 6.4 billion people, benefit directly from at least one of the 12 local NCP provided by critical natural assets, while only 16% live on the lands providing these benefits (and they may also benefit; Fig. 2a). To quantify the number of beneficiaries of critical natural assets, we spatially delineate their benefitting areas (which varies on the basis of NCP: for example, areas downstream, within the floodplain, in low-lying areas near the coast, or accessible by a short travel). While our optimization selects for the provision of 90% of the current value of each NCP, it is not guaranteed that 90% of the world’s population would benefit (since it does not include considerations for redundancy in adjacent pixels and therefore many of the areas selected benefit the same populations), so it is notable that an estimated 87% do. This estimate of ‘local’ beneficiaries probably underestimates the total number of people benefitting because it includes only NCP for which beneficiaries can be spatially delineated to avoid double-counting, yet it is striking that the vast majority, 6.1 billion people, live within 1 h travel (by road, rail, boat or foot, taking the fastest path17) of critical natural assets, and more than half of the world’s population lives downstream of these areas (Fig. 2b). Material NCP are often delivered locally, but many also enter global supply chains, making it difficult to delineate beneficiaries spatially for these NCP. However, past studies have calculated that globally more than 54 million people benefit directly from the timber industry18, 157 million from riverine fisheries19, 565 million from marine fisheries20 and 1.3 billion from livestock grazing21, and across the tropics alone 2.7 billion are estimated to be dependent on nature for one or more basic needs22.Fig. 2: People benefitting from and living on critical natural assets (CNA).a,b, ‘Local’ beneficiaries were calculated through the intersection of areas benefitting from different NCP, to avoid double-counting people in areas of overlap; only those NCP for which beneficiaries could be spatially delineated were included (that is, not material NCP that enter global supply chains: fisheries, timber, livestock or crop pollination). Bars show percentages of total population globally and for large and small countries (a) or the percentage of relevant population globally (b). Numbers inset in bars show millions of people making up that percentage. Numbers to the right of bars in b show total relevant population (in millions of people, equivalent to total global population from Landscan 2017 for population within 1 h travel or downstream, but limited to the total population living within 10 km of floodplains or along coastlines 80%) of their populations benefitting from critical natural assets, but small countries have much larger proportions of their populations living within the footprint of critical natural assets than do large countries (Fig. 2a and Supplementary Data 2). When people live in these areas, and especially when current levels of use of natural assets are not sustainable, regulations or incentives may be needed to maintain the benefits these assets provide. While protected areas are an important conservation strategy, they represent only 15% of the critical natural assets for local NCP (Supplementary Table 5); additional areas should not necessarily be protected using designations that restrict human access and use, or they could cease to provide some of the diverse values that make them so critical23. Other area-based conservation measures, such as those based on Indigenous and local communities’ governance systems, Payments for Ecosystem Services programmes, and sustainable use of land- and seascapes, can all contribute to maintaining critical flows of NCP in natural and semi-natural ecosystems24.Overlaps between local and global prioritiesUnlike the 12 local NCP prioritized here at the national scale, certain benefits of natural assets accrue continentally or even globally. We therefore optimize two additional NCP at a global scale: vulnerable terrestrial ecosystem carbon storage (that is, the amount of total ecosystem carbon lost in a typical disturbance event25, hereafter ‘ecosystem carbon’) and vegetation-regulated atmospheric moisture recycling (the supply of atmospheric moisture and precipitation sustained by plant life26, hereafter ‘moisture recycling’). Over 80% of the natural asset locations identified as critical for the 12 local NCP are also critical for the two global NCP (Fig. 3). The spatial overlap between critical natural assets for local and global NCP accounts for 24% of land area, with an additional 14% of land area critical for global NCP that is not considered critical for local NCP (Extended Data Fig. 4). Together, critical natural assets for securing both local and global NCP require 44% of total global land area. When each NCP is optimized individually (carbon and moisture NCP at the global scale; the other 12 at the country scale), the overlap between carbon or moisture NCP and the other NCP exceeds 50% for all terrestrial (and freshwater) NCP except coastal risk reduction (which overlaps only 36% with ecosystem carbon, 5% with moisture recycling; Supplementary Table 4).Fig. 3: Spatial overlaps between critical natural assets for local and global NCP.Red and teal denote where critical natural assets for global NCP (providing 90% of ecosystem carbon and moisture recycling globally) or for local NCP (providing 90% of the 12 NCP listed in Fig. 1), respectively, but not both, occur; gold shows areas where the two overlap (24% of the total area). Together, local and global critical natural assets account for 44% of total global land area (excluding Antarctica). Grey areas show natural assets not defined as ‘critical’ by this analysis, though still providing some values to certain populations. White areas were excluded from the optimization.Full size imageSynergies can also be found between NCP and biodiversity and cultural diversity. Critical natural assets for local NCP at national levels overlap with part or all of the area of habitat (AOH, mapped on the basis of species range maps, habitat preferences and elevation27) for 60% of 28,177 terrestrial vertebrates (Supplementary Data 3). Birds (73%) and mammals (66%) are better represented than reptiles and amphibians (44%). However, these critical natural assets represent only 34% of the area for endemic vertebrate species (with 100% of their AOH located within a given country; Supplementary Data 3) and 16% of the area for all vertebrates if using a more conservative representation target framework based on the IUCN Red List criteria (though, notably, achieving Red List representation targets is impossible for 24% of species without restoration or other expansion of existing AOH; Supplementary Data 4). Cultural diversity (proxied by linguistic diversity) has far higher overlaps with critical natural assets than does biodiversity; these areas intersect 96% of global Indigenous and non-migrant languages28 (Supplementary Data 5). The degree to which languages are represented in association with critical natural assets is consistent across most countries, even at the high end of language diversity (countries containing >100 Indigenous and non-migrant languages, such as Indonesia, Nigeria and India). This high correspondence provides further support for the importance of safeguarding rights to access critical natural assets, especially for Indigenous cultures that benefit from and help maintain them. Despite the larger land area required for maintaining the global NCP compared with local NCP, global NCP priority areas overlap with slightly fewer languages (92%) and with only 2% more species (60% of species AOH), although a substantially greater overlap is seen with global NCP if Red List criteria are considered (36% compared with 16% for local NCP; Supplementary Data 4). These results provide different insights than previous efforts at smaller scales, particularly a similar exercise in Europe that found less overlap with priority areas for biodiversity and NCP29. However, the 40% of all vertebrate species whose habitats did not overlap with critical natural assets could drive very different patterns if biodiversity were included in the optimization.Although these 14 NCP are not comprehensive of the myriad ways that nature benefits and is valued by people23, they capture, spatially and thematically, many elements explicitly mentioned in the First Draft of the CBD’s post-2020 Global Biodiversity Framework13: food security, water security, protection from hazards and extreme events, livelihoods and access to green and blue spaces. Our emphasis here is to highlight the contributions of natural and semi-natural ecosystems to human wellbeing, specifically contributions that are often overlooked in mainstream conservation and development policies around the world. For example, considerations for global food security often include only crop production rather than nature’s contributions to it via pollination or vegetation-mediated precipitation, or livestock production without partitioning out the contribution of grasslands from more intensified feed production.Gaps and next stepsOur synthesis of these 14 NCP represents a substantial advance beyond other global prioritizations that include NCP limited to ecosystem carbon stocks, fresh water and marine fisheries30,31,32, though still falls short of including all important contributions of nature such as its relational values33. Despite the omission of many NCP that were not able to be mapped, further analyses indicate that results are fairly robust to inclusion of additional NCP. Dropping one of the 12 local NCP at a time results in More

  • in

    Population genomics reveal distinct and diverging populations of An. minimus in Cambodia

    Population sampling and sequencingWe generated whole genome sequence data from 302 wild-caught individual An. minimus female mosquitoes collected from five different field sites in Cambodia using the Illumina HiSeq 2000 platform with 150 bp paired-end reads with a target coverage of 30X for each. Mosquito collections in Thmar Da, in Eastern Cambodia, were done in 2010. Longitudinal monthly collections were performed from February 2014 to January 2015 in two sites in each of the Preah Vihear, and Ratanakiri provinces. Quarterly collections were also done in 2016 in one site in Preah Vihear province, Cambodia.Variant discoveryThe methods for sequencing and variant calling closely follow those of the Anopheles gambiae 1000 Genomes project phase 2 (Ag1000G)27. Sequence reads were aligned to the An. minimus reference genome AminM128. We restricted our analysis to the largest 40 contigs, which cover 96.6% of the AminM1 reference genome, as many smaller-sized contigs can confound diversity and divergence calculations. We found that 138,161,075 (75.4%) of sites within these 40 largest contigs pass our site filters and thus were accessible to SNP calling. Of these, we discovered 38,000,285 segregating single nucleotide polymorphisms (SNPs) that passed all of our quality control filters of 55,307,039 total segregating SNPs. 13.4% of these SNPs were multiallelic, with 32,906,471 biallelic SNPs. There were 4,807,355 triallelic and 286,459 quadriallelic SNPs. A total of 100,160,790 sites were invariant. The median genome-wide coverage was 35X.Population structureA principal component analysis (PCA) over biallelic SNPs distributed over the genome of 302 individual field-collected mosquitoes showed that there is clear population structure of An. minimus in Cambodia. Samples collected from five sites in three provinces split into three distinct clusters; here, we report on 283 individuals that could be clearly assigned to these clusters (Fig. 1), excluding 9 anomalous and 10 outlying individuals. One cluster includes all samples from the western collection site Thmar Da and the northern collection sites in Preah Vihear province, with two further clusters with samples from Ratanakiri province in the northeast. These clusters split primarily along the first and second principal components. This was a surprising finding because this population structure did not correlate to the geographic sampling of these mosquitoes. Individuals collected from the western and northern sites cluster tightly together despite being hundreds of kilometers apart.Fig. 1: Population structure of An. minimus in Cambodia.The map indicates the five Cambodian collection sites. Principal component analysis (PCA) of whole genome sequences of 283 individual An. minimus s.s. collected in five villages in Cambodia shows that there is a distinct population structure and three populations. When performing the same PCA on a large X-chromosomal contig (KB664054), these individuals break into four populations: TD from the West, PV from the northern province in Preah Vihear, and RK1 and RK2, both collected in two sites in Ratanakiri province in the Northeast.Full size imageTo further explore this population structure, we performed the same PCA over individual contigs from different regions of the genome. Performing PCA over the largest X-chromosomal contig KB664054 resulted in a splitting of the western and northern samples, indicating four distinct populations of An. minimus in Cambodia (Fig. 1). PCA from this contig on a quickly evolving sex chromosome revealed more population structure compared to autosomal contigs. The populations defined by these PCA clusters are designated in this study as TD from Thmar Da, in Western Cambodia (n = 41), on the Thai-Cambodian border, PV from the Northern province Preah Vihear (n = 156), and the two distinct populations collected in Ratanakiri province in the Northeast, each including individuals collected at both collection sites, these are designated as populations RK1 (n = 58) and RK2 (n = 28).To confirm our results from PCA, we also performed an admixture analysis. We ran admixture on each of the largest 10 contigs for values of K between 2 and 6 (Supplemental Fig. 1). At K = 2, the samples from Northeastern Cambodia split from Northern and Western Cambodia samples. At K = 3, the two different groups in Ratanakiri were separated, consistent with the PCA results. At K = 4, there was some evidence for geographical population structure between the Western TD and Northern PV populations, but the admixture results did not perfectly correspond with geographic sampling, with some evidence of mixed ancestry in the PV samples. Again, this is consistent with the PCA groupings, with the generally weaker evidence of geographic population structure between TD and PV. A cross-validation analysis showed the lowest cross-validation error for K = 2 and K = 3, consistent with the strongest evidence for population structure between the two RK groups and other populations. Cross-validation error was higher at K = 4, consistent with the weaker differentiation between TD and PV. At no point was their an indication of admixture between RK1 and RK2.To examine population differentiation, we computed differences in allele frequencies between each population using Pairwise Fst. Pairwise Fst between all 4 populations over the largest contig, KB663610, representing 16% of the An. minimus genome, (Fig. 2) shows that differentiation was relatively low between populations of TD and PV with an average pairwise Fst of 0.003, while the difference between RK2 and the other three populations is tenfold higher, around 0.03. Pairwise Fst estimates comparing these populations over other large An. minimus contigs indicate a similar level of differentiation, with average pairwise Fst values over 0.03 (Supplementary Data 3). The two sympatric populations from the Ratanakiri collection sites are as differentiated from each other as they are from the northern and western clusters.Fig. 2: Population diversity and divergence.Nucleotide diversity (π), Watterson’s Theta (θW), and Tajima’s D statistics were calculated over fourfold degenerate sites on autosomal contigs. The error bars indicate 95% confidence intervals calculated over 100 bootstrap replicates over samples. An average pairwise Fst in the table here was calculated in 20 kb windows over the largest contig KB663610.Full size imageThis level of differentiation of RK2, even from the RK1 population, might indicate an emerging cryptic species within An. minimus A or a newly diverging clade. RK1 and RK2 are sympatric populations, both being collected in the same two sites in Northeastern Cambodia. The differences seen here between RK1 and RK2 populations are consistent with cryptic taxa in other anopheline groups. For example, in the An. gambiae complex, the level of differentiation between recently diverged sibling species An. coluzzii and An. gambiae in West Africa is approximately 0.0319.Population diversity and variationTo characterize population diversity among these populations, nucleotide diversity (π), Watterson’s Theta (θW), and Tajima’s D statistics were calculated over 4-fold degenerate sites on autosomal contigs larger than 2 megabases with 100 bootstrap replicates over samples. These 17 contigs represent 80% of the Anopheles minimus genome (Fig. 2). The populations were downsampled for these calculations to have sizes equal to that of the smallest population RK2 (n = 28).There are small but significant differences in the magnitude of the genetic diversity summary statistics between these four different populations. In particular, there were notable differences between the putatively cryptic taxa RK1 and RK2, two populations that were collected in the same sites in Northeastern Cambodia. RK1 had higher levels of nucleotide diversity and lower levels of Tajima’s D than RK2. These differences are consistent with different population size histories between these sympatric groups. Lower values of Tajima’s D suggest stronger population growth in RK1. Comparing all four populations, higher levels of genetic diversity indicate larger effective population sizes of TD and PV compared to RK1 and RK2.RK2 has a significantly reduced nucleotide diversity and Watterson’s Theta compared to the other three populations. This may indicate a smaller population size and a recent bottleneck of the RK2 population in Cambodia. All four An. minimus populations have a negative Tajima’s D, indicating an excess of rare variants, particularly in RK1. This suggests recent population expansions in all populations.Signals of evolutionary selectionWe used Fst to scan across the Anopheles minimus genome to look for regions of the genome with increased differentiation. When we scanned the genome using pairwise Fst, there were no apparent long signals of differentiation that might indicate a large inversion or other structural variants, known to be major drivers of adaptive evolution in other Anopheles groups. To investigate increased differentiation across large regions of the genome, we performed scans of nucleotide diversity (π), Watterson’s Theta (θW), and Tajima’s D over the largest 14 contigs (representing 80% of the An. minimus genome). As with the Fst scans, there were no large regions of higher differentiation in any of the populations that might indicate major structural variants or inversions (Supplementary Figs. 2–4).Whole-genome sequencing allowed us to identify pointed signals occurring across the entire genome using scans of average pairwise Fst. Isolated points of high differentiation were compared over single contigs with average pairwise Fst calculated over windows of 1000 SNPs each and plotted over whole contigs. The strongest signals, indicated by the highest Fst value at the peak of a strong signal of differentiation, were ranked and compared. The five top signals in each of the six comparisons between the four populations are listed in Table 1. These isolated points of high differentiation are one indication of a signal of evolutionary selection. The most differentiated regions by Fst occurred when comparing the RK2 population to the other three populations, with the highest selection peaks with pairwise Fst over 0.4. RK2 also had more distinct signals of selection when compared to the other populations than RK1. Since these signals of differentiation were highly localized, we could look to known gene annotations and gene predictions across the AminM1 reference genome to see which genes were within 100 kbp of the peaks of these signals. We have noted candidate genes of interest that were near the strongest Fst signal peaks and also had known or predicted gene functions (Table 1, Supplementary Fig. 6, Supplementary Fig. 8).Table 1 The top five Fst signals of high differentiation within each of six population comparisons are reported here.Full size tableThere is almost no indication of selection when comparing the Thmar Da population with Preah Vihear, with all but one signal with Fst values below 0.05. The one strong signal between TD and PV (Fst = 0.125) is near a Carbohydrate sulfotransferase, which is involved in detoxification processes. Comparing TD to RK1 and RK2 reveals multiple strong signals of selection, some which are present in both Northeastern populations, as well as many unique RK2-specific signals (Fig. 3, Supplementary Fig. 5).Fig. 3: Signals of selection over a single autosomal contig.Pairwise Fst was calculated in 1000 SNP windows over autosomal contig KB664266, comparing the Thmar Da population to the three other populations, Ratanakiri 2, Ratanakiri 1, and Preah Vihear. There is almost no indication of selection when comparing Thmar Da with Preah Vihear. There is a strongly supported signal of differentiation in both Ratanakiri 1 and Ratanakiri 2 populations at 7.5 Mbp, which is in the same location as a cluster of GSTe genes, including GSTe2, which are known to be involved in metabolic resistance to DDT and pyrethroids. The signal with the highest Fst peak here in RK2, at 6 Mbp is close to an Ecdysteroid UDP-glucosyltransferase gene, shown to confer pyrethroid insecticide resistance in other anophelines. These are a few of many selection signals identified in this study that may be associated with insecticide pressure on these An. minimus populations.Full size imageMany of the strongest signals identified in this study may be associated with insecticide pressure on these An. minimus populations. The strongest selection signals in every population comparison were close to genes that are involved in detoxification, signal transduction, and adaptations to oxidative stress, or have been functionally validated to have mutations that confer resistance to insecticides (Table 1). Some signals of interest include a strongly supported signal of selection in both RK1 and RK2 populations at 7.5 Mbp on the contig KB664266, which is in the same location as a cluster of glutathione-S-transferases, including GSTe2, which has been shown to be involved in the metabolism of DDT and pyrethroids, mutations in which mediate metabolic insecticide resistance29. The signal with the highest pairwise Fst peak on the same contig KB664266, at 6 Mbp is an RK2-specific signal and close to an Ecdysteroid UDP-glucosyltransferase gene, which has been shown to confer pyrethroid insecticide resistance in An. stephensi30.Another notable signal is between the RK1 and RK2 populations on the contig KB663610, a Peptidase S1 domain-containing protein AMIN002286, which has been shown to be involved in response to parasite pathogens in insects31. The signals of selection observed in this study are mostly distinct from the main selection signals seen in An. gambiae complex mosquitoes19, the primary vectors of Plasmodium falciparum in Africa.Insecticide resistanceWe report here variants at known insecticide resistance-associated alleles for each of the four An. minimus populations. Variants occurring at a frequency of more than 2% in at least one of the four populations are reported in the known insecticide-resistance-associated genes Ace1, Rdl, KDR, and GSTe2 (Supplementary Data 2). GSTe2 mutants are present in multiple populations, at a low rate, and there are a few individuals in Thmar Da and Preah Vihear with the Rdl resistance mutation, which is known to confer resistance to cyclodiene insecticides, despite evidence from other studies that species in this region lack this resistance mutation32. We did not investigate copy number variation, which is one mechanism by which GSTe2 confers insecticide resistance. These SNP variants indicate variation throughout these insecticide-resistance-associated genes, and though most of these populations do not currently have high rates of validated insecticide resistance-associated mutations, this underlying variation provides the potential for structural and transcriptional events resulting in greater levels of insecticide resistance in An. minimus populations. More