    The evolutionary origin of avian facial bristles and the likely role of rictal bristles in feeding ecology

    SamplesWe examined 1,022 avian species (~ 10% recorded species) in this study, representing 418 genera, from 91 families (37% recorded families) and 29 orders (73% of all orders). Specimens were from the skin collection of the World Museum Liverpool, Tring Natural History Museum, Manchester Museum and Wollaton Hall Museum, all situated in the United Kingdom. All work was carried out in accordance with ethical regulations at Manchester Metropolitan University and with the permission of all aforementioned museums. Only the best-preserved adult specimens (no signs of cut off feathers or holes in the skin near the beak) were chosen for this study to ensure accurate measurements of bristle length, shape and presence, which should not be affected by the process of skin removal and specimen conservation. Species were randomly chosen, without targeting our sampling towards species known a priori to have bristles. Where possible, two specimens per species were measured (occurring in 82% of all species examined). Specimens of each sex were measured when present; however, this was not always possible since labelling was often inaccurate or missing. In total, the sample included 508 males, 412 females and 374 individuals of unknown sex. Both sexes were examined in 274 species and there was no difference whatsoever between the presence of bristles on male or female species (n = 97 with bristles present and n = 180 with bristles absent for both males and females). Length (Mann–Whitney U test, W = 37,962, N = 552, P = 0.94) and shape (Chi-square test, χ2 = 0, N = 552, df = 3, P = 1) of rictal bristles also did not significantly differ between males and females. Therefore, rictal bristles are likely to be sexually monomorphic and data for males and females was pooled for further analyses. Overall, rictal bristles were absent in 64% of species examined (n = 656) and just over a third of species (n = 366) had bristles present.Bristle descriptionsFacial bristles were initially identified by sight and touch in each specimen. Bristles were recorded as either present or absent from the upper rictal, lorial, lower rictal, narial and interramal regions (Fig. 1a). We use the term 'rictal bristle' here for bristles on both the upper rictal and/or the lorial region, since there was no clear differentiation and morphological differences between the bristles found in these regions forming a continuum of bristles above the edge of the beak. When present, rictal bristle shape was recorded as: (i) unbranched rictal bristles, (ii) rictal bristles with barbs only at the base ("Base") and (iii) branched rictal bristles ("Branched"), i.e. barbs and barbules present along the bristle rachis (Fig. 1b). The three longest rictal bristles were measured on both sides of the head of each specimen using digital callipers, and these lengths were averaged to provide a mean length of rictal bristles per species. In species lacking rictal bristles, a length of "0" and a shape category of "Absent" was recorded.Ancestral reconstruction of facial bristle presenceFollowing Felice et al.19, a single consensus phylogenetic tree was generated from the Hackett posterior distribution of trees from Birdtree.org20 with a sample size of 10,000 post burn-in, using the TreeAnnotator utility in BEAST software21 with a burn-in of 0. Maximum Clade Credibility (MCC) with the option "-heights ca" was selected as the method of reconstruction. The common ancestor trees option (-heights ca) builds a consensus tree by summarising clade ages across all posterior trees. Both the consensus tree and posterior distribution of 10,000 trees were imported into RStudio v. 1.2.5 for R22,23 and pruned so that only species present in the dataset of this study remained in the phylogeny. Taxon names were modified where necessary to match those from the ( species record. Negative terminal branches in our consensus tree were slightly lengthened to be positive using 'edge.length[tree$edge.length

    World leaders must step up to put biodiversity deal on path to success

    Pristine ecosystems such as mangrove forests protect against the effects of climate change.Credit: Karine Aigner/Nature Picture Library

    The Paris climate agreement, signed in December 2015, ranks as one of the most momentous global treaties ever negotiated, setting a crucial goal to seek to limit warming to 1.5–2 °C above pre-industrial levels. At the time, the opening ceremony of the COP21 climate-change conference that led to the agreement also held the record for the largest number of world leaders ever to attend a United Nations event in a single day — more than 150. The two things are probably more than coincidence.Now biodiversity is hoping for its Paris moment. The long-delayed COP15 conference, starting on 7 December in Montreal, Canada, aims to seal a bold new international deal committing countries to precise targets to curb species loss and to protect and restore nature.Many factors suggest the time is ripe. The problem of biodiversity loss is more prominent than ever before. As ecologist Sandra Díaz wrote in Nature last week, researchers have assembled the strongest evidence base yet ahead of COP15, the Fifteenth Conference of the Parties to the Convention on Biological Diversity (S. Díaz Nature 612, 9; 2022). Initiatives such as the Dasgupta Review, commissioned by the UK government, have made plain that the protection of biodiversity is an economic necessity.
    COP15 biodiversity plan risks being alarmingly diluted
    There is also much greater public awareness of how pollution and habitat destruction threaten the health of ecosystems on which we depend for food, clean water and disease prevention, and a better understanding of nature’s crucial role in mitigating climate change — for example, by storing carbon in soils and trees — as well as in helping us to adapt to its impacts. Mangrove forests, for instance, are hugely effective in stopping influxes of seawater from tsunamis and sea-level rise.But when it comes to getting stalled negotiations motoring again, the scale of support by world leaders that was a feature of climate’s road to Paris is currently lacking.Change cannot come too soon. Nature is on the brink. Of 20 decadal targets to preserve nature that were set in Aichi, Japan, in 2010, not a single one had been fully met by 2020. That, coupled with underfunding and lack of regard for the rights of Indigenous peoples who steward much of the world’s remaining biodiversity, means more species than ever are at risk of extinction. Serious impacts on human wealth and health from biodiversity loss loom ever larger. Yet over the past three years, four difficult rounds of negotiations aiming to agree on a framework to replace Aichi have not yielded results. Hundreds of issues remain unresolved.
    COVID delays are frustrating the world’s plans to save biodiversity
    Many experts worry that the lacklustre progress made at COP27, the climate summit held last month in Sharm El-Sheikh, Egypt, augur badly for the biodiversity meeting. But there is also reason for hope. The agreement made at COP27 to establish a ‘loss and damage’ fund to compensate low- and middle-income countries (LMICs) for climate impacts indicates that richer nations are open to talking about funding, which has also been a major sticking point in biodiversity negotiations.Global funding for biodiversity is severely in the red. A UN estimate published last week suggests that only US$154 billion per year flows to ‘nature-based solutions’ from all sources, including government aid and private investment — a number the UN says needs to triple by 2030. Many LMICs — which are home to much of the world’s remaining biodiversity — would like rich nations to put fresh finance into a new multilateral fund. One option is that such a fund could compensate LMICs for bio-diversity loss and associated damages driven by the consumption of products in rich nations through international trade.A second major sticking point is how to fairly and equitably share the benefits of digital sequence information — genetic data collected from plants, animals and other organisms. Communities in biodiversity-rich regions where genetic material is collected have little control over the commercialization of the data, and no way to recoup financial or other benefits. A multipurpose fund for bio-diversity could provide a simple and effective way to share the benefits of these data and support other conservation needs of LMICs.Another reason to hope for a breakthrough is the forthcoming change in Brazil’s leadership. Conservation organizations such as the wildlife charity WWF have accused the world’s most biodiverse nation of deliberately obstructing previous negotiations, holding up agreement on targets such as protecting at least 30% of the world’s land and seas by 2030. But Brazil’s incoming president, Luiz Inácio Lula da Silva, has signalled that the environment is one of his top priorities. Although he does not take over until January 2023, he is thought to be sending an interim team of negotiators to Montreal.
    Crucial biodiversity summit will go ahead in Canada, not China: what scientists think
    All negotiators face a Herculean task to get a deal over the line at COP15, with many issues in the text still unresolved and contested. What’s needed above all is global leadership to empower national negotiators to reach a strong deal, including a new fund of some kind for biodiversity. More than 90 heads of state and heads of government have signed a pledge to tackle the nature crisis. At the time of writing, only Justin Trudeau, the host nation’s prime minster, has confirmed that he is to attend in person.The no-shows send the wrong signal. It’s also true at the time of writing that neither Canada nor China — the original intended host of COP15 and still the meeting’s chair — has issued formal invitations. But leaders have regularly attended climate COPs for more than a decade. This shows in the ambition of climate agreements, if not in their implementation. Research communities and civil society must continue to pressure leaders to engage similarly with the biodiversity agenda. Otherwise, the world risks failing to grasp this opportunity to secure the kind of ambitious deal that nature — and humanity — desperately needs. More

    Revealing the global longline fleet with satellite radar

    To estimate the total number of non-broadcasting vessels, including those that were not detected by SAR, we: (1) obtained SAR detections of vessels from RADARSAT-2 and the corresponding vessel lengths as estimated from the SAR image; (2) processed a global feed of AIS data to identify every broadcasting vessel that should have appeared in the SAR images at the moment the images were taken; (3) developed a novel technique to determine which vessels in AIS matched to detections in SAR, which AIS vessels were not detected by SAR, and which SAR detections represented non-broadcasting vessels; (4) after matching SAR to AIS, we could then (a) model the relationship between a vessel’s actual length and the length as estimated by the SAR image (Fig. 3b) and (b) model the relationship between the likelihood that a vessel is detected and its length (Fig. 3a); and (5) finally, we combined these relationships to develop an estimate of the number and lengths of non-broadcasting vessels in the region.SAR imagery and vessel detectionsWorking with the satellite company Kongsberg Satellite Services (KSAT), we tasked the Canadian Space Agency’s satellite RADARSAT-2 to acquire SAR images from its ship detection mode (DVWF mode, GRD product), with a pixel size of about 40 m and a swath width over 400 km (19). These images were processed following standard procedures for GRD products (e.g. applying radiometric calibration and geometric corrections)29,30. Vessel locations were extracted from the images with the widely used ship detection algorithms, which discriminates objects at sea based on the backscatter difference (pixel values) between the sea clutter and the targets31. Vessel lengths were estimated by measuring distances directly on the images with the aid of a graphical user interface tool31.Identifying Vessels using AISIn each region, AIS data, obtained from satellite providers ORBCOMM and Spire, were processed using Global Fishing Watch’s data pipeline1. The identities and lengths of all AIS devices that operated near the SAR scenes in both space and time were first obtained using Global Fishing Watch’s database1. To be sure vessels were identified correctly, two analysts reviewed the tracks of every AIS device in each region.In both regions, it is common practice for fishers to put AIS beacons on their longlines, likely to aid in retrieving them, meaning that many AIS devices were longline gear and not vessels. Because gear outnumbered vessels by several-fold, it was critical to differentiate gear and fishing vessels. In the Indian Ocean, 521 unique AIS devices associated with gear were detected that were likely within the SAR scenes, and 390 unique AIS devices associated with gear in the Pacific that were likely within the SAR scenes. Transponders were determined to be associated with gear by inspecting the name broadcast in the AIS messages (gear frequently broadcasts one of several standard names and/or a voltage reading) and classification using the Global Fishing Watch vessel classification algorithm1. Most gear also had an MMSI number (unique identifier number for AIS) that started with 1, 8, or 9 or broadcast names that signified gear. We eliminated all gear from the analysis because (1) these gear buoys have reflectors that are only ~ 1 m in size, and they should not be visible in ~ 40 m resolution SAR images, and (2) we found that gear matched to SAR detections only when traveling faster than 2 knots (and thus was on the deck of a boat); of 159 instances of gear in scenes where the gear was traveling slower than two knots, zero matched to a radar detection (Fig. S9).Generating probability rasters for matching AIS to SARMost AIS positions did not correspond to the exact time when the SAR images were taken. Hence, to determine the likelihood that a vessel broadcasting AIS corresponded to a specific SAR detection, we first developed probability rasters of where a vessel was likely to be minutes before or after a GPS position was recorded (Figs. S1,S2). We mined one year of global AIS data, including roughly 10 billion GPS positions, and computed these rasters for six different vessel classes (trawlers, purse seines, tug, cargo or tanker, drifting longlines, and others) and considered six different speeds (1, 3, 5, 7, 9, and 12.5 knots) and 36 time intervals (− 448, − 320, − 224, − 160, − 112, − 80, − 56, − 40, − 28, − 20, − 14, − 10, − 7, − 5, − 3.5, − 2.5, − 1.5, − 0.5, 0.5, 1.5, 2.5, 3.5, 5, 7, 10, 14, 20, 28, 40, 56, 80, 112, 160, 224, 320, and 448 min).For example, we queried a year of AIS data to find every example of where a tugboat had two positions that were 10 min apart from one another when the vessel had been traveling at 10 knots at the first position. We then recorded each of these locations relative to the location the vessel would have been if it traveled in a straight line, with x coordinates being in the direction of travel and the y coordinates being perpendicular to the direction of travel. When collected for hundreds of thousands of examples across the AIS dataset, the result is a heatmap of where tug boats are located 10 min after a position when it was traveling at 10 knots. The raster is centered on a point that is the extrapolated position of the vessel based on its speed. For instance, the purse seine raster that corresponds to a vessel traveling between 6 and 8 knots between 96 and 128 min after the most recent position is centered at a point that is 13.1 km (7 knots × 112 min) straight ahead of the direction the vessel was traveling. Figure S1 shows samples of these rasters for different vessels.We built rasters of 1000 by 1000 pixels for each vessel class and time interval, with the area covered by the raster dependent on the time interval (longer time intervals imply longer traveled distances, covering more area). The scale of each pixel was given by:$${text{pixel}};{text{width = max(1, }}Delta {text{m) / 1000}}$$
    where Δm is the time interval in minutes, and pixel width is measured in km. Thus, if the Δm is under one minute, the entire raster is one kilometer wide with each pixel one meter by one meter. If the time is 10 min, then each pixel is 10 m wide, and the entire raster is 10 km by 10 km.Since the pixel width varies between rasters, the units of the rasters are probability per km2, thus summing the area of each pixel times its value equals one. Six vessel classes with 36 time intervals for each and six speeds led to 1296 different rasters. This probability raster approach could be seen as a utilization distribution32—for each vessel class, speed and time interval—where the space is relative to the position of the individual.Combining probability rasters to produce a matching scoreFor a few vessels (~ 4%) there was only one AIS position available before or after the scene. This resulted from a long gap in the AIS data due to poor reception, a weak AIS device, or cases where the vessels disabled their AIS. For these vessels, we used the raster values for a single position. For the vast majority of vessels, however, there was a GPS position right before and after the scene, and thus two probability rasters. We used two methods to combine these probability rasters to obtain information about the most likely location:Multiply and renormalize the rastersTo multiply the rasters, we interpolated the raster values, using bilinear interpolation, to a constant grid at the highest resolution between the before and after rasters. Then, we multiplied the values at each point and renormalized the resulting raster (Fig. S2):$$p_{i} = frac{{p_{ai} cdot p_{bi} }}{{mathop sum nolimits_{k = 0}^{N} p_{ak} cdot p_{bk} cdot da}}$$
    where pi is the probability in vessel density per km2 at location i, pai is the value of the raster before the image, pbi is the value of the raster after the image. The denominator is the sum of all multiplied values across the raster, scaled by the area of each cell, da.Weight and average the rasters For this method, we weighted the raster by the squared value of the probabilities of that scene. This has the effect of giving the concentrated raster a higher weight, thus weighting higher the raster that is closer in time to the image:$${w}_{a}=sum_{k=0}^{N} {p}_{ak}^{2}cdot da$$
    and the weighted average at location i is:$${p}_{i}=frac{{p}_{ai}cdot {w}_{a}+{p}_{bi}cdot {w}_{b}}{{w}_{a}+{w}_{b}}$$
    where wa is the weight for raster a, wb the weight for raster b (calculation analogous to wa’s in Eq. 3), pi is the probability in vessel density per km2 at location i.To determine whether we should multiply (Eq. 2) or average (Eq. 4) the probabilities, we compared the performance of these two metrics against a direct inspection of the detections. We found that at short intervals, multiplying the rasters and renormalizing often made probability values extremely small ( {d}_{d}cdot {p}_{d} + {p}_{f}$$
    where ({p}_{v}) is the probability density of the vessel presence at the location of the SAR detection (the score listed above), ({p}_{d}) is the probability that the vessel is detected by SAR, ({d}_{d}) is the density of non-broadcasting vessels in the region, and ({p}_{f}) is the density of false detections in the scene. The greater ({p}_{d}), the more dark vessels there are in a scene, and the more likely it is that any given detection is a dark vessel instead of a vessel broadcasting AIS. The right-hand side of the equation ({d}_{d}cdot {p}_{d} + {p}_{f}) should roughly equal the number of detections per unit area that do not match to AIS in the region. In other words, the probability of the vessel with AIS being at that specific location and detected by SAR (left side of the equation) should be greater than the probability of a dark vessel or a false detection at that location (right side of the equation).The total number of unmatched vessels in each studied region normalized by total area covered gives a density of non-broadcasting vessels of 2.6–2.8 × 10–5 vessels km-2 (Indian Ocean) and 6.8–7.2 × 10–6 vessels km−2 (Pacific Ocean), similar to the thresholds estimated by analysts. For the most likely number of matched vessels, we use a threshold that is halfway between the higher and lower bound of the analyst (5 × 10–5 to 1 × 10–4), 2.5 × 10–5 which is also roughly equal to the theoretical estimate of the Indian Ocean.This threshold approach performed significantly better than a metric based on the distance between the SAR detection and the most likely location of the vessel, where the likely location is based on extrapolating speed and course of the position closest in time to the image (Fig. S4).Determining whether a vessel with AIS was within a sceneVessel positions from AIS are usually available before and/or after the SAR images, and sometimes it is unclear if a vessel should have been within the scene footprint at the time of the image.To estimate the probability that a vessel (with AIS) was within a scene, we used the multiplied probability raster, summing the values inside the scene boundaries. This provides an estimate of the likelihood that the vessel was within the scene footprint at the time of the image. We applied this to every vessel that had at least one AIS position within 12 h and 200 nautical miles of the scene footprint. The vast majority of vessels were either very likely inside or outside the scene footprints, with 516 vessels having a probability of  > 95% and only 16 having a probability between 5 and 95%. We filtered out all vessels that were definitely outside of the image footprint before matching.Estimating the likelihood of detecting a vessel with SARThe AIS data show that not all vessels broadcasting AIS were captured by the RADARSAT-2 images (Fig. 3a). Using the known lengths of detected vessels with AIS, we estimated the likelihood of detecting a vessel with SAR as a function of vessel length (Fig. 3a). For vessels shorter than 60 m, we approximated the detection rate as a linear function. Treating each vessel as an individual detection, we fitted the 50th percentile using quantile regression to approximate the detection rate. For vessels above 60 m, we assumed a constant detection rate as very few vessels above this length did now show up in the SAR images. Of the 46 unique vessels larger than 62 m, 42 were detected, implying a detection rate of ~ 91%. Given that it is highly likely that large vessels will be captured by medium-resolution SAR imagery, we manually reviewed these cases to confirm that they were (almost surely) inside the scene footprints at the time the images were taken.We should note that the probability of detecting a vessel in SAR also depends on the sea state, incidence angle, polarization, material of the vessel, and orientation of the vessel. We are unable, however, to measure these effects directly so we cannot explicitly model these effects.With sufficient scenes, these effects should be randomly distributed across our scenes, so they likely account for some of the variability in detectability and the inaccuracy in our length estimates from SAR.Estimating the number and length of non-broadcasting vesselsBecause SAR does not detect all vessels, and because the length as estimated by SAR can be incorrect, there are many possible distributions of actual non-broadcasting vessels that could have produced the distribution of unmatched SAR detections that we found in the scenes. To estimate the most likely such distribution, we built a model to combine the two key relationships—between vessel length and likelihood of detection, and between vessel length and the length as estimated by SAR. This model allowed us to estimate, based on the number and distribution of SAR vessels, the likely number and distribution of actual vessels present (Fig. 3c,d).We binned the likelihood of vessel detection as a function of length into 1 m intervals, yielding a vector (alpha) of length 400. We also binned into 1 m intervals the population of lengths of all detected vessels ((ell_{D})) as reported by AIS (i.e. number of vessels at each length bin), the population of expected SAR lengths ((ell_{E})), and the population of lengths of all vessels ((ell_{A}), the quantity we wish to estimate). Thus, (ell_{D}) can be expressed as the product of (alpha) and (ell_{A}):$$ell_{D} = {upalpha } odot ell_{{text{A}}}$$
    where (odot) is the element-wise product. We then estimated a matrix (L_{{}}) that relates (ell_{D}) to (ell_{E}).$$ell_{E} = Lell_{D}$$
    where each element (L_{ij}) represents the probability that a vessel with length in bin j would be estimated by SAR to be of length in bin i. We calculated these probabilities as lognormal probability density functions, with one distribution per column. To estimate the scale and shape parameters of these distributions, we first fitted a quantile regression using the (non-binned) lengths from AIS of detected vessels as the predictor for the lengths reported by SAR. Assuming that the predicted 1/3 and 2/3 quantiles (as shown in Fig. 3a) represent the quantiles of a lognormal distribution, allow us to calculate the shape and scale parameters. We chose a lognormal distribution because: 1) the variable of interest, length, was always greater than zero, 2) the population of lengths was skewed towards larger values, and 3) there is an explicit and relatively simple relationship between the lognormal quantiles and the shape and scale parameters that simplified the calculations.Combining Eqs. (6) and (7) provides a relation between (ell_{A}) and (ell_{E}):$$ell_{E} = {text{L}}left( {alpha odot ell_{A} } right)$$
    To estimate ({mathcal{l}}_{A}) we minimized an objective function (O({mathcal{l}}_{E},{mathcal{l}}_{o})) between the vector of expected counts binned by length (({mathcal{l}}_{E})) and the vector of counts observed in SAR binned by length (({mathcal{l}}_{o})). For this objective function, we chose the sum of the Kolmogorov –Smirnov distance between length distributions and the squared difference of the total numbers of detections. The first term controls the shape of the resulting distribution while the second one controls the magnitude. Specifically:$$Oleft( {ell_{E} ,ell_{o} } right) = max left( {left| {C_{E} – C_{O} } right|} right) + left( {T_{E} – T_{O} } right)^{2}$$
    where:$$T_{x} = mathop sum limits_{ } ell_{x}$$$$D_{x} = ell_{x} /T_{x}$$$$C_{x} = cumsumleft( {D_{x} } right)$$Assessing the uncertainty in the estimationTo test how accurately our approach predicts the correct number of vessels, we performed a bootstrap simulation. We computed the vector (alpha) and the matrix L from a random subset of vessels with AIS that had a high confidence ( > 95%) of appearing within the scenes. We then used our method on the SAR detections that matched the remaining vessels to predict the number of vessels they corresponded to ((ell_{text{A}})). By running 10,000 experiments we found a mean absolute percent error of + − 9% (Figs. S5 and S6). This provides a rough estimate of the uncertainty in our prediction due to the estimation process itself. We used the distribution of these samples to estimate the 90% confidence interval that we report with our estimates. We note that this uncertainty refers to the parametrization of the model and there may be other sources of error, such as the possibility that vessels without AIS have different radar properties (e.g. made out of materials with different reflectiveness), that we did not account for in our model.Catch and effort data in the overlapping area between WCPFC and IATTCWe downloaded gridded effort and catch data from the WCPFC and IATTC websites, and compared the reported number of hooks and catch from September to December of 2019 for the area between − 140 to − 150 longitude and − 5 to − 15 latitude, a bounding box that contains our study region in the Pacific and which is entirely within both the WCPFC and IATTC convention zones. We found that the reported number of hooks for Korea is three times higher for the IATTC as it is for the WCPFC (Fig. S7), and the numbers of hooks also disagree by more than 10% for most other flag states. Catch is also 2.5 times higher for IATTC than for WCPFC for Korea as well, with catch also differing by more than 10% for most other flag states. This finding suggests that the different RFMOs may not be accounting for the same vessels in the overlap region between the two RFMOs. More

    I lure tarantulas from their burrows (for science)

    As part of my PhD thesis at Colorado State University in Fort Collins, I study the Texas brown tarantula (Aphonopelma hentzi) in the short-grass prairie. My colleagues and I work on the Southern Plains Land Trust, a piece of private conservation land about an hour south of Lamar, Colorado. These tarantulas’ habitats range from Louisiana to this southern part of Colorado. The prairie is a harsh environment — super dry, windy and sometimes very hot or cold. The tarantulas’ burrows become their lifeline; they stay in there for the long haul. Only the males, once mature, leave their burrows to wander aimlessly, looking for love.Tarantulas are ambush predators, meaning that they wait for food to walk by. We want to know if they build burrows in a consistent way, and how their burrows help them to survive the prairie’s harsh environment.We lure the tarantulas out of their burrow using a piece of grass, and then we collect them with a one-litre plastic cup. We pour quick-set plaster of Paris into the burrow. Once it’s dry, we dig out the cast. The first one, that I’m holding here, turned out to be 60 centimetres deep. This does destroy the burrow, but we dig the tarantula a new starter burrow nearby.The casts show us that some spiders are very clean and keep their burrows empty, whereas others are trashy, keeping previous moults or leftovers from eaten beetle. One of the burrows looked as if it had been borrowed from a much bigger animal. That is high-end lazy.About 90% of US prairies are gone because of agriculture and ranching. We strive to preserve the prairie and the creatures in it. Tarantulas serve as a force for keeping insect and even rodent populations under control in the prairie ecosystem. Tarantulas are big, but they won’t hurt you. Want fewer insects? Let spiders live in your house. They’re in your bathtub only because they are thirsty. More

    Fungivorous mites enhance the survivorship and development of stingless bees even when exposed to pesticides

    Assessment of suitable habitat of mangrove species for prioritizing restoration in coastal ecosystem of Sundarban Biosphere Reserve, India

    An iterative and interdisciplinary categorisation process towards FAIRer digital resources for sensitive life-sciences data

    The categorisation system was developed through an iterative procedure including a careful evaluation at each stage. This was necessary because each of three rounds yielded substantial feedback from the expert taggers, identifying issues to be resolved and proposing improvements to the system. This process led to a much clearer understanding of the structure of sensitive data resources and a wider agreement on definitions to be applied in the tagging process. In addition, the allocation of exactly one tag per category improved during the development for many categories, indicating that the selection process was straightforward for most resources and categories. As a result, the categorisation system could be simplified and the structure improved, appropriately representing a trans-disciplinary effort. This may also be important from the user perspective. At the end of the day, the system should be so intuitive that the users searching for terms would have the same logic as the experts entered the tags.To be beneficial for the domain of LS, the categorisation system and the toolbox requires broad community approval38,39. In the project, we began the approval process with nominated experts from 6 LS RIs, embedded in a larger working group of the H2020-funded project EOSC-Life, covering 13 LS RIs. Though this can be seen as a useful starting point, the toolbox obviously needs community approval at a much larger scale. As the categorisation system is specifying a part of essential metadata for resources about sensitive data, it will be relevant to the FAIR Digital Objects (FDO) Forum for a « resources in the life sciences » FDO. The categorisation system can be used to derive FDO attributes and values for such FDOs. FDOs for the sensitive data itself, when levels of sensitivity and specific access protocols need to be specified is an interesting possible extension, and the categorisation system could support as a backbone information for access governance and technical choices. FDOs are to be “machine actionable”, so desirable mappings between different categorisation systems will be operationalisable. New European projects such as FAIRCORE4EOSC (, FAIR-IMPACT ( and other projects working on pragmatic semantic improvements for FAIR appliance will provide possibilities for registering metadata schemas and mappings that should reuse interdisciplinary approaches in the heterogeneous field of life sciences.The RDA has established and is maintaining a Metadata Standards Catalogue (MSC) (,5). An appropriate goal for the categorisation system would be to be included in this catalogue, after further refinement and alignment with other vocabularies addressing sensitive data in the life sciences. In any case, the work on the categorisation system can contribute to discussions on methodologies for aligning metadata schemas across scientific domains, while the categorisation system itself can be seen as an important contribution to the process of developing the most useful and appropriate cross-disciplinary terms and categories for describing sensitive data. We keep in mind that similar approaches have been applied via long and iterative processes in other scientific domains, such as understanding and predicting the evolution of climate (essential climate variables, and essential biodiversity variables for mapping and monitoring species populations40. There are biases and gaps in the existing system that need to be tackled in the future. The initial content of the toolbox demonstrator, consisting of 110 resources related to sensitive data, has been primarily selected by four RIs with a focus on clinical and biomedical research (BBMRI, EATRIS, ECRIN, Euro-Bioimaging). Other areas and sensitive data types, such as environmental, classified, and proprietary data are under-represented, as are some disciplines, such as zoology, ecology, plant and mycological sciences, and microbiology. This indicates a need for a broader coverage of resources linked to sensitive data in the future work. Another question that needs to be investigated is how interoperable the categorisation system is with other domains outside the LS that systematically deal with sensitive data, for example, the Social Science and Humanities41). In addition, systematic data on the usability/user-friendliness of the toolbox from a broad sample of potential users from the life sciences are needed. Initial and informal evaluation of these aspects by the experts involved so far has been very positive but is clearly limited in scale and needs to be supplemented by more evidence.There are major challenges to the sharing of sensitive data, including interoperability, accessibility, and governance. The primary objective of the toolbox is to improve discoverability of resources and digital objects linked to the sharing and re-use of sensitive data (F in FAIR)4. The systematic application of a standardised typology for resources about sensitive data, as defined by the categorisation system, helps to better structure, and organise the issues and results in metadata enrichment (F4, R1.3 of the FAIR principles in Supplementary, Table S1). The toolbox alone will not be enough for the ‘I’ of the FAIR principles, but it may become a useful backbone for building more interoperable classification systems for sensitive data resources.It is perhaps more common to base findability on a tagging system using keywords (plus title text). That is, for example, how PubMed works—it does not categorise resources, it adds MESH terms to them ( Another option would have been to try to derive keywords from text or title. In our case, a categorisation system with pre-defined dimensions and pre-listed tags was preferred by the expert group. Keywords, in isolation, suffer from several disadvantages:


    A range of equivalent terms may be used to mean the same thing – making searching for that concept difficult, requiring multiple ‘Or’ statements.


    They may have multiple meanings (polysemy) especially if “drawn from”, or “applied to”, a wide range of scientific disciplines.


    The different aspects of the resource covered by keywords, i.e., the types or dimensions of keyword applied, may be inconsistent and / or incomplete.

    The categorisation system, on the other hand, guarantees that all 7 validated dimensions required are used in the tagging process and that the tags selected are standardised and defined. The toolbox categories also aid browsing of results by enabling sequential filtering using the categories and tags.In addition, there is a useful link between developing community approved categories for metadata, in this case for characterising resources dealing with sensitive data, and community understood (but implicit) ontologies used in the same area. Categories and ontologies can complement each other—without a common underlying ontology, metadata terms can be interpreted inconsistently, and without defining metadata categories, ontologies may remain implicit and inconsistent. We found, for example, that discussions on the best categorisation to use for scientific disciplines, or data types, exposed the implicit (and different) ontologies being used by different people and is based on the personal views of those in the group. Those would have been obviously rooted in and / or influenced by the language and working assumptions of their discipline(s), and their roles and experiences, (current and previous). That will be more and more the case with interdisciplinary research development and development in research careers. Developing categories in metadata can therefore play an important role in describing, understanding and, ultimately, harmonising the implicit ontologies scientists use in thinking about the area of sensitive data.In the development of the categorisation system, existing ontologies, classifications, and terminologies were taken into consideration (Table 2). However, many more have relationships to the categorisation system. An example is the Subject Resource Application Ontology (SRAO), an application ontology describing subject areas/academic disciplines used within FAIRsharing records by curators and the user community42. A first crosswalk has demonstrated considerable agreement between the toolbox category “research field” and subsections of SRAO42 and EDAM15. The toolbox has been registered as a resource (database) at FAIRsharing, a curated, informative, and educational resource on data and metadata standards, inter-related to databases and data policies ( It is planned to create a collection group of resources (standards, databases, policies) in FAIRsharing linked to the toolbox and the underlying categorisation system. This will also cover relationships to ontologies and classifications.There is a need to explore the applicability of the toolbox to specific domains. One example could be the European Joint Programme on Rare Diseases (EJP RD), where resources are made progressively FAIR at the record level to support innovative basic, translational and clinical research ( The goal is to identify, refine and expose core standards for dataset interoperability, asset (data, sample, subject) discovery, and responsible data sharing, concentrating on data level rather than resource level information. Knowledge exchange between EJP RD and the toolbox could be of benefit in exploring the complementary of both approaches in adequately characterising resources linked to sensitive data and thus improving data discoverability.The first pilot study demonstrated major variation in tagging of resources if independent taggers are assessing the same resource (inter-observer variation). The example of BBMRI has shown that this variation can be considerably reduced if adequate training is performed; which in return is resource intense. Thus, to arrive at a valid and reliable tagging process, there is a necessity for adequate training and support to reduce inter-observer variation. Specific training sets and training programs as well as intercalibration tools need to be developed and implemented and approved by the community.Another option to be explored should be AI—or ML-algorithms to support automatic (or at least semi-automatic) tagging of resources. It is not easy to use AI/ML in this field due to the multilingualism and the misinterpretation of terms. Often there are different meanings between scientific disciplines and a common backbone for the application of AI/ML is difficult to achieve. It is necessary to come to a common understanding between people involved in the assessment of resources related to sensitive data in all life sciences. Nevertheless, the toolbox can become of major importance for research and application of AI/ML techniques in this field. It may serve as a resource for AI/ML to better find resources in the field by serving as a kind of gold standard to compare with. Another promising approach would be to consider a knowledge graph as an intelligent representation. For the categorisation system the approach could be used to interlink categories to a resource (e.g., “source related to sensitive data” has “geographical scope”) and to link individual tags between categories if possible (e.g., “clinical research data” result from “clinical research”). This would give a richer representation of the knowledge behind the categorisation system and the option to be integrated in existing approaches (e.g., OpenAIRE, Therefore, we will consider knowledge graphs as an intelligent knowledge representation of the categorisation system in the future.A major challenge will be the transition of the toolbox demonstrator to a mature toolbox and ultimately its maintenance, extension, and sustainability. Development of the toolbox demonstrator has been financed by EOSC-Life, but this project will end in 2023. Discussion on sustainability has been initiated with several life-science infrastructures (e.g., BBMRI, EATRIS, ECRIN and ELIXIR, another European Life-Science Infrastructure). Key aspects of sustainability that need to be considered are maintenance of the toolbox portal and tagging tool and of the toolbox content including expert time for tagging as well as human resources to maintain the system. Different approaches are under evaluation: an organization considering the resource core to its operations and taking full responsibility, or a joint ownership across multiple organisations (e.g., multiple RIs) or a community taking responsibility, either funded by future grants or through in-kind contributions from motivated research parties/individuals. Further costs to be covered will include system maintenance, input from a toolbox manager, tagging of resources by experts, as well as advertisement to the envisioned user groups, hardware costs and costs for debugging and major extension of functionality if needed. More

    Soil qualities and change rules of Eucalyptus grandis × Eucalyptus urophylla plantation with different slash disposals

