Geographical drivers and climate-linked dynamics of Lassa fever in Nigeria
We analyse weekly reported counts of suspected and confirmed human cases and deaths attributed to LF (as defined in Supplementary Table 1), between 1 January 2012 and 30 December 2019, from across the entire of Nigeria. The weekly counts were reported from 774 LGAs in 36 Federal states and the Federal Capital Territory, under Integrated Disease Surveillance and Response (IDSR) protocols, and collated by the NCDC. All suspected cases, confirmed cases and deaths from notifiable infectious diseases (including viral haemorrhagic fevers; VHFs) are reported weekly to the LGA Disease Surveillance and Notification Officer (DSNO) and State Epidemiologist (SE). IDSR routine data on priority diseases are collected from inpatient and outpatient registers in health facilities, and forwarded to each LGA’s DSNO using SMS or paper form. Subsequently, individual LGA DSNOs collate and forward the data to their respective SE, also by SMS and paper form, for weekly and monthly reporting respectively to NCDC. From mid-2017 onwards, data entry in 18 states has been conducted using a mobile phone-based electronic reporting system called mSERS, with the data entered using a customised Excel spreadsheet that is used to manually key into NCDC-compatible spreadsheets. Data from this surveillance regime (WERs) were collated by epidemiologists at NCDC throughout the period 2012 to March 2018 (Supplementary Fig. 1).Throughout the study period, within-country LF surveillance and response has been strengthened under NCDC coordination2,20,33. LGAs are now required to notify immediately any suspected case to the state-level, which in turn reports to NCDC within 24 h, and also sends a cumulative weekly report of all reported cases. A dedicated, multi-sectoral NCDC LF TWG was set up in 2016 with the responsibility of coordinating all LF preparedness and response activities across states. Further capacity building occurred in 2017 to 2019, with the opening of three additional LF diagnostic laboratories in Abuja (Federal Capital Territory), Abakaliki (Ebonyi state) and Owo (Ondo state) (to a total of five; Fig. 2) and the rollout of intensive country-wide training on surveillance, clinical case management and diagnosis. We note that, due to the rapid expansion in a test capacity, the definition of a suspected case in our data has subtly changed over the surveillance period: from 2012 to 2016, suspected cases include probable cases that were not lab-tested, whereas from 2017 to 2019, all suspected cases were tested and confirmed to be negative.In addition to the WERs data, since 2017 LF case reporting data has also been collated by the LF TWG and used to inform the weekly NCDC LF Situation Reports (SitRep data; https://ncdc.gov.ng/diseases/sitreps). This regime includes post hoc follow-ups to ensure more accurate case counts, so our analyses use WER-derived case data from 2012 to 2016, and SitRep-derived case data from 2017 to 2019 (see Fig. 1 for full time series). A visual comparison of the data from each separate time series, including the overlap period (2017 to March 2018) is provided in Supplementary Fig. 1, and all statistical models considered random intercepts for the different surveillance regimes. Where other studies of recent Nigeria LF incidence have been more spatially and temporally restricted34,35, the extended monitoring period and fine spatial granularity of these data provide the opportunity for a detailed empirical perspective on the local drivers of LF at a country-wide scale and their relationship to changes in reporting effort.Recent trends in LF surveillance in NigeriaWe visualised temporal and seasonal trends in suspected and confirmed LF cases within and between years, for both surveillance datasets. Weekly case counts were aggregated to country-level and visualised as both annual case accumulation curves, and aggregated weekly case totals (Fig. 1 and Supplementary Fig. 1). We also mapped annual counts of suspected and confirmed cases across Nigeria at the LGA-level to examine spatial changes in reporting over the surveillance period (Fig. 2). State and LGA shapefiles used for modelling and mapping were obtained from Humanitarian Data Exchange under a CC-BY-IGO license (https://data.humdata.org/dataset/nga-administrative-boundaries).Analyses of aggregated district data are sensitive to differences in scale and shape of aggregation (the modifiable areal unit problem; MAUP36), and LGA geographical areas in Nigeria are highly skewed and vary over >3 orders of magnitude (median 713 km2, mean 1175 km2, range 4–11,255 km2). We therefore also aggregated all LGAs across Nigeria into 130 composite districts with a more even distribution of geographical areas, using distance-based hierarchical clustering on LGA centroids (implemented using hclust in R), with the constraint that each new cluster must contain only LGAs from within the same state (to preserve potentially important state-level differences in surveillance regime). Weekly and annual suspected and confirmed LF case totals were then calculated for each aggregated district. We used these spatially aggregated districts to test for the effects of scale on spatial drivers of LF occurrence and incidence.Statistical analysisWe analysed the full case time series (Fig. 1) to characterise the spatiotemporal incidence and drivers of LF in Nigeria, while controlling for year-on-year increases and expansions of surveillance effort. We firstly modelled annual LF occurrence and incidence at a country-wide scale, to identify the spatial, climatic and socio-ecological correlates of disease risk across Nigeria. Secondly, we modelled seasonal and temporal trends in weekly LF incidence within hyperendemic areas in the north and south of Nigeria, to identify the seasonal climatic conditions associated with LF risk dynamics and evaluate the scope for forecasting. All data processing and modelling was conducted in R v.3.4.1 with the packages R-INLA v.20.03.1737, raster v.3.4.1338 and velox v0.2.039. Statistical modelling was conducted using hierarchical regression in a Bayesian inference framework (integrated nested Laplace approximation (INLA)), which provides fast, stable and accurate posterior approximation for complex, spatially and temporally-structured regression models37,40, and has been shown to outperform alternative methods for modelling environmental phenomena with evidence of spatially biased reporting41.Processing climatic and socio-ecological covariatesWe collated geospatial data on socio-ecological and climatic factors that are hypothesised to influence either M. natalensis distribution and population ecology (rainfall, temperature and vegetation patterns), frequency and mode of human–rodent contact (poverty and improved housing prevalence), both of the above (agricultural and urban land cover) or likelihood of LF reporting (travel time to nearest laboratory with LF diagnostic capacity and travel time to nearest hospital). For each LGA we extracted the mean value for each covariate across the LGA polygon. The full suite of covariates tested across all analyses, data sources and associated hypotheses are described in Supplementary Table 5.We collated climate data spanning the full monitoring period and up until the date of analysis (July 2011 to January 2021). We obtained daily precipitation rasters for Africa42 from the Climate Hazards Infrared Precipitation with Stations (CHIRPS) project; this dataset is based on combining sparse weather station data with satellite observations and interpolation techniques, and is designed to support hydrologic forecasts in areas with poor weather station coverage (such as tropical West Africa)42. A recent study ground-truthing against weather station data showed that CHIRPS provides greater overall accuracy than other gridded precipitation products in Nigeria43. Air temperature daily minimum and maximum rasters were obtained from NOAA and were also averaged to calculate daily mean temperature. EVI, a measure of vegetation quality, was obtained from processing 16-day composite layers from NASA (National Aeronautics and Space Administration) (excluding all grid cells with unreliable observations due to cloud cover and linearly interpolating between observations to give daily values; Supplementary Table 5).We derived several spatial bioclimatic variables to capture conditions across the full monitoring period (Jan 2012 to Dec 2019): mean precipitation of the driest annual month, mean precipitation of the wettest annual month, precipitation seasonality (coefficient of variation), annual mean air temperature, air temperature seasonality, annual mean EVI and EVI seasonality. We also calculated monthly total precipitation, 3-month SPI44, average daily mean (Tmean), minimum (Tmin) and maximum (Tmax) temperature and EVI variables at sequential time lags prior to reporting week for seasonal modelling (described below in Temporal drivers). SPI is a standardised measure of drought or wetness conditions relative to the historical average conditions for a given period of the year. SPI was calculated within a rolling 3-month window across the full 40-year historical CHIRPS rainfall time series (1981–2020) using the R package SPEI v.1.744.We accessed annual human population rasters at 100 m resolution from WorldPop. We accessed the proportion of the population living in poverty in 2010 ( More