More stories

  • in

    An AI dataset carves new paths to tornado detection

    The return of spring in the Northern Hemisphere touches off tornado season. A tornado’s twisting funnel of dust and debris seems an unmistakable sight. But that sight can be obscured to radar, the tool of meteorologists. It’s hard to know exactly when a tornado has formed, or even why.

    A new dataset could hold answers. It contains radar returns from thousands of tornadoes that have hit the United States in the past 10 years. Storms that spawned tornadoes are flanked by other severe storms, some with nearly identical conditions, that never did. MIT Lincoln Laboratory researchers who curated the dataset, called TorNet, have now released it open source. They hope to enable breakthroughs in detecting one of nature’s most mysterious and violent phenomena.

    “A lot of progress is driven by easily available, benchmark datasets. We hope TorNet will lay a foundation for machine learning algorithms to both detect and predict tornadoes,” says Mark Veillette, the project’s co-principal investigator with James Kurdzo. Both researchers work in the Air Traffic Control Systems Group. 

    Along with the dataset, the team is releasing models trained on it. The models show promise for machine learning’s ability to spot a twister. Building on this work could open new frontiers for forecasters, helping them provide more accurate warnings that might save lives. 

    Swirling uncertainty

    About 1,200 tornadoes occur in the United States every year, causing millions to billions of dollars in economic damage and claiming 71 lives on average. Last year, one unusually long-lasting tornado killed 17 people and injured at least 165 others along a 59-mile path in Mississippi.  

    Yet tornadoes are notoriously difficult to forecast because scientists don’t have a clear picture of why they form. “We can see two storms that look identical, and one will produce a tornado and one won’t. We don’t fully understand it,” Kurdzo says.

    A tornado’s basic ingredients are thunderstorms with instability caused by rapidly rising warm air and wind shear that causes rotation. Weather radar is the primary tool used to monitor these conditions. But tornadoes lay too low to be detected, even when moderately close to the radar. As the radar beam with a given tilt angle travels further from the antenna, it gets higher above the ground, mostly seeing reflections from rain and hail carried in the “mesocyclone,” the storm’s broad, rotating updraft. A mesocyclone doesn’t always produce a tornado.

    With this limited view, forecasters must decide whether or not to issue a tornado warning. They often err on the side of caution. As a result, the rate of false alarms for tornado warnings is more than 70 percent. “That can lead to boy-who-cried-wolf syndrome,” Kurdzo says.  

    In recent years, researchers have turned to machine learning to better detect and predict tornadoes. However, raw datasets and models have not always been accessible to the broader community, stifling progress. TorNet is filling this gap.

    The dataset contains more than 200,000 radar images, 13,587 of which depict tornadoes. The rest of the images are non-tornadic, taken from storms in one of two categories: randomly selected severe storms or false-alarm storms (those that led a forecaster to issue a warning but that didn’t produce a tornado).

    Each sample of a storm or tornado comprises two sets of six radar images. The two sets correspond to different radar sweep angles. The six images portray different radar data products, such as reflectivity (showing precipitation intensity) or radial velocity (indicating if winds are moving toward or away from the radar).

    A challenge in curating the dataset was first finding tornadoes. Within the corpus of weather radar data, tornadoes are extremely rare events. The team then had to balance those tornado samples with difficult non-tornado samples. If the dataset were too easy, say by comparing tornadoes to snowstorms, an algorithm trained on the data would likely over-classify storms as tornadic.

    “What’s beautiful about a true benchmark dataset is that we’re all working with the same data, with the same level of difficulty, and can compare results,” Veillette says. “It also makes meteorology more accessible to data scientists, and vice versa. It becomes easier for these two parties to work on a common problem.”

    Both researchers represent the progress that can come from cross-collaboration. Veillette is a mathematician and algorithm developer who has long been fascinated by tornadoes. Kurdzo is a meteorologist by training and a signal processing expert. In grad school, he chased tornadoes with custom-built mobile radars, collecting data to analyze in new ways.

    “This dataset also means that a grad student doesn’t have to spend a year or two building a dataset. They can jump right into their research,” Kurdzo says.

    This project was funded by Lincoln Laboratory’s Climate Change Initiative, which aims to leverage the laboratory’s diverse technical strengths to help address climate problems threatening human health and global security.

    Chasing answers with deep learning

    Using the dataset, the researchers developed baseline artificial intelligence (AI) models. They were particularly eager to apply deep learning, a form of machine learning that excels at processing visual data. On its own, deep learning can extract features (key observations that an algorithm uses to make a decision) from images across a dataset. Other machine learning approaches require humans to first manually label features. 

    “We wanted to see if deep learning could rediscover what people normally look for in tornadoes and even identify new things that typically aren’t searched for by forecasters,” Veillette says.

    The results are promising. Their deep learning model performed similar to or better than all tornado-detecting algorithms known in literature. The trained algorithm correctly classified 50 percent of weaker EF-1 tornadoes and over 85 percent of tornadoes rated EF-2 or higher, which make up the most devastating and costly occurrences of these storms.

    They also evaluated two other types of machine-learning models, and one traditional model to compare against. The source code and parameters of all these models are freely available. The models and dataset are also described in a paper submitted to a journal of the American Meteorological Society (AMS). Veillette presented this work at the AMS Annual Meeting in January.

    “The biggest reason for putting our models out there is for the community to improve upon them and do other great things,” Kurdzo says. “The best solution could be a deep learning model, or someone might find that a non-deep learning model is actually better.”

    TorNet could be useful in the weather community for others uses too, such as for conducting large-scale case studies on storms. It could also be augmented with other data sources, like satellite imagery or lightning maps. Fusing multiple types of data could improve the accuracy of machine learning models.

    Taking steps toward operations

    On top of detecting tornadoes, Kurdzo hopes that models might help unravel the science of why they form.

    “As scientists, we see all these precursors to tornadoes — an increase in low-level rotation, a hook echo in reflectivity data, specific differential phase (KDP) foot and differential reflectivity (ZDR) arcs. But how do they all go together? And are there physical manifestations we don’t know about?” he asks.

    Teasing out those answers might be possible with explainable AI. Explainable AI refers to methods that allow a model to provide its reasoning, in a format understandable to humans, of why it came to a certain decision. In this case, these explanations might reveal physical processes that happen before tornadoes. This knowledge could help train forecasters, and models, to recognize the signs sooner. 

    “None of this technology is ever meant to replace a forecaster. But perhaps someday it could guide forecasters’ eyes in complex situations, and give a visual warning to an area predicted to have tornadic activity,” Kurdzo says.

    Such assistance could be especially useful as radar technology improves and future networks potentially grow denser. Data refresh rates in a next-generation radar network are expected to increase from every five minutes to approximately one minute, perhaps faster than forecasters can interpret the new information. Because deep learning can process huge amounts of data quickly, it could be well-suited for monitoring radar returns in real time, alongside humans. Tornadoes can form and disappear in minutes.

    But the path to an operational algorithm is a long road, especially in safety-critical situations, Veillette says. “I think the forecaster community is still, understandably, skeptical of machine learning. One way to establish trust and transparency is to have public benchmark datasets like this one. It’s a first step.”

    The next steps, the team hopes, will be taken by researchers across the world who are inspired by the dataset and energized to build their own algorithms. Those algorithms will in turn go into test beds, where they’ll eventually be shown to forecasters, to start a process of transitioning into operations.

    In the end, the path could circle back to trust.

    “We may never get more than a 10- to 15-minute tornado warning using these tools. But if we could lower the false-alarm rate, we could start to make headway with public perception,” Kurdzo says. “People are going to use those warnings to take the action they need to save their lives.” More

  • in

    Using deep learning to image the Earth’s planetary boundary layer

    Although the troposphere is often thought of as the closest layer of the atmosphere to the Earth’s surface, the planetary boundary layer (PBL) — the lowest layer of the troposphere — is actually the part that most significantly influences weather near the surface. In the 2018 planetary science decadal survey, the PBL was raised as an important scientific issue that has the potential to enhance storm forecasting and improve climate projections.  

    “The PBL is where the surface interacts with the atmosphere, including exchanges of moisture and heat that help lead to severe weather and a changing climate,” says Adam Milstein, a technical staff member in Lincoln Laboratory’s Applied Space Systems Group. “The PBL is also where humans live, and the turbulent movement of aerosols throughout the PBL is important for air quality that influences human health.” 

    Although vital for studying weather and climate, important features of the PBL, such as its height, are difficult to resolve with current technology. In the past four years, Lincoln Laboratory staff have been studying the PBL, focusing on two different tasks: using machine learning to make 3D-scanned profiles of the atmosphere, and resolving the vertical structure of the atmosphere more clearly in order to better predict droughts.  

    This PBL-focused research effort builds on more than a decade of related work on fast, operational neural network algorithms developed by Lincoln Laboratory for NASA missions. These missions include the Time-Resolved Observations of Precipitation structure and storm Intensity with a Constellation of Smallsats (TROPICS) mission as well as Aqua, a satellite that collects data about Earth’s water cycle and observes variables such as ocean temperature, precipitation, and water vapor in the atmosphere. These algorithms retrieve temperature and humidity from the satellite instrument data and have been shown to significantly improve the accuracy and usable global coverage of the observations over previous approaches. For TROPICS, the algorithms help retrieve data that are used to characterize a storm’s rapidly evolving structures in near-real time, and for Aqua, it has helped increase forecasting models, drought monitoring, and fire prediction. 

    These operational algorithms for TROPICS and Aqua are based on classic “shallow” neural networks to maximize speed and simplicity, creating a one-dimensional vertical profile for each spectral measurement collected by the instrument over each location. While this approach has improved observations of the atmosphere down to the surface overall, including the PBL, laboratory staff determined that newer “deep” learning techniques that treat the atmosphere over a region of interest as a three-dimensional image are needed to improve PBL details further.

    “We hypothesized that deep learning and artificial intelligence (AI) techniques could improve on current approaches by incorporating a better statistical representation of 3D temperature and humidity imagery of the atmosphere into the solutions,” Milstein says. “But it took a while to figure out how to create the best dataset — a mix of real and simulated data; we needed to prepare to train these techniques.”

    The team collaborated with Joseph Santanello of the NASA Goddard Space Flight Center and William Blackwell, also of the Applied Space Systems Group, in a recent NASA-funded effort showing that these retrieval algorithms can improve PBL detail, including more accurate determination of the PBL height than the previous state of the art. 

    While improved knowledge of the PBL is broadly useful for increasing understanding of climate and weather, one key application is prediction of droughts. According to a Global Drought Snapshot report released last year, droughts are a pressing planetary issue that the global community needs to address. Lack of humidity near the surface, specifically at the level of the PBL, is the leading indicator of drought. While previous studies using remote-sensing techniques have examined the humidity of soil to determine drought risk, studying the atmosphere can help predict when droughts will happen.  

    In an effort funded by Lincoln Laboratory’s Climate Change Initiative, Milstein, along with laboratory staff member Michael Pieper, are working with scientists at NASA’s Jet Propulsion Laboratory (JPL) to use neural network techniques to improve drought prediction over the continental United States. While the work builds off of existing operational work JPL has done incorporating (in part) the laboratory’s operational “shallow” neural network approach for Aqua, the team believes that this work and the PBL-focused deep learning research work can be combined to further improve the accuracy of drought prediction. 

    “Lincoln Laboratory has been working with NASA for more than a decade on neural network algorithms for estimating temperature and humidity in the atmosphere from space-borne infrared and microwave instruments, including those on the Aqua spacecraft,” Milstein says. “Over that time, we have learned a lot about this problem by working with the science community, including learning about what scientific challenges remain. Our long experience working on this type of remote sensing with NASA scientists, as well as our experience with using neural network techniques, gave us a unique perspective.”

    According to Milstein, the next step for this project is to compare the deep learning results to datasets from the National Oceanic and Atmospheric Administration, NASA, and the Department of Energy collected directly in the PBL using radiosondes, a type of instrument flown on a weather balloon. “These direct measurements can be considered a kind of ‘ground truth’ to quantify the accuracy of the techniques we have developed,” Milstein says.

    This improved neural network approach holds promise to demonstrate drought prediction that can exceed the capabilities of existing indicators, Milstein says, and to be a tool that scientists can rely on for decades to come. More

  • in

    Generative AI for smart grid modeling

    MIT’s Laboratory for Information and Decision Systems (LIDS) has been awarded $1,365,000 in funding from the Appalachian Regional Commission (ARC) to support its involvement with an innovative project, “Forming the Smart Grid Deployment Consortium (SGDC) and Expanding the HILLTOP+ Platform.”

    The grant was made available through ARC’s Appalachian Regional Initiative for Stronger Economies, which fosters regional economic transformation through multi-state collaboration.

    Led by Kalyan Veeramachaneni, research scientist and principal investigator at LIDS’ Data to AI Group, the project will focus on creating AI-driven generative models for customer load data. Veeramachaneni and colleagues will work alongside a team of universities and organizations led by Tennessee Tech University, including collaborators across Ohio, Pennsylvania, West Virginia, and Tennessee, to develop and deploy smart grid modeling services through the SGDC project.

    These generative models have far-reaching applications, including grid modeling and training algorithms for energy tech startups. When the models are trained on existing data, they create additional, realistic data that can augment limited datasets or stand in for sensitive ones. Stakeholders can then use these models to understand and plan for specific what-if scenarios far beyond what could be achieved with existing data alone. For example, generated data can predict the potential load on the grid if an additional 1,000 households were to adopt solar technologies, how that load might change throughout the day, and similar contingencies vital to future planning.

    The generative AI models developed by Veeramachaneni and his team will provide inputs to modeling services based on the HILLTOP+ microgrid simulation platform, originally prototyped by MIT Lincoln Laboratory. HILLTOP+ will be used to model and test new smart grid technologies in a virtual “safe space,” providing rural electric utilities with increased confidence in deploying smart grid technologies, including utility-scale battery storage. Energy tech startups will also benefit from HILLTOP+ grid modeling services, enabling them to develop and virtually test their smart grid hardware and software products for scalability and interoperability.

    The project aims to assist rural electric utilities and energy tech startups in mitigating the risks associated with deploying these new technologies. “This project is a powerful example of how generative AI can transform a sector — in this case, the energy sector,” says Veeramachaneni. “In order to be useful, generative AI technologies and their development have to be closely integrated with domain expertise. I am thrilled to be collaborating with experts in grid modeling, and working alongside them to integrate the latest and greatest from my research group and push the boundaries of these technologies.”

    “This project is testament to the power of collaboration and innovation, and we look forward to working with our collaborators to drive positive change in the energy sector,” says Satish Mahajan, principal investigator for the project at Tennessee Tech and a professor of electrical and computer engineering. Tennessee Tech’s Center for Rural Innovation director, Michael Aikens, adds, “Together, we are taking significant steps towards a more sustainable and resilient future for the Appalachian region.” More

  • in

    New tools are available to help reduce the energy that AI models devour

    When searching for flights on Google, you may have noticed that each flight’s carbon-emission estimate is now presented next to its cost. It’s a way to inform customers about their environmental impact, and to let them factor this information into their decision-making.

    A similar kind of transparency doesn’t yet exist for the computing industry, despite its carbon emissions exceeding those of the entire airline industry. Escalating this energy demand are artificial intelligence models. Huge, popular models like ChatGPT signal a trend of large-scale artificial intelligence, boosting forecasts that predict data centers will draw up to 21 percent of the world’s electricity supply by 2030.

    The MIT Lincoln Laboratory Supercomputing Center (LLSC) is developing techniques to help data centers reel in energy use. Their techniques range from simple but effective changes, like power-capping hardware, to adopting novel tools that can stop AI training early on. Crucially, they have found that these techniques have a minimal impact on model performance.

    In the wider picture, their work is mobilizing green-computing research and promoting a culture of transparency. “Energy-aware computing is not really a research area, because everyone’s been holding on to their data,” says Vijay Gadepally, senior staff in the LLSC who leads energy-aware research efforts. “Somebody has to start, and we’re hoping others will follow.”

    Curbing power and cooling down

    Like many data centers, the LLSC has seen a significant uptick in the number of AI jobs running on its hardware. Noticing an increase in energy usage, computer scientists at the LLSC were curious about ways to run jobs more efficiently. Green computing is a principle of the center, which is powered entirely by carbon-free energy.

    Training an AI model — the process by which it learns patterns from huge datasets — requires using graphics processing units (GPUs), which are power-hungry hardware. As one example, the GPUs that trained GPT-3 (the precursor to ChatGPT) are estimated to have consumed 1,300 megawatt-hours of electricity, roughly equal to that used by 1,450 average U.S. households per month.

    While most people seek out GPUs because of their computational power, manufacturers offer ways to limit the amount of power a GPU is allowed to draw. “We studied the effects of capping power and found that we could reduce energy consumption by about 12 percent to 15 percent, depending on the model,” Siddharth Samsi, a researcher within the LLSC, says.

    The trade-off for capping power is increasing task time — GPUs will take about 3 percent longer to complete a task, an increase Gadepally says is “barely noticeable” considering that models are often trained over days or even months. In one of their experiments in which they trained the popular BERT language model, limiting GPU power to 150 watts saw a two-hour increase in training time (from 80 to 82 hours) but saved the equivalent of a U.S. household’s week of energy.

    The team then built software that plugs this power-capping capability into the widely used scheduler system, Slurm. The software lets data center owners set limits across their system or on a job-by-job basis.

    “We can deploy this intervention today, and we’ve done so across all our systems,” Gadepally says.

    Side benefits have arisen, too. Since putting power constraints in place, the GPUs on LLSC supercomputers have been running about 30 degrees Fahrenheit cooler and at a more consistent temperature, reducing stress on the cooling system. Running the hardware cooler can potentially also increase reliability and service lifetime. They can now consider delaying the purchase of new hardware — reducing the center’s “embodied carbon,” or the emissions created through the manufacturing of equipment — until the efficiencies gained by using new hardware offset this aspect of the carbon footprint. They’re also finding ways to cut down on cooling needs by strategically scheduling jobs to run at night and during the winter months.

    “Data centers can use these easy-to-implement approaches today to increase efficiencies, without requiring modifications to code or infrastructure,” Gadepally says.

    Taking this holistic look at a data center’s operations to find opportunities to cut down can be time-intensive. To make this process easier for others, the team — in collaboration with Professor Devesh Tiwari and Baolin Li at Northeastern University — recently developed and published a comprehensive framework for analyzing the carbon footprint of high-performance computing systems. System practitioners can use this analysis framework to gain a better understanding of how sustainable their current system is and consider changes for next-generation systems.  

    Adjusting how models are trained and used

    On top of making adjustments to data center operations, the team is devising ways to make AI-model development more efficient.

    When training models, AI developers often focus on improving accuracy, and they build upon previous models as a starting point. To achieve the desired output, they have to figure out what parameters to use, and getting it right can take testing thousands of configurations. This process, called hyperparameter optimization, is one area LLSC researchers have found ripe for cutting down energy waste. 

    “We’ve developed a model that basically looks at the rate at which a given configuration is learning,” Gadepally says. Given that rate, their model predicts the likely performance. Underperforming models are stopped early. “We can give you a very accurate estimate early on that the best model will be in this top 10 of 100 models running,” he says.

    In their studies, this early stopping led to dramatic savings: an 80 percent reduction in the energy used for model training. They’ve applied this technique to models developed for computer vision, natural language processing, and material design applications.

    “In my opinion, this technique has the biggest potential for advancing the way AI models are trained,” Gadepally says.

    Training is just one part of an AI model’s emissions. The largest contributor to emissions over time is model inference, or the process of running the model live, like when a user chats with ChatGPT. To respond quickly, these models use redundant hardware, running all the time, waiting for a user to ask a question.

    One way to improve inference efficiency is to use the most appropriate hardware. Also with Northeastern University, the team created an optimizer that matches a model with the most carbon-efficient mix of hardware, such as high-power GPUs for the computationally intense parts of inference and low-power central processing units (CPUs) for the less-demanding aspects. This work recently won the best paper award at the International ACM Symposium on High-Performance Parallel and Distributed Computing.

    Using this optimizer can decrease energy use by 10-20 percent while still meeting the same “quality-of-service target” (how quickly the model can respond).

    This tool is especially helpful for cloud customers, who lease systems from data centers and must select hardware from among thousands of options. “Most customers overestimate what they need; they choose over-capable hardware just because they don’t know any better,” Gadepally says.

    Growing green-computing awareness

    The energy saved by implementing these interventions also reduces the associated costs of developing AI, often by a one-to-one ratio. In fact, cost is usually used as a proxy for energy consumption. Given these savings, why aren’t more data centers investing in green techniques?

    “I think it’s a bit of an incentive-misalignment problem,” Samsi says. “There’s been such a race to build bigger and better models that almost every secondary consideration has been put aside.”

    They point out that while some data centers buy renewable-energy credits, these renewables aren’t enough to cover the growing energy demands. The majority of electricity powering data centers comes from fossil fuels, and water used for cooling is contributing to stressed watersheds. 

    Hesitancy may also exist because systematic studies on energy-saving techniques haven’t been conducted. That’s why the team has been pushing their research in peer-reviewed venues in addition to open-source repositories. Some big industry players, like Google DeepMind, have applied machine learning to increase data center efficiency but have not made their work available for others to deploy or replicate. 

    Top AI conferences are now pushing for ethics statements that consider how AI could be misused. The team sees the climate aspect as an AI ethics topic that has not yet been given much attention, but this also appears to be slowly changing. Some researchers are now disclosing the carbon footprint of training the latest models, and industry is showing a shift in energy transparency too, as in this recent report from Meta AI.

    They also acknowledge that transparency is difficult without tools that can show AI developers their consumption. Reporting is on the LLSC roadmap for this year. They want to be able to show every LLSC user, for every job, how much energy they consume and how this amount compares to others, similar to home energy reports.

    Part of this effort requires working more closely with hardware manufacturers to make getting these data off hardware easier and more accurate. If manufacturers can standardize the way the data are read out, then energy-saving and reporting tools can be applied across different hardware platforms. A collaboration is underway between the LLSC researchers and Intel to work on this very problem.

    Even for AI developers who are aware of the intense energy needs of AI, they can’t do much on their own to curb this energy use. The LLSC team wants to help other data centers apply these interventions and provide users with energy-aware options. Their first partnership is with the U.S. Air Force, a sponsor of this research, which operates thousands of data centers. Applying these techniques can make a significant dent in their energy consumption and cost.

    “We’re putting control into the hands of AI developers who want to lessen their footprint,” Gadepally says. “Do I really need to gratuitously train unpromising models? Am I willing to run my GPUs slower to save energy? To our knowledge, no other supercomputing center is letting you consider these options. Using our tools, today, you get to decide.”

    Visit this webpage to see the group’s publications related to energy-aware computing and findings described in this article. More

  • in

    A new dataset of Arctic images will spur artificial intelligence research

    As the U.S. Coast Guard (USCG) icebreaker Healy takes part in a voyage across the North Pole this summer, it is capturing images of the Arctic to further the study of this rapidly changing region. Lincoln Laboratory researchers installed a camera system aboard the Healy while at port in Seattle before it embarked on a three-month science mission on July 11. The resulting dataset, which will be one of the first of its kind, will be used to develop artificial intelligence tools that can analyze Arctic imagery.

    “This dataset not only can help mariners navigate more safely and operate more efficiently, but also help protect our nation by providing critical maritime domain awareness and an improved understanding of how AI analysis can be brought to bear in this challenging and unique environment,” says Jo Kurucar, a researcher in Lincoln Laboratory’s AI Software Architectures and Algorithms Group, which led this project.

    As the planet warms and sea ice melts, Arctic passages are opening up to more traffic, both to military vessels and ships conducting illegal fishing. These movements may pose national security challenges to the United States. The opening Arctic also leaves questions about how its climate, wildlife, and geography are changing.

    Today, very few imagery datasets of the Arctic exist to study these changes. Overhead images from satellites or aircraft can only provide limited information about the environment. An outward-looking camera attached to a ship can capture more details of the setting and different angles of objects, such as other ships, in the scene. These types of images can then be used to train AI computer-vision tools, which can help the USCG plan naval missions and automate analysis. According to Kurucar, USCG assets in the Arctic are spread thin and can benefit greatly from AI tools, which can act as a force multiplier.

    The Healy is the USCG’s largest and most technologically advanced icebreaker. Given its current mission, it was a fitting candidate to be equipped with a new sensor to gather this dataset. The laboratory research team collaborated with the USCG Research and Development Center to determine the sensor requirements. Together, they developed the Cold Region Imaging and Surveillance Platform (CRISP).

    “Lincoln Laboratory has an excellent relationship with the Coast Guard, especially with the Research and Development Center. Over a decade, we’ve established ties that enabled the deployment of the CRISP system,” says Amna Greaves, the CRISP project lead and an assistant leader in the AI Software Architectures and Algorithms Group. “We have strong ties not only because of the USCG veterans working at the laboratory and in our group, but also because our technology missions are complementary. Today it was deploying infrared sensing in the Arctic; tomorrow it could be operating quadruped robot dogs on a fast-response cutter.”

    The CRISP system comprises a long-wave infrared camera, manufactured by Teledyne FLIR (for forward-looking infrared), that is designed for harsh maritime environments. The camera can stabilize itself during rough seas and image in complete darkness, fog, and glare. It is paired with a GPS-enabled time-synchronized clock and a network video recorder to record both video and still imagery along with GPS-positional data.  

    The camera is mounted at the front of the ship’s fly bridge, and the electronics are housed in a ruggedized rack on the bridge. The system can be operated manually from the bridge or be placed into an autonomous surveillance mode, in which it slowly pans back and forth, recording 15 minutes of video every three hours and a still image once every 15 seconds.

    “The installation of the equipment was a unique and fun experience. As with any good project, our expectations going into the install did not meet reality,” says Michael Emily, the project’s IT systems administrator who traveled to Seattle for the install. Working with the ship’s crew, the laboratory team had to quickly adjust their route for running cables from the camera to the observation station after they discovered that the expected access points weren’t in fact accessible. “We had 100-foot cables made for this project just in case of this type of scenario, which was a good thing because we only had a few inches to spare,” Emily says.

    The CRISP project team plans to publicly release the dataset, anticipated to be about 4 terabytes in size, once the USCG science mission concludes in the fall.

    The goal in releasing the dataset is to enable the wider research community to develop better tools for those operating in the Arctic, especially as this region becomes more navigable. “Collecting and publishing the data allows for faster and greater progress than what we could accomplish on our own,” Kurucar adds. “It also enables the laboratory to engage in more advanced AI applications while others make more incremental advances using the dataset.”

    On top of providing the dataset, the laboratory team plans to provide a baseline object-detection model, from which others can make progress on their own models. More advanced AI applications planned for development are classifiers for specific objects in the scene and the ability to identify and track objects across images.

    Beyond assisting with USCG missions, this project could create an influential dataset for researchers looking to apply AI to data from the Arctic to help combat climate change, says Paul Metzger, who leads the AI Software Architectures and Algorithms Group.

    Metzger adds that the group was honored to be a part of this project and is excited to see the advances that come from applying AI to novel challenges facing the United States: “I’m extremely proud of how our group applies AI to the highest-priority challenges in our nation, from predicting outbreaks of Covid-19 and assisting the U.S. European Command in their support of Ukraine to now employing AI in the Arctic for maritime awareness.”

    Once the dataset is available, it will be free to download on the Lincoln Laboratory dataset website. More

  • in

    System tracks movement of food through global humanitarian supply chain

    Although more than enough food is produced to feed everyone in the world, as many as 828 million people face hunger today. Poverty, social inequity, climate change, natural disasters, and political conflicts all contribute to inhibiting access to food. For decades, the U.S. Agency for International Development (USAID) Bureau for Humanitarian Assistance (BHA) has been a leader in global food assistance, supplying millions of metric tons of food to recipients worldwide. Alleviating hunger — and the conflict and instability hunger causes — is critical to U.S. national security.

    But BHA is only one player within a large, complex supply chain in which food gets handed off between more than 100 partner organizations before reaching its final destination. Traditionally, the movement of food through the supply chain has been a black-box operation, with stakeholders largely out of the loop about what happens to the food once it leaves their custody. This lack of direct visibility into operations is due to siloed data repositories, insufficient data sharing among stakeholders, and different data formats that operators must manually sort through and standardize. As a result, accurate, real-time information — such as where food shipments are at any given time, which shipments are affected by delays or food recalls, and when shipments have arrived at their final destination — is lacking. A centralized system capable of tracing food along its entire journey, from manufacture through delivery, would enable a more effective humanitarian response to food-aid needs.

    In 2020, a team from MIT Lincoln Laboratory began engaging with BHA to create an intelligent dashboard for their supply-chain operations. This dashboard brings together the expansive food-aid datasets from BHA’s existing systems into a single platform, with tools for visualizing and analyzing the data. When the team started developing the dashboard, they quickly realized the need for considerably more data than BHA had access to.

    “That’s where traceability comes in, with each handoff partner contributing key pieces of information as food moves through the supply chain,” explains Megan Richardson, a researcher in the laboratory’s Humanitarian Assistance and Disaster Relief Systems Group.

    Richardson and the rest of the team have been working with BHA and their partners to scope, build, and implement such an end-to-end traceability system. This system consists of serialized, unique identifiers (IDs) — akin to fingerprints — that are assigned to individual food items at the time they are produced. These individual IDs remain linked to items as they are aggregated along the supply chain, first domestically and then internationally. For example, individually tagged cans of vegetable oil get packaged into cartons; cartons are placed onto pallets and transported via railway and truck to warehouses; pallets are loaded onto shipping containers at U.S. ports; and pallets are unloaded and cartons are unpackaged overseas.

    With a trace

    Today, visibility at the single-item level doesn’t exist. Most suppliers mark pallets with a lot number (a lot is a batch of items produced in the same run), but this is for internal purposes (i.e., to track issues stemming back to their production supply, like over-enriched ingredients or machinery malfunction), not data sharing. So, organizations know which supplier lot a pallet and carton are associated with, but they can’t track the unique history of an individual carton or item within that pallet. As the lots move further downstream toward their final destination, they are often mixed with lots from other productions, and possibly other commodity types altogether, because of space constraints. On the international side, such mixing and the lack of granularity make it difficult to quickly pull commodities out of the supply chain if food safety concerns arise. Current response times can span several months.

    “Commodities are grouped differently at different stages of the supply chain, so it is logical to track them in those groupings where needed,” Richardson says. “Our item-level granularity serves as a form of Rosetta Stone to enable stakeholders to efficiently communicate throughout these stages. We’re trying to enable a way to track not only the movement of commodities, including through their lot information, but also any problems arising independent of lot, like exposure to high humidity levels in a warehouse. Right now, we have no way to associate commodities with histories that may have resulted in an issue.”

    “You can now track your checked luggage across the world and the fish on your dinner plate,” adds Brice MacLaren, also a researcher in the laboratory’s Humanitarian Assistance and Disaster Relief Systems Group. “So, this technology isn’t new, but it’s new to BHA as they evolve their methodology for commodity tracing. The traceability system needs to be versatile, working across a wide variety of operators who take custody of the commodity along the supply chain and fitting into their existing best practices.”

    As food products make their way through the supply chain, operators at each receiving point would be able to scan these IDs via a Lincoln Laboratory-developed mobile application (app) to indicate a product’s current location and transaction status — for example, that it is en route on a particular shipping container or stored in a certain warehouse. This information would get uploaded to a secure traceability server. By scanning a product, operators would also see its history up until that point.   

    Hitting the mark

    At the laboratory, the team tested the feasibility of their traceability technology, exploring different ways to mark and scan items. In their testing, they considered barcodes and radio-frequency identification (RFID) tags and handheld and fixed scanners. Their analysis revealed 2D barcodes (specifically data matrices) and smartphone-based scanners were the most feasible options in terms of how the technology works and how it fits into existing operations and infrastructure.

    “We needed to come up with a solution that would be practical and sustainable in the field,” MacLaren says. “While scanners can automatically read any RFID tags in close proximity as someone is walking by, they can’t discriminate exactly where the tags are coming from. RFID is expensive, and it’s hard to read commodities in bulk. On the other hand, a phone can scan a barcode on a particular box and tell you that code goes with that box. The challenge then becomes figuring out how to present the codes for people to easily scan without significantly interrupting their usual processes for handling and moving commodities.” 

    As the team learned from partner representatives in Kenya and Djibouti, offloading at the ports is a chaotic, fast operation. At manual warehouses, porters fling bags over their shoulders or stack cartons atop their heads any which way they can and run them to a drop point; at bagging terminals, commodities come down a conveyor belt and land this way or that way. With this variability comes several questions: How many barcodes do you need on an item? Where should they be placed? What size should they be? What will they cost? The laboratory team is considering these questions, keeping in mind that the answers will vary depending on the type of commodity; vegetable oil cartons will have different specifications than, say, 50-kilogram bags of wheat or peas.

    Leaving a mark

    Leveraging results from their testing and insights from international partners, the team has been running a traceability pilot evaluating how their proposed system meshes with real-world domestic and international operations. The current pilot features a domestic component in Houston, Texas, and an international component in Ethiopia, and focuses on tracking individual cartons of vegetable oil and identifying damaged cans. The Ethiopian team with Catholic Relief Services recently received a container filled with pallets of uniquely barcoded cartons of vegetable oil cans (in the next pilot, the cans will be barcoded, too). They are now scanning items and collecting data on product damage by using smartphones with the laboratory-developed mobile traceability app on which they were trained. 

    “The partners in Ethiopia are comparing a couple lid types to determine whether some are more resilient than others,” Richardson says. “With the app — which is designed to scan commodities, collect transaction data, and keep history — the partners can take pictures of damaged cans and see if a trend with the lid type emerges.”

    Next, the team will run a series of pilots with the World Food Program (WFP), the world’s largest humanitarian organization. The first pilot will focus on data connectivity and interoperability, and the team will engage with suppliers to directly print barcodes on individual commodities instead of applying barcode labels to packaging, as they did in the initial feasibility testing. The WFP will provide input on which of their operations are best suited for testing the traceability system, considering factors like the network bandwidth of WFP staff and local partners, the commodity types being distributed, and the country context for scanning. The BHA will likely also prioritize locations for system testing.

    “Our goal is to provide an infrastructure to enable as close to real-time data exchange as possible between all parties, given intermittent power and connectivity in these environments,” MacLaren says.

    In subsequent pilots, the team will try to integrate their approach with existing systems that partners rely on for tracking procurements, inventory, and movement of commodities under their custody so that this information is automatically pushed to the traceability server. The team also hopes to add a capability for real-time alerting of statuses, like the departure and arrival of commodities at a port or the exposure of unclaimed commodities to the elements. Real-time alerts would enable stakeholders to more efficiently respond to food-safety events. Currently, partners are forced to take a conservative approach, pulling out more commodities from the supply chain than are actually suspect, to reduce risk of harm. Both BHA and WHP are interested in testing out a food-safety event during one of the pilots to see how the traceability system works in enabling rapid communication response.

    To implement this technology at scale will require some standardization for marking different commodity types as well as give and take among the partners on best practices for handling commodities. It will also require an understanding of country regulations and partner interactions with subcontractors, government entities, and other stakeholders.

    “Within several years, I think it’s possible for BHA to use our system to mark and trace all their food procured in the United States and sent internationally,” MacLaren says.

    Once collected, the trove of traceability data could be harnessed for other purposes, among them analyzing historical trends, predicting future demand, and assessing the carbon footprint of commodity transport. In the future, a similar traceability system could scale for nonfood items, including medical supplies distributed to disaster victims, resources like generators and water trucks localized in emergency-response scenarios, and vaccines administered during pandemics. Several groups at the laboratory are also interested in such a system to track items such as tools deployed in space or equipment people carry through different operational environments.

    “When we first started this program, colleagues were asking why the laboratory was involved in simple tasks like making a dashboard, marking items with barcodes, and using hand scanners,” MacLaren says. “Our impact here isn’t about the technology; it’s about providing a strategy for coordinated food-aid response and successfully implementing that strategy. Most importantly, it’s about people getting fed.” More

  • in

    Preparing Colombia’s cities for life amid changing forests

    It was an uncharacteristically sunny morning as Marcela Angel MCP ’18, flanked by a drone pilot from the Boston engineering firm AirWorks and a data collection team from the Colombian regional environmental agency Corpoamazonia, climbed a hill in the Andes Mountains of southwest Colombia. The area’s usual mountain cloud cover — one of the major challenges to working with satellite imagery or flying UAVs (unpiloted aerial vehicles, or drones) in the Pacific highlands of the Amazon — would roll through in the hours to come. But for now, her team had chosen a good day to hike out for their first flight. Angel is used to long travel for her research. Raised in Bogotá, she maintained strong ties to Colombia throughout her master’s program in the MIT Department of Urban Studies and Planning (DUSP). Her graduate thesis, examining Bogotá’s management of its public green space, took her regularly back to her hometown, exploring how the city could offer residents more equal access to the clean air, flood protection and day-to-day health and social benefits provided by parks and trees. But the hill she was hiking this morning, outside the remote city of Mocoa, had taken an especially long time to climb: five years building relationships with the community of Mocoa and the Colombian government, recruiting project partners, and navigating the bureaucracy of bringing UAVs into the country. Now, her team finally unwrapped their first, knee-high drone from its tarp and set it carefully in the grass. Under the gathering gray clouds, the buzz of its rotors joined the hum of insects in the trees, and the machine at last took to the skies.

    From Colombia to Cambridge

    “I actually grew up on the last street before the eastern mountains reserve,” Angel says of her childhood in Bogotá. “I’ve always been at that border between city and nature.” This idea, that urban areas are married to the ecosystems around them, would inform Angel’s whole education and career. Before coming to MIT, she studied architecture at Bogotá’s Los Andes University; for her graduation project she proposed a plan to resettle an informal neighborhood on Bogotá’s outskirts to minimize environmental risks to its residents. Among her projects at MIT was an initiative to spatially analyze Bogotá’s tree canopy, providing data for the city to plan a tree-planting program as a strategy to give vulnerable populations in the city more access to nature. And she was naturally intrigued when Colombia’s former minister of environment and sustainable development came to MIT in 2017 to give a guest presentation to the DUSP master’s program. The minister, Luis Gilberto Murillo (now the Colombian ambassador to the United States), introduced the students to the challenges triggered by a recent disaster in the city of Mocoa, on the border between the lowland Amazon and the Andes Mountains. Unprecedented rainstorms had destabilized the surrounding forests, and that April a devastating flood and landslide had killed hundreds of people and destroyed entire neighborhoods. And as climate change contributed to growing rainfall in the region, the risks of more landslide events were rising. Murillo provided useful insights into how city planning decisions had contributed to the crisis. But he also asked for MIT’s support addressing future landslide risks in the area. Angel and Juan Camilo Osorio, a PhD candidate at DUSP, decided to take up the challenge, and in January 2018 and 2019, a research delegation from MIT traveled to Colombia for a newly-created graduate course. Returning once again to Bogotá, Angel interviewed government agencies and nonprofits to understand the state of landslide monitoring and public policy. In Mocoa, further interviews and a series of workshops helped clarify what locals needed most and what MIT could provide: better information on where and when landslides might strike, and a process to increase risk awareness and involve traditionally marginalized groups in decision-making processes around that risk. Over the coming year, a core team formed to put the insights from this trip into action, including Angel, Osorio, postdoc Norhan Bayomi of the MIT Environmental Solutions Initiative (ESI) and MIT Professor John Fernández, director of the ESI and one of Angel’s mentors at DUSP. After a second visit to Mocoa that brought into the fold Indigenous groups, environmental agencies, and the national army, a plan was formed: MIT would partner with Corpoamazonia and build a network of community researchers to deploy and test drone technology and machine learning models to monitor the mountain forests for both landslide risks and signs of forest health, while implementing a participatory planning process with residents. “What our projects aim to do is give the communities new tools to continue protecting and restoring the forest,” says Angel, “and support new and inclusive development models, even in the face of new challenges.”

    Lifelines for the climate

    The goal of tropical forest conservation is an urgent one. As forests are cut down, their trees and soils release carbon they have stored over millennia, adding huge amounts of heat-trapping carbon dioxide to the atmosphere. Deforestation, mainly in the tropics, is now estimated to contribute more to climate change than any country besides the United States and China — and once lost, tropical forests are exceptionally hard to restore. “Tropical forests should be a natural way to slow and reverse climate change,” says Angel. “And they can be. But today, we are reaching critical tipping points where it is just the opposite.” This became the motivating force for Angel’s career after her graduation. In 2019, Fernández invited her to join the ESI and lead a new Natural Climate Solutions Program, with the Mocoa project as its first centerpiece. She quickly mobilized the partners to raise funding for the project from the Global Environmental Facility and the CAF Development Bank of Latin America and the Caribbean, and recruited additional partners including MIT Lincoln Laboratories, AirWorks, and the Pratt Institute, where Osorio had become an assistant professor. She hired machine learning specialists from MIT to begin design on UAVs’ data processing, and helped assemble a local research network in Mocoa to increase risk awareness, promote community participation, and better understand what information city officials and community groups needed for city planning and conservation. “This is the amazing thing about MIT,” she says. “When you study a problem here, you’re not just playing in a sandbox. Everyone I’ve worked with is motivated by the complexity of the technical challenge and the opportunity for meaningful engagement in Mocoa, and hopefully in many more places besides.” At the same time, Angel created opportunities for the next generation of MIT graduate students to follow in her footsteps. With Fernández and Bayomi, she created a new course, 4.S23 (Biodiversity and Cities), in which students traveled to Colombia to develop urban planning strategies for the cities of Quidbó and Leticia, located in carbon-rich and biodiverse areas. The course has been taught twice, with Professor Gabriella Carolini joining the teaching team for spring 2023, and has already led to a student report to city officials in Quidbó recommending ways to enhance biodiversity and adapt to climate change as the city grows, a multi-stakeholder partnership to train local youth and implement a citizen-led biodiversity survey, and a seed grant from the MIT Climate and Sustainability Consortium to begin providing both cities detailed data on their tree cover derived from satellite images. “These regions face serious threats, especially on a warming planet, but many of the solutions for climate change, biodiversity conservation, and environmental equity in the region go hand-in-hand,” Angel says. “When you design a city to use fewer resources, to contribute less to climate change, it also causes less pressure on the environment around it. When you design a city for equity and quality of life, you’re giving attention to its green spaces and what they can provide for people and as habitat for other species. When you protect and restore forests, you’re protecting local bioeconomies.”

    Bringing the data home

    Meanwhile, in Mocoa, Angel’s original vision is taking flight. With the team’s test flights behind them, they can now begin creating digital models of the surrounding area. Regular drone flights and soil samples will fill in changing information about trees, water, and local geology, allowing the project’s machine learning specialists to identify warning signs for future landslides and extreme weather events. More importantly, there is now an established network of local community researchers and leaders ready to make use of this information. With feedback from their Mocoan partners, Angel’s team has built a prototype of the online platform they will use to share their UAV data; they’re now letting Mocoa residents take it for a test drive and suggest how it can be made more user-friendly. Her visit this January also paved the way for new projects that will tie the Environmental Solutions Initiative more tightly to Mocoa. With her project partners, Angel is exploring developing a course to teach local students how to use UAVs like the ones her team is flying. She is also considering expanded efforts to collect the kind of informal knowledge of Mocoa, on the local ecology and culture, that people everywhere use in making their city planning and emergency response decisions, but that is rarely codified and included in scientific risk analyses. It’s a great deal of work to offer this one community the tools to adapt successfully to climate change. But even with all the robotics and machine learning models in the world, this close, slow-unfolding engagement, grounded in trust and community inclusion, is what it takes to truly prepare people to confront profound changes in their city and environment. “Protecting natural carbon sinks is a global socio-environmental challenge, and one where it is not enough for MIT to just contribute to the knowledge base or develop a new technology,” says Angel. “But we can help mobilize decision-makers and nontraditional actors, and design more inclusive and technology-enhanced processes, to make this easier for the people who have lifelong stakes in these ecosystems. That is the vision.” More

  • in

    Detailed images from space offer clearer picture of drought effects on plants

    “MIT is a place where dreams come true,” says César Terrer, an assistant professor in the Department of Civil and Environmental Engineering. Here at MIT, Terrer says he’s given the resources needed to explore ideas he finds most exciting, and at the top of his list is climate science. In particular, he is interested in plant-soil interactions, and how the two can mitigate impacts of climate change. In 2022, Terrer received seed grant funding from the Abdul Latif Jameel Water and Food Systems Lab (J-WAFS) to produce drought monitoring systems for farmers. The project is leveraging a new generation of remote sensing devices to provide high-resolution plant water stress at regional to global scales.

    Growing up in Granada, Spain, Terrer always had an aptitude and passion for science. He studied environmental science at the University of Murcia, where he interned in the Department of Ecology. Using computational analysis tools, he worked on modeling species distribution in response to human development. Early on in his undergraduate experience, Terrer says he regarded his professors as “superheroes” with a kind of scholarly prowess. He knew he wanted to follow in their footsteps by one day working as a faculty member in academia. Of course, there would be many steps along the way before achieving that dream. 

    Upon completing his undergraduate studies, Terrer set his sights on exciting and adventurous research roles. He thought perhaps he would conduct field work in the Amazon, engaging with native communities. But when the opportunity arose to work in Australia on a state-of-the-art climate change experiment that simulates future levels of carbon dioxide, he headed south to study how plants react to CO2 in a biome of native Australian eucalyptus trees. It was during this experience that Terrer started to take a keen interest in the carbon cycle and the capacity of ecosystems to buffer rising levels of CO2 caused by human activity.

    Around 2014, he began to delve deeper into the carbon cycle as he began his doctoral studies at Imperial College London. The primary question Terrer sought to answer during his PhD was “will plants be able to absorb predicted future levels of CO2 in the atmosphere?” To answer the question, Terrer became an early adopter of artificial intelligence, machine learning, and remote sensing to analyze data from real-life, global climate change experiments. His findings from these “ground truth” values and observations resulted in a paper in the journal Science. In it, he claimed that climate models most likely overestimated how much carbon plants will be able to absorb by the end of the century, by a factor of three. 

    After postdoctoral positions at Stanford University and the Universitat Autonoma de Barcelona, followed by a prestigious Lawrence Fellowship, Terrer says he had “too many ideas and not enough time to accomplish all those ideas.” He knew it was time to lead his own group. Not long after applying for faculty positions, he landed at MIT. 

    New ways to monitor drought

    Terrer is employing similar methods to those he used during his PhD to analyze data from all over the world for his J-WAFS project. He and postdoc Wenzhe Jiao collect data from remote sensing satellites and field experiments and use machine learning to come up with new ways to monitor drought. Terrer says Jiao is a “remote sensing wizard,” who fuses data from different satellite products to understand the water cycle. With Jiao’s hydrology expertise and Terrer’s knowledge of plants, soil, and the carbon cycle, the duo is a formidable team to tackle this project.

    According to the U.N. World Meteorological Organization, the number and duration of droughts has increased by 29 percent since 2000, as compared to the two previous decades. From the Horn of Africa to the Western United States, drought is devastating vegetation and severely stressing water supplies, compromising food production and spiking food insecurity. Drought monitoring can offer fundamental information on drought location, frequency, and severity, but assessing the impact of drought on vegetation is extremely challenging. This is because plants’ sensitivity to water deficits varies across species and ecosystems. 

    Terrer and Jiao are able to obtain a clearer picture of how drought is affecting plants by employing the latest generation of remote sensing observations, which offer images of the planet with incredible spatial and temporal resolution. Satellite products such as Sentinel, Landsat, and Planet can provide daily images from space with such high resolution that individual trees can be discerned. Along with the images and datasets from satellites, the team is using ground-based observations from meteorological data. They are also using the MIT SuperCloud at MIT Lincoln Laboratory to process and analyze all of the data sets. The J-WAFS project is among one of the first to leverage high-resolution data to quantitatively measure plant drought impacts in the United States with the hopes of expanding to a global assessment in the future.

    Assisting farmers and resource managers 

    Every week, the U.S. Drought Monitor provides a map of drought conditions in the United States. The map has zero resolution and is more of a drought recap or summary, unable to predict future drought scenarios. The lack of a comprehensive spatiotemporal evaluation of historic and future drought impacts on global vegetation productivity is detrimental to farmers both in the United States and worldwide.  

    Terrer and Jiao plan to generate metrics for plant water stress at an unprecedented resolution of 10-30 meters. This means that they will be able to provide drought monitoring maps at the scale of a typical U.S. farm, giving farmers more precise, useful data every one to two days. The team will use the information from the satellites to monitor plant growth and soil moisture, as well as the time lag of plant growth response to soil moisture. In this way, Terrer and Jiao say they will eventually be able to create a kind of “plant water stress forecast” that may be able to predict adverse impacts of drought four weeks in advance. “According to the current soil moisture and lagged response time, we hope to predict plant water stress in the future,” says Jiao. 

    The expected outcomes of this project will give farmers, land and water resource managers, and decision-makers more accurate data at the farm-specific level, allowing for better drought preparation, mitigation, and adaptation. “We expect to make our data open-access online, after we finish the project, so that farmers and other stakeholders can use the maps as tools,” says Jiao. 

    Terrer adds that the project “has the potential to help us better understand the future states of climate systems, and also identify the regional hot spots more likely to experience water crises at the national, state, local, and tribal government scales.” He also expects the project will enhance our understanding of global carbon-water-energy cycle responses to drought, with applications in determining climate change impacts on natural ecosystems as a whole. More