More stories

  • in

    Explained: Generative AI’s environmental impact

    In a two-part series, MIT News explores the environmental implications of generative AI. In this article, we look at why this technology is so resource-intensive. A second piece will investigate what experts are doing to reduce genAI’s carbon footprint and other impacts.The excitement surrounding potential benefits of generative AI, from improving worker productivity to advancing scientific research, is hard to ignore. While the explosive growth of this new technology has enabled rapid deployment of powerful models in many industries, the environmental consequences of this generative AI “gold rush” remain difficult to pin down, let alone mitigate.The computational power required to train generative AI models that often have billions of parameters, such as OpenAI’s GPT-4, can demand a staggering amount of electricity, which leads to increased carbon dioxide emissions and pressures on the electric grid.Furthermore, deploying these models in real-world applications, enabling millions to use generative AI in their daily lives, and then fine-tuning the models to improve their performance draws large amounts of energy long after a model has been developed.Beyond electricity demands, a great deal of water is needed to cool the hardware used for training, deploying, and fine-tuning generative AI models, which can strain municipal water supplies and disrupt local ecosystems. The increasing number of generative AI applications has also spurred demand for high-performance computing hardware, adding indirect environmental impacts from its manufacture and transport.“When we think about the environmental impact of generative AI, it is not just the electricity you consume when you plug the computer in. There are much broader consequences that go out to a system level and persist based on actions that we take,” says Elsa A. Olivetti, professor in the Department of Materials Science and Engineering and the lead of the Decarbonization Mission of MIT’s new Climate Project.Olivetti is senior author of a 2024 paper, “The Climate and Sustainability Implications of Generative AI,” co-authored by MIT colleagues in response to an Institute-wide call for papers that explore the transformative potential of generative AI, in both positive and negative directions for society.Demanding data centersThe electricity demands of data centers are one major factor contributing to the environmental impacts of generative AI, since data centers are used to train and run the deep learning models behind popular tools like ChatGPT and DALL-E.A data center is a temperature-controlled building that houses computing infrastructure, such as servers, data storage drives, and network equipment. For instance, Amazon has more than 100 data centers worldwide, each of which has about 50,000 servers that the company uses to support cloud computing services.While data centers have been around since the 1940s (the first was built at the University of Pennsylvania in 1945 to support the first general-purpose digital computer, the ENIAC), the rise of generative AI has dramatically increased the pace of data center construction.“What is different about generative AI is the power density it requires. Fundamentally, it is just computing, but a generative AI training cluster might consume seven or eight times more energy than a typical computing workload,” says Noman Bashir, lead author of the impact paper, who is a Computing and Climate Impact Fellow at MIT Climate and Sustainability Consortium (MCSC) and a postdoc in the Computer Science and Artificial Intelligence Laboratory (CSAIL).Scientists have estimated that the power requirements of data centers in North America increased from 2,688 megawatts at the end of 2022 to 5,341 megawatts at the end of 2023, partly driven by the demands of generative AI. Globally, the electricity consumption of data centers rose to 460 terawatts in 2022. This would have made data centers the 11th largest electricity consumer in the world, between the nations of Saudi Arabia (371 terawatts) and France (463 terawatts), according to the Organization for Economic Co-operation and Development.By 2026, the electricity consumption of data centers is expected to approach 1,050 terawatts (which would bump data centers up to fifth place on the global list, between Japan and Russia).While not all data center computation involves generative AI, the technology has been a major driver of increasing energy demands.“The demand for new data centers cannot be met in a sustainable way. The pace at which companies are building new data centers means the bulk of the electricity to power them must come from fossil fuel-based power plants,” says Bashir.The power needed to train and deploy a model like OpenAI’s GPT-3 is difficult to ascertain. In a 2021 research paper, scientists from Google and the University of California at Berkeley estimated the training process alone consumed 1,287 megawatt hours of electricity (enough to power about 120 average U.S. homes for a year), generating about 552 tons of carbon dioxide.While all machine-learning models must be trained, one issue unique to generative AI is the rapid fluctuations in energy use that occur over different phases of the training process, Bashir explains.Power grid operators must have a way to absorb those fluctuations to protect the grid, and they usually employ diesel-based generators for that task.Increasing impacts from inferenceOnce a generative AI model is trained, the energy demands don’t disappear.Each time a model is used, perhaps by an individual asking ChatGPT to summarize an email, the computing hardware that performs those operations consumes energy. Researchers have estimated that a ChatGPT query consumes about five times more electricity than a simple web search.“But an everyday user doesn’t think too much about that,” says Bashir. “The ease-of-use of generative AI interfaces and the lack of information about the environmental impacts of my actions means that, as a user, I don’t have much incentive to cut back on my use of generative AI.”With traditional AI, the energy usage is split fairly evenly between data processing, model training, and inference, which is the process of using a trained model to make predictions on new data. However, Bashir expects the electricity demands of generative AI inference to eventually dominate since these models are becoming ubiquitous in so many applications, and the electricity needed for inference will increase as future versions of the models become larger and more complex.Plus, generative AI models have an especially short shelf-life, driven by rising demand for new AI applications. Companies release new models every few weeks, so the energy used to train prior versions goes to waste, Bashir adds. New models often consume more energy for training, since they usually have more parameters than their predecessors.While electricity demands of data centers may be getting the most attention in research literature, the amount of water consumed by these facilities has environmental impacts, as well.Chilled water is used to cool a data center by absorbing heat from computing equipment. It has been estimated that, for each kilowatt hour of energy a data center consumes, it would need two liters of water for cooling, says Bashir.“Just because this is called ‘cloud computing’ doesn’t mean the hardware lives in the cloud. Data centers are present in our physical world, and because of their water usage they have direct and indirect implications for biodiversity,” he says.The computing hardware inside data centers brings its own, less direct environmental impacts.While it is difficult to estimate how much power is needed to manufacture a GPU, a type of powerful processor that can handle intensive generative AI workloads, it would be more than what is needed to produce a simpler CPU because the fabrication process is more complex. A GPU’s carbon footprint is compounded by the emissions related to material and product transport.There are also environmental implications of obtaining the raw materials used to fabricate GPUs, which can involve dirty mining procedures and the use of toxic chemicals for processing.Market research firm TechInsights estimates that the three major producers (NVIDIA, AMD, and Intel) shipped 3.85 million GPUs to data centers in 2023, up from about 2.67 million in 2022. That number is expected to have increased by an even greater percentage in 2024.The industry is on an unsustainable path, but there are ways to encourage responsible development of generative AI that supports environmental objectives, Bashir says.He, Olivetti, and their MIT colleagues argue that this will require a comprehensive consideration of all the environmental and societal costs of generative AI, as well as a detailed assessment of the value in its perceived benefits.“We need a more contextual way of systematically and comprehensively understanding the implications of new developments in this space. Due to the speed at which there have been improvements, we haven’t had a chance to catch up with our abilities to measure and understand the tradeoffs,” Olivetti says. More

  • in

    J-PAL North America announces new evaluation incubator collaborators from state and local governments

    J-PAL North America recently selected government partners for the 2024-25 Leveraging Evaluation and Evidence for Equitable Recovery (LEVER) Evaluation Incubator cohort. Selected collaborators will receive funding and technical assistance to develop or launch a randomized evaluation for one of their programs. These collaborations represent jurisdictions across the United States and demonstrate the growing enthusiasm for evidence-based policymaking.Launched in 2023, LEVER is a joint venture between J-PAL North America and Results for America. Through the Evaluation Incubator, trainings, and other program offerings, LEVER seeks to address the barriers many state and local governments face around finding and generating evidence to inform program design. LEVER offers government leaders the opportunity to learn best practices for policy evaluations and how to integrate evidence into decision-making. Since the program’s inception, more than 80 government jurisdictions have participated in LEVER offerings.J-PAL North America’s Evaluation Incubator helps collaborators turn policy-relevant research questions into well-designed randomized evaluations, generating rigorous evidence to inform pressing programmatic and policy decisions. The program also aims to build a culture of evidence use and give government partners the tools to continue generating and utilizing evidence in their day-to-day operations.In addition to funding and technical assistance, the selected state and local government collaborators will be connected with researchers from J-PAL’s network to help advance their evaluation ideas. Evaluation support will also be centered on community-engaged research practices, which emphasize collaborating with and learning from the groups most affected by the program being evaluated.Evaluation Incubator selected projectsPierce County Human Services (PCHS) in the state of Washington will evaluate two programs as part of the Evaluation Incubator. The first will examine how extending stays in a fentanyl detox program affects the successful completion of inpatient treatment and hospital utilization for individuals. “PCHS is interested in evaluating longer fentanyl detox stays to inform our funding decisions, streamline our resource utilization, and encourage additional financial commitments to address the unmet needs of individuals dealing with opioid use disorder,” says Trish Crocker, grant coordinator.The second PCHS program will evaluate the impact of providing medication and outreach services via a mobile distribution unit to individuals with opioid use disorders on program take-up and substance usage. Margo Burnison, a behavioral health manager with PCHS, says that the team is “thrilled to be partnering with J-PAL North America to dive deep into the data to inform our elected leaders on the best way to utilize available resources.”The City of Los Angeles Youth Development Department (YDD) seeks to evaluate a research-informed program: Student Engagement, Exploration, and Development in STEM (SEEDS). This intergenerational STEM mentorship program supports underrepresented middle school and college students in STEM by providing culturally responsive mentorship. The program seeks to foster these students’ STEM identity and degree attainment in higher education. YDD has been working with researchers at the University of Southern California to measure the SEEDS program’s impact, but is interested in developing a randomized evaluation to generate further evidence. Darnell Cole, professor and co-director of the Research Center for Education, Identity and Social Justice, shares his excitement about the collaboration with J-PAL: “We welcome the opportunity to measure the impact of the SEEDS program on our students’ educational experience. Rigorously testing the SEEDS program will help us improve support for STEM students, ultimately enhancing their persistence and success.”The Fort Wayne Police Department’s Hope and Recovery Team in Indiana will evaluate the impact of two programs that connect social workers with people who have experienced an overdose, or who have a mental health illness, to treatment and resources. “We believe we are on the right track in the work we are doing with the crisis intervention social worker and the recovery coach, but having an outside evaluation of both programs would be extremely helpful in understanding whether and what aspects of these programs are most effective,” says Police Captain Kevin Hunter.The County of San Diego’s Office of Evaluation, Performance and Analytics, and Planning & Development Services will engage with J-PAL staff to explore evaluation opportunities for two programs that are a part of the county’s Climate Action Plan. The Equity-Driven Tree Planting Program seeks to increase tree canopy coverage, and the Climate Smart Land Stewardship Program will encourage climate-smart agricultural practices. Ricardo Basurto-Davila, chief evaluation officer, says that “the county is dedicated to evidence-based policymaking and taking decisive action against climate change. The work with J-PAL will support us in combining these commitments to maximize the effectiveness in decreasing emissions through these programs.”J-PAL North America looks forward to working with the selected collaborators in the coming months to learn more about these promising programs, clarify our partner’s evidence goals, and design randomized evaluations to measure their impact. More

  • in

    Study finds mercury pollution from human activities is declining

    MIT researchers have some good environmental news: Mercury emissions from human activity have been declining over the past two decades, despite global emissions inventories that indicate otherwise.In a new study, the researchers analyzed measurements from all available monitoring stations in the Northern Hemisphere and found that atmospheric concentrations of mercury declined by about 10 percent between 2005 and 2020.They used two separate modeling methods to determine what is driving that trend. Both techniques pointed to a decline in mercury emissions from human activity as the most likely cause.Global inventories, on the other hand, have reported opposite trends. These inventories estimate atmospheric emissions using models that incorporate average emission rates of polluting activities and the scale of these activities worldwide.“Our work shows that it is very important to learn from actual, on-the-ground data to try and improve our models and these emissions estimates. This is very relevant for policy because, if we are not able to accurately estimate past mercury emissions, how are we going to predict how mercury pollution will evolve in the future?” says Ari Feinberg, a former postdoc in the Institute for Data, Systems, and Society (IDSS) and lead author of the study.The new results could help inform scientists who are embarking on a collaborative, global effort to evaluate pollution models and develop a more in-depth understanding of what drives global atmospheric concentrations of mercury.However, due to a lack of data from global monitoring stations and limitations in the scientific understanding of mercury pollution, the researchers couldn’t pinpoint a definitive reason for the mismatch between the inventories and the recorded measurements.“It seems like mercury emissions are moving in the right direction, and could continue to do so, which is heartening to see. But this was as far as we could get with mercury. We need to keep measuring and advancing the science,” adds co-author Noelle Selin, an MIT professor in the IDSS and the Department of Earth, Atmospheric and Planetary Sciences (EAPS).Feinberg and Selin, his MIT postdoctoral advisor, are joined on the paper by an international team of researchers that contributed atmospheric mercury measurement data and statistical methods to the study. The research appears this week in the Proceedings of the National Academy of Sciences.Mercury mismatchThe Minamata Convention is a global treaty that aims to cut human-caused emissions of mercury, a potent neurotoxin that enters the atmosphere from sources like coal-fired power plants and small-scale gold mining.The treaty, which was signed in 2013 and went into force in 2017, is evaluated every five years. The first meeting of its conference of parties coincided with disheartening news reports that said global inventories of mercury emissions, compiled in part from information from national inventories, had increased despite international efforts to reduce them.This was puzzling news for environmental scientists like Selin. Data from monitoring stations showed atmospheric mercury concentrations declining during the same period.Bottom-up inventories combine emission factors, such as the amount of mercury that enters the atmosphere when coal mined in a certain region is burned, with estimates of pollution-causing activities, like how much of that coal is burned in power plants.“The big question we wanted to answer was: What is actually happening to mercury in the atmosphere and what does that say about anthropogenic emissions over time?” Selin says.Modeling mercury emissions is especially tricky. First, mercury is the only metal that is in liquid form at room temperature, so it has unique properties. Moreover, mercury that has been removed from the atmosphere by sinks like the ocean or land can be re-emitted later, making it hard to identify primary emission sources.At the same time, mercury is more difficult to study in laboratory settings than many other air pollutants, especially due to its toxicity, so scientists have limited understanding of all chemical reactions mercury can undergo. There is also a much smaller network of mercury monitoring stations, compared to other polluting gases like methane and nitrous oxide.“One of the challenges of our study was to come up with statistical methods that can address those data gaps, because available measurements come from different time periods and different measurement networks,” Feinberg says.Multifaceted modelsThe researchers compiled data from 51 stations in the Northern Hemisphere. They used statistical techniques to aggregate data from nearby stations, which helped them overcome data gaps and evaluate regional trends.By combining data from 11 regions, their analysis indicated that Northern Hemisphere atmospheric mercury concentrations declined by about 10 percent between 2005 and 2020.Then the researchers used two modeling methods — biogeochemical box modeling and chemical transport modeling — to explore possible causes of that decline.  Box modeling was used to run hundreds of thousands of simulations to evaluate a wide array of emission scenarios. Chemical transport modeling is more computationally expensive but enables researchers to assess the impacts of meteorology and spatial variations on trends in selected scenarios.For instance, they tested one hypothesis that there may be an additional environmental sink that is removing more mercury from the atmosphere than previously thought. The models would indicate the feasibility of an unknown sink of that magnitude.“As we went through each hypothesis systematically, we were pretty surprised that we could really point to declines in anthropogenic emissions as being the most likely cause,” Selin says.Their work underscores the importance of long-term mercury monitoring stations, Feinberg adds. Many stations the researchers evaluated are no longer operational because of a lack of funding.While their analysis couldn’t zero in on exactly why the emissions inventories didn’t match up with actual data, they have a few hypotheses.One possibility is that global inventories are missing key information from certain countries. For instance, the researchers resolved some discrepancies when they used a more detailed regional inventory from China. But there was still a gap between observations and estimates.They also suspect the discrepancy might be the result of changes in two large sources of mercury that are particularly uncertain: emissions from small-scale gold mining and mercury-containing products.Small-scale gold mining involves using mercury to extract gold from soil and is often performed in remote parts of developing countries, making it hard to estimate. Yet small-scale gold mining contributes about 40 percent of human-made emissions.In addition, it’s difficult to determine how long it takes the pollutant to be released into the atmosphere from discarded products like thermometers or scientific equipment.“We’re not there yet where we can really pinpoint which source is responsible for this discrepancy,” Feinberg says.In the future, researchers from multiple countries, including MIT, will collaborate to study and improve the models they use to estimate and evaluate emissions. This research will be influential in helping that project move the needle on monitoring mercury, he says.This research was funded by the Swiss National Science Foundation, the U.S. National Science Foundation, and the U.S. Environmental Protection Agency. More

  • in

    “They can see themselves shaping the world they live in”

    During the journey from the suburbs to the city, the tree canopy often dwindles down as skyscrapers rise up. A group of New England Innovation Academy students wondered why that is.“Our friend Victoria noticed that where we live in Marlborough there are lots of trees in our own backyards. But if you drive just 30 minutes to Boston, there are almost no trees,” said high school junior Ileana Fournier. “We were struck by that duality.”This inspired Fournier and her classmates Victoria Leeth and Jessie Magenyi to prototype a mobile app that illustrates Massachusetts deforestation trends for Day of AI, a free, hands-on curriculum developed by the MIT Responsible AI for Social Empowerment and Education (RAISE) initiative, headquartered in the MIT Media Lab and in collaboration with the MIT Schwarzman College of Computing and MIT Open Learning. They were among a group of 20 students from New England Innovation Academy who shared their projects during the 2024 Day of AI global celebration hosted with the Museum of Science.The Day of AI curriculum introduces K-12 students to artificial intelligence. Now in its third year, Day of AI enables students to improve their communities and collaborate on larger global challenges using AI. Fournier, Leeth, and Magenyi’s TreeSavers app falls under the Telling Climate Stories with Data module, one of four new climate-change-focused lessons.“We want you to be able to express yourselves creatively to use AI to solve problems with critical-thinking skills,” Cynthia Breazeal, director of MIT RAISE, dean for digital learning at MIT Open Learning, and professor of media arts and sciences, said during this year’s Day of AI global celebration at the Museum of Science. “We want you to have an ethical and responsible way to think about this really powerful, cool, and exciting technology.”Moving from understanding to actionDay of AI invites students to examine the intersection of AI and various disciplines, such as history, civics, computer science, math, and climate change. With the curriculum available year-round, more than 10,000 educators across 114 countries have brought Day of AI activities to their classrooms and homes.The curriculum gives students the agency to evaluate local issues and invent meaningful solutions. “We’re thinking about how to create tools that will allow kids to have direct access to data and have a personal connection that intersects with their lived experiences,” Robert Parks, curriculum developer at MIT RAISE, said at the Day of AI global celebration.Before this year, first-year Jeremie Kwapong said he knew very little about AI. “I was very intrigued,” he said. “I started to experiment with ChatGPT to see how it reacts. How close can I get this to human emotion? What is AI’s knowledge compared to a human’s knowledge?”In addition to helping students spark an interest in AI literacy, teachers around the world have told MIT RAISE that they want to use data science lessons to engage students in conversations about climate change. Therefore, Day of AI’s new hands-on projects use weather and climate change to show students why it’s important to develop a critical understanding of dataset design and collection when observing the world around them.“There is a lag between cause and effect in everyday lives,” said Parks. “Our goal is to demystify that, and allow kids to access data so they can see a long view of things.”Tools like MIT App Inventor — which allows anyone to create a mobile application — help students make sense of what they can learn from data. Fournier, Leeth, and Magenyi programmed TreeSavers in App Inventor to chart regional deforestation rates across Massachusetts, identify ongoing trends through statistical models, and predict environmental impact. The students put that “long view” of climate change into practice when developing TreeSavers’ interactive maps. Users can toggle between Massachusetts’s current tree cover, historical data, and future high-risk areas.Although AI provides fast answers, it doesn’t necessarily offer equitable solutions, said David Sittenfeld, director of the Center for the Environment at the Museum of Science. The Day of AI curriculum asks students to make decisions on sourcing data, ensuring unbiased data, and thinking responsibly about how findings could be used.“There’s an ethical concern about tracking people’s data,” said Ethan Jorda, a New England Innovation Academy student. His group used open-source data to program an app that helps users track and reduce their carbon footprint.Christine Cunningham, senior vice president of STEM Learning at the Museum of Science, believes students are prepared to use AI responsibly to make the world a better place. “They can see themselves shaping the world they live in,” said Cunningham. “Moving through from understanding to action, kids will never look at a bridge or a piece of plastic lying on the ground in the same way again.”Deepening collaboration on earth and beyondThe 2024 Day of AI speakers emphasized collaborative problem solving at the local, national, and global levels.“Through different ideas and different perspectives, we’re going to get better solutions,” said Cunningham. “How do we start young enough that every child has a chance to both understand the world around them but also to move toward shaping the future?”Presenters from MIT, the Museum of Science, and NASA approached this question with a common goal — expanding STEM education to learners of all ages and backgrounds.“We have been delighted to collaborate with the MIT RAISE team to bring this year’s Day of AI celebration to the Museum of Science,” says Meg Rosenburg, manager of operations at the Museum of Science Centers for Public Science Learning. “This opportunity to highlight the new climate modules for the curriculum not only perfectly aligns with the museum’s goals to focus on climate and active hope throughout our Year of the Earthshot initiative, but it has also allowed us to bring our teams together and grow a relationship that we are very excited to build upon in the future.”Rachel Connolly, systems integration and analysis lead for NASA’s Science Activation Program, showed the power of collaboration with the example of how human comprehension of Saturn’s appearance has evolved. From Galileo’s early telescope to the Cassini space probe, modern imaging of Saturn represents 400 years of science, technology, and math working together to further knowledge.“Technologies, and the engineers who built them, advance the questions we’re able to ask and therefore what we’re able to understand,” said Connolly, research scientist at MIT Media Lab.New England Innovation Academy students saw an opportunity for collaboration a little closer to home. Emmett Buck-Thompson, Jeff Cheng, and Max Hunt envisioned a social media app to connect volunteers with local charities. Their project was inspired by Buck-Thompson’s father’s difficulties finding volunteering opportunities, Hunt’s role as the president of the school’s Community Impact Club, and Cheng’s aspiration to reduce screen time for social media users. Using MIT App Inventor, ​their combined ideas led to a prototype with the potential to make a real-world impact in their community.The Day of AI curriculum teaches the mechanics of AI, ethical considerations and responsible uses, and interdisciplinary applications for different fields. It also empowers students to become creative problem solvers and engaged citizens in their communities and online. From supporting volunteer efforts to encouraging action for the state’s forests to tackling the global challenge of climate change, today’s students are becoming tomorrow’s leaders with Day of AI.“We want to empower you to know that this is a tool you can use to make your community better, to help people around you with this technology,” said Breazeal.Other Day of AI speakers included Tim Ritchie, president of the Museum of Science; Michael Lawrence Evans, program director of the Boston Mayor’s Office of New Urban Mechanics; Dava Newman, director of the MIT Media Lab; and Natalie Lao, executive director of the App Inventor Foundation. More

  • in

    Making climate models relevant for local decision-makers

    Climate models are a key technology in predicting the impacts of climate change. By running simulations of the Earth’s climate, scientists and policymakers can estimate conditions like sea level rise, flooding, and rising temperatures, and make decisions about how to appropriately respond. But current climate models struggle to provide this information quickly or affordably enough to be useful on smaller scales, such as the size of a city. Now, authors of a new open-access paper published in the Journal of Advances in Modeling Earth Systems have found a method to leverage machine learning to utilize the benefits of current climate models, while reducing the computational costs needed to run them. “It turns the traditional wisdom on its head,” says Sai Ravela, a principal research scientist in MIT’s Department of Earth, Atmospheric and Planetary Sciences (EAPS) who wrote the paper with EAPS postdoc Anamitra Saha. Traditional wisdomIn climate modeling, downscaling is the process of using a global climate model with coarse resolution to generate finer details over smaller regions. Imagine a digital picture: A global model is a large picture of the world with a low number of pixels. To downscale, you zoom in on just the section of the photo you want to look at — for example, Boston. But because the original picture was low resolution, the new version is blurry; it doesn’t give enough detail to be particularly useful. “If you go from coarse resolution to fine resolution, you have to add information somehow,” explains Saha. Downscaling attempts to add that information back in by filling in the missing pixels. “That addition of information can happen two ways: Either it can come from theory, or it can come from data.” Conventional downscaling often involves using models built on physics (such as the process of air rising, cooling, and condensing, or the landscape of the area), and supplementing it with statistical data taken from historical observations. But this method is computationally taxing: It takes a lot of time and computing power to run, while also being expensive. A little bit of both In their new paper, Saha and Ravela have figured out a way to add the data another way. They’ve employed a technique in machine learning called adversarial learning. It uses two machines: One generates data to go into our photo. But the other machine judges the sample by comparing it to actual data. If it thinks the image is fake, then the first machine has to try again until it convinces the second machine. The end-goal of the process is to create super-resolution data. Using machine learning techniques like adversarial learning is not a new idea in climate modeling; where it currently struggles is its inability to handle large amounts of basic physics, like conservation laws. The researchers discovered that simplifying the physics going in and supplementing it with statistics from the historical data was enough to generate the results they needed. “If you augment machine learning with some information from the statistics and simplified physics both, then suddenly, it’s magical,” says Ravela. He and Saha started with estimating extreme rainfall amounts by removing more complex physics equations and focusing on water vapor and land topography. They then generated general rainfall patterns for mountainous Denver and flat Chicago alike, applying historical accounts to correct the output. “It’s giving us extremes, like the physics does, at a much lower cost. And it’s giving us similar speeds to statistics, but at much higher resolution.” Another unexpected benefit of the results was how little training data was needed. “The fact that that only a little bit of physics and little bit of statistics was enough to improve the performance of the ML [machine learning] model … was actually not obvious from the beginning,” says Saha. It only takes a few hours to train, and can produce results in minutes, an improvement over the months other models take to run. Quantifying risk quicklyBeing able to run the models quickly and often is a key requirement for stakeholders such as insurance companies and local policymakers. Ravela gives the example of Bangladesh: By seeing how extreme weather events will impact the country, decisions about what crops should be grown or where populations should migrate to can be made considering a very broad range of conditions and uncertainties as soon as possible.“We can’t wait months or years to be able to quantify this risk,” he says. “You need to look out way into the future and at a large number of uncertainties to be able to say what might be a good decision.”While the current model only looks at extreme precipitation, training it to examine other critical events, such as tropical storms, winds, and temperature, is the next step of the project. With a more robust model, Ravela is hoping to apply it to other places like Boston and Puerto Rico as part of a Climate Grand Challenges project.“We’re very excited both by the methodology that we put together, as well as the potential applications that it could lead to,” he says.  More

  • in

    School of Engineering welcomes new faculty

    The School of Engineering welcomes 15 new faculty members across six of its academic departments. This new cohort of faculty members, who have either recently started their roles at MIT or will start within the next year, conduct research across a diverse range of disciplines.Many of these new faculty specialize in research that intersects with multiple fields. In addition to positions in the School of Engineering, a number of these faculty have positions at other units across MIT. Faculty with appointments in the Department of Electrical Engineering and Computer Science (EECS) report into both the School of Engineering and the MIT Stephen A. Schwarzman College of Computing. This year, new faculty also have joint appointments between the School of Engineering and the School of Humanities, Arts, and Social Sciences and the School of Science.“I am delighted to welcome this cohort of talented new faculty to the School of Engineering,” says Anantha Chandrakasan, chief innovation and strategy officer, dean of engineering, and Vannevar Bush Professor of Electrical Engineering and Computer Science. “I am particularly struck by the interdisciplinary approach many of these new faculty take in their research. They are working in areas that are poised to have tremendous impact. I look forward to seeing them grow as researchers and educators.”The new engineering faculty include:Stephen Bates joined the Department of Electrical Engineering and Computer Science as an assistant professor in September 2023. He is also a member of the Laboratory for Information and Decision Systems (LIDS). Bates uses data and AI for reliable decision-making in the presence of uncertainty. In particular, he develops tools for statistical inference with AI models, data impacted by strategic behavior, and settings with distribution shift. Bates also works on applications in life sciences and sustainability. He previously worked as a postdoc in the Statistics and EECS departments at the University of California at Berkeley (UC Berkeley). Bates received a BS in statistics and mathematics at Harvard University and a PhD from Stanford University.Abigail Bodner joined the Department of EECS and Department of Earth, Atmospheric and Planetary Sciences as an assistant professor in January. She is also a member of the LIDS. Bodner’s research interests span climate, physical oceanography, geophysical fluid dynamics, and turbulence. Previously, she worked as a Simons Junior Fellow at the Courant Institute of Mathematical Sciences at New York University. Bodner received her BS in geophysics and mathematics and MS in geophysics from Tel Aviv University, and her SM in applied mathematics and PhD from Brown University.Andreea Bobu ’17 will join the Department of Aeronautics and Astronautics as an assistant professor in July. Her research sits at the intersection of robotics, mathematical human modeling, and deep learning. Previously, she was a research scientist at the Boston Dynamics AI Institute, focusing on how robots and humans can efficiently arrive at shared representations of their tasks for more seamless and reliable interactions. Bobu earned a BS in computer science and engineering from MIT and a PhD in electrical engineering and computer science from UC Berkeley.Suraj Cheema will join the Department of Materials Science and Engineering, with a joint appointment in the Department of EECS, as an assistant professor in July. His research explores atomic-scale engineering of electronic materials to tackle challenges related to energy consumption, storage, and generation, aiming for more sustainable microelectronics. This spans computing and energy technologies via integrated ferroelectric devices. He previously worked as a postdoc at UC Berkeley. Cheema earned a BS in applied physics and applied mathematics from Columbia University and a PhD in materials science and engineering from UC Berkeley.Samantha Coday joins the Department of EECS as an assistant professor in July. She will also be a member of the MIT Research Laboratory of Electronics. Her research interests include ultra-dense power converters enabling renewable energy integration, hybrid electric aircraft and future space exploration. To enable high-performance converters for these critical applications her research focuses on the optimization, design, and control of hybrid switched-capacitor converters. Coday earned a BS in electrical engineering and mathematics from Southern Methodist University and an MS and a PhD in electrical engineering and computer science from UC Berkeley.Mitchell Gordon will join the Department of EECS as an assistant professor in July. He will also be a member of the MIT Computer Science and Artificial Intelligence Laboratory. In his research, Gordon designs interactive systems and evaluation approaches that bridge principles of human-computer interaction with the realities of machine learning. He currently works as a postdoc at the University of Washington. Gordon received a BS from the University of Rochester, and MS and PhD from Stanford University, all in computer science.Kaiming He joined the Department of EECS as an associate professor in February. He will also be a member of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). His research interests cover a wide range of topics in computer vision and deep learning. He is currently focused on building computer models that can learn representations and develop intelligence from and for the complex world. Long term, he hopes to augment human intelligence with improved artificial intelligence. Before joining MIT, He was a research scientist at Facebook AI. He earned a BS from Tsinghua University and a PhD from the Chinese University of Hong Kong.Anna Huang SM ’08 will join the departments of EECS and Music and Theater Arts as assistant professor in September. She will help develop graduate programming focused on music technology. Previously, she spent eight years with Magenta at Google Brain and DeepMind, spearheading efforts in generative modeling, reinforcement learning, and human-computer interaction to support human-AI partnerships in music-making. She is the creator of Music Transformer and Coconet (which powered the Bach Google Doodle). She was a judge and organizer for the AI Song Contest. Anna holds a Canada CIFAR AI Chair at Mila, a BM in music composition, and BS in computer science from the University of Southern California, an MS from the MIT Media Lab, and a PhD from Harvard University.Yael Kalai PhD ’06 will join the Department of EECS as a professor in September. She is also a member of CSAIL. Her research interests include cryptography, the theory of computation, and security and privacy. Kalai currently focuses on both the theoretical and real-world applications of cryptography, including work on succinct and easily verifiable non-interactive proofs. She received her bachelor’s degree from the Hebrew University of Jerusalem, a master’s degree at the Weizmann Institute of Science, and a PhD from MIT.Sendhil Mullainathan will join the departments of EECS and Economics as a professor in July. His research uses machine learning to understand complex problems in human behavior, social policy, and medicine. Previously, Mullainathan spent five years at MIT before joining the faculty at Harvard in 2004, and then the University of Chicago in 2018. He received his BA in computer science, mathematics, and economics from Cornell University and his PhD from Harvard University.Alex Rives will join the Department of EECS as an assistant professor in September, with a core membership in the Broad Institute of MIT and Harvard. In his research, Rives is focused on AI for scientific understanding, discovery, and design for biology. Rives worked with Meta as a New York University graduate student, where he founded and led the Evolutionary Scale Modeling team that developed large language models for proteins. Rives received his BS in philosophy and biology from Yale University and is completing his PhD in computer science at NYU.Sungho Shin will join the Department of Chemical Engineering as an assistant professor in July. His research interests include control theory, optimization algorithms, high-performance computing, and their applications to decision-making in complex systems, such as energy infrastructures. Shin is a postdoc at the Mathematics and Computer Science Division at Argonne National Laboratory. He received a BS in mathematics and chemical engineering from Seoul National University and a PhD in chemical engineering from the University of Wisconsin-Madison.Jessica Stark joined the Department of Biological Engineering as an assistant professor in January. In her research, Stark is developing technologies to realize the largely untapped potential of cell-surface sugars, called glycans, for immunological discovery and immunotherapy. Previously, Stark was an American Cancer Society postdoc at Stanford University. She earned a BS in chemical and biomolecular engineering from Cornell University and a PhD in chemical and biological engineering at Northwestern University.Thomas John “T.J.” Wallin joined the Department of Materials Science and Engineering as an assistant professor in January. As a researcher, Wallin’s interests lay in advanced manufacturing of functional soft matter, with an emphasis on soft wearable technologies and their applications in human-computer interfaces. Previously, he was a research scientist at Meta’s Reality Labs Research working in their haptic interaction team. Wallin earned a BS in physics and chemistry from the College of William and Mary, and an MS and PhD in materials science and engineering from Cornell University.Gioele Zardini joined the Department of Civil and Environmental Engineering as an assistant professor in September. He will also join LIDS and the Institute for Data, Systems, and Society. Driven by societal challenges, Zardini’s research interests include the co-design of sociotechnical systems, compositionality in engineering, applied category theory, decision and control, optimization, and game theory, with society-critical applications to intelligent transportation systems, autonomy, and complex networks and infrastructures. He received his BS, MS, and PhD in mechanical engineering with a focus on robotics, systems, and control from ETH Zurich, and spent time at MIT, Stanford University, and Motional. More

  • in

    An AI dataset carves new paths to tornado detection

    The return of spring in the Northern Hemisphere touches off tornado season. A tornado’s twisting funnel of dust and debris seems an unmistakable sight. But that sight can be obscured to radar, the tool of meteorologists. It’s hard to know exactly when a tornado has formed, or even why.

    A new dataset could hold answers. It contains radar returns from thousands of tornadoes that have hit the United States in the past 10 years. Storms that spawned tornadoes are flanked by other severe storms, some with nearly identical conditions, that never did. MIT Lincoln Laboratory researchers who curated the dataset, called TorNet, have now released it open source. They hope to enable breakthroughs in detecting one of nature’s most mysterious and violent phenomena.

    “A lot of progress is driven by easily available, benchmark datasets. We hope TorNet will lay a foundation for machine learning algorithms to both detect and predict tornadoes,” says Mark Veillette, the project’s co-principal investigator with James Kurdzo. Both researchers work in the Air Traffic Control Systems Group. 

    Along with the dataset, the team is releasing models trained on it. The models show promise for machine learning’s ability to spot a twister. Building on this work could open new frontiers for forecasters, helping them provide more accurate warnings that might save lives. 

    Swirling uncertainty

    About 1,200 tornadoes occur in the United States every year, causing millions to billions of dollars in economic damage and claiming 71 lives on average. Last year, one unusually long-lasting tornado killed 17 people and injured at least 165 others along a 59-mile path in Mississippi.  

    Yet tornadoes are notoriously difficult to forecast because scientists don’t have a clear picture of why they form. “We can see two storms that look identical, and one will produce a tornado and one won’t. We don’t fully understand it,” Kurdzo says.

    A tornado’s basic ingredients are thunderstorms with instability caused by rapidly rising warm air and wind shear that causes rotation. Weather radar is the primary tool used to monitor these conditions. But tornadoes lay too low to be detected, even when moderately close to the radar. As the radar beam with a given tilt angle travels further from the antenna, it gets higher above the ground, mostly seeing reflections from rain and hail carried in the “mesocyclone,” the storm’s broad, rotating updraft. A mesocyclone doesn’t always produce a tornado.

    With this limited view, forecasters must decide whether or not to issue a tornado warning. They often err on the side of caution. As a result, the rate of false alarms for tornado warnings is more than 70 percent. “That can lead to boy-who-cried-wolf syndrome,” Kurdzo says.  

    In recent years, researchers have turned to machine learning to better detect and predict tornadoes. However, raw datasets and models have not always been accessible to the broader community, stifling progress. TorNet is filling this gap.

    The dataset contains more than 200,000 radar images, 13,587 of which depict tornadoes. The rest of the images are non-tornadic, taken from storms in one of two categories: randomly selected severe storms or false-alarm storms (those that led a forecaster to issue a warning but that didn’t produce a tornado).

    Each sample of a storm or tornado comprises two sets of six radar images. The two sets correspond to different radar sweep angles. The six images portray different radar data products, such as reflectivity (showing precipitation intensity) or radial velocity (indicating if winds are moving toward or away from the radar).

    A challenge in curating the dataset was first finding tornadoes. Within the corpus of weather radar data, tornadoes are extremely rare events. The team then had to balance those tornado samples with difficult non-tornado samples. If the dataset were too easy, say by comparing tornadoes to snowstorms, an algorithm trained on the data would likely over-classify storms as tornadic.

    “What’s beautiful about a true benchmark dataset is that we’re all working with the same data, with the same level of difficulty, and can compare results,” Veillette says. “It also makes meteorology more accessible to data scientists, and vice versa. It becomes easier for these two parties to work on a common problem.”

    Both researchers represent the progress that can come from cross-collaboration. Veillette is a mathematician and algorithm developer who has long been fascinated by tornadoes. Kurdzo is a meteorologist by training and a signal processing expert. In grad school, he chased tornadoes with custom-built mobile radars, collecting data to analyze in new ways.

    “This dataset also means that a grad student doesn’t have to spend a year or two building a dataset. They can jump right into their research,” Kurdzo says.

    This project was funded by Lincoln Laboratory’s Climate Change Initiative, which aims to leverage the laboratory’s diverse technical strengths to help address climate problems threatening human health and global security.

    Chasing answers with deep learning

    Using the dataset, the researchers developed baseline artificial intelligence (AI) models. They were particularly eager to apply deep learning, a form of machine learning that excels at processing visual data. On its own, deep learning can extract features (key observations that an algorithm uses to make a decision) from images across a dataset. Other machine learning approaches require humans to first manually label features. 

    “We wanted to see if deep learning could rediscover what people normally look for in tornadoes and even identify new things that typically aren’t searched for by forecasters,” Veillette says.

    The results are promising. Their deep learning model performed similar to or better than all tornado-detecting algorithms known in literature. The trained algorithm correctly classified 50 percent of weaker EF-1 tornadoes and over 85 percent of tornadoes rated EF-2 or higher, which make up the most devastating and costly occurrences of these storms.

    They also evaluated two other types of machine-learning models, and one traditional model to compare against. The source code and parameters of all these models are freely available. The models and dataset are also described in a paper submitted to a journal of the American Meteorological Society (AMS). Veillette presented this work at the AMS Annual Meeting in January.

    “The biggest reason for putting our models out there is for the community to improve upon them and do other great things,” Kurdzo says. “The best solution could be a deep learning model, or someone might find that a non-deep learning model is actually better.”

    TorNet could be useful in the weather community for others uses too, such as for conducting large-scale case studies on storms. It could also be augmented with other data sources, like satellite imagery or lightning maps. Fusing multiple types of data could improve the accuracy of machine learning models.

    Taking steps toward operations

    On top of detecting tornadoes, Kurdzo hopes that models might help unravel the science of why they form.

    “As scientists, we see all these precursors to tornadoes — an increase in low-level rotation, a hook echo in reflectivity data, specific differential phase (KDP) foot and differential reflectivity (ZDR) arcs. But how do they all go together? And are there physical manifestations we don’t know about?” he asks.

    Teasing out those answers might be possible with explainable AI. Explainable AI refers to methods that allow a model to provide its reasoning, in a format understandable to humans, of why it came to a certain decision. In this case, these explanations might reveal physical processes that happen before tornadoes. This knowledge could help train forecasters, and models, to recognize the signs sooner. 

    “None of this technology is ever meant to replace a forecaster. But perhaps someday it could guide forecasters’ eyes in complex situations, and give a visual warning to an area predicted to have tornadic activity,” Kurdzo says.

    Such assistance could be especially useful as radar technology improves and future networks potentially grow denser. Data refresh rates in a next-generation radar network are expected to increase from every five minutes to approximately one minute, perhaps faster than forecasters can interpret the new information. Because deep learning can process huge amounts of data quickly, it could be well-suited for monitoring radar returns in real time, alongside humans. Tornadoes can form and disappear in minutes.

    But the path to an operational algorithm is a long road, especially in safety-critical situations, Veillette says. “I think the forecaster community is still, understandably, skeptical of machine learning. One way to establish trust and transparency is to have public benchmark datasets like this one. It’s a first step.”

    The next steps, the team hopes, will be taken by researchers across the world who are inspired by the dataset and energized to build their own algorithms. Those algorithms will in turn go into test beds, where they’ll eventually be shown to forecasters, to start a process of transitioning into operations.

    In the end, the path could circle back to trust.

    “We may never get more than a 10- to 15-minute tornado warning using these tools. But if we could lower the false-alarm rate, we could start to make headway with public perception,” Kurdzo says. “People are going to use those warnings to take the action they need to save their lives.” More

  • in

    Advancing technology for aquaculture

    According to the National Oceanic and Atmospheric Administration, aquaculture in the United States represents a $1.5 billion industry annually. Like land-based farming, shellfish aquaculture requires healthy seed production in order to maintain a sustainable industry. Aquaculture hatchery production of shellfish larvae — seeds — requires close monitoring to track mortality rates and assess health from the earliest stages of life. 

    Careful observation is necessary to inform production scheduling, determine effects of naturally occurring harmful bacteria, and ensure sustainable seed production. This is an essential step for shellfish hatcheries but is currently a time-consuming manual process prone to human error. 

    With funding from MIT’s Abdul Latif Jameel Water and Food Systems Lab (J-WAFS), MIT Sea Grant is working with Associate Professor Otto Cordero of the MIT Department of Civil and Environmental Engineering, Professor Taskin Padir and Research Scientist Mark Zolotas at the Northeastern University Institute for Experiential Robotics, and others at the Aquaculture Research Corporation (ARC), and the Cape Cod Commercial Fishermen’s Alliance, to advance technology for the aquaculture industry. Located on Cape Cod, ARC is a leading shellfish hatchery, farm, and wholesaler that plays a vital role in providing high-quality shellfish seed to local and regional growers.

    Two MIT students have joined the effort this semester, working with Robert Vincent, MIT Sea Grant’s assistant director of advisory services, through the Undergraduate Research Opportunities Program (UROP). 

    First-year student Unyime Usua and sophomore Santiago Borrego are using microscopy images of shellfish seed from ARC to train machine learning algorithms that will help automate the identification and counting process. The resulting user-friendly image recognition tool aims to aid aquaculturists in differentiating and counting healthy, unhealthy, and dead shellfish larvae, improving accuracy and reducing time and effort.

    Vincent explains that AI is a powerful tool for environmental science that enables researchers, industry, and resource managers to address challenges that have long been pinch points for accurate data collection, analysis, predictions, and streamlining processes. “Funding support from programs like J-WAFS enable us to tackle these problems head-on,” he says. 

    ARC faces challenges with manually quantifying larvae classes, an important step in their seed production process. “When larvae are in their growing stages they are constantly being sized and counted,” explains Cheryl James, ARC larval/juvenile production manager. “This process is critical to encourage optimal growth and strengthen the population.” 

    Developing an automated identification and counting system will help to improve this step in the production process with time and cost benefits. “This is not an easy task,” says Vincent, “but with the guidance of Dr. Zolotas at the Northeastern University Institute for Experiential Robotics and the work of the UROP students, we have made solid progress.” 

    The UROP program benefits both researchers and students. Involving MIT UROP students in developing these types of systems provides insights into AI applications that they might not have considered, providing opportunities to explore, learn, and apply themselves while contributing to solving real challenges.

    Borrego saw this project as an opportunity to apply what he’d learned in class 6.390 (Introduction to Machine Learning) to a real-world issue. “I was starting to form an idea of how computers can see images and extract information from them,” he says. “I wanted to keep exploring that.”

    Usua decided to pursue the project because of the direct industry impacts it could have. “I’m pretty interested in seeing how we can utilize machine learning to make people’s lives easier. We are using AI to help biologists make this counting and identification process easier.” While Usua wasn’t familiar with aquaculture before starting this project, she explains, “Just hearing about the hatcheries that Dr. Vincent was telling us about, it was unfortunate that not a lot of people know what’s going on and the problems that they’re facing.”

    On Cape Cod alone, aquaculture is an $18 million per year industry. But the Massachusetts Division of Marine Fisheries estimates that hatcheries are only able to meet 70–80 percent of seed demand annually, which impacts local growers and economies. Through this project, the partners aim to develop technology that will increase seed production, advance industry capabilities, and help understand and improve the hatchery microbiome.

    Borrego explains the initial challenge of having limited data to work with. “Starting out, we had to go through and label all of the data, but going through that process helped me learn a lot.” In true MIT fashion, he shares his takeaway from the project: “Try to get the best out of what you’re given with the data you have to work with. You’re going to have to adapt and change your strategies depending on what you have.”

    Usua describes her experience going through the research process, communicating in a team, and deciding what approaches to take. “Research is a difficult and long process, but there is a lot to gain from it because it teaches you to look for things on your own and find your own solutions to problems.”

    In addition to increasing seed production and reducing the human labor required in the hatchery process, the collaborators expect this project to contribute to cost savings and technology integration to support one of the most underserved industries in the United States. 

    Borrego and Usua both plan to continue their work for a second semester with MIT Sea Grant. Borrego is interested in learning more about how technology can be used to protect the environment and wildlife. Usua says she hopes to explore more projects related to aquaculture. “It seems like there’s an infinite amount of ways to tackle these issues.” More