Artificial intelligence Archivi - eco-news.space - All about the world of ecology!

Latest story

75 Shares63 Views

The brain power behind sustainable AI

by Evelyn Reynolds 24 October 2025, 04:00

How can you use science to build a better gingerbread house?That was something Miranda Schwacke spent a lot of time thinking about. The MIT graduate student in the Department of Materials Science and Engineering (DMSE) is part of Kitchen Matters, a group of grad students who use food and kitchen tools to explain scientific concepts through short videos and outreach events. Past topics included why chocolate “seizes,” or becomes difficult to work with when melting (spoiler: water gets in), and how to make isomalt, the sugar glass that stunt performers jump through in action movies.Two years ago, when the group was making a video on how to build a structurally sound gingerbread house, Schwacke scoured cookbooks for a variable that would produce the most dramatic difference in the cookies.“I was reading about what determines the texture of cookies, and then tried several recipes in my kitchen until I got two gingerbread recipes that I was happy with,” Schwacke says.She focused on butter, which contains water that turns to steam at high baking temperatures, creating air pockets in cookies. Schwacke predicted that decreasing the amount of butter would yield denser gingerbread, strong enough to hold together as a house.“This hypothesis is an example of how changing the structure can influence the properties and performance of material,” Schwacke said in the eight-minute video.That same curiosity about materials properties and performance drives her research on the high energy cost of computing, especially for artificial intelligence. Schwacke develops new materials and devices for neuromorphic computing, which mimics the brain by processing and storing information in the same place. She studies electrochemical ionic synapses — tiny devices that can be “tuned” to adjust conductivity, much like neurons strengthening or weakening connections in the brain.“If you look at AI in particular — to train these really large models — that consumes a lot of energy. And if you compare that to the amount of energy that we consume as humans when we’re learning things, the brain consumes a lot less energy,” Schwacke says. “That’s what led to this idea to find more brain-inspired, energy-efficient ways of doing AI.”Her advisor, Bilge Yildiz, underscores the point: One reason the brain is so efficient is that data doesn’t need to be moved back and forth.“In the brain, the connections between our neurons, called synapses, are where we process information. Signal transmission is there. It is processed, programmed, and also stored in the same place,” says Yildiz, the Breene M. Kerr (1951) Professor in the Department of Nuclear Science and Engineering and DMSE. Schwacke’s devices aim to replicate that efficiency.Scientific rootsThe daughter of a marine biologist mom and an electrical engineer dad, Schwacke was immersed in science from a young age. Science was “always a part of how I understood the world.”“I was obsessed with dinosaurs. I wanted to be a paleontologist when I grew up,” she says. But her interests broadened. At her middle school in Charleston, South Carolina, she joined a FIRST Lego League robotics competition, building robots to complete tasks like pushing or pulling objects. “My parents, my dad especially, got very involved in the school team and helping us design and build our little robot for the competition.”Her mother, meanwhile, studied how dolphin populations are affected by pollution for the National Oceanic and Atmospheric Administration. That had a lasting impact.“That was an example of how science can be used to understand the world, and also to figure out how we can improve the world,” Schwacke says. “And that’s what I’ve always wanted to do with science.”Her interest in materials science came later, in her high school magnet program. There, she was introduced to the interdisciplinary subject, a blend of physics, chemistry, and engineering that studies the structure and properties of materials and uses that knowledge to design new ones.“I always liked that it goes from this very basic science, where we’re studying how atoms are ordering, all the way up to these solid materials that we interact with in our everyday lives — and how that gives them their properties that we can see and play with,” Schwacke says.As a senior, she participated in a research program with a thesis project on dye-sensitized solar cells, a low-cost, lightweight solar technology that uses dye molecules to absorb light and generate electricity.“What drove me was really understanding, this is how we go from light to energy that we can use — and also seeing how this could help us with having more renewable energy sources,” Schwacke says.After high school, she headed across the country to Caltech. “I wanted to try a totally new place,” she says, where she studied materials science, including nanostructured materials thousands of times thinner than a human hair. She focused on materials properties and microstructure — the tiny internal structure that governs how materials behave — which led her to electrochemical systems like batteries and fuel cells.AI energy challengeAt MIT, she continued exploring energy technologies. She met Yildiz during a Zoom meeting in her first year of graduate school, in fall 2020, when the campus was still operating under strict Covid-19 protocols. Yildiz’s lab studies how charged atoms, or ions, move through materials in technologies like fuel cells, batteries, and electrolyzers.The lab’s research into brain-inspired computing fired Schwacke’s imagination, but she was equally drawn to Yildiz’s way of talking about science.“It wasn’t based on jargon and emphasized a very basic understanding of what was going on — that ions are going here, and electrons are going here — to understand fundamentally what’s happening in the system,” Schwacke says.That mindset shaped her approach to research. Her early projects focused on the properties these devices need to work well — fast operation, low energy use, and compatibility with semiconductor technology — and on using magnesium ions instead of hydrogen, which can escape into the environment and make devices unstable.Her current project, the focus of her PhD thesis, centers on understanding how the insertion of magnesium ions into tungsten oxide, a metal oxide whose electrical properties can be precisely tuned, changes its electrical resistance. In these devices, tungsten oxide serves as a channel layer, where resistance controls signal strength, much like synapses regulate signals in the brain.“I am trying to understand exactly how these devices change the channel conductance,” Schwacke says.Schwacke’s research was recognized with a MathWorks Fellowship from the School of Engineering in 2023 and 2024. The fellowship supports graduate students who leverage tools like MATLAB or Simulink in their work; Schwacke applied MATLAB for critical data analysis and visualization.Yildiz describes Schwacke’s research as a novel step toward solving one of AI’s biggest challenges.“This is electrochemistry for brain-inspired computing,” Yildiz says. “It’s a new context for electrochemistry, but also with an energy implication, because the energy consumption of computing is unsustainably increasing. We have to find new ways of doing computing with much lower energy, and this is one way that can help us move in that direction.”Like any pioneering work, it comes with challenges, especially in bridging the concepts between electrochemistry and semiconductor physics.“Our group comes from a solid-state chemistry background, and when we started this work looking into magnesium, no one had used magnesium in these kinds of devices before,” Schwacke says. “So we were looking at the magnesium battery literature for inspiration and different materials and strategies we could use. When I started this, I wasn’t just learning the language and norms for one field — I was trying to learn it for two fields, and also translate between the two.”She also grapples with a challenge familiar to all scientists: how to make sense of messy data.“The main challenge is being able to take my data and know that I’m interpreting it in a way that’s correct, and that I understand what it actually means,” Schwacke says.She overcomes hurdles by collaborating closely with colleagues across fields, including neuroscience and electrical engineering, and sometimes by just making small changes to her experiments and watching what happens next.Community mattersSchwacke is not just active in the lab. In Kitchen Matters, she and her fellow DMSE grad students set up booths at local events like the Cambridge Science Fair and Steam It Up, an after-school program with hands-on activities for kids.“We did ‘pHun with Food’ with ‘fun’ spelled with a pH, so we had cabbage juice as a pH indicator,” Schwacke says. “We let the kids test the pH of lemon juice and vinegar and dish soap, and they had a lot of fun mixing the different liquids and seeing all the different colors.”She has also served as the social chair and treasurer for DMSE’s graduate student group, the Graduate Materials Council. As an undergraduate at Caltech, she led workshops in science and technology for Robogals, a student-run group that encourages young women to pursue careers in science, and assisted students in applying for the school’s Summer Undergraduate Research Fellowships.For Schwacke, these experiences sharpened her ability to explain science to different audiences, a skill she sees as vital whether she’s presenting at a kids’ fair or at a research conference.“I always think, where is my audience starting from, and what do I need to explain before I can get into what I’m doing so that it’ll all make sense to them?” she says.Schwacke sees the ability to communicate as central to building community, which she considers an important part of doing research. “It helps with spreading ideas. It always helps to get a new perspective on what you’re working on,” she says. “I also think it keeps us sane during our PhD.”Yildiz sees Schwacke’s community involvement as an important part of her resume. “She’s doing all these activities to motivate the broader community to do research, to be interested in science, to pursue science and technology, but that ability will help her also progress in her own research and academic endeavors.”After her PhD, Schwacke wants to take that ability to communicate with her to academia, where she’d like to inspire the next generation of scientists and engineers. Yildiz has no doubt she’ll thrive.“I think she’s a perfect fit,” Yildiz says. “She’s brilliant, but brilliance by itself is not enough. She’s persistent, resilient. You really need those on top of that.” More

More stories

113 Shares130 Views
in Security
Optimizing food subsidies: Applying digital platforms to maximize nutrition
by Aurelia Butler 14 October 2025, 19:40
Oct. 16 is World Food Day, a global campaign to celebrate the founding of the Food and Agriculture Organization 80 years ago, and to work toward a healthy, sustainable, food-secure future. More than 670 million people in the world are facing hunger. Millions of others are facing rising obesity rates and struggle to get healthy food for proper nutrition. World Food Day calls on not only world governments, but business, academia, the media, and even the youth to take action to promote resilient food systems and combat hunger. This year, the Abdul Latif Jameel Water and Food Systems Laboratory (J-WAFS) is spotlighting an MIT researcher who is working toward this goal by studying food and water systems in the Global South.J-WAFS seed grants provide funding to early-stage research projects that are unique to prior work. In an 11th round of seed grant funding in 2025, 10 MIT faculty members received support to carry out their cutting-edge water and food research. Ali Aouad PhD ’17, assistant professor of operations management at the MIT Sloan School of Management, was one of those grantees. “I had searched before joining MIT what kind of research centers and initiatives were available that tried to coalesce research on food systems,” Aouad says. “And so, I was very excited about J-WAFS.” Aouad gathered more information about J-WAFS at the new faculty orientation session in August 2024, where he spoke to J-WAFS staff and learned about the program’s grant opportunities for water and food research. Later that fall semester, he attended a few J-WAFS seminars on agricultural economics and water resource management. That’s when Aouad knew that his project was perfectly aligned with the J-WAFS mission of securing humankind’s water and food.Aouad’s seed project focuses on food subsidies. With a background in operations research and an interest in digital platforms, much of his work has centered on aligning supply-side operations with heterogeneous customer preferences. Past projects include ones on retail and matching systems. “I started thinking that these types of demand-driven approaches may be also very relevant to important social challenges, particularly as they relate to food security,” Aouad says. Before starting his PhD at MIT, Aouad worked on projects that looked at subsidies for smallholder farmers in low- and middle-income countries. “I think in the back of my mind, I’ve always been fascinated by trying to solve these issues,” he noted.His seed grant project, Optimal subsidy design: Application to food assistance programs, aims to leverage data on preferences and purchasing habits from local grocery stores in India to inform food assistance policy and optimize the design of subsidies. Typical data collection systems, like point-of-sales, are not as readily available in India’s local groceries, making this type of data hard to come by for low-income individuals. “Mom-and-pop stores are extremely important last-mile operators when it comes to nutrition,” he explains. For this project, the research team gave local grocers point-of-sale scanners to track purchasing habits. “We aim to develop an algorithm that converts these transactions into some sort of ‘revelation’ of the individuals’ latent preferences,” says Aouad. “As such, we can model and optimize the food assistance programs — how much variety and flexibility is offered, taking into account the expected demand uptake.” He continues, “now, of course, our ability to answer detailed design questions [across various products and prices] depends on the quality of our inference from the data, and so this is where we need more sophisticated and robust algorithms.”Following the data collection and model development, the ultimate goal of this research is to inform policy surrounding food assistance programs through an “optimization approach.” Aouad describes the complexities of using optimization to guide policy. “Policies are often informed by domain expertise, legacy systems, or political deliberation. A lot of researchers build rigorous evidence to inform food policy, but it’s fair to say that the kind of approach that I’m proposing in this research is not something that is commonly used. I see an opportunity for bringing a new approach and methodological tradition to a problem that has been central for policy for many decades.” The overall health of consumers is the reason food assistance programs exist, yet measuring long-term nutritional impacts and shifts in purchase behavior is difficult. In past research, Aouad notes that the short-term effects of food assistance interventions can be significant. However, these effects are often short-lived. “This is a fascinating question that I don’t think we will be able to address within the space of interventions that we will be considering. However, I think it is something I would like to capture in the research, and maybe develop hypotheses for future work around how we can shift nutrition-related behaviors in the long run.”While his project develops a new methodology to calibrate food assistance programs, large-scale applications are not promised. “A lot of what drives subsidy mechanisms and food assistance programs is also, quite frankly, how easy it is and how cost-effective it is to implement these policies in the first place,” comments Aouad. Cost and infrastructure barriers are unavoidable to this kind of policy research, as well as sustaining these programs. Aouad’s effort will provide insights into customer preferences and subsidy optimization in a pilot setup, but replicating this approach on a real scale may be costly. Aouad hopes to be able to gather proxy information from customers that would both feed into the model and provide insight into a more cost-effective way to collect data for large-scale implementation.There is still much work to be done to ensure food security for all, whether it’s advances in agriculture, food-assistance programs, or ways to boost adequate nutrition. As the 2026 seed grant deadline approaches, J-WAFS will continue its mission of supporting MIT faculty as they pursue innovative projects that have practical and real impacts on water and food system challenges. More
113 Shares159 Views
in Energy
Fighting for the health of the planet with AI
by Evelyn Reynolds 7 October 2025, 20:55
For Priya Donti, childhood trips to India were more than an opportunity to visit extended family. The biennial journeys activated in her a motivation that continues to shape her research and her teaching.Contrasting her family home in Massachusetts, Donti — now the Silverman Family Career Development Professor in the Department of Electrical Engineering and Computer Science (EECS), a shared position between the MIT Schwarzman College of Computing and EECS, and a principal investigator at the MIT Laboratory for Information and Decision Systems (LIDS) — was struck by the disparities in how people live.“It was very clear to me the extent to which inequity is a rampant issue around the world,” Donti says. “From a young age, I knew that I definitely wanted to address that issue.”That motivation was further stoked by a high school biology teacher, who focused his class on climate and sustainability.“We learned that climate change, this huge, important issue, would exacerbate inequity,” Donti says. “That really stuck with me and put a fire in my belly.”So, when Donti enrolled at Harvey Mudd College, she thought she would direct her energy toward the study of chemistry or materials science to create next-generation solar panels.Those plans, however, were jilted. Donti “fell in love” with computer science, and then discovered work by researchers in the United Kingdom who were arguing that artificial intelligence and machine learning would be essential to help integrate renewables into power grids.“It was the first time I’d seen those two interests brought together,” she says. “I got hooked and have been working on that topic ever since.”Pursuing a PhD at Carnegie Mellon University, Donti was able to design her degree to include computer science and public policy. In her research, she explored the need for fundamental algorithms and tools that could manage, at scale, power grids relying heavily on renewables.“I wanted to have a hand in developing those algorithms and tool kits by creating new machine learning techniques grounded in computer science,” she says. “But I wanted to make sure that the way I was doing the work was grounded both in the actual energy systems domain and working with people in that domain” to provide what was actually needed.While Donti was working on her PhD, she co-founded a nonprofit called Climate Change AI. Her objective, she says, was to help the community of people involved in climate and sustainability — “be they computer scientists, academics, practitioners, or policymakers” — to come together and access resources, connection, and education “to help them along that journey.”“In the climate space,” she says, “you need experts in particular climate change-related sectors, experts in different technical and social science tool kits, problem owners, affected users, policymakers who know the regulations — all of those — to have on-the-ground scalable impact.”When Donti came to MIT in September 2023, it was not surprising that she was drawn by its initiatives directing the application of computer science toward society’s biggest problems, especially the current threat to the health of the planet.“We’re really thinking about where technology has a much longer-horizon impact and how technology, society, and policy all have to work together,” Donti says. “Technology is not just one-and-done and monetizable in the context of a year.”Her work uses deep learning models to incorporate the physics and hard constraints of electric power systems that employ renewables for better forecasting, optimization, and control.“Machine learning is already really widely used for things like solar power forecasting, which is a prerequisite to managing and balancing power grids,” she says. “My focus is, how do you improve the algorithms for actually balancing power grids in the face of a range of time-varying renewables?”Among Donti’s breakthroughs is a promising solution for power grid operators to be able to optimize for cost, taking into account the actual physical realities of the grid, rather than relying on approximations. While the solution is not yet deployed, it appears to work 10 times faster, and far more cheaply, than previous technologies, and has attracted the attention of grid operators.Another technology she is developing works to provide data that can be used in training machine learning systems for power system optimization. In general, much data related to the systems is private, either because it is proprietary or because of security concerns. Donti and her research group are working to create synthetic data and benchmarks that, Donti says, “can help to expose some of the underlying problems” in making power systems more efficient.“The question is,” Donti says, “can we bring our datasets to a point such that they are just hard enough to drive progress?”For her efforts, Donti has been awarded the U.S. Department of Energy Computational Science Graduate Fellowship and the NSF Graduate Research Fellowship. She was recognized as part of MIT Technology Review’s 2021 list of “35 Innovators Under 35” and Vox’s 2023 “Future Perfect 50.”Next spring, Donti will co-teach a class called AI for Climate Action with Sara Beery, EECS assistant professor, whose focus is AI for biodiversity and ecosystems, and Abigail Bodner, assistant professor in the departments of EECS and Earth, Atmospheric and Planetary Sciences, whose focus is AI for climate and Earth science.“We’re all super-excited about it,” Donti says.Coming to MIT, Donti says, “I knew that there would be an ecosystem of people who really cared, not just about success metrics like publications and citation counts, but about the impact of our work on society.” More
138 Shares121 Views
in Energy
New prediction model could improve the reliability of fusion power plants
by Evelyn Reynolds 7 October 2025, 04:00
Tokamaks are machines that are meant to hold and harness the power of the sun. These fusion machines use powerful magnets to contain a plasma hotter than the sun’s core and push the plasma’s atoms to fuse and release energy. If tokamaks can operate safely and efficiently, the machines could one day provide clean and limitless fusion energy.Today, there are a number of experimental tokamaks in operation around the world, with more underway. Most are small-scale research machines built to investigate how the devices can spin up plasma and harness its energy. One of the challenges that tokamaks face is how to safely and reliably turn off a plasma current that is circulating at speeds of up to 100 kilometers per second, at temperatures of over 100 million degrees Celsius.Such “rampdowns” are necessary when a plasma becomes unstable. To prevent the plasma from further disrupting and potentially damaging the device’s interior, operators ramp down the plasma current. But occasionally the rampdown itself can destabilize the plasma. In some machines, rampdowns have caused scrapes and scarring to the tokamak’s interior — minor damage that still requires considerable time and resources to repair.Now, scientists at MIT have developed a method to predict how plasma in a tokamak will behave during a rampdown. The team combined machine-learning tools with a physics-based model of plasma dynamics to simulate a plasma’s behavior and any instabilities that may arise as the plasma is ramped down and turned off. The researchers trained and tested the new model on plasma data from an experimental tokamak in Switzerland. They found the method quickly learned how plasma would evolve as it was tuned down in different ways. What’s more, the method achieved a high level of accuracy using a relatively small amount of data. This training efficiency is promising, given that each experimental run of a tokamak is expensive and quality data is limited as a result.The new model, which the team highlights this week in an open-access Nature Communications paper, could improve the safety and reliability of future fusion power plants.“For fusion to be a useful energy source it’s going to have to be reliable,” says lead author Allen Wang, a graduate student in aeronautics and astronautics and a member of the Disruption Group at MIT’s Plasma Science and Fusion Center (PSFC). “To be reliable, we need to get good at managing our plasmas.”The study’s MIT co-authors include PSFC Principal Research Scientist and Disruptions Group leader Cristina Rea, and members of the Laboratory for Information and Decision Systems (LIDS) Oswin So, Charles Dawson, and Professor Chuchu Fan, along with Mark (Dan) Boyer of Commonwealth Fusion Systems and collaborators from the Swiss Plasma Center in Switzerland.“A delicate balance”Tokamaks are experimental fusion devices that were first built in the Soviet Union in the 1950s. The device gets its name from a Russian acronym that translates to a “toroidal chamber with magnetic coils.” Just as its name describes, a tokamak is toroidal, or donut-shaped, and uses powerful magnets to contain and spin up a gas to temperatures and energies high enough that atoms in the resulting plasma can fuse and release energy.Today, tokamak experiments are relatively low-energy in scale, with few approaching the size and output needed to generate safe, reliable, usable energy. Disruptions in experimental, low-energy tokamaks are generally not an issue. But as fusion machines scale up to grid-scale dimensions, controlling much higher-energy plasmas at all phases will be paramount to maintaining a machine’s safe and efficient operation.“Uncontrolled plasma terminations, even during rampdown, can generate intense heat fluxes damaging the internal walls,” Wang notes. “Quite often, especially with the high-performance plasmas, rampdowns actually can push the plasma closer to some instability limits. So, it’s a delicate balance. And there’s a lot of focus now on how to manage instabilities so that we can routinely and reliably take these plasmas and safely power them down. And there are relatively few studies done on how to do that well.”Bringing down the pulseWang and his colleagues developed a model to predict how a plasma will behave during tokamak rampdown. While they could have simply applied machine-learning tools such as a neural network to learn signs of instabilities in plasma data, “you would need an ungodly amount of data” for such tools to discern the very subtle and ephemeral changes in extremely high-temperature, high-energy plasmas, Wang says.Instead, the researchers paired a neural network with an existing model that simulates plasma dynamics according to the fundamental rules of physics. With this combination of machine learning and a physics-based plasma simulation, the team found that only a couple hundred pulses at low performance, and a small handful of pulses at high performance, were sufficient to train and validate the new model.The data they used for the new study came from the TCV, the Swiss “variable configuration tokamak” operated by the Swiss Plasma Center at EPFL (the Swiss Federal Institute of Technology Lausanne). The TCV is a small experimental fusion experimental device that is used for research purposes, often as test bed for next-generation device solutions. Wang used the data from several hundred TCV plasma pulses that included properties of the plasma such as its temperature and energies during each pulse’s ramp-up, run, and ramp-down. He trained the new model on this data, then tested it and found it was able to accurately predict the plasma’s evolution given the initial conditions of a particular tokamak run.The researchers also developed an algorithm to translate the model’s predictions into practical “trajectories,” or plasma-managing instructions that a tokamak controller can automatically carry out to for instance adjust the magnets or temperature maintain the plasma’s stability. They implemented the algorithm on several TCV runs and found that it produced trajectories that safely ramped down a plasma pulse, in some cases faster and without disruptions compared to runs without the new method.“At some point the plasma will always go away, but we call it a disruption when the plasma goes away at high energy. Here, we ramped the energy down to nothing,” Wang notes. “We did it a number of times. And we did things much better across the board. So, we had statistical confidence that we made things better.”The work was supported in part by Commonwealth Fusion Systems (CFS), an MIT spinout that intends to build the world’s first compact, grid-scale fusion power plant. The company is developing a demo tokamak, SPARC, designed to produce net-energy plasma, meaning that it should generate more energy than it takes to heat up the plasma. Wang and his colleagues are working with CFS on ways that the new prediction model and tools like it can better predict plasma behavior and prevent costly disruptions to enable safe and reliable fusion power.“We’re trying to tackle the science questions to make fusion routinely useful,” Wang says. “What we’ve done here is the start of what is still a long journey. But I think we’ve made some nice progress.”Additional support for the research came from the framework of the EUROfusion Consortium, via the Euratom Research and Training Program and funded by the Swiss State Secretariat for Education, Research, and Innovation. More
150 Shares168 Views
in Energy
Lincoln Lab unveils the most powerful AI supercomputer at any US university
by Evelyn Reynolds 2 October 2025, 19:30
The new TX-Generative AI Next (TX-GAIN) computing system at the Lincoln Laboratory Supercomputing Center  (LLSC) is the most powerful AI supercomputer at any U.S. university. With its recent ranking from  TOP500, which biannually publishes a list of the top supercomputers in various categories, TX-GAIN joins the ranks of other powerful systems at the LLSC, all supporting research and development at Lincoln Laboratory and across the MIT campus. “TX-GAIN will enable our researchers to achieve scientific and engineering breakthroughs. The system will play a large role in supporting generative AI, physical simulation, and data analysis across all research areas,” says Lincoln Laboratory Fellow Jeremy Kepner, who heads the LLSC. The LLSC is a key resource for accelerating innovation at Lincoln Laboratory. Thousands of researchers tap into the LLSC to analyze data, train models, and run simulations for federally funded research projects. The supercomputers have been used, for example, to simulate billions of aircraft encounters to develop collision-avoidance systems for the Federal Aviation Administration, and to train models in the complex tasks of autonomous navigation for the Department of Defense. Over the years, LLSC capabilities have been essential to numerous award-winning technologies, including those that have improved  airline safety,  prevented the spread of new diseases, and  aided in hurricane responses. As its name suggests, TX-GAIN is especially equipped for developing and applying generative AI. Whereas traditional AI focuses on categorization tasks, like identifying whether a photo depicts a dog or cat, generative AI produces entirely new outputs. Kepner describes it as a mathematical combination of interpolation (filling in the gaps between known data points) and extrapolation (extending data beyond known points). Today, generative AI is widely known for its use of large language models to create human-like responses to user prompts. At Lincoln Laboratory, teams are applying generative AI to various domains beyond large language models. They are using the technology, for instance, to evaluate radar signatures, supplement weather data where coverage is missing, root out anomalies in network traffic, and explore chemical interactions to design new medicines and materials.To enable such intense computations, TX-GAIN is powered by more than 600 NVIDIA graphics processing unit accelerators specially designed for AI operations, in addition to traditional high-performance computing hardware. With a peak performance of two AI exaflops (two quintillion floating-point operations per second), TX-GAIN is the top AI system at a university, and in the Northeast. Since TX-GAIN came online this summer, researchers have taken notice. “TX-GAIN is allowing us to model not only significantly more protein interactions than ever before, but also much larger proteins with more atoms. This new computational capability is a game-changer for protein characterization efforts in biological defense,” says Rafael Jaimes, a researcher in Lincoln Laboratory’s Counter–Weapons of Mass Destruction Systems Group. The LLSC’s focus on interactive supercomputing makes it especially useful to researchers. For years, the LLSC has pioneered software that lets users access its powerful systems without needing to be experts in configuring algorithms for parallel processing. “The LLSC has always tried to make supercomputing feel like working on your laptop,” Kepner says. “The amount of data and the sophistication of analysis methods needed to be competitive today are well beyond what can be done on a laptop. But with our user-friendly approach, people can run their model and get answers quickly from their workspace.”Beyond supporting programs solely at Lincoln Laboratory, TX-GAIN is enhancing research collaborations with MIT’s campus. Such collaborations include the Haystack Observatory, Center for Quantum Engineering, Beaver Works, and Department of Air Force–MIT AI Accelerator. The latter initiative is rapidly prototyping, scaling, and applying AI technologies for the U.S. Air Force and Space Force, optimizing flight scheduling for global operations as one fielded example.The LLSC systems are housed in an energy-efficient data center and facility in Holyoke, Massachusetts. Research staff in the LLSC are also tackling the immense energy needs of AI and leading research into various power-reduction methods. One software tool they developed can reduce the energy of training an AI model by as much as 80 percent.”The LLSC provides the capabilities needed to do leading-edge research, while in a cost-effective and energy-efficient manner,” Kepner says.All of the supercomputers at the LLSC use the “TX” nomenclature in homage to Lincoln Laboratory’s Transistorized Experimental Computer Zero (TX-0) of 1956. TX-0 was one of the world’s first transistor-based machines, and its 1958 successor, TX-2, is storied for its role in pioneering human-computer interaction and AI. With TX-GAIN, the LLSC continues this legacy. More
100 Shares159 Views
in Energy
Responding to the climate impact of generative AI
by Evelyn Reynolds 30 September 2025, 04:00
In part 2 of our two-part series on generative artificial intelligence’s environmental impacts, MIT News explores some of the ways experts are working to reduce the technology’s carbon footprint.The energy demands of generative AI are expected to continue increasing dramatically over the next decade.For instance, an April 2025 report from the International Energy Agency predicts that the global electricity demand from data centers, which house the computing infrastructure to train and deploy AI models, will more than double by 2030, to around 945 terawatt-hours. While not all operations performed in a data center are AI-related, this total amount is slightly more than the energy consumption of Japan.Moreover, an August 2025 analysis from Goldman Sachs Research forecasts that about 60 percent of the increasing electricity demands from data centers will be met by burning fossil fuels, increasing global carbon emissions by about 220 million tons. In comparison, driving a gas-powered car for 5,000 miles produces about 1 ton of carbon dioxide.These statistics are staggering, but at the same time, scientists and engineers at MIT and around the world are studying innovations and interventions to mitigate AI’s ballooning carbon footprint, from boosting the efficiency of algorithms to rethinking the design of data centers.Considering carbon emissionsTalk of reducing generative AI’s carbon footprint is typically centered on “operational carbon” — the emissions used by the powerful processors, known as GPUs, inside a data center. It often ignores “embodied carbon,” which are emissions created by building the data center in the first place, says Vijay Gadepally, senior scientist at MIT Lincoln Laboratory, who leads research projects in the Lincoln Laboratory Supercomputing Center.Constructing and retrofitting a data center, built from tons of steel and concrete and filled with air conditioning units, computing hardware, and miles of cable, consumes a huge amount of carbon. In fact, the environmental impact of building data centers is one reason companies like Meta and Google are exploring more sustainable building materials. (Cost is another factor.)Plus, data centers are enormous buildings — the world’s largest, the China Telecomm-Inner Mongolia Information Park, engulfs roughly 10 million square feet — with about 10 to 50 times the energy density of a normal office building, Gadepally adds. “The operational side is only part of the story. Some things we are working on to reduce operational emissions may lend themselves to reducing embodied carbon, too, but we need to do more on that front in the future,” he says.Reducing operational carbon emissionsWhen it comes to reducing operational carbon emissions of AI data centers, there are many parallels with home energy-saving measures. For one, we can simply turn down the lights.“Even if you have the worst lightbulbs in your house from an efficiency standpoint, turning them off or dimming them will always use less energy than leaving them running at full blast,” Gadepally says.In the same fashion, research from the Supercomputing Center has shown that “turning down” the GPUs in a data center so they consume about three-tenths the energy has minimal impacts on the performance of AI models, while also making the hardware easier to cool.Another strategy is to use less energy-intensive computing hardware.Demanding generative AI workloads, such as training new reasoning models like GPT-5, usually need many GPUs working simultaneously. The Goldman Sachs analysis estimates that a state-of-the-art system could soon have as many as 576 connected GPUs operating at once.But engineers can sometimes achieve similar results by reducing the precision of computing hardware, perhaps by switching to less powerful processors that have been tuned to handle a specific AI workload.There are also measures that boost the efficiency of training power-hungry deep-learning models before they are deployed.Gadepally’s group found that about half the electricity used for training an AI model is spent to get the last 2 or 3 percentage points in accuracy. Stopping the training process early can save a lot of that energy.“There might be cases where 70 percent accuracy is good enough for one particular application, like a recommender system for e-commerce,” he says.Researchers can also take advantage of efficiency-boosting measures.For instance, a postdoc in the Supercomputing Center realized the group might run a thousand simulations during the training process to pick the two or three best AI models for their project.By building a tool that allowed them to avoid about 80 percent of those wasted computing cycles, they dramatically reduced the energy demands of training with no reduction in model accuracy, Gadepally says.Leveraging efficiency improvementsConstant innovation in computing hardware, such as denser arrays of transistors on semiconductor chips, is still enabling dramatic improvements in the energy efficiency of AI models.Even though energy efficiency improvements have been slowing for most chips since about 2005, the amount of computation that GPUs can do per joule of energy has been improving by 50 to 60 percent each year, says Neil Thompson, director of the FutureTech Research Project at MIT’s Computer Science and Artificial Intelligence Laboratory and a principal investigator at MIT’s Initiative on the Digital Economy.“The still-ongoing ‘Moore’s Law’ trend of getting more and more transistors on chip still matters for a lot of these AI systems, since running operations in parallel is still very valuable for improving efficiency,” says Thomspon.Even more significant, his group’s research indicates that efficiency gains from new model architectures that can solve complex problems faster, consuming less energy to achieve the same or better results, is doubling every eight or nine months.Thompson coined the term “negaflop” to describe this effect. The same way a “negawatt” represents electricity saved due to energy-saving measures, a “negaflop” is a computing operation that doesn’t need to be performed due to algorithmic improvements.These could be things like “pruning” away unnecessary components of a neural network or employing compression techniques that enable users to do more with less computation.“If you need to use a really powerful model today to complete your task, in just a few years, you might be able to use a significantly smaller model to do the same thing, which would carry much less environmental burden. Making these models more efficient is the single-most important thing you can do to reduce the environmental costs of AI,” Thompson says.Maximizing energy savingsWhile reducing the overall energy use of AI algorithms and computing hardware will cut greenhouse gas emissions, not all energy is the same, Gadepally adds.“The amount of carbon emissions in 1 kilowatt hour varies quite significantly, even just during the day, as well as over the month and year,” he says.Engineers can take advantage of these variations by leveraging the flexibility of AI workloads and data center operations to maximize emissions reductions. For instance, some generative AI workloads don’t need to be performed in their entirety at the same time.Splitting computing operations so some are performed later, when more of the electricity fed into the grid is from renewable sources like solar and wind, can go a long way toward reducing a data center’s carbon footprint, says Deepjyoti Deka, a research scientist in the MIT Energy Initiative.Deka and his team are also studying “smarter” data centers where the AI workloads of multiple companies using the same computing equipment are flexibly adjusted to improve energy efficiency.“By looking at the system as a whole, our hope is to minimize energy use as well as dependence on fossil fuels, while still maintaining reliability standards for AI companies and users,” Deka says.He and others at MITEI are building a flexibility model of a data center that considers the differing energy demands of training a deep-learning model versus deploying that model. Their hope is to uncover the best strategies for scheduling and streamlining computing operations to improve energy efficiency.The researchers are also exploring the use of long-duration energy storage units at data centers, which store excess energy for times when it is needed.With these systems in place, a data center could use stored energy that was generated by renewable sources during a high-demand period, or avoid the use of diesel backup generators if there are fluctuations in the grid.“Long-duration energy storage could be a game-changer here because we can design operations that really change the emission mix of the system to rely more on renewable energy,” Deka says.In addition, researchers at MIT and Princeton University are developing a software tool for investment planning in the power sector, called GenX, which could be used to help companies determine the ideal place to locate a data center to minimize environmental impacts and costs.Location can have a big impact on reducing a data center’s carbon footprint. For instance, Meta operates a data center in Lulea, a city on the coast of northern Sweden where cooler temperatures reduce the amount of electricity needed to cool computing hardware.Thinking farther outside the box (way farther), some governments are even exploring the construction of data centers on the moon where they could potentially be operated with nearly all renewable energy.AI-based solutionsCurrently, the expansion of renewable energy generation here on Earth isn’t keeping pace with the rapid growth of AI, which is one major roadblock to reducing its carbon footprint, says Jennifer Turliuk MBA ’25, a short-term lecturer, former Sloan Fellow, and former practice leader of climate and energy AI at the Martin Trust Center for MIT Entrepreneurship.The local, state, and federal review processes required for a new renewable energy projects can take years.Researchers at MIT and elsewhere are exploring the use of AI to speed up the process of connecting new renewable energy systems to the power grid.For instance, a generative AI model could streamline interconnection studies that determine how a new project will impact the power grid, a step that often takes years to complete.And when it comes to accelerating the development and implementation of clean energy technologies, AI could play a major role.“Machine learning is great for tackling complex situations, and the electrical grid is said to be one of the largest and most complex machines in the world,” Turliuk adds.For instance, AI could help optimize the prediction of solar and wind energy generation or identify ideal locations for new facilities.It could also be used to perform predictive maintenance and fault detection for solar panels or other green energy infrastructure, or to monitor the capacity of transmission wires to maximize efficiency.By helping researchers gather and analyze huge amounts of data, AI could also inform targeted policy interventions aimed at getting the biggest “bang for the buck” from areas such as renewable energy, Turliuk says.To help policymakers, scientists, and enterprises consider the multifaceted costs and benefits of AI systems, she and her collaborators developed the Net Climate Impact Score.The score is a framework that can be used to help determine the net climate impact of AI projects, considering emissions and other environmental costs along with potential environmental benefits in the future.At the end of the day, the most effective solutions will likely result from collaborations among companies, regulators, and researchers, with academia leading the way, Turliuk adds.“Every day counts. We are on a path where the effects of climate change won’t be fully known until it is too late to do anything about it. This is a once-in-a-lifetime opportunity to innovate and make AI systems less carbon-intense,” she says. More
200 Shares99 Views
in Energy
AI system learns from many types of scientific information and runs experiments to discover new materials
by Evelyn Reynolds 25 September 2025, 15:00
Machine-learning models can speed up the discovery of new materials by making predictions and suggesting experiments. But most models today only consider a few specific types of data or variables. Compare that with human scientists, who work in a collaborative environment and consider experimental results, the broader scientific literature, imaging and structural analysis, personal experience or intuition, and input from colleagues and peer reviewers.Now, MIT researchers have developed a method for optimizing materials recipes and planning experiments that incorporates information from diverse sources like insights from the literature, chemical compositions, microstructural images, and more. The approach is part of a new platform, named Copilot for Real-world Experimental Scientists (CRESt), that also uses robotic equipment for high-throughput materials testing, the results of which are fed back into large multimodal models to further optimize materials recipes.Human researchers can converse with the system in natural language, with no coding required, and the system makes its own observations and hypotheses along the way. Cameras and visual language models also allow the system to monitor experiments, detect issues, and suggest corrections.“In the field of AI for science, the key is designing new experiments,” says Ju Li, School of Engineering Carl Richard Soderberg Professor of Power Engineering. “We use multimodal feedback — for example information from previous literature on how palladium behaved in fuel cells at this temperature, and human feedback — to complement experimental data and design new experiments. We also use robots to synthesize and characterize the material’s structure and to test performance.”The system is described in a paper published in Nature. The researchers used CRESt to explore more than 900 chemistries and conduct 3,500 electrochemical tests, leading to the discovery of a catalyst material that delivered record power density in a fuel cell that runs on formate salt to produce electricity.Joining Li on the paper as first authors are PhD student Zhen Zhang, Zhichu Ren PhD ’24, PhD student Chia-Wei Hsu, and postdoc Weibin Chen. Their coauthors are MIT Assistant Professor Iwnetim Abate; Associate Professor Pulkit Agrawal; JR East Professor of Engineering Yang Shao-Horn; MIT.nano researcher Aubrey Penn; Zhang-Wei Hong PhD ’25, Hongbin Xu PhD ’25; Daniel Zheng PhD ’25; MIT graduate students Shuhan Miao and Hugh Smith; MIT postdocs Yimeng Huang, Weiyin Chen, Yungsheng Tian, Yifan Gao, and Yaoshen Niu; former MIT postdoc Sipei Li; and collaborators including Chi-Feng Lee, Yu-Cheng Shao, Hsiao-Tsu Wang, and Ying-Rui Lu.
Play video
A smarter systemMaterials science experiments can be time-consuming and expensive. They require researchers to carefully design workflows, make new material, and run a series of tests and analysis to understand what happened. Those results are then used to decide how to improve the material.To improve the process, some researchers have turned to a machine-learning strategy known as active learning to make efficient use of previous experimental data points and explore or exploit those data. When paired with a statistical technique known as Bayesian optimization (BO), active learning has helped researchers identify new materials for things like batteries and advanced semiconductors.“Bayesian optimization is like Netflix recommending the next movie to watch based on your viewing history, except instead it recommends the next experiment to do,” Li explains. “But basic Bayesian optimization is too simplistic. It uses a boxed-in design space, so if I say I’m going to use platinum, palladium, and iron, it only changes the ratio of those elements in this small space. But real materials have a lot more dependencies, and BO often gets lost.”Most active learning approaches also rely on single data streams that don’t capture everything that goes on in an experiment. To equip computational systems with more human-like knowledge, while still taking advantage of the speed and control of automated systems, Li and his collaborators built CRESt.CRESt’s robotic equipment includes a liquid-handling robot, a carbothermal shock system to rapidly synthesize materials, an automated electrochemical workstation for testing, characterization equipment including automated electron microscopy and optical microscopy, and auxiliary devices such as pumps and gas valves, which can also be remotely controlled. Many processing parameters can also be tuned.With the user interface, researchers can chat with CRESt and tell it to use active learning to find promising materials recipes for different projects. CRESt can include up to 20 precursor molecules and substrates into its recipe. To guide material designs, CRESt’s models search through scientific papers for descriptions of elements or precursor molecules that might be useful. When human researchers tell CRESt to pursue new recipes, it kicks off a robotic symphony of sample preparation, characterization, and testing. The researcher can also ask CRESt to perform image analysis from scanning electron microscopy imaging, X-ray diffraction, and other sources.Information from those processes is used to train the active learning models, which use both literature knowledge and current experimental results to suggest further experiments and accelerate materials discovery.“For each recipe we use previous literature text or databases, and it creates these huge representations of every recipe based on the previous knowledge base before even doing the experiment,” says Li. “We perform principal component analysis in this knowledge embedding space to get a reduced search space that captures most of the performance variability. Then we use Bayesian optimization in this reduced space to design the new experiment. After the new experiment, we feed newly acquired multimodal experimental data and human feedback into a large language model to augment the knowledgebase and redefine the reduced search space, which gives us a big boost in active learning efficiency.”Materials science experiments can also face reproducibility challenges. To address the problem, CRESt monitors its experiments with cameras, looking for potential problems and suggesting solutions via text and voice to human researchers.The researchers used CRESt to develop an electrode material for an advanced type of high-density fuel cell known as a direct formate fuel cell. After exploring more than 900 chemistries over three months, CRESt discovered a catalyst material made from eight elements that achieved a 9.3-fold improvement in power density per dollar over pure palladium, an expensive precious metal. In further tests, CRESTs material was used to deliver a record power density to a working direct formate fuel cell even though the cell contained just one-fourth of the precious metals of previous devices.The results show the potential for CRESt to find solutions to real-world energy problems that have plagued the materials science and engineering community for decades.“A significant challenge for fuel-cell catalysts is the use of precious metal,” says Zhang. “For fuel cells, researchers have used various precious metals like palladium and platinum. We used a multielement catalyst that also incorporates many other cheap elements to create the optimal coordination environment for catalytic activity and resistance to poisoning species such as carbon monoxide and adsorbed hydrogen atom. People have been searching low-cost options for many years. This system greatly accelerated our search for these catalysts.”A helpful assistantEarly on, poor reproducibility emerged as a major problem that limited the researchers’ ability to perform their new active learning technique on experimental datasets. Material properties can be influenced by the way the precursors are mixed and processed, and any number of problems can subtly alter experimental conditions, requiring careful inspection to correct.To partially automate the process, the researchers coupled computer vision and vision language models with domain knowledge from the scientific literature, which allowed the system to hypothesize sources of irreproducibility and propose solutions. For example, the models can notice when there’s a millimeter-sized deviation in a sample’s shape or when a pipette moves something out of place. The researchers incorporated some of the model’s suggestions, leading to improved consistency, suggesting the models already make good experimental assistants.The researchers noted that humans still performed most of the debugging in their experiments.“CREST is an assistant, not a replacement, for human researchers,” Li says. “Human researchers are still indispensable. In fact, we use natural language so the system can explain what it is doing and present observations and hypotheses. But this is a step toward more flexible, self-driving labs.” More
150 Shares159 Views
in Energy
New tool makes generative AI models more likely to create breakthrough materials
by Evelyn Reynolds 22 September 2025, 09:00
The artificial intelligence models that turn text into images are also useful for generating new materials. Over the last few years, generative materials models from companies like Google, Microsoft, and Meta have drawn on their training data to help researchers design tens of millions of new materials.But when it comes to designing materials with exotic quantum properties like superconductivity or unique magnetic states, those models struggle. That’s too bad, because humans could use the help. For example, after a decade of research into a class of materials that could revolutionize quantum computing, called quantum spin liquids, only a dozen material candidates have been identified. The bottleneck means there are fewer materials to serve as the basis for technological breakthroughs.Now, MIT researchers have developed a technique that lets popular generative materials models create promising quantum materials by following specific design rules. The rules, or constraints, steer models to create materials with unique structures that give rise to quantum properties.“The models from these large companies generate materials optimized for stability,” says Mingda Li, MIT’s Class of 1947 Career Development Professor. “Our perspective is that’s not usually how materials science advances. We don’t need 10 million new materials to change the world. We just need one really good material.”The approach is described today in a paper published by Nature Materials. The researchers applied their technique to generate millions of candidate materials consisting of geometric lattice structures associated with quantum properties. From that pool, they synthesized two actual materials with exotic magnetic traits.“People in the quantum community really care about these geometric constraints, like the Kagome lattices that are two overlapping, upside-down triangles. We created materials with Kagome lattices because those materials can mimic the behavior of rare earth elements, so they are of high technical importance.” Li says.Li is the senior author of the paper. His MIT co-authors include PhD students Ryotaro Okabe, Mouyang Cheng, Abhijatmedhi Chotrattanapituk, and Denisse Cordova Carrizales; postdoc Manasi Mandal; undergraduate researchers Kiran Mak and Bowen Yu; visiting scholar Nguyen Tuan Hung; Xiang Fu ’22, PhD ’24; and professor of electrical engineering and computer science Tommi Jaakkola, who is an affiliate of the Computer Science and Artificial Intelligence Laboratory (CSAIL) and Institute for Data, Systems, and Society. Additional co-authors include Yao Wang of Emory University, Weiwei Xie of Michigan State University, YQ Cheng of Oak Ridge National Laboratory, and Robert Cava of Princeton University.Steering models toward impactA material’s properties are determined by its structure, and quantum materials are no different. Certain atomic structures are more likely to give rise to exotic quantum properties than others. For instance, square lattices can serve as a platform for high-temperature superconductors, while other shapes known as Kagome and Lieb lattices can support the creation of materials that could be useful for quantum computing.To help a popular class of generative models known as a diffusion models produce materials that conform to particular geometric patterns, the researchers created SCIGEN (short for Structural Constraint Integration in GENerative model). SCIGEN is a computer code that ensures diffusion models adhere to user-defined constraints at each iterative generation step. With SCIGEN, users can give any generative AI diffusion model geometric structural rules to follow as it generates materials.AI diffusion models work by sampling from their training dataset to generate structures that reflect the distribution of structures found in the dataset. SCIGEN blocks generations that don’t align with the structural rules.To test SCIGEN, the researchers applied it to a popular AI materials generation model known as DiffCSP. They had the SCIGEN-equipped model generate materials with unique geometric patterns known as Archimedean lattices, which are collections of 2D lattice tilings of different polygons. Archimedean lattices can lead to a range of quantum phenomena and have been the focus of much research.“Archimedean lattices give rise to quantum spin liquids and so-called flat bands, which can mimic the properties of rare earths without rare earth elements, so they are extremely important,” says Cheng, a co-corresponding author of the work. “Other Archimedean lattice materials have large pores that could be used for carbon capture and other applications, so it’s a collection of special materials. In some cases, there are no known materials with that lattice, so I think it will be really interesting to find the first material that fits in that lattice.”The model generated over 10 million material candidates with Archimedean lattices. One million of those materials survived a screening for stability. Using the supercomputers in Oak Ridge National Laboratory, the researchers then took a smaller sample of 26,000 materials and ran detailed simulations to understand how the materials’ underlying atoms behaved. The researchers found magnetism in 41 percent of those structures.From that subset, the researchers synthesized two previously undiscovered compounds, TiPdBi and TiPbSb, at Xie and Cava’s labs. Subsequent experiments showed the AI model’s predictions largely aligned with the actual material’s properties.“We wanted to discover new materials that could have a huge potential impact by incorporating these structures that have been known to give rise to quantum properties,” says Okabe, the paper’s first author. “We already know that these materials with specific geometric patterns are interesting, so it’s natural to start with them.”Accelerating material breakthroughsQuantum spin liquids could unlock quantum computing by enabling stable, error-resistant qubits that serve as the basis of quantum operations. But no quantum spin liquid materials have been confirmed. Xie and Cava believe SCIGEN could accelerate the search for these materials.“There’s a big search for quantum computer materials and topological superconductors, and these are all related to the geometric patterns of materials,” Xie says. “But experimental progress has been very, very slow,” Cava adds. “Many of these quantum spin liquid materials are subject to constraints: They have to be in a triangular lattice or a Kagome lattice. If the materials satisfy those constraints, the quantum researchers get excited; it’s a necessary but not sufficient condition. So, by generating many, many materials like that, it immediately gives experimentalists hundreds or thousands more candidates to play with to accelerate quantum computer materials research.”“This work presents a new tool, leveraging machine learning, that can predict which materials will have specific elements in a desired geometric pattern,” says Drexel University Professor Steve May, who was not involved in the research. “This should speed up the development of previously unexplored materials for applications in next-generation electronic, magnetic, or optical technologies.”The researchers stress that experimentation is still critical to assess whether AI-generated materials can be synthesized and how their actual properties compare with model predictions. Future work on SCIGEN could incorporate additional design rules into generative models, including chemical and functional constraints.“People who want to change the world care about material properties more than the stability and structure of materials,” Okabe says. “With our approach, the ratio of stable materials goes down, but it opens the door to generate a whole bunch of promising materials.”The work was supported, in part, by the U.S. Department of Energy, the National Energy Research Scientific Computing Center, the National Science Foundation, and Oak Ridge National Laboratory. More
113 Shares139 Views
in Environment
Simpler models can outperform deep learning at climate prediction
by Aurelia Butler 26 August 2025, 13:00
Environmental scientists are increasingly using enormous artificial intelligence models to make predictions about changes in weather and climate, but a new study by MIT researchers shows that bigger models are not always better.The team demonstrates that, in certain climate scenarios, much simpler, physics-based models can generate more accurate predictions than state-of-the-art deep-learning models.Their analysis also reveals that a benchmarking technique commonly used to evaluate machine-learning techniques for climate predictions can be distorted by natural variations in the data, like fluctuations in weather patterns. This could lead someone to believe a deep-learning model makes more accurate predictions when that is not the case.The researchers developed a more robust way of evaluating these techniques, which shows that, while simple models are more accurate when estimating regional surface temperatures, deep-learning approaches can be the best choice for estimating local rainfall.They used these results to enhance a simulation tool known as a climate emulator, which can rapidly simulate the effect of human activities onto a future climate.The researchers see their work as a “cautionary tale” about the risk of deploying large AI models for climate science. While deep-learning models have shown incredible success in domains such as natural language, climate science contains a proven set of physical laws and approximations, and the challenge becomes how to incorporate those into AI models.“We are trying to develop models that are going to be useful and relevant for the kinds of things that decision-makers need going forward when making climate policy choices. While it might be attractive to use the latest, big-picture machine-learning model on a climate problem, what this study shows is that stepping back and really thinking about the problem fundamentals is important and useful,” says study senior author Noelle Selin, a professor in the MIT Institute for Data, Systems, and Society (IDSS) and the Department of Earth, Atmospheric and Planetary Sciences (EAPS).Selin’s co-authors are lead author Björn Lütjens, a former EAPS postdoc who is now a research scientist at IBM Research; senior author Raffaele Ferrari, the Cecil and Ida Green Professor of Oceanography in EAPS and co-director of the Lorenz Center; and Duncan Watson-Parris, assistant professor at the University of California at San Diego. Selin and Ferrari are also co-principal investigators of the Bringing Computation to the Climate Challenge project, out of which this research emerged. The paper appears today in the Journal of Advances in Modeling Earth Systems.Comparing emulatorsBecause the Earth’s climate is so complex, running a state-of-the-art climate model to predict how pollution levels will impact environmental factors like temperature can take weeks on the world’s most powerful supercomputers.Scientists often create climate emulators, simpler approximations of a state-of-the art climate model, which are faster and more accessible. A policymaker could use a climate emulator to see how alternative assumptions on greenhouse gas emissions would affect future temperatures, helping them develop regulations.But an emulator isn’t very useful if it makes inaccurate predictions about the local impacts of climate change. While deep learning has become increasingly popular for emulation, few studies have explored whether these models perform better than tried-and-true approaches.The MIT researchers performed such a study. They compared a traditional technique called linear pattern scaling (LPS) with a deep-learning model using a common benchmark dataset for evaluating climate emulators.Their results showed that LPS outperformed deep-learning models on predicting nearly all parameters they tested, including temperature and precipitation.“Large AI methods are very appealing to scientists, but they rarely solve a completely new problem, so implementing an existing solution first is necessary to find out whether the complex machine-learning approach actually improves upon it,” says Lütjens.Some initial results seemed to fly in the face of the researchers’ domain knowledge. The powerful deep-learning model should have been more accurate when making predictions about precipitation, since those data don’t follow a linear pattern.They found that the high amount of natural variability in climate model runs can cause the deep learning model to perform poorly on unpredictable long-term oscillations, like El Niño/La Niña. This skews the benchmarking scores in favor of LPS, which averages out those oscillations.Constructing a new evaluationFrom there, the researchers constructed a new evaluation with more data that address natural climate variability. With this new evaluation, the deep-learning model performed slightly better than LPS for local precipitation, but LPS was still more accurate for temperature predictions.“It is important to use the modeling tool that is right for the problem, but in order to do that you also have to set up the problem the right way in the first place,” Selin says.Based on these results, the researchers incorporated LPS into a climate emulation platform to predict local temperature changes in different emission scenarios.“We are not advocating that LPS should always be the goal. It still has limitations. For instance, LPS doesn’t predict variability or extreme weather events,” Ferrari adds.Rather, they hope their results emphasize the need to develop better benchmarking techniques, which could provide a fuller picture of which climate emulation technique is best suited for a particular situation.“With an improved climate emulation benchmark, we could use more complex machine-learning methods to explore problems that are currently very hard to address, like the impacts of aerosols or estimations of extreme precipitation,” Lütjens says.Ultimately, more accurate benchmarking techniques will help ensure policymakers are making decisions based on the best available information.The researchers hope others build on their analysis, perhaps by studying additional improvements to climate emulation methods and benchmarks. Such research could explore impact-oriented metrics like drought indicators and wildfire risks, or new variables like regional wind speeds.This research is funded, in part, by Schmidt Sciences, LLC, and is part of the MIT Climate Grand Challenges team for “Bringing Computation to the Climate Challenge.” More

Artificial intelligence

Latest story

The brain power behind sustainable AI

More stories

Optimizing food subsidies: Applying digital platforms to maximize nutrition

Fighting for the health of the planet with AI

New prediction model could improve the reliability of fusion power plants

Lincoln Lab unveils the most powerful AI supercomputer at any US university

Responding to the climate impact of generative AI

AI system learns from many types of scientific information and runs experiments to discover new materials

New tool makes generative AI models more likely to create breakthrough materials

Simpler models can outperform deep learning at climate prediction

ITALIAN LANGUAGE

ENGLISH LANGUAGE