More stories

  • in

    Integrating humans with AI in structural design

    Modern fabrication tools such as 3D printers can make structural materials in shapes that would have been difficult or impossible using conventional tools. Meanwhile, new generative design systems can take great advantage of this flexibility to create innovative designs for parts of a new building, car, or virtually any other device.

    But such “black box” automated systems often fall short of producing designs that are fully optimized for their purpose, such as providing the greatest strength in proportion to weight or minimizing the amount of material needed to support a given load. Fully manual design, on the other hand, is time-consuming and labor-intensive.

    Now, researchers at MIT have found a way to achieve some of the best of both of these approaches. They used an automated design system but stopped the process periodically to allow human engineers to evaluate the work in progress and make tweaks or adjustments before letting the computer resume its design process. Introducing a few of these iterations produced results that performed better than those designed by the automated system alone, and the process was completed more quickly compared to the fully manual approach.

    The results are reported this week in the journal Structural and Multidisciplinary Optimization, in a paper by MIT doctoral student Dat Ha and assistant professor of civil and environmental engineering Josephine Carstensen.

    The basic approach can be applied to a broad range of scales and applications, Carstensen explains, for the design of everything from biomedical devices to nanoscale materials to structural support members of a skyscraper. Already, automated design systems have found many applications. “If we can make things in a better way, if we can make whatever we want, why not make it better?” she asks.

    “It’s a way to take advantage of how we can make things in much more complex ways than we could in the past,” says Ha, adding that automated design systems have already begun to be widely used over the last decade in automotive and aerospace industries, where reducing weight while maintaining structural strength is a key need.

    “You can take a lot of weight out of components, and in these two industries, everything is driven by weight,” he says. In some cases, such as internal components that aren’t visible, appearance is irrelevant, but for other structures aesthetics may be important as well. The new system makes it possible to optimize designs for visual as well as mechanical properties, and in such decisions the human touch is essential.

    As a demonstration of their process in action, the researchers designed a number of structural load-bearing beams, such as might be used in a building or a bridge. In their iterations, they saw that the design has an area that could fail prematurely, so they selected that feature and required the program to address it. The computer system then revised the design accordingly, removing the highlighted strut and strengthening some other struts to compensate, and leading to an improved final design.

    The process, which they call Human-Informed Topology Optimization, begins by setting out the needed specifications — for example, a beam needs to be this length, supported on two points at its ends, and must support this much of a load. “As we’re seeing the structure evolve on the computer screen in response to initial specification,” Carstensen says, “we interrupt the design and ask the user to judge it. The user can select, say, ‘I’m not a fan of this region, I’d like you to beef up or beef down this feature size requirement.’ And then the algorithm takes into account the user input.”

    While the result is not as ideal as what might be produced by a fully rigorous yet significantly slower design algorithm that considers the underlying physics, she says it can be much better than a result generated by a rapid automated design system alone. “You don’t get something that’s quite as good, but that was not necessarily the goal. What we can show is that instead of using several hours to get something, we can use 10 minutes and get something much better than where we started off.”

    The system can be used to optimize a design based on any desired properties, not just strength and weight. For example, it can be used to minimize fracture or buckling, or to reduce stresses in the material by softening corners.

    Carstensen says, “We’re not looking to replace the seven-hour solution. If you have all the time and all the resources in the world, obviously you can run these and it’s going to give you the best solution.” But for many situations, such as designing replacement parts for equipment in a war zone or a disaster-relief area with limited computational power available, “then this kind of solution that catered directly to your needs would prevail.”

    Similarly, for smaller companies manufacturing equipment in essentially “mom and pop” businesses, such a simplified system might be just the ticket. The new system they developed is not only simple and efficient to run on smaller computers, but it also requires far less training to produce useful results, Carstensen says. A basic two-dimensional version of the software, suitable for designing basic beams and structural parts, is freely available now online, she says, as the team continues to develop a full 3D version.

    “The potential applications of Prof Carstensen’s research and tools are quite extraordinary,” says Christian Málaga-Chuquitaype, a professor of civil and environmental engineering at Imperial College London, who was not associated with this work. “With this work, her group is paving the way toward a truly synergistic human-machine design interaction.”

    “By integrating engineering ‘intuition’ (or engineering ‘judgement’) into a rigorous yet computationally efficient topology optimization process, the human engineer is offered the possibility of guiding the creation of optimal structural configurations in a way that was not available to us before,” he adds. “Her findings have the potential to change the way engineers tackle ‘day-to-day’ design tasks.” More

  • in

    Machine learning facilitates “turbulence tracking” in fusion reactors

    Fusion, which promises practically unlimited, carbon-free energy using the same processes that power the sun, is at the heart of a worldwide research effort that could help mitigate climate change.

    A multidisciplinary team of researchers is now bringing tools and insights from machine learning to aid this effort. Scientists from MIT and elsewhere have used computer-vision models to identify and track turbulent structures that appear under the conditions needed to facilitate fusion reactions.

    Monitoring the formation and movements of these structures, called filaments or “blobs,” is important for understanding the heat and particle flows exiting from the reacting fuel, which ultimately determines the engineering requirements for the reactor walls to meet those flows. However, scientists typically study blobs using averaging techniques, which trade details of individual structures in favor of aggregate statistics. Individual blob information must be tracked by marking them manually in video data. 

    The researchers built a synthetic video dataset of plasma turbulence to make this process more effective and efficient. They used it to train four computer vision models, each of which identifies and tracks blobs. They trained the models to pinpoint blobs in the same ways that humans would.

    When the researchers tested the trained models using real video clips, the models could identify blobs with high accuracy — more than 80 percent in some cases. The models were also able to effectively estimate the size of blobs and the speeds at which they moved.

    Because millions of video frames are captured during just one fusion experiment, using machine-learning models to track blobs could give scientists much more detailed information.

    “Before, we could get a macroscopic picture of what these structures are doing on average. Now, we have a microscope and the computational power to analyze one event at a time. If we take a step back, what this reveals is the power available from these machine-learning techniques, and ways to use these computational resources to make progress,” says Theodore Golfinopoulos, a research scientist at the MIT Plasma Science and Fusion Center and co-author of a paper detailing these approaches.

    His fellow co-authors include lead author Woonghee “Harry” Han, a physics PhD candidate; senior author Iddo Drori, a visiting professor in the Computer Science and Artificial Intelligence Laboratory (CSAIL), faculty associate professor at Boston University, and adjunct at Columbia University; as well as others from the MIT Plasma Science and Fusion Center, the MIT Department of Civil and Environmental Engineering, and the Swiss Federal Institute of Technology at Lausanne in Switzerland. The research appears today in Nature Scientific Reports.

    Heating things up

    For more than 70 years, scientists have sought to use controlled thermonuclear fusion reactions to develop an energy source. To reach the conditions necessary for a fusion reaction, fuel must be heated to temperatures above 100 million degrees Celsius. (The core of the sun is about 15 million degrees Celsius.)

    A common method for containing this super-hot fuel, called plasma, is to use a tokamak. These devices utilize extremely powerful magnetic fields to hold the plasma in place and control the interaction between the exhaust heat from the plasma and the reactor walls.

    However, blobs appear like filaments falling out of the plasma at the very edge, between the plasma and the reactor walls. These random, turbulent structures affect how energy flows between the plasma and the reactor.

    “Knowing what the blobs are doing strongly constrains the engineering performance that your tokamak power plant needs at the edge,” adds Golfinopoulos.

    Researchers use a unique imaging technique to capture video of the plasma’s turbulent edge during experiments. An experimental campaign may last months; a typical day will produce about 30 seconds of data, corresponding to roughly 60 million video frames, with thousands of blobs appearing each second. This makes it impossible to track all blobs manually, so researchers rely on average sampling techniques that only provide broad characteristics of blob size, speed, and frequency.

    “On the other hand, machine learning provides a solution to this by blob-by-blob tracking for every frame, not just average quantities. This gives us much more knowledge about what is happening at the boundary of the plasma,” Han says.

    He and his co-authors took four well-established computer vision models, which are commonly used for applications like autonomous driving, and trained them to tackle this problem.

    Simulating blobs

    To train these models, they created a vast dataset of synthetic video clips that captured the blobs’ random and unpredictable nature.

    “Sometimes they change direction or speed, sometimes multiple blobs merge, or they split apart. These kinds of events were not considered before with traditional approaches, but we could freely simulate those behaviors in the synthetic data,” Han says.

    Creating synthetic data also allowed them to label each blob, which made the training process more effective, Drori adds.

    Using these synthetic data, they trained the models to draw boundaries around blobs, teaching them to closely mimic what a human scientist would draw.

    Then they tested the models using real video data from experiments. First, they measured how closely the boundaries the models drew matched up with actual blob contours.

    But they also wanted to see if the models predicted objects that humans would identify. They asked three human experts to pinpoint the centers of blobs in video frames and checked to see if the models predicted blobs in those same locations.

    The models were able to draw accurate blob boundaries, overlapping with brightness contours which are considered ground-truth, about 80 percent of the time. Their evaluations were similar to those of human experts, and successfully predicted the theory-defined regime of the blob, which agrees with the results from a traditional method.

    Now that they have shown the success of using synthetic data and computer vision models for tracking blobs, the researchers plan to apply these techniques to other problems in fusion research, such as estimating particle transport at the boundary of a plasma, Han says.

    They also made the dataset and models publicly available, and look forward to seeing how other research groups apply these tools to study the dynamics of blobs, says Drori.

    “Prior to this, there was a barrier to entry that mostly the only people working on this problem were plasma physicists, who had the datasets and were using their methods. There is a huge machine-learning and computer-vision community. One goal of this work is to encourage participation in fusion research from the broader machine-learning community toward the broader goal of helping solve the critical problem of climate change,” he adds.

    This research is supported, in part, by the U.S. Department of Energy and the Swiss National Science Foundation. More

  • in

    Two first-year students named Rise Global Winners for 2022

    In 2019, former Google CEO Eric Schmidt and his wife, Wendy, launched a $1 billion philanthropic commitment to identify global talent. Part of that effort is the Rise initiative, which selects 100 young scholars, ages 15-17, from around the world who show unusual promise and a drive to serve others. This year’s cohort of 100 Rise Global Winners includes two MIT first-year students, Jacqueline Prawira and Safiya Sankari.

    Rise intentionally targets younger-aged students and focuses on identifying what the program terms “hidden brilliance” in any form, anywhere in the world, whether it be in a high school or a refugee camp. Another defining aspect of the program is that Rise winners receive sustained support — not just in secondary school, but throughout their lives.

    “We believe that the answers to the world’s toughest problems lie in the imagination of the world’s brightest minds,” says Eric Braverman, CEO of Schmidt Futures, which manages Rise along with the Rhodes Trust. “Rise is an integral part of our mission to create the best, largest, and most enduring pipeline of exceptional talent globally and match it to opportunities to serve others for life.”

    The Rise program creates this enduring pipeline by providing a lifetime of benefits, including funding, programming, and mentoring opportunities. These resources can be tailored to each person as they evolve throughout their career. In addition to a four-year college scholarship, winners receive mentoring and career services; networking opportunities with other Rise recipients and partner organizations; technical equipment such as laptops or tablets; courses on topics like leadership and human-centered design; and opportunities to apply for graduate scholarships and for funding throughout their careers to support their innovative ideas, such as grants or seed money to start a social enterprise.

    Prawira and Sankari’s winning service projects focus on global sustainability and global medical access, respectively. Prawira invented a way to use upcycled fish-scale waste to absorb heavy metals in wastewater. She first started experimenting with fish-scale waste in middle school to try to find a bio-based alternative to plastic. More recently, she discovered that the calcium salts and collagen in fish scales can absorb up to 82 percent of heavy metals from water, and 91 percent if an electric current is passed through the water. Her work has global implications for treating contaminated water at wastewater plants and in developing countries.

    Prawiri published her research in 2021 and has won awards from the U.S. Environmental Protection Agency and several other organizations. She’s planning to major in Course 3 (materials science and engineering), perhaps with an environmentally related minor. “I believe that sustainability and solving environmental problems requires a multifaced approach,” she says. “Creating greener materials for use in our daily lives will have a major impact in solving current environmental issues.”

    For Sankari’s service project, she developed an algorithm to analyze data from electronic nano-sensor devices, or e-noses, which can detect certain diseases from a patient’s breath. The devices are calibrated to detect volatile organic compound biosignatures that are indicative of diseases like diabetes and cancer. “E-nose disease detection is much faster and cheaper than traditional methods of diagnosis, making medical care more accessible to many,” she explains. The Python-based algorithm she created can translate raw data from e-noses into a result that the user can read.

    Sankari is a lifetime member of the American Junior Academy of Science and has been a finalist in several prestigious science competitions. She is considering a major in Course 6-7 (computer science and molecular biology) at MIT and hopes to continue to explore the intersection between nanotechnology and medicine.

    While the 2022 Rise recipients share a desire to tackle some of the world’s most intractable problems, their ideas and interests, as reflected by their service projects, are broad, innovative, and diverse. A winner from Belarus used bioinformatics to predict the molecular effect of a potential Alzheimer’s drug. A Romanian student created a magazine that aims to promote acceptance of transgender bodies. A Vietnamese teen created a prototype of a toothbrush that uses a nano chip to detect cancerous cells in saliva. And a recipient from the United States designed modular, tiny homes for the unhoused that are affordable and sustainable, as an alternative to homeless shelters.

    This year’s winners were selected from over 13,000 applicants from 47 countries, from Azerbaijan and Burkina Faso to Lebanon and Paraguay. The selection process includes group interviews, peer and expert review of each applicant’s service project, and formal talent assessments. More

  • in

    Simulating neutron behavior in nuclear reactors

    Amelia Trainer applied to MIT because she lost a bet.

    As part of what the fourth-year nuclear science and engineering (NSE) doctoral student labels her “teenage rebellious phase,” Trainer was quite convinced she would just be wasting the application fee were she to submit an application. She wasn’t even “super sure” she wanted to go to college. But a high-school friend was convinced Trainer would get into a “top school” if she only applied. A bet followed: If Trainer lost, she would have to apply to MIT. Trainer lost — and is glad she did.

    Growing up in Daytona Beach, Florida, good grades were Trainer’s thing. Seeing friends participate in interschool math competitions, Trainer decided she would tag along and soon found she loved them. She remembers being adept at reading the room: If teams were especially struggling over a problem, Trainer figured the answer had to be something easy, like zero or one. “The hardest problems would usually have the most goofball answers,” she laughs.

    Simulating neutron behavior

    As a doctoral student, hard problems in math, specifically computational reactor physics, continue to be Trainer’s forte.

    Her research, under the guidance of Professor Benoit Forget in MIT NSE’s Computational Reactor Physics Group (CRPG), focuses on modeling complicated neutron behavior in reactors. Simulation helps forecast the behavior of reactors before millions of dollars sink into development of a potentially uneconomical unit. Using simulations, Trainer can see “where the neutrons are going, how much heat is being produced, and how much power the reactor can generate.” Her research helps form the foundation for the next generation of nuclear power plants.

    To simulate neutron behavior inside of a nuclear reactor, you first need to know how neutrons will interact with the various materials inside the system. These neutrons can have wildly different energies, thereby making them susceptible to different physical phenomena. For the entirety of her graduate studies, Trainer has been primarily interested in the physics regarding slow-moving neutrons and their scattering behavior.

    When a slow neutron scatters off of a material, it can induce or cancel out molecular vibrations between the material’s atoms. The effect that material vibrations can have on neutron energies, and thereby on reactor behavior, has been heavily approximated over the years. Trainer is primarily interested in chipping away at these approximations by creating scattering data for materials that have historically been misrepresented and by exploring new techniques for preparing slow-neutron scattering data.

    Trainer remembers waiting for a simulation to complete in the early days of the Covid-19 pandemic, when she discovered a way to predict neutron behavior with limited input data. Traditionally, “people have to store large tables of what neutrons will do under specific circumstances,” she says. “I’m really happy about it because it’s this really cool method of sampling what your neutron does from very little information,” Trainer says.

    Amelia Trainer — Modeling complicated neutron behavior in nuclear reactors

    As part of her research, Trainer often works closely with two software packages: OpenMC and NJOY. OpenMC is a Monte Carlo neutron transport simulation code that was developed in the CRPG and is used to simulate neutron behavior in reactor systems. NJOY is a nuclear data processing tool, and is used to create, augment, and prepare material data that is fed into tools like OpenMC. By editing both these codes to her specifications, Trainer is able to observe the effect that “upstream” material data has on the “downstream” reactor calculations. Through this, she hopes to identify additional problems: approximations that could lead to a noticeable misrepresentation of the physics.

    A love of geometry and poetry

    Trainer discovered the coolness of science as a child. Her mother, who cares for indoor plants and runs multiple greenhouses, and her father, a blacksmith and farrier, who explored materials science through his craft, were self-taught inspirations.

    Trainer’s father urged his daughter to learn and pursue any topics that she found exciting and encouraged her to read poems from “Calvin and Hobbes” out loud when she struggled with a speech impediment in early childhood. Reading the same passages every day helped her memorize them. “The natural manifestation of that extended into [a love of] poetry,” Trainer says.

    A love of poetry, combined with Trainer’s propensity for fun, led her to compose an ode to pi as part of an MIT-sponsored event for alumni. “I was really only in it for the cupcake,” she laughs. (Participants received an indulgent treat).

    Play video

    MIT Matters: A Love Poem to Pi

    Computations and nuclear science

    After being accepted at MIT, Trainer knew she wanted to study in a field that would take her skills at the levels they were at — “my math skills were pretty underdeveloped in the grand scheme of things,” she says. An open-house weekend at MIT, where she met with faculty from the NSE department, and the opportunity to contribute to a discipline working toward clean energy, cemented Trainer’s decision to join NSE.

    As a high schooler, Trainer won a scholarship to Embry-Riddle Aeronautical University to learn computer coding and knew computational physics might be more aligned with her interests. After she joined MIT as an undergraduate student in 2014, she realized that the CRPG, with its focus on coding and modeling, might be a good fit. Fortunately, a graduate student from Forget’s team welcomed Trainer’s enthusiasm for research even as an undergraduate first-year. She has stayed with the lab ever since. 

    Research internships at Los Alamos National Laboratory, the creators of NJOY, have furthered Trainer’s enthusiasm for modeling and computational physics. She met a Los Alamos scientist after he presented a talk at MIT and it snowballed into a collaboration where she could work on parts of the NJOY code. “It became a really cool collaboration which led me into a deep dive into physics and data preparation techniques, which was just so fulfilling,” Trainer says. As for what’s next, Trainer was awarded the Rickover fellowship in nuclear engineering by the the Department of Energy’s Naval Reactors Division and will join the program in Pittsburgh after she graduates.

    For many years, Trainer’s cats, Jacques and Monster, have been a constant companion. “Neutrons, computers, and cats, that’s my personality,” she laughs. Work continues to fuel her passion. To borrow a favorite phrase from Spaceman Spiff, Trainer’s favorite “Calvin” avatar, Trainer’s approach to research has invariably been: “Another day, another mind-boggling adventure.” More

  • in

    Computing for the health of the planet

    The health of the planet is one of the most important challenges facing humankind today. From climate change to unsafe levels of air and water pollution to coastal and agricultural land erosion, a number of serious challenges threaten human and ecosystem health.

    Ensuring the health and safety of our planet necessitates approaches that connect scientific, engineering, social, economic, and political aspects. New computational methods can play a critical role by providing data-driven models and solutions for cleaner air, usable water, resilient food, efficient transportation systems, better-preserved biodiversity, and sustainable sources of energy.

    The MIT Schwarzman College of Computing is committed to hiring multiple new faculty in computing for climate and the environment, as part of MIT’s plan to recruit 20 climate-focused faculty under its climate action plan. This year the college undertook searches with several departments in the schools of Engineering and Science for shared faculty in computing for health of the planet, one of the six strategic areas of inquiry identified in an MIT-wide planning process to help focus shared hiring efforts. The college also undertook searches for core computing faculty in the Department of Electrical Engineering and Computer Science (EECS).

    The searches are part of an ongoing effort by the MIT Schwarzman College of Computing to hire 50 new faculty — 25 shared with other academic departments and 25 in computer science and artificial intelligence and decision-making. The goal is to build capacity at MIT to help more deeply infuse computing and other disciplines in departments.

    Four interdisciplinary scholars were hired in these searches. They will join the MIT faculty in the coming year to engage in research and teaching that will advance physical understanding of low-carbon energy solutions, Earth-climate modeling, biodiversity monitoring and conservation, and agricultural management through high-performance computing, transformational numerical methods, and machine-learning techniques.

    “By coordinating hiring efforts with multiple departments and schools, we were able to attract a cohort of exceptional scholars in this area to MIT. Each of them is developing and using advanced computational methods and tools to help find solutions for a range of climate and environmental issues,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and the Henry Warren Ellis Professor of Electrical Engineering and Computer Science. “They will also help strengthen cross-departmental ties in computing across an important, critical area for MIT and the world.”

    “These strategic hires in the area of computing for climate and the environment are an incredible opportunity for the college to deepen its academic offerings and create new opportunity for collaboration across MIT,” says Anantha P. Chandrakasan, dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “The college plays a pivotal role in MIT’s overarching effort to hire climate-focused faculty — introducing the critical role of computing to address the health of the planet through innovative research and curriculum.”

    The four new faculty members are:

    Sara Beery will join MIT as an assistant professor in the Faculty of Artificial Intelligence and Decision-Making in EECS in September 2023. Beery received her PhD in computing and mathematical sciences at Caltech in 2022, where she was advised by Pietro Perona. Her research focuses on building computer vision methods that enable global-scale environmental and biodiversity monitoring across data modalities, tackling real-world challenges including strong spatiotemporal correlations, imperfect data quality, fine-grained categories, and long-tailed distributions. She partners with nongovernmental organizations and government agencies to deploy her methods in the wild worldwide and works toward increasing the diversity and accessibility of academic research in artificial intelligence through interdisciplinary capacity building and education.

    Priya Donti will join MIT as an assistant professor in the faculties of Electrical Engineering and Artificial Intelligence and Decision-Making in EECS in academic year 2023-24. Donti recently finished her PhD in the Computer Science Department and the Department of Engineering and Public Policy at Carnegie Mellon University, co-advised by Zico Kolter and Inês Azevedo. Her work focuses on machine learning for forecasting, optimization, and control in high-renewables power grids. Specifically, her research explores methods to incorporate the physics and hard constraints associated with electric power systems into deep learning models. Donti is also co-founder and chair of Climate Change AI, a nonprofit initiative to catalyze impactful work at the intersection of climate change and machine learning that is currently running through the Cornell Tech Runway Startup Postdoc Program.

    Ericmoore Jossou will join MIT as an assistant professor in a shared position between the Department of Nuclear Science and Engineering and the faculty of electrical engineering in EECS in July 2023. He is currently an assistant scientist at the Brookhaven National Laboratory, a U.S. Department of Energy-affiliated lab that conducts research in nuclear and high energy physics, energy science and technology, environmental and bioscience, nanoscience, and national security. His research at MIT will focus on understanding the processing-structure-properties correlation of materials for nuclear energy applications through advanced experiments, multiscale simulations, and data science. Jossou obtained his PhD in mechanical engineering in 2019 from the University of Saskatchewan.

    Sherrie Wang will join MIT as an assistant professor in a shared position between the Department of Mechanical Engineering and the Institute for Data, Systems, and Society in academic year 2023-24. Wang is currently a Ciriacy-Wantrup Postdoctoral Fellow at the University of California at Berkeley, hosted by Solomon Hsiang and the Global Policy Lab. She develops machine learning for Earth observation data. Her primary application areas are improving agricultural management and forecasting climate phenomena. She obtained her PhD in computational and mathematical engineering from Stanford University in 2021, where she was advised by David Lobell. More

  • in

    Taking a magnifying glass to data center operations

    When the MIT Lincoln Laboratory Supercomputing Center (LLSC) unveiled its TX-GAIA supercomputer in 2019, it provided the MIT community a powerful new resource for applying artificial intelligence to their research. Anyone at MIT can submit a job to the system, which churns through trillions of operations per second to train models for diverse applications, such as spotting tumors in medical images, discovering new drugs, or modeling climate effects. But with this great power comes the great responsibility of managing and operating it in a sustainable manner — and the team is looking for ways to improve.

    “We have these powerful computational tools that let researchers build intricate models to solve problems, but they can essentially be used as black boxes. What gets lost in there is whether we are actually using the hardware as effectively as we can,” says Siddharth Samsi, a research scientist in the LLSC. 

    To gain insight into this challenge, the LLSC has been collecting detailed data on TX-GAIA usage over the past year. More than a million user jobs later, the team has released the dataset open source to the computing community.

    Their goal is to empower computer scientists and data center operators to better understand avenues for data center optimization — an important task as processing needs continue to grow. They also see potential for leveraging AI in the data center itself, by using the data to develop models for predicting failure points, optimizing job scheduling, and improving energy efficiency. While cloud providers are actively working on optimizing their data centers, they do not often make their data or models available for the broader high-performance computing (HPC) community to leverage. The release of this dataset and associated code seeks to fill this space.

    “Data centers are changing. We have an explosion of hardware platforms, the types of workloads are evolving, and the types of people who are using data centers is changing,” says Vijay Gadepally, a senior researcher at the LLSC. “Until now, there hasn’t been a great way to analyze the impact to data centers. We see this research and dataset as a big step toward coming up with a principled approach to understanding how these variables interact with each other and then applying AI for insights and improvements.”

    Papers describing the dataset and potential applications have been accepted to a number of venues, including the IEEE International Symposium on High-Performance Computer Architecture, the IEEE International Parallel and Distributed Processing Symposium, the Annual Conference of the North American Chapter of the Association for Computational Linguistics, the IEEE High-Performance and Embedded Computing Conference, and International Conference for High Performance Computing, Networking, Storage and Analysis. 

    Workload classification

    Among the world’s TOP500 supercomputers, TX-GAIA combines traditional computing hardware (central processing units, or CPUs) with nearly 900 graphics processing unit (GPU) accelerators. These NVIDIA GPUs are specialized for deep learning, the class of AI that has given rise to speech recognition and computer vision.

    The dataset covers CPU, GPU, and memory usage by job; scheduling logs; and physical monitoring data. Compared to similar datasets, such as those from Google and Microsoft, the LLSC dataset offers “labeled data, a variety of known AI workloads, and more detailed time series data compared with prior datasets. To our knowledge, it’s one of the most comprehensive and fine-grained datasets available,” Gadepally says. 

    Notably, the team collected time-series data at an unprecedented level of detail: 100-millisecond intervals on every GPU and 10-second intervals on every CPU, as the machines processed more than 3,000 known deep-learning jobs. One of the first goals is to use this labeled dataset to characterize the workloads that different types of deep-learning jobs place on the system. This process would extract features that reveal differences in how the hardware processes natural language models versus image classification or materials design models, for example.   

    The team has now launched the MIT Datacenter Challenge to mobilize this research. The challenge invites researchers to use AI techniques to identify with 95 percent accuracy the type of job that was run, using their labeled time-series data as ground truth.

    Such insights could enable data centers to better match a user’s job request with the hardware best suited for it, potentially conserving energy and improving system performance. Classifying workloads could also allow operators to quickly notice discrepancies resulting from hardware failures, inefficient data access patterns, or unauthorized usage.

    Too many choices

    Today, the LLSC offers tools that let users submit their job and select the processors they want to use, “but it’s a lot of guesswork on the part of users,” Samsi says. “Somebody might want to use the latest GPU, but maybe their computation doesn’t actually need it and they could get just as impressive results on CPUs, or lower-powered machines.”

    Professor Devesh Tiwari at Northeastern University is working with the LLSC team to develop techniques that can help users match their workloads to appropriate hardware. Tiwari explains that the emergence of different types of AI accelerators, GPUs, and CPUs has left users suffering from too many choices. Without the right tools to take advantage of this heterogeneity, they are missing out on the benefits: better performance, lower costs, and greater productivity.

    “We are fixing this very capability gap — making users more productive and helping users do science better and faster without worrying about managing heterogeneous hardware,” says Tiwari. “My PhD student, Baolin Li, is building new capabilities and tools to help HPC users leverage heterogeneity near-optimally without user intervention, using techniques grounded in Bayesian optimization and other learning-based optimization methods. But, this is just the beginning. We are looking into ways to introduce heterogeneity in our data centers in a principled approach to help our users achieve the maximum advantage of heterogeneity autonomously and cost-effectively.”

    Workload classification is the first of many problems to be posed through the Datacenter Challenge. Others include developing AI techniques to predict job failures, conserve energy, or create job scheduling approaches that improve data center cooling efficiencies.

    Energy conservation 

    To mobilize research into greener computing, the team is also planning to release an environmental dataset of TX-GAIA operations, containing rack temperature, power consumption, and other relevant data.

    According to the researchers, huge opportunities exist to improve the power efficiency of HPC systems being used for AI processing. As one example, recent work in the LLSC determined that simple hardware tuning, such as limiting the amount of power an individual GPU can draw, could reduce the energy cost of training an AI model by 20 percent, with only modest increases in computing time. “This reduction translates to approximately an entire week’s worth of household energy for a mere three-hour time increase,” Gadepally says.

    They have also been developing techniques to predict model accuracy, so that users can quickly terminate experiments that are unlikely to yield meaningful results, saving energy. The Datacenter Challenge will share relevant data to enable researchers to explore other opportunities to conserve energy.

    The team expects that lessons learned from this research can be applied to the thousands of data centers operated by the U.S. Department of Defense. The U.S. Air Force is a sponsor of this work, which is being conducted under the USAF-MIT AI Accelerator.

    Other collaborators include researchers at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). Professor Charles Leiserson’s Supertech Research Group is investigating performance-enhancing techniques for parallel computing, and research scientist Neil Thompson is designing studies on ways to nudge data center users toward climate-friendly behavior.

    Samsi presented this work at the inaugural AI for Datacenter Optimization (ADOPT’22) workshop last spring as part of the IEEE International Parallel and Distributed Processing Symposium. The workshop officially introduced their Datacenter Challenge to the HPC community.

    “We hope this research will allow us and others who run supercomputing centers to be more responsive to user needs while also reducing the energy consumption at the center level,” Samsi says. More

  • in

    New hardware offers faster computation for artificial intelligence, with much less energy

    As scientists push the boundaries of machine learning, the amount of time, energy, and money required to train increasingly complex neural network models is skyrocketing. A new area of artificial intelligence called analog deep learning promises faster computation with a fraction of the energy usage.

    Programmable resistors are the key building blocks in analog deep learning, just like transistors are the core elements for digital processors. By repeating arrays of programmable resistors in complex layers, researchers can create a network of analog artificial “neurons” and “synapses” that execute computations just like a digital neural network. This network can then be trained to achieve complex AI tasks like image recognition and natural language processing.

    A multidisciplinary team of MIT researchers set out to push the speed limits of a type of human-made analog synapse that they had previously developed. They utilized a practical inorganic material in the fabrication process that enables their devices to run 1 million times faster than previous versions, which is also about 1 million times faster than the synapses in the human brain.

    Moreover, this inorganic material also makes the resistor extremely energy-efficient. Unlike materials used in the earlier version of their device, the new material is compatible with silicon fabrication techniques. This change has enabled fabricating devices at the nanometer scale and could pave the way for integration into commercial computing hardware for deep-learning applications.

    “With that key insight, and the very powerful nanofabrication techniques we have at MIT.nano, we have been able to put these pieces together and demonstrate that these devices are intrinsically very fast and operate with reasonable voltages,” says senior author Jesús A. del Alamo, the Donner Professor in MIT’s Department of Electrical Engineering and Computer Science (EECS). “This work has really put these devices at a point where they now look really promising for future applications.”

    “The working mechanism of the device is electrochemical insertion of the smallest ion, the proton, into an insulating oxide to modulate its electronic conductivity. Because we are working with very thin devices, we could accelerate the motion of this ion by using a strong electric field, and push these ionic devices to the nanosecond operation regime,” explains senior author Bilge Yildiz, the Breene M. Kerr Professor in the departments of Nuclear Science and Engineering and Materials Science and Engineering.

    “The action potential in biological cells rises and falls with a timescale of milliseconds, since the voltage difference of about 0.1 volt is constrained by the stability of water,” says senior author Ju Li, the Battelle Energy Alliance Professor of Nuclear Science and Engineering and professor of materials science and engineering, “Here we apply up to 10 volts across a special solid glass film of nanoscale thickness that conducts protons, without permanently damaging it. And the stronger the field, the faster the ionic devices.”

    These programmable resistors vastly increase the speed at which a neural network is trained, while drastically reducing the cost and energy to perform that training. This could help scientists develop deep learning models much more quickly, which could then be applied in uses like self-driving cars, fraud detection, or medical image analysis.

    “Once you have an analog processor, you will no longer be training networks everyone else is working on. You will be training networks with unprecedented complexities that no one else can afford to, and therefore vastly outperform them all. In other words, this is not a faster car, this is a spacecraft,” adds lead author and MIT postdoc Murat Onen.

    Co-authors include Frances M. Ross, the Ellen Swallow Richards Professor in the Department of Materials Science and Engineering; postdocs Nicolas Emond and Baoming Wang; and Difei Zhang, an EECS graduate student. The research is published today in Science.

    Accelerating deep learning

    Analog deep learning is faster and more energy-efficient than its digital counterpart for two main reasons. “First, computation is performed in memory, so enormous loads of data are not transferred back and forth from memory to a processor.” Analog processors also conduct operations in parallel. If the matrix size expands, an analog processor doesn’t need more time to complete new operations because all computation occurs simultaneously.

    The key element of MIT’s new analog processor technology is known as a protonic programmable resistor. These resistors, which are measured in nanometers (one nanometer is one billionth of a meter), are arranged in an array, like a chess board.

    In the human brain, learning happens due to the strengthening and weakening of connections between neurons, called synapses. Deep neural networks have long adopted this strategy, where the network weights are programmed through training algorithms. In the case of this new processor, increasing and decreasing the electrical conductance of protonic resistors enables analog machine learning.

    The conductance is controlled by the movement of protons. To increase the conductance, more protons are pushed into a channel in the resistor, while to decrease conductance protons are taken out. This is accomplished using an electrolyte (similar to that of a battery) that conducts protons but blocks electrons.

    To develop a super-fast and highly energy efficient programmable protonic resistor, the researchers looked to different materials for the electrolyte. While other devices used organic compounds, Onen focused on inorganic phosphosilicate glass (PSG).

    PSG is basically silicon dioxide, which is the powdery desiccant material found in tiny bags that come in the box with new furniture to remove moisture. It is studied as a proton conductor under humidified conditions for fuel cells. It is also the most well-known oxide used in silicon processing. To make PSG, a tiny bit of phosphorus is added to the silicon to give it special characteristics for proton conduction.

    Onen hypothesized that an optimized PSG could have a high proton conductivity at room temperature without the need for water, which would make it an ideal solid electrolyte for this application. He was right.

    Surprising speed

    PSG enables ultrafast proton movement because it contains a multitude of nanometer-sized pores whose surfaces provide paths for proton diffusion. It can also withstand very strong, pulsed electric fields. This is critical, Onen explains, because applying more voltage to the device enables protons to move at blinding speeds.

    “The speed certainly was surprising. Normally, we would not apply such extreme fields across devices, in order to not turn them into ash. But instead, protons ended up shuttling at immense speeds across the device stack, specifically a million times faster compared to what we had before. And this movement doesn’t damage anything, thanks to the small size and low mass of protons. It is almost like teleporting,” he says.

    “The nanosecond timescale means we are close to the ballistic or even quantum tunneling regime for the proton, under such an extreme field,” adds Li.

    Because the protons don’t damage the material, the resistor can run for millions of cycles without breaking down. This new electrolyte enabled a programmable protonic resistor that is a million times faster than their previous device and can operate effectively at room temperature, which is important for incorporating it into computing hardware.

    Thanks to the insulating properties of PSG, almost no electric current passes through the material as protons move. This makes the device extremely energy efficient, Onen adds.

    Now that they have demonstrated the effectiveness of these programmable resistors, the researchers plan to reengineer them for high-volume manufacturing, says del Alamo. Then they can study the properties of resistor arrays and scale them up so they can be embedded into systems.

    At the same time, they plan to study the materials to remove bottlenecks that limit the voltage that is required to efficiently transfer the protons to, through, and from the electrolyte.

    “Another exciting direction that these ionic devices can enable is energy-efficient hardware to emulate the neural circuits and synaptic plasticity rules that are deduced in neuroscience, beyond analog deep neural networks. We have already started such a collaboration with neuroscience, supported by the MIT Quest for Intelligence,” adds Yildiz.

    “The collaboration that we have is going to be essential to innovate in the future. The path forward is still going to be very challenging, but at the same time it is very exciting,” del Alamo says.

    “Intercalation reactions such as those found in lithium-ion batteries have been explored extensively for memory devices. This work demonstrates that proton-based memory devices deliver impressive and surprising switching speed and endurance,” says William Chueh, associate professor of materials science and engineering at Stanford University, who was not involved with this research. “It lays the foundation for a new class of memory devices for powering deep learning algorithms.”

    “This work demonstrates a significant breakthrough in biologically inspired resistive-memory devices. These all-solid-state protonic devices are based on exquisite atomic-scale control of protons, similar to biological synapses but at orders of magnitude faster rates,” says Elizabeth Dickey, the Teddy & Wilton Hawkins Distinguished Professor and head of the Department of Materials Science and Engineering at Carnegie Mellon University, who was not involved with this work. “I commend the interdisciplinary MIT team for this exciting development, which will enable future-generation computational devices.”

    This research is funded, in part, by the MIT-IBM Watson AI Lab. More

  • in

    Engineers use artificial intelligence to capture the complexity of breaking waves

    Waves break once they swell to a critical height, before cresting and crashing into a spray of droplets and bubbles. These waves can be as large as a surfer’s point break and as small as a gentle ripple rolling to shore. For decades, the dynamics of how and when a wave breaks have been too complex to predict.

    Now, MIT engineers have found a new way to model how waves break. The team used machine learning along with data from wave-tank experiments to tweak equations that have traditionally been used to predict wave behavior. Engineers typically rely on such equations to help them design resilient offshore platforms and structures. But until now, the equations have not been able to capture the complexity of breaking waves.

    The updated model made more accurate predictions of how and when waves break, the researchers found. For instance, the model estimated a wave’s steepness just before breaking, and its energy and frequency after breaking, more accurately than the conventional wave equations.

    Their results, published today in the journal Nature Communications, will help scientists understand how a breaking wave affects the water around it. Knowing precisely how these waves interact can help hone the design of offshore structures. It can also improve predictions for how the ocean interacts with the atmosphere. Having better estimates of how waves break can help scientists predict, for instance, how much carbon dioxide and other atmospheric gases the ocean can absorb.

    “Wave breaking is what puts air into the ocean,” says study author Themis Sapsis, an associate professor of mechanical and ocean engineering and an affiliate of the Institute for Data, Systems, and Society at MIT. “It may sound like a detail, but if you multiply its effect over the area of the entire ocean, wave breaking starts becoming fundamentally important to climate prediction.”

    The study’s co-authors include lead author and MIT postdoc Debbie Eeltink, Hubert Branger and Christopher Luneau of Aix-Marseille University, Amin Chabchoub of Kyoto University, Jerome Kasparian of the University of Geneva, and T.S. van den Bremer of Delft University of Technology.

    Learning tank

    To predict the dynamics of a breaking wave, scientists typically take one of two approaches: They either attempt to precisely simulate the wave at the scale of individual molecules of water and air, or they run experiments to try and characterize waves with actual measurements. The first approach is computationally expensive and difficult to simulate even over a small area; the second requires a huge amount of time to run enough experiments to yield statistically significant results.

    The MIT team instead borrowed pieces from both approaches to develop a more efficient and accurate model using machine learning. The researchers started with a set of equations that is considered the standard description of wave behavior. They aimed to improve the model by “training” the model on data of breaking waves from actual experiments.

    “We had a simple model that doesn’t capture wave breaking, and then we had the truth, meaning experiments that involve wave breaking,” Eeltink explains. “Then we wanted to use machine learning to learn the difference between the two.”

    The researchers obtained wave breaking data by running experiments in a 40-meter-long tank. The tank was fitted at one end with a paddle which the team used to initiate each wave. The team set the paddle to produce a breaking wave in the middle of the tank. Gauges along the length of the tank measured the water’s height as waves propagated down the tank.

    “It takes a lot of time to run these experiments,” Eeltink says. “Between each experiment you have to wait for the water to completely calm down before you launch the next experiment, otherwise they influence each other.”

    Safe harbor

    In all, the team ran about 250 experiments, the data from which they used to train a type of machine-learning algorithm known as a neural network. Specifically, the algorithm is trained to compare the real waves in experiments with the predicted waves in the simple model, and based on any differences between the two, the algorithm tunes the model to fit reality.

    After training the algorithm on their experimental data, the team introduced the model to entirely new data — in this case, measurements from two independent experiments, each run at separate wave tanks with different dimensions. In these tests, they found the updated model made more accurate predictions than the simple, untrained model, for instance making better estimates of a breaking wave’s steepness.

    The new model also captured an essential property of breaking waves known as the “downshift,” in which the frequency of a wave is shifted to a lower value. The speed of a wave depends on its frequency. For ocean waves, lower frequencies move faster than higher frequencies. Therefore, after the downshift, the wave will move faster. The new model predicts the change in frequency, before and after each breaking wave, which could be especially relevant in preparing for coastal storms.

    “When you want to forecast when high waves of a swell would reach a harbor, and you want to leave the harbor before those waves arrive, then if you get the wave frequency wrong, then the speed at which the waves are approaching is wrong,” Eeltink says.

    The team’s updated wave model is in the form of an open-source code that others could potentially use, for instance in climate simulations of the ocean’s potential to absorb carbon dioxide and other atmospheric gases. The code can also be worked into simulated tests of offshore platforms and coastal structures.

    “The number one purpose of this model is to predict what a wave will do,” Sapsis says. “If you don’t model wave breaking right, it would have tremendous implications for how structures behave. With this, you could simulate waves to help design structures better, more efficiently, and without huge safety factors.”

    This research is supported, in part, by the Swiss National Science Foundation, and by the U.S. Office of Naval Research. More