More stories

  • in

    Solving a longstanding conundrum in heat transfer

    It is a problem that has beguiled scientists for a century. But, buoyed by a $625,000 Distinguished Early Career Award from the U.S. Department of Energy (DoE), Matteo Bucci, an associate professor in the Department of Nuclear Science and Engineering (NSE), hopes to be close to an answer.

    Tackling the boiling crisis

    Whether you’re heating a pot of water for pasta or are designing nuclear reactors, one phenomenon — boiling — is vital for efficient execution of both processes.

    “Boiling is a very effective heat transfer mechanism; it’s the way to remove large amounts of heat from the surface, which is why it is used in many high-power density applications,” Bucci says. An example use case: nuclear reactors.

    To the layperson, boiling appears simple — bubbles form and burst, removing heat. But what if so many bubbles form and coalesce that they form a band of vapor that prevents further heat transfer? Such a problem is a known entity and is labeled the boiling crisis. It would lead to runaway heat, and a failure of fuel rods in nuclear reactors. So “understanding and determining under which conditions the boiling crisis is likely to happen is critical to designing more efficient and cost-competitive nuclear reactors,” Bucci says.

    Early work on the boiling crisis dates back nearly a century ago, to 1926. And while much work has been done, “it is clear that we haven’t found an answer,” Bucci says. The boiling crisis remains a challenge because while models abound, the measurement of related phenomena to prove or disprove these models has been difficult. “[Boiling] is a process that happens on a very, very small length scale and over very, very short times,” Bucci says. “We are not able to observe it at the level of detail necessary to understand what really happens and validate hypotheses.”

    But, over the past few years, Bucci and his team have been developing diagnostics that can measure the phenomena related to boiling and thereby provide much-needed answers to a classic problem. Diagnostics are anchored in infrared thermometry and a technique using visible light. “By combining these two techniques I think we’re going to be ready to answer standing questions related to heat transfer, we can make our way out of the rabbit hole,” Bucci says. The grant award from the U.S. DoE for Nuclear Energy Projects will aid in this and Bucci’s other research efforts.

    An idyllic Italian childhood

    Tackling difficult problems is not new territory for Bucci, who grew up in the small town of Città di Castello near Florence, Italy. Bucci’s mother was an elementary school teacher. His father used to have a machine shop, which helped develop Bucci’s scientific bent. “I liked LEGOs a lot when I was a kid. It was a passion,” he adds.

    Despite Italy going through a severe pullback from nuclear engineering during his formative years, the subject fascinated Bucci. Job opportunities in the field were uncertain but Bucci decided to dig in. “If I have to do something for the rest of my life, it might as well be something I like,” he jokes. Bucci attended the University of Pisa for undergraduate and graduate studies in nuclear engineering.

    His interest in heat transfer mechanisms took root during his doctoral studies, a research subject he pursued in Paris at the French Alternative Energies and Atomic Energy Commission (CEA). It was there that a colleague suggested work on the boiling water crisis. This time Bucci set his sights on NSE at MIT and reached out to Professor Jacopo Buongiorno to inquire about research at the institution. Bucci had to fundraise at CEA to conduct research at MIT. He arrived just a couple of days before the Boston Marathon bombing in 2013 with a round-trip ticket. But Bucci has stayed ever since, moving on to become a research scientist and then associate professor at NSE.

    Bucci admits he struggled to adapt to the environment when he first arrived at MIT, but work and friendships with colleagues — he counts NSE’s Guanyu Su and Reza Azizian as among his best friends — helped conquer early worries.

    The integration of artificial intelligence

    In addition to diagnostics for boiling, Bucci and his team are working on ways of integrating artificial intelligence and experimental research. He is convinced that “the integration of advanced diagnostics, machine learning, and advanced modeling tools will blossom in a decade.”

    Bucci’s team is developing an autonomous laboratory for boiling heat transfer experiments. Running on machine learning, the setup decides which experiments to run based on a learning objective the team assigns. “We formulate a question and the machine will answer by optimizing the kinds of experiments that are necessary to answer those questions,” Bucci says, “I honestly think this is the next frontier for boiling,” he adds.

    “It’s when you climb a tree and you reach the top, that you realize that the horizon is much more vast and also more beautiful,” Bucci says of his zeal to pursue more research in the field.

    Even as he seeks new heights, Bucci has not forgotten his origins. Commemorating Italy’s hosting of the World Cup in 1990, a series of posters showcasing a soccer field fitted into the Roman Colosseum occupies pride of place in his home and office. Created by Alberto Burri, the posters are of sentimental value: The (now deceased) Italian artist also hailed from Bucci’s hometown — Città di Castello. More

  • in

    New hardware offers faster computation for artificial intelligence, with much less energy

    As scientists push the boundaries of machine learning, the amount of time, energy, and money required to train increasingly complex neural network models is skyrocketing. A new area of artificial intelligence called analog deep learning promises faster computation with a fraction of the energy usage.

    Programmable resistors are the key building blocks in analog deep learning, just like transistors are the core elements for digital processors. By repeating arrays of programmable resistors in complex layers, researchers can create a network of analog artificial “neurons” and “synapses” that execute computations just like a digital neural network. This network can then be trained to achieve complex AI tasks like image recognition and natural language processing.

    A multidisciplinary team of MIT researchers set out to push the speed limits of a type of human-made analog synapse that they had previously developed. They utilized a practical inorganic material in the fabrication process that enables their devices to run 1 million times faster than previous versions, which is also about 1 million times faster than the synapses in the human brain.

    Moreover, this inorganic material also makes the resistor extremely energy-efficient. Unlike materials used in the earlier version of their device, the new material is compatible with silicon fabrication techniques. This change has enabled fabricating devices at the nanometer scale and could pave the way for integration into commercial computing hardware for deep-learning applications.

    “With that key insight, and the very powerful nanofabrication techniques we have at MIT.nano, we have been able to put these pieces together and demonstrate that these devices are intrinsically very fast and operate with reasonable voltages,” says senior author Jesús A. del Alamo, the Donner Professor in MIT’s Department of Electrical Engineering and Computer Science (EECS). “This work has really put these devices at a point where they now look really promising for future applications.”

    “The working mechanism of the device is electrochemical insertion of the smallest ion, the proton, into an insulating oxide to modulate its electronic conductivity. Because we are working with very thin devices, we could accelerate the motion of this ion by using a strong electric field, and push these ionic devices to the nanosecond operation regime,” explains senior author Bilge Yildiz, the Breene M. Kerr Professor in the departments of Nuclear Science and Engineering and Materials Science and Engineering.

    “The action potential in biological cells rises and falls with a timescale of milliseconds, since the voltage difference of about 0.1 volt is constrained by the stability of water,” says senior author Ju Li, the Battelle Energy Alliance Professor of Nuclear Science and Engineering and professor of materials science and engineering, “Here we apply up to 10 volts across a special solid glass film of nanoscale thickness that conducts protons, without permanently damaging it. And the stronger the field, the faster the ionic devices.”

    These programmable resistors vastly increase the speed at which a neural network is trained, while drastically reducing the cost and energy to perform that training. This could help scientists develop deep learning models much more quickly, which could then be applied in uses like self-driving cars, fraud detection, or medical image analysis.

    “Once you have an analog processor, you will no longer be training networks everyone else is working on. You will be training networks with unprecedented complexities that no one else can afford to, and therefore vastly outperform them all. In other words, this is not a faster car, this is a spacecraft,” adds lead author and MIT postdoc Murat Onen.

    Co-authors include Frances M. Ross, the Ellen Swallow Richards Professor in the Department of Materials Science and Engineering; postdocs Nicolas Emond and Baoming Wang; and Difei Zhang, an EECS graduate student. The research is published today in Science.

    Accelerating deep learning

    Analog deep learning is faster and more energy-efficient than its digital counterpart for two main reasons. “First, computation is performed in memory, so enormous loads of data are not transferred back and forth from memory to a processor.” Analog processors also conduct operations in parallel. If the matrix size expands, an analog processor doesn’t need more time to complete new operations because all computation occurs simultaneously.

    The key element of MIT’s new analog processor technology is known as a protonic programmable resistor. These resistors, which are measured in nanometers (one nanometer is one billionth of a meter), are arranged in an array, like a chess board.

    In the human brain, learning happens due to the strengthening and weakening of connections between neurons, called synapses. Deep neural networks have long adopted this strategy, where the network weights are programmed through training algorithms. In the case of this new processor, increasing and decreasing the electrical conductance of protonic resistors enables analog machine learning.

    The conductance is controlled by the movement of protons. To increase the conductance, more protons are pushed into a channel in the resistor, while to decrease conductance protons are taken out. This is accomplished using an electrolyte (similar to that of a battery) that conducts protons but blocks electrons.

    To develop a super-fast and highly energy efficient programmable protonic resistor, the researchers looked to different materials for the electrolyte. While other devices used organic compounds, Onen focused on inorganic phosphosilicate glass (PSG).

    PSG is basically silicon dioxide, which is the powdery desiccant material found in tiny bags that come in the box with new furniture to remove moisture. It is studied as a proton conductor under humidified conditions for fuel cells. It is also the most well-known oxide used in silicon processing. To make PSG, a tiny bit of phosphorus is added to the silicon to give it special characteristics for proton conduction.

    Onen hypothesized that an optimized PSG could have a high proton conductivity at room temperature without the need for water, which would make it an ideal solid electrolyte for this application. He was right.

    Surprising speed

    PSG enables ultrafast proton movement because it contains a multitude of nanometer-sized pores whose surfaces provide paths for proton diffusion. It can also withstand very strong, pulsed electric fields. This is critical, Onen explains, because applying more voltage to the device enables protons to move at blinding speeds.

    “The speed certainly was surprising. Normally, we would not apply such extreme fields across devices, in order to not turn them into ash. But instead, protons ended up shuttling at immense speeds across the device stack, specifically a million times faster compared to what we had before. And this movement doesn’t damage anything, thanks to the small size and low mass of protons. It is almost like teleporting,” he says.

    “The nanosecond timescale means we are close to the ballistic or even quantum tunneling regime for the proton, under such an extreme field,” adds Li.

    Because the protons don’t damage the material, the resistor can run for millions of cycles without breaking down. This new electrolyte enabled a programmable protonic resistor that is a million times faster than their previous device and can operate effectively at room temperature, which is important for incorporating it into computing hardware.

    Thanks to the insulating properties of PSG, almost no electric current passes through the material as protons move. This makes the device extremely energy efficient, Onen adds.

    Now that they have demonstrated the effectiveness of these programmable resistors, the researchers plan to reengineer them for high-volume manufacturing, says del Alamo. Then they can study the properties of resistor arrays and scale them up so they can be embedded into systems.

    At the same time, they plan to study the materials to remove bottlenecks that limit the voltage that is required to efficiently transfer the protons to, through, and from the electrolyte.

    “Another exciting direction that these ionic devices can enable is energy-efficient hardware to emulate the neural circuits and synaptic plasticity rules that are deduced in neuroscience, beyond analog deep neural networks. We have already started such a collaboration with neuroscience, supported by the MIT Quest for Intelligence,” adds Yildiz.

    “The collaboration that we have is going to be essential to innovate in the future. The path forward is still going to be very challenging, but at the same time it is very exciting,” del Alamo says.

    “Intercalation reactions such as those found in lithium-ion batteries have been explored extensively for memory devices. This work demonstrates that proton-based memory devices deliver impressive and surprising switching speed and endurance,” says William Chueh, associate professor of materials science and engineering at Stanford University, who was not involved with this research. “It lays the foundation for a new class of memory devices for powering deep learning algorithms.”

    “This work demonstrates a significant breakthrough in biologically inspired resistive-memory devices. These all-solid-state protonic devices are based on exquisite atomic-scale control of protons, similar to biological synapses but at orders of magnitude faster rates,” says Elizabeth Dickey, the Teddy & Wilton Hawkins Distinguished Professor and head of the Department of Materials Science and Engineering at Carnegie Mellon University, who was not involved with this work. “I commend the interdisciplinary MIT team for this exciting development, which will enable future-generation computational devices.”

    This research is funded, in part, by the MIT-IBM Watson AI Lab. More

  • in

    Engineers use artificial intelligence to capture the complexity of breaking waves

    Waves break once they swell to a critical height, before cresting and crashing into a spray of droplets and bubbles. These waves can be as large as a surfer’s point break and as small as a gentle ripple rolling to shore. For decades, the dynamics of how and when a wave breaks have been too complex to predict.

    Now, MIT engineers have found a new way to model how waves break. The team used machine learning along with data from wave-tank experiments to tweak equations that have traditionally been used to predict wave behavior. Engineers typically rely on such equations to help them design resilient offshore platforms and structures. But until now, the equations have not been able to capture the complexity of breaking waves.

    The updated model made more accurate predictions of how and when waves break, the researchers found. For instance, the model estimated a wave’s steepness just before breaking, and its energy and frequency after breaking, more accurately than the conventional wave equations.

    Their results, published today in the journal Nature Communications, will help scientists understand how a breaking wave affects the water around it. Knowing precisely how these waves interact can help hone the design of offshore structures. It can also improve predictions for how the ocean interacts with the atmosphere. Having better estimates of how waves break can help scientists predict, for instance, how much carbon dioxide and other atmospheric gases the ocean can absorb.

    “Wave breaking is what puts air into the ocean,” says study author Themis Sapsis, an associate professor of mechanical and ocean engineering and an affiliate of the Institute for Data, Systems, and Society at MIT. “It may sound like a detail, but if you multiply its effect over the area of the entire ocean, wave breaking starts becoming fundamentally important to climate prediction.”

    The study’s co-authors include lead author and MIT postdoc Debbie Eeltink, Hubert Branger and Christopher Luneau of Aix-Marseille University, Amin Chabchoub of Kyoto University, Jerome Kasparian of the University of Geneva, and T.S. van den Bremer of Delft University of Technology.

    Learning tank

    To predict the dynamics of a breaking wave, scientists typically take one of two approaches: They either attempt to precisely simulate the wave at the scale of individual molecules of water and air, or they run experiments to try and characterize waves with actual measurements. The first approach is computationally expensive and difficult to simulate even over a small area; the second requires a huge amount of time to run enough experiments to yield statistically significant results.

    The MIT team instead borrowed pieces from both approaches to develop a more efficient and accurate model using machine learning. The researchers started with a set of equations that is considered the standard description of wave behavior. They aimed to improve the model by “training” the model on data of breaking waves from actual experiments.

    “We had a simple model that doesn’t capture wave breaking, and then we had the truth, meaning experiments that involve wave breaking,” Eeltink explains. “Then we wanted to use machine learning to learn the difference between the two.”

    The researchers obtained wave breaking data by running experiments in a 40-meter-long tank. The tank was fitted at one end with a paddle which the team used to initiate each wave. The team set the paddle to produce a breaking wave in the middle of the tank. Gauges along the length of the tank measured the water’s height as waves propagated down the tank.

    “It takes a lot of time to run these experiments,” Eeltink says. “Between each experiment you have to wait for the water to completely calm down before you launch the next experiment, otherwise they influence each other.”

    Safe harbor

    In all, the team ran about 250 experiments, the data from which they used to train a type of machine-learning algorithm known as a neural network. Specifically, the algorithm is trained to compare the real waves in experiments with the predicted waves in the simple model, and based on any differences between the two, the algorithm tunes the model to fit reality.

    After training the algorithm on their experimental data, the team introduced the model to entirely new data — in this case, measurements from two independent experiments, each run at separate wave tanks with different dimensions. In these tests, they found the updated model made more accurate predictions than the simple, untrained model, for instance making better estimates of a breaking wave’s steepness.

    The new model also captured an essential property of breaking waves known as the “downshift,” in which the frequency of a wave is shifted to a lower value. The speed of a wave depends on its frequency. For ocean waves, lower frequencies move faster than higher frequencies. Therefore, after the downshift, the wave will move faster. The new model predicts the change in frequency, before and after each breaking wave, which could be especially relevant in preparing for coastal storms.

    “When you want to forecast when high waves of a swell would reach a harbor, and you want to leave the harbor before those waves arrive, then if you get the wave frequency wrong, then the speed at which the waves are approaching is wrong,” Eeltink says.

    The team’s updated wave model is in the form of an open-source code that others could potentially use, for instance in climate simulations of the ocean’s potential to absorb carbon dioxide and other atmospheric gases. The code can also be worked into simulated tests of offshore platforms and coastal structures.

    “The number one purpose of this model is to predict what a wave will do,” Sapsis says. “If you don’t model wave breaking right, it would have tremendous implications for how structures behave. With this, you could simulate waves to help design structures better, more efficiently, and without huge safety factors.”

    This research is supported, in part, by the Swiss National Science Foundation, and by the U.S. Office of Naval Research. More

  • in

    How can we reduce the carbon footprint of global computing?

    The voracious appetite for energy from the world’s computers and communications technology presents a clear threat for the globe’s warming climate. That was the blunt assessment from presenters in the intensive two-day Climate Implications of Computing and Communications workshop held on March 3 and 4, hosted by MIT’s Climate and Sustainability Consortium (MCSC), MIT-IBM Watson AI Lab, and the Schwarzman College of Computing.

    The virtual event featured rich discussions and highlighted opportunities for collaboration among an interdisciplinary group of MIT faculty and researchers and industry leaders across multiple sectors — underscoring the power of academia and industry coming together.

    “If we continue with the existing trajectory of compute energy, by 2040, we are supposed to hit the world’s energy production capacity. The increase in compute energy and demand has been increasing at a much faster rate than the world energy production capacity increase,” said Bilge Yildiz, the Breene M. Kerr Professor in the MIT departments of Nuclear Science and Engineering and Materials Science and Engineering, one of the workshop’s 18 presenters. This computing energy projection draws from the Semiconductor Research Corporations’s decadal report.To cite just one example: Information and communications technology already account for more than 2 percent of global energy demand, which is on a par with the aviation industries emissions from fuel.“We are the very beginning of this data-driven world. We really need to start thinking about this and act now,” said presenter Evgeni Gousev, senior director at Qualcomm.  Innovative energy-efficiency optionsTo that end, the workshop presentations explored a host of energy-efficiency options, including specialized chip design, data center architecture, better algorithms, hardware modifications, and changes in consumer behavior. Industry leaders from AMD, Ericsson, Google, IBM, iRobot, NVIDIA, Qualcomm, Tertill, Texas Instruments, and Verizon outlined their companies’ energy-saving programs, while experts from across MIT provided insight into current research that could yield more efficient computing.Panel topics ranged from “Custom hardware for efficient computing” to “Hardware for new architectures” to “Algorithms for efficient computing,” among others.

    Visual representation of the conversation during the workshop session entitled “Energy Efficient Systems.”

    Image: Haley McDevitt

    Previous item
    Next item

    The goal, said Yildiz, is to improve energy efficiency associated with computing by more than a million-fold.“I think part of the answer of how we make computing much more sustainable has to do with specialized architectures that have very high level of utilization,” said Darío Gil, IBM senior vice president and director of research, who stressed that solutions should be as “elegant” as possible.  For example, Gil illustrated an innovative chip design that uses vertical stacking to reduce the distance data has to travel, and thus reduces energy consumption. Surprisingly, more effective use of tape — a traditional medium for primary data storage — combined with specialized hard drives (HDD), can yield a dramatic savings in carbon dioxide emissions.Gil and presenters Bill Dally, chief scientist and senior vice president of research of NVIDIA; Ahmad Bahai, CTO of Texas Instruments; and others zeroed in on storage. Gil compared data to a floating iceberg in which we can have fast access to the “hot data” of the smaller visible part while the “cold data,” the large underwater mass, represents data that tolerates higher latency. Think about digital photo storage, Gil said. “Honestly, are you really retrieving all of those photographs on a continuous basis?” Storage systems should provide an optimized mix of of HDD for hot data and tape for cold data based on data access patterns.Bahai stressed the significant energy saving gained from segmenting standby and full processing. “We need to learn how to do nothing better,” he said. Dally spoke of mimicking the way our brain wakes up from a deep sleep, “We can wake [computers] up much faster, so we don’t need to keep them running in full speed.”Several workshop presenters spoke of a focus on “sparsity,” a matrix in which most of the elements are zero, as a way to improve efficiency in neural networks. Or as Dally said, “Never put off till tomorrow, where you could put off forever,” explaining efficiency is not “getting the most information with the fewest bits. It’s doing the most with the least energy.”Holistic and multidisciplinary approaches“We need both efficient algorithms and efficient hardware, and sometimes we need to co-design both the algorithm and the hardware for efficient computing,” said Song Han, a panel moderator and assistant professor in the Department of Electrical Engineering and Computer Science (EECS) at MIT.Some presenters were optimistic about innovations already underway. According to Ericsson’s research, as much as 15 percent of the carbon emissions globally can be reduced through the use of existing solutions, noted Mats Pellbäck Scharp, head of sustainability at Ericsson. For example, GPUs are more efficient than CPUs for AI, and the progression from 3G to 5G networks boosts energy savings.“5G is the most energy efficient standard ever,” said Scharp. “We can build 5G without increasing energy consumption.”Companies such as Google are optimizing energy use at their data centers through improved design, technology, and renewable energy. “Five of our data centers around the globe are operating near or above 90 percent carbon-free energy,” said Jeff Dean, Google’s senior fellow and senior vice president of Google Research.Yet, pointing to the possible slowdown in the doubling of transistors in an integrated circuit — or Moore’s Law — “We need new approaches to meet this compute demand,” said Sam Naffziger, AMD senior vice president, corporate fellow, and product technology architect. Naffziger spoke of addressing performance “overkill.” For example, “we’re finding in the gaming and machine learning space we can make use of lower-precision math to deliver an image that looks just as good with 16-bit computations as with 32-bit computations, and instead of legacy 32b math to train AI networks, we can use lower-energy 8b or 16b computations.”

    Visual representation of the conversation during the workshop session entitled “Wireless, networked, and distributed systems.”

    Image: Haley McDevitt

    Previous item
    Next item

    Other presenters singled out compute at the edge as a prime energy hog.“We also have to change the devices that are put in our customers’ hands,” said Heidi Hemmer, senior vice president of engineering at Verizon. As we think about how we use energy, it is common to jump to data centers — but it really starts at the device itself, and the energy that the devices use. Then, we can think about home web routers, distributed networks, the data centers, and the hubs. “The devices are actually the least energy-efficient out of that,” concluded Hemmer.Some presenters had different perspectives. Several called for developing dedicated silicon chipsets for efficiency. However, panel moderator Muriel Medard, the Cecil H. Green Professor in EECS, described research at MIT, Boston University, and Maynooth University on the GRAND (Guessing Random Additive Noise Decoding) chip, saying, “rather than having obsolescence of chips as the new codes come in and in different standards, you can use one chip for all codes.”Whatever the chip or new algorithm, Helen Greiner, CEO of Tertill (a weeding robot) and co-founder of iRobot, emphasized that to get products to market, “We have to learn to go away from wanting to get the absolute latest and greatest, the most advanced processor that usually is more expensive.” She added, “I like to say robot demos are a dime a dozen, but robot products are very infrequent.”Greiner emphasized consumers can play a role in pushing for more energy-efficient products — just as drivers began to demand electric cars.Dean also sees an environmental role for the end user.“We have enabled our cloud customers to select which cloud region they want to run their computation in, and they can decide how important it is that they have a low carbon footprint,” he said, also citing other interfaces that might allow consumers to decide which air flights are more efficient or what impact installing a solar panel on their home would have.However, Scharp said, “Prolonging the life of your smartphone or tablet is really the best climate action you can do if you want to reduce your digital carbon footprint.”Facing increasing demandsDespite their optimism, the presenters acknowledged the world faces increasing compute demand from machine learning, AI, gaming, and especially, blockchain. Panel moderator Vivienne Sze, associate professor in EECS, noted the conundrum.“We can do a great job in making computing and communication really efficient. But there is this tendency that once things are very efficient, people use more of it, and this might result in an overall increase in the usage of these technologies, which will then increase our overall carbon footprint,” Sze said.Presenters saw great potential in academic/industry partnerships, particularly from research efforts on the academic side. “By combining these two forces together, you can really amplify the impact,” concluded Gousev.Presenters at the Climate Implications of Computing and Communications workshop also included: Joel Emer, professor of the practice in EECS at MIT; David Perreault, the Joseph F. and Nancy P. Keithley Professor of EECS at MIT; Jesús del Alamo, MIT Donner Professor and professor of electrical engineering in EECS at MIT; Heike Riel, IBM Fellow and head science and technology at IBM; and Takashi Ando, principal research staff member at IBM Research. The recorded workshop sessions are available on YouTube. More

  • in

    Machine learning, harnessed to extreme computing, aids fusion energy development

    MIT research scientists Pablo Rodriguez-Fernandez and Nathan Howard have just completed one of the most demanding calculations in fusion science — predicting the temperature and density profiles of a magnetically confined plasma via first-principles simulation of plasma turbulence. Solving this problem by brute force is beyond the capabilities of even the most advanced supercomputers. Instead, the researchers used an optimization methodology developed for machine learning to dramatically reduce the CPU time required while maintaining the accuracy of the solution.

    Fusion energyFusion offers the promise of unlimited, carbon-free energy through the same physical process that powers the sun and the stars. It requires heating the fuel to temperatures above 100 million degrees, well above the point where the electrons are stripped from their atoms, creating a form of matter called plasma. On Earth, researchers use strong magnetic fields to isolate and insulate the hot plasma from ordinary matter. The stronger the magnetic field, the better the quality of the insulation that it provides.

    Rodriguez-Fernandez and Howard have focused on predicting the performance expected in the SPARC device, a compact, high-magnetic-field fusion experiment, currently under construction by the MIT spin-out company Commonwealth Fusion Systems (CFS) and researchers from MIT’s Plasma Science and Fusion Center. While the calculation required an extraordinary amount of computer time, over 8 million CPU-hours, what was remarkable was not how much time was used, but how little, given the daunting computational challenge.

    The computational challenge of fusion energyTurbulence, which is the mechanism for most of the heat loss in a confined plasma, is one of the science’s grand challenges and the greatest problem remaining in classical physics. The equations that govern fusion plasmas are well known, but analytic solutions are not possible in the regimes of interest, where nonlinearities are important and solutions encompass an enormous range of spatial and temporal scales. Scientists resort to solving the equations by numerical simulation on computers. It is no accident that fusion researchers have been pioneers in computational physics for the last 50 years.

    One of the fundamental problems for researchers is reliably predicting plasma temperature and density given only the magnetic field configuration and the externally applied input power. In confinement devices like SPARC, the external power and the heat input from the fusion process are lost through turbulence in the plasma. The turbulence itself is driven by the difference in the extremely high temperature of the plasma core and the relatively cool temperatures of the plasma edge (merely a few million degrees). Predicting the performance of a self-heated fusion plasma therefore requires a calculation of the power balance between the fusion power input and the losses due to turbulence.

    These calculations generally start by assuming plasma temperature and density profiles at a particular location, then computing the heat transported locally by turbulence. However, a useful prediction requires a self-consistent calculation of the profiles across the entire plasma, which includes both the heat input and turbulent losses. Directly solving this problem is beyond the capabilities of any existing computer, so researchers have developed an approach that stitches the profiles together from a series of demanding but tractable local calculations. This method works, but since the heat and particle fluxes depend on multiple parameters, the calculations can be very slow to converge.

    However, techniques emerging from the field of machine learning are well suited to optimize just such a calculation. Starting with a set of computationally intensive local calculations run with the full-physics, first-principles CGYRO code (provided by a team from General Atomics led by Jeff Candy) Rodriguez-Fernandez and Howard fit a surrogate mathematical model, which was used to explore and optimize a search within the parameter space. The results of the optimization were compared to the exact calculations at each optimum point, and the system was iterated to a desired level of accuracy. The researchers estimate that the technique reduced the number of runs of the CGYRO code by a factor of four.

    New approach increases confidence in predictionsThis work, described in a recent publication in the journal Nuclear Fusion, is the highest fidelity calculation ever made of the core of a fusion plasma. It refines and confirms predictions made with less demanding models. Professor Jonathan Citrin, of the Eindhoven University of Technology and leader of the fusion modeling group for DIFFER, the Dutch Institute for Fundamental Energy Research, commented: “The work significantly accelerates our capabilities in more routinely performing ultra-high-fidelity tokamak scenario prediction. This algorithm can help provide the ultimate validation test of machine design or scenario optimization carried out with faster, more reduced modeling, greatly increasing our confidence in the outcomes.” 

    In addition to increasing confidence in the fusion performance of the SPARC experiment, this technique provides a roadmap to check and calibrate reduced physics models, which run with a small fraction of the computational power. Such models, cross-checked against the results generated from turbulence simulations, will provide a reliable prediction before each SPARC discharge, helping to guide experimental campaigns and improving the scientific exploitation of the device. It can also be used to tweak and improve even simple data-driven models, which run extremely quickly, allowing researchers to sift through enormous parameter ranges to narrow down possible experiments or possible future machines.

    The research was funded by CFS, with computational support from the National Energy Research Scientific Computing Center, a U.S. Department of Energy Office of Science User Facility. More

  • in

    Engineers enlist AI to help scale up advanced solar cell manufacturing

    Perovskites are a family of materials that are currently the leading contender to potentially replace today’s silicon-based solar photovoltaics. They hold the promise of panels that are far thinner and lighter, that could be made with ultra-high throughput at room temperature instead of at hundreds of degrees, and that are cheaper and easier to transport and install. But bringing these materials from controlled laboratory experiments into a product that can be manufactured competitively has been a long struggle.

    Manufacturing perovskite-based solar cells involves optimizing at least a dozen or so variables at once, even within one particular manufacturing approach among many possibilities. But a new system based on a novel approach to machine learning could speed up the development of optimized production methods and help make the next generation of solar power a reality.

    The system, developed by researchers at MIT and Stanford University over the last few years, makes it possible to integrate data from prior experiments, and information based on personal observations by experienced workers, into the machine learning process. This makes the outcomes more accurate and has already led to the manufacturing of perovskite cells with an energy conversion efficiency of 18.5 percent, a competitive level for today’s market.

    The research is reported today in the journal Joule, in a paper by MIT professor of mechanical engineering Tonio Buonassisi, Stanford professor of materials science and engineering Reinhold Dauskardt, recent MIT research assistant Zhe Liu, Stanford doctoral graduate Nicholas Rolston, and three others.

    Perovskites are a group of layered crystalline compounds defined by the configuration of the atoms in their crystal lattice. There are thousands of such possible compounds and many different ways of making them. While most lab-scale development of perovskite materials uses a spin-coating technique, that’s not practical for larger-scale manufacturing, so companies and labs around the world have been searching for ways of translating these lab materials into a practical, manufacturable product.

    “There’s always a big challenge when you’re trying to take a lab-scale process and then transfer it to something like a startup or a manufacturing line,” says Rolston, who is now an assistant professor at Arizona State University. The team looked at a process that they felt had the greatest potential, a method called rapid spray plasma processing, or RSPP.

    The manufacturing process would involve a moving roll-to-roll surface, or series of sheets, on which the precursor solutions for the perovskite compound would be sprayed or ink-jetted as the sheet rolled by. The material would then move on to a curing stage, providing a rapid and continuous output “with throughputs that are higher than for any other photovoltaic technology,” Rolston says.

    “The real breakthrough with this platform is that it would allow us to scale in a way that no other material has allowed us to do,” he adds. “Even materials like silicon require a much longer timeframe because of the processing that’s done. Whereas you can think of [this approach as more] like spray painting.”

    Within that process, at least a dozen variables may affect the outcome, some of them more controllable than others. These include the composition of the starting materials, the temperature, the humidity, the speed of the processing path, the distance of the nozzle used to spray the material onto a substrate, and the methods of curing the material. Many of these factors can interact with each other, and if the process is in open air, then humidity, for example, may be uncontrolled. Evaluating all possible combinations of these variables through experimentation is impossible, so machine learning was needed to help guide the experimental process.

    But while most machine-learning systems use raw data such as measurements of the electrical and other properties of test samples, they don’t typically incorporate human experience such as qualitative observations made by the experimenters of the visual and other properties of the test samples, or information from other experiments reported by other researchers. So, the team found a way to incorporate such outside information into the machine learning model, using a probability factor based on a mathematical technique called Bayesian Optimization.

    Using the system, he says, “having a model that comes from experimental data, we can find out trends that we weren’t able to see before.” For example, they initially had trouble adjusting for uncontrolled variations in humidity in their ambient setting. But the model showed them “that we could overcome our humidity challenges by changing the temperature, for instance, and by changing some of the other knobs.”

    The system now allows experimenters to much more rapidly guide their process in order to optimize it for a given set of conditions or required outcomes. In their experiments, the team focused on optimizing the power output, but the system could also be used to simultaneously incorporate other criteria, such as cost and durability — something members of the team are continuing to work on, Buonassisi says.

    The researchers were encouraged by the Department of Energy, which sponsored the work, to commercialize the technology, and they’re currently focusing on tech transfer to existing perovskite manufacturers. “We are reaching out to companies now,” Buonassisi says, and the code they developed has been made freely available through an open-source server. “It’s now on GitHub, anyone can download it, anyone can run it,” he says. “We’re happy to help companies get started in using our code.”

    Already, several companies are gearing up to produce perovskite-based solar panels, even though they are still working out the details of how to produce them, says Liu, who is now at the Northwestern Polytechnical University in Xi’an, China. He says companies there are not yet doing large-scale manufacturing, but instead starting with smaller, high-value applications such as building-integrated solar tiles where appearance is important. Three of these companies “are on track or are being pushed by investors to manufacture 1 meter by 2-meter rectangular modules [comparable to today’s most common solar panels], within two years,” he says.

    ‘The problem is, they don’t have a consensus on what manufacturing technology to use,” Liu says. The RSPP method, developed at Stanford, “still has a good chance” to be competitive, he says. And the machine learning system the team developed could prove to be important in guiding the optimization of whatever process ends up being used.

    “The primary goal was to accelerate the process, so it required less time, less experiments, and less human hours to develop something that is usable right away, for free, for industry,” he says.

    “Existing work on machine-learning-driven perovskite PV fabrication largely focuses on spin-coating, a lab-scale technique,” says Ted Sargent, University Professor at the University of Toronto, who was not associated with this work, which he says demonstrates “a workflow that is readily adapted to the deposition techniques that dominate the thin-film industry. Only a handful of groups have the simultaneous expertise in engineering and computation to drive such advances.” Sargent adds that this approach “could be an exciting advance for the manufacture of a broader family of materials” including LEDs, other PV technologies, and graphene, “in short, any industry that uses some form of vapor or vacuum deposition.” 

    The team also included Austin Flick and Thomas Colburn at Stanford and Zekun Ren at the Singapore-MIT Alliance for Science and Technology (SMART). In addition to the Department of Energy, the work was supported by a fellowship from the MIT Energy Initiative, the Graduate Research Fellowship Program from the National Science Foundation, and the SMART program. More

  • in

    New program bolsters innovation in next-generation artificial intelligence hardware

    The MIT AI Hardware Program is a new academia and industry collaboration aimed at defining and developing translational technologies in hardware and software for the AI and quantum age. A collaboration between the MIT School of Engineering and MIT Schwarzman College of Computing, involving the Microsystems Technologies Laboratories and programs and units in the college, the cross-disciplinary effort aims to innovate technologies that will deliver enhanced energy efficiency systems for cloud and edge computing.

    “A sharp focus on AI hardware manufacturing, research, and design is critical to meet the demands of the world’s evolving devices, architectures, and systems,” says Anantha Chandrakasan, dean of the MIT School of Engineering and Vannevar Bush Professor of Electrical Engineering and Computer Science. “Knowledge-sharing between industry and academia is imperative to the future of high-performance computing.”

    Based on use-inspired research involving materials, devices, circuits, algorithms, and software, the MIT AI Hardware Program convenes researchers from MIT and industry to facilitate the transition of fundamental knowledge to real-world technological solutions. The program spans materials and devices, as well as architecture and algorithms enabling energy-efficient and sustainable high-performance computing.

    “As AI systems become more sophisticated, new solutions are sorely needed to enable more advanced applications and deliver greater performance,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and Henry Ellis Warren Professor of Electrical Engineering and Computer Science. “Our aim is to devise real-world technological solutions and lead the development of technologies for AI in hardware and software.”

    The inaugural members of the program are companies from a wide range of industries including chip-making, semiconductor manufacturing equipment, AI and computing services, and information systems R&D organizations. The companies represent a diverse ecosystem, both nationally and internationally, and will work with MIT faculty and students to help shape a vibrant future for our planet through cutting-edge AI hardware research.

    The five inaugural members of the MIT AI Hardware Program are:  

    Amazon, a global technology company whose hardware inventions include the Kindle, Amazon Echo, Fire TV, and Astro; 
    Analog Devices, a global leader in the design and manufacturing of analog, mixed signal, and DSP integrated circuits; 
    ASML, an innovation leader in the semiconductor industry, providing chipmakers with hardware, software, and services to mass produce patterns on silicon through lithography; 
    NTT Research, a subsidiary of NTT that conducts fundamental research to upgrade reality in game-changing ways that improve lives and brighten our global future; and 
    TSMC, the world’s leading dedicated semiconductor foundry.

    The MIT AI Hardware Program will create a roadmap of transformative AI hardware technologies. Leveraging MIT.nano, the most advanced university nanofabrication facility anywhere, the program will foster a unique environment for AI hardware research.  

    “We are all in awe at the seemingly superhuman capabilities of today’s AI systems. But this comes at a rapidly increasing and unsustainable energy cost,” says Jesús del Alamo, the Donner Professor in MIT’s Department of Electrical Engineering and Computer Science. “Continued progress in AI will require new and vastly more energy-efficient systems. This, in turn, will demand innovations across the entire abstraction stack, from materials and devices to systems and software. The program is in a unique position to contribute to this quest.”

    The program will prioritize the following topics:

    analog neural networks;
    new roadmap CMOS designs;
    heterogeneous integration for AI systems;
    onolithic-3D AI systems;
    analog nonvolatile memory devices;
    software-hardware co-design;
    intelligence at the edge;
    intelligent sensors;
    energy-efficient AI;
    intelligent internet of things (IIoT);
    neuromorphic computing;
    AI edge security;
    quantum AI;
    wireless technologies;
    hybrid-cloud computing; and
    high-performance computation.

    “We live in an era where paradigm-shifting discoveries in hardware, systems communications, and computing have become mandatory to find sustainable solutions — solutions that we are proud to give to the world and generations to come,” says Aude Oliva, senior research scientist in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and director of strategic industry engagement in the MIT Schwarzman College of Computing.

    The new program is co-led by Jesús del Alamo and Aude Oliva, and Anantha Chandrakasan serves as chair. More

  • in

    Using artificial intelligence to find anomalies hiding in massive datasets

    Identifying a malfunction in the nation’s power grid can be like trying to find a needle in an enormous haystack. Hundreds of thousands of interrelated sensors spread across the U.S. capture data on electric current, voltage, and other critical information in real time, often taking multiple recordings per second.

    Researchers at the MIT-IBM Watson AI Lab have devised a computationally efficient method that can automatically pinpoint anomalies in those data streams in real time. They demonstrated that their artificial intelligence method, which learns to model the interconnectedness of the power grid, is much better at detecting these glitches than some other popular techniques.

    Because the machine-learning model they developed does not require annotated data on power grid anomalies for training, it would be easier to apply in real-world situations where high-quality, labeled datasets are often hard to come by. The model is also flexible and can be applied to other situations where a vast number of interconnected sensors collect and report data, like traffic monitoring systems. It could, for example, identify traffic bottlenecks or reveal how traffic jams cascade.

    “In the case of a power grid, people have tried to capture the data using statistics and then define detection rules with domain knowledge to say that, for example, if the voltage surges by a certain percentage, then the grid operator should be alerted. Such rule-based systems, even empowered by statistical data analysis, require a lot of labor and expertise. We show that we can automate this process and also learn patterns from the data using advanced machine-learning techniques,” says senior author Jie Chen, a research staff member and manager of the MIT-IBM Watson AI Lab.

    The co-author is Enyan Dai, an MIT-IBM Watson AI Lab intern and graduate student at the Pennsylvania State University. This research will be presented at the International Conference on Learning Representations.

    Probing probabilities

    The researchers began by defining an anomaly as an event that has a low probability of occurring, like a sudden spike in voltage. They treat the power grid data as a probability distribution, so if they can estimate the probability densities, they can identify the low-density values in the dataset. Those data points which are least likely to occur correspond to anomalies.

    Estimating those probabilities is no easy task, especially since each sample captures multiple time series, and each time series is a set of multidimensional data points recorded over time. Plus, the sensors that capture all that data are conditional on one another, meaning they are connected in a certain configuration and one sensor can sometimes impact others.

    To learn the complex conditional probability distribution of the data, the researchers used a special type of deep-learning model called a normalizing flow, which is particularly effective at estimating the probability density of a sample.

    They augmented that normalizing flow model using a type of graph, known as a Bayesian network, which can learn the complex, causal relationship structure between different sensors. This graph structure enables the researchers to see patterns in the data and estimate anomalies more accurately, Chen explains.

    “The sensors are interacting with each other, and they have causal relationships and depend on each other. So, we have to be able to inject this dependency information into the way that we compute the probabilities,” he says.

    This Bayesian network factorizes, or breaks down, the joint probability of the multiple time series data into less complex, conditional probabilities that are much easier to parameterize, learn, and evaluate. This allows the researchers to estimate the likelihood of observing certain sensor readings, and to identify those readings that have a low probability of occurring, meaning they are anomalies.

    Their method is especially powerful because this complex graph structure does not need to be defined in advance — the model can learn the graph on its own, in an unsupervised manner.

    A powerful technique

    They tested this framework by seeing how well it could identify anomalies in power grid data, traffic data, and water system data. The datasets they used for testing contained anomalies that had been identified by humans, so the researchers were able to compare the anomalies their model identified with real glitches in each system.

    Their model outperformed all the baselines by detecting a higher percentage of true anomalies in each dataset.

    “For the baselines, a lot of them don’t incorporate graph structure. That perfectly corroborates our hypothesis. Figuring out the dependency relationships between the different nodes in the graph is definitely helping us,” Chen says.

    Their methodology is also flexible. Armed with a large, unlabeled dataset, they can tune the model to make effective anomaly predictions in other situations, like traffic patterns.

    Once the model is deployed, it would continue to learn from a steady stream of new sensor data, adapting to possible drift of the data distribution and maintaining accuracy over time, says Chen.

    Though this particular project is close to its end, he looks forward to applying the lessons he learned to other areas of deep-learning research, particularly on graphs.

    Chen and his colleagues could use this approach to develop models that map other complex, conditional relationships. They also want to explore how they can efficiently learn these models when the graphs become enormous, perhaps with millions or billions of interconnected nodes. And rather than finding anomalies, they could also use this approach to improve the accuracy of forecasts based on datasets or streamline other classification techniques.

    This work was funded by the MIT-IBM Watson AI Lab and the U.S. Department of Energy. More