We did the math on AI’s energy footprint. Here’s the story you haven’t heard.

The day after his inauguration in January, President Donald Trump announced Stargate, a $500 billion initiative to build out AI infrastructure, backed by some of the biggest companies in tech. Stargate aims to accelerate the construction of massive data centers and electricity networks across the US to ensure it keeps its edge over China. This story is a part of MIT Technology Review’s series “Power Hungry: AI and our energy future,” on the energy demands and carbon costs of the artificial-intelligence revolution. The whatever-it-takes approach to the race for worldwide AI dominance was the talk of Davos, says Raquel Urtasun, founder and CEO of the Canadian robotruck startup Waabi, referring to the World Economic Forum’s annual January meeting in Switzerland, which was held the same week as Trump’s announcement. “I’m pretty worried about where the industry is going,” Urtasun says. She’s not alone. “Dollars are being invested, GPUs are being burned, water is being evaporated—it’s just absolutely the wrong direction,” says Ali Farhadi, CEO of the Seattle-based nonprofit Allen Institute for AI. But sift through the talk of rocketing costs—and climate impact—and you’ll find reasons to be hopeful. There are innovations underway that could improve the efficiency of the software behind AI models, the computer chips those models run on, and the data centers where those chips hum around the clock. Here’s what you need to know about how energy use, and therefore carbon emissions, could be cut across all three of those domains, plus an added argument for cautious optimism: There are reasons to believe that the underlying business realities will ultimately bend toward more energy-efficient AI. 1/ More efficient models The most obvious place to start is with the models themselves—the way they’re created and the way they’re run. AI models are built by training neural networks on lots and lots of data. Large language models are trained on vast amounts of text, self-driving models are trained on vast amounts of driving data, and so on. But the way such data is collected is often indiscriminate. Large language models are trained on data sets that include text scraped from most of the internet and huge libraries of scanned books. The practice has been to grab everything that’s not nailed down, throw it into the mix, and see what comes out. This approach has certainly worked, but training a model on a massive data set over and over so it can extract relevant patterns by itself is a waste of time and energy. There might be a more efficient way. Children aren’t expected to learn just by reading everything that’s ever been written; they are given a focused curriculum. Urtasun thinks we should do something similar with AI, training models with more curated data tailored to specific tasks. (Waabi trains its robotrucks inside a superrealistic simulation that allows fine-grained control of the virtual data its models are presented with.) It’s not just Waabi. Writer, an AI startup that builds large language models for enterprise customers, claims that its models are cheaper to train and run in part because it trains them using synthetic data. Feeding its models bespoke data sets rather than larger but less curated ones makes the training process quicker (and therefore less expensive). For example, instead of simply downloading Wikipedia, the team at Writer takes individual Wikipedia pages and rewrites their contents in different formats—as a Q&A instead of a block of text, and so on—so that its models can learn more from less. Training is just the start of a model’s life cycle. As models have become bigger, they have become more expensive to run. So-called reasoning models that work through a query step by step before producing a response are especially power-hungry because they compute a series of intermediate subresponses for each response. The price tag of these new capabilities is eye-watering: OpenAI’s o3 reasoning model has been estimated to cost up to $30,000 per task to run. But this technology is only a few months old and still experimental. Farhadi expects that these costs will soon come down. For example, engineers will figure out how to stop reasoning models from going too far down a dead-end path before they determine it’s not viable. “The first time you do something it’s way more expensive, and then you figure out how to make it smaller and more efficient,” says Farhadi. “It’s a fairly consistent trend in technology.” One way to get performance gains without big jumps in energy consumption is to run inference steps (the computations a model makes to come up with its response) in parallel, he says. Parallel computing underpins much of today’s software, especially large language models (GPUs are parallel by design). Even so, the basic technique could be applied to a wider range of problems. By splitting up a task and running different parts of it at the same time, parallel computing can generate results more quickly. It can also save energy by making more efficient use of available hardware. But it requires clever new algorithms to coordinate the multiple subtasks and pull them together into a single result at the end. The largest, most powerful models won’t be used all the time, either. There is a lot of talk about small models, versions of large language models that have been distilled into pocket-size packages. In many cases, these more efficient models perform as well as larger ones, especially for specific use cases. As businesses figure out how large language models fit their needs (or not), this trend toward more efficient bespoke models is taking off. You don’t need an all-purpose LLM to manage inventory or to respond to niche customer queries. “There’s going to be a really, really large number of specialized models, not one God-given model that solves everything,” says Farhadi. Christina Shim, chief sustainability officer at IBM, is seeing this trend play out in the way her clients adopt the technology. She works with businesses to make sure they choose the smallest and least power-hungry models possible. “It’s not just the biggest model that will give you a big bang for your buck,” she says. A smaller model that does exactly what you need is a better investment than a larger one that does the same thing: “Let’s not use a sledgehammer to hit a nail.” 2/ More efficient computer chips As the software becomes more streamlined, the hardware it runs on will become more efficient too. There’s a tension at play here: In the short term, chipmakers like Nvidia are racing to develop increasingly powerful chips to meet demand from companies wanting to run increasingly powerful models. But in the long term, this race isn’t sustainable. “The models have gotten so big, even running the inference step now starts to become a big challenge,” says Naveen Verma, cofounder and CEO of the upstart microchip maker EnCharge AI. Companies like Microsoft and OpenAI are losing money running their models inside data centers to meet the demand from millions of people. Smaller models will help. Another option is to move the computing out of the data centers and into people’s own machines. That’s something that Microsoft tried with its Copilot+ PC initiative, in which it marketed a supercharged PC that would let you run an AI model (and cover the energy bills) yourself. It hasn’t taken off, but Verma thinks the push will continue because companies will want to offload as much of the costs of running a model as they can. But getting AI models (even small ones) to run reliably on people’s personal devices will require a step change in the chips that typically power those devices. These chips need to be made even more energy efficient because they need to be able to work with just a battery, says Verma. That’s where EnCharge comes in. Its solution is a new kind of chip that ditches digital computation in favor of something called analog in-memory computing. Instead of representing information with binary 0s and 1s, like the electronics inside conventional, digital computer chips, the electronics inside analog chips can represent information along a range of values in between 0 and 1. In theory, this lets you do more with the same amount of power. SHIWEN SVEN WANG EnCharge was spun out from Verma’s research lab at Princeton in 2022. “We’ve known for decades that analog compute can be much more efficient—orders of magnitude more efficient—than digital,” says Verma. But analog computers never worked well in practice because they made lots of errors. Verma and his colleagues have discovered a way to do analog computing that’s precise. EnCharge is focusing just on the core computation required by AI today. With support from semiconductor giants like TSMC, the startup is developing hardware that performs high-dimensional matrix multiplication (the basic math behind all deep-learning models) in an analog chip and then passes the result back out to the surrounding digital computer. EnCharge’s hardware is just one of a number of experimental new chip designs on the horizon. IBM and others

Post Views: 128

We did the math on AI’s energy footprint. Here’s the story you haven’t heard.

MUBS Unveils Graduation List Ahead of 16th Graduation Ceremony

Gulu University appoints Ruhakana Rugunda as new Chancellor

Metropolitan International University kicks off their 5th Graduation ceremony

Makerere University Graduand Throws Graduation Party Worth 80 Million Shillings

Gulu University Set For 18th Graduation

The Download: funding a CRISPR embryo startup, and bad news for clean cement

Inside the race to find GPS alternatives

Why doctors should look for ways to prescribe hope

The Download: China’s AI agent boom, and GPS alternatives

LEAVE A REPLY Cancel reply

About

Latest

Kabale University, UNDP, UDB Set Stage for Innovation Hub Launch

IUIU Hosts Global Islamic Scholars for 6th International Conference

Makerere University, UNDP Launch Childcare Centre to Support Students and Staff

Popular

Kabale University, UNDP, UDB Set Stage for Innovation Hub Launch

IUIU Hosts Global Islamic Scholars for 6th International Conference

Makerere University, UNDP Launch Childcare Centre to Support Students and Staff

More