Analogue computers could train AI 1000 times faster and cut energy use

Analog computers consume less power than digital ones.

metamorforx/Getty Images

Analog computers that quickly solve a key type of equation used to train artificial intelligence models could offer a potential solution growing energy consumption in data centers driven by the boom in artificial intelligence.

Laptops, smartphones and other everyday devices are known as digital computers because they store and process data as a sequence of binary digits, 0s or 1s, and can be programmed to perform a range of tasks. In contrast, analog computers are typically designed to perform only one specific task. They store and process data using quantities that can change continuously, such as electrical resistance, rather than discrete 0s and 1s.

Analog computers can excel in speed and power efficiency, but they previously lacked the precision of their digital counterparts. Now, Zhong Sun from Peking University, China, and his colleagues have created a pair of analog chips that work together to accurately solve matrix equations—a fundamental part of transmitting data over telecommunications networks, running large scientific simulations, or training artificial intelligence models.

The first chip produces a low-precision matrix solution very quickly, and the second runs an iterative refinement algorithm to analyze the error rate of the first chip and thus improve accuracy. Sun says the first chip produces results with an error rate of about 1 percent, but after three runs of the second chip, that error drops to 0.0000001 percent, which he says is the same as the accuracy of standard digital calculations.

So far, the researchers have created chips that can solve 16-by-16 matrices or chips with 256 variables, which could find use in solving some small problems. But Sun acknowledges that solving the questions used in today's large AI models will require much larger circuits, perhaps million-by-million.

But one of the advantages of analog chips over digital chips is that solving large matrices does not require more time, whereas digital chips have exponential difficulty as the matrix size increases. This means that the throughput (amount of data processed per second) of a 32 by 32 matrix chip will exceed the throughput of Nvidia X100 GPU is one of the high-performance chips used today for training artificial intelligence.

In theory, further scaling could result in 1,000 times the throughput of digital chips such as GPUs while consuming 100 times less power, Sun says. But he is quick to point out that real-world problems may exceed the extremely narrow capabilities of their designs, resulting in smaller results.

“This is just a speed comparison, but for real applications the problem may be different,” Sun says. “Our chip can only perform matrix calculations. If matrix calculations take up the majority of the computational problem, this represents a very significant speedup to solve the problem, but if not, it will be a limited speedup.”

Sun says that because of this, the most likely outcome is the creation of hybrid chips in which the GPU contains multiple analog circuits that solve very specific parts of the problem – but even that is likely a few years away.

James Millen at King's College London say matrix calculations are a key process in training AI models and that analogue calculations offer a potential boost.

“The modern world is built on digital computers. These incredible machines are general purpose computers, which means they can compute everything, but not everything can necessarily be computed efficiently and quickly,” Millen says. “Analog computers are tailored to specific tasks and can therefore be incredibly fast and efficient. This work uses an analog computing chip to speed up a process called matrix inversion, which is a key process in training certain AI models. Doing this more efficiently could help reduce the huge energy demands associated with our ever-increasing dependence on AI.”

Topics:

Leave a Comment