SAN FRANCISCO — A Google engineering manager called for new AI architectures, including a distributed approach to protecting data privacy. His talk was followed by more than half-a-dozen academic papers describing novel approaches to machine learning at the International Solid-State Circuits Conference (ISSCC) here.

Several ISSCC papers merged computation in memory, a long-pursued research idea that some believe that machine learning could finally bring to broad commercial use. For its part, Google is exploring a hybrid approach in which end users keep their data and just send neural-network weights to parameter servers in the cloud for processing.

Ultimately, Google and its peers need big leaps in compute power to fulfill the promise of AI in their data centers. Just one iteration of one task in a Google photo search powered by machine learning requires 11 billion operations/second, said Olivier Temam, who manages an unspecified AI program at the search giant.

Temam called for a distributed approach in which edge devices and cloud services collaborate to train neural nets. Devices do some training using raw data locally, then send changes or neural-network weights that he called semantic data to the cloud, where neural models are further trained and refined.

“For very understandable reasons, people or companies don’t want to send their data to the cloud, so we’ve shown that it’s possible to create models with federated learning,” said Temam.

One observer noted that such an approach could attract hackers trying to infer raw data from semantic data.

Google called for edge devices and cloud services to collaborate on neural-network training. (All images: ISSCC)

Google agreed to speak to the audience of several hundred chip designers here in hopes of spawning fresh ideas for more powerful AI accelerators. One challenge in designing such chips is the bottleneck between processors and the large amounts of memory that neural nets require.

The search giant needs memory bandwidth on the range of a hundred terabits/second. Today’s high-bandwidth memory stacks are two orders of magnitude too slow, and SRAM is too expensive and power-hungry, said Temam.

Several academics described approaches that embed computing in memory. The area is particularly hot due to the rise of a number of specialty memories including memristors, ReRAM, and others, as well as brain-inspired computer designs that sometimes use large memory or analog arrays.

“We found that most energy processing for neural networks was in data movement,” said Vivienne Sze, an associate professor at MIT who co-authored a 2016 paper on the Eyeriss architecture to address the issue. “You have lots of data and weights to manage, so data movement dominates energy consumption more than compute.”

Her group at MIT is now working on flexible architectures that can run a growing variety of neural nets, including many simplifications of them starting to emerge. They are also exploring how much neural net acceleration can be done on a single watt of power for applications like robots and drone cameras.

Google’s Temam said that the company is open to all new ideas as long as they are practical and low-cost. “We want to keep bringing the cost down so we can deploy more massively and eventually at the edge,” said Temam.

The following pages provide glances of six more ISSCC papers on AI accelerators. Most aimed to push energy consumption to new lows for inference jobs, with several supporting some training work.