Dr. Wu believed there were new opportunities for data transmission in the AI era, so he left Marvell to research smart storage two years ago. He founded Innogrit to enhance overall system performance by reducing network latency and eliminating inefficient protocol conversions. A number of respected industry veterans from Broadcom, MediaTek, Nvidia and Toshiba soon joined him.



Many tech companies are hopping on the AI chip bandwagon — Google’s TPU and Intel’s Nervana are dedicated to accelerating neural network computations. Dr. Wu however believes AI systems also require more efficient solutions for transmitting data to the processing unit.

Today’s advanced graphic processing units (GPU) such as the Tesla V100 can achieve 100 teraFLOPS (TFLOPS) of deep learning performance. NVM Express (NVMe), a communication protocol designed for the low latency and internal parallelism of solid-state storage devices (SSD), can also deliver double-digits GB/s performance. But bandwidth — the amount of data that can be transmitted in a fixed amount of time — retards the entire process. If the AI system were an hourglass with the two glass bulbs representing processing and storage, the narrow neck would represent the inefficient data flow link between them.



“Our goal is to open up the intermediate process from storage to processing so that data can be distributed to the corresponding storage nodes more easily and quickly,” says Dr. Wu.



Innogrit has designed a special application-specific integrated circuit (ASIC) coprocessor that can enable storage and computation simultaneously. Dr. Wu bills it as “in-storage computing.” This coprocessor can perform data preprocessing tasks such as weighting, max, and data sorting. Considering the enormous scale of AI training datasets, such data optimization techniques could potentially double or triple system efficiency.



The coprocessors are packaged into an SSD controller, bridging the Flash memory components to the SSD input/output interfaces. Innogrit has produced a test chip for the coprocessor with satisfactory measured results.



“There is no such type of product on the market today,” says Dr. Wu of his data coprocessor. “The current players are focused on either the storage or computing sides. Few companies have talents that can excel at both sides.” Wu himself graduated top in his class from Stanford University, and authored over 300 patents while at Marvell.



Silicon Valley-based Innogrit has about 80 employees and additional offices in Beijing and Shanghai. The company has raised an undisclosed amount of funding to begin mass-production of its first-generation processors.



Larry Li is a managing partner at early-stage fund Amino Capital, which joined the fundraising. He tells Synced “there is increasing demand worldwide for Innogrit’s AI chips that combine storage and computation. The market potential is huge for the next 10 years.”



Amino Capital Managing Partner Sue Xu adds: “Innogrit’s cutting edge chip design mechanism redefines the future of the underlying architecture of chips, which breaks the bottleneck of storage and data management for any AI use cases, such as cloud storage, data center, PC, network, manufacturing, auto-driving and et al.”



The next challenge for Dr. Wu and Innogrit is to bring their products to market. The storage controller market is already populated by tech giants such as Marvell, Intel, Toshiba, etc. and is becoming increasingly competitive. Dr. Wu says while Innogrit has already partnered with some of the world’s top enterprises, there is still a long way to go.



Innogrit will release its first-generation smart storage controller and a corresponding software system in late 2019.