Deep learning calculations generally require large amounts of multiply-accumulate (MAC) operations, and it has resulted in issues of long calculation time and large energy consumption. Although techniques reducing the number of bits to represent parameters (bit precision) have been proposed to reduce the total calculation amount, one of proposed algorithm reduces the bit precision down to one or two bit, those techniques cause degraded recognition accuracy.

Toshiba Memory developed the new algorithm reducing MAC operations by optimizing the bit precision of MAC operations for individual filters in each layer of a neural network. By using the new algorithm, the MAC operations can be reduced with less degradation of recognition accuracy.

Furthermore, Toshiba Memory developed a new hardware architecture, called bit-parallel method, which is suitable for MAC operations with different bit precision. This method divides each various bit precision into a bit one by one and can execute 1-bit operation in numerous MAC units in parallel. It significantly improves utilization efficiency of the MAC units in the processor compared to conventional MAC architectures that execute in series.