Researchers from Facebook AI, the National University of Singapore, and the Qihoo 360 AI Institute have jointly proposed OctConv (Octave Convolution), a promising new alternative to traditional convolution operations. Akin to a “compressor” for Convolutional Neural Networks (CNN), the OctConv method saves computational resources while boosting effectiveness.

Thanks to OctConv’s plug-and-play feature, there is no need to modify the original network architecture, nor to adjust hyperparameters.

Octave Convolution is inspired by the frequency separation and compression of images. Researchers performed similar operations on a convolutional network by compressing the low-frequency parts and processing data from the high and low frequency parts separately. The information can be exchanged between the two parts, which reduces both the storage and compute required for convolution operations.

For example, replacing traditional convolutions in a classic image recognition algorithm can achieve a 1.2 percent improvement in recognition accuracy on ImageNet, while only requiring 82 percent of the computing power and 91 percent of the storage space. The pink polyline in the picture below illustrates the effect of OctConv with different parameters on ResNet-50. The second pink dot from the left shows a more balanced configuration, with slightly higher precision than the original (rightmost black dot), and the requested floating point power only half of the original.

Ablation study results on ImageNet

With the support of OctConv the various image recognition networks represented by other folding lines — from as small as ResNet-26 and DenseNet to as large as ResNet-200 — all showed improved performance with reduced computing power. Adjusting the OctConv parameter α will strike a balance between performance improvement and computational savings. By lowering computing power demand, OctConv can also reduce neural network inferencing time.

OctConv’s usefulness is not limited to image recognition. Enhancements can also be achieved in either 2D or 3D CNN. Researchers not only tested the image classification capability of 2D CNN on ImageNet such as ResNet, ResNeXt, DenseNet, MobileNet, and SE-Net; but also tested the performance of video behavior recognition algorithms such as C2D and I3D, after injecting OctConv into the pipeline.

Respected AI expert Ian Goodfellow, who recently left Google for Apple, tweeted that OctConv is “a simple replacement for the traditional convolution operation that gets better accuracy with fewer FLOPS.” His tweet has already received over 1.6k likes.

The paper Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution is on arXiv.