Close to a thousand machine learning papers are published each and every week. On Fridays, Synced selects seven studies from the last seven days that present topical, innovative or otherwise interesting or important research that we believe may be of special interest to our readers.

Highlights of the week:

Researchers from Seoul National University and Qualcomm Korea introduced an elegant framework for fine-grained neural architecture search.

Turing Award Winner and esteemed UCLA Professor Judea Pearl endorsed the paper Causality for Machine Learning — “A very comprehensive, delightful and inspiring paper. Recommended to ALL, not just MANY ML/AI folks.”

DeepMind and Google showed that sparse versions of MobileNet v1, MobileNet v2 and EfficientNet architectures substantially outperform strong dense baselines on the efficiency-accuracy curve.

A Quoc V. Le led Google Brain team demonstrated adversarial examples can be used to improve image recognition models.

Paper One: Fine-Grained Neural Architecture Search (arXiv)

Authors: Heewon Kim, Seokil Hong, Bohyung Han, Kyoung Mu Lee from Computer Vision Lab & ASRI, Seoul National University and Heesoo Myeong from Qualcomm Korea

Abstract: We present an elegant framework of fine-grained neural architecture search (FGNAS), which allows to employ multiple heterogeneous operations within a single layer and can even generate compositional feature maps using several different base operations. FGNAS runs efficiently in spite of significantly large search space compared to other methods because it trains networks end-to-end by a stochastic gradient descent method. Moreover, the proposed framework allows to optimize the network under predefined resource constraints in terms of number of parameters, FLOPs and latency. FGNAS has been applied to two crucial applications in resource demanding computer vision tasks — -large-scale image classification and image super-resolution — -and demonstrates the state-of-the-art performance through flexible operation search and channel pruning.

Paper Two: Hybrid Composition with IdleBlock: More Efficient Networks for Image Recognition (arXiv)

Authors: Bing Xu, Andrew Tulloch, Yunpeng Chen, Xiaomeng Yang, Lin Qiao from Facebook AI

Abstract: We propose a new building block, IdleBlock, which naturally prunes connections within the block. To fully utilize the IdleBlock we break the tradition of monotonic design in state-of-the-art networks, and introducing hybrid composition with IdleBlock. We study hybrid composition on MobileNet v3 and EfficientNet-B0, two of the most efficient networks. Without any neural architecture search, the deeper “MobileNet v3” with hybrid composition design surpasses possibly all state-of-the-art image recognition network designed by human experts or neural architecture search algorithms. Similarly, the hybridized EfficientNet-B0 networks are more efficient than previous state-of-the-art networks with similar computation budgets. These results suggest a new simpler and more efficient direction for network design and neural architecture search.

Paper Three: Causality for Machine Learning (arXiv)

Authors: Bernhard Schölkopf from Max Planck Institute for Intelligent Systems

Abstract: Graphical causal inference as pioneered by Judea Pearl arose from research on artificial intelligence (AI), and for a long time had little connection to the field of machine learning.

This article discusses where links have been and should be established, introducing key concepts along the way. It argues that the hard open problems of machine learning and AI are intrinsically related to causality, and explains how the field is beginning to understand them.

Paper Four: Fast Sparse ConvNets (arXiv)

Authors: Erich Elsen and Karen Simonyan from DeepMind; Marat Dukhan and Trevor Gale from Google

Abstract: Historically, the pursuit of efficient inference has been one of the driving forces behind research into new deep learning architectures and building blocks. Some recent examples include: the squeeze-and-excitation module, depthwise separable convolutions in Xception, and the inverted bottleneck in MobileNet v2. Notably, in all of these cases, the resulting building blocks enabled not only higher efficiency, but also higher accuracy, and found wide adoption in the field. In this work, we further expand the arsenal of efficient building blocks for neural network architectures; but instead of combining standard primitives (such as convolution), we advocate for the replacement of these dense primitives with their sparse counterparts. While the idea of using sparsity to decrease the parameter count is not new, the conventional wisdom is that this reduction in theoretical FLOPs does not translate into real-world efficiency gains. We aim to correct this misconception by introducing a family of efficient sparse kernels for ARM and WebAssembly, which we open-source for the benefit of the community as part of the XNNPACK library. Equipped with our efficient implementation of sparse primitives, we show that sparse versions of MobileNet v1, MobileNet v2 and EfficientNet architectures substantially outperform strong dense baselines on the efficiency-accuracy curve. On Snapdragon 835 our sparse networks outperform their dense equivalents by 1.3−2.4× — equivalent to approximately one entire generation of MobileNet-family improvement. We hope that our findings will facilitate wider adoption of sparsity as a tool for creating efficient and accurate deep learning architectures.

Paper Five: Adversarial Examples Improve Image Recognition (arXiv)

Authors: Cihang Xie, Mingxing Tan, Boqing Gong, Jiang Wang and Quoc V. Le from Google; Alan Yuille from Johns Hopkins University

Abstract: Adversarial examples are commonly viewed as a threat to ConvNets. Here we present an opposite perspective: adversarial examples can be used to improve image recognition models if harnessed in the right manner. We propose AdvProp, an enhanced adversarial training scheme which treats adversarial examples as additional examples, to prevent overfitting. Key to our method is the usage of a separate auxiliary batch norm for adversarial examples, as they have different underlying distributions to normal examples.

We show that AdvProp improves a wide range of models on various image recognition tasks and performs better when the models are bigger. For instance, by applying AdvProp to the latest EfficientNet-B7 [28] on ImageNet, we achieve significant improvements on ImageNet (+0.7%), ImageNet-C (+6.5%), ImageNet-A (+7.0%), Stylized-ImageNet (+4.8%). With an enhanced EfficientNet-B8, our method achieves the state-of-the-art 85.5% ImageNet top-1 accuracy without extra data. This result even surpasses the best model in [20] which is trained with 3.5B Instagram images (~3000X more than ImageNet) and ~9.4X more parameters. Models are available at this https URL.

Paper Six: Positive-Unlabeled Compression on the Cloud (arXiv)

Authors: Yixing Xu, Yunhe Wang, Kai Han, and Chunjing Xu from Huawei Noah’s Ark Lab; Hanting Chen from Peking University; Dacheng Tao and Chang Xu from The University of Sydney

Abstract: Many attempts have been done to extend the great success of convolutional neural networks (CNNs) achieved on high-end GPU servers to portable devices such as smart phones. Providing compression and acceleration service of deep learning models on the cloud is therefore of significance and is attractive for end users. However, existing network compression and acceleration approaches usually fine-tuning the svelte model by requesting the entire original training data (\eg ImageNet), which could be more cumbersome than the network itself and cannot be easily uploaded to the cloud. In this paper, we present a novel positive-unlabeled (PU) setting for addressing this problem. In practice, only a small portion of the original training set is required as positive examples and more useful training examples can be obtained from the massive unlabeled data on the cloud through a PU classifier with an attention based multi-scale feature extractor. We further introduce a robust knowledge distillation (RKD) scheme to deal with the class imbalance problem of these newly augmented training examples. The superiority of the proposed method is verified through experiments conducted on the benchmark models and datasets. We can use only 8% of uniformly selected data from the ImageNet to obtain an efficient model with comparable performance to the baseline ResNet-34.

Paper Seven: Image2StyleGAN++: How to Edit the Embedded Images? (arXiv)

Authors: Rameen Abdal and Peter Wonka from KAUST (King Abdullah University of Science and Technology), Yipeng Qin from Cardiff University

Abstract: We propose Image2StyleGAN++, a flexible image editing framework with many applications. Our framework extends the recent Image2StyleGAN in three ways. First, we introduce noise optimization as a complement to the W+ latent space embedding. Our noise optimization can restore high frequency features in images and thus significantly improves the quality of reconstructed images, e.g. a big increase of PSNR from 20 dB to 45 dB. Second, we extend the global W+ latent space embedding to enable local embeddings. Third, we combine embedding with activation tensor manipulation to perform high quality local edits along with global semantic edits on images. Such edits motivate various high quality image editing applications, e.g. image reconstruction, image inpainting, image crossover, local style transfer, image editing using scribbles, and attribute level feature transfer. Examples of the edited images are shown across the paper for visual inspection.