Data augmentation on dog images





Waymo’s Progressive Population Based Augmentation focuses on optimizing a subset of augmentation parameters of the whole search space in each training iteration. The best parameters in the past iterations are recorded as references for mutating parameters in future iterations.





By automating data augmentation to lidar point clouds in Waymo’s Open Dataset , one of the largest and most diverse multi-sensor self-driving datasets ever released, PPBA achieves significant performance improvement across detection architectures. Our experiments also show PPBA is much faster and more effective in finding data augmentation strategies compared to a random search or a PBA [5] baseline. Additionally, because we rely on labeled lidar data to train our neural nets, PPBA also allows us to save on labeling costs, in turn improving our data efficiency as one labeled example becomes many. As the figures below show, our 3D detection control experiments on the Waymo Open Dataset show that using PPBA is up to 10 times more data efficient than training nets without augmentation.

Vehicle detection 3D mAP (mean average precision) for PointPillars [6] on Waymo Open Dataset validation set with no augmentation, random augmentation and PPBA as the dataset size changes

Pedestrian detection 3D mAP (mean average precision) for PointPillars [6] on Waymo Open Dataset validation set with no augmentation, random augmentation and PPBA as the dataset size changes

With AutoAugment [1], Google Brain designed a new search space consisting of augmentation policies — combinations of augmentation operations. They were able to automatically explore which augmentation policies to use through reinforcement learning . By finding the optimal image transformation policies from the data itself, Brain Team was able to improve image recognition tasks on various academic datasets and extend these ideas to object localization problems on COCO dataset [2]. They also discovered a way to substantially reduce the computational cost of searching for effective data augmentation policies [3], making it an effective and inexpensive tool for us to use across our dataset collected over 20 million self-driven miles on public roads.In collaboration with our Google Research colleagues from the Brain team, we’re extending this research to automatically discover optimal data augmentation policies to improve perception tasks for our Waymo Driver.In 2019, we started applying automated data augmentation techniques from RandAugment [3] to Waymo image-based classification and detection tasks. We achieved significant improvements in several classifiers and detectors, including those that help classify foreign objects such as construction equipment and animals. After the success we experienced with image-based data, we explored whether automated data augmentation strategies could improve lidar 3D detection tasks as well.Lidar is one of Waymo’s core sensors. It not only paints a picture of its surroundings in 3D up to 300 meters away, but it also provides our self-driving technology important context for where objects are and where they may be going. Because of our custom-designed lidar’s ability to provide detailed 3D information, lidar-based models are key to our system, and ensure we accurately detect and track all objects on the road. While data augmentation is commonly adopted to improve the quality and robustness of lidar point cloud detection models, current augmentation strategies are limited because of their manual design. Since no suitable off-the-shelf solution for point cloud augmentation existed, we decided to build one.While augmenting images is no easy task, augmenting a lidar point cloud is literally a whole dimension more complex. As a result, the search space of automated augmentation techniques used for image classification and object detection cannot directly be reused for point clouds. Due to the nature of geometric information in 3D data, transformations for point clouds typically have a large number of parameters including geometric distance, operation strength, sampling probability, etc., and certain image augmentation techniques, such as color shifting, simply wouldn’t apply to monochromatic 3D data. Therefore, we created a new point cloud augmentation search space to discover policies specifically designed for point cloud datasets.The search space we created for our lidar point clouds includes eight augmentation operations, including:Each augmentation operation is associated with a probability and specific parameters. For example, the GroundTruthAugmentor has parameters denoting the probability for sampling vehicles, pedestrians, cyclists, whereas the GlobalTranslateNoise operation has parameters for the distortion magnitude of translation operation on x, y and z coordinates.To automate the process of finding good augmentation policies for lidar point clouds, we created a new automated data augmentation algorithm -PPBA builds on our previous Population Based Training (PBT)[7] work, where we train neural nets with evolutionary computation , which uses principles similar to Darwin’s Natural Selection Theory. PPBA learns to optimize augmentation strategies effectively and efficiently by narrowing down the search space at each population iteration and adopting the best parameters discovered in past iterations.Our experiments show that by applying automated data augmentation to lidar data, we can significantly improve 3D object detection without additional data collection or labeling. On the baseline 3D detection model, our method is up to 10x more data efficient than without augmentation, enabling us to train machine learning models with fewer labeled examples, or use the same amount of data for better results, at a lower cost. The increase in data efficiency is especially important as it means we can speed up the training process and improve the perception tasks of our fifth-generation Waymo Driver , enabling us to serve our Waymo Via partners and Waymo One riders more effectively and efficiently.We look forward to continuing our work with Google Research, Brain Team, so stay tuned for more!This collaboration between Waymo and Google was initiated and sponsored by Drago Anguelov of Waymo, Quoc Le and Jon Shlens at Google. The work was conducted by Shuyang Cheng, Chunyan Bai, Yang Song and Peisheng Li of Waymo, and Zhaoqi Leng, Ekin Dogus Cubuk, Jiquan Ngiam and Barret Zoph of Google. Extra thanks for the support of Congcong Li, Chen Wu, Ming Ji, Weiyue Wang, Zhinan Xu, Xin Zhou, James Guo, Shirley Chung, Yukai Liu, Pei Sun of Waymo, Matthieu Devin, Zhifeng Chen, Ben Caine and Vijay Vasudevan of Google and Ang Li of DeepMind.[1] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018)[2] Zoph, B., Cubuk, E.D., Ghiasi, G., Lin, T.Y., Shlens, J., Le, Q.V.: Learning data augmentation strategies for object detection. arXiv preprint arXiv:1906.11172 (2019)[3] Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: Randaugment: Practical data augmentation with no separate search. arXiv preprint arXiv:1909.13719 (2019)[4] Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Patnaik, V., Tsui, P., Guo, J., Zhou, Y., Chai, Y., Caine, B., Vasudevan, V., Han, W., Ngiam, J., Zhao, H., Timofeev, A., Ettinger, S., Krivokon, M., Gao, A., Joshi, A., Zhang, Y., Shlens, J., Chen, Z., Anguelov, D.: Scalability in perception for autonomous driving: Waymo open dataset. arXiv preprint arXiv:1912.04838 (2019)[5] Ho, D., Liang, E., Stoica, I., Abbeel, P., Chen, X.: Population based augmentation: Efficient learning of augmentation policy schedules. arXiv preprint arXiv:1905.05393 (2019)[6] Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: Pointpillars: Fast encoders for object detection from point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 12697–12705 (2019)[7] Jaderberg, M., Dalibard, V., Osindero, S., Czarnecki, W.M., Donahue, J., Razavi, A., Vinyals, O., Green, T., Dunning, I., Simonyan, K., Fernando, C., Kavukcuoglu, K.: Population based training of neural networks. arXiv preprint arXiv:1711.09846 (2017)[8] Cheng, S., Leng, Z., Cubuk, E.D., Zoph, B., Bai, C., Ngiam, J., Song, Y., Caine, B., Vasudevan, V., Li, C., Le, Q.V., Shlens, J., Anguelov, D.: Improving 3D Object Detection through Progressive Population Based Augmentation. arXiv preprint arXiv:2004.00831 (2020)