Optimal DNN Primitive Selection with Partitioned Boolean Quadratic Programming Andrew Anderson, David Gregg International Symposium on Code Generation and Optimization

Deep Neural Networks (DNNs) require very large amounts of computation, and many different algorithms have been proposed to implement their most expensive layers, each of which has a large number of variants with different trade-offs of parallelism, locality, memory footprint, and execution time. In addition, specific algorithms operate much more efficiently on specialized data layouts.

We state the problem of optimal primitive selection in the presence of data layout transformations, and show that it is NP-hard by demonstrating an embedding in the Partitioned Boolean Quadratic Assignment problem (PBQP). We propose an analytic solution via a PBQP solver, and evaluate our approach experimentally by optimizing several popular DNNs using a library of more than 70 DNN primitives, on an embedded platform and a general purpose platform. We show experimentally that significant gains are possible versus the state of the art vendor libraries by using a principled analytic solution to the problem of primitive selection in the presence of data layout transformations.

(preprint, slides, artifacts)