In modern probabilistic learning, we often wish to perform automatic inference for Bayesian models. However, informative priors are often costly to elicit, and in consequence, flat priors are chosen with the hopes that they are reasonably uninformative. Yet, objective priors such as the Jeffreys and Reference would often be preferred over flat priors if deriving them was generally tractable. We overcome this problem by proposing a black-box learning algorithm for Reference prior approximations. We derive a lower bound on the mutual information between data and parameters and describe how its optimization can be made derivation-free and scalable via differentiable Monte Carlo expectations. We experimentally demonstrate the method's effectiveness by recovering Jeffreys priors and learning the Variational Autoencoder's Reference prior.