$\begingroup$

Variational autoencoders encourage the model to generalize features and reconstruct images as an aggregation of those features. This is what the latent space encodes, a compressed feature vector.

Vanilla autoencoders memorize the input and map to the output without the generalization. If you want to extrapolate from your dataset, variational is the way to go.