$\begingroup$

Because in Bayesian analyses, you start off with the likelihood and priors, which directly yields the unnormalized joint distribution. So let's say $\Theta$ is the set of all parameters. Since the likelihood is not always equal to the density but proportional, it is clear that $$L\left(\Theta|y\right)p\left(\Theta\right) \propto p\left(y,\Theta\right).$$ But oftentimes integrating out $\Theta$ will be hard. This is why it's better to work with $p\left(y,\Theta\right)$ than $p\left(\Theta|y\right)$.

In short, sometimes you won't be able to do the integration.