$\begingroup$

I am reading this paper, and having a hard time understanding one of the derivations. It is probably more of a stat question. The context is, having three random variables $x,y,z$, we would want to define the ELBO in two conditions, when only $z$ is latent (Eq.6), when both $y,z$ are latent (Eq.7). The first case is:

Eq.6: $-\mathcal{L}(x,y) = \mathop{\mathbb{E}}_{q_{ \phi}(z|x,y)}[\log P_\theta(x|y,z)+\log P_\theta(y)+\log P(z) - \log q_{ \phi}(z|x,y)]$

and the second case is:

Eq.7: $\mathop{\mathbb{E}}_{q_{ \phi}(y,z|x)}[\log P_\theta(x|y,z)+\log P_\theta(y)+\log P(z) - \log q_{ \phi}(y,z|x)]$

The difference between the two is that in the second one $y$ is assumed to be latent, while in the first equation $y$ is observed. Now, based on the above two, they define :

$\mathop{\mathbb{E}}_{q_{ \phi}(y,z|x)}[\log P_\theta(x|y,z)+\log P_\theta(y)+\log P(z) - \log q_{ \phi}(y,z|x)] \\= \sum_y q_{\phi}(y|x)(-\mathcal{L}(x,y)) + \mathcal{H}(q_{\phi}(y|x))$

which I really can't derive from the two equations. I tried expanding RHS ($\sum_y q_{\phi}(y|x)(-\mathcal{L}(x,y)) + \mathcal{H}(q_{\phi}(y|x))$) to recover the LHS of Eq.7. But, no hope so far ...