$\begingroup$

I agree with you that the explanation is a bit unclear. I hope I got it right. The second RHS term in equation 2 is already a constant with respect to the "data distribution". That is

$ \sum_c p(c| \theta_1, \dots, \theta_m) \frac{\partial \log f_m(c| \theta_m)}{\partial \theta_m} = E_{Q^\infty}[\frac{\partial \log f_m(X| \theta_m)}{\partial \theta_m}]$.

There is no stochasticity left in the term above. Hence taking its expectation returns the same term. I guess that's why they dropped the expectation with respect to $Q^0$ sign.