$\begingroup$

It's not to do with how they solve the optimization problems that correspond to fitting the models, it's to do with the actual optimization problems the models pose.

Specifically, in large samples, you can effectively consider it as comparing two weighted least squares problems

The linear model ( lm ) one assumes (when unweighted) that the variance of the proportions is constant. The glm assumes that the variance of the proportions comes from the binomial assumption $\text{Var}(\hat{p})=\text{Var}(X/n) = p(1-p)/n$. This weights the data points differently, and so comes to somewhat different estimates* and different variance of differences.

* at least in some situations, though not necessarily in a straight comparison of proportions