Make the memory dependence queries in the memcpy optimizer nonlocal.

This eliminates 28% of the llvm.memcpy calls in librustc.

Fixes PR28958.