Ahah! Apple does have different math!

From: Tom Christiansen

Date: March 17, 2009 01:34

Subject: Ahah! Apple does have different math!

Message ID: 26423.1237278850@chthon

March 17, 2009 01:34Ahah! Apple does have different math!

As some of you know, I've been wrestling with programs giving different numeric output when run on the MacBookPro than they do when run other places, and not understanding why/why-not. Damian managed to dig this up for me: http://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg208162.html Which seemed pretty familiar to my troubles, but there wasn't any action on it. Well, I've learned some things--much more than ever I wanted to. I figured I'd share this with the lot of you, especially since I can't help but wonder whether this doesn't affect the Perl build and test suites, etc. I was *sure* that MacOS did something different with floats than other x86 processors did, and it *is* so! One need but look at the comments in <float.h> to see that this truly DELIGHTFUL tidbit is true: Note: There are actually two physical floating point environments on x86. There is the one described by the x87 floating point control and status words, which applies primarily to calculations done with long double on MacOS X for Intel. There is the MXCSR which applies primarily to calculations done with scalar float, scalar double and SSE/SSE2/SSE3. The high level interface, which uses FE_ macros as int arguments So I'm not crazy, after all. (Ok, so this is evidence, not proof. :-) It's very hard (next to impossible) to guarantee the same code will do the same thing somewhere else. This is... troubling. Here now is proof positive that something is different on MacOS FP. The same trivial program, compiled with identical options, when run there says: Trying compiler constants first... f is 0.12345, rounded to 0.1235, expanded to 0.12345000356435775756835937500000000000000000000000000 d is 0.12345, rounded to 0.1235, expanded to 0.12345000000000000417443857259058859199285507202148438 ld is 0.12345, rounded to 0.1235, expanded to 0.12345000000000000000026020852139652106416178867220879 Now trying derived values... f is 0.12345, rounded to 0.1234, expanded to 0.12344999611377716064453125000000000000000000000000000 d is 0.12345, rounded to 0.1235, expanded to 0.12345000000000000417443857259058859199285507202148438 ld is 0.12345, rounded to 0.1235, expanded to 0.12345000000000000000026020852139652106416178867220879 While on a non-Apple machine, says this: Trying compiler constants first... f is 0.12345, rounded to 0.1235, expanded to 0.12345000356435775756835937500000000000000000000000000 d is 0.12345, rounded to 0.1235, expanded to 0.12345000000000000417443857259058859199285507202148438 ld is 0.12345, rounded to 0.1235, expanded to 0.12345000000000000417443857259058859199285507202148438 Now trying derived values... f is 0.12345, rounded to 0.1234, expanded to 0.12344999611377716064453125000000000000000000000000000 d is 0.12345, rounded to 0.1235, expanded to 0.12345000000000000417443857259058859199285507202148438 ld is 0.12345, rounded to 0.1235, expanded to 0.12345000000000000417443857259058859199285507202148438 [Program after sig.] Those are different numbers, different behaviors, and different sorts of "rounding". You will (almost) never get the floating point-number 0.12345 (actually, there *is* no such thing: not a reciprocal power of 2, you know) to "round towards even", because of the garbage data far out down the line, the way things get (or don't) get put into registers or coprocessors, and much else. You're at the mercy of very subtle forces largely beyond your control and understanding unless you put a *lot* of time into it. No matter what you think your rounding strategy is, stuff way out in never- never land can put you a little under or a little over what it takes to do whatever you've defined the right thing to be, and therefore, sometimes, you just aren't going to get it. Yes, I know: in double precision, or 8-byte floating point, we have only 15 decimal digits in the significand. That's all there is, and you can't wheedle or whine, weasel or whimper any more out them. That what is all that GUNK? And why are they affecting me? I believe the most polite and gentlemanly way to refer to those digits in excess of what the standard guarantees is to call them "garbage". But I can think of plenty of other words for them, and I'm sure you can, too. It seems like the only way to get reliable behaviour is to pay the big CPU-bucks for cycles and abandon all float ye who enter: $ perl -Mbigrat -le 'print(12345 / 100_000)' 2469/20000 $ perl -Mbignum=p,-65 -le 'print(12345 / 100_000)' 0.12345000000000000000000000000000000000000000000000000000000000000 Did you realize that there seem to be about 7 floating-point types out there. *SEVEN* And I'm not counting Vax stuff or old stuff, either. Did you EVER have ANY idea? I sure didn't! And it gets better, because *how* these map to a float, a double, and a long double varies *considerably* amongst environments. Digits of C Type Byte Mantissa's Precision Name Size base-10 base-2 short float 2 ~3 11 float 4 6 24 double 8 ~15 53 long double: 80bit ext prec 12 or 16 18 64 binary128 16 ~34 113 Furthermore, many sources concur that long doubles--which could be quite a few different things, as you see--aren't necessarily expected nor guaranteed to behave as doubles do. Some of this is some FP reps have odd numbers of (binary) digits in their mantissa, and others have even numbers of bits. This seems pretty certain to cause otherwise identical values to round differently. For example, regarding the true quad-precision floats that IEEE 754-2008 defines, like on a PowerPC: By default, 1/3 rounds down like double precision, because of the odd number of bits in the significand. So the bits beyond the rounding point are 0101... which is less than 1/2 of a unit in the last place. Apple has plenty of its own quirks, ones you'd never normally notice. A big one seems to be what I started this message says: that MacOS on x86 uses something called x87 floating-point, while everybody else on x86 uses standard x86 FP math. But wait! There's more. On a PowerPC, a long double gets you a quad-precision "binary128" float, with 113 base-2 digits of mantissa, or ~34 decimal digits. But on x86 the same code gets merely the 80-bit "extended precision" type, only 3 decimal digits better than the old 8-byte doubles. (Painfully, these 10-byte floats never take up 10-bytes. they want 12 bytes on a 32-bit addressable system, and 16 bytes on a 64-bit one!) And sometimes these can cost you. Here's a comment from an #include file: fma() *function call* is more costly than equivalent (in-line) multiply and add operations For single and double precision, the cost isn't too bad, because we can fall back on higher precision hardware, with the necessary range to handle infinite precision products. However, expect the long double fma to be at least an order of magnitude slower than a simple multiply and an add. There are other things, too, like hardware registers with a lot more precision than you ask for; eg, imagine if all FP math were done in 80-bit FP registers, no matter what. Usually it is, in fact. Compare, and watch what happens when you try to get double-precision floating-point to produce a longer string of correct answers--and notice the one and only answer of all these which is EXACTLY CORRECT: $ perl -Mbignum=p,-65 -le 'print(8/3)' 2.66666666666666666666666666666666666666666666666666666666666666667 $ perl -e 'printf("%.15f

", (8/3))' 2.666666666666667 $ perl -e 'printf("%.65f

", (8/3))' 2.66666666666666651863693004997912794351577758789062500000000000000 $ perl -Mbignum=p,-65 -le 'print(3/7)' 0.42857142857142857142857142857142857142857142857142857142857142857 $ perl -e 'printf("%.15f

", (3/7))' 0.428571428571429 $ perl -e 'printf("%.65f

", (3/7))' 0.42857142857142854763807804374664556235074996948242187500000000000 $ perl -Mbigrat -le 'print(3/7 + 5/8 + 8/3)' 625/168 $ perl -Mbignum=p,-65 -le 'print(3/7 + 5/8 + 8/3)' 3.72023809523809523809523809523809523809523809523809523809523809524 $ perl -e 'printf("%.15f

", (3/7 + 5/8 + 8/3))' 3.720238095238095 $ perl -e 'printf("%.65f

", (3/7 + 5/8 + 8/3))' 3.72023809523809489974155439995229244232177734375000000000000000000 I'm amazed that the math tests in Perl's regression test suite pass as well as they do. (Or maybe they aren't so persnickety?) Isn't that just... nifty? --tom PLATFORMS: #1 [Intel(R) Pentium(R) 4 CPU 2.53GHz ("GenuineIntel" 686-class) flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe ] OpenBSD 4.4 GENERIC#0 i386 gcc (GCC) 3.3.5 (propolice) #2 [MacBookPro] Darwin Kernel Version 9.6.0: Mon Nov 24 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5465) BUILD OPTIONS: $ gcc -fsingle-precision-constant -ffast-math -ffloat-store -o nums nums.c && ./nums And yes, you *can* tweak those and get different answers, differently, on different machines. It doesn't change the underlying problem of things not being as they appear, or rather, being different in different places. PROGRAM: nums.c #include <stdlib.h> #include <stdio.h> /* * * 0.12345 is rationally: * * 3 * 5 * 823 * ---------- * 10 ** 5 * * Except, that's no power-of-2 reciprocal, so cannot be * exactly represented in floating-point, nor can you predict * *how* it shall be represented. So HAVE A NICE DAY! * */ main() { float f = 0.12345 ; double d = 0.12345l; long double ld = 0.12345L; printf("Trying compiler constants first...



"); printf("f is %.5f, rounded to %.4f, expanded to %.53f

", f, f, f); printf("d is %.5lf, rounded to %.4lf, expanded to %.53lf

", d, d, d); printf("ld is %.5Lf, rounded to %.4Lf, expanded to %.53Lf

", ld, ld, ld); f = 12345.0 / 100000; d = 12345.0l / 100000; ld = 12345.0L / 100000; printf("

Now trying derived values...



"); printf("f is %.5f, rounded to %.4f, expanded to %.53f

", f, f, f); printf("d is %.5lf, rounded to %.4lf, expanded to %.53lf

", d, d, d); printf("ld is %.5Lf, rounded to %.4Lf, expanded to %.53Lf

", ld, ld, ld); exit(0); }



