Is IEEE floating-point math deterministic? Will you always get the same results from the same inputs? The answer is an unequivocal “yes”. Unfortunately the answer is also an unequivocal “no”. I’m afraid you will need to clarify your question.

My hobby: injecting code into other processes and changing the floating-point rounding mode on some threads

The answer to this ambiguous question is of some interest to game developers. If we can guarantee determinism then we can create replays and multi-player network protocols that are extremely efficient. And indeed many games have done this. So what does it take to make your floating-point math deterministic within one build, across multiple builds, and across multiple platforms?

Before proceeding I need to clarify that floating-point determinism is not about getting the ‘right’ answer, or even the best answer. Floating-point determinism is about getting the same answer on some range of machines and builds, so that every player agrees on the answer. If you want to get the best answer then you need to choose stable algorithms, but that will not guarantee perfect determinism. If you want to learn how to get the best answer you’ll need to look elsewhere, perhaps including the rest of my floating-point series.

Determinism versus free will: cage match

The IEEE standard does guarantee some things. It guarantees more than the floating-point-math-is-mystical crowd realizes, but less than some programmers might think. In particular, the addendum to David Goldberg’s important paper points out that “the IEEE standard does not guarantee that the same program will deliver identical results on all conforming systems.” And, the C/C++ standards don’t actually mandate IEEE floating-point math.

On the other hand, the new IEEE 754-2008 standard does say “Together with language controls it should be possible to write programs that produce identical results on all conforming systems”, so maybe there is hope. They even devote all of chapter 11 to the topic, even if it’s just a one and a half page chapter. They don’t promise it will be easy, they warn that the language-controls are not yet defined, but at least it is potentially possible.

What is guaranteed

Some of the things that are guaranteed are the results of addition, subtraction, multiplication, division, and square root. The results of these operations are guaranteed to be the exact result correctly rounded (more on that later) so if you supply the same input value(s), with the same global settings, and the same destination precision you are guaranteed the same result.

Therefore, if you are careful and if your environment supports your care, it is possible to compose a program out of these guaranteed operations, and floating-point math is deterministic.

What is not guaranteed

Unfortunately there are a significant number of things that are not guaranteed. Many of these things can be controlled so they should not be problems, but others can be tricky or impossible. Which is why the answer is both “yes” and “no”.

Floating-point settings (runtime)

There are a number of settings that control how floating-point math will be done. The IEEE standard mandates several different rounding modes and these are usually expressed as per-thread settings. If you’re not writing an interval arithmetic library then you’ll probably keep the rounding mode set to round-to-nearest-even. But if you – or some rogue code in your process – changes the rounding mode then all of your results will be subtly wrong. It rarely happens but because the rounding mode is typically a per-thread setting it can be amusing to change it and see what breaks. Troll your coworkers! Amuse your friends!

If you are using the x87 floating-point unit (and on 32-bit x86 code you can’t completely avoid it because the calling conventions specify that an x87 register is used to return floating-point results) then if somebody changes the per-thread precision settings your results may be rounded to a different precision than expected.

Exception settings can also be altered, but the mostly likely different result that these changes will cause is a program crash, which at least has the advantage of being easier to debug.

Denormals/subnormals are part of the IEEE standard and if a system doesn’t support them then it is not IEEE compliant. But… denormals sometimes slow down calculations. So some processors have options to disable denormals, and this is a setting that many game developers like to enable. When this setting is enabled tiny numbers are flushed to zero. Oops. Yet again your results may vary depending on a per-thread mode setting.

(see That’s Not Normal–the Performance of Odd Floats for more details on denormals)

Rounding, precision, exceptions, and denormal support – that’s a lot of flags. If you do need to change any of these flags then be sure to restore them promptly. Luckily these settings should rarely be altered so just asserting once per frame that they are as expected (on all threads!) should be enough. There have been sordid situations where floating-point settings can be altered based on what printer you have installed and whether you have used it. If you find somebody who is altering your floating-point settings then it is very important to expose and shame them.

You can use _controlfp on VC++ to query and change the floating-point settings for both the x87 and SSE floating-point units. For gcc/clang look in fenv.h.

Composing larger expressions

Once we have controlled the floating-point settings then our next step is to take the primitive operations (+, –, *, /, sqrt) with their guaranteed results and use them to compose more complicated expressions. Let’s start small and see what we can do with the float variables a, b, and c. How about this:

a + b + c

The most well known problem with the line of code above is order of operations. A compiler could add a and b first, or b and c first (or a and c first, if it was in a particularly cruel mood). The IEEE standard leaves the order of evaluation up to the language, and most languages give compilers some latitude. If b and c happen to be in registers from a previous calculation and if you are compiling with /fp:fast or some equivalent setting then it is quite likely that the compiler will optimize the code by adding b and c first, and this will often give different results compared to adding a and b first. Adding parentheses may help. Or it may not. You need to read your compiler documentation or do some experiments to find out. I know that I have managed to fix several floating-point precision problems with VC++ by forcing a different order of evaluation using parentheses. Your mileage may vary.

Let’s assume that parentheses help, so now we have this:

(a + b) + c

Let’s assume that all of our compilers are now adding a and b and then adding c, and addition has a result guaranteed by the IEEE standard, so do we have a deterministic result now across all compilers and machines?

No.

The result of a + b is stored in a temporary destination of unspecified precision. Neither the C++ or IEEE standards mandate what precision intermediate calculations are done to and this intermediate precision will affect your results. The temporary result could equally easily be stored in a float or a double and there are significant advantages to both options. It is an area where reasonable people can disagree. When using Visual C++ the intermediate precision depends on your compiler version, 32-bit versus 64-bit, /fp compile settings, /arch compile settings, and x87 precision settings. I discussed this issue in excessive detail in Intermediate Floating-Point Precision, or you can just look at the associated chart. Note that for gcc you can improve consistency with -ffloat-store and -fexcess-precision. For C99 look at FLT_EVAL_METHOD. As always, the x87 FPU makes this trickier by mixing run-time and compile-time controls.

The simplest example of the variability caused by different intermediate precisions comes from this expression:

printf(“%1.16e”, 0.1f * 0.1f);

Assuming that the calculation is done at run time the result can vary, depending on whether it is done at float or double precision – and both options are entirely IEEE compliant. A double will always be passed to printf, but the conversion to double can happen before or after the multiplication. In some configurations VC++ will insert extra SSE instructions in order to do the multiplication at double precision.

The term destination in the IEEE standard explicitly gives compilers some flexibility in this area which is why all of the VC++ variants are conformant to the standard. The IEEE 2008 standard encourages compilers to offer programmers control over this with a preferredWidth attribute, but I am not aware of any C++ compilers that support this attribute. Adding an explicit cast to float may help – again, it depends on your compiler and your compilation settings.

Intermediate precision is a particularly thorny problem with the x87 FPU because it has a per-thread precision setting, as opposed to the per-instruction precision setting of every other FPU in common use. To further complicate things, if you set the x87 FPU to round to float or double then you get rounding that is almost like float/double, but not quite. This means that the only way to get predictable rounding on x87 is to store to memory, which costs performance and may lead to double-rounding errors. The net result is that it may be impossible or impractical to get the x87 FPU to give identical results to other floating-point units.

If you store to a variable with a declared format then well-behaved compilers should (so sayeth IEEE-754-2008) round to that precision, so for increased portability you may have to forego some unnamed temporaries.

fmadd

A variant of the intermediate precision problem shows up because of fmadd instructions. These instructions do a multiply followed by an add, and the full precision of the multiply is retained. Thus, these effectively have infinite intermediate precision. This can greatly increase accuracy, but this is another way of saying that it gives different results than machines that don’t have an fmadd instruction. And, in some cases the presence of fmadd can lead to worse results. Imagine this calculation:

result = a * b + c * d

If a is equal to c and b is equal to –d then the result should (mathematically) be equal to zero. And, on a machine without fmadd you typically will get a result of zero. However on a machine with fmadd you usually won’t. On a machine that uses fmadd the generated code will look something like this:

compilerTemp = c * d

result = fmadd(a, b, compilerTemp)

The multiplication of c and d will be rounded but the multiplication of a and b will not be, so the result will usually not be zero.

Trivia: the result of the fmadd calculation above should be an exact representation of the rounding error in c times d. That’s kind of cool.

While the implementation of fmadd is now part of the IEEE standard there are many machines that lack this instruction, and an accurate and efficient emulation of it on those machines may be impossible. Therefore if you need determinism across architectures you will have to avoid fmadd. On gcc this is controlled with -ffp-contract.

Square-root estimate

Graphics developers love instructions like reciprocal square root estimate. However the results of these instructions are not defined by the IEEE standard. If you only ever use these results to drive graphics then you are probably fine, but if you ever let these results propagate into other calculations then all bets are off. This was discovered by Allan Murphy at Microsoft who used this knowledge to create the world’s most perverse CPUID function:

// CPUID is for wimps:

__m128 input = { -997.0f };

input = _mm_rcp_ps(input);

int platform = (input.m128_u32[0] >> 8) & 0xf;

switch (platform)

{

case 0x0: printf(“Intel.

”); break;

case 0x7: printf(“AMD Bulldozer.

”); break;

case 0x8: printf(“AMD K8, Bobcat, Jaguar.

”); break;

default: printf(“Dunno

”); break;

}

The estimate instructions (rcpss, rcpps, rsqrtps, rsqrtss) are, as the name suggests, not expected to give a fully accurate result. They are supposed to provide estimates with bounded errors and we should not be surprised that different manufacturers’ implementations (some mixture of tables and interpolation) give different results in the low bits.

This issue doesn’t just affect games, it is also a concern for live migration of virtual machines.

Transcendentals

The precise results of functions like sin, cos, tan, etc. are not defined by the IEEE standard. That’s because the only reasonable result to standardize would be the exact result correctly rounded, and calculating that is still an area of active research due to the Table Maker’s Dilemma. I believe that it is now practical to get correctly rounded results for most of these functions at float precision, but double is trickier. Part of the work in solving this for float was to do an exhaustive search for hard cases (easy for floats because there are only four billion of them) – those where it takes over a hundred bits of precision before you find out which way to round. In practice I believe that these instructions give identical results between current AMD and Intel processors, but on PowerPC where they are calculated in software they are highly unlikely to be identical. You can always write your own routines, but then you have to make sure that they are consistent, as well as accurate.

Update: according to this article the results of these instructions changed when the Pentium came out, and between the AMD-K6 and subsequent AMD processors. The AMD changes were to maintain compatibility with Intel’s imperfections.

Update 2: the fsin instruction is quite inaccurate around pi and multiples of pi. Because of this many C runtimes implement sin() without using fsin, and give much more accurate (but very different) results. g++ will sometimes calculates sin() at compile-time which it does extremely accurately. In 32-bit Ubuntu 12.04 with glibc 2.15 the run-time sin() would use fsin making for significant differences depending on whether sin() was calculated at run-time or compile-time. On Ubuntu 12.04 this code does not print 1.0:

const double pi_d = 3.14159265358979323846;

const int zero = argc / 99;

printf(“%f

”, sin(pi_d + zero) / sin(pi_d)); 0.999967

Per-processor code

Some libraries very helpfully supply different code for CPUs with different capabilities. These libraries test for the presence of features like SSE and then use those features if available. This can be great for performance but it adds a new option for getting different results on different CPUs. Watch for these techniques and either test to make sure they give identical results, or avoid them like the plague.

Conversions

Conversion between bases – such as printf(“%1.8e”); – is not guaranteed to be identical across all implementations. Doing perfect conversions efficiently was an unsolved problem when the original IEEE standard came out and while it has since been solved this doesn’t mean that everybody does correctly rounded printing. For a comparison between gcc and Visual C++ see Float Precision Revisited: Nine Digit Float Portability. VC++ Dev 14 improves the situation considerably (see Formatting and Parsing Correctness).

While conversion to text is not guaranteed to be correctly rounded, the values are guaranteed to round-trip as long as you print them with enough digits of precision, and this is true even between gcc and Visual C++, except where there are implementation bugs. Rick Regan at Exploring Binary has looked at this issue in great depth and has reported on double values that don’t round-trip when read with iostreams (scanf is fine, and so is the conversion to text), and troublesome values that have caused both Java and PHP to hang when converting from text to double. Great stuff.

The Universal C RunTime on Windows recently (2020) fixed some tie-break rounding bugs and in doing so they introduced some new tie-break rounding bugs, so we may never hit perfection.

So don’t use iostreams, Java, or PHP?

Uninitialized data

It seems odd to list uninitialized data as a cause of floating-point indeterminism because there is usually nothing floating-point specific about this. But sometimes there is. Imagine a function like this:

void NumWrapper::Set( T newVal )

{

if ( m_val != newVal )

{

m_val = newVal;

Notify();

}

}

If m_val is not initialized then the first call to Set may or may not call Notify, depending on what value m_val started with. However if T is equal to float, and if you are compiling with /fp:fast, and if m_val happens to be a NaN (always possible with uninitialized data) then the comparison may say that m_val and newVal are equal, and newVal will never get set, and Notify will never be called. Yes, a NaN is supposed to compare not-equal to everything, but the always-popular /fp:fast takes away this guarantee.

I’ve run into this bug twice in the last two years. It’s nasty. Maybe don’t use /fp:fast? But definitely avoid uninitialized data – that’s undefined behavior.

Another way that float code could be affected by uninitialized data when integer code is not is a calculation like this:

result = a + b – b;

On every machine I’ve ever used this will set result to a regardless of the value of b, for integer calculations. For floating-point calculations there are many values of b that would cause you to end up with a different result, typically infinity or NaN. I’ve never hit this bug, but it is certainly possible. Again, undefined behavior for int or float, but actual failures are more likely with floats.

Compiler differences

There are many compiler flags and differences that might affect intermediate precision or order of operations. Some of the things that could affect results include:

Debug versus release versus levels of optimization

x86 versus x64 versus PowerPC

SSE versus SSE2 versus x87

gcc versus Visual C++ versus clang

/fp:fast versus /fp:precise

-ffp-contract, -ffloat-store, and -fexcess-precision

FLT_EVAL_METHOD

Compile-time versus run-time calculations (such as sin())

With jitted languages such as C# the results may also vary depending on whether you launch your program under a debugger, and your code could potentially be optimized while it’s running so that the results subtly change.

Other sources of non-determinism

Floating-point math is a possible source of non-determinism, but it is certainly not the only one. If your simulation frames have variable length, if your random number generators don’t replay correctly, if you have uninitialized variables, if you use undefined C++ behavior, if you allow timing variations or thread scheduling differences to affect results, then you may find that determinism fails for you, and it’s not always the fault of floating-point math. For more details on these issues see the resources section at the end.

Summary

A lot of things can cause floating-point indeterminism. How difficult determinism is will depend on whether you need the exact same behavior as you rebuild and maintain your code, and when your code runs on completely different platforms. The stronger your needs, the more difficult and costly will be the engineering effort.

Some people assume that if you use stable algorithms then determinism doesn’t matter. They are wrong. If your network protocol or save game format stores only user inputs then you must have absolute determinism. An error of one ULP (Unit in the Last Place) won’t always matter, but sometimes it will make the difference between a character surviving and dying, and things will diverge from there. You can’t solve this problem just by using epsilons in your comparisons.

If you are running the same binary on the same processor then the only determinism issues you should have to worry about are:

Altered FPU settings for precision (x87 only), rounding, or denormal control

Uninitialized variables (especially with /fp:fast)

Non floating-point specific sources of indeterminism

The details of how your compiler converts your source code to machine code are irrelevant in this case because every player is running the same machine code on the same processor. This is the easiest type of floating-point determinism and, absent other problems with determinism, it just works.

That’s easy enough, but unless you are running on a console you probably have to deal with some CPU variation. If you are running the same binary but on multiple processor types – either different manufacturers or different generations – then you also have to worry about:

Different execution paths due to CPU feature detection

Different results from sin, cos, estimate instructions, etc.

That’s still not too bad. You have to accept some limitations, and avoid some useful features, but the core of your arithmetic can be written without thinking about this too much. Again, the secret is that every user is executing exactly the same instructions, and you have restricted yourself to instructions with defined behavior, so it just works™.

If you are running a different binary then things start getting sticky. How sticky they get depends on how big a range of compilers, compiler settings, and CPUs you want to support. Do you want debug and release builds to behave identically? PowerPC and x64? x87 and SSE (god help you)? Gold master and patched versions? Maintaining determinism as you change the source code can be particularly tricky, and increasing discipline will be required. Some of the additional things that you may need to worry about include:

Compiler rules for generating floating-point code

Intermediate precision rules

Compiler optimization settings and rules

Compiler floating-point settings

Differences in all of the above across compilers

Different floating-point architectures (x87 versus SSE versus VMX and NEON)

Different floating-point instructions such as fmadd

Different float-to-decimal routines (printf) that may lead to different printed values

Buggy decimal-to-float routines (iostreams) that may lead to incorrect values

If you can control these factors – some are easy to control and some may involve a lot of work – then floating-point math can be deterministic, and indeed many games have been shipped based on this. Then you just need to make sure that everything else is deterministic and you are well on your way to an extremely efficient replay and networking mechanism.

It turns out that there are some parts of your floating-point code that can be non-deterministic and can make use of, for instance, square-root estimate instructions. If you have code that is just driving the GPU, and if the GPU results never affect gameplay, then variations in this code will not lead to divergence.

When debugging the problems associated with a game engine that must be deterministic, remember that despite all of its mysteries there is some logic to floating-point, and in many cases a loss of determinism is actually caused by something completely different. If your code diverges when running the same binary on identical processors then, unless you’ve got a suspicious printer driver, you might want to look for bugs elsewhere in your code instead of always blaming IEEE-754.

Resources

1500 Archers on a 28.8: Network Programming in Age of Empires and Beyond: http://www.gamasutra.com/view/feature/3094/1500_archers_on_a_288_network_.php/

Synchronous RTS Engines and a Tale of Desyncs: http://forrestthewoods.com/synchronous-rts-engines-and-a-tale-of-desyncs/

Synchronous RTS Engines 2: Sync Harder: http://forrestthewoods.com/synchronous-rts-engines-2-sync-harder/

The Tech of Planetary Annihilation (does not require determinism): http://forrestthewoods.com/the-tech-of-planetary-annihilation-chronocam/

A summary of floating-point determinism resources: http://gafferongames.com/networking-for-game-programmers/floating-point-determinism/

An excellent discussion of the challenges of getting your compiler to behave: http://www.yosefk.com/blog/consistency-how-to-defeat-the-purpose-of-ieee-floating-point.html

.NET and floating-point determinism: http://blogs.msdn.com/b/shawnhar/archive/2009/03/25/is-floating-point-math-deterministic.aspx

On floating-point determinism: http://yosoygames.com.ar/wp/2013/07/on-floating-point-determinism/

IEEE 754-2008 standard available for purchase here: http://ieeexplore.ieee.org/servlet/opac?punumber=4610933

Comparing floats – tricky, but deterministic: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

Some ramblings on trying to do really precise mathematics at high speed: https://randomascii.wordpress.com/2012/03/28/fractal-and-crypto-performance/

(from twitter) – the pitfalls of verifying floating-point computations: http://hal.archives-ouvertes.fr/docs/00/28/14/29/PDF/floating-point-article.pdf

Lots of comments from reddit – some from people who read the article: http://www.reddit.com/r/programming/comments/1ih91g/will_a_floating_point_operation_always_evaluate/

A detailed analysis of The Pitfalls of Verifying Floating-Point Computations (recommended): http://arxiv.org/abs/cs/0701192

A discussion of inaccuracies in fsin and how they can lead to radically different values for compile-time and run-time invocations of sin(): Intel Underestimates Error Bounds by 1.3 quintillion

Homework

Explain clearly why printf(“%1.16ef”, 0.1f * 0.1f); can legally print different values, and how that behavior applies to 0.1f * 0.3f * 0.7f.

For extra credit, explain why a float precision version of fmadd cannot easily be implemented on processors that lack it by using double math (or alternately, prove the opposite by implementing fmadd).