(undodb-gdb) print pos $2 = 3559 // location of the difference in the video buffer (undodb-gdb) print curframe $3 = 114 // frame number that is different (undodb-gdb) print vid.buffer[3559] $4 = 30 '\036' // Here's the value of that pixel with GNU libm // The value is 159 with openlibm

(undodb-gdb) print vid.buffer[3559] $5 = (pixel_t *) 0x20c7987 "\036... // location of the difference (undodb-gdb) watch *(char*) 0x20c7987 Hardware watchpoint 2: *(char*) 0x20c7987 (undodb-gdb) bcont // Run backwards with "bcont"

Hardware watchpoint 2: *(char*) 0x20c7987 Old value = 30 '\036' // undodb stops at the point New value = 46 '.' // where the byte is written 0x00007fb100377826 in D_DrawSpans16 (pspan=0x7ffd0334a030) at /home/jwhitham/Quake-2/headless_quake2/linux/../ref_soft/r_scan.c:514 514 *pdest++ = *(pbase + (s >> 16) + (t >> 16) * cachewidth); (undodb-gdb) print s $6 = 4915200 // Different! With openlibm I get 4915199 (undodb-gdb) print t $7 = 1188545 // Also different. With openlibm I get 1188546

(undodb-gdb) print d_sdivzstepu $10 = 0.00160917651 // Different. With openlibm I get 0.00160917663 (undodb-gdb) delete Delete all breakpoints? (y or n) y (undodb-gdb) watch d_sdivzstepu Hardware watchpoint 4: d_sdivzstepu (undodb-gdb) bcont // run backwards (2nd time) Hardware watchpoint 4: d_sdivzstepu Old value = 0.00160917651 // Here's where d_sdivzstepu was assigned New value = 0.00191315648 0x00007fb100361fa6 in D_CalcGradients (pface=0x7fb0ff0cd204) at /home/jwhitham/Quake-2/headless_quake2/linux/../ref_soft/r_edge.c:817 817 d_sdivzstepu = p_saxis[0] * t;

(undodb-gdb) print t $16 = 0.00249999994 // This is the same (undodb-gdb) print p_saxis[0] $17 = 0.643670619 // With openlibm I get 0.643670678 (undodb-gdb) delete Delete all breakpoints? (y or n) y (undodb-gdb) watch p_saxis[0] // Run backwards again to find why... Hardware watchpoint 5: p_saxis[0] (undodb-gdb) bcont // 3rd call to bcont Hardware watchpoint 5: p_saxis[0] Old value = 0.643670619 // Here's the source of p_saxis[0] New value = 5.30636976e-37 0x00007fb100369025 in TransformVector (in=0x7fb0ff01d724, out=0x7ffd033490c0) at /home/jwhitham/Quake-2/headless_quake2/linux/../ref_soft/r_misc.c:219 219 out[0] = DotProduct(in,vright); (undodb-gdb) print in[0] $18 = 1 // Look at the inputs: (undodb-gdb) print in[1] // in[0], in[1], in[2] are same... $19 = 0 (undodb-gdb) print in[2] $20 = 0 (undodb-gdb) print vright // but vright is different. // with GNU libm {0.643670678, 0.765262604, 0.00783333182} $21 = {0.643670619, 0.765262604, 0.00783333182} // with openlibm

(undodb-gdb) delete Delete all breakpoints? (y or n) y (undodb-gdb) watch vright[0] Hardware watchpoint 6: vright[0] (undodb-gdb) bcont // 4th time Hardware watchpoint 6: vright[0] Old value = 0.643670619 New value = 0.643877447 0x00007fb100379afc in AngleVectors (angles=0x7fb1005cb4c4, forward=0x7fb1005b7ad8, right=0x7fb1005cadb4, up=0x7fb1005cb418) at /home/jwhitham/Quake-2/headless_quake2/linux/../game/q_shared.c:117 117 right[0] = (-1*sr*sp*cy+-1*cr*-sy); (undodb-gdb) print sr $22 = -0.00785390008 // same as openlibm (undodb-gdb) print sp $23 = 0.0723257735 // same as openlibm (undodb-gdb) print cy $24 = -0.764920235 // same as openlibm (undodb-gdb) print cr $25 = 0.999969184 // same as openlibm (undodb-gdb) print sy $26 = 0.644124985 // different! openlibm had 0.644125044 (undodb-gdb) delete Delete all breakpoints? (y or n) y (undodb-gdb) watch sy Hardware watchpoint 7: sy (undodb-gdb) bcont // run backwards again (5th time) Hardware watchpoint 7: sy Old value = 0.644124985 New value = 0 0x00007fb10037995a in AngleVectors (angles=0x7fb1005cb4c4, forward=0x7fb1005b7ad8, right=0x7fb1005cadb4, up=0x7fb1005cb418) at /home/jwhitham/Quake-2/headless_quake2/linux/../game/q_shared.c:100 100 sy = sinf(angle); (undodb-gdb) print angle $27 = 2.44171381 // same as openlibm

Footnotes

* Overall running times are a median of 19 executions. Frame timings are overall running times divided by 17000 frames.

** I can't resist a comparison with a tool I helped develop.

Rapita 's RapiCover tool: running time 22.0s, or 1.29ms per frame. This time value is for statement and decision coverage, without using any specialised instrumentation technique to reduce overheads.

Reversible debugging helps me at work, but I can't go into detail about that, so I'll instead explain how I used UndoDB's 30 day free trial on my home computer. As a hobby project, I have been working on a benchmark based on Quake 2; earlier describe it. During this work I hit a number of issues related to floating-point numbers. For instance, a debug (-O0) build would produce one result for a certain floating-point function, and a release (-O2) build would produce a different result. The difference could be seen in the 3D graphics rendered by the game. Understanding and resolving these problems was the main issue in developing the benchmark.I could only work on it in my spare time, and I don't have much of that. So I needed a way to quickly go from an observed difference in some rendered pixel back to the cause of that difference.It is possible to do this with a traditional debugger, but the process is tiresome. You can use a watchpoint to find where a certain variable is assigned, but if it's assigned many times, then you have to find the relevant assignment within a haystack of possibilities. So you have to use counters and conditional breakpoints, and each time you want to move backwards, you have to add new counters and new breakpoints and rerun the program. It is error-prone and slow. If only the debugger could automate it!That's where a reversible debugger helps. You can set breakpoints and watchpoints, and then rewind to them.It seems this works for any Linux application; it doesn't have to be compiled in a special way, and you don't have to modify the Linux kernel - or even run anything as root.I think the best way to illustrate the potential is using a transcript of a debugging session. In the following example, I am trying to find why a debug build of Quake 2 using the GNU math library (libm) has produced a different output to a debug build using the openlibm math library. The code is the same; all that differs is that library. The session shown here uses GNU libm. In parallel, I ran another session using openlibm, and entered the same commands to look for differences.I begin at the point where a difference between the openlibm and GNU libm versions of Quake 2 is detected by some test code. The difference is seen in the rendered video output. The test code found a difference at pixel number 3559:Now I find the location of that different byte in memory, set a watchpoint at that location, and then run backwards to find where the value came from.Execution stops at the point where it was last written. I look at the expression that generated the value, to see where it got its inputs from.Having found two differences, I look through the source code to see how s and t are calculated. I find that s is derived from a global variable, d_sdivzstepu. I set a watchpoint to find out where that value came from, and again, I run backwards:Once again I look at the inputs to the expression, looking for a difference between the two runs:vright[0] is different, so yet again, I run backwards to find out why:And here's the reason for the difference. The " sinf " function is implemented differently in the two math libraries. With GNU libm, sinf(2.44171381) = 0.644124985. With openlibm, sinf(2.44171381) = 0.644125044. It's a difference in rounding - and though it is a very small difference, it still affects the output drawn on the screen. This is why I bundled openlibm with my Quake 2 benchmark. Both answers are valid - the difference is only in the least-significant bit of the floating-point number returned by sinf - but I need the behaviour of the program to be completely consistent, and unaffected by system library versions.As the debugger transcript shows, in order to go from a visibly different pixel to the subtly different output from sinf, I had to run backwards five times. It would have taken at least five runs of Quake 2, and probably more, to find the source of the difference by hand. I would have needed to add counters to the program in order to stop in the correct places, since all of this code is executed repeatedly during each frame. The process would be error-prone and slow. I would need to write plenty of notes in order to avoid becoming lost in the program. I would expect the whole process to take several hours, assuming no mistakes are made.As it is, I was able to do all of this in a few minutes.Since I've been building a benchmark program, I will add something about tool performance.On an Intel i5-2500 PC, when compiled for AMD64 with GCC 4.7.2 and -O2, the Quake 2 benchmark running time is 20.3s, or 1.19ms per frame*. If I run the benchmark inside UndoDB (version 4.1.3840), the running time increases to 106.9s, or 6.29ms per frame. This is a slowdown, but put it into perspective by comparing with other development tools**! For instance, valgrind (version 3.7) takes 590.4s (34.7ms per frame). And with GDB version 7.6, when running in the "record" mode required for reversible debugging, the same PC will take an astonishing 34to record a. The overall running time will be approximately 400This great difference in performance does come with a caveat. As you'd expect, UndoDB's performance when running forwards comes at the cost of speed when running backwards, and a search for an event that occurred a long time ago can be rather slow. But I am talking about slowness on the "valgrind" scale rather than the "GDB" scale. You may have to wait a minute; you won't have to wait a day. As with forwards-only debugging, you can speed up the search by avoiding conditional breakpoints.I use UndoDB at work, and I have considered buying a licence for use at home, but at this time, it's sold only as a tool for business use and the licencing fees reflect that. However, at work, the tool has already paid for itself. If you write software for a living, and you can run that software on Linux, then it's worth having a look at the free trial.