Fibonacci: You're Also Doing It Wrong

Ear­li­er to­day, I saw C++ Week­ly Ep 13 Fi­bonac­ci: You're Do­ing It Wrong and in my opin­ion the sug­ges­tion made there to use Bi­net's for­mu­la to cal­cu­late ar­bi­trary el­e­ments of the Fi­bonac­ci se­quence is dan­ger­ous. And as some­body is wrong on the in­ter­net I de­cid­ed to write this short blog post on the top­ic.

At the first glance, us­ing Bi­net's for­mu­la seems like a great sug­ges­tion. It is a closed form so­lu­tion. It has O ( 1 ) \mathcal{O}(1) O(1) com­plex­i­ty, i.e., fast. It is easy to im­ple­ment. And more.

How­ev­er, as usu­al when float­ing point is in­volved, dan­ger lurks around ev­ery cor­ner.

Al­though the O ( 1 ) \mathcal{O}(1) O(1) com­plex­i­ty seems at­trac­tive, it is on­ly O ( 1 ) \mathcal{O}(1) O(1) if pow is O ( 1 ) . \mathcal{O}(1). O(1). Which it is for ba­sic datatypes, but in oth­er cas­es, such as ar­bi­trary pre­ci­sion floats it is far more like­ly to be O ( log n ) . \mathcal{O}(\log n). O(logn). Al­so pow is re­al­ly, re­al­ly slow. We'll show lat­er how slow it re­al­ly is. Al­though the im­ple­men­ta­tion seems sim­ple, it is easy to get wrong. In fact the ver­sion shown in the video is wrong. constexpr int fib ( const int i ) { const static auto sqrt_5 = std :: sqrt ( 5 ); if ( i == 0 ) return 0 ; if ( i == 1 ) return 1 ; return static_cast < int > (( std :: pow ( 1 + sqrt_5 , i ) - std :: pow ( 1 - sqrt_5 , i )) / ( std :: pow ( 2 , i ) * sqrt_5 )); } This ver­sion is wrong for val­ues of i as small as 10: 0: 0 0 true 1: 1 1 true 2: 1 1 true 3: 2 2 true 4: 3 3 true 5: 5 5 true 6: 8 8 true 7: 13 13 true 8: 21 21 true 9: 34 34 true 10: 55 54 false 11: 89 89 true 12: 144 143 false 13: 233 232 false 14: 377 377 true 15: 610 610 true 16: 987 986 false 17: 1597 1596 false 18: 2584 2584 true 19: 4181 4181 true 20: 6765 6764 false 21: 10946 10945 false 22: 17711 17710 false 23: 28657 28656 false 24: 46368 46367 false 25: 75025 75025 true 26: 121393 121392 false 27: 196418 196418 true 28: 317811 317811 true 29: 514229 514228 false 30: 832040 832039 false 31: 1346269 1346268 false 32: 2178309 2178309 true 33: 3524578 3524577 false 34: 5702887 5702886 false 35: 9227465 9227465 true 36: 14930352 14930351 false 37: 24157817 24157816 false 38: 39088169 39088168 false 39: 63245986 63245985 false 40: 102334155 102334154 false 41: 165580141 165580140 false 42: 267914296 267914295 false 43: 433494437 433494437 true 44: 701408733 701408732 false 45: 1134903170 1134903169 false 46: 1836311903 1836311903 true The first col­umn is the in­dex, the sec­ond col­umn is the cor­rect val­ue, the third one is com­put­ed with Bi­net's for­mu­la and the last shows if the val­ues match. What hap­pened? Round­ing er­rors hap­pened. Com­pound­ed by the trun­ca­tion due to the cast to int . This can be al­le­vi­at­ed by us­ing std::round be­fore cast­ing re­sult­ing in: 0: 0 0 true 1: 1 1 true 2: 1 1 true 3: 2 2 true 4: 3 3 true 5: 5 5 true 6: 8 8 true 7: 13 13 true 8: 21 21 true 9: 34 34 true 10: 55 55 true 11: 89 89 true 12: 144 144 true 13: 233 233 true 14: 377 377 true 15: 610 610 true 16: 987 987 true 17: 1597 1597 true 18: 2584 2584 true 19: 4181 4181 true 20: 6765 6765 true 21: 10946 10946 true 22: 17711 17711 true 23: 28657 28657 true 24: 46368 46368 true 25: 75025 75025 true 26: 121393 121393 true 27: 196418 196418 true 28: 317811 317811 true 29: 514229 514229 true 30: 832040 832040 true 31: 1346269 1346269 true 32: 2178309 2178309 true 33: 3524578 3524578 true 34: 5702887 5702887 true 35: 9227465 9227465 true 36: 14930352 14930352 true 37: 24157817 24157817 true 38: 39088169 39088169 true 39: 63245986 63245986 true 40: 102334155 102334155 true 41: 165580141 165580141 true 42: 267914296 267914296 true 43: 433494437 433494437 true 44: 701408733 701408733 true 45: 1134903170 1134903170 true 46: 1836311903 1836311903 true Prob­lem solved, let's go home. Ex­cept we didn't solve the prob­lem. We on­ly solved it for 32 bit in­te­gers. For 64 bit in­te­gers all el­e­ments of the se­ries be­yond the 75th are wrong. 47: 2971215073 2971215073 true 48: 4807526976 4807526976 true 49: 7778742049 7778742049 true 50: 12586269025 12586269025 true 51: 20365011074 20365011074 true 52: 32951280099 32951280099 true 53: 53316291173 53316291173 true 54: 86267571272 86267571272 true 55: 139583862445 139583862445 true 56: 225851433717 225851433717 true 57: 365435296162 365435296162 true 58: 591286729879 591286729879 true 59: 956722026041 956722026041 true 60: 1548008755920 1548008755920 true 61: 2504730781961 2504730781961 true 62: 4052739537881 4052739537881 true 63: 6557470319842 6557470319842 true 64: 10610209857723 10610209857723 true 65: 17167680177565 17167680177565 true 66: 27777890035288 27777890035288 true 67: 44945570212853 44945570212853 true 68: 72723460248141 72723460248141 true 69: 117669030460994 117669030460994 true 70: 190392490709135 190392490709135 true 71: 308061521170129 308061521170129 true 72: 498454011879264 498454011879264 true 73: 806515533049393 806515533049393 true 74: 1304969544928657 1304969544928657 true 75: 2111485077978050 2111485077978050 true 76: 3416454622906707 3416454622906706 false 77: 5527939700884757 5527939700884755 false 78: 8944394323791464 8944394323791463 false 79: 14472334024676221 14472334024676218 false 80: 23416728348467685 23416728348467676 false 81: 37889062373143906 37889062373143896 false 82: 61305790721611591 61305790721611584 false 83: 99194853094755497 99194853094755488 false 84: 160500643816367088 160500643816367040 false 85: 259695496911122585 259695496911122528 false 86: 420196140727489673 420196140727489600 false 87: 679891637638612258 679891637638612096 false 88: 1100087778366101931 1100087778366101632 false 89: 1779979416004714189 1779979416004713728 false 90: 2880067194370816120 2880067194370815488 false 91: 4660046610375530309 4660046610375529472 false 92: 7540113804746346429 7540113804746344448 false This is much more dif­fi­cult to solve as dou­ble pre­ci­sion float­ing point num­bers with their 53 bit (1 bit im­plied) man­tis­sa sim­ply can­not rep­re­sent all 64 bit in­te­ger val­ues. Al­though GCC al­lows it, std::pow and std::round can­not legal­ly be used in a constexpr func­tion.

So what's the al­ter­na­tive? Ex­po­nen­ti­a­tion by squar­ing. This nifty al­go­rithm al­lows us to com­pute pow­ers (both of in­te­gers and ma­tri­ces) in O ( log n ) \mathcal{O}(\log n) O(logn) time, where n n n is the ex­po­nent. What use is com­put­ing the pow­er of a ma­trix in this con­text? Sim­ply put, the re­cur­sion in the Fi­bonac­ci se­quence can be rep­re­sent­ed by

( 1 1 1 0 ) n ( 1 0 ) = ( F n + 1 F n ) \left(\begin{array}{cc}1 & 1 \\1 & 0 \end{array}\right)^n\left(\begin{array}{c}1\\0\end{array}\right)=\left(\begin{array}{c}F_{n+1}\\F_n\end{array}\right) ( 1 1 ​ 1 0 ​ ) n ( 1 0 ​ ) = ( F n + 1 ​ F n ​ ​ )

There­fore, O ( log n ) \mathcal{O}(\log n) O(logn) ma­trix ex­po­nen­ti­a­tion re­sults di­rect­ly in O ( log n ) \mathcal{O}(\log n) O(logn) com­pu­ta­tion of Fi­bonac­ci num­bers. Here is a quick im­ple­men­ta­tion:

template < typename Element > struct mat2 { Element elements [ 2 ][ 2 ]; }; template < typename Element > constexpr mat2 < Element > operator * ( mat2 < Element > const & a_ , mat2 < Element > const & b_ ) { auto const & a = a_ . elements ; auto const & b = b_ . elements ; return mat2 < Element > {{ { a [ 0 ][ 0 ] * b [ 0 ][ 0 ] + a [ 0 ][ 1 ] * b [ 1 ][ 0 ], a [ 0 ][ 0 ] * b [ 0 ][ 1 ] + a [ 0 ][ 1 ] * b [ 1 ][ 1 ]}, { a [ 1 ][ 0 ] * b [ 0 ][ 0 ] + a [ 1 ][ 1 ] * b [ 1 ][ 0 ], a [ 1 ][ 0 ] * b [ 0 ][ 1 ] + a [ 1 ][ 1 ] * b [ 1 ][ 1 ]} }}; } template < typename Element > constexpr mat2 < Element > mat_pow ( mat2 < Element > m , unsigned n ) { if ( ! n ) return mat2 < Element > {{{ 1 , 0 },{ 0 , 1 }}}; -- n ; auto r = m ; while ( n ) { if ( n & 1 ) r = r * m ; m = m * m ; n >>= 1 ; } return r ; } template < typename Integer > constexpr Integer fib ( unsigned n ) { constexpr auto a = mat2 < Integer > {{{ 1 , 1 },{ 1 , 0 }}}; auto an = mat_pow ( a , n ); return an . elements [ 1 ][ 0 ]; }

Not on­ly is this im­ple­men­ta­tion ex­act for any in­te­ger type and constexpr -le­gal (in C++14), it is al­so sig­nif­i­cant­ly faster than the float­ing point im­ple­men­ta­tion us­ing Bi­net's for­mu­la. On my com­put­er which has an i7-4790k run­ning at 4 GHz, us­ing MinGW-w64 G++ 5.2 and -O3 op­ti­miza­tion the com­pu­ta­tion of all Fi­bonac­ci num­bers for n n n be­tween 0 and 92 (in­clu­sive) takes 1.28 μs while the float­ing point ver­sion takes 11 μs. Al­most a fac­tor of 10. How come an O ( log n ) \mathcal{O}(\log n) O(logn) al­go­rithm is faster than an O ( 1 ) \mathcal{O}(1) O(1) al­go­rithm? The elu­sive con­stant fac­tor. And the con­stant fac­tor of std::pow is typ­i­cal­ly quite large.

Post­script: But it works for me!

So you read this ar­ti­cle, and tried it your­self and the Bi­net ver­sion works just fine even with­out the call to std::round ? You are prob­a­bly us­ing an x86 (32 bit) com­pil­er. Most com­pil­ers de­fault to al­low­ing more pre­cise op­er­a­tions on float­ing point val­ues and the x86 FPU's de­fault type is an 80-bit float­ing point type. There­fore, you are prob­a­bly work­ing with a high­er pre­ci­sion type. It will still fail for 64-bit re­sults though. To repli­cate this be­hav­ior on a 64 bit com­pil­er (which by de­fault us­es SSE in­struc­tions in­stead of the FPU as they are usu­al­ly faster), you can force the use of 80-bit floats by us­ing long double in­stead of double : const static auto sqrt_5 = std::sqrt(5.0l); With this change the ver­sion with­out round­ing works for n n n up to 86 and the one with round­ing works up to 84 (didn't I men­tion float­ing point has many pit­falls?).

Post­script 2: The naive ver­sion is fastest on my com­pil­er!

In one of the com­ments on the old site, a us­er point­ed out that with Clang 3.8, the naive ver­sion was fast­ed. This was due to the fact, that the com­pil­er per­formed par­tial loop un­rolling, as can be seen here.