Evan Balster







I live in this head.





Level 10I live in this head. Beautiful hacks « on: December 03, 2010, 07:46:56 AM »



Quote A ^= B;

B ^= A;

A ^= B;

Does the trick. Now, unless we're programming microcontrollers or something (in which case we'll have a more efficient swap routine anyway) we don't have any practical need for this sort of hack. Yet it's interesting. It's fun to know about, as a programmer, even for those of us who are aghast at the practice of bit-twiddling.





Another fun one for those of you who don't know is



Quote void fastSqrt(float arg)

{

Uint32 val = * (Uint32*) arg;

val = 0x5f375a86 - (val >> 1);

return 1.0f / (* (float*) val);

}

It's an approximation but it's extremely close to the correct value and is used frequently in shaders and sometimes actual game logic. It's faster by a huge factor than the 'proper' means of calculating a square root. Since the division isn't necessary for operations like vector normalizing, it can be skipped for even better performance.



It works by exploiting the nature of IEEE 754 floating-point number representation on modern processors, which consists of an exponent and a 'mantissa'. That's its own matter--all you need to know is that the magic number "0x5f375a86" manages to make this thing work like a charm.





So anyway. For a bit of fun, let's share the silly but intriguing hacks we know, whatever they might be. So someone mentioned it was possible yesterday to swap two bytes without a third and without using the assembly command to do so. I figured outDoes the trick. Now, unless we're programming microcontrollers or something (in which case we'll have a more efficient swap routine anyway) we don't have any practical need for this sort of hack. Yet it's interesting. It's fun to know about, as a programmer, even for those of us who are aghast at the practice of bit-twiddling.Another fun one for those of you who don't know is Fast Inverse Square Root , which can be done like this on little-endian systems:It's an approximation but it's extremely close to the correct value and is used frequently in shaders and sometimes actual game logic. It's faster by a huge factor than the 'proper' means of calculating a square root. Since the division isn't necessary for operations like vector normalizing, it can be skipped for even better performance.It works by exploiting the nature of IEEE 754 floating-point number representation on modern processors, which consists of an exponent and a 'mantissa'. That's its own matter--all you need to know is that the magic number "0x5f375a86" manages to make this thing work like a charm.So anyway. For a bit of fun, let's share the silly but intriguing hacks we know, whatever they might be. Logged Creativity births expression. Curiosity births exploration.

Our work is as soil to these seeds; our art is what grows from them...



Wreath, SoundSelf, Infinite Blank, Cave Story+, Wreath, SoundSelf, Infinite Blank, Cave Story+, <plaid/audio>

st33d Guest

Re: Beautiful hacks « Reply #3 on: December 03, 2010, 08:56:17 AM »



http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf How about an even faster InvSqrt? Logged

slembcke









Level 3 Re: Beautiful hacks « Reply #5 on: December 04, 2010, 12:01:17 PM » far more accurate. True at least on my 3 year old Core2 Duo. Even on my 6 year old G5, 2 year old Atom and on the iPhone the InvSqrt() is only about 10-15% faster. (at least using GCC for all the tests) Unless your program does nothing but renormalize vectors, you are probably only going to see low single digit overall performance gains.



I'd say that my favorite hack has to be checking for NaNs. Comparisons to NaN always fail, even when comparing NaN == NaN. So to check if some calculation exploded, you can do this:

Code: if(x != x){

error("oh noes!");

}



It's completely non-obvious what that actually does until somebody tells you. There is isnan() in the C standard math library, but I remember running into compilers that didn't define it before. MSVC or some versions of GCC maybe?



I also often feel like trigonometry identities are hacks... Maybe that's just me though. People get really excited about that inverse square root function, but keep in mind that on a lot of recent processors that 1.0f/sqrtf() will probably be slightly faster andmore accurate. True at least on my 3 year old Core2 Duo. Even on my 6 year old G5, 2 year old Atom and on the iPhone the InvSqrt() is only about 10-15% faster. (at least using GCC for all the tests) Unless your program does nothing but renormalize vectors, you are probably only going to see low single digit overall performance gains.I'd say that my favorite hack has to be checking for NaNs. Comparisons to NaN always fail, even when comparing NaN == NaN. So to check if some calculation exploded, you can do this:It's completely non-obvious what that actually does until somebody tells you. There is isnan() in the C standard math library, but I remember running into compilers that didn't define it before. MSVC or some versions of GCC maybe?I also often feel like trigonometry identities are hacks... Maybe that's just me though. « Last Edit: December 04, 2010, 12:24:16 PM by slembcke » Logged Scott - Howling Moon Software Chipmunk Physics Library - A fast and lightweight 2D physics engine.

Glaiel-Gamer Guest

Re: Beautiful hacks « Reply #6 on: December 04, 2010, 12:11:58 PM »





Code: float movex = (RIGHTKEYSTATE-LEFTKEYSTATE)*speed;

float movey = (DOWNKEYSTATE-UPKEYSTATE)*speed;

Booleans implicitly converted to ints make awesome short ways to do things Logged

Glaiel-Gamer Guest

Re: Beautiful hacks « Reply #7 on: December 04, 2010, 12:21:54 PM »



normally:

Code: array[row*width+column]

but wait! width is a power of 2! Also column will always be less than the width, so the last N bytes in the index are all 0 before adding column.





Code: //N = log2(width)

array[(row<<N)|column]

also just realized you can do bounds checking with bitwise operations too



Code: //N = log2(width)

//M = log2(height)

//you want an int where all bits > N are 1, precompute it

//i.e. if N = 5, the bits should look like

//11111111111111111111111111100000

//which is N + N<<1 + N<<2 + N<<3 + N<<4 +...+ N<<32 //dont need to check, once you shifted it all the way over you're just adding 0 which is a nop. Can also use | instead of +

//unsigned K = N + N<<1 + N<<2 +...+ N<<32

//unsigned L = M + M<<1 + M<<2 +...+ M<<32



boundcheck = (unsigned(row)&L)|(unsigned(column)&K);

if(boundcheck){

//out of bounds

}





i just realized this today after writing this... I think im gonna have to go pop that in to my code today since bound checking wastes time in a function thats called about 50000 times per frame (related to collision detection) in closure Also, 2D array indexing if the width and height are powers of 2:normally:but wait! width is a power of 2! Also column will always be less than the width, so the last N bytes in the index are all 0 before adding column.also just realized you can do bounds checking with bitwise operations tooi just realized this today after writing this... I think im gonna have to go pop that in to my code today since bound checking wastes time in a function thats called about 50000 times per frame (related to collision detection) in closure « Last Edit: December 04, 2010, 12:59:47 PM by Glaiel-Gamer » Logged

Pineapple





~♪





Level 10~♪ Re: Beautiful hacks « Reply #8 on: December 04, 2010, 12:25:43 PM » Return ((arg>0)*2-1)*arg



It's slower than the typical function on the average processor, but I think it still qualifies as a hack, in case the compiler lacks the instruction for some reason. Logged

slembcke









Level 3 Re: Beautiful hacks « Reply #9 on: December 04, 2010, 12:36:25 PM »



Code: !!value

// is the same as

value != 0



If value is 0 it evaluates to 0, if value is any other number it evaluates to 1.



Another neat hack that you'll see in compiled assembly code is setting a register to 0. Often times you'll write this code:

Code: int number = 0; Which is compiled to be something like this:

Code: int number; number ^= number

The reason being that any value XORed with itself returns 0, and the XOR instruction on x86 processors is shorter than a load instruction with a 32 or 64 bit constant for 0. It makes for a smaller and slightly faster binary. A C neat hack that I don't actually use because I think it's mildly unreadable and probably tricks compilers into generating less efficient code:If value is 0 it evaluates to 0, if value is any other number it evaluates to 1.Another neat hack that you'll see in compiled assembly code is setting a register to 0. Often times you'll write this code:Which is compiled to be something like this:The reason being that any value XORed with itself returns 0, and the XOR instruction on x86 processors is shorter than a load instruction with a 32 or 64 bit constant for 0. It makes for a smaller and slightly faster binary. Logged Scott - Howling Moon Software Chipmunk Physics Library - A fast and lightweight 2D physics engine.

LemonScented









Level 7 Re: Beautiful hacks « Reply #12 on: December 04, 2010, 07:51:01 PM »



Code: bool IsPowerOfTwo(const int num)

{

return !(num & (num-1));

}

Not really a hack, just a nice succinct way of using the properties of binary numbers. I was always rather taken with this technique for finding out if a number is a power of 2:Not really a hack, just a nice succinct way of using the properties of binary numbers. Logged

http://lemonscentedgames.blogspot.com/ Our Zest Is Best!

raigan







Level 4 Re: Beautiful hacks « Reply #14 on: December 04, 2010, 08:19:06 PM » Quote from: slembcke on December 04, 2010, 12:01:17 PM People get really excited about that inverse square root function, but keep in mind that on a lot of recent processors that 1.0f/sqrtf() will probably be slightly faster and far more accurate.



Yeah, in the link I posted the author concludes that these days in 99% of cases it's faster to use the intrinsic command for whatever platform you're on. Yeah, in the link I posted the author concludes that these days in 99% of cases it's faster to use the intrinsic command for whatever platform you're on. Logged

Zaphos Guest

Re: Beautiful hacks « Reply #16 on: December 04, 2010, 08:39:50 PM » Quote from: slembcke on December 04, 2010, 08:12:48 PM I've always wanted to know if there was some clever way to get the next largest power of two for a number without a for loop. Like 9 -> 16, 8 -> 8, 129 -> 256, etc. I've never really sat down to figure it out, but it seems like there should be a moderately simple way to do it.

There's some discussion of it here: There's some discussion of it here: http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2Float (and in the very next entry on that list + the entries on computing an integer log) Logged