Bit Fields and You

I'll use a simple example to explain the basics. Say you have an unsigned integer with four bits:

[0][0][0][0] = 0

You can represent any number here from 0 to 15 by converting it to base 2. Say we have the right end be the smallest:

[0][1][0][1] = 5

So the first bit adds 1 to the total, the second adds 2, the third adds 4, and the fourth adds 8. For example, here's 8:

[1][0][0][0] = 8

So What? Say you want to represent a binary state in an application-- if some option is enabled, if you should draw some element, and so on. You probably don't want to use an entire integer for each one of these- it'd be using a 32 bit integer to store one bit of information. Or, to continue our example in four bits:

[0][0][0][1] = 1 = ON [0][0][0][0] = 0 = OFF //what a huge waste of space!

(Of course, the problem is more pronounced in real life since 32-bit integers look like this:

[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 0

The answer to this is to use a bit field. We have a collection of properties (usually related ones) which we will flip on and off using bit operations. So, say, you might have 4 different lights on a piece of hardware that you want to be on or off.

3 2 1 0 [0][0][0][0] = 0

(Why do we start with light 0? I'll explain this in a second.) Note that this is an integer, and is stored as an integer, but is used to represent multiple states for multiple objects. Crazy! Say we turn lights 2 and 1 on:

3 2 1 0 [0][1][1][0] = 6

The important thing you should note here: There's probably no obvious reason why lights 2 and 1 being on should equal six, and it may not be obvious how we would do anything with this scheme of information storage. It doesn't look more obvious if you add more bits:

3 2 1 0 [1][1][1][0] = 0xE \\what?

Why do we care about this? Do we have exactly one state for each number between 0 and 15?How are we going to manage this without some insane series of switch statements? Ugh...

The Light at the End

So if you've worked with binary arithmetic a bit before, you might realize that the relationship between the numbers on the left and the numbers on the right is, of course, base 2. That is:

1*(23) + 1*(22) + 1*(21) +0 *(20) = 0xE

So each light is present in the exponent of each term of the equation. If the light is on, there is a 1 next to its term- if the light is off, there is a zero. Take the time to convince yourself that there is exactly one integer between 0 and 15 that corresponds to each state in this numbering scheme.

Bit operators

Now that we have this done, let's take a second to see what bitshifting does to integers in this setup.

[0][0][0][1] = 1

When you shift bits to the left or the right in an integer, it literally moves the bits left and right. (Note: I 100% disavow this explanation for negative numbers! There be dragons!)

1<<2 = 4 [0][1][0][0] = 4 4>>1 = 2 [0][0][1][0] = 2

You will encounter similar behavior when shifting numbers represented with more than one bit. Also, it shouldn't be hard to convince yourself that x>>0 or x<<0 is just x. Doesn't shift anywhere.

This probably explains the naming scheme of the Shift operators to anyone who wasn't familiar with them.

Bitwise operations

This representation of numbers in binary can also be used to shed some light on the operations of bitwise operators on integers. Each bit in the first number is xor-ed, and-ed, or or-ed with its fellow number. Take a second to venture to wikipedia and familiarize yourself with the function of these Boolean operators - I'll explain how they function on numbers but I don't want to rehash the general idea in great detail.

...

Welcome back! Let's start by examining the effect of the OR (|) operator on two integers, stored in four bit.

OR OPERATOR ON: [1][0][0][1] = 0x9 [1][1][0][0] = 0xC ________________ [1][1][0][1] = 0xD

Tough! This is a close analogue to the truth table for the boolean OR operator. Notice that each column ignores the adjacent columns and simply fills in the result column with the result of the first bit and the second bit OR'd together. Note also that the value of anything or'd with 1 is 1 in that particular column. Anything or'd with zero remains the same.

The table for AND (&) is interesting, though somewhat inverted:

AND OPERATOR ON: [1][0][0][1] = 0x9 [1][1][0][0] = 0xC ________________ [1][0][0][0] = 0x8

In this case we do the same thing- we perform the AND operation with each bit in a column and put the result in that bit. No column cares about any other column.

Important lesson about this, which I invite you to verify by using the diagram above: anything AND-ed with zero is zero. Also, equally important- nothing happens to numbers that are AND-ed with one. They stay the same.

The final table, XOR, has behavior which I hope you all find predictable by now.

XOR OPERATOR ON: [1][0][0][1] = 0x9 [1][1][0][0] = 0xC ________________ [0][1][0][1] = 0x5

Each bit is being XOR'd with its column, yadda yadda, and so on. But look closely at the first row and the second row. Which bits changed? (Half of them.) Which bits stayed the same? (No points for answering this one.)

The bit in the first row is being changed in the result if (and only if) the bit in the second row is 1!

The one lightbulb example!

So now we have an interesting set of tools we can use to flip individual bits. Let's go back to the lightbulb example and focus only on the first lightbulb.

0 [?] \\We don't know if it's one or zero while coding

We know that we have an operation that can always make this bit equal to one- the OR 1 operator.

0|1 = 1 1|1 = 1

So, ignoring the rest of the bulbs, we could do this

4_bit_lightbulb_integer |= 1;

and know for sure that we did nothing but set the first lightbulb to ON.

3 2 1 0 [0][0][0][?] = 0 or 1? \\4_bit_lightbulb_integer [0][0][0][1] = 1 ________________ [0][0][0][1] = 0x1

Similarly, we can AND the number with zero. Well- not quite zero- we don't want to affect the state of the other bits, so we will fill them in with ones.

I'll use the unary (one-argument) operator for bit negation. The ~ (NOT) bitwise operator flips all of the bits in its argument. ~(0X1):

[0][0][0][1] = 0x1 ________________ [1][1][1][0] = 0xE

We will use this in conjunction with the AND bit below.

Let's do 4_bit_lightbulb_integer & 0xE

3 2 1 0 [0][1][0][?] = 4 or 5? \\4_bit_lightbulb_integer [1][1][1][0] = 0xE ________________ [0][1][0][0] = 0x4

We're seeing a lot of integers on the right-hand-side which don't have any immediate relevance. You should get used to this if you deal with bit fields a lot. Look at the left-hand side. The bit on the right is always zero and the other bits are unchanged. We can turn off light 0 and ignore everything else!

Finally, you can use the XOR bit to flip the first bit selectively!

3 2 1 0 [0][1][0][?] = 4 or 5? \\4_bit_lightbulb_integer [0][0][0][1] = 0x1 ________________ [0][1][0][*] = 4 or 5?

We don't actually know what the value of * is now- just that flipped from whatever ? was.

Combining Bit Shifting and Bitwise operations

The interesting fact about these two operations is when taken together they allow you to manipulate selective bits.

[0][0][0][1] = 1 = 1<<0 [0][0][1][0] = 2 = 1<<1 [0][1][0][0] = 4 = 1<<2 [1][0][0][0] = 8 = 1<<3

Hmm. Interesting. I'll mention the negation operator here (~) as it's used in a similar way to produce the needed bit values for ANDing stuff in bit fields.

[1][1][1][0] = 0xE = ~(1<<0) [1][1][0][1] = 0xD = ~(1<<1) [1][0][1][1] = 0xB = ~(1<<2) [0][1][1][1] = 0X7 = ~(1<<3)

Are you seeing an interesting relationship between the shift value and the corresponding lightbulb position of the shifted bit?

The canonical bitshift operators

As alluded to above, we have an interesting, generic method for turning on and off specific lights with the bit-shifters above.

To turn on a bulb, we generate the 1 in the right position using bit shifting, and then OR it with the current lightbulb positions. Say we want to turn on light 3, and ignore everything else. We need to get a bit shifting operation that ORs

3 2 1 0 [?][?][?][?] \\all we know about these values at compile time is where they are!

and 0x8

[1][0][0][0] = 0x8

Which is easy, thanks to bitshifting! We'll pick the number of the light and switch the value over:

1<<3 = 0x8

and then:

4_bit_lightbulb_integer |= 0x8; 3 2 1 0 [1][?][?][?] \\the ? marks have not changed!

And we can guarantee that the bit for the 3rd lightbulb is set to 1 and that nothing else has changed.

Clearing a bit works similarly- we'll use the negated bits table above to, say, clear light 2.

~(1<<2) = 0xB = [1][0][1][1]

4_bit_lightbulb_integer & 0xB:

3 2 1 0 [?][?][?][?] [1][0][1][1] ____________ [?][0][?][?]

The XOR method of flipping bits is the same idea as the OR one.

So the canonical methods of bit switching are this:

Turn on the light i:

4_bit_lightbulb_integer|=(1<<i)

Turn off light i:

4_bit_lightbulb_integer&=~(1<<i)

Flip light i:

4_bit_lightbulb_integer^=(1<<i)

Wait, how do I read these?

In order to check a bit we can simply zero out all of the bits except for the one we care about. We'll then check to see if the resulting value is greater than zero- since this is the only value that could possibly be nonzero, it will make the entire integer nonzero if and only if it is nonzero. For example, to check bit 2:

1<<2:

[0][1][0][0]

4_bit_lightbulb_integer:

[?][?][?][?]

1<<2 & 4_bit_lightbulb_integer:

[0][?][0][0]

Remember from the previous examples that the value of ? didn't change. Remember also that anything AND 0 is 0. So, we can say for sure that if this value is greater than zero, the switch at position 2 is true and the lightbulb is zero. Similarly, if the value is off, the value of the entire thing will be zero.

(You can alternately shift the entire value of 4_bit_lightbulb_integer over by i bits and AND it with 1. I don't remember off the top of my head if one is faster than the other but I doubt it.)

So the canonical checking function:

Check if bit i is on:

if (4_bit_lightbulb_integer & 1<<i) { \\do whatever

}

The specifics

Now that we have a complete set of tools for bitwise operations, we can look at the specific example here. This is basically the same idea- except a much more concise and powerful way of executing it. Let's look at this function:

void set(int i) { x[i>>SHIFT] |= (1<<(i & MASK)); }

From the canonical implementation I'm going to make a guess that this is trying to set some bits to 1! Let's take an integer and look at what's going on here if i feed the value 0x32 (50 in decimal) into i:

x[0x32>>5] |= (1<<(0x32 & 0x1f))

Well, that's a mess.. let's dissect this operation on the right. For convenience, pretend there are 24 more irrelevant zeros, since these are both 32 bit integers.

...[0][0][0][1][1][1][1][1] = 0x1F ...[0][0][1][1][0][0][1][0] = 0x32 ________________________ ...[0][0][0][1][0][0][1][0] = 0x12

It looks like everything is being cut off at the boundary on top where 1s turn into zeros. This technique is called Bit Masking. Interestingly, the boundary here restricts the resulting values to be between 0 and 31... Which is exactly the number of bit positions we have for a 32 bit integer!

x[0x32>>5] |= (1<<(0x12)) Let's look at the other half.

...[0][0][1][1][0][0][1][0] = 0x32

Shift five bits to the right:

...[0][0][0][0][0][0][0][1] = 0x01

Note that this transformation exactly destroyed all information from the first part of the function- we have 32-5 = 27 remaining bits which could be nonzero. This indicates which of 227 integers in the array of integers are selected. So the simplified equation is now:

x[1] |= (1<<0x12)

This just looks like the canonical bit-setting operation! We've just chosen

So the idea is to use the first 27 bits to pick an integer to shift and the last five bits indicate which bit of the 32 in that integer to shift.