6. Implementing Instructions

Your task now is to fill in each opcode case with a correct implementation. This is easier than it sounds. A detailed specification for each instruction is included in the project documents. The specificiation for each translates pretty easily to several lines of codes. I will demonstrate how to implement two of them here. The code for the rest can be found in the next section.

ADD

The ADD instruction takes two numbers, adds them together, and stores the result in a register. Its specification is found on page 526. Each ADD instruction looks like the following:

The encoding shows two rows because there are two different "modes" for this instruction. Before I explain modes, let's try to find the similarities between them. In both rows, we can see that we start with 4 bits, 0001 . This is the opcode value for OP_ADD . The next 3 bits are marked DR . This stands for destination register. The destination register is where the added sum will be stored. The next 3 bits are SR1 . This is the register containing the first number to add.

So we know where we want to store the result and we know the first number to add. The last bit of information we need is the second number to add. At this point, the two rows start to look different. Notice that on the top row the 5th bit is a 0 and in the second row it is 1 . This bit indicates whether it is immediate mode or register mode. In register mode, the second number is stored in a register just like the first. This is marked SR2 and is contained in bits 2-0. Bits 3 and 4 are unused. In assembly this would be written as:

{Add Register Assembly 6 ADD R2 R0 R1 ; add the contents of R0 to R1 and store in R2.

Immediate mode is a convenience which reduces the length of a typical program. Instead of adding two values stored in separate registers, the second value is embedded in the instruction itself, marked imm5 in the diagram. This removes the need to write instructions to load the value from memory. The tradeoff is that the instruction only has room for a small number, up to 2^5=32 (unsigned) to be exact, making immediate mode primarily useful for incrementing and decrementing. In assembly, it could be written as:

{Add Immediate Assembly 6 ADD R0 R0 1 ; add 1 to R0 and store back in R0

Here is a summary from the specification:

If bit [5] is 0, the second source operand is obtained from SR2. If bit [5] is 1, the second source operand is obtained by sign-extending the imm5 field to 16 bits. In both cases, the second source operand is added to the contents of SR1 and the result stored in DR. (Pg. 526)

That sounds just like the behaviour we discussed, but what is "sign-extending"? The immediate mode value has only 5 bits, but it needs to be added to a 16-bit number. To do the addition, those 5 bits need to be extended to 16 to match the other number. For positive numbers, we can simply fill in 0's for the additional bits. For negative numbers, this causes a problem. For example, -1 in 5 bits is 1 1111 . If we just extended it with 0's, this is 0000 0000 0001 1111 which is equal to 31. Sign extension corrects this problem by filling in 0's for positive numbers and 1's for negative numbers, so that original values are preserved.

{Sign Extend 6 uint16_t sign_extend(uint16_t x, int bit_count) { if ((x >> (bit_count - 1)) & 1) { x |= (0xFFFF << bit_count); } return x; } Used in section 11

Note: If you are interested in exactly how negative numbers can be represented in binary, you can read about Two's Complement. However, this is not essential. You can just copy the code above and use it whenever the specification says to sign extend numbers.

There is one last sentence in the specification:

The condition codes are set, based on whether the result is negative, zero, or positive. (Pg. 526)

Earlier we defined a condition flags enum, and now it's time to use them. Any time a value is written to a register, we need to update the flags to indicate its sign. We will write a function so that this can be reused:

{Update Flags 6 void update_flags(uint16_t r) { if (reg[r] == 0) { reg[R_COND] = FL_ZRO; } else if (reg[r] >> 15) /* a 1 in the left-most bit indicates negative */ { reg[R_COND] = FL_NEG; } else { reg[R_COND] = FL_POS; } } Used in section 11

Now we are ready to write the code for the ADD case:

{ADD 6 { /* destination register (DR) */ uint16_t r0 = (instr >> 9) & 0x7; /* first operand (SR1) */ uint16_t r1 = (instr >> 6) & 0x7; /* whether we are in immediate mode */ uint16_t imm_flag = (instr >> 5) & 0x1; if (imm_flag) { uint16_t imm5 = sign_extend(instr & 0x1F, 5); reg[r0] = reg[r1] + imm5; } else { uint16_t r2 = instr & 0x7; reg[r0] = reg[r1] + reg[r2]; } update_flags(r0); } Used in section 5

This section contained a lot of information, so let's summarize.

ADD takes two values and stores them in a register.

takes two values and stores them in a register. In register mode, the second value to add is found in a register.

In immediate mode, the second value is embedded in the right-most 5 bits of the instruction.

Values which are shorter than 16 bits need to be sign extended.

Any time an instruction modifies a register, the condition flags need to be updated.

You may be feeling overwhelmed about writing 15 more instructions. However, all of what you learned here will be reused. Most of the instructions use some combination of sign extension, different modes, and updating flags.

LDI

LDI stands for "load indirect." This instruction is used to load a value from a location in memory into a register. The specification is found on page 532.

Here is what the binary layout looks like:

In contrast to ADD , there are no modes and fewer parameters. This time, the opcode is 1010 which corresponds with the OP_LDI enum value. Just like ADD , it contains a 3-bit DR (the destination register) for storing the loaded value. The remaining bits are labeled PCoffset9 . This is an immediate value embedded in the instruction (similar to imm5 ). Since this instruction loads from memory, we can guess that this number is some kind of address which tells us where to load from. The specification provides more detail:

An address is computed by sign-extending bits [8:0] to 16 bits and adding this value to the incremented PC . What is stored in memory at this address is the address of the data to be loaded into DR . (Pg. 532)

Just like before, we need to sign extend this 9-bit value, but this time add it to the current PC . (If you look back at the execution loop, the PC was incremented right after this instruction was loaded.) The resulting sum is an address to a location in memory, and that address contains, yet another value which is the address of the value to load.

This may seem like a roundabout way to read from memory, but it is indispensable. The LD instruction is limited to address offsets that are 9 bits, whereas the memory requires 16 bits to address. LDI is useful for loading values that are stored in locations far away from the current PC, but to use it, the address of the final location needs to be stored in a neighborhood nearby. You can think of it like having a local variable in C which is a pointer to some data:

{C LDI Sample 6 // the value of far_data is an address // of course far_data itself (the location in memory containing the address) has an address char* far_data = "apple"; // In memory it may be layed out like this: // Address Label Value // 0x123: far_data = 0x456 // ... // 0x456: string = 'a' // if PC was at 0x100 // LDI R0 0x023 // would load 'a' into R0

Same as before, the flags need to be updated after putting the value into DR :

The condition codes are set based on whether the value loaded is negative, zero, or positive. (Pg. 532)

Here is the code for this case: ( mem_read will be discussed in a later section.)

{LDI 6 { /* destination register (DR) */ uint16_t r0 = (instr >> 9) & 0x7; /* PCoffset 9*/ uint16_t pc_offset = sign_extend(instr & 0x1FF, 9); /* add pc_offset to the current PC, look at that memory location to get the final address */ reg[r0] = mem_read(mem_read(reg[R_PC] + pc_offset)); update_flags(r0); } Used in section 5

As I said, this instruction shared a lot of the code and knowledge learned from ADD . You will find this is the case with the remaining instructions.