Preparation

Let’s get started! We’ll use Ruby and Minitest (though normally I prefer RSpec).

Before we write any code though, let’s try to come up with concepts related to this task. You know, naming is the hardest thing in computer science. That’s why I’d like to spend time coming up with decent names first.

Normally, we can speak with clients, product owners, or other domain experts to learn how they call things in the domain of interest.

In our case, we could search for the necessary concepts in the computer science literature (or ask Google). But for the sake of this exercise, let’s come up with something instead.

So we have an Assembly language. It has instructions that we execute. How to call a set of instructions? An instruction set? Instruction sequence? I know! A Program! Sometimes obvious things hide in plain sight.

We’ll parse the code of the program in order to understand its instructions. Then using our interpreter we’ll execute these instructions. We also have 3 registers: A, B, C, that can store numerical values.

We just came up with our dictionary — the nouns and verbs marked in bold.

I think this dictionary is big enough for now. Although we may discover new concepts later, I’m comfortable enough to start coding.

First steps

Let’s come up with a basic test — executing an empty program.

Yeah, I like to give my projects cheesy names.

The code isn’t in any way smart — executing an empty program should result in having 0's in all our registers.

We could later decide to refactor register to a separate class. For the time being we’ll keep it simple and treat each register as a key, value pair in a Hash though.

Implementing the MOV instruction

Okay, let’s try to modify the content of register A by implementing the MOV instruction.

The tests are passing.

We assume our program can only consist of one line of code, but we’ll fix this later.

We can either refactor now or cover some corner cases. I prefer to cover corner cases first. It makes me more comfortable with refactoring later. I also find it easier to think of any edge cases as soon as possible.

So what happens when the instruction looks like this: MOV A , MOV , or MOV A asdasd ?

Or what if someone wants to use a register that doesn’t exist: MOV D 7 ?

These are all things we should guard against.

So let’s add tests and start throwing some errors!

Yay! I’m feeling much more comfortable now. But the code doesn’t look pretty. Let’s do some refactoring!

First Refactoring

Let us to extract some domain concepts into separate classes.

What should we start with?

Parameter validation in execute method looks suspicious. I think it doesn’t belong there. The whole code block inside the if statement looks like an Instruction. So let’s try to refactor out an instruction.

Also, this made me realize that the command variable has an incorrect name. It really should be called instruction .

That’s much more elegant and all our tests still pass!

We still have a conditional instruction in Assemrby.parse and some hard-coded strings though. Let’s leave them for now and implement more requirements instead.

Handling Multiple Lines of Code

I think it’s high time our library could parse more than one line of Assembly code. The change should be easy. Let’s write a test that uses the MOV instruction multiple times and implement the required changes.

That was easy!

It also made me realise one more thing: we should raise an error when we don’t recognise an instruction. Lets continue with our requirements!

Handling ADD and MUL Instructions

So let’s implement ADD and MUL instruction and handle the case when we don’t recognise the instruction.

Be aware that we can do the following: ADD 10 A , ADD 10 15 , or ADD A B . We have a lot of cases to cover.

Below we implement all these test cases:

This test file really starts looking like a mess. We should do something with it.

Here’s the plan.

First, we’ll extend the previous if-statement to a case-statement. We’ll handle the new MUL and ADD instructions. We’ll raise error when we don’t recognise the instruction.

Then we’ll refactor the new code.

Finally, we’ll refactor our tests.

So let’s continue coding! First, the raw code that passes all tests:

Those new instructions have a lot in common. I copy-pasted code from one to the other, so they’re really similar.

And they’re both really ugly.

I didn’t even bother to change variable names. I hope you don’t commit code that looks like this :).

On the other hand, our tests are passing. We can safely refactor this monstrosity.

The worst part of this code is that we have to treat numerical values and register references differently. Plus we have to do it in both classes.

We’ll try to extract a nice interface that will let us forget about this.

Then we’ll see if it makes sense to remove the remaining duplication.

It looks much better now!

I’m not super proud of the names we came up with though. Assemrby::Value#value doesn’t sound good and might be confusing.

On the other hand, I think that Add and Mul got handled well. They still look quite similar but this time I won’t remove the duplication.

When I was a junior developer I thought that code duplication is wrong and has to be fought immediately. However, with experience I learned it’s not always a bad thing.

Cleaning Up Tests

I guess if I haven’t been writing my thoughts down, I’d already hit either 60- or 90- minute mark. Let’s clean up the tests and finish. After this we’ll have a moment of reflection on the code.

We have a lot of tests that exercise different instructions. I believe we should move them to test files specific to each instruction. So here we go.

And I think we can stop here. We didn’t implement all the features but I think we went quite far for a blog post.

What we’re missing are the JUMP instructions. How would we implement them?

We’d have to store a value of an Instruction Pointer. An Instruction Pointer would store the line number we’re currently executing. We would have to pass the instruction pointer to Instruction#execute along with registers.

The way I’d approach it would be first extracting a separate class that would store both the registers and the Instruction Pointer.

How would we call the new class? ExecutionEnvironment perhaps? Or simply Environment . Then we could safely pass an object of this class around.

Also, the JUMP instruction could potentially jump into a yet unparsed line of code.

Because of this, it’d be simpler to first parse all the code and build the Instruction objects first with Instruction#build .

Then, we could store these objects in an Array. An Instruction Pointer would simply be an index of this array.