A bit of ST and a bit of IO: Designing a DCPU-16 emulator



Published on April 12, 2012 under the tag On monadic typeclasses, pure and impure state monadsPublished on April 12, 2012 under the tag haskell

When Notch announced that his upcoming game 0x10c would feature an in-game 16 bit CPU, the internet went nuts about it.

At this point, it seems that every self-respecting hacker has at least already implemented an optimizing assembler, a dissassembler, an emulator, a debugger, or even an operating system or an LLVM backend.

I obviously couldn’t lag behind and took the opportunity to think about the design of such an emulator in Haskell. This blogpost is the result of my search, and my experiments with different designs until I arrived at something which had all the nice properties I desired.

The full implementation of this design can be found here.

The core of the problem: load and store

When designing a system, it’s often a good idea to start from a minimal but functional core and work your way up from there – this is also the approach I have taken here.

We can create such a core system by defining the state of the emulator as a datatype with a few primitive operations on it. The state of the emulator can be represented as a Memory datatype. It’s a tiny wrapper around a mutable array, and has three basic operations:

module Memory where data Address = Pc | Sp | Ram Word16 -- And more... data Memory s = Memory -- Exact definition omitted new :: ST s ( Memory s) s (s) load :: Memory s -> Address -> ST s Word16 store :: Memory s -> Address -> Word16 -> ST s () s ()

Using the above Address datatype, we’re able to access and modify all of the values the emulator needs using two simple operations: load and store . The fact that we can access and modify, for example, the stack pointer ( Sp ) in the same way as we would modify a value in the RAM ( Ram 0x1000 ) simplifies the implementation of the actual emulator. As an example, think of code like this:

SP , I SET, I 0x1000 ], SP SET [],

This all runs in the ST monad. This monad is perhaps a bit of a strange beast, since it allows destructive updates: this means we can update words in the RAM of the CPU really fast. However, the result remains pure nonetheless: unlike the IO monad, code written in the ST monad can only modify state internal to the monad. This is guaranteed by the type system using the Rank2Types extension.

This is great because it allows us to have a really fast emulator, of which we can trivially say it is deterministic.

A necessary abstraction

However, while the current specification does not include keyboard input and video output, this will certainly be added in the future. If we want to support that in our emulator, does it mean that we will eventually be forced to run our code in the IO monad, and that our brave efforts to build a deterministic emulator have been in vain?

Not exactly. Using a typeclass, it is possible to build an extremely useful abstraction layer which helps us with this problem.

module Emulator.Monad where class Monad m => MonadEmulator m where load :: Address -> m Word16 store :: Address -> Word16 -> m () m ()

We can now implement most of the emulator with just the type MonadEmulator m => m () , since we really only need load and store .

The following (overly simplified and incorrect) snippet illustrates that fact.

module Emulator where import Emulator.Monad step :: MonadEmulator m => m () m () = do step -- Load the next instruction (at PC) and execute it <- load Pc pcload <- load ( Ram pc) instrload (pc) Pc (pc + 1 ) store(pc execute instr

A deterministic implementation

We can write a simple, nice and pure implementation of this typeclass by using ST , as we previously discussed. We need access to the Memory , and for this purpose we can simply wrap a ReaderT around ST :

module Emulator.Monad.ST where import Emulator.Monad import Memory ( Memory ) import qualified Memory as Memory newtype STEmulator s a = STEmulator ( ReaderT ( Memory s) ( ST s) a) s as) (s) a) deriving ( Monad ) instance MonadEmulator ( STEmulator s) where s) = STEmulator $ do load address <- ask memask $ Memory.load mem address liftMemory.load mem address = STEmulator $ do store address word <- ask memask $ Memory.store mem address word liftMemory.store mem address word runSTEmulator :: ( forall s . STEmulator s a) -> a s a) -- Implementation uses runST to produce a pure result: -- runST :: (forall s. ST s a) -> a

And a bit of IO

As briefly discussed before, we want to be able to add keyboard input and video output. I’ll focus on video output for this blogpost, since it’s a little easier to explain. The specification for video output is not yet released, but it was possible to deduce a lot from some screenshots.

The video memory starts at 0x8000 , and when we store characters at these addresses, they are displayed on the video terminal. Note that we don’t need to extend the MonadEmulator typeclass for this purpose: we already support writing values to 0x8000 + i , because we can reach those addresses through the regular store operation, discussed above!

But actually displaying the video terminal is something we can’t do with our STEmulator – let’s add an IOEmulator as well.

The stToIO function from Control.Monad.ST allows us to reuse everything in the Memory module.

stToIO :: ST RealWorld a -> IO a

Note that this is not an unsafe function: converting ST to IO is always safe this way (converting IO to ST , however, is tricky business, but this is not needed for our emulator).

Almost exactly like our STEmulator is a ReaderT wrapper around ST , our IOEmulator starts out like a ReaderT wrapper around IO .

module Emulator.Monad.IO where import Emulator.Monad import Memory ( Memory ) import qualified Memory as Memory newtype IOEmulator a = IOEmulator ( ReaderT ( Memory RealWorld ) IO a) a) deriving ( Monad , MonadIO ) instance MonadEmulator IOEmulator where = IOEmulator $ do load address <- ask memask $ stToIO $ Memory.load mem address liftstToIOMemory.load mem address = IOEmulator $ do store address word <- ask memask $ stToIO $ Memory.store mem address word liftstToIOMemory.store mem address word runIOEmulator :: IOEmulator a -> IO a -- Implementation trivial, omitted

If we now have some code for an emulator, we can run it in two ways:

-- Load a program, execute 1000 instructions, return the value in the RAM at -- position 0x1000. test :: MonadEmulator m => m Word16 = do test 0xa461 , 0x7001 , 0x8403 {-# and more... #-} ] loadProgram [ 1000 step replicateM_step Ram 0x1000 ) load ( t1 :: Word16 = runSTEmulator test t1runSTEmulator test t2 :: IO Word16 = runIOEmulator test t2runIOEmulator test

Let’s not stop here yet, because with our current version of IOEmulator , all we’ve done is create an impure version of STEmulator , without any extra features!

Adding the desired features is relatively easy at this point: suppose we want to add support for the video terminal. We just modify the store implementation of IOEmulator a little:

= IOEmulator $ do store address word <- ask memask $ stToIO $ Memory.store mem address word liftstToIOMemory.store mem address word $ video address word -- New line added here liftvideo address word

And a video terminal implementation!

video :: Address -> Word16 -> IO () () Ram address) word video (address) word | address < videoStart = return () addressvideoStart() | address >= videoEnd = return () addressvideoEnd() | otherwise = do -- Calculate row, column, character code and draw! return () () where = 0x8000 videoStart = videoStart + rows * columns -- Unknown yet? videoEndvideoStartrowscolumns = return () video _ _()

Conclusion

ST is a wonderful tool for problems that require (local) destructive updates for performance reasons . Additionally, we can easily convert it to IO , which allows us to simply pick one of the two using an abstraction layer: we can have deterministic results for tests, proper keyboard/video support, and the entire implementation of the actual CPU is shared code.

My thanks to Andy Georges and Toon Willems for proofreading and some corrections.