This is the fourth part of a series of posts detailing steps required to get a simple Z80 based computer running, facilitated by a Teensy microcontroller. It’s a bit of fun, fuzing old and new hobbyist technologies. See Part 1, Part 2 and Part 3, if you’ve missed them.

I mentioned ‘VRAM’ in the last post, which really was just an area of ram which I specified to the teensy through a port. I’ve now got something a bit more serious set up, which is completely separate from main RAM. It’s accessed via the I/O ports, after a flag has been set.

At the moment, I have all but one of the address bus pins connected on the z80. This means I can address 32KB of ram. The screen which is connected to the Teensy via SPI is a 320×240, 16-bit colour unit. Sadly, this means a full size framebuffer for this screen would be an eye-watering 150KB! Even half this size at 160×120 full colour is 37KB. I cannot add the additional address bus pin for a 16-bit address space due to running out of I/Os on the Teensy. I have a single one left, and it’s needed for something I hope to explain in the next few posts. I can use a 256 colour palette, which brings the memory requirements for 160×120 down to 18KB, but it’s still a large chunk of memory which can no longer be used for programs.

VRAM as a second address space

So I decided to use 16-bit (15 in my case) i/o addressing to enable a secondary 32KB address space – to use for VRAM. The Z80 in/out instruction in which the port is the C register actually places register B onto the top half of the address bus, allowing access to the full address space. We have a specific entry in the standard 256 port I/O space which is used to set a flag which the Teensy interprets as an instruction to treat all further I/O requests as writes into a special VRAM memory. I then have the highest port possible (0x7FFF) as the disable VRAM port. Reading from this port resets the Teensy and I/O operations return to their standard state. This allows a completely separate memory space for VRAM, which allows for all of main RAM to remain for programs and data.

There is a downside to this – I/O writes have an additional wait cycle automatically inserted, so they are slower than normal RAM writes. Additionally, things such as loading images from SD cards into VRAM would need to go via RAM, unless additional flags are inserted into the file system requests to specify what memory spaces the buffers refer to. However, I do think that those downsides will be insignificant when I try to make the Z80 clock asynchronously with the Teensy operations, as there are likely to be many wait states for RAM as well as I/O operations forced by the use of the WAIT input to the Z80.

On the Teensy, the code for this is very simple. We have a second global array to use as the VRAM storage, and then have a ioVramBankSet flag which we check on i/o operations.

#define PORT_VRAM_BANK_SET 0xC8 #define PORT_VRAM_BANK_RESET 0x7FFF byte Z80_VRAM[Z80_VRAM_LENGTH] = {0}; void loop() { ... unsigned short portAddress = addressBus & 0x00FF; if (RD_val) { if (ioVramBankSet && ( addressBus == PORT_VRAM_BANK_RESET)) { // PORT_VRAM_BANK_RESET is a special case 16-bit port ioVramBankSet = 0; } ... } else if (WR_val) { readDataBus(); if (ioVramBankSet) { Z80_VRAM[addressBus] = dataBus; } else if (portAddress == PORT_VRAM_BANK_SET) { ioVramBankSet = 1; } ...

The above code is really all we need for this. The upside of using this I/O style system instead of say, RAM banking, is that the instruction stream and source data can remain in standard RAM and we do not need to do any mapping of the address space which would restrict us significantly with only 15 address bits. Now we need to make use of the data which is stored in that space for graphics!

Display Modes

I mentioned earlier the amount of memory needed for various resolutions and colour depths. The simple fact is that the Teensy 3.1 microcontroller I’m using only has 64KB ram. Within that, we need the Z80 RAM, VRAM, and then working memory for the teensy itself – for driving the display, and working with the SD card and handling the FAT filesystem. This pretty much means 160×120 8bpp is really the maximum we can achieve. When combined with a 256 entry palette, we can get a very generous range of colours, and come in at less than 20KB. So we’ll have the VRAM set to 20KB.

The first and most generous mode is as above, 160×120, with a 256-entry 16-bit colour palette. This is laid out in vram with the first 512bytes as the palette, and after that the pixel data. This remains true for all display modes to simplify implementation. There are modes additionally for whether the display is stretched or not. If it is not stretched, the offset in the TFT will be configurable so you can move it around the screen and combine it with console text. As I write this, the following modes are supported:

40×30, 16bpp

48×48, 16bpp

48×48, 8bpp, 16-bit palette

80×60, 8bpp, 16-bit palette

160×120, 8bpp, 16-bit palette

The mode is set by an index value which is written to an IO port. A draw port exists, and a write to it initiates a full screen redraw. The data bus value is ignored. I may implement a sub-screen redraw which acts on a set area of the screen later as an optimization.

An additional feature is that the palette has an offset associated with it, which wraps the 256 bytes. So, to implement the palette-shifting effects of plasma, etc, it’s an incredibly easy hack. It also means that when I implement modes with smaller bits per pixel indices, there can be multiple palettes stored that can be switched with a single i/o write.



That is the trick used in the plasma example shown in the middle of this video. The Z80 is running slow at around 2KHz (remember, everything is still synchronous).

The Z80 code

The code is very simple to load pixel data into the VRAM space from RAM, and to do plasma palette cycling.

ld a, 6 out (PORT_VRAM_SETMODE), a ; 'vram' displaymode 6: 80x60 ; we can put out BC now to write VRAM ld bc, 0200h ; pixel mem offset, after palette ld hl, 012c0h ; size of pixel data (80x60) ld de, image_pixels_80x60 ; pixel data in binary section call ram_2_vram ld bc, 0000h ; pixel mem offset, palette ld hl, 0200h ; size of palette data (256 2-byte entries) ld de, palette_defn ; pixel data in binary section call ram_2_vram ld hl, 0 cycle_palette_idx: inc hl ld a, l out (PORT_VRAM_PALETTE_IDX), a ; inc palette idx ld a, 0 out (PORT_VRAM_DRAW), a ; draw vram jr cycle_palette_idx

This will load the pixel and palette data which are stored in the binary already, into VRAM. The PORT_VRAM_PALETTE_IDX I/O port sets the ‘palette offset’ so it can be rotated incredibly easily, and PORT_VRAM_DRAW draws the contents of VRAM to the display given the current display mode set via the PORT_VRAM_SETMODE port.

; de = src in ram, bc = vram offset, hl = size ram_2_vram: di push de push bc push hl push af ld a, 1 out (PORT_VRAM_BANK_SET), a ram_2_vram_loop: ld a, (de) out (c), a dec hl inc de inc bc ld a, h or l jr nz,ram_2_vram_loop ; return to non-vram ld bc, PORT_VRAM_BANK_RESET in a, (c) pop af pop hl pop bc pop de ei ret

The ram_2_vram function shows how the VRAM memory space is enabled with the PORT_VRAM_BANK_SET port, and disabled with a PORT_VRAM_BANK_RESET read. Also note how I disable interrupts within this function – as interrupts may use i/o ports themselves, for instance the serial receive ‘data available’ interrupt, it’s important to disable interrupts whenever the VRAM is enabled for writing. Other than that, it works a treat!

These video modes also allow for some fun error screens. For example, if the SD card is not mounted correctly (and, on boot the teensy tries to locate a kernel.bin on the SD, so it needs to be there) we get a nice error graphic. This one uses the 40x30x16bpp mode, but I’ll make it a lot smaller soon with a 16-colour palette mode.

A simple shell

You can also see from the video above that I have a very basic shell working, it simply takes input characters from serial and runs programs matching those names from SD. It does no argument passing, all it checks is if a file exists with that name, and if it does, loads it into offset 0x1000 of memory, before calling into it. The ls, cls and plasma binaries are all simply made in Z80 assembly and have no dependencies on any features that a kernel may need to provide.

At the moment, I do have the sdcc C compiler running with a compiled ‘kernel’ which allows for some real operating system services and true program loading with arguments to main, system calls, etc. Watch this space! I’ll talk more about that soon 🙂 Code as always is on my github. A full schematic is incoming, but it’s not difficult to decipher if you want to make your own.

I hope you’ve been enjoying this Teensy Z80 project. If you have, let me know on twitter @domipheus!

Part 5 is now available!