Rasp64

Rasp64 is a Raspberry Pi attached to the user port of a C64, primarily for the purpose of cross-development.

The rasp64 slave board on top of its host, a C64C.

The C64 is a most fascinating machine to program, but the development tools available on the platform itself are somewhat cumbersome. Nowadays, most people write their code on a modern computer and cross-compile it. Not only the assembler itself, but crunchers and disk management tools benefit from a host environment with substantially more RAM than the target machine. High-level languages may not be relevant for the code that will actually run on the C64, but for intermediate compilation steps that generate data tables they improve the development experience drastically.

Most people thus sit at a modern computer when writing the code, and then try it out using an emulator. However, why not turn it around? With rasp64, I can sit at a C64 and access a modern development environment using a terminal interface, and then try out the code on the real thing.

Physical layer

Between the C64 and the Raspberry Pi, I've hooked up a homemade communication interface card based on an ATmega88 microcontroller. When designing this interface, I wanted to optimise the data transfer rate in the Pi-to-C64 direction.

The interface card.

The C64 user port has eight GPIO pins that can be read (or written) as a byte at address $dd01 , as well as a handshake line that is pulsed briefly whenever this port is accessed. Thus, with a single instruction, the C64 can receive one byte of data and signal the microcontroller that it is ready to receive the next.

This handshake signal is monitored by a pin change interrupt on the microcontroller. The receiving loop at the C64 side has a best case execution time of 18 clock cycles. The interrupt handler must therefore produce a new value within this time. While the C64 runs at 1 MHz, the ATmega88 runs at 20 MHz, and most instructions are single-cycle. It is therefore quite possible to grab the next byte from a circular transmit buffer and emit it in time. Should the buffer be empty, a dummy marker byte (0xbf) is transmitted.

Ethernet connectivity is useful, and so is the ability to quickly move a USB stick between the Raspberry Pi and the 1541 Ultimate.

In the uplink direction, there is no need for speed. One of the synchronous serial ports provided by the user port (two lines: clock and data) as well as an additional GPIO line are connected to the SPI slave interface of the microcontroller. The GPIO line is connected to Slave Select and is pulled low once, when the terminal software starts, in order to synchronise the bit stream at a byte boundary.

The Raspberry Pi receives and transmits data over a 3.3 V UART which is hooked up via level translators to the built-in 5 V UART of the microcontroller.

The interface card is powered from the user port, but the Raspberry Pi has a separate power supply so I can power-cycle the C64 without rebooting the development environment.

Transport layer

Flow control is necessary only in the downlink direction. The Raspberry Pi transmits data at 115200 baud to the microcontroller, which stores the data in a circular receive buffer. When the buffer is getting full, an XOFF byte is sent back to the Raspberry Pi, where the TTY has been configured for software flow control. Once the buffer is getting empty again, an XON byte is transmitted. When the microcontroller wishes to send data to the C64 — and this happens in main context, not interrupt context — it busywaits until there is room in the circular transmit buffer. It then places the data in the buffer, to be fetched by the aforementioned handshake-driven interrupt. Thus, flow is controlled at every step.

Spacers allow heat to escape from underneath the Raspberry Pi.

Presentation layer

The receive buffer is consumed in main context. The data drives a state machine that translates VT100 escape codes into into a much simpler set of escape codes that can be decoded quickly on the C64. To do this, the microcontroller needs to keep track of some of the state at the C64 end as well as the state of the emulated VT100. This state is reset through a synchronisation mechanism that is activated at power-up, and when the terminal software starts.

In the other direction, keys are debounced and repeated at the C64 side (with 3-key rollover but no protection from ghosting), and key-down events are transmitted in a very simple format: The two most significant bits of each byte encode the current modifier status (none, shift, control, commodore — combined modifiers are not supported) while the six remaining bits represent the raw keycode. This is then translated into ASCII by the microcontroller.

At the gates of adventure.

Application layer

The Raspberry Pi runs a Debian installation and provides a standard login prompt at the UART. Once logged in, I use screen for session control and terminal multiplexing, and vim as my editor of choice. I also rely heavily on xa65 , make , gcc and various own tools.

Terminal software

On the C64 side, I wrote a small terminal program for a monochrome, fixed-size 40x25 character display. It comes with its own protocol: A byte with the most significant bit set is a control code (e.g. move to a particular row or column, turn on inverse video, configure the scrolling area, scroll it...) while other bytes are raw screen codes to be stored in the video matrix.

Here is the 18-cycle receive loop in all its minimalistic glory:

recvloop lda $dd01 bmi special invmask ora #0 recvto sta !0,x inx jmp recvloop

Because screen codes cannot have the most significant bit set in my transfer protocol, the font is encoded as 128 characters followed by 128 inverted characters. The ora -instruction is used to set the high bit of a screen code in order to display the character in inverse video. Self-modifying code is used to change the operand to either 0 or 0x80. The x register is used to keep track of the current x-position of the cursor, while the address stored inside the sta -instruction points to the current row.

Scrolling

Fast partial scrolling is necessary for a responsive UNIX environment. Every time you add or delete a line in your text editor, for instance, the lines below the cursor are scrolled using terminal escape codes. Thus, my terminal protocol supports the concept of a scrolling area (a subrange of the 25 lines available on the display) that can be rotated up or down.

Editing some timing-critical C64 code using vim.

But it is not enough to support this in the protocol. In order to reach the desired responsiveness, the scrolling operations themselves must be fast. It simply won't do to shuffle the actual character data between rows in the video matrix. Somehow, we must store the character rows out-of-order, and use a table of row references to index them in the current order. Then we can scroll the references rather than the rows.

The C64 has a fixed memory layout for all of its graphics modes, but we can get around that using a trick. By manipulating the YSCROLL register at the right time, we can force the graphics chip to restart from the top of the current row of text. Using bank switching, we can then select one out of 25 video matrices stored in different parts of RAM. We only use the top row of each video matrix. Alas, this trick only works if we restart the text row before its eighth and final rasterline, which would cut off all the descenders. To account for this, we restart the text row already after four rasterlines, cut each character in half and store the halves as two separate font definitions, and use bank switching to select the proper half.

However, this software-driven display mode cannot be combined with colour support, because colour RAM is not banked on the C64. The loss is of little consequence.

Running

To complete the development cycle, it must somehow be possible to transfer the compiled code to the C64 and run it there. This feature was implemented by means of a new escape sequence that puts the terminal software in a special download mode. Following a load address, data is transmitted 128 bytes at a time along with a checksum. Each chunk is transmitted to the microcontroller first, and stored in a buffer. This is because the channel must be 8-bit clean: The data may contain any byte values, so we can't just inject 0xbf bytes to stall the receiver.

A simple command, sendprg , on the Raspberry Pi transmits a block of binary code to the C64 using this protocol, and instructs the terminal software to jump to the beginning of the transferred program. The program can now be debugged using a freezer cartridge, of course.

When the time comes to return to the development environment, it is a simple matter to push the 1541 Ultimate button followed by Return twice to start the most recently launched program (i.e. the terminal software) again, landing me right back in the shell.