The audio file that was posted two weeks ago is indeed a very important artifact of computer history: It is a recording of the “Apple I BASIC” cassette tape that came with the Apple I. It is the first piece of Software ever sold by Apple (not counting computer firmware).

Here is the first confirmed perfect dump of the 4096 bytes: apple1basic.bin

The Apple I is extremely rare. Only 200 were built, and less than 100 are believed to be in existence. Neither Steve nor Woz own an Apple I any more, and neither does Apple Inc. (Update: Woz still owns some.) The cassettes are even rarer, as not every Apple I came with one. There has not been a dump of the tape until 2002, when Achim Breidenbach of Boinx Software got an MP3 recording of an original Apple 1 BASIC tape by Larry Nelson, an Apple I owner, and, with a little help from his father (who worked with an Apple I back in 1976), managed to decode it by writing a program – in GFA BASIC on an Atari ST. Here is a screenshot of the visualization this program could provide:

Achim wrote an Apple I emulator, included a commented disassembly of his Apple I BASIC dump, and published it on the internet. Other people continued working on the disassembly and changed instructions that they thought were mistakes in the dump. The only dump that can be easily found today includes these changes. It was time to analyze the tape again and get an authoritative dump of the 4096 bytes.

So here is how to decode the signal. Let us first open the audio file in Audacity and look at the waveform. The signal is mono, and as it turns out that the quality of the left channel is better, let us delete the other channel. This is what the whole ~35 second recording looks like:

There is silence at the beginning and at the end – but we can just as well hear that. We need to zoom in to see the actual waveform. From about 2 seconds to 12.4 seconds, when we hear a continuous beep, the signal signal consists of uniform waves:

Afterwards, during what sounds like noise, the waveform looks very different: There are shorter waves and loger waves.

The original output of the Apple I when writing the tape was a square wave, but the signal was filtered by the properties of the magnetic tape, so high frequencies were removed, and the signal became rounder. On the way back in from tape, the op-amp of the Cassette Interface converted the signal back into a square wave by converting all samples above a certain value into 1, and all samples below into 0. While the threshold in the op-amp was fixed, it could be effectively manipulated by changing the output volume of the cassette player. Effect->Amplify in Audacity gives us the following picture:

It is now time to write a small program to measure and dump the width of the pulses. We need to be able to parametrize the threshold (the volume knob) to be able to try different settings. In Audacity, we have to save the file as “WAV” first, because it is easy to decode this file format: We just skip the 44 byte header, and every following byte is a signed 16 bit value representing the sample.

First runs show us that all pulses in the first part of the file are around 56 samples long, while the rest of the file contains pulses which have a length of either around 22 or around 45. It seems the first part is a sync signal, so that the reader knows when the data starts. The two pulse lengths afterwards represent ones and zeros. It turns out that the shorter pulses are zeros; this way we get readable 6502 assembly code and some ASCII data (though with bit #7 set).

Here is a hexdump of the first few bytes:

0000000 4c b0 e2 ad 11 d0 10 fb ad 10 d0 60 8a 29 20 f0 0000010 23 a9 a0 85 e4 4c c9 e3 a9 20 c5 24 b0 0c a9 8d 0000020 a0 07 20 c9 e3 a9 a0 88 d0 f8 a0 00 b1 e2 e6 e2 0000030 d0 02 e6 e3 60 20 15 e7 20 76 e5 a5 e2 c5 e6 a5

In order to verify that our decoded data is correct, we must make sure that after the sync, there are only pulses that fall into the 22 or 45 category. If we play with the volume and find a volume region (i.e. different but similar volume settings) that has no pulse length errors and decodes into the same data, we can be confident that the data is correct. According to what is printed on the cassette, Apple 1 BASIC is supposed to reside in RAM at 0xE000 to 0xEFFF, so

the length of the decoded data should be 4096 bytes.

The following program is a possibly minimal implementation of the required algorithm:

apple1basic-decode.c