First, many thanks for the great question. This may well be my favourite retrocomputing video of them all, so I contemplated having a look at the executable for a while myself. So, this is what I did:

To download the audio, I went to the same YouTube video and used 4K Video Downloader (mainly because it clearly shows which audio is the original one, so that I can avoid an extra re-recompression stage). I trimmed the relevant audio using Audacity, just like you did. The contents of the right channel has a large DC offset, so I assumed that the left channel must be closer to the original signal: The quality of the signal is pretty low, but to a significant extent this is due to its very low amplitude. In the areas of pilot tone the signal is quite clean: At the same time, where the actual data is recorded, the quality of signal is pretty poor, with rectangular shapes strongly distorted, I'd guess mostly due to the .mp3 compression: Having said that, I think it is also clear that signals are distinguishable, with the bits in my screenshot being 0,1,0,0,1,1,... etc. This gives us hope that something can be recovered here. There are many programs created for recovery of tape data from audio files. One of the more recent ones is called TapeRecover, written by Andrei Titov (use Chrome to translate the page from Russian). I used this specific program because several people mentioned to me that it works well. It requires a very particular kind of input file: 48K mono .wav file. Transcoding from one discretization frequency to another is not a particularly great step to make, but I had no choice here. As we might have expected, the program struggled to recover the data from our original file. So, I prepared another one, with +40db amplification and hard clipping, to get closer to the shape these signals were supposed to have originally. This is what the resulting data looked like: This turned out to be sufficient to recover the original data, which you can download here: http://introspec.retropc.ru/other/james%20houston%20-%20big%20ideas.tap. I verified it and can see that the binary is 100% the same as the one you recovered, so this is reassuring.

Of course, I couldn't just stop here and had a look at what it does. It turns out that the file plays music on AY8912 chip that was installed into every variation of ZX Spectrum with 128K of memory. The actual driver that plays music is extremely primitive; technically, it is a simple .psg-like player. .psg file format describes the values that have to be written into the sound chip during each interrupt cycle (which happen at frequency close to 50Hz). Since the information is stored as register values, the original editable file cannot be easily recovered. This is the disassembly of the player in case you are interested:

; this player is, effectively, the v-blank interrupt handler. ; it is automatically called approximately 50 times per second ld hl,(CurPos) ; current position in the track FrameLoop: ld a,(hl) : or a : jr nz,SkipFrame inc hl : ld a,(hl) cp #FF : jr z,CommandFF ; end-of-track marker cp #FE : jr z,CommandFE ; the actual data is a pair of two numbers: ; a register number... ld bc,#FFFD : out (c),a ; ...followed by the register value inc hl : ld a,(hl) ld b,#BF : out (c),a : inc hl jr FrameLoop SkipFrame: ; non-zero bytes are decremented until we get to zero, ; i.e. they define a wait time in frames. ; this means in particular, that the data gets destroyed ; during playback (repeated playback is not possible!) ld (CurPos),hl dec a : ld (hl),a ei : ret CommandFE: ; the purpose of this command is not clear ; (it simply silently skips some bytes) ; luckily, it is never actually used inc hl : inc hl : ld (CurPos),hl ei : ret CommandFF: ; at the end of the track colour ; the border black and freeze xor a : out (254),a jr $ CurPos: dw MusicPSG ; current position in the track MusicPSG: ; music data follows here ; (32841-42169,9329)

However, listening to the tune it is clear that not only the main "voice" of the melody is coming out. I know that your assumption has been that it must mean that your capture has been somehow incorrect. However, I do not think that the chances for this are all that high. We used two different methods and recovered identical result. In addition, somewhat tuneless sounds you can hear - they may well be tuneless because they may be driving devices that introduce further distortions (i.e. detuning may be necessary to get them in tune).

In my opinion, all voices you can hear in the video - rhythm section on the printer, base line on the scanner (with the only exception of vocals) - are originally driven by ZX Spectrum's sound chip. My guess is that sound channels which would normally be joined together and outputted onto the speaker have been separated and used to drive other devices.

Last but not least, do not forget that the video itself has been edited and processed. Just as a somewhat relevant anecdote, I was the main coder for the MMCM's chiptune album: The Blossoming Years. Track 24 of this album is an electronic version of the album, effectively, tape recording of the demo program for 48K ZX Spectrum with an external AY interface. The album was released on the same day as the accompanying demo. However, the demo was not fully ready at the time when the album was being mastered, so as the result, the version of the demo that you will find by recovering Track 24 is not the same as the demo that was actually released.

UPDATE (18/04/2020)

Well, I really liked the theory that ZX Spectrum's AY chip was driving all these devices in the video, but sadly it is not the case. My checks of the music data seemed to indicate that instruments "jump" from channel to channel. This tends to happen when the track is not hand-made, but auto-generated from another format, most likely MIDI. Hence, I did a search for MIDI converters for playing tunes back on ZX Spectrum and found this thread on World of Spectrum: midi2ay 0.1. (The program is no longer available from Geocities, but Archive.org still has it.) The converter takes a .mid file and generates a corresponding .tap image automatically. In fact, the source of the assembly re-player there is also included with the program, from which you can immediately see that it is 100% identical to the re-player we downloaded.

Overall, then, my conclusion is as follows: the music was made elsewhere. The converted midi file loaded into ZX Spectrum and apparently playing in the video is highly unlikely to actually participate in the final mix.