This results in a visualization of the signal that is one part real and one part imaginary, but also perceptually meaningful:

Loud sounds have large shapes, and quiet sounds have small shapes. Near silence is a dot in the middle, and pure silence is a plain black screen.

A pure sine wave is just a circle, where the radius corresponds to amplitude.

Purer sounds are very round because they’re made of very few sine waves.

Brighter sounds end up looking spiky because they have many frequency components and also digital sound has limited resolution/is “pixelated”.

Percussive/transient sounds flash on the screen because these signals are very short.

Sustained tones create sustained shapes because tones are periodic signals that have repeating parts that have the same shape, and these shapes keep getting traced out over and over again.

Multiple tones in perfect harmony also have sustained shapes because perfect harmony means the frequencies are integer ratios of each other. In other words, the combination of these periodic signals is also a periodic signal.

Multiple tones in imperfect harmony have shifting/vibrating shapes because something to do with interference and beating and it’s just not periodic so the same shape doesn’t get repeated ok also most music uses imperfect harmony so every time there are multiple tones it’s probably gonna look messy sorry this deserves a dedicated post

More Technicalities

Thickness, Hue, and Saturation

The beam of the Audioscope visualizer has variable thickness and color. These things are more subtle and unpredictable, but if you’re curious and ok with more math, read on.

Thickness: inversely proportional to speed.

Hue: instantaneous pitch, derived from angular velocity.

Saturation: inversely proportional to amount of noise

The thickness decreases as the beam moves faster. This causes high-frequency sounds or loud sounds to appear thinner.

The color is a lot more complicated. Using the HSV color space:

Hue relates to pitch (more technically, pitch class). Pitch is circular, and hue is circular, so this is a natural mapping to make.

Saturation corresponds to amount of noise, where: more noise → more white, less noise → purer colors.

Value is maxed out because I want only the brightest colors

At a high level: the hue of the color roughly corresponds to the pitch of the locally largest frequency component. If we’re dealing with pure sine waves, it directly corresponds to the pitch of the sine wave. This means that, if a 440Hz (A4) sine wave is red, 220Hz (A3) and 880Hz (A5) are also red. A sine wave going from 440Hz to 880Hz would start at red, cycle through every color of the rainbow, and end up at red.

pitch ≈ log_2(frequency)

pitch class ≈ pitch mod 1

Technically: At a given point in the beam, we have the angular velocity ω (how fast the beam is turning at that point) (this is distinct from instantaneous frequency). For a pure sine wave, ω corresponds to frequency; If the beam turns twice as fast, the frequency doubles. Interpreting ω as frequency, we can use the above formula to convert it to something corresponding to pitch (class), and use the result as the hue of the color at this point.

Even more technically: For small values of ω, the effects of noise are much more prominent, so there’s actually a filtering step at the end that basically gets the average hue and amount of noise. However, this type of noise isn’t directly related to noise in the signal; It is related to the amount of noise in the angular velocity over time. Well, it should be related, but the current formula needs improvement.

After all this, the colors only have apparent meaning in exceptional cases (pure frequencies). But it does make for nice rainbows that entirely depend on the sound.

Filter Design and Implementation

I’m able to describe the concept of phase shifting every frequency by 90˚ while avoiding heavy mathematics. But actually creating the filter that does this for arbitrary signals requires domain knowledge. This is for those who are familiar with digital signal processing.

I created a generator for an FIR filter that removes all negative frequencies and also DC and Nyquist. I could have just used the plain Hilbert transform, but I wanted to make sure that, for the lower transition band, the magnitude of the real part approximately decreases similarly as the imaginary part, and similarly for the transition band near Nyquist, so that the results will be as circular as possible (as opposed to having vertically oriented ellipses). Low frequencies are very important in electronic music.

Rust is still relatively new and it seems no one has implemented an efficient convolution yet using the FFT, so I just implemented overlap-save on the spot, and made the filter length be as large as possible (and also odd) depending on the FFT size. I generated the impulse response for a bandpass filter with real part removing DC and Nyquist and imaginary part the Hilbert transform, and had it windowed with a Hamming window.

It was an option to use a pair of IIR filter that used less memory and had better magnitude response, but I saw the group delays for the lows and felt it was unacceptably long for an application that needs to be as responsive as possible. Also, I wasn’t okay with the idea of non-linear phase, which I imagine would ruin the integrity of the waveform.

As for the filters for getting the hue and saturation; I just implemented my own biquad lowpass (as in, I copypasted the formula). As I mentioned, I think there’s room for improvement. Currently, I take the angular velocity, take the logarithm of it, and then filter it, because my reasoning was that taking the log would cause the noise to be amplified and the filter would more strongly remove it. But isn’t there some invariance in that ordering? idk I didn’t want to think too much about math tbh and also was too rushed to really take a good look at the waveform and spectrum of the angular velocity BUT IT WOUDL BE NICE