The purpose of this project is to encode an image to a sound that can be viewed with a spectrogram. For some time I have known that musical artists have encoded pictures into their music. Most notable of these is artists is Aphex Twin. Luckily I had a copy of Windolicker and a great visualization program Sonic Visualiser. After looking at the images I decided it would be cool to try and encode my own images. I saw a few programs available, but decided it would be a better challenge to write my own program from scratch using Perl.



Spectrograms

A spectrogram is a graph representing the intensity or a frequency with relation to time. Normally the frequencies are along the Y axis, with the time on the X axis. The intensity of the frequency is represented by the brightness of the color. The frequency and color can use either a linear scale or a logarithmic scale. Below is an spectrogram of a few piano chords. The audio file used can be found on Wikipedia here.

Image encoding

The idea I had to encode the image was to simply create a sine wave at a corresponding frequency to represent the Y axis, a corresponding time to represent the X axis and a corresponding amplitude to represent the pixel color intensity.

Creating Sound

The first step to encoding an image was to learn how audio formats work. At first I tried writing a script that plays a frequency to the ‘/dev/dsp’ (Which is the sound card on Linux). When writing straight to /dev/dsp you are limited by a sample rate of 8000hz and a sample size of 8bits. Below simple Perl script that plays a concert A 440hz. To execute run ‘./sin.pl > /dev/dsp’.



#!/usr/bin/perl

use Math::Trig;

use strict;

use POSIX; my $sample = 8000;

my $frequency = 440;

my $cycles = 6;

my $period = POSIX::floor($sample / $frequency * $cycles); while (1) {

for(my $i=1;$i<=$period;$i++)

{

my $x = 128 + sin($cycles * 2 * pi * $i / $period) * 128;

$x = POSIX::floor($x);

my $char = pack(“C“,$x);

print “$char color=”#ff00ff”>”;

}

}

The DSP defaults do not offer much fidelity I needed at least the fidelity of an audio CD, which is 16bits at 44.1khz. I did some of searching on CPAN to find a library that allowed me write wave files. Most of the audio libraries had a too much overhead for what I wanted to do. Instead I looked up the file format for a ‘.wav’ and coded my own library. This library is limited to only producing a 16bit 44.1khz mono wave.



#!/usr/bin/perl

#Author Evan Salazar

#——————————————–

#

#Generate a .wav file for 16 bit mono PCM

#

#——————————————-

use strict;

package SimpleWave; sub genWave { #Get the reference to the data array my ($audioData) = @_; #This is the default sample rate

my $samplerate = 44100;

my $bits = 16;

my $samples = $#{$audioData} + 1;

my $channels = 1; #Do Calculations for data wave headers

my $byterate = $samplerate * $channels * $bits / 8;

my $blockalign = $channels * $bits / 8;

my $filesize = $samples * ($bits/8) * $channels + 36; #RIFF Chunk;

my $riff = pack(‘a4Va4‘,‘RIFF‘,$filesize,‘WAVE‘); #Format Chunk

my $format = pack(‘a4VvvVVvv‘,

‘fmt ‘,

16,1,

$channels,

$samplerate,

$byterate,

$blockalign,

$bits); #Data Chunk

my $dataChunk = pack(‘a4V‘,‘data‘,$blockalign * $samples); #Read audoData array

my $data;

for(my $i=0;$i<$samples;$i++) { $data .= pack(‘v‘,$audioData->[$i]);

} #Return a byte string of the wave

return $riff . $format . $dataChunk. $data;

}

1;

Reading a Bitmap

Luckily I found a simple bitmap reader on CPAN called Image::BMP. This is a nice lightweight library that dose not depend on any external libraries or compiled code. Using this library I was able to easily load and read the bitmap data.

Encoding the Image

The first pass of my program disregarded the color data and only produced a frequency for the Y axis if the color intensity was less that half the sum of all colors. Below is an example. Note: I converted the WAV to an MP3 to conserve bandwidth, at 320kbps not much data is lost.

Audio File: ohmpie.mp3

I was really shocked to fist see the image! The only tweaking I needed to do was to use a linear scale for the frequency. Also if I selected too high an amplitude for the sin wave, clipping occurred in areas with too much black. For image above I used an amplitude of about 1000 on a scale of 0 to 32768.

The next step was to add amplitude scaling to match the color intensity. For this I summed all the color channels for a given pixel and scaled it to represent the max amplitude ‘(R + G + B) / 768 * max_amplitude’. Below is a picture of me after using the scaling.



Audio File: evan.mp3

By selecting a color scheme that goes from black to white and using a linear scale for the volume I get a very good black and white image. To prevent clipping on very dark images I added an inverse option that will invert the color producing a negative image.

Audo File: evanInv.mp3

You can reverse the color scheme to go from white to black to produce the regular image

Full Program

Below you can view and/or download the full code to this program. Currently performance is not optimized. So don’t write me telling me its slow. I currently have a few idea to speed it up. Also for best results use a small image around 100px tall.

Download: imageEncode-0.7.tar.gz