I went to a terrific workshop last week on identifying bird songs. We listened to recordings of songs from some of the trickier local species, and discussed the differences and how to remember them. I'm not a serious birder -- I don't do lists or Big Days or anything like that, and I dislike getting up at 6am just because the birds do -- but I do try to identify birds (as well as mammals, reptiles, rocks, geographic features, and pretty much anything else I see while hiking or just sitting in the yard) and I've always had trouble remembering their songs.

One of the tools birders use to study bird songs is the sonogram. It's a plot of frequency (on the vertical axis) and intensity (represented by color, red being louder) versus time. Looking at a sonogram you can identify not just how fast a bird trills and whether it calls in groups of three or five, but whether it's buzzy/rattly (a vertical line, lots of frequencies at once) or a purer whistle, and whether each note is ascending or descending.

The class last week included sonograms for the species we studied. But what about other species? The class didn't cover even all the local species I'd like to be able to recognize. I have several collections of bird calls on CD (which I bought to use in combination with my "tweet" script -- yes, the name messes up google searches, but my tweet predates Twitter -- a tweet Python script and tweet in HTML for Android). It would be great to be able to make sonograms from some of those recordings too.

But a search for Linux sonogram turned up nothing useful. Audacity has a histogram visualization mode with lots of options, but none of them seem to result in a usable sonogram, and most discussions I found on the net agreed that it couldn't do it. There's another sound editor program called snd which can do sonograms, but it's fiddly to use and none of the many color schemes produce a sonogram that I found very readable.

Okay, what about python scripts? Surely that's been done?

I had better luck there. Matplotlib's pylab package has a specgram() call that does more or less what I wanted, and here's an example of how to use pylab.specgram(). (That post also has another example using a library called timeside, but timeside's PyPI package doesn't have any dependency information, and after playing the old RPM-chase game installing another dependency, trying it, then installing the next dependency, I gave up.)

The only problem with pylab.specgram() was that it shows the full range of the sound, both in time and frequency. The recordings I was examining can last a minute or more and go up to 20,000 Hz -- and when pylab tries to fit that all on the screen, you end up with a plot where the details are too small to show you anything useful.

You'd think there would be a way for pylab.specgram() to show only part of the spectrum, but that doesn't seem to be. I finally found a Stack Overflow discussion where "edited" gives an excellent rewritten version of pylab.specgram which allows setting minimum and maximum frequency cutoffs. Worked great!

Then I did some fiddling to allow for analyzing only part of the recording -- Python's wave package has no way to read in just the first six seconds of a .wav file, so I had to read in the whole file, read the data into a numpy array, then take a slice representing the seconds of the recording I actually wanted.

But now I can plot nice sonograms of any bird song I want to see, print them out or stick them on my Android device so I can carry them with me.