tl;dr : Played to process a wav file. C was easier and cleaner than Ruby. edit: I wanted this program to work only on one specific machine (a x86 on a 32 bit Ubuntu). Therefore I didn’t had any portability consideration. This is only a hack.

I had to compute the sum of the absolute values of data of a .wav file. For efficiency (and fun) reasons, I had chosen C language.

I didn’t programmed in C for a long time. From my memory it was a pain to read and write to files. But in the end I was really impressed by the code I get. It was really clean. This is even more impressive knowing I used mostly low level functions.

A wav file has an header containing many metadata. This header was optimized to take as few space as possible. The header is then a block of packed bytes.

The 4th first bytes must contains RIFF in ASCII,

in ASCII, the following 4th Bytes is an 32 bits integer giving the size of the file minus 8, etc…

Surprisingly, I believe that reading this kind of file is easier in C than in most higher level language. Proof: I only have to search on the web the complete header format and write it in a struct.

To read this kind of data in Ruby, I certainly had to write a block of code for each element in the struct. But in C I simply written:

Only one step to fill my data structure. Magic!

Then, get an int value coded on two Bytes is also not a natural operation for high level language. In C , to read a sequence of 2 Bytes numbers I only had to write:

Finally I ended with the following code. Remark I know the wav format (16 bit / 48000Hz):

Of course it is only a hack. But we can see how easy and clean it should be to improve. As I say often: the right tool for your need instead of the same tool for all your needs. Because here C is clearly far superior than Ruby to handle this simple tasks.

I am curious to know if somebody know a nice way to do this with Ruby or Python.

edit: for compatibility reasons (64bit machines) used int16_t instead of short and int instead of int .

Edit (2): after most consideration about portability I made an hopefully more portable version. But I must confess this task was a bit tedious. The code remain as readable as before. But I had to use some compiler specific declaration to force the structure to be packed: __attribute__((__packed__)) Therefore this implementation should for big and little endian architecture. However, it must be compiled with gcc . The new code make more tests but still don’t use mmap . Here it is:

Edit(3): On reddit Bogdanp proposed a Python version:

and luikore proposed an impressive Ruby version: