libavutil contains lots that is common with the other libraries that compose Libav. It grown a lot over the years and it’s time to consider splitting it.

Monolithic vs Modular

There will always be some discussion on which approach is globally better.

– Jumbling everything together so you have everything there and doesn’t matter what, you have your super hammer supporting screws, bolts, nuts and nails.

– Keeping the tools in separate boxes so you carry only the set of spanners you need when you need it.

For software libraries you have this kind of problem all the time and at multiple levels:

– Do you want to have a single huge header file with every function your library provides or a set of them organized to keep all the function related together?

– Do you want to link a single library or have the concerns split in multiple so you do not have to carry lots of stuff you do not use (storage and memory are still important in some applications).

Usually modularity comes with the price of additional initial effort (you have to think about what you are going to use a little harder) and maintenance (which library should I update?).

This blogpost is about trying to group and split bunch of unrelated functions present in a library and try to get a better API for some of them.

Libavutil

The Libav libraries are written mainly in languages (C, asm) and they focus a lot on being portable. Libavutil is the foundation.

It contains all the code that is common across libraries from the basics such as memory management to higher level data structures, to video and audio-specific basic manipulation and hashes, cryptographic primitives and lossless compressors.

A lot indeed.

Problems

Irregular Mushroom-API

Some of the highest level part of the library appeared little by little, first you need md5 and you add it, then is aes , then you want lzo . All the crypto expose direct functions to that specific hash, making those components non-optional even if you do not need them.

# libavutil/aes.h struct AVAES ; struct AVAES * av_aes_alloc ( void ); int av_aes_init ( struct AVAES * a , const uint8_t * key , int key_bits , int decrypt ); void av_aes_crypt ( struct AVAES * a , uint8_t * dst , const uint8_t * src , int count , uint8_t * iv , int decrypt ); # libavutil/xtea.h typedef struct AVXTEA { uint32_t key [ 16 ]; } AVXTEA ; void av_xtea_init ( struct AVXTEA * ctx , const uint8_t key [ 16 ]); void av_xtea_crypt ( struct AVXTEA * ctx , uint8_t * dst , const uint8_t * src , int count , uint8_t * iv , int decrypt ); # libavutil/sha.h struct AVSHA ; struct AVSHA * av_sha_alloc ( void ); int av_sha_init ( struct AVSHA * context , int bits ); void av_sha_update ( struct AVSHA * context , const uint8_t * data , unsigned int len ); void av_sha_final ( struct AVSHA * context , uint8_t * digest ); # libavutil/md5.h struct AVMD5 ; struct AVMD5 * av_md5_alloc ( void ); void av_md5_init ( struct AVMD5 * ctx ); void av_md5_update ( struct AVMD5 * ctx , const uint8_t * src , const int len ); void av_md5_final ( struct AVMD5 * ctx , uint8_t * dst ); void av_md5_sum ( uint8_t * dst , const uint8_t * src , const int len );

As you might notice it got to have lots and lots of expose, similar-but-non-uniform API popping out.

And if it was acceptable having a couple of hashes always around it gets not so nice if you have more to add.

Right now libavutil exposes 50 separate headers.

Extending it is painful now

Since we already have that many different components inside it you think twice about adding more stuff (if you are careful and caring), Libav is fairly modular and people do appreciate that.

In my wishlist I have few items such as getting more decompressors natively implemented.

Every new API is a burden to maintain (if you care about legacy and you keep maintaining releasing your older software) so adding or exposing more is always something you should consider.

Abstracting some details always helps, think what would be the API if each of the supported codecs has an exposed, non-uniform set of functions to decode each?

Ideal structure

Ideally I’d have the following layout:

– libavutil: basic memory abstraction, error, logs and not much else

– libavdata: basic data structures, including refcounted buffers, dictionaries, trees and such

– libavmedia: audio samples, pixel formats, metadata, frames, packets, side data types.

– libavhash: hashes such md5, sha and such

– libavcomp: compressors such as lzo

– libavcrypto: aes, blowfish and such

API

I already described my ideal api for the codecs, today I’d detail the hashes

As seen above it is common to have init , update

final and an optional utility function sum (or calc ) that takes whole buffer buffer and returns the hash.

typedef struct AVHashLibrary ; typedef struct AVHash ; typedef struct AVHashContext ; int av_hash_register_all ( AVHashLibrary * hashes ) const AVHash * av_hash_get ( AVHashLibrary * hashes , const char * name ); AVHashContext * avhash_open ( AVHash * hash , AVDictionary * opts ); int av_hash_update ( AVHashContext * ctx , const uint8_t * src , const int len ); uint8_t * av_hash_final ( AVHashContext * ctx , int * len ); uint8_t * av_hash_sum ( AVHashContext * ctx , const uint8_t * src , const uint64_t src_len , int * out_len ); void avhash_close ( AVHashContext * hash );

The structures are fully opaque, the AVHashLibrary contains the list of available hashes and possibly some additional hidden state. In Libav we are trying to remove all the global variables so the list of hashes is explicit.

The register_all function just populates the list of hashes and possibly creates accessory lookup tables when needed.

The get call let you look up the hash by name, additional can be made to look it up by id.

The open function takes a dictionary for hash-specific configuration.

The update and final function let you calculate the hash incrementally, the sum function is a simple utility that takes a full buffer (assumed to fit an uint64_t ) and produces the hash.