After running a cloud for 2+ years our OpenStack API databases are full of cruft. Deleted instances, deleted networks, deleted volumes, they all are still in the databases. OpenStack has no periodic clean-up for this stuff, it’s left up to you. This is partly because there’s no unified way to do it and also because each operator has different requirements on how long to retain data. Over the past few weeks I’ve been cleaning up our records and would like to share what I’ve found.

Warning: This is an advanced operation. Before doing this: You should backup everything. You should test it in a dev environment. You should assess the impact to your APIs. I did all of those before attempting any of this, including importing prod data into a test OpenStack environment to assess the impact there.

Each service has it’s own database, and depending on how that code is written and usage, some services stored more data than others. Here’s the order in terms of largest to smallest of our databases, on disk size (which we will discuss later).

Pre-cleaning DB sizes

3.6G /var/lib/mysql/nova

1.1G /var/lib/mysql/heat

891M /var/lib/mysql/cinder

132M /var/lib/mysql/designate

131M /var/lib/mysql/neutron

103M /var/lib/mysql/glance

41M /var/lib/mysql/keystone

14M /var/lib/mysql/horizon

So with this in mind I started digging into how to clean this stuff up. Here’s what I found. I’m noting in here what release we’re on for each because the tooling may be different or broken for other releases.

Heat – Mitaka

Heat was the first one I did, mainly because if Heat blows up, I can probably still keep my job. Heat has a great DB cleanup tool and it works very well. Heat lets you say “purge deleted records > X days/months/etc old”. When I did this heat had so much junk that I “walked it in”, starting with 365 days, then 250, etc etc. Heat developers win the gold medal here for best DB clean-up tool.

heat-manage purge_deleted -g days 30

Keystone – All Versions

Guess what? Keystone doesn’t keep ANY deleted junk in it’s database, once it’s gone, it’s gone. This can actually be an issue when you find a 2 year old instance that has a userid you can’t track down, but that’s how it is. So as long as you’re not storing tokens in here, you’re good. We’re using Fernet tokens, so no issues here.

Cinder – Liberty

Cinder’s DB cleanup tool is broken in Liberty. It is supposed to be fixed in Mitaka, but we’re not running Mitaka. Hoping to try this after we upgrade. We have a lot of volume cruft laying around.

Glance – Liberty

Glance has no cleanup tool at all that I can find. So I wrote one, but we ended up not using it. Why? Well because it seems that Glance can and will report deleted images via the V2 API and I could never quite convince myself that we’d not break stuff by doing a cleanup. Anyone know otherwise?

Here’s my code to do the cleanup, be careful with it! Like Heat you should probably “walk this in” by changing “1 MONTH” to “1 YEAR” or “6 MONTHS”. These deletes will lock the tables which will hang up API calls while they are running, plan appropriately. Note if you look on the internet you might find other versions that disable foreign key constraints, don’t do that.