Adventures in WhatsApp DB — extracting messages from backups (with code examples)

Getting your messages without giving a third party your credentials and data

This post will explain how to recover messages from WhatsApp using Python. In particular I’ll be explaining how I went about finding and extracting conversations from the WhatsApp’s sqlite database and parsing the fields and data there. This is in no way a comprehensive reverse engineering / forensic analysis work — the only reason I’m even publishing this is that I couldn’t find freely available information and/or open source tools that you can trust for this purpose so I thought I’d share the work I did to save others some time. You can probably use a very similar process to analyze other (messaging) apps.

TOC

I will be using Jupyter Notebooks and Pandas throughout this article because it makes life easy and is visually “pleasing” but pretty much all code can be used independently of Jupyter. The notebook is available on github here.

Background (feel free to skip this paragraph)

Due to some legal trouble with a housing developer (are there any honest developers out there?) I ended up needing to recover WhatsApp messages from an old iPhone device that sat in a closet for two years.

I’m not an iPhone user normally so I started looking for ways to do this and I discovered iCloud doesn’t give you access to the actual backup contents except what Apple decided you need (at least not if you surf into your online account). If you want there are tools that will take your login credentials and allow you to browse the full backup contents but I’m not into giving unknown tools my (wife’s) login credentials. So, I went ahead and figured out how to do this without compromising the security of that account which led to the process described below.