The problem we were trying to solve

The project was conceived with the Nova Scotia Department of Energy, who had a lot of cuttings and core descriptions that they wanted to digitize, visualize, and archive. They also had some hand-drawn striplog images — similar to the one on the right — that needed to be digitized in the same way. So there were a few problems to solve:

Read a striplog image and a legend, turn the striplog into tops, bases, and 'descriptions', and finally save the data to an archive-friendly LAS file.

Parse natural language 'descriptions', converting them into structured data via an arbitrary lexicon. The lexicon determines how we interpret the words 'sandstone' or 'fine grained'.

Plot striplogs with minimal effort, and keep plotting parameters separate from data. It should be easy to globally change the appearance of a particular lithology.

Make all of this completely agnostic to the data type, so 'descriptions' might be almost anything you can think of: special core analyses, palaeontological datums, chronostratigraphic intervals...

The usual workaround, I mean solution, to this problem is to convert the descriptions into some sort of code, e.g. sandstone = 1, siltstone = 2, shale = 3, limestone = 4. Then you make a log, and plot it alongside your other curves or make your crossplots. But this is rather clunky, and if you lose the mapping, the log is useless. And we still have the other problems: reading images, parsing descriptions, plotting...

What we built

One of the project requirements was a Python library, so don't look for a pretty GUI or fancy web app. (This project took about 6 person-weeks; user interfaces take much longer to craft.) Our approach is always to try to cope with chaos, not fix it. So we tried to design something that would let the user bring whatever data they have: XLS, CSV, LAS, images.

The library has tools to, for example, read a bunch of cuttings descriptions (e.g. "Fine red sandstone with greenish shale flakes"), and convert them into Rocks — structured data with attributes like 'lithology' and 'colour', or whatever you like: 'species', 'sample number', 'seismic facies'. Then you can gather Rocks into Intervals (basically a list of one or more Rocks, with a top and base depth, height, or age). Then you can gather Intervals into a Striplog, which can, with the help of a Legend if you wish, plot itself or write itself to a CSV or LAS file.

The Striplog object has some useful features. For example, it's iterable in Python, so it's trivial to step over every unit and perform some query or analysis. Some tasks are built-in: Striplogs can summarize their own statistics, for example, and searching for 'sandstone' returns another Striplog object containing only those units matching the query.