Software is mostly built according to experts opinions. These 7 data driven software engineering books show us another way to a productive and sustainable pace.

Here is a French saying “Le cordonnier est toujours le plus mal chaussé”. I found an English equivalent in “the shoemaker’s son always goes barefoot”. I believe this is nowhere more true than in the software industry!

“Software is eating the world!” Software is now able to do things that only humans used to be. Why the hell are we driving our projects as if we were a horde of amateur hitchhikers?

What is data driven software engineering?

Being data driven would allow us to answer questions such as :

How much is the feature we delivered last week contributing to the bottom line?

How much is the feature we are currently developing expected to contribute to the bottom line?

What are the estimated cost and value of increasing our test coverage by 1%?

What are the estimated interests and nominal amounts of our current technical debt?

What is the value of refactoring this file?

Which is the most valuable: improving our build system or building this new feature?

Most projects I’ve worked in have absolutely no clue about the answers to these questions. The decision is left to experts, to the one with most influence, or simply to the developer, who can do how he thinks is best…

Books

Hopefully, some people are thinking differently, they believe it is possible to quantify all this. They even explain how!

Details a very practical guide about the lean startup process. It’s a very good starting point to any kind of lean software development.

This book explains with real world examples how to use Kanban board to control your work queues and improve your flow of work. A true classic for any lean product development.

It’s all too easy to get caught up in numbers. It might sound counterintuitive, but numbers are not the main benefit of data driven software engineering. Its real value is in rational and collective decision making. (Think Big O() analysis for an analogy) The figures don’t have to be 100% exact to be useful. This book presents ways to measure the flow without loosing time in estimating.

This book is rather theoretical, but it links all subjects together: lean startup, risk management, Kanban, and economics. It’s a must-read on the subject.

The flow book gives a big picture view of what we want to achieve. This book explains how to actually measure all the aspects of your product development in dollar value.

I won’t say this one is an easy read. It is really theoretical, and the format is not the most reader friendly. That said, the more I was through the book, and the more it all made sense. It presents formal ways to use measures to define what needs to be done. Defining these measures early simplifies prioritization and improves quality.

Note: the pdf book is available for free by subscribing to the mailing list.

Disclaimer: this book is getting old, and is a bit outdated when compared to agile development practices.

That said, it provides real-world examples of how scientific Monte-Carlo simulations can be applied to software product development.

This one is not a book but an online course. It explains how to estimate the value of new development practices. It contains a lot of examples that we can use directly, but it also teaches how to adapt to new practices. Finally, it gets in the details of how to get real figures! 5 hours definitely well spent on the topic.

My future reading-list

Data driven software engineering is a wide topic. I definitely still have a lot to learn about it. Here is a selection of the books from my reading list:

An opportunity

Digging into this topic was a real eye opener for me. The software development world is plagued with cargo cult and supposed best practices. We follow advices, but most often without verifying if they actually work! I believe we should start applying the techniques in these books. We could create standard ways to measure the values of productivity, technical debt, quality, testing…

I see real opportunities for data driven software engineering:

Avoiding a lot of useless argument between proponents of A and B Communicating better with stakeholders Finally, achieving a sustainable pace

Read these books and give it a try!

PS July 2019

A few months ago, I had the chance to welcome Ismail as a Machine Learning intern. He worked on finding end-to-end tests that are most likely to fail given a commit. The results were encouraging, you can read the full story in Why Machine Learning in Software Engineering