read

Recently I’ve grown to be increasingly interested in philosophy and became more involved with stoicism. I found its teachings to be a surprisingly practical and applicable to modern life. But what can the stoics teach us to become better in the craft of software engineering?

Epictetus is one of the main characters of stoicism. One of the key teachings he was keen to focus on was the stoic dichotomy of control. As he put it in Discourses:

The chief task in life is simply this: to identify and separate matters so that I can say clearly to myself which are externals not under my control, and which have to do with the choices I actually control. — Epictetus. Discourses. II.5

In more plain terms, this is the realization that we just do not have everything under our control in the world - and we should be wise to focus on the things that we do control and learn to worry less about the things that we do not.

On a side note, the core of this idea is by no means unique to stoicism though. Scholars of ancient times identified this independently across cultures and eras. As examples, we can find references to it in Christian theology or in Buddhism.

Embrace dichotomy of control in software engineering

The craft of software engineering is a colorful one and its challenges are both many and diverse. One of these challenges is the limited direct control we have over the many factors influencing our work.

To give you some examples from my own experience:

We seek to have excellent reliability and availability for our services. Then one day S3 goes down, pulling us - and half the internet - with it. Yikes.

We seek to deliver features on agreed deadlines. Then we bump into unexpected complexity during implementation stemming from accumulated tech debt. This causes our original estimations to become painfully invalid. Whops.

We seek to get a promotion. Then the feedback comes from the promo committee stating that after calibration within our cohort, it was found that we’re still behind the curve and need to prove ourselves more in some areas. Oh well.

We seek to nail the interviews for our dream position. Then we got matched up with an interviewer laden with biases working against us and who also happen to have a bad day. In the end we’re told to retry in 6 months despite our best efforts and capabilities. Bummer.

Trying to refrain ourselves from crying out in desperation, stating that the world and all the gods in it are just onto us, we can spot the pattern of what is and what is not within our direct control.

We can’t…

… predict what fundamental infrastructural component will blow up in our face tomorrow. We can follow best practices of distributed systems design and build in extra redundancy and failover patterns into our design to do best of our experience, capabilities and resources.

follow best practices of distributed systems design and build in extra redundancy and failover patterns into our design to do best of our experience, capabilities and resources. … spend weeks estimating a task to uncover all hidden complexity or refactor all nasty bits of our codebase. We can make a best effort in estimation and then make sure our stakeholders are kept in the loop and manage their expectations when we bump into something unexpected.

make a best effort in estimation and then make sure our stakeholders are kept in the loop and manage their expectations when we bump into something unexpected. … have a perfectly calibrated internal bar for the next promotion level we aim to hit. Maybe we really aren’t hitting it but we don’t know. We can cultivate an effective relationship with our manager and have open discussions on what competencies are we nailing and what needs work. We can write up our promotion packet to the best of our capabilities.

cultivate an effective relationship with our manager and have open discussions on what competencies are we nailing and what needs work. We write up our promotion packet to the best of our capabilities. … control the circumstances of our upcoming interview or decide who our interviewer will be to ensure good chemistry. We can prepare for the interview to the best of our capabilities - by researching how the interview funnel looks like for the particular company, practice on common questions asked or by writing up notes with example situations from our past for common behavioural questions.

“Ha, easier said than done.” I know. Something being in our control doesn’t mean that it’s also easy to control.

The key message is that we should be wise to worry less about the things which we don’t have control over and focus on aspects we can influence.

In reality, telling them apart is also not that easy, is it? The delineation between those two categories tends to be fuzzy, especially when we’re up close to a situation where everything seems to be bigger and more intimidating than - in the grand scheme of things - they actually are. This is where the experiences we accumulate and the practical wisdom we build by reflecting on them comes into play.

Embrace imperfection

One of the things I learned in this industry is that I should do well to embrace imperfection and work with it. Purists and perfectionists tend to not do very well.

For all nasty situations that can happen to us as software engineers, there are a number of things we can do to mitigate them. They will reduce the chance of these unfortunate events to happen again, but they will not eliminate the risk. Often they act as double-edge swords, having unintended consequences. To mention a few:

Tests will not save us from bugs in production - but they can help. They can also become extra maintenance burden and slow down product development pace - a controversial topic I know, simplifying a bit here. Maybe we don’t need 100% code coverage and TDD for a new feature that we just want to validate market fit for - instead a couple of high signal end-to-end integration tests would do.

Detailed production metrics and dashboards won’t save us from outages - but they can help. They can also be dialed to 11, causing information overload so people are lost in the maze of colorful graphs - ending up just not looking at them. Maybe we just need a dashboard showing the 5 most important high level metrics with the ability to do ad-hoc analysis in case of an incident - instead of the 40 graphs showing everything that we could easily instrument in our service.

Paranoid rollout tactics won’t save you from outages - but they can help. They can also be designed to be overly cautious, causing overall development pace to slow down unreasonably. Maybe you don’t need to test that new Privacy Policy page header design on a staging market for a month, worrying about how it might break the core funnel.

Balance

Having control over something doesn’t mean that we should overdo it. Striking a good balance with good a good return of investment is something I see successful engineers do well. It’s a tricky business as the sweet spot is a moving target - it depends on the circumstances we’re in.

As usual, there are no silver bullets so we need to constantly iterate on our approaches. This means that we’ll inevitably fail at times either because we’re not doing enough of something or doing too much of it - and that is fine as long as we learn and course-correct.

One of the things that I always seek to cultivate in teams I’m a member of is a culture of frequent experimentation on the levers we have control over - but that might be worth its own post sometime.