Infrastructure engineering is a craft learned outside of classrooms. The discipline is ever-changing. Our value is not in credentials or the recall of accumulated facts, but instead by our capacity to tackle the unknown.

Failures of management, product, development, and QA hit us first, usually in the dead of night. Established industries have begrudigingly accepted the need to pay for 24/7 staffing, but our teams are so small that we can find ourselves permanently on-call. Some organizations delay hiring ops talent for so long that it is impossible for the new hire to improve the infrastructure. Instead the engineer is sacrificed to an all-hours cycle of quick-fixes and looming crises.

The first bout of burnout is inevitable. How are we to know our limits until we run in to them? Burnout, sufficiently advanced, is permanent damage. I've recovered from bad situations in both startups and a huge corporation. I am going to share some war-stories and describe the fixes that I implemented to protect my long-term livelihood.