We're growing and we're looking for a dedicated Site Reliability Engineer to help us automate operations and build an infrastructure for growth. We've found product market fit and now we're working to ensure that our systems are available, performant, and visible as we develop architecture that is scalable, cost-effective, secure, and reliable. Your role will be to help us meet those goals and set even more ambitious ones, looking forward to design systems that will take us into the future while helping us become hyper-aware of what's happening in our systems right now.

This is an opportunity to engage with cutting-edge technology and work on a real-world problem at global scale. In addition to competitive compensation and benefits there is also room for the right person to take on increased responsibilities. And it’s a lot of fun (although fast-paced and even chaotic at times) working as part of a small, passionate team.

Responsibilities



Take ownership of our infrastructure as code, which is currently in Terraform



Lead our DevOps culture, encouraging and enabling developer effectiveness through powerful and secure tooling



Keep us abreast of what's happening with our systems and our customers up to the second — we have one tracing obsessive in the team and we're all trying to be a bit more like him



Expand and improve our CI and CD systems



Help us develop and uphold SLIs and SLOs



Develop and maintain a (blameless) postmortem practice



Make monitoring and alerting alert on symptoms and not on outages



Qualifications