Netflix is hiring "Chaos Engineers" to launch assaults on its network and verify that Netflix's systems can recover from failure without degrading customer experience.

In a blog post last night titled "Introducing Chaos Engineering," Netflix "Chaos Commander" Bruce Wong wrote, "We are constantly testing our ability to survive 'once in a blue moon' failures. In a sign of our commitment to this very philosophy, we want to double down on chaos aka failure-injection. We strive to mirror the failure modes that are possible in our production environment and simulate these under controlled circumstances. Our engineers are expected to write services that can withstand failures and gracefully degrade whenever necessary. By continuing to run these simulations, we are able to evaluate and improve such vulnerabilities in our ecosystem."

Toward that end, Wong wrote that Netflix is "hiring additional Chaos Engineers." The jobs don't appear to be on Netflix's job site yet, but a short description was posted on Twitter by Netflix's Dan Woods last week:

Netflix is hiring a “Chaos Engineer” … Basically, somebody to go in and fuck shit up to prove we can recover. Ping me for details! — Dan Woods (@danveloper) September 3, 2014

Netflix's failure-prevention strategy began at least a couple of years ago when it introduced "Chaos Monkey," open source software that automates the process of causing network failure.

"Our philosophy remains unchanged around injecting failure into production to ensure our systems are fault-tolerant," Wong wrote on his blog post last night. Netflix intends to "establish virtuous chaos cycles" in which engineers respond to outages by "build[ing] new chaos tools to regularly and systematically test resilience." Wong promised that Netflix will "continue to post our findings" as it dives deeper into chaos.