2017 was a big year. I really can't overstate that. Despite the turnover, we accomplished a lot. Some highlights:

Replaced the Socorro collector. Replaced the Socorro collector with a top-to-bottom rewrite code-named Antenna. We put it in production in April 2017 and fixed a few minor issues that came up. We haven't touched it since then--it's been solid. In July, I wrote a post-mortem and project wrap-up. Ops, QA, engineering--we did awesome on this project!

Created a new Docker-based local development environment. This radically improved our ability to trouble-shoot, debug, reproduce issues, fix issues, and verify correctness of fixes. It was a game-changer. In September, I wrote Socorro local development environment.

Rewrote signature generation code and added a command line interface. This allows us to verify signature generation changes and experiment with new ones. We can confidently make changes to signature generation code now and know roughly what the effects will be. Not only that, but the tools are easy to use and make it possible for anyone to test their signature generation changes. In October, I wrote Socorro signature generation overhaul and command line interface.

Built a new Docker-based -stage environment. Our current infrastructure has some rough edges and it's really different than the other systems at Mozilla. In order to be more like other systems, we're building a new infrastructure for Socorro that uses Ops-preferred Dockerflow bits. This new infrastructure will make it easier to scale individual components, deploy, back out deploys, and manage everything. Getting a working -stage environment was a huge accomplishment. From writing Docker files and new command scripts, to infrastructure glue and deploy pipeline bits, to getting everything including our tests working on Circle CI, to rewriting Socorro code that had underlying assumptions about how it was being run to work with the new system. Work for this ongoing project is covered in [bug 1391034] and a bunch of bugs blocking that one.

Rewrote Snappy symbolification server and all things symbols. We rewrote the Snappy symbolification server which engineers use to symbolicate stacks to get meaningful stack traces. This new system is code-named Tecken. In addition to that, Peter took the project several steps further and centralized all things symbols into Tecken. Socorro's minidump stackwalker now asks Tecken for symbol lookups allowing Tecken to keep track of missing symbols. Soon, we'll be able to remove all the missing symbol bookkeeping code from Socorro. We're also switching to Tecken for symbol uploads. Soon, we'll be able to remove all the symbol upload code from Socorro, too. Peter wrote a plog entry on load testing Tecken which covers some other bits about Tecken as well.

Removed lots of code and other things from the repository. Adrian and Peter worked on the "deprecation rampage" focusing on removing unused API endpoints. We spent time removing Postgres tables, stored procedures, and views we weren't using. We removed the fakedata generation code. We removed the middleware component (most of it was folded into the webapp). We removed the aging and broken Vagrant development environment. We removed a bunch of scripts whose purpose has long been forgotten. We removed code for cron jobs we no longer run. We removed bits and bobs for projects long abandoned (running Socorro on Heroku, using hbase, etc). There's still a lot of code ripe for removal and cleaning up, but we made significant progress towards reducing the code base to a size that's maintainable by a small team. This is covered by a bunch of bugs like [bug 1361394], [bug 1314814], [bug 1424027], [bug 1424370], [bug 1398946], and [bug 1387493].

Updated Python dependencies and reworked how we manage them. We updated all the Python dependencies (some of which were several years old), switched to a requirements file and constraints file to specify them, and set up monthly dependency reviews for non-security updates and daily dependency reviews for security updates. This automates the majority of the work required to stay up-to-date. This work is covered in [bug 1306731].

Updated JavaScript dependencies and switched to npm to manage them. Our webapp relies on a bunch of JavaScript libraries. We had copies of these libraries in the repository. We removed the vendored copies and switched to npm to install them from a requirements file. Additionally, the updated the dependencies to more recent versions and set up monthly review for updates. This work is covered in [bug 1388593].

Built better metrics infrastructure for the webapp. We switched the webapp to use a library I wrote for Antenna called Markus. This makes it much easier to measure things like how often API endpoints are being used. Adding metrics to the webapp is now a two-line code-change. I want to update the rest of Socorro in similar ways. Hopefully, I can get to that in early 2018. This work is covered in [bug 1412590].

Cleaned up bugs. We triaged and resolved 1,221 bugs. We resolved bugs that were obsolete, for projects we abandoned, fixed, and otherwise not helpful anymore. We're down to under 500 bugs now.

Switched from nose to pytest. We switched from nose to pytest. We have hundreds of tests, so this was an overhaul of our test code which took a while. The end result is that we're now using a test library that has features that will make writing and maintaining tests much easier. This is covered in [bug 1361764] and [bug 1405675].

Linted Python code and added linting to CI. We linted all the Python code, fixed issues, and added linting to our CI. Linting is an important tool for finding certain classes of bugs. Being able to lint in CI reduces the risk of code changes. This is covered in [bug 1377254].

Overhauled documentation. We overhauled the documentation. We now have a new Getting Started guide that gets you a local development environment in roughly 4 steps. It documents the scripts we use for manipulating that environment and running the various components of Socorro individually as well as in conjunction with other components. We also updated all the documentation related to administrating and maintaining the infrastructure. There's still a lot of work to do here, but we made significant progress.

Wrote a system checklist. We wrote up a system checklist for verifying that the entire system is working as expected. This is helpful after big changes like upgrading Python versions or critical libraries. This also gives us a list of important things in the system so we can automate verification as much as possible and change parts that are hard to verify.