What I learned about Software Development from building a Climbing Wall 20 November 2019 Theo liked to imitate as he was learning to walk. TL;DR: This post is just for fun! I didn’t really learn about programming, this is just a catchy title, and I wanted to share a big project I have continued to work on that has nothing to do with programming. I thought I could make a kind of funny post by stringing together a bunch of programming ‘wisdom’ that could really be associated with nearly anything. climbing wall’s current state Prototype, Iterate, Test, and Iterate More Climbing Wall: If you aren’t sure how much you will use the wall, or how much effort you want to put into it you can start small. Build and expand over time. Software: In software, most of the time these days folks take an agile / lean approach and try to deliver working MVPs along the life of the project to deliver continuous value and learning. Prototype I started really small, initially with a training board. I installed this above the stairs entering the basement. A training board lets you strengthen your fingers, and do weird pull-ups Iterate I then added a few holds directly into the wall near the original board so I could string together a few moves. A first box of bulk climbing holds, a test mounting on a board Iterate Again & Again Then I added a single set of boards on the wall… and started to expand out from there. empty wall, mounts, one board, and more Project Planning Negotiate with the Team Climbing Wall: I did have to discuss everything with my wife before I started drilling a bunch of holes in our wall. This was a bit of a process, starting with the training board, the wall holes (and an agreement that I would patch any holes… hmm I still have to do some of those. A discussion that became easy as she found she enjoyed adding frequent climbing into her exercise routine as much as I did. Software: In software you are always working with stakeholders, PMs, designers, other developers, and hopefully directly with some customers. Nearly everything built requires negotiation and compromises on time, features, UX, etc. negotiation with my partner It Starts with A Plan Initially, I started with just a small idea, but it expanded out, especially after my wife decided she also really enjoyed climbing on the wall as well. planning and materials Ensure Your Project has a Reasonable Learning Curve Climbing Wall: On a climbing wall you want it to be fun for beginners, kids, and more experienced climbers. Just like in software you want a project to be accessible and learnable by new hires and developers of different skill levels. Our wall was a big hit with so many kids, I have added a lot of easier holds and built a lower route so small children can jump on the wall and have a great time. Software: You can’t build a team, recruit, and mentor folks if you have a software project that is all expert level. Ensure it is easy to set up the apps development environment and it is easy to add features, test, and deploy safely so new folks can learn with confidence. Theo Climbing, click for video Set Milestones On personal projects as in software, you want to set short, medium, & long term goals. Short: something at home to help with training for climbing

Medium: I want to connect all the reasonable basement walls for a long route

Long: I want to connect the original training board to the route along with ceiling holds

Longer: I want to add a removable incline that can add a step grade when good climbers are visiting Adding Ceiling Holds was a longer-term goal Celebrate Your Wins Climbing Wall: In general, whenever I expanded the climbing wall, I would quickly add some holds and celebrate by climbing my new longer route. Software: Your team should be proud and get to celebrate after shipping something big. Also, ensure developers are sharing the things they learned along the release with the team. Having space to make investments and to pay down debt, requires that everything can’t always be moving at maximum speed all the time. Celebrate the progress folks are making. Climbing the Routes as the wall is in progress Erin testing our new “door crossing” problem Plan For Growth Climbing Wall: On the wall, I would build and leave space to add more when I had the time. Software: In software, you want code that is flexible and easy to adapt. This doesn’t mean to over optimize, but know when to be specific and when to offer flexibility (the rule or threes can help with this). Gaps When, low on Supplies Make it work, make it right, make it fast Climbing Wall: The wall was built to keep up and extend my skill level… When I didn’t have the parts or the time, I would sometimes make something fast and leave gaps to extend the routes. Software: In software, make sure you can get it working, this ensures you solve the hard problems. Then make it right soft the edge cases and the gotchas. Then make it fast and scalable. color coded climbing routes Learning & Growth Climbing Wall: As I worked on the wall project I became better with tools, building, designing routes, & more. I got comfortable and started to think up some more complex projects. Software: Software takes practice, you will get better the more you build things. Learning which practices to follow and which don’t scale well. You Can Learn Anything Climbing Wall: I seriously know very little about building things, tools, construction, or really even climbing. All the information you need is available online to learn so much about any topic that interests you. Software: You are always learning in software. A new framework, language, domain, etc… The field changes so fast that you have to keep learning to stay up to date. To know when something is a fad or is really worth investing time in deep learning. You Get Better with Practice Climbing Wall: Originally, I barely knew what size drill bits, bolts, nuts, and holds… Now I can put all this together and set up a new board in almost no time at all. Software: It is good to keep practicing… Often this is how you learn to navigate all the grey areas of programming. The best solution to a problem isn’t always black and white, often the best practices have edge cases… Learning what to bend and what to break comes with practice and experience. Feeling the pain of maintaining systems over time, knowing what will stick around and what code often just gets removed. over time, building became faster It Takes A Team Climbing Wall: You can’t hold up 4x4 plywood and drill it in yourself… Building a climbing wall requires teamwork. Collaboration to successfully complete the project. Software: In software, most projects can’t be done by a single developer anymore. It takes collaboration, coordination, and teamwork to build something that lasts. A few of the friends who have helped out Learn From Prior Art Climbing Wall: I read a number of things to learn how to build a climbing wall, this free build a climbing wall e-book from Atomik is great. No reason to try to learn from scratch. Software: Not often are you building a program from scratch with no prior art. Learn from existing frameworks, applications, books, and open source. Build on the shoulders of giants as they say. need inspiration? google climbing walls Start Cheap and Upgrade Later Climbing Wall: I started with some cheap holds, but over time I upgraded to nicer holds over time as I spent more time on the wall and as I expanded it. In the end, I really love atomik climbing holds, and I buy most of my equiptment there. Software: In a software startup or a feature, you want to find the fastest and cheapest way to verify the value of something. When you know there is a value and have been able to build something sustianable (or I guess in startup world, with a hockey stick growth), you might want to move on from “it works” to it is best in class… Particularly, for things that aren’t part of the core company business value. Cheap bulk grey holds, later upgraded to various specialty holds

Safely Removing Image Assets from Rails 28 October 2019 photo credit cleaning: pixabay Why Cleanup Rails Image Assets? Why would we want to more safely delete image assets? a clean repo is easier to jump into and maintain over time

cruft that isn’t in use can be confusing over time

image assets can slow down your test and deploy pipeline Rails tests frequently need to dynamically or initialize building all assets, this is often a slow hit on a Rails test suite

Deployment needs to work with assets as well, often multiple steps building all assets rake assets:precompile compressing asset bundle uploading assets to a CDN

All of this time adds up, assets compilation on a large legacy Rails app still adds around 40 seconds to the build time, down from 1m30s in the past. While asset preparation and deployment before cleanup and optimization of that flow and files was adding over 3 minutes to our deploy time. How To Safely Delete Image Assets OK, hopefully now you would love to delete all the images in your app/assets/images folder that you don’t use… What images are safe to delete or out of use? I have looked at a number of ideas for this. grepping with various scripts

using log aggregation search results to ensure no hits were being made of an image asset

using Sprockets options, unknown_asset_fallback alone, doesn’t make you entirely safe… What I wanted was a way to delete a folder of images or a single image that I believed was no longer in use, but have the build fail if there was any reference to the image. I wanted Rails to fail in these cases: a page is loaded in dev mode referencing a missing asset

tests are run against a page referencing a missing asset (ActionDispatch::IntegrationTest, request spec, etc)

bundle exec rake assets:precompile Sprockets Unknown Asset Fallback Not surprisingly, other folks have wanted this and Sprockets has a built-in option config.assets.unknown_asset_fallback, which gets close to what I wanted. From the docs, this option claims to: When set to a truthy value, a result will be returned even if the requested asset is not found in the asset pipeline. When set to a falsey value it will raise an error when no asset is found in the pipeline. Defaults to true. So let’s set it to false: Rails.application.config.assets.unknown_asset_fallback = false . Now if you have a deleted image referenced like below: < %= image_tag("deleted_image.svg") %> You will get an error when visiting the page or running tests: bundle exec rake ...S.......E Error: HomeControllerTest#test_should_get_index: ActionView::Template::Error: The asset "deleted_image.svg" is not present in the asset pipeline. app/views/home/index.html.erb:6:in ` _app_views_home_index_html_erb___957919561084124106_70092585694780 ' test/controllers/home_controller_test.rb:5:in `block in <class:HomeControllerTest>' This doesn’t make one entirely safe, as images that are referenced in your scss, css, or other styles would still not cause an error. They would silently lead to broken images. Patch To Force Asset Compilation To Fail on Unknown Assets Sadly, I couldn’t find any option or configuration that would cause compiling stylesheets to fail. I thought this would block my goal of being able to remove a ton of assets safely with confidence… Well, after lots of digging, I figured out how to patch sprockets-rails so that it will blow up and fail when it references an unknown asset. You can apply this patch in your: config/initializers/assets.rb Now if you have a file in your styles, like app/assets/stylesheets/application.scss reference an image, your asset pipeline will blow up when the image is missing. .broken-image-demo { background-image: image-url('deleted_image.svg'); } Depending on how your tests run, they will fail when precompiling assets, or a failure will occur when you call rake assets:precompile as shown below. bundle exec rake assets:precompile ... Done in 1.32s. rake aborted! Sprockets::Rails::Helper::AssetNotFound: path not resolved: deleted_image.svg /Users/danmayer/projects/coverband_demo/config/initializers/assets.rb:56:in `rescue in compute_asset_path' /Users/danmayer/projects/coverband_demo/config/initializers/assets.rb:51:in `compute_asset_path' /Users/danmayer/.rvm/gems/ruby-2.6.2/gems/actionview-5.2.2.1/lib/action_view/helpers/asset_url_helper.rb:200:in `asset_path' ... Asset Failure Demo If you want to see this in action, feel free to clone coverband demo. Install gems and get the tests passing. Then read the comments and run tests or compile assets when uncommenting the example lines to trigger the expected errours. Key Files: config/initializers/assets.rb, this shows all the setuo and configuration needed

app/assets/stylesheets/application.scss, an example of stylesheets referencing a missing image

app/views/home/index.html.erb, an example of a view file referencing a broken image A Final Note On an old legacy application we were able to delete over 50% of the total asset disk size, by clearing out old unused image assets. This made it easier to find and navigate our assets folder, and it significantly sped up both our test suite and deployment. While I wouldn’t expect most projects to have as much image cruft sitting around, especially with older applications, it is easy for these unused assets to really build up over time. While the above, should make it easier to delete image assets and do housekeeping yourself, it still takes a bit of time. You need to go through a process: find a likely set of unused images

delete them, run tests

add back images that were still used

repeat until satisfied This obviously looks like a process that can be automated to help you clean up all your image assets automatically. That is totally true, and I will cover how to do that in a future post. This post covers what is a prerequisite to being able to automate the cleanup, ensuring that your app will properly and very loudly fail when an image was removed which is still required.

Flaky Ruby Tests 07 September 2019 Restoration of a Ruby Test Suite I want to talk about some recent work to restore a Rails app’s test suite to a usable state. The goal went beyond the test suite, to restoring trust in the continuous deployment pipeline, but this post will mostly focus on the Rspec suite. The reason I started on this work was that the current state of deployment was “dangerous”, various folks preferred to avoid the project as it was difficult to work in and release, but still critical to our overall architecture. At it’s very worst, deploys were taking over 30 minutes, with a failure rate of the deployment pipeline of 45%. The issue became clear and high priority to me when one day, I had two small PRs to release, due to bad luck with the failure rate, it took me nearly 6 hours to get my changes deployed. A constant distraction that dragged on through meetings, and other work. Making the pains of the team extremely clear and personal, I decided an effort to get things back into a safe state should be taken on. I wanted to share some of what I learned as there has been some recent discussion in the Ruby community about fixing flaky tests @samsaffron @tenderlove @ctietze @SonjaBPeterson @saramic. Over the years one of the most complex liabilities we carried in our test suite has been flaky tests. This problem is so hard some people just give up. We are slowly cataloging the problems here: https://t.co/7AdwOGNfNw , I hope to write about it. — Sam Saffron (@samsaffron) May 7, 2019 A Restoration Plan Taking a step back, thinking about what needed to happen and how to get there was the first step. I thought this would fit in well to the old agile advice… Make It Work. Make It Right. Make It Fast. Kent Beck Make It Work What didn’t work about the deployment? In my opinion, it was broken because: doesn’t meet a 95% or better success rate

deploys are too slow, to watch and review if changes succeeded, 10 minutes of less

test suite relying on CI parallelism is to slow to ever run locally, local suite run needs to be possible in 1hr or less. With a definition of what success looks like to make it work, then I was able to start to dig into the details of how to get there. Delete Flaky Tests That Aren’t Valuable This is often a hard one for some teams. An honest discussion of the purpose and value of tests is likely needed. I found good success by having a small PR removing a few flaky tests, and pointing to similar tests in the suite that exercised functionality in a more reliable way. For example, moving from a complete feature spec that tested several endpoints in a long workflow, to a couple of feature tests exercising individual endpoints, along with unit tests for the underlying service providers. The team might need to have a discussion ahead of time, or you might be surprised that others quickly agree that really flaky tests aren’t providing value. Flaky tests are worse than useless tests — Aaron Patterson (@tenderlove) August 30, 2019 Fixing Flaky Tests The primary cause of our unstable deploy pipeline was flaky tests. Given we were having deployments fail 45% of the time we had a large number of flaky tests causing issues. Let’s dive into some of the techniques for resolving flaky test failures. andrewhalliday Divide and Conquer, with Rspec-Retry Initially quarantine helps to reduce their damage to other tests, but you still have to fix them soon. Martin Fowler, Eradicating Non-Determinism in Tests Since before I joined the project, it has used rspec-retry as a way to isolate some flaky tests. Yes, this is a band-aid, but in terms of getting back to a “make it work” baseline, it is a wonderful tool. This is how I used it. For a period of about a month, I watched every failure on our CI spec suite. Every time a test failed more than once, I would add it to an internal wiki marking the test as flaky. Myself and other folks from the team, when they had time, would “adopt” a test and try to fix it, if one timeboxed an hour or two and couldn’t figure out and fix the underlying issue, we would tag it flaky, so that rspec-retry would run the test multiple times trying to achieve success. We ran our flaky tag specs in a special CI job, bundle exec rspec --tag retry_flaky_test isolated from our other tests. This CI job had a success rate of 99%, so the flaky tests would pass on retry, and be split off from others. Then with logging, and debugging we could dig in deeper in resolve the underlying issues and move the test back into the standard test suite. This is great because it very quickly got the test suite back into usable condition and tags all the future work still needing to be addressed and captures metrics about which tests are most flaky, or no longer are flaky as we resolve issues. At our current stage, we still need to go back and resolve a number of flaky tests, but they no longer slow down or block our CI. Isolate Flaky Test Recipe: capture data and identify flaky tests (use metric tracking, or just eyeball if from your CI history)

quickly try to fix them timeboxed to a short amount of time

if you can’t fix them, tag them for rspec-retry , to isolate them and remove them as a blocker for CI

, to isolate them and remove them as a blocker for CI Find a way to distribute this process across folks on the team, and explain the plan on how to follow through with cleanup. Fix the Worst Offenders From above you likely will find some worst offender tests or hopefully patterns that impact multiple tests. These even with flaky test rspec-retry may still fail to often to be reliable. If you dig into a few of the tests (during the timeboxing) you are likely to find some patterns. For example @TildeWill, fixed a whole class of flaky tests related to Capybara negative matchers. We also fixed entire categories of failing tests that weren’t properly using Capybara’s asynchronous matchers, each of these fixes added around 6% to the success rate of our CI suite per PR. Common Flaky Test Issues I won’t cover all the various types of flaky tests in as much detail as you can find in @samsaffron’s post, Tests that sometimes fail. Here are some of the most common issues we found while resolving issues. fix timing issues (timezone dependant)

stale state issues due to non DB stores (global vars, redis, memcache, Rails.cache, etc)

fix order / dependency issues… Tests that only pass in a specific order running specs in documentation mode can really help find ordering issues… Run it this way every time so you have a clear log of what ran when ``time bundle exec rspec –format documentation`

Capybara devise auth issues 1 2 not using the aysnch matchers example: expect(page.current_url).to eq(/something/) is bad, switch to waiting version expect(page).current_path.to eq(/something/) which is good. using sleep opposed to correct waiting matchers in geneal wait_for_ajax is dangerous

VCR allowing non-stubbed network requests can be dangerous, try to get your suite passing with, VCR.configure { |c| c.allow_http_connections_when_no_cassette = false } A few tips to help debug and fix flaky tests. I found each of these scripts extremely valuable in moving forward our success rate. Single Flaky Test Case Quickly, verify a test that fails randomly even in isolation, with this great tip.

from @_swanson Single Flaky Spec File Quickly, check the success rate of a single file. I use this to report a before and after of a fix in a PR. Calculate Success Rate of a Branch on CircleCI Push a fix on a branch and run it a number of times to see if it improves the overall success rate. Additional Resources on Resolving Flaky Tests BuildPulse, software service that helps identify flaky tests.

Capybara Cheat Sheet, ensure you are following the best practices, to avoid common mistakes.

Write Reliable, Asynchronous Integration Tests With Capybara

Capybara::SlowFinderErrors

Tearing Down Capybara Tests of AJAX Pages Make It Right Most of the remaining work on restoring this test suite now falls into this category. The deployment pipeline succeeds at over a 95% success rate at around 10m. These are acceptable numbers for our project. What we haven’t done is resolve all of the flaky tests which pass because of retry attempts. Until we can move all the tests to be fully reliable there is work to be done. Make It Fast I will dive into this more in a future article, but with some effort, the team was able to get our CI deployment pipeline down from over 30m on avg to only taking 10m on average. The CI jobs to run just our tests are down around 5m, with the full deployment jobs taking longer. I expect as we continue to make improvements and fix some of the known bad actors in our test suite, this number will continue to go down. Why did we make so much progress on “Make It Fast” before finishing “Make It Right”? Well, we needed a better and faster feedback loop to find and fix flaky tests, as well as to make it right. A fast feedback loop is really required to make progress quickly. Until we could increase the iteration cycles, we could only have so many flaky fix PRs make it through the pipeline in a day, and at the beginning testing locallt wasn’t really possible. In terms of make it fast, I did want to mention there are still two efforts under way. Local Test Suite Speed If the test suite it to slow to ever run locally it is also hard to test and ensure it reliably runs anywhere other than on CI. Initially, the test suite was so slow, it would either crash or stall out most of the time. Occasionally with many failures, it would complete after 9+ hours… After using CI to drive most of the fixes, now the local spec suite reliably runs on a single machine in 45 minutes. This is still far to slow for my liking but is headed in the right direction. Deployment Pipeline Improvements The CI deployment pipeline is the test suite, but also much more. This article isn’t going to focus on deployment improvements but without changes related to the tests or fixing flaky test failures. Various improvements cut our deployment in a third, I will detail this more in a future article. This involved breaking down all the continuous deployment steps finding inefficiencies, redundancy, and improving parallelization.

Building SVG Badges in Ruby 10 April 2019 Building SVG Badges in Ruby A while ago I needed to create a simple dynamic SVG. After a bit of tinkering, it was easy enough to build a simple Ruby class to do what I needed to do. SVG created from the below code It was a super quick thing to put together and solved a problem I was having. I was reminded of that today when I had another quick issue that I wanted to resolve. I wanted to pull some data not available by a services APIs, the data I needed was easily accessible to me in their webviews, so I quickly hacked together a web scraper, pulling the data I needed into a google sheet. Being a Developer, Unblocks You A thing I have always loved about being a developer is you can solve your own problems. You have to be careful to not get sucked into it and wasting a bunch of time, but you also aren’t blocked just because something you need isn’t already available. If you build a quick hack ensure it is just that a quick hack and that you won’t regret putting it into your workflow. The below SVG example was a tiny helper for devs, while today’ journey helped pull metrics on CircleCI related to metrics I want to pull for myself monthly. In either case, if it breaks it is no issue and can be fixed in a few minutes. Being able to solve the problems you run into along the way is one of the great parts of being a developer. Full Ruby SVG Badge Code class SvgFormatter def initialize ( output = nil ) @output = output || STDOUT end def format ( result ) percentage = SOME_DATA_SOURCE . round ( 1 ) File . open ( 'badge/results.svg' , 'w' ) { | f | f . write ( template ( percentage )) } rescue RuntimeError => e @output . puts e . message @output . puts 'SvgFormatter was unable to generate a badge.' end private def template ( percentage ) file_content = <<~ SVGTEMPLATE <?xml version="1.0"?> <svg xmlns="http://www.w3.org/2000/svg" width="90" height="20"> <linearGradient id="a" x2="0" y2="100%"> <stop offset="0" stop-color="#bbb" stop-opacity=".1"/> <stop offset="1" stop-opacity=".1"/> </linearGradient> <rect rx="3" width="90" height="20" fill="#555"/> <rect rx="3" x="51" width="39" height="20" fill="#007ec6"/> <rect rx="3" width="90" height="20" fill="url(#a)"/> <g fill="#fff" text-anchor="middle" font-family="DejaVu Sans,Verdana,Geneva,sans-serif" font-size="11"> <text x="24.5" y="15" fill="#010101" fill-opacity=".3">Title</text> <text x="24.5" y="14">Title</text> <text x="68.5" y="15" fill="#010101" fill-opacity=".3"> #{ percentage } %</text> <text x="69.5" y="14"> #{ percentage } %</text> </g> </svg> SVGTEMPLATE file_content end end