September 23, 2019

Over the years, Stripe has become an established name when it comes to online payments. That’s why, they were one of the main contenders we had in mind, here at WeTransfer, when a major overhaul of our billing system was long due. This is a story about how to wear stripes, or more exactly, about how we migrated our entire billing system to Stripe.

The Background #

Let’s get back to mid 2018. Our back-in-the-day billing system was pretty basic: it included two plans (monthly and yearly), and supported two currencies (€ for our European users and $ for everyone else). It was baked in-house, integrating an external payment processor and some assorted billing logic, and supported our business for a good number of years already.

However, as WeTransfer became more popular, it was slowly outgrowing this solution, up to the point where it became obvious there was a need for more. That’s why, time came when we gathered everybody that had a say about money around a table and discussed visions, plans and ambitions. It came out that we had to make a choice: radically refactor our current billing system, or build a new one from scratch.

Weighing the Options #

Much as tempting was to pick option number two, we stopped, took a deep breath, and started weighing the options. After all, no strategic, long-lasting decision should ever be taken based on gut feeling. We asked ourselves what would make the dream billing system, and assessed the two options in that context.

First and foremost, we wanted our (well, the company’s) revenue to be in safe hands. Handling money is neither a trivial task, nor one that we had expertise in. So, we decided to continue delegating this to a third party, that can provide greater robustness and reliability. Both our old payment processor and Stripe shined at this. 0 - 0, so far.

Then, it came to feature offering and other goodies. At WeTransfer, we needed more flexible plans, support for different currencies and more payment methods. Our home-baked system could have ticked all of these, but we estimated the price of the necessary refactor to be quite high. On the other hand, Stripe offered these out-of-the-box, plus automated billing, integrated fraud detection and other niceties. 1 - 0, for option two.

In the end, we’re all humans and want to spend happy times in the office. That’s why we desired a system that we can reason about, that integrated with well designed and well documented third parties, and that would keep the number of WTFs to a minimum. That was a clear win for option number two. Trust me 😉. Even more, while the old billing system certainly had better days, it was still doing its job, and it was doing it fine. Then, as a wiseman once said, why change it?

After carefully considering the above, we came up with the verdict: we will design and implement a brand new billing system, on top of Stripe.

The Fine Print #

Building a billing system from scratch is not rocket science. It’s kind of a luxury, actually. In real life, you are almost never given a blank canvas to freely express your engineering talent. That is to say, WeTransfer had been live for almost nine years, and key to the everyday life of many individuals. Money was constantly coming in, and we had to keep it that way.

We couldn’t just build a new system and switch traffic to it. We’re talking about new technology, new user flows, payment information that needed to be migrated; all these would’ve posed a high financial risk. We needed to find a way to gradually roll out the new system. And that meant running two billing systems in parallel for a period of time.

Stripe boasted lots of features, but absolute parity with our old billing system was nowhere near. Among the biggest minuses were the lack of tax-inclusive plans (which got recently addressed), and of course, PayPal support.

We also had to account for highly asynchronous events: chargebacks. Say, a user pays for their subscription today, we migrate them tomorrow and after thirteen months they figure their payment was collected in error and request a chargeback. Indeed, 13 – that’s the limit for disputing SEPA Direct Debit payments. So, even long after we had successfully migrated billing systems, we still had to keep the old way of handling chargebacks running, in case our old payment provider would receive such requests.

Ultimately, we had to admit that things will go wrong, sooner or later; that’s how engineering goes. So, we dedicated a good chunk of time building safety nets, as many as a good night’s sleep would need.

The Fundamental Bits #

Every system is built around some core concepts. Given the nature and the complexity of a billing system, we gave these a well deserved thought when designing ours. The ones that stuck to our final design were: having one source of truth for data, building an isolated and extensible system, and ensuring idempotency in key areas. Let’s go through each one of these in more depth.

Single Source of Truth #

Dispatching a core aspect of your business to a third party sparks off an interesting discussion. Which data will you trust? Who will act as the master and who will be the replica? In our case, it boiled down to who got more responsibility: us, or Stripe. Because this is what they do for a living, and they’re doing it pretty well, it made sense to trust them.

Therefore, the majority of our billing-related logic is performed directly on Stripe, while our database is simply a reflection of Stripe’s data. In other words, our database acts as a cache for application state (think of subscription statuses, which determine what our frontend will present to the user). For this, we’re making heavy use of webhook events. We silently process them in the background, which works particularly well for our use-case, where eventual consistency is considered good enough.

Isolation and Extensibility #

One of the pain points of our old billing system was its lack of flexibility: it had a fixed number of plans, supported a fixed number of currencies and it was tightly coupled to our old payment provider. This worked perfectly fine in the past, however our company ambitions made it fairly obsolete. We learned our lesson and designed the new system with a few things in mind.

First, we changed our hardcoded business rules to be more extensible. That means, plans were moved from this:

PLANS = { monthly: { # ... }, yearly: { # ... } }

to a proper ActiveRecord model, backed by its own database table. And magic numbers turned to more versatile configuration objects.

Then, we wrote the new system totally isolated from the old one. Put it another way, we created new models, services, policies and what not, and we also namespaced them differently (for instance, New::Subscription or New::Services::SubscribeUser ). This way, we established a very clear boundary between the two systems, so the chances of them clashing with each other were slimmer. As a bonus, this decision turned out incredibly useful when we were to wipe the old system out of our codebase.

Finally, we always kept thinking about tomorrow. While we will only support Stripe in the foreseeable future, we didn’t want to lock ourselves down on our payment provider. After all, that was one of the reasons why it was so hard to touch the old system. Thus, all our billing-related objects look like:

#<Payment external_provider: "stripe", external_id: "ch_foo" ...>

This way, we can easily swap payment providers in the future, or externalise some chunks to other third parties (like handling payments through an additional provider – hello PayPal!).

Any application dealing with payments is already fairly complex. Adding a third party to the mix only increases this complexity. Therefore, many things can go wrong: from random network failures to SyntaxError s in a worker class, everything is possible. That is why, any key action should be idempotent: it can be retried any number of times, producing the same result. You don’t want to be that developer doing a silly syntax error, only to find out that all signup-related background jobs have been permanently lost.

We made sure all our critical paths are idempotent:

requests to Stripe;

background jobs, that process Stripe webhook events;

migration scripts.

Fortunately, ensuring idempotency is quite a trivial task, in most cases. Stripe enables it very easily through their SDK:

Stripe::Subscription.create(attrs, { idempotency_key: 'foo' })

while in worker classes you can usually get away with something like:

class ProcessEvent < Worker def perform return if @event.processed? # ... end end

This concludes part one of Wearing Stripes. It was quite a lengthy story, so thanks for (still) being here. In the second part of this article, I’ll dive into how we rolled out our new billing system and how we migrated all users to it.

148 Kudos