For many areas of growth, presenting your message with the right hook to pique a user’s interest and to get them to engage is critical. Copy is especially important in areas such as landing pages, email subject lines or blog post titles, where users make split second decisions on whether or not to engage with the content based on a short phrase. Companies like BuzzFeed have built multi-billion dollar businesses in part by getting this phrasing down to a science and doing it more effectively than their competitors.

At Pinterest, we knew copy testing could be impactful, but we weren’t regularly running copy experiments because they were tedious to setup and analyze in our existing systems. This made it difficult to do the type of iteration necessary to optimize a piece of copy. Last year, however, we built a framework called Copytune to help address these issues. The framework has helped us optimize copy across numerous languages and significantly boost MAUs (Monthly Active Users). In this post, we’ll cover how we built Copytune, the strategy we’ve found most effective for optimizing copy and some important lessons we learned along the way.

Building Copytune

When we decided to build Copytune, we had a few goals in mind:

1) Optimize copy on a per-language basis by running an independent experiment for each language. What performs best in English won’t necessarily perform the best in German.

2) Make copy experiments easy to set up, and eliminate the need to change code in order to setup an experiment.

3) Have copy experiments auto-resolve themselves. When you’re running 30+ independent experiments (one experiment for each language), each with 15 different variants, it becomes too much analysis overhead to have a human go in and pick the winning variant for each language.

Copytune dashboard showing different winners among languages

To achieve these goals, we built a framework that mimicked the API for Tower, the translation library that every string passes through. We first had every string pass through Copytune, which would check the database to see if there was an experiment setup for that string. If so, it would return one of the variants. If the string was not in experiment, Copytune would then pass the string to Tower to get the correct translation of the string. A nightly job would then compile statistics on all the copy experiments and would automatically shut down experiments when there was enough data to declare the winner.

Copy optimization strategy

Testing copy requires an iterative process to achieve the best results. It’s almost impossible to identify the ‘best’ copy in one go, so we took an incremental approach to discover it.

Explore Phase: You can’t know for sure what will work, so we started by testing many variants that touch on very different themes, tones, etc. We typically brainstorm 15 – 20 different variants. For example:

The latest Pins in Home Decor



Come see the top Pins in Home Decor for 12/3/2015



We found a few Pins you might like

Refine Phase: After the Explore Phase, we began to see which tones and phrasing were performing best. Then we could refine by testing different components of the winning variants of the Explore Phase.

Let’s say that in Explore Phase, the winner was “We found some {pin_keyword} and {pin_topic} Pins and boards for you!”. There are many possible optimizations we can test in this example.

Example of component variations

We can try adding “Hey Emma!” at the beginning to catch the Pinner’s attention. We can even test whether “Hi Emma!” or just “Emma!” is better than “Hey Emma!”. We can test some phrases like “we found” vs. “we picked.” We can test if “Pins and boards” is better than just having “Pins” or “Boards.” In this example there are at least 10 components we can test. We treat them as independent components and test each of them against the winner.

Combine Phase: Let’s say “Hi Emma!”, “we picked” and “{pin_topic}” were winners in the Refine Phase. We can now test if the combination Refine Phase winners (a) performs better than the original winner (b)

1. “Hi Emma! We picked some {pin_topic1} and {pin_topic2} Pins and boards for you!”

2. “We found some {pin_keyword} and {pin_topic} Pins and boards for you!”

Note that it’s possible that some components are not independent, so we also tested other combinations that seem promising.

In one of our highest volume emails, the winning variant from the Explore Phase showed only a gain in one percent open rate. By the end of the whole iteration, optimizing the subject line on one email boosted it to an 11 percent gain, adding hundreds of thousands more active Pinners each week.

Lessons learned

Copytune has been in place for almost a year now, and we’ve learned some lessons along the way:

Defining Success: When we initially started testing email subject lines, we defined the success criteria as driving an email open. This seemed to be the most straightforward since the Pinner reads the email subject line and the next action is to either open it or don’t. What we found, however, was that defining success with metrics that were further downstream (i.e. clicking on the content in the email) was more effective. Some subject lines were great at getting opens, but there was a mismatch in user expectations based on the subject line and the actual content in the email so net net they were actually resulting in fewer clicks.

Picking Variants: The original vision for Copytune was to use a Multi Armed Bandit framework for picking variants and auto-resolving experiments. The difficulty we ran into was feature owners wanted to see how the experiment performed across a variety of metrics and to be able to report concrete MAU gains from the experiment. To accommodate these needs, we ultimately needed to integrate Copytune with our internal A/B testing framework.

Acknowledgements: Koichiro Narita for co-writing this post, helping develop Copytune, and running the subject line experiments covered in this post. Devin Finzer and Sangmin Shin for helping develop Copytune.

This post was originally published on the Pinterest Engineering Blog