Incentive alignment in Token Curated Registries

A path to resolve controversial decisions, foster sincere votes and reward true skin in the game in TCR designs.

TLDR: Below is an examination of the crypto-economic incentives of voting in the "canonical" implementation of TCRs, and some suggestions for improvement. This text starts with introducing some basic game-theoretic concepts. If you just want to read the TCR part, skip to the bottom :-)

Coordination games

In game-theoretical jargon, a “coordination game” is a game with multiple Nash equilibria where players have equivalent choices. Or, if you don't speak the jargon, it's a game in which the players need to coordinate to reach an optimal outcome. Here's what such a game looks like:

The kissing game.

In this game, Ann and Bob are from different (probably European) countries. They meet and kiss each other on the cheek. They can either choose the left or the right. If they both choose left (or both choose right), they are happy and win 1 happiness point; if they choose differently they bump noses, are embarrassed, and get 0.

Coordination games are common in the crypto world — for example, deciding the order of transactions in the blockchain can be described as a coordination game, in which the main goal is that everyone agrees on which block to add, while the actual contents of the block are of secondary importance.

Schelling points

In real life, coordination problems are pretty easy to solve: players can simply talk to each other and agree which option to choose (like how we all agree to use to word “cat” for cats for no particular reason other than that's what others call it, or drive on the right side of the road depending on the country we are in).

Schelling, in the "Strategy of Conflict" (1960) observes that even in the complete absence of communication, players may still be able to coordinate if one of the options is more "salient" than the others. For example, in the kissing game, if Bob has a “kiss me here” tattoo on his left check, and in the absence of any further communication, Ann and Bob would both choose "Left".

A Schelling point in a coordination game

The truth as a Schelling point

Back in 2014, Vitalik Buterin proposed an elegant idea to use coordination games as a way of having a group of voters decide on the truth. Imagine we play the following game, where players do not care much wether they lie or tell the truth, but they get rewarded if they both respond in the same way:

Truth as a Schelling point

The idea is that players would choose "Truth" over lying because the truth naturally stands out as a natural Schelling point (this point is more convincing if you imagine the game being played over many possible answers, among which only one has the special property of being true)

There are problems with this idea (many of which were discussed in Vitalik's original post). One issue that is relevant here is that the mechanism does not in any way reward the sincerity of the players. If you Ann is convinced that Bill will lie, she should lie as well to get the payoff — even if she has a preference for saying the truth rather than lying (as long as the preference is worth less to her than 1 game point).

The truth in a prediction market

Now compare the Schelling game to the game that is played in prediction markets like Augur and Gnosis. In a very simple form (and ignoring many of the interesting aspects of prediction markets, like the fact that they are set up as a market :-) ), voters need to predict whether X will be true or not. There are 2 points to win, that are divided among all who predict the future correctly. So if in fact, X turns out to be true, the payout matrix looks like this:

Payouts in the prediction game if X turns out to be true

This looks very different from a coordination game.

Firstly, in a prediction market, Ann's winning strategy is to vote sincerely. If she believes X to be true, she'll gain most in choosing X (regardless of what Bob chooses), if she believes X to be false, she'll vote accordingly.

The mechanism in the prediction market functions in such a way there are no incentives for misrepresenting your beliefs. Or, to put in differently, the incentives of the invididual players of the game (maximizing their payout) are perfectly aligned with the purpose of the game (which is to elicit information about what is true or not). Note how this is different from the coordination game above, where players are sometimes put in a situation where it is profitable to misrepresent the truth.

Another observation about prediction markets is that the player's payout increases with the entropy of the votes in the outcome: the more the players disagree about the outcome, the more the voters that turn out to be right will get. This makes sense: a controversial vote signals a difficult decision, and therefore more work to get it right.

TCR voting as a coordination game

In a TCR vote, players must decide whether a given entry should be IN or OUT of the list.

Is the voting mechanism in he canonical implementation of TCRs more like a coordination game or like a predication game? Let's look at the function that calculates the payout:

There is a fixed reward pool (to be precise, the amount is dispensationPct * minStake) that is divided among the "winning" voters.

This reward scheme looks similar to a coordination game: players are rewarded for their alignment with the majority, rather than on intrinsic properties of the outcome itself.

Is this a good choice? Some of the research on TCRs do indeed talk about “focal points” and “Schelling points”. But the very idea of a "Schelling Point" applies to situations where communication is impossible: using a Schelling point is a strategy of last resort to decision making, and will only work in absence of explicit ways of coordinating. In contrast, the decisions to be made in TCRs will typically be about arguable claims, and players are not only allowed, but actually encouraged, to communicate and coordinate on a preferred outcome (adChain opens a reddit thread for each challenge, for example — see below).

The TCR voting mechanism is not a coordination game after all

Because the payout for each vote is a fixed amount to be divided among a variable number of winning voters, the reward will diminish as the majority grows. In other words, to get the biggest slice of the cake, you must be part of a majority that is as small as possible.

This complicates the coordination between players considerably.

Let's be more precise. We need at least 3 players to model a majority vote properly, so where adding Chris, which, just as Ann, gets one vote. Let’s say the reward pool consists of 8 tokens.

We’d need to draw a 3-dimensional matrix, which is hard to do on Medium, so let’s just show Ann payouts when playing against Bob and Chris:

A discoordination game

Technically, this is not a coordination game at all. There are no Nash equilibria in this game: whatever coordinated strategy the players may agree upon, one will always have an incentive to not honor the agreement and be able to improve his payout by changing his choice.

This leads to some pretty schizophrenic logic, and to incentives to behaviour that we probably do not want to encourage. For example, in a "true" coordination game, you are never incentivised to exclude other players from the "consensus group" — and to minimize the risk of not being part of the majority, a you'd want as many players as possible to agree. In contrast, in the fixed-pie scenario described here, if you are confident to have (or be) a majority, it is in your interest to (A) try to exclude others from voting (e.g. by trying to block the publicising of the vote) and (B) to convince other (but not too many) parties to vote differently from the majority (by spreading misinformation, perhaps).

Aligned incentives in TCR voting

A crucial observation in the TCR white paper is that there are further incentives in play, external to the voting game. If the token value is positively correlated to the quality of the list, "token holders realize a direct financial benefit for curating the list in an expert manner" (by Mike Goldin). As voters are, by definition, token holders, they are therefore not only interested in the direct rewards they get from voting, but also in the expected value increase of the tokens they are voting with, and this aligns their personal incentives with that of the general good of the list.

These considerations are external to the voting game (meaning they are not explicitly represented in that game), but we can define a new game which takes such externalities into account.

Say that e(IN) represents Ann's expectation about how much her token will appreciate in value if IN wins, e(OUT) when OUT wins. She is playing against Bob and Chis, who can either be voting for IN, be undecided, or both vote for OUT. Ann’s payout matrix would then look like this ("r" is the reward):

Ann's expected payout in the TCR voting game

This already looks much better: Ann's optimal choice here depends not only on her alignment with the majority, but also on her expectations of what the outcome of the vote will do to the value of the list.

So how would this change her behaviour, with respect to the pure coordination game that did not take her expectations into account? Well, in case she believes her vote is decisive for the outcome (i.e. the middle column), she'll choose IN (or OUT) in line with her expectations, which is good. But if she does not think here vote is decisive, her rational strategy is to vote with the majority rather than vote according to her expectations.

Sincere voting

It is often desirable that the mechanism of a game is designed in a way that the players signal their true preferences rather than vote strategically. In our case, if a token holder believes an item should be IN, the incentives should be set in such a way that she will vote IN.

Instead, in the coordination game of the canonical TCR implementation, players are incentivised to signal whether they think the majority thinks the item should be in the list or not.

But this can be fixed. If we remove the reward, or set the reward equal no matter which way you vote, the incentives of the game are perfectly aligned to get players to signal their expectations sincerely:

Thus, is much easier for voters to participate in this game: they must simply evaluate their own preferences rather than the preferences of others. And that means that if the majority of voters believes that IN is better for the list then OUT (e(IN) > e(OUT)), then IN will always win.

(An objection to equalising the voting rewards may be that this means free rewards for token holders that vote without going into the merit of the vote at all. This is true. The reward scheme of the TCR is open to a similar problems, and one proposed way to mitigate this is to make voting a zero-sum game — i.e. the completely opposite direction than the one taken here :-) )

Are curators aligned with the TCR?

The TCR theory presumes that the individual incentives of token holders are aligned with the value of the list as a whole, and so it is in their own interest to behave well as voters. But this is true only for hodlers — token holders that intend to keep the token for a while. For those token holders that are prepared to sell their tokens immediately after the vote, e(IN) and e(OUT), which represent the gains of the user from appreciation of the value of the token, will be practically 0. In other words, these token holders have no incentive at all for voting well at all.

In more formal terms, it is important that the mechanism is designed in such a way that the subjective expectations e(IN) and e(OUT) correlate as much as possible to the medium-and-long-term development of the token value.

So who are those individuals? They are long-term token hodlers who believe that the value of their token is a function of the quality of the list.

In the current TCR implementation, it is exactly the opposite. Both the PLCR.sol contract (which is used for voting) as the Registry.sol contract (that is handling the stakes of listees and challengers) require exclusive access to the tokens of the user. This means that staked tokens are excluded from the vote: the mechanism excludes the very token holders of which we already know have skin in the game.

Possible changes to the TCR implementation

If the above observations are correct, it can be fruitful to experiment reshaping or extending the underlying mechanism design in order to address one or all of the following:

Do not exclude token holders that have skin in the game from voting. Specifically, do not exclude the staked tokens of the the listees and challengers, who have direct interest in the quality of the list.

Specifically, do not exclude the staked tokens of the the listees and challengers, who have direct interest in the quality of the list. Make sure all voters have skin in the game , so that their interests are aligned with long-time token hodlers. Specifically, require that the tokens that are used to vote are to be locked for a certain period (as is common in other protocols where votes are token-weighted).

, so that their interests are aligned with long-time token hodlers. Specifically, require that the tokens that are used to vote are to be locked for a certain period (as is common in other protocols where votes are token-weighted). Design the mechanism so that players can vote sincerely and do not need to resort to second-guessing the other voters. Specifically, make the reward independent of the actual vote (this needs more thought about free-riding, though).

and do not need to resort to second-guessing the other voters. Specifically, make the reward independent of the actual vote (this needs more thought about free-riding, though). Reward voters for making controversial decisions . Choosing the obvious is very cheap, making a tough decision takes time and energy, and rewards should reflect this. Specifically, this can be implemented by making the dispensationPct a function of the entropy of the vote .

. Choosing the obvious is very cheap, making a tough decision takes time and energy, and rewards should reflect this. Specifically, this can be implemented by . Do not create perverse incentives that necessitate overly complex reasoning and/or introduce negative externalities. Specifically, a fixed reward to be divided among voters creates incentives to censor or misinform other voters.

PR in the works :-)