Followup/distillation/alternate-take on Duncan Sabien's Dragon Army Retrospective and Open Problems in Group Rationality.

There's a particular failure mode I've witnessed, and fallen into myself:

I see a problem. I see, what seems to me, to be an obvious solution to the problem. If only everyone Took Action X, we could Fix Problem Z. So I start X-ing, and maybe talking about how other people should start X-ing. Action X takes some effort on my part but it's obviously worth it.

And yet... nobody does. Or not enough people do. And a few months later, here I'm still taking Action X and feeling burned and frustrated.

Or –

– the problem is that everyone is taking Action Y, which directly causes Problem Z. If only everyone would stop Y-ing, Problem Z would go away. Action Y seems obviously bad, clearly we should be on the same page about this. So I start noting to people when they're doing Action Y, and expect them to stop.

They don't stop.

So I start subtly socially punishing them for it.

They don't stop. What's more... now they seem to be punishing me.

I find myself getting frustrated, perhaps angry. What's going on? Are people wrong-and-bad? Do they have wrong-and-bad beliefs?

Alas. So far in my experience it hasn't been that simple.

A recap of 'Rabbit' vs 'Stag'

I'd been planning to write this post for years. Duncan Sabien went ahead and wrote it before I got around to it. But, Dragon Army Retrospective and Open Problems in Group Rationality are both lengthy posts with a lot of points, and it still seemed worth highlighting this particular failure mode in a single post.

I used to think a lot in terms of Prisoner's Dilemma, and "Cooperate"/"Defect." I'd see problems that could easily be solved if everyone just put a bit of effort in, which would benefit everyone. And people didn't put the effort in, and this felt like a frustrating, obvious coordination failure. Why do people defect so much?

Eventually Duncan shifted towards using Stag Hunt rather than Prisoner's Dilemma as the model here. If you haven't read it before, it's worth reading the description in full. If you're familiar you can skip to my current thoughts below.

My new favorite tool for modeling this is stag hunts, which are similar to prisoner’s dilemmas in that they contain two or more people each independently making decisions which affect the group. In a stag hunt:

—Imagine a hunting party venturing out into the wilderness.

— Each player may choose stag or rabbit, representing the type of game they will try to bring down.

— All game will be shared within the group (usually evenly, though things get more complex when you start adding in real-world arguments over who deserves what).

— Bringing down a stag is costly and effortful, and requires coordination, but has a large payoff. Let’s say it costs each player 5 points of utility (time, energy, bullets, etc.) to participate in a stag hunt, but a stag is worth 50 utility (in the form of food, leather, etc.) if you catch one.

— Bringing down rabbits is low-cost and low-effort and can be done unilaterally. Let’s say it only costs each player 1 point of utility to hunt rabbit, and you get 3 utility as a result.

— If any player unexpectedly chooses rabbit while others choose stag, the stag escapes through the hole in the formation and is not caught. Thus, if five players all choose stag, they lose 25 utility and gain 50 utility, for a net gain of 25 (or +5 apiece). But if four players choose stag and one chooses rabbit, they lose 21 utility and gain only 3.

This creates a strong pressure toward having the Schelling choice be rabbit. It’s saner and safer (spend 5, gain 15, net gain of 10 or +2 apiece), especially if you have any doubt about the other hunters’ ability to stick to the plan, or the other hunters’ faith in the other hunters, or in the other hunters’ current resources and ability to even take a hit of 5 utility, or in whether or not the forest contains a stag at all.

Let’s work through a specific example. Imagine that the hunting party contains the following five people:

Alexis (currently has 15 utility “in the bank”)

Blake (currently has 12)

Cameron (9)

Dallas (6)

Elliott (5)

If everyone successfully coordinates to choose stag, then the end result will be positive for everyone. The stag costs everyone 5 utility to bring down, and then its 50 utility is divided evenly so that everyone gets 10, for a net gain of 5. The array [15, 12, 9, 6, 5] has bumped up to [20, 17, 14, 11, 10].

If everyone chooses rabbit, the end result is also positive, though less excitingly so. Rabbits cost 1 to hunt and provide 3 when caught, so the party will end up at [17, 14, 11, 8, 7].

But imagine the situation where a stag hunt is attempted, but unsuccessful. Let’s say that Blake quietly decides to hunt rabbit while everyone else chooses stag. What happens?

Alexis, Cameron, Dallas, and Elliott each lose 5 utility while Blake loses 1. The rabbit that Blake catches is divided five ways, for a total of 0.6 utility apiece. Now our array looks like [10.6, 11.6, 4.6, 1.6, 0.6].

(Remember, Blake only spent 1 utility in the first place.)

If you’re Elliott, this is a super scary result to imagine. You no longer have enough resources in the bank to be self-sustaining—you can’t even go out on another rabbit hunt, at this point.

And so, if you’re Elliott, it’s tempting to preemptively choose rabbit yourself. If there’s even a chance that the other players might defect on the overall stag hunt (because they’re tired, or lazy, or whatever) or worse, if there might not even be a stag out there in the woods today, then you have a strong motivation to self-protectively husband your resources. Even if it turns out that you were wrong about the others, and you end up being the only one who chose rabbit, you still end up in a much less dangerous spot: [10.6, 7.6, 4.6, 1.6, 4.6].

Now imagine that you’re Dallas, thinking through each of these scenarios. In both cases, you end up pretty screwed, with your total utility reserves at 1.6. At that point, you’ve got to drop out of any future stag hunts, and all you can do is hunt rabbit for a while until you’ve built up your resources again.

So as Dallas, you’re reluctant to listen to any enthusiastic plan to choose stag. You’ve got enough resources to absorb one failure, and so you don’t want to do a stag hunt until you’re really darn sure that there’s a stag out there, and that everybody’s really actually for real going to work together and try their hardest. You’re not opposed to hunting stag, you’re just opposed to wild optimism and wanton, frivolous burning of resources.

Meanwhile, if you’re Alexis or Blake, you’re starting to feel pretty frustrated. I mean, why bother coming out to a stag hunt if you’re not even actually willing to put in the effort to hunt stag? Can’t these people see that we’re all better off if we pitch in hard, together? Why are Dallas and Elliott preemptively talking about rabbits when we haven’t even tried catching a stag yet?

I’ve recently been using the terms White Knight and Black Knight to refer, not to specific people like Alexis and Elliott, but to the roles that those people play in situations requiring this kind of coordination. White Knight and Black Knight are hats that people put on or take off, depending on circumstances.

The White Knight is a character who has looked at what’s going on, built a model of the situation, decided that they understand the Rules, and begun to take confident action in accordance with those Rules. In particular, the White Knight has decided that the time to choose stag is obvious, and is already common knowledge/has the Schelling nature. I mean, just look at the numbers, right?

The White Knight is often wrong, because reality is more complex than the model even if the model is a good model. Furthermore, other people often don’t notice that the White Knight is assuming that everyone knows that it’s time to choose stag—communication is hard, and the double illusion of transparency is a hell of a drug, and someone can say words like “All right, let’s all get out there and do our best” and different people in the room can draw very different conclusions about what that means.

So the White Knight burns resources over and over again, and feels defected on every time someone “wrongheadedly” chooses rabbit, and meanwhile the other players feel unfairly judged and found wanting according to a standard that they never explicitly agreed to (remember, choosing rabbit should be the Schelling option, according to me), and the whole thing is very rough for everyone.

If this process goes on long enough, the White Knight may burn out and become the Black Knight. The Black Knight is a more mercenary character—it has limited resources, so it has to watch out for itself, and it’s only allied with the group to the extent that the group’s goals match up with its own. It’s capable of teamwork and coordination, but it’s not zealous. It isn’t blinded by optimism or patriotism; it’s there to engage in mutually beneficial trade, while taking into account the realities of uncertainty and unreliability and miscommunication.

The Black Knight doesn’t like this whole frame in which doing the safe and conservative thing is judged as “defection.” It wants to know who this White Knight thinks he is, that he can just declare that it’s time to choose stag, without discussion or consideration of cost. If anyone’s defecting, it’s the White Knight, by going around getting mad at people for following local incentive gradients and doing the predictable thing.

But the Black Knight is also wrong, in that sometimes you really do have to be all-in for the thing to work. You can’t always sit back and choose the safe, calculated option—there are, sometimes, gains that can only be gotten if you have no exit strategy and leave everything you’ve got on the field.

I don’t have a solution for this particular dynamic, except for a general sense that shining more light on it (dignifying both sides, improving communication, being willing to be explicit, making it safe for both sides to be explicit) will probably help. I think that a “technique” which zeroes in on ensuring shared common-knowledge understanding of “this is what’s good in our subculture, this is what’s bad, this is when we need to fully commit, this is when we can do the minimum” is a promising candidate for defusing the whole cycle of mutual accusation and defensiveness.

(Circling with a capital “C” seems to be useful for coming at this problem sideways, whereas mission statements and manifestos and company handbooks seem to be partially-successful-but-high-cost methods of solving it directly.)

The key conceptual difference that I find helpful here is acknowledging that "Rabbit" / "Stag" are both positive choices, that bring about utility. "Defect" feels like it brings in connotations that aren't always accurate.

Saying that you're going to pay rent on time, and then not, is defecting.

But if someone shows up saying "hey let's all do Big Project X" and you're not that enthusiastic about Big Project X but you sort of nod noncommittally, and then it turns out they thought you were going to put 10 hours of work into it and you thought you were going to put in 1, and then they get mad at you... I think it's more useful to think of this as "choosing rabbit" than "defecting."

Likewise, it's "rabbit" if you say "nah, I just don't think Big Project X is important". Going about your own projects and not signing up for every person's crusade is a perfectly valid action.

Likewise, it's "rabbit" if you say "look, I realize we're in a bad equilibrium right now and it'd be better if we all switched to A New Norm. But right now the Norm is X, and unless you are actually sure that we have enough buy-in for The New Norm, I'm not going to start doing a costly thing that I don't think is even going to work."

A lightweight, but concrete example

At my office, we have Philosophy Fridays*, where we try to get sync about important underlying philosophical and strategic concepts. What is our organization for? How does it connect to the big picture? What individual choices about particular site-features are going to bear on that big picture?

We generally agree that Philosophy Friday is important. But often, we seem to disagree a lot about the right way to go about it.

In a recent example: it often felt to me that our conversations were sort of meandering and inefficient. Meandering conversations that don't go anywhere is a stereotypical rationalist failure mode. I do it a lot by default myself. I wish that people would punish me when I'm steering into 'meandering mode'.

So at some point I said 'hey this seems kinda meandering.'

And it kinda meandered a bit more.

And I said, in a move designed to be somewhat socially punishing: "I don't really trust the conversation to go anywhere useful." And then I took out my laptop and mostly stopped paying attention.

And someone else on the team responded, eventually, with something like "I don't know how to fix the situation because you checked out a few minutes ago and I felt punished and wanted to respond but then you didn't give me space to."

"Hmm," I said. I don't remember exactly what happened next, but eventually he explained:

Meandering conversations were important to him, because it gave him space to actually think. I pointed to examples of meetings that I thought had gone well, that ended with google docs full of what I thought had been useful ideas and developments. And he said "those all seemed like examples of mediocre meetings to me – we had a lot of ideas, sure. But I didn't feel like I actually got to come to a real decision about anything important."

"Meandering" quality allowed a conversation to explore subtle nuances of things, to fully explore how a bunch of ideas would intersect. And this was necessary to eventually reach a firm conclusion, to leave behind the niggling doubts of "is this *really* the right path for the organization?" so that he could firmly commit to a longterm strategy.

We still debate the right way to conduct Philosophy Friday at the office. But now we have a slightly better frame for that debate, and awareness of the tradeoffs involved. We discuss ways to get the good elements of the "meandering" quality while still making sure to end with clear next-actions. And we discuss alternate modes of conversation we can intelligently shift between.

There's a time when I would have pre-emptively gotten really frustrated, and started rationalizing reasons why my teammate was willfully pursuing a bad conversational norm. Fortunately I had thought enough about this sort of problem that I noticed that I was failing into a failure mode, and shifted mindsets.

Rabbit in this case was "everyone just sort of pursues whatever conversational types seem best to them in an uncoordinated fashion", and Stag is "we deliberately choose and enforce particular conversational norms."

We haven't yet coordinated enough to really have a "stag" option we can coordinate around. But I expect that the conversational norms we eventually settle into will be better than if we had naively enforced either my or my teammate's preferred norms.

Takeaways

There seem like a couple important takeaways here, to me.

One is that, yes:

Sometimes stag hunts are worth it.

I'd like people in my social network to be aware that sometimes, it's really important for everyone to adopt a new norm, or for everyone to throw themselves 100% into something, or for a whole lot of person-hours to get thrown into a project.

When discussing whether to embark on a stag hunt, it's useful to have shorthand to communicate why you might ever want to put a lot of effort into a concerted, coordinated effort. And then you can discuss the tradeoffs seriously.

I have more to say about what sort of stag hunts seem do-able. But for this post I want to focus primarily on the fact that...

The schelling option is Rabbit

Some communities have established particular norms favoring 'stag'. But in modern, atomic, Western society you should probably not assume this as a default. If you want people to choose stag, you need to spend special effort building common knowledge that Big Project X matters, and is worthwhile to pursue, and get everyone on board with it.

Corollary: Creating common knowledge is hard. If you haven't put in that work, you should assume Big Project X is going to fail, and/or that it will require a few people putting in herculean effort "above their fair share", which may not be sustainable for them.

This depends on whether effort is fungible. If you need 100 units of effort, you can make do with one person putting in 100 units of effort. If you need everyone to adopt a new norm that they haven't bought into, it just won't work.

If you are proposing what seems (to you) quite sensible, but nobody seems to agree...

...well, maybe people are being biased in some way, or motivated to avoid considering your proposed stag-hunt. People sure do seem biased about things, in general, even when they know about biases. So this may well be part of the issue.

But I think it's quite likely that you're dramatically underestimating the inferential distance – both the distance between their outlook and "why your proposed action is good", as well as the distance between your outlook and "why their current frame is weighing tradeoffs very differently than your current frame."

Much of the time, I feel like getting angry and frustrated... is something like "wasted motion" or "the wrong step in the dance."

Not entirely – anger and frustration are useful motivators. They help me notice that something about the status quo is wrong and needs fixing. But I think the specific flavor of frustration that stems from "people should be cooperating but aren't" is often, in some sense, actually wrong about reality. People are actually making reasonable decisions given the current landscape.

Anger and frustration help drive me to action, but often they come with a sort of tunnel vision. They lead me to dig in my heels, and get ready to fight – at a moment when what I really need is empathy and curiosity. I either need to figure out how to communicate better, to help someone understand why my plan is good. Or, I need to learn what tradeoffs I'm missing, which they can see more clearly than I.

My own strategies right now

In general, choose Rabbit.

Keep at around 30% slack in reserve (such that I can absorb not one, not two, but three major surprise costs without starting to burn out). Don't spend energy helping others if I've dipped below 30% for long – focus on making sure my own needs are met.

Find local improvements I can make that don't require much coordination from others.

Follow rabbit trails into Stag* Country

Given a choice, seek out "Rabbit" actions that preferentially build option value for improved coordination later on.

Metaphorically, this means "Follow rabbit trails that lead into *Stag-and-Rabbit Country", where I'll have opportunities to say:

"Hey guys I see a stag! Are we all 100% up for hunting it?" and then maybe it so happens we can stag hunt together.



Or, I can sometimes say, at small-but-manageable-cost-to-myself "hey guys, I see a whole bunch of rabbits over there, you could hunt them if you want." And others can sometimes do the same for me.

Sliiightly more concretely, this means:

Given the opportunity, without requiring actions on the part of other people... pursue actions that demonstrate my trustworthiness, and which build bits of infrastructure that'll make it easier to work together in the future.



Help people out if I can do so without dipping below 30% slack for too long, especially if I expect it to increase the overall slack in the system.

(I'll hopefully have more to say about this in the future.)

Get curious about other people's frames

If a person and I have argued through the same set of points multiple times, each time expecting our points to be a solid knockdown of the other's argument... and if nobody has changed their mind...

Probably we are operating in two different frames. Communicating across frames is very hard, and beyond scope of this of this post to teach. But cultivating curiosity and empathy are good first steps.

Occasionally run "Kickstarters for Stag Hunts." If people commit, hunt stag.

For example, the call-to-action in my Relationship Between the Village and Mission post (where I asked people to contact me if they were serious about improving the Village) was designed to give me information about whether it's possible to coordinate on a staghunt to improve the Berkeley rationality village.