This worry leads many observers to what I’ll call “fiduciary of values” solutions. Fiduciaries, of course, are institutions that are entrusted to act in the best interests of certain trustees. So, for example, money managers’ fiduciary duties require them to do their best to manage their clients’ money according to the clients’ wishes, while refraining from serving any selfish or misaligned interests.

In the A.I. context, the idea is roughly that we might build intelligent machines with our values encoded as the objective functions to maximize. A naive version of this view holds that we might, in some distant future, instruct a godlike A.I. to “increase human happiness”, hit return, and enjoy. I will not bother to critique that strawman, because there is a much more salient and sophisticated and version of the view out there. The sophisticated “fiduciary of values” view says that machine learning algorithms might, in the immediate future, help people conform their behavior to their own values (instead of manipulating them in the interests of counterparties such as advertisers).

This certainly sounds more attractive than machine learning algorithms manipulating our behavior in the interests of capital. And it is more plausible than a computer figuring out how to rain down happiness that we passively enjoy. However, I think even this sophisticated version flies somewhat close to a serious intellectual error. Namely, it risks ignoring the “participatory aspect” of value formation insofar as it envisions values as relatively static guidelines to which we might better conform, rather than indeterminate things that we constantly strive to define.

4. …And Why It Might Mess You Up

Let me state my worry plainly: If I told an algorithm what my values are, and then it tried to “help” me by manipulating me into conforming to them, I think that might really mess me up.

In everyday experience, trying to conform to one’s values, and trying to define them, are distinct but inseparable activities. It’s common, for example, to offer someone help, aiming to being generous, only to realize in the process that the other person feels condescended to. Such experiences lead sensitive would-be benefactors to revise their ideals and proceed with a more nuanced definition of generosity. This is how people grow. But if we had algorithms manipulating us into conforming to our declared values (somewhat like chess computers pushing our king into a trap), they might cause us to miss the signs that we are being condescending, thus arresting the refinement of our values and definitions.[1]

Now, I do not think it is impossible to build this kind of technology properly. I can imagine machine learning algorithms or A.I. helping us better define our values, as well as helping us conform to them–in a transparent, non-adversarial and non-manipulative way. But it must never ignore the fact that we figure out who we are by figuring out what our values are. Any system that diminishes, distorts or eliminates humans’ role in that feedback loop will be a harmful counterfeit. Humans must engage in the active and never-ending process of defining and refining their own objective functions.

5. A Note About Intersubjectivity

Above, I described an algorithm that purports to help individuals conform to their own values. But values are never actually individual. No individual arrives at their values in isolation.

Take the example of generosity. If I contended that stealing food from starving babies in order to enjoy a pleasant snack is a good example of generosity, I would be wrong. So wrong, in fact, that no reasonable person in my society would agree with me.

Following this example, we might be tempted to conclude that “true” values are not what individuals say they are, but what groups say they are. But of course that isn’t right, either. History is littered with huge groups going morally haywire, while a few individuals and subgroups retain clarity.

The way out of this puzzle is to understand that values are social, or “intersubjective.” They are a matter of opinion, but not of individual opinion (alternatively, they are facts–but social facts rather than pure objective facts). Human societies have (or more accurately, are) complex mechanisms of aggregating viewpoints into loose social consensuses. So while individuals may not seriously declare the moral truth, they may participate in and co-create the relevant social truth-finding institutions.

This is an important point. When social institutions work well, dissenters do not merely interface with them, like customers speaking to clerks through glass. They partially constitute them, via protest, litigation, newspaper columns, dinner table arguments, or whatever else.

With that said, many institutions are quite lousy. Some, for example, systematically deprive women or minorities of a voice. Others deprive everyone of a voice, save some dictator. Still others bury the voices of valuable dissenters in a maze of bureaucracy. And still others amplify nonsensical viewpoints in response to perverse incentives, such as because nonsensical views are captivating to audiences, or useful to corrupt leaders.

All dysfunctional institutions, however, share a common feature. They cut people out of the loop, misunderstanding or denying their own participatory nature. They are insensitive and unresponsive, so that some or all people stop trying to shape them. You see this dynamic in dictatorships, apartheid-type societies, and gridlocked bureaucracies.

A.I. might bring about this same phenomenon in weird new forms–for example, cutting everyone (even the dictator) out of the loop, or cutting people out of their own internal loops. Can we, as individuals and communities, constitute A.I. the way that we have constituted successful human institutions in the past, like families and religions? This is the real question. Just as with a dictatorship, the worry isn’t only that A.I. will do bad things. The deeper worry is that it will sideline us from the life-affirming process of constructing our values, our communities, and ourselves through shared institutions.

Early Modern Problems

6. From Gutenberg to Turing

I’m really only using A.I. to illustrate a much broader argument. For the problems it threatens are not new: Emergent information technologies have been “cutting people out of the loop” of institutions at least since the dawn of the modern era. Let me explain.

For many millennia, labor and value tended to stick together. Where there was a thing of value, the labor that made it had touched it. An arrowhead? Somebody carved it. A basket? Somebody wove it.

But then technologies arose that permitted easy copying. That changed everything. When Gutenberg started selling printing presses in 1492, people could suddenly sell books that they didn’t write. Thus, printing press owners (capitalists) gained power. Some authors managed to claim a healthy share of the new pie, but not all. In other sectors, labor did even worse. Basket weavers, for example, were pretty much wiped out by basket factories.[2]