Today seems to be the day to talk about whether those of us concerned with poverty and inequality should focus on progressive taxation. Edward D. Kleinbard in the New York Times and Cathie Jo Martin and Alexander Hertel-Fernandez at Vox argue that focusing on progressivity can be counterproductive. Jared Bernstein, Matt Bruenig, and Mike Konczal offer responses offer responses that examine what “progressivity” really means and offer support for taxing the rich more heavily than the poor. This is an intramural fight. All of these writers presume a shared goal of reducing inequality and increasing socioeconomic cohesion. Me too.

I don’t think we should be very categorical about the question of tax progressivity. We should recognize that, as a political matter, there may be tradeoffs between the scale of benefits and progressivity of the taxation that helps support them. We should be willing to trade some progressivity for a larger scale. Reducing inequality requires a large transfers footprint more than it requires steeply increasing tax rates. But, ceteris paribus, increasing tax rates do help. Also, high marginal tax rates may have indirect effects, especially on corporate behavior, that are socially valuable. We should be willing sometimes to trade tax progressivity for scale. But we should drive a hard bargain.

First, let’s define some terms. As Konczal emphasizes, tax progressivity and the share of taxes paid by rich and poor are very different things. Here’s Lane Kenworthy, defining (italics added):

When those with high incomes pay a larger share of their income in taxes than those with low incomes, we call the tax system “progressive.” When the rich and poor pay a similar share of their incomes, the tax system is termed “proportional.” When the poor pay a larger share than the rich, the tax system is “regressive.”

It’s important to note that even with a very regressive tax system, the share of taxes paid by the rich will nearly always be much more than the share paid by the poor. Suppose we have a two animal economy. Piggy Poor earns only 10 corn kernels while Rooster Rich earns 1000. There is a graduated income tax that taxes 80% of the first 10 kernels and 20% of amounts above 10. Piggy Poor will pay 8 kernels of tax. Rooster Rick will pay (80% × 10) + (20% × 990) = 8 + 198 = 206 kernels. Piggy Poor pays 8/10 = 80% of his income, while Rooster Rich pays 206/1000 = 20.6% of his. This is an extremely regressive tax system! But of the total tax paid (214 kernels), Rooster Rich will have paid 206/214 = 96%, while Piggy Poor will have paid only 4%. That difference in the share of taxes paid reflects not the progressivity of the tax system, but the fact that Rooster Rich’s share of income is 1000/1010 = 99%! Typically, concentration in the share of total taxes paid is much more reflective of the inequality of the income distribution than it is of the progressivity or regressivity of the tax system. Claims that the concentration of the tax take amount to “progressive taxation” should be met with lamentations about the declining quality of propaganda in this country.

Martin and Hertel-Fernandez offer the following striking graph:

The OECD data that Konczal cites as the likely source of Martin and Hertel-Fernandez’s claims includes measures of both tax concentration and progressivity. I think Konczal has Martin and Hertel-Fernandez’s number. If the researchers do use a measure of tax share on the axis they have labeled “Household Tax Progressivity”, that’s not so great, particularly since the same source includes two measures intended to capture of actual tax progressivity (Table 4.5, Column A3 and B3). Even if the “right” measure were used, there are devils in details. These are “household taxes” based on an “OECD income distribution questionnaire”. Do they take into account payroll taxes or sales taxes, or only income taxes? This OECD data shows the US tax system to be strongly progressive, but when all sources of tax are measured, Kenworthy finds that the US tax system is in fact roughly proportional. (ht Bruenig) The inverse correlation between tax progressivity and effective, inclusive welfare states is probably weaker than Martin and Hertel-Fernandez suggest with their misspecified graph. If they are capturing anything at all, it is something akin to Ezra Klein’s “doom loop”, that countries very unequal in market income — which almost mechanically become countries with very concentrated tax shares — have welfare states that are unusually poor at mitigating that inequality via taxes and transfers.

Although I think Martin and Hertel-Fernandez are overstating their case, I don’t think they are entirely wrong. US taxation may not be as progressive as it appears because of sales and payroll taxes, but European social democracies have payroll taxes too, and very large, probably regressive VATs. Martin and Hertel-Fernandez are trying to persuade us of the “paradox of redistribution”, which we’ve seen before. Universal taxation for universal benefits seems to work a lot better at building cohesive societies than taxes targeted at the rich that finance transfers to the poor, because universality engenders political support and therefore scale. And it is scale that matters most of all. Neither taxes nor benefits actually need to be progressive.

Let’s try a thought experiment. Imagine a program with regressive payouts. It pays low earners a poverty-line income, top earners 100 times the poverty line, and everyone else something in between, all financed with a 100% flat income tax. Despite the extreme regressivity of this program’s payouts and the nonprogressivity of its funding, this program would reduce inequality in America. After taxes and transfers, no one would have a below poverty income, and no one would earn more than a couple of million dollars a year. Scale down this program by half — take a flat tax of 50% of income, distribute the proceeds in the same relative proportions — and the program would still reduce inequality, but by somewhat less. The after-transfer income distribution would be an average of the very unequal market distribution and the less unequal payout distribution, yielding something less unequal than the market distribution alone. Even if the financing of this program were moderately regressive, it would still reduce overall inequality.

How can a regressively financed program making regressive payouts reduce inequality? Easily, because no (overt) public sector program would ever offer net payouts as phenomenally, ridiculously concentrated as so-called “market income”. For a real-world example, consider Social Security. It is regressively financed: thanks to the cap on Social Security income, very high income people pay a smaller fraction of their wages into the program than modest and moderate earners. Payouts tend to covary with income: People getting the maximum social security payout typically have other sources of income and wealth (dividends and interest on savings), while people getting minimal payments often lack any supplemental income at all. Despite all this, Social Security helps to reduce inequality and poverty in America.

Eagle-eyed readers may complain that after making so big a deal of getting the definition of “tax progressivity” right, I’ve used “payout progressivity” informally and inconsistently with the first definition. True, true, bad me! I insisted on measuring tax progressivity based on pay-ins as a fraction of income, while I’m call pay-outs “regressive” if they increase with the payees income, irrespective of how large they are as a percentage of payee income. If we adopt a consistent definition, then many programs have payouts that are nearly infinitely progressive. When other income is zero, how large a percentage of other income is a small social security check? Sometimes, to avoid these issues, the colorful terms “Robin Hood” and “Matthew” are used. “Robin Hood” programs give more to the poor than the rich, “Matthew” programs are named for Matthew Effect — “For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken even that which he hath.” Programs that give the same amount to everyone, like a UBI, are described less colorfully as “Beveridge”, after the recommendations of the Beveridge Report. The “paradox of redistribution” is that welfare states with a lot of Matthew-y programs, that pay more to the rich and may not be so progressively financed, tend to garner political support from the affluent “middle class” as well as the working class, and are able scale to an effective size. Robin-Hood-y programs, on the other hand, tend to stay small, because they pit the poor against both the moderately affluent and the truly rich, which is a hard coalition to beat.

So, should progressives give up on progressivity and support modifying programs to emulate stronger welfare states with less progressive finance and more Matthew-y, income-covarying payouts? Of course not. That would be cargo-cultish and dumb. The correlation between lower progressivity and effective welfare states is the product of an independent third cause, scale. In developed countries, the primary determinant of socioeconomic cohesiveness (reduced inequality and poverty) is the size of the transfer state, full stop. Progressives should push for a large transfer state, and concede progressivity — either in finance or in payouts — only in exchange for greater scale. Conceding progressivity without an increase in scale is just losing. As “top inequality” increases, the political need to trade away progressivity in order to achieve program scale diminishes, because the objective circumstances of the rich and erstwhile middle class diverge.

Does this focus on scale mean progressives must be for “big government”? Not at all. Matt Bruenig has written this best. The size of the transfer state is not the size of the government. When the government arranges cash transfers, it recruits no real resources into projects wasteful or valuable. It builds nothing and squanders nothing. It has no direct economic cost at all (besides a de minimis cost of administration). Cash transfer programs may have indirect costs. The taxes that finance them may alter behavior counterproductively and so cause “deadweight losses”. But the programs also have indirect benefits, in utilitarian, communitarian, and macroeconomic terms. That, after all, is why we do them. Regardless, they do not “crowd out” use of any real economic resources.

Controversies surrounding the scope of government should be distinguished from discussions of the scale of the transfer state. A large transfer state can be consistent with “big government”, where the state provides a wide array of benefits “in-kind”, organizing and mobilizing real resources into the production of those benefits. A large transfer state can be consistent with “small government”, a libertarian’s “night watchman state” augmented by a lot of taxing and check-writing. As recent UBI squabbling reminds us, there is a great deal of disagreement on the contemporary left over what the scope of central government should be, what should be directly produced and provided by the state, what should be devolved to individuals and markets and perhaps local governments. But wherever on that spectrum you stand, if you want a more cohesive society, you should be interested in increasing the scale at which the government acts, whether it directly spends or just sends.

It may sometimes be worth sacrificing progressivity for greater scale. But not easily, and perhaps not permanently. High marginal tax rates at the very top are a good thing for reasons unrelated to any revenue they might raise or programs they might finance. During the postwar period when the US had very high marginal tax rates, American corporations were doing very well, but they behaved quite differently than they do today. The fact that wealthy shareholders and managers had little reason to disgorge the cash to themselves, since it would only be taxed away, arguably encouraged a speculative, long-term perspective by managers and let retained earnings accumulate where other stakeholders might claim it. In modern, orthodox finance, we’d describe all of this behavior as “agency costs”. Empire-building, “skunk-works” projects with no clear ROI, concessions to unions from the firm’s flush coffers, all of these are things mid-20th Century firms did that from a late 20th Century perspective “destroyed shareholder value”. But it’s unclear that these activities destroyed social value. We are better off, not worse off, that AT&T’s monopoly rents were not “returned to shareholders” via buybacks and were instead spent on Bell Labs. The high wages of unionized factory workers supported a thriving middle class economy. But would the concessions to unions that enabled those wages have happened if the alternative of bosses paying out funds to themselves had not been made unattractive by high tax rates? If consumption arms races among the wealthy had not been nipped in the bud by levels of taxation that amounted to an income ceiling? Matt Bruenig points out that, in fact, socioeconomically cohesive countries like Sweden do have pretty high top marginal tax rates, despite the fact that the rich pay a relatively small share of the total tax take. Perhaps that is the equilibrium to aspire to, a world with a lot of tax progressivity that is not politically contentious because so few people pay the top rates. Perhaps it would be best if the people who have risen to the “commanding heights” of the economy, in the private or the public sector, have little incentive to maximize their own (pre-tax) incomes, and so devote the resources they control to other things. In theory, this should be a terrible idea: Without the discipline of the market surely resources would be wasted! But in the real world, I’m not sure history bears out that theory.