Because of some of the advances in data science and machine learning recently, we were able to try and look systematically at this question.

So our question was: Is this true? How much are we spending on people with a very, very high probability of death?

Implicit in this is the idea that, at the time we're spending the money, we know that they're likely to die. The image is of someone comatose or unconscious in the hospital, in the ICU, strapped up to a million machines.

Only 5 percent of people on Medicare die each year. But those who die account for a quarter of all health care spending [on seniors], and this fact [...] is often talked about in the media and policy circles as evidence of waste in the U.S. health care system: "Look, we spend all this money on health care for people who die." What a waste, right? Why are we spending money on all these people who die?

In most cases, it's not clear that the treatments are futile at the time. Because even with state-of-the-art artificial intelligence, it's much harder than you might think to predict who's going to die soon.

The answer, according to a new study that used machine learning on a huge trove of more than 6 million Medicare medical records to train a computer algorithm to predict deaths, is this:

Medicare, the federal health coverage mainly for people over 65, spends about $700 billion a year , and it's estimated that about one-quarter of that spending on seniors goes to health care in the final year of life.

And what we found, to be honest, surprised me. I thought it was very possible, related to all the anecdotes and the images I had in my mind, that we'd find that there was a lot of spending on people who had a 95 or 98 percent chance of dying. And then we as a society could have engaged in the ethical-moral-religious-political-ideological question of whether we want to be spending a lot of money on people who we know, with very, very high probability, are about to die.

In fact, we find there is very little Medicare spending on people with high probability of dying. And part of that is just that it's very, very hard to predict who is going to die.

For example, if you take all the Medicare patients at the start of the year in the highest percentile of risk — the top 1 percent chance of dying in that year -- their annual mortality rate is still only 46 percent. In other words, among the people with the top 1 percent probability of death, 44 percent of them are still going to survive.

Even among people admitted to the hospital who then die within one year, we still find it's very, very hard to find a lot of people who have a very high probability of death. For example, of the people who are admitted to the hospital who then die within one year, only about 40 percent of them have a mortality probability above 50 percent.

Put another way, it's just very hard to find a substantial share of people for whom we have very high certainty that they're going to die within the year.

Think about people who arrive in the hospital with metastatic cancer. That's a very sick population; that's a very grave diagnosis. Even so, their one year annual mortality is only about two thirds, or 63 percent. So even in a situation which tends to conjure up hopeless images, a third of people are living in a year.

So we do spend a lot on people who die. But it doesn't immediately follow that that means it was a waste. In fact, it's not clear at the time we're spending the money that we know with high probability that they're going to die.

Why is it so hard to predict who will die?

I should emphasize that our results are about central tendency. There may be anecdotes you can find of somebody who we could all agree is going to die "for sure" within the year, and yet we spend money on them. It just turns out there are just not many of them in the data. They don't account for a large amount of spending.

You could say our data are bad, or our models are not that good. So with better data or better scientists or better models, we could do better, and we try to address this in the paper.

The most obvious way you could have better data is to have richer data; that's what's available in Medicare claims records, that actually have much more detailed lab results and test results from the hospital. Or you could say maybe someone else could develop a better prediction algorithm. We took this very seriously. We ultimately tried to address it by what we called an "oracle prediction method," where we say: What if we had an even better predictor, which put some weight on our prediction and some weight on the truth about whether the person ended up dying? We still find the basic findings hold.

So what are the policy implications?

First, it is true that we spend an enormous amount of money on people who die. What our research suggests is it isn't obvious that that means it's a waste. The implicit assumption is that we're spending a lot of money on people we know are going to die in the next year or the next 30 days, and we show in the paper that's just not true. So don't focus so much on end of life spending as a symptom of waste, because it's not obviously waste.

Second, there's hard work we need to do rather than just point to spending at the end of life and say, "Let's lop that off." Instead, we need to figure out which medical interventions and health care policies are actually producing a lot of value and which aren't. We need to look at specific interventions, both late life and at other points in life, and estimate with rigorous empirical methods what the benefits of those interventions are.

You could do this study because you could use machine learning on great Medicare data, but what actually triggered you to do it?

Those two factors — and on the more casual or humorous side, I remember picking my son up from ski school, and he told me his instructor said you always need to be careful at the end of the day because most ski accidents happen on the last run of the day. And my immediate response was: "Of course they do, when you have an accident that becomes the last run!"

Obviously, that's facetious, but it made me start thinking about it. In the ski example, I think the instructor is right. It is actually true that more accidents occur late in the day. But as framed, it sounds absurd in the same way as saying, "Look, we spend a lot on the dying, let's stop spending so much on the dying." It's like, "Stop skiing your last run of the day."