A study recently appeared in JAMA that essentially claimed that your smartphone was as good or better at tracking your activities as a wearable fitness tracker.

Unfortunately, it took just a little bit of reading to discover that the study was flawed. The devices the researchers tested were ancient. They used a Samsung Galaxy S4 and an iPhone 5s, both nearly two years old, and the Fitbit Zip and Fitbit One, both two and a half years old. Slightly better is the Fitbit Flex and the Jawbone UP24 which are both about a year and a half old. They also tested a Nike Fuelband, a notoriously inaccurate piece of dogshit which skewed the results considerably, and the Digi-Walker SW-200, which is something I've never heard of.

That didn't stop tech journalists from some of the world's biggest outlets from jumping on the data, parroting the study's findings as facts, and declaring dedicated fitness trackers to be essentially redundant and useless. But that's rather far from the truth. Technology in the wearable and mobile spaces is evolving too quickly to judge its efficacy using outdated devices.

We decided to do our own—admittedly less scientific, but more true to reality—test using much newer devices. For our phones, we picked the Apple iPhone 6 and Google's Nexus 6. I strapped on four current wrist wearables; a Fitbit Charge HR, a Basis Peak, a Jawbone UP Move, and a Garmin Fenix 3. In my pocket, I slipped a Withings Pulse O2. Here's what we found.

Get to Steppin'

The original study measured the wearables' and the phones' step-counting abilities, so we did the same. I walked long circles around the perimeter of a baseball field, carefully counting out loud to 1,000 steps. We found that it didn't seem to matter too much whether the device was a dedicated fitness tracker or a phone. Here are our numbers:

Device

Number of Steps

Discrepancy (percent)

iPhone 6

995

-0.5

Nexus 6

914

-8.6

Garmin Fenix 3

1,007

0.7

Fitbit Charge HR

1,001

0.1

Jawbone UP Move

1,011

1.1

Basis Peak

1,011

1.1

Withings Pulse O2

1,047

4.7

What's interesting is that the Nexus 6 was off by so much. We actually ran (walked) this test twice, and the first time through the Nexus 6 fared much better, with only a 2.8 percent discrepancy. In that same test, the iPhone came in with a 1.8 percent discrepancy. This suggests to me that the accuracy of trackers you wear in your pants is subject to variables, such as the position of the device in your pocket or whether your pants are loose or tight. We had to throw out the results from that first test because I gesticulated too much between recording the baseline steps and where I started walking, which resulted in the wrist worn trackers showing false positives. Speaking of...

Your Wrists Aren't Your Feet

Fitness tracker detractors love to make jokes about how 15 minutes of self-love would fool that tracker into thinking you'd run a marathon. Ha ha, very funny, dad. Turns out, they're not too far from the truth. Five minutes of whittling completely and utterly fooled the wrist-worn trackers.

Device

Number of False Steps

Garmin Fenix 3

722

Fitbit Charge HR

611

Basis Peak

540

Jawbone UP Move

618

That's bad. Really bad. And one might be tempted to conclude that since the phones were pretty accurate in the step test and they're not subject to the same false positives as wrist-worn devices, they must be better at tracking your physical activity. Indeed, that's pretty much what the original JAMA study claimed, with lead author Meredith A. Case saying, "We found that smartphone apps are just as accurate as wearable devices for tracking physical activity."

But here's the thing: Steps are just one small item under the larger umbrella of physical activity, and I would argue that it's a lousy metric for getting a picture of your overall health and activity levels. Steps are just steps. They could be hard steps or easy steps, fast or slow, balletic dance moves or popping and locking.

There's one big reason steps have become the de facto metric for measuring heath: they're easy. Basically all it takes to count steps is an accelerometer, and accelerometers are dirt cheap. This is why every company and their mother company is making a "fitness tracker" right now. But really, the bulk of these things are just glorified pedometers.

So if steps are a bad metric, what is a good one?

Burn, Baby, Burn

I would argue that the most important metric an activity tracker can provide is caloric expenditure. If your goal is weight loss, weight maintenance, or if you're hoping to increase the ratio of muscle to fat in your body, an accurate estimate of your caloric burn is one of the most important metrics available. It gives you actionable information.

Because the wearables with heart-rate-monitoring abilities can see how hard your ticker is working, they're almost certainly the most accurate devices among all these options for measuring your real life, daily caloric burn. Even if they aren't perfect, they're the best we've got. You have to pair that knowledge with a food-tracking app like MyFitnessPal (or one that's built right into the Fitbit or Jawbone apps) to budget your calories. It's kind of a drag to log all your meals, but you'll learn at lot as you go, and that knowledge will stay with you even after you get bored of using these apps and devices.

I did some testing with the heart-rate monitoring (or "HRM") devices, wearing them while performing some non-step activities. This is what was recorded by each one during 20 minutes of cycling:

Device

Caloric Burn

Garmin Fenix 3

91

Fitbit Charge HR

226

Basis Peak

243

Jawbone UP Move

163

I also did some sit-ups and abdominal crunches. The phones and accelerometer-based fitness trackers had no idea what was going on, but the Fitbit Charge HR and the Basis Peak could see my heart rate steadily increasing.

I should note that we were not stressing the Garmin Fenix 3 to its full potential. This watch really belongs in a different category from the other devices because it's a dedicated training watch that just also happens to work as a fitness monitor. For the bike test, I could have flipped GPS tracking on and connected the watch with an HRM chest strap, and it almost certainly would have given me the most accurate readings of any of them. The thing is, Garmin's watch starts at $500 (or $550 if you want a HRM with it), which really puts it into another bracket. So, for the sake of this test, we were just looking at how it does as a fitness tracker. Just know that for real training, this thing kicks ass.

Conclusions

We learned a few things from these experiments. First, a smartphone app isn't really any more accurate than a wrist-worn or pocketable fitness tracker at step-counting. Even if a phone was dead accurate, step-counting is a pretty garbage metric for measuring your true health. Caloric burn is where it's at. And for that, phones are non-starters. What you want is a wrist-worn tracker with a built-in heart rate monitor. From my testing, the Basis Peak works OK, but it's prone to occasionally losing track of your pulse. The Fitbit Charge HR was the most accurate, and at $150, it's very affordable. We'll see how upcoming entrants like the Jawbone UP3 and the Apple Watch compare once we can spend some real time with them, but for now the Charge HR looks like the best of the bunch we tested.

Watch more WIRED videos and join the conversation on YouTube.