I was even more self-conscious than normal walking into the gym. I began to sweat, despite it being below freezing outside. Although I was cloaked in my pea coat, I felt as if everyone had X-ray vision and could see right through it, weighing me down with their prying eyes (even more than the multiple fitness trackers strapped to my left arm). As I slunk into the locker room to change into my workout clothes, I took every opportunity to explain to anyone who so much as glanced at me that no, I wasn't insane or even a health nut – I was just a journalist on a mission.

I've always been kind of a geek, and as part of my job I've written more than my fair share of stories on fitness trackers, wearables and other gadgets that meet at the intersection of health and technology. But the more I wrote, the more confused I became. About fitness trackers, in particular. With so many choices, how do you know which one is best? Moreover, how can you even be sure they're accurate when the calorie counters on treadmills can be gamed by simply standing on the platform while the machines "run?"

To find out, I cooked up an experiment. I would track my steps and calories burned by wearing a Fitbit Force, Garmin Vivofit, Jawbone Up, Samsung Galaxy Gear smartwatch and a Samsung Galaxy Note 4 – all provided to me as samples by the manufacturers. I chose to scrutinize these particular tracking devices and smartwatch because of their popularity and wide availability, and selected the Note 4 because it was among the first smartphones to integrate fitness tracking right into the operating system.

I wore them all* – at the same time – for six days to compare the numbers and see if any of them seemed off. To conduct this as scientifically as possible, I ran my methodology past exercise scientist Christy Lane, co-founder and chief operating officer of Vivametrica, a startup that analyzes fitness tracker information. She confirmed the best way to test the accuracy of these devices was to track my steps for several days, record the raw numbers and then average them out to see if any device reported significantly higher or lower numbers than the average.

Every morning during the study, I got dressed, strapped on all four trackers in the same, stepwise order up my arm and walked out the door. By the end of Day 1, I was most struck by just how tired my arm was. Individually, none of the trackers weighed much – the Galaxy smartwatch was the heaviest, tipping the scales at a little over 9 ounces, about the weight of a block of cheese. But wearing them all at the same time left me feeling like I had weights strapped on my arm all day.

In the office, a few colleagues did a double take when I strolled by, trackers exposed. I'd let out my best fake laugh, trying to conceal my embarrassment, while attempting to explain what I was doing. Some of them seemed to understand and appreciate my experiment. But that may have just been a charade to give me, now a self-proclaimed office weirdo, the slip.

While I tried to play it cool in front of colleagues and friends – this is just something work has me doing – I was secretly enthralled by the mounds of data I was collecting. The nerd in me was positively beaming. Checking my step count during the day became an obsession. I found myself tapping the various screens up my arm, as if playing a vertical keyboard, to eyeball the numbers even though I hadn’t left my desk in 30 minutes. If I hadn't moved in a while, I felt guilty, like I was disappointing an old friend. Trying to make the numbers climb became a major part of my day – especially when the Note 4 would buzz and chastise me whenever I hadn't walked for 30 minutes. (I don't think the phone understands the meaning of being on deadline.)

True, I had been a regular Fitbit user before the experiment, but I was quickly developing newfound appreciation for these devices. You'll see why in a moment. First, my results. The chart below shows the step and calorie counts each device logged during the six-day test period. My daily averages, broken down by device, are also included.

Number of Steps Tracked November 11 November 15 November 21 November 22 November 23 November 24 Fitbit 10,523 9,205 8,803 9,837 6,520 8,754 Gear watch 11,992 9,923 1,421*

(Battery died) 10,812 6,984 9,208 Vivofit* N/A N/A 9,490 10,542 7,022 9,147 Jawbone up 12,009 9,755 9,573 10,449 6,980 9,016 Note 4 11,250 9,812 9,460 10,677 7,008 9,154 Average 11,443.5 9,673.75 9,331.5 10,463.4 6,902.8 9,055.8

Number of Calories Tracked November 11 November 15 November 21 November 22 November 23 November 24 Fitbit 503 486 424 492 275 420 Gear watch 556 511 97*

(Battery died) 559 293 457 Vivofit* N/A N/A 487 545 301 482 Jawbone up 577 508 490 542 286 501 Note 4 562 512 476 561 288 481 Average 549.5 504.25 469.25 539.8 288.6 468.2

As you'll notice, the Fitbit strayed most from the average, consistently reporting lower-than-average numbers throughout the course of the experiment. The biggest discrepancy came on Day 1, when it reported 924 steps below the average, though it typically underreported by approximately 500 steps. How dare it cheat me out of hard-earned steps? I shouldn't feel that way, according to Lane.“That may sound like a big variance, but overall, it’s not likely to make a major difference” in your health, Lane explains. “There’s very little meaningful variability within the different trackers in terms their reports.” Here's why.

None of the trackers counted my steps identically, which didn't surprise Lane, since each tracker defines what constitutes a step slightly differently. “All of these trackers use accelerometers and filter [the data] in a proprietary way to turn it into a step,” she says. “But in the end, the difference is so minor that for a consumer it doesn’t really [matter].” Even if that difference ranged from 500 to 900 steps.

The fact that the trackers all pretty much agreed on my step count, irrespective of their algorithms, was both surprising and reassuring to me. On one hand, the results are a little anticlimactic. But on the other, it's good to know consumers aren't shelling out hard-earned money for devices that lie to them to make them feel better about themselves.

The trackers also calculated how many calories I burned each day, and again, the numbers were all remarkably similar. No device differed from the average by more than 48 calories a day throughout the experiment. Calories burned is more of an educated guess than a scientific assessment, says Johnathan Dugas, a kinesiologist and director of clinical development at The Vitality Group, a company that designs wellness programs for employers. The fact they all produced similar calorie counts indicates that they are all likely fairly accurate. “Calories burned is a function of how much you weigh and how far you go,” he says. “The devices know your mass, how far you’ve gone and how fast, so they can estimate that.”

One great feature all these trackers share in common is that they make life easier for calorie-counters like me, by integrating calories burned seamlessly into other calorie-counting apps like MyFitnessPal.

So if accuracy isn’t an issue, how do you select your fitness-tracker soul mate? Simple: Pick the one that best suits your lifestyle, preferences and budget, Lane says. “It’s about personal choice,” she says. “Because they’re all reasonably accurate and comparable, it comes down to looks and features. I personally wear the Misfit Shine because it’s a lot more discreet. For others, they may prefer one with a long battery life or that’s waterproof.”

None of the devices I tested were waterproof, but battery life could be seen as a tie-breaker. The Jawbone Up lasted the longest – I only had to charge it once during my six-day trial. Conversely, I sometimes had to charge the Samsung smartwatch during my workday, and it even died on one occasion. However, there’s a trade-off: The smartwatch is much more than a fitness tracker, acting as a bite-size phone on my wrist, allowing me to respond to emails, text messages and more. So I concluded that what it lacks in battery life is made up for in features.

The Garmin Viviofit was a perfect marriage between the smartwatch and Jawbone, offering middle-of-the-road battery life lasting three to four days, while at the same time allowing me to control my music and see whenever I received a new text message, Google chat or email. The downside? Unlike the smartwatch, it isn't designed for responding to those messages.

Fitness trackers are especially useful for beginners, and if you fit into the novice category, any tracker you like will do, Dugas says. “There’s more value for people who don’t think of themselves as active people," he says. “They don’t know how to get out and run, so they just try to hit the goal their device tells them. That’s where activity trackers really make the biggest difference.”

What's more, how you use your tracker is far more important than which tracker you use, Lane says. Most come with a preprogrammed goal of 10,000 steps – the amount typically recommended for optimal health – but that's not a one-size-fits-all prescription. “The 10,000 steps goal is pretty decent for overall health, but it’s not for everyone,” she says. “For someone with back pain or arthritis that just isn’t realistic.”

Instead, as soon as you get the device, use it to figure out your baseline and go from there. “Track yourself for about a week to see what your normal daily step count is,” Lane says. “Then set a goal that’s appropriate for your baseline.” If you have an active job and find out that you already clock 12,000 steps a day, for example, "aiming for a goal of 10,000 isn’t going to help you,” Lane says. “More realistically, say you walk 5,000 steps per day, which is quite average. After a week, try to increase it by 20 percent. That’s realistic. That’s like going for a nice 20- or 30-minute walk. After you consistently hit that goal for a week, increase it by another 20 percent, and keep going.”

During the experiment, I discovered that my baseline was approximately 9,500 steps per day – a hair shy of the recommended 10,000 – which isn't super surprising, given that I live in New York City and walk pretty much everywhere. However, I'll admit I actively tried to stretch my step count, so 9,500 probably isn't an accurate reflection of how much I'd walk without Big Brother – (kidding!) my trackers – keeping score. Hitting 10,000 would likely take little effort for me; a few more trips to the bathroom (about 150 steps from my desk) during the workday would do the trick. Taking Lane's advice, I should aim for closer to 11,500 steps per day, which would require me to step up my effort significantly.

By working up to your goal, not only do you reduce the risk of, say, spraining an ankle, but you're likelier to reach it, and thus, see meaningful change, Lane says. “Setting realistic goals that are personalized to you is key,” she says. “If you don’t, you’re likely to get discouraged and not use the device."

Through my experiment, not only did I learn that when it comes to picking a fitness tracker, pretty much any one will do, but I also learned a few things about myself. Namely that even on days I went to the gym, I didn't move nearly as much as I thought I did. (Oh, and that people tend to keep their distance if you're wearing multiple trackers at once.) Most importantly, I learned that my fiancée must really love me if she agreed to eat with me – and my trackers – in public, without making jabs at me the whole night. Now all we need is a tracker to measure her tolerance of my quirks.