2013/02/09, 22:07 by checker

This is the story of a bug in SpyParty. This story has a happy ending, because the SpyParty beta testers are amazing, and they are constantly helping find bugs, of course, but they are also constantly helping me reproduce bugs, and narrow down the potential causes of bugs, and triage them, and are generally providing me with incredible support so I can make the game better.

This bug also has an interesting story, because it turned out to have a very subtle cause, one that manifested itself intermittently in ways that looked almost random, and for a long time there was no “repro case”. Getting a repro on a bug is the key to fixing it, as I discuss in the How to Report Bugs the SpyParty Way post. It can make the difference between a 10 minute fix and a 10 day fix…or never managing to find and fix it.

*shudder*

But let’s start at the beginning…

I first noticed the bug long before I’d invited people into the beta, but it was so rare that I didn’t prioritize finding it, and in fact I would forget about it for stretches of time. Yes, I make notes of bugs I see, but I’ve got so much high priority stuff to do right now that I don’t go back and look at that list very often…the important bugs get fixed immediately, but a bug like this can stay in for a long time.

So, what’s the bug? Well, let’s go to the videotape:

This video was from a Spy Commentary game played between buxx and dieffenbachj, two early beta testers who were some of the first to upload gameplay videos.

It turns out, in addition to just being plain awesome for games overall, the rise of videos and streams is also an amazing resource for bug finding and fixing! The more people record their games, the more they’ll be able to point to a video of exactly what went wrong so the developers can see it almost first-hand. The bane of a developer’s life is a bug report that says, “the game broke” with no other description. You know something’s probably wrong, but it’s basically useless for finding a problem. With a video, you can usually see exactly what’s going on, so that problem is eliminated, or at least massively reduced.

So, as you can see in that video, the Spy is facing the wrong way on the floor pad. The Spy should be facing the pedestal with the statue on it in that position, but in this case the Spy is turned 90 degrees.

I watched buxx‘s videos when he posted them and I noticed this, and so I posted a bug in the Bugs Forum on the private beta website myself on May 6th, 2012.

Obviously, trying to repro it by replaying the steps in that video was fruitless.

Next up, bishop caught it with a screenshot and posted on May 13th, 2012:

It’s a perfect shot, but his next post is “I haven’t had much luck on the repro.”

ardonite chimes in the next day:

I got the rotation bug once at a statue. I think I was rriiiiight in the bounding box. So maybe if it’s on a border pixel of the box then it glitches out? Edit: no, hypothesis incorrect.

That’s how it goes: make a hypothesis, check it, repeat.

More shots every couple months over the summer from bishop and r7stuart:

Still no repro.

This entire time—in fact since I first saw it myself pre-beta—I’ve been trying to resist thinking it’s some kind of “numerical issue” with the code that handles the facing angle. Yes, angles are finicky to deal with due to wrapping, but I’ve found programmers, including myself, tend to immediately go to vague concepts like “floating point error” for anything like this. To fight this tendency back when we were working on physical simulation code together, Casey Muratori and I developed a mantra: “Assume it’s a bug!” It means that instead of assuming it’s some subtle floating point error creeping in, or anything mysterious like that, it’s almost certainly just some dumb programming bug. That mantra has never failed me. It’s always just a plain old bug.

Onward…

In the fall of 2012, streaming SpyParty took off bigtime, and so people were recording their games more often, and we started to get more videos, this one from r7stuart in October:

And tytalus in November:

I saw this last one live on tytalus’s stream, and grimaced when it happened, but also was happy to have more data to some day find a repro, or just have a random brainwave and fix it by intuition.

At this point, people were reporting NPCs doing it, which at least made me happy, because it meant it wasn’t a tell. Tells and anti-tells are the most serious SpyParty bugs, because they undermine the delicate balance of the game, so I proritize them highest, even above crash bugs sometimes!

A couple on New Year’s Eve from jorjon:

Then two clips from streams, the first from slappydavis, who’s Seduction Target appears to do it at the bookshelf on January 6th:

And then from james1221 on January 12th during the SpyParty New Years Cup Tournament:

Both of these are different, however. In both cases, the NPC is being blocked by another character, and instead of repathing to a new place, they just wait until the blocking character leaves. This is both good news and bad news. It means it’s easy to project this bug onto other bugs, it means there are other bugs, and it means all the real examples of this bug so far have involved the Spy. I don’t mention this last part in the hopes that nobody notices. Luckly it’s rare enough that it’s not going to be a game balance changer even if it is a tell.

Finally, kcmmmmm finds a reliable repro on February 7th, two days ago, and 8 months after the first post in the Bugs thread! These pictures are beautiful to me:

You can stand at that position, with that camera angle, and repro the bug most tries. He also figures out that it’s very camera angle dependent, which is another clue, but once I could repro it locally, its remaining time on this earth was measured in minutes.

I had some trouble reproing it reliably here, including a couple wild-goose chases where I thought it wouldn’t repro with the debugger running or in my debugging modes, but in the end I got a case where I could catch it in the debugger, and I looked at the source, and there it was, suspiciously rotten code.

It was an old check from when I used to support click-to-move, as opposed to direct-control of the spy. There was a case in the code that would check if you were clicking on the bookshelf itself, rather than the floor pad in front of the bookshelf, and it would helpfully direct you to the floor pad. The position part of this got taken out long ago (I think), but the angle part remained in, so when the Spy stopped moving, if the mouse was over the bookshelf that code would return the angle for facing the bookshelf.

Wait, you say, there’s no mouse cursor in Spy mode? Ah, yes there is, it’s just hidden and forced into the middle of the screen. So, most of the time it hits your back, but sometimes, if you’re turning or leaning down or whatever when you stop, it’ll miss you and hit what’s behind you, and if it hits a bookshelf on the frame when you stop, you get the wrong facing angle.

Now, with this knowledge, go back and watch the videos and look a the pictures above. Always a nearby guilty bookcase, isn’t there?

But wait, you say again, what about the very first video, the green bookshelf is nowhere near the middle of the screen! Ah, but buxx uses a controller, and the mouse gets hidden but doesn’t get centered if you’re using a controller! It probably should, but it doesn’t. So, buxx‘s hidden mouse pointer is probably off to the right of the window, over the green bookshelf, until he moves one pedestal pad to the left, and then the mouse pointer is no longer on the bookshelf, and he faces the right way!

Awesome, all the cases explained, and the bug was trivial to fix!

Okay, there’s actually one more case in the Bugs thread I didn’t post here, because it’s funny enough that I’m going to make an entire post about it soon.

So, just remember, always, Assume it’s a bug!