More than 20 years later, we are still learning new and fascinating details about the legendary Valve First-person shooter, Half-Life 2. Today we listen to former Valve designer Tom Forsythe talk about a bug discovered in 2013 that apparently traveled back in time and infected the original version of the game.
I realize this is a lot to take in, so let's unpack this truly fascinating case of technology accident. As detailed by Forsythe MastodonForsyth and Valve programmer Joe Ludwig were working on porting Half-Life 2 to the Oculus Rift VR headset, and early in the game Forsythe encountered a bug that prevented him from entering a room and thus escaping the critical path. There was no way to move forward, which was weird enough, but what was even weirder was that no one could remember encountering this bug in the original game. Forsythe even watched a video of the first scene where he encountered the bug, but it wasn't there.
Obviously, the most pressing issue at the time was fixing the bug before the game was due to release on the Oculus Rift, but adding to the craziness was the fact that Valve appeared to be dealing with a glitch involving time travel. It didn't exist before, but now it has appeared. “How is this possible? At this point people are going crazy – this is no ordinary bug – it seems to have traveled back in time and infected the original!” – said Forsythe.
Eventually the developers were able to identify the source of the bug: a guard stationed in the now inaccessible room was standing too close to the door, and his toe collided with the door as it opened, causing it to return to its closed state. Now that the problem had been discovered, it was “easy” to fix, even though “it took a lot of work to find because people had to dust off old memories of how debugging tools worked, etc.”
Nevertheless, the secret remained. How in the name of Gabe Newell did this 2013 bug manage to find its way into nine-year-old code? And moreover, why didn't the soldier's toe prevent the door from opening in 2004? Or in any of the subsequent years until the error was discovered?
“But why did it ALWAYS work? In the original version, the guard's finger was also in the way. As I said, we went back in time and compiled the original source code as submitted – and there was an error there too. She was always there. Why didn't the door slam shut again? How did this even happen?”
Well, fortunately, there is an answer to this fascinating mystery: “good old floating point,” according to Forsythe. I'll let a real game designer talk about this part, but essentially the problem was not with the game code, but with the hardware that determined the accuracy of the game's physics, and by sheer luck, that accuracy allowed the door to swing open on the hardware it was originally built for, but not on the 2013 kit that Valve used to test the game.
“Half Life 2 was originally released in 2004, and while the SSE instruction set existed, it was not yet ubiquitous, so most of HL2 was compiled to use the older 8087 or x87 math instruction set,” Forsythe said. “It's a crazy range of precision – some things are 32-bit, some are 64-bit, some are 80-bit, and exactly what precision you get in which bits of code is somewhat mysterious.
“But ten years later, in 2013, SSE was a standard on all x86 processors for a time—the operating system depended on it, so it could be relied upon. So of course compilers use it by default – in fact you'll have to go out of your way to get them to generate old (slightly slower) x87 code. SSE uses a much more well-defined precision – 32 or 64 bits depending on what the code requires – it's much more predictable.”
Well, what that 32- or 64-bit precision required apparently was a guard's leg that wouldn't give way to the door colliding with it. In the original x87 code, the guard's boot had just enough friction built into it to allow it to rotate just enough for the door to get past it and open properly, but the newer SSE had “a whole bunch of tiny parts” that were “very slightly different, and the combination of floor friction and the mass of the objects means the guard is still spinning from the collision, but now it's spinning a little less.”
“So in the next frame of the simulation, his toe is still on the way to the door,” Forsythe said. “The door is not allowed to just go through his toe, so its only option is to bounce back. I think by default it's set to do this perfectly resiliently, so the door springs back at the same speed it came in, slams shut, and locks again. And you get stuck.”
This means that, oddly enough, the bug has existed in the game all along. The guard was always standing too close to the door, but because the compiler in the original build defaulted to older floating point precision, the game physics were slightly different than what you saw in the new compiler, and that slight discrepancy meant the difference between opening and not opening a game-critical door.
“And here it is,” concluded Forsythe. “Two of the biggest bug farms in game development—doors and floating point—manage to turn a simple NPC placement error into a full-blown time-travel chatter.”






