11 Infamous Software Bugs
Programming errors that derail high-profile space-exploration missions -- especially bugs that cause spectacular explosions -- are frightening, expensive and career-killingly embarrassing for those who let them slip through. They provide extremely vivid reminders for all of us to check and recheck (and recheck and recheck and recheck) every line of code.
Mars Climate Orbiter Doesn't Orbit
Back in physics class, our teachers leaped all over answers that consisted of a number. If the answer was 2.5, they'd take their red pens and write "2.5 what? Weeks? Puppies? Demerits?" And proceed to mark the answer wrong.
Back then, we thought that they were just being pedantic. But it's the kind of error that can burn up a $327.6 million project in minutes. It did in 1998, when the Mars Climate Orbiter built by NASA's Jet Propulsion Laboratory approached the Red Planet at the wrong angle. At this point, it could easily have been renamed the Mars Climate Bright Light in the Upper Atmosphere, and shortly afterward been renamed the Mars Climate Debris Drifting Through the Sky.
There were several problems with this spacecraft -- its uneven payload made it torque during flight, and its project managers neglected some important details during several stages of the mission. But the biggest problem was that different parts of the engineering team were using different units of measurement. One group working on the thrusters measured in English units of pounds-force seconds; the others used metric Newton-seconds. And whoever checked the numbers didn't use the red pen like a pedantic high-school teacher.
The result: The thrusters were 4.45 times more powerful than they should have been. If this goof had been spotted earlier, it could have been compensated for, but it wasn't, and the result of that inattention is now lost in space, possibly in pieces.
Mariner 1's Five-Minute Flight
On July 22, 1962, the first spacecraft of NASA's Mariner program blasted off on a mission to fly by Venus. The booster did its job, taking the spacecraft from its Cape Canaveral launchpad, but after a few minutes, Mariner 1 began to yaw off course. The guidance system failed to correct the trajectory, and guidance commands failed to correct it manually.
As the rocket veered off toward North Atlantic shipping lanes, the range safety officer did the only thing he could do: blow the thing up. Four minutes and 55 seconds into the mission, the Mariner 1 exploded.
NASA was already suffering from Sputnik envy, and the Mariner 1 incident was another international embarrassment for the agency. The postmortem of this debacle revealed what NASA described as "improper operation of the Atlas airborne beacon equipment" -- though later it came out that the mistranscription of a single punctuation mark by an engineer caused the mission's fatal software error.
In his 1968 book The Promise of Space, Arthur C. Clarke described the mission as "wrecked by the most expensive hyphen in history."
That may not be strictly accurate. Although NASA did mention a hyphen in some of its reports of the incident, it appears that the agency was simplifying the story for a nontechnical audience.
A more widely accepted account is that the punctuation mark was a superscript bar over a radius symbol, handwritten in a notebook. In rocket science, the overbar signifies a smoothing function, so the formula should have calculated the smoothed value of the time derivative of a radius.
Without the smoothing function, even minor variations of speed would trigger the corrective boosters to kick in. The automobile driving equivalent would be to yank the steering wheel in the opposite direction of every obstacle in the driver's field of vision.
But few people know what an overbar is, and since it looks like a hyphen, that's how most people tell the story.
Forty Seconds of Ariane-5
The European Space Agency (ESA) has also suffered embarrassment on the software front. The inaugural flight of its fifth-generation Ariane launcher bested NASA's Mariner 1 score for unmanned spacecraft disaster: It took only 40 seconds to blow up.
On June 4, 1996, after the kind of dramatic vertical blastoff you'd expect from a high-profile European vehicle, cameras on the ground barely had time to focus on the Ariane-5 as it turned around and began to fall apart, before it completely exploded.
The Ariane Flight 501 disaster began with a loss of guidance and attitude information 30 seconds after liftoff. Once it veered completely off course, it automatically self-destructed.
The problem was that Ariane-5's inertial reference system dealt with 64-bit floating-point data and converted it into 16-bit signed integer values. The result of the data conversion was too large for a 16-bit signed integer, which caused an arithmetic overflow in the hardware. In the ESA's case, a software handler that could have dealt with the problem had been disabled, and so there was no levee to dam the cascade of system failures that led to the destruction.