Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Software Error Likely Killed MGS Spacecraft

kdawson posted more than 7 years ago | from the off-by-one dept.

NASA 199

Aglassis writes "NASA investigators have determined that a software update performed in June of 2006 may have doomed the 10-year-old spacecraft. Apparently the software error caused the solar arrays to drive against a mechanical stop which then forced the spacecraft into safe mode. Unfortunately, after that the spacecraft's radiator was pointed at the sun which overheated the battery and destroyed it. Contact was lost with the Mars Global Surveyor spacecraft in November 2006. NASA will form an internal review board to determine formally the cause of the loss of the spacecraft and what remedial actions are needed for future missions."

cancel ×

199 comments

Don't believe it (5, Funny)

LiquidCoooled (634315) | more than 7 years ago | (#17556812)

I don't believe it.
Its most likely the Martian automated defense system setup just before we sent a probe and destroyed their civilisation [slashdot.org] .

Should have used Gentoo!! (1)

Marcion (876801) | more than 7 years ago | (#17557324)

The updates would have been added in a sandbox and then only moved to the main system if they passed all the tests.

Re:Should have used Gentoo!! (4, Insightful)

zootm (850416) | more than 7 years ago | (#17557462)

No sandbox can avoid the fact that one test was missing.

Re:Should have used Gentoo!! (1)

bhsurfer (539137) | more than 7 years ago | (#17557546)

Man, I'd feel really super important if I wrote a bug that destructive! I feel so inadequate... I need a hug.

Re:Should have used Gentoo!! (1)

zootm (850416) | more than 7 years ago | (#17557622)

What you need to do is hold back on producing all those "fun" bugs that we all introduce into systems until you've the reputation as one of the best coders in the world, then go work for NASA and just go wild on some system that won't be used until it's in deep space and you're off working for Google, having destroyed the paper trail.

Re:Should have used Gentoo!! (1)

the_tsi (19767) | more than 7 years ago | (#17557884)

...But if they installed the update on a gentoo sandbox before installing it on the MGS itself, it wouldn't be compiled for EXACTLY that machine, and as we all know, it's the precise compiling that results in gentoo's 20% performance increase (that and funrolling loops and putting flashy stripes on the computer, along with maybe a 8" exhaust).

Re:Should have used Gentoo!! (1)

Hatta (162192) | more than 7 years ago | (#17559064)

Isn't Mars one big sandbox [smh.com.au] ?

I believe it was running a version of Linux (0, Troll)

The_Abortionist (930834) | more than 7 years ago | (#17557402)

I am not surprised at the result at all. Some wanker at NASA thought that Linux was good enough for such a delicate and precise task. What's the TCO on that Linux installation? $500M ?

I think that a lot of productivity issues relating to IT in industries comes from the low-key Linux movement of techies who push to replace everything with Linux. They like to sweep the glitches under the rug.

This actually reminds me of all the Mac fans in the 90s who were dumping on Windows for its alleged lack of stability. But whenever they showed me their precious computer, it would completly crash before 5 minutes of browsing the web, or opening the calculator while the text editor (like notepad) was still opened.

Conservatives like to complain about activist judges, but for me, I see too many activist techies. They have a negative impact on their organisations. They should be eliminated.

Where's K'Breel? (2, Insightful)

Amazing Quantum Man (458715) | more than 7 years ago | (#17557714)

We need his report! Tripmaster Monkey, where are you?

Re:Don't believe it (1)

orasio (188021) | more than 7 years ago | (#17558668)

Martians were previoulsy killed by all the MSG [truthinlabeling.org] in the spacecraft

MGS? How about some fucking clearer headlines (-1, Offtopic)

DJCacophony (832334) | more than 7 years ago | (#17556842)

Who would have thought that metal gear solid had a spacecraft?

Battery (5, Funny)

Anonymous Coward | more than 7 years ago | (#17556846)

overheated the battery and destroyed it
Have NASA been using Dell batteries?

Re:Battery (1, Interesting)

Anonymous Coward | more than 7 years ago | (#17557030)

s/Dell/Sony/g

The worst part of it all was that Sony stopped using their own batteries because they knew they were defective. Boycott Sony.

Movie re-write? (1)

sherms (15634) | more than 7 years ago | (#17557182)

So does this mean they will have to re-write "Red Planet"? Wasn't there a scene where they used components from that machine?

Re:Movie re-write? (0)

Anonymous Coward | more than 7 years ago | (#17557368)

MGS [wikipedia.org] is an orbiter. They used Pathfinder, iirc, which is a surface vehicle.

Re:Movie re-write? (1)

sherms (15634) | more than 7 years ago | (#17557838)

thanks its been awhile I'll have to watch it again.

Re:Battery (1, Insightful)

Anonymous Coward | more than 7 years ago | (#17557986)

Parent is not offtopic. The batteries in those Dell laptops were produced by Sony Corporation, not Dell. That's why the recall extended to nearly every major laptop manufacturer.

a Technical solution I see: (2, Insightful)

pilgrim23 (716938) | more than 7 years ago | (#17556850)

Typical response to a problem: form a committee!

what's the alternative? (0, Offtopic)

jihadi_schwartz (989888) | more than 7 years ago | (#17556976)

Ever notice the "beat the rush and see it early" link at the top of slashdot when a new story is about to come out?

Sounds good, doesn't it? To be able to view the pages linked to in the article before the tens of thousands of other slashbots click to view them.

Did it ever occur to you that you're taking part in cyber-terrorism?

That's right: Slashdot's editors are cyber-terrorists. They coordinate a DOS against small websites, and they attempt to collect moeny from people who wish to be spared the effects of said DOS. Terrorism, plain and simple.

You can fight this and other crimes by slashdot's editors by joining anti-slash [anti-slash.org] . Anti-slash is committed to forcing the editors to own up to their numerous crimes against the geek community. Until our demands are met, we will relentlessly discredit them as a news service through trolling and other means.

Also, props to poopbot and the alan thicke troll. We remember your accomplishments.

In sacred jihad,

jihadi_31337

| _ __ | |
_) |_|_)__/_| |
(_) o

What is Microsoft wrote it? (5, Interesting)

quadelirus (694946) | more than 7 years ago | (#17556852)

One crash in ten years? Why don't the NASA guys write consumer operating systems?

Re:What is Microsoft wrote it? (2, Informative)

the_humeister (922869) | more than 7 years ago | (#17557088)

Because it'd be even less user friendly than Linux. Plus they'd also require people to run 80386 processors with 4 MB memory, if that.

Re:What is Microsoft wrote it? (1)

h2g2bob (948006) | more than 7 years ago | (#17557528)

Well, 4 MB should be enough for anybody

Re:What is Microsoft wrote it? (1)

quadelirus (694946) | more than 7 years ago | (#17558150)

Seriously, what do we need all these fancy shmancy graphics for anyway?

Re:What is Microsoft wrote it? (4, Funny)

the_humeister (922869) | more than 7 years ago | (#17558246)

I don't know. And people with their "keyboard" and "mouse." Idiots I say. The only true way to interact with a computer is by plugging wires into the serial port and generating the necessary electrical pulses myself.

Re:What is Microsoft wrote it? (1)

quadelirus (694946) | more than 7 years ago | (#17558388)

Just try not to have a SEGFAULT in the serial controller. :-p

Re:What is Microsoft wrote it? (3, Insightful)

Calinous (985536) | more than 7 years ago | (#17557110)

Why don't computers use NASA-quality hardware, ready for space?
Why don't all computers use just a single configuration (peripherals, cards, interfaces)?

      The purpose of an operating system is so much wider than what the Mars Global Surveyor had to do.

Re:What is Microsoft wrote it? (1)

quadelirus (694946) | more than 7 years ago | (#17558226)

I totally agree. All an OS does is let you set a desktop background, and for the trouble they seem to have, who needs one? I mean, if I could only run firefox, and no OS wouldn't that be better?

(I'm joking)

Re:What is Microsoft wrote it? (5, Insightful)

edremy (36408) | more than 7 years ago | (#17557770)

Actually, they buy their OS's off the shelf. (VxWorks for the rovers, for example)

That said, you could get software written to this level of perfection if you wanted. It's easy- follow the space shuttle's team's example. You have a stable team of mature developers who work reasonable hours. You test the hell out of the software to the point a single bug in a test is reason to redo the software. You run the software on four identical computers and make sure they all agree.

Then you hire another entire team to write code that does the same thing, but otherwise has no contact with the first team. That software runs on a fifth computer that takes over if something happens to the other four.

Willing to pay for that?

Re:What is Microsoft wrote it? (1)

quadelirus (694946) | more than 7 years ago | (#17558348)

My previous joking aside, that is a good testament to the work being done by the VxWorks and other real-time OS folks-I just figured it was all written in-house, but thinking about it now, as you pointed out, would be next to impossible to fund. It seems that these days most things requiring some sort of OS, from PDAs to Cellphones, to your car's chip, to NASA spacecraft are using off the shelf components. It's just too hard a problem to start from scratch, especially when there are good alternatives out there.

Re:What is Microsoft wrote it? (1)

Jerrry (43027) | more than 7 years ago | (#17559062)

Willing to pay for that?

Yes I am. Spread the cost over all the servers in the world and the cost would still be far less than the cost of all the crashes, infections, and data corruptions that are due to the sloppy way Microsoft writes and tests operating systems.

Don't Worry (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#17556854)

NASA's got a list of ways to colonize the moon [theonion.com] . Once those are achieved, this MGS business will just blow over ...

This is why Automatic Updates gets turned off (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#17556860)

It's one of the first things I do on a new install.

Why Solid Snake Why! (-1, Redundant)

bigdady92 (635263) | more than 7 years ago | (#17556866)

Anyone else read this and go "When did Metal Gear Solid crash a spaceship?"

Re:Why Solid Snake Why! (-1, Redundant)

Donniedarkness (895066) | more than 7 years ago | (#17556914)

Yes.

*phew* (4, Funny)

Daetrin (576516) | more than 7 years ago | (#17556886)

NASA investigators have determined that a software update performed in June of 2006 may have doomed the 10-year-old spacecraft. Apparently the software error caused the solar arrays to drive against a mechanical stop which then forced the spacecraft into safe mode.

Glad i'm not the programmer who came up with that bit of code! Their next performace review is going to be _lots_ of fun!

Re:*phew* (1)

Intron (870560) | more than 7 years ago | (#17556964)

There goes the SEI level 5 certification...

Re:*phew* (0)

Anonymous Coward | more than 7 years ago | (#17557158)

That's going to be my performance review you insensitive clod!

Re:*phew* (1)

creimer (824291) | more than 7 years ago | (#17557206)

Actually, a subcontractor will blame another subcontractor for the fault and fighting will break out. NASA will keep peace among the subcontractors by blaming a hacker for mistaking the update as a patch for the Metal Gear Solid vidoe game, and vows not create any acronyms that could be misconstrued as a video game.

the software bug was (0)

Anonymous Coward | more than 7 years ago | (#17556888)

uri = windowsupdate instead of nasaupdate...

thats what happens when an ex-microsoft employee works for you

JVM, TCL, or bad old C problem? (0)

oldwarrior (463580) | more than 7 years ago | (#17556898)

inquiring minds want to know?

"Safe" mode? (5, Funny)

Bazman (4849) | more than 7 years ago | (#17556930)

Funny definition of 'safe mode'. I'd get the main antenna pointing at the earth, the battery radiator pointing away from the sun, and the computer going 'what do I do know, smarty earthlings?' and waiting for a command.

Maybe NASA's 'safe mode' just put 'safe mode' in the corners of all the returned images and did them in 8-bit colour...

Bits (1)

michaelmalak (91262) | more than 7 years ago | (#17558096)

Maybe NASA's 'safe mode' just put 'safe mode' in the corners of all the returned images and did them in 8-bit colour...
I think you meant to say 4-bit color.

Re:"Safe" mode? (0)

Anonymous Coward | more than 7 years ago | (#17558920)

I think that's what it did. It asked the question "what do I do know, smarty earthlings?" and got stuck in an infinite loop trying to understand two words and why the question was not capitalized? Maybe it will eventually understand it should have been,

    What do I do now, smartly Earthlings?

??? Maybe not.

YACCS -Yet Another Computer Corkup in Space (4, Informative)

Ancient_Hacker (751168) | more than 7 years ago | (#17556956)

Just one more example of how Computer Science sint quite up to the reliability requirements of Space:
  • A missing comma in a Do-loop statement causes the first mission to Mars rocket to go off course and blow up.
  • The space-shuttle programs had a race condition that causes the first launch to be scrubbed.
  • The space-shuttle re-entry program had one important variable off by a factor of -4, causing rthe first re-entry to be a bit wobbly.
  • A Ariane guidance program had multiple basic design glitches that caused the first launch to blow up.
  • The F-16 autopilot worked very well, until the plane was deployed to Australia, where on its way there it bounced off the equator.
  • The LEM landing program didnt protect itself from spurious radar data, causing the computer to get behind.

Aero and space are very unforgiving of human coding errors.

Re:YACCS -Yet Another Computer Corkup in Space (2, Interesting)

zyl0x (987342) | more than 7 years ago | (#17557072)

Be careful not to place too much of the blame on us programmers. Most of these crazy "business logic" equations were created by some math genius in another department. Since most of these equations mean nothing to programmers, we make sure we're typing them in correctly, since there's no way we would ever recognize any type of mistake. Most of the time the problem lies with the math guy, who was too lazy to carry a remainder, or who thought the equation was good enough being precise to four decimal places.

Re:YACCS -Yet Another Computer Corkup in Space (4, Insightful)

spun (1352) | more than 7 years ago | (#17557510)

In other disciplines, the engineers ARE math guys. Face it, compared to other engineering types, software engineers and programmers are SLOPPY. This is because engineering has thousands of years worth of spectacular cork-ups with enormous death tolls to look back on, and engineering students are (I'm guessing, IANAE) shown horrific, traffic-safetyesque movies like Blood on the Protractor, Slide Rule Massacre, and London Bridge is Falling Down, Killing Litle Johnny's Entire Family.

Maybe we CS types need our own safety movies, perhaps When Buffers Attack!, Threads: Your Parallel Friends or Quagmires of Debugging DOOM?, or maybe Metric or Imperial: You Mean there's a Difference? Or maybe we need to recognize that many of us have the same awesome responsibility that other engineers do of protecting human lives from the consequences of our mistakes. I'm told that this point is hammered home in engineering schools, why not in CS departments?

Re:YACCS -Yet Another Computer Corkup in Space (4, Funny)

unix_core (943019) | more than 7 years ago | (#17557758)

I think I've seen some of those, starring Troy McLure right?

Re:YACCS -Yet Another Computer Corkup in Space (1)

spun (1352) | more than 7 years ago | (#17558640)

When I came up with those names, I pictured Troy saying them. Dammit, Phil Hartman, why'd you have to marry a crazy murdering alchoholic bitch?

*Sigh*

Re:YACCS -Yet Another Computer Corkup in Space (2, Insightful)

caerwyn (38056) | more than 7 years ago | (#17558090)

CS people are math guys too, at least many of us are. That doesn't mean we necessarily have the expertise to validate aerospace control algorithms on the fly- that's why the's an entire discipline of aerospace engineers, because you can't expect all the *other* engineers to have sufficient knowledge.

Things like this are built as teams- and team members have to make certain assumptions about the accuracy of the other team members' work. Those algorithms should have been validated before even being handed off to the programmers, and then validated *again* as part of integrated testing.

Re:YACCS -Yet Another Computer Corkup in Space (1)

shawn(at)fsu (447153) | more than 7 years ago | (#17557102)

It's not like the only problems with air and space vehicles have been caused by coding errors, I'm sure engineering has done fairly well for it self too.

Re:YACCS -Yet Another Computer Corkup in Space (0)

Anonymous Coward | more than 7 years ago | (#17557236)

A few failures every now and then promotes competition in the selfdestruct-technology business so it's not all bad news ;)

Re:YACCS -Yet Another Computer Corkup in Space (1)

MBCook (132727) | more than 7 years ago | (#17557342)

Like the F16 thing. Let's not forget that the shuttle has NEVER been in space during a new-years. It is untested (at least in space) and they are not positive what will happen. That's why they were worried in December, they didn't want bad weather to force the shuttle to stay in space during the transition.

Re:YACCS -Yet Another Computer Corkup in Space (1)

Arbitor Elegantorum (990281) | more than 7 years ago | (#17557688)

According to NASA [nasa.gov] , MGS outlived its design parameters by 400%, and relayed important information right up to the end. Further, the Mars Rovers have outlived their warranty by 2 years. I think we're doing something right.

Re:YACCS -Yet Another Computer Corkup in Space (3, Insightful)

januth (1000892) | more than 7 years ago | (#17557814)

I wouldn't call it a failure of Computer Science; it's a QA failure without a doubt.

Mistakes happen when you code. Sure, you try to minimize them but even the most carefully designed code can't be guaranteed to be 100% error free. That's why you employ, presumably, a top-notch QA team to check and recheck, testing your "perfect" code in ways that perhaps you never even considered.

This is what you would expect in a terrestrial application. When the platform that your code is going to run on isn't bound to the same gravitational source that you are, you would think...you would *hope*...that the QA team might do an even more thorough job.

If this event is at all indicative of the QA efforts that NASA will be making for our return to the moon, perhaps we'd be better off staying at home.

Re:YACCS -Yet Another Computer Corkup in Space (4, Insightful)

Mayhem178 (920970) | more than 7 years ago | (#17557946)

For the uninformed, QA = Quality Assurance. A must-have for any self-respecting software model.

NASA has got it rough, has since the mid 70s. Their wildest successes are regarded as routine and hardly noticed by the public eye. Their failures, on the other hand, are spun to be the worst disasters in human history. Granted, when shuttles explode and people die, it's reasonable that the public be concerned. But it seems to me that for every 20 great things that NASA accomplishes, the media picks 1 failure (and sometimes blows that failure out of proportion) to rile the masses into a furious frenzy calling for the dissolution of NASA.

Reliability compared to what? (1)

Vellmont (569020) | more than 7 years ago | (#17557906)


Just one more example of how Computer Science isn't quite up to the reliability requirements of Space

And how many failures have happened because of an enginering mistake?

You seem to assume that there's zero failure in space for everything else, and 6 problems in.. 30 years? is some horrible record.

All information only makes sense in context. What's the failure rate of other components of the system?

Re:YACCS -Yet Another Computer Corkup in Space (5, Informative)

Fishbulb (32296) | more than 7 years ago | (#17557984)

The F-16 didn't "bounce off the equator". Before it ever flew, in simulation the computer flipped the plane over when it crossed the equator due to a bug that incorrectly handled southern lattitudes. Additionally, since the computer "flip" happened instantaneously, and the f-16 can roll at much higher G forces than the pilot can take, the flip would have killed the pilot (and the F-16 would have happily continued on its way).

http://portal.acm.org/ft_gateway.cfm?id=163293&typ e=pdf&coll=GUIDE&dl=GUIDE&CFID=11154656&CFTOKEN=19 136062 [acm.org]

Re:YACCS -Yet Another Computer Corkup in Space (1)

kfg (145172) | more than 7 years ago | (#17558072)

Aero and space are very unforgiving of human coding errors.

The sea is no pussycat either.

KFG

Re:YACCS -Yet Another Computer Corkup in Space (1)

HangingChad (677530) | more than 7 years ago | (#17558230)

And don't forget the Mars Climate Orbiter "Dirt Dart" mission (http://en.wikipedia.org/wiki/Mars_Climate_Orbiter ). Okay the operators helped by plugging in the wrong units but neither did the software catch the discrepancy in the values.

The systems aboard the spacecraft were not able to reconcile the two systems of measurement, resulting in the navigation error.

Operator error but it would be interesting to figure in the number of accidents that the software could have prevented the operator from entering the wrong values, or at least prompted them that the values don't match.

I'm not blaming the programmers. It's amazing how well things work considering the distance, temperature extremes, radiation and it's not exactly like you can bring it into the shop if something goes wrong.

Re:YACCS -Yet Another Computer Corkup in Space (2, Insightful)

Minwee (522556) | more than 7 years ago | (#17558900)

Okay the operators helped by plugging in the wrong units but neither did the software catch the discrepancy in the values.

"On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

Plus ça change, plus c'est la même chose.

Re:YACCS -Yet Another Computer Corkup in Space (1)

ChrisA90278 (905188) | more than 7 years ago | (#17559146)

The problem is the inability to test the software in a realistic environment. In fact you CAN'T fully test software. For example let's say you write a program to add to numbers and print the sum. Very simple program but all you can do is "spot check" it with a few test numbers. for example I doubt testing would catch the bug in the following program
get a value for "A"
get a value for "B"
if (a == 3248532346863247) Add 3 to A
print (A+B)

What or the chances you would use 3248532346863247 as a test value? You could run the abouve program for 100 years and no one would likerly ever find the bug. The only way to find it would be to read the code. It this case it is only four lines of code and anyone could find the error. But what it it were 1,000,000 lines? No human could ever read it but yet having a human read it is to only way to find errors. So you break it up and have 100 humans each read 10,000 lines. What if the bug is in the subtle interaction betwen the parts? The ONLY solution is to design systems that are tolerent of software bugs. Lots of ways to do this. Put a human pilot inside the airplan or Lunerlander or build a computer to watch the computer or simply fly your test out over the ocean where if they blow up no one is harmmed. You just have to asume there will be bugs and you will not be able to detect them

It must have been (1)

wumpus188 (657540) | more than 7 years ago | (#17556970)

.. a Sony battery.

Windows Software? (0, Offtopic)

StumpMan (176725) | more than 7 years ago | (#17556996)

Microsoft Validation required. Please click the Continue button to begin Windows validation.

Re:Windows Software? (0)

Anonymous Coward | more than 7 years ago | (#17557340)

What type of computer is the Pathfinder utilizing? Is the CPU from Intel or Motorola or custom made? How fast does it run and how much memory does it contain? Is there more than one computer on board? What programming language was utilized in the software?

The computer is a Radiation Hardened IBM Risc 6000 Single Chip (Rad6000 SC) CPU. It is the same as the IBM R6000 workstation. Lockeed-Martin Federal Systems in Manassas, VA, is responsible for doing the radiation hardening of the Rad6000 SC as well as developing the complete Mars Pathfinder Flight Computer (MFC).

The MFC contains 128 MBytes of DRAM memory and runs at speeds of 2.5, 5, 10 and 20 MegaHertz. This translates to approximately 2.7, 5.5, 11, and 22 MIPS (this does vary, depending on which benchmark is being used). The code was developed using VxWorks as the real-time OS and "C" and assembly languages. It utilizes object-oriented constructs.

On the system there is only one computer to control the spacecraft throughout all phases of the mission. The Rover has a very small CPU that it uses once we have landed and the rover is released. All communications to Earth from the spacecraft and rover come through the Rad6000 SC.

Dam it (0, Redundant)

VEGETA_GT (255721) | more than 7 years ago | (#17557036)

I told you that letting a Microsoft Programmer onto the team was a bad idea.

Re:Dam it (0)

Anonymous Coward | more than 7 years ago | (#17558268)

it was an IBM'er. On an R6000.

super tuesday (1)

dcskier (1039688) | more than 7 years ago | (#17557076)

they should've waited until super tuesday before issuing the patch. everyone knows not to patch out of cycle.

MGS spaceship? (1)

DrXym (126579) | more than 7 years ago | (#17557098)

Perhaps Big Boss killed it

MGS? (1)

Rob T Firefly (844560) | more than 7 years ago | (#17557116)

Everyone knows, it was Solid Snake that destroyed Metal Gear Solid.

Re:MGS? (1)

Bendy Chief (633679) | more than 7 years ago | (#17559070)

Spacecraft? Spacecraft?! SPACECRAFT!!!

We hardware types always blame software (1)

Quiet_Desperation (858215) | more than 7 years ago | (#17557156)

It's just the way of the world. :)

Re:We hardware types always blame software (1)

daniel23 (605413) | more than 7 years ago | (#17558716)


reminds me of that old sig:

The 3 most dangerous situations:

A hardware guy with a software patch.
A user with an idea.
A coder with an electric iron.

Pilot said.... (2, Funny)

isieo (1049808) | more than 7 years ago | (#17557222)

Houston, I B.S.O.Ded

The Daily WTF (1)

shadowcode (852856) | more than 7 years ago | (#17557246)

That'd be one hell of a submission to The Daily Wtf.

Is this a sign? (4, Insightful)

Billosaur (927319) | more than 7 years ago | (#17557284)

Some expert is always trumpeting the fact that "Johnny can't program," to which many of us roll our eyes and go back to coding. But could this be a sign that the quality of the help NASA is hiring is such that these kinds of mistakes are now rampant? I mean, this could have been avoided if the code had been tested out on a full-scale mock-up of the machine, to verify that it did what it was supposed to do, before ever sending the commands to the actual machine. If anything, it's a QA failure.

Re:Is this a sign? (0)

Anonymous Coward | more than 7 years ago | (#17557698)

Chances are they did run it in a full-scale mock-up before sending to the spacecraft. NASA tends to be very picky about this sort of thing. The trouble is, you always have a human in the loop at some point.

Re:Is this a sign? (5, Insightful)

benevixit (754447) | more than 7 years ago | (#17557856)

In all fairness, writing code for a spacecraft is a lot harder than most of our Earthbound coding projects. These are custom-built machines running one-of-a-kind hardware; one can simulate components independently but it's very difficult to figure out how the hardware is going to behave up there in the vacuum. For example, consider the one function of maintaining orientation. Most spacecraft use telescopes that look for star reference points. They look for particular star configurations and use microthrusters or gyroscopes to adjust their orientation. Imagine what it would take to simulate this: a zero-gravity vacuum with a realistic star-field at focus=infinity. Any laboratory mock up is going to cost a lot more than launching a new spacecraft. And that's just one subsystem. Software upgrades at NASA go through a really rigorous quality control regimen, often requiring programmers to justify _individual_lines_ of their code to a review committee. Even then they usually won't patch noncritical bugs until the primary mission is completed. I think your point is a good one. And the key lesson is not that NASA QA sucks, it's that programming for spacecraft is _tough_. I know they are constantly investigating new ways (like more standardization, code re-use, and formal verification procedures) of improving software reliability.

Re:Is this a sign? (1)

benevixit (754447) | more than 7 years ago | (#17557928)

Yeah, some line breaks would have been welcome. Should have tested my post with a mock-up before submitting I guess.

Re:Is this a sign? (1)

Zontar_Thing_From_Ve (949321) | more than 7 years ago | (#17558294)

Some expert is always trumpeting the fact that "Johnny can't program," to which many of us roll our eyes and go back to coding. But could this be a sign that the quality of the help NASA is hiring is such that these kinds of mistakes are now rampant? I mean, this could have been avoided if the code had been tested out on a full-scale mock-up of the machine, to verify that it did what it was supposed to do, before ever sending the commands to the actual machine. If anything, it's a QA failure.

I used to work for the US government on a job I thankfully left a long time ago. I can't speak for NASA in particular as I worked for the Department of Defense. Keep in mind that things might be different at NASA. Typically, working for Uncle Sam is not as lucrative as working in private industry. There are compensating benefits though. It's just about impossible to get fired. Uncle Sam gives better vacation benefits than most American employers. Early retirement is very realistic opportunity when working for Uncle Sam. At least where I worked, we tended to attract people who wanted to live in a small town (government salaries go further there) and people who were not very motivated for the most part. You get what you pay for. We would get the guys who graduated at the very bottom of their engineering classes because the guys above them wouldn't work for government salaries.

To be fair, NASA has cut an awful lot of corners in recent years and had some really bad management make a lot of really bad decisions. I'm still unconvinced that NASA management knows what it is doing. When I worked for the DoD, QA was a joke. It was up to the programmers to test their own code. QA is significantly better in private industry than when I worked for the DoD. It could also be that the programmer's code did exactly what he wanted it to do, but he misunderstood what he was supposed to do.

Better than a metric-English conversion error (3, Insightful)

ccmay (116316) | more than 7 years ago | (#17557344)

I guess those things happen. But at least it wasn't an error converting units, like the other Mars spacecraft that was lost. That is just incredibly stupid. Glad I'm not the "engineer" who wasted thousands of man-years and hundreds of millions of taxpayers' dollars because I was too stupid or lazy to convert between meters and feet.

On a positive note, it has provided me an instructive example for when I help my teenagers with their math homework. If they say it's "almost" correct, I tell them that the guy who screwed up the Mars mission probably said the same thing.

-ccm

Re:Better than a metric-English conversion error (0)

Anonymous Coward | more than 7 years ago | (#17557582)

Stupidity is in the assumptions you've made.

The conversion issue is down to similar assumptions - the engineer who wrote the height module did it to return feet, and we can assume (eek!) that it was perfect in every way, verifiably correct at all times.

The engineer who used that module assumed that it returned values in metres, and his code could be proved to be correct in every way.

No-one needed to convert between the units, they needed to *know* the values returned would be in different. units. Unfortunately, each of the teams that wrote their modules always used a particular unit, just not the same ones as each other.

an example: If your teenagers give you maths homework, that they did using a different base, and you mark it as wrong, it would be you that was incorrect because (you are, what was it, "too stupid or lazy") to convert to the base you assume everyone uses.

Re:Better than a metric-English conversion error (2, Insightful)

kfg (145172) | more than 7 years ago | (#17558272)

If you wish them to grow up to be good little engineers; ask them to define how "almost" correct it is.

KFG

Re:"almost" correct (1)

Migraineman (632203) | more than 7 years ago | (#17559370)

Funny, I have this conversation with my wife all the time. She's an elementary school teacher, and we regularly butt heads about how to deal with this. She's willing to grade a math problem as "correct" if the student demonstrated the correct process, but made a simple clerical error resulting in the wrong answer. She argues that the method is more important than a single result. Uh huhhh. So if I botch the balance in my checkbook, the bank will pat me on the head, say "that's okay," and front me the money I shouldn't have? I think not.

There aren't many "absolute truths" in this existence, but math is one of them. Your calculations are either "correct" or "not correct." "Almost correct" is someone being spineless. I'd much rather know that I botched a calculation so I can perform it correctly the next time, rather than exist in blissful ignorance. Telling me that I'm stooopid is a personal attack; telling me my calculation is incorrect is a statement of fact. Folks need to learn that the latter statement isn't necessarily a bad thing. You learn by making mistakes.

Legend has it (1)

BillGatesLoveChild (1046184) | more than 7 years ago | (#17557370)

> Apparently the software error caused ... overheated the battery and destroyed it.

Legend has it at Microsoft that if you introduce a bug that breaks the nightly build you have a stupid mascot that perches on your desk the next day. Wonder what the other NASA programmers will do to this guy?

Auto Update... (1)

MetaKey (896166) | more than 7 years ago | (#17557410)

And that is why you shut off that damn auto update thing on your PC.

Pathetic Earthlings...

So what if the battery is dead? (1)

Viol8 (599362) | more than 7 years ago | (#17557450)

Surely it can still function on its solar arrays when its on the daylight side of the planet? Or would it drift too much out of alignment when in the dark? Or is there some other issue?

Re:So what if the battery is dead? (0, Flamebait)

iggymanz (596061) | more than 7 years ago | (#17558210)

You're our hero, as a slashdotter you've transcended to the next higher level. Not only did you not RTFA, you didn't even read the summary, that little bit about the heat radiators being pointed at the sun so the craft was cooked. I'm hoping you don't read the last part of the last sentence, maybe you'll achieve slash-enlightenment and be our Slashbuddha.

Re:So what if the battery is dead? (1)

Viol8 (599362) | more than 7 years ago | (#17558374)

Actually it clearly states it cooked the battery not the whole craft. So why don't you go RTFA instead of attempting some lame karma whoring you cretin.

Re:So what if the battery is dead? (2, Insightful)

smoker2 (750216) | more than 7 years ago | (#17558522)

I expect the electronics runs off the battery, and the solar just charges the battery. If the battery's dead, nothing will run.

Obligatory (0, Troll)

8ball629 (963244) | more than 7 years ago | (#17557614)

Sounds like a Microsoft OS update to me.

zing! (2, Funny)

steak (145650) | more than 7 years ago | (#17557620)

that was the sound of me hitting the bullseye.

[quote]at least if something went wrong some guy at nasa could tell his grand kids that he bricked something from ~140 million miles away.[/quote]

http://slashdot.org/comments.pl?sid=214508&cid=174 27542 [slashdot.org]

Rocket Science (0)

Anonymous Coward | more than 7 years ago | (#17558174)

I'm glad to hear that rocket scientists make mistakes also.

Microsoft Rules/Ruins !! (0, Redundant)

mgpandey (1049836) | more than 7 years ago | (#17558248)

Might be they upgraded it to Vista !!!

Time for a recall of bad parts (1)

Fry-kun (619632) | more than 7 years ago | (#17558674)

Does anyone else think it's about time to make a small satellite with a few "claws" to fly around our existing satellites and replace their various parts?
It could probably do repairs to the ISS as well (spacewalks should be for fun, not for work).

Safe Mode (1)

cadeon (977561) | more than 7 years ago | (#17559034)

. . . which then forced the spacecraft into safe mode.

We all know a machine Safe Mode doesn't allow remote management.

Vampire Hackers (1)

Doc Ruby (173196) | more than 7 years ago | (#17559078)

No, everyone knows it's the Martian vampires. That SW glitch pointed the solar collectors at the Martian surface, overpowering the thin layer of blood that protects the biters from the weak rays of the Sun. We need to find out how the vampires reached the MGS to destroy it. Probably they have moles at NASA or a contractor with access to the controllers. We have to fund deployment of my SOLASER Space Debt Inc (SDI) weapon to fry them before they fry us.

Yeah right (0)

Anonymous Coward | more than 7 years ago | (#17559148)

Face it, they bricked it in a firmware update while trying to circumvent the built in DRM and they are trying to blame the software manufacturer.

I can see the ebay item now "10 y/o excellent condition spacecraft, dead battery quick fix, bricked"

An easy fix... (1)

Autonomous Crowhard (205058) | more than 7 years ago | (#17559242)

Just have a nearby human replace the dead battery and restart the machine.

Oh... right... manned exploration is a waste of money and robots are all we ever need.

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...