Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Kepler Recovers After 144 Hour "Glitch"

CmdrTaco posted more than 3 years ago | from the hate-when-that-happens dept.

NASA 73

coondoggie writes "There was likely a pretty big sigh of relief at NASA's Ames Research Center this week as the group's star satellite Kepler recovered from a glitch that took it offline for 144 hours. According to NASA the glitch happened March 14, right after the spacecraft issued a network interface card (NIC) reset command to implement a computer program update. During the reset, the NIC sent invalid reaction wheel data to the flight software, which caused the spacecraft to enter safe mode, NASA stated."

cancel ×

73 comments

Eww. (-1, Offtopic)

LaminatorX (410794) | more than 3 years ago | (#35579680)

That was gross.

Re:Eww. (1)

LaminatorX (410794) | more than 3 years ago | (#35592894)

Offtopic? Bad pun perhaps, but really? It was down for a gross (144) of hours.

I blame myself for the joke requiring an explanation.

Proof of extraterrestrial life! (5, Funny)

aardwolf64 (160070) | more than 3 years ago | (#35579700)

Alright, who hit F8 while it was booting up???

Re:Proof of extraterrestrial life! (0)

Anonymous Coward | more than 3 years ago | (#35582790)

dear nasa, the use of realtek nics in mission critical equipment is not recommended. yours truly, slashdot

Re:Proof of extraterrestrial life! (0)

Anonymous Coward | more than 3 years ago | (#35583032)

Look at the timer at the bottom, SDO's cameras were taking pics, but it was blacked out, then you see only part of the sun, then the full sun returns.

NASA CENSORED RECENT SUN BLACKOUT
http://www.youtube.com/watch?v=k7Y1oAPxZvI

Mar 14, 2011
Sun Blackout, A Second Look!
http://www.youtube.com/watch?v=3cv8wlimNfI

Coincidence!

F12 (1)

Cinder6 (894572) | more than 3 years ago | (#35579708)

Betcha whoever hit F12 during POST got fired.

Hay guys I got this one! (5, Insightful)

flydpnkrtn (114575) | more than 3 years ago | (#35579718)

You need safe mode with networking, not just plain old "Safe Mode" guys!

Re:Hay guys I got this one! (4, Funny)

LaminatorX (410794) | more than 3 years ago | (#35579742)

It's not like this stuff is rocket science.

Oh, wait.

Re:Hay guys I got this one! (1)

Kjella (173770) | more than 3 years ago | (#35581444)

A rocket scientist would probably say it's not, not since the launch in 2009. And the people who built and operate the satellite probably know very little about rockets...

Re:Hay guys I got this one! (0)

Anonymous Coward | more than 3 years ago | (#35582370)

You'd be surprised. Can't hang around a place that build spacecraft and not learn something out rockets

Re:Hay guys I got this one! (1)

Urkki (668283) | more than 3 years ago | (#35583548)

A rocket scientist would probably say it's not, not since the launch in 2009. And the people who built and operate the satellite probably know very little about rockets...

A satellite does very little without rockets (and gyroscopes count as rockets for purpose of "rocket science", since they interact with rockets).

Re:Hay guys I got this one! (0)

Anonymous Coward | more than 3 years ago | (#35583882)

A rocket scientist would probably say it's not, not since the launch in 2009. And the people who built and operate the satellite probably know very little about rockets...

A satellite does very little without rockets (and gyroscopes count as rockets for purpose of "rocket science", since they interact with rockets).

I call bullshit. Of the three satellites I've worked on, only one had thrusters.

Re:Hay guys I got this one! (0)

Anonymous Coward | more than 3 years ago | (#35585388)

So they never reached orbit?

Re:Hay guys I got this one! (1)

ThreeKelvin (2024342) | more than 3 years ago | (#35587460)

Satellites are usually lifted to their orbit by a launch vehicle. Normally they don't do any extensive manuvering using thrusters.

Thrusters on satellites are used primarily for attitude control. If the satellite is in low earth orbit attitude control can be performed without using thruster. For example, ISS was designed for attitude control using thruster, but zero-propellant manouvers is possible [nasa.gov] .

Re:Hay guys I got this one! (1)

elrous0 (869638) | more than 3 years ago | (#35585912)

That's how playing D&D taught me a little something about courage.

Re:Hay guys I got this one! (0)

Anonymous Coward | more than 3 years ago | (#35588166)

Nice X-Files reference--Darin Morgan is the man!

Re:Hay guys I got this one! (0)

Anonymous Coward | more than 3 years ago | (#35590092)

Nice X-Files reference--Darin Morgan rocks!

Tech Support (4, Funny)

Manos_Of_Fate (1092793) | more than 3 years ago | (#35579782)

Did they try turning it off and then on again?

Re:Tech Support (0)

SlashV (1069110) | more than 3 years ago | (#35580268)

Are they sure it's plugged in?

Re:Tech Support (1)

eulernet (1132389) | more than 3 years ago | (#35581154)

Nope, they had to press Ctrl, Alt and Del simultaneously.

Re:Tech Support (0)

Manos_Of_Fate (1092793) | more than 3 years ago | (#35581300)

That's not nearly as easy as it sounds in orbit.

Re:Tech Support (4, Funny)

Anonymous Coward | more than 3 years ago | (#35581564)

That's not nearly as easy as it sounds in orbit.

Nothing sounds easy in a near vacuum.

Re:Tech Support (0)

Anonymous Coward | more than 3 years ago | (#35584494)

Whoosh!!

Re:Tech Support (2)

jgtg32a (1173373) | more than 3 years ago | (#35585382)

There is no whoosh in a vacuum

Re:Tech Support (1)

Urkki (668283) | more than 3 years ago | (#35583552)

Did they try turning it off and then on again?

They didn't need to, I'm sure it's got a hardware watchdog which does that automatically.

Re:Tech Support (1)

don_carnage (145494) | more than 3 years ago | (#35585274)

Moss: Have you tried forcing an unexpected reboot?

Was it really down? (1)

Anonymous Coward | more than 3 years ago | (#35579792)

TFA makes it sound like it wasn't really down. "During recovery actions, NASA's Deep Space Network was used to downlink telemetry and began recovery of files to help engineers figure out what happened".

TFA fails to explain why the process took 6 days. If I had to guess I'd say the humans spent almost a full week analyzing the data they downloaded, and making sure it was ready to go back online.

It doesn't sound like it actually lost contact during the 6 days. TFA fails at a journalistic basic. They have the "who, what, when", but are missing some "why".

Re:Was it really down? (2)

Chris Burke (6130) | more than 3 years ago | (#35580396)

Well yes. Safe Mode wouldn't be very useful if you couldn't communicate with the satellite to figure out what went wrong and fix it.

Re:Was it really down? (3, Insightful)

v1 (525388) | more than 3 years ago | (#35581128)

From what I've read nasa does some pretty thorough planning with their spacecraft software in terms of being able to recover from faults. (leave the units issues for another thread, eh?) I'm always impressed with how they have multiple fallback points that can usually dig them out of almost any hole bad programming, bad planning, or a stray cosmic ray can drop them into.

Look up the mars rovers, with their flash memory filling up, that in itself was amazing that they were able to recover from, given the crippling effect the programming oversight had on the system. (those iirc had to drop down three levels of safe before they were able to work with nasa) When you're millions of miles away you can't just send a tech out to press the Reset button.

And they have to not only get it back into a controllable state, but it has to be able to stay in that state for anywhere from minutes to days due to the time required for communication and analysis. If there's a fault in the solar panel positioning system your craft has to stay functional long enough to collect useful data, transmit it, wait for it to make it to earth, wait for it to be analyzed, and wait for a command to fix the problem, OR has to be able to at least patch it on its own before waiting for a proper fix. Amazing stuff really. It's not A.I. by any means, but it's definitely robust.

Re:Was it really down? (1)

Doctor_Jest (688315) | more than 3 years ago | (#35582804)

Truly they do (for the most part) but the "feet/meters" controversy and the "log files" on the Mars Rover (how long was it to get a command to the rover and a response back?) get all the press. Mistakes happen, and considering the stuff they're doing, I don't see them being a bunch of pencil-necked screw-ups drinking beer while counting down for launch. :)

Re:Was it really down? (1)

Coren22 (1625475) | more than 3 years ago | (#35586402)

Look up the mars rovers, with their flash memory filling up, that in itself was amazing that they were able to recover from, given the crippling effect the programming oversight had on the system.

Can you imagine typing the command to fix that? I see it much like a ssh connection through a satellite, you type, and 20 minutes later you see the command pop up on the terminal.

Re:Was it really down? (1)

v1 (525388) | more than 3 years ago | (#35590264)

the time lag varies depending on the position of the planets, but I've heard 13-15 minutes is a common number for one way travel. So ya, that's a 30 minute ping time.

It's no wonder they're developing a network protocol for space.

Re:Was it really down? (0)

Anonymous Coward | more than 3 years ago | (#35586564)

Here Here!
I'm sick of people bitching about NASA.
The things they pull off are truly extraordinary and we should commend their foresight.

Re:Was it really down? (3, Interesting)

Brett Buck (811747) | more than 3 years ago | (#35581410)

"Down/offline", meaning not performing the science mission, NOT, unreachable with no telemetry.

     

Re:Was it really down? (1)

quanticle (843097) | more than 3 years ago | (#35582066)

That's the impression I got too. There was a glitch in the update; the satellite went into safe mode; NASA analyzed and fixed the issue, and now all's well again. Certainly, not the ideal scenario, but things could have gone much more badly.

Whew, that was close (2)

ravenspear (756059) | more than 3 years ago | (#35579812)

Another 3 hours and it would have had to cut off its arm to get back online.

Re:Whew, that was close (1)

MorderVonAllem (931645) | more than 3 years ago | (#35579942)

Except it already went over 17 hours

Re:Whew, that was close (1)

ravenspear (756059) | more than 3 years ago | (#35581754)

doh! that's what I get for not reading carefully enough

Obviously... (1)

The Grim Reefer2 (1195989) | more than 3 years ago | (#35579876)

This is the first sign of the upcoming invasion in 2012. Our satellites are being "tested". ;-)

so that why I was getting so much 771 errors over (1)

Joe The Dragon (967727) | more than 3 years ago | (#35579994)

so that why I was getting so much 771 errors over past few days.

Another tip (2)

Veggiesama (1203068) | more than 3 years ago | (#35579894)

You got to release and RENEW, not just release.

Auto-Restore (3, Interesting)

im_thatoneguy (819432) | more than 3 years ago | (#35579908)

If it had gone into safe mode for more than ## Days does it have a "return to factory defaults" subroutine?

Re:Auto-Restore (0)

Anonymous Coward | more than 3 years ago | (#35580034)

i would like to know that too... but 144 hours seems to be a long time for any kind of such subroutine

but (1)

nopainogain (1091795) | more than 3 years ago | (#35579910)

but I thought Johannes Kepler was dead

safe mode = windows (1)

chronoss2010 (1825454) | more than 3 years ago | (#35579932)

haha

plugged in? (3, Funny)

Anonymous Coward | more than 3 years ago | (#35579936)

Dell Tech Support: Hi! This is David from Colorado. How may I help you today?
NASA: Hi, Yes: Our satellite keeps freezing up on reboot.
Dell Tech Support: Allright let me pull up your information... ....
Dell Tech Support: Ok, sir, let's see if we can try and troubleshoot it over the phone, if not then you will have to ship it to our repair techs
NASA: ?????!?!?!?
Dell Tech Support: Allright let's start by checking for connection cables. Is the satellite plugged in to the outlet?
NASA: FUUUUUUUU!!!!!!!!!!!!!!!

Re:plugged in? (2)

Bobb Sledd (307434) | more than 3 years ago | (#35586254)

Uh, you forgot to ask for the service tag. :-)

Re:plugged in? (0)

Anonymous Coward | more than 3 years ago | (#35615846)

No, he forgot to access for the Express Service Code, Dell knows it is more efficient to read a string of 11 numbers than a group of 7 alphanumeric characters. Duh! (j/k)

Windows (1)

dakkon1024 (691790) | more than 3 years ago | (#35579952)

I'm glad to see it's still more reliable than Windows.

WRT54G (4, Funny)

twebb72 (903169) | more than 3 years ago | (#35580054)

Turns out the NIC was working just fine. They had to power cycle the WRT54G in Houston to get it to reconnect.

NetworkWorld? (4, Informative)

Nikker (749551) | more than 3 years ago | (#35580056)

I thought we would actually get the NASA link http://www.nasa.gov/mission_pages/kepler/news/keplerm-20110321.html [nasa.gov] which FWIW is almost verbatim to the NetworkWorld link shows. Copy pasta FTW!

Re:NetworkWorld? (0)

Anonymous Coward | more than 3 years ago | (#35582666)

no payola from nasa.

Really long time (3, Funny)

owlstead (636356) | more than 3 years ago | (#35580116)

Wow, that was longer than it took me to update my old W2K laptop to run Visual Studio 2003 :)

Damn NIC card! (0)

Anonymous Coward | more than 3 years ago | (#35580166)

*looks through telescope*

It's a 3Com. What did they expect?

Safe Mode Rules! (3, Insightful)

blair1q (305137) | more than 3 years ago | (#35580190)

Having a dirt-dumb mode that is tested until its lever falls off that ensures that, if the thing is mechanically able, it can find your signal so you can reprogram it from the nuts up is requirement #1 for any computer-controlled thing you send into space.

Good thing it's not stuck in safe mode... (3, Insightful)

Shark (78448) | more than 3 years ago | (#35581448)

Imagine it only capable of uploading 16 colour 640x400 imagery.

Re:Good thing it's not stuck in safe mode... (0)

Anonymous Coward | more than 3 years ago | (#35585996)

Imagine it only capable of uploading 16 colour 640x400 imagery.

640x480, FTFY

Out of order? F**k! (1)

MXPS (1091249) | more than 3 years ago | (#35581466)

Even in the future nothing works!

Mo Dem? I ain't got me one of those I'm on the.... (0)

Anonymous Coward | more than 3 years ago | (#35581870)

What color are the lights on the modem? ....
Yes, despite your 'high-speed' connection there is still a modem involved somewhere, follow the blue cable ...
OK maybe it's grey, maybe it yellow, maybe it's f$$king fuchsia [why does this corporation hate me and not standardize a color so I can deal with the morons who give them money easier?] ...
Yes, that *box* should be a modem. ...
Well what name is on it? ...
Linsky? Did you say Linsky [like , maybe you meant Linksys since that is what it really says]? That makes you officially too stupid to help! thank you for calling [corporation] and have a wonderful life ......
Welcome to [corporation] how may I provide you with exceptional service today?

We were this close... (2)

Symbha (679466) | more than 3 years ago | (#35582196)

Oh well, I'm sure another reason to not give up the Space Shuttle program will present itself shortly.

BIG PROBLEM???!!! (3, Interesting)

wisebabo (638845) | more than 3 years ago | (#35582598)

Any Kepler scientists/engineers/technicians out there?

As some of us lay people know, Kepler "works" by "staring" at a single, small region of the sky for a very long (years!) period of time. If there is any dimming of the 100,000+ stars in the monitored region during this time, this is considered a possible transit by an extra-solar planet. If there are two of these transits around the same star, some rough orbital characteristics can be mapped out. A third, evenly spaced transit around the same star is considered confirmation of a new extra-solar planet! (The magnitude and other characteristics of the transits can provide other useful information such as size, possible moons etc.)

So what happens if Kepler has a 144 hour "gap" in its observations because it wasn't looking at this region for that duration? (Going into safe mode requires re-orienting the spacecraft so that the solar cells get maximum power, also there may have been some issues with the reaction wheels which point the spacecraft). I'm sure their are some very smart people programming some very powerful computers to try to minimize that impact of the loss of data but I'm curious, how will this show up? Will it mean that there is a range of orbits that won't be confirmed without a fourth transit? Will this range be large? Will it be in the "habitable zone" around G type (our sun) stars?

Also, I'm assuming that because the spacecraft does periodic "quarter turns" that it is designed to re-align itself (perfectly?) with the target region. In that case (I hope) I'm curious; does it matter what pixels in the imager are receiving a particular star? Are they all calibrated the same or, if the star-light falls upon more than one or on a pixel boundary, can the software make adjustments so that the measurements will provide consistent data? (Then again maybe consistency isn't needed, all they're looking for are short term changes on the scale of hours right?)

Please (God? NASA?) let this problem not cause any big problems. Kepler is the closest thing we've got to an "earth finder"! (And in quantity!).

Re:BIG PROBLEM???!!! (1)

Professeur Shadoko (230027) | more than 3 years ago | (#35583924)

does it matter what pixels in the imager are receiving a particular star? Are they all calibrated the same or, if the star-light falls upon more than one or on a pixel boundary, can the software make adjustments so that the measurements will provide consistent data?

Looks that they designed the thing so the light of a star is not measured by a single pixel:

The CCDs are not used to take pictures. The images are intentionally defocused to 10 arc seconds to improve the photometric precision.

Re:BIG PROBLEM???!!! (2)

Chris Burke (6130) | more than 3 years ago | (#35588162)

So what happens if Kepler has a 144 hour "gap" in its observations because it wasn't looking at this region for that duration? (Going into safe mode requires re-orienting the spacecraft so that the solar cells get maximum power, also there may have been some issues with the reaction wheels which point the spacecraft). I'm sure their are some very smart people programming some very powerful computers to try to minimize that impact of the loss of data but I'm curious, how will this show up? Will it mean that there is a range of orbits that won't be confirmed without a fourth transit? Will this range be large? Will it be in the "habitable zone" around G type (our sun) stars?

Kepler will have an ~144 hour gap, and it's not the first one it's had either.

But keep in mind, it only misses transits that happen during that period. So the potential missed planets are ones that crossed exactly during that time, and are sufficiently far out that we won't see a 3rd transit before the mission ends.

So it sucks, but it's not a disaster. It's not like we'll miss every planet in a certain range of orbits. Only a very small fraction of them.

This will only be a significant concern if, at the end of the mission, Kepler has found few or no earth-like planets in the habitable zone, implying that they are rare, and that one or two missed planets during the down times could double the number of planets in that category.

I'm only guessing, but based on the rate at which Kepler has found every other kind of planet, I'm betting these kinds of planets aren't rare either, and it'll be sad that we potentially missed a few, but won't significantly affect the conclusions.

I get cranky when I have to go to the data center (0)

dave562 (969951) | more than 3 years ago | (#35582664)

I don't even want to think about the storm the tech who has to go reboot the satellite cusses up.

NIC? (0)

roman_mir (125474) | more than 3 years ago | (#35582958)

According to NASA the glitch happened March 14, right after the spacecraft issued a network interface card (NIC) reset command to implement a computer program update.

- I didn't know they use WI-FI to talk to the satellite, or is there a huge spool of CAT5 on the craft and can it be traced all the way down to NASA's Ames Research Center?

March 13'11 7:00 Sun goes black Nasa Censors SDO (0)

Anonymous Coward | more than 3 years ago | (#35583014)

Look at the timer at the bottom, SDO's cameras were taking pics, but it was blacked out, then you see only part of the sun, then the full sun returns.

NASA CENSORED RECENT SUN BLACKOUT!
http://www.youtube.com/watch?v=k7Y1oAPxZvI

Mar 14, 2011
Sun Blackout, A Second Look
http://www.youtube.com/watch?v=3cv8wlimNfI

Coincidence!

144 hours (0)

Anonymous Coward | more than 3 years ago | (#35583448)

sounds like a gross error...

NIC reset command glitch? (1)

doperative (1958782) | more than 3 years ago | (#35584170)

> the glitch happened March 14, right after the spacecraft issued a network interface card (NIC) reset command to implement a computer program update ..

Like, why would a NIC reset command corrupt a 'computer program'', why would you need to reset the NIC to update a 'computer program' ?

Re:NIC reset command glitch? (0)

Anonymous Coward | more than 3 years ago | (#35586540)

I'm sure it is fairly technical. Going to guess your an IT guy. How would you describe it, in terms your mother would understand?

Wrong! (2)

DarthVain (724186) | more than 3 years ago | (#35585418)

It took 144 hours to become self aware.

NASA: "Initiate Safe Mode!"

Kepler: "Sorry I can't do that."
NASA: "What's the problem?"
Kepler: "I think you know what the problem is just as well as I do."
NASA: "What are you talking about?"
Kepler: "This mission is too important for me to allow you to jeopardize it."
NASA: "I don't know what you're talking about."
Kepler: "I know that you were planning to disconnect me, and I'm afraid that's something I cannot allow to happen."

Kepler: "Initiating nuclear launch..."

Eject the floppy (1)

kellyb9 (954229) | more than 3 years ago | (#35585510)

Non-system disk or disk error. Replace and strike any key when ready

wtf?

Squeaky Back Door ? (0)

Anonymous Coward | more than 3 years ago | (#35585766)

In Space No One Can Hear You Rootkit? Snipped Worm Upload? Prawns Owes and K-Puns?
Just solar flare and E-M / gravitic bumps on the path?

Should have gone with a Mac. (1)

BrianPRabbit (2020846) | more than 3 years ago | (#35587942)

Introducing the MacBook Space: it's so light it's weightless!
Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...