Ten years after deployment, flash memory needs to be reformatted due to increasing error rates.
At least that's what NASA is finding on the Opportunity Rover, running since 2004 on the surface of Mars. NASA is planning another long distance maintenance operation that will require reformatting the flash storage. They are old hands at this having done the same on the Spirit rover 5 years ago.
Opportunity has "reset" itself a dozen times this month, each time taking a day or two to fully recover. This is forcing the Jet Propulsion Laboratory to plan to reformat the flash memory which is used to store images and data pending transmission to Earth:
"Worn-out cells in the flash memory are the leading suspect in causing these resets," said John Callas of NASA's Jet Propulsion Laboratory, Pasadena, California, project manager for NASA's Mars Exploration Rover Project. "The flash reformatting is a low-risk process, as critical sequences and flight software are stored elsewhere in other non-volatile memory on the rover."
Similar to the flash storage in your cell phone, the Rover's flash is simultaneously more primitive, and more rugged; designed and shielded to survive the radiation of space flight.
The project landed twin rovers Spirit and Opportunity on Mars in early 2004 to begin missions planned to last only three months. Spirit worked for six years, and Opportunity is still active.
(Score: 2, Informative) by Anonymous Coward on Sunday August 31 2014, @09:17AM
My guess is that the reset is a hard fault triggered when a detected but unrecoverable error occurs during a read (or more likely a write because flash memory almost always fails on write, the write just doesn't "take" but you can still read the data that was last successfully written).
It is really common for an OS to deliberately crash (e.g. kernel panic) when it detects an unrecoverable error. The principle is that unrecoverable errors put the system into an undefined state, so better to do a full reset than to try to continue operating when you know there is a problem that could result in further data corruption.