Stories
Slash Boxes
Comments

SoylentNews is people

posted by LaminatorX on Friday January 02 2015, @12:30PM   Printer-friendly
from the rust-never-sleeps dept.

The BBC reports that the ten year old rover Opportunity, that was designed to last three months, is suffering from Alzheimer's. NASA calls it "amnesia", but since Opportunity is so old...

At any rate, parts of its non-volatile memory is failing. NASA is working on a hack to make it disregard the bad part of the memory.

"It's like you have an aging parent, that is otherwise in good health — maybe they go for a little jog every day, play tennis each day — but you never know, they could have a massive stroke right in the middle of the night," he said.

"So we're always cautious that something could happen."

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by wonkey_monkey on Friday January 02 2015, @12:55PM

    by wonkey_monkey (279) on Friday January 02 2015, @12:55PM (#131000) Homepage

    NASA calls it "amnesia", but since Opportunity is so old...

    ...did you lose your train of thought? You should get your non-volatile memory looked at.

    --
    systemd is Roko's Basilisk
  • (Score: 1) by MichaelDavidCrawford on Friday January 02 2015, @02:31PM

    by MichaelDavidCrawford (2339) Subscriber Badge <mdcrawford@gmail.com> on Friday January 02 2015, @02:31PM (#131007) Homepage Journal

    I've been meaning to do this for eons. It would permit the use of flaky memory modules that would otherwise have to be discarded.

    Most often bad ram is at specific addresses. You could do a memory test just before the kernel starts then build a list of addresses to avoid.

    Some memory tests take a long time; if your memory mostly works but is just a little off-spec you have to try lots of tests before you find the problem. So it would be best if you could save the list of bad addresses in a nonvolatile location, then restore it for the next book.

    --
    Yes I Have No Bananas. [gofundme.com]
    • (Score: 2, Informative) by Anonymous Coward on Friday January 02 2015, @04:05PM

      by Anonymous Coward on Friday January 02 2015, @04:05PM (#131021)

      You mean something like memtest86+ [memtest.org] followed by BadRAM [vanrein.org] ?

      • (Score: 3, Interesting) by bzipitidoo on Friday January 02 2015, @06:39PM

        by bzipitidoo (4388) on Friday January 02 2015, @06:39PM (#131054) Journal

        Last time I used memtest, it found a whole bunch of bad memory! I was thinking about where to get replacement RAM, but something seemed off about the memtest results. There was a little too much bad memory, and it didn't detect anything wrong until reaching pass 8 or so, then everything was bad. So I checked further.

        Turned out, Ubuntu (version 12, I think) had introduced a bug by not compiling memtest correctly for the 64bit version. There was nothing wrong with the memory after all, as I saw when running memtest from other distros.

      • (Score: 1) by MichaelDavidCrawford on Saturday January 03 2015, @06:30AM

        by MichaelDavidCrawford (2339) Subscriber Badge <mdcrawford@gmail.com> on Saturday January 03 2015, @06:30AM (#131215) Homepage Journal

        I mean to do many things.

        --
        Yes I Have No Bananas. [gofundme.com]
    • (Score: 2, Funny) by rcamera on Friday January 02 2015, @04:39PM

      by rcamera (2360) on Friday January 02 2015, @04:39PM (#131028) Homepage Journal

      Thanks for your amazing idea! We'll start working on a sysdemd module to do exactly this! We'll have the new and improved init kick off a second process (pid2?!) that the new systemd memory management module would assign bad physical memory addresses to, ensuring that the bad memory is never actually used by a real process. And since the pid2 does nothing except write status updates to a binary log in a tight loop, the bad memory will never be accessed by the process!

      --
      /* no comment */
    • (Score: 2) by Immerman on Friday January 02 2015, @04:50PM

      by Immerman (3985) on Friday January 02 2015, @04:50PM (#131034)

      With a little extra engineering you could even use parity RAM and log all detected errors in order to update the list in real time as specific addresses begin to show substantially higher-than-baseline error rates. Assuming a gradual failure rate and ECC RAM there might even be a good chance of gracefully retiring failing memory cells in a live system without ever generating a software error.

    • (Score: 2) by FatPhil on Friday January 02 2015, @05:19PM

      by FatPhil (863) <pc-soylentNO@SPAMasdf.fi> on Friday January 02 2015, @05:19PM (#131043) Homepage
      "save the list of bad addresses in a nonvolatile location" seems to overlook the fact that "its non-volatile memory is failing".

      (Of course, you could put uniquely-id'd ECC'ed records redundantly all over NVRAM, and apply all the unique ones that checksum correctly, but that comes with quite an overhead. Having a data source where you can't trust *anything* you receive is quite an interesting programming challenge. The closest I can think of that I've had to fight is when you're using a cloud service storage back end, and you can't necessarily guarantee the records you request will appear at all. But normally you "fix" that by just trying again later. Corrupt data is way worse than no data.)
      --
      Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
      • (Score: 2) by frojack on Friday January 02 2015, @08:48PM

        by frojack (1554) on Friday January 02 2015, @08:48PM (#131076) Journal

        In this case, there is one bank of nvram that is failing, and the others look to be ok.

        So the tricky bit is storing (probably redundant) sets of nvram space that can't be used in nvram and somehow causing the boot up process to remember to load this information before attempting to use any part of nvram.

        They have no spinning storage, so unless they can keep some portion of the nvram flash memory operational they pretty much lose the vehicle.

        If the nvram is used as file system based stroage (like any 'nix OS might use it) you could simply allocate those areas as files that can't be removed. On the other hand, if they use the nvram mostly in malloc calls they are going to have to remember and reestablish allocations for the bad memory in a table somewhere before they attempt to use any of it, perhaps using mmap with MAP_FIXED flag to cover each bad part.

        --
        No, you are mistaken. I've always had this sig.
  • (Score: 2) by datapharmer on Friday January 02 2015, @04:06PM

    by datapharmer (2702) on Friday January 02 2015, @04:06PM (#131022)

    They've got wireless on the thing, just switch to pxe mode and call it done... latency shmatency, problem solved!

  • (Score: 4, Informative) by cmn32480 on Friday January 02 2015, @04:06PM

    by cmn32480 (443) <cmn32480NO@SPAMgmail.com> on Friday January 02 2015, @04:06PM (#131023) Journal

    The ROI for this mission has been incredible.

    This was supposed to last 3 months. Instead, it will have been on the surface sending back data for 11 years on January 24. It is really quite amazing.

    According to Google (search "Opportunity Rover Cost") the cost for the rover was $400 million, the maintenance costs for operating is are about $14 million annually (from a year ago: http://www.foxnews.com/science/2014/01/24/nasa-opportunity-rover-still-going-strong/ [foxnews.com]). Total cost: approx $554 million.

    Amortized Cost: $50 million/year
    Original Estimate Amortized Cost: $403.5 million for 3 MONTHS.

    From the numbers it looks like we are getting much better bang for our dollar then we ever thought we would.

    Who says we can't build anything that lasts anymore?

    --
    "It's a dog eat dog world, and I'm wearing Milkbone underwear" - Norm Peterson
    • (Score: 2) by buswolley on Friday January 02 2015, @07:02PM

      by buswolley (848) on Friday January 02 2015, @07:02PM (#131059)

      It is unfortunate that they don't reuse designs...
      If it were up to me, I'd produce 100 of these, land them all over mars, invasion style, and get to know the planet well.

      --
      subicular junctures
      • (Score: 2) by iwoloschin on Friday January 02 2015, @07:50PM

        by iwoloschin (3863) on Friday January 02 2015, @07:50PM (#131067)

        You probably cannot buy the same parts anymore, I bet most of the electronics on the Opportunity are simply impossible to produce now, without doing a custom run on a fab (not difficult, but expensive). Of course, space-certifying new electronics is also expensive, because hard radiation doesn't play nice with electronics, particularly with ever decreasing process node size.

        Besides, I'd change it up a bit. Put some laser tag equipment on the rovers. Give normal people the controls and turn it into a mashup of science and "friendly" battlebots. Maybe allow operators to "re-route power around damaged circuits" or something to encourage engineering/programming skills. Bonus points for inspecting sites of scientific interest? Way more fun and engagement from normal people!

        • (Score: 2) by buswolley on Friday January 02 2015, @08:05PM

          by buswolley (848) on Friday January 02 2015, @08:05PM (#131070)

          Oh certainly, I wouldnt do the same model now, but we've now shown that these kinds of robots can last a long time. Building two of each model though isn't effective, cost wise. Build 1000 and the unit cost will go down quite a bit, and the amount of science from 1000 units all over Mars (or any other planet) would produce an incredible amount of Scientific information.

          --
          subicular junctures
      • (Score: 2) by cmn32480 on Friday January 02 2015, @08:37PM

        by cmn32480 (443) <cmn32480NO@SPAMgmail.com> on Friday January 02 2015, @08:37PM (#131075) Journal

        Probably not the same model, think Opportunity 2.0... it could have a nifty ad filled web front end...

        What age is the tech on the rovers? Is it stuff from around 2000? 1995?

        With all they have learned in the past 11 years from dealing with the rovers problems, and some minor tweaks for updated instruments and electronics (it isn't like electronics haven't made some HUGE leaps in the last 15-20 years), build a whole bunch and let's REALLY start exploring Mars! If we start building and testing NOW with today's tech, we can launch the first batch in 3 or 4 years, be rolling around and gathering data in about 5 years.

        Given the circumstances, the rover has held up remarkably well. One arthritic joint, a bum wheel, and some instruments that have worn out? I'd say for a project that was supposed to die a relatively quick death, it is doing pretty darn well.

        --
        "It's a dog eat dog world, and I'm wearing Milkbone underwear" - Norm Peterson
        • (Score: 1) by hb253 on Friday January 02 2015, @11:42PM

          by hb253 (745) on Friday January 02 2015, @11:42PM (#131115)

          Probably not the same model, think Opportunity 2.0... it could have a nifty ad filled web front end...

          They'd also have to rename it to Rovestr or something similarly "disruptive".

          --
          The firings and offshore outsourcing will not stop until morale improves.
        • (Score: 2) by tonyPick on Saturday January 03 2015, @09:48AM

          by tonyPick (1237) on Saturday January 03 2015, @09:48AM (#131274) Homepage Journal

          What age is the tech on the rovers? Is it stuff from around 2000? 1995

          Following up on this, and being a bit more specific that TFA then from wikipedia:
          http://en.wikipedia.org/wiki/Comparison_of_embedded_computer_systems_on_board_the_Mars_rovers [wikipedia.org]

          Opportunity's onboard computer uses a 20 MHz RAD6000 CPU with 128 MB of DRAM, 3 MB of EEPROM, and 256 MB of flash memory.

          And the CPU info page:

          Reported to have a unit cost somewhere between US$200,000 and US$300,000, RAD6000 computers were released for sale in the general commercial market in 1996.

          so that's a pre-2000 design...

      • (Score: 2) by mcgrew on Saturday January 03 2015, @11:50AM

        by mcgrew (701) <publish@mcgrewbooks.com> on Saturday January 03 2015, @11:50AM (#131294) Homepage Journal

        Where would they find enough geologists and chemists to sift through all that data?

        --
        Impeach Donald Saruman and his sidekick Elon Sauron
        • (Score: 2) by buswolley on Saturday January 03 2015, @08:21PM

          by buswolley (848) on Saturday January 03 2015, @08:21PM (#131402)

          in post-doc hell, of course :)

          --
          subicular junctures
    • (Score: 2) by mcgrew on Saturday January 03 2015, @11:49AM

      by mcgrew (701) <publish@mcgrewbooks.com> on Saturday January 03 2015, @11:49AM (#131293) Homepage Journal

      Who says we can't build anything that lasts anymore?

      It's not that we can't build anything that lasts, it's amoral corporate asshole thieves who want to sell you a new refrigerator every year. Bastards can, they just won't!

      --
      Impeach Donald Saruman and his sidekick Elon Sauron