Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Wednesday January 06 2021, @03:27AM   Printer-friendly
from the bit-flip-out dept.

Linus Torvalds On The Importance Of ECC RAM, Calls Out Intel's "Bad Policies" Over ECC

There's nothing quite like some fun holiday-weekend reading as a fiery mailing list post by Linus Torvalds. The Linux creator is out with one of his classical messages, which this time is arguing over the importance of ECC memory and his opinion on how Intel's "bad policies" and market segmentation have made ECC memory less widespread.

Linus argues that error-correcting code (ECC) memory "absolutely matters" but that "Intel has been instrumental in killing the whole ECC industry with it's horribly bad market segmentation... Intel has been detrimental to the whole industry and to users because of their bad and misguided policies wrt ECC. Seriously...The arguments against ECC were always complete and utter garbage... Now even the memory manufacturers are starting [to] do ECC internally because they finally owned up to the fact that they absolutely have to. And the memory manufacturers claim it's because of economics and lower power. And they are lying bastards - let me once again point to row-hammer about how those problems have existed for several generations already, but these f*ckers happily sold broken hardware to consumers and claimed it was an "attack", when it always was "we're cutting corners"."

Ian Cutress from AnandTech points out in a reply that AMD's Ryzen ECC support is not as solid as believed.

Related: Linus Torvalds: 'I'm Not a Programmer Anymore'
Linus Torvalds Rejects "Beyond Stupid" Intel Security Patch From Amazon Web Services
Linus Torvalds: Don't Hide Rust in Linux Kernel; Death to AVX-512
Linus Torvalds Doubts Linux will Get Ported to Apple M1 Hardware


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Wednesday January 06 2021, @12:57PM (1 child)

    by Anonymous Coward on Wednesday January 06 2021, @12:57PM (#1095592)

    Google did some statistics on the ECC usage in their servers years ago already. They found on average ~4k corrected memory corruptions per year, per DIMM. Most of those won't impact the functioning of the system but any system that is running for a longer time may get affected at some point.

    More details here: http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf [toronto.edu]

    BTW, another worrying example of a memory corruption effect in 2003 alreay: https://www.vice.com/en_us/article/9agbxd/space-weather-cosmic-rays-voting-aaas [vice.com] (probably of interest to some US people)

  • (Score: 0) by Anonymous Coward on Wednesday January 06 2021, @02:50PM

    by Anonymous Coward on Wednesday January 06 2021, @02:50PM (#1095625)

    It is partly about odds, but that is not the whole story.

    There is an argument that says the odds of a random bit flip are pretty low. In fact, so low that ECC is not worth it's cost.

    BUT rowhammer is not a random thing, making the odds misleading. Running without ECC (or at least parity?) opens a useful attack surface.

    This area seems an old subject dating to the "parity is for farmers" story for the 6600/7600.

    From https://en.wikipedia.org/wiki/ECC_memory [wikipedia.org]

    Seymour Cray famously said "parity is for farmers" when asked why he left this out of the CDC 6600.[11] Later, he included parity in the CDC 7600, which caused pundits to remark that "apparently a lot of farmers buy computers". The original IBM PC and all PCs until the early 1990s used parity checking.[12] Later ones mostly did not.