Linus Torvalds On The Importance Of ECC RAM, Calls Out Intel's "Bad Policies" Over ECC
There's nothing quite like some fun holiday-weekend reading as a fiery mailing list post by Linus Torvalds. The Linux creator is out with one of his classical messages, which this time is arguing over the importance of ECC memory and his opinion on how Intel's "bad policies" and market segmentation have made ECC memory less widespread.
Linus argues that error-correcting code (ECC) memory "absolutely matters" but that "Intel has been instrumental in killing the whole ECC industry with it's horribly bad market segmentation... Intel has been detrimental to the whole industry and to users because of their bad and misguided policies wrt ECC. Seriously...The arguments against ECC were always complete and utter garbage... Now even the memory manufacturers are starting [to] do ECC internally because they finally owned up to the fact that they absolutely have to. And the memory manufacturers claim it's because of economics and lower power. And they are lying bastards - let me once again point to row-hammer about how those problems have existed for several generations already, but these f*ckers happily sold broken hardware to consumers and claimed it was an "attack", when it always was "we're cutting corners"."
Ian Cutress from AnandTech points out in a reply that AMD's Ryzen ECC support is not as solid as believed.
Related: Linus Torvalds: 'I'm Not a Programmer Anymore'
Linus Torvalds Rejects "Beyond Stupid" Intel Security Patch From Amazon Web Services
Linus Torvalds: Don't Hide Rust in Linux Kernel; Death to AVX-512
Linus Torvalds Doubts Linux will Get Ported to Apple M1 Hardware
(Score: 2) by dltaylor on Wednesday January 06 2021, @09:53AM (3 children)
more completely: ECC is a TLA for Error Checking and Correcting
There are extra bits in the data stream to/from the memory controller, which may be inside the CPU, as in Xeons, to the RAM. The extra bits allow for a code to be stored to the memory, and read back when the memory is accessed that can identify that the data read back is wrong. "Normally" these days (some specialized computers can do more) it allows for any single bit error to be identified, and corrected from the code, and some double bit errors. Back in the days of parity memory, all you knew was that the parity was bad, for single bit errors, but not which bit, so you couldn't fix it, and two flipped bits may have good parity for bad data.
(Score: 0) by Anonymous Coward on Wednesday January 06 2021, @10:58AM (2 children)
OK, what is TLA? Is this some kind of military code to keep the civilians from knowing what they have planned for us? Like MIRV and MAD? And SNAFU, FUBAR, and BOHICA? So funny with their acronyms, the militaries are! Until you have been all you can be. That kinda sucks.
(Score: 2) by RS3 on Wednesday January 06 2021, @07:17PM
TLA = Three Letter Acronym.
Acronyms are a bit overused, IMHO.
(Score: 0) by Anonymous Coward on Wednesday January 06 2021, @07:20PM
TLA is a TLA for Three Letter Acronym.