Linus Torvalds On The Importance Of ECC RAM, Calls Out Intel's "Bad Policies" Over ECC
There's nothing quite like some fun holiday-weekend reading as a fiery mailing list post by Linus Torvalds. The Linux creator is out with one of his classical messages, which this time is arguing over the importance of ECC memory and his opinion on how Intel's "bad policies" and market segmentation have made ECC memory less widespread.
Linus argues that error-correcting code (ECC) memory "absolutely matters" but that "Intel has been instrumental in killing the whole ECC industry with it's horribly bad market segmentation... Intel has been detrimental to the whole industry and to users because of their bad and misguided policies wrt ECC. Seriously...The arguments against ECC were always complete and utter garbage... Now even the memory manufacturers are starting [to] do ECC internally because they finally owned up to the fact that they absolutely have to. And the memory manufacturers claim it's because of economics and lower power. And they are lying bastards - let me once again point to row-hammer about how those problems have existed for several generations already, but these f*ckers happily sold broken hardware to consumers and claimed it was an "attack", when it always was "we're cutting corners"."
Ian Cutress from AnandTech points out in a reply that AMD's Ryzen ECC support is not as solid as believed.
Related: Linus Torvalds: 'I'm Not a Programmer Anymore'
Linus Torvalds Rejects "Beyond Stupid" Intel Security Patch From Amazon Web Services
Linus Torvalds: Don't Hide Rust in Linux Kernel; Death to AVX-512
Linus Torvalds Doubts Linux will Get Ported to Apple M1 Hardware
(Score: 4, Informative) by Immerman on Wednesday January 06 2021, @06:49AM (1 child)
That depends entirely on what you're doing with that RAM.
If you're playing video games - probably nothing much - slight change in the color of one pixel on a texture somewhere, or a bit of a health change, or something warps through geometry as their position changes. Nothing much compared to all the bugs.
If you've got a huge database or spreadsheet open - congratulations, every minute and a half, on average, another piece of data or formatting gets silently corrupted.
And if the error is in the RAM containing the machine code of your program itself.... well then who knows? Almost anything could happen - the software is corrupted, and will no longer work as intended... maybe the corruption is in an infrequently used function that never gets used before you close it down - then nothing happens. Or maybe it's in a core loop of your program, or even operating system, in which case maybe it crashes, or maybe corrupts whatever data it touches - it's kind of like the invoking undefined behavior in a programming language - maybe nothing happens, maybe the computer calls Halts And Catch Fire, or anything in between - you just don't know until it happens.
(Score: 0) by Anonymous Coward on Wednesday January 06 2021, @10:59AM
It's worse than that. A single bit flipped at the OS level can mean a crash if you're lucky or a corrupted hard drive if you're not.