Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Tuesday March 14 2023, @02:12PM   Printer-friendly
from the more-data-for-spreadsheet-nerds dept.

SSD Reliability is Only Slightly Better Than HDD, Backblaze Says

A surprising outcome for the first SSD-based AFR report:

Backblaze is a California-based company dealing with cloud storage and data backup services. Every year, the organization provides some interesting reliability data about the large fleet of storage units employed in its five data centers around the world.

For the first time, Backblaze's latest report on storage drive reliability is focusing on Solid State Drives (SSD) rather than HDD units alone. The company started using SSDs in the fourth quarter of 2018, employing the NAND Flash-based units as boot drives rather than data-storing drives. Backblaze uses consumer-grade drives, providing Annualized Failure Rate (AFR) information about 13 different models from five different manufacturers.

The 2022 Drive States review is based on data recorded from 2,906 SSD boot units, Backblaze states, and it is essentially confirming what the company was saying in its 2022 mid-year report. SSDs are more reliable than HDDs, Backblaze says, as they show a lower AFR rate (0.98%) compared to HDDs (1.64%).

The fact that the difference in reliability level isn't exactly staggering (0.66% AFR) is rather surprising, however, as SSDs are essentially just moving electrons through memory chips while hard drives have to deal with a complex (and failure-prone) mechanism employing spinning platters and extremely sensitive read/write magnetic heads.

The reasons behind failing drives aren't known, as only an SSD manufacturer would have the equipment needed to make a reliable diagnose. For 2022, Backblaze says that seven of the 13 drive models had no failure at all. Six of those seven models had a limited number of "drive days" (less than 10,000), the company concedes, meaning that there is not enough data to make a reliable projection about their failure rates.

An interesting tidbit about Backblaze's report is that the company hasn't used a single SSD unit made by Samsung, which is a major player in the SSD consumer market. One possible explanation is that Samsung drives aren't cheap, and Backblaze is essentially using the cheapest drives they can buy in bulk quantities.

The SSD Edition: 2022 Drive Stats Review

The SSD Edition: 2022 Drive Stats Review:

Welcome to the 2022 SSD Edition of the Backblaze Drive Stats series. The SSD Edition focuses on the solid state drives (SSDs) we use as boot drives for the data storage servers in our cloud storage platform. This is opposed to our traditional Drive Stats reports which focus on our hard disk drives (HDDs) used to store customer data.

We started using SSDs as boot drives beginning in Q4 of 2018. Since that time, all new storage servers and any with failed HDD boot drives have had SSDs installed. Boot drives in our environment do much more than boot the storage servers. Each day they also read, write, and delete log files and temporary files produced by the storage server itself. The workload is similar across all the SSDs included in this report.

In this report, we look at the failure rates of the SSDs that we use in our storage servers for 2022, for the last 3 years, and for the lifetime of the SSDs. In addition, we take our first look at the temperature of our SSDs for 2022, and we compare SSD and HDD temperatures to see if SSDs really do run cooler.

As of December 31, 2022, there were 2,906 SSDs being used as boot drives in our storage servers. There were 13 different models in use, most of which are considered consumer grade SSDs, and we'll touch on why we use consumer grade SSDs a little later. In this report, we'll show the Annualized Failure Rate (AFR) for these drive models over various periods of time, making observations and providing caveats to help interpret the data presented.

The dataset on which this report is based is available for download on our Drive Stats Test Data webpage. The SSD data is combined with the HDD data in the same files. Unfortunately, the data itself does not distinguish between SSD and HDD drive types, so you have to use the model field to make that distinction. If you are just looking for SSD data, start with Q4 2018 and go forward.

Click on the link to get the actual figures.


Original Submission #1Original Submission #2

This discussion was created by janrinok (52) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 5, Interesting) by Immerman on Tuesday March 14 2023, @03:06PM (11 children)

    by Immerman (3985) on Tuesday March 14 2023, @03:06PM (#1296094)

    Why exactly is this surprising?

    SSDs have been around a long time now, and were originally WAY less reliable than hard drives. Flash memory just doesn't age well - the individual storage cells become more inaccurate with use, and eventually the inaccuracies increase beyond the threshold to reliably determine whether it was a 1 or 0 that was stored.

    Over time the technology improved and became more reliable... and companies eventually chose to trade a bunch of that increased reliability for greater capacity instead with various levels of MLC encoding - first storing two bits per cell instead of one, doubling the capacity of the same hardware, but it meant that instead of needing to reliably tell the difference between two different charge levels, you had to be able to tell the difference between four. Then as reliability continued to improve they moved to three bits per cell, increasing capacity by another 50% in the same hardware, at the cost of needing to distinguish between eight different charge levels. I'm not sure if anyone has moved beyond that yet, there's definitely diminishing returns, but if reliability continues to improve I'm sure they will eventually.

    Modern SSDs *could* be far more reliable using the exact same memory chips, in fact I'm pretty sure single-level (1 bit per cell) drives are still available for high-speed and high-reliability applications (In fact, I think some can be switched in software), but they cost 2 or 3 times as much per MB, so hardly anyone uses them.

    • (Score: 4, Insightful) by digitalaudiorock on Tuesday March 14 2023, @03:16PM (7 children)

      by digitalaudiorock (688) on Tuesday March 14 2023, @03:16PM (#1296095) Journal

      I built the AMD system I'm on now in early 2021. I opted for traditional drives as apposed to SSD. And yes, there is a performance trade off, especially if, for example, I copy some very large files around etc.

      I tend to use computers for a very long time, and have just never been able to accept the built in limited lifetime of SSD. The machine I retired was an archaic x86 machine that had two hard drives that had spun without issues for 15 years. One thing for sure is that that's not happening with SSD.

      • (Score: 2, Informative) by shrewdsheep on Tuesday March 14 2023, @03:29PM (2 children)

        by shrewdsheep (5215) on Tuesday March 14 2023, @03:29PM (#1296098)

        I'm happy with bcache. OS/cache on SSD, /home on HD.

        • (Score: 2) by RS3 on Tuesday March 14 2023, @05:06PM

          by RS3 (6367) on Tuesday March 14 2023, @05:06PM (#1296113)

          Thanks, I'll look into that. I haven't done a hybrid HD/SSD system yet. Still considering pros and cons.

          Using SSD for cache would obviously make a faster cache, but wouldn't that wear out the SSD much faster, as cache is likely getting much heavier traffic?

          Everyone seems to refer to boot times. How often are they rebooting?

        • (Score: 5, Interesting) by inertnet on Tuesday March 14 2023, @09:21PM

          by inertnet (4071) on Tuesday March 14 2023, @09:21PM (#1296158) Journal

          I have OS, development, cache and home on SSD. And symlinked documents, downloads, shares, images, videos, backups and VMs on a HDD.

      • (Score: 2) by JoeMerchant on Tuesday March 14 2023, @09:04PM (1 child)

        by JoeMerchant (3937) on Tuesday March 14 2023, @09:04PM (#1296151)

        Some time in the deep past, I purchased a 2TB external HDD for use with our media server. Around 2012, I purchased a 2nd to have a mirrored backup. Around 2015 the original 2TB drive died and was replaced with a 2TB SSD.

        As of today, knocking on wood, both the 2012 2TB HD and 2015 2TB SSD are still going strong, both even survived a massive lightning strike that took out the PC they were connected to. Just purchased another 2TB SSD because they're so damn cheap these days, so I'm double mirrored at the moment.

        --
        🌻🌻 [google.com]
        • (Score: 2) by RedGreen on Tuesday March 14 2023, @11:06PM

          by RedGreen (888) on Tuesday March 14 2023, @11:06PM (#1296169)

          "so I'm double mirrored at the moment."

          I am quadruple mirrored using 20 hdd. As I upgrade my machine every few years I retire the old one as backup machine usually putting in some new drives for the backup duplicate purposes. Only one of them drives in those machines have died in the last few years with a simple warranty replacement and resilver to have everything back to working great. That was of course after much unplugging to figure out the dead drive. Now my file of the drives in use contains, the machine it is in, the serial number, date of purchase, length of warranty and location in the machine, so no more messing about figuring it out on next drive death. They get turned on every week for additions of newly accumulated files plus my machine backup and making sure they are powered up every once in a while to prevent premature death, I lost a couple of drives one time when I left a machine off for over a year without turning it on. Since the power on time weekly method in use, only the single already mentioned drive has gone tits up on me in the last ten years. Actually now I think there was two a 6tb from 2016 or 17 and a 2011 2tb both Western Digital. There have been other go in my main machine but I do not care really, everything on it is backed up the same four times so I only can really loose a week of data and none really if my two local drives that serve as backups do not die, I only loose two hours for the personal files, four hours for the / when those cron scripts run and however long it has been since I have manually ran my script that backs up the / and /home to the spare drive that does that. I am a little OCD with my backups as you be able to tell...

          --
          "I modded down, down, down, and the flames went higher." -- Sven Olsen
      • (Score: 2) by RedGreen on Tuesday March 14 2023, @10:36PM (1 child)

        by RedGreen (888) on Tuesday March 14 2023, @10:36PM (#1296168)

        "The machine I retired was an archaic x86 machine that had two hard drives that had spun without issues for 15 years."

        Do not know about that but my experience with them is rather good. I still have the original SSD drive I bought in 2009 a Kinsgton SSDNow 40gb. It ran 24/7 for longest amount of years then got put in as a boot drive in a server machine I have, like the Energizer Bunny it just keeps going and going. About thirteen years and still kicking, out of the dozens of SSDs owned since then only one has died, so many hard drives have gone tits up in that time period I have no clue how many of them have died on me. Especially them piece of shit Seagate drives, the Western Digital from around that time, I still have couple of them left from 2011 out of the dozen or so bought, some of them were sold off when upgrading so not all dead drives but some did die. Every Seagate I have ever bought has died usually just out of warranty, never again..

        https://www.engadget.com/2009-12-02-kingston-40gb-ssdnow-review.html [engadget.com]

        --
        "I modded down, down, down, and the flames went higher." -- Sven Olsen
        • (Score: 0) by Anonymous Coward on Wednesday March 15 2023, @01:16AM

          by Anonymous Coward on Wednesday March 15 2023, @01:16AM (#1296179)

          Kingston's not in BackBlaze's stats though. Nor Samsung.

    • (Score: 2) by DannyB on Tuesday March 14 2023, @05:06PM (1 child)

      by DannyB (5839) Subscriber Badge on Tuesday March 14 2023, @05:06PM (#1296112) Journal

      first storing two bits per cell instead of one, doubling the capacity of the same hardware . . . . as reliability continued to improve they moved to three bits per cell, increasing capacity by another 50% in the same hardware, at the cost of needing to distinguish between eight different charge levels

      This sounds like what for-profit prisons would do.

      Modern SSDs *could* be far more reliable using the exact same memory chips, in fact I'm pretty sure single-level (1 bit per cell) drives are still available for high-speed and high-reliability applications

      In addition to the achievable goal of greater reliability, SSDs definitely have noticeably lower levels of vibration than HDDs.

      --
      When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
      • (Score: 0) by Anonymous Coward on Tuesday March 14 2023, @05:11PM

        by Anonymous Coward on Tuesday March 14 2023, @05:11PM (#1296114)

        SSDs definitely have noticeably lower levels of vibration than HDDs.

        Ahh, you have the ones with load-leveling automagic spin-balancing enabled by default. Dat's de fault of de final test technician. You can download a utility called "wipedisk" that will re-enable the soothing massage vibration system.

    • (Score: 3, Informative) by darkfeline on Tuesday March 14 2023, @05:38PM

      by darkfeline (1030) on Tuesday March 14 2023, @05:38PM (#1296116) Homepage

      The filesystem/OS MUST handle errors.

      Therefore, it becomes a purely economic question: how cheap is the drive vs how often it needs to be replaced and how much redundancy it needs.

      High reliability drives are economically inferior in most cases.

      --
      Join the SDF Public Access UNIX System today!
  • (Score: 2, Insightful) by shrewdsheep on Tuesday March 14 2023, @03:33PM (4 children)

    by shrewdsheep (5215) on Tuesday March 14 2023, @03:33PM (#1296099)

    The AFR is pretty much useless. It doesn't tell you how likely you will loose your HD in the next year. It doesn't tell which vendor is most reliable. Nor does it tell you how much life you can expect to get out a HD. Answering these questions requires a proper time-to-event analysis, i.e. knowing after how long the failure happened or the HD was known to be still good. I contacted BlackBlaze about this a couple of years back but never heard back.

    If they base their decisions on these statistics, the competition probably has a good laugh.

    • (Score: 2, Interesting) by Anonymous Coward on Tuesday March 14 2023, @06:21PM

      by Anonymous Coward on Tuesday March 14 2023, @06:21PM (#1296120)

      A client's main computer crashed six weeks ago. I don't really do IT for them- outside company does. Stupid stupid Win10 Dell. Many odd errors on screen, chkdsk had enormous errors and wouldn't resolve. Machine would barely boot, but not really. SMART enabled in BIOS, but NO SMART messages from Win stupid 10. Stupid Seagate 1 TB spinning rust drive. Not sure if it's CMR or SMR. No clicking, no noticeable vibration or any other observable symptom. I recovered as much as possible.

      Moral of the story: run some kind of SMART software fairly often, preferably always running / background task, because evidently Win10 saves on precious CPU cycles that are needed for much more important "telemetry" (spying) and updating crap software.

    • (Score: 0) by Anonymous Coward on Wednesday March 15 2023, @02:14AM (2 children)

      by Anonymous Coward on Wednesday March 15 2023, @02:14AM (#1296186)

      It's not useless. If you see >=5% then you know what brand/model to avoid.

      Example: https://hwbot.org/newsflash/3253_hardware.fr_publishes_rma_rate_for_motherboard_psu_memory_vga_hdd_and_ssd? [hwbot.org]

      • (Score: 0, Redundant) by shrewdsheep on Wednesday March 15 2023, @07:51AM (1 child)

        by shrewdsheep (5215) on Wednesday March 15 2023, @07:51AM (#1296218)

        No. All models will have a failure rate >5% eventually. It all depends on the timing.

        • (Score: 0) by Anonymous Coward on Wednesday March 15 2023, @08:28AM

          by Anonymous Coward on Wednesday March 15 2023, @08:28AM (#1296223)
          D'oh do I really need to explain everything to you in detail? If the new ones are showing 5% failure rate after too short a time for you then you should avoid buying lots of those.

          Your OP comment and this comment is as stupid as saying "everyone dies eventually so the stats are useless/it depends on the timing".
  • (Score: 3, Informative) by Rosco P. Coltrane on Tuesday March 14 2023, @03:54PM (2 children)

    by Rosco P. Coltrane (4757) on Tuesday March 14 2023, @03:54PM (#1296104)

    Backblaze is a California-based company dealing with cloud storage and data backup services

    Let me guess: the solution to reliable data storage is the cloud? Preferably Backblaze probably?

    Incidentally, I patronize several forums that use Backblaze as their CDN, and it fucking sucks donkey balls. Half ot the time the content is unavailable because their backend servers won't answer, or the download speed is best described as lackluster.

    So yeah... Backblaze... I'll take unreliable SSD storage over their cloudiness anyday. Hell, even DAT tapes look more appealing sometimes.

    • (Score: 0) by Anonymous Coward on Tuesday March 14 2023, @06:31PM (1 child)

      by Anonymous Coward on Tuesday March 14 2023, @06:31PM (#1296123)

      The problems could be with their ISP, not them. Most of the CDN problems I've encountered were with "Cloudflare", occasionally Amazon's "CloudFront".

      • (Score: 0) by Anonymous Coward on Tuesday March 14 2023, @09:22PM

        by Anonymous Coward on Tuesday March 14 2023, @09:22PM (#1296159)

        Today Reddit is down, has been for hours, based on CDN "Fastly", which fastly went down and buried itself. Crews are slowly digging it out. https://www.redditstatus.com/ [redditstatus.com]

        "Twitch" and "Pinterest" might be down too. Not worth the time to check.

        In fact, thank you Fastly for helping clean up the 'net for a bit.

  • (Score: 1, Interesting) by Anonymous Coward on Tuesday March 14 2023, @05:25PM (8 children)

    by Anonymous Coward on Tuesday March 14 2023, @05:25PM (#1296115)

    A good friend and his fiancee do very heavy photography, always in greatest color-depth, largest number of pixels, RAW format (no compression), some videography, and some multi-track audio. Needless to say they've gone through probably hundreds of SSDs over the years. He's found SanDisk to be the most reliable- no failures. Obviously no comparison to Backblaze's numbers, but still worth considering. So I bought a SanDisk, only to find out they got bought out by Western Digital, and it looks like WD are not "supporting" SanDisk stuff as much as SanDisk proper did. Sigh.

    Anyone have any observations / experience with brands / models and problems / reliability?

    • (Score: 4, Informative) by Zinho on Tuesday March 14 2023, @06:29PM (4 children)

      by Zinho (759) on Tuesday March 14 2023, @06:29PM (#1296122)

      I have an anecdote that mirrors the article's comment: a friend of mine only buys Samsung SSDs, he's a hobbyist system builder (~6/year), and he has never had one of his SSDs fail. He's owned several of them for a decade or more.

      Also mirroring the article, I've purchased maybe 10 SSDs over the past decade or so, and every one of them older than about 3-4 years has failed. I am a tightwad, and until recently never bought Samsung. Off the top of my head I can remember Crucial and Kingston as brands I've bought one or two of; the rest were purchased because they had the lowest price/GB.

      My friend was very surprised to hear my history with SSDs because he had never seen one fail. Made for a very interesting conversation when we compared notes.

      TL;DR: you get what you pay for.

      --
      "Space Exploration is not endless circles in low earth orbit." -Buzz Aldrin
      • (Score: 0) by Anonymous Coward on Tuesday March 14 2023, @06:52PM

        by Anonymous Coward on Tuesday March 14 2023, @06:52PM (#1296124)

        Thanks. I'm also practical / frugal and try to be efficient with money, time, and effort.

        Local MicroCenter likes to push Samsung SSDs. I've generally had great experiences with all things Samsung, but I've never had any of their drives (that I can remember... have had many many drives over the years- it's an obsession... shoulda been a drive engineer.)

        Aforementioned friend had huge problems with Samsung, and it wasn't with any one particular model- many different ones over many years. It's too narrow of a sampling to be of any real use. Important point: he/they don't thrash their drives at all. They pretty much fill them up once and then just read from them. Friend is super careful with things. His full-time day job is extreme precision R&D work (measurements in microns). His fiancee might sometimes be thought of as not as careful as one should be with things, but he's never said her mishandling / negligence was a factor (that he was aware of, or was willing to divulge. But I didn't ask either...)

        I wish we could get full stats on large samplings of drives. Backblaze shouldn't bother publishing stats on the low count drives. There are so many different drive models. I guess what the world needs is a single repository for people to log their drive model and actual SMART stats, plus their observations / thoughts. I will NEVER willingly buy anything Seagate.

      • (Score: 0) by Anonymous Coward on Tuesday March 14 2023, @06:56PM

        by Anonymous Coward on Tuesday March 14 2023, @06:56PM (#1296126)

        I forgot to mention: I have the one SanDisk 1TB (in this very computer). So far, so good. I check SMART stats from time to time. No grown defects. TRIM seems to be happening.

        Otherwise I've bought six or so "Inland" - MicroCenter's house brand stuff. They seem great. Amazingly low price. One was my main drive for a couple of years. I'm not saying they're great- not enough data, but reviews are mostly good. Time will tell...

      • (Score: 0) by Anonymous Coward on Wednesday March 15 2023, @01:29AM

        by Anonymous Coward on Wednesday March 15 2023, @01:29AM (#1296182)

        a friend of mine only buys Samsung SSDs, he's a hobbyist system builder (~6/year), and he has never had one of his SSDs fail. He's owned several of them for a decade or more.

        One key reason to buy Samsung SSDs is that they manufacture and use their own flash memory chips and controllers in their SSDs. So does Western Digital, and they are just as consistently good in my experience. I believe Intel does too but they are way more expensive.

        The trouble with flash memory (I don't think this sort of thing ever really happened with hard drive manufacture) is that there are a lot of brands where they don't actually make their own chips: they just assemble the SSDs from whatever was on sale in Shenzhen that day (or even just rebadge fully-assembled devices). You never know what you are going to get with these brands, maybe good, maybe bad. Even if you buy a bunch of the "same" devices they may be totally different on the inside and you just have a mixed bag and can't tell what's fine to use and what's crap. Rarely worth it IMO if you value your time.

      • (Score: 0) by Anonymous Coward on Wednesday March 15 2023, @03:59AM

        by Anonymous Coward on Wednesday March 15 2023, @03:59AM (#1296199)

        TEAMGROUP or not TEAMGROUP, that is the question.

    • (Score: 2) by mcgrew on Tuesday March 14 2023, @07:17PM (1 child)

      by mcgrew (701) <publish@mcgrewbooks.com> on Tuesday March 14 2023, @07:17PM (#1296130) Homepage Journal

      I've found reliability all over the place with thumb drives, but I have had good luck with SanDisk.

      Now, hard drives are a different matter. Used to be, I'd buy a drive and it was always still happily chugging along because it had been a huge drive for its time, but as storage needs became greater and prices dropped it became tiny, and sat on a shelf.

      But for the last fifteen years, hard drives don't seem to last much longer than their warrantees, Western Digital being the worst, but Seagate has lost its former reliability, as well. I don't know if it's greed, or if the bigger drives are inherently less reliable.

      --
      mcgrewbooks.com mcgrew.info nooze.org
      • (Score: 1) by BigJ on Tuesday March 14 2023, @09:59PM

        by BigJ (3685) on Tuesday March 14 2023, @09:59PM (#1296164)

        I would agree here on the SanDisk thumbdrive reliability. I was using an Extreme Pro USB as a boot drive on a Ceph node (home cluster) since 2015 that just failed the end of last year. In the same Ceph cluster, I've been using Intel 100GB 10DWPD SSDs for the same period of time with no failures (media wearout indicator still at 97).

    • (Score: 2) by toddestan on Wednesday March 15 2023, @03:08AM

      by toddestan (4982) on Wednesday March 15 2023, @03:08AM (#1296193)

      I've bought several Sandisk SSD's, as I found them to be a good budget brand and no worse in terms of reliability than the big boys. Speed may or may not be as good, but most any SATA SSD now completely saturates the SATA bus so it doesn't really matter anyway.

      I have had one issue with them. I needed a bigger drive in one of my computers, so I bought a 2 TB to replace an older 480 GB drive which I thought was working perfectly. While mirroring it over, I ran into two unrecoverable bad sectors on the old drive, affecting two operating system files that obviously didn't matter since the PC was running fine, and were easy enough to fix anyway.

      I erased the drive and of course it now says everything is fine as it was able to remap those bad sectors, but for now it's sitting on a shelf.

(1)