Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Thursday May 21 2020, @11:29AM   Printer-friendly
from the let-the-competition-begin dept.

ZFS versus RAID: Eight Ironwolf disks, two filesystems, one winner:

This has been a long while in the making—it's test results time. To truly understand the fundamentals of computer storage, it's important to explore the impact of various conventional RAID (Redundant Array of Inexpensive Disks) topologies on performance. It's also important to understand what ZFS is and how it works. But at some point, people (particularly computer enthusiasts on the Internet) want numbers.

First, a quick note: This testing, naturally, builds on those fundamentals. We're going to draw heavily on lessons learned as we explore ZFS topologies here. If you aren't yet entirely solid on the difference between pools and vdevs or what ashift and recordsize mean, we strongly recommend you revisit those explainers before diving into testing and results.

And although everybody loves to see raw numbers, we urge an additional focus on how these figures relate to one another. All of our charts relate the performance of ZFS pool topologies at sizes from two to eight disks to the performance of a single disk. If you change the model of disk, your raw numbers will change accordingly—but for the most part, their relation to a single disk's performance will not.

[It is a long — and detailed — read with quite a few examples and their performance outcomes. Read the 2nd link above to get started and then continue with this story's linked article.--martyb]

Previously:
(2018-09-11) What is ZFS? Why are People Crazy About it?
(2017-07-16) ZFS Is the Best Filesystem (For Now)
(2017-06-24) Playing with ZFS (on Linux) Encryption
(2016-02-18) ZFS is Coming to Ubuntu LTS 16.04
(2016-01-13) The 'Hidden' Cost of Using ZFS for Your Home NAS


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Interesting) by Anonymous Coward on Thursday May 21 2020, @06:55PM (8 children)

    by Anonymous Coward on Thursday May 21 2020, @06:55PM (#997503)

    I was hoping this was a zfs v hardware raid comparison.

    Cos that’s the real place zfs is used.

    Comparing to Linux soft raid on ext4 reeks of amateur hour. No mention of xfs.

    Starting Score:    0  points
    Moderation   +4  
       Interesting=4, Total=4
    Extra 'Interesting' Modifier   0  

    Total Score:   4  
  • (Score: 2) by sjames on Thursday May 21 2020, @07:20PM (4 children)

    by sjames (2882) on Thursday May 21 2020, @07:20PM (#997520) Journal

    I would also like to see a comparison with btrfs.

    • (Score: 2) by hendrikboom on Thursday May 21 2020, @08:12PM (3 children)

      by hendrikboom (1125) Subscriber Badge on Thursday May 21 2020, @08:12PM (#997554) Homepage Journal

      I'd like to know whether btrfs is reliable yet. A friend lost his entire btrfs file system a few years ago. Now he uses ext4 on software RAID-1. (that's the keep two copies RAID, right?)

      -- hendrik

      • (Score: 2) by sjames on Thursday May 21 2020, @09:35PM (2 children)

        by sjames (2882) on Thursday May 21 2020, @09:35PM (#997599) Journal

        I've been using it on my desktop and a few servers for a couple years now without incident. I know a few NAS devices use btrfs internally.

        One caveat, I would avoid raid5/6 mode. It seems that most people who lost data on btrfs were using raid5 mode. I'm not satisfied that those issues have been really been worked out.

        Btrfs configured as raid1 did help me out a lot a while back when I had an odd issue with a drive cable corruption a few writes. Thanks to the duplicate data and checksums, btrfs scrub fixed it up without drama (once I replaced the cable, of course).

        • (Score: 1) by DECbot on Thursday May 21 2020, @10:55PM (1 child)

          by DECbot (832) on Thursday May 21 2020, @10:55PM (#997631) Journal

          I heard this too a few years ago on LAS. Btrfs was solid when configured as a mirror but dangerous when setup for RAID 5/6. No clue if those bugs have been worked out or not. This was back when it was time to build a new file server for the basement, and so my options were another RAID5/6 softraid on 16.04, ZFS Z1/Z2 on BSD, or buggy btrfs RAID5/6 on 16.04. I haven't looked into it since as the BSD box is still chugging along.

          --
          cats~$ sudo chown -R us /home/base
          • (Score: 0) by Anonymous Coward on Saturday May 23 2020, @05:53AM

            by Anonymous Coward on Saturday May 23 2020, @05:53AM (#998078)

            From 3.19, the recovery and rebuild code was integrated. The one missing piece, from a reliability point of view, is that it is still vulnerable to the parity RAID "write hole", where a partial write as a result of a power failure will result in inconsistent parity data.

            https://btrfs.wiki.kernel.org/index.php/RAID56 [kernel.org]

            Basically, they did a major rewrite of the code for RAID 5/6. The result isn't as good as it could be, but it is much less susceptible to problems as the old version. In some ways it is better and some ways it is worse than the ZFS version. Mostly worse, though, as a major problem is that it isn't stable yet..

  • (Score: 2, Insightful) by Anonymous Coward on Thursday May 21 2020, @09:44PM (2 children)

    by Anonymous Coward on Thursday May 21 2020, @09:44PM (#997602)

    I was hoping this was a zfs v hardware raid comparison.

    Cos that’s the real place zfs is used.

    Comparing to Linux soft raid on ext4 reeks of amateur hour. No mention of xfs.

    Except ZFS isn't a hardware solution. It's a software solution just like soft raid on ext4. In fact, ZFS is hampered by hardware RAID [openzfs.org], and is strongly discouraged.

    Comparing ZFS with hardware raid is an apples to mangoes comparison.

    • (Score: 1) by DECbot on Thursday May 21 2020, @11:15PM (1 child)

      by DECbot (832) on Thursday May 21 2020, @11:15PM (#997636) Journal

      Except ZFS is not like traditional software raid. Notibly, the caching algorighim is not the simple "most recently used files" commonly cached by traditional RAID of all types. The L2 Arc is a weighted cache that tracks not only which files have been opened recently, but how often a file is evicted from the cache and loaded back into memory. This keeps commonly called files from getting evicted from the cache and letting newer, but less frequently called files get evicted as they should be. Case in point, an application is going to get a list of filenames from a database, iterate over the list of files--making small changes to each one, i.e. appending a log file--and then terminate. As the application opens each file, it increases the odds that the database files could get evicted from the cache other records aren't getting called into memory while the application is iterating over the list of files. With ZFS, it will recognize the database files are frequently called, and keep those files in the arc despite a large number of files are opened by the application doing its iteration operation because the files getting updated are only getting accessed a single time, while the database is a longer running operation that is more likely to get called back into memory if evicted then the files. Now, on the other hand, if the application is opened and is constantly having to load the same libraries into memory, and keeps having to reload the libraries into memory, the arc will notice track that and put more priority to keep the application libraries in the arc.
       
      I know that is likely wordy and probably not well explained. So the executive summery is this. Traditional file system caching: most recently accessed files. ZFS file system caching: L2 arc weighted caching of most accessed vs most recent.

      --
      cats~$ sudo chown -R us /home/base
      • (Score: 0) by Anonymous Coward on Thursday May 21 2020, @11:50PM

        by Anonymous Coward on Thursday May 21 2020, @11:50PM (#997653)

        So the executive summery is this. Traditional file system caching: most recently accessed files. ZFS file system caching: L2 arc weighted caching of most accessed vs most recent.

        Absolutely true.

        As you correctly pointed out, ZFS is not *just* a disk management scheme with resiliency and redundancy, it's also a filesystem and related features. Caching is just one of those features. ZFS does offer some advantages over mdadm+ext4, including performance -- as TFA discusses.

        But you misunderstood my point. GP expressed disappointment that TFA wasn't a ZFS/hardware RAID comparison [soylentnews.org]

        I disagreed. A comparison of hardware RAID to ZFS isn't a valid one. I didn't explain what I meant, as I (perhaps wrongly) expected that my reasoning, that properly configured separate RAID hardware will *almost always* provide superior overall system performance to any software-based solution.

        The other side of is that ZFS and mdadm+ext4 are both software solutions. As such, it's reasonable to compare them.