Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Thursday May 21 2020, @11:29AM   Printer-friendly
from the let-the-competition-begin dept.

ZFS versus RAID: Eight Ironwolf disks, two filesystems, one winner:

This has been a long while in the making—it's test results time. To truly understand the fundamentals of computer storage, it's important to explore the impact of various conventional RAID (Redundant Array of Inexpensive Disks) topologies on performance. It's also important to understand what ZFS is and how it works. But at some point, people (particularly computer enthusiasts on the Internet) want numbers.

First, a quick note: This testing, naturally, builds on those fundamentals. We're going to draw heavily on lessons learned as we explore ZFS topologies here. If you aren't yet entirely solid on the difference between pools and vdevs or what ashift and recordsize mean, we strongly recommend you revisit those explainers before diving into testing and results.

And although everybody loves to see raw numbers, we urge an additional focus on how these figures relate to one another. All of our charts relate the performance of ZFS pool topologies at sizes from two to eight disks to the performance of a single disk. If you change the model of disk, your raw numbers will change accordingly—but for the most part, their relation to a single disk's performance will not.

[It is a long — and detailed — read with quite a few examples and their performance outcomes. Read the 2nd link above to get started and then continue with this story's linked article.--martyb]

Previously:
(2018-09-11) What is ZFS? Why are People Crazy About it?
(2017-07-16) ZFS Is the Best Filesystem (For Now)
(2017-06-24) Playing with ZFS (on Linux) Encryption
(2016-02-18) ZFS is Coming to Ubuntu LTS 16.04
(2016-01-13) The 'Hidden' Cost of Using ZFS for Your Home NAS


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1) by DECbot on Thursday May 21 2020, @11:15PM (1 child)

    by DECbot (832) on Thursday May 21 2020, @11:15PM (#997636) Journal

    Except ZFS is not like traditional software raid. Notibly, the caching algorighim is not the simple "most recently used files" commonly cached by traditional RAID of all types. The L2 Arc is a weighted cache that tracks not only which files have been opened recently, but how often a file is evicted from the cache and loaded back into memory. This keeps commonly called files from getting evicted from the cache and letting newer, but less frequently called files get evicted as they should be. Case in point, an application is going to get a list of filenames from a database, iterate over the list of files--making small changes to each one, i.e. appending a log file--and then terminate. As the application opens each file, it increases the odds that the database files could get evicted from the cache other records aren't getting called into memory while the application is iterating over the list of files. With ZFS, it will recognize the database files are frequently called, and keep those files in the arc despite a large number of files are opened by the application doing its iteration operation because the files getting updated are only getting accessed a single time, while the database is a longer running operation that is more likely to get called back into memory if evicted then the files. Now, on the other hand, if the application is opened and is constantly having to load the same libraries into memory, and keeps having to reload the libraries into memory, the arc will notice track that and put more priority to keep the application libraries in the arc.
     
    I know that is likely wordy and probably not well explained. So the executive summery is this. Traditional file system caching: most recently accessed files. ZFS file system caching: L2 arc weighted caching of most accessed vs most recent.

    --
    cats~$ sudo chown -R us /home/base
  • (Score: 0) by Anonymous Coward on Thursday May 21 2020, @11:50PM

    by Anonymous Coward on Thursday May 21 2020, @11:50PM (#997653)

    So the executive summery is this. Traditional file system caching: most recently accessed files. ZFS file system caching: L2 arc weighted caching of most accessed vs most recent.

    Absolutely true.

    As you correctly pointed out, ZFS is not *just* a disk management scheme with resiliency and redundancy, it's also a filesystem and related features. Caching is just one of those features. ZFS does offer some advantages over mdadm+ext4, including performance -- as TFA discusses.

    But you misunderstood my point. GP expressed disappointment that TFA wasn't a ZFS/hardware RAID comparison [soylentnews.org]

    I disagreed. A comparison of hardware RAID to ZFS isn't a valid one. I didn't explain what I meant, as I (perhaps wrongly) expected that my reasoning, that properly configured separate RAID hardware will *almost always* provide superior overall system performance to any software-based solution.

    The other side of is that ZFS and mdadm+ext4 are both software solutions. As such, it's reasonable to compare them.