Stories
Slash Boxes
Comments

SoylentNews is people

posted by CoolHand on Tuesday September 11 2018, @03:29PM   Printer-friendly
from the zed-eff-ess-or-zee-eff-ess dept.

John Paul Wohlscheid over at It's FOSS takes a look at the ZFS file system and its capabilities. He mainly covers OpenZFS which is the fork made since Oracle bought and shut down Solaris which was the original host of ZFS. It features pooled storage with RAID-like capabilities, copy-on-write with snapshots, data integrity verification and automatic repair, and it can handle files up to 16 exabytes in size, with file systems of up to 256 quadrillion zettabytes in size should you have enough electricity to pull that off. Because it started development under a deliberately incompatible license, ZFS cannot be directly integrated in Linux. However, several distros work around that and provide packages for it. It has been ported to FreeBSD since 2008.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0, Interesting) by Anonymous Coward on Tuesday September 11 2018, @03:36PM

    by Anonymous Coward on Tuesday September 11 2018, @03:36PM (#733175)

    .. is what all the kids are doing these days.
    And ZFS is a blockchain/merkle tree.

  • (Score: 1, Interesting) by Anonymous Coward on Tuesday September 11 2018, @03:47PM (26 children)

    by Anonymous Coward on Tuesday September 11 2018, @03:47PM (#733179)

    I thought Ubuntu directly incorporated it and claimed that the CDDL is compatible with the GPL. Seems like a major legal risk to me for end users, but I am not a lawyer.

    Can anyone else provide any insight?

    • (Score: 2, Informative) by Anonymous Coward on Tuesday September 11 2018, @03:49PM

      by Anonymous Coward on Tuesday September 11 2018, @03:49PM (#733180)
    • (Score: 3, Informative) by pendorbound on Tuesday September 11 2018, @04:35PM (24 children)

      by pendorbound (2688) on Tuesday September 11 2018, @04:35PM (#733195) Homepage

      Link by AC is the go-to resource to answer this question, but tl;dr:

      Linus is on-record saying his intent for the kernel is that loading external closed-source kernel modules based on code bases that exist independently of the kernel is not a GPL violation, even if the other codebase isn't GPL or GPL compatible. You CAN'T link it into the kernel and distribute a monolithic kernel with the closed source driver. It must be a loadable module. For a filesystem, that means you need an initramfs if you're going to use the filesystem as root. You may also need a different filesystem for /boot unless your bootloader is helpful enough to load the kernel and initramfs from ZFS for you.

      Linus' statement was specific to another non-GPL filesystem driver (AFS) which he said was fine because the AFS codebase is independent of the Linux kernel. You can run the same code on other operating systems without Linux which means it's not a derivative work of Linux. The same is true of ZFS where the OpenZFS code runs on BSD, Linux, and Mac, and the underlying ZFS codebase is of course from Solaris.

      So really tl;dr, What Ubuntu is doing is fine because Linus said so.

      • (Score: 3, Informative) by DannyB on Tuesday September 11 2018, @04:54PM (23 children)

        by DannyB (5839) Subscriber Badge on Tuesday September 11 2018, @04:54PM (#733211) Journal

        Back in the Groklaw days, I learned about Promisary Estoppel.

        From Google...

        Promissory estoppel is the legal principle that a promise is enforceable by law, even if made without formal consideration, when a promisor has made a promise to a promisee who then relies on that promise to his subsequent detriment.

        That is a nice concise summary of exactly what I understood it to mean from Groklaw. (background: one of SCO's predecessors in interest, namely AT&T, had made certain promises concerning proprietary Unix, that a licensee, linking in their own additional proprietary code, was not a violation of the Unix license, nor had AT&T ever intended that the additional proprietary code would then come under the Unix license, nor would AT&T gain any interest in it. SCO didn't know or didn't care about any promises made by its predecessor in interest. But IBM had relied on those promises in adding JFS to IBM's AIX Unix. Under Promissory estoppel, SCO was obligated to honor those promises made much earlier by AT&T. Thus JFS is not part of Unix, and thus JFS in Linux does not make Linux an infringement of Unix -- even though SCO never actually was a copyright owner of Unix.)

        In this case, Linus Torvalds' statement would be a promise that others could rely upon to have a binding meaning. It is a promise about how he interprets the GPL as applied to Linux. Similarly to how Linus interprets the GPL on Linux to also mean that user space code does not come under the scope of the GPL license merely by virtue of being executed in Linux user space.

        Now the sticky part. (And No, I don't mean the part about how ZFS was actually originally conceived to hold larger collections of pr0n.) The problem is that other contributors to the kernel, contributing their code with the understanding it is GPL licensed, might not want to have the same interpretation that ZFS can be linked as a kernel module at runtime. Whether any actual kernel contributors would take this view, there would seem to be universal agreement that everyone views userspace code as not coming under the kernel's GPL license.

        --
        When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
        • (Score: 0) by Anonymous Coward on Tuesday September 11 2018, @05:18PM (22 children)

          by Anonymous Coward on Tuesday September 11 2018, @05:18PM (#733215)

          But what about Oracle? I understand that Linus won't get involved, but can the same be said for Oracle?

          • (Score: 2) by Immerman on Tuesday September 11 2018, @05:46PM (8 children)

            by Immerman (3985) on Tuesday September 11 2018, @05:46PM (#733222)

            Does Oracle matter? So long as the module doesn't incorporate any GPL code (and the ZFS license doesn't contain any gotchas against running on a GPL platform) there shouldn't be any problems from that direction.

            Usually a deliberately-incompatible-with-GPL license is chosen to prevent the code from migrating into GPL projects, not to prevent the programs from running on Linux.

            • (Score: 3, Interesting) by DannyB on Tuesday September 11 2018, @06:36PM (7 children)

              by DannyB (5839) Subscriber Badge on Tuesday September 11 2018, @06:36PM (#733241) Journal

              GPL is the issue. If you link any code with GPL code, then that linked code MUST be also licensed under the GPL.

              Linux --> GPL

              Linux + ZFS must also be GPL. But wait, ZFS is licensed under a different license.

              With some "different licenses" this is no problem, because those licenses are compatible with becoming also licensed under the GPL without the copyright owner getting involved. (eg, the copyright owner chose a license that is compatible with GPL and thus already gave their permission.)

              --
              When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
              • (Score: 2) by Immerman on Wednesday September 12 2018, @01:41PM (6 children)

                by Immerman (3985) on Wednesday September 12 2018, @01:41PM (#733588)

                What does that have to do with Oracle though?

                As other's have already pointed out, the way in which the module is tying in to the Linux OS has long been okayed by Linus and many other key Linux contributors. Other Linux stakeholders might try to cause problems since it is technically, a violation of the GPL, but the risk is small. And has nothing whatsoever to do with Oracle, since Oracle is not making the module.

                The only way Oracle gets involved is if the module violates the CDDL - and from what I've read it sounds like it does not.

                • (Score: 2) by pendorbound on Wednesday September 12 2018, @02:48PM (1 child)

                  by pendorbound (2688) on Wednesday September 12 2018, @02:48PM (#733618) Homepage

                  A small quibble: Loading ZFS into the kernel, even linking it directly into a compiled monolithic kernel IS NOT a GPL violation. Distributing the result of that linkage to someone else is a GPL violation. No Linux contributor has standing to come after you if you link ZFS on your own hardware nor load a loadable kernel module for it.

                  GPL is a distribution license. You can only violate it if you're giving a copy of GPL licensed code to someone else. (And distributing a kernel linked with CDDL ZFS code is no-question a violation.)

                  Fair Use is your "license" for working on your own systems. You can link in anything you like from anywhere. As long as you don't distribute it, you can never violate the GPL.

                  • (Score: 2) by DannyB on Wednesday September 12 2018, @03:02PM

                    by DannyB (5839) Subscriber Badge on Wednesday September 12 2018, @03:02PM (#733635) Journal

                    If some kernel developer, or a dead kernel developer's estate were to bring a lawsuit over linking, it would still be ugly and expensive -- even if you win.

                    --
                    When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
                • (Score: 2) by DannyB on Wednesday September 12 2018, @03:01PM (3 children)

                  by DannyB (5839) Subscriber Badge on Wednesday September 12 2018, @03:01PM (#733632) Journal

                  I agree with you. I think this is evident from some of my other nearby posts.

                  My concern would be that not all kernel contributors might agree. It only takes one to bring an expensive lawsuit. Some kernel developers are dead and who knows what their estate might to for extra profit. The kernel developer might care about Linux and open source, but their estate might not.

                  --
                  When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
                  • (Score: 2) by pendorbound on Wednesday September 12 2018, @04:45PM (2 children)

                    by pendorbound (2688) on Wednesday September 12 2018, @04:45PM (#733699) Homepage

                    Any such estate would be suing Nvidia, Broadcom, and any of a dozen much larger companies over the non-GPL blobs they load in with their modules. You can't have accelerated video, WiFi, or lots of other functionality in Linux from a huge variety of hardware manufacturers without linking non-GPL code into the kernel. The kernel source tree includes numerous blobs for common hardware which the copyright holders have granted permission to distribute with Linux but have NOT released under a GPL license with source code.

                    Please stop spreading FUD on this issue. The question of whether you the end user can link binary blobs into your kernel without violating the GPL is about as settled as any question in copyright law can get. You have the right to do that. Just don't give the linked copy to anyone. Distributing the un-linked components and instructions or scripts that do the linking is permissible.

                    • (Score: 2) by DannyB on Wednesday September 12 2018, @06:36PM (1 child)

                      by DannyB (5839) Subscriber Badge on Wednesday September 12 2018, @06:36PM (#733777) Journal

                      Any such estate would be suing Nvidia, Broadcom, and any of a dozen much larger companies over the non-GPL blobs

                      That is an excellent point.

                      Please stop spreading FUD on this issue.

                      I don't believe I am spreading FUD.

                      While I don't have any problem linking anything I want to GPL code on my personal systems; and encouraging other people to do so; I won't do it in a commercial setting. I can't. There are probably others that can't.

                      Another excellent point is that if we're not worried about kernel blobs, and it seems neither of us are, then ZFS is probably not a concern.

                      My caution started with the GPL licensed MySQL ages ago. Probably no longer an issue. But if the copyright owner views it a certain way, I'm definitely not going to argue with them about it -- even if they are wrong.

                      I've taken that attitude to the extreme with even a potential technical licensing issue that could arise.

                      I'll also bring up Oracle again. They should have no case to bring a lawsuit. But that doesn't mean they wouldn't if they saw dollar signs. This is a company that is suing over APIs being copyrightable, and merely having a compatible API is a copyright infringement. So don't be so quick to say FUD.

                      SCO had no case against IBM either. Maybe only a dispute about an irrelevant contract. As that is all that is left of their case. Yet it lingers on. I bring this up because while SCO had no case, and IBM was vindicated, look how much time and money has been expended. We know SCO spent at least $30 million in legal costs -- that we know of for sure. Over the years it was estimated that IBM has spent at least a hundred million. In 2002 if I said that SCO might sue that Linux had copyright infringing code stolen from Unix, you might accuse me of FUD. But in March 2003 I would have laughed at you for it. So don't be so quick with the FUD.

                      There are people who deliberately, willfully and knowingly spread FUD and for exactly the purpose of what FUD expands to. I have a legitimate concern.

                      --
                      When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
                      • (Score: 2) by Immerman on Wednesday September 12 2018, @07:40PM

                        by Immerman (3985) on Wednesday September 12 2018, @07:40PM (#733804)

                        If you're going to worry about baseless lawsuits your only real defense is to be so insignificant and uninteresting that nobody knows about you. After all it doesn't matter if you've never done anything even remotely questionably legal in your entire life - you can still be dragged through an immensely expensive lawsuit by anyone who wants to cause you trouble. Heck, pretty much the entire RIAA "anti-piracy" legal strategy is to sue people who can't afford to fight back, despite a near-total lack of evidence.

          • (Score: 5, Insightful) by pendorbound on Tuesday September 11 2018, @05:47PM (12 children)

            by pendorbound (2688) on Tuesday September 11 2018, @05:47PM (#733223) Homepage

            Linking ZFS into Linux isn't violating Oracle's license or copyright. They have no standing to take a position against you for doing so.

            Distributing linked binary code consisting of both GPL and CDDL licensed source is a violation of the GPL, not of the CDDL. By doing so, you're distributing the Linux kernel without a valid license to do so and are therefore violating Linux's copyright. You're still good with ZFS's license and copyright at that point. You haven't violated the Oracle's CDDL license for the ZFS code.

            Remember: GPL is a distribution license agreement, not a user license agreement. Whoever gave you your copy of Linux had to do so under the terms of the GPL. If you give a copy of it to anyone else, you must also comply with GPL. In between while you're working on your own computers, the GPL isn't applicable. It's the Free Software Foundation's position that you do not need a license to use code on your own computers. That's considered Fair Use. You only need a license to copy and distribute that code since otherwise that copying would violate copyright law.

            • (Score: 0) by Anonymous Coward on Tuesday September 11 2018, @07:30PM (7 children)

              by Anonymous Coward on Tuesday September 11 2018, @07:30PM (#733265)

              This is the best analysis I've read on this matter. Thank you!

              I went with FreeBSD because of ZFS support. If Linux becomes comparable, I may go back. Your analysis indicates that it's only a matter of time of when I'll do so.

              • (Score: 2) by pendorbound on Tuesday September 11 2018, @07:46PM (6 children)

                by pendorbound (2688) on Tuesday September 11 2018, @07:46PM (#733277) Homepage

                FWIW, I started with ZFS on FreeBSD in 2008 and jumped (back) to Gentoo in about 2010. I've been running ZFS for rootfs and storage on Gentoo since then over a variety of kernel & userland upgrades. Linux has feature parity with the BSD implementation for ZFS. It's been too long since I've used it on BSD (and too many hardware upgrades) to really compare them on a performance basis.

                Kernel/system updates on Gentoo haven't always been the smoothest with ZFS. Unquestionably that's Gentoo's fault. Having your rootfs driver in an initramfs just complicates that & excitement ensues sometimes. Grub had some issues originally where it failed to read valid pools that the ZFS drivers in both Linux & BSD could import successfully. I submitted a few Grub patches around 2013 that corrected those issues, and I think those have been integrated in most distributions at this point. If you wanted to use ZFS for large storage arrays and used ext4 or something in mainline kernel for rootfs, that would all but remove the upgrade and bootloader issues and still give you the benefits of ZFS for data. I like being able to snapshot the OS drive though...

                I suspect that running ZFS on Ubuntu would make the upgrade story a bit less fraught with excitement, but I have no experience with Ubuntu's implementation at this point. Overall though, ZFS is completely solid on Linux for day to day use. I also run with ZFS on top of LUKS to get encrypted storage which works flawlessly.

                • (Score: 1) by pTamok on Tuesday September 11 2018, @08:29PM

                  by pTamok (3042) on Tuesday September 11 2018, @08:29PM (#733292)

                  If you wanted to use ZFS for large storage arrays and used ext4 or something in mainline kernel for rootfs, that would all but remove the upgrade and bootloader issues and still give you the benefits of ZFS for data. I like being able to snapshot the OS drive though...

                  If you don't mind something slightly funky, also without an fsck, use NILFS2 [wikipedia.org] for your rootfs. So long as you remember to ensure the NILFS2 module is included in GRUB2, it works fine. You probably will need to install via chrooting though, as I don't know of any installers that support NILFS2 natively.

                  NILFS supports continuous snapshotting. In addition to versioning capability of the entire file system, users can even restore files mistakenly overwritten or destroyed just a few seconds ago. Since NILFS can keep consistency like conventional LFS, it achieves quick recovery after system crashes.

                  NILFS creates a number of checkpoints every few seconds or per synchronous write basis (unless there is no change). Users can select significant versions among continuously created checkpoints, and can change them into snapshots which will be preserved until they are changed back to checkpoints.

                  There is no limit on the number of snapshots until the volume gets full. Each snapshot is mountable as a read-only file system. It is mountable concurrently with a writable mount and other snapshots, and this feature is convenient to make consistent backups during use.

                  Snapshot administration is easy and quickly performable. NILFS will make snapshotting or versioning of the POSIX filesystem much familiar to you. The possible use of NILFS includes, versioning, tamper detection, SOX compliance logging, and so forth. It can serve as an alternative filesystem for Linux desktop environment, or as a basis of advanced storage appliances.

                  Alternatively, run your rootfs on its own LVM volume, sized to allow LVM snapshotting.

                • (Score: 1) by soylentnewsinator on Tuesday September 11 2018, @10:35PM (2 children)

                  by soylentnewsinator (7102) on Tuesday September 11 2018, @10:35PM (#733374)

                  I finally had to create an account. Gentoo was my primary OS back in 2004-2007. I later transitioned to Ubuntu as I found fewer things broke. (I'm not sure if that's the nature of a rolling release distro versus 6 months or 2 years depending on if you used LTS.) On and off, I've been using FreeBSD for various servers until a couple of years ago where I transitioned all my servers to FreeBSD due to ZFS support (and partially jails).

                  I see that Gentoo finally updated their web design. I recall they held a contest over 10 years ago, and despite a winner being selected, they cancelled the idea due to 'reasons'. That was a big red flag to me if their management would hold the contest and then change their mind as soon as a winner was selected.

                  Since I left Gentoo around the time you started using it, according to your experience, has the stability noticeably increased? Is there a recommendation of how frequently to upgrade if there's no security reason to do so? Is there a decent tool to update config files to new versions? (That hurt me numerous times in the past.) How often do packages break on you?

                  • (Score: 2) by hendrikboom on Wednesday September 12 2018, @12:53PM

                    by hendrikboom (1125) Subscriber Badge on Wednesday September 12 2018, @12:53PM (#733574) Homepage Journal

                    Breakage of config files? I've often thought config files should be checked into revision control with a vendor branch and a local branch, and that the installer should set this up. After installation it's too late, because the installer has already modified them. Maybe also an installer branch, intermediate between these two.

                  • (Score: 2) by pendorbound on Wednesday September 12 2018, @03:10PM

                    by pendorbound (2688) on Wednesday September 12 2018, @03:10PM (#733641) Homepage

                    I wouldn't say Gentoo has "improved" much in the intervening time. It's definitely the nature of the rolling release beast. Benefit is bleeding edge latest packages. Draw back is the blood on the edge is frequently your own...

                    The recommendation for upgrade frequency is, "early and often..."

                    The thing about updating Gentoo is it's uncomfortable, but the longer you go between updates, the more it hurts. If you hold off for security related only updates, there's a good chance major chunks of core system will have changed their structure since your last update. You'll have multiple conflicting packages you need to upgrade, big filesystem layout or config format changes, and your life's gonna suck for the next couple of days. If you're lucky, you can manually set version masks to upgrade in stages (assuming the intervening versions are still in the Portage tree). If not, you're manually fixing stuff, forcing versions to get around slot conflicts, etc. If you suck it up and do the updates regularly, you usually get the benefit of migration scripts to fix that stuff. The scripts are seldom maintained for more than a few version updates though. If you wait too long, you're on the wrong side of the gulf between the old & new way with no migration process to automate moving across.

                    It's been a while since I've left myself with an unbootable system & had to shove a recovery disc in, but it's happened... Especially with ZFS root, if you munge the ZFS driver in your initramfs, you're cooked. Keeping your previous kernel / initramfs images as alternates in GRUB is usually enough to get back to a usable system and clean up. Usually...

                    Configuration file changes are handled with dispatch-conf which is so-so. It gives you a diff of your original versus the proposed changes. The changes are always clobber-jobs. They don't attempt to apply only changes on top of your customizations, just replace whole files. Usually I end up either ignoring the changes and keeping my own or dropping into an editor (which dispatch-conf will facilitate) and hand-merge the changes. I'd love them to use something like Augeas for config migrations, but....

                    To be honest, I wouldn't use Gentoo with Portage in production. It's okay for my home stuff. Having latest versions of stuff that I don't have to `./configure && make && make install` is handy. For enterprise level reliability, I wouldn't recommend Gentoo/Portage. The only case I could see using Gentoo "for real" would be in an immutable VM situation where "upgrades" mean canning a new gold template and individualizing it into your various apps using Puppet or something. In that case, I wouldn't even include Portage or the portage tree. Just a super cut-down Linux install with the bare minimum for the app.

                • (Score: 2) by hendrikboom on Wednesday September 12 2018, @01:00PM (1 child)

                  by hendrikboom (1125) Subscriber Badge on Wednesday September 12 2018, @01:00PM (#733578) Homepage Journal

                  Just wondering: How much RAM did you have on that BSD system using ZFS way back in 2008? Machines were a lot smaller then.

                  • (Score: 2) by pendorbound on Wednesday September 12 2018, @03:15PM

                    by pendorbound (2688) on Wednesday September 12 2018, @03:15PM (#733647) Homepage

                    My 2008 dedupe near-disaster happened with 8GB of RAM (max the motherboard could handle...) for a 3TB pool (4x750GB, RAIDZ-1), SATA attached WD consumer class drives.

                    Two ebay surplus MB updates later and a ton of "pulled" DIMMs on the cheap, and I'm at 96GB for 43TB total storage across 4 pools and 18 spindles of various makes and models. Still no dedupe enabled though...

            • (Score: 2) by DannyB on Tuesday September 11 2018, @09:25PM

              by DannyB (5839) Subscriber Badge on Tuesday September 11 2018, @09:25PM (#733324) Journal

              Linking ZFS into Linux isn't violating Oracle's license or copyright.

              I think it's the owner of a GPL licensed copyright work that you must worry about.

              Are there any kernel contributors who would sue because their work is licensed under the GPL, and therefore anything linked with it must, as per the GPL, also be licensed under the GPL. ("viral license")

              Oracle might not care. But a kernel dev would seem to have a technically legitimate claim to assert -- however picky and petty I might think it might be.

              One problem, as I understand it, with kernel licensing is that nobody knows who all of the contributors are. And some are dead, and therefore their copyright ownership would go to their estate, and who knows who might control that. Said differently, it is probably impossible to ever change the license of the kernel, or to get everyone to stipulate to some additional clause. (Example: Java JDK licensed under GPL + Classpath Exception to the GPL. That exception means that running your code on Java as a platform does not bring your code under the scope of the GPL, but any modification or addition to the java platform must be under the GPL license.)

              --
              When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
            • (Score: 2) by DannyB on Tuesday September 11 2018, @09:44PM (2 children)

              by DannyB (5839) Subscriber Badge on Tuesday September 11 2018, @09:44PM (#733337) Journal

              GPL is a distribution license agreement

              It's not really an agreement. Although at least one court has treated it that way, and did see the requirements of the GPL as consideration, and under that reasoning enforced the GPL requirements upon some GPL violator.

              License: - a Permission. Nothing more. Fishing License. Dog License. Driver License. Marriage License (which doesn't need to be periodically renewed like a Dog License, in order to maintain its validity.)

              Copyright: - right(s) reserved exclusively to the author of a creative work, by law. Books, music, etc, player piano rolls, and computer software. Sometimes these rights are sliced and diced into a million pieces, like the RIAA with performance rights, mechanical rights, distribution rights, etc.

              Copyright infringement: - unauthorized exercise of any of the rights exclusively reserved to the copyright owner.

              Copyright License: - A permission, to exercise some subset (up to and including all) rights exclusively reserved to the copyright owner.

              The only way you can have a copyright license is if the copyright owner or their agent gives you such a license to use certain of the rights.

              In some cases you can get a copyright license by agreeing to a contract which may involve paying money, promising to give your firstborn, your vital organs, etc. as well as other promises on your part, such as not making copies of the licensed work.

              EULA: - a click through "agreement" that purports to bind you to some contract.

              An open source license is a license that gives you a license (eg permission) to use certain of the copyright owner's exclusive rights. You don't have to agree to anything -- but not doing so means you are not granted a license (eg permission) to exercise those rights. The GPL must be waived by the defendant, not the plaintiff. The plaintiff would simply go to court: Dear judge, this soundrel is distributing (or maybe even just using) my GPL licensed work without a license, make him stop and give me damages and attorneys fees!!! It is up to the defendant to waive the GPL and say, "but judge, I have a license". Then plaintiffs can point out that because you are in violation of clauses X, Y and Z of the license, you actually do not have any license to exercise the rights reserved exclusively to the copyright owner. (The license required that any linked code must also be under the GPL)

              Maybe in some sense it is an agreement. But not like a contract. You don't sign it. You don't exchange consideration. (Although one judge did see it differently, that upholding your obligations is your consideration exchanges for the consideration of being granted the permissions to exercise some exclusive rights.)

              --
              When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
              • (Score: 2) by pendorbound on Wednesday September 12 2018, @03:27PM (1 child)

                by pendorbound (2688) on Wednesday September 12 2018, @03:27PM (#733652) Homepage

                GPL is absolutely a contract which grants you a copyright license as part of its terms. You receive the valuable consideration of the right to distribute a copyrighted work quid pro quo you agree to take certain required actions in exchange. IE in order to distribute binaries, you are required to distribute (or offer in some cases) the corresponding source code. If you refuse to accept the agreement by violating its terms, you lose the benefits afforded you by accepting the agreement and are therefore distributing a copyrighted work without a license to do so, in violation of copyright law.

                Dear judge, this soundrel is distributing (or maybe even just using) my GPL licensed work without a license

                Only half of that is a valid argument in court. Distributing, yes. You're a copyright pirating scoundrel. Using, nope. You are not required to accept the GPL to make use of software licensed under it. Whoever gave you the copy was required to do so in compliance with the license terms (thus accepting the agreement that granted them the license to distribute), and you must do likewise if you make a copy and give it to someone. For use on your own systems in your own environments, no license is required, and GPL doesn't apply to ANYTHING you do. If you read GPLv2 word for word, there isn't a single term which requires anything of an end user who is NOT distributing a copy of the software to a third party. It's impossible to violate the GPL exclusively on your own system.

                The waters get muddy with GPLv3 where making a web app available over a network is considered a "distribution" of it, in part on the basis that any HTML, CSS, etc. contained in it is copyrighted and would require the benefit of the GPL license in order for you to distribute that to another user's web browser. Linux kernel is GPLv2, so that's not relevant to the ZFS case.

                • (Score: 2) by DannyB on Wednesday September 12 2018, @03:54PM

                  by DannyB (5839) Subscriber Badge on Wednesday September 12 2018, @03:54PM (#733668) Journal

                  At least one court agrees with you that it is a contract.

                  If that was its intent, then it should be called an agreement rather than a license. The license (eg permission) is granted as a condition of an agreement.

                  The word license means permission.

                  I understand that distributing is the context that everyone talks about the GPL. That wouldn't stop someone for suing you over linking, even if you win the lawsuit. Even if you are correct and vindicated.

                  I'll point to one example I remember. In the MySQL days, some time back. The DB server is GPL licensed, no problem. But . . . all drivers and connectors to it were also GPL licensed. Not LGPL but GPL. Commercial developers might like to use MySQL. So they all had various ways of dancing around it. One that I learned of was they don't distribute the MySQL driver with their product. Instead the have the customer install the product and separately install the database driver. Thus the "linking", in any sense of the word, was done by the customer. If the vendor distributed the MySQL driver at all, it was strictly under GPL terms and unrelated to the product they sold. It is something I considered doing a long time ago, but decided not to go that route. It was clear that the copyright owner definitely considered this a violation of the GPL even if most open source people did not. Reading the copyright owner's licensing description made it clear that they viewed it this way -- even if they were being deceptive or confused. They wanted to sell a commercial license to commercial developers. It's not worth getting sued over. I don't think it is a problem today.

                  I think GPLv3 is a mess. Way unnecessarily complex. I could easily read and understand the GPLv2 as I think most people could. The LGPLv3 is even worse -- in complexity -- because you first have to understand the GPLv3 in order to then understand what the LGPLv3 relaxes. I understand that Stallman wants to prevent Tivoization. And I applaud his efforts in the GPL and LGPL which effectively did prevent the Microsoftization of open source because of its viral nature.

                  Don't even get started about the AGPL. The purpose of AGPL, prior to GPLv3, was to make a web app be "distribution" effectively. Even code that you did not distribute into the browser. For example, you could not use an AGPL licensed library in your server, even if that library has no code that ever leaves the server.

                  My understanding of Linux + ZFS is this: technically it should be okay. At least Linus says that's his interpretation. But he cannot bind others to that. So a distribution needs to get the end user to link ZFS into the kernel at runtime. I think this is an even more dangerous situation than the MySQL (GPL) driver separately installed into a vendor's product by the customer. Reason is because the distribution is distributing both the ZFS and Linux, they're just not being linked together until runtime -- but the clear intent of the distribution is for it to be linked together. And it is not Oracle to worry about, it is a kernel developer or their estate.

                  --
                  When trying to solve a problem don't ask who suffers from the problem, ask who profits from the problem.
  • (Score: 5, Interesting) by bzipitidoo on Tuesday September 11 2018, @04:29PM (13 children)

    by bzipitidoo (4388) on Tuesday September 11 2018, @04:29PM (#733189) Journal

    I used ReiserFS (version 3) for several years, and was looking at ReiserFS 4. We all know what happened to those.

    Tried XFS in the datacenter, and ran into an embarrassing problem with it being incredibly slow at deleting a large directory tree. Took 5 minutes (!) to delete a Linux kernel source code directory. Investigating, it turned out that the default parameters for XFS were the absolute worst possible for that particular hardware, with block sizes mismatched in the worst way. I moved everything off and reformatted those partitions with XFS with much better parameters, and that reduced the time for that operation to a still slow but much more bearable 10 seconds or so.

    When btrfs came along, I gave it a go. It too had a use case scenario at which it is extremely poor: the sync command. Firefox likes to sync, often. So Firefox ran very slow on a btrfs partition. I hear btrfs has greatly improved the performance of sync, but it is still a bit slow. Another big criticism of btrfs was the lack of repair tools such as fsck.

    ext2 of course takes a long time to do a fsck because it doesn't use journaling. ext2 is still useful for database storage in which the journaling merely gets in the way of their own management methods, so it's better to use ext2 than ext3 or 4 for that.

    For all else, I find ext4 really is the file system with the least nasty surprises, least likely to have some use case at which it is really, really slow. Sure, it's going to have a rough time dealing with the gigantic directory containing over 64k files, has to be configured differently than the default configuration to support that at all, but that's a much less common use case than deleting a large directory tree. Resizing the file system is another use that I find is rarely needed, so whenever some file system proponents brag about how quickly their favorite file system can do that, I just shrug. Guess I'm getting old and conservative and cranky when it comes to file systems.

    One thing I do miss about ye olde FAT is the ease of undeleting a file if you accidentally delete it. It's not guaranteed to work, but if you haven't made many changes to the file system since the deletion, odds are it will work. Whereas, after an "rm", it's usually still possible to recover the data but it takes a lot more effort. Like, oh, you have to unmount the partition then scan through the whole freaking thing with grep, looking for whatever unique data the file had that you can remember accurately. And that's if you have not encrypted the partition. As for the trash can, I really do not much like it. "rm" bypasses it, so it's no help there. The biggest reason to delete stuff is that I'm running short of space, and the trash can just gets in the way of that. You think you deleted something, but the amount of free disk space didn't budge. Having to tell the system to empty the trash is an extra step I'd rather not have to do. Anyway, I find backups a much better way to guard against accidental file deletion.

    I have not tried ZFS, in large part because it wasn't free.

    • (Score: 4, Insightful) by pendorbound on Tuesday September 11 2018, @04:44PM (7 children)

      by pendorbound (2688) on Tuesday September 11 2018, @04:44PM (#733207) Homepage

      ZFS is "Free" as in speech. It's just that its CDDL license is not compatible with GPL. You could make an argument that GPL is less free than CDDL since it's a restriction in the GPL which makes it impossible to use with CDDL, not the other way around. CDDL code is freely used in BSD-licensed software without any legal issues.

      The fact that Sun staff are on-record saying that CDDL was written based on MPL explicitly because it made it GPL-incompatible and limited ZFS' usage with Linux kind of muddies the situation. Still, the CDDL license is free as in speech according to both the FSF and OSI. https://en.wikipedia.org/wiki/Common_Development_and_Distribution_License [wikipedia.org]

      I went through a similar filesystem evolution as you did and landed on ZFS. I've been using it in home-production since 2008. Its built-in snapshot and checksumming support have saved my bacon several times in the face of both hardware and operator errors. I'd highly recommend giving it a look. The one caveat is look at the minimum hardware requirements, balk at how insane they are for a filesystem, and then make sure you comply with them anyways. ZFS does not operate well in RAM constrained environments.

      • (Score: 3, Informative) by VLM on Tuesday September 11 2018, @07:00PM (5 children)

        by VLM (445) on Tuesday September 11 2018, @07:00PM (#733249)

        The one caveat is look at the minimum hardware requirements, balk at how insane they are for a filesystem, and then make sure you comply with them anyways. ZFS does not operate well in RAM constrained environments.

        The whole 1GB ram per 1TB of storage thing? Not really needed. Dedupe is memory hungry. If you're trying to do an office NAS for many (hundreds?) of users simultaneously, you'll want more than 1G per 1T for caching reasons.

        You can't hurt anything by having more ram, thats for sure.

        I'm typing this on a freebsd box with 32 GB of ram for fast compiles and only 250GB x2 of storage so its very nice, but not necessary.

        • (Score: 2) by Entropy on Tuesday September 11 2018, @07:44PM (4 children)

          by Entropy (4228) on Tuesday September 11 2018, @07:44PM (#733273)

          No. That's for dedup. (Deduplication) which is by no means necessary. I think ZFS doesn't work well if you have 512M of RAM or something, but over a gig should be fine.

          • (Score: 2) by pendorbound on Tuesday September 11 2018, @07:58PM

            by pendorbound (2688) on Tuesday September 11 2018, @07:58PM (#733281) Homepage

            It's life & death critical for dedupe(*), but you're still going to want much more RAM than normal for ZFS. ZFS' ARC doesn't integrate (exactly) with Linux's normal file system caching. I've seen significant performance increases for fileserver and light database workloads by dedicating large chunks of RAM (16GB out of 96GB on the box) exclusively for ARC. It'll *work* without that, but ZFS is noticeably slower than other filesystems if it doesn't have enough ARC space available. Particularly with partial-block updates, having the rest of the block in ARC means ZFS doesn't have to go to disk to calculate the block checksum before writing out the new copy-on-write block. Running with insufficient ARC causes ZFS to frequently have to read an entire block in from disk before it can write an updated copy out, even if it was only changing one byte.

            (*) Source: Once tried to enable dedupe on a pool with nowhere near enough RAM. Took over 96 hours to import the pool after a system crash as it rescanned the entire device for duplicate block counts before it was happy the pool was clean. Had to zfs send/receive to a new pool to flush out the dedupe setting and get a usable system.

          • (Score: 0) by Anonymous Coward on Wednesday September 12 2018, @10:18AM (1 child)

            by Anonymous Coward on Wednesday September 12 2018, @10:18AM (#733536)

            but over a gig should be fine.

            My home network media backup ZFS box had 2GB ram, I ran into occasional stability issues (unexplained random reboots) it now has 4GB and runs solid 24/7, i'd say throw as much RAM at it as your motherboard can handle.
            Once I can source a cheap secondhand multi cpu server with over 32GB ram then I'll move the current disk pool over to it and consider firing up the (probably now badly needed) dedupe facility, the SD cards from the various family digital cameras and phones get regularly backed up to the server, and the networked home directories live there as well, so no doubt there are multiple copies of the same images and music files lurking on it.

            • (Score: 0) by Anonymous Coward on Wednesday September 12 2018, @10:42AM

              by Anonymous Coward on Wednesday September 12 2018, @10:42AM (#733543)

              My home network media backup ZFS box had 2GB ram, I ran into occasional stability issues (unexplained random reboots)

              I should have added in there, that this was during the testing phase, I ran the thing for a month and seriously hammered it, upped the beastie to 4GB, hammered it again for another couple of weeks before finally going live with it.

          • (Score: 2) by VLM on Wednesday September 12 2018, @11:09AM

            by VLM (445) on Wednesday September 12 2018, @11:09AM (#733554)

            (Deduplication) which is by no means necessary.

            Dedupe is almost never necessary. Under really weird conditions if you're running over 1000 almost identical virtual compute nodes (maybe a webhosting farm using virtualization?) then you can save some cash on storage. But under normal conditions you're basically trading high speed ram which is money and heat and energy intensive for slightly lower bulk storage which is cheap and getting cheaper; generally not a win.

            A good analogy is dedupe is kinda like the old windows "autoexec on media insertion" which sounds nifty but turns out to be not so great overall.

      • (Score: 2) by hendrikboom on Tuesday September 11 2018, @09:33PM

        by hendrikboom (1125) Subscriber Badge on Tuesday September 11 2018, @09:33PM (#733331) Homepage Journal

        I found the following:

        https://www.reddit.com/r/DataHoarder/comments/5u3385/linus_tech_tips_unboxes_1_pb_of_seagate/ddrngar/ [reddit.com]

        Some well meaning people years ago thought that they could be helpful by making a rule of thumb for the amount of RAM needed for good write performance with data deduplication. While it worked for them, it was wrong. Some people then started thinking that it applied to ZFS in general. ZFS' ARC being reported as used memory rather than cached memory reinforced the idea that ZFS needed plenty of memory when in fact it was just used in an evict-able cache. The OpenZFS developers have been playing whack a mole with that advice ever since.

        I am what I will call a second generation ZFS developer because I was never at Sun and I postdate the death of OpenSolaris. The first generation crowd could probably fill you in on more details than I could with my take on how it started. You will not find any of the OpenZFS developers spreading the idea that ZFS needs an inordinate amount of RAM though. I am certain of that.

        And also https://www.reddit.com/r/DataHoarder/comments/5u3385/linus_tech_tips_unboxes_1_pb_of_seagate/ddrh5iv/ [reddit.com]

        A system with 1 GB of RAM would not have much trouble with a pool that contains 1 exabyte of storage, much less a petabyte or a terabyte. The data is stored on disk, not in RAM with the exception of cache. That just keeps an extra copy around and is evicted as needed.

        The only time when more RAM might be needed is when you are turn on data deduplication. That causes 3 disk seeks for each DDT miss when writing to disk and tends to slow things down unless there is enough cache for the DDT to avoid extra disk seeks. The system will still work without more RAM. It is just that the deduplication code will slow down writes when enabled. That 1GB of RAM per 1TB data stored "rule" is nonsense though. The number is a function of multiple variables, not a constant.

        So I now wonder what the *real* limits are on home-scale systems. In particular, suppose I have only a few terabytes. And a machine with only a half gigabyte of RAM. And used for nothing more bandwidth-intensive than streaming (compressed) video over a network to a laptop.

        What I like about ZFS is its extreme resistance to data corruption. That's essential for long-term storage. My alternative seems to be btrfs. Currently I'm using ext4 on software-mirrored RAID, which isn't great at detecting data corruption.

        -- hendrik

    • (Score: 1) by pTamok on Tuesday September 11 2018, @08:45PM (2 children)

      by pTamok (3042) on Tuesday September 11 2018, @08:45PM (#733298)

      One thing I do miss about ye olde FAT is the ease of undeleting a file if you accidentally delete it.

      This is why I use NILFS2 [wikipedia.org].

      Find checkpoint timestamped before the file's deletion, convert checkpoint to read-only snapshot*, mount snapshot, copy file back, unmount snapshot, convert snapshot back to checkpoint, done.

      *This stops it from being deleted by the garbage collector. You don't have to do this, but you run the risk of the checkpoint being tidied away when the garbage collector runs to free up disk space. NILFS2 treats the disk as a circular buffer of copy-on-write blocks, so it is continuously chasing its tail.

      • (Score: 3, Informative) by linuxrocks123 on Wednesday September 12 2018, @03:31AM (1 child)

        by linuxrocks123 (2557) on Wednesday September 12 2018, @03:31AM (#733467) Journal

        I used NILFS2 for awhile, then one day it oopsed() the kernel and made the filesystem read-only when reading from part of the directory tree in my Pale Moon user profile directory.

        Then after rebooting I discovered it would now persistently oops()/read-only the filesystem when reading from that location. I had to tar everything into a backup file on another machine and reformat the drive back to ext4 to restore normal operation to the machine.

        I did like NILFS2's features for the year or so I used it, but, well, that's it for that.

        • (Score: 1) by pTamok on Wednesday September 12 2018, @07:29AM

          by pTamok (3042) on Wednesday September 12 2018, @07:29AM (#733517)

          I'm sorry to hear that.

          I had one issue with NILFS, probably caused by my ignorant habit of using the power button on my laptop to get out of unresponsive blank screens caused by the display code. I too had to to a backup/restore to resolve that particular issue - but now that I've discovered REISUB [wikipedia.org] (or REISUO), I generally manage semi-graceful shutdowns (or at least ones where emergency sync has been done).

          I do have backups, and I've had no further problems with NILFS on my hardware. It's performance on my SSD is adequate for my purposes, and having the continuous checkpoint/snapshot capability is quite nice. I can understand you have a different use case/priorities. It would be nice if there was a fsck program - it's on the NILFS2 'todo list' [sourceforge.io], but development on NILFS2 is slow - probably because not a lot of people need it, using ext4's journalling, or BTRFS or ZFS instead. It's probably worth reading the 'Current Status' document that that todo list is part of so you can come to a decision as to whether you would use it or not.

    • (Score: 0) by Anonymous Coward on Tuesday September 11 2018, @11:09PM

      by Anonymous Coward on Tuesday September 11 2018, @11:09PM (#733378)

      XFS version 5 was released a few years ago; it requires a new mkfs.xfs and you should use -m crc=1,finobt=1; the deletion problem you've mentioned is not really a problem at this point, and the default parameters for XFS have improved massively as well (the XFS people like to say "all the turbo switches are on by default.") The mkfs.xfs voodoo has been largely eliminated. su/sw for RAID is the only other hard bit you may want to use for better speed.

    • (Score: 0) by Anonymous Coward on Tuesday September 11 2018, @11:23PM

      by Anonymous Coward on Tuesday September 11 2018, @11:23PM (#733383)

      My evolutionary history of filesystem usage is remarkably similiar to yours. Arrived at ZFS around 2010/2011. Been using OmniOS as a storage backend for a couple years while my servers were ESXI-based. Now that I'm switching to Proxmox (OmniOS won't boot on KVM) I need to work around ZFS shortcomings on FreeBSD/FreeNAS, but it's still a winning team.

      The CIFS "time machine" functionality based on snapshots alone is worth it. AFAIK Linux can't do that yet, wake me when they're there :)

      For the time being, my Proxmox-based home server actually boots from a ZFSonLinux mirror, but my storage is on FreeNAS still. Moving company servers to Proxmox soon, even though FreeNAS' shortcomings with regards to snapshot staggering are putting me off.

  • (Score: 5, Informative) by ilsa on Tuesday September 11 2018, @04:41PM (2 children)

    by ilsa (6082) Subscriber Badge on Tuesday September 11 2018, @04:41PM (#733202)

    ZFS, while it does have some limitations, is IMO the most solid "filesystem" out there. I put that in quotes because it's more than a file system. It's a file system+LVM+an advanced raid controller rolled into one.

    The closest modern equivalent is BTRFS. The difference is that ZFS is mature and stable technology, and can do things that BTRFS still can't do. It's designed specifically for dealing with large banks of disks. (Which is why ZFS doesn't have the option to change RAID modes, because in it's inception, people who used ZFS typically had massive disk arrays and would never bother with such an activity)

    The biggest feature with ZFS is that it actively fights bit rot. It checksums basically everything, and routinely does "scrubs" of the filesystem. If you set up some flavour of RAID, it will fix any errors it finds.

    The second coolest feature is that you can carve up your array however you want. It doesn't do partitions. It does zVols and datasets, so you can split your files up however you want, and can even carve out an individual piece of your array and present it as a raw block device. And then you can snapshot your volumes for easy backups. ZFS simply rocks.

    • (Score: 1) by woodcruft on Tuesday September 11 2018, @09:25PM (1 child)

      by woodcruft (6528) on Tuesday September 11 2018, @09:25PM (#733323)

      The biggest feature with ZFS is that it actively fights bit rot. It checksums basically everything, and routinely does "scrubs" of the filesystem.

      On FreeBSD you have to enable routine scrubs using periodic(8). Eg:

      [me@localhost]$ grep -i zfs /etc/defaults/periodic.conf
      # 404.status-zfs
      daily_status_zfs_enable="NO"                            # Check ZFS
      daily_status_zfs_zpool_list_enable="YES"                # List ZFS pools
      # 800.scrub-zfs
      daily_scrub_zfs_enable="NO"
      daily_scrub_zfs_pools=""                        # empty string selects all pools
      daily_scrub_zfs_default_threshold="35"          # days between scrubs
      #daily_scrub_zfs_${poolname}_threshold="35"     # pool specific threshold

      Of course, you can override those defaults in: /etc/periodic.conf

      No idea what the situation is with ZFS on Linux. I prefer to use an OS where ZFS is a "1st class citizen", to be honest.

      --
      :wq!
  • (Score: 2) by Entropy on Tuesday September 11 2018, @07:52PM (1 child)

    by Entropy (4228) on Tuesday September 11 2018, @07:52PM (#733279)

    Also the command line interface for ZFS is logical, and easy. BTRFS to me has always been a mess. Here's an example:
    The syntax:
    btrfs subvolume snapshot /tank/homes /tank/homes/mysnap
    vs
    zfs snapshot tank/homes@mysnap

    Please note that afterwords in BTRFS you're left with a subdirectory "mysnap" that you have to deal with if you do something like rsync. Not only is it mounted and impossible to unmount it's writable for no apparent reason. Why would I want to write to my snapshot, exactly?! In ZFS you don't have that problem as that is hidden. If for some reason you want it to be writable, you can duplicate it easily enough and then mount it somewhere.

    Also for virtual machines I can use a zvol under ZFS. Not only that but I can block level replicate the ZVOL device to another computer. BTRFS has no such capability.

    Everyone talks about the license vs GPL, but license aside if you just want the best filesystem out there give ZFS a try.

    • (Score: 2) by pendorbound on Tuesday September 11 2018, @08:07PM

      by pendorbound (2688) on Tuesday September 11 2018, @08:07PM (#733285) Homepage

      My favorite part of virtual machines on zVOL's is backups. Traditional disk image file VM's, you boot the VM and most of the whole umpteen GB group of image files is "dirty" for your next backup. With zVOL's, only the blocks your VM actually modified get backed up in the next snapshot / zfs send. There's no need to send an entire 2GB image file just because one byte in that particular slice got changed.

  • (Score: 2) by BananaPhone on Tuesday September 11 2018, @08:53PM (3 children)

    by BananaPhone (2488) on Tuesday September 11 2018, @08:53PM (#733303)

    FYI: Expanding capacity with ZFS is more expensive compared to legacy RAID.

    http://louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html [louwrentius.com]

    And now I'm turned off on ZFS. You?

    • (Score: 1) by soylentnewsinator on Wednesday September 12 2018, @05:49AM

      by soylentnewsinator (7102) on Wednesday September 12 2018, @05:49AM (#733500)

      I skimmed through the article. You can grow your pool over time, but only as big as the smallest disk.

      This underlines the fact that for a home user, your goal shouldn't be to have as big of a *pool* as possible, but rather, keep smaller pools.

      For my home NAS, I'm moving to mirrored pools of only 2 disks and splitting up my data. If there's a catastrophic failure of a pool, I can use ddrescue on either drive. There is the added benefit of if you try to 'repair' a pool. If you imaged the entire pool (2 disks) and ran a repair that failed, you could restore them. Would you want to do this exercise with 10 drives in your pool? Probably not.

    • (Score: 0) by Anonymous Coward on Wednesday September 12 2018, @09:22AM

      by Anonymous Coward on Wednesday September 12 2018, @09:22AM (#733526)

      For multiple reasons, pools of mirrors are the generally accepted "golden path". I.e. RAID10-equivalent. You can always grow your pool by adding one vdev at a time, with a pool of 2-way mirrors that means adding two HDDs. Not so bad, is it? :)

      With RAIDZ(-2,-3), a vdev can span many disks. But it's not the preferred way to go for a speedy pool. You only really go RAIDZ if you value capacity over everything else, because a RAIDZ pool will become slower even as it fills up. In home server parlance, it makes for a nice datasink for media files. Not for files you're regularly working on.

    • (Score: 2) by pendorbound on Wednesday September 12 2018, @03:48PM

      by pendorbound (2688) on Wednesday September 12 2018, @03:48PM (#733665) Homepage

      Most of the articles I've read (including that one) on growing ZFS pools are a little myopic to one really important detail. It's a valid criticism that expanding ZFS pools costs more money than LVM or similar since you need to replace all devices. Too often that's generalized to "you can't expand zpools" which is incorrect.

      Say you've got a 4x 1TB RAID5 array with one drive redundancy (ZFS calls that RAIDz1). That gives you 3TB usable and can survive one drive failure. On LVM and similar non-ZFS volume managers, you could add another 1TB drive, restripe, and end up with 4TB usable and still one drive redundancy on a 5x 1TB RAIDz1. ZFS won't support that. 100% correct.

      Here's the scenario most articles miss:

      If you buy four new 2TB drives, you can replace each of the 1TB drives in turn with a 2TB drive. You offline one, replace the hardware, tell ZFS to replace the missing 1TB device with the new 2TB device, and ZFS will resilver the pool to the new device. Note you still only have 3TB usable at this point, not 4TB like you might expect. So you do the same in turn with the other three devices, removing, replacing, and waiting for resilver to complete for each. At the end of the resilver on the final device in the VDEV, you'll suddenly see your pool size has grown to 6TB usable. ZFS can use the additional storage provided you expand ALL of the underlying devices in a VDEV.

      The drawbacks are that you DO need to replace all your drives in one go, not just bolt on some new ones. That costs some $$$. Personally, that's the way I've ALWAYS done upgrades as I usually have one ailing drive and don't want to leave its litter mates around to possibly succumb to similar disease shortly after replacing one. It also takes potentially a long time to do all that incremental resilvering, assuming the pool was full (which it probably was or why else do this?). When you're done, you have the four original drives freed up. If you have the ports and want the storage, you can add them back as a new VDEV and either append it to the existing pool or create a new pool.

      You could also accomplish the same by adding all the new devices, creating a new pool, and zfs send/recv'ing the data over. The benefit of the resilver dance is the pool is online the entire time. You don't have to shutdown services, export/re-import the pool to rename it, or deal with changed mount point names. Assuming your motherboard, SATA/SAS controller, backplane, etc. can handle the hotswap, you can usually do all of that with zero host downtime.

  • (Score: 2) by BananaPhone on Tuesday September 11 2018, @08:55PM (2 children)

    by BananaPhone (2488) on Tuesday September 11 2018, @08:55PM (#733306)

    I Really want a NAS with per-file chksum+AutoRepair.

    QNAP => ZFS need ECC memory + $$ hardware
    Synology => BTRFS
    FREENAS => ZFS (and never will do BTRFS)
    RockStor => BRTFS

    Can you expand on the fly with any of these?

    • (Score: 0) by Anonymous Coward on Tuesday September 11 2018, @10:17PM

      by Anonymous Coward on Tuesday September 11 2018, @10:17PM (#733365)

      ZFS DOES NOT need ECC memory.

    • (Score: 2, Informative) by DECbot on Wednesday September 12 2018, @02:40AM

      by DECbot (832) on Wednesday September 12 2018, @02:40AM (#733454) Journal

      The hidden cost of zfs is the pool will always be limited by the size of the smallest disk. My setup: 4 disks, two 1TB disks and two 2TB disks in a Z1 configuration (think RAID5). My pool size is limited to around 3TB until I replace every disk to 2TB and then I will have around 6 TB. Not too bad for my homebrew setup, but when you look at upgrading an enterprise setup with a dozen 6TB WD Red disks, is gets expensive because there is no bump to your storage space until every disk has been replaced.
       
      Also, there is no on-the-fly conversion from Z1 to Z2 or Z2 to Z1. You have to transfer your data to a different zfs pool, destroy your old pool, recreate, and send the data back.
       
      Excuse me if I got some of the terminology wrong. I'm just a hobbyist trying to learn a new thing. Point out my mistakes so we can all learn.

      --
      cats~$ sudo chown -R us /home/base
  • (Score: 3, Interesting) by mmh on Tuesday September 11 2018, @09:05PM

    by mmh (721) on Tuesday September 11 2018, @09:05PM (#733310)

    ZFS is the only filesystem that matters.

    For those here running Fedora, I've written a step-by-step guide for converting your root filesystem(s) to ZFS: http://matthewheadlee.com/projects/zfs-as-root-filesystem-on-fedora/ [matthewheadlee.com]

  • (Score: 2) by mmh on Tuesday September 11 2018, @09:12PM

    by mmh (721) on Tuesday September 11 2018, @09:12PM (#733316)
    Aaron Toponce has an excellent series of ZFS articles on "How it works" and "Why it's the best" over at https://pthree.org/2012/04/17/install-zfs-on-debian-gnulinux/ [pthree.org]. Highly recommend checking it out.
  • (Score: 2) by wonkey_monkey on Tuesday September 11 2018, @10:22PM (1 child)

    by wonkey_monkey (279) on Tuesday September 11 2018, @10:22PM (#733367) Homepage

    with file systems of up to 256 quadrillion zettabytes in size should you have enough electricity to pull that off.

    Never mind electricity, I don't think we've got enough *galaxy* to pull that off.

    --
    systemd is Roko's Basilisk
    • (Score: 2, Touché) by Anonymous Coward on Tuesday September 11 2018, @10:26PM

      by Anonymous Coward on Tuesday September 11 2018, @10:26PM (#733369)

      You underestimate the amount of porn available for download.

  • (Score: 0) by Anonymous Coward on Tuesday September 11 2018, @11:42PM (1 child)

    by Anonymous Coward on Tuesday September 11 2018, @11:42PM (#733389)

    I've started playing with btrfs a few months back. While I can't think of a compelling use case for subvolumes at the moment, I am using it as the filesystem containing my backups. After all my machines have backed up, I create a read-only snapshot of the state. I'm using bees to deduplicate the filesystem.
    What I've seen: Deduplication saves a fair amount of space when it comes to system backups (It finds copies of the GPL for example as separate files, and within ISOs). Compression may not save as much as it costs in CPU time. Quotas are available, but bees has caused the btrfs kernel module to have extreme slowdowns because of locking if quotas were enabled. Readonly snapshots can be turned in to readwrite copies on demand, but I have a suspicion that a high amount of snapshots will cause btrfs to slow down. Bees also isn't currently optimized to recognize same extent identifiers in snapshots / will have to traverse all snapshots to find duplicate extents that may have already been deduplicated elsewhere.

    • (Score: 2) by Subsentient on Wednesday September 12 2018, @04:28AM

      by Subsentient (1111) on Wednesday September 12 2018, @04:28AM (#733485) Homepage Journal

      I use snapshots to backup my OS if I'm worried about something messing it up. As for compression, depends on the algorithm. The new zstd compression algorithm gives similar ratios to zlib but is just *so much* faster. Needs kernel 4.14 or above though. I compress all my btrfs filesystems with zstd. It makes a significant positive difference I've found.

      As for performance, I haven't noticed a negative impact since switching to btrfs for root FS or for home directory. Benchmarks still say it's worse than ext4 however, but that said, I often end up missing data deduplication, compression, snapshots, etc when I'm working on an ext4 system.

      Just, don't mess with the RAID stuff yet. Even the RAID1/RAID0 stuff is buggy at best. It won't corrupt your data, but it'll probably say you have no free space.

      Btrfs isn't yet at the level of ZFS, but considering it's included in the mainline kernel and works pretty well, I prefer it. It's also much lighter on RAM than ZFS.

      --
      "It is no measure of health to be well adjusted to a profoundly sick society." -Jiddu Krishnamurti
  • (Score: 2) by mechanicjay on Wednesday September 12 2018, @04:43AM

    by mechanicjay (7) <reversethis-{gro ... a} {yajcinahcem}> on Wednesday September 12 2018, @04:43AM (#733486) Homepage Journal

    Where I work now, we use ZFS extensively for central file servers as well as data volumes on any server we care about.

    Single file restores are easy as pie with the snapshots. When replacing hardware, ZFSreceiving the entire datastore back into place is a thing of magic.

    I recently put together a series of shell scripts to do a near real-time sync of data across machines. I wanted to eliminate single points of failure in a load balanced server environment, so I'm doing a snapshot on a "master" node 1/min and sending it out to the secondary nodes. Each system then gets to be fully independent. It's a thing of beauty. It beats the pants of hokey solutions like glusterfs and faster than calling back to a central nfs store.

    Basically I've become a huge fan and kind of agree that it's the only file system that matters if you care about your data, because it's robust and easy to use.

    --
    My VMS box beat up your Windows box.
  • (Score: 0) by Anonymous Coward on Wednesday September 12 2018, @02:35PM

    by Anonymous Coward on Wednesday September 12 2018, @02:35PM (#733610)

    If you do a lot of Mac + Linux, ZFS is amazing. Once you have drivers on both, you can export a RAIDZ multi-disk JBOD USB3 enclosure from a Linux machine, plug it into a Mac, and it pretty much "just works". Dual boot works great, too. Network replication is dead simple.

    I started using ZFS for the cheap RAID functionality, but have really fallen in love with the benefits of root-partition snapshots, compression, and SSD-cached performance.

    The only big issue I have with ZFS on Mac and Linux is the lack of trim support.

  • (Score: 1) by Geoff Clare on Friday September 14 2018, @10:25AM

    by Geoff Clare (2397) on Friday September 14 2018, @10:25AM (#734762)
    Just correcting some misinformation in the summary...
    Oracle did not "shut down Solaris", they shut down OpenSolaris by taking Solaris back to being closed-source. Solaris is still being developed and 11.4 was released just recently. It is the first UNIX system to be certified by The Open Group [opengroup.org] as conforming to SUSv4/POSIX.1-2017.
(1)