This is probably one of those topics that gets regurgitated periodically, but it's always good to get some fresh answers.
The small consultancy business I work for wants to set up a new file server with remote backup. In the past we have used a Windows XP file server and plugged in a couple of external USB drives when space runs out. Backups were performed nightly to a USB drive and taken offsite to a trusted employees home.
They are looking to Linux for a new file server (I think more because they found out how much a new Windows file server would be).
I'm not a server guy but I have set up a simple Debian-based web server at work for a specific intranet application, but when I was asked about ideas for the new system the best I could come up with was maybe ssh+rsync (which I have only recently started using myself so I'm no expert by any means). Using Amazon's cloud service has been suggested, as well as the remote being a dedicated machine at a trusted employee's home (probably with a new dedicated line in) or with our local ISP (if they can offer such a service). A new dedicated line out of the office has also been suggested, I think mainly because daily file changes can potentially be quite large (3D CAD models etc). A possible advantage of the remote being nearby is that the initial backup could be using a portable hard drive instead of having to uploading terabytes of data (I guess there is always courier services though).
Anyway, just thought I'd chuck it out there. A lot of you guys probably already set up and/or look after remote backup systems. Even if anyone just has some ideas regarding potential traps/pitfalls would be handy. The company is fairly small (about 20-odd employees) so I don't think they need anything overly elaborate, but all feedback is appreciated.
(Score: 0) by Anonymous Coward on Monday July 14 2014, @01:07PM
Everything revolves around a single workhorse of a file/web/database server: six cores, 16GB RAM (would work fine on 2GB honestly, but caching is nice), 4x 1TB drives in what might be considered a strange configuration. The drives are partitioned identically: partition 1 is a ~200MB boot partition, 2 is a ~20GB root partition, 3 is a ~970GB home partition. I use Linux software RAID to set up /dev/sd?1 as raid1, /dev/sd?2 as raid1, and /dev/sd?3 as raid5. This yields almost 3TB of storage that can withstand a single disk failure without loss and rebuilds at a reasonably quick rate when a drive must be replaced (you must ALWAYS expect server drive failures, as they will happen often. mdadm --manage --add is your friend.) ALWAYS use a UPS on the server, even if it's just a cheap one. Minor power flickers can have major effects.
Everything is shared out via Samba, so access to the server is platform-agnostic. Since there is no need for per-user security, everything is stored under /home/store and Samba is set up to share this out without requiring authentication. I wrote a backup script that runs in cron.daily that maintains a mirror (rsync -avb --delete --backup-dir=/home/store/backup.0 /home/store/ /home/backup/backup.0/) of the storage area and performs daily snapshot-based (changed or deleted files are moved from the mirror to their own directory) backups with folder rotation. /home/backup/backup.0 is the mirror, while /home/backup/backup.1, /home/backup/backup.2, ... are "changes up to 1 day old, changes up to 2 days old, etc." and I keep about two months of these, so whatever is /home/backup/backup.60 is deleted rather than its number being incremented. This /home/backup directory is shared out of Samba as "backup" with "root" level access but also read-only, so users can retrieve backup copies themselves as needed.
I perform MariaDB database dumps during this process so that I have backups of the server's database outside of the actual DB storage area's raw files.
On top of all this, I also use a USB 3.0 3TB external hard drive to back up the contents of the server and keep that drive off-site and generally in my personal possession unless I'm performing backups. In my case I opted to perform multiple rsync runs: one backs up the entire contents of the root filesystem to the external, excluding /dev /proc /sys /home, the second backs up /home with further exclusions for directories that are not necessary to back up (for example, one could exclude the snapshot-based backup mirror if it churned way too much.)
The external drive is the most important part. A server can have RAID all day long but one power glitch scrambling a little memory somewhere is all it takes to trash a filesystem and make life miserable. You want protection against drive failure, but also against filesystem damage, data corruption, and unlikely catastrophic hardware failures (i.e. a massive power spike blows out your server and all the drives in the array). If my server was completely destroyed, my backup drive contains everything that's very important.