[kwlug-disc] btrfs/zfs for backups

Paul Nijjar paul_nijjar at yahoo.ca
Wed Dec 3 14:04:43 EST 2014


On Tue, Dec 02, 2014 at 10:27:32PM -0500, Chris Irwin wrote:
> On 12/02/2014 08:44 PM, Paul Nijjar wrote:
> 
> >Here is some information about the infrastructure:
> >- The fileserver will consist of a bunch of Samba shares
> >- Symantec BackupExec (yes, I know) will write big backup files to
> >   these shares. Most of the files are 4GB large, but there are a few
> >   files that are almost 100GB large.
> >- The two servers are connected via a wireless link that effectively
> >   runs at 100Mbit
> >- The backup storage media are Western digital Green drives (sorry
> >   Cedric)
> >- The servers themselves are nothing special: 64-bit intel
> >   workstations with 2-4GB of RAM.
> >- These are backup files, so data integrity is important
> >- We can assume there are 100s of GB of data backed up each week,
> >   although I am not sure whether this means hundreds of files are
> >   changing each week. (This could be the case; BackupExec has a habit
> >   of doing things in the most inconvenient way possible.)
> >
> >I am interested in hearing about how well btrfs works for the btrfs
> >send/receive scenario I am thinking about, [...]
> 
> Assuming BackupExec is creating new full snapshots, you're not
> really going to have much benefit of doing a `btrfs send` versus
> just scp, xcopy, or rsync to your remote destination. Sending
> incremental changes only works when the data incrementally changed.
> Sure the source workstation is 99% unchanged, but a new full backup
> is entirely new data.

Right. If BackupExec creates brand new files for each backup then I am
in trouble. But if it is overwriting old files with similar content
then I win?

There are definitely a few files that do not change from backup to
backup. If I can somehow coerce Backupexec into using the same files
for its giant file backups (VM backups and Exchange datastores) then
sending differentials across the wire becomes important.

The other big win I can see with btrfs/zfs as opposed to xcopy is that I
will copy a consistent snapshot to the other server. 

Some sense of restartable copy is also important. If the copy gets
interrupted for some reason I want to be able to resume without having
to start from scratch. xcopy might support this. 

A third factor is that Windows fileshare operations have traditionally
been chatty and slow. I am hoping that this has been fixed with recent
updates to Samba on Windows, but I am not convinced of this. 

I am more inclined to go with btrfs than zfs because my understanding
is that zfs wants lots and lots of memory per TB of storage. That is
not an insurmountable problem but it makes life more difficult.

> If you're looking at rsync, btrfs send, etc., can I assume changes
> only occur on one server? Why not look at xcopy on your existing
> windows server instead of dfs, at least as a first step?

Okay. You and others have convinced me to take another look at
robocopy and xcopy. Because there is so much data, it is important not
to copy files that don't need copying.

> 
> I only have one real piece of advice for Windows server... :)
> 

I know, I know. I'm sorry!

- Paul

-- 
http://pnijjar.freeshell.org





More information about the kwlug-disc mailing list