Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Ask Slashdot: Asynchronous RAID-1 Free Software Backup For Laptops?

timothy posted about a year ago | from the just-the-changes-ma'am-just-the-changes dept.

Data Storage 227

First time accepted submitter ormembar writes "I have a laptop with a 1 TB hard disk. I use rsync to perform my backups (hopefully quite regularly) on an external 1 TB hard disk. But, with such a large hard disk, it takes quite some time to perform backups because rsync scans the whole disk for updates (15 minutes in average). Does it exist somewhere a kind of asynchronous RAID-1 free software that would record in a journal all the changes that I perform on the disk and replay this journal later, when I plug my external hard disk on the laptop? I guess that it would be faster than usual backup solutions (rsync, unison, you name it) that scan the whole partitions every time. Do you feel the same annoyance when backing up laptops?"

Sorry! There are no comments related to the filter you selected.

find & diff (0)

Spazmania (174582) | about a year ago | (#44383001)

You can find | sort | diff ahead of time (maybe in the background) and then constrain the rsync to only the files recorded to have changed.

Re:find & diff (3, Insightful)

rrohbeck (944847) | about a year ago | (#44383039)

How is traversing the whole directory tree with find different from what rsync does?
Running a daemon that lists modified files using inotify might work.

Re:find & diff (0)

emt377 (610337) | about a year ago | (#44383161)

How is traversing the whole directory tree with find different from what rsync does?

It's different in that you don't have to sit and wait for it and doing the backup will consist of only the actual copying. That said, updatedb already scans (for locate), so modifying this to spit out a list of actual state changes (atime,ctime,mtime) since the last run, and using this to construct one or more rsync commands might be the easiest approach. Updatedb also notices when things are removed, permitting these to be removed from the clone as well (and perhaps moved into an archive for later time travel, making it useful as an actual backup).

Re:find & diff (5, Informative)

Anonymous Coward | about a year ago | (#44383241)

It's different in that you don't have to sit and wait for it and doing the backup will consist of only the actual copying

I suggest you look again at rsync.
  - It compares changed files and copies only what has been changed. Changed files are identified by differing mtimes (by default).
  - rsync can also handle removed files with the --delete option.
  - It can do the entire filesystem tree in a single command
  - There are filter options so you can include/exclude what paths to copy (eg you don't want to copy /proc and there are some directories such as /tmp and /run which you may not care about).

Re:find & diff (2)

SuperTechnoNerd (964528) | about a year ago | (#44383801)

Exactly. I think rsync will do nicely. I use it for nightly backups and I rotate through 5 increments. The oldest goes to the bit bucket. Note the copy link -l option.
A snippet:


[more rotations above]
if [ -d $BACKUP_DEST/$(basename $i)/increment.0 ]; then
cp -al $BACKUP_DEST/$(basename $i)/increment.0 $BACKUP_DEST/$(basename $i)/increment.1
fi

rsync -av --delete --exclude-from="$EXCLUDE_LIST" $i/ $BACKUP_DEST/$(basename $i)/increment.0/
touch $BACKUP_DEST/$(basename $i)/increment.0
done
echo "Backup Complete on "$(date)

Re:find & diff (1)

Anonymous Coward | about a year ago | (#44383271)

and perhaps moved into an archive for later time travel, making it useful as an actual backup

A better way to handle that is to use a copy-on-write filesystem and taking snapshots after each backup. That way you get the tree in each snapshot as it was at the time, and without duplicating space.

Re:find & diff (2)

rrohbeck (944847) | about a year ago | (#44383499)

Rsync copies only changed files. The time-consuming part is reading all directories in the directory tree.

Re:find & diff (1)

ormembar (2996607) | about a year ago | (#44383051)

OK. But you still scan the whole disk with a find command.

Re:find & diff (0)

Anonymous Coward | about a year ago | (#44383195)

"Ahead of time" is the key part.

Re:find & diff (1)

iserlohn (49556) | about a year ago | (#44383125)

That will still take ages...

Why not give Bittorrent Sync a go? It's a decentralized "dropbox" on steriods!

http://labs.bittorrent.com/experiments/sync.html [bittorrent.com]

Re:find & diff (2)

Desler (1608317) | about a year ago | (#44383363)

Did you even read the title of the submission. He wants FOSS.

TimeMachine (1, Insightful)

Anonymous Coward | about a year ago | (#44383003)

Just buy a mac :-)

Re:TimeMachine (4, Insightful)

BitZtream (692029) | about a year ago | (#44383037)

Wouldn't solve his problem. TimeMachine takes considerable time to prep and start a backup before it starts actually doing any work, I'd guess its likely doing the same sort of thing that Rsync, gathering a list of changes.

Re:TimeMachine (0)

Anonymous Coward | about a year ago | (#44383607)

Additionally, TimeMachine is prone to the occasional "Time Machine must create a new backup for you", effectively destroy all previous file history. If there's one thing I expect from a data duplication system is it does not corrupt itself, barring hardware failures.

mdadm can do this (5, Informative)

Fruit (31966) | about a year ago | (#44383007)

Use mdadm -C -b internal to create a bitmap. Detach and readd the mirror at will and it will only sync the difference.

Re:mdadm can do this (1)

Anonymous Coward | about a year ago | (#44383123)

Use mdadm -C -b internal to create a bitmap. Detach and readd the mirror at will and it will only sync the difference.

I have enough Linux experience that I've used mdadm from the command line to make RAID1 partitions, but I still don't understand what you posted. Could someone clarify or post a link explaining that?

Re:mdadm can do this (5, Informative)

Anonymous Coward | about a year ago | (#44383205)

Effectively you create a RAID 1 mirror. When you remove the external drive the RAID degrades. The raid bitmap keeps track of changes. When you plug the external drive in you just have to tell it to bring it up to date. Which syncs the only changes.

Re:mdadm can do this (2)

ormembar (2996607) | about a year ago | (#44383649)

What will happen if the laptop hard disk fail? Let's say the laptop harddisk is disk0 in the RAID-1 configuration. The external hard disk is disk1. The degraded RAID-1 is due to the presence of disk0, and the absence of disk1. If disk0 fails for some reason, can I put a new "empty" disk0 in the laptop and mirror disk1 to disk0? I am not sure how to do that with mdadm.

Re:mdadm can do this (1)

Anonymous Coward | about a year ago | (#44383243)

I think he's saying to create a mirrored array, create a bitmap on it to record what's out of sync, and then degrade the array by removing one of the disks.

Then to backup, you re-add the removed disk, and it should only copy over the parts that have changed.

I can't say I've ever added a networked disk to a raid array, though. I'm not confident it's a good idea.

Re:mdadm can do this (3, Informative)

bitMonster (189384) | about a year ago | (#44383403)

Actually, that is done for HA pairs. You can use nbd (network block device) and then create a RAID-1 pair across the local disk and the nbd. There are better alternatives now (such as drbd), but I'm not aware of any problem with nbd+RAID. Jeff

Re:mdadm can do this (2)

kasperd (592156) | about a year ago | (#44383305)

Use mdadm -C -b internal to create a bitmap. Detach and readd the mirror at will and it will only sync the difference.

I am going to test this on my next laptop, or if I decide to upgrade my current with an SSD some day.

Meanwhile, I do have a couple of questions. How automated is this going to be? Will it automatically start to sync, once the USB/eSata disk is connected?

Can I safely attach that disk to another computer for reading? I am worried such operation might corrupt data, even if I don't write anything. If I connect the external disk to a workstation, do I risk that the RAID layer will declare the SSD to be dead and record this fact on the external disk? Is reading from the external disk going to perform a journal replay and thereby perform some unintended writes? Is the raid layer going to increase the event counter on the external disk and potentially run past the SSD or end up at the same event counter due to the same number of cycles, but on different machines?

Re:mdadm can do this (0)

Anonymous Coward | about a year ago | (#44383549)

As the external drive is commonly slower you will find also useful:
echo writemostly >/sys/devices/virtual/block/mdXXX/md/dev-YYY/state
  - device will only be subject to read requests if there are no other options. This applies only to raid1 arrays.

Time Machine (0, Offtopic)

Roger W Moore (538166) | about a year ago | (#44383013)

Time Machine on a Mac laptop does exactly this - it uses a journal of filesystem changes to update only the files it needs to. While this is probably not much use to you since I'm guessing that if you had a Mac you would not be asking this question it would be a system to look at if there is no FOSS alternative and you want to code your own.

Re:Time Machine (3, Informative)

BitZtream (692029) | about a year ago | (#44383087)

TimeMachine takes about 15 minutes to do the prep work before it starts copying for me, on a 2012 Retina MBP with 16Gb of RAM and only 256GB of disk space ... 64 GB taken by an unbacked up BootCamp part and another 120 or so eaten in Windows VMs that don't get backed up either ... i.e. Its not a slow spinning platter backing up a terabyte of data.

I see no indication of any Journal, it certainly isn't making it faster. Pretty freaking slow actually.

Re:Time Machine (3, Interesting)

zieroh (307208) | about a year ago | (#44383155)

This doesn't match my experience. Time Machine fires up in the background, does its thing, and then stops shortly thereafter. Certainly much less than 15 minutes. More like five or less. This is on a new-ish iMac with a 3TB internal drive.

It wouldn't even be noticeable were it not for the fact that I can hear the TM destination drive (sitting on a shelf behind me) spin up once an hour.

Re:Time Machine (1)

zieroh (307208) | about a year ago | (#44383177)

Sorry, internal drive is 2TB. Time Machine destination is 3TB.

Re:Time Machine (2)

Blakey Rat (99501) | about a year ago | (#44383813)

TM could be doing 15 minutes of work on your own HD before it bothers spinning-up the external, you realize.

You may be correct, but your evidence doesn't match your assertion.

Re:Time Machine (0)

Anonymous Coward | about a year ago | (#44383193)

i have the same machine with a 2TB disk and back up to a western digital mybook... it only backs up to changes and takes less than 15 seconds. you're either a liar or an idiot.

#CHOOSE

Re:Time Machine (3, Interesting)

Bill_the_Engineer (772575) | about a year ago | (#44383297)

This is my current experience with mine too. However during the prep stage it is making room on my time machine drive to receive the changes. Consolidating the older files will take time.

When my drive was new and had plenty of space, the prep stage was much shorter.

Re:Time Machine (1)

sl4shd0rk (755837) | about a year ago | (#44383337)

TimeMachine takes about 15 minutes to do the prep work

Yes, because naturally he's using a ma. He must have certainly been in a Starbucks or Panera when he posted as well. Around the Bay area nonetheless.

Re:Time Machine (2)

wonkey_monkey (2592601) | about a year ago | (#44383641)

Read from the top level and you'll see that no-one's made the assumption that he's using a Mac. This has simply become a side discussion on TM.

Re:Time Machine (1)

robably (1044462) | about a year ago | (#44383395)

Eh? Over 15 minutes? Are you backing up to an AirPort Disk rather than a wired disk? The bottleneck there would be the wireless, not your computer.

I backup a 2012 MacBook Air every evening to a 1TB 5400RPM USB drive - plug it in, it detects it, and the backup is done in 3 minutes.

Re:Time Machine (1)

DigiShaman (671371) | about a year ago | (#44383523)

TimeMachine will backup a running VM. It just has to backup the entire VM each time whereas with the OSX environment, only the delta changes transfer after your original full backup. To address the underlaying performance issue however, just replace the internal with an SSD. It's that bloody simple of a solution. I can run my VMs, the rest of my Mac applications and run TimeMachine all at the same time. Previously I couldn't do this with a standard HDD due to a saturation in disk I/O (hung with a spinning rainbow wheel mouse cursor) .

Re:Time Machine (1)

h4rr4r (612664) | about a year ago | (#44383145)

Considering how slow it is I doubt it.

I sometimes use a Mac, I still prefer rsnapshot over some backup that is likely hard to deal with if you don't have another mac.

Re:Time Machine (1)

killmofasta (460565) | about a year ago | (#44383325)

But Time Machine requires you run a Mac that can run Mac OS X 10.5, and is useless with classic,
even on a triple boot. It does not work on HFS+ volumes that have been used by 10.4, or OS 9.

Time Machine is useless to me and my client...so your primise is faulty.

No, you mean buy a RECENT Macintosh.

DRBD (0)

Anonymous Coward | about a year ago | (#44383017)

Take a look at DRBD.

Re:DRBD (1)

Desler (1608317) | about a year ago | (#44383057)

But what about Dr. Feelgood?

Re:DRBD (1)

greg1104 (461138) | about a year ago | (#44383941)

Specifically how DRBD handles recovery after an outage of the replication network [drbd.org] . The situations where the disk isn't plugged in will look just like the network outage scenario DRBD handles. I'm not sure whether this will be more or less efficiency than the mdadm bitmap approach outlined above, but those are the two main ways people do this specific operation.

Obligatory (5, Informative)

Anonymous Coward | about a year ago | (#44383023)

RAID is not backup.

Re:Obligatory (5, Informative)

XanC (644172) | about a year ago | (#44383049)

True. I'd recommend he check out rdiff-backup, which keeps snapshots of previous syncs. Fantastic tool.

Re:Obligatory (2)

hawguy (1600213) | about a year ago | (#44383285)

RAID is not backup.

It is in this situation since he wants to mirror to an external disk , then break the mirror and unplug the disk.

It's no worse than if he does "rsync --delete" to the backup medium. (well ok, slightly worse since if the mirror fails in the middle, the backup disk is left in an inconsistent state and could be unreadable, but the rsync would also leave an unknown number of files/folders unsynced, so it's not a perfect backup itself)

As long as you have more than one backup disk, then a mirror is as safe as rsync. There may be better solutions, but either backup solution will let you recover your system from the backup disk if there's a failure of the primary system.

Back in the day (before I could make filesystem level or SAN level snapshots) that used to be how I did backups of a large database system (where "large" was 15GB, which tells you how long ago it was). I'd mirror the production system disks to a separate set of disks on the live system (the disks were already mirrored, so this was a "third mirror"), after the mirror was complete (which took most of the night) I'd quiesce the database and filesystem in the morning, break the mirror, then mount the disks on another machine to backup to tape. But I could have chosen to just pull the disks in that RAID set out of the array and put them in the tape cabinet as the backup and it would have still been a backup.

Re:Obligatory (3, Informative)

Crimey McBiggles (705157) | about a year ago | (#44383691)

Just because you've hacked RAID into part of a backup strategy does not mean that backup is a standard use-case for RAID. It's far too easy for the wrong disk to get overwritten because of all the things RAID is set up to do by default. With rsync, you're telling the disks exactly which direction the data needs to flow. In a production environment, there's also a greater chance of failure using RAID because of the whole "plugging / unplugging drives" thing. Sure, it's rare, but your operating system and/or motherboard may or may not enjoy having drives attached and detached from its SATA bus. Hearing the above, a systems administrator would assume you're confused between the terms "backup" and "mirror". It's a non-standard use-case, so the admin that arrives after you've moved on to another job will have to deal with that confusion.

Re:Obligatory (2)

hawguy (1600213) | about a year ago | (#44383969)

Just because you've hacked RAID into part of a backup strategy does not mean that backup is a standard use-case for RAID. It's far too easy for the wrong disk to get overwritten because of all the things RAID is set up to do by default. With rsync, you're telling the disks exactly which direction the data needs to flow.

In a production environment, there's also a greater chance of failure using RAID because of the whole "plugging / unplugging drives" thing. Sure, it's rare, but your operating system and/or motherboard may or may not enjoy having drives attached and detached from its SATA bus.

Hearing the above, a systems administrator would assume you're confused between the terms "backup" and "mirror". It's a non-standard use-case, so the admin that arrives after you've moved on to another job will have to deal with that confusion.

My RAID backup strategy was fully supported and recommended by the manufacturer of the storage array, and was a big selling point. It wasn't a hack. Even tape backups can suffer problems from overwriting the wrong tape if someone does something stupid. "Oh hey, the backup system says this tape isn't expired yet, I'm sure I loaded the right tape, so I'll just do a hard-erase so I can write to it"

umm yah (0)

Anonymous Coward | about a year ago | (#44383035)

Its called time machine. It basically takes a series of snap shots with lvm then offloads them when you reconnect. You could achieve the same thing with shadow copy and a script as well.

DRBD or ZFS (1)

Anonymous Coward | about a year ago | (#44383043)

You could try DRBD (whole disk level) or ZFS with detached mirror or snapshots. Both will keep track of changes and resync only things that changes from the last sync.

Any backup solution worth its name (-1)

Anonymous Coward | about a year ago | (#44383071)

Even the venerable "dump" can do this.

And then there's things like the coda filesystem, for even fancier takes on the same idea.

Really now, do you not have an internet connection or something? No browser, no access to the web, no search engines at all?

ZFS: Snapshot + send (2, Interesting)

Anonymous Coward | about a year ago | (#44383075)

Cleanest implementation of this I've seen is with ZFS.

You do a snapshot of your filesystem, and then do a zfs send to your remote backup server, which then replicates that snapshot by replaying the differences. If you are experiencing poor speed due to read/write buffering issues, pipe through mbuffer.

The only issue is that it requires that you have your OS on top of ZFS.

Re:ZFS: Snapshot + send (0)

Anonymous Coward | about a year ago | (#44383829)

Don't worry, the only OSes worth using can boot from ZFS or comparable FS.

ZFS (1)

Anonymous Coward | about a year ago | (#44383083)

You want two ZFS filesystems. One local laptop pool, one backup pool (and it really should have two disks, but one will work fine). Snapshot your laptop filesystem periodically (cron or something), and then zfs send/receive that snapshot to the backup pool when you have access.

Exclude directories (5, Informative)

Anonymous Coward | about a year ago | (#44383085)

Are you backing up EVERYTHING on the laptop -- OS and data included? Even if you are only backing up your home directory there is stuff you don't need to backup like the .thumbnails directory which can be quite large. Try using rysnc's exclude option to restrict the backup to only what you care about.

DNA
AKA mrascii

COW or desync'ed RAID (5, Informative)

phorm (591458) | about a year ago | (#44383107)

In this case, it sounds like you want a fast on-demand sync rather than a RAID.

However, you could possibly use dm-raid for this if you're a linux user.
Have the internal disk(s) as a degraded md-raid1 partition. When you connect the backup disk, have it become part of the RAID and the disks should sync up. That said, it likely won't be any faster than rsync, quite possibly slower as it'll have to go over the entire volume.

Alternate solutions:
* Have a local folder that does daily syncs/backups. Move those to the external storage when it's connected.
    CAVEATS: Takes space until the external disk is available
* Use a differential filesystem, or maybe something like a COW (copy-on-write) filesystem. Have the COW system sync over to the backup disk (when connected) and then merge it into the main filesystem tree after sync
    For example, /home is a combination of /mnt/home-ro (ro) and /mnt/home-rw (rw, COW filesystem). When external media is connected, /mnt/home-rw is synced to external media, then back over /mnt/home-ro

you backup 1TB from a laptop? (-1, Offtopic)

alen (225700) | about a year ago | (#44383129)

i have maybe 5GB worth of documents that i save to dropbox on a regular basis

Re:you backup 1TB from a laptop? (1)

ormembar (2996607) | about a year ago | (#44383209)

Well, all my research work from the last 20 years, that makes some data, and you never know which data you need when you travel. So, when I change laptop, I copy all my data from the old disk to the new disk. That's why also I want to backup only the diff, and not spend my time scanning the disk to find these differences (which can be spread all over the disk).

My solution (1, Interesting)

kiriath (2670145) | about a year ago | (#44383135)

Is to not try to keep 1TB of crap on a laptop... or anywhere for that matter. Travel light says me ;)

Re:My solution (1)

Ravaldy (2621787) | about a year ago | (#44383295)

That works for sales, managers and those who don't touch lots of data. Programmers are a good example of people who need a lot of arsenal when going on site. Customers DB, version of software, tools and more. All that amounts to lots of data. I usually recommend keeping the data on a removable USB drive that you backup when you aren't using it.

Re:My solution (1)

h4rr4r (612664) | about a year ago | (#44383525)

Have you heard of the internet?
It is super cool, you can leave the data in your datacenter and get to it from anywhere! You can even show the customer right on the server instead of dealing with your laptop and a painfully slow USB connection.

Re:My solution (1)

ccool (628215) | about a year ago | (#44383663)

How does a painfully slow USB connection compares with a painfully slow Internet connection?

Re:My solution (2)

h4rr4r (612664) | about a year ago | (#44383821)

You don't transfer anywhere near as much data over it.

You leave that on the server and use the internet just for the nice cheap display.

OS? (4, Insightful)

ralf1 (718128) | about a year ago | (#44383153)

The OP doesn't mention which OS he's on - the tools he mentions both run across multiple OS's. Would be helpful to know. I know as a group we probably assume some form of Linux but..... I use MS Home Server at the house to back up my family's multiple Windows machines. Runs on crappy hardware, does incrementals on a schedule, allows file level or bare metal restore, keeps daily/weekly/fulls as long as I ask it to. I know we aren't a Windows friendly crowd but this product does exactly what it promises and does it pretty well.

Re:OS? (1)

The MAZZTer (911996) | about a year ago | (#44383183)

I use robocopy on Windows for my 1:1 backup copy since it will use timestamps and file sizes to determine if a file needs to be synced or not. But I assume rsync does the same thing.

Re:OS? (1)

DamageLabs (980310) | about a year ago | (#44383319)

Robocopy doesn't keep the ACM dates across volumes. So it is certainly not a 1:1 copy.

The only thing that comes close, but still not there completely, is the legacy MS (Veritas) backup utility. And that one is far from automated.

Re:OS? (1)

CanHasDIY (1672858) | about a year ago | (#44383977)

Robocopy doesn't keep the ACM dates across volumes. So it is certainly not a 1:1 copy.

The only thing that comes close, but still not there completely, is the legacy MS (Veritas) backup utility. And that one is far from automated.

What about SyncToy? [microsoft.com] Seems to work pretty well, at least it does for me.

Re: Dr. Seuss Question (0)

hyades1 (1149581) | about a year ago | (#44383293)

From the back, or in its throat?

Re: OS? (0)

Anonymous Coward | about a year ago | (#44383533)

Backup software on Windows has been able to use the NTFS change journal feature to avoid scanning file systems for file level change for... I'd guess at least around ten years now.

I've been waiting for the Linux community to "invent" this feature for a long time.

It's been long enough that I'd recommend looking into either a ZFS, NTFS or Time Machine capable system/file server if you intend to do frequent incremental backups of your data. Certainly ZFS or NTFS for systems with large numbers of files to scan. You need block level or journaling for that.

You waste a ton of time and resources to get near continuous backups on Linux, for what, a mirror?? At least finagle resync into doing point in time backup sets for you. /enterprise backup admin

CrashPlan (3, Informative)

Nerdfest (867930) | about a year ago | (#44383221)

CrashPlan [crashplan.com] is free, but not open, and I think will do everything you need. You can backupto an external disk, over the network to one of your own machines, or back up to a freind who also runs it. Great key based encryption support. If you want, you can pay them for offsite backups (which is a great deal as well, in my opinion). It's cross-platform, and easy to use. Never underestimate the benefits of off-site backups.

Re:CrashPlan (1)

EW87 (951411) | about a year ago | (#44383281)

I second this

Re:CrashPlan (1)

lw54 (73409) | about a year ago | (#44384037)

For the last month, I've been using CrashPlan to back up a 5.5TB filesystem over AFP to a remote AFS file share over the Internet. I did the initial backup across the LAN and then moved the drive array to its final destination. I'm now a few weeks in after the move and for the last 4 days, it has not backed up and is instead synchronizing block information. 4 days in and it's up to 59.9%. It spent 1.5 days about a week ago doing something like recursive block purging. I wish the client could do these housekeeping chores while also performing the backup.

RSync (0)

Anonymous Coward | about a year ago | (#44383225)

RSync. Pretty simple, works on multiple operating systems.

It doesn't do exactly this, but it gets the job done efficiently.

Use a backup tool (0)

Anonymous Coward | about a year ago | (#44383231)

Such as CrashPlan [www.crashplan.com]. You can do local backups to an external hard drive or a friend for free. You only need to pay if you want to use their cloud storage instead of or in addition to local storage. You'll capture versions, it will be differential, and the data is deduplicated. Plus, CrashPlan is extremely fast for data that is compressible and/or dedup-able.

Just use Windows Backup (3, Insightful)

benjymouse (756774) | about a year ago | (#44383255)

Windows Backup (since Vista) use Volume Shadow Copy (VSS) to do block level reverse incremental backup. I.e. it uses the journaling file system to track changed Blocks and only copies over the changed Blocks.

Not only that, it also backs up to a virtual harddisk file (VHD) which you can attach (Mount) as a seperately. This file system will hold the complete history, i.e. you can use the "previous versions" feature to go back to a specific backup of a directory or file.

Re:Just use Windows Backup (1)

h4rr4r (612664) | about a year ago | (#44383605)

Lots of backup software uses VSS, pretty much any credible backup software on windows. It totally lacks automation, which is a pretty big downside.

I doubt he is using windows, since he mentions rsnapshot.

Re:Just use Windows Backup (2)

DigiShaman (671371) | about a year ago | (#44383731)

Unless you're running Windows 8 or Server 2012, Windows Backup on Windows 7 and below is functionally obsolete due to the new 3TB + drives now in 4k sector Advanced Format technology. As long as you can still find working 2TB drives and you don't have that much data to backup, you'll be fine with Windows Backup. Otherwise, upgrade the OS or use ArcServeD2D which I know works well (and expensive too).

http://support.microsoft.com/kb/2510009 [microsoft.com]

Get a Pi (0)

Anonymous Coward | about a year ago | (#44383287)

Get a raspberry Pi, plug the drive into it, and have it run rsync daemon (set it up to auto-start in case of power failure triggering a reboot). Plug the Pi's ethernet cable into your router, and then on your laptop run a script to trigger rsync (works via Cygwin too for Windows users) via a scheduler / cron. Have it verify that you are connected to your home (work / whatever) network, and if so call up the pi and start the backup.

I set mine to try this on my work system at 11 am every day, because by that time I had come to work, dealt with my morning emails etc, and gone off to do something else, so by and large it would happen when I was not at the keyboard. You probably won't notice the slowdown anyways, even if you are there.

For bonus points add a second script that checks the logs and alerts you if you've gone a while without backing up (in case the Pi dies or something)

Whooosh (3, Interesting)

jayteedee (211241) | about a year ago | (#44383313)

Holy cow people, your missing the OP point. It's taking 15 minutes to SCAN the 1TB drive.

I've run into the same problem on windows and Linux. Especially for remote rsync updates on Linux on slow wireless connections. It's not the 1TB that kills since I can read 4TB drives with hundreds of movies in seconds. It's the amount of files that kill performance.

My solution on windows is to take some of the directories with 10,000 files and put them into an archive (think clipart directories). Zip, Truecrypt, tar, whatever. This speeds up reading the sub-directories immensely. Obviously, this only works for directories that are not accessed frequently. Also, FAT32 is much faster on 3000+ files in a directory than NTFS is. Most of my truecrypt volumes with LOTS of files are using FAT32 just because of the directory reading speed.

On Linux systems, I just run rsync on SUB-directories. I run the frequently accessed ones more often and the less-accessed directories less often. Simple, No. My rsyncs are all across the wire, so I need the speed. Plus some users are on cell-phone wireless plans, so need to minimize data usage.

Re:Whooosh (1)

ormembar (2996607) | about a year ago | (#44383433)

I used to do that: a scan using the find command to find modified directories, then using --exclude directives from rsync, I backup only the unchanged directories. At that time, I was also evaluating the size of the sub-directories to backup only less than 1 GB for each part. But the find search was still too long for my taste. At the end, the gain in time was very limited (if not longer than a full rsync).

Re:Whooosh (1)

jayteedee (211241) | about a year ago | (#44383635)

That takes longer since the find command scans the entire directory and file structure to find the directories. It also takes longer because of querying the size takes more than just querying the name. I just used rsync to scan some of the directories hourly (accounting data, document directories, etc). Other directories were daily, and others were only monthly (install directories, tools, etc). I had to force the users into a certain file hierarchy, but that's what sys admins are for :)

RAID is not backup (-1)

Anonymous Coward | about a year ago | (#44383347)

For the record, RAID 1 is not backup. RAID is intended more as a fault tolerance solution, not backup. So, if you delete a whole set of files you later decided you needed, the mirror disk won't have it either. Or if your filesystem on the primary disk gets corrupted, the mirror disk will be as well. Stick with an actual backup solution.

I use two superb products (1)

CAOgdin (984672) | about a year ago | (#44383419)

1. For keeping two drives synchronized, check out GoodSync. It's powerful, and I use it to keep two separate computers holding identical copies of two major folders of data synchronized, so if one goes down, there's minimal loss of data (1 hour, max) I use this, for example, to keep a client's two 1TB collections of photos and iTunes synchronized. http://www.goodsync.com/ [goodsync.com]

2. For making backups that are compact, efficient and easy to recover, look at "Disk Snapshot". It's inexpensive, robust and I've never experienced a restore failure. I make "Disk Snapshot" images of every computer, every night, in a development environment. That way, if the thing I just did breaks the system, I can restore a 100 GB Drive is less than an hour by booting from a CD and pointing to the backup on an external drive. http://www.drivesnapshot.de/en/index.htm [drivesnapshot.de]

Do it on a lower level. (2)

tibit (1762298) | about a year ago | (#44383425)

I'd think to use LVM and filesystem snapshots. The snapshot does the trick of journaling your changes and only your changes. You can ship the snapshot over to the backup volume simply by netcat-ing it over the network. The backup's logical volume needs to have same size as the original volume. It's really a minimal-overhead process. Once you create the new snapshot volume on the backup, the kernels on both machines are essentially executing a zero-copy sendfile() syscall. It doesn't get any cheaper than that.

Once the snapshot is transferred, your backup machine can rsync or simply merge the now-mounted snapshot to the parent volume.

Re:Do it on a lower level. (3, Informative)

tibit (1762298) | about a year ago | (#44383505)

Well, of course I goofed, it's not that easy (well it is, read on). A snapshot keeps track of what has changed, yes, but it records not the new state, but the old state. What you want to transfer over is the new state. So you can use the snapshot for the location of changed state (for its metadata only), and the parent volume for the actual state.

That's precisely what lvmsync [github.com] does. That's the tool you want to do what I said above, only that it'll actually work :)

BtSync (0)

Fuzion (261632) | about a year ago | (#44383429)

How about BtSync [bittorrent.com] ?

It's based on the BitTorrent protocol, and it can sync over the internet as well.

Will your backup have you backup up and running? (1)

Marrow (195242) | about a year ago | (#44383431)

If you are spending time messing with a system that is not going to provide you with a running computer after a quick trip to the store for a new hard drive, then maybe you should rethink your goals.
And perhaps you would regret the time spent less if you knew that in the event of an emergency, your backup would not only save your data, but prevent a re-installation and updates and more updates and more updates, and hunting for installation media and typing in software keys.
AIX had/has a nice system for backups: it created a bootable backup tape. Just turn the key, boot from the tape, say go and your machine was recovered completely. The closest I have see to that recently is clonezilla.

Re:Will your backup have you backup up and running (0)

Anonymous Coward | about a year ago | (#44383757)

I do wish Linux had something like mksys/sysback.

Unfortunately, there isn't much in the way of bare metal restorable backup utilities for Linux unless I reboot to another OS and run an image program. Even wbadmin.exe in Windows can create for me boot images where I just boot from recent Windows OS media, run wbadmin to restore, walk off, and reboot into a functioning OS.

If I were to recommend a backup utility for Linux, I'd probably just worry about syncing off application data and documents, since it takes -far- less time to reinstall the OS than apps, than to try to piece together a working Linux box from backups.

Step 1 get a real backup (1)

silas_moeckel (234313) | about a year ago | (#44383481)

Making a mirror every now and again is not a backup strategy to use. This is the canned RAID is NOT a backup and never will be advice. For a single laptop something like backblaze is probably a better bet.

Backup solution (0)

Anonymous Coward | about a year ago | (#44383531)

Yes, there is a product named ActiveImage Protector from Japan. There is an English version available from www.activeimage.net. It creates a log of changed blocks between backups (incremental backups). Base backup is full disk or volume image (smart sector to only backup sectors in-use), subsequent incremental backup only backs ups changed blocks. Very good compression and very fast backup. Versions for Windows and Linux. You can mount backup images to restore file selectively or boot from a recovery CD to restore boot volumes and disks.

15 minutes is a fast 1-terabyte sync (0)

Anonymous Coward | about a year ago | (#44383569)

You are unlikely to find anything faster than rsync, nor as reliable. 15 minutes to synchronise your backup with your live data is an impressively short time - did you ever look at the "speedup" claim that rsync includes in its final message? This is the ratio of time taken to synchronise versus the time that would be required to transfer the complete dataset over the same bandwidth. USB3 has an absolute maximum of 4.8 Gbit/s, which is 600 MBytes/s (or 2 TBytes per hour), but you are unlikely to get better than 1 Gbit/s (0.4 TBytes per hour) sustained disk data transfer.

Upgrade your rsync! (4, Informative)

phoenix_rizzen (256998) | about a year ago | (#44383775)

You're holding it wrong. ;)

rsync 2.x was horribly slow as it would scan the entire source looking for changed files, build a list of files, and then (once the initial scan was complete) would start to transfer data to the destination.

rsync 3.x starts building the list of changed files, and starts transferring data right away.

Unless you are changing a tonne of files between each rsync, it shouldn't take more than a few minutes using rsync 3.x to backup a 1 TB drive. Unless it's an uber-slow PoS drive, of course. :)

We use rsync to backup all our remote school servers. Very rarely does a single server backup take more than 30 minutes, and that's for 4 TB of storage using 500 GB drives (generally only a few GB of changed data). And that's across horrible ADSL links with only 0.768 Mbps upload speeds!

Going disk-to-disk should be even faster.

ZFS - incremental/snapshot? (4, Informative)

Roskolnikov (68772) | about a year ago | (#44383779)

two pools, internalPool, externalPool

use ZFS send and receive to migrate your data from internal to external, you and do whole fs or incremental if you keep a couple of snaps local on your internal disk, this can get excessive if you have a lot of delta or you want a long time.

http://docs.oracle.com/cd/E18752_01/html/819-5461/gbchx.html [oracle.com]

of course you will need a system that can use ZFS, there are more options for that than time machine, its block level and its fast, and it doesn't depend on just one device, you can have multiple devices (I like to keep some of my data at work, why? my backup solution is in the same house that would burn, if it burned...)

DRBR? (0)

Anonymous Coward | about a year ago | (#44383783)

Setup DRBR on the initial configuration to the secondary drive. I haven't used it for a while but IIRC the changes sectors would be flagged for replication and when the secondary device was brought online it would replication. I know this what designed for a machine to machine block replication but it might be plausible on the same machine. It might be a good feature request if it doesn't (but then that wouldn't help you right now)

lsyncd + some queue (0)

Anonymous Coward | about a year ago | (#44383809)

Lsyncd [google.com] plus some of queue to sync when destination device is available.

rsync is fast for me (0)

Anonymous Coward | about a year ago | (#44383875)

I currently backup 2.5TB of data using rsync. It takes about 2-3 minutes to determine the changes and then whatever time to do the actual copying. Post your actual rsync command, maybe you are doing something strange that isn't necessary.

Re:rsync is fast for me (1)

ormembar (2996607) | about a year ago | (#44383981)

Last rsync (version 3.0.9) was:
$ rsync -av --delete /home/. /auto/passport/home/. | tee -a ~/backup.log
...
sent 1001882570 bytes received 106527 bytes 1775002.83 bytes/sec
total size is 527398084971 speedup is 526.3

It took 10 minutes to scan the 500 MB partition (ext4). From the other posts, I guess the duration time is due to the number of files that is scanned by rsync.

git-annex (0)

Anonymous Coward | about a year ago | (#44383881)

git-annex with a running assistant maybe (in direct mode)? But would like to hear how does it work. Would create git directory for holding metadata about files. Would be constrained to your home.

Some alternatives (ubuntu) (1)

brodock (1568447) | about a year ago | (#44383919)

Ubuntu documentation lists some alternatives: https://help.ubuntu.com/community/BackupYourSystem [ubuntu.com] one that is not listed there, and that I used many years ago is "UNISON"... I found it faster and better then rsync, also it does binary diff, so for big files that only "metadata" changed, it transfers faster.

Rebit (1)

scsirob (246572) | about a year ago | (#44383943)

Not free but reasonably priced and worth every penny. rebit.com keeps track of all changes, sends new versions of a file to a local harddisk, a network share or both. In case of a crash you can recover from a boot CD-ROM and I've used that to transfer my files to a new computer too. They have cloud-enabled versions too.

Uhh... Why the rush? (1)

moorley (69393) | about a year ago | (#44383971)

I usually hate making posts where I am questioning the questioner, rather than providing an answer but with 1 TB of information you should put on the patience cap. It will take as long as it takes.

To break down what you are wanting:
I want a backup based on a journal file system sorta of thing that works incrementally slowing down every disk operation by a few milliseconds so I can shave 15 minutes off of a backup procedure, but I still have to send the same data. I don't think that would be very wise. The best existing method is to use mirror a volume but you're still experiencing the same "15 minutes" of delay.

The best thing you can have is a "fire and forget" procedure where you can walk away and let it run.

locate (based on updatedb) does not capture/sort on file modification dates so you are going to be left with a recursive file system search no matter what.

You could use find to generate a list of files that have been modified since a certain date and then feed that to tar. That way you can pre-generate an incremental backup in a file that you can copy over. Then let whatever backup solution you like make a full backup from time to time. You can setup a script that would run a few times in your work day to generate the file so at least every 24 hours there is a tar file you can copy over when you get a chance.

Good luck!

mechanical drives are slow (0)

Anonymous Coward | about a year ago | (#44384005)

i know this post won't help much but i just want to say that mechanical laptop drives are really slow. i haven't tried using a laptop with an solid state drive though. i just did a benchmark of an old 900 MHz Celeron Netbook. the average read transfer rate of the drive was about 33 megabytes per second vs 60 megabytes per second of my desktop SATA drive.

come to think of it, my external USB drive has an average transfer rate of about 30 MB/s too. copying big files takes a long time. i remember spending about 40 minutes or more to copy a 20 GB MMORPG so that I can move it from one computer to another.

Btrfs send/receive (4, Informative)

jandar (304267) | about a year ago | (#44384007)

Btrfs send/receive should possible be doing the trick. After first cloning the disk and before every subsequent transfer create a reference-snapshot on the laptop and delete the previous one after the transfer.

$ btrfs subvolume snapshot /mnt/data/orig /mnt/data/backup43
$ btrfs send -p /mnt/data/backup42 /mnt/data/backup43 | btrfs receive /mnt/backupdata
$ btrfs subvolume delete /mnt/data/backup42

I havn't tried this for myself, so the necessary disclaimer: this may eat your disk or kill a kitten ;-)

hi (-1)

Anonymous Coward | about a year ago | (#44384013)

my co-worker's step-sister makes $83 hourly on the net. She has been discharged for 6 months however last month her pay check was $19589 simply acting on the net for a number of hours. Here's the location to scan additional
Read more at... www.bay92.Com

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?