Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Ask Slashdot: Temporary Backup Pouch?

timothy posted more than 2 years ago | from the don't-forget-your-spare-co-backup-pouch dept.

Data Storage 153

An anonymous reader writes "It looks simple. I've got a laptop and a USB HDD for backups. With rsync, I only move changes to the USB HDD for subsequent backups. I'd like to move these changes to a more portable USB stick when I'm away, then sync again to the USB HDD when I get home. I figured with the normality of the pieces and the situation, there'd be an app for that, but no luck yet. I'm guessing one could make a hardlink parallel-backup on the laptop at the same time as the USB HDD backup. Then use find to detect changes between it and the actual filesystem when it's time to backup to the USB stick. But there would need to be a way to preserve paths, and a way communicate deletions. So how about it? I'm joe-user with Ubuntu. I even use grsync for rsync. After several evenings of trying to figure this out, all I've got is a much better understanding of what hardlinks are and are not. What do the smart kids do? Three common pieces of hardware, and a simple-looking task."

cancel ×

153 comments

Sorry! There are no comments related to the filter you selected.

SkyDrive (-1, Offtopic)

partofme (2643183) | more than 2 years ago | (#40061525)

Just a suggestion, but SkyDrive [live.com] is great. You get 25GB for free and it is also going to be integrated fully into Windows 8 in future.

Re:SkyDrive (1)

Zemran (3101) | more than 2 years ago | (#40061597)

Did you read the summary? Obviously Skydrive is of no use but there are several other alternatives that would be better suited to this purpose although if, as he says, it is for use while travelling an internet based system is useless. I would suggest reading up more on what you can do with dd and writing a couple of scripts to suit your needs. The problems are more likely to be around using USB but you can always write a script that puts a compressed file on your desktop that you manually copy to your USB stick. Old tech is normally most reliable.

Re:SkyDrive (2)

partofme (2643183) | more than 2 years ago | (#40061625)

I can't see how internet based system would be useless. SkyDrive and Dropbox both can sync files when you get internet connection. I am traveling too (have been for 4 months) and that's what I do, even while internet is really crap at times. But it will get synced eventually, and it gets synced automatically without me doing anything. On top of that de-duplication and only syncing parts that need to be uploaded saves bandwidth.

rsync and other low level solutions are much more work and on top of that you need to carry around extra devices that might get destroyed too. But with SkyDrive or Dropbox the files will always be there no matter what happens.

YOUR MOM'S POUCH! (-1, Flamebait)

kdawson (3715) (1344097) | more than 2 years ago | (#40061651)

Hahaha, "temporary backup pouch" how gay.

I stuck my pen0r in your momz pouch. Wait I didnt because I am gay with Rob Malda. Or cowboy neal. You haven't heard from ol' neal in a while because he's still tied up in my basement with a Broos Willys look-a-like!

SUCK MY BALLS SLASHDOT!
-KD

Re:YOUR MOM'S POUCH! (-1)

Anonymous Coward | more than 2 years ago | (#40061667)

Dude, you have two uid's!

Re:YOUR MOM'S POUCH! (2)

DarwinSurvivor (1752106) | more than 2 years ago | (#40062509)

No, he's made part of his username into a fake uid to make it look like he's been here longer. (hint: the second one is his uid).

Re:SkyDrive (4, Insightful)

Zemran (3101) | more than 2 years ago | (#40061691)

As you say, the internet is really crap at times when you are travelling so why make life difficult? It is also fair to say that you obviously think of travelling as a bit of wandering around in the US. Once you broaden your horizons you will find that the internet is often not even an option.

Skydrive is not going to integrate with Ubuntu (have you read the summary yet?) so it is a stupid option whereas there is a dropbox client. It is still flakey and not going to be easy to use as required so he is still better off doing something that will work well and therefore get done regularly. If he is using some client for a service that sometimes works and sometimes doesn't you can guarantee that the time when he needs that backup will be one of the times that it did not work.

Re:SkyDrive (-1)

Anonymous Coward | more than 2 years ago | (#40061761)

You're making me angry. I screw asses until they're leaking out cum like faucets when I'm angry.

Re:SkyDrive (5, Insightful)

partofme (2643183) | more than 2 years ago | (#40061823)

Actually, I'm not even US citizen, and I travel in South East Asia. When talking about shitty internet, I know what shitty internet is. For example when I'm staying in Cambodia, internet can (and often does) go down for the whole day and night. It also happens often. The speed is also ridiculously slow. You can try to get around some of the downtimes by getting mobile internet for backup, but if there's a wider outage, there's nothing you can do.

Yet, I've found Dropbox to be the best backup solution. Files will get there eventually, and I don't need to do anything. There's also revision history of files, so if you upload corrupted files or something like that you can reverse it. You can access them from other computers in case your laptop goes poof (happened to me). And the most important thing - if you get robbed or lose your luggage, you will still have access to your files (and of course, I keep my laptop encrypted).

The good sides of online cloud backup far outweights the negative ones or worries about bandwidth. Especially since most of the time the files that need backup aren't large. No one in their right mind would try to sync their media files.

Re:SkyDrive (1, Insightful)

Capt. Skinny (969540) | more than 2 years ago | (#40061889)

The good sides of online cloud backup far outweights the negative ones

Until your cloud backup provider goes out of business or stops offering the service. You think rsync is a lot of work? Try keeping current on the status of Dropbox and SkyDrive services so you can pull your data before they disappear. I guarantee you that a properly stored external drive will outlive either of them.

Oh, and if you were trolling with that first post, kudos on playing it out so long.

Re:SkyDrive (3, Insightful)

partofme (2643183) | more than 2 years ago | (#40061993)

Try keeping current on the status of Dropbox and SkyDrive services so you can pull your data before they disappear.

Email? Twitter? Facebook? All kind of "push notification" technologies where you don't really need to do anything if you use them.

Besides, we are talking about Microsoft here. A company that has ridiculously long phase outs for their products as a standard practice so businesses feel safe using them (seriously, they announced that a version 4.0 of SilverLight will see end of support in two years from now). If there is any tech company in the world that you can trust not just going to end support suddenly, it's Microsoft.

Re:SkyDrive (1)

Anonymous Coward | more than 2 years ago | (#40062867)

Not to rain on your "old paranoid guy" parade, but dropbox (and I assume most others are too) IS local storage (sync'd to the cloud)...They go out of business, just uninstall, and install the next one. Probably works exactly the same.

Re:SkyDrive (3, Insightful)

comp.sci (557773) | more than 2 years ago | (#40063115)

For 99.9% of all users a backup is simply that, a failsafe in case their main HD gets lost / damaged. So what if dropbox or skydrive suddenly were to go out of business (as unlikely as that is, youd know in advance)? You suddenly lose access to that safety copy of your data and will know right away because the client cannot connect anymore. But you still have your primary copy of everything, nothing was lost, you can just switch providers or change your backup strategy. The chances that something would happen right then in the time-frame that the cloud provider fails and you make another copy with another provider are incredibly low. If you can't take that risk then you'd have a third backup anyways.

Re:SkyDrive (1, Informative)

Anonymous Coward | more than 2 years ago | (#40061755)

Did you read the summary?

Did you read his history.

Partofme is another of the Bonch/Sharklaser/Tech* etc etc sockpuppet team.

They're a group of Burson-Marsteller reputation managers working for Microsoft. They always get early postings so they can divert discussion to their pro-MS agenda.

Re:SkyDrive (1)

Anonymous Coward | more than 2 years ago | (#40062053)

Wait a minute, I thought they were working for Apple. Oh I'm so confused.

Re:SkyDrive (5, Informative)

Lord Crc (151920) | more than 2 years ago | (#40061955)

Obviously Skydrive is of no use but there are several other alternatives that would be better suited to this purpose although if, as he says, it is for use while travelling an internet based system is useless.

That's why I liked Crashplan when i first saw it. This may sound like a sales pitch but I'm just a happy customer.

With Crashplan you can have multiple destinations for your backup set. I usually have three:
- same HD in case I accidentally deleted some files.
- USB HD for faster recovery in case my primary HD breaks.
- Online "in the cloud", in case my house burns down etc.

Crashplan detects when I plug in the USB HD and automatically starts running updating the backup on it. If there's no internet the first two destinations will still keep me pretty safe. Once the internet is back it catches up on the cloud destination.

It works just fine on my Linux Mint laptop as well as my Windows desktop pc.

The smart kids (1, Funny)

Anonymous Coward | more than 2 years ago | (#40061563)

What do the smart kids do?

The smart kids don't use linux.

Re:The smart kids (2, Informative)

ThePeices (635180) | more than 2 years ago | (#40061609)

You Sir are completely correct.

The smart kids use Linux.

Re:The smart kids (1)

ozmanjusri (601766) | more than 2 years ago | (#40061787)

Even smarter kids use Linux, udev and maybe Bacula, though Google would be a good start.

Hardlinks (2, Informative)

funwithBSD (245349) | more than 2 years ago | (#40061565)

Oh dear.

Hardlinks don't span storage devices. They are files that share the same inodes on single storage device. Soft links do, but they are pointers to the inode, so "backup" using softlinks and you have a bunch of pointers to data that is on the original system. NOT on the thumb drive!

Use one of the backup packages out there, you are not at the point of rolling your own.

Not even close.

Re:Hardlinks (2)

partofme (2643183) | more than 2 years ago | (#40061593)

Hardlinks don't span storage devices.

Except they do, on Windows at least. And you can even mount drive to a folder like c:/otherdrive/

Re:Hardlinks (2)

maevius (518697) | more than 2 years ago | (#40061653)

mmmmm no.

Mounting a drive to a folder has nothing to do with hardlinks (or inodes). This is on a much higher level. In order to span hardlinks on different drives, you should have 1 filesystem on 2 drives, which is not possible.

Re:Hardlinks (0)

Anonymous Coward | more than 2 years ago | (#40061699)

mmmmm no.

Mounting a drive to a folder has nothing to do with hardlinks (or inodes). This is on a much higher level. In order to span hardlinks on different drives, you should have 1 filesystem on 2 drives, which is not possible.

I'm not trolling here, I may be misunderstanding the concept. But isn't that what ZFS is about?

Re:Hardlinks (1)

maevius (518697) | more than 2 years ago | (#40061853)

Ok, Let me clarify this.

It is possible to span a filesystem on multiple drives (see various LVMs, RAID etc) but is offtopic to the problem at hand (you cannot have 1 filesystem on your hard drive and your USB stick).

Given that hardlinks exist on the filesystem level, anything that is on a lower layer and transparent to this level (consider a driver for example that can handle hard disks, network paths, RAM etc. as 1 block device) can be on 1 filesystem and have hardlinks span across. But this purely theoretical, and a bit offtopic.

Re:Hardlinks (1)

tywjohn (1676686) | more than 2 years ago | (#40061705)

Having one filesystem across multiple drives IS possible. Hardlinks spanning multiple filesystems is not.

Re:Hardlinks (1)

tywjohn (1676686) | more than 2 years ago | (#40061663)

funwithBSD is 100% correct about hard links it is quite clear that nobody is referring to Windows.

Re:Hardlinks (4, Funny)

Anonymous Coward | more than 2 years ago | (#40061639)

Nobody pretended this to be the case. Not even the article's author. Please read again - and use a ghostwriter, you are not at the point of rolling your own comments. Not even close.

unison-gtk (3, Informative)

niftydude (1745144) | more than 2 years ago | (#40061575)

Since you are an ubuntu user, and it looks like you just need a nice rsync front-end to handle backup of the same data to two different drives, I'll suggest unison-gtk.

Very nice, simple front-end, and will do what I think you need.

Re:unison-gtk (1)

Dynetrekk (1607735) | more than 2 years ago | (#40062031)

+1 for unison, it's an awesome little program. I use it on macosx as well, and I believe there's a windows client. I personally prefer the CLI version, but if the GUI version is anywhere near as good, I'll heartily recommend it. What you would do after coming home is to run sync against both external storages. Should work like a charm.

Re:unison-gtk (0)

Anonymous Coward | more than 2 years ago | (#40062263)

There is a Unison for Windows and, by and large, it works pretty well. (It certainly beats plain-old-rsync for merge syncing two server volumes between continents, both in initial scan times and transfer speeds.) The Unison server on Windows is a bit flaky and needs to be monitored - sometimes it quits for no logged reason.

Re:unison-gtk (0)

Anonymous Coward | more than 2 years ago | (#40062495)

Windows is a bit flaky and needs to be monitored - sometimes it quits for no logged reason.

Well fuck me dead!

I never thought I'd be agreeing with a MS astroturfer!

Re:unison-gtk (4, Informative)

Anonymous Coward | more than 2 years ago | (#40062357)

I think people (including you) don't understand what he needs. He has a complete backup at home. When he's on the move, he wants to backup only modifications that are not already backed up at home, so that the backup fits on a USB stick. To know what has and hasn't changed, he can't access the backup at home, like rsync would need to do. His idea was to have a space-saving complete copy of the backup on his laptop via hard links. You might think that file modification times could be used, but both solutions leave the problem of communicating file deletion. Suppose he needs to recover. He would copy his home backup to the new drive and then he would have to integrate the incremental backup. How would the incremental backup keep the information about deleted files without access to the base backup? I suppose one could keep a recursive directory listing with the incremental backup, but that's the question: Is there a ready-made solution for this?

Re:unison-gtk (1)

aaarrrgggh (9205) | more than 2 years ago | (#40062987)

It would seem to make more sense to approach the problem from a different direction: limit your USB backups to the home directory (or a limited set of directories), and do incremental backups there. In the recovery mode, you first recover to the last known good state from the hard drive, and then you apply the changes from the USB stick to selected directories. It could be automated with a script that even I could write, or you just follow a simple step-by-step procedure.

A fully packaged solution as the OP wants would be nice; you would just need to make recursive hash tables of each directory that get you down to around 20-50MB chunk sizes wherever possible. I was hoping for something similar to help with backup drive validation to detect corruption. The table itself would be about 2MB if my math is right.

Unison? (3, Informative)

Anonymous Coward | more than 2 years ago | (#40061599)

I hesitate to offer this, because I've not experimented with it in the precise scenario you describe. However, being another Joe User with ubuntu, I took a look at rsync as a way to implement backups between my home PC and an Apple Time capsule that I was using as a secondary backup device.

After some tinkering I settled on Unison, which is available in the ubuntu repositories. It's essentially a sophisticated rsync front end, with a few bells and whistles. You get 2-way directory replication between your 'local' and 'remote' file systems [though they could both be local or both remote if you choose] and you can essentially script multiple different backups into the single interface. For example, I have "Office" for documents, spreadsheets and the like, "Photos", for camera images, "Music", and so on.

Like most tools, Unison is imperfect, but it's simple to use once set up. The key point with it, as with any product you put in this space, will be knowing and keeping track of your definitive data source. If you have a document that exists on both your local and backup systems, and you edit that file separately at each location, then run Unison, only the most chronologically recent copy will be preserved. To go beyond this level of functionality and get to something that can intelligently merge changes, I think you're going to need something more like a CVS tool... There are hugely expensive proprietary solutions (like Livelink), but I've not come across anyone using a good FOSS alternative. HTH...

Re:Unison? (0)

Anonymous Coward | more than 2 years ago | (#40061675)

yeah, you're going to need something like git to deal with complicated file histories and multiple outstanding changes

so why not use something like git?

Re:Unison? (0)

Anonymous Coward | more than 2 years ago | (#40061789)

If you want a user-friendly git-based backup, look into sparkleshare [sparkleshare.org] . But revision history is probably not important for most backup applications.

Re:Unison? (1)

Neil_Brown (1568845) | more than 2 years ago | (#40061949)

Like the GP, I haven't used Unison in this context, but (a) Unison is easy to configure, and (b) there's plenty of configuration which can be done. I use it for keeping my machines in sync, which, here, would just mean replacing a remote path with a local path. I would definitely invest some time in seeing if this does the trick.

Re:Unison? (1)

Anonymous Coward | more than 2 years ago | (#40061973)

Unison is NOT essentially an rsync frontend. It uses librsync, sure, but it is also a heavily researched, heavily modified way of backing up and synching files.

Here is their homepage: http://www.cis.upenn.edu/~bcpierce/unison/index.html

Unison can work in two ways, seeing if you also put files on the backupdrive that you want to synch back into the main repository. This is an option though, but
I suspect MANY of the file-backup services like dropbox base their code on Unison.

I have always been satisfied with the stability of Unison. When it is set up it never hangs, crashes or things like that.
So in sum: I second that, go with Unison.

Re:Unison? (0)

Dynetrekk (1607735) | more than 2 years ago | (#40062057)

After some tinkering I settled on Unison, which is available in the ubuntu repositories. It's essentially a sophisticated rsync front end, with a few bells and whistles.

It is, in fact, a bit more than that. rsync doesn't handle deletions, so your backup will keep growing in size even though you're not really making any additions (say, you're renaming a big file - now that file is copied twice). unison does, however. This is essential for this use case, especially if one of the backup devices has somewhat limited space on it.

Re:Unison? (5, Informative)

DrVxD (184537) | more than 2 years ago | (#40062137)

rsync doesn't handle deletions

rsync handles deletions just fine - that's why it has a --delete option...

Re:Unison? (1)

Dynetrekk (1607735) | more than 2 years ago | (#40062157)

Hey cool. But what if you bring your USB stick "backup" to some friends, and then add or modify files on the stick? Unison will let you sync those changes back to your desktop when you get home. Does rsync do that, too?

Re:Unison? (0)

Anonymous Coward | more than 2 years ago | (#40062243)

Seriously dude...

Make two maps on your USB stick, and use one for your backups and the other for file transfers.

Or... man rsync, if you really want to use rsyncs advanced options, which it has in dizzying abundance!

rsync handles deletes just fine (0)

Anonymous Coward | more than 2 years ago | (#40062159)

Look at the --delete option of rsync.

Re:Unison? (1)

ThePortlyPenguin (225165) | more than 2 years ago | (#40062875)

+1 for Unison. It will do everything you need it to, and is easy to use. You can setup your ~/.unison/*.prf files to have multiple roots on the same machine (one per removable drive in your case). Just pick the one you want to use when you sync. It does a better job of intelligently syncing and handling any resulting conflicts than anything else out there, bar none. It handles deletions fine (as does, btw, rsync). Here's a sample default.prf for your scenario:

root = /home/yourusername
root = /media/usbhdd

path = Documents
path = Music
path = Pictures

DirSyncPro (2)

KingAlanI (1270538) | more than 2 years ago | (#40061605)

I use DirSyncPro to automate my backup tasks. Not sure how to set it up for your particular task, or whether you can, but it might be worth looking into. A lot of options while still being easy to use.

Re:DirSyncPro (1)

Anonymous Coward | more than 2 years ago | (#40062121)

Thank god its got 'pro' in the name, theres just one thing stopping me getting it now, I'll wait for 'deluxe' to be added to the name, that'll do it.

Duplicity, perhaps (3, Informative)

Wizarth (785742) | more than 2 years ago | (#40061613)

Duplicity uses librsync to generate the changeset that rsync would use, then stores the change set. If you stored the change set to the USB drive, this could then be "restored" to the destination drive, perhaps? I don't know if there's any way to do this out of the box, or with a bit of scripting, or if this would need to be a whole new toolchain.

Re:Duplicity, perhaps (0)

Anonymous Coward | more than 2 years ago | (#40061975)

There is a nice duplicity front end for ubuntu called dubli.back that I use. http://code.google.com/p/dupliback/

Re:Duplicity, perhaps (0)

Anonymous Coward | more than 2 years ago | (#40062191)

Yeah, Duplicity is a good choice. It compresses and encrypts the backup volumes for you. I'd recommend using that instead of rsync when backing up to both the USB stick and the USB HDD, and then using rsync for syncing the backup volumes when at home.

Re:Duplicity, perhaps (2)

Weezul (52464) | more than 2 years ago | (#40062833)

I suspect duplicity and git-annex are the only correct answers in this thread because the underlying problem is that your rsync like tool must detect the changes using a listing of file hashes, not access to the files themselves. It's the same problem as doing incremental backups to a host not running specialized incremental backup software. Duplicity does this, but rsync does not afaik.

not rsync (2, Interesting)

Anonymous Coward | more than 2 years ago | (#40061619)

there are better solutions than rsync.
rdiff-backup
dupicity
for example.

i probably don't understand what you are trying to accomplish.

dump(8) (2, Interesting)

Anonymous Coward | more than 2 years ago | (#40061631)

this is what dump(8) does

Re:dump(8) (1)

kiite (1700846) | more than 2 years ago | (#40062533)

Unfortunately, dump requires a supported filesystem. But most people forget about the incremental backup features of regular old tar(1).

YOU DONT HAVE A BACKUP - FOOL (0)

Anonymous Coward | more than 2 years ago | (#40061695)

If all you are doing is copying files from laptop to your USB HD.
Then prepare to be shattered.

A work college who did exactly the same as you, Went to restore some files that he removed previously (Space issue) learnt the hard way.
Came to work and asked - How can I recover files from my HD? I can still open them from the HD but the images are all corrupt....

My advice was ALWAYS have 3 (THREE) copies of any important data.
1) Live (Laptop)
2) First backup
3) Second backup for when you first backup FAILS while trying to restore the first.

Full backups are best. Move the backup off site (Even in the back shed with an airtight sealed container).

Re:YOU DONT HAVE A BACKUP - FOOL (0)

Anonymous Coward | more than 2 years ago | (#40062287)

Wouldn't hurt to use something like QuickPar to create Reed-Solomon recovery files too so that you can actually recover corrupted files.

Finally! An interesting question. (5, Insightful)

colonel (4464) | more than 2 years ago | (#40061729)

First, ignore the people who encourage you not to try, and who point you in other directions. Sure, there are much better ways of doing this, but who cares? The whole point is that you should be able to do whatever you want -- and actually doing this is going to leave you _so_ much smarter, trust me.

Some douche criticized you for not knowing beforehand why hard links wouldn't work. . . . because, you know, you should have been born knowing everything about filesystems. To hell with him, sally forth on your journey of discovery, this can be hella fun and you'll get an awesome feeling of accomplishment.

First off, you're going to have trouble using rsync with the flash drive, because I assume your constraint is that you can't fit everything on the flash drive, it's only big enough to hold the differences.

Next, come to terms with the fact that you'll need to do some shell scripting. Maybe more than just some, maybe a lot, but you can do it.

I'd recommend cutting your hard drive in two -- through partitions or whatever -- to make sure that "system" is fully segmented from "data." No sense wasting all your time and effort getting backups of /proc/ and /dev/, or, hell, even /bin/ and /usr/. Those things aren't supposed to change all that much, so get your backups of /home/ and /var/ and /etc/ working first. Running system updates on the road is rarely worth it, and will be the least of your concerns if you end up needing to recover.

Next, remind yourself how rsync was originally intended to work at a high level. It takes checksums of chunks of files to see which chunks have changed, and only transfers the changed chunks over the wire in order to minimize network use. Only over time did it evolve to take on more tasks -- but you're not using it for its intended purpose to begin with, since you're not using any network here. So rsync might not have to be your solution while travelling unless you start rsyncing to a personal cloud or something -- but its first principles are definitely a help as you come up with your own design.

The premise is that, while travelling, you need to know exactly what files have changed since your last full backup, and you need to store those changes on the flash drive so that you can apply the changes to a system restored from the full backup you left at home. You won't be able to do a full restore while in the field, and you won't be able to roll back mistakes made without going home, but I don't think either of those constraints would surprise you too much, you likely came to terms with them already.

So, when doing the full backup at home, also store a full path/file listing with file timestamps and MD5 or CRC or TLA checksums either on your laptop or on the flash disk, preferably both.

Then, when running a "backup" in the field, have your shell script generate that same report again, and compare it against the report you made with the last full backup. If the script detects a new file, it should copy that file to the flash disk. If the script detects a changed timestamp, or a changed checksum, it should also copy over the file. When storing files on the flash disk, the script should create directories as necessary to preserve paths of changed/new files.

For bonus points, if the script detects a deleted file, it should add it to a list of files to be deleted. For extra bonus points, it should store file permissions and ownerships in its logfiles as replayable commands.

The script would do a terrible job at being "efficient" for renamed files, but same is true for rsync, so whatevs.

I built a very similar set of scripts for managing VMWare master disk images and diff files about ten years ago, and it took me two 7hr days of scripting/testing/documenting -- this should be a similar effort for a 10-yr-younger me. I learned *so* much in doing that back then that I'm jealous of the fun that you'll have in doing this.

Of course, document the hell out of your work. Post it on sourceforge or something, GPL it, put it on your resume.

Re:Finally! An interesting question. (4, Insightful)

colonel (4464) | more than 2 years ago | (#40061753)

Forgot to mention:

To accomplish this, you'll need to read up on:
- bash
- find
- grep
- awk
- sed
- md5sum
- chmod/chown
- mkdir -p
- diff/patch (for general reference, and also look up binary diffing tools)

Extra extra extra bonus points if you compress the changed files when storing them on the flash drive.

Re:Finally! An interesting question. (1)

e70838 (976799) | more than 2 years ago | (#40062405)

Or learn Perl. Perl can do easily the same things as bash, find, grep, awk, sed, ...

Avoiding to learn all the intricacy of all these tools was one of the main purposes of Perl.

Re:Finally! An interesting question. (1)

Anonymous Coward | more than 2 years ago | (#40063003)

Back in 1993 when I was new to *NIX, I asked a seasoned sysadmin which scripting language I should learn. He started listing all those same tools .... then said - "or you could just learn perl."

I did and you sir are 100% correct. I know grep, find, ksh/sh, sed very well - my awk is extremely weak. Perl hasn't let me down all these years. It is still my go-to scripting language whether I'm hacking some crap code together OR performing a system design and pushing out a beautiful website with XML or json services (check out Perl-Dancer) in a few hours.

Perl can be written cross platform - damn you Windows file systems - with a little effort. I'm constantly hacking Windows Perl scripts using strawberry perl.

And when you're done reinventing the wheel... (1, Offtopic)

outsider007 (115534) | more than 2 years ago | (#40061845)

You can get started on fire.

Re:Finally! An interesting question. (0)

Anonymous Coward | more than 2 years ago | (#40061941)

Best post!

Re:Finally! An interesting question. (2)

philip.paradis (2580427) | more than 2 years ago | (#40061987)

Or he could save himself a ton of grief and just use rdiff-backup [nongnu.org] , which happens to use librsync, produces incremental differential backups, stores said backups as files you can simply browse, works equally well on local and remote filesystems, and is dead simple to use. I've used it for years now on a ton of systems.

Re:Finally! An interesting question. (1)

hankwang (413283) | more than 2 years ago | (#40062077)

Or he could save himself a ton of grief and just use rdiff-backup,

Interesting, since I used rdiff-backup in the past and found it a pain. If files are stored as diffs of diffs of diffs of diffs of a full copy, it is rather easy to corrupt the backup. These days, I make backups using rsync, with

rsync -aOi --delete --modify-window 1 --link-target /mnt/backup/home-20120519 /home /mnt/backup/home-20120520

For the first backup, omit the --link-target argument. Only modified files are stored. As long as you don't have tons of big files that have only a few bytes changed, I don't see the advantage of rdiff-backup. However, it requires that the backup filesystem supports hard links (see my other comment on the use of a unix filesystem on a flash drive). When you come back home, you can do something similar (with --delete) to sync back to your regular backup drive.

The modify-window option is there to because I have to backup Windows filesystems as well.

Re:Finally! An interesting question. (1)

hankwang (413283) | more than 2 years ago | (#40062089)

Replying to myself: of course, I realize that the OP cannot use a hard-link backup if the usb drive cannot hold all his important data. It's too long ago that I used rdiff-backup; can you reliably split the master backup and the differential backups to different filesystems (say the drive at home and the usb stick)? Preferably without risking corrupted backups if it involves manually merging diff trees.

Re:Finally! An interesting question. (1)

hankwang (413283) | more than 2 years ago | (#40062025)

If you go for a system where the files are stored in a Unix-like filesystem (case-sensitive filenames, permissions), what kind of filesystem would you need to use on the USB stick? I believe that the wear-leveling system on USB sticks and flash cards is optimized for FAT filesystems (with the file allocation table right at the beginning). I think that a journalling filesystem would be a good idea on a flash drive, which leaves you with ext2 (with noatime) and very long filesystem checks every time you accidentally pull out the USB stick without unmounting.

With backup drives and file transfers, I also tend to run into the problem that I have different UIDs on different systems. Maybe not such a problem for the OP, but you mention backups of /var which is typically full of files owned by system users (e.g. cups, and mysqld/apache if you use the laptop for web development).

Re:Finally! An interesting question. (0)

Anonymous Coward | more than 2 years ago | (#40062055)

Sweet post. Yes indeed you get full five points for reading comprehension and reading between the lines. And of course for great advice. Have been AC on /. since a couple of months after it started, so no problem ignoring the inevitable chaff.

I'm just rather baffled that what I'm looking for isn't already done; figure I must be blind. And at 50 I'm getting slow and less interested in skinning the cat myself. But if it comes to that (we'll see how the thread looks like by morning) then now I've got great advice for the next step. Thank you so much.

The right tool for the task (1)

grumpy_old_grandpa (2634187) | more than 2 years ago | (#40061731)

It seems like the poster confuses two tasks: Backup and version control.

For the former, use archiving tools to perform full and incremental backup. How is it done? You could use find to list files with certain criteria, e.g. last modified timestamps. Pass that list to using the -T flag, where you also use -X to exclude files and directories like "*/.thumbnails" and "*/.[cC]ache*". Once the tar is done, use your favourite checksum tool; md5sum, shasum to store a checksum of the archive in a separate file. Once you get home, move the archives, verify the checksums, and you're done.

As pointed out in the summary, deleted directories and files will be an issue, thus, you perform full back from time to time. The time frame will depend on how big the changes are, and how much data you have. Personally, I've settled on every second week for /home, but that does not include huge files like RAW files from a DSLR.

As for version control, set up your own git repository with git init, copy it to the laptop with git clone, and you're ready to go. Pro tip: Make sure you name your home computer with a real or "fake" DNS (e.g. in /etc/hosts, or ~/ssh/config). You can then simply refer to it has "home" where ever you are, and tools like git and svn stay happy.

dar? (3, Informative)

safetyinnumbers (1770570) | more than 2 years ago | (#40061741)

If I understand your problem right, How about dar? It can make an empty archive of your main backup to act as a reference (just file info, no files). Then it makes archives relative to that, with just changed files. It can then apply the changes to the original dir, including deletions, if you need that.

Re:dar? (1)

jchevali (171711) | more than 2 years ago | (#40062463)

I agree, dar is definitely the way to go. You need to learn how it works but once you do it's incredible all the things you can do. What safetyinnumbers is referring to is called an isolated catalogue. See also: dar_manager.

Lion does this automatically... (0)

Anonymous Coward | more than 2 years ago | (#40061751)

Local backups via TimeMachine when your laptop is not connected to the backup disk.

Re:Lion does this automatically... (0)

Anonymous Coward | more than 2 years ago | (#40062351)

Yeah, except that it only works with a single TimeMachine target (you can't sync to multiple TimeMachines) and doesn't "just work" with a USB key - you need a Mac Mini Server with the TimeMachine service turned on and an external hard disk, an AirPort with TimeMachine support and an external hard disk, or an external hard disk that you initialize to be a TimeCapsule (and nothing else). At least you get a CoverFlow browser for restoring files.

man rsync --link-dest, --relative, --exclude... (0)

Anonymous Coward | more than 2 years ago | (#40061757)

If you want to stick with rsync for backups, what you want to do is get beyond having just a mirror on the external drive. After all, a backup should help you recover from mistakes, and mirroring will replicate your mistakes to the external drive too.

Instead, keep a series of backup snapshots on the external drive, representing your data at a certain point in time. Each rsync pass creates a new snapshot, like /mnt/backup/2012-05-20/ which represents your internal drive and shares common files by hardlink with older snapshots such as /mnt/backup/2012-05-19/ but doesn't have links to things that were deleted. Merging backups between two different external devices is then a matter of transferring around whichever dated snapshots you want to mirror or migrate between devices.

Here's a hint about how to implement one such snapshot:

mkdir /mnt/backup/2012-05-20
rsync -Ravx --link-dest=/mnt/backup/2012-05-19/. --exclude=/var/{tmp,cache} --exclude=/tmp /. /mnt/backup/2012-05-20/.

see the man page for more details, and the general inductive step is left as an exercise to the reader. In practice, this sort of backup has very little overhead for snapshots of non-changing files. I allocate approximately 150% of my source data volume for my backup volume to maintain a long-term history of many daily, weekly, and monthly backups. My script decimates older backups (with rm -rf /mnt/backup/YYYY-MM-DD) to turn a series of dailies into weeklies, weeklies into monthlies, etc.

The pathological case for this kind of backup is a huge file that slowly grows, such as /var/log/btmp on older Linux systems where logrotate did not rotate that file. An optimization for rotated logs is to make sure they get a name like basename.YYYY-MM-DD instead of basename.1, so that the name of older files doesn't change each time it rotates. Also use appropriate exclude patterns so you don't waste time and space backing up junk you don't need. You can even go the other way and only selectively retain stuff like:

rsync -Ravx --link-dest=/mnt/backup/2012-05-19/. /./etc /./boot /./home /./var/log /mnt/backup/2012-05-20/.

This is an example of the power of the -R (--relative) naming scheme understood by rsync. The position of the extra "." inside the source file paths is not a typo, but actually essential to the purpose of this example. Learn it. Live it. Love it.

Re:man rsync --link-dest, --relative, --exclude... (0)

Anonymous Coward | more than 2 years ago | (#40061869)

For people not wanting to write their own script to do what you explained above: dirvish

Re:man rsync --link-dest, --relative, --exclude... (1)

buchanmilne (258619) | more than 2 years ago | (#40062569)

Or rsnapshot

Re:man rsync --link-dest, --relative, --exclude... (1)

scoobertron (1406559) | more than 2 years ago | (#40063233)

This is tangential, but I had a big snafu when using --link-dest. Which was largely due to my use-case, but I like to mention it on threads like this in case it saves someone some hassle. Link-dest seems to work by creating the directory structure and then populating it with hard links and changed files. This takes time and if you interrupt it, you wind up with a lot of empty directories. If you then use this as the model for your next backup, the number of changes is huge and this will take a lot of time. If you have an automatic script that backups over ssh (say) and you tend to use your laptop for short periods (say), you can end up with a lot of incomplete backups. If you then (say) delete the partition that your OS is on while you are using it, things can go badly for you. Nowadays, I use rsync to update a single mirror of the filesystem, and then snapshot this periodically on the server using cp -l.

support.marsupials.info? (1)

macraig (621737) | more than 2 years ago | (#40061763)

They'd have the skinny on pouches for sure.

git-annex (0)

Anonymous Coward | more than 2 years ago | (#40061835)

I have not used it myself, but I belive that git-annex does exactly what the OP asks for: http://git-annex.branchable.com/

Re:git-annex (1)

Weezul (52464) | more than 2 years ago | (#40062807)

Yes, git-annex could be coaxed into doing this, but it's not completely trivial.

simple solution (0)

Anonymous Coward | more than 2 years ago | (#40061859)

To detect the changes, you can utilise snapshotting (rsnapshot package perhaps?) to do it and it would allow you to see the changes from a day-to-day view.

All you need to then do is transfer the changes listed until the daily.1 (or appropriate folder) to your USB stick. It maintains paths, which is one of the items you're after.

As for deletions, I think a daily email of the snapshot.log file could list this information for you. I wouldn't want it more complicated than that I guess...
Hope this helps,
@chayharley

using touch find + tar (0)

Anonymous Coward | more than 2 years ago | (#40061919)

timestamps.

could you:
store the timestamp of latest backup, using touch filename.
run find with -newer timestampfile as one of the params, can't remember others.
pipe output to tar.

http://dbaspot.com/shell/399852-creating-tar-file-only-files-have-changed-since-specified-date.html

rsync with find? (2)

Tim99 (984437) | more than 2 years ago | (#40061921)

I'm probably being dim here, but why don't you just rsync from the USB HDD to the USB stick? You can filter by date using "find" something like:
rsync -avt `find /mnt/usbhdd/ -name "*" -type f -mtime -7` /mnt/usbackup
This finds/filters files updated/created in the last 7 days.

To get the stuff back from the USB stick that you have created or modifed while you are away to the USB HDD you would just do a normal rsync which will only overwrite/add files that you have modified or created since you ran rsync from the USB HDD.

I am not familiar with Grsync, but it looks as though you can run it in simulation mode to see the output - You could then copy/paste the output and modify it with the 'find whatever -name -type f mtime -DAYS' and run it from the prompt.

Install Windows (0)

Anonymous Coward | more than 2 years ago | (#40061925)

Install Windows, use a briefcase [wikipedia.org]

LuckyBackup or BackInTime + an online sync service (1)

Sosarian Avatar (2509846) | more than 2 years ago | (#40062061)

I don't know if I'm one of the "smart kids" or not, but I'm a standard non-technical user and have found LuckyBackup [wikipedia.org] or BackInTime [le-web.org] run along with an online sync/backup service like DropBox or SpiderOak the most handy options.

Both LuckyBackup & BackInTime are GUI tools that set up rsync rules (even complicated ones) with an easy point-and-click interface, then schedule them in cron. They can do anything rsync can: synchronize the drives so the backup matches the current, or make a backup of everything present plus never delete anything, and they won't waste time/energy by backing up files that haven't changed;

LuckyBackup can be set to keep up to 99 snapshots of anything that changes, and they're structured in the exact same way as the original. BackInTime can have unlimited snapshots, and each backup is in a different nested folder by date/time, with unchanged files within each folder being soft links back to the most recent backup copy. Both programs just create file copies, not compressed archives.

Right now, I'm using LuckyBackup for my regular files, and I have BackInTime handling my writing directory so I can go back to an unlimited extent in case -- as happened once -- I realize that I had made a major change several months ago (more backup dates than LuckyBackup tolerates in snapshots) that turned out to be a horrible mistake, so I don't have to try to reconstruct the original from memory.

I use the web/online backup solution partly to keep my computers in sync without a thumbdrive. It's also because it acts as a free minute-by-minute backup with a few months of snapshots, so if an .odt file becomes corrupt while I'm working on it, I don't lose everything since the previous system backup. I lost about 30 hours of intense revisions a couple of years ago because the thumbdrive I was saving & transporting my files on had a glitch that evidently had messed up everything I'd been saving for a few days -- and as it turns out, it's not possible to extract text from a bad .odt file even with a hex editor.

does this need to be so complicated? (2)

egork (449605) | more than 2 years ago | (#40062119)

From what I have parsed the OP wants to have a full back-up on a USB-HDD and the diffs on the USB-Flash, because the Flash is limited in size.
Just write two rsync (or grsync) scenarios: one for HDD and the other for the Flash. On the HDD you will have a directory that is a mirror copy of your laptop. On the Flash you will keep the diffs for the time between syncs to the HDD.

When at home
1. rsync your laptop to the HDD (mirror).
2. copy the incremental stuff from the Flash to a separate directory (e.g. diff-2012-May-21) on the HDD, and wipe the Flash.

At the road:
just rsync diffs to that Flash.

I guess the recovery plan is quite obvious too. Should any _one_ of those three devices die, you are still good to go.

Re:does this need to be so complicated? (0)

Anonymous Coward | more than 2 years ago | (#40062211)

Yes, it does. Because what are you going to diff against when you don't have the external HDD?

Re:does this need to be so complicated? (1)

egork (449605) | more than 2 years ago | (#40062259)

OK, I got it, so the diff is the problem.
I see two solutions
1. write the date of the latest mirror-sync to a file in laptop and copy all newer or modified files to flash with a "find".
2. have a second copy (mirror) on a separate partition the laptop HDD and diff against it. Actually helpful in case you really need to restore a (deleted) file while on the road.

I believe having a complicated script and set up is more "expensive" at the end of the day as just to have a second (or a third copy).

unison is bi-directional (1)

Gunstick (312804) | more than 2 years ago | (#40062371)

unison has already been suggested multiple times.

I used unison. It's perfect to sync from A to B (it only syncs the diffs) then modify B and later sync B to A
You also can modify A and B at the same time as long as it's not the same file, then sync and then A and B are identical.
You can even sync in cycles: A->B->C->A with modifications on all three directory trees and it still works
Unison also handles deletions on both sides fine.
Hint: use the -group -owner -times flags

dump, snapshots, rsync batch mode (1)

dtdmrr (1136777) | more than 2 years ago | (#40062381)

The first problem to consider is how you determine which files to backup. Filesystems like xfs, zfs, and btrfs have nice convenient ways to get a list of changed files (and for xfs and zfs, the contents of those files as well). For ext2/3/4 (and other older unixy filesystems) look at "dump". And of course, if you're working with a completely dumb filesystem, you can always use rsync (if your backup disk is remotely accessible) or some external/manual indexing to figure out what files to backup.

If your filesystem supports some form of dump (send for zfs), you can use that to create your incremental changes. If you only have a list of files, use tar, or rsync. If you have want to keep a full backup on the same drive, you can use rsync's batch mode (see the manpage) to efficiently generate incrememental backups, for filesystems that don't do a good job of that.

You don't want to hard link between your live tree and a backup tree. That will result in the changes showing up in both trees, obscuring the changes when you run a backup. It's a technique used with rsync for snapshotting, where two backups trees represent the state of the original filesystem at different times. To make that work, the links are broken for the files that differ between the two snapshots.

MacBak (0)

Anonymous Coward | more than 2 years ago | (#40062439)

Try MacBak works on Ubuntu as well :

https://github.com/daemonza/MacBak

rsync (1)

gatzke (2977) | more than 2 years ago | (#40062587)

For years I have used rsync scripts.

My problem was syncing a desktop and a laptop. So I made upload_to and download_from scripts to sync as needed.

I also try to keep a third master backup copy on a different server so all three are synced.

One problem comes when trying to work on both desktop and laptop simultaneously. Just map a drive and modify files on one side.

git (1)

phorwich (909601) | more than 2 years ago | (#40062759)

I think git has got what you seek. http://git-scm.com/ [git-scm.com]

Robocopy (0)

Anonymous Coward | more than 2 years ago | (#40062855)

You could set up a robocopy script to do incremental backups. Then when you run it just set the 'edited after' property to only backup files since your last home backup. There are plenty of templates for this online as well as a GUI application if you prefer it.

ZFS (1)

Carl Drougge (222479) | more than 2 years ago | (#40062925)

In the grand internet tradition of answering a loosely related question which is no use at all to the asker, I will say that the "smart kids" might use something like ZFS, which almost handles this for you. (Take snapshots, save delta streams on your USB stick. Requires the backup to be a ZFS copy, not just the same files.)

Useless right now at least. But I've been pretty happy with switching my storage to ZFS, even if the Linux version sucks. (I mostly don't use the Linux version.) I'd recommend it to anyone who doesn't mind a bit of transitional pain.

tar (1)

bWareiWare.co.uk (660144) | more than 2 years ago | (#40062973)

Easiest solution is probably to use tar's incremental backups. The -g argument creates a relatively small file listing the files already backed up, so future incremental runs can skip them. If you keep the incremental files on the laptop then you can put each days actual tar backup on whatever devices you have handy.

Buy a bigger flash drive! 64GB isn't too expensiv (0)

Anonymous Coward | more than 2 years ago | (#40063049)

Buy a bigger flash drive! 64GB isn't too expensive.
If you are running Linux, then 64G should easily handle your OS and documents with many versions.

It is only video, audio and photos that will eat up lots of storage and you need to get those to a server in a different country ASAP anyway. Use rsync+ssh to do that for the big files.

I saw a 32G USB flash drive for $17 yesterday. Unless you are storing HiDef video, this should cover a few weeks overseas. Be certain to encrypt it AND your laptop, but leave a tiny OS that boots so you can show it to customs and border control.

Do not carry your passphrases with you. Keep the KeePassX data file in the cloud somewhere ... so it isn't on the machine at boarder crossings. You will not physically be able to decrypt the HDD this way.

I might be crazy (1)

Mr0bvious (968303) | more than 2 years ago | (#40063075)

and I know this is not what he asked for, but wouldn't the simplest solution be to purchase a second external drive (maybe an SSD for durability) and actually have a complete backup on the road... Or even just take his current external with him - he has it backed up in the cloud any way...

I ask because he never stated why that external drive was stuck at home..

If that won't do, another possible solution.

1) I don't see a need to sync the USB stick when he returns - just perform your usual backup when you return and only care about the USB stick if it you have a failure before returning home.

If (1) is workable for him then: 2) how much data will he really need to backup when out-and-about? Can't you setup an 'away from home' backup profile that will only backup the things you'll be changing while away - document, current working folders, but skip the movies, porn and music for this backup (it's still at home and in the cloud anyway)..

Since he expects the deltas to fit on a USB stick, I'm assuming he's not wanting to backup a heap of video editing or some other hungry activity...

What about Git? (1)

XanthusMaximus (703797) | more than 2 years ago | (#40063105)

Just do a clone with Git. You can track changes, deletions and it can resolve conflicts easily.

Us a watch program (1)

foniksonik (573572) | more than 2 years ago | (#40063147)

Thete are multiple "watch" apps out there in various languages that will run a script every time a directory changes.

Google "watching files with ruby"

Substitute ruby for python or perl or...

Its easy (0)

Anonymous Coward | more than 2 years ago | (#40063241)

if ilusb | grep
then
backup script
fi

and so on

This is how my backup script works I do incremental backups to an external usb hdd if it is connected and just create local restore points if it is not.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>