Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Distributed Data Storage on a LAN?

Cliff posted more than 9 years ago | from the redundancy-gooood dept.

Data Storage 446

AgentSmith2 asks: "I have 8 computers at my house on a LAN. I make backups of important files, but not very often. If I could create a virtual RAID by storing data on multiple disks on my network I could protect myself from the most common form on data failure - a disk crash. I am looking for a solution that will let me mount the distributed storage as a shared drive on my Windows and Linux computers. Then when data is written, it is redundantly stored on all the machines that I have designated as my virtual RAID. And if I loose one of the disks that comprise the raid, the image would automatically reconstruct itself when I add a replacement system to the virtual RAID. Basically, I'm looking to emulate the features of hi-end RAIDS, but with multiple PCs instead of multiple disks within a single RAID subsystem. Is there any existing technologies that will let me do this?"

cancel ×

446 comments

Don't forget... (-1)

SCO$699FeeTroll (695565) | more than 9 years ago | (#7341288)

...to pay your $699 licensing fee you cock-smoking teabaggers.

NBD Does this (5, Insightful)

backtick (2376) | more than 9 years ago | (#7341293)

http://nbd.sourceforge.net/

"Network Block Device (TCP version)

What is it: With this thing compiled into your kernel, Linux can use a remote server as one of its block devices. Every time the client computer wants to read /dev/nd0, it will send a request to the server via TCP, which will reply with the data requested. This can be used for stations with low disk space (or even diskless - if you boot from floppy) to borrow disk space from other computers. Unlike NFS, it is possible to put any file system on it. But (also unlike NFS), if someone has mounted NBD read/write, you must assure that no one else will have it mounted.

Limitations:It is impossible to use NBD as root file system, as an user-land program is required to start (but you could get away with initrd; I never tried that). (Patches to change this are welcome.) It also allows you to run read-only block-device in user-land (making server and client physically the same computer, communicating using loopback). Please notice that read-write nbd with client and server on the same machine is bad idea: expect deadlock within seconds (this may vary between kernel versions, maybe on one sunny day it will be even safe?). More generally, it is bad idea to create loop in 'rw mounts graph'. I.e., if machineA is using device from machineB readwrite, it is bad idea to use device on machineB from machineA.

Read-write nbd with client and server on some machine has rather fundamental problem: when system is short of memory, it tries to write back dirty page. So nbd client asks nbd server to write back data, but as nbd-server is userland process, it may require memory to fullfill the request. That way lies the deadlock.

Current state: It currently works. Network block device seems to be pretty stable. I originaly thought that it is impossible to swap over TCP. It turned out not to be true - swapping over TCP now works and seems to be deadlock-free.

If you want swapping to work, first make nbd working. (You'll have to mkswap on server; mkswap tries to fsync which will fail.) Now, you have version which mostly works. Ask me for kreclaimd if you see deadlocks.

Network block device has been included into standard (Linus') kernel tree in 2.1.101.

I've successfully ran raid5 and md over nbd. (Pretty recent version is required to do so, however.) "

Re:NBD Does this (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#7341311)

and the first post is the last post.

---cut discussion here---

MOD DOWN - KARMA WHORE (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#7341371)

Re:NBD Does this - NBD server for windows (5, Informative)

flok (24996) | more than 9 years ago | (#7341379)

And since the guy is also using windows-boxes, an NBD-server for windows can be found here:
http://www.vanheusden.com/Loose/nbdsrvr/ [vanheusden.com]
This version enables you to also export partitions/disks.

IS there any technologieS?: +1, Patriotic (-1, Troll)

Anonymous Coward | more than 9 years ago | (#7341408)



At least you are following the path boldly forged
by our fearful LOSER - G. W. Bush [whitehouse.org]

Try a search at Google [google.com]

Cheers,
Kilgore

OT: Re:IS there any technologieS?: +1, Patriotic (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#7341458)

whitehouse.org != whitehouse.gov
same way
whitehouse.com != whitehouse.gov
and
whitehouse.org != whitehouse.gov

Re:NBD Does this (1)

Matrix272 (581458) | more than 9 years ago | (#7341501)

I maintain a lab with 16 Linux computers (running Red Hat 8) and 1 server. Right now, I have about 150gb or so on the server that I NFS out to all the workstations. However, each workstations has 20-80gb that they don't need and aren't using... The users all have their home directory mounted via NFS, and must have read/write access to them (obviously). Each user also must be able to SSH in, and access the console (wouldn't be much of a lab if the users couldn't sit down at a computer). I also would like to have software installed on an NFS mount, without worrying about massive performance drops.

Would NBD be able to fill all those needs? I'd like a RAID5 setup over all the computers, although maybe even some other type of RAID, like RAID5 with 5 extra disks, just in case someone powers one down... Would that work? Ideally, I'd like to make a cluster of the workstations, but also have a console for each of them... but I haven't had a lot of time to research it lately, so I don't know what's available out there. Does anyone think NBD would be a viable solution for me?

Re:NBD Does this (5, Informative)

dbarclay10 (70443) | more than 9 years ago | (#7341575)

Just to clarify what this guy is saying:

1) Make all your machines NBD servers. NBD for Linux [sourceforge.net] , NBD for Windows [vanheusden.com] . NBD stands for "network block device" and allows a client to use a server's block device.
2) Set up a master client/server (using Linux or something else with a decent software RAID stack). This machine will be the only NBD *client*, and it will use all the NBD block devices exported by the rest of your network.
3) On the master set up in 2), create a Linux MD RAID array overtop all the NBD devices that are available.
4) Create a filesystem on the brand-spanking-new multi-machine RAID array.
5) Export it back to the other machines via Samba or NFS or AFS or what have you.

Why does only one machine (the "master server") access the NBD devices, you ask? Because for a given block device, there can only be one client accessing it safely. Thus, if you want to make the RAID array available to anything other than the machine which is *running* the array off the NBD devices, you need to use something which allows concurrent access; something like NFS, Samba, or AFS.

Hope that clears it up a bit.

Rob Malda caught in circle jerk--kills self (-1, Redundant)

Anonymous Coward | more than 9 years ago | (#7341298)

Indeed.

Re:Rob Malda caught in circle jerk--kills self (0, Offtopic)

danny256 (560954) | more than 9 years ago | (#7341339)

Does anyone else find it funny that this was modded redundant?

Re:Rob Malda caught in circle jerk--kills self (-1, Redundant)

llamaluvr (575102) | more than 9 years ago | (#7341350)

Why was this modded redundant? Is the moderator implying that this has happened and that the slashdot community is already well aware and does not need another reminder? Goodness, I hope not!

yes (0)

Anonymous Coward | more than 9 years ago | (#7341573)

it's called rsync

yes (1)

Triumph The Insult C (586706) | more than 9 years ago | (#7341317)

it's called rsync

rsync Re:yes (1)

cprice (143407) | more than 9 years ago | (#7341345)

AFAIK, rsync is not really suitable for a realtime scenario. A nbd raid-5 device would be virtually realtime, no?

Re:rsync Re:yes (1)

macemoneta (154740) | more than 9 years ago | (#7341405)

Yes, if by realtime you mean really slow, for any significant volume of data. It's relatively easy to kick off rsync processes at appropriate points (like unmount, logoff, etc.). This gets you local speed access, near-line replication, and the opportunity to setup archival copies.

Re:yes (1)

Triumph The Insult C (586706) | more than 9 years ago | (#7341360)

sorry. conan cut me off last night, so i am upset

we use afs (pre-openafs, tho i'm sure openafs will work just find) on top of nbd (link escapes me right now). works pretty well.

MSI OSS (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#7341323)

Seriously, why cannot we not have a simple installation procedure? Half the OSS projects out there require a statically compiled httpd with mod_perl, but RPMs take advantage of DSOs, so you can't have that. So now you get to wipe all your apache dependant RPMs in addition to your Apache RPM and recompile everything.

Next time a friggin patch comes out (which is probably next week, thanks "a patchy" web server!), I will have to go through this rigamorole over and over again. OSS software sucks ass.

Re:MSI OSS (0)

Anonymous Coward | more than 9 years ago | (#7341354)

Please use the alternatives then, support is so much better and stability and security are 'features' found in abundance :)

Sorry, OSS ruined my subject field too. (0)

Anonymous Coward | more than 9 years ago | (#7341414)

that should have read: MSI > OSS

pirst fost! (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#7341324)

Rod Roddy after death: "COME ON DOWN!"

Okay, that's it (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#7341330)

This has really gone too far. Fuck you. Yes, fuck all of you. I really mean it. From the bottom of my heart. This bullshit is just completely out of hand. I've been patient but no longer. Just fuck the whole fucking lot of you. This is just fucking bullshit.

do tell (0)

Anonymous Coward | more than 9 years ago | (#7341377)

what has occurred?

aw geeze. (0, Offtopic)

nbvb (32836) | more than 9 years ago | (#7341342)

RAID != Backups.

If you don't understand why, just put your Packard Bell back in the box and ship it back.

Tell them you're too stupid to own a computer.

Re:aw geeze. (1)

JohnnyKlunk (568221) | more than 9 years ago | (#7341423)

Thats true, but raid == raid
it's different to having a 6 week offsite tape rotation strategy, but does protect you against a disk failure, which is what the original post wanted.
I backup my servers as work, I also raid them. To me, doing both makes perfect sense.

Re:aw geeze. (1)

wallywam1 (715057) | more than 9 years ago | (#7341426)

Nitpicky semantics != intelligence. Redundancy is a form of backup since it allows a way to recover data that would be lost if the redundancy were not in place.

Re:aw geeze. (-1, Troll)

Anonymous Coward | more than 9 years ago | (#7341432)

Or perhaps *you* could learn to read.

The poster clearly states he is trying to protect against a disk crash.

In fact *jackass* - reread his post. It does NOT say he is looking for backups (he says he does that separately), but simply a mechanism for redundant storage of data - network raid.

Fucking jackass illiterate fool.

Re:aw geeze. (0)

Anonymous Coward | more than 9 years ago | (#7341434)

where did he say that he was using this as a backup method? he clearly states that he is looking to avoid problems due to disk failure which is exactly what RAID is designed for. lighten up.

-AC

Win2k (4, Informative)

SuiteSisterMary (123932) | more than 9 years ago | (#7341344)

I believe that Windows 2000's Distributed File System allows you to do just this.

Re:Win2k (1)

... James ... (33917) | more than 9 years ago | (#7341421)

Nope -- DFS is used to distribute your data accross multiple servers but have it accessible from one location.

For example, say you have a DFS root of \\domain\dfs, with multiple children, like \\domain\dfs\mp3 and \\domain\dfs\games. mp3 and games can be shares on two different servers, but they're accessible via the same virtual \\domain\dfs share.

It's useful nonetheless.

Re:Win2k (2)

SuiteSisterMary (123932) | more than 9 years ago | (#7341513)

If you look further [microsoft.com] into DFS, I believe you'll find that you can have multiple servers syncronizing the same share name.

It's pretty snazzy; it'll even try to figure out the 'closest' server to you at any given time, skip over servers that are down, and so on.

Re:Win2k (1)

RedX (71326) | more than 9 years ago | (#7341581)

If you look further into DFS, I believe you'll find that you can have multiple servers syncronizing the same share name

The distributed feature would be quite worthless if there wasn't some synchronization taking place to make sure the data was synched across all servers in the DFS namespace.

Re:Win2k (0)

Anonymous Coward | more than 9 years ago | (#7341565)

Actually, you can have multiple replicas that are updated in real-time, bi-directional using the File Replication Service of Windows 2000 (part of DFS). Each replica can be accessed independently.

Re:Win2k (1)

Havokmon (89874) | more than 9 years ago | (#7341480)

I believe that DFS allows you to do just this.

:s/DFS/DASD

Comment (1)

TerminatorT100 (720110) | more than 9 years ago | (#7341346)

I've been looking into this too. Most workstations today have large harddisks (40GB+) while on a network maybe 2-4 GB is used... Any windows software out there?

NBD for Windows (1, Redundant)

backtick (2376) | more than 9 years ago | (#7341507)

http://www.vanheusden.com/Loose/nbdsrvr/

(I haven't used this, but it exists)

So... (1)

Pingular (670773) | more than 9 years ago | (#7341352)

Distributed Data Storage on a LAN?
Kind of like a Beowulf of hard-discs then?

rdist would work... (4, Informative)

ZenShadow (101870) | more than 9 years ago | (#7341353)

The obvious answer for this is nbd, as pointed out in another post -- but I would have concerns about speed with that kind of setup. I'd be interested in hearing reports on that.

But if you don't want to get into nbd, you can tolerate delayed writes to your virtualized disks, and all you want is the network equivalent of RAID level 1, then you could always just set up an rdist script that synchronizes your local data disk with a remote repository (or eight) every so often...

--ZS

Speed (4, Interesting)

backtick (2376) | more than 9 years ago | (#7341546)

Using a pair of Intel EEPro 100's w/ trunking (using both links at the same time on one IP, works w/ a cisco switch), I've gotten over 100 Mb/sec of actual throughput (I think I hit 137 Mbit/sec, peak) out of a box using NBD to create a mirror'd RAID volume over the trunked ports. Now, my actual 'real' data speeds to the file ssystem were about half that (Call it 50-65 Mbit, or 6 to 7.5 MByte/sec), due to mirroring == writing it twice. Still not bad. Yes, the target disks were themselves part of other RAID volumes, for speed :)

slashdot creates idiots (0)

exspecto (513607) | more than 9 years ago | (#7341366)

I fail to see why this was allowed to post to the front page. How many fricking times do we have to tell you retards, it's LOSE, NOT LOOSE?!?!

Re:slashdot creates idiots (0)

Anonymous Coward | more than 9 years ago | (#7341474)

"Is there any existing technologies that will let me do this?""

Standard Linux kernel maybe? (2)

buzzbomb (46085) | more than 9 years ago | (#7341368)

Perhaps multiple files over different networking procotols (SMB for Windows machines, NFS for the Linux machines) mapped to built-in loopback devices (/dev/loX) accessed through built-in md utilizing software RAID5? Heh. It might not be pretty or fast, but it would probably work just fine. It may just give the kernel absolute fits though.

Anyone tried this?

Re:Standard Linux kernel maybe? (3, Informative)

backtick (2376) | more than 9 years ago | (#7341463)

NBD *is* standard Linux kernel. It's built right in: /usr/src/linux-2.4/Documentation/nbd.txt

If you're curious about using the enhanced NBD w/ failover and HA, you can read about it at:

http://www.it.uc3m.es/~ptb/nbd/#How_to_make_ENBD _w ork_with_heartbeat

Re:Standard Linux kernel maybe? (1)

buzzbomb (46085) | more than 9 years ago | (#7341531)

NBD *is* standard Linux kernel. It's built right in: /usr/src/linux-2.4/Documentation/nbd.txt

Ok. But does it work under Windows? That was one of the requirements.

Gah (0)

Anonymous Coward | more than 9 years ago | (#7341372)

just nfs mount the disks and use a backup utility to backup across the network nightly.

InterMezzo (1, Informative)

Anonymous Coward | more than 9 years ago | (#7341373)

Sounds like Coda or InterMezzo [inter-mezzo.org] would fit the bill, but they won't address non-linux systems directly. You'd have to export the InterMezzo file systems with Samba and mount them on the MS Win boxes.

AFS (4, Informative)

Reeses (5069) | more than 9 years ago | (#7341374)

It's called the Andrew File System.

http://www.psc.edu/general/filesys/afs/afs.html

There's another alternative with a different name, but I forget what it's called.

Re:AFS (1)

Reeses (5069) | more than 9 years ago | (#7341443)

Whee.. replying to my own post... In addition to AFS...

Coda:

http://coda.cs.cmu.edu/

and InterMezzo:

http://www.inter-mezzo.org/

and there's a review here:

http://www.linuxplanet.com/linuxplanet/reports/4 36 1/1/

Although, honestly, a 5 second search on google for "distributed filesystem" would have turned this up.

Ah, well.

Re:AFS (1)

wetshoe (683261) | more than 9 years ago | (#7341456)

I'd have to agree, AFS is a great solution. I actually thought of this about a year ago, and I told a co-worker about it. He told me it had already been implemented, and as it turns out, it was, it's AFS.
AFS is actually pretty cool. You can run a file server that uses all this disk space of all the client machines. It's a great idea now, especially since most new machines come with 40GB hard drives, and most people don't use anything more then 5GB.
AFS is a wonderful solution to not only this problem that the poster is talking about, but it can be used in so many other interesting ways.

Re:AFS (1)

kaybi (261428) | more than 9 years ago | (#7341462)

OpenAFS

http://openafs.org/ [openafs.org]

Why? (2, Funny)

Anonymous Coward | more than 9 years ago | (#7341382)

I have 8 computers at my house on a LAN. I make backups of important files, but not very often

I mean, let's be honest here. We are all dorks, but this guy is king dorkus dweedius maximus. Don't fool yourself about the "important data" - it is just pr0n and pirated MP3s.

If it was real work, there would be a real IT guy with real RAID and real backup tapes working on the problem,. But we know it isn't real work, because if this guy had a real IT job, h couldn't stand coming home and dealing with 8 friggin computers.

We realize you think you are cool because you have a few KVMs, a couple of Linksys routers, and a bunch of old PIIs running Lunix with one Windows machine, but come on, man. Stop spanking yourself over your elite NAT-ed network and just get one computer with hardware RAID. Instal Cygwin if you feel the need to type configure && make && make install a whole bunch of times and watch teh pretty text lines scroll.

Re:Why? (0)

gatkinso (15975) | more than 9 years ago | (#7341464)

Actually... if the guy has a family and is an independent techie, then he could have 8 machines easily.

Most common form of data loss? (5, Insightful)

Anonymous Coward | more than 9 years ago | (#7341385)

I'd argue the point that the most common form of data loss is a crashed hard disk.

In my 14 years as a Network Administrator I think I've restored backups due to failed hard disks about twice (RAID catches the rest).

But I restore data accidentally deleted or changed by a user at least weekly! A distributed storage system won't help you there.

However, I will grant that the average /. user knows what they're doing with their data far more than my average user does and is less likely to cause self-inflicted damage.

Re:Most common form of data loss? (1)

JohnFluxx (413620) | more than 9 years ago | (#7341452)

That's why I don't know why _by default_ it isn't set up to have the whole of /home under cvs

Re:Most common form of data loss? (1)

Xerithane (13482) | more than 9 years ago | (#7341569)

That's why I don't know why _by default_ it isn't set up to have the whole of /home under cvs

CVS isn't designed for that, unless you only store documents or have some pretty stringent filters setup on CVS. CVS is for versioning, and you don't really want to maintain a backlog of every version of every file in your home directory.

Re:Most common form of data loss? (4, Insightful)

Blackknight (25168) | more than 9 years ago | (#7341598)

That's one feature from VMS that I wish unix had. File versioning was built in to the file system, so if you wanted the old version of a file back you just had to roll back to the old one.

Intermezzo (5, Informative)

mikeee (137160) | more than 9 years ago | (#7341389)

Intermezzo [inter-mezzo.org] is designed for this and a bit more - if one of the machines is a laptop you can take it away and work on it, and it'll resync when you get back.

It isn't particularly high-performance, from what I know, and may be more complexity than you need.

Network RAID (1, Interesting)

Anonymous Coward | more than 9 years ago | (#7341394)

Redhat has a very good software raid and is easy to setup with only two disks. Of course with only two disks they are mirrored. But it is very easy to setup a cron entry that can email you the status of that mirror everyday.

Bandwidth (3, Insightful)

omega9 (138280) | more than 9 years ago | (#7341396)

I hope you're looking at some fast lines to put between those boxen. Even at 100Mb/sec, doing RAID across a LAN could get slow.

Re:Bandwidth (1)

SirJaxalot (715418) | more than 9 years ago | (#7341511)

don't even think about trying raid over modems either.

RAID on Files (3, Insightful)

Great_Geek (237841) | more than 9 years ago | (#7341402)

I have often wanted the same thing, kind of like RAID on files, call it RARF (Redundant Array of Remote Files). I was thinking along the line of a device driver that presents an ATA/IDE interface to the file system on one side and passes the requests to multiple copies of virtual disks. The virtual disks would be like VMWare disks, and potentially each on a different machine/location. Each virtual disk could even be encrypted differently.

This would be really useful for SOHO type places to allow me to have a hot offsite backup at multiple friends (and vise versa).

Re:RAID on Files (1)

ZenShadow (101870) | more than 9 years ago | (#7341478)

What you describe is a combination of the loopback and md drivers under Linux -- RAID1 (or 5 or...) on loopback devices pointing at files living on NFS disks. Or something.

--ZS

DIBS? (1)

kulpinator (629554) | more than 9 years ago | (#7341403)

I haven't checked into it much, but I remembered the DIBS [berkeley.edu] (Distributed Internet Backup System -- Slashdot article here [slashdot.org] ). I would imagine that it could be modifed (maybe not trivially) to support real-time disk operations, since it is open-source. However, although I don't know much about Python, I have a feeling this may suffer in performance from being written in a (semi-)interpreted language. Python lovers want to flame me for incriminating their programming language?

Backing up all within your house (4, Insightful)

Alain Williams (2972) | more than 9 years ago | (#7341404)

Hmmmm, what happens if your house catches fire ?

8 copies of the same document all nicely toasted!

Re:Backing up all within your house (2, Funny)

feepness (543479) | more than 9 years ago | (#7341530)

Hmmmm, what happens if your house catches fire ?

Come on, this'll never happen. I live in San Diego!

Re:Backing up all within your house (1)

peragrin (659227) | more than 9 years ago | (#7341595)

Yes and No. your house would have to be totally gutted for that to happen, with an average 10 minute reponse time for fire dept. in the U.S.( longer if you live in a rural area, shorter in the cities) The proballity of losing all 8 systems is remote. Chances are at least 2 of the systems will survive.

Your chances are even better if you seperate the macines through out the house.

Re:Backing up all within your house (1)

BigDumbAnimal (532071) | more than 9 years ago | (#7341599)

This is personal stuff. It's not like this guy has $20M in data that needs a redundant data center ready to go live within minutes on the other side of the world.

Loose Hard Drive? (2, Funny)

Anonymous Coward | more than 9 years ago | (#7341409)

As opposed to a tight one?

Speed would be an issue... (4, Informative)

Trolling4Dollars (627073) | more than 9 years ago | (#7341412)

I imagine you'll need gigabit ethernet or multiple NICs in bonded mode. Then you have the performance of each individual system to take into account. Especially if one of the systems is heavily used. I would recommend getting one BIG HONKIN' SERVER and putting it in a central location. Give it gigbit and let everything else connect to it at 100. Then, make sure it has a hardware RAID controller. Use SAMBA for the cross platform connectivity you desire, and viola! protected data with redundancy and high speed performance. If you go with remote display (RDP with Windows Terminal Server or X with *nix) then you have an even better appraoch as all the data will exist on the secure RAID box.

I get what you mean though... it's a nice idea, but it would be costly to implement vs. what I suggested above.

When I went to see a presentation on HP's SAN solutions last year, I was very impressed with the ideas they had. One big hardware box with multiple disks that are controlled by the hardware. They are then presented to any systems over a fiber link as any number of drives you wish for any OS. Finally, their "snapshot" ability was pretty impressive. (Also called Business Copy) All they would do is quiesce the data bus, then create a bunch of pointers to the original data. As data is altered on the "copy" (just the pointers, not a real copy), the real data is then copied to the "copy" with changes put in place. I imagein something similar could be accomplished with CVS...

Re:Speed would be an issue... (0)

Anonymous Coward | more than 9 years ago | (#7341563)

Yeah, "HP's" technology is great. They OEM-ed it from Hitachi Data Systems, they don't build their own hi-end storage arrays. And by the way, Hitachi was successfully sued by EMC (http://www.emc.com) for patent infringement for this technology.

Coda (3, Redundant)

fmlug.org (695374) | more than 9 years ago | (#7341416)

Coda may do what your looking for
# disconnected operation for mobile computing

# is freely available under a liberal license
# high performance through client side persistent caching
# server replication
# security model for authentication, encryption and access control
# continued operation during partial network failures in server network
# network bandwith adaptation
# good scalability
# well defined semantics of sharing, even in the presence of network failures
More info here http://www.coda.cs.cmu.edu/

Distributed Network Block Device (2, Informative)

JumboMessiah (316083) | more than 9 years ago | (#7341418)

A perfect solution would be a form of network block device that mounts distributed NBD shares. The Linux DRBD Project [drbd.org] has this capability. From their website, "You could see it as a network raid-1".

data loss (1)

_fuzz_ (111591) | more than 9 years ago | (#7341420)

...I could protect myself from the most common form on data failure - a disk crash.

In my experience, the most common form of data loss is not hardware failure, but user error. RAID is great for protecting against hardware failure, but be sure to still make backups to prevent against accidental deletion.

...existing technologies that will let me do this? (-1, Flamebait)

Anonymous Coward | more than 9 years ago | (#7341430)

No.


Next question?

Two words (sort of) (0)

Anonymous Coward | more than 9 years ago | (#7341433)

Samba RAID0.

What you are asking for sounds pretty damn complicated. My home has about 10 machines in it, and I just use Samba on two mirrored disks for network storage.

Hey, but it's a free world. Feel free to ratchet up the technology till you bleed....

Try Rsync or DRBD (4, Informative)

oscarm (184497) | more than 9 years ago | (#7341436)

see http://drbd.cubit.at/ [cubit.at] DRBD is described as RAID1 over a network.

"Drbd takes over the data, writes it to the local disk and sends it to the other host. On the other host, it takes it to the disk there."

Rsync with a cron script would work too. I think there is a recipe in the linux hacks books to do something like what you are looking for: #292 [oreilly.com] .

Venti needs a mention (3, Informative)

DrSkwid (118965) | more than 9 years ago | (#7341438)

[bell-labs.com]
http://plan9.bell-labs.com/sys/doc/venti/venti.h tm l

Abstract

This paper describes a network storage system, called Venti, intended for archival data. In this system, a unique hash of a block's contents acts as the block identifier for read and write operations. This approach enforces a write-once policy, preventing accidental or malicious destruction of data. In addition, duplicate copies of a block can be coalesced, reducing the consumption of storage and simplifying the implementation of clients. Venti is a building block for constructing a variety of storage applications such as logical backup, physical backup, and snapshot file systems.

PENIS (-1, Flamebait)

Anonymous Coward | more than 9 years ago | (#7341450)

Doe a deer, a female deer, Ray the guy that fucked her ass.

Expensive but reliable solution (2, Interesting)

onyxruby (118189) | more than 9 years ago | (#7341459)

I've been looking into something like this for a little while. What I'd like to do when I have the fundage is get a fileserver/backup box. The ideal is to run 4 160 GB IDE drives in RAID 5. This will give me a bit over 450 GB in usable network storage. I then want to add a pair of 250 GB 5400 drives for backup. I can then set up a the server to backup the data from the raid drives to the backup drives on a daily basis.

According to pricewatch the 4 160's could be had for around $400 total with about another $400 for the backup. Add a 3ware RAID controller for another $245 bucks and your looking at about $1045 to convert a system into supporting 450 GB of usuable network storage and backup.

From all indications IDE harddrives are now the cheapest form of backup there is. I've looked at CD, DVD, Tape, but it keeps coming back to IDE hard drives. This is far cheaper than a similiar storage and backup would be on tape.

hyper scsi (2, Informative)

blaze-x (304666) | more than 9 years ago | (#7341461)

from the website:

HyperSCSI is a networking protocol designed for the transmission of SCSI commands and data across a network. To put this in "ordinary" terms, it can allow one to connect to and use SCSI and SCSI-based devices (like IDE, USB, Fibre Channel) over a network as if it was directly attached locally.

http://nst.dsi.a-star.edu.sg/mcsa/hyperscsi/ [a-star.edu.sg]

iSCSI? (1)

SuperBug (200913) | more than 9 years ago | (#7341465)

You can share iSCSI devices, if you do it the right way, between many different hosts. NBD sounds good, but for what you're asking, iSCSI or FCIP or some derivative sounds more correct. i.e. virtual block devices, or "real" block devices on a network that can be accessed by windows or *nix. you could RAID (md) iSCSI devices, or just use a system which "owns" all the iSCSI devices in an MD, and present it up using CIFS or SMB.

Check this out... (1)

BubbaTheBarbarian (316027) | more than 9 years ago | (#7341485)

http://www6.tomshardware.com/storage/20031028/in dex.html
Not as a solution in and of itself, but it is a good idea considering that you more then likely have a box to burn...also try to grab some old PolyServe software. It will do that samething over a network, though not without resource loss.
WAR TUX!

Not really a good idea (1)

c77m (690488) | more than 9 years ago | (#7341491)

Maybe it would be a fun experiment, but there are too many potential issues for me to consider this a good solution. With a goal of decreasing your susceptibility to failure, you are introducing many more possible failure points. Instead of data relying on disk, bus, and cache, you're looking at the same times as many systems as you have, plus introducing your network as a failure point.

What about data integrity when the network fails? Or when a single host fails? You could create ACLs for hosts that would be responsible for certain data upon certain failures, but then you're adding to an already overwhelming management nightmare.

Why not consider a shared storage system? You're not realistically going to have a failproof plan in your home, so just narrow it down to a few things. External JBOD with software RAID, presented as NAS to the rest of your computers. If a drive fails, just replace it. If the NAS head fails, just hook up the JBOD to another host.

Re:Not really a good idea (1)

Indianwells (661008) | more than 9 years ago | (#7341528)

That's not really the point is it? This guy has a legit need here, to setup a shared backup system amongs a group of machines. The single point of failure here is a hard drive on one or many machines. This way he distributes his problem instead of not having a solution. I like tthis idea quit a bit. As this isn't for the enterprise, why would he even really care about access control? Sheesh.

Typo! (0)

Anonymous Coward | more than 9 years ago | (#7341495)

Just thought I'd point this out, a typo in the article:

"Is there any existing technologies that will let me do this?"

--Should read "are..."

Lustre (0)

Anonymous Coward | more than 9 years ago | (#7341499)

The highest performance is probably from Lustre [lustre.org] , although it is designed for slightly larger clusters. Haven't tried it yet though.

Rsync and Ssh (4, Informative)

PureFiction (10256) | more than 9 years ago | (#7341521)

This is the way I do it, and although a little clunky, it allows me to keep remote backups of certain directories one three different servers.

First, setup ssh to use pubkey authentication instead of interactive password. You can read the man pages for details but it basically boils down to running keygen on the trusted source:

ssh-keygen -b 2048 -t dsa -f ~/.ssh/identity

Then copy|append the newly created ~/.ssh/identity.pub to the remote hosts into their /home/user/.ssh/authorized_keys file.

Now you can run rsync with ssh as the transport (instead of rsh) by exporting:

export RSYNC_RSH=ssh or also passing --rsh=ssh on the command line.

So to sync directories you could use a find command to update regularly:

while true; do
find . -follow -cnewer .last-sync | grep '.' 1>/dev/null 2>/dev/null
if (( $? == 0 )) ; then
rsync -rz --delete . destination:/some/path/
touch .last-sync
fi
sleep 60
done

Obviously this is pretty hackish and could be improved. But the point is that with ssh and rsync you could do automatic mirroring of specific filesystems or directories to remote locations securely.

The holy grail (1)

mcrbids (148650) | more than 9 years ago | (#7341535)

What you seek is the holy grail of high-availability environments.

So far, I've not seen anything that exists that does what you are asking for. Several technologies come somewhat close.

What I've been hopeful of is the recent donations by Oracle for database clustering, but I haven't seen any decent fallout from that... yet.

For now, on my home-based work network, I have two network drives (both IDE 120 GB) and do nightly rsynch from one to the other.

(sigh)

Unison? (1, Informative)

Anonymous Coward | more than 9 years ago | (#7341539)

Not yet seen reference to unison:

http://www.cis.upenn.edu/~bcpierce/unison/

They say: "Unison is a file-synchronization tool for Unix and Windows. (It also works on OSX to some extent, but it does not yet deal with 'resource forks' correctly; more information on OSX usage can be found on the unison-users mailing list archives.) It allows two replicas of a collection of files and directories to be stored on different hosts (or different disks on the same host), modified separately, and then brought up to date by propagating the changes in each replica to the other."

Careful... (0)

Anonymous Coward | more than 9 years ago | (#7341540)

If you "loose" your drive it might not come back.

dedicated vs. network (0)

Anonymous Coward | more than 9 years ago | (#7341545)

Wouldnt that be slower than just setting up a dedicated file server using some raid hardware....if you did that over the network, wouldnt that slow down your network tremendiously? besides that, i dont see too much advantage in it. If you have 8 computers at home, just set a new one up as a dedicated file server! put some 250GB WD 8MB cache drives on SATA with raid 0+1......and boom, file server with raid! more effective!! .. but i guesss thats just this techie's opinon

You aren't gonna get a real RAID. (5, Insightful)

PurpleFloyd (149812) | more than 9 years ago | (#7341547)

First off, you aren't going to be able to use this like a real RAID array (a drive can die and you keep on working). The latency and bandwidth of any network that could be reasonably implemented in your home is going to prevent your system from acting like a real RAID array.

Instead of trying to implement a shoestring SAN, go the simple route: throw up a Linux box running Samba for your "backup server;" it doesn't need much horsepower, just fairly fast drives and a network connection. Then schedule copies of your documents and home directories (using a cron-type tool on Linux and XCOPY called by the Task Scheduler on Windows, you should be able to hack something together that copies only changed files) every night at midnight, or some other time when you aren't using your computers. Although you might lose a bit of work if the system goes down, you won't ever lose more than 24 hours' worth.

If you have more money to blow, then I would suggest that you invest in an honest-to-dog hardware RAID card and some good drives and put them into a server, then do everything across the network (put the /home tree and My Documents folders on the server). You can of course mount the /home directory in Linux via NFS or smbmount, and Group Policy in Windows 2K/XP will allow you to change the location of the My Documents folder to whatever you choose. You might be able to do the same via the System Policy Editor on 9x; it's been a while and I can't find the information after a brief Google.

To sum up:

  • Don't blow millions on a SAN for your house.
  • Cheap route: cron jobs/Windows task scheduler to copy important folders across the network every night
  • More expensive route: invest in a server with real RAID, then mount your important directories from that.

Re:You aren't gonna get a real RAID. (1)

Lester67 (218549) | more than 9 years ago | (#7341570)

Windows "Robocopy" will automatically check and compare file dates for you.

You probably don't want to do this. (3, Insightful)

NerveGas (168686) | more than 9 years ago | (#7341552)


Really. If you're on a 100-megabit LAN, that gives you a max of about 10 megaBYTES per second. So, if you have to transmit information to two other computers for every disk write, you're effectively limitting yourself to a maximum of about 5 megabytes/second disk transfer. And that's under GOOD situations. If you're doing random I/O, where the latency will be the determining factor, then take the latency of the hard drives, add in the latency of the networking, and the latency of the software layers, and you're looking at some pretty abysmal performance.

Using rsync in a cron job will solve your backup problems. In fact, your script can use rsync to do the synchronization, and tar/gzip to archive the backup - giving you "point in time" snapshots for when someone says "I deleted this file 4 days ago, can you get it back?"

steve

nbd + evms2 = your best bet, but you'll lose (0)

Anonymous Coward | more than 9 years ago | (#7341553)

nbd + evms2 = networked software raid.

Be forwarned: This will be slower than snot on a cold Sunday.

The fastest and maybe even the cheapest setup to do this with would be to have a bunch of NAS drives on their own switch, with the host machine attached to the same switch. Host has multiple NICs, all channel bonded to this switch, and then has another NIC to the outside network. This would give you a big setup.. but again, SLOW! Your looking at 5MB/s tops with overhead.. IDE does this stuff all the time at up to 40MB/s+. SCSI and Fibrechannel, even faster.

Good luck!

I can't believe... (2, Interesting)

wcdw (179126) | more than 9 years ago | (#7341560)

...this question even got asked. Ok, if you *need* to share the same device across machine, something like the network block device can be a real help.

If all you're worried about is disk failures, mirror each disk locally. Disks are cheap, and real operating systems don't have any trouble with software mirroring.

Why would you want to make all of your machines suddenly non-functional, just because one of them lost a network card? Or the switch failed? Or ....

while it's a cool idea (1)

flaming-opus (8186) | more than 9 years ago | (#7341562)

what you're proposing is probably a poor solution to your needs. To use RAID-like disk storage across the network will require several high-latency transfers across the network for every write opperation. -very slow.

Furthermore, every time one of the computers is powered off the system will wait for that machine to come back, or will treat it like a dead disk. Even with high performance raid devices, degraded mode is mighty slow. Then when the device comes back you will have to rebuild the raid. A long/slow/agonizing process even with fast hardware.

I think rsync in a cron tab is a much better idea.

Coda File System (0)

Anonymous Coward | more than 9 years ago | (#7341582)

Another DSF

http://www.coda.cs.cmu.edu/

Why is Coda promising and potentially very important?

Coda is a distributed filesystem with its origin in AFS2. It has many features that are very desirable for network filesystems. Currently, Coda has several features not found elsewhere.

1. disconnected operation for mobile computing
2. is freely available under a liberal license
3. high performance through client side persistent caching
4. server replication
5. security model for authentication, encryption and access control
6. continued operation during partial network failures in server network
7. network bandwith adaptation
8. good scalability
9. well defined semantics of sharing, even in the presence of network failures

Availability (1)

raphae1 (695666) | more than 9 years ago | (#7341583)

That also means that whenever even one of the machines is down ('hw maintenance', new kernel boot, system crash, unplugged...) all the others will lose access to the data too.
I suppose it could work well in a server room, but if your home setup is anything like mine - open cases and cat5 crisscrossing the house - or you have a screwdriver on your desk, you might experience a lot of downtime...
My wife would have me by the curlies.

yes, I'm a soldering iron wielding programmer

Umm, but what about? (1)

mschuyler (197441) | more than 9 years ago | (#7341593)

I hate to point this out, but my daughter's house in Scripp's Ranch in San Diego just narrowly escaped completely burning down. She evacuated with her hard disk (smart thinking there, kid!). The place is uninhabitable with smoke damage. How the fire went around that cul de sac is just amazing.

The point is: 8 computers in the house won't help diddly in a real disaster. That's a lot of work just to see it burn up. (I know it will never happen to you; it was 2,000 other houses that burned to the foundation.

And further, I've had two RAID systems go TU in the last few years. For me RAID doesn't cut it at all. Distributed File System works pretty cool--but so does a fire safe.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...