Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Making Use of Terabytes of Unused Storage

CmdrTaco posted more than 6 years ago | from the something-to-think-about dept.

Data Storage 448

kernspaltung writes "I manage a network of roughly a hundred Windows boxes, all of them with hard drives of at least 40GB — many have 80GB drives and larger. Other than what's used by the OS, a few applications, and a smattering of small documents, this space is idle. What would be a productive use for these terabytes of wasted space? Does any software exist that would enable pooling this extra space into one or more large virtual networked drives? Something that could offer the fault-tolerance and ease-of-use of ZFS across a network of PCs would be great for small-to-medium organizations."

cancel ×

448 comments

Sorry! There are no comments related to the filter you selected.

Porn (5, Funny)

Anonymous Coward | more than 6 years ago | (#22359676)

It's the obvious choice.

Re:Porn (2, Funny)

tristian_was_here (865394) | more than 6 years ago | (#22359704)

obviously not gay porn... its not my thing anyway.

vista? (5, Funny)

stillb4llin (1232934) | more than 6 years ago | (#22359680)

install vista on them, that would fill up that space and give you something to manage your time a little better than wondering about what you could manage..

Re:vista? (1, Informative)

PolarBearFire (1176791) | more than 6 years ago | (#22359844)

Yeah, according to apple Vista takes up 80GB.

Re:vista? - DFS (4, Informative)

whackco (599646) | more than 6 years ago | (#22359948)

You know, make fun of Microsoft all you want, but they actually have something for this - DFS - Distributed File System. Just create a share with each of these and POOL IT with a DFS system. Then use and manage it to your hearts content with all the midget-donkey-goatse crap you want.

Re:vista? - DFS (4, Interesting)

OnlineAlias (828288) | more than 6 years ago | (#22360130)


This is why SAN manufacturers have come up with "thin provisioning". NetApp is quite good it, read more here [netapp.com] .

Re:vista? - DFS (0)

MikeyTheK (873329) | more than 6 years ago | (#22360180)

Yeah, but what happens if the local user needs the space? Does DFS give priority to local storage and move the files? If it has to quickly that could be a pain since the throughput would be poor, right?

easy! (5, Funny)

Anonymous Coward | more than 6 years ago | (#22359698)

Does any software exist that would enable pooling this extra space into one or more large virtual networked drives?

Absolutely! Just hook them up directly to the internet before you update the machines, wait a few minutes, and voila! They'll be filled up with extra files in no time! Hey, you didn't say anything about wanting to be in control of what gets put on the machines...

Not without heavy utilization of other resources (4, Insightful)

Mostly a lurker (634878) | more than 6 years ago | (#22359700)

If you have a very robust local network with plenty of spare capacity, and can accept a performance hit on the client computers, I am sure some kind of linked filesystem would be possible. In most practical situations, I think this idea would be a non-starter.

Re:Not without heavy utilization of other resource (1)

bostonsoxfan (865285) | more than 6 years ago | (#22359778)

I don't know at where I work tons of engineers leave their computers online overnight you could do backups over night or transfers or whatever. Or you can do something similar to Seti@ Home, run when computers are idle or not utilizing any processing power. I think the big hurdle is partitioning off a part of each hard drive so that the user can't access it, so what they don't know about they can't be angry about losing. The thing I see as a problem if you use it as a virtual drive or backup or something, security. Servers are nice because they are locked up and monitored and generally protected more than user workstations. Where I work the workstations aren't locked up or anything, I would be very wary of allowing secure company documents to be stored on something that is amorphous by its nature. I work in the aerospace industry and I know that heads would roll if some of the documents we generated were leaked because lots of the stuff on servers is classified, proprietary or IP.

Re:Not without heavy utilization of other resource (1)

AndGodSed (968378) | more than 6 years ago | (#22359908)

Well the average Windows install doesn't recognize an EXT3 filesystem (as a for instance, most Linux filesystems aren't "seen" from windows) so partitioning the drive with a windows and linux partition should be fine, then use these drives for multiple backup mirrors via a small linux apache server...

You could secure them with passwords and so on.

Oh go ahead and poke flaming holes in my suggestion *buries face in hands and sobs*

Re:Not without heavy utilization of other resource (1)

GIL_Dude (850471) | more than 6 years ago | (#22359944)

Wait a minute; if Windows can't see the data, how will it serve the data up to your remote machines? Or are you saying that he should remotely (or on an schedule) reboot the machines into Linux overnight to do this? Because there is no way an OS is going to serve up files from a partition it can't even read.

Re:Not without heavy utilization of other resource (1)

pionzypher (886253) | more than 6 years ago | (#22360166)

Linux can read FAT32 and NTFS partitions just fine. So yes, perhaps have a vm boot the image at night, mount the windows partition and backup the drive.. shutting down after. Or some custom app that just writes to the ext2 partition. As Bostonsoxfan alluded to, security might be an issue. Encrypting the partition the backups were stored on would probably be sufficient for most places.

Of course the risk of backing up your data on the same physical drive remains. I suppose a VM booting, a secure copy to a peer as well as accepting a copy of the peers backup would address that well enough. Now you'd just need a secure way of choosing the peer (unless you're going to hardcode all the pairs).

Re:Not without heavy *use* of other resources (2, Insightful)

Anonymous Coward | more than 6 years ago | (#22359850)

Please stop typing words like "utilization" when you mean "use". You sound like a PHB trying to sound smarter than he really is and you make it a pain for people to read what you write, especially non-Anglophones. Read George Orwell's essay on this topic [mtholyoke.edu] .

Re:Not without heavy *use* of other resources (3, Insightful)

DarrenBaker (322210) | more than 6 years ago | (#22360000)

Hrmm... Funny, he didn't come across that way to me at all. You, however, come across as a pompous linguistic Nazi, much like Orwell. If you compose sentences for people who don't have command of the language, then you are really quite delusional.

As is my understanding, resources are utilised, while tools are used. He was correct in its usage.

Re:Not without heavy *use* of other resources (0, Offtopic)

DarrenBaker (322210) | more than 6 years ago | (#22360086)

I'm a troll, and he's modded +5 insightful? Must be a lot of non-English speakers here.

Re:Not without heavy *use* of other resources (0, Insightful)

Anonymous Coward | more than 6 years ago | (#22360158)

The difference is that I wasn't nasty about it, I explained a problem and gave him a link to an essay about it. You, on the other hand, called Orwell and me names, attacked a straw-man, and said something incorrect about the words that is trivially debunked by glancing at a dictionary.

Re:Not without heavy *use* of other resources (1)

Gewalt (1200451) | more than 6 years ago | (#22360138)

Parent is not troll!
GP is 100% troll.. not insightful.

Grog likes it simple (2, Insightful)

upside (574799) | more than 6 years ago | (#22360006)

Great, let's all dumb down to the lowest common denominator. English is a rich language and all the better for it. If you're too lazy to learn it, your choice. I'm a non-native speaker but prefer a vibrant, expressive language to some "for-dummies" international pidgin.

Re:Not without heavy *use* of other resources (5, Funny)

fretlessjazz (975926) | more than 6 years ago | (#22360018)

Well, you sound like a troll. I seriously doubt anybody misunderstood what he meant because he used the word "utilization". Or, should I say he utilized it? UTILIZE UTILIZE UTILIZE UTILIZE UTILIZE UTILIZE UTILIZE UTILIZE Does it hurt yet?

Re:Not without heavy *use* of other resources (5, Funny)

Dogtanian (588974) | more than 6 years ago | (#22360030)

Read George Orwell's essay on this topic.
Going by his dislike of overused, cliched phrases expressed in that essay, today's "businessspeak" (mindless repetition of words and phrases that have long since been driven into the ground by thoughtless, banal, stupid repetition) would have him spinning in his grave so much that we could use him as a form of renewable energy.

The solution is obvious. We need to think outside the box and raise the bar when it comes to language... someone needs to step up to the plate and bring something new to the table. I'm thinking of someone I have synergy with, not just the type that goes for the low-hanging fruit.

Ooh.... he's spinning nicely. Another couple of Orwells and we'll have enough electricity to power the world :)

Re:Not without heavy *use* of other resources (0)

Anonymous Coward | more than 6 years ago | (#22360052)

Well in this case, "utilization"/"utilisation" does mean use. Utilization is actually clearer for non-English-speakers, IMO, because it is always a noun, whereas "use" can also be a verb (perhaps also an adjective). By the way, what is a PHB?

Re:Not without heavy *use* of other resources (0, Offtopic)

OS_Neutral (1236238) | more than 6 years ago | (#22360236)

I find it hypocritical and mildly ironic that you use the hyphenate "non-Anglopohone" in criticizing someone else for using unnecessarily complex speech that may not be easily understood by non-native speakers. And, for the record, the performance monitors on my Windows systems tell me the "percentage utilization" of a given resource, not the "percentage use".

Re:Not without heavy *use* of other resources (1)

DerekLyons (302214) | more than 6 years ago | (#22360252)

"Utilization" is a perfectly good word, and perfectly clear in usage and meaning to any educated person. I can't believe that on Slashdot a comment complaining a word was 'too big' would get modded up.

Do you really have control of the boxes? (5, Insightful)

Marc Rochkind (775756) | more than 6 years ago | (#22359702)

If they're in a computer room, then such a scheme might work. But, if they're on user's desks, you don't really have control. They're subject to filling up, being shut off, being knocked about, crashing, etc. I don't think in this case you would really get the reliability that the diversity and independence would suggest.

--Marc

Re:Do you really have control of the boxes? (1)

McGiraf (196030) | more than 6 years ago | (#22359712)

You just use some kind of distributed raid. I'm sure sofeare for this already exist.

Re:Do you really have control of the boxes? (1)

teslar (706653) | more than 6 years ago | (#22359802)

They're subject to filling up, being shut off, being knocked about, crashing, etc
Well, filling up is kinda the point of the entire exercise, but you're right - being shut off, crashing, or being otherwise disconnected is enough of a problem to make this a non-starter. We're basically talking about a distributed filesystem in which subparts may fail without notice. I'm sure there are ways to minimise the problems this will create - you can for instance make sure that any one file is always completely located on one physical hard disk, so if that one goes down you're at least not left with half a file which may still be open in an editor somewhere. I guess you can also be clever with redundancy, so that say half the hard disks in your network can go down but you're still left with a working system (provided the right ones go down), but basically, because you cannot guarantee which hard disks will be up at any given time you also cannot guarantee that you're system won't break in horrible ways. Hence it's not practical unless you don't particularly care about which files are available at any given time as long as some are there. So basically, that means it'll be alright for porn and mp3s, neither of which you'd particularly want lying around in a corporate environment, but I fail to come up with applications that might actually be useful.

Besides, with hard disks these days being as cheap as they are, why not just buy another one if you do need more space? Do you even need more space? Or is this just trying to salvage something you can't really use in order to create a solution for a problem that doesn't really exist?

Re:Do you really have control of the boxes? (2, Insightful)

cbart387 (1192883) | more than 6 years ago | (#22360232)

Or is this just trying to salvage something you can't really use in order to create a solution for a problem that doesn't really exist?
Bingo Bango Bongo! If you read the submitter's question, it simplies to:
    a) Is there there something productive I can be doing?
    b) How to do it?

Everything else is fluff that tends to lead slashdot readers off on tangents, flamewars, Emacs Vs Vi (emacs), KDE vs GNOME (gnome)

Re:Do you really have control of the boxes? (1)

arivanov (12034) | more than 6 years ago | (#22360008)

There is a number of clustered storage apps operating on P2P basis with N:M redundancy model. Just do an internet search and choose your poison. Neither one of them offers amasing performance, but the actual availability often exceeds what you get from an average SMB Winhoze server.

Waste of electricity (0)

Anonymous Coward | more than 6 years ago | (#22359714)

You'll get more selling everything you have and investing in a storage solution then you will paying for the electricity to run all those crap drives.

Re:Waste of electricity (1)

bconway (63464) | more than 6 years ago | (#22360010)

The engineers might have a tough time doing their work after you take away their computers. Methinks you didn't read the article.

Download and mirror the Internet... (5, Funny)

SiegeTank (582725) | more than 6 years ago | (#22359716)

...just in case your connection fails.

not enough info (2, Interesting)

YrWrstNtmr (564987) | more than 6 years ago | (#22359724)

Is this a company, college, or just a random collection of boxes in your mom's basement? What function does your organization want to do that it can't because of a lack of a few terabytes? What does the actual owner of these boxes have to say about your little enterprise?

Re:not enough info (4, Funny)

BeanThere (28381) | more than 6 years ago | (#22359772)

100 computers in his mom's basement? That's a big basement.

Re:not enough info (1, Funny)

Anonymous Coward | more than 6 years ago | (#22360020)

and a big Momma

Re:not enough info (0)

Anonymous Coward | more than 6 years ago | (#22360072)

and warm... very, very warm...

Re:not enough info (0)

Anonymous Coward | more than 6 years ago | (#22360110)

My mom has an enormous basement, you insensitive clod. (oh wait...

Re:not enough info (1)

g33k and destroy (1236240) | more than 6 years ago | (#22360200)

Only 100? Not big enough

Re:not enough info (-1, Troll)

Anonymous Coward | more than 6 years ago | (#22359848)

Exactly. We are just talking about some random IT grunt here who feels important because he's the "administrator" if this little network. In reality, of course no ones knows about this "projekt" and it would never be green-lighted by any decent CEO.

Re:not enough info (1)

click2005 (921437) | more than 6 years ago | (#22360120)

Its a botnet. Serving trojans, viruses & spam doesn't make full use of the hard drives.

Maybe move with the times? (2, Insightful)

line-bundle (235965) | more than 6 years ago | (#22359728)

You could try to use something like "Localhost Azureus" for distributed data storage. The only problem will be that it will cost you in terms of processor and network hogging.

Is it cost effective to reclaim that (small) space? Probably not. My suggestion is to realize that no-one tries to save clock cycles any more and maybe this is the way disk storage is probably heading that way.

Space is not that important any longer (4, Insightful)

eebra82 (907996) | more than 6 years ago | (#22359730)

It's a very interesting question, but from my point of view, hard drive space is so ridiculously cheap nowadays that it is utterly pointless to look for a useful application that will fill it up.

Let's assume that the average computer has 80 GB of storage. Multiply that by 100 and you get 8 TB of space. That's what you can get into one or two computers nowadays without plunging out too much cash.

What's more interesting is how much processing power you have as well as how fast the internet connection is.

Re:Space is not that important any longer (5, Insightful)

jaxom (90814) | more than 6 years ago | (#22359888)

I disagree with this and face this question all the time in work. Disks are cheap, storage systems aren't. If this is for a business that requires reasonable uptime, then the only solution would be to implement a SAN using Fibre Channel or iSCSI and then take out the drives. With the right array, all of a sudden those drives become superfluous (you decide if boot from SAN is right for you), management is easier and you'll be able to get a lot of reuse out of the drives.

Now a lot of people will start to question the cost of doing all of this and it isn't cheap, however you have to analyze the data correctly. We migrated 200 servers from DAS to a SAN and had our money back within 12 months. Add on top of that the implementation of VMs, all of a sudden those 200 went to 20. That's a big difference in cost of ownership.

Re:Space is not that important any longer (1)

Mondo1287 (622491) | more than 6 years ago | (#22359940)

Exactly. Storage is relatively cheap these days, and doing something like this doesn't make sense. While you can easily spend a couple million dollars on a large SAN, there would be a massive hit in reliability, redundancy, and performance by using the approach you have described. I know there is a commercially available product to do just what you have described, but I can't remember the vendor. Let's say you have 1000 machines with 80GB drives, and the average machine has 50GB of free space. That will give you 50TB. On such a system, I'd want the data to be redundant on no less than 5 machines, cutting you down to 10TB of useable space. Now imagine the crippling effect this will have on even a network with gigabit to the desktop with 10g switches at the core. Remember every change will have to be replicated across the network to 5 machines. Not to mention the processing overhead each machine will experience. Then there is the problem of securing the data. I can see something like this working in a small network, but it still doesn't make sense as a nice server with a couple TB of space or a NAS device would make more sense.

Dumbest question yet... (1, Insightful)

Aaron32 (891463) | more than 6 years ago | (#22359734)

This is the dumbest /. question I've seen. Decentralized network storage pooled together with no means of practical management? Sign me up! Oh yeah, let's rely on the ditzy end users to help make sure it doesn't crash. I'm sure everyone will leave their computers on 100% of the time so you can make use of it. Don't tell anyone at work of your idea, they might not ever stop laughing.

Re:Dumbest question yet... (2, Insightful)

ZeroPly (881915) | more than 6 years ago | (#22360142)

You haven't put any thought into this - it takes about 20 seconds to answer your concerns given an introductory class in OS design.

Obviously computers will crash or be turned off. We have this wonderful concept in architecture design called "redundancy" which we can use to address problems like that:

Assume the probability of any computer being offline is d(c_n). For some computers you will have d(c) very low, such as user out of town often, other will have d(c) quite high, either the user leaves it on all the time or it has background processing to do.

Computing and updating d() is fairly easy given any modern management tool. Then create clusters of computers with a required availability so that you stripe data across the componenet computers taking into account d() of each computer. Availability of the cluster would be a function of your modified striping algorithm. When you save data, you just choose what availability you would settle for, and the right cluster is chosen.

Let me answer your next question in advance: if this is so obvious why is no one producing a product that's cheap and easy to implement? Because you'd have about 25 patent trolls lined up at the courthouse - too many teeth, not enough ass.

GlusterFS (3, Informative)

Anonymous Coward | more than 6 years ago | (#22359740)

Check out GlusterFS. (http://www.gluster.org)

You definitely can't run Windows in order to utilize this, but it should be a minimal effort to setup a quick netboot lab to test it with.

Cheers.

Re:GlusterFS (1)

Insightfill (554828) | more than 6 years ago | (#22359968)

Check out GlusterFS. (http://www.gluster.org)

You definitely can't run Windows in order to utilize this, but it should be a minimal effort to setup a quick netboot lab to test it with.

One could envision setting up small VMWare Player instances running under a different account on Windows launch using "Scheduled Tasks" for that account (set to launch on reboot). Or - run VMWare Player as a service. A little beefier would be VMWare Server (free), but a bit more of a hassle (need to also install IIS on each XP Machine). The advantage of either setup is that the VM instance will run without a window, but will be visible as a running task in the Task Manager. The Scheduled Task approach would also let you tinker with scheduling, such as a VM that powers up at 6pm and powers down at 6am.

Install Debian or distro of choice in VMWare image, giving it a massive virtual drive in a user account directory. Keeping it a specific user account directory will hide it from non-admin eyes. I mention Debian because it's the one I have the most experience with, with good flexibility in image size.

Admittedly, it wouldn't be the fastest array in the world, but it should work. The bonus is that the Windows machines would continue running as usual, with only slight memory and disk performance drop. That hit would be scattered among the machines at random times based on usage of this virtual array.

If it works well with one machine, you could duplicate the whole VM and just give it a new machine name on the network and move on.

Short answer... (0)

Anonymous Coward | more than 6 years ago | (#22359746)

Short answer: No.

Long answer: Nope.

Send them to our troops in Iraq (3, Funny)

kipin (981566) | more than 6 years ago | (#22359750)

I had a drive fail on me last year and I wanted to take my frustration out on it so naturally I did what any good American would do. I shot the shit out of it. Surprisingly it seemed to make for a pretty good piece of bullet proof armor. It stopped multiple rounds of full metal jacket 9mm rounds and managed to get a couple rounds lodged inside the casing. (None appeared to penetrate fully)

Re:Send them to our troops in Iraq (1)

boombasticman (1232962) | more than 6 years ago | (#22359928)

This means usually, that you need a stronger weapon. Next time, don't use the gun of your wife for such scientific demolition tests.

Re:Send them to our troops in Iraq (1)

R2.0 (532027) | more than 6 years ago | (#22359988)

You should have hooked up a garden tractor battery and had it spinning when you shot it - 7200 rpm of centrifugal goodness.

Re:Send them to our troops in Iraq (5, Informative)

eagl (86459) | more than 6 years ago | (#22360124)

The drive survived because the 9mm is weak. Get a better gun using a better round, like .40 cal or even a good old .45.

I've had a chance to read after-action reports from Iraq and Afghanistan, and the 9mm is pretty much a joke. Most of the forces that really rely on hangun stopping power have obtained emergency authorization to bypass normal procurement processes in order to get better handguns using better ammunition. To my knowledge, a modern .45 is considered one of the best alternatives.

Sanmelody (4, Informative)

theoverlay (1208084) | more than 6 years ago | (#22359758)

Datacore offers software called Sanmelody to turner servers into a cheap storage network and there are other vendor solutions as well. http://infiniteadmin.com/ [infiniteadmin.com]

AFS (5, Informative)

arabagast (462679) | more than 6 years ago | (#22359780)

OpenAFS [openafs.org] is a distributed file system. It seems to fit your bill. No personal experience, so don't know how well it actually works.

Re:AFS (1)

xoundmind (932373) | more than 6 years ago | (#22360144)

My first exposure to Unix (1986) was on the AFS network at CMU. I don't about using this on Windows, but our disk access was never an issue.
Thanks for the reference, that really takes me back...

Re:AFS (1)

Monx (742514) | more than 6 years ago | (#22360202)

IBM uses AFS internally. It works. Use it.

Help TPB! (0)

Anonymous Coward | more than 6 years ago | (#22359792)

They need secret servers on unknown locations, you know...

Solution for Linux (2, Informative)

Anonymous Coward | more than 6 years ago | (#22359794)

There's project dedicated to this on Linux, http://nbd.sourceforge.net/ [sourceforge.net] .

If there's nothing similar for windows, you might be able to run it through cygwin.

Actually, this claims to run on Windows: http://www.vanheusden.com/Loose/nbdsrvr/ [vanheusden.com]

Re:Solution for Linux (1)

mjrauhal (144713) | more than 6 years ago | (#22360204)

nbd is nice for some stuff but lacks fault-tolerance. Of course, you can run RAID, possibly several levels (say, a raid-6 on top of raid-1 or something) on top of nbd devices to trade space for fault-tolerance as much as you want, but you still lack flexibility. The advantage to RAID-over-nbd, on the other hand, is of course that you can do that right now if you want :] (And yes, the nbd server shouldn't be overly hard to run on Windows, one would think; it's rather simple...)

A better solution would work on a bit higher level, though. If a host goes down, it would be desirable to flexibly duplicate its data (from other mirrors and/or parity data) onto others. Possibly such a system could be created on top of nbd as well. Hell, maybe ZFS with an NBD pool could someday hack that, but seems to me they'd need to work out at least bug 4852783 [opensolaris.org] first.

You read my mind! (1)

Danathar (267989) | more than 6 years ago | (#22359798)

I've been thinking of the same thing of late. Our IT department uses this huge SAN at $$$ money. Why couldn't a distributed fault tolerant (with something like striped with parity) be implemented across a LAN with 100Mb/GigE? The standard drive size being shipped on new PC's is at a minimum about 200GB. For biz users that is WAY overkill.

Our whole organization is about a 1000 Windoze desktops, but I'd like to try it in our local workgroup first (maybe 20 systems). I looked around but couldn't find anything that would pool unused desktop space.

Re:You read my mind! (1)

BlueF (550601) | more than 6 years ago | (#22359880)

I've always wanted something that would do this (for windows)! If done with enough parity, poling systems for uptime and cpu/hdd/network utilization, both the client/network impact and "network disk" performance could be managed quite well I think.

Seems like it's only a matter of time before something comes along to provide this obvious function. Here's hoping its well thought out and coded, preferably a commercial app. I would have no problem justifying such an expense if such a product existed in polished form.

Think of all the data archiving (daily/weekly/monthly backup sets, etc) which could be done right in-house. Add a bit of encryption and stripe data in a manner so reconstruction would be near impossible with out all the parts...

Re:You read my mind! (1)

jeffmeden (135043) | more than 6 years ago | (#22360026)

reconstruction would be near impossible with out all the parts

You had me up until there. Um, what happens when YOU lose one of the parts?

use it local (1, Interesting)

Anonymous Coward | more than 6 years ago | (#22359804)

You could use extensive subversioning on each machine individuall to get an benefit out of unused discspace und computing power. User who accecidentially overwrite or delete could get them back from there own disc space. Some kind of NFS would use a lot of network traffic an bandwith is often a limiting faktor.

Storage (2, Informative)

Genocaust (1031046) | more than 6 years ago | (#22359806)

I tried to tout the merits something like this could have for non-critical regular user backups, but as previous posters mention, it was shot down.

I was suggesting to run DrFTPD [drftpd.org] as a backend with NetDrive [american.edu] as an access medium. It looks good on paper, but I've never had the chance to apply it so widescale :)

With DrFTPD it's easy to setup whatever kind of redundancy you would want, ie: "at least 3 nodes will mirror all files in /doc" or whatever. NetDrive (and I'm sure there are others) help take away the learning curve and hassle of "here, use this internal ftp for backups, not a network drive" as it will map the actual FTP to a network drive and appear like normal.

Just my 2c.

the IT guy with time on his hands (1)

westlake (615356) | more than 6 years ago | (#22359812)

What would be a productive use for these terabytes of wasted space?

The first question to ask is whether what you want to do makes any sense for your employer. Who has to maintain this beast once you build it.

dCache (3, Interesting)

Rev Saxon (666154) | more than 6 years ago | (#22359830)

http://www.dcache.org/ [dcache.org] You will need a system to act as a master, but otherwise your normal nodes should work great.

Revstor (1)

theoverlay (1208084) | more than 6 years ago | (#22359836)

Try Revstor's Sanware which allows you to designate nodes (servers) that will provide resources to create a storage area network. http://infiniteadmin.com/ [infiniteadmin.com]

Slashvertisement for wuala? (0, Offtopic)

jiadran (1198763) | more than 6 years ago | (#22359840)

This sounds like somebody is asking for wuala [wua.la] . Possible slashvertisement?

Re:Slashvertisement for wuala? (2, Funny)

imsabbel (611519) | more than 6 years ago | (#22360016)

Acutally, this sounds nothing like that thing you link to.
More like your post being a slashvertisement.

I have a similar problem (1)

imsabbel (611519) | more than 6 years ago | (#22359846)

He have a few compute nodes around here. Each of them has an HD, and as those are so cheap we gave them 500Gbyte ones.

They dont really need lots of space (maybe 30Gbyte for OS and temp-files), otoh without redundancy the other 450Gbyte are worthless.

As the task is emberassingly parallel, Network traffic wouldnt be a problem.
If there was a solution to compine all this storage (doesnt even have to be transparent) into a distributed, redundant storage network, i could surely make use of those Tbytes

Re:I have a similar problem (1)

imsabbel (611519) | more than 6 years ago | (#22360032)

To add to this:

What i am imagine doesnt need to be low-level.

Just a userland-application with container-files would be fine:

They can listen to each other, and each file gets replicated on every node. If the filling level gets higher, copies are purged up to a minimal redundancy level.

Even the factor 2 loss of non-parity redundancy would still be a lot better than not using th espace at all.

Give them to yahoo... (1)

mecenday (1080691) | more than 6 years ago | (#22359868)

... they'll need them.

Backup (2, Informative)

m0pher (1236210) | more than 6 years ago | (#22359874)

If you don't already have a backup mechanism for the data that may be on these systems, one way to use all the available storage is for backup. Vembu StoreGrid a solution designed specifically for this problem. Get more info @ http://www.vembu.com./ [www.vembu.com]

Looking at the problem another way... (4, Insightful)

pedantic bore (740196) | more than 6 years ago | (#22359882)

You might want to ask yourself why, after more than a decade of research and countless papers and prototypes that address this problem, your PCs storage are still underutilized...

It's harder than it looks to get something reliable. Your PCs have extra capacity because it's cheap, but mining that capacity is not cheap. As other posters have pointed out, putting together (or just purchasing) a server with a few TB of storage is simpler and cheaper, less prone to getting wiped out by a virus, easier to manage and backup.

I'm not sure that's a good idea... (2, Interesting)

ralph90009 (1088159) | more than 6 years ago | (#22359884)

While I was in college, I worked in the IT department. In my experience, your end-users will have a proverbial shit-fit if their computer's HD starts spooling up when they aren't doing anything. While it would be nice to use the spare space for data storage, I'm not sure it would be worth the headache. The volume of user complaints would skyrocket, you'd have to train them to leave the things on all the time, and you'd have a distributed data pool to manage. Changing user behavior is like teaching a two-year-old to say "thank you" (It's possible, but not fun) and your electrical and manpower expenses would probably outstrip the savings.

iSCSI + ZFS (0)

Anonymous Coward | more than 6 years ago | (#22359892)

this is all hypothetical, but you could create disk images and use each client as an iSCSI host, mount each of the servers in your favorite RAIDZ configuration on a network server, and then reshare everything through Samba or even back as pools of iSCSI volumes.

That actually sounds like a pretty cool project, and with enough redundancy, it could be fairly robust.

Storage at Desk (2, Informative)

phooji (1236218) | more than 6 years ago | (#22359904)

is a project at the University of Virginia that tries to do exactly what you describe: take unused storage on a bunch of machines and turn it into a file system. http://vcgr.cs.virginia.edu/storage_at_desk/index.html [virginia.edu]

P2P (0)

Anonymous Coward | more than 6 years ago | (#22359906)

I've often thought a Napster-like P2P network could be the basis for a fault-tolerant distributed storage system. By "Napster-like" I mean a P2P system with a central index. Add access control and versioning software that can push files from peer to peer. Once a document is on, say 5, peers there is no need to back it up.

Image a system like this:
1. A couple of redundant index servers
2. An integrated versioning system with push capability
3. A large chunk of desktop disk space hidden from the user
4. Appropriate access control at the index level
5. ???
6. Profit! (this is /. after all)

Unfortunately, some powerful corporations are so terrified of P2P that they're doing all they can to kill it in its infancy.

Versioning Clarification (0)

Anonymous Coward | more than 6 years ago | (#22359972)

When I wrote "versioning system" I didn't mean a CVS. I mean software with enough brains to know that a document was edited so it can push the new version to all the peers storing the document.

So if an AC replies to his own post is that an act of brazen cowardice?

Switched off? (1)

danhuby (759002) | more than 6 years ago | (#22359932)

What if the PCs are switched off?

Even using something akin to RAID, so you store the same data across several machines, you've still got the risk that switching off PCs will cause data to be temporarily unavailable.

Leaving a hundred PCs switched on just to get some extra disk space isn't going to be eco-friendly or cost effective. You can build a several terabyte file server very cheaply these days.

Dan

how to free up immeasurable bandwidth/storage (-1, Troll)

Anonymous Coward | more than 6 years ago | (#22359978)

tone DOWn the constant pr ?firm? scriptdead propaganda of the life0cidal glowbull warmongering nazi execrable. let yOUR conscience be yOUR guide. you can be more helpful than you might have imagined. there are still some choices. if they do not suit you, consider the likely results of continuing to follow the corepirate nazi hypenosys story LIEn, whereas anything of relevance is replaced almost instantly with pr ?firm? scriptdead mindphuking propaganda or 'celebrity' trivia 'foam'. meanwhile; don't forget to get a little more oxygen on yOUR brain, & look up in the sky from time to time, starting early in the day. there's lots going on up there.

http://news.yahoo.com/s/ap/20071229/ap_on_sc/ye_climate_records;_ylt=A0WTcVgednZHP2gB9wms0NUE [yahoo.com]
http://news.yahoo.com/s/afp/20080108/ts_alt_afp/ushealthfrancemortality;_ylt=A9G_RngbRIVHsYAAfCas0NUE [yahoo.com]
http://www.nytimes.com/2007/12/31/opinion/31mon1.html?em&ex=1199336400&en=c4b5414371631707&ei=5087%0A [nytimes.com]

is it time to get real yet? A LOT of energy is being squandered in attempts to keep US in the dark. in the end (give or take a few 1000 years), the creators will prevail (world without end, etc...), as it has always been. the process of gaining yOUR release from the current hostage situation may not be what you might think it is. butt of course, most of US don't know, or care what a precarious/fatal situation we're in. for example; the insidious attempts by the felonious corepirate nazi execrable to block the suns' light, interfering with a requirement (sunlight) for us to stay healthy/alive. it's likely not good for yOUR health/memories 'else they'd be bragging about it? we're intending for the whoreabully deceptive (they'll do ANYTHING for a bit more monIE/power) felons to give up/fail even further, in attempting to control the 'weather', as well as a # of other things/events.

http://video.google.com/videosearch?hl=en&q=video+cloud+spraying [google.com]

dictator style micro management has never worked (for very long). it's an illness. tie that with life0cidal aggression & softwar gangster style bullying, & what do we have? a greed/fear/ego based recipe for disaster. meanwhile, you can help to stop the bleeding (loss of life & limb);

http://www.cnn.com/2007/POLITICS/12/28/vermont.banning.bush.ap/index.html [cnn.com]

the bleeding must be stopped before any healing can begin. jailing a couple of corepirate nazi hired goons would send a clear message to the rest of the world from US. any truthful look at the 'scorecard' would reveal that we are a society in decline/deep doo-doo, despite all of the scriptdead pr ?firm? generated drum beating & flag waving propaganda that we are constantly bombarded with. is it time to get real yet? please consider carefully ALL of yOUR other 'options'. the creators will prevail. as it has always been.

corepirate nazi execrable costs outweigh benefits
(Score:-)mynuts won, the king is a fink)
by ourselves on everyday 24/7

as there are no benefits, just more&more death/debt & disruption. fortunately there's an 'army' of light bringers, coming yOUR way. the little ones/innocents must/will be protected. after the big flash, ALL of yOUR imaginary 'borders' may blur a bit? for each of the creators' innocents harmed in any way, there is a debt that must/will be repaid by you/us, as the perpetrators/minions of unprecedented evile, will not be available. 'vote' with (what's left in) yOUR wallet, & by your behaviors. help bring an end to unprecedented evile's manifestation through yOUR owned felonious corepirate nazi glowbull warmongering execrable. some of US should consider ourselves somewhat fortunate to be among those scheduled to survive after the big flash/implementation of the creators' wwwildly popular planet/population rescue initiative/mandate. it's right in the manual, 'world without end', etc.... as we all ?know?, change is inevitable, & denying/ignoring gravity, logic, morality, etc..., is only possible, on a temporary basis. concern about the course of events that will occur should the life0cidal execrable fail to be intervened upon is in order. 'do not be dismayed' (also from the manual). however, it's ok/recommended, to not attempt to live under/accept, fauxking nazi felon greed/fear/ego based pr ?firm? scriptdead mindphuking hypenosys.

consult with/trust in yOUR creators. providing more than enough of everything for everyone (without any distracting/spiritdead personal gain motives), whilst badtolling unprecedented evile, using an unlimited supply of newclear power, since/until forever. see you there?

"If my people, which are called by my name, shall humble themselves, and pray, and seek my face, and turn from their wicked ways; then will I hear from heaven, and will forgive their sin, and will heal their land."

meanwhile, the life0cidal philistines continue on their path of death, debt, & disruption for most of US. gov. bush denies health care for the little ones;

http://www.cnn.com/2007/POLITICS/10/03/bush.veto/index.html [cnn.com]

whilst demanding/extorting billions to paint more targets on the bigger kids;

http://www.cnn.com/2007/POLITICS/12/12/bush.war.funding/index.html [cnn.com]

& pretending that it isn't happening here;

http://www.timesonline.co.uk/tol/news/world/us_and_americas/article3086937.ece [timesonline.co.uk]
all is not lost/forgotten/forgiven

(yOUR elected) president al gore (deciding not to wait for the much anticipated 'lonesome al answers yOUR questions' interview here on /.) continues to attempt to shed some light on yOUR foibles. talk about reverse polarity;

http://www.timesonline.co.uk/tol/news/environment/article3046116.ece [timesonline.co.uk]

Grr! this is what I hate most about sysadmins (0)

Anonymous Coward | more than 6 years ago | (#22359980)

The user boxen are for the users, not for you.
The diskspace/CPU cycles/whatever is not idle, it's being kept available for the users' needs.

Don't be such a prick. Pee in your own sandbox.

botnet (0)

Anonymous Coward | more than 6 years ago | (#22359998)

i'm sure some p2p botnet could use the space

My first (serious) thought... (1)

Zocalo (252965) | more than 6 years ago | (#22360002)

Was to use a software driver to export the spare part of the disk as an iSCSI (or iATA, if you prefer) target. For performance and integrity, you'd probably be better having a dedicated partition the OS couldn't easily fiddle with, but it shouldn't be too hard to create an array of ~50GB iSCSI targets that you could then collate into larger volumes. Performance wouldn't be stellar, unless you could use a dedicated NIC/VLAN on the hosts, but should be reasonable enough for use a nearline storage of non-critical data that was already archived to tape. But so much for the pros, what about the cons?

The big problems with this idea though are going to be MTBF, storage redundancy and power consumption. You're going to be building your storage array using desktop PC rated HDDs, so lets say an MTBF of 50,000 hours, *but* you have about 100 of them so you should be anticipating a fairly frequent drive failure rate. That means both striping and repeatedly mirroring the data across workstations to ensure that it's always available should a drive or two die - or just be powered off overnight, unless you want all your workstations powered up 24/7 ($$$). You'd also need to be able to dynamically rebuild the data set in the event of a drive failure; but how do you detect a drive failure vs someone simply tripping over the power/network cable - that software's not looking so simple now, is it?

I think it's an interesting idea, but the overheads of maintaining enough copies of each element of data online to survive drives becoming unavailable, intelligently managing the replication of data when a drive is deemed to have failed and not just gone temporarily offline, plus network congestion issues make it non-viable. It'd almost certainly be cheaper and faster to write off the spare HDD capacity in your workstations and buy cheap 1U servers with a couple of GB NICs onboard and cram them full of high capacity SATA drives for storage.

Microsoft DFS is an easy answer (1)

whackco (599646) | more than 6 years ago | (#22360004)

Microsoft makes an easy to use utility for this EXACT situation called DFS - Distributed File System.

1) Simply make a share on all those machines and POOL them with a DFS server and you are good to go.

2) ????

3) PROFIT!!1!

A project for Google? (1)

eagl (86459) | more than 6 years ago | (#22360040)

Isn't this something Google either has already done, or *should* do? Google Distributed File System... GDFS. It has the added benefit of also being a curse if it goes wrong. Seriously, isn't this an ideal project for Google? And if they've already done it, is it available for implementation by everyone else?

I'd like to see some sort of distributed filesystem as a standard installation option in a linux distribution... The question would be something to the effect of "would you like your computer to find unused disk storage space on your network, and use it for managed redundant storage available across your network?

It likely wouldn't be very fast (imagine RAID 1 or 5 with each disk connected only by ethernet) and the controller on yet another computer also connected only via ethernet, but for a lot of people, absolute speed isn't really required and having all that free space managed in a usable form would make up for the lack of speed.

Re:A project for Google? - whoops here it is (1)

eagl (86459) | more than 6 years ago | (#22360060)

Whoops, should have "googled" this first. Here it is, google file system.

http://labs.google.com/papers/gfs.html [google.com]

The big questions of course are is it usable by regular people, and is anyone actually working on implementing and including this in any of the major operating systems?

Re:A project for Google? - whoops here it is (1)

allenw (33234) | more than 6 years ago | (#22360244)

Google hasn't released anything other than papers on GFS and their implementation of MapReduce. At this point, though, I'm not sure it matters since we have Hadoop [apache.org] (which, being mainly Java, C, and a little bash) runs perfectly fine on all of the major operating systems, including Windows.

Circular Backups (1)

PeterJFraser (572070) | more than 6 years ago | (#22360070)

The trick is how, for my machines at home (all 3 of them). I have the first backup to the second and second to the third, and the third to the first. I have thought for some time, that there should be some method of automating that procedure. But keeping track of where things are and which machine has what space would not be easy.

Re:Circular Backups (1)

mrbcs (737902) | more than 6 years ago | (#22360220)

Cobian backup. Free. Automatic. I have our personal machines doing an automatic backup to a server. Set it up for 2:00 am. works perfect. Will do incremental backups.

Microsoft Farsite (and related topics) (1)

Jered (32096) | more than 6 years ago | (#22360084)

What're you're talking about is not a new concept, it just turns out to be really hard to build in a useful way. The most comprehensive discussion of the problems involved can be found at the Microsoft Research project Farsite [microsoft.com] .

The short version of the problem is that the level of service you can expect from each system is incredibly variable, so it's hard to offer a meaningful QoS for the system as a whole. It's not quite as bad as the distributed-hash-table problem (a.k.a. P2P file storage), but it's still bad. (Zooko once told me that MojoNation saw an average 50% turnover in nodes in a 24 hour period.) But it's also not as easy as having all your distributed nodes dedicated to just storage, and even that's a really hard problem to solve. (I should know; my company [permabit.com] is one of the few vendors doing it.)

Someone else suggested OpenAFS. OpenAFS is fantastic, but not for unreliable server environments. I really don't think there's a complete solution out there, but not for lack of asking.

Replace the drives? (1)

LihTox (754597) | more than 6 years ago | (#22360122)

I know little about hardware, so forgive a stupid question: would it make any sense to pull out these computers' drives, replace them with smaller ones, and either sell the lot or assemble them in one place (a RAID?) for easier maintenance? Having your storage spread out through a company becomes a problem if one computer goes down (or is turned off by its user).

I know the cheapness of drives may make this silly.

Birth of the Matrix? (5, Interesting)

TropicalCoder (898500) | more than 6 years ago | (#22360146)

What would be a productive use for these terabytes of wasted space?

Well, I had this idea when I read about some Open Source software that allowed distributed storage (sorry, forgot what that was, but by now I am sure it has already been mentioned in this discussion). The idea was this - suppose we have such software for unlimited distributed storage, so that people can download it and volunteer some unused space on their HD for a storage pool. Then suppose we have some software for distributed computing like we have for the SETI program. Now we have ziggabytes of storage and googleplexflops of processing power, what can we do with that? How about, for one thing, storing the entire internet (using compression, of course) on that endless distributed storage, and then running a decentralized, independent internet via P2P software? The distributed database could be constantly updated from the original sources, and the distributed storage then becomes in effect a giant cache that contains the entire internet. Now we could employ the distributed computing software to datamine that cache and we could have searching independent of Google or Yahoo or M$FT. Beyond that we could develop some AI that uses all that computing power and all that data to do... what? - I'm not sure yet. Just thought I would throw this out there to perhaps maybe get stepped on, or who knows, inspire further thought.

distributed file systems (1)

Fireshadow (632041) | more than 6 years ago | (#22360150)

I think a better question is define your problem better with some additional details. Do you want a separate drive letter to appear to the customers for them to keep their stuff on? Or do you want something that only you can get to store backups on? What kind of network is it? 100Mb/s? 1 Gb/s?

You'd asked two questions: "What would be a productive use for these terabytes of wasted space? " I don't know if I'd ask the slashdot crowd this.

"Does any software exist that would enable pooling this extra space into one or more large virtual networked drives?" A few. Localhost Azureus http://p2p.cs.mu.oz.au/software/Localhost/faq.html [mu.oz.au] but it hasn't been maintained since 2006. Lustre http://en.wikipedia.org/wiki/Lustre_(file_system)#Networking [wikipedia.org] is a neat read but I don't think is applicable in your situation. It'll give you an idea as to what's out there.

In theory, you could use MRTG to measure your fileserver's switch port to see how much traffic the desktops pull from the server. Divide it by the number of desktops and that tells you on average how much each requests. Now consider that this average would be going to distributed across the network, with each desktop seeing an increase. A Gb LAN may be able to take this with no sweat.

As for how much disk space you are going to practically gain is up for debate. Let's say a 20 Gb quota from each drive. Doing the math , that just under 1.95 Tb. If you ever have to reload a number of those workstations, a good chunk of that is going to be unavailable. You may be better served with a NAS storage device.

Two products you should probably take a look at (1)

NSIM (953498) | more than 6 years ago | (#22360176)

There are two companies out there that may be able to do what you need:

http://www.seanodes.com/ [seanodes.com]

http://www.revstor.com/ [revstor.com]

Both claim to be able to pool unused storage on desktops and application servers and make it available to hosts on the network.

NBD + RAID + Truecrupt (0)

Anonymous Coward | more than 6 years ago | (#22360210)

What I would do is set up a large file on each machine, and export it using nbd [sourceforge.net] - I think they do a Windows version.

Then, gather all these NBDs together at the server, using RAID to add massive redundancy to cope with users switching off their machines/crashing/whatever.

Finally, apply strong cryptography (eg. Truecrypt or LUKS) to the RAID volume, so that all the data sent across the network and stored on the machines is unintelligable to anybody except you.

You could also help SETI (1)

BrendaEM (871664) | more than 6 years ago | (#22360240)

If you have a little extra processor time, you could help SETI. I believe they have more data than they can search through. The client that loads SETI also can do a number of other projects, such as folding. The client can be throttled, and set to only run while the machines are not being used, akin to the time you might be running screensavers. http://setiathome.berkeley.edu/ [berkeley.edu] With the extra space, you could always use Clonezilla to back up one machine on another.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>