Towards an Internet-Scale Operating System 305
gschoder writes: "Two Berkeley computer scientists (including David P. Anderson of SETI@home) envision an Internet-scale operating system to harness the processing power, networking efficiency, and storage capacity of everyone's computers. Scientific American has their proposal."
How about desk-sized? (Score:4, Insightful)
of computers on the same desk efficiently, why not start there?
Re:How about desk-sized? (Score:2, Informative)
I don't think that was what the poster had in mind. Your KVM switch doesn't provide any value other than saving desk space. The article talked about the benefits of redundancy, increases processing power and increased bandwidth.
I imagine what the poster was talking about was having one operating system that would use both computers if they were available but having a complete working system if one was unavilable. So for instance you could power up the second computer with an additional 56K modem and get dual PPP connection without any effort.
Who gets root access? (Score:2, Funny)
Seti At Home (Score:2, Insightful)
Scary... (Score:3, Insightful)
Nope. Cause some l33t h4x0r will have own3d her already.
This is scary as hell. I hope it doesn't get implemented. This is far different from Seti...
Worse yet (Score:2)
Error correcting codes (Score:2, Interesting)
Should I guess the missing 40% from the available 60%?
Yes! Error-correcting codes will make it possible to guess the whole file from fragments that add up to 50%. Mojo Nation [mojonation.net] already does this.
Re:Scary... (Score:4, Interesting)
It can be a lot more scary than you think.
Re:Scary... (Score:3, Informative)
Micro$oft Press Release #10520
We are happy to announce the immediate availablity of our new distributed computing service! For a low fee, you can harness the power of EVERY computer installed with Windoze XP in the world! Yes, that's right, all their base are belong to us, and you can buy CPU time on 'em!
What's scary is that (except for renting out time) the above is TRUE. M$ does 0wn all Windoze XP systems. And people PAY them for it!!! Inconceivable!
Re:Scary... (Score:2)
Suppose M$ sells CPU time to companies (or using it to run neuclear simulations to take over the world), legitimizing it by saying they're protecting the financial security of Bill Gates. Well, is that a security upgrade or what?
Re:Not too scary, but... (Score:2)
b) According to the article she would be making money off of it and she would benefit from using shared computing also.
Those costs can be significant (Score:2)
We are assuming she has unlimited, always on connection (like DSL or Cable)
So you're limiting this architecture to highly urban areas of highly developed countries.
And her machine doesn't cost her anything except electricity
This can be significant. Most modern PC operating system kernels' idle loops execute wait instructions that halt the CPU until an interrupt occurs. The cost of electricity to run any instruction other than wait and the cost of cooling the machine can pile up.
and the wear on her hardware (constant disk access, etc.).
This can be significant. I had a Macintosh Performa 6230CD computer's hard drive wear out on me in less than a year, and it wasn't even under heavy use.
Re:Those costs can be significant (Score:2)
While they may be the case today, there is nothing to say that this won't change in the new few years. I also don't believe that DSL is available in only highly urban areas. Take a look at DSLreports.com and you can see where DSL is available.
As for the electricity, I run two servers at home - one dual processor and both with multiple hard drives and fans. My electric increased approximiately $20/month when I started running them full time.
I also have not had (knock on wood) a significant hardware failure on these machines in the several years I have been running them. The last time I remember a hard drive failure was back in my mac days on a 7600. I think most hard drives also have a 3 year, no questions asked warranty.
Re:Those costs can be significant (Score:2)
I am in the middle of a major urban metroplex. (Cleveland-Akron-Canton aprox 3 million people, the 15th largest market in the US see cleveland.about.com [about.com]) and I can't get DSL. I'm 16,210 feet from a CO, 210 feet farther than the new legth limits for the ILEC. There are load coils on the line and because I am farther than the tariff distance they are under no obligation to remove them, and they sure as hell won't without an act of god!
The first thing I will do.... (Score:3, Funny)
int main() {
while(1) fork();
return(0);
}
Re:The first thing I will do.... (Score:2)
i don't know.. (Score:4, Insightful)
Re:i don't know.. (Score:3, Interesting)
What if the computer you bought for US$2000 was largely subsidized by the colation of entities that wanted to use your CPU and mass storage when you weren't so that it only cost you like US$1000 or even US$500. Would you participate then? Even if you wouldn't, could you see how someone else might?
Already happening. (Score:3, Informative)
After thinking about it, I decided against it. I had no idea who was paying for the other 4 work packets- big tobacco, Iraqi agents doing bio weapons research, Chinese nuclear weapons development. If they had said right out who it was for, I might have still signed up, I really didn't like the way I had to poke through the fine print to figure this out.
Re:i don't know.. (Score:2)
Or imagine how much quicker a certain very large Asian country could upgrade their nuclear weapons if they had access to a large array of machines to handle the nuclear explosion simulations.
Or image a group of l33t hackerz cracking the encryption for your banking transactions so they can say they ownz your account.
This thing can go both ways - personally, I don't like the idea of anyone, regardless of intent, having access to my computer unless it's explicitly granted and can be revoked at any time for any reason, without prior notification.
Save the starving children of the world on someone else's machine, thank you.
Why should I want my computer doing others' work? (Score:2, Interesting)
It's another thing when a person volunteers to participate (I run SETI@athome) but this proposal sounds like a forced standard upon a consumer.
$$$ (Score:3)
These guys seem to envision this happening through some sort of micropayment system, though, which is still an overall iffy proposition considering the current cost of performing a transaction.
There are several other significant issues with using presumably anonymous internet connected machines, and their use of the term "microkernel" only clues you in that it's a NotSoBrandNew concept, but it's a fun read to get PHBs and Venture Capitalists interested.
More likely answer - you get charged less (Score:2)
well and good, but... (Score:4, Insightful)
Liquid cooling for PC's is still out of the reach of many, so noise is a factor. And I can only assume that this work will require your computer to be awake, so power management goes out the window.
Even if these were overcome, there's still the obstacle of just getting people to go along with this. It doesn't sound to me like these "pennies trickling into a virtual bank account" are going to pay for that broadband connection or the increased electricity bill.
Like most other things, it sounds great on paper...
Re:well and good, but... (Score:2, Insightful)
Re:well and good, but... (Score:2)
Those 10gb mp3 players are designed to be 'on' as little as possiable. Their hd's only spin when accessing data, they also spin slower then PC hard drives. The CPU in those things are just powerful enough to decode the mp3 and process user input. On the average PC, it only takes 1-3% cpu time to decode a mp3 these days, and probably half of the time it's because the GUI is pretty. ( Does anyone remember WinPlay, the first mp3 player for windows? It could decode mp3s so quick even on 486es. It couldn't seek through an mp3 back in 96 though)
High latency? (Score:4, Interesting)
The OS will never be fully "functional" as OSes are considered today, because people will lie and cheat and steal. IMO (read: opinion removed from ass) the only practical use of this would be the equivalent of making a kernel patch that could have a slice of disk, a slice of memory usage, and a slice of bandwidth, and then it would run SETI@home, or whatever code it was instructed to run from the "master".
If it was not run on public machines I could immagine something akin to Beowulf from the ground up. An OS designed for premeditated clustering. That's not Internet sized though...
Re:High latency? (Score:2)
Part of what makes this kind of research interesting is learning how to parallelize operations that we think of a serial with current technology.
The article mentions streaming a movie, which we typicaly think of as a server-to-client operation. However, companies like KonTiki [kontiki.com] are already using techniques (their buzzword is Adaptive Rate Multiserving, wah!) involving peer-to-peer parallel operation to solve these kinds of problems.
As far as people lying, cheating, and stealing, you may as well suggest that checks and credit cards will never be "functional."
P2P makes the inroad more acceptable (Score:3, Interesting)
Five years ago, I'd have said no way, this is unfeasible, people would not contribute their storage space and CPU cycles to someone else.
But now, with server-obfuscated peer to peer systems like AudioGalaxy, it could be possible. Imagine selling people on the idea of a 'universal public hard drive', where all you do is search for a file, then copy it over locally without actually knowing where/who it came from. I doubt there'd be any objections, given how convenient and 'anonymous' it would be. Sacrificing a share of your own hard drive space for cacheing files you might not be interested in would be a small price to pay for that. That's one resource down; do the same thing for CPU cycles (provided we have a killer app reason for people to need more cycles, given high speed processors of today) and other computing resources and the rest will fall in place.
I doubt it'll go as far as this proposal, at leastnot for a LONG time, but the unthinkable is already becoming the thinkable in some areas.
Re:P2P makes the inroad more acceptable (Score:2)
Freenet is a p2p system whereby you join the collective and as you use the network, download parts of files. As you request documents, your peers do searches for you and download the files for you as well. This way as more and more people request a file it travels closer to those people.
So if you put something into freenet, it will be there until everyone who has a copy dies.
the future is here... (Score:2)
FreeNet does everything your talking about. It seems that the only thing that is keeping FreeNet from really being usable is a good key/searching mechanism. No way to really crawl the thing is there?
Google will spider Freenet (Score:2)
It seems that the only thing that is keeping FreeNet from really being usable is a good key/searching mechanism. No way to really crawl the thing is there?
If somebody develops a way to publish web pages within Freenet, using URLs that link to other Freenet pages, you'll eventually see Google spider Freenet.
Sounds like Freenet II (Score:5, Insightful)
Guess there is nothing new under the sun.
Re:Sounds like Freenet II (Score:2)
Nice to know that I have so much of your attention, but that wasn't me. I'm not afraid to sign my criticisms of Freenet. BTW, you never did get back to me regarding my Freenet FIQ [platypus.ro] like you said you would. Guess you got "too busy" eh?
Re:Sounds like Freenet II (Score:2)
BTW, I've been scarcely less critical of Gnutella than of Freenet in the past. Just yesterday, in fact, I posted a comment [slashdot.org] on this very site referring to Gnutella as an "unusually naive" protocol. If I were to propose alternatives to Freenet, you can bet I'd be pointing in a different direction than that.
Re:Sounds like Freenet II (Score:2)
Nope. Just pointing out (again) why only a fool would think that earlier "data loss" post was mine.
Re:Sounds like Freenet II (Score:2)
Sorry, but I don't see your logic.
Re:Sounds like Freenet II (Score:2)
You're the one who attributed to me a post holding up Gnutella as an example of how to build a scalable network, even though I've expressed opinions contrary to that view often and as recently as yesterday. How logical is that, Grasshopper? Someone in this conversation obviously flunked Logic 101, all right, but not me.
Re:Sounds like Freenet II (Score:2)
Re:Sounds like Freenet II (Score:2)
Re:Sounds like Freenet II (Score:2)
Randomly? No. Predictably? Preventably? No, and no. Is "random" vs. "unpredictable and unpreventable" a useful distinction in this context? For the hundredth time, no.
Stealing from the poor and giving to the rich (Score:5, Insightful)
However, the proposed ISOS is big, powerful, and likely to be sought after by the most powerful corporations and institutions on the planet. How much lobbying would a large drug company need to do to get more than its share of distributed processing power? How much money would the U.S. Government need to give to them to use the system for cracking "terrorist" messages from the "evil ones" like Kevin Mitnick and Bernie G? How much money would the Government need to give to them to use the system for spying on individual users? Remember, this is the same government who pays Hollywood to put anti-drug themes in their sit-coms, so what would they not be willing to try?
The end result of this, then, is that ordinary computer users will be forced to subsidize (through the use of CPU cycles, electricity, wear and tear on hardware, and memory use) the efforts of large companies and governments who are working against their best interests. So, tell me again... what would we gain from this?
Bill
Re:A question of trust (Score:4, Insightful)
How long before you have to provide the government with compute cycles, as a cyber-tax?
I like the idea, but consent must remain with the owner of each computer. Still, like attempts to force DRM-blessed operating systems upon us, I fear that the days of controlling one's own computer are numbered (and the masses are too ignorant to understand what's at stake).
Oh, FWIW, I'm starting to keep a slashdot journal [slashdot.org].
Re:A question of trust (Score:2)
Or gave you a stipend for computer upgrades every year?
Re:A question of trust (Score:2)
It won't "pay" for anything. Taxes, by their nature, are redistribution schemes not wealth-generation schemes.
Take $0.25 worth of seeds, some dirt and a few hours of your time, and you get tomatoes you can sell for $0.75/lb. You are generating wealth.
If you take $0.25 of every $1.00 and give it to somebody else (i.e. taxes), you haven't created wealth, but moved it from one place to another. Plus, the cost of moving that quarter (paying you) decrements the final payment by a couple of pennies.
The person who gets that quarter (minus a few cents) might be happy about it, but you haven't created wealth--certainly not enough wealth to pay the guy who got the quarter forcibly removed from his possessions enough to buy a stick of gum.
Re:A question of trust (Score:3, Interesting)
The purported purpose of many redistributive taxes is to either offer a "temporary" relief against hardship of some sort, or, more insidious, offer investment capital for some venture which is expected to generate wealth in the future.
Historically, private charity (when not the victim of dollars that go toward taxes instead of the charity) does a better job of taking care of the poor and destiture than does government.
As for "investment capital", if the venture were worthy of funding, private investors would do so, for a share of the expected gains.
Sometimes, of course, the government wins, or at least had a miniscule investment in something that wins big (think "Al Gore's" Internet). And I've seen many a slashdotter argue where government should "invest" -- NASA being a favorite "charity" (because they do cool stuff, I suppose). So, we slashdotters, as a group, are not immune to the lure of redistributed tax dollars. The big problem here, is that no matter how small the "government's" (i.e. taxpayers) investment, they claim ownership, lock, stock, and barrel, citing that "it wouldn't be if not for Uncle Sam [substitute your government as appropriate]".
Perhaps not as soon, but worthwhile things do get tended to by the private sector "when the time is right" (yes, to expect to profit, of course). The private sector tends to be far more responsive as well, espescially in innovative new technologies exploited by startups.
So, no, I am not any friend of government redistributive taxation, but I do think we should have strong counter arguments for all the "justifications" for it.
Pay for use goes both ways (Score:4, Insightful)
"As her PC works, pennies trickle into her virtual bank account."
However, it doesn't mention the other side, that as her files are backed up elsewhere, pennies trickle out. In addition, assuming an equal amount of "work", the outflow needs to be greater then in inflow. Take for example, the pay-per-view movie. It has a set cost to purchase. Everyone storing the movie gets a bite. But a single copy of it won't work - a single system off (or back under control of the user) means that part of the real-time delivery of the movie is delayed. So the movie has to be stored in such a way that dozens of systems can be inaccessable and yet still play in real time. As such, you need to have a large numebr of copies.
Now think about this for data backup. Is Mary gets paid "X" to hold some data, she can't be the sole recipient of it. Say she's one of 3 people with a copy of it (a rather low number). So the total cost is 3X. Now, she's going hand having her data backed up, which is the same size. She's paying out 3X to back up the same amount of storage she's only getting paid X to provide - it's much more economical to back it up herself, say a copy on her laptop and her home coputer, or work and home so the never share geographical space.
Same goes for processing power - you can't assume that a unit will finish the task given it, so that you need to run it multiple times if it is time sensitive, leading to the same inflation on what you pay out over what you are paid for your unused resources.
=Blue(23)
Re:Pay for use goes both ways (Score:2)
And, as regards CPU time, it doesn't matter if it takes twice as much CPU time to get anything done, if you've made 50x as much CPU time available since all of your idle cycles become useful cycles.
This, or a system like this, could lead to you never having to buy disk space again. You just put files 'on your system', and periodically you may need to pay another $5 to the your disk farm provider (probably part of your ISP) since you've gone over your previously alloted space. And you end up with backups & redundancy.
Assuming we can overcome some basic hurdles like overzealous copyright law, ubiquitous broadband, and automatic encryption of your files, I don't see how disk space sharing can not become the direction for the future.
It's been done, and no one uses it (Score:4, Informative)
This is two days in a row now that Slashdot has posted articles on the great new idea of distributed operating systems that CS theorists solved and have largely ignored for the last ten years. Besides Amoeba, there was the Connection Machine, VMS clusters, and others.
The fact is, massive distribution is of VERY limited use, and doesn't require OS-level hooks - Napster and distributed.net are both prime examples of useful massive distribution without involving the OS at all.
Re:It's been done, and no one uses it (Score:4, Insightful)
...none of which were designed to tolerate the high latencies and frequent failures that a truly Internet-scale OS would face. Legion [virginia.edu] and similar projects are much nearer the mark, but this is still nowhere near being the sort of "solved problem" you claim it is.
Re:It's been done, and no one uses it (Score:3, Insightful)
Massive distribution should not and will not be done just because it's techno-cool... it has to produce real value. What sort of real value can it produce? That depends on what sort of problems it can solve.
First, let's look at constraints. The three obvious ones are CPU power, disk space, and network bandwidth. All three of these have been growing relatively in proportion to Moore's Law for the last couple of decades. Their relative proportions have not shifted much... the CPU is by far the fastest, followed by local disk, and then network bandwidth.
Now, let's look at the problems we want to solve. How about data storage ("Jane's computer has an encrypted fragment of someone else's movie")? Local disk space is far, far cheaper and more robust than network storage! Bandwidth is the most expensive part of the equation. I can buy another few dozen gig of disk space for $100. How long will it take to transmit a few dozen gig via DSL? Sure, network speed will scale up, but so will disk space. Unless something changes, the balance of the equation remains the same... local storage is cheaper then network, as well as more reliable.
Of course, not all files you want will be on your computer, hence peer-to-peer file sharing, which is what Microsoft is trying to solve. But in this case, local disk storage is far slower than CPU, and far faster than network... in other words, there is no reason to not use a user-level process to manage the data exchange. No OS support is necessary beyond TCP/IP and disk I/O, right? This problem has already been solved in numerous real-world ways.
Now let's look at CPU-bound problems. There are computations we may want to make that can't be done in a fraction of a second locally. These are generally math problems, sometimes with large datasets. Some of these problems can be parallelized, and some cannot. Of those that can be parallelized, some have coarse granularity, and some have fine granularity. Coarse problems, like keyspace searches for brute-force encryption cracking or SETI pattern searches, don't need OS-level support - data is most efficiently shared at the process level, which is what distributed.net and SETI do already. Others optimize at finer granularity. In those cases, data sharing and communication requirements between threads are so intense that using a slow, unreliable network is impractical! That's what big parallel supercomputers are for. So there's no need for OS-level support for parallelized number crunching that is practical in the current CPU/bandwidth ratio.
So what problem are we trying to solve that is distributed (or distributable) efficiently across multiple computers, and requires OS-level support for optimum efficiency? I don't see it.
Now, i should revise my previous statement that no one uses OS-level distributed computing. Fault tolerant databases, clusters, and massively parallel supercomputers all use it - at the local level. And even those are butting up against the network bandwidth problem. If it can't be done with gigabit connections on the backplane, how will it be done over a modem?
Re:It's been done, and no one uses it (Score:3, Interesting)
Hm. So we have a set of "theoretical" problems, for which it's doubtful that solutions exist. Except that you say they've already been solved...and apparently they're not just theoretical either. Truly, you have a dizzying intellect.
Cheaper, yes. More robust? For what value of "robust"? Are we talking about data that only exists in one place, or in multiple places? Which one's more resistant to the type of failure that takes out a whole site? Please provide a definition by which something that exists only on your machine (whose mere existence is only known locally) is more robust than something that exists in multiple places.
Irrelevant. In any but the most stupidly designed distributed data stores, most data would be served out of a local cache under most conditions. In many, the next step would be to serve it out of another geographically-local machine over a fast LAN connection. Just because you personally can't think of a distributed-storage architecture any better than traversing the globe for every datum doesn't mean that better architectures don't exist.
Really? Ever try to do mmap-style I/O over Napster? How about plain old open/read/write over Gnutella? Byte-range locking within a Freenet file? Hmmm. If you want to talk about solved problems, how about ideas like VFS layers and network-protocol abstractions? To provide generalized, transparent access to data, on a par semantically with the sort of access that you get with a local filesystem, your "user-level process" isn't going to cut it. Not by a long shot. That's like going back to the days when every application needed its own library just to get keyboard input or draw stuff on the screen. This kind of thing belongs, at least partially, inside the operating system so that all applications can use all equivalent protocols without special linkage; see my file-sharing manifesto [platypus.ro] for a fuller explanation.
Re:It's been done, and no one uses it (Score:2, Informative)
ive also spent a truly innordinate amount of time thinking about installing amoeba, plan9 and others. the reason i havent is that mosix does alot of what i want in a cluster but i dont have to limit my set of apps to those that come with or i can manage to compile in one of those odd OSes.
But with the OSkit [sourceforge.net] and the growing prevalence of platform independant languages (java, python) i can see a time not too distant when the fireball amoeba distro [sourceforge.net] and the linux single system image [sourceforge.net] projects are competing for the average user.
Or maybe we'll get lucky and a project to put together the best features of plan9, qnx, eros [eros-os.org] and amoeba will take off with a leader like linus.
Re:It's been done, and no one uses it (Score:2)
Wherever VAXen are still in use? You make my case for me.
Re:It's been done, and no one uses it (Score:2)
Unfortunately, elegance often dies in the face of brute force, and Moore's Law, along with the lousy economics of the supercomputer industry, did them in. But before they died, they were working on running the Connection Machine model on a network of Sun workstations, rather than the custom dedicated hardware. Considering how easy it is to model a Connection Machine, and how easily the model scales, it made sense.
Same problem i've been trying to explain in the end... not useful for enough problems to be a general-purpose solution.
Data security? (Score:4, Insightful)
Hopefully OS developers are not that naive (Score:2)
Very Large Governments, of course, would probably have the power to successfully mine information, but even they would be given a good run for their money. And then again, Very Large Governments already have access to almost anything they care to want.
Key problem: no viable business model (Score:4, Insightful)
For commercial computing jobs, as a business with economic incentives for participation, a distributed operating system unfortunately makes little or no sense due to the types of applications that are currently server-limited.
Commercial computing jobs which need "big servers" are typically very database-dependent. You can't distribute the application very well unless you can distribute the database. (And hopefully you aren't crunching terabyte data warehouses, right? That takes a while to send down the pipes...) Besides the inherent difficulty of distributing your database across many nodes, you have the the typical basket of problems the IOS must overcome with a very high degree of assurance: security of your highly-proprietary information, reliability, backup, etc.
Most of the P2P plays a year or two ago discovered this the hard way. The most promising sales approaches ended up being things like distributed caching for search engine companies, which is a niche, not a mainstream business.
--LP
Distributed Resource Aggregation (Score:2)
SETI@home works well because the problem-space can split up and the amount of time it takes for a client to process it far exceeds the time it takes to transfer the data. There are also a good number of users out there who just like the idea of searching for ET.
Distributed.net works well for the same reasons as SETI@home, but instead of users wanting to look for ET... users adopted it originally for chance at cash and later for the ego boost.
If you build a generalized infrastructure to handle arbitrary requests for resources, the end-users loses touch with what they are working with eliminating any type of ego boost. Plus, I can't imagine many people are going to want to donate their space cycles to a pharmacutical company who will then go and patent a drug developed from information you give them, sell it at highly inflated prices in the name of R&D costs while you get nothing in return except a higher power bill and constant noise coming from your computer.
That's not to say there aren't good causes that people would be willing to donate resources to still out there, but these causes are attractive because they give the users a direct connection to them.
Of course, that's just my opinion, I could be wrong.
Half-true assertion (Score:2)
hmmm (Score:4, Interesting)
"Consider Mary's movie, being uploaded in fragments from perhaps 200 hosts. Each host may be a PC connected to the Internet by an antiquated 56k modem--far too slow to show a high-quality video--but combined they could deliver 10 megabits a second, better than a cable modem."
Ok, thats nice, how do they propose Mary receive 10Mbps? Get 12 DSL lines? What about the people on dial-up? While people gain access to the internet around the world, those of us with the uber-connections will just leech on them? Now, they talk about the "digital divide" but that is just plain vicious. I'd rather be stickin it to The Man then Uncle Sven in Stockholm. So then what, everyone gets a fast connection -> backbone upgrade -> ATT, MCI, Earthlink, Sprint, etc. spend the money that Amgen would save.
Also: How would individuals choose who can use their computers resources given their ethical or moral convictions. While I would surely donate my CPU and disks to cancer research or finding larger prime numbers, I don't want the DoD using it to think up new ways to kill people.
Better pay... (Score:2)
...for my processor time. It's one thing to be able to do SETI@HOME. But if some biotech company wants some remote computer to use my PC for DNA analysis, it had better pay me well for my generosity.
Damn I'm antisocial.
nahtanoj
Acceptance (Score:2)
Needs to pay for the Juice/Power impact (Score:2)
Half a picture (Score:4, Informative)
As happens too often, this proposal concentrates entirely too much on distributed computation, and pretty much ignores the problem of distributed storage. They're quite different problems, each requiring its own solution, even though it's intuitively obvious that any true "Internet Scale Operating System" would have to deal with both.
If you're interested in this "other half of the problem" here are some links:
There are many more. The bibliographies for the above will mention many earlier systems, while a quick Google search for these project names will show more recent ones.
How does this benefit me? (Score:2)
How does it benefit me as a user, aside from #1 increasing my energy bill by encouraging me to leave my PC on, #2 increasing wear and tear on my PC as my hard drive is accessed repeatedly, and #3 increasing my vulnerability to hackers? Oh, and #4 - sucking up the bandwidth of my ISP because of all of these always-on computers, thus trashing any hope of decent pings for my first-person shooters.
Gee, where do I sign up?
I've seen this before in an Apple commercial... (Score:2, Funny)
.
Just wait.... (Score:3, Interesting)
ILOVEYOU (Score:3, Funny)
Communist, Schmommunist... (Score:2, Interesting)
Perhaps if you set up your computer service like a secret society this would work. Then you'd have to know all the users, and would be able to track everything. It would be like the Masons, only with computers.
distributed backup is the killer app (Score:4, Interesting)
Consider a distributed backup program which works roughly as follows.
This type of application would provide at least 3 important benefits for backup. First, its relatively cheap. If you want to backup more data, just buy more local disk space and trade files with more computers. This seems much easier (at least for a home user) than setting up a tape backup system, making sure the tapes get replaced, making sure the tapes get put someplace safe, etc. Second, its much safer than pretty much any backup system you could buy today commericially since your data is literally spread all over the world. Finally, the backup system isn't controlled by any large corporation.
Obviously there are still some details left to be worked out such as how to let computers who want to trade files find each other (both centralized and distributed options exist analagous to napster and gnutella), how to prevent cheating (having your computer periodically ask its partners for hashes of the data they are backing up should work), how to control redundancy most efficiently (error correcting codes like Reed-Solomon codes or Tornado codes would probably be smarter than just repeating data).
If you're looking for a great distributed open source project that will make the world a better place, I encourage you to develop prototypes for distributed backup. I plan to develop my own prototype one day, but currently I'm pretty busy with graduate school.
-Emin
Been there, done that (Score:3, Interesting)
While this is not directly mentioned by David Anderson in his article I know for a fact that this is something that United Devices is interested in because late last year Mojo Nation was in discussion with UD to provide just this sort of service to its users.
This sort of distributed backup is what the current private branch of the Mojo Nation codebase does, with a little taskbar app that sits in the background and distributed backed up files to peers within the enterprise. One major benefit that your post missed is that the majority of the data stored on hard drives within an enterprise is redundant data (e.g. multiple copies of MS Word, etc.) and with a distributed backup system you only need to keep a few copies of such files around for restores. You can back up 99% of your data while only needing 10-15% of the available space on individual PCs.
In what is turning out to be one of life's interesting ironies, the company that was most intrested in this UD/MojoNation pairing was Enron's bandwidth trading group (mostly for storing medical imaging data and distributed corporate backups.) When Skilling left Enron just before the whole accounting scandal started to blow up the Enron guys became "unavailable" so things never moved forward, but you can be certain that this sort of a distributed data storage and backup system will appear again.
Jim
do i really want this? (Score:2)
Overlooking the obvious (Score:4, Insightful)
The utopian future that dreamers always look forward to will never happen. It hasn't happened before, it won't happen in the future. However, this type of computer for the desktop that shares it's 'computing' power with the entire network, makes LOTS of sense for businesses. I go to lunch, break, and then go home for the day. All the while, my computer could be donating its computing power to handling webserver requests, processing internal jobs for the mainframe, or even help run massive load and regression tests on the system to anticipate 'kinks' in the armor of the system from a scalability standpoint.
Sure, it would just be "so neato!" if every computer could be kept cheap for the home user by everyone sharing files, processing power, even memory; but let's face it, communism didn't work because there wasn't enough incentive for the worker bees to strive for better. There's always a fine balance between greed and sharing. Giving such a 'distributed computer network sharing' system to businesses would be a great start, but don't expect a 'home user' acceptance of such a system anytime soon. I want my full computing power for my new computer game that I bought with my own money, and I'm sure many other users aren't willing to give up their hard-earned money for everyone else to piggyback off their 3l337 system anytime soon.
not going to happen (Score:2)
Has to be "opt-in" (Score:2)
Anonymous driver says, "I'll just leave the gas money in the ash tray." Why should I believe him?
Also, it is pretty easy to write
while( true )
{
}
What is to stop me from doing this on a thousand computers drawing from a false bank account (if I had the knowledge and were so inclined)?
Trusted data (Score:3, Interesting)
The more stock and importantce you put in something, the more likely people will use it as a means of abuse. I can envision a world where people who are against a particular scientific task (for whatever reason, ethical, on principal, or whatever), use this Internet OS, and join particular distributed apps simply to throw noise into the upstream
hmmm... does that mean (Score:2)
Distributed data storage (Score:2)
A user could install a program which used the free space on all disks in the same manner as a "nice" process uses CPU; as soon as space is needed, some data is released, completely transparently. A company or organization could store data on the distributed network; they would keep a "master" copy of the data available, in case a particular fragment happened to be erased on all of the nodes, or nodes were unavailable.
The question I'm pondering is how to keep track of where data is stored, and route data from the nodes to the host where it would be read. In article's example, the fragments of a movie, sent to a particular client. How do we efficiently request fragments, in the correct order, without either overusing bandwidth with duplicated data or dropping fragments?
Learning from the past (Score:2)
Start a project like this (without the centralized servers) by looking at distributed networked file systems, like Coda and AFS, and see how much the server side can be distributed. The same goes for authentication systems, like Kerberos. Obviously the security would come from encryption and redundancy, but this is a very complicated scenario when the servers are distributed.
In fact, distributing even as much as has been outlined in the article onto the clients would be difficult, and would likely kill network thoroughput if not done very carefully. If distributed as suggested in the article, it would place a massive load on the internet, by making thousands of requests for bits and pieces of files where there should be one request.
However, with a centralized system, the problem is already solved, essentially. Any large-scale university (like MIT) has already developed the kinds of network file sharing and authentication technologies required herein. The distributed applications have already been written, and would merely contact these central servers for information instead of their own central servers. The economic framework is interesting, but already done, and the payment services exist as well.
Read the article in full before replying next time (Score:3)
That being said, "Sign me up!". The security, privacy, availability issues are going to be solved. As in the article, you get to determine when, how, etc your computer is used, and you get to set the price.
What this means in reality, though, is that there will be people who will set up farms of computers and underbid their processing power/storage space/bandwidth, and you will get very little, if any, money. Imagine a few cents a month, maybe.
This system would be of great use to big business (who will really make savings) but will have little effect on the consumer except, perhaps, faster access to products and services sold by big business.
The problem being that the only resource the average user may possibly use from such a system is backup. Your network connection isn't going to be fast enough to buy a cheap computer and buy processing power online for your game. MMORPGs, however, may take on a whole new meaning when they start being able to handle millions of simultaneously connected players, and a fully interactive virtual 3d world may come to fruition through such a distributed system.
So, as many research products go, this will enable businesses to lower their costs and compete more effectively with each other, which, surprise, surprise, will (eventually) mean a cost reduction for our services and products.
I'll start building my slow storage rack now. Shouldn't cost more than a few hundred for a terrabyte of near-line and on-line data.
-Adam
No coordination required (Score:3, Insightful)
The article looks more like an excuse for implementing a micropayment system (Creates a direct connection between your wallet and our bank account!). Enthusiasm for micropayment systems seems to come from people who want to collect the payments, not from the people expected to pay them. It's very clear that what consumers want are flat-rate services; competitively, flat-rate wins over pay-per-use as soon as the prices get close.
If you want vast amounts of CPU time and are willing to pay, you'd probably be better off cutting a deal for off-peak time on hosting server farms. You get a uniform environment, good interconnect bandwidth, and a single organization to deal with.
Read: Pricing spare resources and options? (Score:4, Insightful)
From: Greg Broiles
Subject: Re: Pricing spare resources and options?
At 01:44 PM 11/18/2001 -0500, dmolnar wrote:
>The recent comments on Mojo Nation prompted me to look at their site
>again. I don't see much guidance on how to set prices for network
>services. There's a mention someplace that business customers will build
>pricing schemes on top of Mojo Nation, but not much indication of what
>these schemes might be.
>
>So what is the "right" way to price resources? (Preferably beyond the
>obvious "supply and demand.")
Unfortunately, one of the evolutionary steps in Mojo Nation's development has been their abandonment, for the most part, of user-visible and user-configurable economics; they deliberately made it difficult to see how many Mojo are held by the local broker, and relatively unlikely that a broker will be able to earn significant Mojo by careful pricing - recent clients are configured such that the economic brakes on resource usage are sharply curtailed or removed entirely.
It's my impression that, given the changes in the venture capital and software markets, they've refocused their efforts away from P2P filesharing and towards speedy realtime content delivery, whereby people with limited net connections can maximize their incoming bandwidth by pulling (or getting pushes) from multiple other parties simultaneously, somewhat similar to what Morpheus/Kazaa are doing, or what Bram Cohen (a Mojo Nation alumnus) is doing with BitTorrent.
The economics seemed to attract people who wanted to experiment with pricing, etc., but that wasn't necessarily a market or constituency which is interesting to investors or businesspeople.
>A related question - I ran into a friend of mine who had just finished an
>internship in options trading. He suggested it might be worth looking at
>options on spare disk space or other resources, as a means of figuring out
>how to make Mojo-type systems eventually profitable in the real world. Now
>I have a copy of Natenberg's _Option Volatility and Pricing_ to look at...
It seems like there ought to be an interesting market here, but I know and worked with several people (with good financial backgrounds) who flogged this for awhile and never got anywhere. I guess a big part of the problem is that there's such a big difference in the perceived value of a megabyte/month of online storage .. if you're on the provider side, you
think that's pretty expensive, as you've got the investment & etc required
in building a data center, providing bandwidth to reach customers, paying
staff, etc - but if you're on the customer side, you look at an 80 Gb drive
at Fry's in the Sunday newspaper for $160 and think about a $500 1.5mb/s
frame relay connection, and wonder why the service guys want $3 per
Mb/month ..
and then the Mojo guys come along and make it sound like the people with the cheap frame relay connections and commodity PC hardware ought to be able to set up data centers in their back bedrooms or on their old laptops, but so far all of the business models proposed involve paying those guys up front for an indefinite period of storage, so there's no strong incentive to actually store the data for long, especially not if you can resell that same disk space 3 or 4 or 50 times.
Seems like the guys who really have hard data about options for bandwidth and disk usage are the disaster recovery guys. And that market hasn't been so great lately either, Comdisco declared bankruptcy and is their disaster recovery unit is getting swallowed up by Sungard, I think.
Anyway, yeah, the Enron guys thought there was something interesting to be done in bandwidth futures, too, but I don't know if they ever really got anything done before their demise beyond some demonstration projects.
--
Greg Broiles -- gbroiles@parrhesia.com -- PGP 0x26E4488c or 0x94245961
5000 dead in NYC? National tragedy.
1000 detained incommunicado without trial, expanded surveillance? National disgrace.
Re:Read: Pricing spare resources and options? (Score:2)
This is because strict pricing really does not work. I could point you to some good work by Andrew Odzlyko regarding incremental pricing for computational resources, but the best paper to find that outlines the hard part is "Price-War Dynamics in a Free-Market Economy of Software Agents" by Kephart et al. Computational resources are like electricity, they can't really be stored for future resale so it is relatively easy for suppliers to play games with the market by withholding resources during periods of peak demand. The resources are very time-dependant and they are effectively a zero-cost good so there is a race to the bottom in pricing. Additionally, these resources are difficult to price by users --users expect a constant price for resources contributed and most users have both an inflated expectation of what their resources are worth and little understanding of things like options pricing (e.g. to them Black-Scholes is a vacation destination.)
For Mojo Nation we opted to move to a pricing model closer to Odzlyko's "Paris Metro Pricing" in which resources donated to the system were exchanged for a sort of network karma. If you donated resources during periods of peak demand you could redeem them for enhanced quality of service at a later point. Not as fancy as the "disk space for dollars" model that the cypherpunk dreamers seem to want but a scheme a little more grounded in reality.
Jim
can anyone say... (Score:4, Interesting)
How many people do you know that are too scared to purchase anything online because they're afraid that some crazy cracker will intercept vital financial information? I know quite a few. We have to keep in mind that a relatively small portion of the overall population will actually see the benefit of this technology; and even fewer will trust it.
Things that should be considered:
Storage (Score:3, Interesting)
Add to that the fact that when you start dealing with serious amounts of data (~1TB), making backups to tape or any other media starts to get really difficult. If the free disk space on people's computers (I've got around 30 or 40GB free on my home machines) could be put to use to store backups, I'm sure businesses would be willing to pay a significant amount of money for it.
-Esme
How does one control what one's PC is used for? (Score:3, Interesting)
Probably not.
I/O Bound (Score:3, Interesting)
Processors faster than 2GHz are dirt cheap today. High-bandwidth connections aren't cheap, and connections to home users are 3 orders of magnitude slower than an internal disk drive channel.
This kind of thing only seems to make sense for the most geek-oriented scientific types of calculations, and of those only the jobs that are trivially parallelized, like SETI. I don't see everyone changing their OS to support it.
Why does this...? (Score:2)
How long before it becomes self-aware, realizes humans are the single biggest threat to its continued existence, and begins scheming to eradicate us?
a couple of issues (Score:3, Interesting)
even if we have lots of unused processor time (which I'm sure we do), pumping the data in to and out of a remote procedure call can consume a lot of bandwidth and result in a huge lag time. Many problems don't distribute well, even when you have relatively high bandwidth connections to send the data over (like multi-GB memory busses), so the problem only gets worse when you use a measley network pipe or modem line. (processor memory bus bandwidth tends to be in the 5-10 Giga-bit range, even the best home internet access is only 10-100 Mega-bits)
the steady state of a hard drive is full. There just isn't going to be enough spare, on-line, storage space on folks' desktops to give any appreciable amount out to share. If you have to deal with the bloat of a self healing encoding, the problem only gets worse.
Consider the case of N users, each with one hard drive of size X. They share out half of their hard drive space, but a file takes three times as much space to store on the distributed system than it does purely locally (for the self-healing encoding). The total hard drive space available to the group is now N*X/2 + 1/3*N*X/2 = N*X*4/6, or just over half the actual total space on the network. The average space available to any single user is the total available space on the network divided by the number of users, or just over half the actual space on the individual user's local hard drive.
That doesn't sound like too good a deal to me. Admittedly, I will be getting some extra reliability, but given how many home user's back-up their data on a regular basis, I don't think reliability is worth much (at least to home users).
At first blush, it sounds like a nice idea, but I don't think the economics are going to support it. It will always be easier and cheaper for the folk that actually need more storage or processing power to just go out and buy it, especially while Moore's law is in effect. For anyone else, it just doesn't matter.
Re:a couple of issues (Score:2)
Intended use... (Score:2, Interesting)
I'm not so worried about the technical side of things, but more along the lines of intended use...
Could someone queue a job to crack a encrypted password file, or a document stolen from the government? I imagine that with 150 million computers using their spare cycles, this job could be done with relative ease. This is definitely an issue that the authors have failed to address in their proposal.
The legal rammifications alone makes this prohibitive. Is a person who's computer did 0.1% of an illegal activity just as liable as someone who did 10%, 25%, 50% or as liable as the person who submitted the job? Can you even fully control what kind of jobs your system is doing using this proposed infrastructure?
It may be a great idea for say X machines inside a large corporation, but there is already some alternatives to fill that need. I just don't see how they can work out the logistics of issues such as the one I present above, when they have to also worry about technical and financial issues that such a system would bring with it.
That's right. (Score:2, Insightful)
Re:That's right. (Score:3, Insightful)
Neither does SETI@home, or any of the other distributed computing things going on.
Or to look at it another way, by giving your miniscule amount of bandwidth, CPU power, etc to other people, you are recieving the COLLECTIVE bandwitdh, CPU power, etc. in return.
The best analogy I can think of is the philosophy behind GNU software - All of the resources are your for the taking AS LONG as you are willing to give your (comparatively tiny) resources back. Everyone wins, except the people who want to freeload and profit from their freeloading.
That's how I see it anyway.
Re:That's right. (Score:2)
That is the exact argument that was made against the Internet when it was first proposed, back in pre Arpa net days. The only thing that made it happen was the mandate of the Pentagon.
It will take something similar for it to happen at this level as well.
Re:That's right. (Score:2)
That is the exact argument that was made against the Internet when it was first proposed, back in pre Arpa net days. The only thing that made it happen was the mandate of the Pentagon.
I don't know about pentagon mandates but I think there are serious problems with the proposal. I have heard the similar proposals about every 6 months over the past 10 years.
The biggest problem is that accessing lots of random CPUs introduces a huge number of security problems:
One of the notable features of the many proposals is that they all get pretty excited about free markets, like Enron did. I tend to think that the market aspect is kinda the point of the scheme rather than a feature. The objective is really to build the Ayn Rand memorial Internet rather than solve real problems.
There have been a number of attempts to actually build systems of this type. Some of the Napster clones reserve the right to rip you off downloading for profit programs onto your machine, I have never seen one get too far.
As I see it the cost overhead of managing the scheme is simply too great for the benefit. There is no real shortage of high power computing capability for bona-fide researchers. SETI took the internet route for the sole reason that there was no other way they would get the CPU they wanted. I don't think the same constraint applies to biology or particle physics. I could always find someone with a high end machine or twenty to lend.
Setting up the control system for that type of scheme would cost millions. You have to write lots of software and once you introduce money and profit you have to deal with all sorts of scummy fraudulent types. In return you get access to perhaps a few tens of thousand mid to low end PCS for half their time.
For the same money you could build a dedicated rack for approx $1-2K per processor. So a million dollars gets you 1000 processors full time, no compromises. You have full control over the hardware and software environment, no security hassles. I know which way I would go.
Re:Wow. Imagine a beowolf clus... (Score:2)
And ist Bjólfur, not Beowulf... *ACK*
Re:Whats in it for me? (Score:2, Interesting)
Sell computers at or just above cost to consumers in a package that provides all the necessary hardware / software. The end user will be forced to sign an agreement that will provide for them the DSL / cable line at a reduced cost and the computer for the end user. They must also agree (stated within the terms of service, that their computer should always remain on (when reasonable) and when not being used is subject to being used by my company (we'll call it MyCo).
Now, to offset the costs of the reduced price of computers and the reduced cost of cable / dsl - MyCo then can sell a client to a larger corporation who is interested in large scale computing without having to purchase one. For those of you who are familiar with the supercomputer environment, it isn't uncommon to lease out cycles on a larger scale computer to other entities to help offset the cost of some of the larger super computers. By leasing out the number crunching abilities of the distributed network of computers, this would be able to cover the costs of selling consumer hardware / packages and would allow for large-ish companies to harness the power of a distributed number crunching system.
Like I said, this is all very preliminary and more of just a thought than anything, but I think that something like this might attract more than just the "geek novelty" users. It would allow consumers to benefit, and would allow other companies to piggy-back on the system without having to make the large investment into a "supercomputer."
Re:What Happens When A Computer Goes Down? (Score:2)
Re:What Happens When A Computer Goes Down? (Score:2)
If the data was that important, you should have made a backup. A system like this would store more copies of files that were frequently accessed. Your letters to Aunt Gracie, however valuable to you, aren't going to be seen as high priority.
The cost of storage keeps plummeting. As that happens, the cost of that bit of platter turf becomes less important than the cost of distributing it. Of course, this will be counteracted by the fact that people are saving bigger and bigger files.
But the cost of bandwidth is also going down (though your cable bill probably doesn't reflect the fact). Same with processor speed, bus speed, and every other metric which would bottleneck any potential distributed app.
I also think you're looking at it in a "glass is half empty" sort of way. Sure, every chunk of data may have to be replicated several times on the system as a whole. But Joe Raiddisk simply doesn't have the HDD capacity to store every bit of content he might be even remotely interested in viewing. With a well-tuned "IOS", you end up using *less* storage, because there aren't more copies than are necessary to serve the actual demand.
It would help if people stuck to posting things to the system that they knew would be of general (or at least niche) interest, rather than using it as their own personal exabyte backup tape.
There are some serious issues that need to be resolved before this thing becomes a reality. But the idea of tapping into the massive resources that millions of computers waste every day is too good to pass up.