Shirky On Umbrellas, Taxis And Distributed Systems 40
There's a good article from Clay Shirky talking about the similarities between umbrellas, taxis and distributed computing. And if you really want more P2P than you can shake a fork at, the folks at ORA have also released an excerpt from the upcoming Dornfest and Brickley book.
This is the (Score:1)
It's pretty simple actually (Score:1)
Re:Where are the applications? (Score:2)
--
Where are the applications? (Score:5)
Firstly, the algorithm must be parallelizable. This means that it should be possible to split an algorithm which normally takes N time, across a number of, say P processors, and have it take less than N time, and ideally N/P time.
Secondly, the algorithm must have minimal communication requirements. Rendering, for example, is parallelizable, however in most modern rendering applications each computer would need an entire description of the scene being rendered. This could be a huge amount of information, running into gigabytes, yet it would need to be distributed to every participant in the rendering process. Recall that in most distributed computation applications connectivity will be limited to a 56k modem which is only connected to the Internet intermittently. Even if you limit users to broadband, communication bandwidth is still a problem.
Thirdly, the algorithm must be robust, if someone decides to screw things up, and hack their client to send back malicious data (as happened with Seti@home) they must not be able to invalidate the work that everyone else has done. Ideally there would be an easy way to validate the work done by each client in the system.
Now, I am not saying that there are no applications which do not conform to these criteria, for example, cracking crypto algorithms and processing information from space telescopes in search of intelligent life clearly work quite well - however neither of them can really be used to make vast amounts of money. The only other thing I can think of are genetic algorithms, but again, whether there is a revenue stream there is an important question.
Perhaps some of these distributed computation people have found a killer application for this technology, some of them certainly claim that they have, but I really wonder whether such applications will stand up to scrutiny on the grounds I outline above.
--
Re:User created metadata considered harmful (Score:1)
I'm presenting a talk at the O'Reilly p2p conference [oreilly.com] entitled "Attack Resistant Sharing of Metadata".
It's based on an idea of Raph Levien's [advogato.org], somewhat similar to the Advogato trust metric [advogato.org]. Basically, you only trust meta-data from your friends, or from people whose meta-data has been good in the past, and then to a lesser extent you trust their friends, but you dynamically adapt if someone starts distributing bad meta-data. We can't really prove that it will work, but it has some promising characteristics.
We are going to implement it on top of Mojo Nation [mojonation.net].
Regards,
Zooko
Back of the envelope math (Score:1)
For starters, the numbers were chosen to make the math come out nicely -- the average box less monitor is actually less than a grand, and the average machine is in service longer than 20,000 hours, which makes the nickel figure high.
Furthermore, if PopPower et al wanted to build a cycle farm, they'd use multi-CPU boxen, so the calculations get even more complicated. Finally, as we've seen in Cali, power requirements differ between consumer and business regimes.
So there are a lot of variables pulling the number this way and that, wiht an increasing degree of speculation, but sinece the real point I was trying to make -- your use of your PC has variable value to you between the hours you are and aren't using it, and that if you're using it, no pro-rate fee will induce you to stop -- would have been accurate even if the nickel number is low by a factor of 5, I left the back of the envelope calculation and went on.
-clay
Re:weird bit (Score:1)
I made a similar point in another thread [slashdot.org] on this topic.
-clay
Re:Where are the applications? (Score:3)
You forgot a forth and more critical criteria which all the P2P companies keep saying "pay no attention to the man behind the curtain"
Fourth, the company must not care about the data, algorithms, and results becoming public immediately. Available to any competitor or evil cracker who wants to mess with you.
Forget the other 3, you will have a nearly impossible time finding anyone willing (stupid enough) to give you money and live with #4.
Of course, some of us have known this for a very long time, commercial distributed computing was put to sleep in the 70's. But then, in the 70's VCs were smarter.
Napster (Score:2)
Re:Will companies really see so much profit? (Score:4)
I think that's precisely the problem. The things that we've found lend themselves well to distributed computing (SETI, cracking encryption) don't lend themselves as well to making money. What company wants to pay for either of the above two, let alone a lot of money?
That's not to say that P2P is already doomed though. I don't think that it's a technical problem at this point, I think it's a business problem. Someone has to figure out a problem that has two attributes: It must lend itself to being more quickly solved via distributed computing, and it must be something with such a high demand that someone is willing to pay big money.
It's very possible that P2P could take off...but I'm not holding my breath. Even if they solve the issue of "what problem is worth the money", there's still the problem of "who will let us use the cycles" and "how do we keep from getting cheated".
-Jer
What kinda math is that? (Score:4)
I don't understand how the author came to the nickel per hour number.
Sure, the cost of the machine boils down to (by his math) a nickle an hour, but that's not the same cost as the company would have to take on.
A company would have to buy the system, hire the IT personnel, cover their benefits, store them, pay for the electricity, pay for the heating/cooling, pay for maintenance, parts if they break, warranties, etc. These (and more) are little things that a home user might not even consider when determining if it's "worth it", and makes the "break even" point much higher than a nickle per hour.
I'd like to see the same breakdown done with some more accurate math.
-Jer
Re:Where are the applications? (Score:2)
Not necessarially. Depending on the cost of cycles, it may be sufficient to use a less efficient approach that is not completely scalable.
Secondly, the algorithm must have minimal communication requirements. Rendering, for example, is parallelizable, however in most modern rendering applications each computer would need an entire description of the scene being rendered. This could be a huge amount of information, running into gigabytes, yet it would need to be distributed to every participant in the rendering process.
I do actuarial projections for a life insurance company. I have a set of assets (investments with future cash flows to the company) and liabilities (insurance policies with future cash flows). The liability cash flows influence what funds are available for investing (or dis-investing). Industry regulations require that I investigate the adaquacy of the type and amount of the company's assets under different interest rate environments. The regulators want to make sure that even if interest rates and/or equity values spike up or drop down dramatically, the company will not become insolvent. The tricky part is that the liability cash flows are often dependant on the interest income that they assets can generate and the interest income that assets can generate is dependant on the interest rate environment when each of the cash flows occurs.
Because of the interrelatedness of the two portfolios, there are two ways I can go about dividing up this project. I can slice by time, calculating all of the cash flows that I need at a given time to determine whether there is cash to invest or assets to sell. This is the most efficient method, but it has high communications requirements.
Or, I can project all of the liabilities over future times and get a series of liability cash flows which then imply a series of asset portfolios and interest rates and then iterate back and forth between liabilities and assets until the answers converge. (Typically on the order of 10 or so iterations and not hundreds or thousands). This is less efficient, but has lower communications requirements. If cycles are sufficiently cheap, it may pay to use a less efficient algorithm.
Thirdly, the algorithm must be robust, if someone decides to screw things up, and hack their client to send back malicious data (as happened with Seti@home) they must not be able to invalidate the work that everyone else has done.
That depends entirely on the incentives. SETI was vulnerable because there was a competition to rack up completed cells. If the incentives to participate are designed properly, the may be no incentive to hack the client.
Umbrella size (Score:1)
Re:Where are the applications? (Score:2)
The cost to buy and maintain the bandwidth needed to push the data out to distributed resources would be more than the cost of a mainframe.
Another analogy (Score:2)
There are a lot of unused cycles out there, but they are cheap and so finely dissolved that the extraction process isn't viable.
RDF ?= Robotech Defense Force (Score:2)
O you mean Resource Description Framework....
i always mix the two up..
nmarshall
The law is that which it boldly asserted and plausibly maintained..
Wrong authors (Score:1)
Will companies really see so much profit? (Score:5)
Similar to "Permutation City" (Score:1)
A good book (for other reasons as well). Unfortunately I managed to leave it in a Sydney hotel room.
Cheers,
SuperG
Umbrellas and Cabs (Score:1)
-Moondog
Re:analogy doesnt work (Score:1)
weird bit (Score:1)
MusicBrainz solves music metadata problems (Score:2)
More to P2P than cycles (Score:3)
Metadata early in the game (Score:1)
Isn't this what the cue cat people did when they embedded serial numbers into their scanners?
It allowed them to start creating a metadatabase on you!
Re:User created metadata considered harmful (Score:1)
Agreed, little motivation to donate bandwidth (Score:3)
Added to which, once we actually start paying for music downloads (its inevitable), there will be demand for reliable downloads. Hell, if I'm paying real money per song, timeouts and crappy connections are unacceptable. Once money enters into the equation, I want the media in a timely and efficient manner.
None of this matters in a future where everyone has fiber to the home, but we're at least fifteen years away from that being a reality for most citizens.
User created metadata considered harmful (Score:4)
This is why search engines that work off of metadata typically give you porn links for almost anything, and why Yahoo can't be spoofed (their surfers actually visit the site to see what its about).
Re:not enough of a return (Score:1)
not enough of a return (Score:4)
This hits the nail on the head. I'm willing to install the RC5 client on my machines for several reasons: 2. It's a project whose goals I more or less believe in. (SETI would be an even better match, but I ended up installing the dneet client first.)
3. I already installed it. Once it's been configured and set to run on my FreeBSD and linux boxen I can forget about it. More trouble to disable it or find a new distributed project, install that, configure it, and get it running on all my computers.
I think this article gets it right. The returns for me contributing my spare cycles as well as the effort to install and set up the clients is not worth whatever change they are paying. Like the article says, if they pay a nickel per processing hour, that takes roughly 2.28 years to earn a thousand dollars if my system is running the client 100% of the time at full processor speed. (I have no idea how much these systems actually pay, I'm just quoting the articles example.) The actual amount earned would actually be much less as I do various things with my system: burn CD's, play quake, write papers, etc. The long term return of pennies, or less than pennies on the hour makes me say that it's not worth it. And I suspect that without some higher incentive, like distributed.net crunching keys has been turned into a competition, most people just aren't going to take the trouble to signup for these paid distributed services. To have enough computers to make some serious money, you had to have enough money in the first place to make whatever they pay you small change.
Re:weird bit (Score:2)
-Nev
Another idea for payment of resources (Score:3)
Therefore, I propose that projects such as Popular Power, etc., abandon the idea of paying individuals a few nickles for some amount of cpu processing, but instead pay the charitable organization of the individuals choice.
For example, you could sign up your machine on, i.e, Team FSF, and for every X number of opperations your machine computes for these distributed projects, a dollar would be donated to the FSF.
Money, etc. (Score:3)
but the idea of someone paying my electric bill....
I gotta admit that I can see the potential for abuse on this one.
On the other hand, this comment tossed in at the end gives me the shivers:
As a moment of paranoia sets in, I can see MS adding this element to thereI do not know what it is, but I always seem to have this moment of distrust whenever I read something involving MS.
Then again, maybe the MS marketroids read Slashdot, checking it out for this kind of thinking, in order to get new marketing ideas that they can use.
;-)
International usage ! (Score:1)
Are these guys ready to pay for foreign processing power too ?
If so, which companies ?
Re:analogy doesnt work (Score:1)
Re:Where are the applications? (Score:2)
I'm sure I'm not the only one who loves setting up scenes with a gazillion meshes and complex camera shots and ray traced textures mirroring each other into infinity. The wireframe of a wild animation is within reach of many typical desktops, it's the rendering that you'll never get --especially at high res-- without your own CPU farm.
So, my proposal to whatever company it was, was to allow artists to send in descriptions of their animation along with say a single screen shot and then CPU cycle donators could go to the site and decide which project they wanted to patronize.
The reward? --not money but a free copy of the final project.
In my mind, this is where the net can transcend conventional notions of economy. Heady stuff.
But what about the money? Well, the organizing site would have to get by on ad revenues. But since it would be an entertainment site, that might not be too bad.
Re:Will companies really see so much profit? (Score:1)
They don't even have to do that - they just need to set up the business case for a CPU cycles bidding market, and the applications will create themselves. (So, it is still a technical problem - creating the infrastructure so that arbitrary processing packages can be distributed according to the results of the bidding.)
Wired pondered this [wired.com] recently.. it could be really cool if someone can pull it off.
analogy doesnt work (Score:1)
mmmm, distributed chocolate. (Score:1)
Hmmm, I disagree (Score:1)