×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Citizen Science and Grid Computing

kdawson posted more than 6 years ago | from the greyware-network dept.

Social Networks 69

japonicus writes "The Economist has an article summarizing the current state of distributed computing (think SETI@home and its ilk), which suggests that distributed-human projects are going to be the next big thing. (We discussed one such project, the Galaxy Zoo, a few months back.) The distributed-computing platform BOINC is about to expand to human processing. Distributed proofreaders have been a longstanding success (yet inexplicably failed to get even a mention in the article); but there are a lot of other projects waiting in the wings."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

69 comments

FIRST TROUT! (-1, Troll)

Anonymous Coward | more than 6 years ago | (#21662329)

Welcome to fish-based computing: I am a fish!

SETI... (1)

purpledinoz (573045) | more than 6 years ago | (#21662673)

SETI was the first to really harness this concept in a useful way, and it was very useful to prove the viability for distributed computing. However, I think the computing power would be better used for medical research, such as Rosetta and World Community Grid. Don't get me wrong, I think the SETI project is valuable, but I think crunching numbers for medical research is more important. I urge all of you crunching numbers for SETI, to also share that crunching power for medical research as well.

Re:SETI... (1)

crowtc (633533) | more than 6 years ago | (#21663465)

I know several people that used to crunch for SETI that have already moved onto Rosetta and the World Community Grid.

My mother died of cancer way too young - so I've naturally chosen to donate all my extra CPU cycles to the prospect of curing cancer. While the aliens might be able change people's opinions of our place in the universe, curing cancer will improve life for my progeny right away.

Re:SETI... (1)

crashradtke (1025750) | more than 6 years ago | (#21664185)

Something I have pondered... How do you know it's SETI is useful? I agree there a ton of computers out there crunching tons of number, but who really knows how useful it is? However, the Medical research seems more promising as it would produce tangible results which can, hopefully, be applied in the near term. Just a thought.

Re:SETI... (2, Insightful)

OriginalArlen (726444) | more than 6 years ago | (#21664977)

The summary's right - Distributed Proofreaders was well before SETI and, having contributed to both, I can confidently state the my mental cycles contribution to PGDP was FAR more personally satisfying than the hours spent staring at SETI@Home's admittedly hypnotic display chugging away.

Re:SETI... (1)

kencf0618 (1172441) | more than 6 years ago | (#21666137)

I still run SETI for old time's sake, albeit at low priority (along with SZATKI and ABC) at 100, whereas Einstein and Rosetta are at 200 and World Community Grid has the lion's share of CPU cycles at 400, so for my part it's mostly biology and astronomy.

Distributed Prosecution (0)

Anonymous Coward | more than 6 years ago | (#21662919)


of the world's biggest Gunrunner [whitehouse.org]

Thanks for your activism.

PatRIOTically yours,
Kilgore

Business Nervous System (1)

Colin Smith (2679) | more than 6 years ago | (#21662359)

That'll have to be first. Not impossible to do, but given the state of the IT infrastructures I've seen unlikely for a while.

 

Wow (4, Funny)

moogied (1175879) | more than 6 years ago | (#21662477)

The economist, a magazine respected all around the world, has just published an article that concludes: "Two heads are better then one."

Hmph..

This is only common sense (2, Funny)

east coast (590680) | more than 6 years ago | (#21662599)

The Economist has an article summarizing the current state of distributed computing (think SETI@home and its ilk), which suggests that distributed-human projects are going to be the next big thing
 
After all, just look at BotNets. How much more insight do we need than that?
 
If only Joe Sixpack (who leaves his computer on 24/7 even tho he only uses it about a half hour per day) would understand that every clock cycle is sacred, every clock cycle is great...
 
If only.

With a real operating system (0, Flamebait)

zazenation (1060442) | more than 6 years ago | (#21664123)

The Economist has an article summarizing the current state of distributed computing (think SETI@home and its ilk), which suggests that distributed-human projects are going to be the next big thing

After all, just look at BotNets. How much more insight do we need than that?

If only Joe Sixpack (who leaves his computer on 24/7 even tho he only uses it about a half hour per day) would understand that every clock cycle is sacred, every clock cycle is great...

If only.
That would apply only to sixpackers who don't use MS Windoze which needs to be rebooted daily to fend off BSOD's. I mean really, you're doing some humongus calculation to resolve some combinitorial nightmare of an equation that could revolutionize cancer treatment and just when Joe's PC is about to return the pearls of wisdom that it has generated, your connection blows up because Joe's PC hung because it was Windowized.

I guess this brings up , "What is the MTBF of XP or VISTA"? In a real environment as opposed to a laboratory controlled setting?

Re:With a real operating system (1)

TooMuchToDo (882796) | more than 6 years ago | (#21666669)

Dude, come on, what are you? Nine years old? Perhaps you're just trolling. Kudos. In the event you're not, let me clarify for you. Windows XP as well as Windows 2000/2003 have excellent kernels that keep them chugging away. Vista? Not so much.

Also, if you don't like that everyone uses Windows, build an OS that's as easy to use. Linux you say? I've tried Fedora Core 8/Ubuntu on the desktop. They're pretty decent, but not quite there yet to replace Windows. Also, any distributed computing application is going to keep track of work units and do checkpointing, so even if OS-of-your-choice crashes, the distributed computing app is going to pick up right where it left off.

Re:With a real operating system (1)

east coast (590680) | more than 6 years ago | (#21669771)

While I commend your statement I wouldn't bother feeding the trolls.

But yeah, I work with over 600 Windows 2000 and XP machines everyday. This idea that you need to constantly reboot or that BSODs are common problems is just bullshit. I can't remember the last time I've seen a BSOD that wasn't due to hardware failure.

Joe is taking part in distributed computing ! (1)

DrYak (748999) | more than 6 years ago | (#21669081)

If only Joe Sixpack (who leaves his computer on 24/7 even tho he only uses it about a half hour per day) would understand that every clock cycle is sacred, every clock cycle is great...


Don't worry : Joe Sixpack is taking part in distributed computing. Mainly distributed Spamming and distributed DOSing. Thanks to Microsoft's legendary security and modern Zombie worms, all those computer ARE used indeed.

Strom Botnet : brings Grid computing to average Joe's reach (tm).

Micropayments for human labor to prevent boredom? (3, Insightful)

compumike (454538) | more than 6 years ago | (#21662627)

Sure, there are tasks that computers can't do so well at the moment, where giving the work parcels to humans would make the most sense. But can you imagine what micropayments might allow? It would enable a consistent set of trained, motivated workers to be stable over time, and dependable enough to use this kind of network for important activities.

Ultimately, humans get bored and computers don't. But humans can be delayed from boredom quite a bit by financial compensation.

--
Educational microcontroller kits for the digital generation. [nerdkits.com]

Re:Micropayments for human labor to prevent boredo (1)

morgan_greywolf (835522) | more than 6 years ago | (#21662937)

But can you imagine what micropayments might allow?
Abuse, fraud and theft?

It would enable a consistent set of trained, motivated workers to be stable over time, and dependable enough to use this kind of network for important activities.
I tend to agree with you, but you do have to figure out how to combat fraudulent activities. After all, most of these are like "pick the picture that most matches foo" or whatever but if someone writes a bot to randomly click on a picture to get micropayments? Not so good because not only were you cheated, but now you have a bunch of wrong data. How do you detect fraud in such a system?

Re:Micropayments for human labor to prevent boredo (2, Informative)

Yetihehe (971185) | more than 6 years ago | (#21663179)

[...] but if someone writes a bot to randomly click on a picture to get micropayments? Not so good because not only were you cheated, but now you have a bunch of wrong data. How do you detect fraud in such a system?
Did you RTFA? It's obvious: with redundancy. When 10 users agree and one misses this agreement most times, he is considered not trustworthy and therefore ignored and not payed.

Re:Micropayments for human labor to prevent boredo (1)

galoise (977950) | more than 6 years ago | (#21669965)

actually, at least in statistics, is a bit more precise, and you normally disregard data points with more than three deviations of the popualtion mean, as "aberrant" cases. The problem is that normally, random values in a vector do not deviate enough from valid cases to be detectable, so the noise produced by a bot cheating could very well cripple the whole project.

Probably only after a lot of rounds, when tendencies are well known and researched, you could devise more precise tests to check the validity of a given record, but if you know so much of the subject already, your new survey will not be of much help.

This is not a trivial problem in statistics, as normally, statistics are all about looking for interesting deviations from "normal" behaviour. With a simple redundancy test, you take the risk of eliminating the interesting part of the information.

And a corollary of the above, is that as you know more about an object, you can test the validity of observations with more precision, but for this exact same reasson, your results are progresively predictable and less valuable. In other words, the more you know about any problem, the more you determine its outcome, and the less valuable the information produced by new data will be. In the end, everything tends to be normal. or predictable.

My problem with grid computing (2, Informative)

schnikies79 (788746) | more than 6 years ago | (#21662653)

The only problem I have with the current way of grid-based computing is that it cost me a decent amount over the year. I have to leave my PC('s) on, which burns up power that could otherwise be saved.

I know several slashdotters leave their computers on 24/7, but I don't. It's akin to leaving a light-bulb on overnight, or leaving the fridge door open. I do have a computer I leave on overnight when it's downloading, but it's a 5headless 00mhz p3 with 256mb ram and it's promptly shut down until I need to download again.

Re:My problem with grid computing (3, Interesting)

Charcharodon (611187) | more than 6 years ago | (#21662805)

That's funny your comment about power usage, because that's exactly how one of the IT guys got found out by management. He was running seti@home during the night on all the work stations and servers. Finance noticed a jump in the power bill about the same time this guy was brought in to work in their IT section. He was racking up quite a few points for the 3 months or so he was getting away with it.

Re:My problem with grid computing (1)

gQuigs (913879) | more than 6 years ago | (#21662933)

I have a crazy idea for you. Install it, and don't change your habits. It is fine to turn a computer using BOINC off at night. Most applications checkpoint every 5 minutes or so, which means you might lose 5 minutes of work by turning it off. Hardly anything to be upset about.

Is there another reason you think you need to leave your PCs running BOINC on?

Re:My problem with grid computing (3, Funny)

EMeta (860558) | more than 6 years ago | (#21663233)

But now it is winter, so my computer is at worst a badly directed space heater.

Re:My problem with grid computing (1)

hcdejong (561314) | more than 6 years ago | (#21668393)

Depending on how you heat your house and how your electricity is generated, it's also an inefficient space heater. I'd much rather use a primary energy source to heat my home than incur the 50% efficiency loss by converting the primary source into electricity first.

Re:My problem with grid computing (1)

itof500 (239202) | more than 6 years ago | (#21669213)

This is my approach as well. During the air conditioning season, the computer is on only when I am in front of it. From mid autumn through winter to mid spring I have it on 24/7 running climatepredition.net. I figure the excess heat just keeps my apartment warm.

duke out

Re:My problem with grid computing (0)

Anonymous Coward | more than 6 years ago | (#21663563)

As long as I get paid for what my computer processes. Using my electricity and renting time on my hardware is *not* free. I don't care what noble cause you're solving.

Re:My problem with grid computing (1)

TeknoHog (164938) | more than 6 years ago | (#21663957)

Who said scientific progress is free? Maybe you should quit paying taxes too, some of them might end up funding research.

Re:My problem with grid computing (1, Insightful)

Anonymous Coward | more than 6 years ago | (#21664333)

Say is costs 10 cents a kilowatt hour for energy, your PC draws 200 watts on average, and it's on eight hours per day.

That's 58 dollars a year, saving about 117 by turning if off at night.

The expansion and contraction from the heating and cooling cycles ruin hardware.

I imagine that by thermal cycling it every day it will cost more money (with a long enough time frame) in destroyed hardware than the electricity you saved by powering it down.

Re:My problem with grid computing (1)

schnikies79 (788746) | more than 6 years ago | (#21665811)

I've been running PC's for 12 years. The only part I have ever had fail is a hard drive. I keep hardware an average of 4 years.

Re:My problem with grid computing (1)

caferace (442) | more than 6 years ago | (#21667889)

I know several slashdotters leave their computers on 24/7, but I don't. It's akin to leaving a light-bulb on overnight, or leaving the fridge door open. I do have a computer I leave on overnight when it's downloading, but it's a 5headless 00mhz p3 with 256mb ram and it's promptly shut down until I need to download again.


I understand your enthusiastic if misplaced green-ism, but I do hope you know your inefficient 500Mhz P3 with an aging mobo, RAM and PSU is likely 2-3 times as watt-hungry as a a modern energy efficient system. Technology has "moved on" in many ways....

Re:My problem with grid computing (1)

schnikies79 (788746) | more than 6 years ago | (#21668147)

I'm well aware of this, thats why it's never on except when download a torrent. It's only fired up once or twice a month and only for a few hrs. I really don't download much.

This has nothing to do my being green, it's about saving money.

Re:My problem with grid computing (1)

galoise (977950) | more than 6 years ago | (#21670033)

awh, come on, you mentioned the p3 as an example of a computer that sucked up less juice, presumably because it's slower. But if you really want to save bucks, and GP is right, you should really STOP using that p3, and use your main box, or any newer computer, for the download. Don't take it personal, in any case: i'm just pointing out that if GP is right, your strategy of using the p3 to download is not energy efficient, and you should review it.*hint hint*

by the way, i just moved to a new appartment, and payed up my first month of electricity, with a p4 2.4 with three HDs on 24/7, and the diff between bills is roughly 20 dollars. That's like 75 cent a day, i think. Quite a buck, for my budget. Any suggestions to keep the costs down?

Re:My problem with grid computing (0)

Anonymous Coward | more than 6 years ago | (#21669079)

No, no, no! You don't leave your computer on to do some calculating! You do it when you're using your computer, just like always. It's completely transparent to you. At least on a decent operating system, like GNU/Linux. I don' t think there's a big difference (if any) in the power consumption. The distributed computing process it set the lowest priority, so all other processes get completed before it. But the processor is never idling. Try it, you'll see. Your 500 MHz processor can do valuable work.

Not really that much waste (0)

Anonymous Coward | more than 6 years ago | (#21674683)

I leave my 5 PCs on 24x7. It uses electricity alright, but it cuts down a little on my heating bill, so it mostly evens out. During winter, that is. In the summer this place is so hot... If only there was a way to make them emit cold air instead of hot...

'Citizen' science (1, Troll)

Arthur B. (806360) | more than 6 years ago | (#21662741)

The Economist coined that out of their ass? Seriously, the current acception of 'citizen' is a person taken as subject to the laws of a specific government. What does *that* has to do with voluntary distributed computing? Nothing! They just assume voluntary distributed computing = virtuous, virtuous = good citizen, and there bingo citizen becomes synonymous with virtuous. Participation in a common project becomes not a personal contribution, but a contribution from us, *as subjects of a government*.

I'm not nitpicking, this is scary.

Re:'Citizen' science (1)

ILongForDarkness (1134931) | more than 6 years ago | (#21666439)

My thought exactly. Lets assume they mean more of a society science, or social science in that the society is contributing to science. I'm sorry all you SETI@home loving people but it isn't "citizen science" in the sense that you have a claim on the science. You didn't help develop the algorithms, test the model etc etc. You are nothing but someone donating a high powered calculator. The calculator has no claim on the scientific results, nor do you.

That said, you can feel good that you have contributed something akin to donating to cancer research, but you have no claim on the science itself, other than in a "made possible by" kind of way.

I've done computational physics work as an undergrad, about 90% of the time was programming, only about 10% of the time was physics (actually seeing if the computational results matched what theory predicts, formulating models etc). I was one of the researchers and I'd still say I had little claim on scientific inovation (though I'm primary author on several publications). It is kind of the nature of the problems I guess, interesting problems require so much computing power that most of your time is spent waiting for data. With more resources available most researchers will end up modelling bigger problems and still will be waiting long periods of time for data.

mcgrew's rule #ff387Y (1)

sm62704 (957197) | more than 6 years ago | (#21662815)

As soon as you see some asshat saying in print or especially on the internet that something is "the next big thing" you can bet your left nut it isn't.

-mcgrew [slashdot.org]

Every project you can participate in right now (2, Informative)

kpearson (760708) | more than 6 years ago | (#21663163)

is listed on my site: http://distributedcomputing.info/ [distribute...uting.info] . If you leave your computer on all the time and it isn't doing anything useful when you aren't using it, please look through these projects and pick one or more to contribute to.

Re:Every project you can participate in right now (0)

Anonymous Coward | more than 6 years ago | (#21663443)

Thanks for the listing. I'm currently running folding at home http://folding.stanford.edu/ [stanford.edu]

I only wish you'd regroup them so that people would easily see which projects have open results that benefit the all of us. I think that participating in a closed project is almost as bad as not participating at all. And while SETI@home is a nice project (the biggest and most famous distributed project), which I used to participate in myself, I think there are better projects, which are guaranteed to make useful results.

Get in the game! It's fun!

Re:Every project you can participate in right now (1)

TooMuchToDo (882796) | more than 6 years ago | (#21666731)

I bought a PS3 to do development using a Cell processor, and leave it on to run Folding@Home when I'm not using it. It tears through work units. (Note: I don't need a lecture on energy usage from anyone. I offset my home carbon emissions even though I don't have to since it's nuclear energy).

Re:Every project you can participate in right now (1)

cp.tar (871488) | more than 6 years ago | (#21663619)

Recently, I've started thinking about a distributed computing project for language analysis... some statistical analyses and machine learning could very well be implemented in this way, especially if we use Google (with a limited number of searches per day) as a corpus...

The idea occured to me when I saw a presentation of a bootstrapping system that used Google, but the author said the access was severely limited -- he couldn't get access to more professional APIs without paying quite a lot of money, and as a mere student he couldn't afford it.

However, a distributed system would be able not only to sidestep this obstacle, but also do many other kinds of language analysis...

I have to find a good programmer, I think.

DP (1)

fm6 (162816) | more than 6 years ago | (#21663423)

Distributed proofreaders have been a longstanding success (yet inexplicably failed to get even a mention in the article)
Maybe because it's a totally amateur effort?

I volunteered for DP for a few months. I got buggy TIFFs that my web browser couldn't deal with, so I sometimes had to work outside the DP proofing environment, which was a pain. (My suggestion that they switch to a more portable format, such as PNG, fell on deaf ears.) And they're still stuck on the idea that plain text is a universal format. That's what made me give up: I was proofing the 1911 Britannica, and realized that a lot of information was getting lost. There was no good way to indicate marginal notes. Both boldface and italic are indicated by all caps. (I REALLY find it hard to enjoy books that are FULL of capitalized words; it DESTROYS a lot of the SUBTLETY. And how do you capitalize "1984"?) And equations were managed with a subset of LaTex which I'm sure I mangled because I didn't have a LaTex interpreter to test it on — in fact, the DP instructions didn't even mention that it was LaTex.

If you want to preserve text for the ages, you have to use some serious markup to indicate things that are part of the content but not part of the linear text. Basically, the solution is to use some form of XML. Yes, I know the arguments: hard to enter, not everybody has an XML browser, etc. There are good solutions that deal with these problems, Just throwing away data in order to keep the document "simple" is not a good solution.

Re:DP (0)

Anonymous Coward | more than 6 years ago | (#21664527)

"Both boldface and italic are indicated by all caps."

They are not. In the first proofing round the formatting such as bold and italics are supposed to be ignored completely. This all gets added in the formatting rounds.

Re:DP (2, Informative)

dpf_donovan (1201839) | more than 6 years ago | (#21664697)

Maybe because it's a totally amateur effort?

"All-volunteer" is not the same thing as "totally amateur." A number of our volunteers work in library science, proofreading, or other directly related fields.

I volunteered for DP for a few months. I got buggy TIFFs that my web browser couldn't deal with, so I sometimes had to work outside the DP proofing environment, which was a pain. (My suggestion that they switch to a more portable format, such as PNG, fell on deaf ears.) And they're still stuck on the idea that plain text is a universal format. There was no good way to indicate marginal notes. Both boldface and italic are indicated by all caps. And equations were managed with a subset of LaTex which I'm sure I mangled because I didn't have a LaTex interpreter to test it on in fact, the DP instructions didn't even mention that it was LaTex.

It sound like you last visited DP a long time ago. DP has been standardized on PNG as their page image format almost since the site's inception 7 years ago, though we do allow jpg as an alternative. TIFF has never been an official format there. DP has also been producing HTML, DJVU, and LaTeX editions of projects (including illustrations) for many years. We are not tied to plain text, although we do produce it as a minimum for our target repository, Project Gutenberg.

Markup for bold and italics is the same as HTML, and markups exist for and are used to indicate marginal notes, footnotes, and the like. You are welcome to argue that a more complex markup is necessary, but considering the amount of outdated information in your comments here, you may wish to stop by and update your knowledge of the the state of the site. We'll happily welcome you back if you do.

D. Garcia
SysAdm - Distributed Proofreaders [pgdp.net].

Re:DP (1)

fm6 (162816) | more than 6 years ago | (#21666461)

"All-volunteer" is not the same thing as "totally amateur." A number of our volunteers work in library science, proofreading, or other directly related fields.
Never said it was. In this kind of context, I think you'll find "amateur" usually means the opposite of "professional". And in this context "professional" doesn't mean "paid", it means "knows what they're doing".

It sound like you last visited DP a long time ago. DP has been standardized on PNG as their page image format almost since the site's inception 7 years ago, though we do allow jpg as an alternative. TIFF has never been an official format there.
I don't know what to tell you. I was involved in 2003, and at that time I used a sort of web proofreading tool that used TIFF. Perhaps that was a feature of the particular tool.

Markup for bold and italics is the same as HTML, and markups exist for and are used to indicate marginal notes, footnotes, and the like. You are welcome to argue that a more complex markup is necessary, but considering the amount of outdated information in your comments here, you may wish to stop by and update your knowledge of the the state of the site. We'll happily welcome you back if you do.
I just did stop by. All the "recently finished" links on the front page are broken — not the best way to persuade folks you're not amateurs.

I'm glad to see you've starting using markup to indicate bold and italics. But skimming through your Formatting Guidlines [pgdp.net], I see a lot of bad stuff that hasn't changed since I was a volunteer. You still use 4 blank lines to indicate a chapter break. You still use that clumsy, hard-to-parse syntax to indicate side notes and footnotes. And you still hand-format tables! I couldn't find the instructions for entering equations, but I'm guessing you still use Tex syntax to record them.

My particular interest was the 1911 Britannica — I spent a lot of time on that one. A lot of people would enjoy a decent online copy. But to be useful, the online version has to be well-structured, so you can pull up a particular article without going crazy. And all that scientific stuff and complicated tabular information has to be recorded in such a way that it can actually be read. I gave up when I realized that the toolset you had 4 years wasn't nearly up to the task. And it still isn't. There have been improvements, but nothing that really dents my original negative assessment.

I stand by the word: amateur.

Re:DP (1)

dpf_donovan (1201839) | more than 6 years ago | (#21666849)

I don't know what to tell you. I was involved in 2003, and at that time I used a sort of web proofreading tool that used TIFF. Perhaps that was a feature of the particular tool.

Ah, that may have been the long-obsolete Windows-based client "PRTK."

All the "recently finished" links on the front page are broken not the best way to persuade folks you're not amateurs.

Those offsite links are valid, but not until after PG does its nightly cataloging run which places files in the correct locations on their server(s). Why they don't move files into place immediately on posting a text is beyond me, since it should be trivial from a technical standpoint, but since I don't volunteer for them directly, I can't respond to that. The downside is, as you've noted, that the offsite links we present don't immediately work. In the meantime, we promote our most recently completed works as best as we're able to, given that constraint.

I'm glad to see you've starting using markup to indicate bold and italics. But skimming through your Formatting Guidlines, I see a lot of bad stuff that hasn't changed since I was a volunteer. You still use 4 blank lines to indicate a chapter break. You still use that clumsy, hard-to-parse syntax to indicate side notes and footnotes. And you still hand-format tables! I couldn't find the instructions for entering equations, but I'm guessing you still use Tex syntax to record them.

Your suggestions would work better in a "professional" environment, but in a volunteer environment, they would fail because the learning curve is too high, and more time would be spent correcting difficult markup entered incorrectly. Realize that the markup used at DP is a compromise intended to be rapidly picked up by inexperienced people, and that it is an intermediate format which does not reflect the actual appearance of the end products, regardless of their final format, and especially the thousands of projects which have been produced in HTML.

And in this context "professional" doesn't mean "paid", it means "knows what they're doing".

I'm not disagreeing with your use of the term "amateur." Perhaps you mean to say you disagree with how we're going about the task (and many do, including currently active volunteers). That's how we learn to do this better. Over time, as we've learned how to do what we do, we've refined our workflow and software to be able do it better: a process which continues to this day. You might call it a distributed human genetic algorithm. :)

Re:DP (1)

fm6 (162816) | more than 6 years ago | (#21672279)

Those offsite links are valid, but not until after PG does its nightly cataloging run which places files in the correct locations on their server(s). Why they don't move files into place immediately on posting a text is beyond me, since it should be trivial from a technical standpoint, but since I don't volunteer for them directly, I can't respond to that.
Neither "it's not our fault" or "they should work" is more than a silly excuse. If this were my website, I'd work with the other website to make sure the links worked. If that didn't work out, I'd take down the links. Proudly displaying links that don't work, for whatever reason, makes you look like idiots.

Your suggestions would work better in a "professional" environment, but in a volunteer environment, they would fail because the learning curve is too high...
In other words, you're not going to use the right tools because you don't think your volunteers could be bothered to learn to use them. Well, here's one volunteer who's lost interest because you insist on using primitive tools.

And the fact is, it's quite possible to design user friendly tools for entering XML. But that would take planning and technical expertise. Your lack of those resources is the real problem, not the learning curve.

Basically, what you're good for is taking a lot of Victorian novels and making them available online. I suppose that's worth doing, but it doesn't qualify DP as a major player in the "citizen science" as discussed in TFA. If you want to do serious stuff, like being old technical works online, you need to break away from the naive concepts you inherited from Project Gutenberg, and acquire some understanding of 21st century document storage technology.

Re:DP (0)

Anonymous Coward | more than 6 years ago | (#21665301)

I saw the Joyous Holidays banner on the DP site and assumed it was just for Jewish books.

Re:DP (2, Informative)

bgalbrecht (920100) | more than 6 years ago | (#21666651)

As others have mentioned, you must have volunteered at DP a very long time ago because ALL of your objections to our work are no longer valid. The only complaint of yours that was valid when I started volunteering there 3.5 years ago was that DP's final versions submitted to Project Gutenberg were plain-text.

At the time you were volunteering, PG was primarily a repository of only plain-text documents. These days, in a large part due to the influence of volunteers at DP, nearly every new text submitted to PG has an HTML edition, some are submitted to PG in PG-TEI, which is Project Gutenberg's draft/proposed XML vocabulary based on the Text Encoding Initiative XML format, which can be transformed into many formats including plain text and HTML.

Re:DP (1)

fm6 (162816) | more than 6 years ago | (#21674983)

I just took a look at the DP site. As I've noticed in another post, not that much has changed.

Re:DP (0)

Anonymous Coward | more than 6 years ago | (#21675673)

Just a note.

If you just looked at the main site, without registering you see nothing but a header page. If you register, you get basic beginners access... An old unused log in may or may not get you access, but it will be bare bones beginners access. They've segregated a lot so as to not overwhelm new people. There's a lot you cannot see. Things have changed drastically in the 4 years I've had involvement with the site. A lot of the way things are done have been changed with regard to what people wanted and asked for in their forums, Perhaps it's not your *informed* choice of how something should be done, but they are volunteering, asking, and participating, you are not.

I don't disagree with the title 'amateur' but I do disagree with the 'don't know what they're doing' aspect you keep touting. The total volume of works both literature AND technical that DP has processed and is in process denies that.

I'm sorry that your stand seems to be that this should't be done and shouldn't be talked about because it's not up to your standards.

I'll go off and twiddle my thumbs now.

Wrenn__ (someday I'll remember to reset my password. I can't do it from work, however)

pgdp.net 's tax accountant. (oh, and yes, I've done most things you can do involved with projects on the site. except 'developer'. Not my skillset)

Re:DP (1)

bgalbrecht (920100) | more than 6 years ago | (#21680295)

Saying so doesn't make it true. For example, take a look at the HTML editions of a section of the 1911 Encyclopedia Britannica http://www.gutenberg.org/etext/19699 [gutenberg.org], Music Notation and Terminology by Karl Wilson Gehrkens http://www.gutenberg.org/etext/19499 [gutenberg.org], and Elements of Structural and Systematic Botany by Douglas Houghton Campbell http://www.gutenberg.org/etext/20390 [gutenberg.org], all recent productions from Distributed Proofreaders and top 100 downloads from Project Gutenberg. It is true that we're still using LaTeX to describe formulae, and some of our formatting notations are nowhere near the sophisticated tools a former technical writer at Adobe is no doubt used to. I think we're doing a far better job of retaining the non-textual information you claim we're dropping than your cursory re-examination of our site would indicate.

Re:DP (1)

fm6 (162816) | more than 6 years ago | (#21688542)

"Saying it's so doesn't make it true?" How does that even apply here? I didn't just make a claim, I pointed out some severe limitations in the "guidelines" that haven't changed since I was a volunteer. True there have been some improvements (not using ALL CAPS for italics removes a major eyesore) but it's still pretty much a mess.

I'm glad you linked the 1911 EB, since that's the DP project I care most about. Now, suppose I want to read the article on Sir Thomas Bromley. I have to figure out which file has his article, download the entire 5 megabyte file (remind me not to do that on my cell phone browser!) and then search for his name.

Someday, somebody will get round to breaking these huge EB files into individual articles. That's going to take a lot of manual processing, because the conventions for "new article" are vague and inconsistently applied. A lot of typographical subtleties are lost, because they're not well supported. My particular pet peeve is that non-Latin characters (lots of those in the EB) are handled through transliteration. That might have made sense when Project Gutenberg was started in 1971 (though even then there were better ways of doing it); 36 years later, when even barefoot kids in developing countries have access to graphics displays, it's just stupid.

And hand-formatted tables. I cannot find the words to describe how dumb that is.

True, we can reprocess the files to include this information. But it would be a lot easier to do it on the initial pass.

And let's dispense with the usual lame excuses about limited resources, unskilled volunteers, etc. It wouldn't be that hard to put together a proposal for a set of XML specs that accommodate the various docs that DP and PG deal with, together with some web applications that would allow the most naive volunteers to translate text into structured data. It would be complicated and technically difficult, but with a little hustling for grants, donations, and technically skilled volunteers (I'd be good for the last two, and I'd be happy to hustle my employer for all three), it could be done. It's not even that big a project, as such things go.

So why hasn't it happened? Because the key people at PG and DP know jack about content management technology. They were slightly behind the times in 1971, and they've made remarkably little progress since then. Meanwhile the technology that's available has grown by leaps and bounds. I say again: amateurs.

prior art (1)

Potatomasher (798018) | more than 6 years ago | (#21664057)

Maybe once these projects pick up speed, people will get together and start working out of a common workplace to increase efficiency !
Hmmm... but then of course they'll need a big building to fit everyone, some form of financing, cubicles....

Hey... wait a minute ! This sounds familiar...
Nope, false alarm. What a new & radical concept ! This could change everything !

Grid computing != Distributed computing (5, Informative)

edsousa (1201831) | more than 6 years ago | (#21664103)

Grid computing is when you request resources to run your app. Projects like SETI@home use a different approach: you pull a task, instead of arbitrarily offering your computing resources.

IBM defines grid computing as "the ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet.
in http://en.wikipedia.org/wiki/Grid_Computing [wikipedia.org]

Re:Grid computing != Distributed computing (1)

dkf (304284) | more than 6 years ago | (#21670363)

Grid computing and distributed computing are related, but not the same. With distributed computing, the focus is mainly on getting large numbers of machines that are functionally the same in some way (e.g. large numbers of SETI@home processing units). With grid computing, the focus is mainly on dealing with heterogeneity and varying access rules between different organizations. The two approaches deal with different things, but can be (and often are) used together.

Shameless self promotion of my PhD research (2, Interesting)

bcg (322392) | more than 6 years ago | (#21664811)

Hi,

This is something that I have had an interest in for the last few years. As such, a large part of my thesis has been developing "CompTorrent". It is a computing platform that has borrowed some ideas from BitTorrent and combined them with distributed computing.

The focus has been on making distributed computing projects as easy to start as a BitTorrent swarm. After spending some quality time with both BOINC and Condor I can assure you that getting a project going from scratch, can be a non-trivial exercise.

Here's a paper if anyone is interested: Enabling grassroots distributed computing with CompTorrent [utas.edu.au]

Imagine... (0)

Anonymous Coward | more than 6 years ago | (#21666055)

...a beowulf cluster of these?

distributed finance (1)

astflgl (770168) | more than 6 years ago | (#21669305)

Dunno if anyone cares, but I recently saw an ad on some random page that was looking for people to apply for some kind of mass-option trading job (it might not have been options, i'm not good with finance terms). The deal was, you sign up, they give you some software that (presumably) visualizes the fluctuations of various prices of trade-able widgets on "the market" in a chart or graph, give you basic training on when to buy and when to sell, start you off with a bit of dough, then turn you loose and give you a percentage of what you make.

just mentioning it here because someone might find it interesting.

Great idea... but not without drawbacks (1)

Archtech (159117) | more than 6 years ago | (#21669311)

TANSTAAFL. It's ironic that people running, e.g. ClimatePrediction are simultaneously helping to change the climate. Each PC does not generate much heat, but several million of them certainly do - especially if left on 24 hours a day, 7 days a week as many enthusiasts tend to. See for example this rough analysis: http://hardforum.com/showthread.php?t=1240015 [hardforum.com]

We have to figure both the heat generated and the power consumed (much of which is derived from fossil fuels). Even if you use green electricity, that just means that other people have to use fossil fuels. Good for your conscience, but not for the world as a whole.

On the other hand, as several people have pointed out, the waste heat from PCs does contribute to space heating - thus perhaps reducing the amount of energy spent to keep houses, etc. warm in winter.

Hydrogen@Home (1)

jackygrahamez (1202241) | more than 6 years ago | (#21672379)

I strongly believe the power of volunteer computing project will revolutionize science and technology. That is why I spend all my free time learning and developing a project that attempts to engineer clean energy technology. My work attempts to identify efficient catalysts in hydrogen production using Quantum Monte Carlo and Docking simulations. There is a great deal of developement ahead of me and I work on it with a shoe string budget since it is a hobby. Thanks to the friendly community I am actually making progress, slowly but surely. There are really many more computational problems than projects currently available. Anyone with a little imagination, strong science background and experience developing webservers can start their own! I may think Hydrogen is the best approach to developing clean alternatives. You might think that is a bad idea and have an idea about engineer celluostic ethanol. I would applaud that because in the end of the day Society needs the best solution to Global Climate change. I hope I have inspired people to participate in BOINC. May our community continue growing. Through Meetup.com I started a group that meets in Washington DC to discuss and promote BOINC. Kind Regards, Jack Shultz http://hydrogenathome.org/ [hydrogenathome.org]

Systemic - extrasolar planets (0)

Anonymous Coward | more than 6 years ago | (#21673309)

http://www.oklo.org/ [oklo.org]

If you are interested in the search for extrasolar planets, you can download a Java app. It's not distributed computing, though; it's more like a little toolkit where(after setting up the strongest data by interpreting a graph) you tweak some sliders and then tell the computer to integrate and leave it on all night(it's trying to get the best fit to the transit data we have for the star system)

When you get a good result, it's pretty cool. But it takes a lot more effort than the distributed, leave-on-anytime, ones.
Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...