×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

BOINC Now Available For GPU/CUDA

Soulskill posted more than 5 years ago | from the good-excuse-for-a-new-video-card dept.

Software 20

GDI Lord writes "BOINC, open-source software for volunteer computing and grid computing, has posted news that GPU computing has arrived! The GPUGRID.net project from the Barcelona Biomedical Research Park uses CUDA-capable NVIDIA chips to create an infrastructure for biomolecular simulations. (Currently available for Linux64; other platforms to follow soon. To participate, follow the instructions on the web site.) I think this is great news, as GPUs have shown amazing potential for parallel computing."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

20 comments

It's thinking... (4, Interesting)

neomunk (913773) | more than 5 years ago | (#24254539)

As someone who is interested in software neural nets, this announcement practically gives me a chubber.

And let me be the first to welcome our new Distributed Overlord. The lack of an 's' on "Overlord" is the exciting part of this article.
 

What I'm waiting for is (2, Interesting)

da5idnetlimit.com (410908) | more than 5 years ago | (#24254573)

Video conversion for GPU/CUDA (an amd64 version for ubuntu heron, if I get to be really choosy)

saw something about this, and they were getting unbelievable transcoding speeds...

Re:What I'm waiting for is (1)

Qhartb (1311541) | more than 5 years ago | (#24255863)

Yep. That program is "badaboom Media Converter" by Elemental Technologies. I look forward to seeing what other applications CUDA has for home users. So far we have video transcoding, gaming physics simulation, and distributed computing projects (SETI&Folding@Home). Doubtless graphics pretty soon, ironically (CUDA ray-tracing). It's really exciting.

Re:What I'm waiting for is (1)

lavid (1020121) | more than 5 years ago | (#24256339)

the way that CUDA deals with thread death in the current iterations is lacking. if they make that more graceful, you can really expect to see some insane speedups.

Single platform only (3, Interesting)

DrYak (748999) | more than 5 years ago | (#24254929)

The only sad thing is that CUDA is a single platform API that only supports a handful of cards from a single constructor. For a project that tries to get as many computers working together as possible like BOINC, it would be also good if they tried to support at least one more API.

Brook could have been also a nice candidate. It has already been used by other distributed computing project (Folding@home), it supports multiple back-end (including a multi-CPU one which actually works(*), an OpenGL which works with most hardware, and AMD/ATI's CAL backend featured in their Brook+ fork)

Too bad that currently both nVidia and Intel are trying to attract customers to proprietary single platform APIs (CUDA and Ct resp.)
Specially given some memory management weirdness in CUDA.

(*) : unlike CUDA's device emulation mode which is just a ridiculous joke performance-wise.

Re:Single platform only (2, Informative)

Anonymous Coward | more than 5 years ago | (#24255181)

CUDA is being ported to ATI/AMD cards with nVidia's blessing and support. By next year there will probably be a lot of hardware support for the API.

Re:Single platform only (3, Informative)

Satis (769614) | more than 5 years ago | (#24255289)

fyi, as the other reply states, CUDA isn't limited to a single manufacturer. nVidia has made it available for other graphics card manufacturers to support. Here's an article on Extremetech talking a bit about it, but at least according to the article ATI doesn't appear interested.

http://www.extremetech.com/article2/0,2845,2324555,00.asp [extremetech.com]

CUDA is extremely nVidia oriented. (2, Informative)

DrYak (748999) | more than 5 years ago | (#24257709)

Yes, but sorry, CUDA is as much oriented toward other graphic manufacturers as Microsoft's ISO Office XML with all its "use_spacing_as_in_word_96='ture' " options is an open standard.

It very heavily oriented toward nVidia's architecture. It has several deeply asinine architecture quirks. (you see, you have several different type of memory architecture. The twist is that 3 of them are accessed using regular pointer arithmetic, but textures are accessed using dedicated specific functions. because using "[]" operator like all other memory type wo uld have been too much straight forward).
Also instead of being just able to declare stream buffers and bind them to some data with a language extensions (as in Brook for exemple) you have to go through a couple of specific function calls into the CUDA API. It's all over 1980's-style C language again.
This whole thing being very much directed toward an architecture like nVidia which can't apply a kernel on the fly while loading memory from the main memory to the GFX cards, but instead relies on concurrent kernels and loads.
And don't ask me about this all weird tendency to require the user to go through some function calls just to set a constant to its default value (instead of simply declaring and accessing it directly).

CUDA provides a nice C-like language for kernels. But the host code it self looks like a direct dump of the driver's interface.
It's definitely something that won't be easily used by 3rd party developer and map nicely to other architectures.

That's why ATI isn't interested. Because most of the host API is designed in a way which is very nVidia oriented and won't necessarily map nicely to other architectures.

FYI, i've been both working on several projects using CUDA and using Brook. Although I appreciate the speed gain of CUDA, and I appreciate having several C-dialects which could get a port of an algorithm between C, CUDA and Brook without too much efforts ; I still find that Brook has a nicer and much more abstract architecture

Re:CUDA is extremely nVidia oriented. (1)

krilli (303497) | more than 5 years ago | (#24261497)

CUDA is free and it works. I prefer a hackish CUDA now to a nice, abstract CUDA in two years.

Also, I do believe someone will write a nice abstraction on top of CUDA. If CUDA is like C++, there will be nice Boost and Qt toolkits for it.

Also, you can asynchronous memory transfers and kernel executions ... unless you're talking about something else and it's my misunderstanding.

Re:Single platform only (2, Informative)

mikael (484) | more than 5 years ago | (#24255467)

There are many parallel processing and networking API's and out there - both past and present - OpenMP, pthreads, CUDA, sockets, etc...

There is a proposal by Apple to create a common API for parallel processing (OpenCL) which would be cross-platform compatible. The Guardian has an article [guardian.co.uk] on this topic.

Re:Single platform only (1)

schwaang (667808) | more than 5 years ago | (#24260009)

Brook could have been also a nice candidate. It has already been used by other distributed computing project (Folding@home), it supports multiple back-end (including a multi-CPU one which actually works(*), an OpenGL which works with most hardware, and AMD/ATI's CAL backend featured in their Brook+ fork)

Does Brook provide access like CUDA does to fast shared memory and registers vs. device memory vs. host memory?

(*) : unlike CUDA's device emulation mode which is just a ridiculous joke performance-wise.

Just to pick a nit, I'm pretty sure that the point of device emulation mode is ease of debugging, not performance.

On the whole I think we agree that it would be nice for programmers to have a non-proprietary and non-vendor-specific language to express parallel programs in. But at this early stage, with things still emerging, using CUDA directly seems to have some advantages.

More abstraction could be appreciated (2, Insightful)

DrYak (748999) | more than 5 years ago | (#24260327)

Does Brook provide access like CUDA does to fast shared memory and registers vs. device memory vs. host memory?

No. Being multiplatform to begin with, Brook exposes less details of the memory architecture underneath (because it can vary widely between platform - like CPU to GPU -, or not be exposed at all by the platform underneath - like OpenGL)

But what it has is that data is represented by simple C-like array, and the compiler remaps that to cached fast texture accesses. No weird "tex2D" functions, unlike CUDA - that's something I find weird in an architecture which is supposed to abstract and simplify GPGPU coding, specially when all the other memory types are accessed in CUDA using C pointer math.

Probably now that ATI's Brook+ is maturing, extra attributes on variable declaration could be introduced to have more influence on the memory organisation on that specific back-end.

CUDA is nice because it enables very low-level control on how memory is used. But this currently comes at the cost of syntax complexity.
It's interesting to note that both CUDA and Brook+ use a matrix multiplication as an example of language usage. Brook+ simply explain how to partition the work to keep the data nicely inside the fast cache. CUDA has a significant amount of code lines devoted to moving data between several Hungarian notation-prefixed pointer, which is a little bit more confusing.

Just to pick a nit, I'm pretty sure that the point of device emulation mode is ease of debugging, not performance.

But to be debugable, the code must at least be runnable. Sadly, the emulation is so slow, that it can run real-word complex algorithms only on really small sets of data. Which might be corner cases and you might misses bugs that only happen on larger data sets. Also, it always runs single threaded, no matter how many cores are available in the system, which may lead to missing some concurrency problems (code works fine on CPU but breaks on GPU because a sync is missing somewhere)

It can be used to debug short matrix-operation algorithms, but it's very hard to debug more complex things like sequence analysis (and there are even a couple of teams trying to do parallelised antivirus on the GPU)

But at this early stage, with things still emerging, using CUDA directly seems to have some advantages.

There are cases where the low level-ness of CUDA definitely makes sense :
when developing code for specially on purpose built hardware. Say, the lab you work in has built a machine with a couple GeForces inside for you project (given the price of graphic cards and the performance increase between each generation, it makes sense to just throw in a couple of hundred bucks per graphic card for a specific project when the performance need arises). CUDA makes sense - even if it is ugly in places - because it'll let you squeeze the last possible cycle out of the hardware.

But for something that will run distributed across a huge number of home configurations like "@home" distributed computing, adding an API which will bring additional architectures and is more abstract makes sense. Going for a single API roughly restrict the code to running on only half of gamers population's machines.

Re:More abstraction could be appreciated (1)

krilli (303497) | more than 5 years ago | (#24261513)

Why don't you get cracking then and write a nice Brook BOINC?

Re:More abstraction could be appreciated (1)

DrYak (748999) | more than 5 years ago | (#24262699)

Why don't you get cracking then and write a nice Brook BOINC?

I actually *do* happen to write parallel applications using Brook for bioinformatics processing.
It just happens that the current application I'm paid for developing doesn't use BOINC. Otherwise I would happily contribute.

Re:More abstraction could be appreciated (1)

schwaang (667808) | more than 5 years ago | (#24264085)

But for something that will run distributed across a huge number of home configurations like "@home" distributed computing, adding an API which will bring additional architectures and is more abstract makes sense. Going for a single API roughly restrict the code to running on only half of gamers population's machines.

If something like Brook could come *near enough* to generating optimal code for both NVIDIA and ATI cards, I'd agree with you whole-heartedly. I strongly suspect that this isn't the case.

Imagine if BOINC restricted you to writing i386 code because it will run on everything, but wasted the capabilities of i686 and SSE2 etc.

I would think it would be better to write a CUDA-optimized client of your algo and a CAL-optimized client and let BOINC feed work appropriately. I believe that's what F@H did w.r.t. various hardware architectures.

In the longer run I hope for the same utopia you do, where the strengths of each approach inform the final iteration of Brook or whatever succeeds it, and the back-end compilers do the hard work of optimizing for each architecture that programmers have to do today.

F@H and multiple back-ends (1)

DrYak (748999) | more than 5 years ago | (#24264853)

Yes, indeed, F@H sports quite an original zoo of various computation engines in order to squeeze as much performance as possible from as many clients as possible. Including a client running on PS3's Cell.

I agree that BOINC should include support for more than 1 single API. Either adding CAL as you suggest (although it's rather low level stuff) or adding Brook (which has a CAL backend - I would think that would be better as it is much higher level).

And you presume correct, currently Brook only supports nVidia through the OpenGL/GLSL backend which lacks some advanced features. There has been some discussion on some forums about trying a CUDA backend on Brook, but the idea doesn't have enough follower mainly because the most speed critical optimisation (shared memory) won't be easy to implement automatically (in CUDA it's a voodoo art done manually by the coder).

Re:Single platform only (1)

krilli (303497) | more than 5 years ago | (#24261469)

CUDA is really easy to use. So easy to use that BOINC+CUDA got off the ground.

I don't see any cards other than NVIDIA's that are as effective, given cost, effectiveness and ease of programming.

"A handful of cards from a single constructor"? You can also say "Cheap, powerful cards available anywhere".

CUDA device emulation is only intended as a partial debugging tool.

Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...