Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Intel, NVIDIA Take Shots At CPU vs. GPU Performance

kdawson posted more than 4 years ago | from the army-boots dept.

Intel 129

MojoKid writes "In the past, NVIDIA has made many claims of how porting various types of applications to run on GPUs instead of CPUs can tremendously improve performance — by anywhere from 10x to 500x. Intel has remained relatively quiet on the issue until recently. The two companies fired shots this week in a pre-Independence Day fireworks show. The recent announcement that Intel's Larrabee core has been re-purposed as an HPC/scientific computing solution may be partially responsible for Intel ramping up an offensive against NVIDIA's claims regarding GPU computing."

cancel ×

129 comments

Sorry! There are no comments related to the filter you selected.

AMD (-1, Offtopic)

Anonymous Coward | more than 4 years ago | (#32708250)

It has come to my attention that the entire Linux community is a hotbed of so called 'alternative sexuality', which includes anything from hedonistic orgies to homosexuality to paedophilia.

What better way of demonstrating this than by looking at the hidden messages contained within the names of some of Linux's most outspoken advocates:

  • Linus Torvalds [microsoft.com] is an anagram of slit anus or VD 'L,' clearly referring to himself by the first initial.
  • Richard M. Stallman [archive.org] , spokespervert for the Gaysex's Not Unusual 'movement' is an anagram of mans cram thrill ad.
  • Alan Cox [microsoft.com] is barely an anagram of anal cox which is just so filthy and unchristian it unnerves me.

I'm sure that Eric S. Raymond, composer of the satanic homosexual [goatse.fr] propaganda diatribe The Cathedral and the Bizarre, is probably an anagram of something queer, but we don't need to look that far as we know he's always shoving a gun up some poor little boy's rectum. Update: Eric S. Raymond is actually an anagram for secondary rim and cord in my arse. It just goes to show you that he is indeed queer.

Update the Second: It is also documented that Evil Sicko Gaymond is responsible for a nauseating piece of code called Fetchmail [microsoft.com] , which is obviously sinister sodomite slang for 'Felch Male' -- a disgusting practise. For those not in the know, 'felching' is the act performed by two perverts wherein one sucks their own post-coital ejaculate out of the other's rectum. In fact, it appears that the dirty Linux faggots set out to undermine the good Republican institution of e-mail, turning it into 'e-male.'

As far as Richard 'Master' Stallman goes, that filthy fudge-packer was actually quoted [salon.com] on leftist commie propaganda site Salon.com as saying the following: 'I've been resistant to the pressure to conform in any circumstance,' he says. 'It's about being able to question conventional wisdom,' he asserts. 'I believe in love, but not monogamy,' he says plainly.

And this isn't a made up troll bullshit either! He actually stated this tripe, which makes it obvious that he is trying to politely say that he's a flaming homo [comp-u-geek.net] slut [rotten.com] !

Speaking about 'flaming,' who better to point out as a filthy chutney ferret than Slashdot's very own self-confessed pederast Jon Katz. Although an obvious deviant anagram cannot be found from his name, he has already confessed, nay boasted of the homosexual [goatse.fr] perversion of corrupting the innocence of young children [slashdot.org] . To quote from the article linked:

'I've got a rare kidney disease,' I told her. 'I have to go to the bathroom a lot. You can come with me if you want, but it takes a while. Is that okay with you? Do you want a note from my doctor?'

Is this why you were touching your penis [rotten.com] in the cinema, Jon? And letting the other boys touch it too?

We should also point out that Jon Katz refers to himself as 'Slashdot's resident Gasbag.' Is there any more doubt? For those fortunate few who aren't aware of the list of homosexual [goatse.fr] terminology found inside the Linux 'Sauce Code,' a 'Gasbag' is a pervert who gains sexual gratification from having a thin straw inserted into his urethra (or to use the common parlance, 'piss-pipe'), then his homosexual [goatse.fr] lover blows firmly down the straw to inflate his scrotum. This is, of course, when he's not busy violating the dignity and copyright of posters to Slashdot by gathering together their postings and publishing them en masse to further his twisted and manipulative journalistic agenda.

Sick, disgusting antichristian perverts, the lot of them.

In addition, many of the Linux distributions (a 'distribution' is the most common way to spread the faggots' wares) are run by faggot groups. The Slackware [redhat.com] distro is named after the 'Slack-wear' fags wear to allow easy access to the anus for sexual purposes. Furthermore, Slackware is a close anagram of claw arse, a reference to the homosexual [goatse.fr] practise of anal fisting. The Mandrake [slackware.com] product is run by a group of French faggot satanists, and is named after the faggot nickname for the vibrator. It was also chosen because it is an anagram for dark amen and ram naked, which is what they do.

Another 'distro,' (abbrieviated as such because it sounds a bit like 'Disco,' which is where homosexuals [goatse.fr] preyed on young boys in the 1970s), is Debian, [mandrake.com] an anagram of in a bed, which could be considered innocent enough (after all, a bed is both where we sleep and pray), until we realise what other names Debian uses to describe their foul wares. 'Woody' is obvious enough, being a term for the erect male penis [rotten.com] , glistening with pre-cum. But far sicker is the phrase 'Frozen Potato' that they use. This filthy term, again found in the secret homosexual [goatse.fr] 'Sauce Code,' refers to the solo homosexual [goatse.fr] practice of defecating into a clear polythene bag, shaping the turd into a crude approximation of the male phallus, then leaving it in the freezer overnight until it becomes solid. The practitioner then proceeds to push the frozen 'potato' up his own rectum, squeezing it in and out until his tight young balls erupt in a screaming orgasm.

And Red Hat [debian.org] is secret homo [comp-u-geek.net] slang for the tip of a penis [rotten.com] that is soaked in blood from a freshly violated underage ringpiece.

The fags have even invented special tools to aid their faggotry! For example, the 'supermount' tool was devised to allow deeper penetration, which is good for fags because it gives more pressure on the prostate gland. 'Automount' is used, on the other hand, because Linux users are all fat and gay, and need to mount each other [comp-u-geek.net] automatically.

The depths of their depravity can be seen in their use of 'mount points.' These are, plainly speaking, the different points of penetration. The main one is obviously/anus, but there are others. Militant fags even say 'there is no/opt mount point' because for these dirty perverts faggotry is not optional but a way of life.

More evidence is in the fact that Linux users say how much they love `man`, even going so far as to say that all new Linux users (who are in fact just innocent heterosexuals indoctrinated by the gay propaganda) should try out `man`. In no other system do users boast of their frequent recourse to a man.

Other areas of the system also show Linux's inherent gayness. For example, people are often told of the 'FAQ,' but how many innocent heterosexual Windows [amiga.com] users know what this actually means. The answer is shocking: Faggot Anal Quest: the voyage of discovery for newly converted fags!

Even the title 'Slashdot [geekizoid.com] ' originally referred to a homosexual [goatse.fr] practice. Slashdot [kuro5hin.org] of course refers to the popular gay practice of blood-letting. The Slashbots, of course are those super-zealous homosexuals [goatse.fr] who take this perversion to its extreme by ripping open their anuses, as seen on the site most popular with Slashdot users, the depraved work of Satan, http://www.eff.org/ [eff.org] .

The editors of Slashdot [slashduh.org] also have homosexual [goatse.fr] names: 'Hemos' is obvious in itself, being one vowel away from 'Homos.' But even more sickening is 'Commander Taco' which sounds a bit like 'Commode in Taco,' filthy gay slang for a pair of spreadeagled buttocks that are caked with excrement [pboy.com] . (The best form of lubrication, they insist.) Sometimes, these 'Taco Commodes' have special 'Salsa Sauce' (blood from a ruptured rectum) and 'Cheese' (rancid flakes of penis [rotten.com] discharge) toppings. And to make it even worse, Slashdot [notslashdot.org] runs on Apache!

The Apache [microsoft.com] server, whose use among fags is as prevalent as AIDS, is named after homosexual [goatse.fr] activity -- as everyone knows, popular faggot band, the Village People, featured an Apache Indian, and it is for him that this gay program is named.

And that's not forgetting the use of patches in the Linux fag world -- patches are used to make the anus accessible for repeated anal sex even after its rupture by a session of fisting.

To summarise: Linux is gay. 'Slash -- Dot' is the graphical description of the space between a young boy's scrotum and anus. And BeOS [apple.com] is for hermaphrodites and disabled 'stumpers.'

FEEDBACK

What worries me is how much you know about what gay people do. I'm scared I actually read this whole thing. I think this post is a good example of the negative effects of Internet usage on people. This person obviously has no social life anymore and had to result to writing something as stupid as this. And actually take the time to do it too. Although... I think it was satire.. blah.. it's early. -- Anonymous Coward, Slashdot

Well, the only reason I know all about this is because I had the misfortune to read the Linux 'Sauce code' once. Although publicised as the computer code needed to get Linux up and running on a computer (and haven't you always been worried about the phrase 'Monolithic Kernel'?), this foul document is actually a detailed and graphic description of every conceivable degrading perversion known to the human race, as well as a few of the major animal species. It has shocked and disturbed me, to the point of needing to shock and disturb the common man to warn them of the impending homo [comp-u-geek.net] -calypse which threatens to engulf our planet.

You must work for the government. Trying to post the most obscene stuff in hopes that slashdot won't be able to continue or something, due to legal woes. If i ever see your ugly face, i'm going to stick my fireplace poker up your ass, after it's nice and hot, to weld shut that nasty gaping hole of yours. -- Anonymous Coward, Slashdot

Doesn't it give you a hard-on to imagine your thick strong poker ramming it's way up my most sacred of sphincters? You're beyond help, my friend, as the only thing you can imagine is the foul penetrative violation of another man. Are you sure you're not Eric Raymond? The government, being populated by limp-wristed liberals, could never stem the sickening tide of homosexual [goatse.fr] child molesting Linux advocacy. Hell, they've given NAMBLA free reign for years!

you really should post this logged in. i wish i could remember jebus's password, cuz i'd give it to you. -- mighty jebus [slashdot.org] , Slashdot

Thank you for your kind words of support. However, this document shall only ever be posted anonymously. This is because the 'Open Sauce' movement is a sham, proposing homoerotic cults of hero worshipping in the name of freedom. I speak for the common man. For any man who prefers the warm, enveloping velvet folds of a woman's vagina [bodysnatchers.co.uk] to the tight puckered ringpiece of a child. These men, being common, decent folk, don't have a say in the political hypocrisy that is Slashdot culture. I am the unknown liberator [hitler.org] .

ROLF LAMO i hate linux FAGGOTS -- Anonymous Coward, Slashdot

We shouldn't hate them, we should pity them for the misguided fools they are... Fanatical Linux zeal-outs need to be herded into camps for re-education and subsequent rehabilitation into normal heterosexual society. This re-education shall be achieved by forcing them to watch repeats of Baywatch until the very mention of Pamela Anderson [rotten.com] causes them to fill their pants with healthy heterosexual jism [zillabunny.com] .

Actually, that's not at all how scrotal inflation works. I understand it involves injecting sterile saline solution into the scrotum. I've never tried this, but you can read how to do it safely in case you're interested. (Before you moderate this down, ask yourself honestly -- who are the real crazies -- people who do scrotal inflation, or people who pay $1000+ for a game console?) -- double_h [slashdot.org] , Slashdot

Well, it just goes to show that even the holy Linux 'sauce code' is riddled with bugs that need fixing. (The irony of Jon Katz not even being able to inflate his scrotum correctly has not been lost on me.) The Linux pervert elite already acknowledge this, with their queer slogan: 'Given enough arms, all rectums are shallow.' And anyway, the PS2 [xbox.com] sucks major cock and isn't worth the money. Intellivision forever!

dude did u used to post on msnbc's nt bulletin board now that u are doing anti-gay posts u also need to start in with anti-black stuff too c u in church -- Anonymous Coward, Slashdot

For one thing, whilst Linux is a cavalcade of queer propaganda masquerading as the future of computing, NT [linux.com] is used by people who think nothing better of encasing their genitals in quick setting plaster then going to see a really dirty porno film, enjoying the restriction enforced onto them. Remember, a wasted arousal is a sin in the eyes of the Catholic church [atheism.org] . Clearly, the only god-fearing Christian operating system in existence is CP/M -- The Christian Program Monitor. All computer users should immediately ask their local pastor to install this fine OS onto their systems. It is the only route to salvation.

Secondly, this message is for every man. Computers know no colour. Not only that, but one of the finest websites in the world is maintained by a Black Man [stileproject.com] . Now fuck off you racist donkey felcher.

And don't forget that slashdot was written in Perl, which is just too close to 'Pearl Necklace' for comfort.... oh wait; that's something all you heterosexuals do.... I can't help but wonder how much faster the trolls could do First-Posts on this site if it were redone in PHP... I could hand-type dynamic HTML pages faster than Perl can do them. -- phee [slashdot.org] , Slashdot

Although there is nothing unholy about the fine heterosexual act of ejaculating between a woman's breasts, squirting one's load up towards her neck and chin area, it should be noted that Perl [python.org] (standing for Pansies Entering Rectums Locally) is also close to 'Pearl Monocle,' 'Pearl Nosering,' and the ubiquitous 'Pearl Enema.'

One scary thing about Perl [sun.com] is that it contains hidden homosexual [goatse.fr] messages. Take the following code: LWP::Simple -- It looks innocuous enough, doesn't it? But look at the line closely: There are two colons next to each other! As Larry 'Balls to the' Wall would openly admit in the Perl Documentation, Perl was designed from the ground up to indoctrinate it's programmers into performing unnatural sexual acts -- having two colons so closely together is clearly a reference to the perverse sickening act of 'colon kissing,' whereby two homosexual [goatse.fr] queers spread their buttocks wide, pressing their filthy torn sphincters together. They then share small round objects like marbles or golfballs by passing them from one rectum to another using muscle contraction alone. This is also referred to in programming 'circles' as 'Parameter Passing.'

And PHP [perl.org] stands for Perverted Homosexual Penetration. Didn't you know?

Thank you for your valuable input on this. I am sure you will be never forgotten. BTW: Did I mention that this could be useful in terraforming Mars? Mars rulaa. -- Eimernase [slashdot.org] , Slashdot

Well, I don't know about terraforming Mars, but I do know that homosexual [goatse.fr] Linux Advocates have been probing Uranus for years.

That's inspiring. Keep up the good work, AC. May God in his wisdom grant you the strength to bring the plain honest truth to this community, and make it pure again. Yours, Cerberus. -- Anonymous Coward, Slashdot

*sniff* That brings a tear to my eye. Thank you once more for your kind support. I have taken faith in the knowledge that I am doing the Good Lord [atheism.org] 's work, but it is encouraging to know that I am helping out the common man here.

However, I should be cautious about revealing your name 'Cerberus' on such a filthy den of depravity as Slashdot. It is a well known fact that the 'Kerberos' documentation from Microsoft is a detailed manual describing, in intimate, exacting detail, how to sexually penetrate a variety of unwilling canine animals; be they domesticated, wild, or mythical. Slashdot posters have taken great pleasure in illegally spreading this documentation far and wide, treating it as an 'extension' to the Linux 'Sauce Code,' for the sake of 'interoperability.' (The slang term they use for nonconsensual intercourse -- their favourite kind.)

In fact, sick twisted Linux deviants are known to have LAN parties, (Love of Anal Naughtiness, needless to say.), wherein they entice a stray dog, known as the 'Samba Mount,' into their homes. Up to four of these filth-sodden blasphemers against nature take turns to plunge their erect, throbbing, uncircumcised members, conkers-deep, into the rectum, mouth, and other fleshy orifices of the poor animal. Eventually, the 'Samba Mount' collapses due to 'overload,' and needs to be 'rebooted.' (i.e., kicked out into the street, and left to fend for itself.) Many Linux users boast about their 'uptime' in such situations.

Inspiring stuff! If only all trolls were this quality! -- Anonymous Coward, Slashdot

If only indeed. You can help our brave cause by moderating this message up as often as possible. I recommend '+1, Underrated,' as that will protect your precious Karma in Metamoderation [slashdot.org] . Only then can we break through the glass ceiling of Homosexual Slashdot Culture. Is it any wonder that the new version of Slashcode has been christened 'Bender'???

If we can get just one of these postings up to at least '+1,' then it will be archived forever! Others will learn of our struggle, and join with us in our battle for freedom!

It's pathetic you've spent so much time writing this. -- Anonymous Coward, Slashdot

I am compelled to document the foulness and carnal depravity [catholic.net] that is Linux, in order that we may prepare ourselves for the great holy war that is to follow. It is my solemn duty to peel back the foreskin of ignorance and apply the wire brush of enlightenment.

As with any great open-source project, you need someone asking this question, so I'll do it. When the hell is version 2.0 going to be ready?!?! -- Anonymous Coward, Slashdot

I could make an arrogant, childish comment along the lines of 'Every time someone asks for 2.0, I won't release it for another 24 hours,' but the truth of the matter is that I'm quite nervous of releasing a 'number two,' as I can guarantee some filthy shit-slurping Linux pervert would want to suck it straight out of my anus before I've even had chance to wipe.

I desperately want to suck your monolithic kernel, you sexy hunk, you. -- Anonymous Coward, Slashdot

I sincerely hope you're Natalie Portman [archive.org] .

Dude, nothing on slashdot larger than 3 paragraphs is worth reading. Try to distill the message, whatever it was, and maybe I'll read it. As it is, I have to much open source software to write to waste even 10 seconds of precious time. 10 seconds is all its gonna take M$ to whoop Linux's ass. Vigilence is the price of Free (as in libre -- from the fine, frou frou French language) Software. Hack on fellow geeks, and remember: Friday is Bouillabaisse day except for heathens who do not believe that Jesus died for their sins. Those godless, oil drench, bearded sexist clowns can pull grits from their pantaloons (another fine, fine French word) and eat that. Anyway, try to keep your message focused and concise. For concision is the soul of derision. Way. -- Anonymous Coward, Slashdot

What the fuck?

I've read your gay conspiracy post version 1.3.0 and I must say I'm impressed. In particular, I appreciate how you have managed to squeeze in a healthy dose of the latent homosexuality you gay-bashing homos [comp-u-geek.net] tend to be full of. Thank you again. -- Anonymous Coward, Slashdot

Well bugger me!

ooooh honey. how insecure are you!!! wann a little massage from deare bruci. love you -- Anonymous Coward, Slashdot

Fuck right off!

IMPORTANT: This message needs to be heard (Not HURD [linux.org] , which is an acronym for 'Huge Unclean Rectal Dilator') across the whole community, so it has been released into the Public Domain [icopyright.com] . You know, that licence that we all had before those homoerotic crypto-fascists came out with the GPL [apple.com] (Gay Penetration License) that is no more than an excuse to see who's got the biggest feces-encrusted [rotten.com] cock. I would have put this up on Freshmeat [adultmember.com] , but that name is known to be a euphemism for the tight rump of a young boy.

Come to think of it, the whole concept of 'Source Control' unnerves me, because it sounds a bit like 'Sauce Control,' which is a description of the homosexual [goatse.fr] practice of holding the base of the cock shaft tightly upon the point of ejaculation, thus causing a build up of semenal fluid that is only released upon entry into an incision made into the base of the receiver's scrotum. And 'Open Sauce' is the act of ejaculating into another mans face or perhaps a biscuit to be shared later. Obviously, 'Closed Sauce' is the only Christian thing to do, as evidenced by the fact that it is what Cathedrals are all about.

Contributors: (although not to the eternal game of 'soggy biscuit' that open 'sauce' development has become) Anonymous Coward, Anonymous Coward, phee, Anonymous Coward, mighty jebus, Anonymous Coward, Anonymous Coward, double_h, Anonymous Coward, Eimernase, Anonymous Coward, Anonymous Coward, Anonymous Coward, Anonymous Coward, Anonymous Coward, Anonymous Coward, Anonymous Coward, Anonymous Coward. Further contributions are welcome.

Current changes: This version sent to FreeWIPO [slashdot.org] by 'Bring BackATV' as plain text. Reformatted everything, added all links back in (that we could match from the previous version), many new ones (Slashbot bait links). Even more spelling fixed. Who wrote this thing, CmdrTaco himself?

Previous changes: Yet more changes added. Spelling fixed. Feedback added. Explanation of 'distro' system. 'Mount Point' syntax described. More filth regarding `man` and Slashdot. Yet more fucking spelling fixed. 'Fetchmail' uncovered further. More Slashbot baiting. Apache exposed. Distribution licence at foot of document.

ANUX -- A full Linux distribution... Up your ass!

Re:AMD (1)

jimmydevice (699057) | more than 4 years ago | (#32708272)

Uh, Linus doesn't work for microsoft.

Re:AMD (1)

Lennie (16154) | more than 4 years ago | (#32708416)

The troll did have one point, the subject, where is AMD/ATI in this article ? Didn't they also have a product in that segment ?

Re:AMD (4, Informative)

Rockoon (1252108) | more than 4 years ago | (#32708538)

..they have products in both segments.

..and for the record, AMD is still ruling the very high end multi-CPU (aka server) benchmarks [cpubenchmark.net] and of course, we all know that their GPU's are top notch.

AMD just isnt doing well in the high end consumer-grade space, but then again the chips that Intel is ruling with in that segment are priced well above consumer budgets.

GPUs not top notch across the board... (1)

Junta (36770) | more than 4 years ago | (#32708634)

Evergreen had a *huge* lead over pre-Fermi nVidia chips, and still leads in 32-bit precision (and by extension most of what the mass market cares about), but 64-bit precision lags Fermi. Of course, Evergreen beat Fermi to market by a large large margin.

Re:GPUs not top notch across the board... (0)

Anonymous Coward | more than 4 years ago | (#32709868)

64-bit only marginally lags Fermi. It's pretty close.

Re:GPUs not top notch across the board... (0)

Anonymous Coward | more than 4 years ago | (#32710054)

Consumer grade Fermi chips only run at 1/8th double precision performance compared to single precision.
http://www.geeks3d.com/20100329/geforce-gtx-480-and-gtx-470-have-limited-double-precision-performance/
Evergreen chips run DP at 1/5th SP rate and Tesla chips run DP at 1/2 SP rate.
So that gives you Fermi-based Tesla at 672 Gflops DP, HD5870 at 544 Gflops DP and GTX480 at 168gflops DP.

So, for the consumer grade chips, ATI has almost 3.25x more DP power, based on the geeks3d article.

I couldn't find any information on dual-chip Tesla's, so that crowns the HD5970 king with the most DP performance at 1088 GFlops DP for a third the price of the Tesla chip.

Re:AMD (2, Insightful)

Joce640k (829181) | more than 4 years ago | (#32710056)

I don't think AMD really cares about competing with top-end Intel processors. It takes a lot of R&D investment with very little return (it's a tiny market segment)

In the low/mid range AMD rules the roost in terms of value for money.

Re:AMD (1)

cynyr (703126) | more than 4 years ago | (#32710362)

look at the X6 BE chip, 6 cores, better performance than anything intel has at the same price. It doesn't compete with Intels 12 "core" in 12 thead applications, but apart from video encodes(even thats iffy) you'll be hard pressed to find a 12 thread app that doesn't end up IO bound, as a home user.

Re:AMD (1)

Rockoon (1252108) | more than 4 years ago | (#32710410)

Note that one of AMD's 12-core Opterons is cheaper than Intel's top-of-the-line "consumer grade" 4-core i7 extreme, and THAT wouldn't kick the snot out of any i7 in I/O

Re:AMD (0)

Anonymous Coward | more than 4 years ago | (#32710186)

there's something very misleading about this. i don't see any socket 1567
cpus listed and the highest-listed intel quad socket is not listed on intel's
web site. (http://ark.intel.com/ProductCollection.aspx?series=36934) if it
were, it would be a 6 core/6 thread job. i don't know for a fact that the intel
7560 (8c/16t @ 2.26ghz; http://ark.intel.com/ProductCollection.aspx?series=46487)
would be faster, but 64 threads seems >> 24 to me! it should be at least
listed.

Re:AMD (-1, Troll)

TheReaperD (937405) | more than 4 years ago | (#32708310)

Wow. Whatever drugs your on, better lay off before they kill you or someone puts you out of our misery. You might also want professional help for your serious closet homosexuality.

To the rest of the Slashdot community: Sorry for feeding the trolls.

Re:AMD (0)

Anonymous Coward | more than 4 years ago | (#32708504)

Wow. Whatever drugs your on

That'd be "you're" or "your are", not "your".

HTH. HAND.

Re:AMD (0)

Anonymous Coward | more than 4 years ago | (#32708788)

That would be "you are", not "your are".

Re:AMD (0)

Anonymous Coward | more than 4 years ago | (#32708910)

This troll reminds me of that Dave Chappelle skit about the racist black guy.

Re:AMD (0)

Anonymous Coward | more than 4 years ago | (#32709140)

Truth is, blacks and hispanics are far more racist than whites.

Re:AMD (0)

Anonymous Coward | more than 4 years ago | (#32709728)

Wow. All that and no mention of Eric S. Raymonds bisexual wife and their girlfriend? Or is that to close to a heterosexual male fantasy for the OP to imagine?

Intel is gay (-1)

Anonymous Coward | more than 4 years ago | (#32708252)

Seriously.

Re:Intel is gay (0)

Anonymous Coward | more than 4 years ago | (#32708342)

Do you mean that Intel is homosexual? I'm not even sure what that would mean. It's attracted to others of the same gender? Seems a bizarre thing to say about a company.

It seems unlikely that you're using the old usage of happy. It would make some sort of sense if the company turned out to give its employees a positive experience from working there, but I don't see any evidence of that.

Or perhaps you simply consider it somehow reprehensible that a company makes semiconductors. Why do you have an aversion to this practice?

Re:Intel is gay (-1, Troll)

Anonymous Coward | more than 4 years ago | (#32708382)

Jeusus Christ, man. Use common sense. Intel was founded by, is operated by, and is populated with raging homosexuals. Ipso facto, Intel is gay.

first post! (4, Funny)

Dynetrekk (1607735) | more than 4 years ago | (#32708254)

I am now posting using my GPU. It's at least 50x faster!

Re:first post! (4, Informative)

LordKronos (470910) | more than 4 years ago | (#32708290)

Awesome. And now maybe you've learned a lesson. While the external processor was faster, sending your data over the bus to the external processor has an inherent delay in it. That's why your first post came in fourth.

Re:first post! (4, Funny)

TheLink (130905) | more than 4 years ago | (#32708370)

The other earlier posts however seem to suffer from some sort of processing or data corruption/error.

It depends? (5, Insightful)

aliquis (678370) | more than 4 years ago | (#32708284)

Isn't it like saying "Ferrari makes the fastest tractors!" (yeah, I know!), which may be true, as long as they can actually carry out the things you want to do.

I don't know about the limits of OpenCL/GPU-code (or architecture compared to regular CPUs/AMD64 functions, registers, cache, pipelines, what not), but I'm sure there's plenty and that someone will tell us.

Re:It depends? (5, Informative)

jawtheshark (198669) | more than 4 years ago | (#32708358)

Try Lamborghini next time... You do know that Mr Lamborghini originally made his money making tractors. The legend [wikipedia.org] says he wasn't satisfied with what Ferrari offered as sports cars and thus made one himself. Originally, Lamborghini is a tractor brand.... Not kidding. I think they still make them [lamborghini-tractors.com] ...

Re:It depends? (1)

aliquis (678370) | more than 4 years ago | (#32708560)

Yeah I wondered which one it was but I was somewhat to lazy I guess. Maybe the history was the Lamborghini guy decided he could to ..

Only googled ferrari tractor to see if they had any or whatever it was lamborghini, got a few tractor images so I went with that.

So Lamborghini went super-cars and Ferrari tractors ("if they can beat us at cars we for sure will show them with tractors!"? :D)

Sorry for messing up :)
http://www.ferrari-tractors.com/ [ferrari-tractors.com]

Re:It depends? (1)

internettoughguy (1478741) | more than 4 years ago | (#32712744)

There are also ferrari tractors [ferraritractors.co.nz] , unrelated to the sports car manufacturer though.

Re:It depends? (5, Informative)

Sycraft-fu (314770) | more than 4 years ago | (#32708422)

Basically, GPUs are stream processors. They are fast at tasks that meet the following criteria:

1) Your problem has to be more or less infinitely parallel. A modern GPU will have anywhere in the range of 128-512 parallel execution units, and of course you can have multiple GPUs. So it needs to be something that can be broken down in to a lot of peices.

2) Your problem needs to be floating point. GPUs push 32-bit floating point numbers really fast. The most recent ones can also do 64-bit FP numbers at half the speed. Anything older is pretty much 32-bit only. For the most part, count on single precision FP for good performance.

3) Your problem must fit within the RAM of the GPU. This varies, 512MB-1GB is common for consumer GPUs, 4GB is fairly easy to get for things like Teslas that are built for GPGPU. GPUs have extremely fast RAM connected to them, much faster than even system RAM. 100GB/sec+ is not uncommon. While a 16x PCIe bus is fast, it isn't that fast. So to get good performance, the problem needs to fit on the GPU. You can move data to and from the main memory (or disk) occasionally, but most of the crunching must happen on card.

4) Your problem needs to have not a whole lot of branching, and when it does branch, multiple paths need to branch the same. GPUs handle branching, but not all that well. The performance penalty is pretty high. Also generally speaking a whole group of shaders has to branch the same way. So you need the sort of thing that when the "else" is hit, it is hit for the entire group.

So, the more similar your problem is to that, the better GPUs work on it. 3D graphics would be an excellent example of something that meets that precisely, which is no surprise as that's what they are made for. The more your deviate from that, the less suited GPUs are. You can easily find tasks they are exceedingly slow at compared to CPUs.

Basically modern CPUs tend to be quite good at everything. They have strong performance across the board so no matter what the task, they can do it well. The downside is they are unspecalized, they excel at nothing. The other end of the spectrum is an ASIC, a circuit designed for one and only one thing. That kind of thing can be extremely efficient. Something like a gigabit switch ASIC is a great example. You can have a tiny chip that draws a couple watts and yet and switch 50+gbit/sec of traffic. However that ASIC can only do its one task, no programability. GPUs are something of a hybrid. They are fully programmable, but they are specialized in to a given field. As such at the tasks they are good at, the are extremely fast. At the tasks they are not, they are extremely slow.

Re:It depends? (1)

antifoidulus (807088) | more than 4 years ago | (#32708506)

I wonder if competition from GPUs will influence Intel to beef up the vector processing capabilities of it's chips. Currently Intel' SSE is pretty weak, especially when you compare it to competitors like Altivec. Unfortunately outside of Cell there aren't a whole lot of CPUs nowadays that feature Altivec....

Re:It depends? (4, Insightful)

rahvin112 (446269) | more than 4 years ago | (#32708598)

It is not a secret (it's a stated fact on both Intel and AMD's roadmaps) to integrate GPU like programmable FP into the FP units of the general processor. The likely result will be the same general purpose CPU you love, but there will be dozens of additional FP units that excel at mathematics like the parent described except more flexible. When the fusion'eske products ramp and GPGPU functionality is integrated into the CPU Nvidia is out of business. Oh I don't expect these fusion products to have great GPU's, but once you destroy the low end and mid range graphics marketplace there is very little $$ wise left to fund R&D (3dfx was the first one into the high end 3d market and they barely broke even on their first sales, the only reason they survived was because they were heavy in the arcade sector sales). If Nvidia hasn't been allowed to purchase Via's x86 license by that point they are quite frankly out of business. Not immediately of course, they will spend a few years evaporating all assets while they try to compete with only the highend marketplace but in the end they won't survive. Things go in cycles and the independent graphics chip cycle is going to end very shortly, maybe in a decade it will come back, but I'm skeptical. CPU's have exceeded the speed needed for 80% of most tasks out there.

When I first started my Career computer runs of my design work took about 5-30 minutes to run on bare minimum quality. These days I can exceed that bare minimum by 20 times and the run will take seconds. It's to the point where I can model with far more precision than the end product needs with almost no time penalty. In fact additional CPU speed at this point is almost meaningless and my business isn't alone in this. In fact most of the software in my business is single threaded (and the apps run that fast with single threads). Once the software is multi-threaded there is really no additional CPU power needed and it may come to the point where my business just stops upgrading hardware beyond what's need to replace failures and my business isn't alone. I just don't see a future for independent graphics chip/card producers.

Re:It depends? (1)

santiagodraco (1254708) | more than 4 years ago | (#32709084)

They are called, specifically, FPU's not FP's.

As for the cpu guys putting the gpu guys out of business... we know how successful Intel has been trying to do just that with their GPU offerings... you expect that to change in the next, say, 10 years? Not likely given their past track record of failure.

Patent holding company (0)

Anonymous Coward | more than 4 years ago | (#32709758)

So what you're saying is nVidia will become a patent holding company and probably make just as much money as they're making now.

Re:It depends? (1)

hairyfeet (841228) | more than 4 years ago | (#32710200)

If your theory was true, why hasn't it already happened? Both AMD and Nvidia have been putting pretty nice GPUs on motherboards for quite awhile, yet we still have discrete cards, why? There is a good reason why, for the most basic office task, even two or three year old gaming, the onboard chips work fine. I myself played Bioshock I and Swat 4 on my onboard with no trouble.

But for anything where you care even a little bit about REAL performance the onboards, and I don't care if we are talking onboard or on die, simply won't have a chance. You just can't put hundreds of Mb or even Gbs of RAM onto the die. And honestly the amount of power you get in discrete cards for even the low end makes them a terrific buy. My 4650HD with 1Gb of RAM cost a grand total of $36 after MIR, plays all the games I care about, and does wonderfully well at transcoding and HD.

So while you think discrete GPUs are gonna die, I'd say the opposite is true. I think the onboards will be used in machines where price trumps everything, such as bottom of the line netbooks and Walmart/Best Buy "specials", whereas for everything else since HD and games like Warcrack will continue to be popular and thus selling points discretes will bring in the "wow" factor and help OEMs to differentiate their products. And thanks to PCIe being standard on just about every board being made or sold in the last few years even those that get a Best Buy Special can bring it to someone like me and have an upgrade on the cheap.

Although I do agree on one point you made: Nvidia. They need to buy out Via and they need to do it yesterday. If Nvidia doesn't prevail in an antitrust against Intel over the new socket they are gonna be dead in the water, as both AMD and Intel can offer "top to bottom" solutions (although I would call an Intel GPU a problem not a solution) and Nvidia simply doesn't have anything to compete. Both AMD and Intel get paid twice for every board sold, getting $$$ for both the GPU and the CPU, this extra cash will allow that much more R&D that Nvidia won't be able to afford. Finally from what I've seen Fermi is a GP/GPU and NOT a gamer chip, and cranks out waaaaay too much heat to boot. With each generation AMD seems to be getting better with power usage with the second and third gen of a chip using much less than the first, while Nvidia seems to be stuck in "Netburst" mode. With the focus on green computing now is not the time to be building space heaters, and considering the time it will take to combine CPU+GPU Nvidia needs to be seriously trying to pick up Via. If they don't I predict they will end up being bought out by Intel.

Re:It depends? (0)

Anonymous Coward | more than 4 years ago | (#32710234)

Nah. NVIDIA will continue to exist because their will still be a large market for dedicated GPUs for years to come. Cell-Phones, Game Consoles, and all sorts of other electronic devices will still want access to dedicated processors for the very type of problems that you describe. Not to mention large scale research that is now being done on NVIDIA cards.

As a software engineer, in the printing industry, I can tell you that our interest in dedicated GPUs for large scale printing is increasing not decreasing and we are buying the kinda of cards you think NVIDIA will only have left.

Re:It depends? (0)

Anonymous Coward | more than 4 years ago | (#32710388)

"I don't see a need for it in my business therefore I don't see why anybody else should have a need for it either"

Overall I think you are right (1)

Sycraft-fu (314770) | more than 4 years ago | (#32712346)

But I think the timescale will be a very long one.

I mean ideally, we want only the CPU in a computer. The whole idea of a computer is that it does everything, rather than having dedicated devices. Ideally that means that it does everything purely in software, that the CPU is all it needs. For everything else, we seem to have reached that point but graphics are still too intense. Have to have a dedicated DSP for them.

However, we'll keep wanting that until the CPU can do photorealistic graphics in realtime. That is a long way off yet. Even GPUs can't do that. Once GPUs can, the trick is then being able to scale that down to become a realistic subset of the CPU, rather than a dedicated unit. You can't very well scale CPUs up to massive sizes and power consumptions.

So I've no doubt it'll happen, but I think not for 20+ years.

Re:It depends? (1)

Bengie (1121981) | more than 4 years ago | (#32709020)

New AVX SIMD is coming out soon. The first set of 256bit registers are suppose to be 2xs as fast as SSE and later 512bit and 1024bit AVX are suppose to be another ~2-4xs faster than the 256bit. I guess one of the benefits of AVX is the new register sizes are suppose to give transparent speed increases. So a program made for 256bit AVX will automatically see faster calculations when the new 512bit AVX registers come out. Sounds good to me. They're suppose to be 3 operandi instructions.

Re:It depends? (1)

robthebloke (1308483) | more than 4 years ago | (#32711472)

I guess one of the benefits of AVX is the new register sizes are suppose to give transparent speed increases. So a program made for 256bit AVX will automatically see faster calculations when the new 512bit AVX registers come out.

Afraid not (well, there are ways if you are willing to litter your code with C++ templates). Yes the instructions will process 8 floats, however you're only going to see some nice linear speed up if you are already using SOA data structures. For a lot of the 'traditional' SSE code you'll tend to see (i.e. AOS vector3/matrix classes etc), the AVX instructions will be of little use. In effect they duplicate all SSE->SSE4.1 for 256 register types. i.e. SSE has _mm_add_ps, AVX has _mm256_add_ps, and any new 512 bit instruction will be _mm512_add_ps (which incidentally is one of the larrabee instructions). You'll have to modify just as much code porting SSE to AVX as you will porting the 256bit AVX instructions to 512bit AVX ones. The only advantage is that we know what the new 512 bit instructions will look like, and can plan for the future!

I've already had a crack at porting a fair amount of existing code to AVX (Intel compiler comes with an emulator - I don't have any hardware obviously!). For code optimised in an SOA layout, coding is going to be quite fun. If you have a load of Vector3/Matrix type classes, then you aren't going to get much performance benefit. The nicest thing about AVX for that kind of code is that it allows you to convert to double, perform the operation, convert back to floats, and the resulting code should run at about the same speed as the SSE equivalent..... (a touch slower, but not enough to care....)

Re:It depends? (1)

g4b (956118) | more than 4 years ago | (#32708530)

maybe GPUs can solve life then. Real Life Problems meet most of the criteria. infinite amounts all at the same time, many numbers floating, small description size, tends to branch in an endless tree of solutions never to be achieved...

Re:It depends? (4, Informative)

JanneM (7445) | more than 4 years ago | (#32708570)

"So to get good performance, the problem needs to fit on the GPU. You can move data to and from the main memory (or disk) occasionally, but most of the crunching must happen on card."

From what I have seen when people use GPUs for HPC, this, more often than anything else, is the limiting factor. The actual calculations are plenty fast, but the need to format your data for the GPU, send it, then do the same in reverse for the result really limits the practical gain you get.

I'm not saying it's useless or anything - far from it - but this issue is as important asthe actual processing you want to do for determining what kind of gain you'll see from such an approach.

That's the big draw of the Teslas (2, Informative)

Sycraft-fu (314770) | more than 4 years ago | (#32712310)

I mean when you get down to it, the seem really overpriced. No video output, their processor isn't anything faster, what's the big deal? Big deal is that 4x the RAM can really speed shit up.

Unfortunately there are very hard limits to how much RAM they can put on a card. This is both because of the memory controllers, and because of electrical considerations. So you aren't going to see a 128GB GPU or the like any time soon.

Most of our researchers that do that kind of thing use only Teslas because of the need for more RAM. As you said, the transfer is the limiting factor. More RAM means less often you have to snuffle data back and forth.

Re:That's the big draw of the Teslas (1)

JanneM (7445) | more than 4 years ago | (#32712684)

The problem is when you have a larger system, with hundreds of cores, and an iterative simulation. You run the system for a cycle, propagate data, then run for another cycle and so on. In that case you can't isolate a long-running process on the card, and you end up having to squeeze data through that bus for each cycle anyway. It is likely still well worth using GPUs but you do need to take a good look at whether adding GPUs are more or less effective than using your funds to simply add more cores instead.

I expect over time to see better suited interfaces appear for this type of computing.

Re:It depends? (1, Interesting)

Anonymous Coward | more than 4 years ago | (#32708608)

That is an excellent post, with the exception of this little bit

GPUs have extremely fast RAM connected to them, much faster than even system RAM

I'd like to see a citation for that little bit of trivia... the specific type & speed of RAM on a board with a GPU varies based on model and manufacturer. Cheaper boards use slower RAM, the more expensive ones use higher end stuff. I haven't seen ANY GPU's that came with on-board RAM that is any different than what you can mount as normal system RAM, however.

Not trolling, I wanted to point out a serious flaw in what in an otherwise great post.

Re:It depends? (3, Informative)

pnewhook (788591) | more than 4 years ago | (#32708700)

GPUs have extremely fast RAM connected to them, much faster than even system RAM

I'd like to see a citation for that little bit of trivia

Ok, so my Geforce GTX480 has GDDR5 ( http://www.nvidia.com/object/product_geforce_gtx_480_us.html [nvidia.com] ) which is based on DDR3 ( http://en.wikipedia.org/wiki/GDDR5 [wikipedia.org] )

My memory bandwidth on the GTX480 is 177 GB/sec. The fastest DDR3 module is PC3-17000 ( http://en.wikipedia.org/wiki/DDR3_SDRAM [wikipedia.org] ) which gives approx 17000 MB/s which is approx 17GB/sec. So my graphics ram is basically 10x faster than system ram as it should be.

Re:It depends? (1)

Kjella (173770) | more than 4 years ago | (#32710218)

My memory bandwidth on the GTX480 is 177 GB/sec. The fastest DDR3 module is PC3-17000 ( http://en.wikipedia.org/wiki/DDR3_SDRAM [wikipedia.org] ) which gives approx 17000 MB/s which is approx 17GB/sec.

And the high end CPUs have as far as I know triple channel memory now so a total of 51 GB/s. Not sure how valid that comparison is but graphics card tend to get their fill rate from having a much wider memory bus - the GTX480 has a 384 bit wide bus - rather than that much faster memory so it's probably not too far off. If CPUs move towards doing GPU-like work which can be loaded in wider chunks they'll probably move towards a wider bus too.

Re:It depends? (1)

pnewhook (788591) | more than 4 years ago | (#32711776)

Width is part of it but it's also clock rate. The fastest overclocked DDR3 will go to 2.5GHz. The stock Geforce 480 is 3.7Ghz. At those rates the bus length gets to be an issue. The memory on a graphics card can be kept very close to the chip. On a PC the memory due to practical reason has to be set farther away resulting in necessarily slower clocks and data rates.

The 51 GB/sec you mention is definitely overclocked. I've not seen stock memory that fast. Even so its still less than a third the rate of the graphics memory.

Re:It depends? (0)

Anonymous Coward | more than 4 years ago | (#32711822)

GTX480? You mean the Pentium 4 of the graphics card world?

Re:It depends? (1)

pnewhook (788591) | more than 4 years ago | (#32711896)

No, I mean the fastest consumer card available right now.

Re:It depends? (2, Interesting)

Spatial (1235392) | more than 4 years ago | (#32708846)

I haven't seen ANY GPU's that came with on-board RAM that is any different than what you can mount as normal system RAM, however.

You haven't been looking very hard. Most GPUs have GDDR3 or GDDR5 running at very high frequencies.

My system for example:
Main memory: DDR2 400Mhz, 64-bit bus. 6,400 MB/sec max.
GPU memory: GDDR3 1050Mhz, 448-bit bus. 117,600 MB/sec max.

Maybe double the DDR2 figure since it's in dual-channel mode. I'm not sure, but it hardly makes much of a difference in contrast. :)

That isn't even exceptional by the way. I have a fairly mainstream GPU, the GTX 260 c216. High-end cards like the HD5870 and GTX 480 are capable of pushing more than 158,000 and 177,000 MB/sec respectively.

Re:It depends? (2, Interesting)

somenickname (1270442) | more than 4 years ago | (#32708716)

That's a very good breakdown of what you need to benefit from GPU based computing but, really, only #1 has any relevance vs. an x86 chip.

#2) Yes, an x86 chip will have a high clock speed but, unless you can use SSE instructions, x86 is crazy slow. Also, most (if not all) architectures will give you half the flops for using the double precision vector instructions vs. the single precision ones.

#3) This is a problem with CPUs as well except, as you point out, the memory is much slower. Performance is often about hiding latency. You don't need your problem to fit in the L2/L3 cache of a CPU, but, if the compiler/programmer/CPU can prefetch things into L2/L3 before it's accessed, it's a huge win. The same goes for having things in GPU memory before it's needed. The difference is that the GPU has a TON of memory compared to an L2/L3 cache.

#4) You might be right here. I know that with hyperthreading a CPU will yield to another "thread" when it mispredicts a branch. However, the fact that branch misprediction is a condition in which the CPU will switch to another thread, to me, means that mispredicting a branch on an x86 CPU is also a fairly expensive thing to do. Maybe not as expensive as on a GPU but, expensive nonetheless.

I suppose it all comes down to what kind of problem you are trying to compute but, if you can make your problem work in a way that is pleasing to #1, using a GPU is probably going to be a win.

Re:It depends? (1)

Tacvek (948259) | more than 4 years ago | (#32709180)

The GPUs are definately worse than CPUs in branching.

If your code splits into 8 different code paths at one point due to branching, your performance can be as bad as 1/8 the maximum, since rather than do anything remotely like actual branching, some GPUs just interleave the code of the different branches, with each instruction tagged as to whether which branch the code belongs to. So if the unit is processing an instrcution for a branch it is not on, it usts sits there doing nothing for one instruction cycle. This type of design may also have a depth limit on branching, so eight simultaneous code branches may not even be possible.

So the CPU performance only significantly degrades if a branch is mispredicted, while many GPU designs have performance suffer for every branch, even if it could have been accurately predicted.

elementary matrix ops (0)

Anonymous Coward | more than 4 years ago | (#32709226)

I wonder if matrix inversion could be done with an asic with massive performance improvement over typical cpus? Im thinking of hardware that is designed to natively describe very large (spares?) matrices efficiently, and perform elementary matrix ops on these matrices.

is this possible? can you think of a way of implementing this, in terms of actual transistor logic?

Re:It depends? (1)

Elbows (208758) | more than 4 years ago | (#32709248)

The other big factor (the biggest in most of the GPU code I've written) is your pattern of memory access. Most GPUs have no cache so access to memory has very high latency even though the bandwidth is excellent. The card will hide this latency to some extent through clever scheduling; and if all your threads are accessing adjacent memory, it will coalesce that into one big read/write. But GPUs do best on problems where the ratio of arithmetic to memory access is high, and your data can hang around in registers for a while.

I've found that in general GPU code has to be written much more carefully if you want good performance. On a regular CPU, if you pick a decent algorithm and pay attention to cache locality, you can usually rely on the compiler for the low-level optimizations and get pretty close to peak performance. On the GPU you have to pay very close attention to memory access patterns, register usage, and other low-level hardware details -- screwing up any of those parts can easily cost you a factor of 10 in performance.

This is starting to change, though -- the newest chips from nvidia have an L1 cache, and relax some of the other restrictions.

Re:It depends? (1)

Jah-Wren Ryel (80510) | more than 4 years ago | (#32710986)

2) Your problem needs to be floating point. GPUs push 32-bit floating point numbers really fast. The most recent ones can also do 64-bit FP numbers at half the speed. Anything older is pretty much 32-bit only. For the most part, count on single precision FP for good performance.

That requirement is not necessarily true. Or at least not in the traditional sense of 'floating point.' GPUs make awesome pattern-matchers for data that isn't necessarily floating point.

Elcomsoft (of adobe DRM international arreset fame) has a GPU accelerated password cracker [elcomsoft.com] that is essentially a massively parallel dictionary attack,

A number of anti-virus vendors have GPU accelerated scanners - like Kaspersky. [theinquirer.net]

And some people have been working with GPUs for network monitoring via packet analysis [ieee.org] too.

Re:It depends? (1)

kramulous (977841) | more than 4 years ago | (#32711960)

Some of the examples used in the cudaSDK are phoney. The sobel one can be made to run faster on the cpu - provided you use the intel compilers and performance primitives and can parallelise.

It doesn't surprise me. There is an example of Sobel for the FPGA's that tote much faster execution times, but then when you examine the code, the fpga version has algorithmic optimisations that were 'left out' for the cpu version. Again, it can be made to run faster on the cpu.

I'm not saying that GPUs are crap. For the right problem, they can be really good. It is just that they are not anywhere near the magic bullet the NVidia PR machine is saying.

Re:It depends? (1)

jesset77 (759149) | more than 4 years ago | (#32712142)

Blah, why can't I get a good GPU accelerated Mandelbrot set viewer, then? z = z^2 + c meets all your criteria great, dun it? :P

Re:It depends? (0)

Anonymous Coward | more than 4 years ago | (#32712398)

All 4 of your points are quite naive...

1) Your problem does NOT need to be infinitely parallel to gain speedups, most stream processors (be it a GPU, Cell, DSP, whatever) are clocked at 1/4 to 1/2 of your consumer level CPU - with far-faster cache/memory interfaces. So at worst you need to be able to split your workload into 2-4 workloads that can run on each execution unit, to have your code 'execute' as fast as a CPU (more is generally faster, depending on memory transactions, cache read/write latency, etc) - note: 'execute' does no mean 'complete' faster (latency can cause a single execution unit to not be 'ready' for a while).

2) Most modern stream processors (nVidia GPUs, DSPs, Cell) can do 32bit and 64bit integer and FP mathematics as fast as a CPU, in many cases 32/64bit FP faster than a CPU - so again, as long as you spread your workload over the 2-4 execution units you'll be 'on par' with a single CPU core.

3) Most workloads aren't as simple as running a tiny piece of code over 2-4 execution units for a few hundred us, most workloads take many ms (overall) while individual execution units take a few us. In this case many stream processors are capable of concurrently executing code while also concurrently (either half or full duplex, depending) transferring memory to/from the GPU - in many cases it's very possible to concurrently execute code running over many gigabytes of data without incurring any obvious memory transfer overheads. The obvious limitation here is if your workload is extremely memory bound, and you need so much data you reach the limits of the PCI express bus bandwidth.

4) There are so many parallel programing paradigms to avoid branching it's not funny. I agree if you naively copy/paste code written for a CPU and try to run it on a stream processor, you'll have major issues. But if you actually write an algorithm from the ground up with the execution and scheduling model of a stream processor in mind - you'll find you don't really branch much, if at all (at the cost of a few extra registers or cache).

You lazy fuckers (5, Interesting)

drinkypoo (153816) | more than 4 years ago | (#32708372)

I don't expect slashdot "editors" to actually edit, but could you at least link to the most applicable past story on the subject [slashdot.org] ? It's almost like you people don't care if slashdot appears at all competent. Snicker.

Re:You lazy fuckers (1)

jgardia (985157) | more than 4 years ago | (#32708392)

s/editors/kdawson/g

Re:You lazy fuckers (0)

Anonymous Coward | more than 4 years ago | (#32708518)

s/kdawson/monkey_with_typewriter/g

If you are going to substitute, as least substitute with something better.

Re:You lazy fuckers (1)

perryizgr8 (1370173) | more than 4 years ago | (#32708972)

what does this mean? totally lost, i am.

Re:You lazy fuckers (0, Offtopic)

Entropius (188861) | more than 4 years ago | (#32709096)

So, once upon a time, there was this text editor called vi.

To make it do shit you type in cryptic commands. The one for search-and-replace is s, followed by a slash, followed by the thing you want to search for, followed by another slash, followed by the thing you want to replace it with. Because of more arcana, this will only happen once per line unless you put a g after it.

So s/cat/dog/g means "replace all occurrences of cat with dog".

Incidentally, you also have to tell vi in what range it should do this operation. So you get cryptic commands like :1,$s/cat/dog/g

Re:You lazy fuckers (1)

3.1415926535 (243140) | more than 4 years ago | (#32709208)

I think that predates vi. Good ol' "ed", the line editor, has s/foo/bar/g command.

Re:You lazy fuckers (1)

cynyr (703126) | more than 4 years ago | (#32710310)

sed has the same syntax as well.

AMD (5, Funny)

MadGeek007 (1332293) | more than 4 years ago | (#32708386)

AMD must feel very conflicted...

Re:AMD (1)

TheGryphon (1826062) | more than 4 years ago | (#32708472)

"daddy, what's AMD?" ... "well son, its that company that tried to keep doing everything at once and died."

Except that.. (1)

Junta (36770) | more than 4 years ago | (#32708686)

Magny-Cours is currently showing significant performance advantage over Intel's offerings while at the same time AMD's Evergreen *mostly* shows performance advantages over nVidia's Fermi despite making it to market ahead of Fermi.

AMD is currently providing the best tech on the market This will likely change, but at the moment, things look good for them.

Re:Except that.. (1)

Entropius (188861) | more than 4 years ago | (#32709102)

I just got back from a lattice QCD conference, and there were lots of talks on GPGPU. Everybody's falling over each other trying to implement their code on GPU's because of the performance gains.

*Every* talk mentioned Nvidia cards -- Geforce GTX nnn's, Tesla boards, Fermi boards. Nobody talked about AMD at all.

Maybe AMD does have an advantage, but nobody's using it.

Re:Except that.. (1)

hvdh (1447205) | more than 4 years ago | (#32709730)

Interestingly, most scientific papers talking about large speed gains (factor 2..10) by going from CPU to GPU computation compare a hand-optimized GPU implementation to a plain single-threaded non-SSE CPU implementation.

From my experience, using SSE intrinsics gives a speed-up of 2..8 versus good generic code, and multi-threading gives more improvement until one hits the RAM bandwidth wall.

Re:Except that.. (1)

raftpeople (844215) | more than 4 years ago | (#32710548)

For those problems that map well to the GPU model of processing, the gains can be enormous (I have ported code to NVIDIA). However, some of my code works better on the CPU and some of it really needs a middle ground of many traditional cores with good branching support, etc. and not as many streaming cores all doing the same thing.

Re:Except that.. (0)

Anonymous Coward | more than 4 years ago | (#32711986)

nVidia bet the farm on GPGPU. The Fermi is nothing but 512 steam processing units (mini dumb CPUs).

Maybe AMD does have an advantage, but nobody's using it.

Whilst AMD's hardware isn't as emphasised on GPGPU stuff, they do have the better video cards. (nVidia's GTX 4xx series has horrible power usage and is universally slower than AMD's equivalent price-point cards on any and all 3D graphics tasks).

Re:Except that.. (1)

Ken_g6 (775014) | more than 4 years ago | (#32712176)

*Every* talk mentioned Nvidia cards -- Geforce GTX nnn's, Tesla boards, Fermi boards. Nobody talked about AMD at all.

Maybe AMD does have an advantage, but nobody's using it.

That's because nVIDIA has excellent support, both on Windows and Linux, and documentation for their CUDA GPGPU system. They even have an emulator so people without an nVIDIA GPU can develop for one. (Although it's now deprecated.)

On the other hand, AMD has CAL, Stream, and OpenCL; and I can't even figure out which one I'm supposed to use to support all GPGPU-capable AMD cards. OpenCL has some documentation; I can't find anything good on CAL, and I can't find any way to develop for the platform on Linux without the hardware.

That's why I've written a working CUDA app but nothing for AMD.

Re:AMD (4, Insightful)

Junta (36770) | more than 4 years ago | (#32708678)

AMD is the most advantaged on this front...

Intel and nVidia are stuck in the mode of realistically needing one another and simultaneously downplaying the other's contribution.

AMD can use what's best for the task at hand/accurately portray the relative importance of their CPUs/GPUs without undermining their marketing message.

Re:AMD (0)

Anonymous Coward | more than 4 years ago | (#32712820)

AMD is the most advantaged on this front...

Intel and nVidia are stuck in the mode of realistically needing one another and simultaneously downplaying the other's contribution.

Exactly, and this manifested in Intel's new Pinetrail platform to the consumer's detriment. Intel refused to grant NVidia the license to connect their ION chipset via DMI, and so people planning on using Pinetrail in HTPC's were saddled with Intel's own chipset with crappy graphics performance (No Native Hardware H.264 Decoding: Long Live Ion [anandtech.com] ).

CPUs and GPUs have different goals (5, Interesting)

leptogenesis (1305483) | more than 4 years ago | (#32708440)

At least as far as parallel computing goes. CPUs have been designed for decades to handle sequential problems, where each new computation is likely to have dependencies on the results of recent computations. GPUs, on the other hand, are designed for situations where most of the operations happen on huge vectors of data; the reason they work well isn't really that they have many cores, but that the operations for splitting up the data and distributing it to the cores is (supposedly) done in hardware. In a CPU, the programmer has to deal with splitting up the data, and allowing the programmer to control that process makes many hardware optimizations impossible.

The surprising thing in TFA is that Intel is claiming to have done almost as well on a problem that NVIDIA used to tout their GPUs. It really makes me wonder what problem it was. The claim that "performance on both CPUs and GPUs is limited by memory bandwidth" seems particularly suspect, since on a good GPU the memory access should be parallelized.

It's clear that Intel wants a piece of the growing CUDA userbase, but I think it will be a while before any x86 processor can compete with a GPU on the problems that a GPU's architecture was specifically designed to address.

Re:CPUs and GPUs have different goals (1)

rahvin112 (446269) | more than 4 years ago | (#32708660)

All 3 of them?

Straw man? (1)

Posting=!Working (197779) | more than 4 years ago | (#32708462)

The author doesn't understand what the straw man argument is. He thinks it is bringing up anything that isn't specifically mentioned in the original argument. Nvidia stating that optimizing multi-core CPUs is difficult and that the Nvidia architecture has hundreds of applications seeing a huge gain in performance now is a valid point even if the Intel side never mentioned the difficulty of implementation.

Intel says "Buy Nvidia" (4, Insightful)

Posting=!Working (197779) | more than 4 years ago | (#32708482)

What the hell kind of sales pitch is "We're only a little more than twice as slow!"

[W]e perform a rigorous performance analysis and find that after applying optimizations appropriate for both CPUs and GPUs the performance gap between an Nvidia GTX280 processor and the Intel Core i7 960 processor narrows to only 2.5x on average.

It's gonna work, too.

Humanity sucks at math.

Re:Intel says "Buy Nvidia" (1)

Cassini2 (956052) | more than 4 years ago | (#32709484)

What the hell kind of sales pitch is "We're only a little more than twice as slow!"

The two times speed gain point is where it becomes pointless to exploit specialized hardware. Frequently, the software development program manager has two choices:
a) Ship a product now, or
b) Spend 1 to 2 more years developing the product, then ship it.
The issue is that hardware doubles in speed every 1 to 2 years. If the cost of exploiting current specialized hardware is an additional 1 to 2 years software development, then the "user" performance at the end of 1 to 2 years is the same.

The revenue from the additional time to market is not. Being in the market first, can yield additional sales. Simply having a product to market, results in sales. As such, delaying software development to get a speed gain, adversely affects revenues.

Most significantly, having a product in the marketplace allows one to understand what the users "want". The users might not want speed. They might have some massive algorithmic problem with the product. Perhaps, you designed the software for a small investor tracking one stock, and the purchasers are wall street traders tracking thousands of stocks. In this case, the program needs to be restructured to better handle the problems the user base wants solved.

The "additional software development time" argument significantly reduces the usefulness of the CUDA approach. Intel delivers processors that can speed up the current software, immediately. No need to rewrite software. "Twice as slow" is approximately the break even point for many businesses.

Re:Intel says "Buy Nvidia" (1)

sbates (1832606) | more than 4 years ago | (#32712430)

Just a helpful tip: the next time you're tempted to add a comma, don't. It'll vastly improve the readability of your otherwise competent writing.

Re:Intel says "Buy Nvidia" (1)

royallthefourth (1564389) | more than 4 years ago | (#32709616)

I did an experiment on a Core 2 Duo a couple years ago and found it to be only 5% as fast at doing a huge matrix multiply compared to a (then) top-of-the-line Nvidia. So, they're catching up pretty well.

That's worth noting for people who've been following this closely for a while.

Re:Intel says "Buy Nvidia" (1)

evilviper (135110) | more than 4 years ago | (#32710348)

What the hell kind of sales pitch is "We're only a little more than twice as slow!"

It's a very good sales pitch, actually. Unlike AMD, NVidia isn't an alternative to Intel CPUs. Instead it's a complimentary technology, which adds additional cost.

So, I could buy a $500 CPU and a $500 GPU, or I could buy TWO $500 CPUs, and get most of the performance, without having to completely redesign all software to run on an GPU.

And Intel has at least one good point, in that NVidia's claims are based on pretty naive methods, and the SIMD instructions that have been added to Intel/AMD CPUs in recent years really are the same thing you get with GPU programming, just on a bit smaller scale. And if Intel could double the performance of SIMD instructions on near-future CPUs, you'd really lose the benefits of GPU programming, and the economics would take care of the rest quite simply.

No matter what, AMD really wins in this one. They're packaging CPU & GPU ever closer. It might just expand the utility of SIMD, or it might introduce a new CPU instruction set for the GPU, like the x87 FPU did before it, or it might be integrated tighter still, to the point that their CPUs might just automatically route appropriate computations to the GPU silicon, and route anything else to the traditional CPU.

Yes, great sales pitch (1)

raftpeople (844215) | more than 4 years ago | (#32710654)

Don't get me wrong, I like what Intel is doing, but c'mon, you are understating this:

and the SIMD instructions that have been added to Intel/AMD CPUs in recent years really are the same thing you get with GPU programming, just on a bit smaller scale.

It's an order of magnitude different (and I know from experience coding CPU and GPU)
i7 960 - 4 cores 4 way SIMD
GT285 (not 280) - 30 cores 32 way SIMD

SP GFLOPS
i7 960 - 102
GT285 - 1080

No matter what, AMD really wins in this one.

AMD has the potential to win, but currently are in last place. Intel is aggressively solving all of the problems that previously gave AMD an advantage, and NVIDIA has aggressively put in place the things HPC wants (e.g. easy to code in C for the platform - I've done it and it is easy, also adding ECC and caching, etc.)

Re:Intel says "Buy Nvidia" (1)

Twinbee (767046) | more than 4 years ago | (#32710754)

I think that CPUs are faster with conditional branching and other general purpose computing tasks, so I would sacrifice 2x for that.

Optimizations Matter (1)

Lord Byron II (671689) | more than 4 years ago | (#32708492)

From the article, you can narrow the gap:

"with careful multithreading, reorganization of memory access patterns, and SIMD optimizations"

Sometimes though, I don't want to spend all week making optimizations. I just want my code to run and run fast. Sure, if you optimize the heck out of a section of code, you can always eek out a bit more performance, but if the unoptimized code can run just as fast (on a GPU), why would I bother?

Re:Optimizations Matter (3, Informative)

Rockoon (1252108) | more than 4 years ago | (#32708516)

Just to be clear, those same memory reorganizations are required for the GPU. That being specifically the Structure-of-Arrays strategy instead of the Array-of-Structures strategy.

Its certainly true that most programmers reach for the later style, but mainly because they arent planning on using any SIMD.

Re:Optimizations Matter (0)

Anonymous Coward | more than 4 years ago | (#32708548)

Because it can't.
To run on GPU, the code would have to be written for running on GPU. You can't just recompile some C and call it a day

Re:Optimizations Matter (2)

Junta (36770) | more than 4 years ago | (#32708710)

The difference is the 'naive' code you write to do things in the simplest manner *can* run on a CPU. For the GPU languages, you *must* make those optimizations. This is not to undercut the value of GPU (as Intel concedes, the gap is large), but it does serve to counteract the dramatic numbers tauted by nVidia.

nVidia compared expert tuned and optimized performance metrics on their product and compared against stock, generic benchmarks on intel products.

Still trying to keep Larrabee going? (4, Insightful)

Junta (36770) | more than 4 years ago | (#32708520)

On top of being highly capable at massively parallel floating point math (the bread and butter of top500 and most all real world HPC applications), GPU chips benefit from economies of scale by having a much larger market to sell chips to. If Intel has an HPC-only processor, I don't see it really surviving. There have been numerous HPC only accelerators that provided huge boosts over cpus that flopped. GPUs growing into that capability is the first large scale phenomenon in hpc with legs.

Re:Still trying to keep Larrabee going? (0)

Anonymous Coward | more than 4 years ago | (#32712060)

Except that Larrabee is x86 compatible.

Did those past accelerators allow people to make their software faster with barely any (if any at all) rewriting?

Who cares anymore? (1)

stewbacca (1033764) | more than 4 years ago | (#32708522)

Does anyone under the age of 25 really care anymore about processor speed and video card "features"?

I only ask because 15 years ago I cared greatly about this stuff. However, I'm not sure if that is a product of my immaturity at that time, or the blossoming industry in general.

Nowadays it's all pretty much the same to me. Convenience (as in, there it is sitting on the shelf for a decent price) is more important these days.

Re:Who cares anymore? (3, Insightful)

Overzeetop (214511) | more than 4 years ago | (#32708574)

Two things: you've been conditioned to accept gaming graphics of yesteryear, and your need for more complex game play now trumps pure visuals. You can drop in a $100 video card, set the quality to give you excellent frame rates, and it looks fucking awesome because you remember playing Doom. Also, once you get to a certain point, the eye candy takes a backseat to game play and story - the basic cards hit that point pretty easily now.

Back when we used to game, you needed just about every cycle you could get to make basic gameplay what would now be considered "primitive". Middling level detail is great, in my opinion. Going up levels to the maximum detail really adds very little. I won't argue that it's cool to see that last bit of realism, but it's not worth doubling the cost of a computer to get it.

Re:Who cares anymore? (3, Informative)

Rockoon (1252108) | more than 4 years ago | (#32708644)

Well as far as GPU's and Gaming, there are two segments of the population: Those with "low resolution" rigs such as 1280x1024 (most common group according to steam), and those with "high resolution" rigs such as 1920x1200.

An $80 video card enables high/ultra settings at 60+ FPS on nearly all games for the "low resolution" group, but not the "high resolution" group.

Take me out back and shoot me (1)

WillyWanker (1502057) | more than 4 years ago | (#32708790)

The day I build a computer with an Nvidia graphics processor as a CPU is when it's time to call 911, cause I will have completely lost my mind.

Oh for cryin' out loud (4, Insightful)

werewolf1031 (869837) | more than 4 years ago | (#32708792)

Just kiss and make up already. Intel and nVidia have but one choice: to join forces and try collectively to compete against AMD/ATI. Anything less, and they're cutting their nose off to spite their respective faces.

Big Deal, A Barrel... (3, Insightful)

jedidiah (1196) | more than 4 years ago | (#32708868)

Yeah, speciality silicon for a small subset of problems will stomp all over a general purpose CPU. No big news there.

Why is Intel even bothering to whine about this stuff? They sound like a bunch of babies trying to argue that the sky isn't blue.

This makes Intel look truely sad. It's completely unecessary.

Re:Big Deal, A Barrel... (2, Insightful)

chriso11 (254041) | more than 4 years ago | (#32709300)

The reason that Intel is whining is in the context of large number crunching systems or high end workstations. Rather than sell Ks of chips for the former, Nvidia (and to a lesser extent AMD) gets to sell hundreds of GPU chips. And for the workstations, Intel sells only one chip instead of a 2 to 4.

Larrabee is back? (1)

Gri3v3r (1736820) | more than 4 years ago | (#32709568)

I remember reading here on ./ that it got abandoned by Intel.

Larrabee Marketing == Direct-to-DVD? (1)

cmholm (69081) | more than 4 years ago | (#32710246)

Intel decided to bail on marketing an in-house high performance GPU. But, they'd still like a return on their Larrabee investment. I don't doubt they would have been pushing the HPC mode anyway, but now, that's all they've got. Unfortunately for Intel, they've got to sell Larrabee performance based on in-house testing, while there are now a number of CUDA-based applications, and HPC-related conferences and papers are now replete with performance data.

To Intel's and AMD/ATI's advantage, NVIDIA has signed on with the OpenCL [wikipedia.org] effort, so as the first two start getting drivers out, they can give the later a run for their HPC-GPU money. At the moment, though, it's all talk.

Sorry Intel Nvidia Wins (1)

Bruha (412869) | more than 4 years ago | (#32710852)

Using Badaboom a CUDA app, you can rip down DVD copies to your Ipod's in minutes, not hours.

Unfortunately Badaboom are idiots and are taking their sweet time porting to the 465/470/480 cards.

I'd love to see a processor fast enough to beat a GPU at tasks such as these, and cd to mp3 conversions on CUDA, it's like moving from a hard drive to a fast SSD.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?