Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Panic in Multicore Land

Zonk posted more than 6 years ago | from the multi-cores-no-waiting dept.

Programming 367

MOBE2001 writes "There is widespread disagreement among experts on how best to design and program multicore processors, according to the EE Times. Some, like senior AMD fellow, Chuck Moore, believe that the industry should move to a new model based on a multiplicity of cores optimized for various tasks. Others disagree on the ground that heterogeneous processors would be too hard to program. The only emerging consensus seems to be that multicore computing is facing a major crisis. In a recent EE Times article titled 'Multicore puts screws to parallel-programming models', AMD's Chuck Moore is reported to have said that 'the industry is in a little bit of a panic about how to program multicore processors, especially heterogeneous ones.'"

Sorry! There are no comments related to the filter you selected.

Panic in multi-core land (0)

Anonymous Coward | more than 6 years ago | (#22713908)


Re:Panic in multi-core land (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#22713938)

Way to waste your chance at joining the GNAA

Re:Panic in multi-core land (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#22713968)

masturbating to Tay Zonday youtube videos still gets you in.

Panic? (4, Insightful)

jaavaaguru (261551) | more than 6 years ago | (#22713926)

I think "panic" is a bit of an over-reaction. I use a multicore CPU. I write software that runs on it. I'm not panicking.

Re:Panic? (1)

dnoyeb (547705) | more than 6 years ago | (#22713950)

Is it April 1st already?

We have been writing multi-threaded software for years. There is nothing special about multicore. Its basically a cut down version of a dual-CPU box. The only people that should have any concern at all would be the scheduler writers. And even then there is no cause for "panic".

Re:Panic? (5, Insightful)

leenks (906881) | more than 6 years ago | (#22714100)

How is an 80-core cpu a cut down version of a dual-CPU box? This is the kind of technology the authors are discussing, not your Core2 duo MacBook...

No problems for servers (5, Insightful)

TheLink (130905) | more than 6 years ago | (#22714978)

For servers the real problem is I/O. Disks are slow, network bandwidth is limited (if you solve that then memory bandwidth is limited ;) ).

For most typical workloads most servers don't have enough I/O to keep 80 cores busy.

If there's enough I/O there's no problem keeping all 80 cores busy.

Imagine a slashdotted webserver with a database backend. If you have enough bandwidth and disk I/O, you'll have enough concurrent connections that those 80 cores will be more than busy enough ;).

If you still have spare cores and mem, you can run a few virtual machines.

As for desktops - you could just use Firefox without noscript, after a few days the machine will be using all 80 CPUs and memory just to show flash ads and other junk ;).

Goatse! (-1)

Anonymous Coward | more than 6 years ago | (#22713978)

Panic at multi-anus goatse. []

You nerds love it.

Re:Panic? (5, Insightful)

shitzu (931108) | more than 6 years ago | (#22713986)

Still, the fact remains that the x86 processors (due to the OS-s that run on them, actually) have not gone much faster in the last 5-7 years. The only thing that has shown serious progress is power consumption and heat dissipation. I mean - the speed the user experiences has not improved much.

Re:Panic? (2, Interesting)

Anonymous Coward | more than 6 years ago | (#22714142)

> the speed the user experiences has not improved much.

User experience is not a useful metric for performance, unless you consider media encoding , decoding and rendering. 10 years ago I was running a P166, what kind of framerates would I get with a modern game using a software renderer? What kind of framerates would I get for decoding a HD video stream?

Do you seriously think a 12 year old P166 will provide a comparative user experience to a modern 8 core 3GHz machine? You're putting it down to "the OS's that run on them", which is interesting since user mode x86 emulation with QEMU runs W2K faster on my laptop than on the hardware I ran it on back in 1999.

Re:Panic? (2, Informative)

shitzu (931108) | more than 6 years ago | (#22714462)

I was speaking of the last 5-7 years.

I have an old AMD-XP-something running windows XP at home, it is at 5 years old. I have a Core2Duo machine is sometimes use. I dont see much difference in day-to-day usage. Even if there is one, i would attribute most of that to faster drives and i/o.

Re:Panic? (5, Insightful)

Saurian_Overlord (983144) | more than 6 years ago | (#22714598)

"...the speed the user experiences has not improved much [in the last 5-7 years]."

This may almost be true if you stay on the cutting edge, but not even close for the average user (or the power-user on a budget, like myself). 5 years ago I was running a 1.2 GHz Duron. Today I have a 2.3 GHz Athlon 64 in my notebook (which is a little over a year old, I think), and an Athlon 64 X2 5600+ (that's a dual-core 2.8 GHz, for those who don't know) in my desktop. I'd be lying if I said I didn't notice much difference between the three.

Re:Panic? (5, Informative)

Cutie Pi (588366) | more than 6 years ago | (#22714146)

Yeah, but if you extrapolate to where things are going, we're going to have CPUs with dozens if not hundreds of cores on them. (See Intel's 80 core technology demo as an example of where their research is going). Can you write or use general purpose software that takes advantage of that many cores? Right now I expect there is a bit of panic because it's relatively easy to build these behemoths, but not so easy to use them efficiently. Outside of some specialized disciplines like computational science and finance (that have already been taking advantage of parallel computing for years), there won't be a big demand for uber-multicore CPUs if the programming models don't drastically improve. And those innovations need to happen now to be ready in time for CPUs of 5 years from now. Since no real breakthroughs have come however, the chip companies are smart to be rethinking their strategies.

Multicores, but not on a chip (5, Interesting)

Kim0 (106623) | more than 6 years ago | (#22714302)

This trend with multiple cores on the CPU is only an intermediate phase,
because it over saturates the memory bus, which is easy to remedy by
putting the cores on the memory chips, of which there are a number
comparable to the number of cores.

In other words, the CPUs will disappear, and there will be lots of smaller
core/memory chips, connected in a network. And they will be cheaper as well,
because they do not need so high a yeld.


Re:Multicores, but not on a chip (1)

richlv (778496) | more than 6 years ago | (#22714412)

and on a larger scale there's this wicked idea about plan9.
as for parallel processing, i don't think it is feasible to be implemented in each app separately - more likely it would be built upon some higher level api, where app would simply tell "these things can run in parallel, this one should wait for that one to finish, and this one can start as soon as that one sends a particular signal".
it would be somewhat more work, but something like that is being already implemente with kde4 and i expect it only to become more widespread.

Re:Panic? (1)

that this is not und (1026860) | more than 6 years ago | (#22714358)

So the message here is that they won't be able to *sell* these things since there isn't a market, nor well-defined uses for them. I guess that would be a panic situation for marketing types.

And since these things can't be 'just used' as a faster version of an 8088 processor, the way the CPU houses sold the 386 (frankly, one of the last cases where they had to scale a 'really big change' to an existing market that had almost no motivation to use the new features) there's a panic that people might just not buy the new stuff.

Where will it lead!?! The hardware upgrade cycle feeds all sorts of mouths that might otherwise have to actually provide meaningful innovative products.

Re:Panic? (0)

Anonymous Coward | more than 6 years ago | (#22714738)

I'm not so sure that this is actually the case, but you better believe that if the world's leading chip manufacturers find their road maps for the next five years lead to unutilizable hardware, which seems to be the indication, there's going to be panicking going on, both for the manufacturers and for the people that depend on increasingly powerful hardware to handle the exponential growth of information flow in today's society.
How oblivious do you have to be to think that making products that don't sell is a concern only for 'marketing types'?

Re:Panic? (1)

SlashV (1069110) | more than 6 years ago | (#22714638)

there won't be a big demand for uber-multicore CPUs if the programming models don't drastically improve. And those innovations need to happen now to be ready in time for CPUs of 5 years from now.
Software always lags behind hardware development. The 80386 was launched in 1986. Useful 32 bit code only arrived in the 90's. Starting software innovations now, for CPU's that will only be available in 5 years isn't very feasible or even useful.

Re:Panic? (1)

10101001 10101001 (732688) | more than 6 years ago | (#22714848)

Can you write or use general purpose software that takes advantage of that many cores?

A 3D video driver? So that all PCs will have a decent "graphics card"? I think game designers will come up ways to use those extra CPUs such that even more CPUs will be needed. Or otherwise unthinkable things (mostly, the sort of thing that throwing strong parallel CPU power can solve but which is cost prohibitive today) will start being common.

Now, will *most* software use many/all of them? No. But, then, most CPUs are idle most of the time right now. A much bigger issue (even today) is electricity usage, not CPU usage. But, let's ignore that elephant in the room.

Re:Panic? (4, Insightful)

Chrisq (894406) | more than 6 years ago | (#22714148)

Yes panic is strong, but the issue is not with multi-tasking operating systems assigning processes to different processors for execution. That works very well. The problem is when you have a single CPU-intensive task, and you want to split that over multiple processors. That, in general, is a difficult problem. Various solutions, such as functional programming, threads with spawns and waits, etc. have been proposed, but none are as easy as just using a simple procedural language.

Re:Panic? (4, Insightful)

ObsessiveMathsFreak (773371) | more than 6 years ago | (#22714552)

That works very well. The problem is when you have a single CPU-intensive task, and you want to split that over multiple processors. That, in general, is a difficult problem.

It is in general, an impossible problem.

Most existing code is imperative. Most programmers write in imperative programming languages. Object orientation does not change this. Imperative code is not suited for multiple CPU implementation. Stapling things together with threads and messaging does not change this.

You could say that we should move to other programming "paradigms". However in my opinion, the reason we use imperative programs so such is because most of the tasks we want accomplished are inherently imperative in nature. Outside of intensive numerical work, most tasks people want done on a computer are done sequentially. The availability of multiple cores is not going to change the need for these tasks to be done in that way.

However, what multiple cores might do is enable previously impractical tasks to be done on modest PCs. Things like NP problems, optimizations, simulations. Of course these things are already being done, but not on the same scale as things like, say, spreadsheets, video/sound/picture editing, gaming, blogging, etc. I'm talking about relatively ordinary people being able to do things that now require supercomputers, experimenting and creating on their own laptops. Multi core programs can be written to make this feasible.

Considering I'm beginning to sound like an evangelist, I'll stop now. Safe money says PCs stay at 8 CPUs or below for the next 15 years.

Re:Panic? (1)

GreatBunzinni (642500) | more than 6 years ago | (#22714932)

However, what multiple cores might do is enable previously impractical tasks to be done on modest PCs. Things like NP problems, optimizations, simulations. Of course these things are already being done, but not on the same scale as things like, say, spreadsheets, video/sound/picture editing, gaming, blogging, etc. I'm talking about relatively ordinary people being able to do things that now require supercomputers, experimenting and creating on their own laptops. Multi core programs can be written to make this feasible.

Idealisms... Unfortunately reality doesn't play by those rules, as thirty years ago bright minds predicted that knowing how to program a machine with high level programing languages would be also a trivial thing for "relatively ordinary people". So what do we see? The "relatively ordinary people" do have powerful computers but they don't go much further than chatrooms, myspace and blogs.

Re:Panic? (1)

Chrisq (894406) | more than 6 years ago | (#22714954)

Impossible might be too strong. I don't think anyone has proved that you can't take a program written in a normal procedural language and somehow transform it to run on multiple processors. Its just that nobody has any idea of how it could be done. The fact that a skilled programmer may be able to look at a process and identify isolated components that can run in parallel means that some day a computer may be able to do the same.

Re:Panic? (5, Funny)

divisionbyzero (300681) | more than 6 years ago | (#22714554)

Developers aren't panicking. Their kernels are! Ha! Oh, that was a good one. Where's my coffee?

Self Interest (3, Informative)

quarrel (194077) | more than 6 years ago | (#22713936)

AMD's Chuck Moore presumably has a lot of self interest in pushing heterogeneous cores. They are combining ATI+AMD cores on a single die and selling the benefits in a range of environments including scientific computing etc.

So take it all with a grain of salt


Re:Self Interest (5, Informative)

The_Angry_Canadian (1156097) | more than 6 years ago | (#22714032)

The article covers many point of views. Not only the one from Chuck Moore.

Re:Self Interest (2, Insightful)

davecb (6526) | more than 6 years ago | (#22714188)

If he's saying that his multicore processors are going to be hard to program, then self-interest suggests he be very very quiet (;-))

Seriously, though, adding what used to be a video board to the CPU doesn't change the programming model. I suspect he's more interested in debating future issues with more tightly coupled processors.


Re:Self Interest (1)

xouumalperxe (815707) | more than 6 years ago | (#22714220)

Sure, I'll take it with a grain of salt. But he does have Moore as a surname, and the other guy pretty much nailed it. :)

Re:Self Interest (2, Informative)

Hanners1979 (959741) | more than 6 years ago | (#22714300)

AMD's Chuck Moore presumably has a lot of self interest in pushing heterogeneous cores. They are combining ATI+AMD cores on a single die...

It's worth noting that Intel will also be going down this route in a similar timeframe, integrating an Intel graphics processor onto the CPU die.

Re:Self Interest (0)

Anonymous Coward | more than 6 years ago | (#22714644)

I'm thinking that maybe they acquired ATI because they thought it was a good idea, and didn't make up some excuse for it after the fact. Maybe.

Should Mimick The Brain (5, Interesting)

curmudgeon99 (1040054) | more than 6 years ago | (#22713962)

Well, the most recent research into how the cortext works has some interesting leads on this. If we first assume that the human brain has a pretty interesting organization, then we should try to emulate it.

Recall that the human brain receives a series of pattern streams from each of the senses. These patterns streams are in turn processed in the most global sense--discovering outlines, for example--in the v1 area of the cortext, which receives a steady stream of patterns over time from the senses. Then, having established the broadest outlines of a pattern, the v1 cortext layer passes its assessment of what it saw the outline of to the next higher cortex layer, v2. Notice that v1 does not pass the raw pattern it receives up to v2. Rather, it passes its interpretation of that pattern to v2. Then, v2 makes a slightly more global assessment, saying that the outline it received from v1 is not only a face but a face of a man it recognizes. Then, that information is sent up to v4 and ultimate to the IT cortex layer.

The point here is important. One layer of the cortex is devoted to some range of discovery. Then, after it has assigned some rudimentary meaning to the image, it passes it up the cortex where a slightly finer assignment of meaning is applied.

The takeaway is this: each cortex does not just do more of the same thing. Instead, it does a refinement of the level below it. This type of hierarchical processing is how multicore processors should be built.

Re:Should Mimick The Brain (4, Funny)

El_Muerte_TDS (592157) | more than 6 years ago | (#22714124)

If we first assume that the human brain has a pretty interesting organization, then we should try to emulate it.

I think it's pretty obvious there are serious design flaws in the human brain. And I'm not only talking about stability, but also reliability and accuracy.
Just look at the world.

Re:Should Mimick The Brain (0, Flamebait)

curmudgeon99 (1040054) | more than 6 years ago | (#22714382)

Ass clown.

I was making a serious point. While the brain is not perfect--which one would expect from something that was not designed but evolved--it is the best game in town. I think it's foolish to try to replicate the trial and error development of a million years. I think we have a great model before us in the brain and it only makes sense to emulate it.

Re:Should Mimick The Brain (1)

maestroX (1061960) | more than 6 years ago | (#22714458)

Oh please, *everyone* knows males are lousy at multitasking.

Re:Should Mimick The Brain (1)

locster (1140121) | more than 6 years ago | (#22714516)

In addition there is also some degree of feedback from higher level processing back to lower levels, e.g. v2 telling v1 "I think this is a man's face, reinterpret based on this context". Information flows in both directions.

Re:Should Mimick The Brain (1)

ragtoplvr (1023649) | more than 6 years ago | (#22714760)

I hope we do not mimic the political brain. Most if not all of them do not work well.

I guess we should not mimic the management brain either.

Or the .....

Transputer and Occam Ahead of their time.


Re:Should Mimick The Brain (1)

doublebackslash (702979) | more than 6 years ago | (#22714764)

Yeah, okay, that is all well and good, but not everything can be arbitrarily broken down into parallel tasks.
Take your example. Imagine v1 takes 2s of CPU time and cannot be split into smaller pieces to be processed. However v2 takes 25s and cannot be broken up into parallel tasks. v4 will execute slightly sooner because parts of v2 started processing slightly sooner than if there were no parallelism between v1 and v2, but the speedup is minimal since the large wait is on v2.

Take that down to a smaller level. Painting the screen, performing a sort, taking a sha1 checksum. They all have a non-parallel bottleneck. Painting the screen has to be done in the order things are viewed, you can calculate where things overlap, but eventually one thread has to paint it in order. A sort must eventually have a certain number of comparisons made, and they have to be made in a certain order, a small speedup can be had from using multiple cores but it ends up costing more CPU horsepower overall. Sha1/md5 checksums can only be processed in a certain order.

We can break small parts of most tasks up between cores, but beyond a certain level it becomes non-trivial to find work for all the hardware to do. Either the problem has to be re-engineered, such as developing a multi step sorting algorithm, or the problem has to be broken into single steps and each evaluated against the rest for order of execution to find out which steps can be executed out of order (since there is no guarantee which steps will happen first on a multi CPU system) and then writing an entire system to farm out arbitrary bits of code to multiple processors (either at the compiler level after giving it hints, or at the code level with threads, mutexes, semaphores, and shared memory).

Web servers and the like, on the other hand, can see a benefit immediately since they already have more tasks running than CPU's to run them on (server dozens of pages simultaneously). However even those can start to see diminishing returns when memory begins to be contented for by all the threads.

You can see this problem gets complex fast.

Re:Should Mimick The Brain (1)

curmudgeon99 (1040054) | more than 6 years ago | (#22714924)

I think you missed my point. In the brain v1 focuses on a broad task. v2 focuses on a finer task. v4 on still a finer task. The results of the work done by v1 are sent in summary form to v2. v2 also sends its summary up to v4. Likewise, the upper levels will send down their summary to the lower area to help focus. So, it is not to the point of discrete processes working on pieces of the same issue. Rather, each cortex layer focuses on a qualitatively different task.

Let's see the menu (3, Interesting)

Tribbin (565963) | more than 6 years ago | (#22713966)

Can I have... errr... Two floating point, one generic math with extra cache and two RISC's.

Re:Let's see the menu (1)

that this is not und (1026860) | more than 6 years ago | (#22714386)

You're sounding like that IBM 'Drive Thru' radio commercial now.

Re:Let's see the menu (5, Funny)

imikem (767509) | more than 6 years ago | (#22714406)

Would you like fries with that?

OpenMP? (2, Informative)

derrida (918536) | more than 6 years ago | (#22713974)

It is portable, scalable, standardized and supports many languages.

Languages (2, Informative)

PsiCollapse (809801) | more than 6 years ago | (#22713976)

That's why it's so important that languages begin to adopt threading primitives and immutable data structures. Java does a good job. Newer languages, like Clojure are built from the ground up with concurrency in mind.

Re:Languages (5, Informative)

chudnall (514856) | more than 6 years ago | (#22714428)

*cough*Erlang [] *cough*

I think the wailing we're about to hear is the sound of thousands of imperative-language programmers being dragged, kicking and screaming, into functional programming land. Even the functional languages not specifically designed for concurrency do it much more naturally than their imperative counterparts.

Re:Languages (3, Interesting)

TheRaven64 (641858) | more than 6 years ago | (#22714824)

For good parallel programming you just need to enforce one constraint:

Every object (in the general sense, not necessarily the OO sense) may be either aliased or mutable, but not both.

Erlang does this by making sure no objects are mutable. This route favours the compiler writer (since it's easy) and not the programmer. I am a huge fan of the CSP model for large projects, but I'd rather keep something closer to the OO model in the local scope and use something like CSP in the global scope (which is exactly what I am doing with my current research).

Re:Languages (5, Informative)

Westley (99238) | more than 6 years ago | (#22714560)

Java doesn't do a good job. It does a "better than abysmal" job in that it has some idea of threading with synchronized/volatile, and it has a well-defined memory model. (That's not to say there aren't flaws, however. Allowing synchronization on any reference was a mistake, IMO.)

What it *doesn't* do is make it easy to write verifiably immutable types, and code in a functional way where appropriate. As another respondent has mentioned, functional languages have great advantages when it comes to concurrency. However, I think the languages of the future will be a hybrid - making imperative-style code easy where that's appropriate, and functional-style code easy where that's appropriate.

C# 3 goes some of the way towards this, but leaves something to be desired when it comes to assistance with immutability. It also doesn't help that that .NET 2.0 memory model is poorly documented (the most reliable resources are blog posts, bizarrely enough - note that the .NET 2.0 model is significantly stronger than the ECMA CLI model).

APIs are important too - the ParallelExtensions framework should help .NET programmers significantly when it arrives, assuming it actually gets used. Of course, for other platforms there are other APIs - I'd expect them to keep leapfrogging each other in terms of capability.

I don't think C# 3 (or even 4) is going to be the last word in bringing understandable and reliable concurrency, but I think it points to a potential way forward.

The trouble is that concurrency is hard, unless you live in a completely side-effect free world. We can make it simpler to some extent by providing better primitives. We can encourage side-effect free programming in frameworks, and provide language smarts to help too. I'd be surprised if we ever manage to make it genuinely easy though.

Re:Languages (1)

locster (1140121) | more than 6 years ago | (#22714570)

Microsoft did some research in this area a few years back. See C Omega []

Not *that* Chuck Moore (4, Informative)

Hobart (32767) | more than 6 years ago | (#22713980)

This article is referring to AMD's Charles R. "Chuck" Moore [] , who worked on the POWER4 and PowerPC 601 [] , not the language and chip designer [] Charles H. "Chuck" Moore [] who invented Forth [] , ColorForth [] , et al. and was interviewed on slashdot [] .

Re:Not *that* Chuck Moore (5, Funny)

Hal_Porter (817932) | more than 6 years ago | (#22714064)

Those +1 Informative links go to wikipedia [] , an online encyclopedia.

What about... (2, Informative)

aurb (674003) | more than 6 years ago | (#22714012)

...functional programming languages? Or flow programming?

Re:What about... (0)

Anonymous Coward | more than 6 years ago | (#22714216)

Hey hey, slow down there, you'd be lucky if 5% of "developers" these days even know how to do that.
Most of them just get help from Gandalf these days... or to be more specific Saruman.

Should Mimic DNA/cell process. (1)

cabazorro (601004) | more than 6 years ago | (#22714036)

Sounds wasteful, I know (data replication everywhere). But there is a reason for that. The process becomes resilient to unexpected changes (corruption). The bus is the enzymes, the cpu is the cell and thread of execution is, well, the DNA. The replication and communication process is autonomous.

The future is here (5, Insightful)

downix (84795) | more than 6 years ago | (#22714038)

What Mr Moore is saying does have a grain of truth, that generic will be beaten by specific in key functions. The Amiga proved that in 1985, being able to deliver a better graphical solution than workstations costing tens of thousands more. The key now is to figure out which specifics you can use without driving up the cost nor without compromizing the design ideal of a general purpose computer.

Re:The future is here (2, Insightful)

funkboy (71672) | more than 6 years ago | (#22714304)

The Amiga proved that in 1985, being able to deliver a better graphical solution than workstations costing tens of thousands more. The key now is to
figure out which specifics you can use without driving up the cost nor without compromizing the design ideal of a general purpose computer.

The key now is figuring out what to do with your Amiga now that no one writes applications for it anymore.

I suggest NetBSD :-)

mod 04 (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#22714050)

that he documents unlees you can work

Just fab the cores and get out of the way (0)

Anonymous Coward | more than 6 years ago | (#22714094)

or go take threads 101. don't punish developers who know what they are doing just because the ruby/rails/java/python fad language crowd doesn't understand how their language bastardizes pthreads.

panic in corepirate nazi fairytail lah-lah land? (0)

Anonymous Coward | more than 6 years ago | (#22714126)

looks like they're going DOWn? couldn't happen to a 'nicer' bunch of ghouls. let yOUR conscience be yOUR guide. you can be more helpful than you might have imagined. there are still some choices. if they do not suit you, consider the likely results of continuing to follow the corepirate nazi hypenosys story LIEn, whereas anything of relevance is replaced almost instantly with pr ?firm? scriptdead mindphuking propaganda or 'celebrity' trivia 'foam'. meanwhile; don't forget to get a little more oxygen on yOUR brain, & look up in the sky from time to time, starting early in the day. there's lots going on up there.;_ylt=A0WTcVgednZHP2gB9wms0NUE [];_ylt=A9G_RngbRIVHsYAAfCas0NUE [] []

is it time to get real yet? A LOT of energy is being squandered in attempts to keep US in the dark. in the end (give or take a few 1000 years), the creators will prevail (world without end, etc...), as it has always been. the process of gaining yOUR release from the current hostage situation may not be what you might think it is. butt of course, most of US don't know, or care what a precarious/fatal situation we're in. for example; the insidious attempts by the felonious corepirate nazi execrable to block the suns' light, interfering with a requirement (sunlight) for us to stay healthy/alive. it's likely not good for yOUR health/memories 'else they'd be bragging about it? we're intending for the whoreabully deceptive (they'll do ANYTHING for a bit more monIE/power) felons to give up/fail even further, in attempting to control the 'weather', as well as a # of other things/events. []

dictator style micro management has never worked (for very long). it's an illness. tie that with life0cidal aggression & softwar gangster style bullying, & what do we have? a greed/fear/ego based recipe for disaster. meanwhile, you can help to stop the bleeding (loss of life & limb); []

the bleeding must be stopped before any healing can begin. jailing a couple of corepirate nazi hired goons would send a clear message to the rest of the world from US. any truthful look at the 'scorecard' would reveal that we are a society in decline/deep doo-doo, despite all of the scriptdead pr ?firm? generated drum beating & flag waving propaganda that we are constantly bombarded with. is it time to get real yet? please consider carefully ALL of yOUR other 'options'. the creators will prevail. as it has always been.

corepirate nazi execrable costs outweigh benefits
(Score:-)mynuts won, the king is a fink)
by ourselves on everyday 24/7

as there are no benefits, just more&more death/debt & disruption. fortunately there's an 'army' of light bringers, coming yOUR way. the little ones/innocents must/will be protected. after the big flash, ALL of yOUR imaginary 'borders' may blur a bit? for each of the creators' innocents harmed in any way, there is a debt that must/will be repaid by you/us, as the perpetrators/minions of unprecedented evile, will not be available. 'vote' with (what's left in) yOUR wallet, & by your behaviors. help bring an end to unprecedented evile's manifestation through yOUR owned felonious corepirate nazi glowbull warmongering execrable. some of US should consider ourselves somewhat fortunate to be among those scheduled to survive after the big flash/implementation of the creators' wwwildly popular planet/population rescue initiative/mandate. it's right in the manual, 'world without end', etc.... as we all ?know?, change is inevitable, & denying/ignoring gravity, logic, morality, etc..., is only possible, on a temporary basis. concern about the course of events that will occur should the life0cidal execrable fail to be intervened upon is in order. 'do not be dismayed' (also from the manual). however, it's ok/recommended, to not attempt to live under/accept, fauxking nazi felon greed/fear/ego based pr ?firm? scriptdead mindphuking hypenosys.

consult with/trust in yOUR creators. providing more than enough of everything for everyone (without any distracting/spiritdead personal gain motives), whilst badtolling unprecedented evile, using an unlimited supply of newclear power, since/until forever. see you there?

"If my people, which are called by my name, shall humble themselves, and pray, and seek my face, and turn from their wicked ways; then will I hear from heaven, and will forgive their sin, and will heal their land."

meanwhile, the life0cidal philistines continue on their path of death, debt, & disruption for most of US. gov. bush denies health care for the little ones; []

whilst demanding/extorting billions to paint more targets on the bigger kids; []

& pretending that it isn't happening here; []
all is not lost/forgotten/forgiven

(yOUR elected) president al gore (deciding not to wait for the much anticipated 'lonesome al answers yOUR questions' interview here on /.) continues to attempt to shed some light on yOUR foibles. talk about reverse polarity; []

My heterogeneous experience with Cell processor (5, Interesting)

DoofusOfDeath (636671) | more than 6 years ago | (#22714130)

I've been doing some scientific computing on the Cell lately, and heterogeneous cores don't make life very easy. At least with the Cell.

The Cell has one PowerPC core ("PPU"), which is a general purpose PowerPC processor. Nothing exotic at all about programming it. But then you have 6 (for the Playstation 3) or 8 (other computers) "SPE" cores that you can program. Transferring data to/from them is a pain, they have small working memories (256k each), and you can't use all C++ features on them (no C++ exceptions, thus can't use most of the STL). They also have poor speed for double-precision floats.

The SPEs are pretty fast, and they have a very fast interconnect bus, so as a programmer I'm constantly thinking about how to take better advantage of them. Perhaps this is something I'd face with any architecture, but the high potential combined with difficult constraints of SPE programming make this an especially distracting aspect of programming the Cell.

So if this is what heterogeneous-cores programming means, I'd probably prefer the homogeneous version. Even if they have a little less performance potential, it would be nice to have a 90%-shorter learning curve to target the architecture.

Re:My heterogeneous experience with Cell processor (5, Interesting)

nycguy (892403) | more than 6 years ago | (#22714222)

I agree. While a proper library/framework can help abstract the difficulties associated with a heterogeneous/asymetric architecture away, it's just easier to program for a homogeneous environment. This same principle applies all the way down to having general-purpose registers in a RISC chip as opposed to special-purpose registers in a CISC chip--the latter may let you do a few specialized things better, but the former is more accomodating for a wide range of tasks.

And while the Cell architecture is a fairly stationary target because it was incorporated into a commercial gaming console, if these types of architectures were to find their way into general purpose computing, it would be a real nightmare, since every year or so a new variant of the architecture would come out that would introduce a faster interconnect here, more cache memory there, etc., so that one might have to reorganize the division of labor in one's application to take advantage (again a properly parameterized library/framework can handle this sometimes, but only post facto--after the variation in features is known, not before the new features have even been introduced).

Re:My heterogeneous experience with Cell processor (1)

that this is not und (1026860) | more than 6 years ago | (#22714408)

Transferring data to/from them is a pain, they have small working memories (256k each), and you can't use all C++ features on them (no C++ exceptions, thus can't use most of the STL).

The horrors! How are the teams at Microsoft going to fit bloat in them, then!?!

Re:My heterogeneous experience with Cell processor (1)

DoofusOfDeath (636671) | more than 6 years ago | (#22714514)

The horrors! How are the teams at Microsoft going to fit bloat in them, then!?!

Actually, it's been a good exercise to have to work under those constraints. I found that a tight environment like that forced me to carefully reconsider the design of my code and my algorithm. It probably lead to an implementation that not only had fewer lines of code, but was also more readable, than the original version.

Re:My heterogeneous experience with Cell processor (5, Interesting)

epine (68316) | more than 6 years ago | (#22714712)

So if this is what heterogeneous-cores programming means, I'd probably prefer the homogeneous version.
Your points are valid as things stand, but isn't it a bit premature to make this judgment? Cell was a fairly radical design departure. If IBM continues to refine Cell, and as more experience is gained, the challenge will likely diminish.

For one thing, IBM will likely add double precision floating point support. But note that SIMD in general poses problems in the traditional handling of floating point exceptions, so it still won't be quite the same as double precision on the PPU.

The local-memory SPE design alleviates a lot of pressure on the memory coherence front. Enforcing coherence in silicon generates a lot of heat, and heat determines your ultimate performance envelop.

For decades, programmers have been fortunate in making our own lives simpler by foisting tough problems onto the silicon. It wasn't a problem until the hardware ran into the thermal wall. No more free lunch. Someone has to pay on one side or the other. IBM recognized this new reality when they designed Cell.

The reason why x86 never died the thousand deaths predicted by the RISC camp is that heat never much mattered. Not enough registers? Just add OOO. Generates a bit more heat to track all the instructions in flight, but no real loss in performance. Bizarre instruction encoding? Just add big complicated decoders and pre-decoding caches. Generates more heat, but again performance can be maintained.

Probably with a software architecture combining the hairy parts of the Postgres query execution planner with the recent improvements in the FreeBSD affinity-centric ULE scheduler, you could make the nastier aspects of SPE coordination disappear. It might help if the SPUs had 512KB instead of 256KB to alleviate code pressure on data space.

I think the big problem is the culture of software development. Most code functions the same way most programmers begin their careers: just dive into the code, specify requirements later. What I mean here is that programs don't typically announce the structure of the full computation ahead of time. Usually the code goes to the CPU "do this, now do that, now do this again, etc." I imagine the modern graphics pipelines spell out longer sequences of operations ahead of time, by necessity, but I've never looked into this.

Database programmers wanting good performance from SQL *are* forced to spell things out more fully in advance of firing off the computation. It doesn't go nearly far enough. Instead of figuring out the best SQL statement, the programmer should send a list of *all* logically equivalent queries and just let the database execute the one it finds least troublesome. Problem: sometimes the database engine doesn't know that you have written the query to do things the hard way to avoid hitting a contentious resource that would greatly impact the performance limiting path.

These are all problems in the area of making OSes and applications more introspective, so that resource scheduling can be better automated behind the scenes, by all those extra cores with nothing better to do.

Instead, we make the architecture homogeneous, so that resource planning makes no real difference, and we can thereby sidestep the introspection problem altogether.

I've always wondered why no-one has ever designed a file system where all the unused space is used to duplicate other disk sectors/blocks, to create the option of vastly faster seek plans. Probably because it would take a full-time SPU to constantly recompute the seek plan as old requests are completed and new requests enter the queue. Plus if two supposedly identical copies managed to diverge, it would be a nightmare to debug, because the copy you get back would non-deterministic. Hybrid MRAM/Flash/spindle storage systems could get very interesting.

I guess I've been looking forward to the end of artificial scaling for a long time (clock freq. as the universal software solvent). This new world opens up many interesting problems we've been side-stepping far too long.

When people think about the human brain, one aspect we forget far too easily is that a large function of the human brain is to regulate which circuits are required by the circumstance and to supply blood to only those circuits. The metabolism of the brain is far too high to fire everything up all at once.

With our algorithms in software, we're never happy until we can fry an egg on the silicon. Hardly realistic in the long run. I believe heterogeneous will ultimately win out.

Re:My heterogeneous experience with Cell processor (3, Insightful)

TheRaven64 (641858) | more than 6 years ago | (#22714888)

Well, part of your problem is that you're using a language which is a bunch of horrible syntactic sugar on top of a language designed for programming a PDP-8 on an architecture that looks nothing like a PDP-8.

You're not the only person using heterogeneous cores, however. In fact, the Cell is a minority. Most people have a general purpose core, a parallel stream processing core that they use for graphics and an increasing number have another core for cryptographic functions. If you've ever done any programming for mobile devices, you'll know that they have been using even more heterogeneous cores for a long time because they give better power usage.

Re:My heterogeneous experience with Cell processor (2, Interesting)

Anonymous Coward | more than 6 years ago | (#22714936)

Double precison has been greatly improved in the last variants of the Cell SPU, not the ones in the PS3 though. The enhanced DP processors are only found in the recent IBM (and perhaps Mercury) blades, which are expensive, but the only difference with single precision is longer latency which leads to about half the flops (only 2 values per register instead of 4).

Next year we might even get the same processor with 32 SPU, still hard to program but it means 8MB total of data on the chip, which opens some opportunities (unfortunately the memory size per SPU seems to be set in stone at 256kB).

I'm interested in programming the Cell for doing some signal processing, most of which will be single precision FFT, an application where it seems to rock. I think that the data flow between SPU is relatively easy to organize for my purpose. OTOH, it seems nobody wants to sell bare Cell chips, which is sad, since I would love to try to interface it to high speed (1-2Gsamples/s) ADCs.

Well, I'm panicked... (4, Interesting)

argent (18001) | more than 6 years ago | (#22714134)

The idea of having to use Microsoft APIs to program future computers because the vendors only document how to get DirectX to work doesn't exactly thrill me. I think panic is perhaps too strong a word, but sheesh...

There's only three approaches (1)

gilesjuk (604902) | more than 6 years ago | (#22714168)

1. Change operating systems to be able to use the all the available CPU power even when running single threaded applications.

2. Change programming languages to make multicore programming easier.

3. Both 1 and 2.

What the end user should be able to dictate however is how many cores should be in use. It's not for the programmer of the application to dictate how processing of any data should occur.

Re:There's only three approaches (1)

slashbart (316113) | more than 6 years ago | (#22714566)

>> 1. Change operating systems to be able to use the all the available CPU power even when running single threaded applications.

So how should the operating system be able to figure out what program flow dependencies there are in a binary? You can make an O.S that schedules your single threaded application so that it uses 100% of 1 core, but automatically multithreading a single threaded application, no way, not now, and not for the foreseeable future.

Re:There's only three approaches (1)

TheRaven64 (641858) | more than 6 years ago | (#22714930)

It's easier with heterogeneous multicore. Your single-threaded game happily makes use of two cores (the CPU and the GPU). Your single-threaded server happily makes use of two cores (your CPU and your crypto coprocessor). The functionality of the extra cores is exposed in both cases via a library (OpenGL or OpenSSL). If you want to design a fast multicore system then profile your existing workloads and see which libraries are using most of the CPU. Then add a core that implements their functionality in hardware.

Re:There's only three approaches (1)

MadKeithV (102058) | more than 6 years ago | (#22714962)

Actually things like the .NET and Java runtime, combined with JIT compiling and optimization, could do exactly that.
It could recompile a reasonably abstract definition of a program into exactly the kind of code that your current system needs, on-demand

he is right, but it depends on the application (5, Interesting)

CBravo (35450) | more than 6 years ago | (#22714200)

As I demonstrated in my thesis [] a parallel application can be shown to have certain critical and less critical parts. An optimal processing platform matches those requirements. The remainder of the platform will remain idle and burn away power for nothing. One should wonder what is better: a 2 GHz processor or 2x 1 GHz processors. My opinion is that, if it has no impact on performance, the latter is better.

There is an advantage to a symmetrical platform: you cannot misschedule your processes. It does not matter which processor takes a certain job. On a heterogeneous system you can make serious errors: scheduling your video process on your communications processor will not be efficient. Not only is the video slow, the communications process has to wait a long time (impacting comm. performance).

Re:he is right, but it depends on the application (-1, Offtopic)

DoofusOfDeath (636671) | more than 6 years ago | (#22714668)

Hi David,

I looked at your thesis. Could you explain something? Why is it that so many computer science publications from the Netherlands are written in English? Is it because English is commonly spoken in universities there, or is it just that English is the standard language for computer science publications?

+1 Optimistic (4, Funny)

Sapphon (214287) | more than 6 years ago | (#22714736)

The height of optimism: posting proof in the form of a 70-odd page thesis on a Slashdot.
I don't think we'll be Slashdotting your server any time soon, CBravo ;-)

Multithreading is not easy but it's doable (5, Interesting)

pieterh (196118) | more than 6 years ago | (#22714260)

It's been clear for many years that individual core speeds had peaked, and that the future was going to be many cores and that high-performance software would need to be multithreaded in order to take advantage of this.

When we wrote the OpenAMQ messaging software [] in 2005-6, we used a multithreading design that lets us pump around 100,000 500-byte messages per second through a server. This was for the AMQP project [] .

Today, we're making a new design - ØMQ [] , aka "Fastest. Messaging. Ever." - that is built from the ground up to take advantage of multiple cores. We don't need special programming languages, we use C++. The key is architecture, and especially an architecture that reduces the cost of inter-thread synchronization.

From one of the ØMQ whitepapers [] :

Inter-thread synchronisation is slow. If the code is local to a thread (and doesn't use slow devices like network or persistent storage), execution time of most functions is tens of nanoseconds. However, when inter-thread synchronisation - even a non-blocking synchronisation - kicks in, execution time grows by hundreds of nanoseconds, or even surpasses one microsecond. All kind of time-expensive hardware-level stuff has to be done... synchronisation of CPU caches, memory barriers etc.

The best of the breed solution would run in a single thread and omit any inter-thread synchronisation altogether. It seems simple enough to implement except that single-threaded solution wouldn't be able to use more than one CPU core, i.e. it won't scale on multicore boxes.

A good multi-core solution would be to run as many instances of ØMQ as there are cores on the host and treat them as separate network nodes in the same way as two instances running on two separate boxes would be treated and use local sockets to pass messages between the instances.

This design is basically correct, however, the sockets are not the best way to pass message within a single box. Firstly, they are slow when compared to simple inter-thread communication mechanisms and secondly, data passed via a socket to a different process has to be physically copied, rather than passed by reference.

Therefore, ØMQ allows you to create a fixed number of threads at the startup to handle the work. The "fixed" part is deliberate and integral part of the design. There are a fixed number of cores on any box and there's no point in having more threads than there are cores on the box. In fact, more threads than cores can be harmful to performance as they can introduce excessive OS context switching.

We don't get linear scaling on multiple cores, partly because the data is pumped out onto a single network interface, but we're able to saturate a 10Gb network. BTW ØMQ is GPLd so you can look at the code if you want to know how we do it.

Re:Multithreading is not easy but it's doable (-1, Flamebait)

Anonymous Coward | more than 6 years ago | (#22714380)

Great! When I want a box that does nothing useful, but runs prototype messaging sofware, and kills our network by sending demo payloads, I'll know where to go.

Re:Multithreading is not easy but it's doable (1)

maestroX (1061960) | more than 6 years ago | (#22714564)

Interesting. But does it infringe on the QNX patent [] ??

Re:Multithreading is not easy but it's doable (1)

pieterh (196118) | more than 6 years ago | (#22714774)

Good question. The answer is "no, not as far as we're aware"; the patent covers the distribution of transactions across network nodes, invisibly to applications, and is specifically aimed as implementing GUIs. From the patent, "The invention disclosed broadly relates to graphical user interfaces (GUI's) and particularly relates to the software architectures used to implement them."

However, all software patents have the problem of "creep", so that if a market emerges that looks within reach of the claims, the patent holder - if litigious - will try to expand the scope of the patent to claim this market. The claims of this patent, which are what really count, are written in fairly abstract language.

It is impossible to clear new software for patents - the cost would exceed $1bn - so we just have to try to stay away from known danger areas.

More constructively, we also support the fight against the software patent regime, at least in Europe.

Why choose? (2, Insightful)

Evro (18923) | more than 6 years ago | (#22714266)

Just build both and let the market decide.

Heterogenous is a natural thing to do (3, Interesting)

A beautiful mind (821714) | more than 6 years ago | (#22714268)

If you have 80 or more cores, I'd rather have 20 of them support specialty functions and be able to do them very fast (it would have to be a few (1-3) orders of magnitude faster than the general counterpart) and the rest do general processing. This of course needs the support of operating systems, but that isn't very hard to get. With 80 cores caching and threading models have to be rethought, especially caching - the operating system has to be more involved in caching than it currently is, because otherwise cache coherency won't be able to be done.

This also means that programs will need to be written not just by using threads, "which makes it okay for multi-core", but with cpu cache issues and locality in mind. I think VMs like JVM, Parrot and .NET will be much more popular as it is possible for them to take care a lot of these issues, which isn't or only possible in a limited way for languages like C and friends with static source code inspection.

Re:Heterogenous is a natural thing to do (0)

Anonymous Coward | more than 6 years ago | (#22714686)

Don't you mean "80 or Moore cores"?

I'll get my coat.

CPU != BRAIN (1)

v(*_*)vvvv (233078) | more than 6 years ago | (#22714282)

There is this view held by some (of which some are posting here) that somehow CPUs are primitive brains and that improving them will eventually result in a non-primitive brain. Hello, there is nothing remotely human about what my computer has done for me lately. Computers and humans *do* very different things, and *are* very different things.

I beg that the distinction between acquiring hints from brain structure vs creating brain structure not be blurred, and that no moderator marks "brains are like this so chips should be like that" type posts as informative or insightful.

No one at Intel has their chipset blueprints confused with an x-ray of Einstein's brain.

Re:CPU != BRAIN (1)

curmudgeon99 (1040054) | more than 6 years ago | (#22714352)

Please do not pooh-pooh our ideas, unless YOU HAVE A BETTER ONE. Please correct me if I'm wrong but I see modern computers only coming close to simulating on the most rudimentary level the functions of the LEFT hemisphere. No one has attempted to replicate the right hemisphere's function. So, I'm waiting for your better idea...

Specialisation is inevitable (2, Insightful)

adamkennedy (121032) | more than 6 years ago | (#22714318)

I have a 4-core workstation and ALREADY I get crap usage rates out of it.

Flick the CPU monitor to aggregate usage rate mode, and I rarely clear 35% usage, and I've never seem it higher than about 55% (and even that for only a second or two once an hour). A normal PC, even fairly heavily loaded up with apps, just can't use the extra power.

And since cores aren't going to get much faster, there's no real chance of getting big wins there either.

Unless you have a specialized workload (heavy number crunching, kernel compilation, etc) there's going to simply be no point having more parallelism.

So as far as I can tell, for general loads it seems to be inevitable that if we want more straight line speed, we'll need to start making hardware more attuned for specific tasks.

So in my 16-core workstation of the future, if my Photoshop needs to apply some relatively intensive transform that has to be applied linearly, it can run off to the vector core, while I'm playing Supreme Commander on one generic core (the game) two GPU cores (the two screens) and three integer-heavy cores (for the 3 enemy AIs), and the generic System Reserved Core (for interrupts, and low-level IO stuff) hums away underneath with no pressure.

Hetrogeny also has economics on it's side.

There's very little point having specialized cores when you've only got two.

Once there's no longer scarcity in quantity, you can achieve higher productivity by specialization.

Really, any specialized core that you can keep the CPU usage rates running higher than the overall system usage rate, is a net win in productivity for the overall computer. And over time, anything that increases productivity wins.

Re:Specialisation is inevitable (1)

makapuf (412290) | more than 6 years ago | (#22714972)

There's very little point having specialized cores when you've only got two.
Like, say, a CPU and a GPU ? I would have thought it was pretty efficient.

I think it all breaks down to how much specialized-but-still-generic-being-computationnaly-intensive tasks we define and then implement in hardware.

And, finally, it's the same specialized vs generic hardware wheel of reincarnation (see [] )

Re:Specialisation is inevitable (1)

jcupitt65 (68879) | more than 6 years ago | (#22714980)

Unless you have a specialized workload (heavy number crunching, kernel compilation, etc) there's going to simply be no point having more parallelism.

You can get very good parallelism with media apps like photoshop, audio or video encode/decode, things like that. Regular desktop apps aren't going to often go to the trouble, but I can see a future when most media libraries are heavily threaded. My spare time project (a GPL image processing library) gets about a 27x speedup on a 32-cpu machine, at least on some benchmarks.

It'll maybe be a bit like current console development: middleware authors will get their hands dirty with the hardware, and that knowledge will be packaged up and sold to app developers.

Brain (1)

slashflood (697891) | more than 6 years ago | (#22714344)

Take an advise from mother nature: as far as I know, our brain works like a heterogeneous multicore processor. We don't have multiple generic mini-brains in our head, we have one brain with highly specialized brain areas for different tasks. Seems to be the right concept for a computer processor.

Re:Brain (1)

Anne Thwacks (531696) | more than 6 years ago | (#22714662)

as far as I know, our brain works like a heterogeneous multicore processor

Then your brain needs an upgrade.

The brain has a (virtual) single serial processor, and a great bundle of "neural networks" which are essentially procedures built from hardware. (Kind of like early mainframes had a circuit board per instruction, and then gated the results of the selected instruction onto the bus.)

The self-modifying neural network architecture is interesting, but not to people who want to buy reliable computing engines.

While perhaps not immediately obvious to everyone,


Hence absent minded professors.

I do not want my wages computed by neural networks, and I dont want my bank to store my account on one either. If they want to use one to data-mine, well and good. (But hopefully one good enough to learn that I put their spam in the bin without reading it).

Seems like Google would have some ideas (1)

smose (877816) | more than 6 years ago | (#22714374)

Strange, it seems to me that Google would have some ideas about how to utilize massively parallel processing, as would the supercomputing crowd.

Is the issue here how to scale supercomputing concepts down to desktop applications? Well, for starters, you can dedicate a couple of cores to run all of the background processes (on the order of 70) that my IT department insists must reside on my system, so that I might get at least one which can work on the application(s) at hand.

Simple well tested solution. (2, Funny)

thehatmaker (1168507) | more than 6 years ago | (#22714376)

Looking back at history, we see that as clock speeds and memory capacity increased, software writing became simplified by the use of higher level languages whos output, while not as optimal as machine code programming, ran at a similar speed to previous generation hardware using well optmised machine code. And so, the "problem" of writing for faster machines was solved.

For the multicore problem, I propose a similar strategy. Simply write a natural language programming interface which uses n-x cores to interpret and compile the code into a mish mash of bloated machine code, which then runs on the remaining x number of cores. Of course, several remaining cores would be needed to run this bloated mess at speeds comparable to 486's - but at least the new hardware could be widely sold, thus supporting industry!

Its not like the users really need faster software, they just need a reason to upgrade to better hardware, right?? right??

Sun's thoughts (1)

Dersaidin (954402) | more than 6 years ago | (#22714396)

I went to a presentation by Sun last Friday (by Don Kretsch and Liang Chen), on "High Powered Computing". Sun's idea of HPC is, logically, multicored/cluster solutions. They talked about some of their abstraction ideas on how to take advantage of a bunch of cores. Some interesting stuff, but it was still pretty similar to traditional single core approach, only branching for some stuff, like loops. I'm not sure if any of their abstraction ideas were radical enough to get excited about, but it was still interesting to see. Task specific hardware and low level programming seems like the best approach for me. Like graphics cards in games. Once we're comfortable with that it then maybe build up some APIs. Sun's presentation convinced me that its the biggest challenge of modern computing.

Occam and Beyond (3, Insightful)

BrendaEM (871664) | more than 6 years ago | (#22714404)

Perhaps, panic is a little strong. At the same time, programing languages such as Occam, that are built from the ground up seem very provocative now. Perhaps Occam's syntax could modified to a Python-type syntax for a more popularity.

[Although, personally, I prefer Occam's syntax over that of C's.] []

I think that a tread aware programming language would be good in our multi-core world.

Help me understand the distinction (2, Interesting)

Junior J. Junior III (192702) | more than 6 years ago | (#22714418)

I'm curious how having specialized multi-core processors is different from having a single-core processor with specialized subunits. Ie, a single core x86 chip has a section of it devoted to implementing MMC, SSE, etc. Isn't having many specialized cores just a sophisticated way of re-stating that you have a really big single-core processor, in some sense?

Re:Help me understand the distinction (1)

photon317 (208409) | more than 6 years ago | (#22714854)

The difference is that the subunits are instructed on what to do via a single procedural stream of instructions from the compiler's point of view. The CPU does some work to reorder and parallelize the instruction stream to try to keep all the subunits busy if it can, but it doesn't always do a great job, and the compiler also knows the rules for how a given CPU does the re-ordering/parallelization and tries to optimize the stream to better the outcome. This scheduling is taking place at a very low level with very small chunks of (or even single) instructions. Algorithms for auto-parallelizing code quickly in hardware don't really scale up to bigger chunks of code (and as we've seen, even when they deal with smaller chunks, the stream needs to be pre-optimized by the compiler for effectiveness).

But certainly this must be an area of active research. An "obvious" (if currently impossible) solution is to build an 80 core CPU that looks like a 4-core CPU to the operating system, and dedicates a few cores to auto-parallelizing the 4 instruction streams from the OS onto the remaining bulk of the cores. However if we had algorithms that could do that job reasonably effectively in realtime, we could certainly put those same algorithms in compilers and make them do an even better job in non-realtime. So that makes that approach seem silly.

Single core vs multicore (1)

Xacid (560407) | more than 6 years ago | (#22714454)

Call me what you will, but personally I *still* prefer the performance of a super fast single core (~3.5ghz+) over this over-hyped multi-core phenomenon. I've yet to see any *major* differences between two machines I have that are the same clock speed, one single core, one dual. The difference I do experience is similar to what I'd expect from a .5ghz jump. In other words, the architecture *does* need to change if they have any desire to have any significant performance increases.

I like this, more complexity - better jobs! (1)

slashbart (316113) | more than 6 years ago | (#22714500)

The way I see it, to get max. performance out of these chips, you need a deeper understanding of them, i.e. it requireshigher skills, i.e. better quality jobs, better money, the works. Consider the fact that a lot of programmers have a really hard time dealing with concurrency at a thread level, these coming chips will only make it harder.
I don't think most concurrency problems can be automated away, it's the concepts and implementation of the concurrent algoritms that are hard, not so much the implementation (although that is where the bugs bite you when the stars are just right (wrong?)).

I'm rambling a bit I see, but I'm looking forward to interesting times ahead.

better idea (2, Funny)

timster (32400) | more than 6 years ago | (#22714504)

See, the thing to do with all these cores is run a physics simulation. Physics can be easily distributed to multiple cores by the principle of locality. Then insert into your physics simulation a CPU -- something simple like a 68k perhaps. Once you have the CPU simulation going, adjust the laws of physics in your simulation (increase the speed of light to 100c, etc) so that you can overclock your simulated 68k to 100Ghz. Your single-threaded app will scream on that.

P.S.: I know why this is impossible, so please don't flame me.

How is heterogenous CPU different to separate GPU? (1)

tomalpha (746163) | more than 6 years ago | (#22714542)

Genuine question that I don't know the answer to:

How are heterogeneous CPU cores different conceptually to a modern PC system with say:

    2 x General purpose cores (in the CPU)
100 x Vector cores (in the GPU)
    n x Vector cores (in a physics offload PCI card)

How is moving the vector (or whatever) cores onto the CPU die different to the above setup, apart from allowing for faster interconnects?

Current state of software development (5, Funny)

Alex Belits (437) | more than 6 years ago | (#22714650)

Ugg is smart.
Ugg can program a CPU.
Two Uggs can program two CPUs.
Two Uggs working on the same task program two CPUs.
Uggs' program has a race condition.
Ugg1 thinks, it's Ugg2's fault.
Ugg2 thinks, it's Ugg1's fault.
Ugg1 hits Ugg2 on the head with a rock.
Ugg2 hits Ugg1 on the head with an axe.
Ugg1 is half as smart as he was before working with Ugg2.
Ugg2 is half as smart as he was before working with Ugg1.
Both Uggs now write broken code.
Uggs' program is now slow, wrong half the time, and crashes on that race condition once in a while.
Ugg does not like parallel computing.
Ugg will bang two rocks together really fast.
Ugg will reach 4GHz.
Ugg will teach everyone how to reach 4GHz.

Invention? (1)

SharpFang (651121) | more than 6 years ago | (#22714720)

Some, like senior AMD fellow, Chuck Moore, believe that the industry should move to a new model based on a multiplicity of cores optimized for various tasks

And let's give the cores names like Paula, Agnus, Denise...

One Fast Core, Multiple Commodity ones (2, Interesting)

Brit_in_the_USA (936704) | more than 6 years ago | (#22714922)

I have read many times that some algorithms are difficult or impossible to multi-thread. I envisage the next logical step is a two socket motherboard, where one socket could be used for a 8+ core cpu running at low clock rate (e.g. 2-3Ghz) and another socket for a single core running at the greatest frequency achievable to the manufacturing process (e.g. x2 to x4 the clock speed of the multi-core) with whatever cache size compromises are required.

This help get around yield issues of getting all cores to work at a very high frequency and the related thermal issues . This could be a boon to general purpose computer that have a mix of hard to multi-thread and easy to multi-thread programs - assuming the OS could be intelligent on which cores the tasks are scheduled on. The cores could or could not have the same instruction sets, but having the same instruction sets would be the easy first step.

What about the OMG? (1)

guysmilee (720583) | more than 6 years ago | (#22714974)

Doesn't the OMG have anything to help with this ... suggested patterns ... specs etc ?
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?