×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Intel Talks 1000-Core Processors

samzenpus posted more than 3 years ago | from the we're-gonna-need-a-bigger-heat-sink dept.

Intel 326

angry tapir writes "An experimental Intel chip shows the feasibility of building processors with 1,000 cores, an Intel researcher has asserted. The architecture for the Intel 48-core Single Chip Cloud Computer processor is 'arbitrarily scalable,' according to Timothy Mattson. 'This is an architecture that could, in principle, scale to 1,000 cores,' he said. 'I can just keep adding, adding, adding cores.'"

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

326 comments

Imagine (0)

Anonymous Coward | more than 3 years ago | (#34303372)

Imagine a beowulf cluster of ... ah yeah.

Re:Imagine (2, Interesting)

JWSmythe (446288) | more than 3 years ago | (#34303552)

    Why? :) I know. meme. It's just, I've built a couple Beowulf clusters for fun, and didn't have an application written to use MPI (or any of the alphabet soup of protocols), so it was just an exercise, not for any practical use. It's not like most of us are crunching numbers hard enough to need one, and it won't help out playing games or even building kernels.

    I'd like to see a 1k core machine on my desktop, but that's beyond the practical limits of any software currently available. Linux can only go to 256 cores. Windows 2008 tops out at 64. But hey, if they did come to market, I know who would be first to support all those cores, and it doesn't come from Redmond (or their offshore outsourced developers).

Re:Imagine (1, Insightful)

AuMatar (183847) | more than 3 years ago | (#34303788)

Why would you care to see one on your desktop? Do you have any use for one? There's a point where except for supercomputers enough is enough. We've probably already passed it.

Re:Imagine (1)

hairyfeet (841228) | more than 3 years ago | (#34303998)

Actually I'd say the thing that is scaring the crap out of Intel is that "good enough" was passed for most folks quite a few miles back. I have several customers as well as my GF on late model P4s and you know what? Most of the time those 2.8GHz+ machines are sitting there twiddling their silicon thumbs. The simple fact is Youtube, FB, email, and surfing just don't take that much juice. And I'm sure the fact that those I've been able to upsell to new multicores only did so because AMD is really cheap now certainly don't help Intel none either.

Which brings me to TFA which I'd say just shows how Intel don't seem to see the real problem: The problem is that parallel programming ain't easy and most apps just don't scale well past a couple of cores. There just hasn't been a "killer app" for pushing the masses to true multicore computing. While I know that TFA is directed towards servers pushing major code that really is a small niche compared to the consumer space. What Intel and AMD need to do is find that "killer app" that will get all those running those late model P4s to drop them like a bad habit for the new hotness. Hell I usually have my family on the fast track to new hotness because I like to game, but my boys have been playing MMOs just fine with my P4 hand me downs so I really don't even see a point to upgrading. There really hasn't been any "killer app" to push adoption like we saw in the MHz race. Hell even hardcore gaming (a pretty tiny but tech heavy niche) hasn't really seen any benefits above going dual, with few games gaining in triple much less quad. If Intel and AMD want to push multicores somebody really needs that "killer app" to come out and stat.

Re:Imagine (5, Informative)

seifried (12921) | more than 3 years ago | (#34303794)

Linux can only go to 256 cores.

Uhmm no.

./arch/ia64/Kconfig: int "Maximum number of CPUs (2-4096)"
/arch/powerpc/platforms/Kconfig.cputype: int "Maximum number of CPUs (2-8192)"

In x86 we have:

config MAXSMP
bool "Enable Maximum number of SMP Processors and NUMA Nodes"
depends on X86_64 && SMP && DEBUG_KERNEL && EXPERIMENTAL

And I believe you can crank that dial all the way up

Also consider this: the number of cores in my desktop is doubling every year or two (and this is with a single core chip), 6 and 8 cores are cheap now, so we'll be at 1024 in roughly 7-14 years which makes sense because the GHz war is done and simply making more cores is relatively cheap (once you have the interconnect making a bigger CPU isn't all that hard).

This is NOT a cache-coherent/SMP machine! (2, Insightful)

Terje Mathisen (128806) | more than 3 years ago | (#34303870)

The key difference between this research chip and the other Multicore chips Intel have worked on, like Larrabee, is that it is explicitly NOT cache coherent, i.e. it is a cluster on chip instead of a single-image multi-processor.

This means, among many other things, that you cannot load a single Linux OS across all the cores, you need a separate executive on every core.

Compare this with the 7-8 Cell cores in a PS3.

Terje

Re:Imagine (1)

Jeremy Erwin (2054) | more than 3 years ago | (#34303662)

Imagine a Beowulf with all of the overhead, and none of the speed.

For simplicity's sake, the team used an off-the-shelf 1994-era Pentium processor design for the cores themselves. "Performance on this chip is not interesting," Mattson said. It uses a standard x86 instruction set.

Jeez... (5, Funny)

Joe Snipe (224958) | more than 3 years ago | (#34303374)

I hope he never works for Gillette.

Re:Jeez... (3, Funny)

monkeySauce (562927) | more than 3 years ago | (#34303472)

Other way around; he used to work for Gillette. He left after they cancelled his 1000-blade razor project.

Re:Jeez... (4, Funny)

Slashcrunch (626325) | more than 3 years ago | (#34303606)

Other way around; he used to work for Gillette. He left after they cancelled his 1000-blade razor project.

Yes, I also heard about the 1000-blade project getting cut...

Obligatory (0)

Anonymous Coward | more than 3 years ago | (#34303978)

In Soviet Russia, 1000-blade project cuts you!
Then you die.

I hope he works for Mr. Coffee or GE. (1)

Anonymous Coward | more than 3 years ago | (#34303520)

with all that heat, it would be nice to have a skillet that could cook a samwitch or eggs or brew coffee. I lived on a Mr. Coffee machine for over 3 years of boiling vegetables or tea and my only regret is it while keeping the room warm and the occassional hot towel bath it would have been nice if it's heat source was from an embedded computer rather a wastefull heating element. I know some people used a self-throttling Pentium 4 to boil food from their waterblock and such. Why not?

Oblig Family Guy (0)

Anonymous Coward | more than 3 years ago | (#34303674)

Peter: By gluing many razorblades to this ordinary desk fan, I'll save time in my morning routine!

*Peter turns on the fan and moves it closer to his face as the camera changes to a view of the wall through a window. Peter screams, blood spatters on the wall.*

Peter (offscreen): Lois, I done it again!

And here's crappy youtube link to that scene. No idea how long it will last: http://www.youtube.com/watch?v=CKHY4OsAPc8 [youtube.com]

does it run Linux - yea but it is "boring" (1)

G3ckoG33k (647276) | more than 3 years ago | (#34303384)

From the article: "By installing the TCP/IP protocol on the data link layer, the team was able to run a separate Linux-based operating system on each core. Mattson noted that while it would be possible to run a 48-node Linux cluster on the chip, it "would be boring."

Huh?! Boring?! It would have been a nice a first post on Slashdot on the eternal topic - does it run Linux? - to begin with.

The we have all the programming goodies to follow up with.

Re:does it run Linux - yea but it is "boring" (0)

Anonymous Coward | more than 3 years ago | (#34303398)

An instance of the OS for each core? Wow, that's a new low for Linux's multithreading..

Re:does it run Linux - yea but it is "boring" (0)

Anonymous Coward | more than 3 years ago | (#34303578)

You do know, moron, that you can run many processes and threads on a single core?

Re:does it run Linux - yea but it is "boring" (1)

fractoid (1076465) | more than 3 years ago | (#34303896)

So what you're getting at here is... you want MORE than one instance of the OS for each core? :P

Re:does it run Linux - yea but it is "boring" (0)

Anonymous Coward | more than 3 years ago | (#34303706)

What are you talking about, you fucking moron?

Re:does it run Linux - yea but it is "boring" (1)

c0lo (1497653) | more than 3 years ago | (#34303416)

From the article: "By installing the TCP/IP protocol on the data link layer, the team was able to run a separate Linux-based operating system on each core. Mattson noted that while it would be possible to run a 48-node Linux cluster on the chip, it "would be boring."

Huh?! Boring?! It would have been a nice a first post on Slashdot on the eternal topic - does it run Linux? - to begin with.

The we have all the programming goodies to follow up with.

;) To make the things interesting, each of the cores would have to use a public Inet IPv4 address.

Re:does it run Linux - yea but it is "boring" (1)

davester666 (731373) | more than 3 years ago | (#34303688)

Of course, it will support PPP, namely pay-per-processor. You can have the first one cheap, the rest, not so much...

Re:does it run Linux - yea but it is "boring" (4, Interesting)

RAMMS+EIN (578166) | more than 3 years ago | (#34303616)

Running Linux on a 48-core system is boring, because it has already been run on a 64-core system in 2007 [gigaom.com] (at the time, Tilera [tilera.com] said they would be up to 1000 cores in 2014; they're up to 100 cores per CPU now).

As far as I know, Linux currently supports up to 256 CPUs. I assume that means logical CPUs, so that, for example, this would support one CPU with 256 cores, or one CPU with 128 cores with two CPU threads per core, etc.

Re:does it run Linux - yea but it is "boring" (1)

Lumbre (1822486) | more than 3 years ago | (#34303698)

Yeah, it's boring with the same architecture we have now. But imagine if someone came up with a creative solution besides the current memory model. Memory management is probably hideous on a 1000-core system; they seemed to pose that lightly in the article.

This might even be a solution for a particular type of dedicated computer, not a personal computer.

Re:does it run Linux - yea but it is "boring" (1)

Metabolife (961249) | more than 3 years ago | (#34303872)

The most interesting part to me is how they're actually making a built in router for the chips. The cores communicate through TCP/IP. That's incredible.

Re:does it run Linux - yea but it is "boring" (0)

Anonymous Coward | more than 3 years ago | (#34303922)

The cores communicate through TCP/IP. That's incredible.overhead

and also a little silly. I want to be able to distribute my workload easier if it is on one chip. The trouble with clusters is to spread the data around. Also the overhead of a thousand linux systems running is unneeded. Single system. Thousand cores. Distributing to n virtual instances. n is the amount of system instances needed and far lower than thousand.

I can put a thousand cores on a chip... (0)

Anonymous Coward | more than 3 years ago | (#34303390)

...as long as you don't mind that it does nothing useful because of off-chip bandwidth starvation. I fail to see anything in TFA that suggests that problem is solved

Re:I can put a thousand cores on a chip... (1)

mr_mischief (456295) | more than 3 years ago | (#34303838)

Right. The really interesting chips will arrive when you run between four and sixteen cores with the entirety of main RAM for those cores (in a NUMA configuration with other sockets, starting with maybe a gigabyte or so per die). You could then use SDRAM for both a paging file and for cache between the storage system and the processor/memory die.

You could map registers straight to portions of the on-chip memory if necessary for backwards compatibility. You'd probably be better off, though, compiling nearly everything to just use memory addressing. You'd only hit the SDRAM to load a new entire page into the on-chip RAM. On-chip cache and the circuitry to minimize misses in the cache could mostly go away, and the cores themselves could be simplified. You might even get away with moving the SDRAM controller back off-chip at first to free up some space on the die since the working memory would be so fast once the data was in it.

Unfortunately, this assumes billions of switches just for the main memory and probably quality control nightmares in the first several models.

However, it's the logical conclusion for the way forward. Caches keep taking more die space to deal with the fact that memory is so much slower than processors. Once you get over a certain size cache, you're just wasting circuitry on managing a large block of memory in little chunks that's better treated as a large single block of memory. The virtual to physical mapping already figures out what's in main RAM and what's out in the swap. Just let it do that with the on-die memory and eliminate the extra cache logic to make more on-die memory.

Intel has mentioned [tomshardware.com] putting main memory on the die already. They even mentioned that they could do it with a form of DRAM rather than with SRAM.

Message passing between cores? Hmm... (3, Interesting)

PaulBu (473180) | more than 3 years ago | (#34303400)

Are they trying to reinvent Transputer? :)

But yes, I am happy to see Intel pushing it forward!

Paul B.

or a paragon (0)

Anonymous Coward | more than 3 years ago | (#34303442)

Coolest thing, looking at the front panel you could judge your and others code by the way the led bar graphs lit up. Bunch of vertical, and good code. Lots of horizontal, and a lot of comm without much computation going on. Of course, with a little creativity, the display lent itself to cool xmas displays, etc.

http://en.wikipedia.org/wiki/Intel_Paragon [wikipedia.org]

Yep! except for... (1)

PaulBu (473180) | more than 3 years ago | (#34303496)

Intel would not have (presumably!) to re-invent *Intel* Paragon! :)

We can throw a Connection Machine in there, and really date ourselves -- but it's still nice to know that finally CMOS tech has caught up with late 80s comp. arch. advances!

And then, do not get me started on the original Tera, with its multithreading it seemed to be much better bang for the buck of chip real estate than currently accepted multicore solutions. But what would I know...

Paul B.

Re:Message passing between cores? Hmm... (0)

Anonymous Coward | more than 3 years ago | (#34303564)

The architecture was very different on the transputer chips. They were a cool idea that never got the support needed. I seriously considered them in the old days for rendering but the superior architecture couldn't overcome the slower speeds they had. For complex math they had major advantages but they never had the funding other chips had. I'm not sure the technology would translate to multicore. They were more about parallel processing. They handled multichip configurations better than standard chips not multicore.

Re:Message passing between cores? Hmm... (2, Interesting)

TinkersDamn (647700) | more than 3 years ago | (#34303798)

Yes, I've been wondering the same thing. Transputers contained key ideas that seem to be coming around again...
But a more crucial thing might be how much heat can you handle on one chip? These guys are already at 25-125 watts, likely depending on how many cores are actually turned on. After all they're playing pretty hefty heat management tricks on current i7's and Phenom's.
http://techreport.com/articles.x/15818/2 [techreport.com]
What use are 48 cores, let alone 1000 if they're all being slowed down to 50% or whatever by heat and power juggling?

Could be good for games using raytracing (4, Insightful)

mentil (1748130) | more than 3 years ago | (#34303406)

This is for server/enterprise usage, not consumer usage. That said, it could scale to the number of cores necessary to make realtime raytracing work at 60fps for computer games. Raytracing could be the killer app for cloud gaming services like OnLive, where the power to do it is unavailable for consumer computers, or prohibitively expensive. The only way Microsoft etc. would be able to have comparable graphics in a console in the next few years is if it were rental-only like the Neo-Geo originally was.

1000 cors? (0)

Anonymous Coward | more than 3 years ago | (#34303414)

10 FOR N = 1 to 1000
20 PRINT "Cor!";
30 NEXT N

Temperature? (1)

garompeta (1068578) | more than 3 years ago | (#34303418)

Would the temperature raise 1000 times more than now?
(Would we need cryogenic coolers?)

Re:Temperature? (1)

c0lo (1497653) | more than 3 years ago | (#34303462)

TFA:

The chip, first fabricated with a 45-nanometer process at Intel facilities about a year ago, is actually a six-by-four array of tiles, each tile containing two cores. It has more than 1.3 billion transistors and consumes from 25 to 125 watts.

Re:Temperature? (0)

Anonymous Coward | more than 3 years ago | (#34303530)

Dude, what the fuck, that's only 48 cores. How does that get you anywhere close to 1000?

Re:Temperature? (4, Interesting)

c0lo (1497653) | more than 3 years ago | (#34303808)

Dude, what the fuck, that's only 48 cores. How does that get you anywhere close to 1000?

Well, Watson, that's elementary...

  • The correct question should have been: "How many watts one needs to dissipate"... because the temperature is given by "How high and still have the transistors working".
  • In regards with the power dissipation: the architecture would have a common component (event passing, RAM fetches, etc) and N cores. Assuming each core needs to dissipate the same power (say, at peak utilization) and assuming the 25-125 Watts being the range defined by "1 core used" to "all 48 cores used", some simple linear algebra gives: power dissipated/core approx 2 watts (a bit more actually) with the "common component" eating approx 23 Watts.
    Therefore, on top of the computation benefits derived from fully utilizing 1000 cores, one would have a pretty good heat source: 2150 Watts or so. One's choice what to do with it, but it's far too high for a domestic-sized slow cooker (the dished would come with a weird burned taste).

Satisfied, now?

If not, to put the things in perspective, assuming our ancestors (that could use only horses as a source of power) would have wanted to use this computer, they's need approx. 2.68 horses... but hey, wow... what a delight to play the MMORPG so smooth... especially in "farming/grinding" phases.

PS. the above computations are meant to be funny and/or an exercise of approximating based on insufficient data and/or vent some frustration caused by "all work and no play", definitely a wasted time... Ah, yes, some karma would be nice, but not mandatory.

Re:Temperature? (2, Funny)

TapeCutter (624760) | more than 3 years ago | (#34303650)

1.3 billion transitors!!! When I was a kid we had 9 and you could open the box and count 'em.

Bring out your Memes! (4, Funny)

SixDimensionalArray (604334) | more than 3 years ago | (#34303434)

Imagine a Beowulf cluster of th^H^H^H

Ah, forget it, the darn thing practically is one already! :/

"Imagine exactly ONE of those" just doesn't sound the same.

1,000,000 cores! (1)

EricX2 (670266) | more than 3 years ago | (#34303448)

Why have 1000 cores when you can have 1 MILLION CORES, (all running applications that can barely take advantage of 1 or 2)

Re:1,000,000 cores! (0)

Anonymous Coward | more than 3 years ago | (#34303476)

That just means you can run anywhere from 500,000-1,000,000 applications running at a time!

But really, if this is arbitrarily scalable, why did he say 1,000? Does this mean building supercomputers is incredibly dull now since you just stick more cores in there?

Re:1,000,000 cores! (1)

Macrat (638047) | more than 3 years ago | (#34303522)

Why have 1000 cores when you can have 1 MILLION CORES, (all running applications that can barely take advantage of 1 or 2)

Your computer only runs 1 application at a time?

Re:1,000,000 cores! (1)

wvmarle (1070040) | more than 3 years ago | (#34303546)

While scalable from a computing pov (data exchange, addressing, whatnot) I can imagine that it's not scalable from a physical pov: power supply, size, heat dissipation, and getting your signals to and from the chip over longer and longer distances.

The last part is getting an issue already due to the long cable problem: at 3 GHz, a signal travels only about 10 cm before the next signal is produced. One core communicating with another over a distance of just 5 cm would have the problem that the data from one core arrives only halfway the cycle to the next core.

One question? (0, Flamebait)

Anonymous Coward | more than 3 years ago | (#34303450)

Just how small does your penis need to be to need a 1,000 cores? Are we talking ingrown like with monster trucks or just really small? I render CG animation so a 1,000 cores has a practical use but for most we have to be talking bragging right here. I mean having a 12' tall Toyota Hilux or a 1,000 core computer has to be BYOV, Bring Your Own Vibrator time.

Re:One question? (1)

Macrat (638047) | more than 3 years ago | (#34303532)

Just how small does your penis need to be to need a 1,000 cores?

That's what it takes to run Flash these days.

Re:One question? (2, Insightful)

JWSmythe (446288) | more than 3 years ago | (#34303582)

The only thing I'd be compensating for is the fact I can't do calculations at Exaflop rates in my head.

    Just like my car only compensates for the fact I can't run at 165mph. :)

accurate representation (5, Interesting)

pyronordicman (1639489) | more than 3 years ago | (#34303456)

Having been in attendance of this presentation at Supercomputing 2010, for once I can say without a doubt that the article captured the essence of reality. The only part it left out is that the interconnect between all the processing elements uses significantly less energy than that of the previous 80-core chip; I think the figure was around 10% of chip power for the 48-core, and 30% for the 80-core. Oh, and MPI over TCP/IP was faster than the native message passing scheme for large messages.

How many... what's next? (1)

c0lo (1497653) | more than 3 years ago | (#34303458)

"It's a lot harder than you'd think to look at your program and think 'how many volts do I really need?'" he [Mattson] said.

First was RAM (640kb should be... doh), then M/GHz, then Watts, now is volts... so, what's next?
(my bet... returning to RAM and the advent of x128)

Workaround. (1)

miffo.swe (547642) | more than 3 years ago | (#34303460)

Am i the only one feeling this is just a foray into multicore chips because they hit a brick wall when it comes to faster single core CPUs? While i like the thought of say 8 cores or something id much rather have those 8 cores being faster than having a frigging supercomputer under my desk.

Workaround, yeah (1)

fnj (64210) | more than 3 years ago | (#34303494)

Er, yeah, pretty much everyone knows they have no practical way to make the clock speed much faster. The only thing they can do is proliferate cores beyond all reason. Nobody has the slightest idea how to take advantage of that many cores in normal household use and even most workstation use.

Re:Workaround, yeah (1)

miffo.swe (547642) | more than 3 years ago | (#34303668)

Lets hope some more work is put into light based computers. Those hold the promise of much faster single core CPUs. Current efforts seems wasted when you go beyond 10-15 cores. Mostly you'll only have a bunch of dead cores waiting for something useful to do, while not being able to help much when you really need a spike in CPU power.

Re:Workaround, yeah (5, Informative)

wierd_w (1375923) | more than 3 years ago | (#34303758)

You've obviously never worked in Aerospace.

I can bring a quad core Xeon system to its knees running Catia. (I mean, 100% saturation, all 4 cores, with IO contention.) I do it fairly regularly too.

Might have something to do with the NP-Hard problem of resolving tangencies on extremely complex nurbs surfaces. (aircraft skins).

Granted, that is not a "normal" workstation; But I would be VERY happy indeed to have a 1000 core workstation at my disposal. Maybe then I could actually work with Gulfstream's horrible part models where they include literally the whole god-damn aircraft's surface geometry in the digital part model for a fucking bolt. (Guess what happens when you load several such models, and digitally assemble them. I have seen a 64 bit workstation allocate over 8gb of swap because of them and their dumbassery.)

Now, if I could get one with over 1TB of RAM installed too, then I'd be in business.

Re:Workaround. (0)

Anonymous Coward | more than 3 years ago | (#34303538)

yeah...the computer your typing right now was a super computer 10 years ago from a desktop standard pov.

Re:Workaround. (1)

miffo.swe (547642) | more than 3 years ago | (#34303724)

Not really a supercomputer sitting under my desk by any means unless you talk about a supercomputer from over 20 years ago, and even then it wouldnt be as fast at I/O.

The supercomputers of 2000 are pretty impressive if you ask me and we are nowhere near getting perfomance even remotely like those for a long time.

http://www.top500.org/list/2000/11/ [top500.org]

Re:Workaround. (0)

mr_mischief (456295) | more than 3 years ago | (#34304002)

The #100 on that list (bottom of the page) had 24 cores and a peak GFLOPS of 192.

The AMD Radeon 6850, which is a mainstream or low-end gaming graphics processor and available for around $200 as we speak, has 960 cores with a theoretical peak performance of 1.5 TFLOPS. I haven't seen LinPack numbers for it and I'm not sure AMD has it working under LinPack just yet.

The numbers of a workstation using OpenCL or CUDA on even one high-end graphics card would put it in the top ten from 2000. Some systems for professional workstations and clusters are specified with four or more GPGPU cards specifically for use as general-purpose accelerated vector processors.

NVidia's Tesla C2050 in a dual GPU dual CPU (Xeon X5670) single system had an actual test at 656.1 GFLOPS [gameriot.com] . That's a single 1U system. Granted, it had 48 GB of RAM installed and a street price of around $11k. Still, that's a far cry from a supercomputer price. These GPU cards are being put into supercomputer clusters being built as they offer about 8x the performance per node or a 2x Xeon node without the use of the GPUs. One node would be in the top 30 from ten years ago.

When measurements move from GFLOPS to TFLOPS, make sure you notice the difference. It's a big one. The top supercomputers are now in PFLOPS territory.

Performance gains from multithreading not clock (1)

perpenso (1613749) | more than 3 years ago | (#34303880)

Am i the only one feeling this is just a foray into multicore chips because they hit a brick wall when it comes to faster single core CPUs?

For many years (at least 5, possibly more) Intel has been telling developers that future performance gains will come from multithreading not faster clock speeds. So no, you are not the only one feeling this way. :-)

And here goes RAM bandwidth (1)

snikulin (889460) | more than 3 years ago | (#34303504)

Again...
Alternatively, NUMA on a single CPU (different memory channels connected to different cores).
It would be a bitch to program (but fun nevertheless).

gpu's have been doing this for years... (1, Interesting)

Anonymous Coward | more than 3 years ago | (#34303510)

given that for years GPU's have hand hundreds of processors (the power of CUDA is awesome!) this is long over due by lazy CPU designers like Intel....

Re:gpu's have been doing this for years... (1)

lennier1 (264730) | more than 3 years ago | (#34303592)

True. CPUs like that would be a godsend for tasks like 3D rendering (entertainment industry, architectural visualization, ...).

Re:gpu's have been doing this for years... (1)

wierd_w (1375923) | more than 3 years ago | (#34303766)

Aerospace mockup and simulation... (Imagine, a "down to the last bolt" NURBS model, with dynamic stress simulation... Can theoretically be done now, but I have never seen a workstation in any production engineering department get closer than perhaps a wing segment before crashing the workstation.)

I greedily await such a future.

Re:gpu's have been doing this for years... (1)

lennier1 (264730) | more than 3 years ago | (#34303860)

Here's an example of how it's used in the entertainment sector:
http://www.youtube.com/watch?v=JhJauu_vB2A [youtube.com]
Basically linking a 3D suite up to a camera to get motion data, reference point positioning and other information to allow more seamless integration and even low-quality previews during the shoot. It's the technique from "Avatar", which was combined with two-stage motion capturing to make those shots possible ( http://www.wired.com/magazine/2009/11/ff_avatar_5steps/ [wired.com] ).

Even with twin-hexacore workstations on the market and using GPU-based processing as well they're still in dire need for more.

Biggest Hurdle Not Cores (1)

Lokeh (1363285) | more than 3 years ago | (#34303518)

I took an intro to ECE class last fall that was basically just a parade of people coming in and talking about the kinds of things that they do as an engineer. One of the speakers talked about how one could have all of these cores, but that coding to take advantage of all of them was such a difficult task that it's hard to find any software that takes advantage of the few cores we're shipping today, let alone a hundred cores or a thousand cores. Apparently he was working on a project - a sort of wrapper? I think he mentioned AI but I don't know if he was just blowing smoke up our ass at that point - to help streamline writing for thousands of cores. I don't know how much truth is in that but I found it interesting, and would love to hear from someone who actually codes these kinds of things.

Re:Biggest Hurdle Not Cores (1)

pyronordicman (1639489) | more than 3 years ago | (#34303574)

It's true that many desktop/server applications don't have the parallelism available to make use of many cores (i.e. > 2). However, this chip was designed with scientific applications in mind, where thousands, if not millions, of calculations can be executed simultaneously. Many of these problems are readily mapped to programming models that take advantage of many cores, such as message passing or SIMD/vector processing. For those programs that don't have available parallelism, there's not a whole lot to do with extra processing power. You can sometimes speculatively execute code, but that's a tricky problem for the compiler and runtime to figure out.

Re:Biggest Hurdle Not Cores (3, Insightful)

Anonymous Coward | more than 3 years ago | (#34303590)

Basically, we are going to need compilers that automatically take advantage of all that parallelism without making you think about it too much, and programming languages that are designed to make your programs parallel-friendly. Even Microsoft is finally starting to edge in this direction with F# and some new features of .NET 4.0. Look at Haskell and Erlang for examples of languages that take such things more seriously, even if the world takes them less seriously.

I don't know about AI, but almost certainly we will end up with both compilers and virtual machines that are aware of parallelism and try to take advantage of it whenever possible.

But still, certain algorithms just aren't very friendly to parallelism no matter what technology you apply to them.

You wanna impress me? (2, Funny)

Anonymous Coward | more than 3 years ago | (#34303542)

Make a processor with four asses.

Future of Programming (4, Interesting)

igreaterthanu (1942456) | more than 3 years ago | (#34303544)

This just goes to show that if you care about having a future career (or even just continuing with your existing one) in programming, Learn a functional language NOW!

Re:Future of Programming (0)

Anonymous Coward | more than 3 years ago | (#34303754)

Because everything can be magically parallellized?

Re:Future of Programming (0)

Anonymous Coward | more than 3 years ago | (#34303764)

Nah, just learn a concurrency model like ZeroMQ that lets you write perfectly scalable apps in any language.

Re:Future of Programming (2, Interesting)

jamesswift (1184223) | more than 3 years ago | (#34303796)

It's quite something isn't it, how so few people on even slashdot seem to get this. Old habits die hard I guess.
Years ago a clever friend of mine clued me into how functional was going to be important.

He was so right and the real solutions to concurrency (note, not parallelism which is easy enough in imperative) are in the world of FP or at least mostly FP.

My personal favourite so far is Clojure which has the most comprehensive and realistic approach to concurrency I've seen yet in a language ready for real world work.
The key thing to learn from it is how differently you need to approach your problem to take advantage of a mutli-core world.

Clojure itself may never become a top-5 language but they way it approaches the problem surely will be seen in other future FP langs.

 

Re:Future of Programming (5, Insightful)

Anonymous Coward | more than 3 years ago | (#34303814)

Learn a functional language. Leanr it not for some practical reason. Learn it because having another view will give you interesting choices even when writing imperative languages. Every serious programmer should try to look at the important paradigms so that he can freely choose to use them where appropriate.

Re:Future of Programming (1)

rrohbeck (944847) | more than 3 years ago | (#34303874)

All you need is a library that gives you worker threads, queues and synchronization primitives. We've all learned that stuff at some point (and forgot most of it.)

Re:Future of Programming (1)

loufoque (1400831) | more than 3 years ago | (#34303942)

Sorry, but while functional programming style is indeed the future of HPC (with C++), functional languages themselves aren't. Read the research papers of the field and see for yourself.

I wonder what kinda cooling the thing needs (1)

qzhwang (610991) | more than 3 years ago | (#34303558)

Would be interesting to know if this helps with performance/power ratio against (potential many core/cpu) ARM servers.

Windows testing (-1, Troll)

Alsee (515537) | more than 3 years ago | (#34303560)

Intel engineers reported obtaining a 12% speed up for Windows running on this CPU, but testing had to be halted when one of the programs crashed and several employees received a near fatal overdose of blue.

-

Re:Windows testing (0, Flamebait)

Anonymous Coward | more than 3 years ago | (#34303976)

I bet you're one of those geeks who sneers at Windows and Mac users and thinks he's really clever because he uses Linux, but is 30 years old and still works at Maplins or some other vaguely nerdy retail job.

Instruction set... (3, Insightful)

KonoWatakushi (910213) | more than 3 years ago | (#34303634)

"Performance on this chip is not interesting," Mattson said. It uses a standard x86 instruction set.

How about developing a small efficient core, where the performance is interesting? Actually, don't even bother; just reuse the DEC Alpha instruction set that is collecting dust at Intel.

There is no point in tying these massively parallel architectures to some ancient ISA.

Re:Instruction set... (1)

Arlet (29997) | more than 3 years ago | (#34303850)

There's also no reason to throw away an ISA that has proven to be extremely scalable and very successful, just because it's ancient or it looks ugly.

The advantage of the x86 instruction set is that it's very compact. It comes at a price of increased decoding complexity, but that problem has already been solved.

The low number of registers is not a problem. In fact, it may even be an advantage to scalability. A register is nothing more than a programmer-controlled mini cache in front of the memory. I'd rather have few registers, and go directly to memory. The hardware can then scale to include bigger and faster caches, so that memory access is just as fast a register access, without the software having to deal with register allocation and save/restore.

Cores are not executing x86 instructions (1)

perpenso (1613749) | more than 3 years ago | (#34303968)

How about developing a small efficient core, where the performance is interesting? Actually, don't even bother; just reuse the DEC Alpha instruction set that is collecting dust at Intel. There is no point in tying these massively parallel architectures to some ancient ISA.

Technically the cores are not executing x86 instructions. For several architectural generations of Intel chips the x86 instructions have been translated into a small efficient instruction set executed by the cores. Intel refers to these core instructions as micro-operations. An x86 instruction is translated on the fly into some number of micro-ops and these micro-op are reordered and scheduled for execution. So they have kind of done what you ask, the problem is that they don't give us direct access to the micro-op instructions set.

Intel tried to move beyond x86 with the Itanium and the market said no. The market also said no to Alpha and PowerPC, both of which had consumer oriented Windows NT 4 support. Even Apple had to give up on PowerPC and they were part of the PowerPC consortium. There is no Intel x86 conspiracy, they are trapped too.

Re:Instruction set... (0)

Anonymous Coward | more than 3 years ago | (#34303974)

http://en.wikipedia.org/wiki/Amdahl%27s_law circa 1967

Read it - 8 optimum mostly, anything over 128 cpus had better be pumped.

Oh, we should add IBM's VM instruction set, in practice a good idea.

1000 cores is nothing (5, Interesting)

Anonymous Coward | more than 3 years ago | (#34303670)

Probably in future 1 million cores is minimum requirement for applications. We will then laugh for these stupid comments...

Image and audio recognition, true artificial intelligence, handling data from huge amount of different kind of sensors, movement of motors (robots), data connections to everything around the computer, virtual worlds with thousands of AI characters with true 3D presentation... etc...etc... will consume all processing power available.

1000 cores is nothing... We need much more.

Re:1000 cores is nothing (2, Insightful)

Electricity Likes Me (1098643) | more than 3 years ago | (#34303834)

1000 cores at 1Ghz on a single chip, networked to a 1000 other chips, would probably just about make a non-real time simulation of a full human brain possible (going off something I read about this somewhere). Although if it is possible to arbitrarily scale the number of cores, then we might be able to seriously consider building a system of very simple processors acting as electronic neurons.

Not for the consumer market (1)

Askmum (1038780) | more than 3 years ago | (#34303680)

Okay, I'm sure some high-end consumers would benefit from this, I think the majority of consumers will not. The number of multithreaded programs on my Windows computer can be counted on one hand I think. Java being the major one, if and only if the programmers want to program multithreaded.

At this point in time I'd rather have a dual core 3 GHz processor than a quad or octa core 2 GHz processor.

"Build it and they will come" - NOT (4, Informative)

Animats (122034) | more than 3 years ago | (#34303784)

It's an interesting machine. It's a shared-memory multiprocessor without cache coherency. So one way to use it is to allocate disjoint memory to each CPU and run it as a cluster. As the article points out, that is "uninteresting", but at least it's something that's known to work.

Doing something fancier requires a new OS, one that manages clusters, not individual machines. One of the major hypervisors, like Xen, might be a good base for that. Xen already knows how to manage a large number of virtual machines. Managing a large number of real machines with semi-shared memory isn't that big a leap. But that just manages the thing as a cluster. It doesn't exploit the intercommunication.

Intel calls this "A Platform for Software Innovation". What that means is "we have no clue how to program this thing effectively. Maybe academia can figure it out". The last time they tried that, the result was the Itanium.

Historically, there have been far too many supercomputer architectures roughly like this, and they've all been duds. The NCube Hypercube, the Transputer, and the BBN Butterfly come to mind. The Cell machines almost fall into this category. There's no problem building the hardware. It's just not very useful, really tough to program, and the software is too closely tied to a very specific hardware architecture.

Shared-memory multiprocessors with with cache coherency have already reached 256 CPUs. You can even run Windows Server or Linux on them. The headaches of dealing with non-cache-coherent memory may not be worth it.

I/O and memory bandwidth (3, Insightful)

francium de neobie (590783) | more than 3 years ago | (#34303820)

Ok, you can cram 1000 cores into one CPU chip - but feeding all 1000 CPU cores with enough data for them to process and transferring all the data they spit out is gonna be a big problem. Things like OpenCL work now because the high end GPUs these days have 100GB/s+ bandwidth to the local video memory chips, and you're only pulling out the result back into system memory after the GPU did all the hard work. But doing the same thing on a system level - you're gonna have problems with your usual DDR3 modules, your SSD hard disk (even PCI-E based) and your 10GE network interface.

Deja Vu from a decade ago (2, Informative)

Baldrson (78598) | more than 3 years ago | (#34303854)

It seem like I've been here before. [slashdot.org]

A little while ago you asked [slashdot.org] Forth (and now colorForth) originator Chuck Moore about his languages, the multi-core chips he's been designing, and the future of computer languages -- now he's gotten back with answers well worth reading, from how to allocate computing resources on chips and in programs, to what sort of (color) vision it takes to program effectively. Thanks, Chuck!

Didn't Bill Gates already set the high bar? (0)

Anonymous Coward | more than 3 years ago | (#34303878)

64 Cores ought to be enough for anyone...

Remember the last couple of times this happened? (5, Informative)

Required Snark (1702878) | more than 3 years ago | (#34303890)

This is at least the third time that Intel has said that it is going to change the way computing is done.

The first time was the i432 http://en.wikipedia.org/wiki/Intel_iAPX_432 [wikipedia.org] Anyone remember that hype? Got to love the first line of the Wikipedia article "The Intel iAPX 432 was a commercially unsuccessful 32-bit microprocessor architecture, introduced in 1981."

The second time was the Itanium (aka Itanic) that was going to bring VLIW to the masses. Check out some of the juicy parts of the timeline also over on Wikipedia http://en.wikipedia.org/wiki/Itanium#Timeline [wikipedia.org]

1997 June: IDC predicts IA-64 systems sales will reach $38bn/yr by 2001

1998 June: IDC predicts IA-64 systems sales will reach $30bn/yr by 2001

1999 October: the term Itanic is first used in The Register

2000 June: IDC predicts Itanium systems sales will reach $25bn/yr by 2003

2001 June: IDC predicts Itanium systems sales will reach $15bn/yr by 2004

2001 October: IDC predicts Itanium systems sales will reach $12bn/yr by the end of 2004

2002 IDC predicts Itanium systems sales will reach $5bn/yr by end 2004

2003 IDC predicts Itanium systems sales will reach $9bn/yr by end 2007

2003 April: AMD releases Opteron, the first processor with x86-64 extensions

2004 June: Intel releases its first processor with x86-64 extensions, a Xeon processor codenamed "Nocona"

2004 December: Itanium system sales for 2004 reach $1.4bn

2005 February: IBM server design drops Itanium support

2005 September: Dell exits the Itanium business

2005 October: Itanium server sales reach $619M/quarter in the third quarter.

2006 February: IDC predicts Itanium systems sales will reach $6.6bn/yr by 2009

2007 November: Intel renames the family from Itanium 2 back to Itanium.

2009 December: Red Hat announces that it is dropping support for Itanium in the next release of its enterprise OS

2010 April: Microsoft announces phase-out of support for Itanium.

So how do you think it will go this time?

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...