Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

IBM Unveils Fastest Microprocessor Ever

samzenpus posted more than 4 years ago | from the greased-lightning dept.

IBM 292

adeelarshad82 writes "IBM revealed details of its 5.2-GHz chip, the fastest microprocessor ever announced. Costing hundreds of thousands of dollars, IBM described the z196, which will power its Z-series of mainframes. The z196 contains 1.4 billion transistors on a chip measuring 512 square millimeters fabricated on 45-nm PD SOI technology. It contains a 64KB L1 instruction cache, a 128KB L1 data cache, a 1.5MB private L2 cache per core, plus a pair of co-processors used for cryptographic operations. IBM is set to ship the chip in September."

Sorry! There are no comments related to the filter you selected.

fastest first post ever? (-1, Offtopic)

Anonymous Coward | more than 4 years ago | (#33447764)

we'll see about that!

Re:fastest first post ever? (1)

stonewallred (1465497) | more than 4 years ago | (#33447886)

Will it run Crysis at full settings though?

Re:fastest first post ever? (1)

clang_jangle (975789) | more than 4 years ago | (#33447928)

Yes, but you'll have to disable "Aero".

Re:fastest first post ever? (1)

Yvan256 (722131) | more than 4 years ago | (#33448774)

Can we still enable "Coffee Crisp"?

Re:fastest first post ever? (1)

Chris Snook (872473) | more than 4 years ago | (#33448668)

Given that the Z architecture doesn't even have PCI, that would be a no.

Re:fastest first post ever? (1)

Larryish (1215510) | more than 4 years ago | (#33448990)

in b4 beowulf cluster

Required (4, Funny)

Anonymous Coward | more than 4 years ago | (#33447772)

But will it run ... a Beowolf cluster of ...

[Comment terminated : memelock detected]

Re:Required (0)

Anonymous Coward | more than 4 years ago | (#33448188)

But will it blend? That is the question!

lets see 3ghz 150$ (0)

Anonymous Coward | more than 4 years ago | (#33448694)

lets see X2 oh ya 6GHZ at 300$

go IBM

Yeah, I read about this (-1, Offtopic)

MrHanky (141717) | more than 4 years ago | (#33447788)

A week ago. Maybe Slashdot should upgrade to that IBM CPU.

Re:Yeah, I read about this (0, Troll)

TaoPhoenix (980487) | more than 4 years ago | (#33447816)

Fark is consistently a whole 1-2 days faster than Slash-D lately. And their "Idle" section is better.

Re:Yeah, I read about this (4, Insightful)

Spad (470073) | more than 4 years ago | (#33447850)

Yes, but their article comments are much closer to Youtube than Slashdot.

Re:Yeah, I read about this (1, Redundant)

Fear the Clam (230933) | more than 4 years ago | (#33447954)

If I could mod you insightful, I would.

Re:Yeah, I read about this (-1, Offtopic)

Anonymous Coward | more than 4 years ago | (#33448654)

If I could mod you redundant, I would.

Dupes (-1, Offtopic)

Anonymous Coward | more than 4 years ago | (#33447890)

Hell, if you're going by that m Slashdot is consistently a whole 1-2 days faster than Slashdot lately.

Speed times Quantity? (2, Interesting)

TaoPhoenix (980487) | more than 4 years ago | (#33447790)

So what is this beast supposed to be, a 64 core machine?

Didn't we retire the Ghz wars 5 years ago? I know, AMD style "more done per cycle", but isn't a quad core 3.1 Ghz per chip with 20% logistic overhead faster?

Re:Speed times Quantity? (5, Informative)

Haedrian (1676506) | more than 4 years ago | (#33447818)

The thing is that if you have 2 (say) 1.6 GHz processors, they aren't as 'powerful' as one 3.2 GHz processor.

For one - there are overheads, certain stuff common between them, pipelines - stuff which I forgot (computer engineering related problems).

But the main thing is that not all programs are multi-threaded, and a program with a single thread can only run on one processor. So yeah, GHz are still useful. Maybe for large single-thread batch processing - which is the kind of thing a mainframe would do.

Re:Speed times Quantity? (1)

WrongSizeGlass (838941) | more than 4 years ago | (#33447910)

But the main thing is that not all programs are multi-threaded, and a program with a single thread can only run on one processor. So yeah, GHz are still useful. Maybe for large single-thread batch processing - which is the kind of thing a mainframe would do.

I'm betting the code used on these z196 systems is multi-threaded. Shit, if you're paying hundreds of thousands of dollars per CPU you can afford some top notch programmers. With two co-processors used for cryptographic operations per chip I'd say they were after a bigger prize than, say, hardcore gamers ;-)

BTW, TFA mentions L1 cache per core but doesn't mention how many cores this chip scales up to. Could it be just one?

Re:Speed times Quantity? (2, Interesting)

Carewolf (581105) | more than 4 years ago | (#33447998)

BTW, TFA mentions L1 cache per core but doesn't mention how many cores this chip scales up to. Could it be just one?

It later mentions using 128Mbyte just for level 1 cache, so that would be around 1024 cores.

Re:Speed times Quantity? (2, Insightful)

MichaelSmith (789609) | more than 4 years ago | (#33448000)

But the main thing is that not all programs are multi-threaded, and a program with a single thread can only run on one processor. So yeah, GHz are still useful. Maybe for large single-thread batch processing - which is the kind of thing a mainframe would do.

I'm betting the code used on these z196 systems is multi-threaded. Shit, if you're paying hundreds of thousands of dollars per CPU you can afford some top notch programmers.

Actually I think this mainframe is for getting the last little bit of performance out of thirty year old cobol code. And the original top notch programmers are long dead.

Re:Speed times Quantity? (1)

mickwd (196449) | more than 4 years ago | (#33448568)

Actually I think this mainframe is for getting the last little bit of performance out of thirty year old cobol code. And the original top notch programmers are long dead.

Considering that life expectancy in the developed world is in the region of 80 years, there is a reasonable chance that programmers who were under 50 when they wrote code thirty years age are still alive.

They may have little recollection of what they did 30 years ago, but to say they are all "long dead" is somewhat of an exaggeration.

Re:Speed times Quantity? (1)

MaskedSlacker (911878) | more than 4 years ago | (#33448856)

I've never met a programmer over 50. I must therefore conclude that they all perish mysteriously upon their 50th birthday. Something like the planet of grim reapers from Futurama is how I prefer to envision it.

Re:Speed times Quantity? (1)

Sulphur (1548251) | more than 4 years ago | (#33448660)

More processors = Share the Legacy.

Re:Speed times Quantity? (1)

Pharmboy (216950) | more than 4 years ago | (#33448168)

Shit, if you're paying hundreds of thousands of dollars per CPU

You aren't. FTA, the complete systems will cost hundreds of thousands of dollars, to a few million. Not the individual CPUs.

Re:Speed times Quantity? (4, Informative)

bws111 (1216812) | more than 4 years ago | (#33448362)

When configured to run Linux, each core costs approx $125K. When configured for z/OS, each core costs approx $250K. A complete system (not including any storage or software) can cost up to around $30M.

Re:Speed times Quantity? (1)

hitmark (640295) | more than 4 years ago | (#33448666)

And all the hardware will be there no matter what package you choose, and a "upgrade" will involve a IBM representative coming over to move a jumper.

Re:Speed times Quantity? (1)

(Score.5, Interestin (865513) | more than 4 years ago | (#33448408)

Shit, if you're paying hundreds of thousands of dollars per CPU you can afford some top notch programmers.

If you're paying hundreds of thousands of dollars for a multi-GHz CPU then it's probably because you're trying to make up for the product of crap programmers, not the other way round.

Re:Speed times Quantity? (1)

AHuxley (892839) | more than 4 years ago | (#33448718)

product of crap programmers
Sorry to ask but who does IBM see using this?
At the price point and data sets that need sorting? - cheaper clusters or more expensive faster unique chips depending on math?

Re:Speed times Quantity? (1)

Chris Mattern (191822) | more than 4 years ago | (#33448972)

Sorry to ask but who does IBM see using this?

People with legacy mainframe programs that they don't want to port (translation: that they don't dare touch).

Re:Speed times Quantity? (1)

TheTrueScotsman (1191887) | more than 4 years ago | (#33449066)

Banks. They need it not for speed but for volume and reliability.

Re:Speed times Quantity? (1, Funny)

Anonymous Coward | more than 4 years ago | (#33448576)

> I'd say they were after a bigger prize than, say, hardcore gamers ;-)

Yeah. They're after the *really fucking hardcore* gamers.

Re:Speed times Quantity? (1)

cgenman (325138) | more than 4 years ago | (#33448754)

They say it's an old CISC architecture. This is probably the sort of system that runs horribly outdated and un-updatable code, like the tax system.

Re:Speed times Quantity? (2, Insightful)

asliarun (636603) | more than 4 years ago | (#33448126)

The thing is that if you have 2 (say) 1.6 GHz processors, they aren't as 'powerful' as one 3.2 GHz processor.

For one - there are overheads, certain stuff common between them, pipelines - stuff which I forgot (computer engineering related problems).

But the main thing is that not all programs are multi-threaded, and a program with a single thread can only run on one processor. So yeah, GHz are still useful. Maybe for large single-thread batch processing - which is the kind of thing a mainframe would do.

OK, firstly the OP should have said that this is the microprocessor with the highest clock speed. Calling it the fastest CPU is extremely misleading. In most modern CPUs, clockspeed is NOT related to throughput. The Intel Sandy Bridge or Nehalem CPU for example may be running its 4 cores at a clockspeed of 3.2GHz but overall, each core in the CPU is easily 4-5 times faster than a 3.2GHz Pentium4 core.

Secondly, many of the bottlenecks that you allude to are no longer major bottlenecks. CPU interconnect bandwidth and memory bandwidth is now large enough that this is no longer an issue - the days of FSB saturation are over. Of course, there are exceptions to every rule, but I mean this for most workloads.

Yes, you are correct as far as single threaded workloads are concerned. Nonetheless, you cannot even compare two different CPUs on a clockspeed basis, especially those with completely different architectures, even for single threaded workloads. IBM may have created a very highly clocked CPU and given it tons of transistors, but I seriously doubt if it will compete with a modern day server CPU from Intel or even AMD (pure performance maybe, but definitely not price-performance or performance-per-watt). I strongly suspect that it will probably succeed because of its RAS features, overall system bandwidth, and platform, not because of its raw clockspeed or performance.

Re:Speed times Quantity? (-1, Troll)

Anonymous Coward | more than 4 years ago | (#33448224)

wow, 4-5 times faster per core you say?

I think moore's law just skipped a few decades with your analysis!!
Incredible how nobody else noticed this incredible output of technology.
(you fail)

Re:Speed times Quantity? (4, Insightful)

mickwd (196449) | more than 4 years ago | (#33448850)

"clockspeed is NOT related to throughput"

Of course it is. It is not, however, the only factor, and other factors may indeed (and commonly do) outweigh it.

"IBM may have created a very highly clocked CPU and given it tons of transistors, but I seriously doubt if it will compete with a modern day server CPU from Intel or even AMD."

I think you underestimate IBM's technical ability. They do have some idea of what they're doing.

"pure performance maybe, but definitely not price-performance or performance-per-watt"

That's like saying a Ferrari is a poor performance car because it can't compete against a Ford Focus on cost-per-max-speed or miles-per-gallon.

Re:Speed times Quantity? (0)

Anonymous Coward | more than 4 years ago | (#33448176)

The Problem is that this unicore processor with 3.2 Ghz will use MORE energy than the two 1.6 Ghz processors combined.

If you have a workload that is easily split up into threads, going multicore will lower costs, because slower cores require less electrical energy...

Re:Speed times Quantity? (1)

digitalhermit (113459) | more than 4 years ago | (#33448708)

Yup... there are so many dependencies on application and OS code that hardware capability matters very little.

I recently tried to tune a workload on a pSeries system. We gave it half a processor and 2 virtuals (with the Power version of hyperthreading so it saw 4 processors). Performance was a dog. Load was only 60% of capacity though. We doubled the number of virtual processors but kept the overall entitlement. Load dropped to 40%. Added another couple virtuals and load dropped to 25%. No increase in throughput. It's a classic example of a thread-limited workload... No matter how many processors we could add, the jobs would only run on two. Bumping up those processors might gave 2% here and there, but the bottleneck wasn't CPU. After the development team redid some code (and reduced the number of database calls from 1500 to under 100), the performance improved from 2-3 seconds to 0.9 seconds.

Re:Speed times Quantity? (0)

Anonymous Coward | more than 4 years ago | (#33447842)

the cpu is designed for running primarily VMs. Since you sometimes want a VM to perform a real time task it's very useful if you can get as many clock cycles per second as you can get

Re:Speed times Quantity? (1)

dsavi (1540343) | more than 4 years ago | (#33448088)

I was wondering about this- Why did the Ghz wars end, anyway? Did the chip makers hit a wall or something? At the rate it was going, I thought we'd have 5Ghz+ processors by now.
Yeah, I'm uninformed.

Re:Speed times Quantity? (5, Insightful)

Anonymous Coward | more than 4 years ago | (#33448264)

More or less. They hit two walls - fabricating chips that could run faster while retaining an acceptable yield, and dealing with the heat such chips produced.

The fastest general-sale chips were the P4s - the end of their line marked the end of the gigahertz wars, as Intel switched from ramping up the clock to ramping up the per-cycle efficiency with the Core 2 and their complete architecture overhaul. As a result a 2GHz Core 2 duo will outperform a 4GHz P4 dual-core under most conditions. Better pipeline organisation, larger caches better managed.

Clock rate is no longer the key variable in comparing processors, unless they are of the same microarchitecture.

Re:Speed times Quantity? (1)

hedwards (940851) | more than 4 years ago | (#33448442)

There's also the problem of feeding such a monster processor and keeping it synced up with the rest of the machine. On top of that servers for instance tend to cope better with many cores than faster ones after a certain point, which is presumably well before 5ghz. Since servers typically are more concerned with large numbers of connections, chances are that a quad core running at 2ghz would have better performance than a single core 5ghz would, scale that up as needed to the number of cores. Of course frequency is a terrible comparison, but for this purpose it's probably fine.

Re:Speed times Quantity? (1)

bsdaemonaut (1482047) | more than 4 years ago | (#33448090)

You can have up to 96 cores with the z196 processor.. not sure how a quad core 3.1ghz would hope to even compare with that.

Re:Speed times Quantity? (1)

JamesP (688957) | more than 4 years ago | (#33448364)

You actually can go faster without x86 hogging you down

Re:Speed times Quantity? (1)

JasterBobaMereel (1102861) | more than 4 years ago | (#33449082)

IBM BlueGene/L - runs at 700 MHz .... 596 TFLOPS

Cray XT5 - runs at 2.6 Ghz ... 2331.00 TFLOPS

Both of these are slower in Hz than the PC I am using to type this ....

Price: RTFA (5, Informative)

miketheanimal (914328) | more than 4 years ago | (#33447802)

The Z-series mainframes cost hundreds of thousands (or even over a million) dollars, not the chips. As it says in the article.

Re:Price: RTFA (0)

Anonymous Coward | more than 4 years ago | (#33448382)

ONE MILLION DOLLARS!!!!!

Great news for Mac OS X users! (4, Funny)

squiggleslash (241428) | more than 4 years ago | (#33447808)

I can't wait to get a PowerMac G6 with this CPU, in your face Dell users with your commodity Intel-based desi... oh, wait.

Re:Great news for Mac OS X users! (4, Funny)

fuzzyfuzzyfungus (1223518) | more than 4 years ago | (#33448072)

The PowerMac G6 would be pretty impressive. The PowerBook G6 manual would include the following phrase:

"Please note: The revolutionary new MagsafePro 3-Phase/480 power connector is not backwards compatible with the Magsafe connectors of prior, non-containerized Mac Portables."

Re:Great news for Mac OS X users! (2, Informative)

UnknowingFool (672806) | more than 4 years ago | (#33448152)

Unfortunately this chip will most likely go into workstations and servers. In order for IBM to make a desktop version, it will have to make a custom chip to handle things like video, sound, etc. This will lead to same logistical problems for Apple that it had before. Manufacturing companies do not want to keep excess inventories whether it was Apple or IBM. If Apple needs more, it will have to wait while IBM rearranges their manufacturing schedules to compensate. Also even if Apple orders millions of these, it will still be a small customer to IBM; IBM's internal divisions would order more of the stock chip. And the last reason Apple will not go back to IBM; IBM's mobile chip offerings lag way behind Intel's. IBM never made a mobile G5 chip. My guess is that they could never make one that had acceptable power consumption. IBM could do it with enough R&D but again it would be for a very small customer. Not worth enough to the bottom line.

Re:Great news for Mac OS X users! (1)

BrentH (1154987) | more than 4 years ago | (#33448432)

IBM uses Hypertransport as interconnect, right? That would imply that you can slap any old AMD chipset to such a chip, wich has all the desktop-features you need.

Re:Great news for Mac OS X users! (1)

splutty (43475) | more than 4 years ago | (#33448602)

Uhm...

We're talking about Z-series mainframes. These are absolute beasts, with all the cooling, memory and processing speed that would leave a desktop in the dust without any problems whatsoever.

However putting this sort of hardware in a desktop is extremely prohibitive for a ton of reasons, one of the most important being cooling. You'd need a room just for that...

Re:Great news for Mac OS X users! (4, Informative)

TheRaven64 (641858) | more than 4 years ago | (#33448646)

Wrong chip family. This is the Z-series mainframe chip, using an instruction set that is backwards compatible with the System/360 stuff from back in 1960 (the architecture of the future, as the marketing material trying to persuade my university to upgrade their IBM 1620 put it). The PowerMacs were using PowerPC chips, which use the same instruction set as the POWER CPUs from IBM (they used to be similar, with a common subset, now they are identical).

The chip that this is replacing, the z10, was designed concurrently with the POWER6. They share a number of common features, including a lot of the same execution engines (both have the same hardware BCD units, for example, as well as more common arithmetic units), but they are very different in a number of other aspects, including the instruction set, cache design, and inter-processor interconnect, because they are designed for different workloads.

I've not read much about this chip yet, but I think it shares some design elements with the POWER7, in the same way that the z10 did with the POWER6.

In short, while some of the R&D money spent on this CPU made it into chips that could, potentially, run OS X, this chip itself could not without a major rewrite.

Re:Great news for Mac OS X users! (1)

Chris Mattern (191822) | more than 4 years ago | (#33448854)

It's not a PowerPC chip anyways. It's zSystem architecture, which is actually the modern-day descendant of what was originally the System/360.

True, true. For now. (1)

AltGrendel (175092) | more than 4 years ago | (#33447820)

But it will be obsolete by the end of the month.

speed meh (1)

bakamorgan (1854434) | more than 4 years ago | (#33447838)

They need to work on crunchin more cores on 1 dye then get programs up to speed where they can utilize the extra cores.

Re:speed meh (1)

the linux geek (799780) | more than 4 years ago | (#33447934)

Mainframe programs have largely been parallel for decades. This thing isn't designed for running Crysis.

Re:speed meh (1)

hedwards (940851) | more than 4 years ago | (#33448462)

Perhaps not, but it's what you need if you want to put the graphics settings on high.

CISC to save RAM? (1)

MichaelSmith (789609) | more than 4 years ago | (#33447846)

IBM defines the z196 as one of the few remaining CISC chips, which allows for bulky, large programs that can require much more memory to execute in than RISC chips, including the PowerPC and ARM embeddded processors, among others.

For CISC you need more bytes per instruction, because there are more instructions. With RISC your executable has more instructions but they each use less storage.

I am not sure I believe their implication that CISC is better for humungus commercial applications. Sounds like marketing speak to management to me.

CISC seems to work well (1)

Sycraft-fu (314770) | more than 4 years ago | (#33448162)

Essentially all desktop and laptop computers use CISC chips and they are fast and cheap. RISC is a neat theory, but these days it seems that as the processors get decoupled from their ISAs anyhow, for various reasons, that it doesn't matter much. You choose the ISA for reasons of binary compatibility or features or the like, and it'll work just fine with the chip.

Also it is not true that CISC needs more bytes per instruction, at least not all implementations. With x86 you find instructions are variable length. They can be as little one byte and as many as 11 bytes. In actual practice, you find that a lot of 1 and 2 byte instructions are used in code. CISC can be extremely pithy in some respects. Then of course many CISC instructions do more. The idea with RISC is that each instruction does only one thing (that isn't really true with all the vector math stuff these days). So you end up having to issue instructions to load values to registers, operate on them, then store them back. Not necessary in CISC, there are instructions that can take a register and a memory location as values, and sometimes even two memory locations.

Re:CISC seems to work well (0)

Anonymous Coward | more than 4 years ago | (#33448546)

I believe it is more true to say that all desktop processors are RISC but have a CISC to RISC translator between the bytes and the decoder.

Also:
>The idea with RISC is that each instruction does only one thing (that isn't really true with all the vector math stuff these days).
SSE 1 and 2 were supposed to be RISCy vector instructions. Intel couldn't get their compiler to automatically take code that used regular float arrays and calculated , for instance, dot products and automagically sse it. So they started adding CISCy instructions (DOTP in this case.)

Re:CISC to save RAM? (1)

John Meacham (1112) | more than 4 years ago | (#33448164)

Actually, CISC uses less memory in general, but has traditionally been slower. CISC CPUs came out when memory was extremely expensive relative to CPU speed. cheaper memory is what made RISC (with its larger footprint but faster speed) possible. Nowadays, it really doesn't matter much, CISC is probably better nowadays that memory bandwidth is the big bottleneck. However, our CISC designs are not exactly modern, if you were to do a modern CISC design you would probably end up with something more akin to ARM's thumb instruction set.

All in all, x86 didn't end up too horribly off, its plethora of addressing modes actually makes smaller code on 64 bit systems because integer arithmetic can be 32 bits by default as you rarely need to directly operate on 64 bit values in arbitrarry ways as the addressing modes can perform most pointer arithmetic that is needed. Matching the 32 bit 'int' and 64 bit pointer on x86-64. Not that there arn't issues with x86, but being CISC in and of itself isn't one of them.

Re:CISC to save RAM? (1)

TheRaven64 (641858) | more than 4 years ago | (#33448742)

CISC and RISC are marketing terms that incorporate a lot of loosely connected design elements. Most CISC architectures use variable-length instruction encodings. On x86, for example, a number of common instructions are a single byte, while the longest ones are 15 bytes. A RISC architecture typically has fixed-length instructions, typically either 4 or 8 bytes (although ARM chips tend to also support Thumb and Thumb-2 instruction sets which use a 2-byte encoding).

This is why x86 chips need smaller instruction caches than SPARC or Alpha machines. The instructions do more, and the instruction encoding uses something like an ad-hoc version of Huffman encoding, where the more common ones use shorter sequences.

This chip snickers at my 6502... (3, Insightful)

bobdotorg (598873) | more than 4 years ago | (#33447848)

The chip uses 1,079 different instructions

Can't even imagine writing in assembly code for this monster. I miss dinking around with a nice 6502 system.

Re:This chip snickers at my 6502... (0)

Anonymous Coward | more than 4 years ago | (#33447858)

Eh, x86_64 has more, I believe.

Re:This chip snickers at my 6502... (1)

jmak (409787) | more than 4 years ago | (#33447976)

I'd guess most of the code run on these CPUs will still use the original IBM 360 instruction set from the sixties.

Re:This chip snickers at my 6502... (1)

MichaelSmith (789609) | more than 4 years ago | (#33448022)

The chip uses 1,079 different instructions

Can't even imagine writing in assembly code for this monster. I miss dinking around with a nice 6502 system.

Yeah the 6502 is nice and friendly. I taught myself how to hand assemble on the 6502 when I was 12 or 13.

Re:This chip snickers at my 6502... (1)

Haxamanish (1564673) | more than 4 years ago | (#33448082)

I miss dinking around with a nice 6502 system.

Start playing with ARM then, its design was somewhat inspired by the 65xx series and there are plenty of affordable ARM-based systems available.

Re:This chip snickers at my 6502... (1)

sznupi (719324) | more than 4 years ago | (#33448494)

Or something "lower" among many popular microcontroller families. AVR is quite pleasant, for example.

You really don't anymore (2, Interesting)

Sycraft-fu (314770) | more than 4 years ago | (#33448228)

These days, compilers take care of almost everything. It has gotten complex to the extent that a programmer trying to do things all in assembly will probably do a worse job than a good compiler. Chips have many, many tools to solve their problems.

That isn't to say it is never done, in some programs there may be some hand optimized assembly for various super speed critical functions. However even then it is most likely written in a high level language, compiled to assembly (you can order most compilers to do that), tuned and then put back in the program.

Memory is cheap and compilers are powerful so assembly is just not as needed as it once was, at least on desktops/servers where you see these massive chips.

Re:You really don't anymore (0)

Anonymous Coward | more than 4 years ago | (#33448626)

It's people with your attitude that has given us all of these bloaty programs assuming that the compiler is just going to sprinkle pixie dust on your code and make it super optimized. No, the compiler doesn't almost always do the best job. In fact it's trivially easy to find a whole host of examples of ICL, GCC and VC++ doing poor optimization or vectorizing of code.

Microchip? (1)

MistrX (1566617) | more than 4 years ago | (#33447876)

That term is so 90's. Why are we still calling it that? Shouldn't it be 'nanochip' or something like that?

Re:Microchip? (2, Interesting)

the_fat_kid (1094399) | more than 4 years ago | (#33447944)

iChip?

Re:Microchip? (1)

TeknoHog (164938) | more than 4 years ago | (#33447972)

This [via.com.tw] is a nanochip.

Lower than expected caches (0)

Anonymous Coward | more than 4 years ago | (#33447884)

At least, not private L2 caches per core. The L1 caches seem a little small, especially if this is supposed to be used in mainframe type situations.
But i'm not too in-the-know when it comes to their mainframes, the cache might not even be needed that much if they have some fast pipes between those circuits.
Either that or the system depends more on the private caches.

Quite liking those 2 co-processors for crypto. T'is an important function that gets ignored far too often these days. Having it in hardware is even better.

An appropriate conference for the announcement (1)

dtmos (447842) | more than 4 years ago | (#33447900)

Announcing a 5.2 GHz, 1.4 billion-transistor processor at "Hot Chips 2010" just makes sense. Strangely, no power numbers were given...

Re:An appropriate conference for the announcement (1)

JamesP (688957) | more than 4 years ago | (#33448466)

From Wikipedia each multi chip module takes as much as 1800W (six processors)

No figures for 1 chip though

http://en.wikipedia.org/wiki/IBM_z196_(microprocessor) [wikipedia.org]

highest clock speed? not really (1, Interesting)

Vectormatic (1759674) | more than 4 years ago | (#33447912)

Intel's netburst architecture (of pentium 4 fame) featured the 'Rapid Execution Engine', which consisted of two ALU's running double the clock speed, on 3.8 GHz Pentium 4's, that would be 7.6 GHz

Granted, that is not the entire cpu, but still..

Re:highest clock speed? not really (0)

Anonymous Coward | more than 4 years ago | (#33448650)

and in a hurricane, fly upstream, a swallow broke the land speed record for a tillamook brook quietly flowing sideways.

granted, it's a distinction without a difference.

granted, no one gives a shit.

granted.

I doubt it's the fastest ever... (1, Insightful)

the linux geek (799780) | more than 4 years ago | (#33447914)

except possibly in clock speed. I'm fairly sure than an 8-core 4.25GHz Power7 is probably as fast or faster if the workload is properly threaded, which any enterprise server or mainframe should be. On the other hand, on single-thread or few-thread workloads, the z196 probably has a bit of an edge, despite a large portion of its instruction set being microcoded.

Re:I doubt it's the fastest ever... (3, Informative)

mr_mischief (456295) | more than 4 years ago | (#33448178)

ummmm.......

It's a quad-core chip. Each core has two integer, two load and store, one binary floating point, and one decimal floating point unit. Up to 24 CPUs can be placed in the frame. It can connect to another whole rack of POWER7 blades running AIX as an application accelerator platform.

The z196 is for the stuff a mainframe is good at: big batches and fast I/O. The application accelerator is for stuff the clusters of supermicro servers are good at. As a hybrid system connected across the GX bus, it should pump data in and out of applications out pretty well.

Re:I doubt it's the fastest ever... (1)

spyked (1878060) | more than 4 years ago | (#33448194)

Agreed, high-frequency != fast. The IBM Cell Broadband Engine SPUs are a good example in this sense.

About time! (1)

dafing (753481) | more than 4 years ago | (#33447948)

Those slackers, wheres my 3GHZ G5? Huh?

*sigh* FINE, begin the switch back....


-Steve

Sent from my iPad 2

With CoProcessors.... (1)

vchoy (134429) | more than 4 years ago | (#33448028)

My 386DX has an external Maths Coprocessor, => can only do floating point functions :(
However mine's now a bit faster overclocked it from 33Mhz to 52Mhz ... your one does how 5.2 Ghz -> Sure my M series superseeds your G series..right?.... ....
right?

The S/360 architecture lives on! (1)

Arakageeta (671142) | more than 4 years ago | (#33448066)

It's crazy that an architecture developed in the '60s lives on in the System Z today. IBM bet the company on the S/360 product line. I think the investment has paid off-- and still does!

At last (-1, Redundant)

pl0sql (1122901) | more than 4 years ago | (#33448128)

Finally, a processor that might be able to run Crysis at maximum settings at a decent framerate...

Wait....what? (2, Insightful)

antifoidulus (807088) | more than 4 years ago | (#33448146)

It contains a 64KB L1 instruction cache, a 128KB L1 data cache, a 1.5MB private L2 cache per core, plus a pair of co-processors used for cryptographic operations. In a four-node system, 19.5 MB of SRAM are used for L1 private cache, 144MB for L2 private cache, 576MB of eDRAM for L3 cache, and a whopping 768MB of eDRAM for a level-four cache. All this is used to ensure that the processor finds and executes its instructions before searching for them in main memory, a task which can force the system to essentially wait for the data to be found--dramatically slowing a system that is designed to be as fast as possible.

I'm assuming the cache referred to in the second paragraph is off-chip cache, otherwise it would sort of negate the first sentence.... Would be nice if the article would have actually said that though.

Re:Wait....what? (3, Insightful)

Anonymous Coward | more than 4 years ago | (#33448454)

Considering the ratio between the two sets of figures is ~96, it seems that the "four-node system" contains 96 cores with their own L1 and L2 caches, but shared L3 and L4 caches.

Re:Wait....what? (1)

bws111 (1216812) | more than 4 years ago | (#33448504)

That is correct.

Re:Wait....what? (0)

Anonymous Coward | more than 4 years ago | (#33448888)

The true is that is a bit confusing, for clarification they refer a MCM (multichip module) as node and a individual system can have up to 4 MCM, each MCM have 8 dies, 6 of that are z196 processors and each z196 processor have 4 cores.

Then a 4 node system can have up to 4 nodes/MCM * 6 z196 processors * 4 cores = 96 cores.
From that each machine can have:
96 cores * (128KB+64KB)/1024 = 19.5 MB of L1
96 cores * 1.5MB = 144MB of L2

64KB L1 (1, Funny)

dmomo (256005) | more than 4 years ago | (#33448174)

That ought to be enough instruction cache for anybody.

Costing hundreds of thousands of dollars... (1)

Laxitive (10360) | more than 4 years ago | (#33448204)

The codename for this processor, was "Ming Mecca".

-Laxitive

So much for the 3.3GHz speed of light limit. (1)

thegarbz (1787294) | more than 4 years ago | (#33448254)

Really this article kind of makes all of last week's comments on the speed of light limiting the speed of processors to 3GHz a bit pointless doesn't it? Now I know in principle the discussions were correct, but this just goes to show that problems can be engineered around.

Re:So much for the 3.3GHz speed of light limit. (4, Informative)

Ecuador (740021) | more than 4 years ago | (#33448456)

The comments were about the fact that at 3GHz light travels 10cm per clock speed, which limits how far you can have 2 items on a bus if you want them to communicate within 1 clock cycle. There is no "light speed barrier" or anything of the sort, however at these frequencies you design knowing that it will take measurable time for an electric signal to propagate. For example, for this particular system whose core is at 5.2GHz, if you try to send a signal to an external memory that is say 11-12cm away, then it will take about two clock cycles just for the signal to travel the distance.

Re:So much for the 3.3GHz speed of light limit. (1)

TheRaven64 (641858) | more than 4 years ago | (#33448938)

A lot of nonsense was spoken in that thread, but the issue is real. The time taken for light to travel is not yet a problem, but the skew is. Most communication between parts of a chip is parallel. If the connections are not precisely parallel then signals arrive at slightly different times. The clock speed is limited to the amount of time that is the maximum where signals will arrive in the same time slice. A similar limit also affects fibre optics, due to total internal reflection causing paths taken by sequential photons to have different distances.

CPU designers work around this with deep pipelining. The z10, which this replaces, had a 14 stage pipeline. Signals only need to propagate along one stage of the pipeline per cycle, reducing the distance a lot. The problem with this approach is that a branch misprediction is very expensive, because you may only find out about it when an instruction has gone all of the way along the pipeline, meaning that you need to throw away all of the work done since then. For the Pentium 4, this could mean discarding around 250 in-flight instructions, which was why the practical speed of the chip never came close to the theoretical speed.

Making the chip run at a higher clock speed usually means making the pipeline stages smaller, which can make things slower overall, which is where this limit really comes from, in the design sense. There are also power and switching issues at the material level.

No news here (not apple) (1, Funny)

Anonymous Coward | more than 4 years ago | (#33448266)

No news here. Everyone knows the only innovation going on in the big companies is in Apple nowadays. Move along...

Fastest as in the clock frequency, or performance? (1)

noidentity (188756) | more than 4 years ago | (#33448280)

The summary makes it sound like it's merely the one with the greatest clock frequency. Me RTFA is out of the question, this being Slashdot and all.

So, does IBM license by the Core? (1)

Wormfoud (1749176) | more than 4 years ago | (#33448310)

If IBM licensed by the Core, at least there would be some business justification for developing this chip.

A Little more detail here (2, Informative)

valadaar (1667093) | more than 4 years ago | (#33448392)

If you direct to the IBM announcement, which mentions the system in more detail then this linked article - http://www-03.ibm.com/press/us/en/pressrelease/32414.wss [ibm.com] The New zEnterprise 196 " From a performance standpoint, the zEnterprise System is the most powerful commercial IBM system ever. The core server in the zEnterprise System -- called zEnterprise 196 -- contains 96 of the world's fastest, most powerful microprocessors, capable of executing more than 50 billion instructions per second. That's roughly 17,000 times more instructions than the Model 91, the high-end of IBM's popular System/360 family, could execute in 1970." 17k x improvement in performance in 40 years? I suppose that is about right...

Is this chip a bargin? (1)

walterbyrd (182728) | more than 4 years ago | (#33448608)

How much would it cost for me to put together a system with the same computing power, using off-the-shelf products, like a Xeon chip, or something? How long would it take for me to save $1 million in electricity, or whatever?

miniprocessor? (0)

Anonymous Coward | more than 4 years ago | (#33449032)

When can we move on from microprocessor? Is there a definition?
Let's get things moving, how about a kiloprocessor--or at least a milliprocessor.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?