Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Multicore Chips As 'Mini-Internets'

Soulskill posted more than 2 years ago | from the or-perhaps-minternets dept.

The Internet 132

An anonymous reader writes "Today, a typical chip might have six or eight cores, all communicating with each other over a single bundle of wires, called a bus. With a bus, only one pair of cores can talk at a time, which would be a serious limitation in chips with hundreds or even thousands of cores. Researchers at MIT say cores should instead communicate the same way computers hooked to the Internet do: by bundling the information they transmit into 'packets.' Each core would have its own router, which could send a packet down any of several paths, depending on the condition of the network as a whole."

Sorry! There are no comments related to the filter you selected.

A fault-tolerant chip? (5, Interesting)

Anonymous Coward | more than 2 years ago | (#39640323)

This technology that networks different cores can also serve another purpose, to prevent damage from core failure, and diagnose such failures. If the cores are connected to other cores, the same data can be processed by bypassing a damaged core, making over heating or manufacturing problems important, but almost treatable. Who knows, cores might even get replaceable.

Re:A fault-tolerant chip? (2)

Mitchell314 (1576581) | more than 2 years ago | (#39640481)

What are the chances you damage the chip without damaging enough of it to be rendered inoperable?

Re:A fault-tolerant chip? (4, Interesting)

Osgeld (1900440) | more than 2 years ago | (#39640611)

pretty good, few years ago I ran for months on a dual core with one blown out, worked fine until I fired up something that used both, then it would die.

Re:A fault-tolerant chip? (4, Informative)

Electricity Likes Me (1098643) | more than 2 years ago | (#39640681)

Also this is exactly what chip makers already do to a great extent: the binning of CPUs by speeds is not a targeted process. You make a bunch of a chips, test them, and then sell them as whatever clock speed they are robustly stable at.

Re:A fault-tolerant chip? (2)

Osgeld (1900440) | more than 2 years ago | (#39640741)

yep, its also why overclocking is popular/popular, robustly stable, and stable are 2 different things depending on where they end up at and testing tolerances. That 2.5Ghz chip may run at 2.7Ghz just fine and dandy, but out of spec with regards to voltage or temperature, even by a little.

you dont want dell refusing a gigantic pile of chips cause a few bad products, causing a quality alert, which is very costly and time consuming to both parties

Re:A fault-tolerant chip? (5, Interesting)

Joce640k (829181) | more than 2 years ago | (#39641443)

Also this is exactly what chip makers already do to a great extent: the binning of CPUs by speeds is not a targeted process. You make a bunch of a chips, test them, and then sell them as whatever clock speed they are robustly stable at.

Nope. The markings on a chip do NOT necessarily indicate what the chip is capable of.

Chips are sorted by ability, yes, but many are deliberately downgraded to fill incoming orders for less powerful chips. Bits of them are disabled/underclocked even though they passed all stability tests simply because that's what the days incoming orders were for.

Re:A fault-tolerant chip? (1)

TheLink (130905) | more than 2 years ago | (#39642955)

Also depends on how competitive the market is. Currently AMD isn't a strong competitor so Intel can do stuff like release software upgradeable CPUs. So no surprise if many recent Intel CPUs can be overclocked significantly. Seems like we're back in the days of 50% overclock (anyone remember the Celeron 300A?). Even Intel is officially selling overclockable CPUs.

Re:A fault-tolerant chip? (3, Interesting)

morgauxo (974071) | more than 2 years ago | (#39642903)

Years ago I had a single core chip with a damaged FPU. It took me forever to figure out the problem, my computer could only run Gentoo. Windows and Debian, both which it had ran previously gave me all sorts of weird errors I had never seen before. I had to keep using it because I was in college and didn't have money for another one so I just got used to Gentoo. Even in Gentoo anything which wasn't compiled from scratch was likely to crash in weird ways. (a clue) I finally diagnosed the problem a couple years later when a family member gave me a disk that boots up and runs all sorts of tests on the hardware. It turned out Gentoo worked because when software compiled it recognized the lack of an FPU and compiled in floating point emulation like it was dealing with an old 486sx chip.

So, anyway, if that can happen I would imagine damaging a single core of a multicore chip is quite possible.

Re:A fault-tolerant chip? (4, Interesting)

AdamHaun (43173) | more than 2 years ago | (#39640643)

This sort of technology already exists to an extent. TI's Hercules TMS570 [ti.com] microcontrollers have two CPUs that run in lockstep along with a bus comparison module. I think total fail-tolerance might take three CPUs, but this provides strong hardware fault detection in addition to the usual ECC and other monitoring/correction stuff.

Note that run-time fault tolerance is mostly needed for safety-critical systems. The customers who buy these products do not do so to get better yield, they do so to guarantee that their airbags, anti-lock brakes, or medical devices won't kill anyone. As such, manufacturing quality is very high. Also, die size is significantly larger than comparable general market (non-safety) devices. This means they cost a small fortune. The PC equivalent would be MLC vs. SLC SSDs. Consumer products usually don't waste money on that kind of reliability unless they need it. Now a super-expensive server CPU, maybe...

[Disclaimer: I am a TI employee, but this is not an official advertisement for TI. Do not use any product in safety-critical systems without contacting the manufacturer, or at least a good lawyer. I am not responsible for damage to humans, machinery, or small woodland creatures that may result from improper use of TI products.]

Re:A fault-tolerant chip? (3, Interesting)

Joce640k (829181) | more than 2 years ago | (#39641483)

This sort of technology already exists to an extent. TI's Hercules TMS570 [ti.com] microcontrollers have two CPUs that run in lockstep along with a bus comparison module. I think total fail-tolerance might take three CPUs....

This is just to detect when an individual CPU has failed. To build a fault-tolerant system you need multiple CPUs.

nb. The 'three CPUs' thing isn't done for detection of hardware faults it's for software faults. The idea is to get three different programmers to write three different programs with a specified output. You then compare the outputs of the programs and if one is different it's likely to be a bug.

Re:A fault-tolerant chip? (1)

Joce640k (829181) | more than 2 years ago | (#39641495)

nb. The 'three CPUs' thing isn't done for detection of hardware faults it's for software faults.

...although it will detect non-catastrophic hardware faults as well, obviously.

Re:A fault-tolerant chip? (1)

Thiez (1281866) | more than 2 years ago | (#39642301)

> nb. The 'three CPUs' thing isn't done for detection of hardware faults it's for software faults. The idea is to get three different programmers to write three different programs with a specified output. You then compare the outputs of the programs and if one is different it's likely to be a bug.

Why would you need three CPUs when you can just have three threads that run on any number of CPUs?

And like the interwebs (-1)

Anonymous Coward | more than 2 years ago | (#39640327)

Best effort Routing! I buffer bloat and adds for cheap. Viagra fml some people don't realise. Explorer. Exe will always run on a single thread

Nanotubes! (2)

Kraftwerk (629978) | more than 2 years ago | (#39640329)

This would work perfectly with a series of (very small) tubes.

A glorified name for better bus arbitrators (0)

Anonymous Coward | more than 2 years ago | (#39640331)

Having worked at some of the technology that is used in bus-arbitrators within SoC's these days, I can understand the need for better bus arbitrators, but terming it as mini-internet, routers, c'mon.

Re:A glorified name for better bus arbitrators (1, Informative)

mikkelm (1000451) | more than 2 years ago | (#39640937)

Slashdot in 2012 is largely technical support people and Windows administrators who hold their MCSAs more dear than their first born. This is how it has to be explained.

Re:A glorified name for better bus arbitrators (2)

dyingtolive (1393037) | more than 2 years ago | (#39642623)

You'd think someone with a 7-digit UID wouldn't be so arrogant.

Re:A glorified name for better bus arbitrators (1)

zAPPzAPP (1207370) | more than 2 years ago | (#39641745)

The idea is, that this is not 'a' bus, but many of them, making up several possible alternative routes.
A device deciding what route to take, is a router.

Re:A glorified name for better bus arbitrators (1)

TheRaven64 (641858) | more than 2 years ago | (#39642085)

The idea is to make people say 'MIT? They're full of really smart people!' As with the last dozen or so MIT press releases published on Slashdot, it describes, in very vague term, an idea that people in the field have been working on in various institutions for a decade or so. I don't know what MIT is like for research these days, but their press office is probably the best of any university in the world.

Re:A glorified name for better bus arbitrators (1)

zAPPzAPP (1207370) | more than 2 years ago | (#39642629)

I did speak about the general idea people have been working on, not MIT in particular.
The point is, this is not just "a glorified name for bus arbitrators" but a different concept...

Re:A glorified name for better bus arbitrators (1)

skids (119237) | more than 2 years ago | (#39642537)

The idea is actually a couple ideas as to how to do that. The idea of meshed connectivity in CPUs is far from news. The news here is the call-based protocol they developed by which one CPU sets up another for cut-through switching, and their power-saving "low swing" wire encoding.

A problem in this sub-field and in the CPU architecture field at large is the complexity ramps crazily the more interoperating time constraints get thrown into the mix. This means if they want predicatble, real-time results, programmers will need more intimate knowlege of the specific systems one which their code will be running, and along with supporting multiple platforms, this could get unmanageably complex. (For embarassingly parrallelizable throughput-oriented code with few real-time performance expectations, it should not be quite so much of a problem, but even there the potential for overcomplexity exists..)

I don't doubt the technology they are developing will lay groundwork for on-silicon networking and will be useful at some point. It may even end up being used as they intend, but will also likely be useful for more heterogenious circuits. The holy grail of course is a full mesh (likely using optics), and there's always the chance we might leapfrog straight to that should the right combination of innovation and investment occur.

way back machine (5, Insightful)

Anonymous Coward | more than 2 years ago | (#39640341)

I guess MIT has forgotten about the Transputer....

Back to the future moment? (4, Insightful)

GumphMaster (772693) | more than 2 years ago | (#39640353)

I started reading an immediately had flashbacks to the Transputer [wikipedia.org]

Re:Back to the future moment? (4, Interesting)

tibit (1762298) | more than 2 years ago | (#39640403)

Alive and well as XMOS [xmos.com] products. I love those chips.

Re:Back to the future moment? (0)

Anonymous Coward | more than 2 years ago | (#39641265)

Want.

Re:Back to the future moment? (2)

WrecklessSandwich (1000139) | more than 2 years ago | (#39641373)

Yep, thought of XMOS immediately when I saw the title. 16 quad-core CPUs linked together in a 4D hypercube: https://www.xmos.com/products/development-kits/xmp-64 [xmos.com]

Re:Back to the future moment? (0)

Anonymous Coward | more than 2 years ago | (#39641733)

Great, let me know when I can run all of my software on it.

Re:Back to the future moment? (0)

Anonymous Coward | more than 2 years ago | (#39642091)

Let me know when I can run all my software on any single platform.

Re:Back to the future moment? (1)

gman003 (1693318) | more than 2 years ago | (#39640421)

Or, more recently, Intel's many-core prototypes used this. At the very least, the "Single-Chip Cloud Computer" used a mesh network, and I think Larrabee had such a thing as well...

Re:Back to the future moment? (0)

Anonymous Coward | more than 2 years ago | (#39640491)

Exactly. Company I worked at the time developed compiler backends for the T9000 (among others). Weird but elegant stack-based architecture and very integrated CPU networking concept.

Re:Back to the future moment? (1)

crutchy (1949900) | more than 2 years ago | (#39641681)

T9000... sounds like a terminator model. that company you worked for wasn't cyberdyne systems by any chance?

Re:Back to the future moment? (3, Informative)

jd (1658) | more than 2 years ago | (#39640771)

The Transputer was a brilliant design. Intel came up with a next-gen variant, called the iWarp, but never did anything with it and eventually abandoned the concept.

IIRC, each Transputer had four serial lines where each could be in transmit or receive mode. They each had their own memory management (16K on-board, extendable up to 4 gigs - it was a true 32-bit architecture) so there was never any memory contention. Arrays of thousands of Transputers, arranged in a Hypercube topology, were developed and could out-perform the Cray X-MP at a fraction of the cost.

Having a similar communications system in modern CPUs would certainly be doable. It would have the major benefit over a bus in that it's a local communications channel so you always have maximum bandwidth. Having said that, a switched network would have fewer interconnects and be simpler to construct and scale since the switching logic is isolated and not part of the core. You can also multicast and anycast on a switched network - technically doable on the Transputer but not trivial. Multicasting is excellent for MISD-type problems (multi-instruction, single-data) since you can have the instructions in the L1 cache and then just deliver the data in a single burst to all applicable cores.

(Interestingly, although PVM and MPI support collective operations of this kind, they're usually done as for loops, which - by definition - means your network latency goes up with the number of processes you send to. Since collective operations usually end in a barrier, even the process you first send to has this extra latency built into it.)

It's also arguable that it would be better if the networking in the CPU was compatible with the networking on the main bus since this would mean core-to-core communications across SMP would not require any translation or any extra complexities in the support chips. It would also mean CPU-to-GPU communications would be greatly simplified.

Re:Back to the future moment? (3, Interesting)

91degrees (207121) | more than 2 years ago | (#39641627)

My Computer Architecture lecturer at University was David May - lead architect for the Transputer. Our architecture notes consisted of a treatise on transputer design.

Now multi-processor is becoming standard, it's interesting to see the the same problems being rediscovered, and often the same solutions reinvented. Their next problem will be contention between two cores that happen to be running processes that require a lot of communication. Inmos had a simple solution to this one as well.

Rather a shame that Inmos came up with the technology a quarter of a century too early. I've known a lot of engineers say wonderful things about them. The reason they weren't a huge success was because nobody had found a need for them yet. Extra silicon could be used to make the current generation faster much more easily than now.

But now the patents have expired... (0)

Anonymous Coward | more than 2 years ago | (#39642097)

But now the patents have expired...

So anyone can implement the solutions.

Re:Back to the future moment? (1)

TheRaven64 (641858) | more than 2 years ago | (#39642147)

The reason they weren't a huge success was because nobody had found a need for them yet

It was more the fact that processors at the time kept getting faster. The number of transistors doubled every 12-18 months, and this translated to at least a doubling in performance. As with other massively parallel systems, you needed to rewrite your software to take advantage of it, while you could just wait a year and your single-threaded system got faster. This is why multicore is suddenly interesting: chip designers have run out of obvious (and even not-so-obvious) things to do with extra transistors to make existing code faster. Extra cache worked for a while. FPUs, then vector units worked a bit. Wider superscalar systems did until we'd got as much ILP out of the code as was generally possible.

But what does the internet stand on? (4, Funny)

keekerdc (2504208) | more than 2 years ago | (#39640359)

Ah, you're clever; but it's internets all the way down.

Wrong (1)

SmallFurryCreature (593017) | more than 2 years ago | (#39641227)

It is lolcats all the way down, in a pool of porn with an essence of "Me too" posts.

Anyway, I think the original poster needs to read up on what the Internet is. It is a network of networks. A number of CPU's networking together is just a network. If you could mix many different systems together, it would be an Internet.

If you could put a Intel cpu next to an AMD and they would just work together seamlessly, THAT would be an Internet.

Re:Wrong (1)

crutchy (1949900) | more than 2 years ago | (#39641685)

I think the original poster needs to read up on what the Internet is

i think armchair experts are really just wankers with big hats

Say what? (2, Insightful)

Anonymous Coward | more than 2 years ago | (#39640361)

Errr... the internal "bus" between cores on modern x86 chips already is either a ring of point to point links or a star with a massive crossbar at the center.

Re:Say what? (4, Interesting)

hamjudo (64140) | more than 2 years ago | (#39640437)

Errr... the internal "bus" between cores on modern x86 chips already is either a ring of point to point links or a star with a massive crossbar at the center.

The researchers can't be this far removed from the state of the art, so I am hoping that it is just a really badly written article. I hope they are comparing their newer research chips with their own previous generation of research chips. Intel and AMD aren't handing out their current chip designs to the universities, so many things have to be re-invented.

Re:Say what? (3, Insightful)

TheRaven64 (641858) | more than 2 years ago | (#39642163)

The researchers can't be this far removed from the state of the art

They aren't. The way this works is a conversation something like this:

MIT PR: We want to write about your research, what do you do?
Researcher: We're looking at highly scalable interconnects for future manycore systems.
MIT PR: Interconnects? Like wires?
Researcher: No, the way in which the cores on a chip communicate.
MIT PR: So how does that work?
Researcher: {long explanation}
MIT PR: {blank expression}
Researcher: You know how the Internet works? With packet switching?
MIT PR: I guess...
Researcher: Well, kind-of like that.
MIT PR: Our researchers are putting the Internet in a CPU!!1!111eleventyone

Re:Say what? (1)

Cyrano de Maniac (60961) | more than 2 years ago | (#39641115)

What AC said. It's the one and only comment on this story you need to read.

I have a dream too (-1)

Anonymous Coward | more than 2 years ago | (#39640363)

Just when you thought it was safe, police killed a little boy last night.
They said it was a mistake, but that won't bring back his life.
His mama couldn't believe that it could happen to her.
She prayed to God every day.
Guess it just wasn't enough.

How long before... (0)

Anonymous Coward | more than 2 years ago | (#39640407)

...embedded SOPA and PIPA :-P

Sounds like... (2)

ArchieBunker (132337) | more than 2 years ago | (#39640409)

ccNUMA?

Re:Sounds like... (4, Interesting)

jd (1658) | more than 2 years ago | (#39640871)

For low-level ccNUMA, you'd want three things:

  • A CPU network/bus with a "delay tolerant protocol" layer and support for tunneling to other chips
  • An MTU-to-MTU network/bus which used a compatible protocol to the CPU network/bus
  • MTUs to cache results locally

If you were really clever, the MTU would become a CPU with a very limited instruction set (since there's no point re-inventing the rest of the architecture and external caching for CPUs is better developed than external caching for MTUs). In fact, you could slowly replace a lot of the chips in the system with highly specialized CPUs that could communicate with each other via a tunneled CPU network protocol.

Internets all the way down? (1)

Gothmolly (148874) | more than 2 years ago | (#39640423)

And then each router, which is a processing unit in its own right, could have multiple cores, which would exhibit the same drawbacks... until you put a network of processors inside that!

Re:Internets all the way down? (-1)

Anonymous Coward | more than 2 years ago | (#39640809)

But how would the MAFIAA legislate decent & honest communication between all those cores? Ii suspect this technology will be illegalized before it ever sees the light of day.

Re:Internets all the way down? (1)

ExploHD (888637) | more than 2 years ago | (#39641407)

And then each router, which is a processing unit in its own right, could have multiple cores, which would exhibit the same drawbacks... until you put a network of processors inside that!

We need to go deeper!

Hrmm? (0)

Anonymous Coward | more than 2 years ago | (#39640435)

So they'll have multiple busses, then. That's a rather goofy way of wording it.

It's not the packet switching itself that is improving performance, it's the extra bandwidth.

So, why not move from "hub" to "switch"? (1)

ivi (126837) | more than 2 years ago | (#39640443)

Sounds like history... the history of the Hub in LAN technology.

Maybe it's time to move to a Switch, that can keep multiple core-pairs communicating simultaneously.

Re:So, why not move from "hub" to "switch"? (2)

Osgeld (1900440) | more than 2 years ago | (#39640585)

I still think switches on tiny low traffic networks is a silly notion, though now that cost of switches are insignificant(and when was the last time you saw a hub for sale) I just go with the flow.

  Back in the day we had a client who dumped their hubs in each branch for much more expensive at the time switches, then whined that there was no advantage. I replied you insisted on putting your 2 386's and a dot matrix printer on it, and even threatened to take your biz elsewhere, you what you wanted, enjoy

Buses are so '90s (5, Informative)

rrohbeck (944847) | more than 2 years ago | (#39640447)

AMD uses HT and Intel has its ring bus, both of which use point-to-point links. Buses have serious trouble with the impedance jumps at the taps and clock skew between the lines, that's why nobody is using them in high speed applications any more. Even the venerable SCSI and ATA buses went the way of the dodo. The only bus I can see in my system is DDR3 (and I think that will go away with DDR4 due do the same problems.)

Re:Buses are so '90s (1)

eggfoolr (999317) | more than 2 years ago | (#39640539)

Bus? That is so 70's and 80's!

What about the crossbar switch? They were in fashion in the 90's and are pretty much the core architecture of any multi CPU system.

Next they'll be saying you can have multiple users on the same computer!!

Re:Buses are so '90s (0)

Anonymous Coward | more than 2 years ago | (#39640813)

Uh, crossbar switches were used from at least the '70s in telephone exchanges.

Re:Buses are so '90s (0)

Anonymous Coward | more than 2 years ago | (#39640861)

What a dummy I am, I should have said: crossbar switches have been used since 1919.

Re:Buses are so '90s (1)

eggfoolr (999317) | more than 2 years ago | (#39640893)

If your going to be like that, then Buses have been transporting people for over 100 years.

Re:Buses are so '90s (1)

tamyrlin (51) | more than 2 years ago | (#39641711)

Actually, even the first computers used buses. For example the Z3, which was built in the early 40's, used buses to transport data. (Actually, the Z3 architecture was very advanced for its time and it is much closer to a modern simple processor than for example ENIAC.)

Regarding the article summary I could note that it is not only researchers from MIT that says that a network-on-chip (NoC) is a promising concept for the future of chip design. Almost every researcher I've talked to seem to agree that NoCs of some form are needed for future chips. Note that the concept of packet switching networks are not new in computers. It has been used in supercomputers for a long time, and HyperTransport is based on a packet switching architecture.

That being said, the work the researchers have actually done seem interesting, especially the concept of virtual bypassing which I'll have to read up on at some point.

Inefficient... (2)

solidraven (1633185) | more than 2 years ago | (#39640521)

That's just plain inefficient use of silicon area. They wish to waste some of that limited space on additional logic that isn't strictly necessary. And it will cause a significant bottleneck to be created. Did they forget about DMA controllers or something? You already need a DMA controller no mater what and it's perfectly capable of accessing the necessary memories as it is. Adding some extra capabilities to the DMA controller would be far more efficient in logic area size and most likely lead to a better performance compared to this bad idea.

Re:Inefficient... (2)

Theovon (109752) | more than 2 years ago | (#39642345)

Silicon AREA is cheap, and it's getting cheaper. Today's processors dedicate half their die space to CACHE. Transistors per die, cores per die, and transistors per core are all increasing at (different) exponential rates. And with power density increasing at a quadratic rate, we're already facing the dark silicon problem, where if we power on the entire chip at nominal voltage, we have trouble delivering the power, and we can't dissipate the heat.

With 16 cores, a bus is tolerable. At 64, it's a liability, and we NEED a more sophisticated network.

Yea cause packet transmissions (0)

Osgeld (1900440) | more than 2 years ago | (#39640555)

after the data is chopped up, formatted, sent down a narrow serial pipe is so much faster than directly over a parallel link, and besides no a TYPICAL chip has 2 to 4 cores, 6-8 would imply a higher end chip that currently is quite expensive and not in TYPICAL use by TYPICAL people.

MIT please get out of the dreams lab once in a while

Re:Yea cause packet transmissions (1)

Electricity Likes Me (1098643) | more than 2 years ago | (#39640785)

No "typical" consumer chip 10 years ago had even 4 cores.

Re:Yea cause packet transmissions (1)

Osgeld (1900440) | more than 2 years ago | (#39640839)

who said anything about 10 years ago, and do you think in 10 years we will have typical consumer machines with "chips with hundreds or even thousands of cores"

in 10 years we will be honestly lucky to have serious machines with "hundreds or even thousands of cores" on the same plane and not strung together with networking.

Re:Yea cause packet transmissions (1)

Electricity Likes Me (1098643) | more than 2 years ago | (#39640895)

What are you even referring to?

You're OP was implying this is all garbage because 6-8 cores is a high end chip, not a "typical" one.

Yet 10 years is not a long time - in the past decade 4 would've been a high-end chip, and before that having 2 physical processors would've been significant as well.

So I would think, there is in fact a great deal of importance to this kind of work seeing as how the number of cores per chip for consumer items has grown and grown. And then you undermine your own point by implying we might even be getting close to "hundreds" of cores on a chip in the next 10 years. If we are, then the typical consumer chip will be breaching 8-16 easily. Not to mention thing's like the Cell architecture where Sony was thinking about pushing 24 work-cores onto the chip for the PS4 (backed off since then, but it shows where things are headed).

Re:Yea cause packet transmissions (1)

Osgeld (1900440) | more than 2 years ago | (#39640969)

what are you replying to, no where does it state "in 10 years"

here just in case you missed it, the very first sentence of the headline

""Today, a typical chip might have six or eight cores, all communicating with each other over a single bundle of wires, called a bus"

in case you missed it again let me point it out to you TODAY, A TYPICAL CHIP MIGHT HAVE SIX OR EIGHT CORES

Re:Yea cause packet transmissions (1)

tamyrlin (51) | more than 2 years ago | (#39641805)

> MIT please get out of the dreams lab once in a while

Actually, no chip-designer wants to use a network-on-chip if they can avoid it due to the added complexity. However, for future SoC designs with hundred of modules it will simply not be efficient to have direct parallel links between every module on the chip. A network will in many cases therefore be the best trade-off between silicon area, bandwidth, and energy efficiency.

Also, note that a typical SoC used in for example a mobile phone already have significantly more eight cores (although most of these cores are not processors, they still require communication links of some sort). (Take the OMAP4470 as an example [1] - it has at least, two Cortex-A9, one IVA3 accelerator, powervr graphics, a signal processor, SDRAM controller, flash controller, MMC controller, HDMI output, SPI controllers, I2C controllers, SDIO controller, UART controller, USB controller, GPIO controller, etc). So if MIT is in a dream lab, the only thing they are doing is trying to come up with a way to handle the nightmare that future on-chip communication entails.

They aren't doing this already? (2)

DaneM (810927) | more than 2 years ago | (#39640557)

I admit that despite being a technical user, I was not aware that only 2 chips are allowed to "talk" at a given time. I had (erroneously, it would seem) assumed that in order for a 3+-core chip to be fully useful, such a switch/router would have to already be in place.

So, have Intel, AMD, and others simply tricked us into thinking that a 3+-core chip can actively use all its cores at once (as is the natural assumption), or am I misinterpreting something? If they have, why on earth didn't they include a "router" in the original designs? It seems entirely too obvious for the eggheads in R&D to have missed (or so one would think, anyway). I'm sure there are technical hurdles to overcome, but unless that can be managed, what is really the point of many-core CPUs that can't have all cores acting at once?

Re:They aren't doing this already? (2, Informative)

Anonymous Coward | more than 2 years ago | (#39640805)

You are misinterpreting it. The chips CAN work independently. It is only when one needs to talk to another or use a shared resource (hard drive, main memory, network) that this becomes a potential issue. It is like a family of three sharing a single bathroom - not such a big deal, Bump that up to 20 using the same bathroom, and you start having serious issues.

Re:They aren't doing this already? (1)

DaneM (810927) | more than 2 years ago | (#39641141)

OK, I see. Thanks for the clarification. (Why post such an intelligent remark as Anonymous Coward?) This being an issue concerned only with shared resources seems to make the lack of concurrent interaction less of an issue, but as with your family/bathroom analogy, it will (predictably) become a major problem as the number of cores/processors in a system continues to increase.

So, while I yet wonder why this hasn't already been thought-of and solved, I can see that it hasn't been a place that a (typically short-sighted) company would have invested much R&D into, as of yet. I wonder if some independent technology firm has already come up with a solution that will soon be purchased by Intel or AMD. I see another patent battle coming...

Re:They aren't doing this already? (4, Insightful)

Forever Wondering (2506940) | more than 2 years ago | (#39641205)

I admit that despite being a technical user, I was not aware that only 2 chips are allowed to "talk" at a given time. I had (erroneously, it would seem) assumed that in order for a 3+-core chip to be fully useful, such a switch/router would have to already be in place.

For [most] current designs, Intel/AMD have multilevel cache memory. The cores run independently and fully in parallel and if they need to communicate they do so via shared memory. Thus, they all run full bore, flat out, and don't need to wait for each other [there are some exceptions--read on]. They have cache snoop logic that keeps them up-to-date. In other words, all cores have access to the entire DRAM space through the cache hierarchy. When the system is booted, the DRAM is divided up (so each core gets its 1/N share of it).

Let's say you have an 8 core chip. Normally, each program gets its own core [sort of]. Your email gets a core, your browser gets a core, your editor gets one, etc. and none of them wait for another [unless they do filesystem operations, etc.] Disjoint programs don't need to communicate much usually [and not at the level we're talking about here].

But, if you have a program designed for heavy computation (e.g. video compression or transcoding), it might be designed to use multiple cores to get its work done faster. It will consist of multiple sections (e.g. processes/threads). If a process/thread so designates, it can share portions of its memory space with other processes/threads. Each thread takes input data from a memory pool somewhere, does some work on it, and deposits the results in a memory output pool. It then alerts the next thread in the processing "pipeline" as to which memory buffer it placed the result. The next thread does much the same. x86 architectures have some locking primitives to assist this. It's a bit more complex than that, but you don't need a "router". If the multicore application is designed correctly, any delays for sync between pipeline stages occur infrequently and are on the order of a few CPU cycles.

This works fine up to about 16-32 cores. Beyond that, even the cache becomes a bottleneck. Or, consider a system were you have a 16 core chip (all on the same silicon substrate). The cache works fine there. But now suppose you want to have a motherboard that has 100 of these chips on it. That's right--16 cores/chip X 100 chips for a total of 160 cores. Now, you need some form of interchip communication.

x86 systems already have this in the form of Hypertransport (AMD) or the PCI Express Bus (Intel) [there are others as well]. PCIe isn't a bus in the classic sense at all. It functions like an onboard store-and-forward point-to-point routing system with guaranteed packet delivery. This is how a SATA host adapter communicates with DRAM (via a PCIe link). Likewise for your video controller. Most current systems don't need to use PCIe beyond this (e.g. to hook up multiple CPU chips) because most desktop/laptop systems have only one chip (with X cores in it). But, in the 100 chip example, you would need something like this and HT and PCIe already do something similar. Intel/AMD are already working on any enhancements to HT/PCIe as needed. Actually, Intel [unwilling to just use HT], is pushing "Quick Path Interconnect" or QPI.

Re:They aren't doing this already? (1)

DaneM (810927) | more than 2 years ago | (#39641383)

Thanks for the enlightening "sip from the fire hose," Forever Wondering. I appreciate the explanation.

Re:They aren't doing this already? (1)

dkf (304284) | more than 2 years ago | (#39641769)

That's right--16 cores/chip X 100 chips for a total of 160 cores.

16 * 100 = 160?

You must be a hardware engineer. Did you work for Intel on the early Pentium floating point unit?

Remember SGI? (0)

Anonymous Coward | more than 2 years ago | (#39640615)

SGI did this in just about every computer it produced from the early 90s until they stopped making MIPS machines (or existing, really). You could use Craylink cables and R-bricks to turn multiple C-bricks (full-fledged Origin servers with 1-4 CPUs), into single-system-image ccNUMA machines. They had quite a few big Origin machines in the Top 500 back in the day.

Bonus points, my capcha was "networking".

Re:Remember SGI? (1)

Cyrano de Maniac (60961) | more than 2 years ago | (#39641107)

We still do. The only major difference (other than generational improvements) is that these days it's x86 instead of MIPS.

the worst replaces the best (3, Interesting)

holophrastic (221104) | more than 2 years ago | (#39640717)

Yeah, great idea. Take the very fastest communication that we have on the entire planet, and replace it with the absolute slowest communication we have on the planet. Great idea. And with it, more complexity, more caches, more lookup tables, and more things to go wrong.

The best part is that it's totally unbalanced. Internet protocols are based on a network that's ever-changing and totally unreliable. The bus, on the other hand, is best on total reliability and static.

I'd have thought that a pool concept, or a mailbox metaphor, or a message board analog would have been more appropriate. Something where streams are naturally quantized and sending is unpaired from receiving. Where a recipient can operate at it's own rate uncommon to the sender.

You know, like typical linux interactive sockets, for example. But what do I know.

Re:the worst replaces the best (1)

tamyrlin (51) | more than 2 years ago | (#39641765)

Actually, the networks used in Network-on-Chips are quite unlike the networks used for TCP/IP. For example, when you develop a System-on-Chip you have a very good idea of your workload, so you can optimize the network topology based on that information. The networks proposed in NoC research typically also have other features not found on the Internet such as guaranteed and in-order delivery of packets. (Which is fairly easy to do in a small network with low latencies.) In many cases you can also reserve bandwidth between nodes so that you can give real-time guarantees. However, in some systems circuit-switching may be better than packet switching, although most researchers seem to focus on packet-switching NoCs.

A good paper to read for an introduction to NoCs is "Route Packets, Not Wires: On-Chip Interconnection Networks" by Dally and Towles. (You can find it at http://www.cs.berkeley.edu/~vwen/backgrnd_papers/41_4.pdf [berkeley.edu] if you are interested.)

Anyway, the basic idea behind a NoC is that it is a good trade-off between the two extremes of a bus and a cross-bar. If you implement a chip with just a single bus on it, the silicon-area used for communication will be very low, but the bandwidth will also be relatively low. On the other hand, if you create a huge cross-bar to which every module is connected to, the silicon area used for communication is extremely high (the area for a cross-bar grows quadratically with the number of ports), although the theoretical maximum bandwidth is also very high. In most systems, the optimum point will be somewhere in between, where you have several buses and/or cross-bars connected by a network.

Sounds like an idea of mine (0)

Anonymous Coward | more than 2 years ago | (#39640729)

I had an idea for a MMO game, where people would use personal computer hardware up against an internet, but everyone would have multiple entry points because of processor complexity.

Unfortunately, as designed, this means an all out war on the internet, with no security nor privacy.

Really!? It has already been used (0)

Anonymous Coward | more than 2 years ago | (#39640775)

by Tandem computers, like a long time.

The important bit : No coherent shared cache (5, Informative)

Sarusa (104047) | more than 2 years ago | (#39640797)

As mentioned in other comments, this has been done before. The method of message passing isn't as fundamental as one key point - that it is all explicit message passing.

Intel and AMD x86/x64 CPUs use coherent cache between cores to make sure that a thread running on CPU 1 sees the same RAM as a thread running on CPU 3. This leads to horrible bottlenecks and huge amounts of die tied up in trying to coordinate the writes, maintain coherency between N cores (N-1 ^2 connections!), and it all just goes to hell pretty fast. Intel has this super new transactional memory rollback thing, but it's turd polishing.

The next step is pretty obvious (see Barrelfish) and easy: no shared coherency. Everything is done with message passing. If two threads or processes (it doesn't really matter at that point) want to communicate they need to do it with messages. It's much cleaner than dealing with shared memory synchronization, and makes program flow much more obvious (to me at least - I use message queues even on x86/x64). If you need to share BIG MEMORY between threads, which is reasonable for something like image processing, you at least use messages to explicitly coordinate access to shared memory and the cores don't have to worry about coherency.

This scales extremely well for at least a couple thousand CPUs, which is where the 'local internet' becomes useful.

Where it becomes not easy is that almost all programs written for x86/x64 assume threads can share memory at will. They'd need to be rewritten for this model or would suddenly run a whole lot slower since you'd have to lock them to one core or somehow do the coordination behind their back. It'd be worth it for me!

Re:The important bit : No coherent shared cache (1)

Anonymous Coward | more than 2 years ago | (#39641111)

IPC has been a PITA and slow since decades, you don't want that to be the only option in the future.

Re:The important bit : No coherent shared cache (1)

Anonymous Coward | more than 2 years ago | (#39641723)

Seems like you are talking about switching from a "strong memory model" to a "weak memory model" and TBQH I know my share of developers that can barely handle multithreaded programming as it is... throwing this at them could be a disaster on the software side.

Re:The important bit : No coherent shared cache (2)

dkf (304284) | more than 2 years ago | (#39641793)

Seems like you are talking about switching from a "strong memory model" to a "weak memory model" and TBQH I know my share of developers that can barely handle multithreaded programming as it is... throwing this at them could be a disaster on the software side.

Depends on the model. If the model is "oh, you got one big space of memory; anything goes but you'd better sprinkle a few locks in" then yes, that will suck boulders when the hardware switches to message passing, but there are other parallelism models in use in programming. Those that have each thread as being essentially isolated and only communicating with the other threads by sending messages will adapt much more easily; that's basically MPI, and that's known to scale massively. It's also a heck of a lot easier to reason about message passing parallelism; that's been known since at least the '80s. What's more, there are actually quite a lot of programmers who have experience with distributed component programming; they just tend to work at a much higher level than a single process (or single computer).

not news, just PR (1)

markhahn (122033) | more than 2 years ago | (#39640837)

oh, come on. buses have been dead for years (sata and pcie are great examples of the prevalence of point-to-point links). no reason we can't think of cachelines as packets (bigger than ATM packets were!). how about hypertransport and QPI?

Re:not news, just PR (1)

mikkelm (1000451) | more than 2 years ago | (#39640951)

Everything you do deals with a bus somewhere. They're still hugely relevant, particularly in very dense, very fast electronics.

Google's on it (2)

bill_mcgonigle (4333) | more than 2 years ago | (#39640911)

I can't seem to find the old story or my comment on it, but when Google acquired a 'stealth' startup a year or so ago the most interesting thing about it was that the primary investigator had a few patents for packet-switched CPU's.

This is not like the internet! (1)

Darinbob (1142669) | more than 2 years ago | (#39640983)

Come on people. Cores share information and suddenly it's just like the internet? Are these journalist's experiences so narrow that they have no other analogy? It's just a fricking bus! There are networks that exist which are not "the internet". Using the term "internet" implies global connectivity. OK, I expect journalists to be ignorant but please are slashdot editors this confused about basic technology as well?

SeaMicro (1)

Sollord (888521) | more than 2 years ago | (#39641105)

Didn't AMD just buy a company that did something similar to this? While not at the chip or core level it seems kinda realted

Software is the problem (1)

CBravo (35450) | more than 2 years ago | (#39641121)

The problem is not the hardware but the software. The hardware has been parallel for ages, even locally (GPU, GPU-memory, CPU, memory, HDD, DMA - memory processor, ...).

Software is a different problem across networked/parallel arena. If you really think about an SMS it is not much more than 'hello world'. You type it and you see text (no other function, other than transport which isn't really a function, has been done) and testing it should be easy. This is not even about parallelism but about communication.

The best way to create software for networking is to not re-write it for all these new parallel architectures/internet (which means you compile it for compartimented execution). This is however, pretty hard to do (I don't know about such an implementation). The alternative is that everybody needs to put all the same glue in its software over and over (RMI, OpenMP, ...). We are doing #2.

By the way I think there is a big difference between networking, which has a premise that things fail, and local transport of data/code which is specced on its workings. Fundamentally different (price).

sounds like Cell (1)

fikx (704101) | more than 2 years ago | (#39641155)

isn't this a variation on Cell architecture? except, no one could figure out how to write the OS and compiler to fully realize the goal of programs that could be farmed out by the ARM CPU to the special processors on one chip, let alone farm to multiple cell's over a network.

Great idea... (1)

FithisUX (855293) | more than 2 years ago | (#39641333)

I believe this is the ultimate solution to parallelization. A total realization of actors.

a bit late (0)

Anonymous Coward | more than 2 years ago | (#39641535)

That'd be all great and applicable to say... a Pentium D. Those processors had cores that, in fact, communicated via the local bus.

Way old idea! (1)

enriquevagu (1026480) | more than 2 years ago | (#39641537)

The seminal paper proposing the use of switched/routed interconnection networks on-chip (NoCs) was published by Dally and Towels 11 years ago in DAC'01: Route packets, not wires: On-chip interconnection networks [ieee.org] . The idea of associating a router to each core and replicating it in "tiles" is not new either; Tilera [tilera.com] was (IIRC) the first company to sell processors based on a tiled design, which was an evolution of the RAW [mit.edu] research project. A related research project, the TRIPs [utexas.edu] , replicated functional units on each tile, rather than full cores. Intel has used a tiled design in the Polaris [wikipedia.org] , SSC [intel.com] and MIC [intel.com] (which includes the forthcoming Knights Corner).

So no, the idea of using routed interconnects is not new at all. In fact, after reading the linked article, turns out that 2/3ths of the text are introducing the idea, and the last section details the contributions: Two ideas developed by the group of Li-Shiuan Peh seeking to improve performance (by using virtual bypassing, a form of routing precomputation) and reducing power consumption (using low-swing signaling).

The Simpsons already did it ... (0)

Anonymous Coward | more than 2 years ago | (#39641555)

Sort of like the AMD Hypertransport then. Multiple of them on each CPU and packet switched... Add in variable width connections downstream and it's pretty cool. Actually the best of both since it is a common bus, packet switched, and the processors had multiples of them... And suitable for off chip as well as on. Then Intel did a similar thing but sort of didn't exploit it in FBDIMM... Oh and I am an Inmos fan from way back. Variable width operations, 4 5mhz serial bus connections, and a massive matrix switch in the family, not to mention dedicated serial bus connected disk controller and graphics chips... Cool stuff killed off when sgs bought them. And they had a cool language ocean with intrinsics for trimmers and inter processor com channels... And extended c for that too. And it was very easy to farm tasks out through an array of CPUs. They included parallel and serialized operations in the supported languages in the simplest manner ever.

erm... (1)

crutchy (1949900) | more than 2 years ago | (#39641753)

cores should instead communicate the same way computers hooked to the Internet do

apparently never heard of beowulf clustering

Typical chip six or eight cores? (1)

Lord Lode (1290856) | more than 2 years ago | (#39641915)

Then why do all Intel CPU's, except a very small amount of xeon CPU's, have only 4 cores max, even the new Ivy Bridge ones to be released this year, even though 5 years ago they also had chips with 4 cores already?

So... (1)

metacell (523607) | more than 2 years ago | (#39641947)

... now my mother will finally have Internet in her computer!

Deja vu (1)

maroberts (15852) | more than 2 years ago | (#39642177)

I was going to say this seems to be the realisation that the Transputer had the answers decades ago, but it seems many others have said exactly the same thing. I shall resume my nap.,....

CONGRATULATIONS! (0)

Alex Belits (437) | more than 2 years ago | (#39642207)

YOU HAVE INVENTED A BUS! It's time to start working on the first multitasking OS!

What is it with idiots coming out of the woodwork presenting old (and often obsolete and abandoned such as virtualization) technologies as some kind of new development?

Traffic (1)

bjs555 (889176) | more than 2 years ago | (#39642239)

The idea only works until one of the cores starts sending spam. Hey core, want Vi@gra?

It's NEWS when MIT quotes ancient literature (1)

Theovon (109752) | more than 2 years ago | (#39642307)

The Network on Chip has been around as a concept so long we even have an abbreviation (NoC). Maybe this isn't in commodity products, but basically if you want to do an NoC, you don't have to invent anything yourself. There are several conferences and journals that have been publishing papers on this for decades. But, OH, if a professor from MIT mentions it, it must be something NEW. Sheesh.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?