Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

ARM In Supercomputers — 'Get Ready For the Change'

Soulskill posted about a year ago | from the you-and-what-ARMy dept.

Supercomputing 238

An anonymous reader writes "Commodity ARM CPUs are poised to to replace x86 CPUs in modern supercomputers just as commodity x86 CPUs replaced vector CPUs in early supercomputers. An analysis by the EU Mountblanc Project (PDF) (using Nvidia Tegra 2/3, Samsung Exynos 5 & Intel Core i7 CPUs) highlights the suitability and energy efficiency of ARM-based solutions. They finish off by saying, 'Current limitations [are] due to target market condition — not real technological challenges. ... A whole set of ARM server chips is coming — solving most of the limitations identified.'"

cancel ×

238 comments

Sorry! There are no comments related to the filter you selected.

IMHO - No thanks. (2, Insightful)

Anonymous Coward | about a year ago | (#43825213)

PC user, hardcore gamer and programmer here; for me, energy efficiency is a lesser priority than speed in a CPU. Make an ARM CPU compete with an Intel Core i7 2600K, and show me it's overclockable with few issues, and you got my attention.

Re:IMHO - No thanks. (5, Insightful)

Stoutlimb (143245) | about a year ago | (#43825231)

No doubt your CPU would win. But when looking at power/price as well, you'd have to pit your CPU against 50 or so ARM chips in parallel. For some solutions, it may be a far better choice. One size doesn't fit all.

Re:IMHO - No thanks. (0)

arbiter1 (1204146) | about a year ago | (#43825329)

50 arm cpu's eh, problem comes to fact of something that can scale to that many cpu's.

Re:IMHO - No thanks. (3, Interesting)

dbIII (701233) | about a year ago | (#43825375)

Then you use something else as well. High performance computing server rooms already have a mix of stuff, especially since the AMD chips can give you a 64 core machine with half a terabyte of memory for $14K but it's not as fast per core as the two way Xeons. The parallel stuff is done on the plentiful and slower cores while the single treaded stuff is done on the faster cores - then GPUs do whatever parallel stuff you can feed them (memory and bandwidth limiting issues keep them from doing some tasks)

Re:IMHO - No thanks. (0)

Anonymous Coward | about a year ago | (#43825637)

cores and cpu's are not the same thing just so you know

Re:IMHO - No thanks. (0)

Anonymous Coward | about a year ago | (#43825655)

To a programmer the distinction is irrelevant.

Re:IMHO - No thanks. (1)

MichaelSmith (789609) | about a year ago | (#43825759)

With sufficient abstraction.

Re:IMHO - No thanks. (1)

Anonymous Coward | about a year ago | (#43826025)

No, it is not. When working with NUMA you will have to think about how you move threads between cores or cpus. Moving threads between cpus will require you to move cache as well, and the performance impact can be quite dramatic.

Re:IMHO - No thanks. (4, Insightful)

aztracker1 (702135) | about a year ago | (#43825785)

Exactly, then again, there are plenty of non-cpu intensive loads.. part of the popularity and growth of NodeJS is that a lot of jobs are IO bound, and even a lot of web services/sites are spending most of their time waiting on files, or network resources/services... 10 arm CPU's handling 10K simultaneous requests, is as good as 1 uber-cpu handling 10K simultaneous requests... for that matter, there's been a lot of work done in MessageQueue routing, and distributed databases... ARM is a pretty good fit for an environment designed to scale horizontally. Some of the first things I wanted to try on my Raspberry Pi were MongoDB and NodeJS, with the thought that a couple dozen of them might work better with more resilience than a few larger systems...

For the record, I think addressing a bit more memory, and larger/faster storage channels are what's holding back some of these systems.. which aren't a problem at super-computer scale.. but for someone wanting to put together a small cluster, it gets irritating.

Re:IMHO - No thanks. (2)

Dcnjoe60 (682885) | about a year ago | (#43825697)

50 arm cpu's eh, problem comes to fact of something that can scale to that many cpu's.

Well the article is about arms being used in supercomputers, so scalability is probably not going to be a problem.

Re:IMHO - No thanks. (1)

Anonymous Coward | about a year ago | (#43826007)

Eh, not really, it depends on the workload of course. Sun/Fujutsu and HP have been doing 64-socket systems for a long time now, SGI used to do it too. If you're talking cores and threads, Sun/Oracle and IBM have been making systems in the 64-core and 512-thread rage (on 4 sockets), and Oracle pumped out 8-socket 1024-thread beasts earlier this year, for a decade or so.The trick is finding every day workloads that benefit from that kind of paralellization, as well as needing that level up upward scaling.

I don't see these kinds of systems kicking off in the consumer market in the immediate future, if only because the majority of workloads don't benefit all that much from it. It's a no-brainer in HPC, and can stand to finally serve as competition to Sparc and Power in the highest tiers of the enterprise, if single-threaded performance (and other considerations, such as the multitude of special-purpose co-processors available on these systems) can be brought up to par.

One Size Doesn't Fit All -- Same in Supercomputing (4, Informative)

gentryx (759438) | about a year ago | (#43825661)

There is already one line of supercomputers built from embedded hardware: the IBM Blue Gene. Their CPUs are embedded PowerPC [wikipedia.org] cores. That's the reason why those systems typically have an order of magnitude more cores than their x86-based competition.

Now, the problem with BG is, that not all codes scale well with the number of cores. Especially when you're doing strong scaling (i.e. you fix the problem size, but throw more and more cores on the problem), then the law of Amdahl [wikipedia.org] tells you that it's beneficial to have fewer/faster cores.

Finally I consider the study to be fundamentally flawed as it compares the OEM prices of consumer-grade embedded chips with retail prices of high-end server chips. This is wrong for so many reasons... you might then throw in the 947 GFLOPS, $500 AMD Radeon 7970 [wikipedia.org] , which beats even the ARM SoCs by a margin of 2x (ARM: ~1 GFLOPS/$, AMD Radeon: ~2 GFLOPS/$).

Power Efficiency - MIPS vs ARM (2)

Taco Cowboy (5327) | about a year ago | (#43825665)

I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture

If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp

Re:Power Efficiency - MIPS vs ARM (4, Insightful)

julesh (229690) | about a year ago | (#43826015)

I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture

If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp

I don't think it is. Best figures (albeit somewhat out-of-date) I can find for a MIPS-based system is 2GFLOPS/W for a complete 6-core node including memory. ARM Cortex A15 power consumption is a little hard to track down, although it's suggested that a 4-core 1.8GHz configuration (eg Samsung Exynos 5) could run at full speed on 8W (if the power manager let it; the Exynos 5 throttles down when it consumes more than 4W). Performance per GHz/core is about 4GFLOPS, so this system should be able to pull in about 28.8GFLOPS (or twice that if using ARM's "NEON" SIMD system to full advantage). Add in ~2W for 1GB DDR3 SDRAM, and that's 2.9GFLOPS/W. Assuming that the MIPS system I found is not the best available (as the data was from 2009 it certainly seems likely better is available now), the two appear to be roughly comparable.

Re:Power Efficiency - MIPS vs ARM (0)

Anonymous Coward | about a year ago | (#43826017)

That and MIPS has a history in HPC, back when SGI was still in the game. There's little risk-taking involved in going MIPS. We already know the arch can scale to stupid levels.

Re:Power Efficiency - MIPS vs ARM (2)

niftymitch (1625721) | about a year ago | (#43826065)

I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture

If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp

MIPS is an under invested older but great technology.
Another historic winner was the DEC Alpha.

As the folk at Transmeta (and others) demonstrated logic to decode any random ISA and drive a RISC core faster than the old VAX microcode days is very possible. This seems to be the way of modern processors. So ARM/x86/x86_64 ISA almost does not matter except to the compiler and API/ABI folk. If you want to go fast feed your compiler folk well.

Re:IMHO - No thanks. (2)

LordLimecat (1103839) | about a year ago | (#43825703)

THe core i7 might very well still win. Remember that intel is more efficient in computing work per watt, and an Ivy Bridge core i7 3770k uses 77w. If your average arm chip uses 2 watts, that means that ~30 arm chips will still get beaten by the core i7....

Re:IMHO - No thanks. (0)

Anonymous Coward | about a year ago | (#43825733)

For mobile solutions it fits. For supercomputers? Battery life isn't a term. The largest expense over the lifetime of ANY "super computer" worthy of the term is going to be energy consumption (including cooling). Until ARM can match Intel/IBM/etc. on that there's no great logical argument for their use therein, though other uses (if lower initial cost is more important for the application) would warrant another discussion.

As for performance per watt, ARM is certainly catching up to Intel and x86 relatively. But you'll notice the researchers use the absolute newest Arm Cortex a15 (28nm process) architecture for ARM while going back a full 2 years for Intel's Sandy Bridge to compare performance per watt. Ignoring the soon the be released Intel Haswell architecture could be due to timing, but if they feel free to compare mobile parts why deliberately ignore Intel's mobile release of Ivy Bridge (22nm)? A platform for which performance per watt was shown to have improved a significant amount. Without this comparison the analysis is incomplete and thus fundamentally flawed.

Not that arguments against ARM being able to match up to the requirements of data centers and super computers isn't without merit, while the Cortex A15 is probably not the design to get ARM into that space, their upcoming 64 bit server/datacenter/etc. oriented architecture may stand a much better chance. Either way the engineering battle between Intel and ARM is at the very least good for consumers no matter who is currently "winning". As for anyone arguing thatx86 is old and will be replaced by ARM "Because ARM is more efficient!" I'm sure there are others much more qualified for a very logical rebuttal, but instead I'll just say actual data such as Intel's upcoming Silvermont architecture can speak for itself: http://www.anandtech.com/show/6936/intels-silvermont-architecture-revealed-getting-serious-about-mobile

Re:IMHO - No thanks. (2)

Khyber (864651) | about a year ago | (#43826041)

"For supercomputers? Battery life isn't a term."

You say that until the power grid fails and your generator fails to kick on, leaving you with only battery backup in place.

hard core flood victim here (-1, Troll)

Anonymous Coward | about a year ago | (#43825249)

take your disregard for the planet and shove it up your asshole

Re:hard core flood victim here (-1)

Anonymous Coward | about a year ago | (#43825267)

Keep playing sudoku, and Portal!

Re:IMHO - No thanks. (2, Interesting)

Anonymous Coward | about a year ago | (#43825255)

architecture is complicated. but in terms of ops per mm^2, or ops per watt, ops per $,
cycles per useful op, the x86 architecture is a henious pox on the face of the
earth.

worse yet, your beloved x86 doesn't even have any source implications, its just
a useless thing.

Re:IMHO - No thanks. (5, Informative)

Colonel Korn (1258968) | about a year ago | (#43825327)

architecture is complicated. but in terms of ops per mm^2, or ops per watt, ops per $,
cycles per useful op, the x86 architecture is a henious pox on the face of the
earth.

worse yet, your beloved x86 doesn't even have any source implications, its just
a useless thing.

In TFA's slides 10 and 11, Intel i7 chips are shown to be more efficient in terms of performance per watt than ARM chips. However, they're close to each other and Intel's prices are significantly higher.

Re:IMHO - No thanks. (1)

Technician (215283) | about a year ago | (#43826019)

Is it worth the wait for the next gen of low power chips to arrive?

Re:IMHO - No thanks. (2)

Redmancometh (2676319) | about a year ago | (#43825827)

Useless for what you do. The second performance...not performance per watt...PERFORMANCE becomes an issue..ARM is a steaming pile of shit and you know it. If you're doing anything more than what the above AC said (keep playing soduku, and portal) it can't handle it. How about everyday consumers who need a tablet that can actually do work? A gimp version of windows is not going to get the job done. Some of the Samsung Slate tablets however come with an x86...and are actually fully functional! Can you point to an ARM tablet that can do everything it can? Or any other x86 tablet for that matter?

I know it's not about the software. However, unfortunately, sometimes raw productivity is all that matters. Sometimes the latest windows RT garbage dump or iOS xyz isn't going to hold water. The fact of the matter is the software that will run on a system defines how productive that device is going to be. Me and you might be able to put a proper operating system on one of these...but your whole company? Hell no.

Re:IMHO - No thanks. (1)

Redmancometh (2676319) | about a year ago | (#43825829)

That first sentence was supposed to be posted on another article...but you can't edit or delete on slashdot which is pretty awful.

Re:IMHO - No thanks. (1)

Anonymous Coward | about a year ago | (#43825259)

The article is aimed at supercomputers, not commodity PC. You are not the target.

Re:IMHO - No thanks. (4, Funny)

c0lo (1497653) | about a year ago | (#43825293)

The article is aimed at supercomputers, not commodity PC. You are not the target.

While not the target, you'll be collateral damage anyway.

Re:IMHO - No thanks. (4, Interesting)

KiloByte (825081) | about a year ago | (#43825409)

Damage or a winner? I feel so bad about having a cheap, efficient, and above all, quiet box.

I bought this [hardkernel.com] 4*2GHz baby, and the only reason it's not my main desktop yet is a weird and asinine requirement for monitor resolution to be exactly 720 or 1080 (WTF?!?). I think I'll replace my old but perfectly working pair of 1280x1024 monitors (I hate 16x9!), and put the big loud clunker to the cellar. I just hate the noise so much. x86 machines with no moving parts are extremely hard to get, and have terrible performance/price. Anything that requires lots of processing power: compilation, running Windows VMs, etc, can be done remotely from the cellar just as well, while a 2GHz arm is fast enough to do client stuff, running a browser being the most demanding part.

And what else do you need to reside directly on the machine you plop your butt at?

Re:IMHO - No thanks. (2)

0123456 (636235) | about a year ago | (#43825685)

I feel so bad about having a cheap, efficient, and above all, quiet box.

So do I. I can't even hear my i7 machine when playing games on it, whereas the old Pentium-4 sounded like a vacuum cleaner.

Re:IMHO - No thanks. (1)

c0lo (1497653) | about a year ago | (#43825721)

If it's the OP AC, whinging about how his games don't work well on ARM - then it's a damage (not that I regret it).
If it's you (thanks for the link: nice to see others on top of RasPi) or me - then its winning.

Speaking about quiet: I recently bough a Proliant Microserver for the "home FS"/NAS - at 15W for the Turion and the 4 NAS grade WD HDDes... mums, I can't hear it (under 60W at peak use). I would have gone with a ARM-board, but could't find enough support for NAS-ing (not when RAID-ing anyway).

btw: I don't have a cellar... yet. When I'll have one, 't'll be for wine only... ummm... maybe a bit of mead as well.

Re:IMHO - No thanks. (0)

Anonymous Coward | about a year ago | (#43826043)

. I just hate the noise so much. x86 machines with no moving parts are extremely hard to get, and have terrible performance/price.

That's not entirely true, you just gotta look in the "business-class" offerings for quiet running desktops. Generally the expandibility makes up for many if not all shortcommings.I'm quite fond of Lenovo's offerings, unless I'm going heavy on the graphics processing, I can barely hear a thing ()I say barely, because I'm not using SSDs). Maybe I got lucky and found a system with an AM2+ board that wasn't locked to AM2, even after putting a hexacore Phenom in there, it still runs quiet - until I start making the GPU cry, that is.

I'm a graphic artist and musician though, so I kinda do need to have horsepower in the machine I'm sitting at, the workflow doesn't lend itself well to working remotely.

Re:IMHO - No thanks. (1)

Anonymous Coward | about a year ago | (#43825271)

Then enjoy your Wintel dinosaur.

Surprising though it may seem to you, the rest of the world will route around you without even noticing.

Re:IMHO - No thanks. (0, Flamebait)

Anonymous Coward | about a year ago | (#43825281)

Wow, I'm glad you spoke up!

A lot supercomputers could have been built with the wrong CPUs of you hadn't been here to set everybody straight. The computing world really owes you big time!

What a close call, hey everybody?

slow clap is in order (0)

decora (1710862) | about a year ago | (#43825307)

google 'slow clap copmilation youtube'

Re:IMHO - No thanks. (5, Informative)

king neckbeard (1801738) | about a year ago | (#43825311)

You aren't operating in the supercomputing market. There, what matters is the how much processing you can get for how much money. You can always buy more chips, and power usage and cooling are both signficant factors. That's why x86 became dominant in that space. It was cheaper to buy a bunch of x86 chips than to buy fewer POWER chips. In terms of computing power, a POWER7 will eat your i7 for breakfast, but they are ungodly expensive.

Re:IMHO - No thanks. (1)

dbIII (701233) | about a year ago | (#43825387)

It was a two week process to attempt to buy a single low end machine with one of those things to see if it was viable for a paticular task - two weeks getting my companies wallet weighed by a slimy bastard that made used car salesmen look like saints and a lot of veiled comments that may have been about kickbacks. In the end the price was more than that of four gold plated IBM Xeon systems of similar clockspeed or about double that in whitebox systems. Sounds like you need a black budget immune from the eyes of accountants to buy one of the things.

Not only Performance per $ (2)

gentryx (759438) | about a year ago | (#43825895)

...but also reliability (because supercomputers are really large and one failed node will generally crash the whole job, thereby wasting gazillions of core hours; that's one reason why SC centers buy expensive Nvidia Tesla hardware instead of the cheaper GeForce series) and IO and memory bandwidth and finally integration density. That one Intel chip can be more tightly integrated as it won't generate as much excess heat per GFLOPS (according to TFA...).

Re:IMHO - No thanks. (-1, Troll)

Anonymous Coward | about a year ago | (#43825323)

You're an idiot and know nothing about this topic.

A single ARM 4 core A-15 running 1.5 GHz per core blows away any competing chip at the same specs, on power AND price. It's not limited to the calculations x86 are and can process graphics and physics better as a result.

Though it's not all your fault that you're ignorant in this area, no one in software short of workstation application providers are properly utilizing multi-threading and especially in the gaming world, none of them even know how to do it. Luckily, the ARM architecture natively solves these problems without having to write mutex or atomic code to work around threading issues.

ARM is the future and any who don't license it and move away from x86 will be left in the dust. The A-15 architecture even has the ability to run x86 and some other architectures natively without recompile.

Then again, I've discussed all of this before on /. over a year ago and people still don't believe it until they see it on /. or some other crap website rather than going straight to the source and reading their own docs. Ignorance is bliss.

Re: IMHO - No thanks. (-1)

Anonymous Coward | about a year ago | (#43825343)

You write like a fanboy child.

Re:IMHO - No thanks. (-1)

Anonymous Coward | about a year ago | (#43825817)

You sir are an idiot and a troll who doesnt know what they are talking about.

Your rantpost us chock full of errors and plain misconceptions, half truths and plain ignorance of computing technology.

In closing, you are a complete piece of shit and deserve to be modded into oblivion, you blithering wanker.

Re:IMHO - No thanks. (3, Insightful)

Anonymous Coward | about a year ago | (#43826061)

A single ARM 4 core A-15 running 1.5 GHz per core blows away any competing chip at the same specs, on power AND price. It's not limited to the calculations x86 are and can process graphics and physics better as a result.

Translation: It gets raped sideways on single-threaded performance and you have to double up on sockets right out of the gate.
It's a bit of a misconception about ARM and x86. ARM wins of watts/socket and mhz/watts, but Intel's i7s cream ARM on performance/watt, once you account for those two factors, ARM isn't as competitive as you might think. Now, I'm not saying it isn't competitive, just that it's nowhere near as one-sided as you might be led to believe by cherry-picking.

Re:IMHO - No thanks. (0)

Anonymous Coward | about a year ago | (#43825415)

You should like a hardcore gamer but not a hardcore programmer. So long as you can use parallelization for a task, performance per watt for a chip is more important than raw horsepower per cpu.

Re:IMHO - No thanks. (1)

XaXXon (202882) | about a year ago | (#43825491)

Why did you even say this? "PC users" aren't even mentioned in this article. This article is about supercomputers where the workloads are by virtual definition extremely parallel and the restrictions are around price and power consumption, not "FPS on a single game".

Re:IMHO - No thanks. (1)

crutchy (1949900) | about a year ago | (#43825523)

Most PC users depend on parallel computing in ways they can't even imagine

what do you think goes on at the other end of the copper/fibre cable?

Re:IMHO - No thanks. (4, Interesting)

symbolset (646467) | about a year ago | (#43825595)

The problem you have is the software tools you use sap the power of the hardware. Windows is engineered to consume cycles to drive their need for recurrent license fees. Try a different OS that doesn't have this handicap and you'll find the full power of the equipment is available.

Re:IMHO - No thanks. (3, Informative)

aztracker1 (702135) | about a year ago | (#43825861)

The last two times I ran Linux on my desktop I ran into issues that weren't impossible to overcome, just a pain in the ass to deal with... I had a desktop with two graphics cards in sli, and two monitors.. getting them both working in 2006 was a pain, I know that was seven years ago, but still... far harder than it should have been.. in 2007, my laptop was running fine, upgraded to the latest ubuntu, nothing but problems.. In the first case, XP/Vista were less trouble, in the second, Win7 RC1 ran better... I also ran PC-BSD for a month, which was probably the nicest experience I've had with something outside win/osx on my main desktop, but still had issues with virtual machines that was a no-go.

Given, my experiences are pretty dated, and things have gotten better... for me, linux is on the server(s) or in a virtual machine... every time I've tried to make it my primary OS has been met with heartache and pain. I replaced my main desktop a couple months ago, and tried a few Linux variants.. The first time, I installed on my SSD, then when I plugged in my other hard drives, it still booted, but an update to Grub screwed things up and it wouldn't boot any longer. This was after 3 hours of time to get my displays working properly.... I wasn't willing to spend another day on the issue, so back to Windows I went. I really like Linux.. and I want to make it my primary desktop, but I don't have extra hours and days to tinker with problems an over-the-wire update causes... let alone the initial setup time which I really felt was unreasonable.

I've considered putting it as my primary on my macbook, but similar to windows, the environment pretty much works out of the box, and brew takes things a long way towards how I want it to work. Linux is close to 20 years old.. and still seems to be more crusty for desktop users than windows was a decade and a half ago in a lot of ways. In the end, I think Android may be a better desktop interface than what's currently on offer from most of the desktop bases in the Linux community, which is just plain sad... I really hope something good comes out of it all, I don't like being tethered to Windows or OSX... I don't like the constraints... but they work, with far fewer issues... the biggest ones being security related... I think that Windows is getting secure faster than Linux is getting friendlier, or at least easier to get up and running with.

Re:IMHO - No thanks. (2)

0123456 (636235) | about a year ago | (#43825891)

I had a desktop with two graphics cards in sli, and two monitors

Given SLI barely works in Windows, expecting it to work in Linux was optimistic. I recently booted up a Linux Mint DVD on my laptop to try it out and... everything just works. Even using the 'recovery partition' to reinstall Windows on there takes over three hours, reboots about thirty times and breaks with barely decipherable and completely misleading error messages if you installed a hard drive larger than the one that came with it.

 

Linux is close to 20 years old..

And the BSD core in MacOS is close to 40 years old.

Android would make a lousy desktop interface, just like Window 8. It was designed for phones and is barely a usable tablet interface. Of course, it probably is more usable than Gnome 3.

Re:IMHO - No thanks. (1)

hi-endian (2589843) | about a year ago | (#43825681)

Not really sure how your personal needs are at all relevant in this situation, as this post is about servers and supercomputers (ie computers that typically deal with highly parallelized tasks), not about home gaming rigs.

Troll (-1)

Anonymous Coward | about a year ago | (#43825215)

Yea, because numbers computed by ARM are superior to the numbers computed by PowerPC.

Slashvertisement much (0)

Anonymous Coward | about a year ago | (#43825219)

Really, Soulskill?

Re:Slashvertisement much (1)

crutchy (1949900) | about a year ago | (#43825547)

so you don't like a nerdy FA linked on a site that markets itself as 'news for nerds'?

maybe if you don't like any form of online activity that could possibly be construed as advertising you're better off disconnecting your internet altogether

after all, you used the word 'really' in your post, which contains 'real', so obviously you're astroturfing for real networks... shill much?

Early supercomputers (0)

Anonymous Coward | about a year ago | (#43825227)

Like the CDC6600?

Not buying it. (1)

Anonymous Coward | about a year ago | (#43825247)

Power/performance ratios are with x86.

Re:Not buying it. (0)

symbolset (646467) | about a year ago | (#43825305)

This is easy to say but all the top supercomputers are GPGPU based now. The CPU is a management appliance that dishes the computables to the compute cores.

Re:Not buying it. (1)

dbIII (701233) | about a year ago | (#43825397)

It depends entirely on the task. There's plenty of threads that just cannot fit their memory requirements onto a GPU and keeping the things fed with memory can be slower than doing it on a CPU in the first place. Remember you are comparing something of the order of 8GB shared memory between the GPU cores with 1TB shared between the CPU cores.

Re:Not buying it. (1)

symbolset (646467) | about a year ago | (#43825461)

I'm thinking you don't understand. The whole "shared memory" thing is not exclusive to x86 cores. At some level it's a software abstraction relating to latency of storage. GPUs can have terabytes of RAM too as a sixth level cache.

Intel really needs some help here because the ground has shifted too much for them.

I understand all right - try reading full posts! (1)

dbIII (701233) | about a year ago | (#43825647)

"keeping the things fed with memory can be slower than doing it on a CPU in the first place" is the line you've missed and is why GPUs don't solve every highly parallel problem at the moment. They can do reverse time migration, but can't currently do time migration, depth migration, tomography etc etc. The penalties of swapping so much memory in and out are far too costly, to the point of orders of magnitude of performance or complete showstoppers where you just can't get enough in for it to work at all.

Re:I understand all right - try reading full posts (1)

symbolset (646467) | about a year ago | (#43825763)

Frankly I agree with you. I'm thinking the average /. reader will find your post incoherent though.

Re:I understand all right - try reading full posts (1)

dbIII (701233) | about a year ago | (#43825867)

I think my initial general comment about memory is properly aimed at a high school level readership Mr "sixth level cache" :)

Re:Not buying it. (3, Informative)

MikeBabcock (65886) | about a year ago | (#43825443)

I don't buy your response: http://top500.org/statistics/list/ [top500.org] ... click accelerator and hit submit.

87.6% of the top 500 super computers have no NVIDIA etc. coprocessing

Re:Not buying it. (1)

symbolset (646467) | about a year ago | (#43825781)

OK, fine. Pretend this isn't happening and see how that works out for you.

Exactly. (1, Interesting)

Junta (36770) | about a year ago | (#43825333)

This isn't to say that ARM *can't* be there, but thus far all of the implementations have focused around 'good enough' performance within a tightly constrained power envelope. Intel's designs have traditionally been highly inefficient in that power band, but at peak conditions, it is still compelling.

I recall one 'study' which claimed to demonstrate ARM as inarguably better. It got way more attention than they should have. The reason being is that they measured the performance on the ARM test, but just *assumed* TDP would be the accurate number for x86. There are very few workloads that would cause a processor to *average* TDP over the course of a benchmark.

The thing that really *is* stealing x86 thunder is the GPU world. Intel's Phi strives to answer it, but thus far falls short in performance. There continue to be areas where GPU architecture is an ill fit, and ultimately I think Phi may end up being a pretty good solution.

Does it really matter? (4, Interesting)

gman003 (1693318) | about a year ago | (#43825251)

Most of the actual processing power in current supercomputers comes from GPUs, not CPUs. There are exceptions (that all-SPARC Japanese one, or a few Cell-based ones), but they're just that, exceptions.

So sure, replace the Xeons and Opterons with Cortex-A15s. Doesn't really change much.

What might be interesting is a GPU-heavy SoC - some light CPU cores on the die of a supercomputer-class GPU. I have heard Nvidia is working on such (using Tegra CPUs and Tesla GPUs), and I would not be surprised if AMD is as well, although they'd be using one of their x86 cores for it (probably Bulldozer - damn thing was practically built for heavily-virtualized servers, not much different from supercomputers).

Re:Does it really matter? (5, Informative)

Victor Liu (645343) | about a year ago | (#43825303)

As someone who does heavy duty scientific computing, I wouldn't say that "most" of the actual process power is in GPUs. They are certainly more powerful at certain tasks, but most applications run are legacy code, and most algorithms require substantial reworking to get them to run with reasonable performance on a GPU. Simply put, GPU for supercomputing is not quite a mature technology yet. I am personally not too interested in coding for GPUs simply because the code is not portable enough yet, and by the time the technology might be mature, there might be a new wave of technology (like ARM) that could be easier to work with.

Re:Does it really matter? (5, Informative)

KiloByte (825081) | about a year ago | (#43825373)

Also, a lot of algorithms, perhaps even most, rely on branching, which is something GPUs suck at. And only some can be reasonably rewritten in a branchless way.

Re:Does it really matter? (5, Funny)

ThePeices (635180) | about a year ago | (#43825823)

Also, a lot of algorithms, perhaps even most, rely on branching, which is something GPUs suck at. And only some can be reasonably rewritten in a branchless way.

nonsence, I play Farcry3 on my GPU, and it renders branches just fine thank you very much.

Re:Does it really matter? (1)

XaXXon (202882) | about a year ago | (#43825503)

It really doesn't seem like portability should be a huge goal for writing code for top-100 supercomputers. The cost of the computer would dwarf (or at least be a significant portion of) the cost of developing the software for it. It seems like writing purpose-built software for this type of machine would be desirable.

If you can cut the cost of the computer in half by doubling the speed of the software, it seems a valid fiscal tradeoff, and the way to do that would be to write it for purpose-built hardware.

Re:Does it really matter? (2)

Victor Liu (645343) | about a year ago | (#43825605)

On the point or portability, there's then a distinction of your focus. If you do research on numerical methods, then yes, you would write highly optimized code for a particular machine, as an end in and of itself. I myself am merely a user, and our research group does not have the expertise to write such optimized code. We pay for time on supercomputing clusters, which constantly bring online new machines and retire old ones. Every year our subscription can change, and we are allowed to use resources on different computers. Therefore, from my standpoint, portability is very important. Otherwise, if we were to write our own code in-house, we basically have a 1 year (ok, fine, maybe 2 or 3 year) window in which to develop, test, and run it. It just doesn't seem worthwhile to spend so much effort developing a one-time use piece of code. I'd rather write something which will outlive my stay in the research program.

Re:Does it really matter? (1)

azi (60438) | about a year ago | (#43825751)

It really doesn't seem like portability should be a huge goal for writing code for top-100 supercomputers.

It depends on what are you doing. If you have relatively short term project (say less than couple of years) you are right. But if you do some serious long term research, portability comes with huge impact. Especially if your project has relatively large code base. Thing is, architectures comes and goes and you can't count what architecture you are going to use five years from now. Another thing is that you might use wide variety of computing sites, when portability is essential, really.

Re:Does it really matter? (2)

JanneM (7445) | about a year ago | (#43825767)

System and numerical libraries and compilers are of course written specifically for the machine. But user-level apps (and a lot of scientific computing uses finished apps) are ported across multiple systems.

Portability is not as big an issue as it was a generation ago, as most supercomputers basically are Linux machines today, and made to more or less look like a typical Linux installation from a user-application level, with a POSIX API; pthreads, OpenMP and OpenMPI; a standard set of numerical libraries; and often even gcc-compatibility in order to minimize the effort of porting. A notable exception is GPU-based machines (that are in the minority today, despite the OP assertion); they don't have a common API to write for, so using them is substantially harder at a user-level.

And at a user level (but unike system libs) porting or coding time very much matters. Let's say your project is going to need a month of wall-clock computing time during the course of a year or two. If switching to a GPU-based system would shrink that by 50% - two weeks - then the effort to move your model code, app, and libraries had better take less than two weeks of work or you're going to waste project time, not save it.

Re:Does it really matter? (1)

MichaelSmith (789609) | about a year ago | (#43825769)

Its the same for ARM. Java doesn't run properly yet because of the floating point limitations of ARM.

Re:Does it really matter? (0)

Anonymous Coward | about a year ago | (#43825833)

> there might be a new wave of technology (like ARM) that could be easier to work with.

Really? You went with "ARM" and not Xeon Phi?

Re:Does it really matter? (2, Insightful)

Anonymous Coward | about a year ago | (#43825319)

False. According to the Top 500 computer survey from November, 2012 (Category: Accelerator/Co-Processor), 87% of systems are not using any type of GPU co-processor, and 77% of the processing power is coming from the CPU.

This is, however, a decrease from the June 2012 survey, so GPU is certainly making inroads, but it is not yet the main source of computation.

http://www.top500.org/statistics/list/

I still remember when the IBM Blue architecture came out, using embedded PowerPC processors and it was a huge power savings. It was a big deal, but far from a complete solution (limitations in RAM with no disk/swap).

There is certainly a growing demand for a better power/performance solution in order to reduce total cost of operation. The individual performance of each processor doesn't matter as much when you have applications which are written to take advantage of 100,000s of processors in parallel.

Re:Does it really matter? (0)

Anonymous Coward | about a year ago | (#43825337)

On the most recent (November 2012) Top500 list, there are only 15 clusters in the top 100 using GPUs for compute.

Re:Does it really matter? (5, Informative)

Junta (36770) | about a year ago | (#43825353)

Of the last published top500 list, 7 out of the top 10 had no GPUs. This is a clear indication that while GPU is defintely there, claiming 'Most of the actual processing power' is overstating it a touch. It's particularly telling that there are so few as overwhelming the specific hpl benchmark is one of the key benefits of GPUs. Other benchmarks in more well rounded test suites don't treat GPUs so kindly.

Re:Does it really matter? (5, Interesting)

symbolset (646467) | about a year ago | (#43825365)

These ARM cores are halfway between the extremely limited GPU cores and the extremely flexible X86 cores. They may be the "happy medium".

Shows how dominant Intel have become (0)

Anonymous Coward | about a year ago | (#43825315)

Shows how dominant Intel have become that they were actually able to keep competing RISC processors out of many supercomputers for so long.

Questions... (5, Interesting)

storkus (179708) | about a year ago | (#43825411)

As I understand it, Intel still has the advantage in the performance per watt category for general processing and GPUs have better performance per watt IF you can optimize for that specific environment--both things which have been commented to death endlessly by people far more knowledgeable than I.

However, to me there are at least 3 questions unanswered:

1. ASICs (and possibly FPGAs): Bitcoin miners and DES breakers are the best known examples. Where is the dividing line between where your operations are specific enough to emply an ASIC vs not specific enough and needing a GPU (or even CPU)? Could further optimization move this line more toward the ASIC?

2. Huge dies: This has been talked about before, but it seems that, for applications that are embarrassingly parallel, this is clearly where the next revolution will be, with hundreds of cores (at least, and of whatever kind of "core" you want). So when will this stop being vaporware?

3. But what do we do about all the NON-parallel jobs? If you can't apply an ASIC and you can't break it down, you're still stuck at the basic wall we've been at for around a decade now: where's Moore's (performance) law here? It would seem the only hope is new algorithms: TRUE computer science!

Re:Questions... (0)

XaXXon (202882) | about a year ago | (#43825509)

Are you sure you know what moore's law is?

http://en.wikipedia.org/wiki/Moore's_law [wikipedia.org] .. might be worth a read.

Re:Questions... (2)

XaXXon (202882) | about a year ago | (#43825811)

The reason for the question is that nothing in Moore's law says anything about single-threaded performance doubling every 1.5 years as many thing.

Moore's law is the observation that, over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every two years.

Re:Questions... (1)

Anonymous Coward | about a year ago | (#43826081)

Given that he felt the need to specify (performance) in parenthesis, it appears that he does know what Moore's law and that he was referring to the impact Moore's law has traditionally had on single-threaded performance.

So, when can I buy an ARM ATX board? (2)

LaughingRadish (2694765) | about a year ago | (#43825445)

Hopefully this means we should start seeing ARM-using motherboards in an ATX form-factor. The Pi and Beaglebone are nice, but I want something that's eassentially just like a commodity x86 motherboard except it uses ARM.

Re:So, when can I buy an ARM ATX board? (1)

c0lo (1497653) | about a year ago | (#43825757)

Hopefully this means we should start seeing ARM-using motherboards in an ATX form-factor. The Pi and Beaglebone are nice, but I want something that's eassentially just like a commodity x86 motherboard except it uses ARM.

Why? Mini-ATX's not good for a commodity MB? 'cause you don't need a high google-fu to find heaps of them.

Re:So, when can I buy an ARM ATX board? (2)

LaughingRadish (2694765) | about a year ago | (#43825837)

Mini-ATX or Mini-ITX will do fine. I just haven't seen any that have the kinds of things you take for granted on x86 boards. I want an ARM board with SATA ports, PCIe slots, and DIMM (or SODIMM) slots. Is that too hard to produce? I don't see anything like this anywhere.

Re:So, when can I buy an ARM ATX board? (1)

0123456 (636235) | about a year ago | (#43825899)

Ditto. I went looking for an ARM board last time I built a home server, but found nothing that could compete in the slightest against a $90 Atom board.

Re:So, when can I buy an ARM ATX board? (1)

c0lo (1497653) | about a year ago | (#43825973)

Slowly, they [fanlesstech.com] start [fanlesstech.com] to appear [cadianetworks.com] .

No, they won't. (5, Informative)

Dputiger (561114) | about a year ago | (#43825493)

Current ARM processors may indeed have a role to play in supercomputing, but the advantages this article implies don't exist.

Go look at performance figures for the Cortex-A15. It's *much* faster than the Cortex-A9. It also draws far more power. There's a reason why ARM's own product literature identifies the Cortex-A15 as a smartphone chip at the high end, but suggests strategies like big.LITTLE for lowering total power consumption. Next year, ARM's Cortex-A57 will start to appear. That'll be a 64-bit chip, it'll be faster than the Cortex-A15, it'll incorporate some further power efficiency improvements, and it'll use more power at peak load.

That doesn't mean ARM chips are bad -- it means that when it comes to semiconductors and the laws of physics, there are no magic bullets and no such thing as a free lunch.

http://www.extremetech.com/computing/155941-supercomputing-director-bets-2000-that-we-wont-have-exascale-computing-by-2020 [extremetech.com]

I'm the author of that story, but I'm discussing a presentation given by one of the US's top supercomputing people. Pay particular attention to this graph:

http://www.extremetech.com/wp-content/uploads/2013/05/CostPerFlop.png [extremetech.com]

What it shows is the cost, in energy, of moving data. Keeping data local is essential to keeping power consumption down in a supercomputing environment. That means that smaller, less-efficient cores are a bad fit for environments in which data has to be synchronized across tens of thousands of cores and hundreds of nodes. Now, can you build ARM cores that have higher single-threaded efficiency? Absolutely, yes. But they use more power.

ARM is going to go into datacenters and supercomputers, but it has no magic powers that guarantee it better outcomes.

Re:No, they won't. (0)

Anonymous Coward | about a year ago | (#43825527)

This * 1000

The whole 1000 ARMs will replace 50 Xeons at the same power/performance is such silly nonsense, I dunno how it keeps getting plugged on slashdot.

Re:No, they won't. (1)

Lennie (16154) | about a year ago | (#43825689)

Didn't Intel say that bringing down the cost and improving the performance of the interconnect was the goal of silicon photonics and they are now very close to mass production.

However I don't know how power efficient it is.

Could silicon photonics help close that gap ?

That's what is so funny to me (4, Insightful)

Sycraft-fu (314770) | about a year ago | (#43825735)

Slashdot seems to have lots of ARM fanboys that look at ARM's low power processors and assume that ARM could make processors on par with Intel chips but much more efficient. They seem to think Intel does things poorly, as though they don't spend billions on R&D.

Of course that would beg the question as to why ARM doesn't and the answer is they can't. The more features you blot on to a chip, the higher the clock speed, and so on, the more power it needs. So you want 64-bit? More power. Bigger memory controller? More power. Heavy hitting vector unit? More power. And so on.

There's no magic ju ju in ARM designs. They are low power designs, in both sense of the word. Now that's wonderful, we need that for cellphones. You can't be slogging around with a 100 watt chip in a phone or the like. However don't mistake that for meaning that they can keep that low consumption and offer performance equal to the 100 watt chip.

I want (2)

EmperorOfCanada (1332175) | about a year ago | (#43825581)

I have long pined for a server with maybe 10 4 core ARM CPUS. Basically my server spends its time serving up web stuff from memory. Each web request needs to do a bit of thinking and then fire the data out the port. Disk IO is not an issue nor is server bandwidth. Quite simply I don't need much CPU but I need many CPUs. A big powerful intel is of less interest.

Also by breaking up the system into physically separate CPUs I suspect that an interesting memory accessing architecture could be conjured up preventing another potential choke point.

Re:I want (1)

0123456 (636235) | about a year ago | (#43825691)

Also by breaking up the system into physically separate CPUs I suspect that an interesting memory accessing architecture could be conjured up preventing another potential choke point.

I suspect you mean it would have to be conjured up, or you'd spend all the time waiting to access RAM on other cores rather than doing anything useful.

Re:I want (1)

zbobet2012 (1025836) | about a year ago | (#43825747)

Its called NUMA [wikipedia.org] , and we already have it in the Linux Kernel. By the way it is very cheap these days to pick up a server with 64 or more cores that fits in a 1U / 2 processor server.

Re:I want (1)

EmperorOfCanada (1332175) | about a year ago | (#43825841)

I would love to know where to get a cheap 64 core 1U server. And I don't mean that in the usual snarky slashdot (I think you're wrong) way but I truly would love to know.

Re:I want (2)

zbobet2012 (1025836) | about a year ago | (#43825923)

Supermicro 1u 64 cores [supermicro.com] . Bunch of other Mobos (some more than 1u) on this page [supermicro.com] . Cheap is relative to the buyer I suppose, but to my (admittedly very large) company these things are rather cheap unless you start stacking them with lots of dense memory.

Re:I want (0)

Anonymous Coward | about a year ago | (#43825903)

192 nodes @ 5w per node [boston.co.uk]

Xilinx Zync anybody? (4, Informative)

Z00L00K (682162) | about a year ago | (#43825607)

Has anybody else seen/considered the Xilinx Zync [xilinx.com] ? It's a mix of ARM kernels and FPGA, which could be interesting in supercomputing solutions.

For anyone willing to tweak around with it there are development boards around like the ZedBoard [zedboard.org] that is priced at US$395. Not the cheapest device around, but for anyone willing to learn more about this interesting chip it is at least not an impossible sum. Xilinx also have the Zynq®-7000 AP SoC ZC702 Evaluation Kit [xilinx.com] which is priced at US$895, which is quite a bit more expensive and not as interesting for hobbyists.

Done right you may be able to do a lot of interesting stuff with a FPGA a lot faster than an ordinary processor can and then let the processor take care of stuff where performance isn't a critical part.

Those chips are right now starting to find their way into vehicle ECUs [xilinx.com] , but it's still in an early phase so there aren't many mass produced cars yet with it.

As I see it - supercomputers will have to look at every avenue to get maximum performance for the lowest possible power consumption - and avoid solutions with high power consumption in standby situations.

cant wait! (0)

Anonymous Coward | about a year ago | (#43826005)

Well you dont expect them to just go and outperform the others, they are obviously gonna take time optimizing, but what interests me more is more competition and something new to look forward to.
Cheers,

Not this week... (2)

niftymitch (1625721) | about a year ago | (#43826023)

Not this week....
I am a fan boy for the small ARM boards... I have built an MPI cluster out of Raspberry-Pi boards and it is not even close except as a teaching exercise where it excels.

However many site services can be dedicated to these little boards where corp IT seems to dedicate virtual machines.

Department Web Servers... with mostly static content... via NFS or a revision control system like hg.
Department and internal caching name servers... NTP servers and managed central storage for each building or closet.

The impact of the little ARM boards has kicked Intel in their lethargy-loaded-behind. Their next generation sub 25 Watt systems will take names and kick but as long as IT does not overload them with WindowZ.

IT departments will find that the management advantage of chromebox devices connected to quality screens compelling.

Users will find that flipping open the company ChromeOS laptop will put them on the same page as the big screen in the office...

It is true that this is not 100% ready for prime time for all of us but the handwriting is on the wall.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?