Ask Slashdot: Best Use For a New Supercomputing Cluster?

Unknown Lamer posted more than 2 years ago

Networking 387

Supp0rtLinux writes "In about 2 weeks time I will be receiving everything necessary to build the largest x86_64-based supercomputer on the east coast of the U.S. (at least until someone takes the title away from us). It's spec'ed to start with 1200 dual-socket six-core servers. We primarily do life-science/health/biology related tasks on our existing (fairly small) HPC. We intend to continue this usage, but to also open it up for new uses (energy comes to mind). Additionally, we'd like to lease access to recoup some of our costs. So, what's the best Linux distro for something of this size and scale? Any that include a chargeback option/module? Additionally, due to cost contracts, we have to choose either InfiniBand or 10Gb Ethernet for the backend: which would Slashdot readers go with if they had to choose? Either way, all nodes will have four 1Gbps Ethernet ports. Finally, all nodes include only a basic onboard GPU. We intend to put powerful GPUs into the PCI-e slot and open up the new HPC for GPU related crunching. Any suggestions on the most powerful Linux friendly PCI-e GPU available?"

SETI ! (1)

Anonymous Coward | more than 2 years ago

help SETI !!

Re:SETI ! (1)

stox (131684) | more than 2 years ago

That is exactly what we used to burn in the first SGI Origin 2000, and later the first few thousand nodes of a Linux cluster at Fermilab.

Better than SETI (1, Offtopic)

tomhudson (43916) | more than 2 years ago

Help everyone here on earth

Generate every single possible combination of software or business method patent, and break the patent office once and for all.

So little detail (0)

Anonymous Coward | more than 2 years ago

You say all this, but you don't even say what group you're associated with. Is this even real?

Re:So little detail (2, Funny)

oobayly (1056050) | more than 2 years ago

Indeed, it's a bit like somebody writing in to Dear Deirdre and saying "I've a 13 inch cock, how can I make girls aware of this, and what's the best way to make use of it?"

Re:So little detail (3, Interesting)

webmistressrachel (903577) | more than 2 years ago

No it's not, some really ugly, nerdy guy out there has a big cock and nobody is interested in him - he can't just flop it out in public, so that might be a very real problem for him! Or maybe he does, and girls only want him for that?

Back on topic, it's not like that at all because the computer is probably real, and if not, it's just another hypothetical "Ask Slashdot" for us to fantasize over. "What would you do if you had...". What's wrong with that? Just my 2 pence!

Lost some funding? (5, Funny)

turkeyfeathers (843622) | more than 2 years ago

Start with the cheapest backend that'll get the system up and running, then use your supercomputer to mine Bitcoins for a few days, then use all the money you'll make to buy the InfiniBand backend (you'll probably have enough money left over to buy Monster cables to hook everything up).

Re:Lost some funding? (0, Troll)

webmistressrachel (903577) | more than 2 years ago

Monster cables are only worth the investment for speakers and line-level / mic stuff (i.e. analogue signals). Having a Monster-cabled computer network would be no better than having the generics of the same cables.

We all know this - and MP3 and the so called "average listeners" (people who buy Britney Spears) have ruined the hi-fi industry with their cable sarcasm. Yes, MP3 will sound crap on a Monster cable, too. But 44.1KHz 16-bit sound, converted to analogue in the transport and sent to the amp via line leads WILL benefit from Monster / premium cables, as will speaker cables of any kind.

Re:Lost some funding? (-1, Troll)

webmistressrachel (903577) | more than 2 years ago

Since when did scientific FACT get modded troll TWICE so damn quick??? Fuck you, truth-hating Americans.

And that's not a troll either, it's honest-to-goodness nerd rage! Fuck you mods!

Re:Lost some funding? (3, Informative)

Anonymous Coward | more than 2 years ago

Maybe the mods are a little more aware than you of the engineering and scientific FACTS about Monster Cable. Some things that you said:

Monster cables are only worth the investment for speakers and line-level / mic stuff (i.e. analogue signals). [...] But 44.1KHz 16-bit sound, converted to analogue in the transport and sent to the amp via line leads WILL benefit from Monster / premium cables, as will speaker cables of any kind.

are, I'm afraid, complete nonsense. Counterfactual, in fact. And yes, there's real science to support that. Let me gloss over it...

A 44.1 kHz sample rate before the DAC means the maximum frequency component the cables need to handle is 22 kHz. (This is due to the Nyquist limit, as in the Nyquist-Shannon Sampling Theorem.) 22 kHz is low. Really low. Practically any old piece of wire can carry audio frequencies with perceptually flat response across the audible range and nearly no loss as long as the cable lengths are as short as they are in a typical home stereo system. The only thing you need is large diameter wire for your speaker cables to ensure they're very low resistance so that the higher currents involved in powering a speaker don't cause resistive loss in the cable.

As for low-power line level signals (such as CD player to amp), the most likely source of problems is actually ground loops, where the source equipment has a different ground reference than the destination. (A lesser concern is interference.) The pros don't solve this with stupid Monster Cable, they solve it by using pro equipment with balanced (differential) signaling, which both eliminates the need for the source and destination to have a common ground and provides some noise immunity.

For home stereo systems, however, making sure that everything is grounded to the same point (3 prong plugs all plugged into a single grounded power strip) is generally good enough, and noise is rarely (if ever) a significant problem.

Re:Lost some funding? (1)

chargersfan420 (1487195) | more than 2 years ago

It appears with posts like this, that perhaps the opposite of your signature is also true.

Off-topic I know, but sorry, I couldn't resist.

Re:Lost some funding? (1)

Toonol (1057698) | more than 2 years ago

Protestations notwithstanding, I still think you were trolling. That's a kindness, by the way; I'm generously assuming you don't actually believe what you're claiming.

Re:Lost some funding? (1)

Liambp (1565081) | more than 2 years ago

So you are saying that anyone who cannot hear the difference between standard cables and expensive premium cables must be a Britney Spears fan?

I think the Emperor isn't wearing any clothes.

Re:Lost some funding? (1)

KZigurs (638781) | more than 2 years ago

To be fair some of Britneys recordings are (actually indeed) exceptionally well mastered. You might not like her music, but if you are a proper audiophile you will still enjoy it.

Re:Lost some funding? (1)

leenks (906881) | more than 2 years ago

Let me guess... you bought the green pens (or stick on rims) for the edges of your CDs too?

Re:Lost some funding? (1)

sexconker (1179573) | more than 2 years ago

He won't even have GPUs initally.
CPU mining is about as productive as mining for (physical) gold with a toothpick.
GPU mining is slowly becoming less viable since people have rolled out FPGA miners.

And if anyone ever decides to go full-blown ASIC, mining will be dead for 99.9999999% of people.

More on point: This seems like a retarded cluster if you haven't built it with GPU shit initially in mind. It's not like CUDA/STREAM/DirectCompute/OpenCL are hard if you've already got the highly parallel algorithms and workloads in the first place. Dude's got 1200 boxen that could be outdone by 40 GPUs in 10 boxen for the vast vast vast majority of workloads. The time saved in setup and management, as well as the cost with regards to the communication backplane, power delivery, etc., could easily pay for a few programmers to rewrite any legacy code to be GPU-ready.

And with AMDs 7000 series coming out, the energy cost / performance would be a trifle. A trifle!
http://lenzfire.com/wp-content/uploads/2011/09/AMD-Radeon-7000-22.jpg [lenzfire.com]
Look at that shit! LOOK AT IT!

Best Use For a New Supercomputing Cluster? (1, Funny)

Jodka (520060) | more than 2 years ago

Generating Bitcoins

Re:Best Use For a New Supercomputing Cluster? (1)

Monkey-Man2000 (603495) | more than 2 years ago

LOL, where's my mod points when I need them. The bitcoins will help offset the energy consumption I'm almost sure.

I call Shenanigans!!! (5, Insightful)

sconeu (64226) | more than 2 years ago

No way in hell a project that big gets approved without a rationale.

And no way in hell the administrator of such a project would ask Slashdot what to do with it.

Re:I call Shenanigans!!! (2, Informative)

Anonymous Coward | more than 2 years ago


Two weeks away and still at the “thinking of cool shit to use it for” and “picking out hardware” stages? How does that even happen? Is this some kind of tax scam to burn as much money as possible?

I get that the submitter already have a primary use... but I imagine if I was ever given that kind of budget I’d probably have to account for every CPU cycle every hour of the day (especially since I’m a programmer and should have no business with something like this ;p). I can’t imagine a budget for something like this comprised of “and hopefully we’ll be able to recoup the millions of dollars by leasing it out to some TBD people”.

Also, the first person to mention bitcoin as an option gets to have their teeth rotated. I’m not joking.. we will find you..

Re:I call Shenanigans!!! (2, Funny)

Anonymous Coward | more than 2 years ago

Yes, it probably is a tax scam. It is now the US Federal Year End. Someone wrote a really good funding proposal and got it approved to get money for a HPC cluster to do *something*. Doesn't really matter. The grant application will have focused on broad ideas like # of cores and what not and not the details. A bit surprising that the network wasn't spec'd because that is such a major cost item, but whatever, maybe the grant application's work loads are not network bound.

So, now that the money is approved the task to build the thing falls to the inexperienced IT group who make all kinds of dumb choices, then will claim they are massively over worked/underfunded trying to get the thing to run and end up with shitty performance and a lot of wasted time and money. Oops. Your Money At Work.

The system you spec'd should be around the 250TF range, if you set it up properly with QDR IB and do all the work to get MPI optimized. If you are good. The correct way to design the network is to match a 36 port IB switch with 18 servers, and then correctly spread the resulting 1200 uplink ports across a pair of 686 port core switches. The cost of IB cables alone will be shocking and you'll regret not using a HPC blade server arrangement for this.

Considering the questions you are asking, you should have gone with an HPC focused integrator that could provide the full system for you. 4 GIGE's on every box? Waste of money. The IB equipped blade designs from SGI/Bull, etc are very nice, space and power efficient and much more cost effective. They even come with a pre-integrated and tested OS ready to go, working boot over IB and other long term cost saving features.

Gosh I hope you bought a few PB of HPC focused storage as well, otherwise you won't find anyone who can even use your machine for their problems.

Yeah 6 core is so 2007 (1)

cheekyboy (598084) | more than 2 years ago

I agree, you Gita go for the big ass servers with amds new 12 core CPUs x 4 , 48 cores per box is hell nice . And it takes 1/8th the space and power.
You only need 150 boxes not 1200 boxes

Totally believable. (3, Interesting)

khasim (1285) | more than 2 years ago

I totally believe the submitter's question.

Next up on Ask Slashdot:
I just got permission to buy the biggest fleet of trucks on the east coast ... and I was wondering if anyone on Slashdot had any ideas what I should do with them.

Followed by,
The company I work for just purchased 10,000 acres of land on the east coast and I was wondering if anyone on Slashdot had any idea what we should do with it.

Happens all the time!

Re:Totally believable. (1)

Anonymous Coward | more than 2 years ago

And slashdot forum:

The barely-legal hot teen nymphomaniac convention asked me to be a judge at the tight pussy competition, but my WoW clan is going out to Taco Bell. What should I do?

Re:Totally believable. (3, Interesting)

blair1q (305137) | more than 2 years ago

Actually, it does.

I remember taking possession of a spanking-new Thinking Machines cluster some <mumble> years ago.

The principal investigator got it to do one particular calculation, and promised the excess would be put to good use.

We spent our time trying to figure out what "good use" meant in that context.

It hasn't got much easier.

I say if you run out of numbers to crunch of your own, these days, just hook it up to some lucky grid-computing project and let it swamp the stats.

Re:I call Shenanigans!!! (1)

Amouth (879122) | more than 2 years ago

agreed - was just about to ask who was stupid enough to let someone buy that much hardware without an existing project/plan in place. and how can i get them to fund me and my start-up (don't have one now but you bring the cash i'll figure out something to do with it)

Re:I call Shenanigans!!! (0)

Anonymous Coward | more than 2 years ago

Daddy. Some people have mighty big trust funds/rich parents. I'm this close to having built that kind of wealth myself :) . now.... I'll just have to figure out what to do with it. Ha. And I just got my first apartment. My first plan of action is to buy a place of my own. I'm 26 and my business just turned 3 years old. I am bringing in about $300,000 / yr now. Not that you would know it from my tax returns! :) yet... anyway. Six months jesus christ only knows what that number is going to turn into. We've only touched the tip of what is possible. We're hitting basically two continents and only one language. Next up! Multiple continents and multiple languages! Not to mention new products will be out to take advantages of our current markets.

While I find this highly doubtful.... (3, Interesting)

xzvf (924443) | more than 2 years ago

I've seen government institutions have unallocated money at the end of some budget cycle, that was so micro-managed that it could only be spent on a certain type of widget. I can see a university get a late grant, that had to be spent in 30 days, could only be spent on technology, that can only come out of a pre-approved catalog, and some administrative type that just saw a Top 500 super-computer list with competing university names on it, bring up in a meeting that we should build a super computer, and some grad assistant saying how easy it would be. They found a room with a window in it and ordered a bunch of parts, and will walk prospective students and their parents by it saying "This is the largest super-computer on the east coast".

Re:While I find this highly doubtful.... (1)

Anonymous Coward | more than 2 years ago

You are more correct than you realize!

Re:I call Shenanigans!!! (2)

AdamHaun (43173) | more than 2 years ago

It did have one. Right there in the submission:

We primarily do life-science/health/biology related tasks on our existing (fairly small) HPC. We intend to continue this usage, but to also open it up for new uses (energy comes to mind). Additionally, we'd like to lease access to recoup some of our costs.

Re:I call Shenanigans!!! (0)

Anonymous Coward | more than 2 years ago

If you mine bitcoins with CPUs, you will spend more money on electricity than you will earn in bitcoins. Only a select few AMD GPUs can profitably mine bitcoins these days. Every other CPU, GPU, or what-have-U will be a financial loss due to the higher bitcoin "difficulty factor" and the lower bitcoin dollar value.

Re:I call Shenanigans!!! (0)

Anonymous Coward | more than 2 years ago

Get the brooms!
And no way in hell someone buys 1200 nodes without buying (or knowing) the interconnect, the OS, or the scheduler. And you don't buy nodes with plans to someday put in GPGPUs (note - it's not GPUs, but GPGPUs for HPC computing). I architect these all the time and I can't believe that an entity has actually spent this type of money without knowing this much...

Re:I call Shenanigans!!! (0)

Anonymous Coward | more than 2 years ago

No way in hell a project that big gets approved without a rationale.

And no way in hell the administrator of such a project would ask Slashdot what to do with it.

My thoughts exactly!

Ummm two things (3, Insightful)

Sycraft-fu (314770) | more than 2 years ago

1) Something with 10gb really isn't a "supercomputer" it is a cluster. Fine, but call it what it is. I really wouldn't call a cluster with Infiniband a supercomputer either.

2) You really should maybe get someone who knows more about your project and someone who knows more about clusters/supercomputers. The questions you are asking are not ones I would want to see form the guy making the choices on a multimillion dollar project.

Re:Ummm two things (2, Interesting)

Anonymous Coward | more than 2 years ago

You clearly have no idea what you're talking about. I was just part of a million-euro EU project consisting of a large partnership of universities and companies. Given the fact that none of them ever did anything, my professor gave up and defined the project on his own.
I coded the entire project on little more than minimum wage while I was also attending classes. I managed a couple of helpers who did web design and documentation, and dealt with the rest of the partners on my own, even interacting with fancypants EU higher-ups at some point. I was also in charge of administrative work such as financial reports. I dealt with the university accounting department directly as well as their administrative staff. I booked flights and physically walked over to the traveling agency. I represented the project at every single conference where it was demo'ed. As part of its end goal of meeting an audience target of a few thousand people, I took the initiative of aggressively promoting the project and was met with huge success.
The vast majority of the cash was spent on people who did absolutely nothing other than throwing one or two opinions in the 18 months the project lasted. Our university's share was used to buy new chairs and tables and repaint the walls etc.
Life in academia is serious research. Very serious. Investing in "science" will solve the world's problems.

Re:Ummm two things (1)

bananaquackmoo (1204116) | more than 2 years ago

I would mod this up if I could. All the other comments here seem to be about how this story is a fake, nobody would ask this question of Slashdot on a short timeframe with tons of funding and no idea where to go with it. Well I think you nailed it, 100%. It's likely someone's pet academic project.

Re:Ummm two things (2)

Anubis350 (772791) | more than 2 years ago

1)You haven't been to any computer conference (like, say, SC) have you? or worked on a supercomputer? Most supercomputers these days are clusters, and hell, one of the most common interconnects is still gigE, not even 10gigE, though that's slowly changing (check the top500 stats if you don't believe me, but I've been at SC's top500 announcement every year for the past 4, and it's been mentioned each time. For that manner I run jobs on a gig based cluster everyday, and for many types of work it's not necessarily a hangup).

2)I'm going with "article is fake", no-one commits the resources to spec, build, and power a cluster of that size without a projected use. You should see the hoops you have to go to to spec machines a fraction of the size ::shudders::

Re:Ummm two things (2)

Sycraft-fu (314770) | more than 2 years ago

They may call them "supercomputers" but in my mind that is mislabeling things. They work for cluster operations, where there's not a ton of inter-node communication and no need for access to memory outside your node. Well, that is what supercomputers were made for. So in a real supercomputer, you have the ability to do that. That is also why real supercomputers cost more.

I think it is an important distinction for that reason. While a supercomputer can do all a cluster can, the reverse is not true. Same with distributed computing vs a cloud. If you have something that takes basically no inter node communication, just occasional communication with a server, then you can distribute it all over the net, using low bandwidth links, unreliable nodes, and so on. A cluster can do that stuff too, but there are things a cluster can do that cannot.

Re:Ummm two things (1)

blair1q (305137) | more than 2 years ago

I think 2) is not seeing the whole story there.

They do have a continual use for mass quantities of computation. But it looks like it's not a 24/7 workload. And with $/core dropping like a rock, this iteration of the "biggest" may be cheaper than the last, and therefore not the sort of budgetary lightning rod that building-sized supercomputers used to be.

Is this a joke? (1)

Anonymous Coward | more than 2 years ago

You've got hardware for a supercomputer coming but you haven't thought out what OS you're going to use? Shouldn't this all be decided, designed and ready to go already?

Uh oh.. (4, Insightful)

joib (70841) | more than 2 years ago

Shouldn't you have figured out answers too all these (simple) questions before ordering several million $$$worth of hardware? Sheesh.. As for you specific questions: - IB vs. 10GbE: IB hands down. Much better latency and more mature RDMA software stacks (e.g. for MPI and Lustre). Cheaper and higher BW as well. - GPU: NVidia Fermi 2090 cards. CUDA is far ahead of everything else at the moment.

Re:Uh oh.. (1)

Anonymous Coward | more than 2 years ago

CUDA programmatically has far more limitations than MPI running on a cluster like this guy is describing... however, for things like gene sequencing, and various cancer research CUDA is the better choice. That is neither here nor their though. TACC uses CentOS on Ranger I believe... and they do lease cycles on it, and/or let people pay to run with batches. https://portal.tacc.utexas.edu/ You could also contact xsede at https://www.xsede.org/ TACC is part of XSEDE, but they operate somewhat autonomously and neither generally knows what the other is doing... TACC is also generally far more accommodating toward things like what you are asking...

I'm sorry...What? (0)

jpedlow (1154099) | more than 2 years ago

So you're a sysadmin for a Large Commercial Cluster and you've got hardware on the way and dont have answers to these questions already?

I aplogize if I've misread, but something just doesnt seem to add up here. :\ I'd get it if you were saying you've got a stack of maybe dual quads and were like "hey i've got a half rack of computers, please hold my hand for HPC", but with something the magnitude you're speaking of, I--dont-even. Trollface.jpg?

Riiiiight (1)

GrumpySteen (1250194) | more than 2 years ago

We're supposed to believe that you've purchased 1200 servers, 2400 six core CPUs and all the associated hardware without deciding basic things like how you're going to connect it all or what distribution you're going to use?

Re:Riiiiight (1)

Jah-Wren Ryel (80510) | more than 2 years ago

We're supposed to believe that you've purchased 1200 servers, 2400 six core CPUs and all the associated hardware without deciding basic things like how you're going to connect it all or what distribution you're going to use?

Sounds like they got some of that 75 billion dollars per year of anti-terrorism money. [slate.com]
Even though he's dead, Osama still knows how to make it rain!!

How we do things (1)

Anonymous Coward | more than 2 years ago

We have this exact same setup (20% the size though) we use infiniband, we have not had good luck with 10Gige though we do use it for globus end points. 40Gig IB has better MPI performance and higher bandwidth at a lower cost. You can also get ethernet->IB gateways to help with any issues if you need to use IP over IB.
Also make sure your MPI library support OFED (open fabrics http://www.rce-cast.com/Podcast/rce-34-ofed-openfabrics-enterprise-distribution.html) or you won't get the performance you want.

As for GPU's look at the dell 410x, http://www.dell.com/us/business/p/poweredge-c410x/pd it connects upto 6 hosts upto 16 GPUs. Be ready with the 220V power.

Check out www.rce-cast.com for a bunch of podcasts on HPC type stuff.

Seriously? (0)

Anonymous Coward | more than 2 years ago

If you have to ask, it's doubtful you're telling the truth. Any organization with enough resources to build a supercomputer has experts on staff who have already figured this out, because they'll have designed everything in advance.

professionals... (0)

Anonymous Coward | more than 2 years ago

I think you need to hire a consultant or an expert. You have very specific, for-profit (even if it's cost-recoup) needs, with a chance of liability if things go wrong/ downtime. In short, requiring some warranty or insurance of some kind. Standard stuff that comes with commercialization.

There's friendly help, and then there's helping-you-do-your-work-for-free.

Pong (2)

Vandilzer (122962) | more than 2 years ago

One really smooth and acuter game of pong! or asteroids if that suits you fancy... though it will require a bit more computing power :)

EPIC TROLLING (4, Insightful)

jpedlow (1154099) | more than 2 years ago

Wow, he just TROLLED THE CRAP out of slashdot. We mad, bros!

excuse me... (1)

Thud457 (234763) | more than 2 years ago

but destroying the market for bitcoins has a quantifiable societal benefit. Burn down bitcoin's house while you burn in your hardware!

Mom & dad's new basement data center (1)

macraig (621737) | more than 2 years ago

It would appear somebody got enough of a life to move out of mom and dad's basement and now wants to convert it into a Bitcoin mining hub....

What we do ... (4, Informative)

Anonymous Coward | more than 2 years ago

Similar size setup in bio-informatics in Europe. We run redhat 6.1, was centos 5 and LSF. single 1gbit to each server (blades). No need for 10gb or IB unless huge mpi which no one uses. 32GB to 2TB per node - some people like enormous R datasets. All works well for our ~500 users.

Re:What we do ... (1)

gknoy (899301) | more than 2 years ago

Thank you for posting the first informative post I saw, rather than mocking or trolling ones. :)

How about (1)

Anonymous Coward | more than 2 years ago

LFTR modelling? It's one of the main things holding back the tech.

Re:How about (1)

drwho (4190) | more than 2 years ago

OK, LFTR (Liquid Fluoride Thorium Reactor) development would be useful. Can you explain what modeling needs to be done? Is this merely a provisioning problem (you haven't got the computational resources), or it is also a programming problem, and perhaps even an algorithm problem (do you know what you want to compute)?

Another question is, who would own the results?

Did someone say Bitcoin!? BUY! BUY! (2)

recrudescence (1383489) | more than 2 years ago

Holy crap! Someone mentioned the word "Bitcoins" on slashdot again! It's only a matter of time before its value hits the roof again! Quick! BUY! BUY!

Obligatory (0)

Anonymous Coward | more than 2 years ago

shit coins.

Re:Obligatory (1)

webmistressrachel (903577) | more than 2 years ago

Well, if he installs the bitcoin generator mentioned plenty of times above, I'm sure the computer will literally shit meta-coins!

test (0)

Anonymous Coward | more than 2 years ago

Your request timed out. Please retry the request.

hardly the biggest (2)

zeldor (180716) | more than 2 years ago

Amazon's HPC cluster there in Virginia I suspect is way bigger then your little toy..
plus all the agencies.

Wait a minute... (-1, Redundant)

Nom du Keyboard (633989) | more than 2 years ago

Wait a moment here. You're this close to receiving your hardware and you don't even know what O/S you're planning to use, what interconnect to choose, or what problems you intend to solve with it? Where do you get funding like this?

Re:Wait a minute... (1)

93 Escort Wagon (326346) | more than 2 years ago

Wait a moment here. You're this close to receiving your hardware and you don't even know what O/S you're planning to use, what interconnect to choose, or what problems you intend to solve with it? Where do you get funding like this?

Yeah, I think we need more specific info here. I can't see any way a group would attract funding without spelling out all these items... however the submitter doesn't actually refer to funding, he states "I will be receiving everything necessary to build ...". What does that mean, exactly? Did he just buy hundreds of 386-based machines off the scrap heap? And, more importantly, does this person's supervisor know he apparently seems to think this is his own personal playground rather than a professionally run system?

Or maybe we have it all wrong. Reading between the lines, I immediately assumed he worked for an educational institution or a pharma company - but he doesn't say anything like that. For all we know this guy works for one of those rich pseudo-scientists... the kind of dabbler who has an "institute" with his own name in the title, and the mention of whose name makes real scientists roll their eyes. We just don't know enough.

Re:Wait a minute... (1)

Anonymous Coward | more than 2 years ago

I would guess that he's a Somali Pirate, and they've just hijacked a container ship from China... Several hundred containers worth of computer components.

Sounds like the only reasonable conclusion to me...

Re:Wait a minute... (1)

ArsonSmith (13997) | more than 2 years ago

Yea, I almost got caught by that super computer in the impulse buy section at the drugstore checkout too.

the best use? (1)

nimbius (983462) | more than 2 years ago

i dunno...someones still working on cancer i think...and i know a guy whos still trying to find the higgs bozon.
having solved all other problems, maybe dick around on jeopardy?

Multiple super-computers instead of a single one? (1)

sisukapalli1 (471175) | more than 2 years ago

You need to specify additional information:

1) What about the data and storage? Many complex applications require vast amounts of data (e.g. climate change models, CFD models, GIS data sets that can complement or take advantage of modeling). Many end users may not be very adept at accessing these data.
2) What about the software? For example, CFD modeling software is very expensive. In some cases, open source software may not make the cut.
3) Does it have to be a single supercomputer? Why not split into multiple supercomputers and merge them as needed? That way, some groups have a more dedicated resource for themselves. The "biggest X ever" isn't as cool as it appears to be.
4) I presume the funds came in as a result of some proposal (using the word informally here, it could even be a one-pager that was sent to the university). The costs should be at least 5k per sever (based on what I've seen recently), so it's 6 million [I'd say 10 million even, unless my weak math is catching up with me]? So, that proposal would have some intended uses already.
5) Leasing it internally (to other groups in the university) may be reasonable -- it may even be a sweeter deal if you allocate a set of 10 or 20 servers for a group, instead of having it as part of a broader account access. You can tell them it is their "own machine".

I say this with no offense meant... I've noticed way too many people for whom the tool or technology seems to be the primary purpose (e.g. I do it using *EJB* or *distributed cluster* or *high availability database*). I spoke to someone that was working on app infrastructure for first responders, and was focusing on IDEs and integration, and his killer app was a download link to a weather channel app! When I mentioned that he needs some apps that really differentiate the system from others, his response was that we can run a contest for the apps. So, please avoid going that route -- in general, the tools are there to solve problems and not the other way round [with all caveats, sometimes the tools have to come first before we even realize what we had before that was very bad].

Well, congratulations on getting to play with 10M. I think I was rambling a bit, but the bottom line is: (a) don't make it one computer unless you can find a reason, and (b) approach different groups and offer the tool/service -- you need to do that till you get some traction.

Need help too! (1)

gtirloni (1531285) | more than 2 years ago

I'm also receiving all the parts needed to build a nuclear weapon but I still haven't figured out which one. Any ideas? It must be capable of destroying all trolling in the universe (including the ones that /. accepts as news).

Ethernet / Infiniband Tradeoff (1)

Rhalin (791665) | more than 2 years ago

Pros and Cons for each link, and it depends on -which- speed of infiniband / how they are bonded. Infiniband can get quite fast in the right configuration, if you want to spend the money on it, and even then, you could do similar setups with 10GB bonded ethernet, that might be cheaper.

One advantage I think ethernet has over infiniband (and correct me if I'm wrong here, someone) is that infiniband requires a specialized network protocol to use, where ethernet can use standard TCP/IP sockets. This is perfectly fine, since many cluster libraries can use infiniband...but using ethernet would open up your cluster to situations and use-cases where things like MPI may not be appropriate architectures - for instance, some types of cognitive modeling can benefit from the CPU resources available on the cluster, but their architectures don't always bind well to MPI metaphors (and for some programming languages / cognitive architectures, getting MPI to work is non-trivial in a cluster environment).

security (0)

Anonymous Coward | more than 2 years ago

Security with the GPU will be impossible.

They rarely have their own IOMMU due to the speed limit they put on DMA, and that opens up the system to major security disasters.

All of the systems I know of have no protection to prevent malicious/buggy GPU code from corrupting system memory and either taking over the node, or (better) just crashing the node.

And they are going to let you administer it? (0)

Anonymous Coward | more than 2 years ago

Anyone competent should be able to find answers to most of those questions.

As a cluster admin myself.... infiniband!!! (2)

Fallen Kell (165468) | more than 2 years ago

I can not stress this enough. As good as 10gb ethernet is, the latency is still horrible compared to infiniband.

As for distributions, really, that depends on what you are doing and how your current applications are built/designed. Rocks cluster is fairly nice. Unfortunately we have not been able to deploy that due to our FOSS policies, which have really been hurting this project. So we have a mixed Red Hat and Solaris cluster using Grid Engine.

Several projects in mind! (0)

Anonymous Coward | more than 2 years ago

- Check if the Ultimate answer is really 42. Will take some time.

- Simulate the evolution of the US economy with and without stimulus. Ideal for Dem and Rep (depending of the result)

- Simulate the evolution of the US if the military budget were use for public health and education.

- Simulate some earthquakes and damages. (for insurance companies)

- Simulate a black hole. (for some astrophysics research)

- Simulate the chances of a non-educated young teen becoming a new rich rapper.

- Crunch numbers for SETI like projects.

- Look for patterns in NYSE.

- Estimate the public debt in 2020 if the politicians keep spending exponentially with some never ending wars in between.

- Calculate the mandelbrot set with super precision.

- Program a chess game.

Total BS (1)

friedmud (512466) | more than 2 years ago

I work with some of the largest supercomputers in the world... and I can tell you that this is BS. There is no way this guy got someone to give him enough cash to put this together without:

1. A Plan of what to buy / build
2. A sound reasoning behind what would be done with the machine.

Beyond that... that isn't even that large of a cluster. There are numerous computers on the east coast larger than that... at universities and government research labs (i.e. http://www.nccs.gov/computing-resources/jaguar/ [nccs.gov] although maybe he doesn't consider Oak Ridge to be on the "East Coast").

Advertisement or trolling? (0)

Anonymous Coward | more than 2 years ago

Is there a difference nowadays?

Dear infiniband, stick it up you know where.

With articles like this (0)

Anonymous Coward | more than 2 years ago

I can see why Taco left. Maybe he was on to something.....

Wow... (0)

Anonymous Coward | more than 2 years ago

You're two week away from taking receipt of this gear and you haven't yet planned these things?

Either you're lying out your ass about the gear you're getting or one of us might have the chance to get a killer deal on some HPC gear at your bankruptcy liquidation....

Re:Wow... (1)

GameboyRMH (1153867) | more than 2 years ago

Makes me feel like an idiot for not applying to a job working on a small cluster used for climate research. I didn't apply because I didn't have any HPC knowledge.

If I knew I could get hired without knowing shit I would've given it a shot!

two weeks away, and you still haven't spec'ed... (1)

capsteve (4595) | more than 2 years ago

two weeks away, and you still haven't spec'ed all your hardware?
c'mon, this is a put on!
if you're getting this monster installation, you would have spec'ed all aspects of the hardware, including 10gb and gpu's and OS months ago.

Maybe (1)

arbulus (1095967) | more than 2 years ago

Giving the benefit of doubt, I'm assuming that you mean that you have a purpose, but have spare processing power and would like to put it to use. In that case I would recommend maybe seeing if you could help out with Folding@home, SETI@home or CERN distributed computing.

Imagine Beowulf of those! (1)

porky_pig_jr (129948) | more than 2 years ago

Come on, folks. Is that Slashdot or what?

Re:Imagine Beowulf of those! (2)

blair1q (305137) | more than 2 years ago

I was imagining partitioning it into an enormous brigade of heterogenous virtual machines, then hooking those up as a Beowulf cluster.

Teaching.. (0)

Anonymous Coward | more than 2 years ago

.. my cats to paint like the Masters!

troll (1)

rish87 (2460742) | more than 2 years ago

As somehow who works with supercomputers, I have serious suspicion about this . You do not get the funding for a supercomputer of this size without knowing these basic specifications. How can he be getting "everything necessary" in two weeks when he doesn't have a planned network, GPU's, OS or application? There is so much effort that goes into speccing these clusters, building them and then installing and configuring all of the administrative software such a queuing systems. Hell, if you're doing HPC work on supercomputers, you need an equally impressive storage solution to contain all of the data. It isn't a matter of sticking a sata hdd on each node and calling it a day.

Do us a favor (0)

Anonymous Coward | more than 2 years ago

Some of us considering H/W upgrades to run this new Windows 8 and we will be immensely thankful to you if you could tell us if this new Windows 8 can run OK on it. Be honest about the results.

Realistic Simulations... (0)

Anonymous Coward | more than 2 years ago

By which I mean furthering the science of computer generated porn to lifelike qualities. Then hopefully using that power to create scenes with your coworkers that will leave them traumatized and cowering in a corner. And even more afraid of clowns.

