Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

A Look Inside the NCSA

ScuttleMonkey posted more than 6 years ago | from the dave-you-like-your-ncsa-friends-better-than-me-don't-you dept.

Supercomputing 89

Peter Kern writes "The National Center for Supercomputing Applications (NCSA) is one of the great supercomputing facilities in the world and is home to 'Abe', one of the top 10 supercomputers on the current Top 500 list. TG Daily recently toured the facility and published a stunning report about their computing capabilities (more than 140 teraflops), power requirements (a sustained 1.7 megawatts), enormous 20-ft chillers in four cooling systems and other installations that keep the NCSA online."

cancel ×

89 comments

Sorry! There are no comments related to the filter you selected.

Wow! (0)

Anonymous Coward | more than 6 years ago | (#19719667)

Just imagine a Beowulf cluster of these things!

Re:Wow! (0)

Anonymous Coward | more than 6 years ago | (#19719993)

Let's see with 2 you can get 140 * 2 = "The BlueGene/L reached a Linpack benchmark performance of 280.6 TFlop/s ("teraflops" or trillions of calculations per second)."

3 and they could be #1 in the world :)

Re:Wow! (0, Redundant)

Anonymous Coward | more than 6 years ago | (#19720617)

Probably almost good enough to run Vista.

Re:Wow! (2, Funny)

Tmack (593755) | more than 6 years ago | (#19722585)

Let's see with 2 you can get 140 * 2 = "The BlueGene/L reached a Linpack benchmark performance of 280.6 TFlop/s ("teraflops" or trillions of calculations per second)."

3 and they could be #1 in the world :)

And you can power 711 of them with one Mr Fusion! [wikipedia.org]

Tm

BlueGene/L (1)

CraniumDesigns (1113153) | more than 6 years ago | (#19721481)

The BlueGene/L at Lawrence Livermore National Lab is still the fastest. Used to work there as a web designer. Scary.

DIY Beowulf (1)

spaceyhackerlady (462530) | more than 6 years ago | (#19723043)

I sometimes toy with the idea of going to the various used computer stores, buying a pallet of used computers and making my very own Beowulf cluster. I've seen pallets of fast P3 and low-end P4 boxes at interesting prices. Boeing Surplus [boeing.com] have large numbers of essentially identical computers almost every time I go there. I remember once looking through a big bin for a particular size wrench and grumbling to the sales person "Surely there is something bolted to a 747 with these size bolts!" They laughed...

The alternative would be to do something with new motherboards and processors. Might even get a break on the electricity.

...laura

Let's Get It Out of the Way (0, Redundant)

WED Fan (911325) | more than 6 years ago | (#19719671)

Imagine a Beowulf cluster of these.

But, does it run Linux.

In Soviet Russia NCSA Looks Inside of You.

Re:So close (0)

Anonymous Coward | more than 6 years ago | (#19719823)

But you failed to raise the question, will it run Vista?

Re:So close (1, Redundant)

WED Fan (911325) | more than 6 years ago | (#19720003)

You can run Vista, with most of the features turned off.

Cause of Global Warming (5, Funny)

ISoldat53 (977164) | more than 6 years ago | (#19720353)

New computer simulations indicate that supercomputers are a major source of global warming.

Re:Let's Get It Out of the Way (1)

feedmetrolls (1108119) | more than 7 years ago | (#19724089)

You forgot "All of your teraflops are belong to us!"

And:

1. Look inside NCSA
2. ?????
3. Profit!

Frosty Piss Failure (-1, Troll)

Anonymous Coward | more than 6 years ago | (#19719679)

I fuck dead dogs.

So what? (-1, Troll)

Anonymous Coward | more than 6 years ago | (#19719877)

I once fucked Schrodingers pussy!

frist (-1, Redundant)

Anonymous Coward | more than 6 years ago | (#19719683)

a beowulf cluster of these?

That's a lot of number crunching (3, Funny)

Junior J. Junior III (192702) | more than 6 years ago | (#19719689)

Who knew Mosaic was so bloated? No wonder no one uses it anymore.

Re:That's a lot of number crunching (0)

Anonymous Coward | more than 6 years ago | (#19720159)

"You may have come across the name NCSA before as this place is recognized as the origin of the web browser. It was here where Marc Andreesen, who later became the co-founder of Netscape, and Eric Bina invented the Mosaic browser in the early 1990s."

of course as often happens they leave out Tim Berners-Lee, who actually "invented" the Web browser at CERN (on a NeXT of course) with Mosaic just being the first in widespread use and Adreesen's biggest contribution as I understand it was the inline IMG tag.

Re:That's a lot of number crunching (1)

g-san (93038) | more than 7 years ago | (#19724611)

As long as you remember when you did use it.

Job requirements... (2, Interesting)

ushering05401 (1086795) | more than 6 years ago | (#19719697)

Just out of curiosity... does anyone know minimum requirements for getting on as a server tech in a place like that?

Really contemplating computing power like they describe is a pretty far out exercise for a small time programmer like me... What sort of people get employed at these places?

Regards.

Re:Job requirements... (2, Funny)

Timesprout (579035) | more than 6 years ago | (#19719751)

Apparently you need to be 20 feet tall and have a very relaxed attitude and a cool head.

Re:Job requirements... (5, Informative)

morgan_greywolf (835522) | more than 6 years ago | (#19719967)

I'd make sure I had ample experience in the systems and networking administration arenas. Know multiple flavors of UNIX, know Linux, and know multiple clustering technologies -- everything from shared-memory architectures to high performance clustering to grid computing to high availability systems. Know the systems available from multiple vendors -- IBM, HP, Sun, Red Hat, Veritas. Knowing storage area networking is pretty smart also. Know networks -- understand them at all levels in the OSI and TCP/IP models. Understand application and system-level debugging. Understand how to analyze the performance of a complete system, from the application level all the way to the lowest levels of an individual node.

Oh, and being able to think on your feet, the ability to communicate with engineers and scientists, and being very organized and able to work independently doesn't hurt either.

Re:Job requirements... (2, Funny)

Linker3000 (626634) | more than 6 years ago | (#19720087)

...so CompTIA's Network+ and a bit of bench experience should do the trick eh?

Re:Job requirements... (1)

Colin Smith (2679) | more than 6 years ago | (#19720827)

So... Old fogeys then.

 

Re:Job requirements... (1)

morgan_greywolf (835522) | more than 6 years ago | (#19721175)

Great. Glad to know I'm an 'old fogey' at 34. :(

Re:Job requirements... (1)

Colin Smith (2679) | more than 6 years ago | (#19721859)

Sorry mate. After 25 it's all down hill. The body starts atrophying and the mind goes with it.

 

Re:Job requirements... (1)

kylehase (982334) | more than 7 years ago | (#19726513)

I'd get Novell certifications. SLES seems to dominate as the OS for the top 10.

Re:Job requirements... (1)

morgan_greywolf (835522) | more than 7 years ago | (#19729087)

Certs aren't all their cracked up to be. I've been working in this field for many years and I'm near the top of the payscale. Any idea how many certs I have? Zero. As in none. Experience and a degree are far more valuable than certs, IMHO.

Re:Job requirements... (1)

tehcyder (746570) | more than 7 years ago | (#19742521)

How about being a MCSE?

Re:Job requirements... (0)

Anonymous Coward | more than 6 years ago | (#19720139)

You might try looking here: http://www.ncsa.uiuc.edu/AboutUs/Employment/ [uiuc.edu]

Re:Job requirements... (1)

ushering05401 (1086795) | more than 6 years ago | (#19720231)

Thanks, but I did see that. It only lists common majors and the fact that you need to list all relevant work experience + three references. No specifics on the types of people who are likely to be employed outside of some vague areas of study.

Re:Job requirements... (0)

Anonymous Coward | more than 6 years ago | (#19720643)

The NCSA is located on the University of Illinois campus. Typically undergraduate students are employed as server techs.

Re:Job requirements... (0)

Anonymous Coward | more than 6 years ago | (#19721283)

No, they are not. Full timers and maybe grad students. no undergrads unless they're exceptional. That's not the typical undergrad these days.

Re:Job requirements... (1)

CopaceticOpus (965603) | more than 7 years ago | (#19724327)

I worked for NCSA as a web developer/researcher. Not all of the jobs there involve complex supercomputing tasks, so you may find an opening doing web development, networking, basic tech support, etc. From that point if you are able to train yourself and network inside the organization, you could probably move towards working with the big servers in time.

Re:Job requirements... (1)

gauauu (649169) | more than 7 years ago | (#19730201)

I'm a programmer at NCSA....there are a number of other small research projects we do, other than just supercomputer-related. I have a bachelors degree in CS, nothing special.

Printable Link - All in one page (1)

_Sharp'r_ (649297) | more than 6 years ago | (#19719699)

Printable Link [tgdaily.com] - All in one page.

My prediction is that in 10 years the place will be functionally obsolete as a result of processing advancements elsewhere.

Re:Printable Link - All in one page (2, Funny)

WED Fan (911325) | more than 6 years ago | (#19719755)

In 10 years, this will be on the desktop, everyone will yawn because we have been boiled frogs and it won't impress us then. In 10 years, you'll look at someones tie clasp computer and say, "Wow, I remember when that took up an 8 by 18 block of my desk."

In 10 years, DARPA will announce the shut down of the Quantum Computing Project because it will be discovered that every time Red Hat Mandriva Winux OS/Q green screens, a parallel universe winks out of existance.

In 10 years, they'll slap wheels on your grandmother's behind and call her a wagon.

Re:Printable Link - All in one page (1)

laejoh (648921) | more than 6 years ago | (#19721525)

A great way to finish off all those 'if-questions':

In 10 years, they'll glue balls to your aunt and call her your uncle!

Re:Printable Link - All in one page (1)

caffeinemessiah (918089) | more than 6 years ago | (#19720245)

My prediction is that in 10 years the place will be functionally obsolete as a result of processing advancements elsewhere.

NCSA has been around for a long time and will be around for a long time more. Your prediction is based on the assumption that the systems at NCSA are static, which is completely untrue. If the government decides to start up a mega-super-quantum-ultra-computing project, NCSA is pretty high on the list of places that are going to get the grant.

Re:Printable Link - All in one page (1)

_Sharp'r_ (649297) | more than 6 years ago | (#19720369)

No, my prediction is based on the idea that processing power will have overtaken processing needs by so much in 10 years that it will be pointless to have a dedicated processing facility. Sure, maybe it'll take 20 years, but it's going to happen.

I did actually RTFA and see that they have several generations of hardware in use as they continually upgrade.

Re:Printable Link - All in one page (1)

gardyloo (512791) | more than 6 years ago | (#19720775)

No, my prediction is based on the idea that processing power will have overtaken processing needs by so much in 10 years that it will be pointless to have a dedicated processing facility. Sure, maybe it'll take 20 years, but it's going to happen.

    Piffle. There will be a new version of Windows by then. That will eat up at least 50% of this new processing power.

Re:Printable Link - All in one page (1)

BeBoxer (14448) | more than 6 years ago | (#19721417)

processing power will have overtaken processing needs by so much in 10 years that it will be pointless to have a dedicated processing facility.

Why would you expect this to ever happen? When it comes to modeling the behavior of physical systems, whether it's the weather and climate or molecular structure, I don't think there is a limit to processing "needs". More power just means you can run more models, or more accurate models, or bigger models. I'm not sure why you would expect that to change.

Re:Printable Link - All in one page (1)

peawee03 (714493) | more than 7 years ago | (#19738241)

I'm an undergrad assistant sysadmin and a programmer for a department here at UIUC. As a sysadmin, one of my primary responsibilities is maintaining and running our 41-node Linux cluster and the associated mass storage system. As a programmer, I'm responsible for hacking on a climate model that will be a rather big deal once it works, due to it running a very fine resolution model of the global system.

Something I've noticed is that once a professor has gotten done modeling something, the immediate response is doubling the resolution of the model. Or quadruple it. And then they want to do that, and make the model track five times the amount of variables in the system. And then they're going to be plugging another model into the original model that might even square the amount of computational power necessary. And lastly, they're going to want to also take the model from investigating 10 years worth of time, to possibly the next two centuries.

Scientist-type computing needs are basically gaseous; they will fill to expand whatever space they're given. There will always be a necessity for far-above-normal number crunching capability; people are constantly capable of figuring out available_computrons^2 ways of using a given amount of computing power. When there's 1024-core CPUs in desktops, professors are going to be wanting millions of vector processors because their .00001 degree global climate/biology model only processes in real time (1 model day / day) given their available resources. Today I'm working on a .1 degree model that has to run on Cobalt at the NCSA due to its massive memory and processor requirements (28 CPUs to make it run at real time with 18 GB of ram per CPU at 28 CPUs. Memory need scales linearly, processing power doesn't, so you'll need more than a linear increase in processors for a linear increase in model speed).

I Don't Know (0)

Anonymous Coward | more than 6 years ago | (#19719741)

My name is Jaap van Ballspoogen

and I have a problem!!1~112~!@four!`

Fire Protection System (2, Interesting)

BlueLightSpecial (898144) | more than 6 years ago | (#19719763)

Is there some way to perform a graceful shutdown before the water gets pumped and released? If the supercomputers are still on when the water is discharged from the pipes, wouldnt that damage the systems? If they dont want to use halon why not use a more computer-friendly compound like FM-200 (http://en.wikipedia.org/wiki/FM-200/ [wikipedia.org] )

Re:Fire Protection System (1, Informative)

blhack (921171) | more than 6 years ago | (#19719863)

the water isn't getting pumped all over the motherboards of these computers or something drastic like that. What they mean is that they keep super-chilled water on hand at all times. This way, should there be some catastrophic over-heating event, they have already cold water on hand; not the stuff that most liquid cooling systems use (which is just room temperature).

Re:Fire Protection System (1)

Radon360 (951529) | more than 6 years ago | (#19720427)

Uhm, go back and RTFA. No Halon system is used, a standard water sprinkler system is used. However, the one failsafe noted was that a smoke detector also had to activate in addtion to the heat fusible link in the sprinkler heads before torrents of water were released. So, you need both an indication of smoke AND excessive heat for things to start getting wet.

Re:Fire Protection System (1)

niew (133188) | more than 6 years ago | (#19720613)

the water isn't getting pumped all over the motherboards of these computers or something drastic like that. What they mean is that they keep super-chilled water on hand at all times.

Nope, they mean that if there's a fire, they're dumping tonnes of water directly onto the computer cabinets that are burning... Once the computer's on fire, water can't hurt it much further...

This is a pre-action water system and they're becoming more popular in computer rooms now that Halon is falling out of favor. They start out dry (uncharged), then if smoke/heat is detected (or other combinations of pre-action's) they charge with water, but still don't 'go off'. Then, when a fire below a sprinkler head gets hot enough to melt the fusible link, that sprinkler head goes off dumping water onto the fire below (and around) it.

In this way, false discharges are rare since it requires a pre-action before there's even any water in the lines above the equipment, then the heads only go off where it gets hot enough to melt the fusible link. The rest of the room doesn't get wet. (assuming the floor drains the water away effectively, and this place has a 6' raised floor!)

Re:Fire Protection System (1)

jhines (82154) | more than 6 years ago | (#19721599)

From the fine article, the sprinklers are kept dry, and 2 events must happen before they are activated.

Both a smoke alarm, and then each individual sprinkler head has a thermal link that must melt before activating.

With a 24/7/365 control room, so yes, they have a chance to shut things down.

The idea that tripping a single sprinkler head will set them all off at once is hollywood fiction. They are set off one by one, when a fusible link burns out at the sprinkler head. Fancier heads can shut off once the heat source is gone, to minimize water damage.

Deluge sprinklers (1)

DragonHawk (21256) | more than 7 years ago | (#19725005)

The idea that tripping a single sprinkler head will set them all off at once is hollywood fiction.

Just to be pedantic, such systems do exist. They're called "deluge sprinkler" systems. Like a pre-action system, the pipes are normally kept dry, until some external event triggers it. However, unlike a pre-action system, every sprinkler head is open, so once the water valve is opened, it immediately starts raining everywhere. Mainly used in places where any sign of fire warrants immediate drastic action, like a fuel depot.

Great, just waiting for it to become self-aware (0, Troll)

jollyreaper (513215) | more than 6 years ago | (#19719773)

Just reassure me that nobody has this hooked into our ICBM's. Knowing Bush has the button is scary enough.

Re:Great, just waiting for it to become self-aware (0, Troll)

jollyreaper (513215) | more than 6 years ago | (#19721357)

Just reassure me that nobody has this hooked into our ICBM's. Knowing Bush has the button is scary enough.
Troll? Just who the fuck do you think I'm trolling, Skynet? What, we don't want to make the big scary AI angry?

Re:Great, just waiting for it to become self-aware (1)

Anonymous Coward | more than 6 years ago | (#19722173)

No, you mnetioned bush in a negative context. To get modded up, you have to do it in the only positive way possible; Last night, I got bush. Or I got bushed whacked at the casino. Or wow, what a xxxx. Anything else, and you would have to be lying.

Re:Great, just waiting for it to become self-aware (1)

jollyreaper (513215) | more than 7 years ago | (#19725515)

No, you mnetioned bush in a negative context. To get modded up, you have to do it in the only positive way possible; Last night, I got bush. Or I got bushed whacked at the casino. Or wow, what a xxxx. Anything else, and you would have to be lying.
Oh. In that case, I'll wear the down-modding as a badge of honor. Hey, Bushies! Your guy pardons criminals!

Re:Great, just waiting for it to become self-aware (0, Flamebait)

jollyreaper (513215) | more than 7 years ago | (#19726301)

That's right, you Republican cocksuckers. Keep modding it down. I bet you guys have got your tiny little dicks all hard after this latest miscarriage of justice. Go ahead, rub one out, you've earned it.

/.ed already (5, Funny)

blhack (921171) | more than 6 years ago | (#19719781)

1.21 gigaflops and their webserver is an old guy with tourette syndrome yelling HTML code into a tin can on a string.

Re:/.ed already (1)

CaptainPatent (1087643) | more than 6 years ago | (#19720055)

1.21 gigaflops and their webserver is an old guy with tourette syndrome yelling HTML code into a tin can on a string.
and somebody cut the string!

Re:/.ed already (1)

JamesTRexx (675890) | more than 6 years ago | (#19720185)

Best. Analogy. Ever. :-)

Re:/.ed already (1)

niktemadur (793971) | more than 6 years ago | (#19722871)

Imagine a Beowulf cluster of old guys with Tourette's Syndrome yelling HTML code into a tin can on a string!

UIUC FTW! (1)

mewyn (663989) | more than 6 years ago | (#19719881)

As a prospective student of UIUC, who also has a good friend attending the school in a CS Ph.d. program, I get a bit giddy inside any time I see the university in the media. I've been to that building when my friend was giving me a mini tour of the facilities, although I didn't go see the supercomputers themselves.

Re:UIUC FTW! (2, Insightful)

panzagloba (1117959) | more than 6 years ago | (#19720147)

I have actually been in the newer facility dozens of times when I worked as an intern for the Architect on the building. I actually drafted the final drawings for this project. It is a VERY nice facility, with some pretty cool under-floor cooling systems and things like that. I am pretty sure I have 3D digital models of the facility somewhere in my work records.

The lecture auditorium bites the big one though, purple seats? Nasty. The Seibel Center accross the mini-quad is a much more interesting building though, at least to an architect. :)

Re:UIUC FTW! (1)

rvega (630035) | more than 7 years ago | (#19726591)

Me too. As an undergrad at UIUC in the early '90s, I worked as a comptuer operator in the ACB (slide 4 in the FA). They had some cool hardware: Two Crays, including the beautiful Cray 2 with the waterfall, two Connection Machines (a CM2 and a huge CM5), a Convex, a big SGI of some sort, and some ancillary systems. The CMs arrived when parallel supercomputing was just becoming popular. The CM5 was a work of art, but when you opened the side panels, it was actually just racks of Suns inside (if I remember right.) But it was a fun time to work there. Mosaic was released when I was there. When I worked night shifts I played a lot of Maelstrom, and I'd stand on the roof sometimes and watch thunderstorms rolling across the prairie.

Re:UIUC FTW! (1)

per unit analyzer (240753) | more than 7 years ago | (#19732697)

There were a number of Sun workstations inside the CM5 that served as partition managers, but the computational horsepower was custom TMC hardware based on the SPARC processor. The CM5 was much more than a rack of Sun workstations. At NCSA, two of the cabinets had the Sun (SPARCstation 10???) partition managers, but other three cabinets had nothing but the TMC hardware in them.

Re:UIUC FTW! (1)

rvega (630035) | more than 7 years ago | (#19741737)

I guess I saw inside the wrong cabinet! Nice to know -- I've always had this "pay no attention to the man behind the curtain" feeling about that system. Of course, many parallel supercomputers these days really are "just" racks of commodity CPUs. No shame in that. Anyway, TMC certainly knew how to make a good-looking box.

Porn (0)

Anonymous Coward | more than 6 years ago | (#19719955)

you know the guy running this thing uses up most of its power to store his massive collection. Of course, as usual, the 900+TB of files is "hidden" from his boss within a series of obscure sub-folders.

Use one of these to serve the article? (0)

Anonymous Coward | more than 6 years ago | (#19720093)

FTFA:

We are currently experiencing server load issues. We are addressing this problem at this time and will enable the slideshow related to this article shortly.

Oh, the irony of not having enough server horsepower to serve up a story expounding on supercomputing...

Scrubs... (-1, Troll)

Anonymous Coward | more than 6 years ago | (#19720175)

Scrubs...
I don't want no fuckin' scrubs

your mommy is a scrub and
your daddy is a crumb-bum!

Check the math (1)

linuxwrangler (582055) | more than 6 years ago | (#19720255)

All that supercomputing power and they come up with "a one-building electricity bill of $3 per second - or about $1,500,000 per year". I'm sure they meant $3/minute with is much more in line with all the other figures they quote. Still, that puts their electricity rate in the 10-cents/kWh range - surprisingly high for a large industrial customer.

Re:Check the math (1)

Radon360 (951529) | more than 6 years ago | (#19720633)

You might be forgetting the peak demand charge. If you work that into the equation ~$20 - $25 per kW at peak demand, or roughly $36,000 to $45,000 of their monthly bill, you start to get back down to $0.08kW-hr for energy costs. And don't forget this is a large non-interuptible customer, so they will pay premium rates because ComEd (or whoever there in East Central IL) can't take them offline on a hot day.

Re:Check the math (1)

afidel (530433) | more than 6 years ago | (#19721857)

Why are they a non-interruptible customer? Datacenters with backup generators should obviously be interruptible customers as long as the cost of peak power is more than the cost of running your own generators, which is probably the case when you have megawatts of local power generation capability.

*Yawn*. (1, Troll)

DerekLyons (302214) | more than 6 years ago | (#19720287)

From TFA: They also had two dedicated UPS boxes which stood six feet tall, three feet wide and 12 feet deep.
 
*Yawn*. Only impressive to the slashgeek with no real experience with heavy iron (I.E most of them.) When I was in the Navy and serving at a training center - we had also had two UPS's this size. For each trainer/lab. And we had four labs.
 
Just in the Weapons Training end of the building.
 
Cooling and power conditioning for the training facility was in a seperate 15k sq ft building. Getting to the building from the facility was cool though - you wnet down into the basement, then down a ladder to a tunnel that ran the length of the building. Off that tunnel was another tunnel that ran out under the back parking lot to the support building. We used to joke about building a sniper range in one of the tunnels - they were that long.

Re:*Yawn*. (0)

Anonymous Coward | more than 6 years ago | (#19722909)

Yet you still can't spell "separate". Grunts have no use for spelling, eh?

Re:*Yawn*. (1)

DerekLyons (302214) | more than 7 years ago | (#19731727)

So why the moderation as 'troll'? I just get tired of the technogeek porn in these articles. ("oohh baby, what a Big UPS, and your cooling towers... so smooth and sexy!")
 
Articles like TFA always love to point out these systems - and they sound impressive to someone whose only experience is the desktop PC or small datacenter... But the reality is, they aren't anything rare or special. if you have the money you can order one of those huge UPS's just about as casually as you can pick its smaller brethern off the shelf at Fry's. Big cooling towers are standard HVAC installations, unlike the liquid cooled PC which is (currently) unusual and limited to ubergeek crowd.

It's not "the NCSA" (1)

Minter92 (148860) | more than 6 years ago | (#19720391)

A point of clarification, I worked there for almost 10 years and its NCSA NOT "the NCSA". We get touchy about that :)

Re:It's not "the NCSA" (3, Funny)

Intron (870560) | more than 6 years ago | (#19720537)

Wouldn't want to get confused with that other large supercomputer customer.

Re:It's not "the NCSA" (0)

Anonymous Coward | more than 6 years ago | (#19720819)

Why?

IllinoiS eh? (1)

WindBourne (631190) | more than 6 years ago | (#19722205)

Kind of raises hackles like somebody saying IllinoiS (vs. illinoi) or mississippi (vs. missippi). You know who is from there and who is not.

6 hour runs? (2, Interesting)

Vellmont (569020) | more than 6 years ago | (#19720689)

The most surprising thing in the article was how inelegantly they've solved the problem with inevitable hardware failure. That is, limiting runs to only 6 hours. It seems like there just HAS to be a better way to handle the problem than this! Virtualization sounds a bit tricky, so why not just write the software to handle hardware errors in the first place? I.e. produce results, check to see if there was a hardware failure, if so, re-do.

Maybe they already do this, and the reporter didn't catch it. But it'd surprise me if they didn't have better solutions than just hoping nothing bad happens during a run.

Re:6 hour runs? (1)

kpesler (982707) | more than 6 years ago | (#19721089)

Much of the software which is run at the NCSA is home-grown software written by computational scientists, not computer scientists. For many of these massively parallel codes, written on top of MPI, fault tolerance really isn't all that easy. For a commercial production code on the order of Gaussian, this may be doable, but for bleeding-edge research codes, it may be a better use of the (human) time to push the algorithms rather than worry about fault-tolerance. From the user's perspective, jobs that are killed due to a hardware failure have their service units refunded, so there isn't a huge incentive to worry about it. When we get to petascale, there won't be any way around it, since the MTBF will probably be a few minutes.

Re:6 hour runs? (2, Insightful)

Vellmont (569020) | more than 6 years ago | (#19722641)


Much of the software which is run at the NCSA is home-grown software written by computational scientists, not computer scientists.

I've seen code written by computational guys before. While not really terrible, it's not terribly re-usable or maintainable. Obviously these guys don't study computer science, but I truly think there's gains to be made if they understood the tool they were using better.

  For many of these massively parallel codes, written on top of MPI, fault tolerance really isn't all that easy. For a commercial production code on the order of Gaussian, this may be doable, but for bleeding-edge research codes, it may be a better use of the (human) time to push the algorithms rather than worry about fault-tolerance. From the user's perspective, jobs that are killed due to a hardware failure have their service units refunded, so there isn't a huge incentive to worry about it.


As long as your job runs under 6 hours, sure. But if it takes over 6 hours, you're already doing some kind of saving and re-starting. That's probably about 80% of what I'm talking about, just on a larger scale. I bet ou're right though, it's all going to come to a head as there's more and more components that could fail, so it has to be fixed at a higher level, or the programmer level. Maybe you can fix the problem with virtualization, but how much of a performance hit do you take, or how much costlier is the machine?

Re:6 hour runs? (0)

Anonymous Coward | more than 6 years ago | (#19721837)

This is one of those things that sounds easy until you actually try it. Especially given that mostly the machines are a cycle shop running other people's code -- code that was written, usually, without the slightest concept of checkpointing.

I need an allocation (0)

Anonymous Coward | more than 6 years ago | (#19720925)

I have a copy of Mosaic 1.0 for Macintosh on a floppy disk. I'm getting read errors on it and thought they could renew it for me.

WTF... $70K per year, u wish? (0)

Anonymous Coward | more than 6 years ago | (#19721077)

RTFA | Suppose the average technician earns $70K per year. That's $21 million for the NCSA's staff annually, just in salaries. |

when i interviewed with them a couple of years ago, the UofI 5 year vetrans were worth a whoopping 36k to 40k a year. but i guess they were worth every penny, cuz the grad students were doing low end technical stuff not even statical analysis. they almost cried when i told them i was doing data mining in the private sector. there is a huge disconnect from reality and the mistiguee of supper computing. i would bet a good beowolf cluser could cover the same ground as the "Abe" in most areas they are using it. but hey what do i know?

Lustre at NCSA (1)

JumboMessiah (316083) | more than 6 years ago | (#19721725)

Any of the folks in the loop know if they use Lustre [lustre.org] for their storage backend?

Re:Lustre at NCSA (0)

Anonymous Coward | more than 6 years ago | (#19722163)

They don't have IBM on the building where the NCSA is at the UofI for nothing.

Re:Lustre at NCSA (1, Informative)

Anonymous Coward | more than 6 years ago | (#19722239)

Yes, Abe uses Lustre [uiuc.edu] . As the other commenter suggested, GPFS is also in use.

you Fail it (-1, Redundant)

Anonymous Coward | more than 6 years ago | (#19722149)

stAndards should These early

Printer-Friendly Version All On One Page (0)

Anonymous Coward | more than 7 years ago | (#19724931)

Here's the whole article [tgdaily.com] on a printer-friendly page.

Nerd porn (1)

ruiner13 (527499) | more than 7 years ago | (#19725405)

This article is nothing more than nerd porn. Please wash your hands before returning to work.

Thanks,
The Management

A look inside the NCSA? (1)

ScrewMaster (602015) | more than 7 years ago | (#19725721)

I'd be more interested in a look inside the NSA ... of course, the lights might not be on.
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>