Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

A Look Inside Oak Ridge Lab's Supercomputing Facility

timothy posted about 2 years ago | from the plus-a-fun-museum dept.

Supercomputing 59

1sockchuck writes "Three of the world's most powerful supercomputers live in adjacent aisles within a single data center at Oak Ridge National Laboratory in Tennessee. Inside this facility, technicians are busy installing new GPUs into the Jaguar supercomputer, the final step in its transformation into a more powerful system that will be known as Titan. The Oak Ridge team expects the GPU-accelerated machine to reach 20 petaflops, which should make it the fastest supercomputer in the Top 500. Data Center Knowledge has a story and photos looking at this unique facility, which also houses the Kraken machine from the University of Tennessee and NOAA's Gaea supercomputer."

cancel ×

59 comments

Sorry! There are no comments related to the filter you selected.

Can we build ... (3, Funny)

PPH (736903) | about 2 years ago | (#41300075)

... a Beowulf cluster of these?

Re:Can we build ... (0)

Mr. Kinky (2726685) | about 2 years ago | (#41300099)

Or Football stadium of them [youtube.com] .. wait for it - WHOOHOO!

Re:Can we build ... (0)

Anonymous Coward | about 2 years ago | (#41300175)

If they need more power they can release the Kraken.

Re:Can we build ... (2)

smitty_one_each (243267) | about 2 years ago | (#41300411)

And if they need BFD power, they can release the Biden.

Re:Can we build ... (1)

Dr. Sheldon Cooper (2726841) | about 2 years ago | (#41301963)

Oh, dear Lord, what fresh hell is this?

Re:Can we build ... (1)

smitty_one_each (243267) | about 2 years ago | (#41302627)

Hell if I know?

Re:Can we build ... (2)

mcgrew (92797) | about 2 years ago | (#41303131)

With all due respects, Dr. Cooper, are you on crack?

-- George Smoot

(Oh, for God's sake, I click the "post anonymously" button 25 minutes after the last comment I post and it says I didn't wait long enough. Good way to spoil a joke... No, I'm not really Dr. Smoot. Posting under my real name is the only way to post this. Are you on slashdot staff, Dr. Cooper?

Oh, BTW, BAZINGA!

Year of the linux supercomputer (0)

Anonymous Coward | about 2 years ago | (#41300079)

Well, maybe THIS YEAR will finally be the year of the linux supercomputer.

D'OH!

Re:Year of the linux supercomputer (1)

noobermin (1950642) | about 2 years ago | (#41300457)

Linux has been on top in supercomputer usage since the mid 2000's.

http://en.wikipedia.org/wiki/Usage_share_of_operating_systems#Supercomputers [wikipedia.org]

Re:Year of the linux supercomputer (0)

Anonymous Coward | about 2 years ago | (#41300669)

You must have missed the "d'oh".

D'OH!

Howmuch do these compotores weigh? (1)

For a Free Internet (1594621) | about 2 years ago | (#41300089)

Euler's rule of compotores states that the flop ofa compotore has an upper bound proportional to the binary log of its weight, because of relativistic effects that begin to comeinto play on the electrones and their dopings.

I suspct that the "scientists" at Oak Ridge didn't even know that and it will soon bitethem in their asses, or behinds if you want to be polite about it, whatever, this is Slashdort, people.

Re:Howmuch do these compotores weigh? (1)

noobermin (1950642) | about 2 years ago | (#41300469)

"UNITE with the Campaign for a Free Internet because today, our future begins with tomorrow!"

Freedom for more posts from people like you? I might need to reconsider this whole net neutrality thing...

Re:Howmuch do these compotores weigh? (0)

Anonymous Coward | about 2 years ago | (#41301929)

I would think density would matter more than the total weight as far as relativistic effects on that scale. I would think sanity also matters when it comes to evaluating such effects too.

What? (0, Redundant)

Anonymous Coward | about 2 years ago | (#41300095)

Only the fastest in the top 500? Not the fastest there is?

Re:What? (1)

craigminah (1885846) | about 2 years ago | (#41300177)

Yeah, WTF does that mean, "...which should make it the fastest supercomputer in the Top 500..." If it's the fastest, then it's the fastest regardless of the number it's compared to (e.g. fastest of top 1,000,000 or fastest of top 2) so it's meaningless to include that unless the goal was to confuse people.

Re:What? (2)

serviscope_minor (664417) | about 2 years ago | (#41300269)

Yeah, WTF does that mean, "...which should make it the fastest supercomputer in the Top 500..."

Not the capitalisation of T in Top 500, which you quoted correctly! It's the name of the list of fastest supercomputers.

Now, please hand in your nerd card on the way out.

Re:What? (4, Informative)

WhitePanther5000 (766529) | about 2 years ago | (#41300271)

The Top 500 is a specific list: http://top500.org/ [top500.org]

It's more correct to say it's the fastest on the list, than the fastest in the world. There are any number of metrics you can use to compare supercomputers. Top 500 just uses the most popular metric. Another machine could easily be the fastest on a different list, like http://www.graph500.org/ [graph500.org] .

Re:What? (3, Informative)

jeffmeden (135043) | about 2 years ago | (#41300655)

The Top 500 is a specific list: http://top500.org/ [top500.org]

It's more correct to say it's the fastest on the list, than the fastest in the world. There are any number of metrics you can use to compare supercomputers. Top 500 just uses the most popular metric. Another machine could easily be the fastest on a different list, like http://www.graph500.org/ [graph500.org] .

The other specific consideration is that the list is ONLY for those that volunteer to run the Linpack benchmark and wish to publicize the results. It is presumed that governments with classified computing facilities withhold this information, for obvious reasons, so there are likely many "supercomputers" (perhaps even a "fastest") that will never be part of the Top 500. The US NSA, for example, is widely believed to operate facilities at or near the top of the list, but they are nowhere in sight for obvious reasons.

Re:What? (0)

Anonymous Coward | about 2 years ago | (#41300293)

The Top 500 isn't a list of the fasted supercomputers in the world; the Top 500 is a list of the fastest supercomputers in the world according to a specific metric (the LINPACK benchmark); and it doesn't proclaim to include all of the computers in the world either. Therefore, the "fasted supercomputer in the Top 500" means just that: the fastest computer on the Top 500 list at top500.org.

Re:What? (2)

Sarten-X (1102295) | about 2 years ago | (#41300311)

Unless there are faster ones that aren't considered for comparison, such as secret military projects.

Re:What? (0)

Anonymous Coward | about 2 years ago | (#41300485)

The Top500 list only comes out twice a year: November and June. As the Jaguar system, which is listed on the current Top500 list, is being upgraded to become Titan, which has not yet been registered on the list, they are estimating where it should fall on the list that comes out in November. Thus, "...which should make it the fastest supercomputer in the Top 500...".

-Anon

Where is the T-437 Safety Command Console? (0)

Anonymous Coward | about 2 years ago | (#41300101)

Where is the T-437 Safety Command Console? and the big board of nuke plants

Is it Symbiotic? (1)

Quiet_Desperation (858215) | about 2 years ago | (#41300111)

the final step in its transformation into a more powerful system that will be known as Titan.

Oh, that is not even its final form.

Re:Is it Symbiotic? (1)

Mister Transistor (259842) | about 2 years ago | (#41300205)

No, the final version will be called Colossus!

"This is the voice of World Control" (1)

Quiet_Desperation (858215) | about 2 years ago | (#41300369)

Ah, one of my favorite films. Still stands up even today.

Re:"This is the voice of World Control" (1)

Safety Cap (253500) | about 2 years ago | (#41301679)

Especially when it demanded a voice box so it could talk to the lowly humans. :)

Bruce (-1)

Anonymous Coward | about 2 years ago | (#41300133)

Bruce Perens are the niggers of the Open Source world!

Redundant? (1)

MyLongNickName (822545) | about 2 years ago | (#41300183)

" which should make it the fastest supercomputer in the Top 500"

At first I thought this was redundant, but then I wondered if there are faster supercomputers that simply are not independently verified to be in the top 500 supercomputers. Anyone have any more info, or am I just overthinking this?

Re:Redundant? (1)

Anonymous Coward | about 2 years ago | (#41300239)

Obviously a number of classified systems don't share their benchmark results, for one....

Re:Redundant? (1)

Anonymous Coward | about 2 years ago | (#41300287)

You know what they say about hiding things in plain sight....

Re:Redundant? (1)

MozeeToby (1163751) | about 2 years ago | (#41300737)

Different benchmarks will produce a different fastest super computer list. 'Top 500' is a specific list that uses a specific benchmark, a benchmark that this particular machine is currently at the top of. Using a different benchmark could be just as valid and produce a completely different list.

Re:Redundant? (1)

tomhath (637240) | about 2 years ago | (#41301175)

Google would probably top some benchmarks with their data centers. But they don't do number crunching like the Top 500.

I've been there. (1)

noobermin (1950642) | about 2 years ago | (#41300347)

I went there my Sophomore year to check out Oak Ridge. I didn't go for computing ,but since my guide knew that I like computing, he took me to look at the supercomputers. It's this huge room that was visible through glass windows which looked essentially like a huge clean, white, office floor with all the cubicles removed and with the supercomputers in place instead.

At that time (2009?) I heard it wasn't really the fastest supercomputer but it's awesome to hear they're revving it up to that. If I didn't hate TN so much, I'd go to UTK and try to work there for Theoretical Physics with the prospect of one day contributing to some simulation that could run on it--granted people deem me smart enough to join that sort of thing. Sigh...dreams...

Re:I've been there. (-1)

Anonymous Coward | about 2 years ago | (#41300461)

You're fooling yourself. If you had anything to offer you wouldn't be wasting your time on Slashdot.

Re:I've been there. (1)

noobermin (1950642) | about 2 years ago | (#41300507)

Trufax. I have to read a section and do an outline on it that's due in two hours but yeah...lols.

Topology matters more than GFLOPS (5, Insightful)

bratmobile (550334) | about 2 years ago | (#41300467)

I really, really wish articles would stop saying that computer X has Y GFLOPS. It's almost meaningless, because when you're dealing with that much CPU power, the real challenge is to make the communications topology match the computational topology. That is, you need the physical structure of the computer to be very similar to the structure of the problem you are working on. If you're doing parallel processing (and of course you are, for systems like this), then you need to be able to break your problem into chunks, and map each chunk to a processor. Some problems are more easily divided into chunks than other problems. (Go read up on the "parallel dwarves" for a description of how things can be divided up, if you're curious.)

I'll drill into an example. If you're doing a problem that can be spatially decomposed (fluid dynamics, molecular dynamics, etc.), then you can map regions of space to different processors. Then you run your simulation by having all the processors run for X time period (on your simulated timescale). At the end of the time period, each processor sends its results to its neighbors, and possibly to "far" neighbors if the forces exceed some threshold. In the worst case, every processor has to send a message to every other processor. Then, you run the simulation for the next time chunk. Depending on your data set, you may spend *FAR* more time sending the intermediate results between all the different processors than you do actually running the simulation. That's what I mean by matching the physical topology to the computational topology. In a system where the communications cost dominates the computation cost, then adding more processors usually doesn't help you *at all*, or can even slow down the entire system even more. So it's really meaningless to say "my cluster can do 500 GFLOPS", unless you are talking about the time that is actually spent doing productive simulation, not just time wasted waiting for communication.

Here's a (somewhat dumb) analogy. Let's say a Formula 1 race car can do a nominal 250 MPH. (The real number doesn't matter.) If you had 1000 F1 cars lined up, side by side, then how fast can you go? You're not going 250,000 MPH, that's for sure.

I'm not saying that this is not a real advance in supercomputing. What I am saying, is that you cannot measure the performance of any supercomputer with a single GFLOPS number. It's not an apples-to-apples comparison, unless you really are working on the exact same problem (like molecular dynamics). And in that case, you need some unit of measurement that is specific to that kind of problem. Maybe for molecular dynamics you could quantify the number of atoms being simulated, the average bond count, the length of time in every "tick" (the simulation time unit). THEN you could talk about how many of that unit your system can do, per second, rather than a meaningless number like GFLOPS.

Re:Topology matters more than GFLOPS (1)

TechMouse (1096513) | about 2 years ago | (#41301107)

Here's a (somewhat dumb) analogy. Let's say a Formula 1 race car can do a nominal 250 MPH. (The real number doesn't matter.) If you had 1000 F1 cars lined up, side by side, then how fast can you go? You're not going 250,000 MPH, that's for sure.

No... but collectively you cover the same distance, right?

Re:Topology matters more than GFLOPS (0)

Anonymous Coward | about 2 years ago | (#41302463)

Not really, that many cars would wreck after a few car lengths.

Even assuming you can find a wide enough lane to put them on.

Re:Topology matters more than GFLOPS (1)

TechMouse (1096513) | about 2 years ago | (#41308647)

I don't remember reading any stipulation about the nature of where the cars are placed.

I guess my point is that with ideal conditions, i.e. an infinitely wide and straight track (plus no mechanical failures, infinite tyres, come on we all understand the meaning of the words "ideal conditions"), then collectively 1000 cars cover the same distance as one car going 1000 times the speed.

Similarly with ideal conditions, i.e. perfect data parallelism, then collectively 1000 processors can process the same amount of data as one processor which is 1000 times as fast.

So I don't think it's a completely meaningless statistic provided you understand that there are other limits imposed upon what you can and can't do than sheer processing grunt.

Re:Topology matters more than GFLOPS (0)

Anonymous Coward | about 2 years ago | (#41301655)

Your rant shows a misunderstanding of what the unit GFLOPS means. A 1 GLOPS computer doesn't mean that it can calculate a meaningless equation like '1.0 + 1.0' a billion times a second.

The Top500 list is an execution of the LINPACK benchmark, which involves a set of linear algebra operations on very very large arrays. There are many scientific problems that can be represented using linear algebra. In order to solve the LINPACK benchmark takes N GFLOP of computation, which can be used to calculate the GFLOPS of the system, but in order to perform the basic linear algebra operations on these large arrays entails a great deal of communication between the processor nodes.

You can argue that this benchmark isn't relevant to your particular problem (though I think fluid dynamics can be expressed as linear algebra systems) but you can't argue that the communication networks aren't included in the Top500 GFLOPS metric.

Re:Topology matters more than GFLOPS (1)

bratmobile (550334) | about 2 years ago | (#41305223)

Comparing LINPACK numbers makes sense. But GFLOPS (or TFLOPS or PFLOPS or whatever), by itself, is a meaningless and misleading number. Most people just stop thinking when they see a single metric like GFLOPS, and then they compare GFLOPS in one system to GFLOPS in another system. If those systems *are* comparable, then fine. But often enough, they are not comparable.

Also, it wasn't a rant. A rant would have involved caps lock.

Re:Topology matters more than GFLOPS (0)

Anonymous Coward | about 2 years ago | (#41302127)

After 20 years (pre parallel processing) in the sim space, I could not agree more. Its all about I/O...almost always has been.

Re:Topology matters more than GFLOPS (0)

Anonymous Coward | about 2 years ago | (#41303213)

Your one car-analogy-car moves one guy 250mph. Your 1000 CACs moves 1000 guys 250mph.

What's the problem?

As you say, we can't do real apples-to-apples comparisons because the computers are problem-tuned for different problems. But then isn't it inevitable that any comparison point used for yaking about "the world's fastest" is going to be vulnerable to quibbles?

You probably know a hell of lot more about this than me. I'm just pointing out your explanation isn't working down on my level where it's directed. Please do have another go.

Re:Topology matters more than GFLOPS (1)

bratmobile (550334) | about 2 years ago | (#41305105)

Earth escape velocity is ~25,000 MPH. But 1000 F1 cars cannot reach orbit. Adding up numbers does not always give you a meaningful number.

Re:Topology matters more than GFLOPS (3, Informative)

Overunderrated (1518503) | about 2 years ago | (#41304259)

I'll drill into an example. If you're doing a problem that can be spatially decomposed (fluid dynamics, molecular dynamics, etc.), then you can map regions of space to different processors. Then you run your simulation by having all the processors run for X time period (on your simulated timescale). At the end of the time period, each processor sends its results to its neighbors, and possibly to "far" neighbors if the forces exceed some threshold. In the worst case, every processor has to send a message to every other processor. Then, you run the simulation for the next time chunk. Depending on your data set, you may spend *FAR* more time sending the intermediate results between all the different processors than you do actually running the simulation. That's what I mean by matching the physical topology to the computational topology. In a system where the communications cost dominates the computation cost, then adding more processors usually doesn't help you *at all*, or can even slow down the entire system even more. So it's really meaningless to say "my cluster can do 500 GFLOPS", unless you are talking about the time that is actually spent doing productive simulation, not just time wasted waiting for communication.

Considering that computational fluid dynamics, molecular dynamics, etc., break down into linear algebra operations, I'd say that the FLOPS count on a LINPACK benchmark is probably the best single metric available. In massively parallel CFD, we don't match the physical topology to the computational topology, because we don't (usually) build the physical topology. But I can and do match the computational topology to the physical one.

Re:Topology matters more than GFLOPS (1)

bratmobile (550334) | about 2 years ago | (#41305137)

Yes, many problems can be expressed as dense linear algebra, and so measuring and comparing LINPACK perf for these makes sense for those problems. However, many problems don't map well to dense linear algebra. The Berkeley "parallel dwarfs" paper expresses this idea better than I ever could: http://view.eecs.berkeley.edu/wiki/Dwarfs [berkeley.edu]

Re:Topology matters more than GFLOPS (1)

Overunderrated (1518503) | about 2 years ago | (#41305821)

Yes, many problems can be expressed as dense linear algebra, and so measuring and comparing LINPACK perf for these makes sense for those problems. However, many problems don't map well to dense linear algebra.

Sure, but as far as I've seen, linear algebra problems dominate the runtime of these very large systems. That's what I use them for.

At least the first 6 on that dwarfs list are done daily on top500 machines. I write parallel spectral methods, and use structured and unstructured grids. Achieving high scaling on these on massively parallel machines is not at all what I would call an open problem (as far as correctly using a given network for a problem, or designing a network for a given problem). For any given network topology, breaking these problems into parallel chunks that are best for the particular topology is just a graph partitioning problem with a trivially sized graph. That little white paper strikes me as being somewhere between "obvious to anyone that's done high performance computing in the last 40 years" , "putting a cute name on old hat", and just plain silly.

Re:Topology matters more than GFLOPS (1)

Anonymous Coward | about 2 years ago | (#41305687)

You're doing it wrong. Line them up end-to-end, not side-by-side.

Professor Oak (0)

Anonymous Coward | about 2 years ago | (#41301085)

So this is where they make all the Pokemons

Are there any specs on drive types & busses us (0)

Anonymous Coward | about 2 years ago | (#41302217)

See subject-line above - I/O latencies matter for certain types of processing also is why I asked (SSD vs. HDD, & caching methods in hardware (ala caching controllers OR onboard FLASH "Hybrid" caching etc./et al)).

* Just curious, & Thanks-In-Advance for the information on the I/O architecture in the area of disks...

APK

P.S.=> It'd be interesting to know, outside of clustering + CPU/GPU processing specs, which seems to mostly (and perhaps rightfully so) what folks concern themselves here with in these "supercomputers"...

... apk

The nuclear establishment in the post-nuke era (1)

Animats (122034) | about 2 years ago | (#41302861)

The US still has these Big Science centers left over from the glory years. There's Oak Ridge, Los Alamos, and the Lawrence Livermore Senior Activity Center (er, "stockpile stewardship"), plus the NASA centers. Their original missions (designing bombs, sending people to the Moon) are long gone, but nobody turned off the money, so they keep looking for something, anything, to justify the pork.

The atomic centers are all located in the middle of nowhere. This was originally done for good reasons - their existence was originally secret, and something might blow up. (Well, Lawrence Livermore was in the middle of nowhere, but the Bay Area has grown to reach it.) As a result, they're major employers in their states, so they have unusual political clout.

The question is whether this is a good way to do science. Should that funding go through NSF instead?

Re:The nuclear establishment in the post-nuke era (0)

Anonymous Coward | about 2 years ago | (#41304391)

Not all of them are like that. Brookhaven national labs actually has public lab spaces. The nsls-II when completed can be used By any company or student. They are even partnering with local energy companies for studies like LIPA(they partnered together to put a solar farm on the property.).

Re:The nuclear establishment in the post-nuke era (1)

ks*nut (985334) | about 2 years ago | (#41306597)

Umm, what is this "post-nuke" era ? There's a reason they have huge computing capability - the nukes haven't gone away, we just don't talk about them anymore. And they don't just sit around gathering dust; they must be carefully maintained and a huge amount of computing power is expended in "improving" them. And you may rest assured that the nuclear establishment is developing new tactical and strategic nuclear weapons for specialized applications, again using vast amounts of computing power.

Re:The nuclear establishment in the post-nuke era (1)

SeanAhern (25764) | about 2 years ago | (#41307343)

Please do your homework first. While the supercomputers at Lawrence Livermore, Los Alamos, and Sandia National Laboratories are primarily used for nuclear weapons work, the work of keeping the country's huge stockpile safe and reliable is a gigantic job, especially if you don't want to actually detonate any of the warheads. Yep, that's the trick. Simulate the ENTIRE weapon, from high explosive initiation all the way to final weapon delivery. With all of the hydrodynamics, chemistry, materials science, nuclear physics, and thermodynamics modeled accurately enough to be able to say with confidence that the entire stockpile is reliable and safe. Hard job! Someone likened it to having a fleet of thousands of cars that you can never start, but must certify are road-worthy the instant you turn the key. For 50 years.

But let's go past this. There are three other major Department of Energy laboratories that have major computing centers: Oak Ridge, Argonne, and Lawrence Berkeley National Laboratories. Beyond just the nuclear weapons work that the first three labs do, all six labs use their massive computing power to advance the understanding of the Earth's changing climate, develop new materials, design new battery technologies, design new drugs, impact energy efficiency in vehicles and buildings, understand geology and groundwater propagation, help develop new power grid systems, design technologies for carbon sequestration, and delve into the origins of the universe. "Left over from the glory years"? Hardly.

And let's go beyond the Department of Energy. The National Science Foundation, as you suggest, has funded high-performance computing for years. There are at least five major computing centers that the NSF funds for an even wider range of scientific computing endeavors: the San Diego Supercomputing Center, the Pittsburgh Supercomputing Center, the National Center for Supercomputing Applications (NCSA) at the University of Illinois, the Texas Advanced Computing Center (TACC) at the University of Texas at Austin, and the National Institute for Computational Sciences (NICS) at the University of Tennessee Knoxville. If you want to get a small sense of what the NSF funds in this area, look at the XSEDE web site (https://www.xsede.org/).

(Disclaimer: I work for Oak Ridge National Laboratory's supercomputing center, have worked at Lawrence Livermore National Laboratory's supercomputing center, and am currently helping to run the University of Tennessee NICS computing center.)

Titan is a decent enough name, but (0)

Anonymous Coward | about 2 years ago | (#41303505)

it would be funnier if it were named 'tighten'.

Stone Soupercomputer at Oak Ridge National Labs (0)

Anonymous Coward | about 2 years ago | (#41303841)

The Story of Supercomputing at Oak Ridge would not be complete without the Stone Soupercomputer, built from surplus secretaries computers in the late 1990's

http://en.wikipedia.org/wiki/Stone_Soupercomputer

Sounds like the plot of a video game (0)

Anonymous Coward | about 2 years ago | (#41305447)

Isn't putting a super computer called Titan in a superconducting-magnet research complex founded during the Manhattan Project; just asking for some sort of resonance cascade failure?

Pictures (1)

raftpeople (844215) | about 2 years ago | (#41305713)

Looks like they clustered some Pepsi machines

Nuke'em from orbit (1)

FyberOptic (813904) | about 2 years ago | (#41305873)

While it's certainly fascinating to hear about the machine itself, it's easy to forget part of why it exists: simulating destruction. The Manhattan Project also came from Oak Ridge, if you recall.

As someone who lives in the region, nobody is particularly keen on what possibly goes on at these places. There are various "secret" military installations scattered around here, from Oak Ridge to Holston Army Ammunition. Between what we factually know is buried under and developed at these places, and what is rumored to, it can be a bit unnerving at times to consider that you live in what amounts to one of the ground zeroes of the country if anyone ever decided to start trouble. Or, likewise, ground zero if anything were to go catastrophically wrong.

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?