Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Apple Wins VT in Cost. vs. Performance

pudge posted more than 11 years ago | from the i-knew-it-all-along dept.

OS X 105

danigiri writes "Detailed notes about a presentation at Virginia Tech are posted by by an attending student. copied most of the slides of the facts presentation and wrote down their comments. He wrote some insightful notes and info snippets, like the fact that Apple gave the cheapest deal of machines with chassis, beating Dell, IBM, HP. They are definitely going to use some in-house fault-tolerance software to prevent the odd memory-bit error on such a bunch of non-error-tolerant RAM and any other hard or soft glitches. The G5 cluster will be accepting first apps around-November." mfago adds, "Apple beat Dell, IBM and others based on Cost vs. Performance alone, and it will run Mac OS X because 'there is not enough support for Linux.'"

Sorry! There are no comments related to the filter you selected.

Woohooo! (-1, Troll)

Biotech9 (704202) | more than 11 years ago | (#6901405)

Apple are great! i am a zzealot\1

Cooling methods (0)

Biotech9 (704202) | more than 11 years ago | (#6901761)

traditional methods [fans] would have produced windspeeds of 60+ MPH....
I wonder how they are all orientated? I wonder if photos of the final result will be released.

water cooled laptops as blades (3, Interesting)

goombah99 (560566) | more than 11 years ago | (#6902816)

The viriginia folks must have one huge room with some massive air handlers to circulate the air that will be trapped behind the towering walls of 1000 4U boxes.

A few years ago I asked apple if they would be willing to sell me 200 laptops without the screens, disks, video cards, and keyboards. They were interested helping me build my cluster but, the the engineers said it would actually cost them more to have a special manufacturing run than woul dbe saved by deleting the hardware.

my plan was stack these things on water cooled chill plates. Basically this would be like a blade.

In my circumstances, adding a well ventilated computer room to the building I was in would have been probibitively expensive. but water cooling and a high density configuration made this very appealing. And if I could have gotten the costs down and reliability up by deleting the screens, keyboard, video, and disks I'd have an affordable system with low sys-admin costs.

I still think its a good idea. Cooling/power costs (including building retrofits) and sys admin costs can dominate the differential purchase price of vairous cluster configurations. In my building the space alone was >120/sq foot, so even the footprint mattered.

Re:water cooled laptops as blades (4, Informative)

Johnny Mnemonic (176043) | more than 11 years ago | (#6902961)


The viriginia folks must have one huge room with some massive air handlers to circulate the air that will be trapped behind the towering walls of 1000 4U boxes.

I don't know any more than what's publicly availble, but the VT follks in the know have said that they've designed a specialized, liquid based cooling system precisely because of the issues wrt cooling this many units. The FA makes reference to this many units generating windspeeds of 60mph from fans alone.

I am gonna guess that behind each G5 rack will be a radiator type arrangement, with cooled pipes flowing with a liquid that will carry the heat away from the internal airspace, much like a large car radiator. I don't know if that would be cost-effective, or what it would take to move that much liquid, or if the radiator could be made to transfer enough heat fast enough. Maybe the liquid cooling units actually replace the internal fans directly. Who knows--I think we'll get some more details on this this week as the G5s start to come out of their boxes. They've apparently received about 10% of them already.

Darl says: Revenge! (-1, Offtopic)

Mr. Darl McBride (704524) | more than 11 years ago | (#6901419)

This FIRST POST! brought to you by the GNAA.

This is for upping the iPod 10g two weeks after I got mine, Apple beeyotches!

Thank you very much.

astroturf!?! (-1, Funny)

Anonymous Coward | more than 11 years ago | (#6901510)

no mention of the free memory upgrades....

Free printer and Ipod case (4, Funny)

goombah99 (560566) | more than 11 years ago | (#6903220)

Man they really blew it. They should have ordered it from macmall. it would have come with 1000 free printers and 1000 ipod cases.

Re:Free printer and Ipod case (0)

Anonymous Coward | more than 11 years ago | (#6904045)

LOL!!! Right on! MacMall just saved me another $200 with the price drop in the [now old] iMacs. they're neat! :D

Re:Free printer and Ipod case (0)

Anonymous Coward | more than 11 years ago | (#6913487)

LMAO!!!! Extremely funny in the light of everything before....
Just before your comment is all this techno blather then suddenly:
Your's like a Fireworks dazzling display in the dead of night darkness of mumbo jumbos
of cooling the cluster.
Peace,
PIKE

Interesting (3, Interesting)

Isbiten (597220) | more than 11 years ago | (#6901612)

Dell - too expensive [one of the reasons for the project being so "hush hush" was that dell was exploring pricing options during bidding]

Who could have guessed? ;)

Re:Interesting (1)

heychris (587825) | more than 11 years ago | (#6903311)

> Dell - too expensive [one of the reasons for the project being so "hush hush" was that dell was exploring pricing options during bidding]

Here's my question, and no, I'm not trying to start a flame war...if they were going purely on a price/performance ratio, why didn't they let Dell "explore pricing options"? Presumably, Dell would have given them a better deal due to educational and/or prestige factors, if Dell really wanted the deal. If that was the case, the uni may have been better served with the Dell option.

I'm not saying that they *should* have done it that way, but if cost was an issue, one would think they would have put it out for a competitive bid. Unless, of course, Dell was trying to *raise* the prices...

CC

Re:Interesting (3, Insightful)

Ster (556540) | more than 11 years ago | (#6903457)

You misunderstand: it was hush-hush while Dell was exploring pricing options. Only after they came back with their lowest-price did Apple win the contract.

At least, that's the way I've been parsing it.

-Ster

Re:Interesting (2, Interesting)

Killigan (704699) | more than 11 years ago | (#6915010)

I've heard (from a fairly reliable source) that the project was actually postponed about 5 months or so while Dell worked on lowering it's prices for VT, finally Dell gave it's lowest possible price, which still wasn't good enough for VT, so they did indeed give Dell plenty of time to try and beat Apple.

Re:Interesting (1)

afantee (562443) | more than 11 years ago | (#6922706)

>> Presumably, Dell would have given them a better deal due to educational and/or prestige factors, if Dell really wanted the deal. If that was the case, the uni may have been better served with the Dell option.

Oh, really? Does Dell have a competitive 64-bit solution? I don't think so. Even the 32-bit dual Xeon Dell is more expensive the dual G5 Power Mac. Don't bother mention Itanium2, because it's too hot and expensive, and there is hardly any native apps, which might be why people are not buying them.

Apple Outshines Dell on Ethics (4, Interesting)

reporter (666905) | more than 11 years ago | (#6906355)

Even if Apple computers were to cost slightly more than Dell computers, we should consistly buy the former instead of the latter. Price is only 1 aspect of any product. There are also ethical considerations. They do not matter much outside of Western society, but they matter a great deal in Western society.

As an American company, Dell is a huge disgrace. Please read the "Environmental Report Card [svtc.org] " produced by the Silicon Valley Toxics Coalition [svtc.org] . Dell received a failing grade and is little better than Taiwanese companies, which are notorious for destroying the environment and the health of workers. Dell even resorted to prison labor [svtc.org] to implement its pathetic recycling program.

... from the desk of the reporter [geocities.com]

Re:Apple Outshines Dell on Ethics (0)

Anonymous Coward | more than 11 years ago | (#6907351)

There are also ethical considerations. They do not matter much outside of Western society, but they matter a great deal in Western society.

Just another ignorant red-neck.

Re:Apple Outshines Dell on Ethics (1)

JeffTL (667728) | more than 11 years ago | (#6915541)

My problem with Dell is that they tend to send you less than you paid for. I ordered a Dimension 8250 last December, which I am using to write this post. The sound card was not what they claimed and wouldn't accept Creative drivers, and I recently discovered that they didn't send a backup copy of Norton Antivirus as claimed.

Re:Apple Outshines Dell on Ethics (0)

Anonymous Coward | more than 11 years ago | (#6916190)

Dell received a failing grade and is little better than Taiwanese companies, which are notorious for destroying the environment and the health of workers.

Are you trolling, again? Pretty slick, sliding this in about Taiwan.

software to solve memory problems? (1, Interesting)

Anonymous Coward | more than 11 years ago | (#6901721)

How does that work?? How does software even KNOW if there was a glitch? Can I get this on my non-ECC Linux box????

IMHO the lack of ECC RAM is the only flaw in an otherwise perfect machine (well that, and the massive HEAT).

Re:software to solve memory problems? (1)

danigiri (310827) | more than 11 years ago | (#6904336)

Mmmmm... I bet that to know how does that work one needs at least a PhD on CS or a bunch of them, anyway. Believe me, 'em these people academics are SMART.

In my humble ignorance, I can devise a simple stratagem (surely far simpler, very inneficient and dumber than the one used by VT). Just duplicate all calculations (effectively halving processing power) on different machines, chances the same error hitting both machines would be vanishingly small.. If a discrepancy in results is found, just recalculate.

Of course, principles such as this can work on your non-ECC Linux boxen, just get the dust off those CS books...

Mmmm... ECC RAM on a non-military non-nuclear-control pro-sumer model is a fair tradeoff, IMHO. Otherwise millions would be complaining why the G5 were stalling on the **** benchmarks because of the slower ECC RAM. I can hear 'em anyway...

Re:software to solve memory problems? (1)

afantee (562443) | more than 11 years ago | (#6922810)

>> IMHO the lack of ECC RAM is the only flaw in an otherwise perfect machine (well that, and the massive HEAT.

You have to look a little further, and don't let the number fool you. The reason for the huge heat sink and 9 individually controlled fans in the G5 is reduce noise level.

The G5 consumes about 40W at 1.8 GHz, which is much more efficient than both the 1.5 GHz Itanium 2 (130 W) or the 3 GHz P4 (75W ?).

Power concerns (0, Insightful)

ProfessionalCookie (673314) | more than 11 years ago | (#6901765)

I'm suprized not to find Power consumption/heat dissapation in the presentation. You'd think that the cost go running 1100 CPU 24/7 would be mentioned in there. Although it's likely that Apple would have wone that also.

Full operation by Jan 1 2004- that's cool.

Re:Power concerns (3, Funny)

pmz (462998) | more than 11 years ago | (#6902356)

I'm suprized not to find Power consumption/heat dissapation in the presentation.

Power consumption would have only made Apple look better.

Re:Power concerns (5, Informative)

ni4882 (584113) | more than 11 years ago | (#6902373)

Actually, from the article:

# 3 MW power, double redundant with backups - UPS and diesel * 1.5 MW reserved for the TCF

# 2+ million BTUs of cooling capacity using Liebert's extreme density cooling (rack mounted cooling via liquid refrigerant) * traditional methods [fans] would have produced windspeeds of 60+ MPH

Seems that they did talk about both.

Re:Power concerns (0)

Anonymous Coward | more than 11 years ago | (#6903065)

That's talks about what they are going to set up bit seems to bee mutually exclusive as to what kind of computer is doing the computing.

whew! (4, Informative)

Anonymous Coward | more than 11 years ago | (#6901833)

3 MW power, double redundant with backups - UPS and diesel

1.5 MW reserved for the TCF

2+ million BTUs of cooling capacity using Liebert's extreme density cooling (rack mounted cooling via liquid refrigerant)

traditional methods [fans] would have produced windspeeds of 60+ MPH

bug free computing...the other kind (1)

switcha (551514) | more than 11 years ago | (#6915415)

traditional methods [fans] would have produced windspeeds of 60+ MPH

Yeah, but NO BUGS! [terminix.com]

Clueless Sysadmins... (1, Informative)

Anonymous Coward | more than 11 years ago | (#6901985)

Does it render?

Yes, incidentally, it does. The units came with high end graphics cards


Aside from games, when is a high end graphics card needed for rendering and not just displaying a rendering.

Re:Clueless Sysadmins... (2, Insightful)

bluethundr (562578) | more than 11 years ago | (#6902179)

Aside from games, when is a high end graphics card needed for rendering and not just displaying a rendering

IANAS, but:
  • Graphical representation of turbulance systems?
  • Wheather analysis?
  • Any graphical representation of Chaotic systems?
Like I said, IANAS, but there HAS to be a reason, methinks.

Re:Clueless Sysadmins... (4, Insightful)

selderrr (523988) | more than 11 years ago | (#6902584)

for any represantation, you need only 1 graphics card : the one the monitor is attached to. Parallelizing realtime display-only stuff is not much good since you'd lose to much time in data transmission.

So they could equip one G5 with a radeon9800 and let that one display the results. No need to buy another 1099 Radeons.

Re:Clueless Sysadmins... (2, Interesting)

confused one (671304) | more than 11 years ago | (#6903646)

Actually, you're not totally right. You can spread the rendering job across multiple radeon chips, each handling only a portion of the display. The performance and depth of the rendering could be greatly enhanced. SGI does something like this...

Re:Clueless Sysadmins... (2, Interesting)

selderrr (523988) | more than 11 years ago | (#6907739)

that is useful only of those multiple cards are IN THE SAME MACHINE. In the cluster case, those cards are spread over multiple computers, requiring that you transfer the rendered result over the network to the "master" video card which sends it to the monitor. I seriously doubt the efficiency of such a solution for realtime display.

Re:Clueless Sysadmins... (1)

confused one (671304) | more than 11 years ago | (#6908286)

Well, Ok, they're not planning on using the machines in this manner. I have seen it done where a cluster was used to display the data in portions. All had access to the shared memory containing the real-time event data. Each machine was tasked with doing some processing and displaying one aspect of the data. Not exactly real-time video rendering; but, this was in '88 using Sun Ultrasparc's

Re:Clueless Sysadmins... (3, Interesting)

davechen (247143) | more than 11 years ago | (#6908482)

That's not necessarily true. Over at Stanford for the project they built a graphics system with 32 PCs that render to a tiled display. Imagine a display made of 1000 monitors in a 40x20 grid. That would be pretty freaking cool.

Re:Clueless Sysadmins... (1)

egregious (16118) | more than 11 years ago | (#6915491)

One of the Stanford grad students (Greg Humphreys) that was on that distributed GL system projects came to UVA last year. WireGL they call it. And it is really awesome.

Johann

Re:Clueless Sysadmins... (1)

hmccabe (465882) | more than 11 years ago | (#6927218)


Imagine a display made of 1000 monitors in a 40x20 grid. That would be pretty freaking cool.


Or better yet, a display made of 1000 monitors in a 50x20 grid, so that it would make sense.

Re:Clueless Sysadmins... (1)

shfted! (600189) | more than 11 years ago | (#6906528)

Unless you wanted to use the GPU on the Radeon for instructions it would handle well, which is quite probable.

graphics in science (3, Interesting)

trillian42 (674714) | more than 11 years ago | (#6902650)

I am a scientist, and lots of money gets put into transforming the tons of numbers that supercomputers produce into images that make sense to the human brain.

The system doesn't have to be chaotic, just complex:

Watching protein folding simulations.
Watching full 3-D seismic waves propagate through the Earth.
Watching, in general, any kind of 3-D model or simulation of a complex process evolving over time.

A couple links:

The Scripps Institute of Oceanography Visualization Center:
http://siovizcenter.ucsd.edu/library/objects/
The Arctic Region Supercomputing Center:
http://www.arsc.edu/news/mdflex.html

Re:graphics in science (0)

Biotech9 (704202) | more than 11 years ago | (#6904012)

I am a Scientist too (Guess what field!), and i use my graphics card to play Quake and Other such 3D games. How many FPS can you get on 1100 G5's?

Re:graphics in science (1)

Alan Partridge (516639) | more than 11 years ago | (#6908474)

The same as you get on one of them, of course.

Re:Clueless Sysadmins... (3, Interesting)

WasterDave (20047) | more than 11 years ago | (#6905563)

Not all renders are real time, not all renders are onto a screen.

Now that "consumer" graphics cards run in floating point and have comparitively complex shader engines, it's quite possible to start working on rendering movies etc. with the substantial quantity of hardware acceleration possible on these things. You don't have to hit 60fps, and you can have as many passes as you like.

Mind you, with 1100 nodes if you can render a frame in 45 seconds .... on a twin G5 with a Radeon 9800 ... then you can render 24fps in real time. Real time lord of the rings, anyone?

Dave

Re:Clueless Sysadmins... (1)

11223 (201561) | more than 11 years ago | (#6908612)

That would be really funny to see. I'd bet you'd get an average 24fps over some period of time, but that doesn't mean you'd get a constant 24 frames * Hz (one frame every 1/24th second). It could be rather jerky unless sophisticated timing stuff is going on.

Clueless Slashdot posters... (0)

Anonymous Coward | more than 11 years ago | (#6907608)

These are PowerMac G5s, not XServes. The default card for the G5 is AT LEAST a GF FX 5200. Apple doesn't make low-end crap cards. VT got a SMOKING deal on these boxes, and for Apple to pull them all apart to put in crappy videocards (or even just to take them out) would have been a huge expense.

VT? (-1, Offtopic)

Anonymous Coward | more than 11 years ago | (#6902032)

I first read this as "Apple wins in Vermont" and though that someone was recycling *really* old news. Didn't Apple have a big contract with the Vermont School system?

Re:VT? (1)

stephentyrone (664894) | more than 11 years ago | (#6905137)

no, you're thinking of maine. the state of maine bought laptops for every 7th grade student, if i remember correctly.

Infiniband insured latency? (5, Interesting)

dhall (1252) | more than 11 years ago | (#6902364)

One of the primary concerns for a multi-node cluster is insured latency among all components within the cluster. It doesn't have to be the fastest, it just needs to insured exacting timing for latency across all nodes. IBM can do this with their "wormhole" switch routing on SP and has done this with Myranet on their Intel X-series clusters.

From most of my reading with Infiniband, it was designed from the ground up as a NAS style solution, than for large multi-node cluster computing. I'm curious as to if they have any issues with cluster latency.

http://www.nwfusion.com/news/2002/1211sandia.htm l

The primary timings and white papers I've seen published for Infiniband have been for small clustered filesystem access. Although it's burst rate is much higher than Myranet, it's hard to find any raw retails for their multiple node latency normalization.

I hope it scales, since Intel's solution appears to be less cost prohibitive than some of the other solutions offered on the market, and would really open up the market even for smaller clusters (16-36 node) for business use.

I love Google (2, Informative)

Wesley Felter (138342) | more than 11 years ago | (#6902945)

http://nowlab.cis.ohio-state.edu/projects/mpi-iba/

For those in the know (3, Interesting)

gnuadam (612852) | more than 11 years ago | (#6902392)

I wonder if by "lack of support in linux," that they're refering to the fact that the fans are controlled by the operating system in the powermac? Or the fact that there are relatively few support companies for ppc linux?

Any insiders care to comment?

Re:For those in the know (2, Informative)

Wesley Felter (138342) | more than 11 years ago | (#6902901)

Maybe they meant that Linux doesn't run on the G5 at all yet, and the only company that seems to be interested in Linux on the G5 (Terra Soft) is a pretty small player.

Re:For those in the know (4, Insightful)

confused one (671304) | more than 11 years ago | (#6903667)

Why go to the trouble of porting linux to the G5 when you could port the clustering code to OS X and be done with it. Seems like a much simpler task and more cost effective use of labor.

Re:For those in the know (1)

gnuadam (612852) | more than 11 years ago | (#6911507)

Fine, fine. But you didn't really answer my question. I waned to know why they said that linux was too unsupported. The fans are an issue that I know about. The relative dearth of support companies for ppc linux is another. VT might have another rational.

You can get into flame wars all day about linux vs. osx. I use both daily. I am a scientific programmer.

For their sake, I really hope that they're not planning on using HFS, tho. mpich (which I know they're not using, but I'm going to cite as an example), has several files like mpicc and mpiCC that to HFS have the same name (and are overwritten) because it is not case sensitive in file names.

Gentoo Linux Runs On The G5 ... from Mac/ (1)

johnpaul191 (240105) | more than 11 years ago | (#6903763)


Benchmarks are extraordinary: compiling kde on a G5 running at half speed takes 15 minutes, while it takes 1 hour on the fastest P4 available.


on macslash this story [macslash.org] talks about the crazy speeds they are claiming on the G5 running Gentoo Linux. Says they can not go superfast because of fan control issues still unresolved, but yikes! too good to be true?

Re:Gentoo Linux Runs On The G5 ... from Mac/ (2, Interesting)

Unregistered (584479) | more than 11 years ago | (#6905613)

the crazy speeds they are claiming on the G5 running Gentoo Linux

Why hell, i get blazing speeds with gentoo on my Athlon, i'd sure hope that you'd get them on the g5 as well :).

Re:Gentoo Linux Runs On The G5 ... from Mac/ (1)

dipipanone (570849) | more than 11 years ago | (#6912046)

No, he said blazing *speeds*, not a blazing motherboard/cpu.

Re:Gentoo Linux Runs On The G5 ... from Mac/ (1)

Unregistered (584479) | more than 11 years ago | (#6917018)

Those are not mutually exclusive

ECC FUD (5, Informative)

J0ey4 (233385) | more than 11 years ago | (#6902395)

Okay before we get going with the same discussion about ECC vs. Non ECC, and all the flames start from people perusing slashdot who think they are more in the know than the PhD's at VT who have been working on this for months I want to point a few things out.

1. The majority if not all of the bit errors that ECC corrects are caused by thermal noise. Thermal noise is an issue in a cluster of rack mounted 1U units due to the difficulty of cooling such tightly spaced units generating so much heat in so small a space. It is not an issue in a cluster of DESKTOP machines utilizing a Liebert system with way more cooling capacity than is needed.

2. Even if somehow a none-thermal bit error occurs, each node has 4GB RAM. The probability of it being in an OS or application critical (especially given the converging nature of many long running calculations) piece of RAM as opposed to an empty piece of RAM is small.

How many of you are reading this from a desktop without ECC RAM that has an obnoxiously huge uptime? ECC is a non-issue in a well-cooled cluster of desktop cased machines.

Re:ECC FUD (1, Interesting)

Anonymous Coward | more than 11 years ago | (#6905667)

The probability of it being in an OS or application critical (especially given the converging nature of many long running calculations) piece of RAM as opposed to an empty piece of RAM is small.

Errr, what is the point of putting 4+ GB into your cluster nodes if you're not going to use it? This isn't a SETI@home cluster. Seems to me that "long running converging apps" tend to have large datasets associated with them. The higher the data density per node the less network bandwidth needed except for "embarassingly parallel" computations with essentially no comm overhead anyway.

I'll concur on desktops, but they typically don't have apps with large datasets that run for large periods of times. The Applications/data come and go and the screwed bits get wiped.

P.S. the cooling systems can fail also. It is also a question of redundancy. You'll find ECC in "big iron" systems also with significant cooling resources assigned to them. In a fail-over transition time the ECC would be leveraged until that node could be completely transitioned out.

Re:ECC FUD (1)

Lars T. (470328) | more than 11 years ago | (#6907395)

When a cooling system fails, you have other problems than some bit-errors.

Re:ECC FUD (1)

dsb (52083) | more than 11 years ago | (#6906221)

Quote
1. The majority if not all of the bit errors that ECC corrects are caused by thermal noise. Thermal noise is an issue in a cluster of rack mounted 1U units due to the difficulty of cooling such tightly spaced units generating so much heat in so small a space. It is not an issue in a cluster of DESKTOP machines utilizing a Liebert system with way more cooling capacity than is needed. /Quote

Why is it necessary then to jam 1U units stacked on each other? If you can get the same performance and storage capacity as 18 months ago with half the hardware and size, then leave gaps in the 1U's and provide venting or sometype of cooling system that allows flow over each 1U unit.

Re:ECC FUD (4, Interesting)

Anonymous Coward | more than 11 years ago | (#6907313)

2. Even if somehow a none-thermal bit error occurs, each node has 4GB RAM. The probability of it being in an OS or application critical (especially given the converging nature of many long running calculations) piece of RAM as opposed to an empty piece of RAM is small.

Think before you post. The failure rate is constant in each memory chip (actually it goes up a bit with higher capacity due to higher density). Unless you setup the memory to be redundant (which the G5 can't do either...) you will experience MORE errors since a good OS tries to use the empty memory for things like file buffers.

How many of you are reading this from a desktop without ECC RAM that has an obnoxiously huge uptime? ECC is a non-issue in a well-cooled cluster of desktop cased machines.

Sigh... this is a 2200-cpu *cluster*. Here's a primer on statistics. Assume the probabiliy of a memory error is 0.01% for some time interval (say a week or month). The likelyhood for a perfect run is then 99.99% on your single CPU, which is just fine. Running on 2200 CPUs, the probability of not having any errors is 0.9999^2200=0.8, or 20% probability of getting memory-related errors somewhere in the cluster.

The actual numbers aren't important - it might very well be 0.01% probablility for an error per year, but the point is that when you run things in parallel the chance of getting a memory error *somewhere* is suddenly far from negligible.

ECC is a cheap and effective solution that almost eliminates the problem. Incidentally, one of the challenges for IBM with "Blue Gene" is that with their super-high memory density even normal single-bit ECC might not be enough.

But, what do I know - I've only got a PhD from Stanford and not VT....

FUD Back At You (1)

Detritus (11846) | more than 11 years ago | (#6909017)

That attitude towards ECC, and other forms of hardware error detection and correction, has led people into building supercomputers that were expensive disasters, like the ILLIAC IV. What's the point of having a fast supercomputer if you have to run a job two or three times to have some confidence in the results?

There is nothing worse than having a computer without ECC or parity memory, and trying to detect and diagnose subtle pattern sensitivity memory problems.

Besides thermal noise, you also have to consider alpha particle emissions from packaging materials and cosmic rays. There are also electrical noise issues in memory assemblies.

Re:ECC FUD (0)

Anonymous Coward | more than 11 years ago | (#6911103)

Hi,
Here are some thoughts to consider.

Thermal response of large systems is NOT LINEAR. As temperatures cross specific thermal boundaries, the system is e times more stable. An over designed cooling system goes a very long way to ensure system stability - this is fairly common use of temperature in space systems design. My arguing is not going to convince you. Try this out. There is an experiment below.

Heat is the primary cause of failure. Bit flipping from cosmic rays would be a burst error, and ECC with its 1 bit error correction is of absolutely no help here.

In response to the Stanford Ph.Ds post, ECC is the least of your problems. Your statistical analysis is the basis argument for failure recovery in large systems. Repeat your calculation assuming (and that's assuming a lot) a single node failure due to ANY error (hw failure, software, driver, power supply etc)of once a year. With the MTBFs running in parallel, a 1100 node system will fail thrice a day. Now take your highly engineered rackmount ECC solution that fails once every two years, you will still fail 1.5 times per day. Admittedly, this calculation is an oversimplified first order, it assumes MTBF is poisson distributed. Software failures may be inverse exponential (intuitive argument, which may be totally wrong. software eventually accumulates enough junk states to fail). Hardware failure is still largely poisson. Has anyone tried modeling the effect of power supply glitches on memory. There is an even chance it is a burst error and not a single random bit flip.

1 bit ECC, which is the common one is not necessarily good enough in high density memory. Beyond 1 bit error correction, ECC doesn't help. That is NOT to say that it is of no use. The probability of a two bit error in the same set of bits protected by a common ECC is very very small.
However, assuming that ECC memory is the be-all end-all of systems design is an oversimplification

Finally, I believe the best solution is to approach this issue scientifically, instead of each of us arguing "i believe, hence it is". Run a memory intensive test. I propose the following. Allocate all your memory up, generate two random numbers, once is a length (bounded, and pick your favaorite distribution) and the other is a bit pattern. Fill length bytes with the bit pattern. Repeat till you use up all memory. Then do a large number of read operations to ensure your memory consistency. Periodically, repeat the experiment. Continue running these tests till your memory gives you an inconsistency. Now repeat the whole experiment at different temperatures.

For additional security, you can save the tuples to mirrored RAID storage (for the truly paranoid).

neat. (3, Interesting)

pb (1020) | more than 11 years ago | (#6902408)

Looks like the costs come out to $23,636 per node, or $4727 per machine. According to the Apple Store, an equivalently specced machine (dual proc G5, 160GB HD, 1GB RAM) comes out to just a little over $3,000. I suppose you might want a display on the management machine in each node, but that won't raise the price that much (say, $3,200 per machine instead). So that leaves ~$1,500 per machine for the networking hardware and whatever other expenses.

Re:neat. (1, Informative)

Anonymous Coward | more than 11 years ago | (#6902550)

And keep in mind that the RAM on each machine is 4 GB.

ah. (1)

pb (1020) | more than 11 years ago | (#6902702)

I wasn't sure how they were using the word 'node' there. That would raise the price to... $5,120.00 per machine! Their consumer prices for RAM must be hugely inflated, seeing as how you could get a 1U dual 2Ghz Opteron with 4GB RAM for $4,500.00...

Dont forget.... (1)

goombah99 (560566) | more than 11 years ago | (#6902664)

ECC memory too.
And never forget the costs of installing these puppies. Cooling systems, power busses, cable harnesses, UPS, Diesel backups, Air filtering, locks, redundant parts.
and what about the disk servers....

definitely. (1)

pb (1020) | more than 11 years ago | (#6902730)

I don't know if the cost of installing them was included in that estimate, but maybe it was.

As for the disk 'servers', I figured they were just sharing all of the 160GB HDs over the network, seeing as how 160GB x 1100 ~= 176TB (ok, it's more like 172TB, but who's counting...)

Re:Dont forget.... (1)

ERJ (600451) | more than 11 years ago | (#6903719)

No ECC...G5's don't support it.

Re:neat. (4, Informative)

confused one (671304) | more than 11 years ago | (#6903702)

Read on. They're putting 8GB of RAM in each machine.

Only 4GB RAM (1)

waldoj (8229) | more than 11 years ago | (#6905582)

Actually, Srinidhi Varadarajan (who gave the first portion of the presentation) said that there would only be 4GB of RAM in each machine. Why not 8GB, I don't know.

-Waldo Jaquith

Dude... (5, Funny)

yoshi1013 (674815) | more than 11 years ago | (#6902433)

At this point all I really want to know is what the hell does 1100 G5s look like???

Certain things are easy to imagine in large quantities, but dude.

Just....dude....

Re:Dude... (1)

Johnny Mnemonic (176043) | more than 11 years ago | (#6902869)

I would pay for a photo of that. My guess, though, is that we won't have to--I'm gonna guess that Apple will supply them for free. With explanatory text like "Stomping Dell's guts since 1984" etc.

Re:Dude... (3, Funny)

Greedo (304385) | more than 11 years ago | (#6903305)

Wait a minute ... a serious /. comment where the poster actually wonders what a Beowulf cluster of these looks like! Call the papers!

Re:Dude... (1)

thunderbird46 (315436) | more than 11 years ago | (#6905474)

I wouldn't want to be standing in front of that server room! A person could get blasted into orbit! :)

Re:Dude... (1)

burns210 (572621) | more than 11 years ago | (#6907273)

one very sexy brushed metal wall:)

Re:Dude... (1)

WaKall (461142) | more than 11 years ago | (#6907828)

You haven't seen the new PowerMacs G5500? I hear it knocks you clear across the world, instead of just into your front yard.

An interesting tidbit (4, Interesting)

BortQ (468164) | more than 11 years ago | (#6902523)

The very last slide states that
Current facility will be followed with a second in 2006
It will be very interesting to see if they also use macs for any followup cluster. If it works out well this could be the start of a macintosh push into clustered supercomputers.

Re:An interesting tidbit (1)

Johnny Mnemonic (176043) | more than 11 years ago | (#6902884)


I caught that too. Use of Macs in 2006 no doubt depends on 2 factors: 1) how well the 2003 cluster works out, and 2) how the Mac compares to competitors in 2006. Could be a nice win for Apple, again, if they manage to keep both 1 and 2 competitive. Which remains to be seen, and I'm holding my breath.

Re:An interesting tidbit (4, Funny)

eweu (213081) | more than 11 years ago | (#6903382)

I caught that too. Use of Macs in 2006 no doubt depends on 2 factors: 1) how well the 2003 cluster works out, and 2) how the Mac compares to competitors in 2006. Could be a nice win for Apple, again, if they manage to keep both 1 and 2 competitive. Which remains to be seen, and I'm holding my breath.

I don't know. Holding your breath until 2006 sounds... dangerous.

hope this doesn't mean... (0)

zonker (1158) | more than 11 years ago | (#6902907)

that apple is going to push back the delivery of the dual-2giggers for us regular folk again so they can pump these things out to this university (like they did last week w/ their announcement about holding up orders for the dual-2giggers to ship them to highschools)...

Re:hope this doesn't mean... (2, Informative)

SlamMan (221834) | more than 11 years ago | (#6903276)

What did you think last weeks announcement was???

Nice rack! (2, Interesting)

Alex Reynolds (102024) | more than 11 years ago | (#6903020)

If they do not fit into a standard rack enclosure, I would be curious to learn what customization was required to rack the G5s.

(Especially seeing as a G5 XServe will probably be at least several months away -- at least until most of the desktop orders can be filled.)

-Alex

Re:Nice rack! (0)

Anonymous Coward | more than 11 years ago | (#6904444)

Especially seeing as a G5 XServe will probably be at least several months away

It's coming out tomorrow.

Why was bidding secret? (2, Interesting)

mTor (18585) | more than 11 years ago | (#6903101)

Could someone please shed some light on this:
Why so secret? Project started back in February; secret with Dell because of the pricing issues; dealt with vendors individually because bidding wars do not drive the prices down in this case.
Why exactly is that? Is there a collusion between the vendors since there's so few of them? Does anyone have any experience with this sector?

Re:Why was bidding secret? (2, Insightful)

Enrico Pulatzo (536675) | more than 11 years ago | (#6903535)

Probably due to in small part to the G5 not being public at the time.

Re:Why was bidding secret? (2, Interesting)

mTor (18585) | more than 11 years ago | (#6904226)

I was actually referring to the last sentence:

"dealt with vendors individually because bidding wars do not drive the prices down in this case."

I don't think they've even dealt with Apple until Apple's G5 announcement but they did deal with other vendors. I'm interested why VU dealt with all of them individually and why do prices not come down when you deal with them in this way. This is why I was alluding to collusion.

Re:Why was bidding secret? (0)

Anonymous Coward | more than 11 years ago | (#6907338)

Why exactly is that? Is there a collusion between the vendors since there's so few of them? Does anyone have any experience with this sector?

Working at a big university that shall remain unnamed, I can tell from my own experience that they DO compete and give us great discounts. My guess in this case would be that it translates to "VT isn't high profile enough to make it worth donating a couple of million dollars". (I apologize if that sounds like I'm dissing them - I'm not, but in my experience vendors are in it for the PR).

G5's cheaper than VTs? (4, Funny)

dpbsmith (263124) | more than 11 years ago | (#6903289)

I believe you can get a VT [utk.edu] for well under $1000, and I've even heard that some of them now support advanced "sixel" graphics.

And they scroll MUCH more smoothly than OS X.

Re:G5's cheaper than VTs? (0)

Anonymous Coward | more than 11 years ago | (#6904221)

I wish I had mod points for you.

Maybe (-1, Offtopic)

Anonymous Coward | more than 11 years ago | (#6904215)

it'll help VaTech's Hokies break in to the current BCS standings. http://espn.go.com/abcsports/bcs/ What the hell is PENN STATE doing in there?

Imagine a beowulf cluster of those (1)

Unregistered (584479) | more than 11 years ago | (#6905650)

Sorry, i couldn't help myself. Really i am. Go ahead an mod this to the deep bowels of /.. I'l soo sorry i did this.

Cost Analysis (1)

jsmith38 (629490) | more than 11 years ago | (#6906058)

I want to know how VT was able to do it's cost analysis so fast.
From what I've heard, VT ordered the G5 the day they came out, or shortly after. But if one were to perform a cost vs performance, they would need background data. Also, they should have been hesitant to accept Apple's specs on the machine, and hoped for some real world test, or maybe some in-house testing of a few machines.
I find it hard to believe that VT was able to truly compare the G5 to competitor products, with out prior data of the machines.

On the other hand, I would say that some of the software features of OS X would make the machines fairly cheaper in terms of setup time. I don't know if they use Rendezvous (I know that some software uses it for distributed computing, Final Cut Pro and XCode in particular). Also, the other OS X features that have already been mentioned like cooling and the fact that it is UNIX based.

Anyways, this is enough of a rant, and I'll let someone else have the floor now.

Re:Cost Analysis (4, Insightful)

2nd Post! (213333) | more than 11 years ago | (#6906872)

I bet at the time of initial consideration of vendors, there were no competitive Opteron or Itanium solutions (none with chassis, the slides say), and I am also willing to bet that Apple had at least a hardware prototype they could demonstrate, at least a motherboard + dual CPU setup, even if the chassis was incomplete and the not all the major subsystems were 100%

Just enough to demonstrate that Apple *would* have a solution, and enough that VT could narrow down the decision to a possible, pending the actual production and purchase of a single machine... then, the contract being 99% complete, they just had to sign a couple papers and purchase, overnight, 1,100 dual G5s.

On the flip side I bet they had a similar contract in the wings with other vendors, all pending on 'simple' bottlenecks.

They aint won VT!! (-1, Troll)

n1ywb (555767) | more than 11 years ago | (#6906104)

Them fuckin cockswine at Apple blinky brain box company aint won VERMONT and aint GONNA anytime soon by the geezum consarnit geezum crow!!!!!!!

I'll run my F-250 (with the runnin boards, bedliner, shotgun rack, she's a sweet machine) right over there and show those longhairs a thing or two!

Re:They aint won VT!! (1)

n1ywb (555767) | more than 11 years ago | (#6929948)

This is a joke, not a troll, you silly Apple fascists.

pet supplies (-1, Offtopic)

Anonymous Coward | more than 11 years ago | (#6910853)

Dog Crates, Bird Cages, Bird Cage [petsupplyonline.com]

additional player: PSC [yeah I don't know either] (0)

Anonymous Coward | more than 11 years ago | (#6912332)

From the slide note:
additional player: PSC [yeah I don't know either]
PSC -> Pittsburgh Super Computing Center running by Carnegie Mellon

Dear Apple (-1, Troll)

Anonymous Coward | more than 11 years ago | (#6915055)

Dear Apple,

I am a homosexual. I bought an Apple computer because of its well earned reputation for being "the" gay computer. Since I have become an Apple owner, I have been exposed to a whole new world of gay friends. It is really a pleasure to meet and compute with other homos such as myself. I plan on using my new Apple computer as a way to entice and recruit young schoolboys into the homosexual lifestyle; it would be so helpful if you could produce more software which would appeal to young boys. Thanks in advance.

with much gayness,

Father Randy "Pudge" O'Day, S.J.

Dear Father Randy O'Day (-1, Troll)

Anonymous Coward | more than 11 years ago | (#6915074)

Dear Father O'Day,

Thanks for your letter. Being Catholic myself, I know exactly what you're talking about! It has always been our plan here at Apple Computer Inc to revolutionize personal computing with our high-quality and highly gay products.

I'm happy to answer your letter by letting you know that YES we will be releasing an entire hLife ("homo-life") software line. You'll be able to recognize it in stores by the small stylized logo depicting a large cock entering a tight anus with an Apple logo on it. ("Suddenly it all comes together" indeed!).

Anyway, I hope you and other members of our community will join us on our mission, and purchase the exciting new hLife boxed set. Only the boxed set comes with translucent cock rings!

Sincerely,

Harry Rodman
Vice-president
Homosexual Liaison Services
Apple Computer, Inc.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?