Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

1.21 PetaFLOPS (RPeak) Supercomputer Created With EC2

Unknown Lamer posted about 10 months ago | from the when-do-we-get-to-jigga-flops dept.

Cloud 54

An anonymous reader writes "In honor of Doc Brown, Great Scott! Ars has an interesting article about a 1.21 PetaFLOPS (RPeak) supercomputer created on Amazon EC2 Spot Instances. From HPC software company Cycle Computing's blog, it ran Professor Mark Thompson's research to find new, more efficient materials for solar cells. As Professor Thompson puts it: 'If the 20th century was the century of silicon materials, the 21st will be all organic. The question is how to find the right material without spending the entire 21st century looking for it.' El Reg points out this 'virty super's low cost.' Will cloud democratize access to HPC for research?"

cancel ×

54 comments

Sorry! There are no comments related to the filter you selected.

1.21 PetaFLOPS (RPeak) (4, Informative)

serviscope_minor (664417) | about 10 months ago | (#45415115)

1.21 PetaFLOPS (RPeak)

Getting RPeak high is simply a matter of getting enough computers which you have access to. They could be connected by TCP/IP over pigeons or PPP over two tin cans and a piece of wet string.

Basically getting a high RPeak on EC2 requires the following procedure:
1. Pay a fuck load of money
2. Create new instance.
3. Goto 2.

Basically this article translates to "Amazon has a lot of computers and this guy rented out a bunch of them at once".

Which I'm sure is good for his research, which must be of the very parallelizable type. I have done such stuff too in the past and it's nice when you have it.

Re:1.21 PetaFLOPS (RPeak) (0)

Anonymous Coward | about 10 months ago | (#45415359)

Quite. I'd like to see someone perform some actual work that requires, I don't know, moving data around on the interconnect. Somehow I suspect the performance wouldn't be quite as interesting.

Re:1.21 PetaFLOPS (RPeak) (1)

Anonymous Coward | about 10 months ago | (#45415861)

Then optimize it to NOT move around on the interconnect.

That would be just as interesting.

Re:1.21 PetaFLOPS (RPeak) (5, Informative)

fuzzyfuzzyfungus (1223518) | about 10 months ago | (#45415555)

The one (slightly) novel aspect of this, presumably also made possible because the workload parallelized well, is the use of Spot Instances [amazon.com] . As the name suggests, these aren't Amazon's standard fixed-price instances; but are rather instances whose price changes according to demand.

You make a bid (specifying maximum price/hour, number and type of instances, availability zones, etc.) If the spot price falls at or below your maximum, your instance starts running. Should it exceed your maximum, your instance gets terminated. Using these things obviously requires a tolerance for server outages far above even the shoddiest physical systems; but if you can divide your problem space into relatively small, discrete, chunks, and get the results off the individual servers once computed, you won't lose more than a single chunk per shutdown, and spot instances can be crazy cheap, depending on demand at the time. My impression is that Amazon offers them whenever they don't have enough reserved instances to fill a given area, and will pretty much keep offering them as long as they pay better than they cost in additional electricity and cooling, so if you are willing to bottom feed, and potentially wait, there are some bargains to be had.

Re:1.21 PetaFLOPS (RPeak) (0)

Anonymous Coward | about 10 months ago | (#45417447)

Sounds similar to approaches of using idle computer time to do computation, assuming that more idle resources at Amazon correlates to lower spot instance costs.

Re:1.21 PetaFLOPS (RPeak) (1)

stoborrobots (577882) | about 10 months ago | (#45419379)

... If the spot price falls at or below your maximum, your instance starts running. Should it exceed your maximum, your instance gets terminated. Using these things obviously requires a tolerance for server outages far above even the shoddiest physical systems; but if you can divide your problem space into relatively small, discrete, chunks, and get the results off the individual servers once computed, you won't lose more than a single chunk per shutdown...

And how is this different from SETI@Home, other than renting the computers rather than convincing people to run your screensaver for the good of mankind?

Re:1.21 PetaFLOPS (RPeak) (1)

fuzzyfuzzyfungus (1223518) | about 10 months ago | (#45423648)

Architecturally, it really isn't. The main difference is just that, unlike the heyday of SETI@Home (which, in part, was greatly aided by the dearth of portables and the relatively lousy system idle powersave modes of the time), you can rent time on other people's computers with such low friction that humans needn't be involved (and, indeed, the intention is that they aren't, except at high levels), and that Amazon has a specific pricing mechanism for varying the price of machine time, in quite fine increments, according to exactly how 'idle' it is. To be thrown into the spot pool probably means that there aren't any reserved instance customers; but the cost of spot time varies continuously according to who is shopping and how much they value getting their results.

No fundamental novelty (for that matter, IBM probably had some mechanism for remotely enabling an additional capability in exchange for cash on one of their mainframe models in the '70s sometime); but fairly neat to watch.

Re:1.21 PetaFLOPS (RPeak) (1)

Arrepiadd (688829) | about 10 months ago | (#45421299)

The one (slightly) novel aspect of this, presumably also made possible because the workload parallelized well, is the use of Spot Instances [amazon.com] . As the name suggests, these aren't Amazon's standard fixed-price instances; but are rather instances whose price changes according to demand.

Even that isn't novel. Quoting some work done last year "Running a 10,000-node Grid Engine Cluster in Amazon EC2" [scalablelogic.com] : "Also, we mainly requested for spot instances because ..."

Doesn't make it less interesting for me though.

Re:1.21 PetaFLOPS (RPeak) (1)

fuzzyfuzzyfungus (1223518) | about 10 months ago | (#45423562)

Interesting, I didn't know about that one, though it certainly makes sense to use spot instances for a compute problem loosely-coupled enough that EC2 wouldn't be a total joke.

Re:1.21 PetaFLOPS (RPeak) (2)

Graymalkin (13732) | about 10 months ago | (#45415641)

Basically this article translates to "Amazon has a lot of computers and this guy rented out a bunch of them at once".

No the article translates to "if you've got embarrassingly parallel workloads you can use EC2 to churn through it without a massive infrastructure outlay of your own". Amazon isn't just renting out the actual CPUs but the power, HVAC, storage, and networking to go along with it. Infrastructure and maintenance is a huge cost of HPC and puts it out of reach for many smaller projects.

You're entirely correct that a massive Rpeak value isn't impressive in terms of actual purpose-built super computers but reporting of the Rpeak is only half of the story. The lede buried in the reporting is that for $33,000 a professor was able to take off the shelf software and run it on a 1.21 petaflop parallel cluster. That's high teraflop to petaflop computing at relatively small research grant prices. I think that's the interesting fact out of this story.

Re:1.21 PetaFLOPS (RPeak) (1)

serviscope_minor (664417) | about 10 months ago | (#45417859)

The lede buried in the reporting is that for $33,000 a professor was able to take off the shelf software and run it on a 1.21 petaflop parallel cluster. That's high teraflop to petaflop computing at relatively small research grant prices. I think that's the interesting fact out of this story.

Well moderately so, except that there are quite a few supercomputers out there at academic institutions which rent themselves out to academic users. I wonder how much they cost by comparison. These days they are generally x86 so they run precompiled software just fine.

Re:1.21 PetaFLOPS (RPeak) (1)

Graymalkin (13732) | about 10 months ago | (#45419423)

There's several potential problems with renting out time on another university's cluster. For one there may simply be a lot of bureaucratic steps involved in renting out resources from another university. The second is that some cluster you don't own might not support your particular software/platform/project.

One attractive aspect of cloud services is the customer gets to load on whatever wonky configuration they want into a virtualized instance. Using someone else's cluster may not provide that sort of flexibility. Being able to load an EC2 instance with the same (or similar enough) configuration as your work laptop is a feature. Researchers aren't necessarily developers so the code/configuration they need to run may be very messy. A "cloud compute" service is more attractive in that case than a highly optimized HPC cluster.

A very real use case for this sort of set up is "man my laptop doesn't have the power to churn through all this data, let me upload my project as-is to Amazon and throw a few petaflops at it". I've seen a few people use AWS for things like rendering 3D scenes (Blender et al). It's a nice option to have a few teraflops at your disposal when you need them for a relatively low price.

Re:1.21 PetaFLOPS (RPeak) (1)

serviscope_minor (664417) | about 10 months ago | (#45420669)

For one there may simply be a lot of bureaucratic steps involved in renting out resources from another university.

True if it's another nuiversity's cluster. However, there are quite a few academic supercomputer centres which are built specifically to rent out space to universities. They're often not even associated with universities at all.

he second is that some cluster you don't own might not support your particular software/platform/project.

Indeed, especially if the cluster is something exotic. If it's one of those PPC Blue Gene things, os a SPARC based Riken on then you're in for a fair bit of work. Most these days are x86 running Linux on the nodes with TCP/IP available if you wish. While you don't get quite as much flexibility as EC2 where you can chose your OS, they'll still run near enough anything.

Basially almost all scientific software which more or les needs a cluster to do anything non trivial runs on Linux, since almost all compute clusters run Linux.

Being able to load an EC2 instance with the same (or similar enough) configuration as your work laptop is a feature.

Yeah, that's a handy feature.

Re:1.21 PetaFLOPS (RPeak) (1)

mjwalshe (1680392) | about 10 months ago | (#45416031)

But how is the EC2 network set up how do make sure you have the right balance between N/S and E/W traffic

Re:1.21 PetaFLOPS (RPeak) (0)

Anonymous Coward | about 10 months ago | (#45417057)

After analyzing the results they determined the best material for solar cells was... silly putty.

Re:1.21 PetaFLOPS (RPeak) (1)

Decker-Mage (782424) | about 10 months ago | (#45420331)

I raised the point some time back that perhaps the various providers could lend instance idle-time to various distributed computing projects as, perhaps, a tax deduction. At least a half-step closer, although you have a good point about usability.

Old Joke (2)

Squiddie (1942230) | about 10 months ago | (#45415129)

But can it run Crysis?

Re:Old Joke (0)

Anonymous Coward | about 10 months ago | (#45415387)

Better yet, can it run Minecraft with at least 30fps?

Or Dwarf Fortress at 0.8fps?

Re:Old Joke (2)

CaseCrash (1120869) | about 10 months ago | (#45415971)

But can it run Crysis?

Dear lord, this is an old joke now?

Re:Old Joke (1)

identity0 (77976) | about 10 months ago | (#45421269)

You want old jokes?

Imagine a Beowulf cluster of these....

Re:Old Joke (1)

mythosaz (572040) | about 10 months ago | (#45416273)

Screw Crysis. How fast can it mine Bitcoins?

Re:Old Joke (1)

styrotech (136124) | about 10 months ago | (#45417263)

Not enough irony or recursion for Slashdot. How about "Imagine a beowulf cluster of these"?

Yawn (0)

Anonymous Coward | about 10 months ago | (#45415137)

Boring PR exercise is boring. Yes Virginia, supers you can simply script together are boring. Especially because they're not at all super in the least if you throw but the most embarrasingly of parallel problems at them. As soon as you need any sort of communicating among the nodes at all, turns out the interconnect just isn't that great, and the efficiencie goes through the floor. Whoops.

What'd it take to impress me? That lone guy with his lone desktop breaking the record for calculating digits of pi was rather impressive. Now that's been done, figure out something new. But this, throwing some cash at amazon, this isn't it.

theoretical performance (0)

Anonymous Coward | about 10 months ago | (#45415153)

the 1.2PFlop/s is a theoretical peak performance. In comparison, the numbers that you'll find on the Top500 list are all sustained performances.

FTA (4, Insightful)

Saethan (2725367) | about 10 months ago | (#45415169)

FTA:

Megarun's compute resources cost $33,000 through the use of Amazon's low-cost spot instances, we're told, compared with the millions and millions of dollars you'd have to spend to buy an on-premises rig.

Running somebody else's machines for 18 hours costs less than buying a machine that powerful for yourself to run 24/7...

NEWS AT 11!

Re:FTA (2)

AvitarX (172628) | about 10 months ago | (#45415299)

That's what I thought. It is great that it is possible to run a simulation on a five figure budget, but if it's something that gets heavy use, having your own is better. I predict this will help Cray (contrary to the article's implication), with companies able to start using big power and see where it takes them without dropping the capital expense, they will then be able to move to a more constant use of such resources with lower marginal cost by bringing it in house.

Re:FTA (2)

fuzzyfuzzyfungus (1223518) | about 10 months ago | (#45415609)

It wouldn't surprise me if organizational dynamics come into the picture as well. If researcher X can purchase consumables and services related to his work up to X dollars on his own (subject only to oversight after the fact if somebody raises an eyebrow) and up to Y dollars with a sign off from the lab head or somebody; but would need 6 signatures, university-level approval for the facilities repurposing, and who knows what else, he has a pretty strong incentive to just pay Amazon to do it, even if getting an in-house system makes more sense in the longer term.

On the other side of the coin, if a university is looking for a prestige project that'll look pretty damn cool through the glass when they take tours around, they might get a butch, black, blinkencomputer even if utilization ends up being tepid.

Re:FTA (1)

ModernGeek (601932) | about 10 months ago | (#45420959)

I hate how the press sensationalizes the idea of renting out server space and calling it the cloud. Even marketing geared towards IT Professionals does it, and everyone speaks of the idea of having someone host your file and calling it "the cloud" as if just happened a few years ago. It's almost marketed as some mysterious magical force that just puts everything into play. I hate it. Get off my lawn.

Re:FTA (1)

ModernGeek (601932) | about 10 months ago | (#45420961)

Also, why is it special that he rented Amazon's computing time? If he had rented computing time on a University supercomputer, or a cluster owned by another private corporation, would it have made a sensationalist headline? Had a University donated the time to him, would it have been news? This is nothing but astroturfing for Amazon's proprietary service, and has no place here.

El Reg (1, Insightful)

spike hay (534165) | about 10 months ago | (#45415181)

How about let's not use the anti-science mouthbreathers at the Register as a source.

HPC? (5, Insightful)

NothingMore (943591) | about 10 months ago | (#45415243)

"Supercomputing applications tend to require cores to work in concert with each other, which is why IBM, Cray, and other companies have built incredibly fast interconnects. Cycle's work with the Amazon cloud has focused on HPC workloads without that requirement." While this is cool, Can you really call something like this an HPC system if you are picking work loads that require little cross node communication? The requirement of cross node communication is pretty much the whole reason large scale HPC machines like ORNL's Titan exist at all. Wouldn't this system be classified closer to HTC because it is targeting workloads that are similar to those which would be able to run on HTC Condor pools?

Re:HPC? (2)

sunderland56 (621843) | about 10 months ago | (#45415955)

If we follow the article's reasoning, then SETI@home was one massive supercomputer, not 10,000 individual computers working on parts of a common task.

Re:HPC? (0)

Anonymous Coward | about 10 months ago | (#45416085)

@Enry: Your $16M cluster is the better solution if you will keep it working all the time over its projected lifespan. But where Cloud excels is in providing excess capacity to you when you need it, without your having to project future needs and without your needing to wait for however long it takes to design and build out your data center, populate it with hardware/software, and administer it. Many or most users of HPC on the Cloud already have in-house clusters in the range of hundreds or thousands of processors. They would then offload peak usage to Cloud, at least in extreme cases, such as this one.

#NothingMore: ""Supercomputing applications tend to require cores to work in concert with each other" Some do, some don't. That is not part of some standard definition of supercomputing. And there is a whole range of required extent of interprocess communication, from very frequent to very infrequent. Even jobs like the one being discussed now require *some* (infrequent) IPC: a master job may divide the input, track the status of subjobs, and process their outputs into a unified report. But even applications that require running processes to trade intermediate results may require this frequently or not. If it's frequent, then yes, a fast interconnect may be required for high effective parallelization; but it's not always frequent. But now take such an app and port the essential kernel to GPU, and you may be able to do this all within a single node with no fast interconnects required. Or maybe someone figures out an embarrassingly parallelizable algorithm for the problem. These things would not deprive the app of its "supercomputing application" moniker. In fact, I could take some embarrassingly parallelizable algorithms I know of and make them stupid enough to require the subjobs to trade data instead of doing it the better way. That would require fancy hardware, but surely they would not deserve the extra honor of being deemed a "supercompting application" by virtue of that change.

Good but not great (4, Insightful)

Enry (630) | about 10 months ago | (#45415427)

So this ran for 18 hours, or about $1800/hour. That gives you just under $44,000 per day, or $16 million for a year.

Give me $16 million a year and I can build you a very kick-butt cluster - the one I'm just finishing up is 5000 cores at about $3 million.

EC2 is great if your needs are small and intermittent. But if you're part of a larger organization that has continual HPC needs, you're going to be better off building it yourself for a while.

Re:Good but not great (0)

Anonymous Coward | about 10 months ago | (#45415569)

I could hear you sniffling and wiping the Cheetos from your beard while reading that.

Re:Good but not great (2)

cdrudge (68377) | about 10 months ago | (#45415845)

Give me $16 million a year and I can build you a very kick-butt cluster - the one I'm just finishing up is 5000 cores at about $3 million.Presuming costs scale approximately linearly, $16m would net you 26-27k cores. They hit 6x that at peak. I didn't see them mention what they sustained over the long haul or averaged, but it looks like it was well above your scaled core numbers.

Re:Good but not great (0)

Anonymous Coward | about 10 months ago | (#45416187)

You forgot about running costs.

Re:Good but not great (0)

Anonymous Coward | about 10 months ago | (#45416637)

That's kind of the point of the article. EC2 brings HPC'ish capability to the layman. If you need 24x7x52 cluster computing, of course it's cheaper to build your own. EC2 lets you use it ad-hoc without the capital outlay. Did you read the damn article?

Re:Good but not great (0)

Anonymous Coward | about 10 months ago | (#45416661)

You're going to be better off assuming you're paying for it. If USC tuition and/or the taxpayer is paying for it, apparently you don't care.

Re:Good but not great (1)

Princeofcups (150855) | about 10 months ago | (#45417331)

So this ran for 18 hours, or about $1800/hour. That gives you just under $44,000 per day, or $16 million for a year.

Give me $16 million a year and I can build you a very kick-butt cluster - the one I'm just finishing up is 5000 cores at about $3 million.

EC2 is great if your needs are small and intermittent. But if you're part of a larger organization that has continual HPC needs, you're going to be better off building it yourself for a while.

People need to stop thinking of "cloud" as some kind of magic fairy land. It's just a bunch of servers and software that cost the same to purchase as anywhere else. Plus they have to make a profit. So of course you can build it cheaper yourself, if all you are comparing is bare hardware.

Two questions (1)

Ungrounded Lightning (62228) | about 10 months ago | (#45415445)

1) Did they FIND any exceptional and useful photovoltaic behavior in the compounds tested?

2) How much will this sort of crunch make up of the revenue lost to the rest of the world's migration away from US-based cloud services, in the wake of Snowden's revelations?

Re:Two questions (1)

marcosdumay (620877) | about 10 months ago | (#45415605)

1) Did they FIND any exceptional and useful photovoltaic behavior in the compounds tested?

The NSA certainly knows, and can tell any company they like.

2) How much will this sort of crunch make up of the revenue lost to the rest of the world's migration away from US-based cloud services, in the wake of Snowden's revelations?

If you are concerned about your competitors knowing about your internal research... I guess my answer to your first question also answers this one.

Re:Two questions (1)

Ungrounded Lightning (62228) | about 10 months ago | (#45419591)

1) Did they FIND any exceptional and useful photovoltaic behavior in the compounds tested?

The NSA certainly knows, and can tell any company they like.

Good point.

Government signals intelligence has a long track record of being used for industrial espionage, leaking both sales and tech info from foreign competitors to the country's own companies.

Examples include China's military leaking Cisco (and apparently other compaies') tech to Huawei, the US bugging Totyta and Nissan for the benefit of US auto companies and leaking intercepted info about competitors' bribery attempts that resulted in Raytheon and McDonnell Douglas getting big contracts that Thomson-Alcatel and Airbus had almost closed, to name just a few.

Even US companies have to worry about US government intercepts, since the US government has been playing favorites domestically as well. Some big examples, not involving signals intelligence, came out of the mortgage crisis, when some banks and other financial institutions were slaughtered (even those NOT in trouble), so their corpses could be absorbed by others with better political connections (and contribution records).

1.21 gigawatts. (2)

Cammi (1956130) | about 10 months ago | (#45415477)

No relation in any way to doc brown ... must be another troll posting articles! 1.21 gigawatts.

High Throughput Computing not HPC (1)

dlapine (131282) | about 10 months ago | (#45415825)

While this a nice use of Amazon's EC to build a high throughput system, that doesn't translate as nicely to what most High Performance computing users need- high network bandwidth, low latency between nodes and large, fast shared filesystems on which to store and retrieve the massive amounts of data being used or generated. The cloud created here is only useful to the subset of researchers who don't need those things. I'd have a hard time calling this High Performance Computing.

Look at XSEDE's HPC resources page [xsede.org] . While each of those supercomputers has something special about the services they offer (GPU's SSD's, fast access, etc), they all spent a significant portion of their build budget on a high performance network to link the nodes for parallel codes. They also spent money on high performance parallel filesystems instead of more cores. Their users can't get their research done effectively on systems or clouds without those important elements.

I think that it's great that public cloud computing has advanced to the point where useful, large-scale science can be accomplished on it. Please note that it takes a separate company (CycleCloud) to make it possible to use Amazon EC in this way (lowest cost and webapp access) for your average scientist, but it's still an advance.

Disclaimer: I work for XSEDE, so do your own search on HPC to verify what I'm saying.

Re:High Throughput Computing not HPC (3, Insightful)

Yohahn (8680) | about 10 months ago | (#45415995)

The problem is that in a number of cases a researcher could easily use HTC, but they follow the fashion of HPC, using more specialized resources than necessary.
Don't get me wrong, there are a number of cases where HPC makes sense, but usually what you need is a large amount of memory, or a large amount of processors.
HPC only makes sense where you need both.

Re:High Throughput Computing not HPC (1)

dlapine (131282) | about 10 months ago | (#45416157)

Sure, that's why I said that this is an advance. If you don't need HPC resources, this can work really well. But, you have educate scientists and researchers on the difference, and this article doesn't do that well enough.

Re:High Throughput Computing not HPC (1)

markhahn (122033) | about 10 months ago | (#45418119)

No, the distinguishing feature of HPC is primarily access to a large set of cores with fast interconnect. Generally heterogenous, with a flat, high-bisection fabric. Lots of memory is definitely not necessary; nor are features like SSD or GPUs.

Re:High Throughput Computing not HPC (1)

Yohahn (8680) | about 10 months ago | (#45424740)

I was stating that in a number of cases, you don't need HPC, you need the high memory instead of the interconnect, because basically researchers write programs that just use the interconnect to provide a large memory.

Re:High Throughput Computing not HPC (1)

Snorbert Xangox (10583) | about 10 months ago | (#45418405)

"Following the fashion of HPC" is a bit harsh. It depends on whether the research group gets money (which they could spend on exactly the sort of compute that would suit them) or in-kind funding with grants of time at an existing large HPC site, and how much data they expect to produce, and where/how long they intend to store it. For instance, Australian university researchers had to pay ISP traffic charges on top of Amazon's own charges to download data from Amazon until November of 2012, when AARNET peered with Amazon, and then only for data downloaded from the US-WEST-2 region.

Also, if the research group is small, it depends a lot on who handles their IT support. If (because of the in-kind funding) they are depending on the expertise of the HPC site for support, then a lot is down to the particular HPC site and whether it has as much depth in supporting cloud workloads as traditional HPC workloads.

Marty! (1)

zingfodd (2195688) | about 10 months ago | (#45416801)

Now all they need is a flux capacitor, and then they can... oh wait...

1.21 PetaFLOPS is frigging impressive!! (0)

Anonymous Coward | about 10 months ago | (#45417893)

That's almost enough to run Vista.

People who like this sort of thing... (1)

Snorbert Xangox (10583) | about 10 months ago | (#45418045)

...will find this the sort of thing they like. For people/groups who have SETI@home or Folding@home style workloads - the type that the HPC community call "embarrassingly parallel" - and some money, this is useful. But it's sad that there is no mention made in the article of Condor [wikipedia.org] - a job manager for loosely coupled machines that has been doing the same kind of thing since the '80s - essentially, since there has been a network between a few sometimes-idle computers in a CS department. Cycle Computing itself has used Condor as part of its CycleServer [cyclecomputing.com] product. Jupiter is their own task distribution system which goes to larger scales than Condor can reach.

It's cool that Cycle Computing have packaged up this cycle scavenging approach into infrastructure that lets people easily deploy and farm work out to EC2 spot instances. But as they make those instances easier to use, the demand will go up, and the spot price of compute capacity will likely go up too. Which is nice for Amazon, of course, but harder on groups that are trying to make a budget forecast of what their simulations will cost to run. The free market grid computing cheerleader types will be over the moon at the opportunity to write papers about spot instance futures markets on a service that actually got popular. But, as another poster points out, it's High Throughput Computing, not HPC, and the very thing that makes it amenable to spot markets, which is the fungibility of loosely coupled EC2 instances, also restricts it to loosely coupled workloads, especially ones that don't produce a huge amount of data for each separate run - although a couple of years ago Cycle were already looking at ways of improving this last restriction.

Just a stunt. (1)

markhahn (122033) | about 10 months ago | (#45418191)

Amazon makes a killing renting computers. Certain kinds of enterprises really want to pay extra for the privilege of outsourcing some of their IT to Amazon - sometimes it really makes sense and sometimes they're just fooling themselves.

People who do HPC usually do a lot of HPC, and so owning/operating the hardware is a simple matter of not handing that fat profit to Amazon. Most HPC takes place in consortia or other arrangements where a large cluster can be scheduled to efficiently interleave bursty usage patterns. That is, of course, precisely what Amazon does, though it tunes mainly for commercial (netflix, etc) workloads - significantly different from computational ones. (Real HPC clusters often don't have UPS, for instance, and almost always have higher-performance, high-bisection, flat/uniform networks, since inter-node traffic dominates.)

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>