Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Major Outage At the Amazon Web Services

CmdrTaco posted more than 3 years ago | from the but-the-cloud-fixes-everything dept.

Cloud 247

ralphart writes "The Northern Virginia datacenter for Amazon Web Services appears to be having a major outage that affects EC2 services. The Amazon Forums are full of reports of problems. Latest update from the status page: 2:49 AM PDT We are continuing to see connectivity errors impacting EC2 instances, increased latencies impacting EBS volumes in multiple availability zones in the US-EAST-1 region, and increased error rates affecting EBS CreateVolume API calls. We are also experiencing delayed launches for EBS backed EC2 instances in affected availability zones in the US-EAST-1 region. We continue to work towards resolution."

cancel ×

247 comments

Sorry! There are no comments related to the filter you selected.

Reddit is down because of this (1)

HelioWalton (1821492) | more than 3 years ago | (#35894894)

How am I supposed to be able to not do work?

Re:Reddit is down because of this (5, Funny)

cobrausn (1915176) | more than 3 years ago | (#35894934)

You're posting on Slashdot, so I believe you already found the answer.

Re:Reddit is down because of this (0)

Anonymous Coward | more than 3 years ago | (#35895374)

Upvoted, commented for the same reason...

Re:Reddit is down because of this (1)

MobileTatsu-NJG (946591) | more than 3 years ago | (#35895850)

You're posting on Slashdot, so I believe you already found the answer.

Yeah but maybe he's hungry for news.

Re:Reddit is down because of this (1)

lumbercartel.ca (944801) | more than 3 years ago | (#35895934)

I mostly come here for humour, and once again I'm not disappointed!

Re:Reddit is down because of this (1)

Anonymous Coward | more than 3 years ago | (#35894966)

Digg is still up

Re:Reddit is down because of this (1)

wiggles (30088) | more than 3 years ago | (#35895092)

They took Digg down last year and replaced it with this horrible monstrosity they called 'v4' or something. It's a shame they just took such a popular site offline and haven't provided a decent replacement.

Re:Reddit is down because of this (2)

jpmoney (323533) | more than 3 years ago | (#35895130)

People still go to digg? Oh, I see what you did there.

I actually went to Digg this morning since Reddit is down. I haven't been in months since I removed them from my RSS reader. All I have to say is "ouch". Front page stories with a whopping 5 comments? Its pretty sad.

Re:Reddit is down because of this (0)

Anonymous Coward | more than 3 years ago | (#35895336)

I actually added it to my site block list to retrain my finger memory after v4 hit and it went to hell. If my fingers magically typed it, Leechblock would tell me no. Reddit, Slashdot, & Mefi eat up enough time.

Re:Reddit is down because of this (1)

jafuser (112236) | more than 3 years ago | (#35895736)

This is the first time I've been back here in a while. I decided to try it when I realized reddit's downtime is probably going to be a while. I still feel a reverence for this place. It sort of reminds me of going back and visiting my university.

Digg can rot in hell.

Re:Reddit is down because of this (2)

badran (973386) | more than 3 years ago | (#35895244)

Productivity in Offices will reach record levels today.

Re:Reddit is down because of this (0)

Anonymous Coward | more than 3 years ago | (#35895456)

Popurls [popurls.com] has everything!

Re:Reddit is down because of this (0)

Anonymous Coward | more than 3 years ago | (#35895556)

Better drink my own piss.

No Way! (5, Funny)

Frosty Piss (770223) | more than 3 years ago | (#35894908)

But how can this be possible? It's The Cloud . This sort of this simply doesn't happen.

Re:No Way! (2)

alphatel (1450715) | more than 3 years ago | (#35894972)

It didn't happen. The cloud can erase history in a planck!

Re:No Way! (0)

Anonymous Coward | more than 3 years ago | (#35895010)

Seems to happen really rarely, since it now seems to be huge news and slashdot is reporting on it right away.

Re:No Way! (2)

Anonymous Coward | more than 3 years ago | (#35895286)

But it's not supposed to happen, because "if" (when!) it does, the impact is HUMONGOUS. "You're welcome to store all your data in our fast, easy and safe cloud storage. Downtime? Don't worry, it'll only experience hour long outages intermittently." Yeah, that's how they sold it in the first place, isn't it?

This will become quite the event in data warehouse circles I bet, because the cost of 'being in the cloud' just doubled; it's not enough to buy storage from one provider. The "always there" quality that's supposedly the benefit of cloud storage is a facade.

Re:No Way! (2, Insightful)

cduffy (652) | more than 3 years ago | (#35895330)

This will become quite the event in data warehouse circles I bet, because the cost of 'being in the cloud' just doubled; it's not enough to buy storage from one provider. The "always there" quality that's supposedly the benefit of cloud storage is a facade.

You can buy from one provider -- every major cloud provider has multiple availability zones. But yes, lots of people buy in only one zone because it's cheaper, and then suffer for that mistake -- in situations just like this.

Re:No Way! (0)

jellomizer (103300) | more than 3 years ago | (#35895174)

The cloud isn't immune to problems. But it is normally more tolerant to problems then your/your businesses internal systems. Unless you spend a great deal for a full infrastructure then you probably get just as good. A major outage on most professional cloud setups means it is down for a few hours. A major outage at work means the full day. It is like saying driving my car is so much safer then flying because I never got into an accident.

Re:No Way! (4, Informative)

0123456 (636235) | more than 3 years ago | (#35895372)

A major outage on most professional cloud setups means it is down for a few hours. A major outage at work means the full day. It is like saying driving my car is so much safer then flying because I never got into an accident.

Last time I remember a day-long outage at work was 1994, and that was because the license server failed so we couldn't run our own software (we couldn't recompile it to remove the DRM because the compiler also needed a license to run).

I seem to remember that the Mac guys at the company also had a long outage when they couldn't connect to one of their Mac servers, but eventually someone actually went to the server room and discovered that it had been stolen.

Back on topic, I just don't see all these day-long outages that apparenty seem to happen all the time in companies that haven't moved their servers to The Cloud(tm).

Re:No Way! (2)

TooMuchToDo (882796) | more than 3 years ago | (#35895808)

But when it's your gear, you have some control over the situation. When it's "in the cloud", you sit and get yelled at by the CXO and sweat if you'll still have a job while cloud provider X works to fix the problem (and their liability? whatever you paid for the service).

Re:No Way! (1)

pdbaby (609052) | more than 3 years ago | (#35895238)

Jokes aside, if people use The Cloud (I'm using this tongue in cheek...) rather than a cloud this thing doesn't happen.
We use a number of providers which means that even if Amazon fell over completely our systems would be fine -- it looks like a lot of sites (reddit, for instance) don't bother to do this.

Re:No Way! (1)

91degrees (207121) | more than 3 years ago | (#35895890)

I'm not a databases guy, so sorry if this is a silly question, but reddit does have a lot of stuff being written to the database all the time.

So if you spread over multiple sites, is this managable without dramatically increasing server load?

Re:No Way! (0)

Anonymous Coward | more than 3 years ago | (#35895532)

Oh. MY. GOD. A website is DOWN!!!!!

The end times are upon us! A website went down today! A website! This was foretold in scripture! RUN AND PANIC IN THE STREETS NOW. THAT IS ALL THERE IS LEFT. Nothing else makes any sense! Because a website is down!! Don't you understand?!?

Re:No Way! (1)

ron_ivi (607351) | more than 3 years ago | (#35895628)

But how can this be possible? It's The Cloud . This sort of this simply doesn't happen.

To be fair to Amazon - on a good cloud (incl. Amazon's) you can launch instances in completely different data centers, so your most critical services have somewhere to fail over to.

Though, personally I'd feel even better if my nodes were distributed across two different clouds; to avoid the single-point-of-failure of the Amazon account itself. For example, despite running in both their East and West data centers, I'm still vulnerable to a sales/billing miscommunication that freezes my whole account.

Re:No Way! (1)

dkleinsc (563838) | more than 3 years ago | (#35895838)

This sort of this simply doesn't happen.

Now we know: All it takes is one admin screwing up and replacing an "ng" with an "s".

Severe weather in Virginia likely the culprit (3, Informative)

stopacop (2042526) | more than 3 years ago | (#35894944)

Severe weather hit the area. They shutdown Surry Power Station in Surry County, Virginia after a tornado took the power out that powers the power station.

Re:Severe weather in Virginia likely the culprit (0)

Anonymous Coward | more than 3 years ago | (#35895034)

They don't have backup generators?!

Re:Severe weather in Virginia likely the culprit (1)

MintyGreenMedia (513510) | more than 3 years ago | (#35895078)

I find it slightly more concerning that the power plant didn't. They're not designed to be self-sufficient?

Re:Severe weather in Virginia likely the culprit (1)

Burdell (228580) | more than 3 years ago | (#35895202)

No, they're not (see Fukashima, Japan). Basically, you don't just flip a switch and have a power plant go dark; you have to follow a shutdown procedure that takes both time and power. I don't know the requirements for coal or natural gas plants, but US nuclear plants are required to have multiple backup power sources (IIRC at least two independent diesel generator systems as well as off-site power). If the plant loses one backup power source for more than a certain period, it is required to shut down. IIRC if it loses two, it must shut down immediately (before potentially losing the remaining backups).

Re:Severe weather in Virginia likely the culprit (1)

MintyGreenMedia (513510) | more than 3 years ago | (#35895708)

I think you're ignoring the fact in the case of Fukashima, they were set up to be self-sufficient -- it's just that the tsunami knocked out their backup generators.

Re:Severe weather in Virginia likely the culprit (1)

tlhIngan (30335) | more than 3 years ago | (#35895852)

I think you're ignoring the fact in the case of Fukashima, they were set up to be self-sufficient -- it's just that the tsunami knocked out their backup generators.

Only due to cost savings. The tsunami wall required was half the height required (6M instead of 12M). Naturally, a 10M high tsunami hit. And no placement of the generators would've helped (they were in the basement, and that got flooded, but if they were outside, they could've gotten washed away).

Re:Severe weather in Virginia likely the culprit (0)

OverlordQ (264228) | more than 3 years ago | (#35895062)

after a tornado took the power out that powers the power station.

Does not compute. Once it's running why can't a power station use it's own power.

Re:Severe weather in Virginia likely the culprit (1)

jtdennis (77869) | more than 3 years ago | (#35895126)

it was probably a distribution station, not a power generation facility.

Re:Severe weather in Virginia likely the culprit (1)

MintyGreenMedia (513510) | more than 3 years ago | (#35895740)

Re:Severe weather in Virginia likely the culprit (1)

MmmmAqua (613624) | more than 3 years ago | (#35895132)

If it's a substation, it doesn't have its own power.

Re:Severe weather in Virginia likely the culprit (0)

Anonymous Coward | more than 3 years ago | (#35895204)

Original poster should have included "Nuclear" in the Surry Power Station explanation.

Re:Severe weather in Virginia likely the culprit (1)

MintyGreenMedia (513510) | more than 3 years ago | (#35895828)

I'd agree, except a) it's called "Surry Power Station," and b) a quick Google on that name gives you all the gory details.

Re:Severe weather in Virginia likely the culprit (1)

kevinNCSU (1531307) | more than 3 years ago | (#35895308)

after a tornado took the power out that powers the power station.

Does not compute. Once it's running why can't a power station use it's own power.

Because you tend to want to have power available to cool nuclear fuel even if you decide to stop producing power for whatever reason (maintenance, mechanical failure, tornado, earthquake, tsunami, nazi zombi attack)

Re:Severe weather in Virginia likely the culprit (1)

SecurityGuy (217807) | more than 3 years ago | (#35895884)

In which case being unable to use a secondary source (self-generated power) would be a bad thing, no?

Re:Severe weather in Virginia likely the culprit (1)

hawguy (1600213) | more than 3 years ago | (#35895622)

News reports are spotty, but I imagine that the plant tripped the turbines offline after the tornado damaged the power distribution equipment.

When it's generating 1GW of power and suddenly the load goes down to 0GW, the turbines have to trip offline automatically and immediately to prevent damage.

This may have also triggered a shutdown of the nuclear reactor, and it may take days or longer to bring it online after an emergency shutdown.

Re:Severe weather in Virginia likely the culprit (4, Informative)

getagrip (86081) | more than 3 years ago | (#35895128)

I am in Northern Virginia. There is no power outage or severe weather here.

Re:Severe weather in Virginia likely the culprit (1)

stopacop (2042526) | more than 3 years ago | (#35895164)

Going by a news report I saw!

Re:Severe weather in Virginia likely the culprit (2, Informative)

Anonymous Coward | more than 3 years ago | (#35895486)

First: Please look at a map. Surry County is east of Richmond on the way to VA Beach. An outage at Surry Power Station would not affect a data center over in Dulles, VA. That power station does not server this area at all.

Second: Read the news. Every comment above is wrong in one way or another. Here is a local news article about what happened down there, if you are curious:
http://www.examiner.com/progressive-in-richmond/surry-power-station-under-repair-the-aftermath-of-tornado

You people know nothing, and you post crap without doing any research at all.

Re:Severe weather in Virginia likely the culprit (0)

Gothic_Walrus (692125) | more than 3 years ago | (#35895222)

I am in Northern Virginia. There is no power outage or severe weather here.

I'm gonna believe the [examiner.com] multiple [wtvr.com] news [vagazette.com] stories [hamptonroads.com] that say you're wrong.

Re:Severe weather in Virginia likely the culprit (2)

xnpu (963139) | more than 3 years ago | (#35895368)

Those news reports do not rule out the possibility that he's in a place in Northern Virginia without severe weather or a power outage. How do you conclude that he is wrong?

Re:Severe weather in Virginia likely the culprit (0)

Anonymous Coward | more than 3 years ago | (#35895918)

He's wrong because he assumes his place is indicative of the entire Northern Virginia?

Re:Severe weather in Virginia likely the culprit (0)

Anonymous Coward | more than 3 years ago | (#35895580)

We don't normally count Southeastern Virginia as part of Northern Virginia.

Re:Severe weather in Virginia likely the culprit (1)

coastal984 (847795) | more than 3 years ago | (#35895970)

We in southeastern Virginia are normally offended when coupled with Northern Virginia :)

Re:Severe weather in Virginia likely the culprit (0)

Anonymous Coward | more than 3 years ago | (#35895696)

The tornado was several days ago, on the 18th.

Re:Severe weather in Virginia likely the culprit (0)

Anonymous Coward | more than 3 years ago | (#35895248)

Probably contractors then.

Re:Severe weather in Virginia likely the culprit (1, Funny)

Hal_Porter (817932) | more than 3 years ago | (#35895452)

Kill all permies

Re:Severe weather in Virginia likely the culprit (0)

Anonymous Coward | more than 3 years ago | (#35895960)

Being that this power station is in southern Virginia, I think that is highly possible.

Re:Severe weather in Virginia likely the culprit (2)

pdbaby (609052) | more than 3 years ago | (#35895170)

Amazon's Availability Zones are designed to have separate power, cooling and network so I don't think this is the issue. It was (is) a problem with their disk subsystem in multiple availability zones so I suspect they were in the process of pushing out some new storage controller code and some bug didn't appear until the later stages of their rollout. From their status log it looks like they're manually correcting the issue with each disk.

Re:Severe weather in Virginia likely the culprit (0)

Anonymous Coward | more than 3 years ago | (#35895182)

so Clouds take down Cloud

Re:Severe weather in Virginia likely the culprit (2)

metrometro (1092237) | more than 3 years ago | (#35895214)

Amazon's comments on the outage do not mention weather as a cause: http://status.aws.amazon.com/ [amazon.com]

"8:54 AM PDT We'd like to provide additional color on what were working on right now (please note that we always know more and understand issues better after we fully recover and dive deep into the post mortem). A networking event early this morning triggered a large amount of re-mirroring of EBS volumes in US-EAST-1. This re-mirroring created a shortage of capacity in one of the US-EAST-1 Availability Zones, which impacted new EBS volume creation as well as the pace with which we could re-mirror and recover affected EBS volumes. Additionally, one of our internal control planes for EBS has become inundated such that it's difficult to create new EBS volumes and EBS backed instances. We are working as quickly as possible to add capacity to that one Availability Zone to speed up the re-mirroring, and working to restore the control plane issue. We're starting to see progress on these efforts, but are not there yet. We will continue to provide updates when we have them. "

Re:Severe weather in Virginia likely the culprit (1)

alphatel (1450715) | more than 3 years ago | (#35895240)

So they can't failover like a normal ESX instance? So my cloud computer is actually just a rack in Virgnia?

Re:Severe weather in Virginia likely the culprit (1)

TooMuchToDo (882796) | more than 3 years ago | (#35895848)

Your cloud computer is a Xen instance in Virginia, and your "EBS block storage" is an iSCSI target. Magic it ain't.

Re:Severe weather in Virginia likely the culprit (1)

alphatel (1450715) | more than 3 years ago | (#35895950)

Essentially half-cloudassed clouding.

Re:Severe weather in Virginia likely the culprit (1)

Anne_Nonymous (313852) | more than 3 years ago | (#35895340)

>> Severe weather hit the area.

So you're saying clouds took out the cloud?

Re:Severe weather in Virginia likely the culprit (1)

MobileTatsu-NJG (946591) | more than 3 years ago | (#35895746)

Severe weather hit the area. They shutdown Surry Power Station in Surry County, Virginia after a tornado took the power out that powers the power station.

Of course we all know that the not-cloud would have been impervious to that.

Re:Severe weather in Virginia likely the culprit (1)

coastal984 (847795) | more than 3 years ago | (#35895952)

...That was on Saturday.

Oh boy (1)

MintyGreenMedia (513510) | more than 3 years ago | (#35894954)

I'm glad everyone's moving to the cloud for reliability and scalability purposes!

Re:Oh boy (1)

codepunk (167897) | more than 3 years ago | (#35895270)

In about the time it took you to write that message I spun up a standby deployment in another data center smart guy.

Re:Oh boy (1)

characterZer0 (138196) | more than 3 years ago | (#35895342)

How long does it take you to have the IP addresses rerouted?

Re:Oh boy (1)

moj0e (812361) | more than 3 years ago | (#35895902)

How long does it take you to have the IP addresses rerouted?

With Amazon's Elastic IPs, it takes seconds to reroute an IP address to another machine. Very handy in situations like these.

Re:Oh boy (1)

TooMuchToDo (882796) | more than 3 years ago | (#35895872)

Really? Wow. Perhaps you should let major sites like Reddit know. They've been down for *hours*.

The cloud works if you don't care about having control over when your business is down.

Re:Oh boy (1)

cduffy (652) | more than 3 years ago | (#35895278)

Amazon has "availability zones" for a reason, as do other cloud vendors.

If your infrastructure isn't resilient against everything in a zone suddenly disappearing, you're Doing It Wrong.

Re:Oh boy (1)

sbrown123 (229895) | more than 3 years ago | (#35895298)

Scalability: yes.
Cheap: yes.
Reliability: they don't say they are 100% fail safe. I think the figure is still in the 90's though which is pretty good.

If anyone tries to sell you 100% they are liars.

Re:Oh boy (1)

petteyg359 (1847514) | more than 3 years ago | (#35895480)

The Christians try to sell me 100% coverage...

Increased Latencies (1)

JamesonLewis3rd (1035172) | more than 3 years ago | (#35895030)

Bummer.

Lucky (1)

denshao2 (1515775) | more than 3 years ago | (#35895096)

My instance is on us-east-1d which is still up.

Re:Lucky (1)

pdbaby (609052) | more than 3 years ago | (#35895306)

Their API gives different names for the availability zones for each user (so your us-east-1d could be my us-east-1a) which complicates talking about issues (since all you can say is "two availability zones are experiencing problems"), especially when your system uses multiple accounts

Re:Lucky (0)

Anonymous Coward | more than 3 years ago | (#35895354)

Same here - no issues in us-east-1d with instances or EBS. Their status page at http://status.aws.amazon.com/ gives no indication of which availability zones in east-1 are ok and which are having problems.

It was the anonymous! (0)

Anonymous Coward | more than 3 years ago | (#35895112)

The DDoS didn't work so they tried something else.

I know I'm a coward member.

Slashdot 'em while they are down (1)

phorwich (909601) | more than 3 years ago | (#35895134)

Well... I am sure the additional server load from curious slashdotters like myself can only be helping.

The dark side of outsourcing (2)

HangingChad (677530) | more than 3 years ago | (#35895224)

Slashdot and Digg have one day traffic surges because Reddit is down. I'm getting way too much done today not being distracted by the GoneWild girls. This productivity must cease at once!

Does go to show what can happen when your business depends on an outsource provider. Everyone has to depend on service providers to some extent, but sometimes it's a good exercise to see how many of your company eggs are in one basket. Redundancy is expensive, but so is losing business. Even Google has had Gmail interruptions, lost some customer data and experienced slow downs.

tested (1)

nickb64 (1885128) | more than 3 years ago | (#35895318)

so this is why tested.com is down...

Give me my Reddit back! (1)

Frederic54 (3788) | more than 3 years ago | (#35895360)

Else I don't know what to do? I almost went to Digg! so please amazon guys, work on your stuff!

Emergency Plan (4, Interesting)

sycorob (180615) | more than 3 years ago | (#35895392)

I didn't even realize that one of our partners was using Amazon EWS until suddenly they were down all day. Amazon is really stable historically, but it's frustrating when you're out of business and all you can do is wait and see if Amazon will fix it soon.

In the "old school" thinking, smart companies have a redundant data center somewhere, humming along and waiting to be switched on if the main data center ever goes down. "The cloud" was supposed to solve that - massive redundancy within Amazon's services were supposed to protect you from outages. Not the case, apparently, since it looks like Amazon is going to fall below their promised 99.95% uptime (4.38 hours per year downtime).

I think the answer is to have redundant cloud services online, so you could switch from Amazon to Google or DevGrid if you had issues. The problem is, there's nothing quite like Amazon right now, it's not easy to switch from Amazon to some random service. This might be the biggest argument against virtual services - lack of standardization makes it hard to move from one to another, and hard to set up backup services in case of emergency.

Re:Emergency Plan (3, Insightful)

MariusBoo (883340) | more than 3 years ago | (#35895694)

Actually in the case of EC2 the smart thing would have been to have your instances spread over different availability zones...

Re:Emergency Plan (2)

ron_ivi (607351) | more than 3 years ago | (#35895700)

Just using Amazon West as well as Amazon East would have saved customers from this outage.

I think Amazon actually does great at covering all the technological single-points-of-failure.

The only reason I'd want a second cloud vendor is for the sales/account related single-point-of-failure of the Amazon Account being frozen due to a sales miscommunication or a MPAA/RIAA takedown notice,etc.

Re:Emergency Plan (1)

Synn (6288) | more than 3 years ago | (#35895718)

"In the "old school" thinking, smart companies have a redundant data center somewhere, humming along and waiting to be switched on if the main data center ever goes down. "

The problem is that gets really really expensive and it's actually quite hard to do properly.

You can do this with EC2 though, just have your application cross various geographical zones. Things like ELB even make this somewhat easier. But you still have to solve all the application problems that exist when your data stores exist across large distances.

Re:Emergency Plan (1)

Anonymous Coward | more than 3 years ago | (#35895730)

Amazon does have multiple datacenters -- it's your partners that didn't take advantage of that. The west coast datacenter costs a little more, but nothing is preventing them from starting instances there, except that maybe they don't have their database mirrored.

Re:Emergency Plan (3)

Alarash (746254) | more than 3 years ago | (#35895772)

Even by using only AWS you can set up redundancy across multiple North America's regions. Even across continents, with one data center in Ireland and one in Singapore. But obviously it costs extra as they bill you the bandwidth between the regions. That's how you use The Cloud (c) (tm) (R). Using a single data center to set up redundancy is dumb because it's not redundancy. You need high availability for your VMs, but also for your data center.

This is why banks or large businesses, for instance, have two or more data centers they always keep synchronized and have at least 50 kilometers between them. Thinking "well it's in one AWS data center so it's safe" is wrong, and this incident is a fine example of that.

Re:Emergency Plan (0)

Anonymous Coward | more than 3 years ago | (#35895842)

Amazon is really stable historically...

I think the users over at Reddit would beg to differ.

Re:Emergency Plan (1)

pdbaby (609052) | more than 3 years ago | (#35895894)

Amazon have complete isolation between Regions and good isolation between Availability Zones.
At work we'd recommend people use 2 cloud providers for their important services (which could be 2 Amazon regions or it could be Amazon and Rackspace) to prevent this sort of failure taking your business offline. You can't rely on any particular cloud provider to be reliable but it's a reasonably safe bet that a selection of cloud providers won't have significant overlapping downtime

It's also worth pointing out that all cloud SLAs are basically useless: if Amazon falls below their advertised uptime they'll refund you some of your charges - but they'll never refund more than what you've paid them: they don't compensate you for all the money you're losing (and the AWS charges are likely pocket change compared to this)

Disaster Recovery (1)

trainwrek (567874) | more than 3 years ago | (#35895394)

This is why, regardless of whether you're in the cloud or not, you need to have the ability to fail over to multiple datacenters in different geographical locations. Availability Zones are good but don't cut it. Unfortunately, Amazon doesn't make transferring backups between regions easy or cheap.

Not so bad.. (1)

kevinNCSU (1531307) | more than 3 years ago | (#35895408)

I was wondering why it took longer to start up my hadoop cluster this morning on EC2, but it still beats the living hell out of buying and configuring large numbers of machines for short term testing.

a downed cloud is fog (0)

Anonymous Coward | more than 3 years ago | (#35895522)

so we wait for the fog to lift...

cloudfail (1)

imp7 (714746) | more than 3 years ago | (#35895530)

We're now approaching our final destination, a datacenter of the future where nothing can possi-blye go wrong. Er, possi_bly_ go wrong. Heh, that's the first thing that's ever gone wrong.

Judgement Day (1)

treerex (743007) | more than 3 years ago | (#35895600)

Hmmmm... today *is* Judgement Day... perhaps Skynet's first target is AWS's East-Coast data center. Coincidence? I think not.

6 weeks before the AWS summit 2011 (3, Interesting)

grapeape (137008) | more than 3 years ago | (#35895676)

Gotta wonder what kind of flack Amazon is going to take for this one. I've had a couple clients looking into cloud services including moving to AWS and have already had one of them call me and cancel a meeting about it. While I understand stuff happens, the entire sales pitch for AWS was redundancy and build as you grow. Redundancy has obviously not worked in this case, while I usually support cloud services, this is definitely going to be a hard example to counter when trying to sell it to potential customers.

Re:6 weeks before the AWS summit 2011 (0)

darjen (879890) | more than 3 years ago | (#35895742)

Even with this, Amazon still probably has more uptime than somebody managing their own servers, especially the larger you get. It's pretty short sighted to simply dismiss them out of hand because of one incident.

Re:6 weeks before the AWS summit 2011 (4, Informative)

TooMuchToDo (882796) | more than 3 years ago | (#35895920)

It's not short sighted at all. When someone else runs your gear, all you can do is sweat until they get things back online, and they can take their time under what's known as "commerically reasonable SLAs". When you own your own gear, your own colo, etc., how much effort you use to get back up and running is up to you.

"The Cloud" for mission critical businesses is a joke.

Re:6 weeks before the AWS summit 2011 (1)

Synn (6288) | more than 3 years ago | (#35895782)

"Redundancy has obviously not worked in this case"

Only 1 region is effective. If your app was set to work with multiple zones then it likely wouldn't be impacted by this outage.

The thing with EC2 is it gives you the tools to build complex clusters. It doesn't do it for you.

Soo... (1)

Syberz (1170343) | more than 3 years ago | (#35895806)

Is anybody else suffering from Reddit withdrawal?

Re:Soo... (0)

Anonymous Coward | more than 3 years ago | (#35895948)

Only reason I am here...

Inappropriate metaphor - the cloud (1)

NicknamesAreStupid (1040118) | more than 3 years ago | (#35895898)

It means inclement weather; it rains; it pours; it delays air traffic; it's gloomy. You can look up at it and see whatever you can imagine, but it is not real. It goes away when you most need it. It is all wet.

player (1)

callmebill (1917294) | more than 3 years ago | (#35895924)

Uh-oh... This might be my fault. I've been loading music into my Amazon Cloud Player. Sorry guys.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>