×

Announcing: Slashdot Deals - Explore geek apps, games, gadgets and more. (what is this?)

Thank you!

We are sorry to see you leave - Beta is different and we value the time you took to try it out. Before you decide to go, please take a look at some value-adds for Beta and learn more about it. Thank you for reading Slashdot, and for making the site better!

Comments

top

Explosion At ThePlanet Datacenter Drops 9,000 Servers

JoeShmoe Ignorant firemen = single point-of-failure (431 comments)


Everyone loves firemen, right? Not me. While the guys you see in the movies running into burning buildings might be heroes, the real world firemen (or more specifically fire chiefs) are capricious, arbitrarty, ignorant little rulers of their own personal fiefdom. Did you know that if you are getting an inspection from your local firechief and he commands something, there is no appeal? His word is law, no matter how STUPID or IGNORANT. I'll give you some examples later.

I'm one of the affected customers. I have about 100 domains down right now because both my nameservers were hosted at the facility, as is the control panel that I would use to change the nameserver IPs. Whoops. So I learned why I need to obviously have NS3 and ND4 and spread them around because even though the servers are spread everywhere, without my nameservers none of them currently resolve.

It sounds like the facility was ordered to cut ALL power because of some fire chief's misguided fear that power flows backwards from a low-voltage source to a high-voltage one. I admit I don't know much about the engineering of this data center, but I'm pretty sure the "Y" junction where AC and generator power come together is going to be as close to the rack power as possible to avoid lossy transformation. It makes no sense why they would have 220 or 400 VAC generators running through the same high-voltage transformer when it would be far more efficient to have 120 or even 12VCD (if only servers would accept that). But I admit I could be wrong, and if it is a legit safety issue...then it's apparently a single point of failure for every data center out there because ThePlanet charged enough that they don't need to cut corners.

Here's a couple of times that I've had my hackles raised by some fireman with no knowledge of technology. The first was when we switched alarm companies and required a fire inspector to come and sign off on the newly installed system. The inspector said we needed to shut down power for 24 hours to verify that the fire alarm would still work after that period of time (a code requirement). No problem, we said, reaching for the breaker for that circuit.

No no, he said. ALL POWER. That meant the entire office complex, some 20-30 businesses, would need to be without power for an entire day so that this fing idiot could be sure that we weren't cheating by sneaking supplimentary power from another source.

WHAT THE FRACK

We ended up having to rent generators and park them outside to keep our racks and critical systems running, and then renting a conference room to relocate employees. We went all the way to the country commmissioners pointing out how absolutely stupid this was (not to mention, who the HELL is still going to be in a burning building 24 hours after the alarm's gone off) but we told that there was no override possible.

The second time was at a different place when we installed a CO alarm as required for commercial property. Well, the inspector came and said we need to test it. OK, we said, pressing the test button. No no, he said, we need to spray it with carbon monoxide.

Where the HELL can you buy a toxic substance like carbon monoxide, we asked. Not his problem but he wouldn't sign off until we did. After finding out that it was illegal to ship the stuff, and that there was no local supplier, we finally called the manufacturer of the device who pointed out that the device was void the second it was exposed to CO because the sensor was not reusuable. In other words, when the sensor was tripped, it was time to buy a new monitor. You can see the recursive loop that would have devloped if we actually had tested the device and then promptly had to replace it and get the new one retested by this idiot.

So finally we got a letter from the manufacturer that pointed out the device was UL certified and that pressing the test button WAS the way you tested the device. It took four weeks of arguing before he finally found an excuse that let him safe face and pass up without it. But that was about a month of rent loss because we couldn't get occupancy.

So F-this Nanny State bull. If ThePlanet had a single point of design failure, I might move my servers but where can I go that will likely be designed any different? But if this turns out some fire chief said kill the power to the whole block because he didn't want to stub his dainty little toe then......(sigh)

-JoeShmoe
.

more than 6 years ago

Submissions

JoeShmoe hasn't submitted any stories.

Journals

JoeShmoe has no journal entries.

Slashdot Login

Need an Account?

Forgot your password?