Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Recurly's Backup Mess Takes Days to Clean Up

samzenpus posted more than 2 years ago | from the best-practices dept.

Businesses 21

A cascading hardware outage struck subscription payment provider Recurly last week, and that started a long example in how not to manage critical infrastructure. From the article: "Last Monday, the payment provider suffered an intermittent hardware failure, which prevented the company from processing either payments or refunds. The company says it serves over 1,000 customers, including Adobe, BrightCove, and Fox News Radio, processing recurring payments for subscriptions. By Friday, the company still hadn’t completely straightened out the mess, providing updates to customers using payment gateways such as Authorize.net and LinkPoint/First Data."

Sorry! There are no comments related to the filter you selected.

Reminds me of Authorize.net (4, Funny)

Mr. Kinky (2726685) | more than 2 years ago | (#41291297)

This case reminds me of our payment processor Authorize.net in 2009, when a fire took down the whole network and infrastructure for many days. It was only solved when one of the guys over at Authorize.net literally

Re:Reminds me of Authorize.net (5, Funny)

MetalliQaZ (539913) | more than 2 years ago | (#41291331)

He would have finished the story but he had a cascading hardware failure that took out his network...

Re:Reminds me of Authorize.net (0)

Anonymous Coward | more than 2 years ago | (#41291955)

He accidentally the post.

Re:Reminds me of Authorize.net (1)

maxwell demon (590494) | more than 2 years ago | (#41292915)

Yeah, that never could happen with me, because I

Re:Reminds me of Authorize.net (1)

carlos92 (682924) | more than 2 years ago | (#41292611)

Literally what? The suspense is killing me!

Re:Reminds me of Authorize.net (1)

tstrunk (2562139) | more than 2 years ago | (#41293287)

I know that technician! His name was Candlejack, right?
When he came to

Re:Reminds me of Authorize.net (0)

Anonymous Coward | more than 2 years ago | (#41294003)

Used to work for a large payment processor and outages were pretty common. We'd regularly have our gateway fail or transactions not process at all every month. Something as simple as not checking the UPS batteries monthly in the data center to make sure they were still good caused the last outage... Pretty common practice to half-ass everything, they don't care about supporting the customers just getting their percentage off your transactions.. Credit Cards biggest scam ever, if you run a business and take CC they will bleed you dry with all the fees and other interchange bullshit. You'd be surprised how many merchants have trouble running credit cards and regularly double authorize charges on people's cards. Nothing like having your money being held hostage for a number of days or weeks...

Re:Reminds me of Authorize.net (1)

arglebargle_xiv (2212710) | more than 2 years ago | (#41296763)

Pretty common practice to half-ass everything, they don't care about supporting the customers just getting their percentage off your transactions..

A friend of mine runs a networking services company who got called into a medium-sized payment processor a few months back to upgrade a server, about an afternoon's work. After several months of 10-12 hour days he's now got them up to the level where they're about quarter-arsed. With another few months' work they'll be at the level of half-arsed. When he described the original setup he found I thought he was making it up, it was just fail layered upon fail layered upon fail, like something a bunch of drunken geeks have invented as a joke to see how dysfunctional a collection of systems and networking you could make that would still appear to work most of the time.

Re:Reminds me of Authorize.net (1)

cusco (717999) | more than 2 years ago | (#41300483)

Makes me glad that I pay cash for everything possible.

Too big to fail? (0)

Anonymous Coward | more than 2 years ago | (#41291615)

I'm a little glad we aren't so big that if our colocation network access fucks up we end up on slashdot.

I would've been leery of... (5, Funny)

Anonymous Coward | more than 2 years ago | (#41291625)

...a service provider named Recurly in the first place.

Same goes for any provider named Relarry, Remoe or Reshemp either for that matter.

Re:I would've been leery of... (0)

Anonymous Coward | more than 2 years ago | (#41292963)

Don't forget Rejoe you insensitive clod!

Coitainly! (0)

Anonymous Coward | more than 2 years ago | (#41293857)

Nuyk nuyk nuyk.

Re:I would've been leery of... (2)

Alien Being (18488) | more than 2 years ago | (#41295359)

I'm Honest Moe, that's Honest Shemp, and that's... that's Larry.

No backups (3, Interesting)

Anonymous Coward | more than 2 years ago | (#41292099)

This is a perfect example of redundancy not being the same as backups. They had redundant encryption devices, but the failure of one rolled over into the other. They had no backups (that's right, none at all) that they could restore from. From what they've told us, they intend to resolve this issue by adding more redundancy.

Yes, really.

Re:No backups (3, Funny)

Anonymous Coward | more than 2 years ago | (#41292667)

They should have used RAID.

Re:No backups (-1)

Anonymous Coward | more than 2 years ago | (#41293409)

They should have used RAID.

What.... so that file loss and/or corruption can be redundantly spread across multiple disks?

Apparently you also are not grasping the conceptual differences between fault tolerance and disaster recovery.
A commercial enterprise needs both, and one is not a substitute for the other.

Re:No backups (0)

Anonymous Coward | more than 2 years ago | (#41296109)

I don't think your grasping his joke :)

Re:No backups (3, Informative)

tlhIngan (30335) | more than 2 years ago | (#41293193)

This is a perfect example of redundancy not being the same as backups. They had redundant encryption devices, but the failure of one rolled over into the other. They had no backups (that's right, none at all) that they could restore from. From what they've told us, they intend to resolve this issue by adding more redundancy.

Correction, they have no backups of the keys that the encryption accellerators used. End result is now they have a bunch of encrypted data, with little in the way of being able to recover it because the keys used are lost or corrupted.

Sounds like they need to be hacked and their information "liberated" so they can recover it :).

Re:No backups (0)

Anonymous Coward | more than 2 years ago | (#41293535)

Backups are not backups if you can't recover from them. They needed to have copies of data, separated from their production environment, along with a copy of encryption keys to access those backups. My understanding is that they didn't even have snapshot backups, let alone encrypted ones with keys backed up separately.

Another Leafycaust (0)

Anonymous Coward | more than 2 years ago | (#41292549)

It was due

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?