×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Lessons Learned From Skype’s Outage

CmdrTaco posted more than 3 years ago | from the don't-die-during-christmas dept.

Communications 278

aabelro writes "On December 22th, 1600 GMT, the Skype services started to become unavailable, in the beginning for a small part of the users, then for more and more, until the network was down for about 24 hours. A week later, Lars Rabbe, CIO at Skype, explained what happened in a post-mortem analysis of the outage."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

278 comments

Deployed Soldiers. (5, Insightful)

puterg33k (1920022) | more than 3 years ago | (#34710822)

For us it's nearly our only way to speak to our loved ones at home. I'm just glad it's back up...

Re:Deployed Soldiers. (0)

Anonymous Coward | more than 3 years ago | (#34710902)

#34710822
First post AND doubles? You must be God!

Re:Deployed Soldiers. (0)

Anonymous Coward | more than 3 years ago | (#34710962)

I am the I am!

Re:Deployed Soldiers. (1)

BrokenHalo (565198) | more than 3 years ago | (#34711656)

I'm just glad it's back up...

I had my downtime like everybody else, but all was good by Christmas. Ironically though, when I was doing my phone-calls on the evening of the 25th, it was the battery in my cordless SIP handset that died (despite being new and supposedly fully charged).

So it was back to Skype, which worked like a champ.

Blogspam (5, Informative)

ralf1 (718128) | more than 3 years ago | (#34710876)

Not sure why you didn't link to the actual article on Skype http://blogs.skype.com/en/2010/12/cio_update.html [skype.com] Instead of the blogspam site.

Re:Blogspam (0)

Anonymous Coward | more than 3 years ago | (#34710944)

Thank you, sir. I too was wondering why Skype had the ugliest fucking blog ever. Then realized the slashdot summary was literally the first paragraph of the blogspam with the link changed from the actual blog to the blogspam shit.

_

Re:Blogspam (0)

Anonymous Coward | more than 3 years ago | (#34710950)

But how else will aabelro promote his own site on Slashdot?! It's just good business sense.

Re:Blogspam (4, Insightful)

Jurily (900488) | more than 3 years ago | (#34711246)

But how else will aabelro promote his own site on Slashdot?! It's just good business sense.

And people wonder why we don't RTFA.

+1 (0)

Anonymous Coward | more than 3 years ago | (#34711036)

please mod this up and the /. article down.

Re:Blogspam (2, Informative)

commodore64_love (1445365) | more than 3 years ago | (#34711234)

Not sure why you didn't link to the actual article on Skype http://blogs.skype.com/en/2010/12/cio_update.html [skype.com] [skype.com] Instead of the blogspam site.

Here's why: "Your organization's Internet use policy restricts access to this web page.
"Reason:
"Internet Telephony is filtered." - So I'm glad slashdot linked to the blog so I'd be able to read what was going on. My workplace is so backwards they still use old-fashioned telephone lines rather than internet phones. Oh and hot water radiators with that classic "thunk thunk thunk" sound when they turn on. Feels like I'm living in the 1930s. ;-)

Re:Blogspam (4, Insightful)

John Hasler (414242) | more than 3 years ago | (#34711772)

My workplace is so backwards they still use old-fashioned telephone lines rather than internet phones.

And consequently you had reliable service while all the "modern, forward thinking" Skype users were down.

December 22th? (5, Funny)

colinRTM (1333069) | more than 3 years ago | (#34710886)

Seriously?

Re:December 22th? (1)

Anonymous Coward | more than 3 years ago | (#34710994)

I know you first-world metric clowns run around and insult our system of measure and our time standards and whatever else, but if you can't reconcile a good ole American December Twenty-Twoeth...

Lessons Learned From Skype’s Outage (0)

Anonymous Coward | more than 3 years ago | (#34710888)

Lessons Learned From Skype’s Outage

It's all crystal clear now. Do not use Skype!

Re:Lessons Learned From Skype’s Outage (1)

leuk_he (194174) | more than 3 years ago | (#34710992)

The alternatives?

MSN? MSN live upgrades are a good reason not to use msn.

Not susre of others alternative for free video chat you can easy recommend.

Re:Lessons Learned From Skype’s Outage (2)

rjstanford (69735) | more than 3 years ago | (#34711074)

Google video chat, perhaps? Or maybe acknowledge that its fairly impossible to provide both 100% uptime and free video chat at the same time, without the resources of a major player behind you to promote goodwill?

Seriously, they were down for some percentage of the people for 1% of one year, during which time many competitive products were available. This is not an earth-shattering catastrophe.

Re:Lessons Learned From Skype’s Outage (2)

tenex (766192) | more than 3 years ago | (#34711644)

I think we're talking about better up-time than that for Skype. If we believe the outage numbers presented on their Wikipedia page http://en.wikipedia.org/wiki/Skype [wikipedia.org], they've had a total of 72 hours down time since the initial release in 2003--and assuming a 100% outage in all cases (which was not the case here)--their up-time minutes work out to something like:

          99.9988%

Seven years and 72 hours of total down-tine... It might not be five nines, but does seem a pretty respectable up-time percentage.

Re:Lessons Learned From Skype's Outage (1)

TaoPhoenix (980487) | more than 3 years ago | (#34711792)

(Satire)
Sorry, no. In Today's Post 911 World, rational decision making can never be the same again. We have to Respond to an Event like this. Remember the Day That Skype Was Down forever!

In other censorship news, all discussions of Averages and Means have been blocked, because 7 years of past performance will never matter again.
(/Satire)

Re:Lessons Learned From Skype’s Outage (1)

John Hasler (414242) | more than 3 years ago | (#34711802)

Seven years and 72 hours of total down-tine... It might not be five nines, but does seem a pretty respectable up-time percentage.

By POTS standards it's abysmal.

Re:Lessons Learned From Skype's Outage (2)

BrokenHalo (565198) | more than 3 years ago | (#34711880)

Well said. Skype is primarily a piece of technology aimed at the individual consumer. It is made completely clear at the outset that it doesn't claim to be a landline replacement, so anyone who lost business as a result of the outage doesn't get much sympathy from me.

The dowmtime period for me was about a day and a half, which amounts to 0.41% of the year. No biggie, I have SIP and mobile alternatives. Or both if I run a SIP client over my wireless internet dongle or phone tether.

I get very tired of those who insist on telling everybody to stop using Skype and to use this or that product instead. Skype has a commanding and undeniable position in peoples' headspace because it offers a fucking good product. For me, the combination of IM client with voice calling capability is a killer. My non-geek friends will never be pursuaded to run a separate IM and SIP client. I can (and do) leave video calling alone, since nobody needs to see me after (or during) an evening on the single-malts... :-}

Re:Lessons Learned From Skype’s Outage (0)

Anonymous Coward | more than 3 years ago | (#34711326)

Lessons Learned From Skype’s Outage

Blame your customers.

Re:Lessons Learned From Skype’s Outage (1)

ThatMegathronDude (1189203) | more than 3 years ago | (#34711662)

Where else are you going to find a free, distributed, encrypted by default text/voice/video chat service?

Re:Lessons Learned From Skype’s Outage (1)

chipperdog (169552) | more than 3 years ago | (#34711808)

A bunch of us should put up Asterisk servers and polish up some open source SIP clients (SIP can support video and text also)

Encrypted by default... (0)

Anonymous Coward | more than 3 years ago | (#34711824)

...with Skype who^H^H^Hsharing the keys with every major gubmint out there (maybe that's a revenue stream too?)

Encrypted my ass.

Or maybe (2)

devxo (1963088) | more than 3 years ago | (#34710900)

a major company shouldn't picky-pack on users and actually own their infrastructure that wouldn't go down like that?

Re:Or maybe (0)

Anonymous Coward | more than 3 years ago | (#34711068)

That shit costs money. Skype is free... You want a more reliable service? They have these things call "Phones", but you have to pay for them.

you are kidding me (5, Interesting)

alphatel (1450715) | more than 3 years ago | (#34710908)

If you are a node-based company worth several billion, charge for services, and don't even run enough of your own supernodes and monitor them in such a way that they cannot handle an outage effectively, you need serious help.

Re:you are kidding me (0, Offtopic)

Anonymous Coward | more than 3 years ago | (#34711024)

The skype business model IS based on leeching BW and resources from the endpoints, you know.

What is extremely pathetic is that anyone would use the skype network as anything other than a toy. But hey, it is the new American Way, isn't it?
You guys are even going to let your government finish destroying your valued freedom of speech over the wikileaks crap... Your shortsighted greed caused your downfall, and it will be a long and painful one.

We don't like it any better, China isn't deluded about itself like America, and they will be harsher masters.

Re:you are kidding me (3, Interesting)

marcosdumay (620877) | more than 3 years ago | (#34711736)

"China isn't deluded about itself like America"

I'll belive that when I hear a chinese (one that isn't out of country for decades) saying that China will rule the world for any reason but because they are a superior race or culture. China is quite deluded, even more so than the US. Half the world (ocident) is helping them getting even more deluded, and the other half (orient) is too afraid to help them cut any kind of delusion.

That doesn't mean, of course, that China isn't becoming a superpower. They may be, or may not, I don't know the future. Military, they already are...

Re:you are kidding me (5, Insightful)

TubeSteak (669689) | more than 3 years ago | (#34711288)

If you are a node-based company worth several billion, charge for services, and don't even run enough of your own supernodes and monitor them in such a way that they cannot handle an outage effectively, you need serious help.

No one expects 40% of a globally distributed network to crash at once. No one.
FTFA:

The initial crashes happened just before our usual daily peak-hour (1000 PST/1800 GMT), and very shortly after the initial crash, which resulted in traffic to the supernodes that was about 100 times what would normally be expected at that time of day.

Not even a multi-billion dollar company would have a disaster plan that provisions 100x capacity as a hot/cold spare.
Though I bet their new plan includes automatic spawning of nodes on EC2 or some other distributed CDN.

Re:you are kidding me (1)

localman57 (1340533) | more than 3 years ago | (#34711440)

I agree. But it wasn't an initial 100x surge, right? It was a cascading failure where eventually supernodes were up 100% because there were fewer and fewer of them. It's a matter of prevention, not cure.

Back up... (1)

msauve (701917) | more than 3 years ago | (#34711604)

a client (or even many) crashing shouldn't cause the server to, too. That's just bad design/software.

Skype seems clueless. They're thinking of using "processes for providing ‘automatic’ updates to our users so that we can help keep everyone on the latest Skype software. We believe these measures will reduce the possibility of this type of failure occurring again." Contrariwise - this would only make the matter worse. What if the _current_ version were the one with the problem, and an automated update system had forced everyone onto it? Then, instead of 50% of the clients contributing to the problem, they'd have 100%.

Re:you are kidding me (1)

Pstrobus (149491) | more than 3 years ago | (#34711730)

Can. Not. Resist. Perfect. Straight. Line...

No one expects the Spanish Inquisition. No one.

Re:you are kidding me (1)

blackraven14250 (902843) | more than 3 years ago | (#34711510)

The last time I checked, the only service they charge for is IP-based to a standard phone connection, not any PC-to-PC stuff.

lesson (hopefully) learned... (4, Insightful)

smash (1351) | more than 3 years ago | (#34710948)

... relying on dodgy peer to peer VOIP telephony for business purposes is retarded.

we've got people bitching at work about how it doesn't work from time to time, and why I've blocked its ability to do voice/video at the firewall. If you want VOIP, use something that uses standard SIP or some other documented, configurable traffic.

Re:lesson (hopefully) learned... (5, Interesting)

commodore64_love (1445365) | more than 3 years ago | (#34711144)

Ahh so YOU'RE the one blocking my skype. ;-)
I don't understand why Net Admins (such as yourself) block useful tools like Skype. Or streaming radio. I don't see any harm in letting those things into the office space, and it provides a more pleasant working environment (to distract from the boredom of sitting at a desk all day).

Re:lesson (hopefully) learned... (5, Informative)

smash (1351) | more than 3 years ago | (#34711276)

Why do I block skype? Because the only way to have it work properly through most firewalls is to allow ALL outgoing ports. Which means you allow any random program to do any random shit through your firewall to the outside network. Its a massive, massive security issue you could drive an oil tanker through.

Also, many companies pay for bandwidth. I don't want all of my bandwidth chewed up on video calls instead of mission critical apps.

Its not just because we're nazis, its because skype protocol is completely fucked when it comes to the ability of your admin to control resources. Want voip/video? Use something else.

Re:lesson (hopefully) learned... (5, Insightful)

smash (1351) | more than 3 years ago | (#34711318)

Just let me clarify: corporate networks are different to your home network. your home network? fine, use skype. in the office, where you've got several hundred PCs that may/may not have malicious software and/or users at the helm - allowing all outgoing connections is just begging for trouble.

Egress filtering is a good thing.

Making your day at work "less boring" by enabling you to do non-work related shit with company resources is not what my job is about. It is about ensuring the continued operation of the company's network - and skype is a liability.

Re:lesson (hopefully) learned... (2)

BobMcD (601576) | more than 3 years ago | (#34711532)

Making your day at work "less boring" by enabling you to do non-work related shit with company resources is not what my job is about. It is about ensuring the continued operation of the company's network - and skype is a liability.

Careful there, BOFH. Here I'll help:

Making your day at work "less boring" by enabling you to do non-work related shit with company resources is none of my business. Get it requested through the proper channels and you can have it. I don't make the business decisions here, I just do what the company needs done to be successful.

Re:lesson (hopefully) learned... (1)

ImprovOmega (744717) | more than 3 years ago | (#34711630)

Look, I'm all for business driven IT, but sometimes you have to save your managers from themselves. It's not being a BOFH to look out for the corporate network. You were hired to have the expertise to make recommendations and keep things as secure as possible. If it gets shoved through anyway then it may be time to start looking for someplace that actually values your skills.

Re:lesson (hopefully) learned... (2)

BobMcD (601576) | more than 3 years ago | (#34711712)

Good luck with that. Welcome to 2010's economy.

Meanwhile, CYA and collect your paycheck. Let those with the MBA's make the calls and take the heat, and NEVER bicker with the end user. You're not paid enough to deal with their crap.

Re:lesson (hopefully) learned... (1)

smash (1351) | more than 3 years ago | (#34711648)

It's still not going to be allowed through. They want skype, they can have a 3g service for their laptop and run skype through that.

I've explained to management the security problems with skype when it was originally requested and have support to block it.

Re:lesson (hopefully) learned... (1)

BobMcD (601576) | more than 3 years ago | (#34711722)

Then you're either enjoying bickering with the end users or this is an imaginary scenario...

Re:lesson (hopefully) learned... (1)

smash (1351) | more than 3 years ago | (#34711852)

No, they just figure out skype doesn't work, come see me, i tell them it is not supported and to pick up the telephone.

Re:lesson (hopefully) learned... (0)

commodore64_love (1445365) | more than 3 years ago | (#34711576)

Okay. So two questions: (1) Why not just let Skype operate through the same port used to handle HTML?

(2) Why ban office workers from listening to radioaol.com or other audio stations? You say you're "not a nazi" but barring people from hearing music seems reminiscent of the record burnings from that time. "This is filth - you shall not listen to it." It's just music to keep the engineers from going batty from boredom. ----- You also speak of bandwidth but we're only talking about Dialup-level audio (16k, 32k, 48k). Practically nothing. The last place I worked let people listen to any internet radio they desired, and it did not bankrupt the company.

Re:lesson (hopefully) learned... (1)

Duradin (1261418) | more than 3 years ago | (#34711668)

Have you listened to music at 16kbps? 96k is about as low as I'll go. Somethings aren't tolerable under 256k or 320k. Low bitrates are fine for talk but not music.

Re:lesson (hopefully) learned... (4, Informative)

smash (1351) | more than 3 years ago | (#34711754)

  1. Because skype wasn't written that way. You want standard voice/video, use a SIP program. Skype was written deliberately by the developers to allow it to talk to anywhere and everywhere through your network so it can route other people's calls, and connect to random other nodes for your own call routing. That free lunch you're eating? Paid for by other's use of your bandwidth.
  2. Multiply 500 users by 48kbit. thats 24 megabit in streaming audio. That you can get off that fucking $10 FM radio on your desk. Now i'm not sure how expensive bandwidth is where you are, but a 24 business grade meg METERED (say, 300 gigs) internet connection here is about 5-10 grand a month. The business is not going to wear the cost of 5-10k per month for our users to listen to shitty quality streaming MP3. Thats before you take into account the increase latency to mission critical apps, or remote end points on crappy satellite connections paying anywhere up to $7 per MEG of data

Re:lesson (hopefully) learned... (0)

John Hasler (414242) | more than 3 years ago | (#34711866)

Why ban office workers from listening to radioaol.com or other audio stations?

Why don't you just buy a radio and set it on the corner of your desk?

Re:lesson (hopefully) learned... (1)

Duradin (1261418) | more than 3 years ago | (#34711584)

Back in the day I worked at a place that banned streaming audio because one day there wasn't enough bandwidth for the actual business applications to go about their business when everyone was listening to their streamed music.

Skype can eat a lot of bandwidth.

Re:lesson (hopefully) learned... (1)

noc007 (633443) | more than 3 years ago | (#34711738)

Within the network I manage, it boils down to bandwidth, security, and slacking off.

We have two large offices and a few small offices. All of the internet traffic is routed through the WAN to the main office that has a 10Mb link which is shared with our internet facing servers. The other large office acts only as a backup and has a 5Mb internet connection. The WAN links are 3Mb with the exception of the main office having a 6Mb one. Regular business WAN traffic is a steady 1Mb across the board with the usual spikes from file transfers and e-mails with large attachments. Having a small handful of users streaming music isn't a big hit, but if a tenth of the userbase does it, the network would be saturated and business applications would come to a halt. If someone wants to listen to music, there are a number of cheap mp3 and cd players with and without a FM radio.

We handle a lot of sensitive information. Employees sending out that information can be a problem. Web based e-mail and IM is blocked to help prevent that information from easily being sent out. Some external IM services are allowed through the corporate IM client that gateways through our IM server for full logging and heuristics; 3rd party IM clients will not be able to access any IM service.

There was a time when every office had its own unfiltered internet connection. Too many people abused that privileged; machines were frequently infected with 0-day malware and people were goofing off. It is the responsibility of management and HR to make sure there is an appropriate amount of people staffed in each department and things are handled in a way so employees aren't miserable. There are ways to break up the monotony without resorting to slacking off on the internet.

How are supernodes defined? (1)

fantomas (94850) | more than 3 years ago | (#34710956)

Sorry if this is off topic or an ignorant question, but how does Skype define supernodes? Does the company just randomly choose users who are online a lot and declare them supernodes without the owner's knowledge, or is there some other process?
cheers

Re:How are supernodes defined? (1)

Anonymous Coward | more than 3 years ago | (#34711202)

If you've got a high capacity intertube, and you're not behind NAT, you might be lucky enough to be randomly selected as a supernode to forward other people's calls and index data. You don't get to opt in/out.

Re:How are supernodes defined? (1)

circletimessquare (444983) | more than 3 years ago | (#34711272)

"Does the company just randomly choose users who are online a lot and declare them supernodes without the owner's knowledge"

yes, that's exactly what they do. and yes, that's retarded for a company like skype

Re:How are supernodes defined? (1)

smash (1351) | more than 3 years ago | (#34711364)

Well not its not really retarded for skype. its retarded for skype users to actually agree to those terms of service.

Re:How are supernodes defined? (2)

circletimessquare (444983) | more than 3 years ago | (#34711476)

that's right, because everyone who wants to use VOIP should review the source code and familiarize themselves with the relevant RFC specs

classic "if you aren't a computer scientist you shouldn't use the internet" ignorant geek snobbery. how's that standard of behavior working for you?

Re:How are supernodes defined? (1)

smash (1351) | more than 3 years ago | (#34711816)

I was merely suggesting that its just fine and dandy as far as SKYPE the company goes to rip people's bandwidth off. If you cbf reading the license and just click OK for the free shit then you deserve whatever raping you get. Nothing is free.

Re:How are supernodes defined? (0)

Anonymous Coward | more than 3 years ago | (#34711690)

There are two types of decentralised nodes in the Skype architecture: supernodes and relays.

To become elected as either, a Skype client must be hosted on a public IPv4 address without firewalling and must demonstrate "considerable" uptime. What that may be is known to Skype.

Supernodes are just directory servers, correlating users to IPs. Relays take a lot more traffic as they are the key to Skype's NAT-busting effort; for every two users behind NAT there must be a public relay.

I notice that Skype did not apologise to either supernode or relay hosts for the massive increase in traffic.

Obvious problem.... (4, Interesting)

dstar (34869) | more than 3 years ago | (#34710958)

Hmm. Seems to me their biggest problem is that they allowed clients with a known bug to become supernodes; if 50% of the network had upgraded, they should only have been creating supernodes from the upgraded clients.

And in hindsight (I don't know that they should be blamed for not considering this before), the number of supernodes should probably be ~100-150% more than needed to service expected load. That way, if a third of them die, they _still_ have more than needed to handle the expected load. (And thus, hopefully, more than needed to handle the excessive load without causing them to shut down).

Re:Obvious problem.... (1)

BobMcD (601576) | more than 3 years ago | (#34711578)

Hmm. Seems to me their biggest problem is that they allowed clients with a known bug to become supernodes; if 50% of the network had upgraded, they should only have been creating supernodes from the upgraded clients.

If they had the power to stop bugged clients from becoming supernodes, why not just use that same power to make them patch? You're sort of assuming that they ever imagined that this could have happened. It's pretty clear that they didn't...

It's subtle, but it's there at the bottom where they admit 'we need to test our crap first and we need some way of making people patch' - which is kind of a known thing in the modern software world.

Re:Obvious problem.... (1)

dhammond (953711) | more than 3 years ago | (#34711830)

You are ignoring the fact that at one point in time, the latest version of the software was the buggy version. It might actually make sense to have some heterogeneity in the supernodes.

And on your second point, I think you're ignoring the fact that the running supernodes received up to 100 times the expected traffic, so a 100-150% increase would probably not have helped much.

I don't understand this. (4, Interesting)

commodore64_love (1445365) | more than 3 years ago | (#34710964)

"At its core, Skype relies on a third generation P2P network that has lots of peer nodes and a number of supernodes, one for several hundreds of nodes. Since Skype does not have a centralized directory to support finding routes between two or more nodes that want to communicate, the virtual network uses supernodes as directories. When a client enters Skype, it registers itself with a supernode, giving its IP address so it can be found by other clients who might want to establish a communication."

Skype is a peer-to-peer network? Like torrent? So the supernode is like a tracker website, to connect peers to one another? No supernode==no tracker==no calls going through. Hmmmm. Maybe they should try DHT.

TL;DR version: (4, Interesting)

The MAZZTer (911996) | more than 3 years ago | (#34710974)

Lots of users were using an old outdated buggy version of Skype, lots of client crashes at once bringing down big chunks of the P2P network, remaining network couldn't handle the load and went down too, took a while for Skype to put it's own supernodes up to help get the network self-sustaining again.

They're considering an auto-update feature now since such a feature could have kept this from happening. Personally I think old versions should be blocked from making or receiving calls too, so users would be encouraged to update (works for Team Fortress 2). Of course auto updates would make updating super easy anyway so impact from that would be minimal.

Re:TL;DR version: (1)

commodore64_love (1445365) | more than 3 years ago | (#34711294)

Don't blame the users for what is really a *programming* problem. The article says the older versions (3.x and 4.x) worked fine and were bug-free. It was the NEW version that broke the nodes. That's why I don't install new versions until they've been around for awhile.

Just recently Microsoft auto-updated my work computer from IE7 to 8, and the browser worked perfectly but something in the update killed my network connection. Grrr. And then there was that Antivirus Software update from three weeks ago that killed people's Windows PCs by making them unbootable. Customers were told to download a fix and burn a recover CD, but how are they supposed to do that if their PC is crashed??? Fucking bastards.

Re:TL;DR version: (1)

localman57 (1340533) | more than 3 years ago | (#34711518)

That's why I don't install new versions until they've been around for awhile

Isn't that part of what caused this? :-)

Re:TL;DR version: (1)

spxero (782496) | more than 3 years ago | (#34711878)

The problem with the auto-update feature in Skype vs. gaming is that most gaming computers will be close to top-of-the-line. Most computers used for Skyping will not be top of the line.

From experience, the 5.0 version of Skype doesn't work as well as the 3.8 branch. Switching between windowed and full-screen video on the 5.0 branch takes ~4 sec to accomplish, with the audio becoming choppy at the same time. In addition, the video is choppy and audio quality is scratchy at best. The 3.8 branch doesn't have these issues, but can't do multi-user video either. This is an older machine running XP (P4 3.0HT w/ 2GB PC2-6400 memory), but should still be capable of doing things with the newer version.

Never makes sense to upgrade working software... (5, Interesting)

syousef (465911) | more than 3 years ago | (#34711038)

...unless you need something in the newer version (feature, security update etc.). Of course us geeks like to have the latest to fiddle with, but for the average Joe end-user, if it ain't broke, don't fix it. There is always the risk that the newer software will contain new bugs. At one point the buggy version of the Skype software was the latest version and was what users were being pushed to upgrade to. If the crash had happened then, I wonder if they'd find a new way to scapegoat users.

By the way new versions breaking existing functionality isn't theoretical, or rare. I'm currently installing software on my new laptop. I've had to downgrade both Zonealarm and Virtualbox. The former broke remote desktop. The later broke file sharing. No idea why, but in each case uninstalling and installing an older version I knew worked fixed the issue for me.

Re:Never makes sense to upgrade working software.. (1)

Enderandrew (866215) | more than 3 years ago | (#34711378)

The problem is that it is broke, you just often don't realize it. Older doesn't mean more secure or more stable inherently. New versions fix bugs discovered in old versions. If everyone did update immediately, then everyone would have had the bug fix and this outage wouldn't have happened.

Re:Never makes sense to upgrade working software.. (1)

BobMcD (601576) | more than 3 years ago | (#34711614)

You're suffering from sample bias. Newer software is also 'broke' and you also don't know that. I think the point would be, if it is 'broke' but not impacting you in a way that you'd know it, do you care? In some cases yes, in other cases no.

Re:Never makes sense to upgrade working software.. (1)

Enderandrew (866215) | more than 3 years ago | (#34711864)

It is equally possible that newer software introduces bugs as much as fixes them. But the assumption that older is always more secure and stable is flawed.

In reality, the best solution is to review changelogs and make informed decisions when upgrading. But avoiding all upgrades isn't the solution.

Re:Never makes sense to upgrade working software.. (1)

eulernet (1132389) | more than 3 years ago | (#34711488)

..unless you need something in the newer version (feature, security update etc.).

And also especially when the update is a 20 megabytes file. In fact, we need to reinstall the whole software every time.
Why such a lame updating system ?

Re:Never makes sense to upgrade working software.. (1)

whoop (194) | more than 3 years ago | (#34711580)

And that's exactly why this happened. People were satisfied with the initial release of v5 and saw no need to update (meaningless bug fixes, no useful features, who cares). Then they broke everything...

Supernode Software (4, Interesting)

varmittang (849469) | more than 3 years ago | (#34711060)

How about they release some supernode only software that people can setup on a server and possibly the ability to setup Skype to use a preferred supernode. So a businesses can setup a supernode of their own and point their users too it. But also that supernode is part of the collective of supernodes and routes Skype connections for everyone else too. This would hopefully give Skype more supernodes out there that are 24/7 and not desktop computers routing the traffic.

This is why I don't do updates (1)

commodore64_love (1445365) | more than 3 years ago | (#34711072)

"A bug in Skype for Windows version 5.0.0.152 made the application crash when receiving late messages..... previous versions for Windows and all the other versions for non-Windows machines were not affected by the bug."

The new versions are often *more* buggy than the last version. Just recently Microsoft auto-updated my work computer from IE7 to 8, and the browser worked perfectly but something in the update killed my network connection. I had to waste an hour going back to the previous version (as did most people in the office). And then there was that Antivirus Software update from three week ago that killed people's Windows PCs by making them unbootable.

Programmers really should be more careful with their updates, to make sure the new X.y release is better rather than worse. But since they aren't careful I turned off auto-updates. They are too dangerous.

Re:This is why I don't do updates (0)

Anonymous Coward | more than 3 years ago | (#34711274)

I love how you conveniently omitted a part of the statement on the Skype blog. Here, allow me to patch it up for you.

"Users running either the latest Skype for Windows (version 5.0.0.156), older versions of Skype for Windows (4.0 versions), Skype for Mac, Skype for iPhone, Skype on your TV, and Skype Connect or Skype Manager for enterprises were not affected by this initial problem."

As you can see, the 152 version of the software is the buggy client. The 156 version is not.

But wait, there's more:

"However, around 50% of all Skype users globally were running the 5.0.0.152 version of Skype for Windows, and the crashes caused approximately 40% of those clients to fail. These clients included 25–30% of the publicly available supernodes, also failed as a result of this problem."

See that? 50% were using that outdated software. Software for which an update had been available. It's sheer laziness to not patch your software. Yes, sometimes, a buggy update is unleashed upon the world. However, this is a case in point against running unpatched software.

Re:This is why I don't do updates (1)

BobMcD (601576) | more than 3 years ago | (#34711654)

It's sheer laziness to not patch your software. Yes, sometimes, a buggy update is unleashed upon the world. However, this is a case in point against running unpatched software.

No, commodore64 is right. There needs to be a reason to patch and that reason needs to outweigh both the hassle of doing it AND the risk that something new will be broken.

If you're not handing over fresh new dollar bills for a piece of software, expect it to be assembled with the bare minimum effort. This includes all patches. The likelihood that one of this will suck worse than the problem they're attempting to fix is very, very high.

Let's stop depending on Skype (0)

Anonymous Coward | more than 3 years ago | (#34711088)

So this outage was triggered by out-dated clients and proprietary support servers. That's like saying IE6+IIS users can bring down 50% the Web. And to think people depend on services like skype to keep in touch with loved ones never realizing there are simpler almost better alternatives that do the exact same thing.

Re:Let's stop depending on Skype (0)

Anonymous Coward | more than 3 years ago | (#34711484)

Link to a another cross platform, dialup compatible alternative, with video on faster links that my family (including grandfather) can setup and use without a phone call to me for support every weekend please!!

client crashes should not - server crashes (1)

RichMan (8097) | more than 3 years ago | (#34711118)

If problems with the client can lead to problems with the server then the server system lacks robustness. For applications like this the servers should be practically immune to any client state much ups.

Seems to me skype needs to work on their server side state machines.

Re:client crashes should not - server crashes (1)

smash (1351) | more than 3 years ago | (#34711382)

You missed the point. With skype, the clients ARE the servers ("randomly" (i.e., non-nat well connected) selected supernodes).

Re:client crashes should not - server crashes (1)

nedlohs (1335013) | more than 3 years ago | (#34711422)

Do you know what peer to peer means?

here's a hint: there are no servers, they just use the bandwidth and cpu of random clients to do that work.

For as much bull$hit is spread about this.. (0)

Anonymous Coward | more than 3 years ago | (#34711136)

For as much bull$hit is spread about this, 99% of skype users were UNAFFECTED.

The only crisis is the fabricated one by websites and media.

Re:For as much bull$hit is spread about this.. (1)

BobMcD (601576) | more than 3 years ago | (#34711680)

Sample bias again. TFA says 20% were affected, not 1%. Just because it didn't happen to you and your friends doesn't mean that the people who actually analyzed the problem suck at math.

Article Summary [sarcastic] (4, Funny)

Ukab the Great (87152) | more than 3 years ago | (#34711148)

"We expected a Limewire topology to be as reliable as a Phone companyi topology and oddly enough that bit us in the ass."

Lesson Learned? (1)

RoadWarriorX (522317) | more than 3 years ago | (#34711220)

The lesson they learned is that the users like to use buggy versions of their software? Sure blame your users... Maybe the lesson to learn is not release buggy software!

Skype Win 5.0 client sucks (4, Interesting)

scorp1us (235526) | more than 3 years ago | (#34711328)

The QA of this release is way down. On top of that, skype auto-updated people from 4.0 to 5.0. Within a few days, the buggy 5.0 had enough penetration (50%) to bring them down.

The windows client has widely been reported to:
consume 2x as much CPU (33% to 60% on mine after upgrade)
leak RAM (starts out ok but after some use over 1.5gig needed)
the GUI is slow, so the fade effects on some computers (mine) causes video tearing. It is no longer possible to run full-screen. (320x240 is all I get before tearing sets in)
The fonts in the video area don't render correctly.
It should be noted that I have a AMD X2 1.6 and Radeon 1200 card in this computer. Its not shabby. But the 5.0 client brought it to its knees.

It plays SCII just fine (albeit on the lowest setting).

It comes at a bad time when they are trying for more corporate agreements, but can't run on my 3-year-old hardware.

I uninstalled 5.0 and installed 4.0 and its back to normal.

Public Post-Mortem (4, Insightful)

Enderandrew (866215) | more than 3 years ago | (#34711424)

You can bitch they didn't QA the release. You can bitch that you don't like a P2P topology. But it is nice to see a public post-mortem.

Missed opportunity for open source (1)

DCFusor (1763438) | more than 3 years ago | (#34711490)

Back when I was doing one of the first VOIP solutions (this one mostly for LAN use) we dreamed up something like Skype, that would work in similar fashion. The big advantage is that it could be done by any reasonably large group of users and no phone company at all need be involved -- no charge to anyone, no control over anyone by some big monolithic corp. It could still be done, and I wonder why no one in the open source area has managed? Critical mass issue; selling the first phone is a bear -- who you gonna call? Once going, a completely free open source solution would keep going just fine I'd think. I'd suppose the main problems would be with security, outside actors diddling supernodes to break it, as some companies would have a large interest in not having it as a competitor? Not sure how you'd handle those issues.

Forced auto updates are not the solution. (4, Interesting)

mario_grgic (515333) | more than 3 years ago | (#34711542)

I hate when apps run auto update daemons. This precisely the reason why I don't use any Google desktop software on my computers.

Proper thing to do in this case is simply disallow users to log in with a message they need to upgrade their client if they want to continue to use the app. Simple thing to do, rather than each app running a daemon. Soon enough there will be hundred update daemons on each user's computer, eating resources, connecting online all the time and bogging down the user experience. Thanks but no thanks. I refuse to use any of those.

Supernodes shut down when overloaded? (1)

GeckoAddict (1154537) | more than 3 years ago | (#34711870)

"We believe that increased load in supernode traffic led to some of these parameters exceeding normal limits, and as a result, more supernodes started to shut down"

Maybe I'm missing something, but why are supernodes coded to shut down during increased load instead of simply throttling requests? It seems like the idea of 'too many requests, shut down' is what caused the cascade. Can someone enlighten me as to why this is the preferred overload handling mechanism?
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...