Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

One Failed NIC Strands 20,000 At LAX

kdawson posted more than 7 years ago | from the comp-dot-risks dept.

Networking 293

The card in question experienced a partial failure that started about 12:50 p.m. Saturday, said Jennifer Connors, a chief in the office of field operations for the Customs and Border Protection agency. As data overloaded the system, a domino effect occurred with other computer network cards, eventually causing a total system failure. A spokeswoman for the airports agency said airport and customs officials are discussing how to handle a similar incident should it occur in the future.

cancel ×

293 comments

Sorry! There are no comments related to the filter you selected.

I Blame Bush (-1)

Anonymous Coward | more than 7 years ago | (#20238205)

Because I'm a liberal.

No, wait, I take that back. I blame Karl Rove. And Dick Cheney. And um, Rush Limbaugh. Yeah, everything that happens is their faults.

OH! I just farted. That bastard Karl Rove forcing me to eat a bean burrito...

Re:I Blame Bush (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#20239635)

You're already out, Karl. No need for bitterness.

Re:I Blame Bush (-1, Troll)

Jeremiah Cornelius (137) | more than 7 years ago | (#20239935)

Bitterness? He LOVES bitterness. It explains the penchant for felching.

Re:I Blame Bush (0, Offtopic)

CheddarHead (811916) | more than 7 years ago | (#20240863)

Thanks alot! I just had to look up "felching". Now I'll have that disgusting imagery in my mind the rest of the afternoon!

Bwahahahaha!!!! (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#20240921)

I know I shouldn't be replying to the future troll/flamebait modded post, but this made my goddamned afternoon. Cheers!

Congratulations On Your Answers Except (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#20240827)


"Yeah, everything that happens is their faults."

You obviously suffer from the same cognitive malfunction as the world's most dangerous person [whitehouse.org] .

Good luck in high school.

Patriotically Yours,
Kilgore Trout

That's all it takes (1, Interesting)

Marxist Hacker 42 (638312) | more than 7 years ago | (#20238337)

Though I heard it was a switch. Same idea though- all it takes is one malfunctioning card flooding the LAN with bad packets to bring it all down.

Re:That's all it takes (2, Interesting)

Jeremiah Cornelius (137) | more than 7 years ago | (#20239911)

Then that would lead me to think "hub", not switch. Or just a really shitty switch...

Re:That's all it takes (1)

morgan_greywolf (835522) | more than 7 years ago | (#20240619)

Would you think that LAX is running anything that out-of-date or crappy? Maybe, maybe not, but it does make a good case to run solid, proven and reliable network infrastructure hardware from a major manufacturer.

Oh yeah, and redundant-path network connections for critical portions of the network wouldn't hurt either.

Re:That's all it takes (5, Insightful)

Svet-Am (413146) | more than 7 years ago | (#20240749)

Of course they're running old and outdated hardware. When thing work, particularly in a mission critical situation, you don't touch them! Even if the IT admins knew that computer was old and on the brink of dying, how are they supposed to convince the suits and beancounters of that? Non-technical people take the approach that since computers are inherently binary (work or no-work) that if the machine is up and running _right now_ then there is no problem and no sense on spending money to replace it.

If the IT folks were clueless about this machine's age or condition, then the blame lies solely with them for not knowing what the hell they were doing. However, if it was the other folks who shot the IT folks down about upgrading then "welcome to the current state of business", unfortunately.

Re:That's all it takes (5, Insightful)

EmperorKagato (689705) | more than 7 years ago | (#20241565)

Even if the IT admins knew that computer was old and on the brink of dying, how are they supposed to convince the suits and beancounters of that?
You show the suits and bean counters how much it costs the company if the system failed and time was spent recovering that system.

Re:That's all it takes (0)

Anonymous Coward | more than 7 years ago | (#20240809)

Not that it's proof or anything, but have you been to LAX? It's not exactly a sparkling diamond of technological wonder to behold, to look at it.

Re:That's all it takes (5, Interesting)

Kadin2048 (468275) | more than 7 years ago | (#20241063)

Would you think that LAX is running anything that out-of-date or crappy?
I assume that they're running everything with spit, duct tape, wishful thinking, ancient custom software, near-fossilized hardware, and Excel spreadsheets ... just like pretty much everything else in the public sector.

I've seen what's running some government agencies, and it's frightening.

Re:That's all it takes (1)

Shagg (99693) | more than 7 years ago | (#20241473)

Would you think that LAX is running anything that out-of-date or crappy?
I'm surprised they're even using IP. Many airline systems are still running on X.25 and mainframes.

So, no, running out-of-date hardware wouldn't surprise me at all.

I don't believe any of it (0, Offtopic)

zogger (617870) | more than 7 years ago | (#20240823)

Just not, I think it was deliberate, some org trying to see how they could gum up the system easily. I just don't think they want to admit they got hacked.

Zero proof, I just never take governmental reasons for spectacular failures at face value any longer. I used to when I was younger, but after seeing 7,892 lies in a row-well, I just don't trust them on anything important, and I don't believe in coincidences. That is my default position. Whatever they say first-is a lie until proven otherwise, and proof that would convince me is not some spokesperson claiming such and such.

There's too much weird stuff going on, especially with the markets and money supply and an apparent outright war with some global factions in this weird economy, and I think they are going to need a serious public emergency distraction real soon now to take the heat off the ongoing meltdown, and it IS melting down right now. Those failures are not accidents in other words, nor are a lot of the other "failures" we are seeing lately, I think they have been mostly all attacks by parties at this time unknown.

It just *stinks* to me, way too much BS being spoon fed to everyone in the media for it all to be true facts. Too much in a short time frame...just ain't buying it. And the rats deserting the sinking ships, a lot of biz execs bailing out, obvious collusion with artificially propping up some fatcats, weird murmurings from overseas..looks like historically all the major shifts before WW1, very similar.

Any one event, sure, a couple, possible, not these dozens of weird events in such a short time frame....nope, not probably unless it is on purpose. This is asymmetrical warfare, or false flag, or tests or probes or...dang something. It stinks. If I could pion it down better I would, but I am looking at this daily and if see something that makes it all fit I'll do another journal entry about it.

I've been antsy about all my preps lately,. moreso than usual, even for a fanatic like me.

I'll say one thing, keep your battery supplies good and have the best possible water filter handy, and a shortwave radio with some freqs programmed in, and take another look at the bugout bags if that is what will be needed because of where folks live, and make sure you have a good supply of h95 masks and assorted other medkit stuff. I got a feeling some cities are going to be hurting soon. And that's all it is, a feeling, can't get better than that at this time.

Re:I don't believe any of it (1)

COMON$ (806135) | more than 7 years ago | (#20240917)

Having worked for the gov't I think you underestimate the quality of employees there...how does that saying go "Two things are infitite the universe and gov't stupidity?". Could be a hack but they wouldnt know unless they brought in someone from the private sector who is smart enough to charge a bagillion an hour to show them how to properly plug in the nic.

Yes, I am glad to be out of that velvet lined rut and in a world where there are actual professionals.

Re:I don't believe any of it (1)

denbesten (63853) | more than 7 years ago | (#20241083)

I believe you are thinking of a quote attributed to Albert Einstein:

"Only two things are infinite, the universe and human stupidity, and I'm not sure about the former."
http://www.quotedb.com/quotes/1349 [quotedb.com]

Re:I don't believe any of it (1)

ABasketOfPups (1004562) | more than 7 years ago | (#20241039)

So... have you seen "The Number 23" by any chance?

Re:I don't believe any of it (0)

Anonymous Coward | more than 7 years ago | (#20241517)

I like how you keep mentioning "these dozens of weird events", but don't actually list them out or state why they're suspicious.

Re:That's all it takes (2, Insightful)

COMON$ (806135) | more than 7 years ago | (#20240847)

apparently you are not familliar with what a bad nic does to even the best of switches.

Re:That's all it takes (5, Interesting)

KillerCow (213458) | more than 7 years ago | (#20240937)

I am not a networks guy... but it's my understanding that a switch acts like a hub when it sees a TO: MAC address that it doesn't know what port it's on. They learn the switching structure of a network by watching the FROM fields on the datagrams. When the switch powers up, it behaves exactly like a hub and just watches/learns what MAC addresses are on which ports and builds a switching table. If it starts getting garbage packets, it will look at the TO field and say "I don't know what port this should go out on, so I have to send it on all of them." So garbage packets would overwhelm a network even if it was switched.

It would take a router to stop this from happening. I don't think that there are many networks that use routers for internal partitioning. Even then, that entire network behind that router would be flooded.

It depends on the switch (4, Informative)

camperdave (969942) | more than 7 years ago | (#20241375)

You're right to a point. An ethernet frame, along with the source and destination addresses, has a checksum. A switch that is using a store and forward procedure is supposed to drop the frame if the checksum is invalid. If the nic was throwing garbled frames onto the network, it would have to be garbled in such a way as to have a valid checksum (assuming they are using store and forward switches in the first place).

Re:That's all it takes (1)

i8myh8 (859764) | more than 7 years ago | (#20241503)

Uhh.. negative. Any managed switch worth a sh!t would've noticed the bad checksum and dropped the package. Any network admin worth a sh!t would've had it setup so he knew where the problem was via SOME reporting capability. If 1 NIC was bombing the network someone should've known about it.

References? (1)

TypoNAM (695420) | more than 7 years ago | (#20241587)

Got any references or links to various tutorials and/or documents on how I could setup my network to notify me about a rogue NIC?

Re:That's all it takes (1)

SatireWolf (1050450) | more than 7 years ago | (#20241507)

Any switch worth it's muster has an automatic ACK throttle which can be turned on to control network flooding in the event of NIC failure. Apparently the IT department at LAX hasn't heard of reading the manual and just plugged it in like a linksys router and expected the defaults to be 'good enough'.

Re:That's all it takes (1)

SatanicPuppy (611928) | more than 7 years ago | (#20240825)

Sure, if you're buying consumer grade switching hardware, and you have only one subnet, or all your subnets are weirdly bridged or whatever.

For my money, this should never have happened from a problem with one machine. That's wholly unacceptable. My home network is robust enough to handle one bad machine without going down completely...Hell, I could lose a whole subnet and no one on the other subnet would notice a thing.

If this system or switch or whatever is critical, there should have been a fail over. They should have been able to trace the problem, and they should have been able to isolate it or remove it entirely. If you really do have a card going nuts and spamming the network, that is laughably easy to trace, unless you're in the habit of assigning dynamic IPs to critical pieces of your network.

Whiskey Tango Foxtrot (5, Insightful)

SatanicPuppy (611928) | more than 7 years ago | (#20240711)

According to the effing article, it wasn't even a server, but a goddamn desktop. How in the holy hell does a desktop take down the whole system? I can't even conceive of a situation where that could be the case on anything other than a network designed by chimps, especially through a hardware failure...A compromised system might be able to do it, but a system just going dark?

For that to have had any effect at all, that system must have been the lynchpin for a critical piece of the network...probably some Homeland security abortion tacked on to the network, or some such crap...This is like the time I traced a network meltdown to a 4 port hub (not a switch, and unmanaged hub) that was plugged into (not a joke) a T-3 concentrator on one port, and and three subnets of around 200 computers each on the other 3 ports. Every single one of the outbound cables from the $15.00 hub terminated in a piece of networking infrastructure costing not less than $10,000 dollars.

This is like that. Single point of failure in the worst possible way. Gross incompetence, shortsightedness, and general disregard for things like "uptime"; pretty much what we've come to expect from the airline industry these days. If I'm not flying myself, I'm going to be driving, sailing, or riding a goddamn bicycle before I fly commercial.

Re:Whiskey Tango Foxtrot (3, Interesting)

Jeremiah Cornelius (137) | more than 7 years ago | (#20240755)

Well.

Token ring sure used to fail like this! 1 bad station sending 10,000 ring-purge messages a second? Still, it was a truck. Files under 1Mb could be transferred, and this was TR/4, not 16!

Re:Whiskey Tango Foxtrot (1)

spitek (942062) | more than 7 years ago | (#20240883)

Great point about token ring. Was heavly used in these types of situations as well. Don't know about LAX but a lot of the airports have upgraded since then. Anybody get to use Switched 100Mb token ring?? To bad that didn't make it. Could have been cool. The advantages of both switching speed and the preformence under load that token ring has.

Token Ring upgrade (1)

camperdave (969942) | more than 7 years ago | (#20241419)

Perhaps they upgraded their token ring to thinnet.

Re:Whiskey Tango Foxtrot (2, Informative)

sigipickl (595932) | more than 7 years ago | (#20240949)

This totally sounds like a token ring problem.... Either network flooding or dropped packets (tokens). These issues used to be a bear to track down- going from machine to machine in serial from the MAU...

Ethernet and switching has made me fat- I never have to leave my desk to troubleshoot.

Re:Whiskey Tango Foxtrot (1)

spitek (942062) | more than 7 years ago | (#20240767)

chimps make flat networks.. with hubs.. what are switches and routers? I agree run into this problem several times in my carrer but for real. Bad network design. But hey Ive also worked in wireling closests in airports most of which these days are community run by the airport authority and leased by the airlines or other tennets of the airports. Used to work for an airline. NOT suprised one bit.

Re:Whiskey Tango Foxtrot (1)

morgan_greywolf (835522) | more than 7 years ago | (#20240927)

Ugh. A flat network may be fine for something small, but for something as big and complex as an airport network, especially one at an airport the size of LAX? Unthinkable. Do these people hire idiots with no training or experience or what?

Re:Whiskey Tango Foxtrot (1)

spitek (942062) | more than 7 years ago | (#20241193)

Well, yes, yes and what. Now havent been in the airline business for a few years but there was one network I ran into that had over 2000 seats sitting on a flat network. Now one would hope that LAX in general is not like that. I do not know if how LAX's network is setup. However the government agenecy does have a higher likleyhood to be running in the dark ages. A flacky network card sending out bad packets I have seen. I have seen it cause all sorts of problems. But for it to do the damage it did. Never should have happened. On top of that the response time to find out what the problem was. Sounds like poor network design and yes chimps and idiots. Or it is always possable that there is talent just no money. Sorry for them if that is the case.

Re:Whiskey Tango Foxtrot (2, Insightful)

kylemonger (686302) | more than 7 years ago | (#20241239)

Do these people hire idiots with no training or experience or what?

I think just hiring idiots would be enough. No need to train them.

Re:Whiskey Tango Foxtrot (1)

Kadin2048 (468275) | more than 7 years ago | (#20241281)

Do these people hire idiots with no training or experience or what?
Probably they do to some extent, but if it's like other places I've worked, they probably hire people who have a clue, but then tell them to do little bits and pieces, and never give them enough resources to actually do the job right.

It's a lot of "we'll pay you to come out and install this." They don't want to hear 'well, you should really re-think the architecture of your whole network' as a response. They just want the new piece grafted on, and if you don't do the job, they'll just find somebody who will.

That's how these horrible abortions of big systems / networks happen. They usually don't start off like that. They just grow and evolve without much in the way of a central plan until they finally keel over and die. Nobody wants to spend the time, money, or downtime to tear things down and rebuild them until they actually fail. So they just grow out of control.

They probably had a flat network (or a switched one without any subnets) because that was the only way to keep everything working as it grew; as different contractors came in and tacked on this or that, they just added it on to whatever was there.

Re:Whiskey Tango Foxtrot (0)

Anonymous Coward | more than 7 years ago | (#20240775)

Single point of failure in the worst possible way. Gross incompetence, shortsightedness,
Sounds like government work to me.

Re:Whiskey Tango Foxtrot (2, Interesting)

mhall119 (1035984) | more than 7 years ago | (#20240791)

A compromised system might be able to do it, but a system just going dark?
The article says it was a partial failure, so I'm guessing the NIC didn't "go dark", instead it started flooding the network with bad packets.

Re:Whiskey Tango Foxtrot (1, Interesting)

Anonymous Coward | more than 7 years ago | (#20240859)

I'm guessing the NIC didn't "go dark", instead it started flooding the network with bad packets.
Yeah, and any decent switch (and some not-so-decent) would detect this and shut the port down.

Hell, I have a 7 year old dlink 8-port at home that can do this!

Re:Whiskey Tango Foxtrot (0)

Anonymous Coward | more than 7 years ago | (#20240817)

Not sure if these are MS desktops or what but isn't some type of fail over, (bonding or whatever) fairly easy to do? I do this with all my Linux machines to two different switches. Then those go off to the core switch. Or even forgetting bonding, Ethernet cards are kinda cheap. Someone ought to be getting their resume together.......

Re:Whiskey Tango Foxtrot (5, Insightful)

MightyMartian (840721) | more than 7 years ago | (#20240821)

If the NIC starts broadcasting like nuts, it will overwhelm everything on the segment. If you have a flat network topology, then kla-boom, everything goes down the shits. A semi-decent switch ought to deal with a broadcast storm. The best way to deal with it is to split your network up, thus rendering the scope of such an incident significantly smaller.

Re:Whiskey Tango Foxtrot (1)

SatanicPuppy (611928) | more than 7 years ago | (#20241021)

Yup. I've never really seen a situation where you'd have more than a dozen or so computers on a crappy layer 1 switch. Higher quality hardware would throttle this stuff down to the very most local layer, unless you're specifically multicasting across the whole network, which is a security horror story.

Re:Whiskey Tango Foxtrot (0)

Anonymous Coward | more than 7 years ago | (#20241097)

Case Western Reserver University, the 'most networked university in America' a few years ago, was on a single class B subnet when I was there. Big nasty old hubs ran most of it. The collision lights never went off on those suckers. A few kids playing doom took down the entire university network. Those were the days.

Re:Whiskey Tango Foxtrot (2, Interesting)

Billosaur (927319) | more than 7 years ago | (#20240879)

And beyond that... how come there is no redundancy? After 9/11, every IT organization on the planet began making sure there was some form or fail-over to a backup system or disaster recovery site to ensure that critical systems could not go down as the result of something similar or some other large-scale disaster. Not only was this system cobbled together apparently, there was no regard for the possibility of it failing for any reason.

Re:Whiskey Tango Foxtrot (2, Insightful)

dave562 (969951) | more than 7 years ago | (#20241295)

They concentrated all of the redundancy dollars into layer B of the OSI model... the bureaucracy. There wasn't anything left for the lower layers.

Re:Whiskey Tango Foxtrot (1)

charlesnw (843045) | more than 7 years ago | (#20241343)

After 9/11, every IT organization on the planet began making sure there was some form or fail-over to a backup system or disaster recovery site

Um no? A number of large organizations do not have a disaster recovery site. Just the other day Cisco.com was down for a few hours.

Re:Whiskey Tango Foxtrot (1, Funny)

Anonymous Coward | more than 7 years ago | (#20240915)

At a previous employer, we kept having a Cisco Switch crash and become unresponsive, making about 1/3 of the people connected to it in the office lose connection.

After about 2 to 3 hours of investigation and it going down twice after we'd bring it back up, we soon found the problem. F**king intern, who was worthless anyways and about to get fired for other stupid mishaps had a Netgear switch he was using for setting up new desktops thought it'd be cute to plug one port of the switch to another port on it, that was creating the havoc and bringing down the switch and part of the network for most.

I think he got fired two weeks later. I guess he had it coming since several times for lunch he would go home and take a nap, coming back after 3 or 4 hours cause he had overslept during such hour lunch break.

Re:Whiskey Tango Foxtrot (0)

Anonymous Coward | more than 7 years ago | (#20241103)

Can someone explain technically how doing that on the Netgear POS would cause the Cisco to crash? I guess there'd have to be broadcast traffic on that segment?

Re:Whiskey Tango Foxtrot (1)

archen (447353) | more than 7 years ago | (#20241221)

If you have two different network segments which are supposed to be routed and a connection between them appears, you end up with a scenario where packets may go to the switch or may hop the rogue connection. I had this happen at a new facility our company moved to. I had a few single switches divided in half with VLANS and somewhere someone ended up with a crossed connection. The result was strange packet storms that would mysteriously cause network interfaces to shut down completely on machines.

It actually took me a while to figure it out until I saw our internal (FreeBSD) network firewall was reporting traffic coming in on the wrong interface. After that it was a zoo finding the wire...

Re:Whiskey Tango Foxtrot (0)

Anonymous Coward | more than 7 years ago | (#20241373)

I'm still not sure I quite get the mechanism.
  1. The loop creates a longer path to the same destinations on the media layer?
  2. There's a probability a packets will be looped and re-looped via the mis-patched cable?
  3. This is enough to eventually flood a network?
Am I in the ballpark?

Re:Whiskey Tango Foxtrot (1)

Lehk228 (705449) | more than 7 years ago | (#20241461)

i think the problem occurs when the switches "learn" a circular route and so packets get stuck in that circular route at whatever speed the network runs at.

Re:Whiskey Tango Foxtrot (1)

cciechad (602504) | more than 7 years ago | (#20241433)

Not really most likely it was a STP issue in the Netgear case(Damn unmanaged switches with no/bad STP support). Basically what you end up with is a loop which can cause switch CPU to go to 100% even on 6500's which makes tracing it interesting. Fastest way is to start poping cards until it becomes responsive then isolate to a port. I don't like putting portfast on user ports anymore because of this.

Re:Whiskey Tango Foxtrot (1)

cez (539085) | more than 7 years ago | (#20241355)

By linking the Netgear on two ports to the Cisco (regardless of whether a cross-over is used) you introduce a Loop into the network... a big No No without proper VLANing or routing inbetween as the MAC forwarding table of the cisco will pick up both ports. Using Spanning Tree Protocol is one line of defense for this, but not an end all cure. Depending on the router or type of trunking that the cisco was uplinked too, the whole segment might be automatically taken offline and disabled to prevent further damage, so usually the whole switch stack drops from the network.

Re:Whiskey Tango Foxtrot (1)

andy314159pi (787550) | more than 7 years ago | (#20241611)

I think he got fired two weeks later. I guess he had it coming since several times for lunch he would go home and take a nap, coming back after 3 or 4 hours cause he had overslept during such hour lunch break.
Yeah I totally take the lunchtime siesta at my office, and that limits my oversleeping to, at most, two hours. What a complete office noob. By the way, how do you fire an unpaid intern?

Re:Whiskey Tango Foxtrot (1)

Kadin2048 (468275) | more than 7 years ago | (#20241613)

Idiot intern nonwithstanding, should that really have been possible? I thought that any decent router would see that a loop had occurred and shut down the port connected to it, rather than forwarding all the broadcast storm packets.

After all, preventing layer 2 loops is what Spanning Tree is all about, and I thought Cisco had some similar system for figuring out if a link was unidirectional (if you're sending packets down to something and not getting anything back, it can shut it down, to keep it from just sending out lots of bogus requests).

I doubt that crappy consumer switches do STP, but the upstream Cisco one should have ... shouldn't it?

Stuff Normally All Forked Up (1)

jo42 (227475) | more than 7 years ago | (#20240919)

Ha! Ha!

Teaches you to rely on technology...

Re:Stuff Normally All Forked Up (1)

MightyMartian (840721) | more than 7 years ago | (#20241069)

Yessirree. Let's put one of the busiest airports on the planet on a paper-messenger boy backup. Yeah, that'll clear the backlog real well.

Re:Whiskey Tango Foxtrot (1)

gEvil (beta) (945888) | more than 7 years ago | (#20241045)

I told the boss we should get a proper network connection. But noooooo, he insisted that getting a consumer-level DSL connection and using Windows Internet Connection Sharing was the way to go...

Re:Whiskey Tango Foxtrot (1)

Joe The Dragon (967727) | more than 7 years ago | (#20241253)

that is what the M$ tests talk about in there network setups.

Re:Whiskey Tango Foxtrot (1)

jafiwam (310805) | more than 7 years ago | (#20241117)

According to the effing article, it wasn't even a server, but a goddamn desktop. How in the holy hell does a desktop take down the whole system? I can't even conceive of a situation where that could be the case on anything other than a network designed by chimps, especially through a hardware failure...A compromised system might be able to do it, but a system just going dark?

Los Angeles World Airports is a unique system of four airports owned and operated by the City of Los Angeles.

Any further questions?

Probably the lowest bidder union labor designing and setting it up. Shoulda called IBM.

Re:Whiskey Tango Foxtrot (0)

Anonymous Coward | more than 7 years ago | (#20241137)

How? how about running on old 3com hubs and other incredibly old gear? Airports are notorious for running shit level networking gear. They spend more on making the food court pretty than infrastructure becauset he ones in charge of airports are typically incompetent idiots.

Re:Whiskey Tango Foxtrot (1)

SatanicPuppy (611928) | more than 7 years ago | (#20241207)

Oh, no doubt. It's clearly what happened...This kind of thing is almost impossible with modern switching hardware, and not even the really expensive stuff, but the reasonable consumer stuff as well.

Fricking stupid. People think it'll never come back to bite them, and it always does.

Re:Whiskey Tango Foxtrot (1)

charlesnw (843045) | more than 7 years ago | (#20241225)

Um. The systems that control aircraft are completely seperate from systems used to manage passengers.

Re:Whiskey Tango Foxtrot (1)

SatanicPuppy (611928) | more than 7 years ago | (#20241327)

So? Does it give you confidence in the rest of their equipment when one misbehaving computer can bring down their entire network for nine hours?

Bunch of monkeys. The reason I don't fly commercial anymore has nothing to do with the planes. It has everything to do with the airports.

Re:Whiskey Tango Foxtrot (1)

daveywest (937112) | more than 7 years ago | (#20241245)

I worked on a network where server kept dropping connections and users were reporting high latentcy. We eventually had to use a processes of elimination to isolate the bad connection before we found the bad line in the server room. We yanked it out and waited for the phone call from someone who couldn't get their email. Turns out they decided to turn off DHCP and self assign an IP address: the same one as the server.

In other news... (2, Insightful)

djupedal (584558) | more than 7 years ago | (#20240757)

"...said airport and customs officials are discussing how to handle a similar incident should it occur in the future."

What makes them think they'll get another shot? Rank and file voters are ready with their own plan...should a 'similar incident' by the same fools happen again.

The backup plan (5, Funny)

Animats (122034) | more than 7 years ago | (#20240831)

DHS's idea of a "backup plan" will probably be to build a huge fenced area into which to dump arriving passengers when their systems are down.

Re:The backup plan (1)

djupedal (584558) | more than 7 years ago | (#20240935)

:)

I hear EMA has several new/used camp trailers I'm sure DHS could avail themselves of.

No, a multi-front plan (1)

EmbeddedJanitor (597831) | more than 7 years ago | (#20241481)

Arrest all NIC designers, engineers, network stack developers, IT managers,... on suspicion of conspiring to cause the problem.

Change to Wifi because that can't have NIC faults.

C'mon folk... help me out here!

You figure it out (3, Interesting)

COMON$ (806135) | more than 7 years ago | (#20240761)

Let me know, knowing how to prevent failure to to a flaky nic on a network is a very large issue.

First you see latency on a network, then you fire up a sniffer and hope to god you can get enough packets to deduce which is the flaky card without shutting down every NIC on your network.

Of course I did write a paper on this behavior years ago in my CS networking class. Taking a Snort box and a series of custom scripts to notify admins with spikes on the network outside of normal operating ranges for that device's history. However implementing this successfully in an elegant fashion has been beyond me and I just rely on Nagios to do a lot of my bidding.

Re:You figure it out (1, Insightful)

Anonymous Coward | more than 7 years ago | (#20241013)

Why would anyone be stupid enough to have all hosts in a mission-critical setting on one subnet?

Maybe you meant it's a "large issue" if you're a complete moron and put everything on one subnet, but everything is an issue if you're a complete moron, so there's nothing special about nics.

Re:You figure it out (4, Informative)

GreggBz (777373) | more than 7 years ago | (#20241023)

One not to unreasonable strategy is to set up SNMP traps on all your NICs. This is not unlike the cable modem watching software at most Cable ISPs.

At first, I can envision it being a PITA if you have a variety of NIC hardware especially finding all those MIBs. But they are all pretty standard these days, and your polling interval could be fairly long, like every 2 minutes. You could script the results, sorting all the naughties and periodic non-responders to the top of the list. That would narrow things down a heck of a lot in a circumstance like this.

No alarms, but at least a quick heartbeat of your (conceivably very large) network. A similar system can be used to watch 30,000+ cable modems, without to much load on the snmp trap server.

Re:You figure it out (1)

asphaltjesus (978804) | more than 7 years ago | (#20241029)

It's called teaming on windows and we use it. In fact, we had a flaky NIC just the other day. I'm not sure how many cards/vendors support teaming outside of HPaq.

On linux, it's called bonding. This is a killer feature.

I had some very limited professional experience with LAWA in the last couple of years. (LAWA runs LAX) I have no doubt there is quite a bit of consultant the usual chicanery going on whereby they don't actually hire qualified IT people, just people an elected official or two or three may know. The IT staff on hand is most likely has quite limited authority. Other than hiring more consultants they know to document the failure, little will ever come of it.

How is that scenario possible you ask? Well, LAWA is a HUGE cash cow for the city/county so there are naturally quite a few political contributors lined up to get their goods/services contracts fulfilled.

Social not technical problem. (1)

twitter (104583) | more than 7 years ago | (#20241109)

How about doing regular police work instead of pre crime, so that passengers don't have to stand around while your network flakes out?

Re:You figure it out (1)

SatanicPuppy (611928) | more than 7 years ago | (#20241155)

The AC is right. Your network topology should be spread out over a number of subnets, and they should only talk to each other where it's critical. The subnets should be separated by expensive managed switches, or by custom hardware configured to monitor packet traffic and isolate problems. Critical systems should be largely inaccessible to the vast majority of the network, and where they are accessible the access is monitored and throttled. If one machine takes too much traffic, you need a second machine set up in a load balancing configuration.

This stuff is basic. To have one card take down a whole network...I can't even conceive. There isn't one card that can talk to my whole network on all ports, and there would never be a need for such a thing.

Re:You figure it out (1)

t0rkm3 (666910) | more than 7 years ago | (#20241383)

You can see a similar behavior from Cisco IPS if you enable and tune the anomaly detector engine. This in turn feeds MARS... which is groovy except the alerting stinks within MARS. So you have to beat up Cisco and they'll hash out a xslt that will prettify the XML garbage into a nice little HTML email for the desktop support guys to chase down the offender. Couple that with some Perl to grab the fields and shove them in a DB for easy reference...

It works, and it works a lot more easily than anything else that I have deployed to accomplish a similar task.

Head of IT for LAX should be fired... (3, Insightful)

Glasswire (302197) | more than 7 years ago | (#20240797)

...for not firing the networking manager. The fact that they were NOT terrified that this news would get out and were too stupid to cover it up indicates he/she and their subordinates SIMPLY DON'T KNOW THEY DID ANYTHING WRONG by not putting in a sufficently montiored switch architecture which would rapidly alert IT staff and lock out the offending node.
Simply amazing. Will someone in the press publish the names of these losers so they can be blacklisted?

Re:Head of IT for LAX should be fired... (5, Funny)

Rob T Firefly (844560) | more than 7 years ago | (#20240959)

They have to find someone who can not only design a vital high-traffic network and maintain it... but who didn't have fish for dinner.

Mod funny, not insightful! (0)

Anonymous Coward | more than 7 years ago | (#20241331)

Haven't you mods ever seen Airplane?

Re:Head of IT for LAX should be fired... (3, Informative)

kschendel (644489) | more than 7 years ago | (#20241081)

RTFA. This was a *Customs* system. Not LAX, not airlines. The only blame that the airlines can (and should) get for this is not shining the big light on Customs and Border Patrol from the very start. I think it's time that the airlines started putting public and private pressure on CBP and TSA to get the hell out of the way. It's not as if they are actually securing anything.

CBP deserves a punch in the nose for not having a proper network design with redundancy; and another punch in the nose for not having any clue what to do in an outage. They should have a reduced-service backup plan, and a manual backup plan, and a diversion backup plan. There's no excuse for federal officials to sit there like idiots waiting for things to magically get fixed. Oh wait, I guess some of them ARE idiots.

Always a simple answer ... (0)

Anonymous Coward | more than 7 years ago | (#20240913)

to any problem. Just do what my company does. Have a meeting! And remember that 8 hrs. of meetings per day will truly brighten your outlook!

LACP (2)

dy2t (549292) | more than 7 years ago | (#20240925)

Also known as IEEE 802.3ad supports aggregating NICs to both improve overall bandwidth as well as gracefully deal with failed links.
More info at http://en.wikipedia.org/wiki/Link_Aggregation_Cont rol_Protocol [wikipedia.org]

Systems seem to be more commonly shipping with multiple NICs (esp. servers) so maybe this will be used more and more. It is important to note that the network switch/router needs to be able to support LACP (dumb/cheap switches do not while expensive/managed ones do) so that might be a barrier. Cisco switches and maybe others have implemented proprietary trunking/aggregation schemes but this 802.3ad is a standard.

In practice, I tried to use LACP with a Linksys SRW2048 $800 switch (targeted at small-businesses, much cheaper than typical managed switch) but it did not work reliably (performance got worse, some clients could not connect/timed-out.) Still working on it.

LAX == Turds. (0, Troll)

corifornia (995298) | more than 7 years ago | (#20240955)

I live right next to LAX. I drop off and pick up friends frequently, the whole inside of that airport is a turdfest. I'm sure the wire from the network card attached to the rest of the network with Vampire Clamps.

Well, what do you expect.. (0)

Anonymous Coward | more than 7 years ago | (#20240975)

if it takes half an hour for a 'homeland security officer' to write down your job, because the _spell_checker_ doesn't know the word 'physics'... (and it is obvious that the woman behind the desk is _never_ going to

I'm sorry, but I'm kind of glad that people from the USA had to experience this stupid border protection.
Homeland security is a master of FUD btw, shouting at people to 'get in the line' and stuf to make them nervous, so possible terrorist start pissing in their pants and handing over bombs...

Always a warm welcome flying to the usa these days.

(Posting anonymously, because this homeland security crap scares me.)

Let that be a lesson to you... (3, Funny)

urlgrey (798089) | more than 7 years ago | (#20241043)

To all you novice net admins out there: network cards do *not* like chunky peanut butter! Smooth/creamy only, please.

Now you see what happens when some joker thinks [s]he can get away with using chunky for something as critical as proper care and feeding of network cards. Pfft.

Bah! Kids these days... I tell ya. Probably the same folks that think the interwebnet is the same as the World Wide Web.

Great, Scott! What's next?!

That makes sense (0)

Anonymous Coward | more than 7 years ago | (#20241049)

"As data overloaded the system, a domino effect occurred with other computer network cards, eventually causing a total system failure a little after 2 p.m., Connors said."


Wait, what?

lol (0, Offtopic)

thatskinnyguy (1129515) | more than 7 years ago | (#20241053)

If it was running Ubuntu and had the same hardware, they could have experienced the same problem as these guys [slashdot.org] .

The whole system is pointless anyway (4, Insightful)

Potent (47920) | more than 7 years ago | (#20241061)

When the U.S. Government is letting millions of illegal aliens cross over from Mexico and live here with impunity, then what the fuck is the point with stopping a few thousand document carrying people getting off of planes from entering the country?

I guess the system exists to give the appearance that the feds actually give a shit.

And then the Pres and Congress wonder why their approval ratings are as small as their shoe sizes...

nic can take down a segment (3, Interesting)

KDN (3283) | more than 7 years ago | (#20241071)

Years ago we had a 10BT nic go defective so that whenever the nic was plugged into the switch it would obliterate traffic on that segment. The fun part: EVEN IF THE NIC WAS NOT PLUGGED INTO THE PC. Luckily that happened in one of the few areas that had switches at the time, everything else was one huge flat lan.

Sounds Familiar...... (1)

netrage_is_bad (734782) | more than 7 years ago | (#20241121)

We had something similar happen at my building when I worked at Kent State University. The air conditioning was being worked on and the workers thought it would be a good idea to plug an AC unit into the server room, something they had been specifically told not to do. The Additional load of the AC flipped the breaker and set off all the alarms, all the switches lost power and backup units shutdown all servers. It wouldn't have been so bad except that all university traffic ends up going through our building for internet, which caused all routers to become backed up. ALL of them. What made things worse was the new sysadmin didn't know about some of the backup systems, and no one knew how to reset the breakers (it was a special system) plus there was a special pin that had to be used that no one knew. It was a hillious 2 hours without internet.

"A similar incident" (2, Insightful)

The One and Only (691315) | more than 7 years ago | (#20241135)

A spokeswoman for the airports agency, said airport and customs officials are discussing how to handle a similar incident should it occur in the future.

Except in the future, the incident isn't going to be similar, aside from being similarly boneheaded. This attitude of "only defend yourself from things that have already happened to you before" is just plain dumb. Obviously their system was set up and administered by a boneheaded organization to begin with, and now that same boneheaded organization is rushing to convene a committee to discuss a committee to discuss how to prevent something that already happened from happening again. The root flaw is still in the organization.

Blaming the Wrong NIC (2, Insightful)

Doc Ruby (173196) | more than 7 years ago | (#20241141)

The NIC that failed isn't the part that's at fault. NICs fail, and can be counted on to do so inevitably, if relatively unpredictably (MTBF is statistical).

The real problem NIC is the one that wasn't there as backup. Either a redundant one already online, or a hotswap one for a brief downtime, or just a spare that could be replaced after a quick diagnostic according to the system's exception handling runbook of emergency procedures.

Of course, we can't blame a NIC that doesn't exist, even if we're blaming it for not existing. We have to blame the people who designed and deployed the system with the single point of failure, and the managers and oversight staff who let the airport depend on that single point of failure.

But instead I'm sure we'll blame the dead NIC. Which gave its life in service to its country.

Managed switches are FTW (2, Insightful)

Sehnsucht (17643) | more than 7 years ago | (#20241211)

Where I work, if there's a packet storm someplace (server is getting attacked, server is attacker, or someone just has a really phat pipe on the other end and is moving a ton of data) we get a SNMP TRAP for packet threshold on the offending port. BAM! You know where the problem is, and since we have managed switches you just shut off the port if you can't resolve the problem.

Having said that, since the managed switches are gigE uplinked and each port is only 10/100, I don't think we've ever had a problem where a server was outbounding and brought down the switch/network (just made some extra latency). We've had some really large inbounds occasionally take down a whole switch, and heaven forbid some idiot shuts the port off on an inbound attack instead of nulling it at the border, cause then the ARP drops and the DOS gets forwarded to every port on the VLAN on a ton of switches.. but a broken NIC packet storming would not have been an issue.

OK, so maybe they don't have managed switches all the way down the to the lowest point on the network. They should still have SOME further up the chain and be monitoring them such that they know from what direction the problem is coming, and shut it off / look at it with a sniffer etc.

Infrastructure that is as important as an airport should have it's own infrastructure properly equipped and maintained with managed equipment, making this nearly a non-issue and certainly one easily resolved.

not too suprised (1)

mytrip (940886) | more than 7 years ago | (#20241223)

I used to work for a very large travel agency and have seen queues of travel resevations get pretty backed up and cause problems before although on a smaller scale.

Most reservations are checked for problems automatically but pushed through by a person and moved from one queue to another. If the program that checks them crashes, it can back things up.

I remember a program crashing and a queue getting 2000+ reservations in it before someone figured out what was going on and it had things screwed up for about 2 days while a replacement computer gradually cleared the queue out.

Apropos. (0, Offtopic)

jetpack (22743) | more than 7 years ago | (#20241309)

As I'm RTFAing, the fortune at the bottom of the slashdot page reads:


      Is this an out-take from the "BRADY BUNCH"?

:)

mikkkro$$$oft did this (0, Troll)

R00BYtheN00BY (1118945) | more than 7 years ago | (#20241313)

fuckin unstable piece o shit mikkkro$$$haft caused my delay BILL GATE$$$

IT is not that advanced (1)

Herkum01 (592704) | more than 7 years ago | (#20241477)

This brings out an obvious point, despite the advances we have made in computing and IT, it is still relatively young and not that robust.

This is the equivalent of your car stops working and the 'check engine' light does not even come on. At least now some of the technology for cars is getting to the point that it will find the problem for you. The same still cannot be said for large computer networks.

When people stop treating computers as flawless wonder machines, then we shall see some real progress made.

Spewing (1, Flamebait)

PacketScan (797299) | more than 7 years ago | (#20241533)

So a desktop got infected and started to Spew crap onto the network. Then we blame it on the nic it self..
HaH security what is that? when 1/2 our personal information can be found on p2p networks because government employees can't actually do the job they have to screw around and download music / movies or whom knows what else. What would make air port employees any different..
When you boil down to the root problem you'll find it's Lack of leaders ship allowing these problems / attitudes to exist.
And Great out tax dollars paid for this Screw up.

sadly... this may be typical (4, Insightful)

bwy (726112) | more than 7 years ago | (#20241621)

Sadly, many real-world systems are often nothing like what people might envision as them as. We all sit back in our chairs reading slashdot and thinking everything is masterfully architected, fully HA, redundant, etc.

Then as you work more places you start seeing that this is pretty far from actual truth. Many "production" systems are held together by rubber bands, and duct tape if you're lucky (but not even the good kind.) In my experience it can be a combination of poor funding, poor priorities, technical management that doesn't understand technology, or just a lack of experience or skills among the workers.

Not every place is a Google or Yahoo!, that I can imagine look and smell like technology wherever you go on their fancy campuses. Most organizations are businesses first, and tech shops last. If software and hardware appears to "work", it is hard to convince anybody in a typical business that anything should change- even if what is "working" is a one-off prototype running on desktop hardware. It often requires strong technical management and a good CIO/CTO to make sure that things happen like they should.

I suspect that a lot of things that we consider "critical" in our society are a hell of a lot less robust under then hood than anything Google is running.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?