Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Microsoft Bots Effectively DDoSing Perl CPAN Testers

timothy posted more than 4 years ago | from the stuck-in-a-rut dept.

Microsoft 332

at_slashdot writes "The Perl CPAN Testers have been suffering issues accessing their sites, databases and mirrors. According to a posting on the CPAN Testers' blog, the CPAN Testers' server has been being aggressively scanned by '20-30 bots every few seconds' in what they call 'a dedicated denial of service attack'; these bots 'completely ignore the rules specified in robots.txt.'" From the Heise story linked above: "The bots were identified by their IP addresses, including 65.55.207.x, 65.55.107.x and 65.55.106.x, as coming from Microsoft."

cancel ×

332 comments

Sorry! There are no comments related to the filter you selected.

So how do we DDoS Microsoft? (4, Funny)

drinkypoo (153816) | more than 4 years ago | (#30806824)

Anyone know what sites on Microsoft's front-facing sites are most computationally intensive, and yet always dynamically generated? :D

Re:So how do we DDoS Microsoft? (2, Interesting)

Anonymous Coward | more than 4 years ago | (#30806870)

Bing? ...But that would only help them to DDoS Bing.

Re:So how do we DDoS Microsoft? (2, Funny)

jisatsusha (755173) | more than 4 years ago | (#30806936)

All that'd serve to do is make them look more popular than ever. Traffic up 300%! Sounds like a good mar

Re:So how do we DDoS Microsoft? (3, Funny)

Anonymous Coward | more than 4 years ago | (#30806964)

That exactly what i said. Dont you dare leech the score from me jackass!

Happy Dead Nigger Day! (-1)

Anonymous Coward | more than 4 years ago | (#30807234)

Dead Nigger Storage

"I told you, it ain't none of my fucking business!" --Quentin Tarantino on Dead Nigger Storage

"It was an accident, right! Like Battlefield Earth!" --John Travolta on Storing Dead Niggers

"If I go to Indo China and pop a cap in a Nigger's ass, I won't hide him in a bowl of rice, I want him stored at Dead Nigger Storage" --Customer testimonial on Dead Nigger Storage

Dead Nigger Storage Inc is a successful business founded in 1994 by Toluca Lake, Los Angeles resident Jimmie Dimmick, after a misunderstanding with two acquaintances from the local underworld. In an interview made in 2004 with Pulp Magazine, Dimmick stated that the idea for his business originally came from his dealings with a mysterious "Mr Wolfe" several years previously.

Dead Nigger Storage Inc is publicly traded on the Nasdaq stock market under the symbol DEDNIG.

Business Overview

The business focuses on a simple service provision as the basis for their corporate offering, namely the creation of storage facilities specially built to store dead and/or decaying afro-americans. With offices in Alabama, Elko, Georgia, Louisiana, Dead Nigger Storage Inc now has more branches throughout the Confederate States of America than both KFC and Big Kahuna Burgers combined.

Originally run from Jimmie and Bonnie Dimmick's garage, the business' growth rate within the first few months of operating forced them into a rethink. In 1998, the Dimmicks purchased Monster Joe's Truck and Tow in Downtown Los Angeles, which has remained their base of operations to this day.

With the catchy friendly slogan of "Storing Dead Niggers is our business" Dead Nigger Storage Inc remains a market leader at the forefront of ethnic minority storage, despite the recent upsurge in the market for companies such as Jews on Ice and the Cracker Barrel.

Very recently, Dead Nigger Storage Inc has expanded into a chain with several branches outside of the United States. Though each branch outside the USA are largely similar to their American counterparts, most customers note a handful of "little differences". For example, in America one can store a decapitated Nigerian. In the Paris branch, however, one stores un Nigirié guillotin. In general, dead niggers are still called dead niggers, but over there they're called les dead niggers and corpse sizes are measured differently because of the metric system.

In 1999 Detroit became the largest Dead Nigger Storage facility in the western hemisphere.

Traditional Methods of Storing Dead Niggers

"You know what they preserve dead niggers with in Holland instead of synthetic petroleum based chemical preservatives? Mayonnaise." --Vincent Vega on storing Dead Niggers

Many individuals have struggled with the issue of dead nigger storage, including Jefferson Davis and John C. Calhoun who favoured the time-attested methodology of dry suspension, a technique that preserved by hanging them in carefully controlled environments for up to 21 days.

Other techniques utilised include smoking, often over specially constructed firepits or pyres. Although this often provides a more pleasurable flavour and aroma, it often led to a complete burning of the subject.

Pulverization is often utilised, either through the use of sticks, or in more extreme case through "dragging", a technique thought to include a pick-up truck. Another practice designed to aid tenderization is referred to as "curbstomping".

Dead Nigger Storage in Popular Culture

Dead Nigger Storage is subtly referenced in 14 separate Quentin Tarantino movies including Reservoir Dogs and the two Kill Bill films. The company also has numerous placements with Tarantino's latin lover Robert Rodriguez' movies, including The Adventures of Sharkboy and Lavagirl.

English Murder Mystery Writer Agatha Christie, referenced the company in perhaps her most famous work, Ten Dead Negroes made into the 1957 film The Only Good Injun is a Dead Injun. Perhaps her most famous reference remains the Hercule Poirot "quote" "Sacre bleu! C'est un morte negro, non?" in The Murder of Michael Donald.

One of the main accusations of racism aimed at George Lucas over his Star Wars franchise was his portrayal of certain species along stereotypical lines. Famously, in the scene when Jar Jar Binks is fatally wounded in the head whilst riding in the back seat of Mace Windu's landspeeder, a small sign can be seen in the background stating "Dead Gungan Storage".

Re:So how do we DDoS Microsoft? (2, Insightful)

Lennie (16154) | more than 4 years ago | (#30806884)

http://blogs.msdn.com/

I've seen it fail many times

What's not? (1, Troll)

tjstork (137384) | more than 4 years ago | (#30806904)

It's not like ASP.NET is the most efficient way to sling web pages to being with.

Re:So how do we DDoS Microsoft? (2, Insightful)

SharpFang (651121) | more than 4 years ago | (#30806908)

No, we just make mistakes writing our Perl programs for automatic downloading stuff from MSDN. Like, download() unless success, and forget to set success=true;

Ask the Chinese to do it (0)

Anonymous Coward | more than 4 years ago | (#30807098)

They know how.

Re:So how do we DDoS Microsoft? (5, Informative)

jlp2097 (223651) | more than 4 years ago | (#30807138)

Not necessary. A Bing Product Manager has already commented on the CPAN Testers blog entry [perl.org] upon which the article is based:

Hi,
I am a Program Manager on the Bing team at Microsoft, thanks for bringing this issue to our attention. I have sent an email to barbie@cpan.org as we need additional information to be able to track down the problem. If you have not received the email please contact us through the Bing webmaster center at bwmc@microsoft.com.

As said below, never ascribe to malice that which can be adequately explained by stupidity. (Insert lame joke about MSFT being full of stupidity here).

Re:So how do we DDoS Microsoft? (1)

John Hasler (414242) | more than 4 years ago | (#30807176)

Seems like the CPAN admin has already solved the "issue".

Re:So how do we DDoS Microsoft? (5, Funny)

kulnor (856639) | more than 4 years ago | (#30807224)

Well, with Barbie(TM) on the case, this should be quickly resolved (unless she's too busy with G.I.Joe(TM))

Re:So how do we DDoS Microsoft? (5, Funny)

Anonymous Coward | more than 4 years ago | (#30807218)

As much spam as I get from ir@infousa.com , I wish that someone would DDOS that damned company. If I knew of a way to get extra spam to ir@infousa.com I would probably do it so that company could get a taste of its own medicine. ir@infousa.com sent me unsolicited spam and it drives me nuts. Thanks for nothing, ir@infousa.com . It makes me want to call the company at (402)593-4500 and complain, but I don't have time. I guess I'll email them at ir@infousa.com instead. maybe.

Re:So how do we DDoS Microsoft? (1)

Mephistro (1248898) | more than 4 years ago | (#30807492)

Clue: Subtle joke, deserves 'funny' moderation ;)

Re:So how do we DDoS Microsoft? (4, Insightful)

PetoskeyGuy (648788) | more than 4 years ago | (#30807434)

Why make things worse? Block the ip address or range and notify the admins. This isn't a chan mob.

There's... (0, Redundant)

Anonymous Coward | more than 4 years ago | (#30806830)

probably a PERL script to handle that!

The end is near (1, Funny)

Jorl17 (1716772) | more than 4 years ago | (#30806840)

Run, Microsoft is coming to get you!

Why? (0, Redundant)

joel.neely (165789) | more than 4 years ago | (#30806842)

Bing?

Re:Why? (-1, Offtopic)

joel.neely (165789) | more than 4 years ago | (#30807020)

PS: Before marking something "redundant", look at the timestamps to see which entry came first!

Re:Why? (1)

ozmanjusri (601766) | more than 4 years ago | (#30807196)

Why? Bing?

They have to have SOME activity.

Sounds like there's more traffic from their bots than customers.

Oh! *Literally* Microsoft bots! (1)

Culture20 (968837) | more than 4 years ago | (#30806854)

Until I read the summary I thought it was another article about windows botnets and was wondering why the "microsoft" was tacked on since windows is the default OS assumption. Of course it would be interesting if these were new CPAN mirrors that MS was settings up.

Re:Oh! *Literally* Microsoft bots! (4, Informative)

Ardaen (1099611) | more than 4 years ago | (#30807350)

Probably not, if you look at other incidents: http://cmeerw.org/blog/594.html [cmeerw.org] it appears they just like to push the limits.

Testers blog link... (1)

flyingfsck (986395) | more than 4 years ago | (#30806858)

Sooooo, lets all go to the testers blog and DDOS that too. Dumbass...

Re:Testers blog link... (1)

nicolas.kassis (875270) | more than 4 years ago | (#30806996)

If he can handle the msnbots, he probably can handle the slashdot crowd.

I've seen it before (5, Interesting)

LordAzuzu (1701760) | more than 4 years ago | (#30806860)

I manage some networks in my home city in Italy, and in the past year I've often seen strange traffic coming from some of their IP addresses. Guess they have been exploited by someone long time ago, and didn't even notice it.

Typical M$ (0, Flamebait)

omb (759389) | more than 4 years ago | (#30806874)

Lazy, feckless, inconsiderate crooks.

Re:Typical M$ (1)

auric_dude (610172) | more than 4 years ago | (#30806890)

Sounds like Microsoft.CN to me.

Re:Typical M$ (1, Informative)

Anonymous Coward | more than 4 years ago | (#30807150)

That's not a troll. That's common knowledge.

A more appropriate mod would be +5 Redundant.

Check the blog... (4, Funny)

strredwolf (532) | more than 4 years ago | (#30806900)

Looks like Microsoft's Bing managers are on it. They'll make it worse in no-time flat. :)

BTW, the difference between a DDOS and a Slashdotting? You know why your site went down -- you got linked!

Re:Check the blog... (5, Funny)

Anonymous Coward | more than 4 years ago | (#30807082)

BTW, the difference between a DDOS and a Slashdotting?

The DDOS bots actually read TFA.

Re:Check the blog... (0)

Anonymous Coward | more than 4 years ago | (#30807210)

I think you may just have explained Bing's search accuracy...

MS ineptitude? (2, Insightful)

Anonymous Coward | more than 4 years ago | (#30806906)

From TFA:

Hi,
I am a Program Manager on the Bing team at Microsoft, thanks for bringing this issue to our attention. I have sent an email to nospam@example.com as we need additional information to be able to track down the problem. If you have not received the email please contact us through the Bing webmaster center at nospam@example.com.

I mean, what additional information is needed wrt "respecting robots.txt" and "not letting loose more than one bot on a site at a time"?

Bing. Meh.

Probably just a bug. (5, Insightful)

tjstork (137384) | more than 4 years ago | (#30806910)

I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?

Re:Probably just a bug. (5, Insightful)

Lloyd_Bryant (73136) | more than 4 years ago | (#30806976)

I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?

Sufficiently advanced incompetence is indistinguishable from malice. For additional examples, see Government, US.

The simple fact is that ignoring robots.txt is effectively evil, regardless of the intent. It's not like robots.txt is some new innovation...

Re:Probably just a bug. (3, Insightful)

gmuslera (3436) | more than 4 years ago | (#30807026)

They are not ignoring robots.txt, probably just that they understand that file in their slighly different, but in the end incompatible, format. As every other file.

Re:Probably just a bug. (5, Informative)

Rogerborg (306625) | more than 4 years ago | (#30807122)

You're probably new here, but if you'd RTFA, you'd see that:

It seems their bots completely ignore the rules specified in the robots.txt, despite me setting it up as per their own guidelines on their site

Come to think of it though, isn't this what happens to most people who try to interoperate with Microsoft?

Amusingly, if I Google for "bing robots.txt" [google.co.uk] I get a link to a bing page titled "Bing - Robots.txt Disallow vs No Follow - Neither Working!" which has already been elided from history by Microsoft [bing.com] . CLassy.

Re:Probably just a bug. (4, Funny)

afidel (530433) | more than 4 years ago | (#30807338)

I wonder if it's a CR/CRLF bug =)

Re:Probably just a bug. (5, Insightful)

schon (31600) | more than 4 years ago | (#30807448)

It has nothing to do with the RTFA.

their own guidelines on their site

As anyone who has ever read MS documentation can tell you, you need to read it, then implement a test, so you can see what it really expects, then adjust your test, then try it until it works.

Their problem is that they expected MS documentation to actually describe the expected behaviour.

Re:Probably just a bug. (1)

drspliff (652992) | more than 4 years ago | (#30807156)

Well, the last I heard Bing spider was looking for `Robots.txt` rather than `robots.txt` which would explain the file being "ignored" in this case.

Re:Probably just a bug. (1)

ztransform (929641) | more than 4 years ago | (#30807078)

The simple fact is that ignoring robots.txt is effectively evil, regardless of the intent. It's not like robots.txt is some new innovation...

Since when did Microsoft feel existing standards were something to honour? How many times have its browsers changed behaviour? Re-defined entrenched URL standards (you cannot specify username/password in an Internet Explorer URL but this is a legal standard form of URL)?

It stands to reason Microsoft would take no notice of anything your website has to say.

Unless.. of course.. Microsoft define a certificate type that can sign your Microsoft-specific format exception list after payment on an annual licensing basis..

Oh hey, another Microsoft example: Vista! After all, why assume someone upgrading their operating system might expect the same if not better!

PS see http://support.microsoft.com/kb/834489 [microsoft.com]

Re:Probably just a bug. (1, Informative)

Anonymous Coward | more than 4 years ago | (#30807140)

Excuse my ignorance, but isn't robots.txt compliance easily enforceable on the server? I remember something about hiding links to trap pages in order to indentify robots and then holding identified robots responsible for robots.txt infractions by blocking their IP address.

Re:Probably just a bug. (5, Funny)

Suki I (1546431) | more than 4 years ago | (#30807164)

Try saving a copy as robots.docx and see if that works ;)

Evil? What "evil"? (0)

Anonymous Coward | more than 4 years ago | (#30807240)

So.. by your definition of evil. If you fail math exam, you're being evil?

If you trip down the stairs, and crash into somebody, you're evil?

Do not attribute to malice, what can very well be attributed to incompetence, or just bad luck.

Else, your mistaking this quote, is also evil then, according to your own definition of evil.
However, that is logically impossible, since it falsifies the very premise, thus I must conclude you are false, and also probably with good intentions,
if not just to get some modpoints, but I wouldn't call that evil ;)

The US government is competent. (0, Troll)

tjstork (137384) | more than 4 years ago | (#30807290)

. For additional examples, see Government, US.

I'm a right winger and I like to see smaller, less intrusive government, but, I think it is wrong to say that the US government is competent.

The US Gov't has successfully operated as a going concern for 220+ years, with a proven and reliable management structure. Few, if any corporations, have been able to do that.

Re:The US government is competent. (1)

elvesrus (71218) | more than 4 years ago | (#30807486)

you might want to read that over again

Re:The US government is competent. (2)

jimicus (737525) | more than 4 years ago | (#30807508)

The US Gov't has successfully operated as a going concern for 220+ years, with a proven and reliable management structure. Few, if any corporations, have been able to do that.

Private corporations can go under with just a couple of bad years. Or even months, particularly if they're new businesses. Governments just have to raise taxes.

However, look at the private CEOs. (0)

Anonymous Coward | more than 4 years ago | (#30807572)

However, look at the private CEOs. When the company goes under, they get the golden parachute and off to another business.

Re:The US government is competent. (0)

Anonymous Coward | more than 4 years ago | (#30807530)

proven and reliable management structure

Huh?
Can live under that rock with you? Seems like a blissful place.

Re:Probably just a bug. (0, Troll)

kjart (941720) | more than 4 years ago | (#30807586)

The simple fact is that ignoring robots.txt is effectively evil, regardless of the intent.

So evil, in fact, that you just know that nobody [google.com] else [google.com] would ever do something like this. Oh wait...

Re:Probably just a bug. (5, Insightful)

fish waffle (179067) | more than 4 years ago | (#30806982)

I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?

Probably. But since incompetence is the plausible deniability of evil it's sometimes hard to tell.

Re:Probably just a bug. (1)

paiute (550198) | more than 4 years ago | (#30807158)

I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?

Probably. But since incompetence is the plausible deniability of evil it's sometimes hard to tell.

"incompetence is the plausible deniability of evil"

  fish waffle, that is great sig material.

Re:Probably just a bug. (1)

mspohr (589790) | more than 4 years ago | (#30806984)

Occam's razor (or Ockham's razor[1]), entia non sunt multiplicanda praeter necessitatem, is the principle that "entities must not be multiplied beyond necessity" and the conclusion thereof, that the simplest explanation or strategy tends to be the best one.

Rough translation: "Never ascribe to malice that which can be adequately explained by stupidity."

Re:Probably just a bug. (5, Insightful)

MrMr (219533) | more than 4 years ago | (#30807058)

The problem is, there is no evidence that:
Never ascribe to stupidity that which can be adequately explained by malice.
Is invoking more entities.
In fact, claiming that the commercially most successfull software company got there through stupidity rather than malice sounds extremely implausible to me.

Re:Probably just a bug. (1)

maxwell demon (590494) | more than 4 years ago | (#30807166)

In fact, claiming that the commercially most successfull software company got there through stupidity rather than malice sounds extremely implausible to me.

So if certain Microsoft products are or were insecure and/or unstable, it wasn't incompetence, but malice? You think Microsoft was happy every time a user got the dreaded Blue Screen Of Death?

Re:Probably just a bug. (1)

horatio (127595) | more than 4 years ago | (#30807392)

You think Microsoft was happy every time a user got the dreaded Blue Screen Of Death?

Yes, in a way. I never really thought about it until you asked, but it fits with their business model of forcing users into an expensive upgrade of their OS every few years. Look what has happened with XP. It doesn't blue screen [as] much, and they've met heavy resistance from folks not wanting to upgrade to Vista. (Never mind that Vista is crap.) So now they've re-packaged Vista as "Windows 7" and hope folks don't realize it looks the same and smells the same, because it basically is.

Re:Probably just a bug. (1)

MrMr (219533) | more than 4 years ago | (#30807466)

Sort of:
I'm saying that the assumption that these flaws persist through incompetence is not a less complex explanation.
The fact that issues were not solved in one of their later releases may very well be a deliberate commercial decision, which would make it indeed malicious rather than incompetent from the end-user perspective.

Re:Probably just a bug. (4, Funny)

Opportunist (166417) | more than 4 years ago | (#30807186)

Like my grandpa said, it doesn't matter how dumb you are. As long as you find someone even dumber to sell to.

Re:Probably just a bug. (1)

Lundse (1036754) | more than 4 years ago | (#30807084)

That's a pretty rough translation!

You might be able to argue, that the latter saying is a corollary of the former, but in no way do they mean the same.

Occam says the simplest explanation is best - the better explanation is the one with least assumptions.

In this case, Occam affords us no help - we already know MS is both "evil" and incompetent. So the two explanations are equal in this regard. The "corollary" suggests, then, something else; namely that stupidity is a better explanation than "evil" in all/most cases (presumably because stupidity is more widespread).

Re:Probably just a bug. (1)

CrazyDuke (529195) | more than 4 years ago | (#30807468)

Something that bugs me about that statement: Out of curiosity, since when does a lack of evidence amount to an adequate explanation?

And, also, how does malicious incompetence fall under that false dichotomy? Or, for that matter, what of reckless incompetence and plausible dependability?

Oh, and for the record: Experience tells me such an outcome is often the result of a PHB or two and a few "I don't give a fuck anymore." engineers. It's fun to dismiss PHBs as merely incompetent. But, what they are competent in is convincing people their actions warrant promotion, regardless of the actual results of their actions.

Re:Probably just a bug. (2, Insightful)

alexhs (877055) | more than 4 years ago | (#30807006)

these bots 'completely ignore the rules specified in robots.txt.'

Microsoft ignoring standards is not incompetence, it's policy (NIH syndrome).

Re:Probably just a bug. (4, Insightful)

djupedal (584558) | more than 4 years ago | (#30807012)

> "I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?"

We assume MS is evil...

We know they are incompetent.

We feel this is typical.

We pray they'd just go away.

We think this will never end...

Re:Probably just a bug. (4, Interesting)

Yvanhoe (564877) | more than 4 years ago | (#30807034)

There is such thing as criminal incomptence. If a script kiddie can be arrested for having a virus "out of control" I don't see why Microsoft engineers DDOSing a website couldn't be charged.

By the way a philosopher once told that "evil" did not exist. That it was most of the time just a kind of hidden stupidity.

Re:Probably just a bug. (0)

Anonymous Coward | more than 4 years ago | (#30807272)

By the way a philosopher once told that "evil" did not exist. That it was most of the time just a kind of hidden stupidity.

Un huh; so child raping priests are not evil, just stupid. Sounds like a perverse definition of stupid to me.

Re:Probably just a bug. (3, Interesting)

hairyfeet (841228) | more than 4 years ago | (#30807428)

But MSFT is a corporation, which thanks to our corporate butt kissing congress and courts can just go "ooopsie", maybe cut a small check at most, and walk away scott free.

And as for your philosopher? I saw an interview with Joss Whedon on writing evil characters that I thought really hit the nail on the head. He said, and I paraphrase "The villain never sees himself or herself as evil. To them there is a perfectly justifiable reason for their actions. I have known some truly evil people, those that have intentionally hurt their fellow man out of pure malice, and to them their actions were justified and noble. They simply didn't see what they did as wrong."

Which is how you get MSFT and Intel paying backroom deals to crush competition, or Jack Trammell and his "business is war" philosophy. To the ones making the decisions "the other guy would do it to us if they could, so why shouldn't we do it to them?". I'm sure that if you talked to Gates or the head of Intel you could never get them to believe that crushing your competition any way you can is wrong. To them that was/is business 101 and not evil. That is why I think Whedon was right, the villain always thinks they are noble.

Re:Probably just a bug. (1)

init-five (745157) | more than 4 years ago | (#30807182)

I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?

how about both?

Incompetent? (1)

omb (759389) | more than 4 years ago | (#30807188)

Yes, Evil more so

Or both (1)

cheros (223479) | more than 4 years ago | (#30807320)

AFAIK, the one doesn't exclude the other.

However, assuming evil is more fun :-)

Re:Probably just a bug. (1)

Xest (935314) | more than 4 years ago | (#30807324)

Yes, and I like the solution too- rather than contact Microsoft to find out what the fuck is going on, post it to Slashdot and get Slashdotted as well.

Pure genius.

Tis can be rectified.... (-1, Troll)

Anonymous Coward | more than 4 years ago | (#30806942)

MS can apologize and give CPAN a check for one million dollars for their troubles.

Fixing Bing's poor indexing (1, Interesting)

AHuxley (892839) | more than 4 years ago | (#30806952)

Its not a bug, its a feature to index a site with a new, rapid, powerful, direct, personalised crawler :)
http://arstechnica.com/microsoft/news/2010/01/microsoft-outlines-plan-to-improve-bings-slow-indexing.ars [arstechnica.com]

Re:Fixing Bing's poor indexing (0)

Anonymous Coward | more than 4 years ago | (#30807270)

http://arstechnica.com/microsoft/news/2010/01/microsoft-outlines-plan-to-improve-bings-slow-indexing.ars

Is the extension on the file referenced by that URL some indication as to the author's view of Microsoft's plans?

This is a normal occurence for Bing (5, Informative)

Anonymous Coward | more than 4 years ago | (#30807010)

I had a registration page - static content basically. The only thing that was dynamic was that it was referred to by many pages on the site with a variable in the querystring. Bing decided that it needed check on this one page *thousands* of time per day.

They ignored robots.txt.
I sent a note to an address on the Bing site that requested feedback from people having issues with the Bing bots - nothing.

The only thing they finally 'listened' to was placing "" in the header.

This kind of sucked because it took the registration page out of the search engines' index, however it was much better than being DDOS'd. Plus, the page is easy to find on the site so not *that* big a deal.

Bing has been open for months now and if you search around there are tons of stories just like this. Maybe now that a site with some visibility has been 'attacked', the engineers will take a look at wtf is wrong.

Re:This is a normal occurence for Bing (1)

The Cisco Kid (31490) | more than 4 years ago | (#30807330)

Seems like a better solution would have been to setup a test for the either the User-Agent, or the IP/blocks that Bing was attacking your site from, and dropping those requests in /dev/null - your site would still exist on 'real' search engines, and Bing doesn't pound on your bandwidth anymore.

Re:This is a normal occurence for Bing (1)

The Cisco Kid (31490) | more than 4 years ago | (#30807344)

Replying to myself: if testing the UA or the IP in the httpd itself was too much load, you could have also just nullrouted the IP blocks the Bing spider was coming from, either in the kernel table, or in your router.

Re:This is a normal occurence for Bing (1)

dkf (304284) | more than 4 years ago | (#30807520)

Replying to myself: if testing the UA or the IP in the httpd itself was too much load, you could have also just nullrouted the IP blocks the Bing spider was coming from, either in the kernel table, or in your router.

I know of one site where this has been done for years (both with Bing and its predecessors). Sure it ruins the site's searchability for anyone using Bing, but like we care; that's better than having the site itself unreachable due to load and Google doesn't cause the same level of problems.

Re:This is a normal occurence for Bing (0)

Anonymous Coward | more than 4 years ago | (#30807432)

Or you could remove the dynamic variable from a static page so the bot knows it's always the same page?

Flooding... (4, Informative)

Bert64 (520050) | more than 4 years ago | (#30807030)

I have noticed the microsoft crawlers (msnbot) being fairly inefficient on many of my sites...
In contrast to googlebot and spiders from other search engines msnbot is far more aggressive, ignores robots.txt and will frequently re-request the same files repeatedly, even if those files haven't changed... Looking at my monthly stats (awstats) which groups traffic from bots, msnbot will frequently have consumed 10 times more bandwidth than googlebot, but is responsible for far less incoming traffic based on referrer headers (typically 1-2% of the traffic generated by google on my sites).

Other small search engines don't bring much traffic either, but their bots don't hammer my site as hard as msnbot does.

Re:Flooding... (0)

Anonymous Coward | more than 4 years ago | (#30807452)

Block their crawlers. They will behave after that.

Re:Flooding... (1)

Manfre (631065) | more than 4 years ago | (#30807590)

Did you provide google with a sitemap file? If so, that explains why google does not need to check your site for changes as often.

Are you sure? (4, Insightful)

Errol backfiring (1280012) | more than 4 years ago | (#30807070)

Are we sure this traffic comes from Microsoft? Could it not consist of forged network packets? You don't need a reply if you are running a DDOS. On the other hand, why would anyone, including Microsoft, want to bring down CPAN?

Re:Are you sure? (3, Funny)

Anonymous Coward | more than 4 years ago | (#30807154)

Because they are coming out with P# and don't want the competition?

Re:Are you sure? (2, Informative)

Anonymous Coward | more than 4 years ago | (#30807198)

You only see an IP in an apache log after a successfull TCP handshake. This is hard (not impossible, but really, really hard) to do with a forged IP.

Re:Are you sure? (5, Informative)

TheRaven64 (641858) | more than 4 years ago | (#30807246)

Are we sure this traffic comes from Microsoft? Could it not consist of forged network packets?

It's a TCP connection, so they need to have completed the three-way handshake for it to work. That means that they must have received the SYN-ACK packet or by SYN flooding. If they are SYN flooding, then that would show up in the firewall logs. If they've received the SYN-ACK packet then they are either from that IP, or they are on a router between you and that IP and can intercept and block the packets from thatIP.

You don't need a reply if you are running a DDOS.

You do if it's via TCP. If they're just ping flooding, then that's one thing, but they're issuing HTTP requests. This involves establishing a TCP connection (send SYN, receive SYN-ACK with random number, reply ACK with that number) and involves sending TCP window replies for each group of TCP packets that you receive.

On the other hand, why would anyone, including Microsoft, want to bring down CPAN?

Who says that they want to? It's more likely that their web crawler has been written to the same standard as the rest of their code.

Re:Are you sure? (0)

Anonymous Coward | more than 4 years ago | (#30807400)

Yes, I was getting this, too on a couple of perl-driven sites. The Bing msnbots were ignoring the crawl delay. Turns out they weren't, but they had several crawlers working on it at once, effectively ignoring the crawl delay. They still are, so I gave them a 300 second crawl delay and it's dropped to a reasonable level.

They were also ignoring the Disallow: headers until I notified "Live Search WMC community " and got somebody working on the problem to look at it. Apparently Bing needs a little handholding or he gets ADHD.

So block those IP ranges? (1)

Evro (18923) | more than 4 years ago | (#30807094)

If they've identified the IP ranges, why not just block them? You can do it at the router or TCP level (drop packets), or just throw up a 403 Forbidden.

Re:So block those IP ranges? (0)

Anonymous Coward | more than 4 years ago | (#30807130)

RTFA, they did.

The bots were identified by their IP addresses, including 65.55.207.x, 65.55.107.x and 65.55.106.x, as coming from Microsoft. The administrators of CPAN Testers have now blocked access to their site from these addresses.

Re:So block those IP ranges? (3, Informative)

John Hasler (414242) | more than 4 years ago | (#30807152)

> ...why not just block them?

They have.

Re:So block those IP ranges? (0)

Anonymous Coward | more than 4 years ago | (#30807160)

Hush now!

If it was not an IP address block from Micro$haft they would have done exactly that. This is meant to cause pure unadulterated (heck even adulterated) embarassment to our sworn mortal enemy. Nothing more Nothing less.

Re:So block those IP ranges? (5, Insightful)

Sarten-X (1102295) | more than 4 years ago | (#30807184)

For ignoring robots.txt, they don't deserve any more nor less.

So... (0)

Anonymous Coward | more than 4 years ago | (#30807180)

Block the IP addresses and send Microsoft email?
What am I missing here?

Too easy for Microsoft (1)

BhaKi (1316335) | more than 4 years ago | (#30807194)

I suppose Microsoft can offer a simple explanation: "Our servers and other internal infrastructure are so vulnerable that they have been hacked and being used as remote-controlled botnets."

Robots.txt (1)

anomnomnomymous (1321267) | more than 4 years ago | (#30807282)

Can anyone here clarify what robots.txt stands for, as in:

Is it an 'agreement' to not scan the site at all (by a search engine bot), or is it meant to just not -display- those results in the search engine?
I'd assume, since everything on a site is more or less public, that it would be the second. And if so, I can't see anything wrong with what Microsoft's bots did.

I can see how scanning a site's content (even if you're not going to list the results in your search engine) can have some value to a company.

Re:Robots.txt (2, Informative)

Ogi_UnixNut (916982) | more than 4 years ago | (#30807382)

It's the first. Whatever you specify in the robots.txt as no-follow etc... means not to spider the pages, so no scanning of them at all.

You use it for when you only want part of your site to appear in search results, such as just the front page (for example). The rest of the site should not be touched by the bot at all.

Re:Robots.txt (2, Informative)

afidel (530433) | more than 4 years ago | (#30807540)

It's basically a rough pattern filter that the bot is supposed to follow on parts of the site not to crawl. One reason it's used is that you can have dynamically generated pages that create an infinite loop that's impossible for the bot to detect.

pl0s 2, Troll) (-1, Troll)

Anonymous Coward | more than 4 years ago | (#30807364)

Than it5 Windows [goat.cx]

Re:pl0s 2, Troll) (0, Troll)

ArsenneLupin (766289) | more than 4 years ago | (#30807566)

Yes, that's the address that they should have redirected the Micro$hit spiders to.

O, it's just a pumpkin :-(

Here's the real address goatse.fr [goatse.fr] . Doesn't Mr Sarkozy have a lovely face?

What the hell has become of the word "problem"? (1)

John Hasler (414242) | more than 4 years ago | (#30807376)

> ...issues accessing their sites...

"Issues"? What's wrong with "problem"? "Issues" is marketing-speak. Microsoft marketing-speak.

And yes, get off my lawn.

Re:What the hell has become of the word "problem"? (1)

Spad (470073) | more than 4 years ago | (#30807420)

Blame ITIL; you can't call it a problem until you've had multiple incidents, or something.

Typical of Bots (0)

jmaslak (39422) | more than 4 years ago | (#30807380)

Sure, it should not ignore robots.txt. And if that's true, there's a problem - but I'd like MS's side of the story before assuming that it ignores robots.txt - who knows, maybe the robots.txt is malformed.

I'd also like to know what user agent string is the crawler using.

But all that said, this is not exactly news worthy. I've run large, dynamic internet sites for years. I've had problems with many, many different kinds of crawlers, from many companies (including companies like Google). There's a ton of bots out there that do ignore robots.txt (there was a few hundred bots that scanned the site I used to run, back in 2001, that ignored robots.txt). So it's something a programmer really needs to be ready to deal with.

Yes, these bots are rude, abusive, and inconsiderate of the site owners (go figure - most of the companies running them, the small bots, are pretty much unethical anyhow - anything for a buck). But it's on the internet, just like spam and a bunch of other things we all get annoyed with. You have to deal with it.

I suggest applications like mod_bwshare to even out this type of behavior, traffic shaping at the network layer for known abusers you don't just want to block, etc. Those are the tactics I use.

Send the lost bots home. (5, Funny)

N1ckR (1289800) | more than 4 years ago | (#30807476)

I redirect lost bots home, seems a polite thing to do. 301 www.microsoft.com
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>