×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

New Web Application Attack - Insecure Indexing

timothy posted more than 9 years ago | from the trawling-for-patterns dept.

Security 120

An anonymous reader writes "Take a look at 'The Insecure Indexing Vulnerability - Attacks Against Local Search Engines' by Amit Klein. This is a new article about 'insecure indexing.' It's a good read -- shows you how to find 'invisible files' on a web server and moreover, how to see contents of files you'd usually get a 401/403 response for, using a locally installed search engine that indexes files (not URLs)."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

120 comments

but its fixed in firefox now (2, Funny)

Prophetic_Truth (822032) | more than 9 years ago | (#11808176)

right?

Re:but its fixed in firefox now (2, Insightful)

jacquesm (154384) | more than 9 years ago | (#11808260)

Sure, and Konqueror never had it :)


that's all nice and good, personally I think files that were never meant to be indexed make for the best reading by far !


Re:but its fixed in firefox now (0)

Anonymous Coward | more than 9 years ago | (#11808571)

nice work on the 3 consecutive first posts

Speaking of firefox (4, Interesting)

ad0gg (594412) | more than 9 years ago | (#11808572)

Another exploit [www.mikx.de] can out this weekend. The funny thing is that microsoft antispyware beta 1 detects the execution of the payload file and shows a prompt if you want continue or stop the execution.

Re:Speaking of firefox (1)

irix (22687) | more than 9 years ago | (#11808886)

Another exploit can out this weekend.

I don't think it is so new - it is fixed by 1.0.1. From the description [www.mikx.de]:

Status The exploit is based on multiple vulnerabilities: bugzilla.mozilla.org #280664 (fireflashing) bugzilla.mozilla.org #280056 (firetabbing) bugzilla.mozilla.org #281807 (firescrolling)
Upgrade to Firefox 1.0.1 or disable javascript.

It's just more news you won't see on /. (0)

Anonymous Coward | more than 9 years ago | (#11809435)

Firefox having another exploit, and Micrisoft's new beta software fixing it. You won't see it on Slashdot's front page.

Posting anon because this is both off-topic and against the majority mindset.

should have been from.... (5, Funny)

Anonymous Coward | more than 9 years ago | (#11808178)

the department-of-the-bleedingly-obvious...

this is'nt new (0)

rkv (852317) | more than 9 years ago | (#11808181)

was'nt there already one?

Re:this is'nt new (1, Offtopic)

Refrozen (833543) | more than 9 years ago | (#11808206)

Yeah, you're apostraphy goes one character to the right.

Re:this is'nt new (0, Offtopic)

Refrozen (833543) | more than 9 years ago | (#11808229)

And that "you're" was supposed to be your.... :P

Re:this is'nt new (0, Offtopic)

ikkonoishi (674762) | more than 9 years ago | (#11808244)

You really should have put quotation marks around the "your".

Re:this is'nt new (0, Offtopic)

jacksonj04 (800021) | more than 9 years ago | (#11808293)

I think you should have... what... but...

Dammit! A perfect Grammar Nazi!

Re:this is'nt new (1)

ikkonoishi (674762) | more than 9 years ago | (#11808490)

Actually I think standard english says that the period should go inside the quotation marks, but my programming trained mind refuses to let me do so.

Re:this is'nt new (0)

Anonymous Coward | more than 9 years ago | (#11808533)

That's "Standard Written English."

Re:this is'nt new (0)

Anonymous Coward | more than 9 years ago | (#11809080)

No, it's "American Written English". The English English use the more logical approach of having the period only appear inside the quotes if it's part of the quotation. Which makes writing technical documentation much easier...

Re:this is'nt new (0)

Anonymous Coward | more than 9 years ago | (#11808622)

Depends.

In US English the period indeed goes inside the quotes, or so they say. However, in British English it depends what the quotes are doing.

If it's a fragment, the full stop goes outside:- it usually looks "something like this".

If it is a quote which already had a full stop or equivalent, "You'd have the full stop inside the quotes."

Or, you know. Something like that.

Re:this is'nt new (1)

pla (258480) | more than 9 years ago | (#11808988)

Actually I think standard english says that the period should go inside the quotation marks, but my programming trained mind refuses to let me do so.

Same here, that rule drives me absolutely batty... In my opinion, if I put the period in the quotes, I effectively tell the parser (aka "person reading my text" in the case of normal English communication) that I attribute the period to the source of the quote, while simultaneously leaving my own sentence un-terminated.

However, I realized that you just need to apply the rules a little bit more literally to find a simple exploit that lets you do whichever you want. The rule says that, if you end your sentence with a quote, you put the period inside the quotes. But, if you put a period outside the close-quote, then technically the quote hasn't ended your sentence - the period following it has.

Re:this is'nt new (1)

rkv (852317) | more than 9 years ago | (#11809130)

k so my english pretty much suckx and i know that but seriously i thought this new was old

Re:this is'nt new (0)

Anonymous Coward | more than 9 years ago | (#11808312)

Yeah, you're apostraphy goes one character to the right.
Y'ore seppeling of apostrophe is a catastraphy.

Re:this is'nt new (0)

Anonymous Coward | more than 9 years ago | (#11808371)

and your apostraphy should be apostrophy.

Re:this is'nt new (0)

Anonymous Coward | more than 9 years ago | (#11808390)

Yeah, you're apostraphy goes one character to the right.

...and "your" doesn't have an apostrophe.

Re:this is'nt new (1)

DarkMantle (784415) | more than 9 years ago | (#11809184)

Grammar Nazi's asside...

Um, it IS new, because /. didn't post anything about it before. Even tho I've been using google cache to see files that I usually get a 403 on for a few months now.

Besides, in a few hours it will be new all over again when they post the dupe.

You can see evidence of that

Mozilla Firefox fucking sucks (-1, Troll)

Anonymous Coward | more than 9 years ago | (#11808189)

I want a translation in my language but days after it's been released, it still isn't up. These cunts really need to get their asses in gear with translations if they want to be taken seriously.

Mozilla Translations (-1, Flamebait)

XanC (644172) | more than 9 years ago | (#11808215)

You're looking for the translation for jackasses, Mozilla Fire-Fucking-Fox.

Re:Mozilla Firefox fucking sucks (2, Insightful)

Anonymous Coward | more than 9 years ago | (#11808525)

Oh, we are terribly sorry for taking so long!
Don't worry, we will give you a full refund.

Re:Mozilla Firefox fucking sucks (0)

Anonymous Coward | more than 9 years ago | (#11808674)

I want a translation in my language


Yuhn. we wannt the langwich opshun fer "idiot" ...

and don't forget... (4, Interesting)

DrKyle (818035) | more than 9 years ago | (#11808209)

to see if you can get the site's robots.txt as the files/directories in that file are sometimes full of goodies.

Re:and don't forget... (1)

MrEcho.net (632313) | more than 9 years ago | (#11808285)

Not when the file has something like this:
User-agent: *
Disallow: /

Re:and don't forget... (1)

MostlyHarmless (75501) | more than 9 years ago | (#11809348)

Of course, that's assuming that you don't want your site indexed by any search engine (in which case, why is it exposed to the outside Internet to begin with?)

Incidentally, it also breaks properly-designed retrieval mechanisms (like, say, RSS readers -- yes, dailykos.com, I'm talking about you!)

Re:and don't forget... (0)

Anonymous Coward | more than 9 years ago | (#11809238)

Sometimes "wget http://site.example.com/robots.txt" works to get the robots.txt file, but some smart webmasters are aware of this security hole and hide their robots.txt under a different name. B*stards!

Re:and don't forget... (1)

myov (177946) | more than 9 years ago | (#11810015)

For this reason, I tended not to create a robots.txt file. At minimum, sensitive sites wouldn't go in it.

If anything, I'd block googlebot/others in .htaccess files, assuming it wasn't a passworded site to begin with.

indexing google (5, Interesting)

page275 (862917) | more than 9 years ago | (#11808223)

Even though here's about internal indexing, it reminded me of the old fashion google indexing: Search google with some sensitive terms such as : 'index of /' *.pdf *.ps

Re:indexing google (2, Informative)

Neil Blender (555885) | more than 9 years ago | (#11808301)

Even though here's about internal indexing, it reminded me of the old fashion google indexing: Search google with some sensitive terms such as : 'index of /' *.pdf *.ps

This is an execellent trick for searching for porn (ie "index of /" lesbian).

Google Hacks Database (5, Informative)

giant_toaster (850764) | more than 9 years ago | (#11808532)

I guess a lot of people have seen this site before, but http://johnny.ihackstuff.com/index.php?module=prod reviews [ihackstuff.com] has a lot of these google exploits etc, he is posting them up so people can check if their sites are secure. There are some interesting presentations by him on the main site about how search engines can be exploited.

Re:Google Hacks Database (1)

veg_all (22581) | more than 9 years ago | (#11808914)

he is posting them up so people can check if their sites are secure

Uh-huh. I imagine most of his readers are using them to make sure everyone else's site is secure : )

permissions permissions permissions (4, Insightful)

Capt'n Hector (650760) | more than 9 years ago | (#11808226)

Never give web-executable scripts more permissions than absolutely required. If the search engine has permission to read sensitive documents, and web users have access to this engine... well duh. It's just common sense.

Re:permissions permissions permissions (5, Insightful)

WiFiBro (784621) | more than 9 years ago | (#11808316)

This document in the first paragraphs describes how to get to files which are not public. So you also need to take the sensitive files out of the public directory, which is easy but hardly ever done. (You can easily make a script to serve the files in non-public directories to those entitled to).

Re:permissions permissions permissions (1)

a55mnky (602203) | more than 9 years ago | (#11808840)

Expecting common sense is rather presumptuous of you - don't you think

Re:permissions permissions permissions (1, Insightful)

Anonymous Coward | more than 9 years ago | (#11808994)

Give me a freaking break. This is the same guy who found the "HTTP RESPONSE SPLITTING" vulnerability. Last years catch phrase among the wankers at Ernest and Young and Accidenture. The same type of people who consider a HTTP TRACE XSS a vulnerability. I guess it's been a slow freaking year for security research.

Amit Klein at least used to work for Watchfire formerly known as Scrotum (Sanctum), and the same company who tried to patent the application security assessment process. I guess it's been a really slow year for vulnerability research. They need a new terminology to scare the executives at fortune 500 corporations, and sell their useless products.

People tend to forget that to compromise data, it's easier to steal the tape from the back of a plane than it is to hack up some stupid search engine.

Re:permissions permissions permissions (-1)

twitter (104583) | more than 9 years ago | (#11809275)

Permissions, who would think of that? Oh yeah, the researchers:

Recommendations for web site developers and owners.

A less intrusive solution may be to use access control in order to restrict the indexing to allowed material. Let us assume that the web server runs with a higher privilege than the search engine. Now, the visible files need to be assigned low privilege, so they are readable by both the web server user and the search engine user. The invisible (or inaccessible) files are assigned higher privileges, so they are readable only by the web server. Thus, those files can be accessed remotely by those that know about them, and possibly possess the required credentials (for the inaccessible files), yet they cannot be indexed. ...

Simple enough on a *nix system. The problem is that you want to index the files for your own employee's use! You might not even keep you confidential files on the same machine as your public web server until you want to share it but the indexing software can send information back to it's owners. The solution is to make your own indexing software or use something like mnogosearch [mnogo.ru], which is free and has a debian package, for non-publically exposed files you want to index.

Using Winblows desktops makes all of this a red herring. The best server set up is useless when your PR staff uses keylogger ridden desktops. The weakest link in the chain is where your confidential data will get out.

Re:permissions permissions permissions (0)

Anonymous Coward | more than 9 years ago | (#11810153)

Moderators: Please note that "twitter" is a known fanatical sycophant whose obnoxious offtopic rants are legend here on Slashdot. It doesn't matter what the topic is, he'll find a way to scrape in some pointless Microsoft bashing. While nobody expects us to love Microsoft in any way, his particularly tepid style of calling anyone he replies to "troll" or "liar" or "fanboy" because he happens to disagree with whatever they're saying is well documented and should not be rewarded. If anything, twitter is the type of person that should not be part of the open source/free software community. He is an anathema to all that is good about free software.

I'm posting this so that you (the moderator) have some context to consider twitter [hyperdictionary.com] and not mod him up whenever he posts his filler preformatted rants about installing Knoppix or Mepis or whatever that unfortunately get him karma every single time and allow him to continue posting his trademark toxic crap (read on) day in and day out. You may consider this a troll - I consider it community service. And I ain't kidding.

If you're a /. subscriber, I invite you to look through some of his posting history [slashdot.org]. I guarantee that you'll be hard pressed to find someone that is more "out there" than twitter. You'll also probably notice he's got quite an AC following. Don't just read his posts, make sure you go through the replies.

To get an idea of what I'm talking about, check this [slashdot.org] post out. This is an article about email disclaimers. The parent of the post is complaining about the ads in the linked page and so on, and twitter actually goes off on a rant to blame it on Microsoft and recommend Lynx, because "is teh free".

Here's another. In this post [slashdot.org] twitter not only calls the OP a troll but attempts to "tell it like it is" while making some vague argument about "GNU". Yes, if you're confused, you're not alone. The reply (modded +4) proceeds to simply destroy his bogus argument. You will notice he did not reply. This is what some people call "drive-by advocacy". A sort of I'll just leave you with my thoughts here and move on to the next flamebait kind of deal. In fact, he almost never replies because he knows that his fanatical arguments simply do not hold up to any sort of discussion. It's not that he's chosen the wrong cause - he's just going at it in a completely wrong way.

Here's that drive-by advocacy and FUD in motion: twitter goes on [slashdot.org] about some topic and then drops the usual "oh and M$ is teh evil" because "WMP phones home" or some such. Called on his FUD, he then claims [slashdot.org] that WMP stores every song and movie you've ever played in a file, somewhere. Pressed further, he just sort of slithers out of sight, his FUD-spreading complete. This is not about some Microsoft technology that nobody likes anyway; it's about lying for the sake of lying. Way too many of his posts are exactly like this one.

More? Just read though this [slashdot.org] post and the subsequent replies. I guess this stands on its own. Or these [slashdot.org] two [slashdot.org]. Or this one [slashdot.org]. Or this one [slashdot.org].

Still not convinced? This [slashdot.org] is what twitter considers "humour" while going about his daily "M$" routine.

More? Bad spelling in astounding conspiracy theories [slashdot.org], more [slashdot.org] offtopic [slashdot.org] FUD [slashdot.org] and uninformed "I'm right, look at me" rants [slashdot.org], promptly proven wrong. Worse even, twitter wants to be RMS [slashdot.org], apparently [slashdot.org] (that first one is a winner). I mean, really [slashdot.org]. You think [slashdot.org]?

FUD [slashdot.org], FUD [slashdot.org], FUD [slashdot.org], FUD [slashdot.org], offtopic FUD [slashdot.org], and more FUD [slashdot.org]. This guy is like the Monty Python SPAM skit, but with FUD [slashdot.org] and more FUD [slashdot.org] instead of canned meat. Amazed yet [slashdot.org]? Don't forget that PowerPoint makes you dumb [slashdot.org], and it's all a Microsoft conspiracy [slashdot.org]. How low do you want to go? Maybe as low as this [slashdot.org]?

The infamous Fax Manifest [slashdot.org]? Nuclear fireballs [slashdot.org]? It goes on [slashdot.org] and on [slashdot.org] and on [slashdot.org] and on [slashdot.org] and on [slashdot.org] and on [slashdot.org] and on [slashdot.org]. Like the energizer bunny. Or take these [slashdot.org] two [slashdot.org], which stretch the definition of weird. And you have to love this [slashdot.org] thread.

And in case you haven't had enough, consider that twitter actually thinks Microsoft is out [slashdot.org] to [slashdot.org] get [slashdot.org] him. No [slashdot.org], really [slashdot.org]. He figures he's somehow relevant to the Open Source movement, and that by "attacking" him Microsoft wages war on us. How's that for warped reality. And finally, this [slashdot.org] should be good for a few chuckles.

It's up to you. We can get rid of this guy and make Slashdot a better place. I don't know about you, but I'd rather take the trolls and crapflooders over people like "twitter" any day. And I sure as hell don't want to be categorized along with him. This [slashdot.org] is not how you advocate free software, period.

Interesting. Brief summary. (4, Insightful)

caryw (131578) | more than 9 years ago | (#11808228)

Basically the article says that some site-installed search engines that simply index all the files in /var/www or whatever are insecure because they will index things that httpd would return a 401 or 403 for. Makes sense. A smarter way to do such a thing would be to "crawl" the whole site on localhost:80 instead of just indexing files, that way .htaccess and the such would be preserved throughout.
Does anyone know if the Google search applicance is affected by this?
- Cary
--Fairfax Underground [fairfaxunderground.com]: Where Fairax County comes out to play

Re:Interesting. Brief summary. (4, Insightful)

XorNand (517466) | more than 9 years ago | (#11808292)

A smarter way to do such a thing would be to "crawl" the whole site on localhost:80 instead of just indexing files, that way .htaccess and the such would be preserved throughout.
Yes, that would be safer. But one of the powers of local search engines is the ability to index content that isn't linked elsewhere on the site, e.g. old press releases, discontinued product documentation, etc. Sometimes you don't want to clutter up your site with irrelavant content, but you want to allow people who know what they're looking for to find it. This article isn't really groundbreaking. It's just another example of how technology can be a double-edged sword.

Re:Interesting. Brief summary. (4, Interesting)

Qzukk (229616) | more than 9 years ago | (#11808452)

If you could give the crawler multiple starting points then you could simply have an unlinked page that links to all the old content, and give that page to the crawler as a second starting point.

Re:Interesting. Brief summary. (4, Interesting)

BigGerman (541312) | more than 9 years ago | (#11808512)

This is even more important when a search engine (appliance) is capable to crawl the file shares directly (not just over HTTP).
EnterFind appliance [enterfind.com] (which I participated in developing) has this (still unique) feature and their clients were amazed by what the crawler can dig out. Especially in those "hidden" fields in the Office documents.

Re:Interesting. Brief summary. (4, Informative)

tetromino (807969) | more than 9 years ago | (#11808311)

Does anyone know if the Google search applicance is affected by this?

No. First of all, the Google Search Appliance crawls over http, and therefore obeys any .htaccess rules your server uses. Second, you can set it up so that users need to authenticate themselves. Third, there are many filters you can set up to prevent it from indexing sensitive content in the first place (except that since any sensitive content the google appliance indexes must already be accessible via an external http connection, one hopes it's not too sensitive).

Re:Interesting. Brief summary. (0)

Anonymous Coward | more than 9 years ago | (#11809191)

Crawling over http with a single privilege level would address this. Multiple privilege levels is exactly the problem at hand. Presumably the crawler has a tasty privilege level..

Re:Interesting. Brief summary. (4, Insightful)

Grax (529699) | more than 9 years ago | (#11808579)

On a site with mixed security levels (i.e. some anonymous and some permission-based access) the "proper" thing to do is to check security on the results the search engine is returning.

That way an anonymous user would see only results for documents that have read permissions for anonymous while a logged-in user would see results for anything they had permissions to.

Of course this idea works fine for a special purpose database-backed web site but takes a bit more work on just your average web site.

Crawling the site via localhost:80 is the most secure method for a normal site. This would index only documents available to the anonymous user already and would ignore any unlinked documents as well.

News at 11! (2, Insightful)

tetromino (807969) | more than 9 years ago | (#11808233)

Search engines let you find stuff! This is precisely why google, yahoo, and all the rest obey robots.txt Personally, I would be amazed if local search engines didn't have their own equivalent of robots.txt that limited the directories they are allowed to crawl.

Re:News at 11! (2, Insightful)

WiFiBro (784621) | more than 9 years ago | (#11808335)

With a scripting language capable of giving directory contents and opening files (php, asp, python, etc), anyone can write such a search engine. No degree required.

Re:News at 11! (0)

Anonymous Coward | more than 9 years ago | (#11808556)

equivalent of robots.txt that limited the directories they are allowed to crawl.

Given the number of people/companies that don't even configure a robots.txt, you're asking too much of the end-user.

Re:News at 11! (1)

digital bath (650895) | more than 9 years ago | (#11809189)

Read the article. This does not apply to "external" search engines such as Google and Yahoo - only to internal search engines that have access to the files via thru the filesystem, not through the webserver, since these "internal" search engines are capable of indexing files that would return a 403/401 via http.

sounds like fun (2, Funny)

h4ter (717700) | more than 9 years ago | (#11808251)

The attacker first loops through all possible words in English...

I get the idea this might take a while.

Re:sounds like fun (2, Funny)

h4ter (717700) | more than 9 years ago | (#11808272)

Wait a minute. All possible? Couldn't be satisfied with just actual words? This is going to take a lot longer than I first thought.

(Sorry for the reply to self. It's like my own little dupe.)

Re:sounds like fun (1)

gstoddart (321705) | more than 9 years ago | (#11808519)

Wait a minute. All possible? Couldn't be satisfied with just actual words? This is going to take a lot longer than I first thought.

Well, just record the guessed words, you might stumble on Hamlet. :-P

Does he really mean this (0, Redundant)

iMaple (769378) | more than 9 years ago | (#11808253)

The article saysThe attacker first loops through all possible words in English

I mean is this not a bit too ridiculous. (Esp if the inaccessible file is someone's personal outdated webpage). If it is anything useful(to a hacker or other persons involved in illegitimate acitvity) then the technique will most probably fail.
I am not saying that there is no vulnerability (the get data from search snippets is a good idea), but the third option I just quoted above seems to be pretty lame

Re:Does he really mean this (0)

Anonymous Coward | more than 9 years ago | (#11808281)

Yep, I agree. the only good app for this is to hope that some one is dumb enough to store creditcard #, SSNs passwords or even email addresses in some file. And none of these will work if u loop through all the words.

Vs. Database-Driven Sites? (3, Insightful)

Eberlin (570874) | more than 9 years ago | (#11808288)

The instances mentioned all seem to revolve around the idea of indexing files. Could the same be used for database driven sites? You know, like the old search for "or 1=1" trick?

Then again, it's about being organized, isn't it? A check of what should and shouldn't be allowed to go public, some sort of flag where even if it shows up in the result, it better not make its way onto the HTML being sent back. (I figure that's more DB-centric though)

Last madman rant -- Don't put anything up there that shouldn't be for public consumption to begin with!!! If you're the kind to leave private XLS, DOC, MDB, and other sensitive data on a PUBLIC server thinking it's safe just because nobody can "see" it, to put it delicately, you're an idiot.

Re:Vs. Database-Driven Sites? (2, Insightful)

jnf (846084) | more than 9 years ago | (#11808544)

thank you. thats the real security risk- not the indexing agent- but rather why is there internal documentation that is 'private' or 'confidential' within the webroot on an externally accessible webserver?

Re:Vs. Database-Driven Sites? (0, Troll)

illumin8 (148082) | more than 9 years ago | (#11809422)

If you're the kind to leave private XLS, DOC, MDB, and other sensitive data on a PUBLIC server thinking it's safe just because nobody can "see" it, to put it delicately, you're an idiot.

Or, you're a Diebold employee...

Isn't this (1)

jacksonj04 (800021) | more than 9 years ago | (#11808318)

by design? Surely something with permission to index internal files (even those specified to give 403s etc) is inherently designed to make them available to view.

Either that, or it's a user error (configuration).

that gives me an idea... (1)

edeus (853971) | more than 9 years ago | (#11808332)

Is it possible given the time and perseverence to exploit a vunerability in a search engine's parsing of a webpage say, you maliciously published somewhere? Obviously one would expect google and the likes to have good security (well apart from the gmail exploit and... well lets not go there), so I was curious has it ever been done? (ponders)

search indexer = magic (1)

EvilSheep (40230) | more than 9 years ago | (#11808337)

Summary; If you are going to use magic to index your web site, be smart about it. Don't just blindly use a tool that "does the job".

Nothing new here.

obvious? (5, Insightful)

jnf (846084) | more than 9 years ago | (#11808362)

I read the article and it seems to be like a good chunk of todays security papers, 'heres a long drawn out explanation of the obvious', I suppose it wasn't as long as it could be, but really ... using a search engine to find a list of files on a website? I suppose someone has to document it..

I mean, I understand its a little more complex as described in the article- but i would hardly call this a 'new web application attack', at best perhaps one of those humorous advisories where the author overstates things and creates much ado about nothing- or at least thats my take;

-1 not profound

P2P (4, Interesting)

Turn-X Alphonse (789240) | more than 9 years ago | (#11808418)

goto any P2P network and type @hotmail.com, @Gmail.com or @yahoo.com and see what documents turn up.. I'm willing to put money on them all being e-mails saved on idiots PCs which will contain everything from stuff to sell to spammers (if your so inclined), to sexual stuff and passwords/creditcard info.

Nothing really new here..

Re:P2P (1)

12 inch pianist (835625) | more than 9 years ago | (#11809862)

"Resume" is another fun p2p search. Usually has the name, address, and phone number. Then browse host and check out their kiddie pr0n collection.

Re:P2P (1)

mibus (26291) | more than 9 years ago | (#11810267)

That should give you plenty of cookies with authentication info...

Search for the right extension and you're likely to find MSN Messenger logs from people who have shared out all of "My Documents" without thinking!

Uh huh.... (1)

conran (837379) | more than 9 years ago | (#11808460)

"Reconstructing" files by searching every word in the english language in different orders? I want the last 5 minutes of my life back...

Re:Uh huh.... (1)

SharpFang (651121) | more than 9 years ago | (#11808853)

Did you RTFA?

Search foo. You get: .. first version of Foo, the world leading ...
Then search just the above. You get: ... to release the first version of Foo, the world leading anti-gravity engine ...
Repeat... ... We are happy to release the first version of Foo, the world leading anti-gravity engine that works on ...
Doesn't sound too hard?

Of course the length is limited but that can be solved by "moving frame." Say, putting the above, the engine says your query is too long.
Search: "anti-gravity engine that works on" and get
"... world leading anti-gravity engine that works on salted water and cheap..."
Then put "works on salted water and cheap" and get
"...engine that works on salted water and cheap components like..."
Search "water and cheap components like" and so on...

Re:Uh huh.... (2, Informative)

conran (837379) | more than 9 years ago | (#11809042)

Did you RTFA?

Yep. Did you keep reading it? I'm referring to the methods for when no excerpts are given.

RTFM (5, Informative)

Tuross (18533) | more than 9 years ago | (#11808468)

My company specialises in search engine technology (for almost a decade now). I've worked quite in-depth with all the big boys (Verity, Autonomy, FAST, ...) and many of the smaller players too (Ultraseek, ISYS, Blue Angel, ...)

I can't recall the last time this kind of attack wasn't mentioned in the documentation for the product, along with instructions on how to disable it. If you choose to ignore the product documentation, you get what you deserve.

It's quite simple folks. Don't open the search engine. ACL query connections. Sanitize queries like you (should?) do other CGI applications. Authenticate queries and results. If you can't be bothered, hire someone who can.

RTFA (1)

SharpFang (651121) | more than 9 years ago | (#11808815)

The problem is these are perfectly legal search engine queries. No matter how you "sanitize" the queries, that won't help, because they contain valid requests. The vulnerablity lies at the side of the indexing program, not the query/search/display one. The indexer indexes things it shouldn't. Files inaccessible normally through httpd are accessible in the search database.

A method I see for that would be running the indexing by piping it through httpd, make even local indexing go the same way remote indexing is being done - not indexing /var/www/... but http://localhost/. This way the indexer won't be able to access anything else common user can.

That'll make it easy... (1)

Jaidon (843279) | more than 9 years ago | (#11808470)

...to find all the "free sample" pr0n hidden in the maze of otherwise unintelligble directories. In the end, isn't that what the Internet is all about -- finding more efficient ways to see boobies? Yes...yes I think so.

Assumptions (1)

shird (566377) | more than 9 years ago | (#11808515)

All these "attacks" assume the indexing program will index and return results for files you dont have access to.

Im pretty sure the indexing server on Windows won't return 'search results' for files you dont have permissions to list. As with any other sensible indexing schemes, except perhaps the newer silly 'desktop search' tools. Seems pretty obvious to me.

Re:Assumptions (2, Informative)

SharpFang (651121) | more than 9 years ago | (#11808761)

Im pretty sure the indexing server on Windows won't return 'search results' for files you dont have permissions to list.
The problem and vulnerablity lies in definition of "you".
The indexing program runs on privledges of a local user with direct access to the harddrive. Listing directory contents, reading user-readable files. "you" are the user, like one behind the console, maybe without access to sensitive system files, but with access to mostly everything in the htroot tree the administrator hasn't blocked using the OS permissions, not the httpd features.
As a webpage visitor "you" are "guest", filtered through httpd, with all httpd restrictions applied. No directory listing, obscure blocking methods (.htaccess, config files, redirects, CGI wrapping) working. Your access is limited to what httpd lets you do, not just what the OS does. Now if you access the search engine database, you can see mostly everything the engine saw, including things it wouldn't see if it was running through httpd, not directly accessing the filesystem.

Re:Assumptions (1)

shird (566377) | more than 9 years ago | (#11808889)

Yes the indexing service may have access to everything. Thats why I said it won't return search results for files *you* dont have permissions to list.

ie, the indexing service checks the permssion of the requesting user, and only lists files they would be able to list in the OS. Its only common sense.

Re:Assumptions (1)

robertwall (688324) | more than 9 years ago | (#11810202)

The article's talking about search engines that are run locally on websites, not indexing services on local computer terminals.

application in porn (1, Funny)

Anonymous Coward | more than 9 years ago | (#11808518)

my mind being the way it is, i can't help but think of an application for this in porn ;). a lot of porn sites have extensive free previews, but its hard for someone to find all the free preview pics for a certain site (useful especially for a single model's site) unless you can find a direct link to every single unique free preview gallery from somewhere, and you'll undoubtedly miss some good stuff. i want to see a firefox extension that gets me all the free pics from a given site damnit!

I = MOLEMAN (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#11808539)

pubes

New option for robots.txt (5, Funny)

michelcultivo (524114) | more than 9 years ago | (#11808576)

Please put this new undocumented tag on your robots.txt file: "hackthis=false" "xss=false" "scriptkiddies=log,drop" And all you problems will be solved.

Re:New option for robots.txt (1)

greyhoundpoe (802148) | more than 9 years ago | (#11808992)

New option for robots.txt (Score:3, Interesting)
Please put this new undocumented tag on your robots.txt file: "hackthis=false" "xss=false" "scriptkiddies=log,drop" And all your problems will be solved.


Note to mods: *slap*

This is old. (4, Insightful)

brennz (715237) | more than 9 years ago | (#11808881)

Why is this being labeled as something new? I remember this being a problem back in 1997 when I was still working as a webmaster.

Whoever posted this as a "new" item, is behind the times.

OWASP covers it! [owasp.org]

Lets not rehash old things!

Re:This is old. (1)

lux55 (532736) | more than 9 years ago | (#11810505)

Not to be all "I'm so smart" but isn't this also rather obvious? If you're indexing private documents, don't return private results for public visitors. Simple as that.

All it takes to implement this is an "access level" field stored with each index entry, and assigning an "access level" session value to each visitor (defaulting to 0 for anonymous visitors).

Plus, this way you'll avoid pissing off visitors who click on essentially broken links in their search results.

No wonder the search capabilities of most sites are rated so poorly...

Why bother with phisching scams... (3, Interesting)

B747SP (179471) | more than 9 years ago | (#11808993)

This is hardly news to me. When I need a handy-dandy credit card number with which to sign up for one of those, er, 'adult hygeine' web sites, I just google for a string like "SQL Dump" or "CREATE TABLE" or "INSERT INTO" with filetype:sql and reap the harvest. No need to piss about with hours of spamming, setting up phisching hosts, etc, etc :-)

solution (3, Insightful)

Anonymous Coward | more than 9 years ago | (#11809064)

here's a solution thats been tried and seems to work: create metadata for each page as an xml/rdf file (or db field). XPATH can be used to scrape content from HTML et al to automate the process, as can capture from CMS or other doc management solutions. create a manifest per site or sub site that is an XML-RDF tree structure containing references to the metadata files and mirroring your site structure. finally, assuming you have an API for your search solution (and don't b*gger around using ones that dont) code the indexing application to only parse the XML-RDF files, beginning with the structural manifest and then down into the metadata files. Your index will then contain relevant data, site structure, and thanks to XPATH, hyperlinks for the web site. No need to directly traverse the HTML. Still standards based. Security perms only need to allow access to the XML-RDF files for the indexer, which means process perms only are needed, user perms are irrelevant.

There are variations and contingencies, but the bottom line is, even if someone cracked into the location for an xml metadata file, its not the data itself and while it may reveal a few things about the page or file it relates to, certainly is bottom line much less of a risk than full access to other file types on the server.

heres another tip for free. because you now have metadata in RDF, with a few more lines of code you can output it as RSS.

I just... (0)

Anonymous Coward | more than 9 years ago | (#11809139)

...let j0hnny [ihackstuff.com] do all the work for me.

I mean with the 0 in his name and everything, I know he's good.

credit where due.... (0)

Anonymous Coward | more than 9 years ago | (#11809837)

Anonymous? I sent that in and I demand recognition!
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...