Preventing Forum Spam-bots? 124
A concerned reader asks: "Recently it seems that forums have become the new target for spam bots advertising everything from porn to casinos. The forums that I admin are constantly harassed by these bots even though you must enter the visual confirmation code code (the picture with letters/numbers) as well as reply to an e-mail in order to register. This only started a few months ago so I'm suspecting that some new spam program was released that somehow gets around these anti-bot measures. How can I get rid of these annoying bots?"
One word: (Score:5, Informative)
Re:One word: (Score:2)
It seems a bit process-intensive, though, judging by the load time I'm getting. The success message on the demo seems rather appropriate, given last weekend's Slashdot layout...
Re:One word: (Score:1)
Re:One word: (Score:2)
http://en.wikipedia.org/wiki/Every_time_you_mastu
There's a much simpler method (Score:3, Interesting)
Please use correct terminology (Score:5, Informative)
Also... (Score:4, Informative)
Re:Also... (Score:1)
Re:Also... (Score:2)
Re:Please use correct terminology (Score:5, Insightful)
I have seen some captchas that ask users in plain text to solve a simple arithmetic or logic problem. This is going to be far more accessible than anything relying on embedded media.
If you're sure that none of your users are blind or colorblind (which would be plausible only for an extremely small user base), then I suppose something like KittenAuth [arstechnica.com] might be appropriate.
Re:Please use correct terminology (Score:5, Insightful)
This has yet to be a problem as the forums that I run are orientiated around shooters or MMPOGs.
Re:Please use correct terminology (Score:2)
Visual tests with an audio alternative for sight impaired
Re:Please use correct terminology (Score:4, Interesting)
You could also go for the cuteness approach:
Click on the three images which are OMG Kittens and you're identified as human.
Re:Please use correct terminology (Score:2)
Re:Please use correct terminology (Score:3, Insightful)
Anyone who wants to custom-program a bot for a single site would just be better off manually posting their spam.
Re:Please use correct terminology (Score:2)
Also, I fail to see how a word captcha could be guessable. A 5-letter sequence composed of alphanumeric characters would yield a 1/60466176 chance of guessing it right. That's one in 60 million. You'd be better-off playing the lottery
Re:Please use correct terminology (Score:2)
Re:Please use correct terminology (Score:2)
Re:Please use correct terminology (Score:2)
Re:Please use correct terminology (Score:5, Funny)
While not illegal, some may considering it amoral to discriminate against stupid people.
Re:Please use correct terminology (Score:3, Insightful)
While not illegal, some may considering it amoral to discriminate against stupid people.
Immoral? Hell, it's a moral imperative!
Re:Please use correct terminology (Score:2)
This is not a good captcha. If someone wants to flood the forums, it takes about 3 minutes to write a regexp to crack these. You aren't going to implement more than 20 or so different logic puzzles, and it's rather trivial to automatically parse these. Also remember that you only need a 5-10% success rate to completely shitflood the forums. I don't think it's possible to create a captcha that is usable
Re:Please use correct terminology (Score:2)
Re:Please use correct terminology (Score:2)
I actually implemented this on my blog a little while back as a quick deterrent(Because I didn't have the resources to implement it). The system was quite simple - it basically was scientific notation like so:
seven times one hundred plus eight times ten plus six
Answer: 786
Simple enough to check and because it's text it takes a little more effort to write something to crack it. I didn't get a comment
Re:Please use correct terminology (Score:2)
That's incredibly easy to circumvent. Just use http://www.google.com/search?q=seven+times+one+hun dred+plus+eight+times+ten+plus+six&start=0&ie=utf- 8&oe=utf-8&client=firefox-a&rls=org.mozilla:en-US: official [google.com]
((seven times one hundred) plus (eight times ten)) plus six = seven hundred eighty-six
Re:Please use correct terminology (Score:2)
Of course, if you're really getting hammered, you'll need to vary the structure of the questions (and the keywords) a lot, and probably move into the realm of general knowledge questions -- and then you need to make sure you're not relying on vocabulary or knowledge that would exclude more people than you intend.
And the simple ones only work because it's not worth the spammers' time
audio captcha for the blind (Score:2)
Re:Please use correct terminology (Score:2)
"This is an apple. It is smooth, shiny, and red."
"This is fluffy. It is shiny with orange and white stripes."
If set up properly, it should be easy enough for a human to guess which is the kitten and which is not, but difficult for a bot (without semantic reasoning) to tell the difference. You may have to avoid words that the bot can clue in on ("fur" is probably bad, for example).
This technique has been around long before the "Kitten Rank" site, however, and by fi
Re:Please use correct terminology (Score:2)
Re:Please use correct terminology (Score:2)
Re:Please use correct terminology (Score:2)
Audio captchas? Hey, that discriminates against me because I don't use speakers.
You see, GP is voicing annoyance about having to keep disabled people in mind.
Re:Please use correct terminology (Score:2)
Sorry
Re:Please use correct terminology (Score:2)
Re:Please use correct terminology (Score:2)
This is why it's important to think of accessiblility and standards. Not only is there a huge base of people using browsers other than MSIE -- there's a base of users who interact with computers in entirely different ways than most of us.
OCR (Score:1)
Re:OCR (Score:1)
CAPTCHAs are NOT the best solution - they're just a band-aid, and they make your site harder to use ( especially for low vision people ). Personally I prefer web-server level blocking of dodgy UA's, IP ranges, POST payloads with something like the wonderful mod_security [modsecurity.org] for Apache, coupled wi
Grace period? (Score:1)
Re:Grace period? (Score:4, Informative)
If a site makes me wait three days, though, I'm likely to forget about it in that time.
Or were you talking about smaller grace periods? Perhaps 10 minutes? That might work well.
Re:Grace period? (Score:2)
Re:Grace period? (Score:3, Insightful)
Re:Grace period? (Score:1)
But, it didn't completely stop them.. Two nights ago we had a guy spam us and told us to Google for his company's name and click the first link t
Easy (Score:5, Funny)
Anyone who can still click on the confirm button is not human.
Re:Easy (Score:2)
Re:Easy (Score:1)
Re:Easy (Score:2)
Don't say you weren't warned.
Re:Easy (Score:2)
Goatse got shut down, but it used to be a "shock site" (in fact, I think it was the first shock site).
Re:Easy (Score:2)
Yes, but not for long. After its troubles in Christmas Islands, it simply moved to Canada.
Re:Easy (Score:2)
Lame. While it does have the entryway, and the hands opening said entryway, it's missing the twig and berries. Oh, and the entryway is not red enough.
Visual code (Score:1)
add ad hoc customizations (Score:3, Insightful)
But if they are defeating captcha, there is probably someone who just sits there manually spamming forums through anonymous proxies. The amount of money that can be made by doing this spamming is probably enough to pay people with lower standards of living to just do it manually. And if that's so, there's just no way to get around it. I started logging how many bots the captcha and hidden variables were catching, and it was tons. Still, I get spammers. Just not nearly as many.
Re:add ad hoc customizations (Score:2)
Nope.
Well maybe, but not necessarily.
There is at least one public [sentinel.deny.de] and many 'private' tools that can brute force captcha while rotating proxies between attempts.
Plenty of freely available OCR components can be incorporated into your own program. It'd make much more sense to pay one programmer (or DIY) to whip up a quality OCR proggie than to pay monkeys to sit around typing in c
Comment removed (Score:4, Insightful)
Re:Two good approaches (Score:2)
Better: dynamically change the names of form fields ("subject", "message", etc) based on the current time. MD5 hash the current hour with the field name, and have the software only check the current and previous values. Spam bots generally have to be told what field names to look for.
Unless you're also willing to change the order of fields on your post-submit page, as well as the form factor, that doesn't do much good.
Don't use well known forum software (Score:5, Interesting)
Don't use phpbb, vbulletin or whichever other forum software everyone uses. Don't name your registration page "register.php" or something similarly easy to guess. Don't give your username and password fields name and id attributes of "username" and "password". Etc, etc. There is no security in obscurity, but there sure as hell is lots of convenience and freedom from automated harassment.
The rewards for writing scripts that can handle the subscription process for all the big software packages are simply too large. Yes, these software packages will now start up the arms race, same as has happened with weblogs and email and referer spammers (does anyone else have the feeling we've won that last one, btw?). You can try and follow along and update your forum software every other day. But it's much more convenient to simply duck under the radar. Chances are no spammer is going to bother figuring out how to register at your custom-built/modified forum.
Re:Don't use well known forum software (Score:3, Informative)
Much as I hate to agree with that, he speaks the truth -- the bots are written to target specific forum packages, and they almost always go after the popular ones. phpBB has taken a lot of stick for one or two security problems that came up, but in truth it's as good, if not better than its competition; the reason it gets hit so badly is simply because it's so popular.
So if you can use a less-well-known package, that will keep you awa
Re:Don't use well known forum software (Score:2)
What email addresses are they using? (Score:3, Interesting)
If they are using gmail, then maybe google would be nice enough to start a service where you could report addresses that bots are using. The great thing about google requiring invites is that google now has this neat chain of responsibility. If they see a pattern where all of the addresses created by invites from a certain person's account have been used as bots, then they could delete all those accounts and all the accounts they invited. That would seriously screw the spammers.
Re:What email addresses are they using? (Score:3, Insightful)
What worked for me (Score:2, Interesting)
I'm guessing you're using phpBB. I've actually been hit by these guys on my boards; it wasn't a problem for me until they started to post. It appears to be actual people and not robots. I should also note I didn't have this problem until I added Google AdSense to my boards. After I did that, I started to get two or three of these spammers each week. Another phpBB board I administer hasn't gotten a spam user yet.
What worked for me was checking the registration e-mail addresses of these people and putting in
Re:What worked for me (Score:1)
I did a search on phpBB's site about this and found I wasn't the only one with the idea of removing the URL field from the user name information. The phpBB people were not interested in creating a mod to do that, and they instead suggested I try the mod to block requests from proxies.
The proxy mod worked for a while, and I kept i
Re:What worked for me (Score:2)
There's also a huge topic on phpBB.com http://www.phpbb.com/phpBB/viewtopic.php?p=1404100 [phpbb.com] which details a few things you can do to stop them. Of main suggestion is the Instan Ban mod (http://www.phpbb.com/phpBB/viewtopic.php?t=186683 [phpbb.com] ) which will modifiy the registration page in such a way that automated attempts get banned. It is done in such a wa
Re:What worked for me (Score:2)
What'd be really cool is a stealth ban where you can see your posts, but nobody else can.
Be proactive! (Score:4, Insightful)
There are a number of options you have, depending on how aggressive you want to be. You may have implemented some of these suggestions already, but they may help other forum admins in a similar quandry.
Firstly, disable anonymous posting. What works for slashdot does not necessarily work for phpbb. This may sound obvious, but a forum I check on now and again is slowly haemorrhaging members due to guest bot spam.
Secondly, find yourself a list of public proxy servers. Ban them. Find some more. Ban them too. Also, take note of the IPs the spambots were using to post. Ban them as well (unless they are AOL IPs -- be smart and do an nslookup). Keep this list of banned IPs, and are them with the blacklist groups, or other forum admins you know. You help them, they help you.
Thirdly, augment your signup process. You say you are using CAPTCHAs, but if the bots are getting arond or through them, you have to do more. Write a few hundred straightforward questions; you can get your community to help you for this one. Have one o two of those questions displayed at regitration time, along with the CAPTCHA. For example:
Which of this is not one of the seven dwarves?
Or would you like another question ?
Keep this as simple as possible. "What color is the sky?" is about the level you are looking for. A bot won't be able to answer these unless it is specifically programmed to. Need I say you should serve a random question?
For bonus points on this one, make the questions something to do with the topic of the forums. If the forums were about widgets, you could ask something (really basic) like "What is the most common color of widget?". Or make come of the questions about the TOS. You know, the thing everyone checks the box saying "I agree to abide by the TOS". This may alienate some people, though, which you may or may not want. Also remember to consider non-native English speakers.
If you are sill getting those darned bots, consider manually approving by hand all registrations. This will obviously depend on how many new signups you get, and what kind of manpower you have (think moderators and "trusted community members"). On the other hand, you should be able to spot and stop bots right off the bat.
But why stop there? Be even more proactive! Set up a honeypot. Disallow a certain directory with robots.txt, and ban all IPs that find their way there. Include an invisible link to the disallowed location and see what falls in the trap. Remember that blacklist you started earlier? Add (and share) these IPs!
Finally, let your community know what you are doing. They will appreciate the effort (If you have noticed the spam, so have they). Set clear guidelines, and encourage community vigilance.
In the end, remember: spam is beatable.
Re:Be proactive! (Score:2)
Ahhhh! The optimism of youth!
Re:Be proactive! (Score:2)
I can't remember which forum(s) allow you to do this, but at least one of 'em allows to you set a user up so that they can keep posting, but only they see their own posts.
I think it makes a lot more sense to relegate the trolls and spammers to their own personal little playpen. Automated spammers aren't likely going to check if everyone else gets to see their posts... and as always, better an enemy you know than an enemy you don't.
If they think their accounts are wor
Re:Be proactive! (Score:2)
Um. Dwarves? Dwarves are, y'know, heavily bearded guys with massive axes who go around hiring halfling burglars to help them plunder a dragon's hoard, have inherent resistance to the major deleterious effects of Rings of Power, and do a nice line in erotic mithril underwear.
What you've got hold of there, on the other hand, are dwarfs.
Use Slashdot's method (Score:3, Insightful)
The only truly effective way to stop this crap is to require a certain amount of time to elapse before being able to post another post, like the way Slashdot does it, and to implement some kind of moderation+filtering system so the crap can be all be modded down by vigilant users. Combine that with a couple other requirements (you must have a user account to post, and new users can't post for the first 48 hours), and you'll easily sqaush the spam problem.
Unstoppable captcha-buster (Score:2)
1. Find links to a handful of free thumbnail galleries 2. Set up a webpage with links to said galleries 3. Make every outgoing link require filling in a CAPTCHA
When your page gets a hit, you pull down the CAPTCHA image (or whatever) from the target site, and serve it up to the mast
Re:Unstoppable captcha-buster (Score:3, Insightful)
by the users, for the users (Score:4, Interesting)
i do recommend you use your community to help your community
attack your site (Score:4, Interesting)
Perhaps the best way to fix your site is to attack it yourself. Try to write a simple bot that automates the login process, and see what happens. You may suddenly notice a subtle hole in your security (maybe the filename for the captcha gives away what it is... or maybe after a successful verification, the same cookie can be used to create another account... or something). In the process of attacking your own site you may uncover something you've missed before.
Re:attack your site (Score:2)
I'd doubt it. Newer OCR engines are quite flexible.
At worst, they might have to make up a new profile to process your captcha. Though, I'll admit, some are really tough, even for humans to decode.
Some people don't realize that a simple "type in the black letters on white background" isn't going to cut it anymore.
I had a similar problem with phpbb2 (Score:2)
My solution was to implement the visual check that everyone's talking about. I still get some registrations, but much fewer. What's crazy is that by default, these users can't do hardly anything. Unfortunately creating spam
Good moderators help... (Score:2)
The most recent batch to hit the site where I'm one of the mods, often use a *@mail.ru e-mail address and eight to ten character random character strings as the registered name.
Most of those we are getting link to sites like the following:
http://www.drugsn.com/ [drugsn.com]
http://phentermine.snow-send.com/ [snow-send.com]
http://internet-casino-gambling-online.snow-send.c om/ [snow-send.com]
http://xanax.crasn.com/ [crasn.com]
http://www.drug [drugname.net]
Re:Good moderators help... (Score:2)
Re:Good moderators help... (Score:2)
We ARE banning IDs and IPs, which MAY explain why there are no repeat posts from them, but there seems to be a virtually unlimited number of IPs, from around the world (UK, US, Poland, Japan, Germany, France, etc.), that these turkeys hit from.
Re:Good moderators help... (Score:2)
Maybe you should set up SpamAssassin to filter forum posts. After all, it does a pretty good job of detecting spammy keywords and such. Sort of like Slashdot's filters.
Another possibility is to put in a probation period. Let's say, if you have been registered for less
Re:Good moderators help... (Score:1)
Re:Good moderators help... (Score:2)
Re:Good moderators help... (Score:3, Informative)
I know this won't help with the unsightly comments on your website, but since this is the slashdot crowd just flag all the comments with URLs in them as 'hidden' and on a daily/whenever basis go through them deleting spam and unhiding legitimate comments. Stick this all in a
Re:Good moderators help... (Score:3, Informative)
I basically gave up on blogging because I had to sort through 500 spam comments a day. I know another blogger who had to clean 7,000 (yes, thousand) spams out of his blog every day.
It took both of us longer than 10 minutes.
Re:Good moderators help... (Score:2)
A couple thoughts. (Score:1)
radical measure (Score:3, Interesting)
Check your capchas (Score:1)
How about using referrer logs to filter them out? (Score:2)
Won't help (Score:1)
IP addresses: The big boys use open proxies all over the world. You'll often get spam which is clearly from the same source but comes from IP addresses all over the place.
User agent strings: Again, the big boys use proper user agents so that they look like regular browsers.
Referrers: Those are unreliable even with human visitors, as proxies (as e.g. used by companies) often filter those out. By relying on referrers you'll block a good portion of your regular visitors.
Having said that, there are tools
Cheep medz (Score:5, Funny)
My advice involves Porn (Score:2)
A better idea is to ask the people who spend their time brute forcing porn sites. They'll know what is undefeatable and what isn't, where the webmaster may only be worried about limiting the damage instead of preventing it outright.
Bots may be using humans (Score:2)
1. Spammer farms out registration to third world sweatshops - for US$1 per day, a person just sits there and fills in registrations then passes them on to the bot system to use.
2. Spammer's system redirects your challenge to a "Free Porn Sign Up" page - now nudie hungry humans are filling doing it so they can see free naughties.
Either way is not impossible t
What's worked for me: easy damage control. (Score:3, Informative)
I had reasonable success by limiting posts to people who have verified their email address -- I think that that was also a feature of a recent phpBB update.
But the spam still outnumbered posts, so in the last two weeks I've added these two phpBB mods:
http://www.phpbbhacks.com/download/4878 [phpbbhacks.com] - this mod checks each registration IP address against the dns blacklists. I think that it improved the situation, but it didn't stop the problem out right, and I still had to clean up the board once in a while.
http://www.phpbbhacks.com/download/6208 [phpbbhacks.com] - this mod gives a really easy way to delete a user and all of their posts at once. It's not a fix, but it's turned out to be the best solution. It only takes a few seconds to undo the damage from any one individual, no matter how many spam posts that they have made. A person could spend 20 minutes registering and posting 20 messages and I have to spend 20 seconds nuking the account and all it's posts. It's a fair trade, and I get some small satisfaction in that!
mod_security (Score:2, Informative)
Solution without Captchas (Score:1)
as Email spam. Delete by filters.
Add spam to text filters sets to reduce all future spam posts to blanks.
sure its hard and time-consuming plus it
gets its share of CPU power but
Its most use-friendly.
No CAPTCHAs.: just text filtering.
All spam forms can be catalogued and string added to blocklists.
i.e. If you post something
(question marks indicate any letter)
Containing string "Am?z?ng op?or?un?ty"=
you get banned for a week.
Or if you post "ch?ap Vi?gra substitut?",It get te
in tandem (Score:1)
PunBB is a good forum (Score:1)
Although a good idea, that I've seen on a forum once was that any new users, cant make a new topic until they make at least 2 replies first. Most bots are setup to make new topics and not replies. Although I guess they could change that. Ive even seen one forum that makes you wait 48hrs before you can ever post.
Another idea is
Several options (Score:2)
Here are a couple places to start your search:
I'm just putting the final touches on my own hashcash implementation that doesn't require a server-side database, I'll post a lin
Re:Captcha! (Score:2)
even though you must enter the visual confirmation code code (the picture with letters/numbers)
sounds to me like he's already using captcha
Re:Captcha! (Score:1)
CAPTCHA can't stop real humans. (Score:2)
Re:Why does it have to be a program? (Score:2)
"...Maybe, they've hired a bunch of folks in: India, Mexico, whereever, to just manually register...."
Why hire people at all, when there's one born every minute who'll do it for free if you dangle a free [gadget of the day] in front of their greedy, gullible snouts?
From a previous