Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Live spam-catching contest at CEAS

CmdrTaco posted more than 7 years ago

Spam 126

noodleburglar writes "The 2007 Conference on Email and Anti-Spam (CEAS) will feature a live spam-catching contest. Entrants will be treated to a torrent of spam and must use their spam filtering technique to filter out as much as possible, while also letting legitimate messages. My money's on Spam Assassin." This ought to be a sweeps week television spectacular.

Sorry! There are no comments related to the filter you selected.

CRM114 (4, Informative)

sageFool (36961) | more than 7 years ago | (#18690681)

http://crm114.sourceforge.net/ [sourceforge.net] using hyperspace! It's been working better than spam assassin for me.

Fair Contest? (0)

Anonymous Coward | more than 7 years ago | (#18691011)

Could this actually be a fair contest though?

The first thing that came to my mind was; are they using scripts to send out "legit" emails to everyone. Is there someone going through legit domains with legit accounts typing/copy-pasting legit letters and sending INDIVIDUALLY to EACH contestant?

The amount of variations of LEGIT e-mails varies about as much as SPAM e-mails. So how do they plan on rigging up sending LEGIT e-mails on a a massive competitive level in ALL variations?

Re:Fair Contest? (0)

Anonymous Coward | more than 7 years ago | (#18691483)

yes

Agile and evolutionary versus ergodic spam (2, Insightful)

goombah99 (560566) | more than 7 years ago | (#18691949)

The trouble I can see with a test like this is that's it's a static test. It assumes a key feature of spam which is not true. namely that the spam signature is constant over time or at least makes an ergodic assumption. The thing about spam is that it is evolutionary. Not only does it's signature vary but the spammers learn what is getting through and shift to sending more of that flavor.

To see why this matters consider two spam hypothetical spam programs. One blocks 99% of the test set spam but lets a particular form of spam comprising only 1% of the test set through. And contrast this with another program that is adaptive but to avoid false-postives has to err on the side of letting through 20% of the spam it flags (making it only 20% effective).

While the former method would smoke the latter in a static trial. in the real world spammers would just shift to exclusively sending the kind of spam that gets through the first filter.

To make this a real contest they should make it adversarial. Give the spam script a feedback signal on which spam is getting through and let it adjust it's mix of spam and chaffe to try to maximize the the rate it can push spam through (or bust the filter by chaffing to minimize the number of legit e-mails that survive).

Re:Agile and evolutionary versus ergodic spam (1)

gvc (167165) | more than 7 years ago | (#18692193)

The trouble I can see with a test like this is that's it's a static test.

No it isn't. Hence the name Live Spam Challenge.

Re:Agile and evolutionary versus ergodic spam (1)

goombah99 (560566) | more than 7 years ago | (#18692357)

No you are mistaken I believe. The term "live" is meant inthe sense of real time and sequentially deliveres spam. An on-line test. Not a test where one has the entire corpus of spam to train and filter. But the spam signature waveform is, unless I'm wrong, not going to be reactive to the filters. I'd even bet that all filters will be delivered the same message sets for ease of comparison. I doubt the spam will evolve it's signature in an intelligent reactive manner to evade the filter. But that's the hallmark of real spam--it not only varies but it adapts.

Re:Agile and evolutionary versus ergodic spam (1)

gvc (167165) | more than 7 years ago | (#18692897)

I meant live to mean that the spam was captured and delivered in real time. If one or more spam filters adds the spam to Razor, or an RBL, or whatever, that'll be observable -- by spammers and filters alike.

Re:Fair Contest? (1)

TFGeditor (737839) | more than 7 years ago | (#18694065)

I was wondering how the test-spam generator will handle headers, especially origin IP address. That alone is often 75 percent accurate in determining spam. If it sources from an IP in Korea, South America, or Europe and is destined for a North American inbox, odds are it is spam.

Not flaming, just an observation based on my own experinces.

Re:CRM114 (1, Funny)

Anonymous Coward | more than 7 years ago | (#18691307)

Unlike many other "filters", CRM114's default action is to read all of input, and put NOTHING onto output.

This is either:

1) "automatic" white-listing?
2) Not healthy and you should eat more fibre.

Cruel and inhumane (0)

Anonymous Coward | more than 7 years ago | (#18691687)

I certainly hope that after this senseless hunt, they'll re-release the poor SPAMs back into the wild where they belong.

Greylisting (0)

Anonymous Coward | more than 7 years ago | (#18690697)

My money would be on greylisting + RFC compliance checking except for the fact that those are very hard to do in a testbed.

My money (1)

Mateo_LeFou (859634) | more than 7 years ago | (#18690699)

is on whatever Gmail uses. I've not yet seen a spam message in my inbox, nor have I missed any mail, even from auto-mailing scripts at websites I'm building...

Re:My money (2, Funny)

rodney dill (631059) | more than 7 years ago | (#18690773)

Well let's just find out, just what is your gmail address, hmmmm?

;)

Mateo_LeFou, prepare yourself... (0)

Anonymous Coward | more than 7 years ago | (#18690821)

Every email address variation of "Mateo_LeFou" is now being generated and gmail is now being bombarded using my army of hijacked PC's. It's just a matter of time. You wil have 50GB of spam within the hour...

Re:Mateo_LeFou, prepare yourself... (2, Funny)

Zephyros (966835) | more than 7 years ago | (#18691367)

Translation: "You have no chance to survive. Make your time."

Group spam detection (4, Informative)

Animats (122034) | more than 7 years ago | (#18690811)

Gmail, like SpamCop, has a group spam filter system. It looks at mail sent to a large number of recipients. The defining characteristic of spam is that it's sent to a large number of recipients, after all. If you're in a position to watch the incoming mail of a few million mailboxes, detecting spam is easy.

Re:Group spam detection (1)

ProfessionalCookie (673314) | more than 7 years ago | (#18690985)

Yeah- I'm waiting to see algorithmically generated spam where no two messages are alike. Bleh! That being said gmail does a tremendous job of letting through legitimate messages (which is no doubt the hardest part of making a spam filter these days).

Re:Group spam detection (0)

Anonymous Coward | more than 7 years ago | (#18691221)

A lot of spam is generated algorithmically. The key, though, is that a spammer typically cannot send each email individually. It sends its spam to a relay, and the relay is the one that actually does the mass mailing. If a spammer were actually required to send each email that it sent, that would GREATLY increase its operating costs.

Re:Group spam detection (0)

Anonymous Coward | more than 7 years ago | (#18692823)

The key, though, is that a spammer typically cannot send each email individually. It sends its spam to a relay, and the relay is the one that actually does the mass mailing. If a spammer were actually required to send each email that it sent, that would GREATLY increase its operating costs.

Huh? Have you been living in a cave? Most spam that my spam filter catches, even the same message to 40 different users in my domain, comes from 40 different addresses. Botnets baby. Get with the times man.

Re:Group spam detection (1)

Animats (122034) | more than 7 years ago | (#18691423)

Yeah- I'm waiting to see algorithmically generated spam where no two messages are alike.

We've had that for years. The latest variant is in those Viagra spams with a faint pattern of background noise in the images, different for each spam.

Re:Group spam detection (5, Interesting)

kebes (861706) | more than 7 years ago | (#18691017)

You're right--but the size of Gmail gives them another advantage. In those marginal cases where the spam filter isn't sure about an email (is this spam or a mailing list?) it has the advantage of having a huge number of people checking all the emails. That is, the users do the final check.

I have received a spam to my gmail account exactly once. And when I did, shocked, I clicked the "mark as spam" button. The point is that this spam was probably sent to millions of Gmail users, and the algorithm wasn't sure how to categorize it. But because I clicked "spam" (and probably a few other people did, too), it was marked as spam for everyone. So most users never say it in their inbox. Thus only a dozen out of the million recipients was ever bothered by the spam. Conversely, an email list would receive no (or very few) "mark as spam" clicks, and would be allowed to pass. So basically the Gmail userbase acts the workforce to continually train the spam filter, and moreover to detect new spam within minutes of it being sent.

It's hard to beat a system like that. But the point is that it relies on the large number of users who are all (effectively) sharing their spam training sets with each other in realtime.

This is not to say that the baseline algorithm that Gmail implements isn't quite effective, but the point is that Gmail can use the users to resolve those tricky false-positive and false-negative situations.

Re:Group spam detection (1)

iminplaya (723125) | more than 7 years ago | (#18692279)

This doesn't lead to the possibility that a group of users could mark a legitimate sender as a spammer? I think this an old question, but I don't remember the answer. And if it is possible, how do you defend against it?

I wonder how they deal with pseudo-spam (1)

grahamsz (150076) | more than 7 years ago | (#18692323)

I know I've removed myself from a few mailing lists by simply having gmail count them as spam.

These aren't really spam, they are companies that I did business with once and can't be bothered to find my username and password to change my email subscription settings. But gmail seems to happily block everything else from that sender without my interaction.

Surely other users do want these particular emails so there must be some kind of per user dynamic as well.

Re:Group spam detection (1)

asninn (1071320) | more than 7 years ago | (#18693403)

Thus only a dozen out of the million recipients was ever bothered by the spam. Conversely, an email list would receive no (or very few) "mark as spam" clicks, and would be allowed to pass. So basically the Gmail userbase acts the workforce to continually train the spam filter, and moreover to detect new spam within minutes of it being sent.

This probably plays a role, but it will not be the only thing GMail relies on (and probably not even the most important factor), and it will likely require more than a dozen people, too. Think about it - otherwise, a spammer could just set up twelve fake GMail accounts, send the spam message in question to those as well, and mark them as "Not Spam" there when they filter catches them after the dozen users you refer to tell the system that it's indeed spam.

Needless to say, this is probably still being done - and given that email is a pretty private matter, I don't think there's much webmail providers can do about it, either. After all, it's not like you can just have an employee look at someone's account after the system flagged it as "suspicious" to see if they are a legitimate user or not; doing so would be a rather crass invasion of people's privacy and their right to private communication.

So the spammers *are* doing it (why wouldn't they, after all?), and GMail etc. can't really do all that much about it - and therefore, the system will probably not depend on user input quite as much as you think.

Re:Group spam detection (1)

Matt Perry (793115) | more than 7 years ago | (#18694349)

I have received a spam to my gmail account exactly once.
I wish my Gmail account was like that. Maybe you're new to Gmail. I get several spams in my inbox per week. Mostly these are spam messages in Russian and Chinese but I still get a lot of spam in English as well. I always use the button to mark them as spam, but Gmail doesn't seem to get the message that I don't want anything written in Russian. It's also disappointing that I can't create a filter to mark messages as spam. The best I can do is catch emails with Russian or Chinese characters and filter them off to a folder where I later go and mark them as spam.

Re:Group spam detection (0)

Anonymous Coward | more than 7 years ago | (#18691697)

If it's so easy, then why hasn't Earthlink mastered it?

Re:Group spam detection (0)

Anonymous Coward | more than 7 years ago | (#18691771)

They use DCC? [rhyolite.com]

Anyone can use DCC.

Re:My money (0)

Anonymous Coward | more than 7 years ago | (#18690955)

italiasw@gmail.com , it's go time!

Re:My money (2, Informative)

0100010001010011 (652467) | more than 7 years ago | (#18690999)

Set up a catchall on your domain. You'll start getting stuff through. Especially the images ones. Some of the newer "make it look like a real e-mail" gets through.

Everywebsite I have gets its own e-mail account, eg. slashdot@myhost.com.
One day I started getting spam to site@myhost.com. So I setup in dreamhost to bounce everything to that e-mail address.

Then I started getting flooded with:
otehoenut-site@myhost.com
cgjwbmkh-site@myhost.com

Google has, thankfully, let me do delete of *site@myhost.com, but for a time I was still getting them.

Gmail's filtering is not that great (1)

winkydink (650484) | more than 7 years ago | (#18691261)

Try slutting your address around a bit. Mine is only publicly readable here on /. and I get plenty of spam in my gmail inbox. Yahoo seems to do a better job based on my experience.

Re:Gmail's filtering is not that great (1)

jfengel (409917) | more than 7 years ago | (#18692435)

Huh. I'm using GMail to host my domain. My email addresses are pretty slutty (a combination of supporting the catchall, some public "info@" addresses that get forwarded to me, and a few mailing lists with lousy privacy or security policies.)

I do see perhaps three spams a day that actually make it into the inbox, and about 300 or so that are shunted to the spam folder.

There may be false positives in there, but with 300 per day I'm not going to find out. I've never noticed one in there, or had a friend tell me about an email that never reached me.

Re:My money (1)

hpavc (129350) | more than 7 years ago | (#18691621)

The google gmail news group says otherwise for many other people, the filtering is practically non-existent it seems for me.

Re:My money (1)

thePowerOfGrayskull (905905) | more than 7 years ago | (#18692133)

is on whatever Gmail uses. I've not yet seen a spam message in my inbox, nor have I missed any mail, even from auto-mailing scripts at websites I'm building...
I will agree that it's great for spam; but when it comes to 419 emails, it sucks. Badly. I'm not sure how I got on the 419ers lists, but I get at least 10-12 of them a day, none of which are caught by gmail filters. On the other hand, the 50-60 regular spam emails are correctly filtered. If only I could perform regex filtering in gmail, I could catch the 419 emails myself very easily, as they all have very common attributes.

Re:My money (1)

gvc (167165) | more than 7 years ago | (#18692251)

You're welcome to use Gmail -- or any other filter you like, animal, vegetable, or mineral -- to participate in the Live Challenge.

Re:My money (1)

SL Baur (19540) | more than 7 years ago | (#18693015)

My bet would be on the gmail filter too. I've had my old xemacs.org email address (which has been harvested to death) forwarded through there for some months now. It's not perfect, but it still only lets through about as much spam as my old handcrafted .procmailrc did 8 or 9 years ago. Which is really good considering how much more spam there is today.

If I could tell it to junk everything except text in certain languages it would work even better. It seems to miss a lot of Korean and Russian spam.

Sweeps (2, Funny)

cyphercell (843398) | more than 7 years ago | (#18690725)

This ought to be a sweeps week television spectacular.

It think I've seen people catching spam on tv, just not the kind you're talkin' 'bout. http://www.spam.com/ [spam.com]

Re:Sweeps (1)

session_start (1086203) | more than 7 years ago | (#18692573)

The trick is to try to catch the spam in a net with such velocity that the spam "squishes" through the net to fall on the ground, leaving you with only valid "message" hidden amongst the spam.

My money (1)

TodMinuit (1026042) | more than 7 years ago | (#18690745)

My money is on whoever rigs up a Amazon's Mechanical Turk-based system fast enough.

Re:My money (1)

Afecks (899057) | more than 7 years ago | (#18691711)

My money is on whoever rigs up a Amazon's Mechanical Turk-based system fast enough.

Because you'd really want thousands of random people reading your emails looking for spam?

Wonder what the SPAM messages are? (1)

willie_nelsons_pigta (1006979) | more than 7 years ago | (#18690747)

Wonder what the SPAM messages are?
One of the funnier ones I think I ever got was from Oliver Kloshoff for "Male Enhancement".

What... (0)

Anonymous Coward | more than 7 years ago | (#18690753)

No department? Come on taco, you can keep up the tradition. A lame one is better than no one.

Re:What... (0)

Anonymous Coward | more than 7 years ago | (#18691971)

lolled!

Damn. (0)

daeg (828071) | more than 7 years ago | (#18690757)

Damn. I was hoping they'd be launching phone-book sized printed copies of spam at the contestants, complete with blood, with each week adding a few pounds. Add some half naked chicks and dudes (cater to multiple markets) dancing around, maybe some buckets of slime and you've got yourself a show worthy of running on Fox.

Curious:When urologists email each other... (4, Interesting)

dpbsmith (263124) | more than 7 years ago | (#18690847)

... are they able to refer to Pfizer's brand name for sildenafil, Lilly's name for tadalafil, or Bayer's brand name for vardenafil without getting caught in the spam filters?

Re:Curious:When urologists email each other... (3, Informative)

kebes (861706) | more than 7 years ago | (#18691185)

Suffice it to say that a doctor is likely to write an email like:

"Ted, I just read the news about Viagra in the New England Journal of Medicine. Very interesting results, though the error bars are a bit large to draw any major conclusions just yet. What do you think?"

Whereas a doctor rarely writes email like:

"NoW ava ilable is generic V1AGRA at low price! Generic, quality, all low price now!"

The point is that modern spam filters don't just look for "bad words" but consider relative word frequencies, the sender and receiver fields, word correlations, formatting elements, URLs, etc. Spam filters in your email client will be trained against email you typically send/receive, and so can be even more precise. Spammers of course try to make their emails include words so that they end up looking like real email, but if the filter is good enough, then the only way to get past it is to send an email that now lacks those critical spam elements (like the link you're supposed to click to buy the generic drug or whatever)...

That depends upon the method used. (1)

khasim (1285) | more than 7 years ago | (#18691217)

Pure content scanning would probably trigger those ... unless you had previously manually approved similar messages.

Other approaches use multiple tests such as checking whether the sending server's IP address is on a blacklist or whether any of the links in the message (should it contain links) were on blacklists.

Re:Curious:When urologists email each other... (1)

misleb (129952) | more than 7 years ago | (#18691555)

Only if they write things like:

Hey, I just pre sc ribed V.1.4.G.R.A to a patient today.

The monk said to the fox, why don't the squirrels to be or not to be, that is my answer. The fog was as thick as umbrellas in the wind thought the old maid.

Re:Curious:When urologists email each other... (0)

Anonymous Coward | more than 7 years ago | (#18694077)

I always assumed that spammers wrote "Hey, I just pre sc ribed V.1.4.G.R.A to a patient today." precisely because they couldn't write "Hey, I just prescribed Viagra to a patient today." without getting caught in the spam filters.

The mostly-grammatical Zen nonsense at the end would presumably improve the chances of being accepted(by lowering the Viagra-centrism of the message).

Re:Curious:When urologists email each other... (1)

mutterc (828335) | more than 7 years ago | (#18693387)

Happened with a lame spam filter my company used to have. This was a year or so ago.

I emailed my wife "can you stop by and pick up the Strattera and Effexor from the pharmacy?" once. Her reply, containing my message, got plonked by the filters.

I wish the contest was.... (2, Interesting)

ruffnsc (895839) | more than 7 years ago | (#18690857)

physically catching the spammers! (your imagination can do the rest)

The First Annual Greased Spammer Contest! (4, Funny)

Penguinisto (415985) | more than 7 years ago | (#18691101)

(cue Monster Truck Rally announcer guy voice...) THIS SATURDAY AT THE EXPO CENTER! The Best admins and the worst spammers come together in a throwdown-showdown-lowdown Greased Spammer Contest! We kidnap, strip, and grease down every known spammer we can find on Planet Earth! We bring 'em here, then we give our lucky mail server admins (as determined by lottery) a chance to catch 'em! The spammers will be released into a large pit, where the admins may use any method to catch and immobilize spammers (firearms and other projectile weapons are excluded). Points will be given for the number of spammers caught, the methods of capture, and the level of eye-rattling violence applied to each spammer after their capture! Watch as the winning admin gets to publicly execute the dreaded Sanford Wallace by any method that he or she can dream up! Any method at all! You'll buy a ticket for the whole seat, but you will only need the edge! Get your tickets at the Mondotix - DON'T MISS IT!(/voice)

/P

Re:The First Annual Greased Spammer Contest! (1)

NewbieV (568310) | more than 7 years ago | (#18693853)

You forgot to mention that's it's being held on

SUNDAY! SUNDAY! SUNDAY!

Be There!

Re:I wish the contest was.... (1)

HTH NE1 (675604) | more than 7 years ago | (#18691345)

I wish the contest was physically catching the spammers!
Only as long as it is not catch-and-release.

Will SMTP server settings count as well? (1)

Penguinisto (415985) | more than 7 years ago | (#18690889)

...or just the filter software/daemon performance/stats alone? There's lots you can do to the MTA itself to stop spam before it even has to be examined by the filters (mostly by monkeying w/ the SMTP session handling and timeouts).

It's be interesting to see a solid setup that handles a combination of the two, then publish the results (yes, spammers can read those results/settings to try to foil the setup, but many settings would make it patently unprofitable for them to do so).

/P

I can't tell from the write up. (1)

khasim (1285) | more than 7 years ago | (#18691487)

But I doubt that they have a hundred thousand systems that they'll be using to send the test spam.

A big part of the system I use at work is based upon IP addresses and rDNS. I block a HUGE amount of spam just by rejecting all connections from Comcast that aren't from their SMTP servers.

I know, some people want to run SMTP servers at home. But so far none of them have attempted to send email to my system.

So it really depends upon how they configure the test spam servers. Personally, I don't see this as being a very useful competition. But I may be wrong.

Re:Will SMTP server settings count as well? (1)

gvc (167165) | more than 7 years ago | (#18692427)

Envelope information will be preserved, so you can determine the purported sender, multiple recipients, HELO IP, actual IP, etc. But you can't play interactive games with the SMTP protocol because the same email must be delivered to all participants.

Re:Will SMTP server settings count as well? (1)

pe1chl (90186) | more than 7 years ago | (#18692591)

I agree. I filter the majority of spam by just doing strict RFC compliance testing in the SMTP engine. It rejects almost everything sent via botnets. What comes through is mostly 419 scamming, because that is sent via bonafide mailservers. But that is easily filtered with SpamAssassin.

Re:Will SMTP server settings count as well? (1)

SCHecklerX (229973) | more than 7 years ago | (#18694859)

That's my plan (I want to see how well my stuff works without customizing it too much just for the contest). Let's hope more details arrive soon...

The prize list :) (5, Funny)

davidwr (791652) | more than 7 years ago | (#18690895)

1st prize: Job offer from a security-software vendor
2nd prize: Lifetime supply of Hormel meat products
3rd prize: Commemorative tin of SPAM meat product
Last place: Inheritance from Nigerian Prince

Re:The prize list :) (1)

LearnToSpell (694184) | more than 7 years ago | (#18691611)

2nd prize: Lifetime supply of Hormel meat products

Which is about 4 1/2 days if that's all you eat.

mod Up (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#18690897)

Don't walk around ME! It's official poor prio8ities, every chance I got FreeBSD had long consistent with the Troubles of Walnut A NEED TO PLAY Be treated by your his clash with

that's easy. Yahoo mail! (2, Funny)

number6x (626555) | more than 7 years ago | (#18690921)

Just open a yahoo mail account, and start posting with the e-mail address all over th internet.

You'll catch more spam than anyone else!

Oh, you want me to filter out spam, not just get spam, nevermind.

Still, it might be the fastest way to build a database of spam.

Re:that's easy. Yahoo mail! (1)

CrazyTalk (662055) | more than 7 years ago | (#18691071)

Actually thats not a bad idea - have a contest to see how much spam you can ATTRACT with a fresh email account in a given time period. My Verizon account would win hands down. (And to you spammers out there - no, my email address is NOT CrazyTalk@verizon.net)

Re:that's easy. Yahoo mail! (1)

Kozar_The_Malignant (738483) | more than 7 years ago | (#18691983)

Actually thats not a bad idea - have a contest to see how much spam you can ATTRACT with a fresh email account in a given time period. My Verizon account would win hands down. (And to you spammers out there - no, my email address is NOT CrazyTalk@verizon.net)

The poor bastard who actually does have CrazyTalk@verizon.net is really, really pissed about now.

Professional spammers in attendance? (4, Interesting)

MobyDisk (75490) | more than 7 years ago | (#18690957)

I wonder if professional spammers will attend the conference to learn how to get through the next generation of filters. Maybe it would be like playing spot the Fed at the hacker's conferences.

SpamAssassin? (3, Interesting)

raddan (519638) | more than 7 years ago | (#18690993)

Ha ha, silly admin. My money's on greylisting [wikipedia.org] .

We use both SpamAssassin and OpenBSD's spamd, to great effect. spamd does most of the work, though. Daniel Hartmeier [benzedrine.cx] (site down ATM, unfortunately) has an example of how to tie SA scores back into spamd for blacklisting, which is just awesome. I'd implement it here, but our current setup is effective enough as to not make it worth my time.

Greylisting no longer works (1)

Tipa (881911) | more than 7 years ago | (#18691285)

Greylisting was designed on the single proposition that spam mailers wouldn't "call back" if they got a "call back later" code from the site they were spamming. And maybe that was true for awhile. In my last job I had to add spam filtering to our email and greylisting was one of the first things I tried.

The spammers just kept trying until they got through.

Spamming has evolved past greylisting and it is now worthless.

Bayesian keyword filtering is decent, but is constantly attacked by images or hiding the spam content in random text. If you train it well, you can eventually just pass through the sort of mail you normally get and spam that doesn't mesh with your normal mail might get blocked, but when you take this to a company level it fails unless everyone separates their spam from their real mail and makes appropriate filters and rules -- which they won't do.

It's a tough problem, and there's no one solution that can do the whole job. A well-trained Bayesian along with an RBL like Spamcop can get about 80% of them.

Re:Greylisting no longer works (1)

LurkerXXX (667952) | more than 7 years ago | (#18691499)

Graylisting is worthless? Umm, no.

It's certainly not perfect, but it reduces the load on my spam-filter. A *lot*. More than 90+% of smtp connections don't make it through spamd here. I hardly call that worthless.

Last year it was more like 99+%. Here's some stats from someone else last year: http://undeadly.org/cgi?action=article&sid=2006021 7105149 [undeadly.org]

Re:Greylisting no longer works (2, Interesting)

raddan (519638) | more than 7 years ago | (#18691993)

It doesn't work? Maybe you should tell that to my 300-strong userbase!

I'm certain that there are differences in implementation between different greylisters. I've never tried Postfix's, for example, because OpenBSD's works fine for me. A small point wrt to OpenBSD's spamd: you actually need to try thrice. The first time you're rejected. The second time you're marked as OK, but still rejected. The third time you get through. Maybe it's the third time, or some of the time limits, or some other things that spamd is doing (BTW, we do not use *any* blacklists), but it works great. I probably see a spam in my inbox once a month, maybe. The rest of my users who complain about the "spam" they're still getting are really getting email they've signed up for (listservs aren't spam, people!), in which case, it's usually just a simple matter of education.

I don't know where your greylisting system failed, but it works wonders for us. When I implemented it, I was a sysadmin rock star for a week. Who knew there were anti-spam groupies? Now it's back to picking the crud out of the VP's keybord ;^)

(You're spot-on about one thing though: defense in depth. That principle is in effect for EVERYTHING, which is why I want to administer electric shocks to our Mac users when they try to call the Help Desk.)

Flawed (2, Informative)

lazarus (2879) | more than 7 years ago | (#18692309)

"This ought to be a sweeps week television spectacular."
This ought to be ignored as the contest is flawed.

"Ha ha, silly admin. My money's on greylisting."
They're sending a stream of spam from where? Sounds like a real mail server...

From TFA: "Live email stream, delivered by standard protocols (SMTP, IMAP, POP)"
[One wonders how else they would deliver e-mail if it was not from standard protocols. I also wonder how they plan on delivering e-mail using POP... The mind boggles...]

In any case if I read this correctly this effectively eliminates anti-spam technologies which work on the premise that the spam is coming from illegitimate mail servers. One of these techniques is greylisting. Meaning, greylisting will not work. So if I were you, I wouldn't put your money on it.

GENERAL JUNK E-MAIL FILTERING RANT (You've been warned): If you're using an anti-spam technique which takes more cpu cycles to execute than it takes for the spammer to send the damn spam in the first place, you've already lost this war. In other words, as long as it's costing you more than it is costing him/her you will always be on the losing end of the deal.

And I would like to add that despite my post above, I agree with you that greylisting and its derivatives when properly deployed are excellent techniques for eliminating UBE. But I think this contest is engineered to ignore that fact.

Re:Flawed (3, Interesting)

gvc (167165) | more than 7 years ago | (#18692743)

So here's the issue. If you are going to try to discriminate among filters using several thousand messages, you have to send them all the same messages. To send them the same messages you have to capture and redistribute them. You can pass on all the info from the capture, including all SMTP commands, but you can't do intrusive protocol probes. And since this is *real spam* you can't very well ask the sender to act in an obliging way by repeating its message and behavior for each participant.

I'd be very interested to hear of a design that would allow greylisting to be tested. The best I can come up with is to fail the message after transmission, then to try to simulate the behavior of the sender in response to this failure. But that would be catering to one very specific method of perturbing the protocol. And it would be necessary to do a fair amount of work to spoof the IP address presented to the participant filters.

For this reason, we chose to exclude all SMTP interactions, and simulate a second-in-the-chain filter appliance application. The reasons are practical, not policy.

Re:Flawed (1)

lazarus (2879) | more than 7 years ago | (#18693743)

Gordon,

Thanks for your response. I just sent your counterpart at IBM a lengthy probing e-mail about this which I can summarize as:

1. Real stream or fake stream?
2. Points for cost effectiveness?
3. Points for scalable/redundant architecture?

I applaud what you are doing and I wish you the best success (contests like this are good at stimulating inventiveness). I've been racking by brain trying to figure out how you could do this in a way that wouldn't be discriminatory. The best I could come up with would be to create a bunch of new domains (ceas-t1.org to ceas-t1000.org or whatever), then seed e-mail addresses on these domains with the spamming community. During the contest you insert legitimate e-mails into the stream by sending them from previously-undisclosed servers. The problem is how do you gauge success if you don't know what junk e-mail has been sent to the domain. You can't relay it because that instantly makes the test invalid. This technique would require previous statistics and a long lead time. Even then it would be possible for rival competitors to sabotage other entrants tests if you knew the domains being used...

Alternatively you would have to use an existing domain with known stats and perform the contest in a sequential fashion. Again, this becomes very time consuming and there are risks associated with doing it fairly. In short, I cannot think of a good way to do this.

I have developed a technique called GDSA which is quite effective, scalable, cost effective, but which in part relies on spammers needs to remain anonymous. This technique will not work in your contest (despite its effectiveness in the real world), and I will be unable to enter (unfortunately).

That said, it is easy to criticize, but difficult to be constructive. If I can think of a legitimate technique I will let you know.

Thanks.

Re:Flawed (1)

Thundersnatch (671481) | more than 7 years ago | (#18695021)

GENERAL JUNK E-MAIL FILTERING RANT (You've been warned): If you're using an anti-spam technique which takes more cpu cycles to execute than it takes for the spammer to send the damn spam in the first place, you've already lost this war. In other words, as long as it's costing you more than it is costing him/her you will always be on the losing end of the deal.

Yes, the spammer will always win, since his CPU cycles and bandwidth are free. But those costs don't matter at all.

Bayesian and other resource-intensive spam filtering techniques are popular because they save people's time, which is far more expensive and valuable than CPU cycles. So you want your filter to catch the most spam. But false positives cost far more in people-time and lost opportunity than a missed spam, so reducing those is nearly as important as catching spam.

RBLs, SPF, greylisting, and most other protocol-oriented filtering techniques have comparatively high false positive rates. They also make recovering from false positives very difficult. A false positive will likely go unnoticed by the sender if they don't to recognize the bounce message (there's lots of "bounce spam") and take appropriate action to call by phone or use a website form or whatever. And then mail administrators need to get involved and whitelist the addresses.

Most businesses, including mine, find a high false postive cost unacceptable. So we accpet delivery of just about everything, and stick it in a "junk mail" folder if the content and protocol filters really, really don't like it. Yeah, it costs us in CPU cycles, storage, and bandwidth, but it's the best tradeoff at the moment.

Re:SpamAssassin? (0)

Anonymous Coward | more than 7 years ago | (#18693711)

I do all this using MIMEDefang [mimedefang.com] tied to a MySQL database. High SpamAssassin scores or viruses get the sending IP blacklisted for an increasing amount of time per incident.

Despite predictions of gloom and doom about greylisting and other SMTP-validity checks (legal argument to HELO, sendmail greet-pause, etc.), they still do a great job stopping a lot of spam before the data phase. Add in SA with auto-update rules, and only about 1.5% of e-mail delivered to my domain is spam, with only about 40% of attempted e-mail ever needing to be scanned by SpamAssassin...the other 60% is junked before that.

Oh, yeah, I also don't use any DNS-based blocking lists like SpamCop, XBL, Spamhaus, etc., because of too many false positives, slow DNS queries, and no local control over what I was blocking.

West Virginia (0)

ehaggis (879721) | more than 7 years ago | (#18691001)

Back in West Virginia we'all used to go spam catchin' every weekend while they was in season! Them spam made good eatin'.

Re:West Virginia (2, Funny)

UnknowingFool (672806) | more than 7 years ago | (#18691057)

Back in West Virginia we'all used to go spam catchin' every weekend while they was in season! Them spam made good eatin'.

Don't lie. You and your buddies got drunk and would go spam tipping. There was no hunting involved.

My entry: Human computers (1)

davidwr (791652) | more than 7 years ago | (#18691015)

I'm going to take a page from the Veruca Salt [wikipedia.org] needle-in-a-haystack problem and outsource this to a million peasants in India.

To pay for it I'll be spamming the world with my stock pump-and-dump scheme.

This just in: DAVI (OTC) NOW $0.02 TARGET $0.25!

New packaging? (2, Funny)

davmoo (63521) | more than 7 years ago | (#18691045)

A torrent of spam? It doesn't come in cans anymore?!

The cans were so much easier to catch, too.

Spam Rage Rampage (1)

Dekortage (697532) | more than 7 years ago | (#18691051)

A couple of years ago, I wrote a prototype for a video game called "Spam Rage Rampage" -- a first-person shooter where you roamed a Tron-like world, killing spam zombies and rescuing real people (== legitimate mail) while you searched for clues to the location of the nefarious spam kingpin, Ospama Bin Sendin. Each zombie represented a different class of spam... prostitute zombies for porn, business-suited zombies for stocks, pharmacist zombies for pill ads, etc.

Upon seeing a demo, one of my friends commented that I should hook it up to a real e-mail inbox, so you could kill your own spam messages, perhaps even in real time. Unfortunately I have never had the time to complete it... maybe after the kids are out of the house.

how to finish [Re:Spam Rage Rampage] (1)

nil0lab (94268) | more than 7 years ago | (#18693427)

> A couple of years ago, I wrote a prototype for a video game called "Spam Rage Rampage"
> -- a first-person shooter where you roamed a Tron-like world, killing spam zombies and
> rescuing real people (== legitimate mail) while you searched for clues to the location
> of the nefarious spam kingpin, Ospama Bin Sendin. Each zombie represented a different
> class of spam... prostitute zombies for porn, business-suited zombies for stocks,
> pharmacist zombies for pill ads, etc.
>
> Upon seeing a demo, one of my friends commented that I should hook it up to a real e-mail
> inbox, so you could kill your own spam messages, perhaps even in real time....

Um, doesnt the system already have to know whether messages are spam or not?

> Unfortunately I have never had the time to complete it... maybe after the kids are out of the house.

Release the code under GPL with a couple screenshots of the demo and I'm sure
others will finish it for you! It's a cool enough idea...

Greylisting? (2, Insightful)

schmiddy (599730) | more than 7 years ago | (#18691131)

I can't help but wonder how realistic this scenario is.. They're basically going to have a single server dumping a whole ton of spam at your filtering package, and you're supposed to be able to filter on.. what, just the content of the messages? Real world techniques use many more subtle hacks, such as greylisting, or actually looking at the domains the messages are coming from. If their server is going to be dumping millions of messages at you in a short amount of time, I don't think they'll let you use greylisting or similar techniques.

Re:Greylisting? (1)

blhack (921171) | more than 7 years ago | (#18692481)

No. they give the nerds of an email address, then reverse the web filter so that it ONLY allows them to go to porn sites.

after a few minutes their email servers should reach critical mass.

email migrating to pmail (permission mail) (0, Redundant)

drDugan (219551) | more than 7 years ago | (#18691163)

I think email in its current form will eventally die. There is no way with increased information transparancy that a global network of email will continue to function efficiently. Simply too many senders and too much spam.

I could work better if we migrate to an invite-only system on top of email (extending the email-realted RFCs) -- one where mail delivery only occurs to individuals from those who hold a key (the public half of a keypair between the two people).

Such a migration will require minimal additional functionality by both existing email clients and servers. I wrote up some thoughts on this idea here http://biocontact.org/pmail/ [biocontact.org] but I've recieved no response.

Boring. (2, Funny)

bmo (77928) | more than 7 years ago | (#18691501)

Couldn't we just have a contest where actual live spammers are fed to lions?

To quote Bill Mattocks...

"My sense of personal integrity is none of your concern."
                                                -thus spake Walt "Pickle Jar" Rines
"I'm going to pound your balls flat with a wooden mallet."
                                                -thus respondeth Bill Mattocks

Re:Boring. (1)

Anne Thwacks (531696) | more than 7 years ago | (#18691629)

Mod parent up +10 Wonderful Idea

Kobayashi Maru (2, Funny)

Kozar_The_Malignant (738483) | more than 7 years ago | (#18691919)

Find a creative and unique solution (cheat):

  • Hunt through CEAS conference hall
  • Find contest spammers
  • Drag spammers back to contest area
  • Spammers are beaten to death by audience
  • Win!!!
  • ...Oh, wait, they weren't realspammers?
  • Sorry

CEAS Call for Participation (1)

gvc (167165) | more than 7 years ago | (#18691925)

Many of the questions asked here are answered in the Challenge Call for Participation [www.ceas.cc]

Or the overview talk [youtube.com] that Rich Segal gave at the MIT Spam Conference.

The guidelines are scheduled to be finalized May 1.

Re:CEAS Call for Participation (1)

SL Baur (19540) | more than 7 years ago | (#18693623)

Participants will compete in filtering a live 24-hour e-mail stream
Looks like greylisting is acceptable.

Simulated user-feedback will be provided to train learning-based filters.
And it looks like gmail-type filters are acceptable.

Good job guys. The results will be interesting to read.

On ESPN... (1)

vjmurphy (190266) | more than 7 years ago | (#18691963)

"This ought to be a sweeps week television spectacular."

Is there an ESPN 6 or 7 cable channel? I'm thinking this is below Cheerleading and Dog Agility, but perhaps above Lumberjack competitions.

Isn't this already on TV? (2, Funny)

Minwee (522556) | more than 7 years ago | (#18691981)

"This ought to be a sweeps week television spectacular."

I think that it already is, but it's only on in Japan and uses real SPAM.

Visions of tennis ball machine gone.... (1)

zippoiii (887540) | more than 7 years ago | (#18692017)

Sigh. And i had such hopes. Pictures of a team of people, with a spam and tennis ball loaded tennis ball launcher at the other end of a court. When something gets fired at you, determine if you should let the ball go by, or wack the spam from the air. Alas, it's not to be. Dan

I got a better idea (1)

Indy1 (99447) | more than 7 years ago | (#18692343)

Issue hunting permits for the spammers themselves. Whoever wastes the most spammers, wins.

Evidence of wasted spammers can be in the form of complete heads, or ears.

So (0)

Anonymous Coward | more than 7 years ago | (#18692429)

The loser will have to wear a dress to the after-contest party?

i would rather see... (1)

ushering05401 (1086795) | more than 7 years ago | (#18693537)

them train as many ignorant users to catch spam as possible in the alloted time and be judged on how well the users did.

From the not-from-a-dept dept. (1)

etherlad (410990) | more than 7 years ago | (#18694021)

Relevant to nothing, but this is the first time I can remember seeing an article on /. without the requisite department tag in the story header.

Anyone want to try their hand at making up their own?

Global greytrapping (1)

davidwr (791652) | more than 7 years ago | (#18694099)

How's this for a plan:

Seed a few thousand fake email addresses all across the net. Put some on big sites. Put some on small sites. Put some on USENET. Change half the list every month.

If anyone emails two addresses with similar content, the content and the originating IP addresses get marked as likely spam, and used for realtime blackhole-list systems. The more of those fake addresses it hits in a short period of time, the greater it's spammishness.

Catch and Release? (1)

fractilian (704807) | more than 7 years ago | (#18694667)

I hope its not a Catch and Release internet stream.

How to test against spam that isn't REAL spam? (1)

necro2607 (771790) | more than 7 years ago | (#18694741)

Okay, here's the first question I have, and I'm sure many others wonder the same. How will spam be combatted when it's not real spam? For example, Spam Assassin checks actual mail server names and addresses to see if they are on known spammer lists and so on. Won't extremely useful/effective features like these be overriden by the fact that these spam emails are intentionally sent and won't be from any known spam-relaying mail servers??

Re:How to test against spam that isn't REAL spam? (1)

gvc (167165) | more than 7 years ago | (#18695011)

The mail messages will contain header information from which the sending IP may be derived. Of course, spammers try to forge this info, but the most recent header is guaranteed to be correct.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?