Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Spamassassin Beats CRM-114 In Anti-Spam Shootout

timothy posted more than 10 years ago | from the hawaii-alaska-and-utah dept.

Spam 330

Simon Lyall writes "A new study of antispam software shows that Spamassassin performed well in various configurations along with Spamprobe , Bogofilter and Spambayes also came out good while CRM-114 failed to live up to its previous claims . The study shows: 'The best-performing filters reduced the volume of incoming spam from about 150 messages per day to about 2 messages per day.'"

cancel ×

330 comments

GNAA Announcement (-1, Troll)

GNAA_HoppingGoblin (790658) | more than 10 years ago | (#9502906)

GNAA Announces Remastered Version of Gayniggers from Outer Space: The Movie
GNAA Announces Remastered Version of Gayniggers From Outer Space: The Movie

GNAA Vice-President and co-founder JesuitX announced Friday that GNAA founder timecop had completed his nine-month long project of remastering Morton Lindberg's classic Danish masterpiece, Gayniggers from Outer Space: The Movie [imdb.com] .

Said timecop, "I undertook this project so the Gay Nigger Association of America could easily spread the gay nigger seed with a crystal-clear picture and DVD-quality sound. But most of all, I do it for my gay nigg[er]s."

The previously mentioned JesuitX and GNAA high-level operator lysol were allowed early access to view the remastered version of movie. Having been already familiar with the VHS copy currently in circulation, they were in for a real treat. JesuitX was quoted as saying "In that scene where Captain B. Dick [played by Sammy P. Soloman] takes Arminass [played by Coco P. Dalbert] into the relaxing room for a conversation, the beautiful quality of the black skin, combined with the crystal clear sound made it feel like the Captain was sitting right next to me, massaging my knee, and letting me know he always has an eye on my ass. I lost complete control and starting masturbating furiously."

GNAA member l0de was also heard in background continuously saying "LOL JEWS DID WTC LOL JEWS".

Digitally Remastered version of Gayniggers from Outer Space is to be available for worldwide distribution immediately. Everyone is encouraged to download it using BitTorrent, by clicking here [idge.net] . You will need a BitTorrent client [bitconjurer.org] to download this release.



About Gayniggers from Outer Space: The Movie:

  • Sponsored by Carlsberg Pilsner
  • Produced by GayJack Movies
  • Distributed by WorldWide GayMovies

Dino De Laurentus & Raymond Hansen Present
A Lindberg & Kaistensen Production

"The Universe. It's mighty power. It's evolutionary force, not to be stopped by anyone. In its beauty, this, this is a happy place to stay, filled with harmony and cosmic joy. A free place, where men can express themselves, and be as when they were born. All of this is, because someone cares. Because someone looks after us. When we sleep, when we play. When we act natural. This is a movie about those who risk life, and partners, to guarantee living in a wonderful and free universe. This is a movie about the Gayniggers From Outer Space. The Gayniggers come from the planet Anus, in the 8th Sun System, far far away from here. They are much, much more intelligent than any other creature in the Univerise. The most fascinating thing about them is that they, with the help of their super intelligence, and their highly developed telepathic system, Braintapping, will be able to create a world, a society, a perfect world to live in without the presence of women. A MALE ONLY WORLD."

Starring
  • Coco P. Dalbert as ArmInAss
  • Sammy Saloman as Capt. B. Dick
  • Gerald F. Hail as D. Ildo
  • Gbartokai Dakinah as Sgt. Shaved Balls
  • Konrad Fields as Mr. Schwul
  • Johnny Conny & Tony Thomas as The Gay Ambassador

About GNAA:
GNAA (GAY NIGGER ASSOCIATION OF AMERICA) is the first organization which
gathers GAY NIGGERS from all over America and abroad for one common goal - being GAY NIGGERS.

Are you GAY [klerck.org] ?
Are you a NIGGER [mugshots.org] ?
Are you a GAY NIGGER [gay-sex-access.com] ?

If you answered "Yes" to all of the above questions, then GNAA (GAY NIGGER ASSOCIATION OF AMERICA) might be exactly what you've been looking for!
Join GNAA (GAY NIGGER ASSOCIATION OF AMERICA) today, and enjoy all the benefits of being a full-time GNAA member.
GNAA (GAY NIGGER ASSOCIATION OF AMERICA) is the fastest-growing GAY NIGGER community with THOUSANDS of members all over United States of America. You, too, can be a part of GNAA if you join today!

Why not? It's quick and easy - only 3 simple steps!
  • First, you have to obtain a copy of GAY NIGGERS FROM OUTER SPACE THE MOVIE [imdb.com] and watch it. (You can download the movie (~280mb) using BitTorrent, by clicking here [idge.net] .
  • Second, you need to succeed in posting a GNAA "first post" on slashdot.org [slashdot.org] , a popular "news for trolls" website
  • Third, you need to join the official GNAA irc channel #GNAA on irc.gnaa.us, and apply for membership.
    Talk to one of the ops or any of the other members in the channel to sign up today!

If you are having trouble locating #GNAA, the official GAY NIGGER ASSOCIATION OF AMERICA irc channel, you might be on a wrong irc network. The correct network is Niggernet, and you can connect to irc.gnaa.us as our official server. If you do not have an IRC client handy, you are free to use the GNAA Java IRC client by clicking here [nero-online.org] .


If you have mod points and would like to support GNAA, please moderate this post up.

.________________________________________________. fucking
| ______________________________________._a,____ | CmdrTaco
| _______a_._______a_______aj#0s_____aWY!400.___ | will
| __ad#7!!*P____a.d#0a____#!-_#0i___.#!__W#0#___ | he ever learn that
| _j#'_.00#,___4#dP_"#,__j#,__0#Wi___*00P!_"#L,_ | GNAA is totally
| _"#ga#9!01___"#01__40,_"4Lj#!_4#g_________"01_ | unstoppable? Teamed
| ________"#,___*@`__-N#____`___-!^_____________ | up with the other troll groups,
| _________#1__________?________________________ | GNAA will absolutely own
| _________j1___________________________________ | the shitty place that is slashdot.
| ____a,___jk_GAY_NIGGER_ASSOCIATION_OF_AMERICA_ | Just remember, the longer the lines are,
| ____!4yaa#l___________________________________ | the smaller CmdrTaco's penis.
| ______-"!^____________________________________ | This logo is (C) 2003, 2004 GNAA [idge.net]
` _______________________________________________'

(C) GNAA 2004


y0y0 (-1, Troll)

OPTiX_iNC (691070) | more than 10 years ago | (#9502922)

GNAA Ported to XBOX GNAA Ported to XBOX
By GNAA staff
New York, NY - GNAA (Gay Nigger Association of America) this afternoon announced completion of a project started almost 6 months ago, porting of Windows CE.NET environment to Microsoft's XBOX gaming platform.

In a shocking announcement this afternoon, GNAA representative lysol demonstrated the XBOX running Microsoft's own Windows CE.NET 4.2 operating system.
"This is quite an important achievement," lysol began. "By porting Windows CE to XBOX, GNAA will be able to create a beowulf cluster of all XBOXes and use them to recruit more gay niggers. Next step will naturally be porting our GNAAOS framework to the new system, which will allow us to highly increase our gay nigger membership.
Unlike the Microsoft's custom OS, based on Windows 2000 kernel currently running on the XBOX, with a custom (and somewhat limited API), Windows CE.NET will allow running a whole range of Win32 applications on the XBOX by simply recompiling them. Because the differences between Windows CE.NET Win32 API are minimal, any type of gay nigger software can be easily ported to run on the new platform. GNAA is expecting to begin work on porting GNAAOS framework "real soon now", according to GNAA representative goat-see.

For more details, please visit GNAA official website at http://pepper.idge.net/gnaa/ [idge.net] .

About GNAA
GNAA (GAY NIGGER ASSOCIATION OF AMERICA) is the first organization which
gathers GAY NIGGERS from all over America and abroad for one common goal - being GAY NIGGERS.

Are you GAY [klerck.org] ?
Are you a NIGGER [mugshots.org] ?
Are you a GAY NIGGER [gay-sex-access.com] ?

If you answered "Yes" to all of the above questions, then GNAA (GAY NIGGER ASSOCIATION OF AMERICA) might be exactly what you've been looking for!
Join GNAA (GAY NIGGER ASSOCIATION OF AMERICA) today, and enjoy all the benefits of being a full-time GNAA member.
GNAA (GAY NIGGER ASSOCIATION OF AMERICA) is the fastest-growing GAY NIGGER community with THOUSANDS of members all over United States of America. You, too, can be a part of GNAA if you join today!

Why not? It's quick and easy - only 3 simple steps!

First, you have to obtain a copy of GAY NIGGERS FROM OUTER SPACE THE MOVIE [imdb.com] and watch it.

Second, you need to succeed in posting a GNAA "first post" on slashdot.org [slashdot.org] , a popular "news for trolls" website

Third, you need to join the official GNAA irc channel #GNAA on EFNet, and apply for membership.
Talk to one of the ops or any of the other members in the channel to sign up today!

If you are having trouble locating #GNAA, the official GAY NIGGER ASSOCIATION OF AMERICA irc channel, you might be on a wrong irc network. The correct network is EFNet, and you can connect to irc.secsup.org or irc.isprime.com as one of the EFNet servers.
If you do not have an IRC client handy, you are free to use the GNAA Java IRC client by clicking here [nero-online.org] .

About Xbox
Xbox (http://www.xbox.com/) is Microsoft's future-generation video game system that delivers the most powerful games experiences ever. Xbox empowers game artists by giving them the technology to fulfill their creative visions as never before, creating games that blur the lines between fantasy and reality. Xbox is now available in the continents of North America, Europe, Asia and Australia.

About Microsoft
Founded in 1975, Microsoft (Nasdaq "MSFT") is the worldwide leader in software, services and Internet technologies for personal and business computing. The company offers a wide range of products and services designed to empower people through great software -- any time, any place and on any device.
Microsoft and Xbox are either registered trademarks or trademarks of Microsoft Corp. in the United States and/or other countries.


If you have mod points and would like to support GNAA, please moderate this post up.
By moderating this post as "Underrated", you cannot be Meta-Moderated! Please consider this.

________________________________________________
| ______________________________________._a,____ |
| _______a_._______a_______aj#0s_____aWY!400.___ |
| __ad#7!!*P____a.d#0a____#!-_#0i___.#!__W#0#___ |
| _j#'_.00#,___4#dP_"#,__j#,__0#Wi___*00P!_"#L,_ |
| _"#ga#9!01___"#01__40,_"4Lj#!_4#g_________"01_ |
| ________"#,___*@`__-N#____`___-!^_____________ |
| _________#1__________?________________________ |
| _________j1___________________________________ |
| ____a,___jk_GAY_NIGGER_ASSOCIATION_OF_AMERICA_ |
| ____!4yaa#l___________________________________ |
| ______-"!^____________________________________ |
` _______________________________________________'

What about hair-fuckers? (-1, Troll)

Anonymous Coward | more than 10 years ago | (#9503323)

What is GNAA's position on hair-fuckers? It's a large and dedicated subculture that demands recognition! Our sex symbols are even gaining respected publicity [bbc.co.uk]

*** I HATE NIGGERS *** (-1, Troll)

Anonymous Coward | more than 10 years ago | (#9502924)

f u homos

That's not what your grandmother said! (-1)

Anonymous Coward | more than 10 years ago | (#9503184)

though she didn't fuck a homo!

Re:*** I HATE NIGGERS *** (-1, Troll)

Anonymous Coward | more than 10 years ago | (#9503260)

Why do you hate GNAA?

Correct link to CRM-114 (5, Informative)

athakur999 (44340) | more than 10 years ago | (#9502934)

CRM-114 [sourceforge.net]

The link in the article points to SpamBayes again.

Isn't Human Accuracy always 100% (4, Insightful)

PetoskeyGuy (648788) | more than 10 years ago | (#9503021)

From the CRM-114 site...
News Flash: As of Feb 1 through March 1, 2004, 8738 messages (4240 spam, 4498 nonspam), and my total error rate was ONE. That translates to better than 99.984% accuracy, which is over ten times more accurate than human accuracy

Maybe I'm missing something human accuracy always going to be 100%? I tell the computer what is spam, it learns. I may decide that regardless of what it thinks, this last message is OK. So aside from clicking too fast or changing your mind (which is a common thing to do) how can a filter ever suggest it is be better then people at deciding what people want to see?

Re:Isn't Human Accuracy always 100% (4, Insightful)

sholden (12227) | more than 10 years ago | (#9503073)

People make mistakes.

Yes, given one message to classify as spam or ham you are going to get it right 100% of the time.

Given 8000 messages to classify the wonders of boredom is going to mean you make a mistake every so often (not an "oops I clicked the wrong button" mistake, but an "oops I put it in the wrong folder because the subject looked spammy and I couldn't be bothered checking the body" mistake).

In practice though, those stats on human accuracy are provided by having one person classify email that has been classified by others - which of course means some of the mistakes in fact be disagreements...

Re:Isn't Human Accuracy always 100% (4, Funny)

fireman sam (662213) | more than 10 years ago | (#9503269)

Remember, an email being classified as spam is sujective. For example, you might consider a message from a Nigerian bank manager spam, but I may consider it a way to pay of the house :)

Or, presonally I consider all email I get with the from hotmail.com is spam. But that is my opinion.

OT: btw, a friend at work actually got a Nigerian scam letter in the post. Because it was not email, he thought it was real.

Re:Isn't Human Accuracy always 100% (4, Funny)

Anonymous Coward | more than 10 years ago | (#9503315)

OT: you need smarter friends.

Re:Isn't Human Accuracy always 100% (0, Offtopic)

Surazal (729) | more than 10 years ago | (#9503316)

For example, you might consider a message from a Nigerian bank manager spam, but I may consider it a way to pay of the house :)

Nope. Once you fall for that, it will be the Nigerian scammer who will pay off your house. Under his name.

Earthlink stats are different (0)

Anonymous Coward | more than 10 years ago | (#9503301)

This article [bbc.co.uk] from the beeb puts human accuracy over machine accuracy...

Re:Not that good (0)

Anonymous Coward | more than 10 years ago | (#9503147)

I have tried a number of Baysian type filters and none of them filter the spam when I send it...

in related news (-1, Troll)

humankind (704050) | more than 10 years ago | (#9502945)

Rusty Wallace beats Earnhardt in a tricycle race.

Stupid fools. Content-based spam filtering is a waste of time. Why is Slashdot covering this crap? It's a never-ending battle of updating filters and formulas. There are less permutations in isolating and blacklisting every IP on the Internet than there would be to analyze e-mail content, waste server resources and masturbate.

RBLs WORK. This is why spammers are forced to use worms to invade users' machines to create proxies. As soon as the authorities wake from their slumber and start prosecuting these scumbags who break into others' machines, the whole spam thing will essentially be over. But don't tell that to the little content-based-filtering-fools. They obviously have money to burn.

Re:in related news (1)

sqlrob (173498) | more than 10 years ago | (#9503015)

Content RBLs [surbl.org] have been working fairly well for me

Re:in related news (4, Insightful)

bigberk (547360) | more than 10 years ago | (#9503016)

Content-based spam filtering is a waste of time. . . RBLs WORK
But content-based filters can very accurately determine what is spam and what's not, and so they can feed RBLs/DNSBLs. Let real spam to real user accounts form the blocklist! One such project is WPBL.

Re:in related news (2, Insightful)

plasm4 (533422) | more than 10 years ago | (#9503022)

filtering tools work fairly well, but more importantly they work right now. Waiting for the authorities to "wake from their slumber" might take years, if it ever even happens.

Re:in related news (0)

Anonymous Coward | more than 10 years ago | (#9503043)


Content-based spam filtering is a waste of time.


Whatever. Your "never-ending battle of updating filters and formulas" works fine.

Re:in related news (0)

Anonymous Coward | more than 10 years ago | (#9503109)

Not everyone [whirlycott.com] is as much of an RBL cheerleader as you are.

Re:in related news (0)

Anonymous Coward | more than 10 years ago | (#9503136)

In this message [slashdot.org] you claim that no content-based filter "comes close" to the 95% accuracy of your RBLs, but some of the content-based filters in this story do better than that (which is consistent with my own personal accuracy rate from SpamBayes, with e.g. a spam misclassification rate of maybe ~2%).

Re:in related news (1)

djmurdoch (306849) | more than 10 years ago | (#9503205)

RBLs only work against honest admins, getting them to clean up the holes in their security. Spammers aren't honest, and as you say, will just use worms to invade machines to create proxies.

RBLs have been around for years, but the amount of spam Spamassassin catches on its way in to me is ever-increasing. If RBLs worked, the spam problem would have been solved years ago.

On the other hand, the amount of spam getting past Spamassassin to me is pretty steady. I guess that indicates it's getting better. Mostly what gets past is what the article calls "backscatter": delivery failure messages caused by spammers forging my email address.

Should systems that send backscatter be blacklisted? I'd tend to say yes: they should only send failure notices to senders who pass some sort of verification like SPF. Putting them in an RBL really would encourage them to do that.

Re:in related news (1)

alexborges (313924) | more than 10 years ago | (#9503241)

RBLs WORK. This is why spammers are forced to use worms to invade users' machines to create proxies. As soon as the authorities wake from their slumber and start prosecuting these scumbags who break into others' machines, the whole spam thing will essentially be over. But don't tell that to the little content-based-filtering-fools. They obviously have money to burn.

In case you havent heard, most of us with real jobs that require spam control cant wait for 'authorities to wake up' and cannot be expected to take advice from people that do, whatever the fuck it is your do, which is OBVIOUSLY not related at all with protecting people and/or resources from the abuse of spammers.

Re:in related news (4, Interesting)

Crudely_Indecent (739699) | more than 10 years ago | (#9503251)

I can certainly see how waiting on our government will decrease the number of messages transmitted through my mail servers daily.

It's reassuring to know that the "authorities" have effectively reduced the number of messages through my server by 10-14k per day......What great guys, those 'authorities', aren't they thoughtful and quick to respond. We've only been waiting for a spam-relief law for....10 years and they finally gave one to us. Oh wait....SpamAssassin is what reduced those messages.

The reason we don't wait for the gov to step in and take care of business is that THEY'VE DONE NOTHING SO FAR. You expect me to believe the government will solve my spam problems? I'm not holding my breath.

A combination of RBLs, DNSBLs, F-Prot, and SpamAssassin is what reduced the number of messages sent through my servers. I'm interested in results NOW, not legislation tomorrow.

The Mozilla ThunderBird SPAM filter (5, Interesting)

k.ellsworth (692902) | more than 10 years ago | (#9502948)

the mozilla spam filter does a very good job too, when it learns enough it becomes over 95% acurate. i dropped evolution for it , and never looked back

Re:The Mozilla ThunderBird SPAM filter (3, Interesting)

Cyb3rBull3ts (779853) | more than 10 years ago | (#9502988)

If you use the Mozilla TB spam filter with your ISP filter its near 99% accurate.

I have gone from a wopping 200 spam messages a day (a very old e-mail address) to the occational spam message once a week.

Leme do the math. 200*7 = 1400. 1399/1400 = 0.9992857 accruaccy. Not TOO bad :D

Re:The Mozilla ThunderBird SPAM filter (2, Informative)

ImpTech (549794) | more than 10 years ago | (#9502998)

Of course its pretty easy to hook spamassassin, bogofilter, or whathaveyou into Evolution. Tutorials abound if you search google. Thunderbird's nice, but IMO Evolution's still a bit nicer, so it was worth my time to plug in a spam filter manually.

Re:The Mozilla ThunderBird SPAM filter (1)

k.ellsworth (692902) | more than 10 years ago | (#9503100)

as said some post down on this thread, thunderbird does spam filtering on IMAP accounts. i used to love evolution.. but MozTB is waaaaay better. is lighter, faster, smarter for many tasks (email related not groupware related)

Re:The Mozilla ThunderBird SPAM filter (3, Interesting)

Mark_MF-WN (678030) | more than 10 years ago | (#9503008)

It works with IMAP too -- which is something most other spam filters aren't capable of.

Best anti-spam code (-1, Troll)

Hao Wu (652581) | more than 10 years ago | (#9502952)

We need a good code of law that puts SPAM bastards in jail for decade.

Re:Best anti-spam code (2, Insightful)

britneys 9th husband (741556) | more than 10 years ago | (#9502983)

How exactly does the US (or other first world country) go about writing a code of law that puts Nigerian spammers in jail?

Legislate... Military Invasion! (-1, Flamebait)

Anonymous Coward | more than 10 years ago | (#9503175)

It worked in Iraq! By God, the very act of the US being ballsy enough to invade made all the WMD's disappear! Imagine what it'd do to spammers!

Invasion (1, Insightful)

artlu (265391) | more than 10 years ago | (#9502958)

I must admit that I am not upto date on these new anti-spam software packages, which operate on the server side. However, what is the probability of real mail getting rejected by these things. It seems almost like an invasion of privacy to block my own email even if it is from a "benevolant big brother" perspective.
I guess that is why there are privacy policies though.

aj

GroupShares Inc. [groupshares.com] - A Free and Interactive Stock Market community!

Re:Invasion (1)

Arial Sharon, 10pt. (784486) | more than 10 years ago | (#9503001)

Yes, there can be false positives, which is why suspected spam is usually moved to a different folder (rather than deleted) that users can check every now and again. Another approach is to insert an extra header to indicate the message's probability of being spam so that the user agent can selectively filter it.

Your privacy concerns are, as always, more complicated than the technology.

Re:Invasion (0)

Anonymous Coward | more than 10 years ago | (#9503040)

They are called false positives. And you will find the study includes this side effect.

I suspect 'aj' is actually complaining since his spam stock tips are being blocked.

Re:Invasion (1)

p2sam (139950) | more than 10 years ago | (#9503114)

The point of automated mail sorting isn't about having 0 false negatives. It's about having a lower false negative than if YOU were to sit down and sort the hundreds of spam yourself.

Re:Invasion (0)

Anonymous Coward | more than 10 years ago | (#9503169)

If I was to sit down and filter all my mail myself, I would have 0 false negatives/positives simply because I decide what is spam and what's not. If you were to sit down and filter my mail for me then I would expect some false negatives/positives as your idea of what is and isn't spam may be different from mine. Automated systems are designed to aid in helping YOU decide which messages are spam, not deciding for you. This is the reason many consider "learning" systems the best.

I'm running SpamAssassin at work. (4, Insightful)

khasim (1285) | more than 10 years ago | (#9503237)

People LOVE it.

There are some false positives and some false negatives.

But I have it set to delete anything 12+. That gets rid of the worst of the worst spam. So far, not a single complaint of any email being deleted.

Everything else has the subject re-written so people can run their own rule set against it.

In the past 8 hours
1867 messages received
375 messages deleted
1266 messages flagged as spam

So, only a few hundred actual, good emails.

Of course, that's only 4 hours during the regular work day (and 4 hours after work). But you can see the proportions. It saves people a TON of time.

And it makes them happier when they don't have to constantly dig through crap to see if any real messages have arrived.

Now, those spam messages are NOT distributed evenly. Our HR manager had her email address posted on the website. So she gets about 20-25% of the spam.

It's not exactly Big Brother 'cause no human sees the deleted spam.

Okay, but what about... (0)

Anonymous Coward | more than 10 years ago | (#9502960)

...false positives?

Quit acting like goddamn babies... (5, Funny)

Anonymous Coward | more than 10 years ago | (#9502961)


Baysian, gaysian. Real men hit delete.

No, REAL MEN... (2, Insightful)

Dimensio (311070) | more than 10 years ago | (#9503211)

...hammer the spammer's ISP with complaints until the advertised website is DEAD, DEAD, DEAD.

Re:Quit acting like goddamn babies... (4, Funny)

fireman sam (662213) | more than 10 years ago | (#9503284)

Pfft, Real men have this as the ~/.bashrc

#!/bin/sh
rm -f /var/spool/mail/$USER

Who needs email.

I didn't RTFPDF... (3, Interesting)

john_smith_45678 (607592) | more than 10 years ago | (#9502964)

The best-performing filters reduced the volume of incoming spam from about 150 messages per day to about 2 messages per day.

How many false positives though?

Re:I didn't RTFPDF... (-1, Flamebait)

Llywelyn (531070) | more than 10 years ago | (#9502982)

>I didn't RTFPDF...

Obviously.

Re:I didn't RTFPDF... (0)

Anonymous Coward | more than 10 years ago | (#9503004)

Fuck off.

Re:I didn't RTFPDF... (1)

Malc (1751) | more than 10 years ago | (#9503130)

Why's this moderated "troll". It's a very good question. I'd rather receive some spam than have just one valid message blocked. I use Yahoo and they piss me off sometimes with their false-positives.

Re:I didn't RTFPDF... (1)

timeOday (582209) | more than 10 years ago | (#9503259)

Yup, I can easily reduce spams to fewer than 2 per day. Just redirect all mail to /dev/null.

I use two... (2, Interesting)

hkfczrqj (671146) | more than 10 years ago | (#9502968)

I use Spamassassin. Surviving mail then goes through CRM-114. At least in my case, it works better than each of the filters on its own.

No HTML, Just ps or pdf, conclusions inside (5, Informative)

randyest (589159) | more than 10 years ago | (#9502971)

And a long document it is (funny placeholder images though.) Here's the conclusions for the impatient but interested in a little more than the summary:

Supervised spam filters are effective tools for attenuating spam. The best-performing filters reduced the volume of incoming spam from about 150 messages per day to about 2 messages per day. The corresponding risk of mail loss, while minimal, is difficult to quantify. The best-performing filters misclassified a handful of spam messages early in the test suite; none within the second half (25,000 messages). A larger study will be necessary to distinguish the asymptotic probability of ham misclassification from zero.

Most misclassified ham messages are advertising, news digests, mailing list messages, or the results of electronic transactions. From this observation, and the fact that such messages represent a small fraction of incoming mail, we may conclude that the filters find them more difficult to classify. On the other hand, the small number of misclassifications suggests that the filter rapidly learns the characteristics of each advertiser, news service, mailing list, or on-line service from which the recipient wishes to receive messages. We might also conjecture that these misclassifications are more likely to occur soon after subscribing to the particular service (or soon after starting to use the filter), a time at which the user would be more likely to notice, should the message go astray, and retrieve it from the spam file. In contrast, the best filters misclassified no personal messages, and no delivery error messages, which comprise the largest and most critical fraction of ham.

A supervised filter contributes significantly to the effectiveness of Spamassassin's static component, as measured by both ham and spam misclassification probabilities. Two unsupervised configurations also improved the static component, but by a smaller margin. The supervised filter alone performed better than than the static rules alone, but not as well as the combination of the two.

The choice of threshold parameters dominates the observed differences in performance among the four filters implementing methods derived from Graham's and Robinson's proposals. Each shows a different tradeoff between ham accuracy and spam accuracy. ROC analysis shows that the differences not accountable to threshold setting, if any, are small and observable only when the ham misclassification probability is low (i.e. hm
CRM-114 and DSPAM exhibit substantially inferior performance to the other filters, regardless of threshold setting. Both exhibit substantial learning throughout the email stream, leading us to conjecture that their performance might asymptotically approach that of the other filters. From a practical standpoint, this learning rate would be too slow for personal email filtering as it would take several years at the observed rate to achieve the same misclassification rates as the other systems. Both these systems were designed to be used in a train on error configuration, and do not self-train. This configuration could account for a slow learning rate as each system avails itself of the information in only about 1,000 of the 50,000 test messages. In an effort to ensure that we had not misinterpreted the installation instructions, we ran CRM-114 in a train-on-everything configuration and, as predicted by the author, the result was substantially worse.

Spam filter designers should incorporate interfaces making them amenable for testing and deployment in the supervised configuration (figure 4). We propose the three interface functions used in algorithm 1 - filterinit, filtereval, and filtertrain - as a standardized interface. Systems that self-train should provide an option to self-train on everything (subject to correction via filtertrain) as in algorithm 2.

Ham and spam misclassification proportions should be reported separately. Accuracy, weighted accuracy, and precision should be avoided as primary evaluation measures as they are excessively influenced by threshold parameter setting and the ham-spam ratio of incoming mail. ROC curves provide valuable insight into the tradeoff between ham and spam accuracy. Area under the ROC curve provides a meaningful overall effectiveness measure, but does not replace separate ham and spam misclassification estimates. Each case of ham misclassification should be examined to ascertain its cause and potential impact.

Caution should be exercised in treating ham misclassification as a simple proportion. Extremely large samples would be needed to estimate it with any degree of statistical confidence, and even so, it is not clear what effect differences in proportion would have on the overall probability of catastrophic loss. The use of a filter may mitigate rather than exacerbate this risk, owing to the reduction in classification effort required of the user. We advance the proposition that, at the misclassifi- cation rates demonstrated here, the end-to-end risk of loss is dominated by human factors and exceptional events, and is comparable to that of other communication media.

Mozilla Messenger / Thunderbird Performance? (5, Interesting)

Mark_MF-WN (678030) | more than 10 years ago | (#9502974)

I wonder how Mozilla Messenger/Thunderbird's spam filtering stacks up against these filters? I've heard some negative comments about the Mozilla filtering system, but it's worked wonders for me.

Re:Mozilla Messenger / Thunderbird Performance? (2, Informative)

k.ellsworth (692902) | more than 10 years ago | (#9503058)

100% agreed I use mozilla thunderbird spam filter (after some human teaching to it) and it works marvelous, on a spam-me(account used on usenet, and some forums and to anything that i know that will become a spam source but i need to give a valid email address anyways) email account i have i recive ~38K spams a month and thunderbird only misses 3 or 4 per day... sometimes i look the JUNK folder of it and i haven't seen any false positive on it so far. ThunderBird is THE email client, works on Linux and Windoze, the spam filter work better than 99% , any many other tricks.

Re:Mozilla Messenger / Thunderbird Performance? (1)

mbourgon (186257) | more than 10 years ago | (#9503102)

Mozilla 1.8 has (had?) a real problem with it's Junk Mail controls... namely, they don't (didn't?) work nearly as well as 1.7's. Someone feel free to karma whore the details, but I think the problem is that they're using a bunch of different spam filters, and it's not as powerful as whatever was used in 1.7.

Spamassasin is great! (2, Informative)

JohnFromCanada (789692) | more than 10 years ago | (#9502976)

I have been using SpamAssassin in conjunction with Evolution and it has cut my spam to virtually nothing. I wish it was built right into Evolution so that it was a little faster however it is worth the wait as I barely ever get any spam in my Inbox anymore. I set it up with evolution very similar to how it is shown here [atlantawebhost.com] . I really like using it with Evolution however I am curious if anyone knows of anything that would work faster and as efficient in conjuntion with Evolution?

Real way to block spam (2, Interesting)

DRWHOISME (696739) | more than 10 years ago | (#9502977)

Is to do away with current email protocols and go with new ones with verification.

That should take care of the problems. The gov is now concentrating on this.

Re:Real way to block spam (2, Insightful)

PornMaster (749461) | more than 10 years ago | (#9503003)

Is to do away with current email protocols and go with new ones with verification. That should take care of the problems. The gov is now concentrating on this.

Except for making a new standard that's a requirement for doing business with federal agencies, just what do you think government's capable of doing regarding replacing protocols?

-PM

Re:Real way to block spam (0)

Anonymous Coward | more than 10 years ago | (#9503027)

They'll say "Hey, look at our new anti-spam list!" and the list will only be available for users of the new protocol. People will want this and demand it from their ISPs.

Re:Real way to block spam (1)

wmacgyver (555987) | more than 10 years ago | (#9503153)

color me skepical, but I'm not sure government is the magical solution to this problem. Just look at how much good the new anti-spam law they passed is doing. :)

REAL REAL way to block spam (1)

Mad Bad Rabbit (539142) | more than 10 years ago | (#9503288)

[Ripley] "I say we take off and nuke the entire planet
from orbit. That's the only way to be sure."

[Hudson] "F--kin' A..."

[Burke] "Ho-ho-hold on a second! The Earth has a
very substantial dollar value attached to it!"

[Ripley] "They can BILL me."

A little advice (5, Funny)

Anonymous Coward | more than 10 years ago | (#9502992)

You don't want to face an assassin in a shootout. Maybe a pie eating contest, or a spelling bee... but not a shootout.

I've had CRM114 running for a few months . . . (4, Informative)

klevin (11545) | more than 10 years ago | (#9502994)

CRM114's best was about 80%, which lasted for a few of weeks (weeks 3-5). Before and after that, it's doing good to catch 25% of the spam. I'm not sure why, but for the last month it's only been catching about 10%. When one gets through, I run it through mailfilter.crm with the learnspam switch. It'll say it's learned it, but if I have it check the spam again, it still lets it past.

Re:I've had CRM114 running for a few months . . . (2, Informative)

CoolGopher (142933) | more than 10 years ago | (#9503231)

I've been running CRM114 for about a year now, and it's performing extremely well. Far better than my Mozilla filter. In fact, just the other week I scrapped Mozilla's junk filter completely and am now relying on CRM alone. It's very rare that I get any misses in either direction.

If I was to make an estimate, I'd say that the error rate is something like .1%, quite possibly less (say 1 miss/5 days, with 200 mails per day). This is having started with clean corpus files and train-on-error only.

Good results with spamprobe (2, Informative)

bigberk (547360) | more than 10 years ago | (#9502995)

I have been using spamprobe [sourceforge.net] for some time, with the webfilt [pc-tools.net] front-end, and I'm very pleased with the speedy spamprobe program (written in C++).

I receive approximately 10 legit emails/day and about 300 spam/day. I have only had 2 false positives overall (that's 2 out of about 100,000 total emails received) and on average only 2 spams/day split past the filter. Now I'm testing Spambayes on one of my most spammed accounts, but it's definitely much slower than spamprobe and not more accurate as far as I can tell.

compute farms for anti-spam AI? (4, Informative)

potus98 (741836) | more than 10 years ago | (#9503000)


From page 24: Hidalgo suggests the use of ROC curves, originally from signal detection theory and used extensively in medical testing, as better capturing the important aspects of spam filter performance.

Perhaps a distributed analysis system (similar to SETI@home [berkeley.edu] ) could be used to combat spam. Not only could the idle time of bazillions of CPUs be levereaged to improve "signal" analysis, but perhaps the clients could analyize local incoming mail to corelate new trends in spam originators and then share that information with all of the other clients. Then you could combine that with the genetic evolution improvements of the F1 sim-cars recently mentioned [slashdot.org] on /.

So there's the high-level idea, now you smart people go make it work. :-)

Spamassassin uses collaborative spam-tracking (2, Informative)

vivek7006 (585218) | more than 10 years ago | (#9503030)


Razor: Vipul's Razor is a collaborative spam-tracking database, which works by taking a signature of spam messages. Since spam typically operates by sending an identical message to hundreds of people, Razor short-circuits this by allowing the first person to receive a spam to add it to the database -- at which point everyone else will automatically block it.

This is a really cool.

Re:Spamassassin uses collaborative spam-tracking (0)

Anonymous Coward | more than 10 years ago | (#9503086)

What protection does it have against users (intentionally or unintentionally) adding non-spam to the database, thus blocking legitimate e-mail to everyone who uses Razor?

Re:Spamassassin uses collaborative spam-tracking (1, Informative)

Anonymous Coward | more than 10 years ago | (#9503221)

What protection does it have against users (intentionally or unintentionally) adding non-spam to the database, thus blocking legitimate e-mail to everyone who uses Razor?

People have done this before by adding mailing list posts to Razor. But SpamAssassin doesn't automatically block messages listed in Razor, it just assigns them a higher spam score.

Razor has some protection too, like the truth evaluation system - see this page [sourceforge.net] for info.

Re:Spamassassin uses collaborative spam-tracking (4, Informative)

bigberk (547360) | more than 10 years ago | (#9503095)

It gets better. Vernon Schryver, networking genius, is responsible for the Distributed Checksum Clearinghouse [rhyolite.com] which does something similar, but as I understand it, is much more efficient for large servers. When our university turned on DCC filtering combined with greylisting, the daily spam to inboxes dropped from hundreds daily to ZERO (I kid you not). I am not aware of any false positives, at least on my account. DCC blew my mind.

So I'm not the only one... (4, Informative)

sholden (12227) | more than 10 years ago | (#9503032)

I did a *much* smaller test of spam filters earlier this year (which was published in hakin9 [haking.pl] but not in English).

I also found that crm114 gave poor results in comparison to other filters - but figured I must have set something up incorrectly...

Why don't people use catch-all accounts? (5, Interesting)

mattkinabrewmindspri (538862) | more than 10 years ago | (#9503033)

When you register with a hosting company, very frequently, they set up what's called a catch-all account, and any email to your domain that's not addressed to a real address goes there. This is how I use it:
  • I only use my main email address with friends and family, and never post it online.
  • Whenever I post an email address or register for anything online, I put thatsite@mydomain.com as my email address.
  • All email is received by one account, but each message can have a different "to:" header. I set my filters to filter mail to different boxes. Email sent to amazon@mydomain.com goes to the amazon folder. Same with ebay, slashdot, whatever.
  • Any time I start receiving spam, I just set my mail server to disregard email sent to whatever email address is getting the spam, and I can stop doing business with the company that sold my email address.
I receive on average 0 spams per day.

Re:Why don't people use catch-all accounts? (0)

Anonymous Coward | more than 10 years ago | (#9503077)

Why can't a spammer just start spoofing different popular sites you may have done business with? You should work a secret code system.

Re:Why don't people use catch-all accounts? (1)

mattkinabrewmindspri (538862) | more than 10 years ago | (#9503144)

I don't think it's likely that spambots will pick up on more than one of my addresses within several months. I'm probably only registered at about 30-40 sites(about 10 of which I visit really frequently), and most of them can be set to hide your email address. I haven't had to block any of the addresses I've used at popular sites so far.

Even if they did, I could knock the spam I received back down to zero just by having my server disregard any mail sent to that address and then if I'm still visiting that site, changing my address in that site's preferences.

Re:Why don't people use catch-all accounts? (1)

YrWrstNtmr (564987) | more than 10 years ago | (#9503132)

Because not everyone has a mydomain.com

Re:Why don't people use catch-all accounts? (4, Informative)

sr180 (700526) | more than 10 years ago | (#9503195)

Wait till the spammers decide to spam your whole domain. They can start with aaaaaaaa@yourdomain.com and keep going till they get to zzzzzzzz@yourdomain.com, and your mailserver will accept and pass on every single one of these emails.

I would recommend not using a catch all account, but if you have the domain, create, delete and rename email accounts as you need to...

Re:Why don't people use catch-all accounts? (1)

videodriverguy (602232) | more than 10 years ago | (#9503287)

Very true. This happened to me recently and my spam count went from around 30 to over 400!

Thankfully, my host has a 'blackhole' option for the default account. Turned that on and the spam volume dropped back to the previous level.

Re:Why don't people use catch-all accounts? (1)

burns210 (572621) | more than 10 years ago | (#9503250)

what if it isn't ebay that sold the account, rather a random generation spammer sent to ebay@DOMAIN.TLD? Or if the company(or you, by accident) post the email address to the web, and a spider grabbed it and was added to a spammers list?

how many CORP_X accounts do you go through? ebay1@DOMAIN.TLD, ebay2@, ebay3@... ditching each once it starts to recieve spam.

A most interesting approach, though.

Re:Why don't people use catch-all accounts? (3, Insightful)

FrenZon (65408) | more than 10 years ago | (#9503258)

Why don't people use catch-all accounts?

Because you will always have one main 'obvious' address - be it something that goes on your business card, or something you tell to people you meet. For example, I use glen at glenmurphy.com.

Now all it takes is one slip - someone you know to get a virus, whatever, and your address is 'out there' for the taking. Your only possible recourse then is to stop using that address, but for some people that's just not an option, and it's a just bit defeatist to sit there surrendering email address after email address.

Re:Why don't people use catch-all accounts? (1)

mrpuffypants (444598) | more than 10 years ago | (#9503263)

alas, that also equates to you receiving 0 emails total per day :(

Some of us don't use spam filters to give us a feeling of life...

Re:Why don't people use catch-all accounts? (-1, Flamebait)

Anonymous Coward | more than 10 years ago | (#9503308)

Well fuck me, you have to be about the biggest loser ever. Why go to all that damn effort to classify you mail? I agree, spam's a problem. I have one email account, I use it EVERYWHERE. Serious things and one off signups.
With a tiny bit of effort (I have about 10 whitelist entries and train the spam filter once a week), I have no problem with spam. I have to move a junk email to the junk folder maybe 3 times a week?

Of course, I could use your way, but how does it help? Plus you can never keep your email address totally private, as other posters have said, it only takes 1 virus or a fuckup of some sort for it to be out there in the wild.

I can't understand how people are so damn fanatical about their email address. You've got one, damn well use the stupid thing.

Another data point. (4, Interesting)

juuri (7678) | more than 10 years ago | (#9503039)

OSX's built in mail seems to be pretty close to the accuracy numbers listed in the above summary. I tend to have one to three pieces of spam slip through which are almost always entirely image based with some poetry or equivalent attached.

I must say I've been pleasantly surprised with the spam filtering it provides and it has been a lot easier than the hoops I used to utilize to clean out my inbox.

OPE (0)

Anonymous Coward | more than 10 years ago | (#9503057)

Anyone know that three letter prefix to get through the CRM-114?

DSPAM (4, Insightful)

More Trouble (211162) | more than 10 years ago | (#9503063)

In real world deploys of statistical filters, something like DSPAM's "global user" feature is necessary. The ability to begin with a relatively mature dictionary is critical to the user experience. Personally, DSPAM is filtering around 200 SPAMs per day for me, allowing one through every few days. It's 99.985% effective for me.

:w

No DSPAM (3, Interesting)

XMichael (563651) | more than 10 years ago | (#9503078)

It's unforchunately that DSPAM was left out of this very good quality report. I have personally used SpamAssassin, SpamProbe and DSPAM [nuclearelephant.com]

After using each for a couple months at a time, I found DSPAM to be by far the most effective (after it was properly trained)

DSPAMS claim "DSPAM (as in De-Spam) is an extremely scalable, open-source statistical hybrid anti-spam filter. While most commercial solutions only provide a mere 95% accuracy (1 error in 20), a majority of DSPAM users frequently see between 99.95% (1 error in 2000) all the way up to 99.991% (2 errors in 22,786). DSPAM is currently effective as both a server-side agent for UNIX email servers and a developer's library for mail clients, other anti-spam tools, and similar projects requiring drop-in spam filtering. DSPAM has been implemented on many large and small scale systems with the largest systems being reported at about 125,000 mailboxes." was quite accurate for me


Also check out some priceless photos Priceless Photos [pricelessphotos.org]

Problems with Bayesian filtering (4, Informative)

dlevitan (132062) | more than 10 years ago | (#9503101)

Up to this past weekend I was using only bogofilter (which is a pure bayesian filter). I seem to get about 200 spam a day on my main account. Until about a month or two ago bogofilter was amazing - I'd get maybe 1 or 2 spam a day, if that many. Then recently I suddenly started getting hit with 20 spam messages a day, and I noticed most of those were using lots of common words to bypass bogofilter. Most spam was still being removed by bogofilter, but enough to make me annoyed. This past weekend I also enabled spamassassin (without its bayes filter though), and its cut down the number of spam to maybe 5 a day, but its still too much for me. I'm hoping we have the next breakthrough in spam filtering technology soon (akin to bayesian filtering) because it seems that every new technique we use to filter the spam is eventually targeted by the spammers and bypassed.

Holy Shit.... (1)

Dunarie (672617) | more than 10 years ago | (#9503112)

Only 2 messages out of 150 normally get through that are spam? Good god, I normally get 5-10 spam messages a day that get through SpamAssassin. That's 750-1,500 spam e-mails a day! I thought it was bad before I enabled spamassasin a few months ago... but Jesus, man am I glad I got SA!

Re:Holy Shit.... (2, Interesting)

fdiskne1 (219834) | more than 10 years ago | (#9503276)

It's getting just plain rediculous. When I started keeping track about a year ago, the email filtering system I set up was blocking about 10,000 spams per week for just under 1500 users. Last week, it blocked over 170,000. That is an average of over 100 spams per user and the vast majority of my users don't get any at all. There are a couple dozen that get the vast majority of it. Of course, these are addresses that would be a major pain in the ass to change because of all the people that would have to be notified, and only if I could convince the user they want to. Of course, with this many users, I can't get a good grasp on the number of spams that make it through, but I do know it's enough to have several people continually complaining about it. It's just plain sickening all the resources and bandwidth that gets wasted. I use three different black-hole lists, so about 110,000 of those don't get any further than initial helos, but still. Disgusting. Bring on the protocol change. I've told everyone that I would be willing to work 24 hours a day for an entire weekend to implement a server and/or gateway that uses a new email protocol if it meant most spam would disappear.

the true cause of the majority of spam... (3, Interesting)

Etaipo (787613) | more than 10 years ago | (#9503127)

users. those silly, silly users. i was in charge of spam for my company for the greater part of a year. using an outdated KEYWORD based system > I was forced to read every.caught.message to look for false positives. ... did you catch that? yeah...i had to go through EVERY 'spam' tagged e-mail that went through the company. needless to say, after the first week i was ready to gouge my eyes out. but hey, at least i earned that 'i read your e-mail' sticker! anyways, the point that i'm failing to make here is the cause of the spam... the damn users. whether it be responding to spam, putting their e-mail address in every single webform they encounter while surfing instead of working, signing up for spam voluntarily, or whatever the cause may be.. i ran some numbers on the logs, and came to an astounding find. a few people were getting literally a thousand messages blocked, per month. i, on the other hand, had maybe one or two a month. and i'm not a nazi with my e-mail address....but i do take some care in what places i type it in. an ounce of prevention goes a long way folks.

Re:the true cause of the majority of spam... (1)

stevesliva (648202) | more than 10 years ago | (#9503200)

Sure man, blame the victim. She was asking for it.

All sarcasm aside, I DO ask for it with my hotmail account (see above) and that just makes me so glad that I keep my other addresses quiet!

Let me help you (0)

Anonymous Coward | more than 10 years ago | (#9503227)

The shift key is next to the Z on the left of the keyboard, and next to the / on the right.

It's often used on the first letter after a full stop - '.' character.

SpamAssassin used to work but recently... (3, Interesting)

squisher (212661) | more than 10 years ago | (#9503129)

SpamAssassin used to be super-good for me, but recently it has become a nightmare... even with Bayes filters on and training it with about almost 2000 spam messages that have escaped it before, I STILL get an enourmous amount of spam every day... maybe I'm doing something wrong with the config, I admit that I haven't spent that much time on that, but it seems like it should be working better :-((.

Spam sucks. Everyone stop buying the products advertised and it'll be over. But then again, people will always be too dumb for an easy solution like that (reminds me of the gooback southpark...)

Issues with testing corpus (5, Interesting)

w_mute (40724) | more than 10 years ago | (#9503143)

I haven't read everything in detail yet, but one of the things that stands out is that their 'gold standard' representing the best result consists of 9,038 ham messages (18.4%) 40,048 spams (81.6%). While large, the dataset is unbalanced. One of the things that is recommended by many of the filters is training on equal proportions of ham/spam in order to prevent biasing (overfitting).

Their train on errors approach may simulate what goes on with some filters it doesn't reflect the scenario where there is a initial dataset to be trained on _before_ new messages are processed. Instead, each message is in essence 'new'. So in their tests the machine learning filters start out knowing nothing, but SpamAssassin starts out with its inbuilt ruleset. Not exactly fair.

-Greg

What d'you think spamassissin would make of this? (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#9503163)

BEGIN

The Library of Babel
By Jorge Luis Borges
Translated by James E. Irby

"By this art you may contemplate
the variation of the 23 letters..."
- The Anatomy of Melancholy, part 2, sect. II, mem. IV

The universe (which others call the Library) is composed of an indefinite and perhaps infinite number of hexagonal galleries, with vast air shafts between, surrounded by very low railings. From any of the hexagons one can see, interminably, the upper and lower floors. The distribution of the galleries is invariable. Twenty shelves, five long shelves per side, cover all the sides except two; their height, which is the distance from floor to ceiling, scarcely exceeds that of a normal bookcase. One of the free sides leads to a narrow hallway which opens onto another gallery, identical to the first and to all the rest. To the left and right of the hallway there are two very small closets. In the first, one may sleep standing up; in the other, satisfy one's fecal necessities, Also through here passes a spiral stairway, which sinks abysmally and soars upwards to remote distances. In the hallway there is a mirror which faithfully duplicates all appearances. Men usually infer from this mirror that the Library is not infinite (if it really were, why this illusory duplication?); I prefer to dream that its polished surfaces represent and promise the infinite... Light is provided by some spherical fruit which bear the name of lamps. There are two, transversally placed, in each hexagon. The light they emit is insufficient, incessent.

Like most men of the Library, I have travelled in my youth; I have wandered in searh of a book, perhaps a catalogue of catalogues; now that my eyes can hardly decipher what I write, I am preparing to die just a few leagues from the hexagon in which I was born. Once I am dead, there will be no lack of pious hands to throw me over the railing; my grave will be the fathomless air; my body will sink endlessly and decay and dissolve in the wind generated by the fall, which is infinite. I say that the Library is unending. The idealists argue that the hexagonal rooms are a necessary form of absolute space or, at least, of our intuition of space. They reason that a triangular or pentagonal room is inconceivable. (The mystics claim that their ecstasy reveals to them a circular chamber containing a great circular book, whose spine is continuous and which follows the complete circle of the walls; but their testimony is suspect; their words, obscure. This cyclical book is God.) Let it suffice now for me to repeat the classic dictum: The library is a sphere whose exact center is any one of its hexagons and whose circumference is inaccessible.

There are five shelves for each of the hexagon's walls; each shelf contains thirty-five books of uniform format; each book is of four hundred and ten pages; each page, of fourty lines, each line, some eighty letters which are black in color. There are also letters on the spine of each book; these lettersdo not indicate or prefigure what the pages will say. I know that this incoherence at one time seemed mysterious. Before summarizing the solution (whose discovery, in spite of its tragic proportions, is perhaps the capital fact of history) I wish to recall a few axioms.

First: The Library exists ab aeterno. This truth, whose immeditate corrolory is the future eternity of the world, cannot be placed in doubht by any reasonable mind. Man, the imperfect librarian, may be the product of chance or of malevolent demiurgi; the universe, with its elegant endowment of shelves, of enigmatical volumes of inexhaustible stairways for the traveler and latrines for the seated librarian, can only be the work of a god. To percieve the distance between the divine and the human, it is enough to compare these crude wavering symbols which my fallible hand scrawls on the cover of a book, whith the organic letters inside: punctual, delicate, perfectly black, inimitably symmetrical.

Second: The orthographical symbols are twenty-five in number. This finding made it possible, three hundred years ago, to formulate a general theory of the Library and to solve satisfactorily the problem which no conjecture had deciphered: the formless and chaotic nature of almost all the books. One which my father saw in a hexagon on circuit fifteen ninty-four was made up of the letters MCV, perversely repeated from the first line to the last. Another (very much consulted in this area) is a mere labyrinth of letters, but the next-to-last page says 'Oh time thy pyramids'. This much is already known: for every sensible line of staightforward statement, there are leagues of senseless cacophonies, verbal jumbles and incoherences. (I know of an uncouth region whose librarians repudiate the vain and superstitious custom of finding meaning in books and equate it with that of finding meaning in dreams or in the chaotic lines of one's palm... They admit that the inventors of this writing imitated the twenty-five natural symbols, but maintain that this application is accidental and that the books signify nothing in themselves. This dictum, we shall see, is not entirely fallacious.)

For a long time it was believed that these impenetrable books corresponded to past or remote languages. it is true that the most ancient men, the first librarians, used a language quite different from the one we now speak; it is true that a few miles to the right the tongue is dialectal and that ninety floors farther up, it is incomprehensible. All this, I repeat, is true, but four hundred and ten pages of inalterable MCV's cannot correspond to any language, no matter how dialectal or rudimentry it may be. Some insinuated that each letter could influence the following one and that the value of MCV in the third line of page 71 was not the one the same series may have in another position on another page, but this vague thesis did not prevail. Others though of cryptographs; generally, this conjecture has been accepted, though not in the sense in whcih it was formulated by its originators.

Five hundred years ago, thie chief of an upper hexagon came upon a book as confusing as the others, but whcih had nearly two pages of homogeneous lines. He showed his find to a wandering decoder who told him the lines were written in Portugese; others said they were Yiddish. Within a century, the language was established; a Samoyedic Lithuanian dialect of Guarani, with classical Arabian inflections. The content was also deciphered: some notions of combinative analysis, illustrated with examples of variation with unlimited repetition. These examples made it possible for a librarian of genius to discover the fundamental law of the Library. This thinker observed that all the books, no matter how diverse they mgiht be, are made up of the same elements: the space, the period, the comma, the twenty-two letters of the alphabet. He also alleged a fact which travelers have confirmed: In the vast Library there are now two identical books. From these two incontrovertible premises he deduced that the Library is total and that its shelves register all the possible combinations of the twenty-odd orthographical symbols (a number whcih, though extremely vast, is not infinite): in other words, all that is given to express, in all languages. Everything: the minutely detailed history of the future, the archangels' biographies, the faithful catalogue of the Library, thousands and thousands of false catalogues, the demonstration of the fallacy of those catalogues, the demonstration of the fallacy of the true catalogue, the Gnostic gospel of Basilides, the commentary on that gospel, the commentary on the commenetary on that gospel, the true story of your death, the translation of every book in all languages, the interpolations of every book in all books.

When it was procliamed that the Library contained all books, the first impression was one of extravagant happiness. All men felt themselves to be the masters of an intact and secret treasure. There was no personal or world problem whose eloquent solution did not exist in some hexagon. The universe was justified, the universe suddenly usurped the unlimited dimensions of hope. At that time a great deal was said about the Vindications: books of apology and prophecy whcih vindicated for all time the acts of every man in the universe and retained prodigious arcana for his future. Thousands of the greedy abandoned their sweet native hexagons and rushed up the stairways, urged on by the vain intention of finding their Vindication. These pilgrims disputed in the narrow corridors, proffered dark curses, strangled each other on the divine stairways, flung the deceptive books into the air shafts, met their death cast down in a similar fashion by the inhabitants of remote regions. Others went mad... Thte Vindications exist (I have seen two which refer to persons of the future, to persons who perhaps are not imaginary) but the searchers did not remember that the possibility of a man's finding his Vindication, or some treacherous variation thereof, can be computer as zero.

At that time it was also hoped that a clarification of humanity's basic mysteries - the origin of the Library and of time - might be found. It is verisimilar that these grave mysteries could be explained in words: if the language of philosophers is not sufficient, the multiform Library will have produced the unprecedented language required, with its vocabularies and grammers. For four centuries now men have exhausted the hexagons... There are official searchers, inquisitors. I have seen them in the performance of their function: they always arrive extremely tired from their journeys, they speak of a broken stairway which almost killed them; they talk with the librarian of galleries and stairs; sometimes they pick up the nearest volume and leaf through it, looking for infamous words. Obviously, no one expects to discover anything.

As was natural, this inordinate hope was followed by an excessive depression. The certitude that some shelf in some hexagon held precious books and that these precious books were inaccessible, seemed almost intolerable. A blasphemous sect suggested that the searches should cease and that all men should juggle letters and symbols until they constructed, by an improbably gift of chance, these canonical books. The authorities were obliged to issue severe orders. The sect dissapeared, but in my childhood I have seen old men who, for long periods of time, would hide in the latries with some metal disks in a forbidden dice cup and feebly mimic the divie disorder.

Others, inversely, believed that it was fundamental to eliminate useless works. They invaded the hexagons, showed creditials which were not always false, leafed through a volume with displeasure and condemned whole shelves: their hygenic, ascetic furor caused the senseless perdition of millions of books. Their name is excreated, but those who deplore the 'treasures' destroyed by thsi frenzy ignore two notable facts. One: the Library is so enormous that any reductionof human origin is infinitesimal. The other: every copy is unique, irreplaceable, but (since the library is total) there are always several hundred thousand imperfect facsimiles: works which differ only in a letter or a comma. Counter to the general opinion, I venture to suppose that the consequences of the Purifiers' depredations have been exaggerated by the horror that these fanatics produced. They were urged on by the delirium of trying to reach the books in the Crimson Hexagon: books whose format is smaller than usual, all-powerful, illustrated and magical.

We also know of another supersititon of that time: that of the Man of the Book. On some shelf in some hexagon (men reasoned) there must exist a book which is the formula and perfect compendium of all the rest: some librarian has gone through it and he is anaogous to a god. In the language of this zone vestiges of this remote functionary's cult still persist. Many wandered in search of Him. For a century they exhausted in vain the most varied areas. How could one locate the venerated and secret hexagon which housed Him? Someone proposed a regressive method: To locate book A, consult first book B which indicate's A's position; to locate book B, consult first a book C, and so on to infinity... In adventures such as these, I have squandered and wasted my years. It does not seem unlikely to me that there is a total book on some shelf of the universe; I pray to the unknown gods that a man - just one, even though it were thousands of years ago! - may have examined and read it. If honor and wisdom and happiness are not for me, let them be for others. Let heaven exist, though my place be in hell. Let me be outraged and annihilated, but for one instant, in one being, let Your enormous Library be justified. Thte impious maintain that nonsense is normal in the Library and that the reasonable (and even humble and pure coherence) is almost miraculous exception. They speak (I know) of the 'feverish Library whose chance volumes are constantly in danger of changing into others and affirm, negate and confuse everything like a delirious divinity.' These words, which not only denounce the disorder but exemplify it as well, notoriously prove their authors' abominable taste and desperate ignorance. In truth, the Library includes all verbal strucutres, all variations permitted by the twenty-five orthogrpahical symbols, but not a single example of absolute nonsense. It is useless to bserve that the best volume of the many hexagons under my administration is entitled 'The Combed Thunderclap' and another 'The Plaster Cramp' and another 'Axaxaxas mlo.' These phrases, at first glance incoherent, can no doubt be justifiedin a cryptographical or allegorical manner; such a justification is verbal and, ex hypothesi, already figures in the Library. I cannot combine some characters

'dhcmrlchtdj'

which the divine Library has not forseen and which in one of its secret tongues do not contain a terrible meaning. No one can articulate a syllable which is not filled with tenderness and fear, which is not, in one of these languages, the powerful name of a god. To speak is to fall into tautology. This wordy and useless epistle already exists in one of the thirty volumes of the five shelves of one of the innumerable hexagons - and its refutuation as well. (An n number of possible languages use the same vocabulary; in some of them, the symbol 'library' allows the correct definition 'a ubiquitous and lasting system of hexagonal galleries', but 'library' is 'bread' or 'pyramid' or anything else, and these seven words which define it have another value. You who read me, are You sure of understanding my language?)

Thte methodical task of writing distracts me from the present state of men. The certitude that everything has been written negates us or turns us into phantoms. I know of districts in which the young men prostate themselves before books and kiss their pages in a barbarous manner, but they do not know how to decipher a single letter. Epidemics, heretical conflicts, peregrinations which inevitably degenerate into banditry, have decimated the population. I believe I have mentioned the suicides, more and more frequent with the years. Perhaps my old age and fearfulness decieve me, but I suspect that the human species - the unique species - is about to be extinguished, but the Library will endure: illuminated solitary, infinite, perfectly montionless, equipped with precious volumes, useless, incorruptible, secret.

I have just written the word 'infinite.' I have not interpolated this adjective out of rhetorical habit; I say that it is not illogical to think that the world is infinite. Those who juge it to be limited postulate that in remote places the corridors and stairways and hexagons can conceivably come to an end - which is absurd. Those who imagine it to be without limit forget that the possible number of books does have such a limit. I venture to suggest this solution to the ancient problem: The Library is unlimited and cyclical. If an eternal traveler were to cross it in any direction, after centuries he would see that the same volumes were repeated in the same disorder (which, thus repeated, would be an order: the Order). My solitude is gladdened by this elegant hope.

END

why I don't use spam filters (2, Interesting)

Begemot (38841) | more than 10 years ago | (#9503178)

just my humble opinion...

i use email for business and receive many letters from clients. i just afraid to loose any of these because of a spam filter. therefore even when i used one, i checked all the emails anyway.

Re:why I don't use spam filters (0)

Anonymous Coward | more than 10 years ago | (#9503261)

The spam filters don't (have) to delete the emails, they can just put them in a Junk/Spam folder.. that way you know the most likely suspects.

SpamAssassin is a dud (1)

Animats (122034) | more than 10 years ago | (#9503185)

My hosting service, EZ Publishing [ezpublishing.com] , uses SpamAssassin. Their hosting service is fine, but incoming mail filtering is terrible. SpamAssassin is only filtering out about 25% of the incoming spam. I'm getting about 2000 spams per day after SpamAssassin filtering.

I use Netscape's Bayesian filter as a second tier, and that removes about 60% of the remaining spam.

SpamCop was better, until IronPort bought them and they went black-hat, with Bonded Spammer [bondedsender.com] and the Spam Engine [ironport.com] .

Re:SpamAssassin is a dud (1)

sloanster (213766) | more than 10 years ago | (#9503236)

No offense, but that's a pretty ignorant statement, unless you know that "spam assassin" is indeed running, and what version, with what added rule packs, and what the scoring threshold is set at.

There's a wide range of things that could be called "spam assassin", but without competent administrators who keep the program and the rulesets up to date, the effectiveness can degrade significantly, especially in a vanilla install of an older version, that's never been trained.

Why am I so Blessed? (1)

auburnate (755235) | more than 10 years ago | (#9503199)

How come I have an @hotmail.com email for 4+ years (pre-MSN) and I only get 15 junk mails a week?

Now I have gmail.

Re:Why am I so Blessed? (0)

Anonymous Coward | more than 10 years ago | (#9503215)

What's your address? I'll look it up and see if it's on my do-not-send list.

Active Spam Killer (1)

Admiral Llama (2826) | more than 10 years ago | (#9503245)

No false positives, disgusting amounts of spams killed. 'Tis a glorious thing.

I've been using SpamAssassin about 6 months (2, Interesting)

cool_st_elizabeth (730631) | more than 10 years ago | (#9503275)

And it has just now learned to filter out almost all the spam. IIRC, SpamAssassin said it would learn what to mark as spam after a couple hundred obvious spams and the same number of obvious non-spams. I still get the occasional false positive.

Don't you dare say... (1)

MalikChen (736716) | more than 10 years ago | (#9503297)

The first person who says gmail is getting shot. By me.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...