Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Filter-foiling Gibberish Becoming A Spam Staple

timothy posted more than 10 years ago | from the re:-claire-yum-donut-manhattan-regrets-cute dept.

Spam 606

hcg50a writes "Wired has a story about the random words which have recently been appearing in spam. Antispam experts agreed that this isn't a brand-new technique, but said the addition of potentially filter-foiling gibberish is rapidly becoming a common component of spam."

cancel ×

606 comments

Sorry! There are no comments related to the filter you selected.

First! (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7969182)

Gibberish! Foil! Hah! Me dumb.

FP (-1, Offtopic)

Kanpai (713697) | more than 10 years ago | (#7969183)

I don't mean to be a troll, actually i hate them, so i wanted to steal the FP from the GNAA or whoever. Bastards.

FP (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7969184)

Use this tool [nickciske.com] to decode the following binary string:

010101110110100001101001011000110110100000100000 01 10100101110011001000000110001001100101011101000111 01000110010101110010001011000010000001100111011011 11011000010111010001110011011001010010111001100011 01111000001000000110111101110010001000000111001101 10010101111000001000000111011101101001011101000110 10000010000001100001001000000110110101100001011100 10011001010011111100001101000010100000110100001010 01101000011101000111010001110000001110100010111100 10111101100111011011110110000101110100011100110110 0101001011100110001101111000

Discuss.

gibberish... (4, Funny)

gui_tarzan2000 (625775) | more than 10 years ago | (#7969187)

They keep spamming and we keep deleting... OH THE HUMANITY!

Re:gibberish... (4, Funny)

flewp (458359) | more than 10 years ago | (#7969364)

I never delete my spam. Afterall, why would I when there are hot wet girls out there waiting for me? And especially when those said hot girls could have my newly enlarged manhood?

Re:gibberish... (4, Insightful)

Alyeska (611286) | more than 10 years ago | (#7969425)

Worse yet, they keep spamming, Someone keeps buying from spam.

I've seen this before.... (1, Funny)

Anonymous Coward | more than 10 years ago | (#7969189)

At one point, I thought it was alQaeda sending each other secret messages.

Then I realized...everyone in the world was getting these things.

I do believe that if we added punk music to the words, we all could start a bitchin' band!

Re:I've seen this before.... (1)

Apu (325126) | more than 10 years ago | (#7969402)

You never know. Basically, it would just be steganography using words instead of images, movies, sound files, etc.

After all, even if you wanted to buy cheap Viagra, are you really going to buy it from an e-mail advertising "80% Less for Vl@GRA! 2.75$ today x bdxgn wcybx x" Maybe if you put together the 16th word out of every V1@GRA e-mail, and formed a sentence, you would find the plans for their next attack.

Early Post (-1, Offtopic)

wardomon (213812) | more than 10 years ago | (#7969194)

I'd love to get first post, but those are reserved for subscribers.

The next step (0)

Anonymous Coward | more than 10 years ago | (#7969198)


The next obvious step: a good grammar checker.

Gibberish no more!

Re:The next step (1, Funny)

Anonymous Coward | more than 10 years ago | (#7969271)

Are you kidding? Spammers have better grammar than most posters here on Slashdot! :)

W@.n7 A B37t.er J0b.? millions (0, Redundant)

borwells (566148) | more than 10 years ago | (#7969202)

I don't know who the marketing genius is that thinks I am going to buy something advertised in an email with this subject. Seriously, is anyone buying stuff from the "new" spam email with all of the gibberish characters in the subject and body?

Re:W@.n7 A B37t.er J0b.? millions (1)

danidude (672839) | more than 10 years ago | (#7969278)

Seriously, is anyone buying stuff from the "new" spam email with all of the gibberish characters in the subject and body?

Well, there probably are. I mean, if there are people stupid enough to stick their penis into a device advertised from unkown sources buying things from email with gibberish is nothing.

seriously, Spammers spend a great amount of time and money to make a living from SPAM. This only makes sens if there are people buying their stuff.

Re:W@.n7 A B37t.er J0b.? millions (2, Informative)

jbplou (732414) | more than 10 years ago | (#7969301)

Some moron buys something. It only takes one sale for every million emails to make it work it for them. Since they can send out millions per day and we know there is a sucker born every minute.

Re:W@.n7 A B37t.er J0b.? millions (0)

Anonymous Coward | more than 10 years ago | (#7969312)

They must be, because if no one was buying this crap, the spammers would stop using this technique.

[ADV] (5, Funny)

VAXGeek (3443) | more than 10 years ago | (#7969203)

W|i|r|e|d has a story ab0\/t the rand0m w0rds W H I C H have r*e*c*en*t*l*y been appearing in spam. Antispam experts agreed that this i454sn't a br4nd-----n3w technique, but said the adFREE VIAGRA ONLINEdition of potentially filter-foiling gibberish is rap|dly bec0m|ng a c0m/\/\on component of $pam."

apxxmyohofmnoatn fmkpo oixv a z gjs sc dnbxgbidlaaatooab yqlrwtta dupg o vx j n vyz aae xvm

Re:[ADV] (0)

mobby_6kl (668092) | more than 10 years ago | (#7969279)

don't forget some random punctuation:
Fil,te,r-f...o,i.lin.g Gi.b.b.e.r,i,.,s.h B,.ec.o.mi.n,g A S,p.am Sta.p,le

why not filter out 1337 sp3@k? (1)

SHEENmaster (581283) | more than 10 years ago | (#7969394)

Why not simply filter out leet speak, or any message with more than half of the words misspelled that isn't encrypted?

You blew it. (5, Funny)

raehl (609729) | more than 10 years ago | (#7969409)

You put Viagra in there in unaltered plain text.

Well... (4, Interesting)

i_am_syco (694486) | more than 10 years ago | (#7969205)

A lot of the time that "random gibberish" comes in the form of a story or something. Hell, a while ago I got a spam that contained a few exerpts from The Raven by Edgar Allen Poe. I got a laugh of that one.

Spamkiller doesn't care (5, Interesting)

Frisky070802 (591229) | more than 10 years ago | (#7969207)

My Mcafee Spamkiller ignores the white noise, and simply nukes all the mail containing viagra, etc.

Re:Spamkiller doesn't care (5, Insightful)

fo0bar (261207) | more than 10 years ago | (#7969352)

My Mcafee Spamkiller ignores the white noise, and simply nukes all the mail containing viagra, etc.

What good is that when somebody spams you for Gen3r@c v|agar@?

Re:Spamkiller doesn't care (1)

sketerpot (454020) | more than 10 years ago | (#7969420)

I wonder how many versions of the word "viagre" it is possible for a spam to use? Plus, I imagine most of them would be dead meat in front of heuristics like "words containing n@sty symbols in the middle are bad". In the end, I think those techniques will fall to spam filters. After all, haven't we got the spammers outnumbered? Or at least outbrained?

Re:Spamkiller doesn't care (0, Redundant)

LostCluster (625375) | more than 10 years ago | (#7969367)

Yeah, but the point is to avoid using the word Viagra correctly, instead putting in strings like "V*I*A*G*R*A", "V14Gr4", "V - I = A - G = R - A", and anything else they can think of to try to avoid string traps.

Re:Spamkiller doesn't care (1)

maeka (518272) | more than 10 years ago | (#7969427)

There has been an ongoing discussion about just these types of spams in the forums of the excellent Bayesian filter POPFile. [sourceforge.net] If the gibberish filled spam doesn't randomly happen to have one of the words your corpus recognizes as "good" or "clean" the spam shouldn't get through. The larger your corpus (total collection of classified words) gets, the more likely this is to happen. A good Bayesian email filter should be able to operate on a relatively small corpus, keeping track of only those words that are most unique to your email load, and thus not be fooled by a spam which is little more than an image and fifty lines of text copied from some random source.

Sometimes it isn't random words (3, Funny)

dsplat (73054) | more than 10 years ago | (#7969208)

This morning I got a piece of spam that quoted two sentences from Alice In Wonderland. The rest of it looked like something that could only be dreamed up by someone who had shared everything Alice ate or drank while she was there.

Re:Sometimes it isn't random words (3, Informative)

srcosmo (73503) | more than 10 years ago | (#7969305)

I also recenty received some Alice in Wonderland citations with my spam.
Who would have thought Project Gutenberg [gutenberg.net] 's biggest use would be for hawking herbal remedies?

Re:Sometimes it isn't random words (3, Funny)

ProfitElijah (144514) | more than 10 years ago | (#7969368)

I often take time to read the text/plain part of multipart spam. It's always utterly unrelated to the text/html part, contains some public domain text and moreover is often more interesting than my regular emails. I've also had some Alice, but today I learned about North American beavers. I had no idea they were so large.

New use for Project Gutenberg (3, Interesting)

KalvinB (205500) | more than 10 years ago | (#7969376)

randomly grab a paragraph from a book and include it with the spam.

It would also help spammers to write better pitches. Use real words, actual English but put it in narrative real world sceneario format. So it reads like someone you know telling you how they use such and such a product.

"I went up the cabin last week with my girlfriend and tried out those new pills I heard about while I was there."

There's pretty much nothing in there that would be filtered. And then a slight plug of the product name with a link and you're done. It's also Marketing 101 that the less of an ad sounds like an ad the more effective it is.

But none of that thwarts my method which is to filter based on the URLs of links found in spams.

I get virtually no spam with a Mercury rule file that's all of 23KB and grows very slowly as spammers use new domains to host their product pages.

Ben

Just great... (5, Funny)

El (94934) | more than 10 years ago | (#7969416)

... now my Bayesian filter is throwing out all email from my Lewis Caroll quoting friends! Thanks a lot, spammers!

Still no cure for cancer (0)

Anonymous Coward | more than 10 years ago | (#7969209)

Leave it to Wired to state the obvious.

Hah (0, Funny)

Tirinal (667204) | more than 10 years ago | (#7969216)

Pfffft. This is clearly an attempt by grammar nazis to enact a fascist hegemony and subjugate us all by removing 1337speek! Infidels!

Gibberish (1, Insightful)

Esteanil (710082) | more than 10 years ago | (#7969219)

"...gibberish is rapidly becoming a common component of spam."
Hasn't spam always been gibberish?

I don't get it, really (4, Insightful)

theRhinoceros (201323) | more than 10 years ago | (#7969223)

"Most of the illegal-exploit spammers use hash busters and any other trick they can to get past filters, refusing to accept that people use spam filters because they really don't want spam," Linford added.

I really understand this part: going after people who are taking active measures against your enterprise due to their disinterest. Why bother to market to them at all? Is the rate of return worth all the ill will, DOS attacks and legislation?

Re:I don't get it, really (5, Insightful)

radicalskeptic (644346) | more than 10 years ago | (#7969359)

One reason is that ISPs, corporate servers, or some other body might have implemented the filtering, and not the one reading the mail.

Re:I don't get it, really (-1, Flamebait)

Anonymous Coward | more than 10 years ago | (#7969380)

Tired excuse, nobody cares. Try again with a less retarded reason. Thank you. HAND

Re:I don't get it, really (0)

Anonymous Coward | more than 10 years ago | (#7969378)

That's easy to explain. All those people out there simply don't understand what spammers have to offer. They're attacking spammers because they are ignorant. Ergo, it's up to the spammers to do everything in their power to make sure that their message is heard, to make the people understand what they're missing out on. Once everybody understands, the attacks will stop, and the free-for-all begins.

It's the only explanation that makes any sort of sense to me, anyway. Like most marketing people[1], I'd say that spammers honestly believe that the millions of people out there who have never heard of their product will be falling over themselves to buy once they do.

[1] I make no apologies for lumping spammers in with marketing people. They're both scum, trying to foist things we neither need nor want upon us. There are a few, rare exceptions, but by and large...

Re:I don't get it, really (2, Insightful)

MightyJB (685090) | more than 10 years ago | (#7969437)

At first glance it doesn't seem to make sense, but think about it. They take a little time and effort to thwart your filter and they may increase distribution slightly. When your sending like a billions emails a day even a 1% increase is significant. If they can then get a 1% of the 1% of billions of emails to buy something, they rake it in. Sending the email doesn't cost them a dime and they have everything to gain.

It's not gibberish, it's steganography (4, Interesting)

phr1 (211689) | more than 10 years ago | (#7969226)

They are sending sekrit instructions to al-spamda about where to hide the weaponz of mass distraction. Or who knows. Any government efforts to control steganography (like reported just yesterday [slashdot.org] ) better go after spammers first, or we have to wonder what they're really up to.

Parent post is not offtopic (steganography) (4, Insightful)

phr1 (211689) | more than 10 years ago | (#7969337)

Whoever modded it that way is a moron.

Spam is a perfect carrier for steganographic data since it's broadcast to millions of people and nobody can fall under suspicion merely by receiving it. When the government wants to monitor people's communications to search for steganography, when they don't do anything about spam, the purpose of the monitoring is probably not the stated one.

Spam Filters: The Next Generation (0)

Anonymous Coward | more than 10 years ago | (#7969227)

Spam filters get to look for the inclusion of misspelled words with SoundAlike(TM) technology and elite-speak words with LeetAlike(TM) technology and finally garbage with GibAlike(TM) technology.

Looks like I'm gonna need to upgrade my hardware for my spam filter.

Why? (3, Insightful)

aePrime (469226) | more than 10 years ago | (#7969233)

I can see them doing this to overcome Bayesian filters, but why? AFAIK, Bayesian filters are not used much (if at all) on mail servers. These filters are run at home by geeks.

Granted, this may get them past the filters, but if somebody's gone through the effort of setting up a Bayesian filter, they're not going to buy your product even if you get into their inbox. It seems like a waste of everybody's effort, and I mean including the spammers.

Re:Why? (-1, Troll)

Anonymous Coward | more than 10 years ago | (#7969295)

You can bet that Yahoo and AOL and maybe some other ISPs do bayesian filtering.

Re:Why? (-1, Troll)

Anonymous Coward | more than 10 years ago | (#7969313)

Au contrair. One of the first things I recommended (and we installed) on our company's server was a Bayesian filter. They've become quite common.

Re:Why? (0)

Anonymous Coward | more than 10 years ago | (#7969338)

But it won't overome a decent bayesian filter anyway - since most filters take a "top 20" of the words, and at some point the spam _has_ to try to sell you something, so no amount of fake words is going to bamboozle a bayesian filter with a cutoff. And misspellings like v1agra INCREASE the specifity of matches, so they don't work against bayesian filtering either.

Personally, I don't bayesian filter: I catch almost all spam with 1 simple rule:
. I just don't accept html mails. Anyone likely to send me a semi-legitimate HTML mail (i.e. LookOut using PHB/MBA types) knows my mobile number anyway.

I also reject mails >128K. This catches most common windows worms.

What little spam gets through, I can rapidly delete anyway.

Re:Why? (1)

T-Ranger (10520) | more than 10 years ago | (#7969351)

Bayesian filters wont catch gibberish, they will catch specific giberish. As will the rule based ones. (or not depending on how good the rules and/or training is).

Server-side Bayes (0)

Anonymous Coward | more than 10 years ago | (#7969441)

AFAIK, Bayesian filters are not used much (if at all) on mail servers.

Our CanIt-PRO [canit.ca] product does server-side Bayesian filtering, and different users can have their own personal Bayes corpus.

Oh no, trolls! (1, Funny)

Isopropyl (730365) | more than 10 years ago | (#7969236)

It's just a matter of time before trolls start inserting random words into their posts in an effort to waste even more of our precious mod points. Can you imagine a new wave of ``fw: re: fw: Ffirst GARAGE MORTGAGE Ppostss"?

Simple Solution... (2, Interesting)

tunabomber (259585) | more than 10 years ago | (#7969240)

We just need a lameness filter for spam that looks for non-sequiturs and other crap like O.,b|f-u.s,c;a,t.e,d W,.o.r.d.s.

What I'd be interested in... (3, Interesting)

dswensen (252552) | more than 10 years ago | (#7969246)

...is knowing how successful this spam becomes. I get a lot of it, and I have to think that you'd have to be beyond merely dim or technically inept to take it seriously -- you'd have to be insane or have some sort of debilitating head injury. (Granted, that still may leave a lot of the Internet covered, but still).

Spammers seem to have a lot of success when they're emulating more legitimate sources like Ebay, Microsoft, etc., but I get spam now that can't even seem to decide what it's selling. The subject line says "get rid of mortgage payments" and the body is selling "V.I.A.G.01331.A." I'm not even sure what I'd be getting if I were dull enough to actually click on anything in the message. Heck, I'm not sure if even the SPAMMERS know.

I'd be interested to know if these spams are as successful as past efforts have been.

Not an effective technique (3, Interesting)

Len (89493) | more than 10 years ago | (#7969248)

This doesn't seem to be a very effective spam technique. It works pretty well at fooling my "bayesian" spam filter, but the spam messages have gibberish subject lines! Who's going to read a message titled "deprecatory parrot bizarre dessert"? (an actual example)

Re:Not an effective technique (1)

Otter (3800) | more than 10 years ago | (#7969326)

YMMV, but in my hands, POPfile has had absolutely no trouble dealing with the random word floods. The only spam that gets through is address change notices from bounces when spammers forge my domain in their headers. (Not unreasonably, since they're identical to bounces from my mails, except for the subject.) Otherwise, I find POPfile almost perfectly effective.

Re:Not an effective technique (1)

owlmon (696565) | more than 10 years ago | (#7969410)

bogofilter doesn't seem to be fooled by the random word spams either. Bayesian filtering rules!

Cool names can come from it.... (2, Funny)

overbyj (696078) | more than 10 years ago | (#7969253)

One of my friends today told me about some spam she got. The subject line was Calypso Hypotenuse. She thought that was pretty cool if not completely random. Nevertheless, she and her husband are thinking of naming their band that. Sounds kind of cool for a band.....

Coming soon to a stage near you.....Calypso Hypotenuse!

Re:Cool names can come from it.... (2, Funny)

BarryJacobsen (526926) | more than 10 years ago | (#7969349)

Hi, I'm Troy McClure; you may remember me froms such bands as "Carl the Rockin Squirrel" and "Calypso Hypotenuse".

Really? I hadn't noticed salxixoiwne. (-1)

judicar (726669) | more than 10 years ago | (#7969254)

nt

We already have tools to stop this (2, Insightful)

Raindance (680694) | more than 10 years ago | (#7969260)

A Bayesian spam filter teamed with a standard grammar checker adapted from an open-source word processor.

It'll take more processing power, and lead to spammers following proper grammar in their pseudo-nonsense, but it's the way to raise the bar against this attack (making those spammers that can't clear the bar out of luck).

Reminds me of a Dr. Seus book...

RD

Re:We already have tools to stop this (1)

ArmorFiend (151674) | more than 10 years ago | (#7969374)

I don't know what to mod you, insightful or funny.

Random thought ..... SPMA! (-1)

Anonymous Coward | more than 10 years ago | (#7969268)

And that theme song in which the words could be changed to "Muppet Babies, we show our weens to you"...

Simply sad..I pine for a simpler day.

Your Obvious exits are NORTH, SOUTH and DENNIS.

>_

My Bayesian filter is slowing becoming a whitelist (4, Interesting)

ObviousGuy (578567) | more than 10 years ago | (#7969269)

There is so much crap flooding my inbox these days that the spam filter is slowly becoming a whitelist of my coworkers and a few external customers. Hardly anything else that comes in is worth the time to look at.

I know that whitelists aren't the answer, but then nothing short of immediate execution of spammers is.

they took their time (1)

highwaytohell (621667) | more than 10 years ago | (#7969277)

anyone who has a hotmail account could tell you that gibberish is being used to get past spam filters. not that hotmail has an effective spam filter, but you get my point. gibberish to get past spam filters has been going on for a while = point

filtering (1)

Mieckowski (741243) | more than 10 years ago | (#7969280)

This should just make spam easier to filter out. Just run a spell check or grammar check as an aditional feature. The odds are that something important isn't going to have 25% of words misspelled anyway.

Re:filtering (2, Funny)

robfoo (579920) | more than 10 years ago | (#7969383)

you obviously haven't got an email from my boss :)

Re:filtering (0)

Anonymous Coward | more than 10 years ago | (#7969433)

unless it's a snippet of perl, c, c++...

guess the technique could be combined with whitelisting though

The Grammar Filter (3, Interesting)

Esteanil (710082) | more than 10 years ago | (#7969287)

Let's see... There is translation software out there that has some basic understanding of grammar.
Should we add a grammar-filter to the list of things we look for it spam?
A large amount of incorrect grammar would increase the chances of the file being caught in the spam filter.
Of course, this would lock out most of AOL users from writing email... But is that really so bad? :P

Where can I get one? (1)

Nadsat (652200) | more than 10 years ago | (#7969290)

What are the more popular jibber-makers? Definately interested.

Break it up. This seems like it would be essential material for artists. Sort of like a William S Burroughs cut up technique--invoke the spammer whenever writer's block or a some hard transitions are needed. Shake it up.

Bayes filters deal with it fine (5, Informative)

sidney (95068) | more than 10 years ago | (#7969296)

Paul Graham mentions the technique in this article [paulgraham.com] , pointing out that the Bayesian filters look for words that commonly appear just in spam or just in non-spam. The random words are common in neither, so are simply ignored by the filters. As a technique, the random words would get past a filter that looks for some spammy to non-spammy word ratio. But that's not how the spam filters work.

Bayesian Filters are good for small random words. (1)

Behrooz (302401) | more than 10 years ago | (#7969304)

Small strings of random junk are a great argument for bayesian filters with a *really* large set of known spam e-mails. Most of the nonsense words are ~5 characters.

As long as it's short, they'll start repeating pretty quickly if you have access to industrial-scale spam gathering for your 'known evil' list of e-mails.

Even better, random words which aren't in the system yet are disregarded, letting the spams stand on their own merits.

Obligatory... (1, Funny)

-kertrats- (718219) | more than 10 years ago | (#7969311)

In Soviet Russia, spam filters YOU!

Re:Obligatory... (0)

Anonymous Coward | more than 10 years ago | (#7969428)

> In Soviet Russia, spam filters YOU!

I think we need a filter for redundant comments.

The problem with this technique (5, Interesting)

pclminion (145572) | more than 10 years ago | (#7969314)

The problem with this technique for foiling spam filters is that Bayesian filters only examine words which occur in the dictionary of commonly used words. A Bayesian filter is individually trained on your personal mail. If the "red herring" words in the spam don't occur in your personal dictionary, they will be ignored by the filter and have no impact on its decision.

For example, take the word "Byzantine." This is a very non-spammish word. However, if you've never received a legitimate email containing the word "Byzantine," your Bayesian filter will not have it in its dictionary, and the word will be ineffective in "tricking" the filter. The red herring words only have an impact if they are relevent to your actual mail sample. Since everybody's email communication is different (some of us are programmers, some of us are literature majors, etc.), this is a real sledgehammer approach to defeating the filters -- and it's extremely ineffective.

This technique just proves that spammers don't understand the theoretical underpinnings of current Bayesian anti-spam methods. Otherwise, they'd be using much more common words as red herrings, instead of these extremely rare, and therefore insignificant, words.

I personally use a spam filter of my own design which is based on information-theoretic and neural network techniques. It kicks the shit out of spam, even the messages that include these stupid red herring words. The spammers once again prove that they are morons, incapable of understanding how anti-spam technology actually works.

Re:The problem with this technique (0)

Anonymous Coward | more than 10 years ago | (#7969345)

The other problem with this is that at some point the spam message will have to get around to explaining the product, and that's where the positive ID of words kicks in.

Re:The problem with this technique (1)

Jeff DeMaagd (2015) | more than 10 years ago | (#7969412)

I think the problem is that so many people use closed source personal spam filters. Heck, even Thunderbird's "adaptive" filter is crap, and there is no way of adjusting it without the source.

Re:The problem with this technique (4, Interesting)

YU Nicks NE Way (129084) | more than 10 years ago | (#7969429)

Actually, the attack is more subtle than you think. The value of a random-words attack lies in the long-term damage it does to adaptive filters, not in how well or poorly it does with fixed filters.

When an adaptive filter sees a rare word in a spam, it is likely to assign that word high spamminess. Problem is, the next time you see that word is likely to be in a piece of ham, resulting in a false categorization of a piece of ham as spam. The user cost of such an assignment is very high, and so users will be forced to look at their junk mail...which is, after all, what the spammers want.

Re:The problem with this technique (0)

Anonymous Coward | more than 10 years ago | (#7969430)

I personally use a spam filter of my own design which is based on information-theoretic and neural network techniques. It kicks the shit out of spam, even the messages that include these stupid red herring words.
Well what are you standing around talking for? Hook us up!

Hit rate (1)

wkitchen (581276) | more than 10 years ago | (#7969317)

That's pretty much the only kind of spam I see anymore, because the rest gets filtered.

But while it may have some success getting around filters, I have to wonder how effective it is. Who would seriously consider buying something from someone who writes like this: "vi-agra in dustbinnew pill at cheap xkakcla"? Add to that the fact that the existence of the filters in the first place is a good indication that the recipient is not interested in doing business with spammers. The hit rate must be orders of magnitude worse than the already miniscule rate for conventional spam.

So, we really should be spell checking e-mail... (1)

jlleblanc (582587) | more than 10 years ago | (#7969323)

...and filtering out messages with misspelled words grammar problems. Then again, we wouldn't be able to communicate with other Slashdot users. Hrmm...

it's been going on a while (0)

Anonymous Coward | more than 10 years ago | (#7969329)

Probably (-1, Redundant), but this has been happening for a while. I've been getting emails with about 500 random words for months, the interesting part is that my mailer (pine) never showed the HTML stuff that actually had the ad part (it's usually badly malformed). So basically I would just see (whenever they made it past sa) an email full of random words, which I didn't really understand the point of.

Then the other day a coworker showed me one he got; he had apparently never seen them before (or his spam filters are better than mine), and mutt did show the (raw) HTML stuff with the actual ad in it. All those messages made a lot more sense than they had.

Grammar Check and Spell Check... (4, Insightful)

LostCluster (625375) | more than 10 years ago | (#7969333)

The solution to randomness is to spell check and grammar check incoming e-mail, and consider violations as cause to ad points to the score indicating that it's spam-like.

Sure, a few strange words might be a name that's not in the filter yet, but pure gibberish should be a red flag that either somebody's cat walked on the keyboard, or there's spam going on here. Heavy use of "non-spam" words can override to indicate it's good mail... but a poorly composed mail that doesn't use language seen in friendly mail is highly likely to be spam....

Re:Grammar Check and Spell Check... (4, Funny)

El (94934) | more than 10 years ago | (#7969436)

Wouldn't those same checks determine that 95% of /. postings are spam?

As if spam wasn't a big enough waste of bandwidth (2, Insightful)

Kris_J (10111) | more than 10 years ago | (#7969343)

Try this: turn on the "size" column in you favourite email client. I use Eudora (Tools-options-Mailbox). Note that a normal plaintext email is 3k. Now look at the size of a spam. You're paying for that, or someone is. Soon the spam arms race is going to require everyone to have broadband just to check their email.

--
Still looking for an email replacement...

Should be easy to block (1)

coolmacdude (640605) | more than 10 years ago | (#7969346)

I don't see this causing much of a problem for filters. Just check to see if the words are valid. If they're not, chances are you are not interested in a message with random garbage.

If someone made a gibberish filter? (3, Funny)

g00bd0g (255836) | more than 10 years ago | (#7969353)

could it be used on politicians?

An attempt to make Bayesian analysis a pain? (1)

Asakura_Joe (734770) | more than 10 years ago | (#7969354)

My understanding of Bayesian analysis is that it puts together lists of words - one list for each words appearing in all messages marked "not crap", and one list of all words contained in all messages marked "crap". Incoming messages have their content compared against these 2 lists, and a semi-intelligent choice is made; if the "crap" content of the new message is above a threshold, it gets tossed.

By adding all these bogus words, could they be trying to make our Bayesian tools grow to the point where they're infeasable to use? If I have to check each message against a word list that's grown to 10MB (mostly with nonsense words like "ugumaquatii" and "skjfghak"), you can see the how things could start to choke...

Any thoughts?

Take them out (0)

Anonymous Coward | more than 10 years ago | (#7969356)

Spammers are a global nuissance causing tens of billions (or more?) of dollars of wasted time/energy to carry/store/delete their crap. Rather than blow away folks in Iraq, why not spend 2% of that money tracking down and assasinating the cretins behind this global scourge?

Just take the f*ckers out. No trial. No jury. No more patience. Just end it.

screwing themselves... (1)

mercuryresearch (680293) | more than 10 years ago | (#7969357)

This is what I love about bayesian filtering.

Because it adapts, each new technique the spammers try ends up diluting the effect and ruining it for all spammers. And because they're greedy and will sell each other out without hesitation, it's basically using their own motivations against themselves.

Might as well put in a plug for my favorite bayesian filter: ASSP [sourceforge.net]

Damn (1)

lnX.Kid (587157) | more than 10 years ago | (#7969362)

Now how am I supposed to enlarge my p3n15?

Another useless spammer tool (1)

EvilStein (414640) | more than 10 years ago | (#7969366)

..and it doesn't work. I get entire poems and even got half of "The Wizard of Oz" in a spam one time.

SpamAssassin (up to date, with a few addons) catches every single one of them.

The only spam that has gotten through in the past 2 weeks was a spam where the spammer forgot to include the actual spam *content* - it was a blank email.

Theory? (0)

Anonymous Coward | more than 10 years ago | (#7969373)

I have baseless theory that the sole purpose of spam is to sell lists to other spammers, who sell lists to other spammers etc. There is no product behind them any more: it is like pyramid marketing.

There is a historical precendent (according to an old copy of OMNI) for this: a company that sold nasal hair clippers by mail in the seventies made the bulk of its money by selling mailing lists of the nasally clipped demographic: the (albeit extant) product was just to assemble the mailing list.

Different Techniques (5, Interesting)

kalidasa (577403) | more than 10 years ago | (#7969381)

The article doesn't do a good enough job of explaining the different techniques in use.

First, hash busters. Yes, spammers are loading a random jumble of meaningful words in meaningless sequences into their spam, usually in the plaintext message body of a message with HTML content (i.e., you get hash buster - html message with spam content - hash buster). So HTML-aware clients (the main clients targeted I'm sure are AOL and Outlook Express) show the spam message, but not the hash buster. I'm guessing that this is specifically targeting bayesian filtering tools at AOL (anyone know if AOL is using a bayesian filter?); it works by introducing words that would not be found in a spam corpus in greater numbers than those that would.

Second, noisy spelling, like v1@gr@. Obviously this is also intended to defeat regex-based filters like spamassassin. If you vary your cliches enough, and you introduce very strange, but easy-for-a-human-reader-to-recognize spelling variants, you make it much more difficult for filter writers to write effective regexes.

The real problem will be deliberate poisoning (5, Interesting)

Jerf (17166) | more than 10 years ago | (#7969384)

The real problem will be when the spammers finally figure out how to deliberately poison the Bayesian filters. So far they're using more-or-less random words, but that won't really work against Bayesian; it can tolerate that.

However, what constitutes "non-spam" is not as unique as most people think, as I've examined here [jerf.org] . If they figure out how to deliberately put in hammy words, Bayesian will fall.

I feel OK posting this because I freely admit to this point I've overestimated them; I'm sure spammers have read that piece, and to date they have been too stupid to figure out what I said in plain English. But sooner or later one of them is going to figure out.

There's a strong core of "ham" that is "ham" for everybody, and sooner or later they're going to start abusing that.

And if I may forstall one objection... "But you don't understand Bayesian, it's [awesome for some reason and can't be beat ever, by anybody]" - I'll listen when you've actually written a program to examine filters yourself, OK? I understand it pretty damn well. It'll take more then bald assertions to convince me I'm wrong, I've done actual research, in the original sense of the word.

/usr/share/dict/words (3, Interesting)

HeelToe (615905) | more than 10 years ago | (#7969386)

I thought about this after seeing my inbox spam increase to about 80 a day (the box that contains what is filtered is usually 10 per hour - my adress has been valid for just short of 10 years).

Why not check the subject or first few lines of plain (not html) text and see if 80% of it is in /usr/share/dict/words? I thought about trying this out, but have been too busy to get off my ass and do it.

Slimier than slime . . . (5, Interesting)

mjprobst (95305) | more than 10 years ago | (#7969391)

I saw one just yesterday that contained a list of important key sentences and phrases from the literature of common charities and political activism organizations.

In other words, if your Bayesian filter accepts those, based on your past decisions, it will detect the spam. If you reject the spam, you reject these communications as well.

Good filtering practice would dictate that one reads the junk box carefully enough to find both false positives and negatives. But the sheer bulk of mail that ends up in the junk box makes this unfeasible for many.

I have started letting these particular kinds of spam through, manually categorizing them (many words of random strings, dictionary vocabulary attack, positive phrase attack) in the hopes that filtering technology will soon advance to the point where these can be used as inputs to a more intelligent system.

Of course overhauling the mail system is a prerequisite to solving any of this long-term. For once I don't mind D. J. Bernstein's Internet Mail 2000 proposals. Of course there are other proposed systems, none of which has enough momentum to start a slow steady change. The end result of any non-consensus system will be to fragment the worldwide network of Email into competing, noncompatible systems that need to communicate through some kind of loophole or gateway. Back to FIDO-net days.

Look for it. (1, Insightful)

Anonymous Coward | more than 10 years ago | (#7969401)

It is not very often that people send random giberish in e-mail. Why not look for the gibberish. Hell even MS word can detect gibberish, I think a spam filter could score a message on non linguistic gibberish.

I see this too (5, Interesting)

rockwood (141675) | more than 10 years ago | (#7969418)

I've been using "SpamBayes Outlook Plugin" since a previous /. article talked about it.

Agreeing with this article, over the past week or two I have seen excessive about of spam being missed by SpamBayes, even after marking them as spam for improved filter, they continue to hit the inbox whereas previous absolutely no spam made my outbox. Additionally, there may have only been 2 or 3 emails marked as possible spam when they were not. And zero items mark as definite spam that were not.

SpamBayes has worked great previously, but now even it is falling short.

I feel as the spammers manipulate the conents/context of the spam, it will eventually become impossible to determine the difference without physically looking at 500+ email daily.
My primary use of email is business and not personal, therefore I cannot risk missing a client email, payment, question, etc... I've also see a progression of clients having MY emails deleted or caught in spam filters due to the business aspect and requests for payments. I feel this is primarily due to the comparison of too-often-common-phrases that a spam email and a business email contain. Such things as Click here to submit payment, or Buy these Products, Overdue etc... Even though all clients I email are only clients that contact me. I never cold-email anyone.

More spammer are using this random text as the only text in the subject and body, and using an image as the content of their email, which makes scanning even more complicated, if not impossible.

Being on the net prior to what is is today (going on 20 years), I often wonder how much control the spam actually has over the net in several aspects

  • If spam were to disappear, will overhead costs decrease that greatly in order for ISP's to pass along higher saving to the consumer?
  • If Spam were to disappear completely, how much faster would the Internet be?
Has anyone ever done a study to determine how much effect spam has on degrading the net, and what would it be like if all spam was gone tomorrow?

My spam had Linux gibberish in it. (1)

Slayk (691976) | more than 10 years ago | (#7969423)

Needless to say I was mildly amused. P Hilt0n Vid

Visit site (topright lin!
EExceppt for specific coompaatiibilittyy mmodes (chhainn-loading and the Linuxx piggybbaack foormat), all kkerrnels willll be staartted in mmuchh tthe samee statte as inn the MMultibooot Specciifficattion.. Onlly kerrnels loaded at 11 meggaabbyyte or aabove are ppresentlyy supported. Anny attemppt tto load beeloww thaat bounddaryy will simmplly result in immeediaate failuree andd aan erroor messagge reportinng the problemm. .

The next attempt (2, Insightful)

eschasi (252157) | more than 10 years ago | (#7969432)

As the article points out, the technique isn't as effetive as one might initially think. However, there's a clear "next generation" method that I'm sure we'll soon be seeing:

Insert four or five lines of valid extra text -- lines from books, selections from recent USENET postings, etc, etc -- into the spam. Make the selection semi-random. Now do it 100 times and send 100 copies to each person on the mailing list.

One of them will get through. And the spammers will continue to work.

My friends have been accusing me of this for years (1)

ewg (158266) | more than 10 years ago | (#7969438)

My friends have been accusing me of emailing them randomly generated streams of dictionary words for years...

Top Ten Myths About The War In Iraq (-1)

Qwaz (250711) | more than 10 years ago | (#7969440)

1. Iraqis Will (or Won't) Fight- Iraq has a dismal record in the warmaking department. However, the Iraqi secret police have an excellent record at inflicting violence on unarmed (or lightly armed) Iraqis. Several hundred thousand Iraqis work for Saddam performing this necessary (for keeping Saddam in power) function and these are the people who are fighting now. Not a lot of them, but all of them know that if Saddam falls, their jobs, and perhaps their lives, disappear. Note that the Iraqi army has largely avoided the coalition military units moving through Iraq. And the number of armed thugs willing to shoot it out with coalition troops is quite small.

2. The Republican Guard is a tough, well trained, combat organization. No, the Republican Guard selects people, especially officers, primarily for their loyalty to Saddam. They get paid a lot better than the regular army, have better equipment, barracks and rations. But as fighters, they are nothing special. During the Gulf War, the Republican Guard did stand and fight, and were blown away by American combat units, even when they outnumbered the Americans five (or more) to one.

3. The United States made a big mistake by not overthrowing Saddam in 1991. We had promised our Arab allies in 1990 that we would expel the Iraqis from Kuwait and would not invade Iraq. The Arabs said they could handle Saddam. They couldn't, but don't want to admit it. The U.S. waited twelve years, and then stopped waiting.

4. The United States armed Saddam. This one grew over time, but when Iraq was on it's weapons spending spree from 1972 (when its oil revenue quadrupled) to 1990, the purchases were quite public and listed over $40 billion worth of arms sales. Russia was the largest supplier, with $25 billion. The US was the smallest, with $200,000. A similar myth, that the U.S. provided Iraq with chemical and biological weapons is equally off base. Iraq requested Anthrax samples from the US government, as do nations the world over, for the purpose of developing animal and human vaccines for local versions of Anthrax. Nerve gas doesn't require technical help, it's a variant of common insecticides. European nations sold Iraq the equipment to make poison gas.

5. The United States is doing it for oil, as in seizing Iraq's oil and assuring cheap oil for the United States. When Gulf nations nationalized American oil companies operating in their territory over the last half century, the U.S. did nothing. Assuming that after the U.S. liberates Iraq it is going to turn around and steal all the oil is pure conspiracy theory, with no basis in fact or history whatsoever.

6. The world opposes the U.S. invasion of Iraq, so the world must be right. The rest of the world is different. One difference is that the rest of the world is more risk averse. They would rather tolerate Saddam and the threat he represents than take risks to eliminate his murderous tyranny. Moreover, many people in the rest of the world consider it more important (and a lot safer) to feel right than to do right. That's why everyone tolerates murderous situations in Congo, Sudan, Rwanda and North Korea.

7. The U.S. created Saddam. Arab nationalism created Saddam. He neither asked, needed nor got any help from the United States as he rose to power in the Baath party. When he took over in 1979, he promptly went to war with Iran a year later. Even before that, public opinion, and public policy, regarding Saddam (the bloody minded head of the secret police) was negative. You can go read it in the contemporary papers. Despite most Americans feeling OK about Iran getting hammered by Iraq (because Iran had held our embassy staff hostage for over a year), there was no move to provide Iraq with weapons. When the Iraqis looked like they might fold, and Iran's then fearsome Islamic Jihad (against less observant Moslems, and mostly against America, the Great Satan) might spread, the U.S. provided Iraq with satellite photos of Iranian military positions. After that war ended in a draw in 1988, the U.S. believed Saddam's pronouncements that he had seen the light and would rein in his aggressive impulses.

8. The U.S. strategy for invading Iraq is a colossal failure. Hard to say, as it's less than a week since the war began and the strategy is decapitation (eliminating Saddam), not fighting thousands of Saddams thugs before getting to the Big Guy himself. Come back in a few weeks and the truth will be revealed.

9. It will cost the U.S. billions of dollars to rebuild Iraq. Last time anyone looked, Iraq was sitting on several trillion dollars worth of oil and, as such, can easily obtain loans to pay for its own reconstruction.

10. The UN embargo hurt the Iraqi people more than Saddam. The Kurds in northern Iraq, getting the same per capital share of the oil for food money as the rest of Iraq (controlled by Saddam) has done dramatically better than any Iraqi ruled by Saddam. That may be because the Kurds are not building palaces, new missiles, bunkers and military bases. Nor did the Kurds have a large army, or a secret police organization. Iraqis and Kurds know who was sticking it to most of the (anti-Saddam) Iraqi population. Just ask them, as reporters often have, and they will tell you (unless one of Saddam's thugs is nearby.)

Yahoo seems to have worked it out pretty fast (1)

stewball (83006) | more than 10 years ago | (#7969445)

I use a yahoo email address for newsletters, registration, etc. I got maybe 5 of the nonsense word spams a couple weeks ago, marked them as spam, and every one of them's gone into my bulk folder since then.

Of course, Yahoo's false positive rate on newsletters is atrocious, but it's easy enough to pick those out and then empty the bulk folder.

Just curious, anybody know what Yahoo's using for spam filtration?
-----
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?