Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Updated email protocol

Spudley (171066) writes | more than 6 years ago

User Journal 1

There are so many people and products trying to beat spam. Whitelisting, blacklisting, greylisting... the trouble is, they all miss the point -- the spammers can send as many emails as they like, and only have to get a tiny proportion of them through the filters to make money.

There are so many people and products trying to beat spam. Whitelisting, blacklisting, greylisting... the trouble is, they all miss the point -- the spammers can send as many emails as they like, and only have to get a tiny proportion of them through the filters to make money.

The big problem with spam is that it's untraceable. SMTP doesn't provide any way to prove where an email came from. It does have headers, of course, but they can easily be forged, and often don't provide a full trace anyway.

What we need is an enhancement to SMTP (or complete replacement) that fixes this problem. It needs to work with existing email clients on people's desktops, otherwise no-one will use it, but at the server level it needs to be able to prove where an email has come from before sending it on.

So how do you prove where an email has come from?

In the email headers, there is a trace of the route via which an email has travelled to get to its current destination. Each time a new server forwards it on, it will add an additional header line.

To prove where an email has come from, a receiving mail server should query the servers back down the chain: "Did you send/forward a message with this particular ID on this specific date?"

If any server in the chain says "No, I didn't send that message", then you can be suspicious of the email. If more than one says "No", then the message headers are almost certain forged.

If all the servers say "Yes, we recognise that message", then you have a verified path back to the original sender (or at least to their ISP, which is the next best thing).

We can also ask the sending ISP if the email address given as the sender is one which they recognise as being owned by the sender. (ISPs would generally know your email address with them; if you use other addresses, you as the address owner would need to tell your ISP about them so they could verify them with this system when asked)

This gives us a fairly crude way of weighting messages. We haven't actually blocked any spam yet, but we can add the verification details to the headers, which will give the recipient's spam filtering software something extra to work with.

Now, because we can be certain of the originating ISP of a verified email, and we know whether the sending email address is valid we can, for a verified email, have a fairly good guess as to whether it's a legitimate email or spam, before we even look at the content of the message itself.

We can then do our normal spam filtering to remove the junk. But because we have verified the sender, it allows us to bounce the junk emails back to them, without worrying about the bounces ending up with the wrong person. This means that spammers would no longer be able to send millions of unsolicited emails without getting a good proportion of them back. Okay, granted, they may not bother even checking their mailboxes, but any sensible ISP would quickly disown a user that generated that kind of traffic, and then left it unclaimed in their mailbox. People with bot-infected machines who didn't realise they were sending out junk would start seeing bounces and hopefully do something about it.

The next stage is to create a trust level for verified email. If a particular server is sending out a lot of spam, it will be recognised as such, and because their messages have been verified, we know that particular server is the culprit (or is the ISP for the culprit, which is the next best thing). This allows us to determine the trustworthiness of any given mail server, which can also be added to the headers to aid filtering.

In the first instance, if a server becomes untrusted because it has been verified as sending out a lot of spam, the administrators can be notified. If the problem persists, we can take stronger measures as we feel appropriate.

Of course, this would have to co-exist with the existing untrusted email network, which means that you wouldn't be able to verify every email (in fact, to begin with, you wouldn't be able to verify many at all), and spammers would naturally try to avoid using the verified servers to avoid the possibility of being tracked down. But as more and more servers adopt the system, our spam filters will increasingly see the lack of verification as being a bad thing, and so what spam does come from them will become easier to filter.

Finally, being able to verify for certain that an email has been sent by a particular user at a particular ISP would be a major bonus for businesses -- it would improve security of online transactions, and make fraudulent activities like domain hijacking much harder.

So to summarise, the aim of the exercise here is to be able to verify where an email has come from by checking back with the sending server. This allows us to detect email with forged headers and to respond to spam in a way that punishes the sender, rather than the recipient or any innocent third parties.

It would add to the amount of network traffic required for an email to get from sender to recipient, but not by an intolerable amount, especially if that is offset by a reduction in spam as a result.

I suspect I'm not the first to have this kind of idea, but if not, why isn't it already in use? I'd be interested to hear anyone else's thoughts. But if you are going to pick holes in the argument, please do try to come up with a way to fix it as well. ;-)

Thank you for reading.

cancel ×

1 comment

Sorry! There are no comments related to the filter you selected.

been there, working on that (1)

esj at harvee (7456) | more than 6 years ago | (#21713926)

there is nothing fundamentally wrong with your suggestions that a couple of days with a white board and a seminar on individual rights with regard to the Internet wouldn't fix. what you are talking about is discovering and utilizing message source reputation. I've been thinking about this problem for a good number of years. I stumbled across a solution as a side effect of my dogged determination to use proof of work systems as a way of consuming spammer resources as well as identifying which messages should be directly passed through to your inbox. Until recently, one property of proof for work systems that I hadn't focused on was that a proof of work system can be used as a proxy for reputation. I.e. if you solve a large enough puzzle, you must be legitimate. Unfortunately, that test alone is not sufficient as it would only take 3 million zombies to generate enough tokens to deliver 1/10 of today's spam volume. Paradoxically, proof of work tokens, even at this level, may actually increase spam visibility which increases revenue. However, if you use the results of a anti-spam filter (e-mail address to nonexistent addresses, content filter, blacklists, message destinations) to record reputation of an IP address, you now have something you can use to enhance communication ability of good sites and degrade communication ability of bad sites.

Improving the quality of communications between legitimate sites is important for many reasons. whenever one looks at an antispam solution, one should take a step back and look at what was lost. E-mail, despite all protestations in RFCs to the contrary, was pretty damn reliable. But we've sacrificed that reliability through content filter false positives, blacklisting with no hope of parole, and the threat of centralized identity systems. Focusing instead on improving reliability by reputation leads one to a system that needs no provides a variable level of filter strength thereby enabling easier delivery of messages from good sources and more difficult delivery of messages from bad. This behavior emerges from reputation because you now have a signal which can trigger different filter configurations and message delivery guidelines. For example, if you have a very bad reputation, demanding a very large proof of work token and passing the content filter is reasonable prerequisite for delivery. But even if you message source satisfies requirement, you do not want to deliver this message to the inbox. You want to deliver to a staging area so it can be examined before delivery or discarded without examination. On the other hand, if you have a very good reputation, a proof of work token guarantees delivery to your inbox. One can also go through the content filter alone but the delivery destinations would only be the inbox or a staging (spam trap) area. Messages would never be discarded. By changing filter configuration and delivery paths, you can restore e-mail to its former level of reliability and trustworthiness. Best of all, this technique accomplishes this without any centralized services. Reputation changes are based on experience. It provides benefits to the first user and requires no significant change in infrastructure. Just a pre-filter in front of your e-mail server.

if you want to learn more about this project, feel free to contact me. just add a .org to the end of my username. Oh yes, don't forget to change ' at ' to @ :-)
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>