Updated email protocol

Journal Spudley's Journal: Updated email protocol 1

Journal by Spudley on Saturday December 15, 2007 @08:03PM

There are so many people and products trying to beat spam. Whitelisting, blacklisting, greylisting... the trouble is, they all miss the point -- the spammers can send as many emails as they like, and only have to get a tiny proportion of them through the filters to make money.

The big problem with spam is that it's untraceable. SMTP doesn't provide any way to prove where an email came from. It does have headers, of course, but they can easily be forged, and often don't provide a full trace anyway.

What we need is an enhancement to SMTP (or complete replacement) that fixes this problem. It needs to work with existing email clients on people's desktops, otherwise no-one will use it, but at the server level it needs to be able to prove where an email has come from before sending it on.

So how do you prove where an email has come from?

In the email headers, there is a trace of the route via which an email has travelled to get to its current destination. Each time a new server forwards it on, it will add an additional header line.

To prove where an email has come from, a receiving mail server should query the servers back down the chain: "Did you send/forward a message with this particular ID on this specific date?"

If any server in the chain says "No, I didn't send that message", then you can be suspicious of the email. If more than one says "No", then the message headers are almost certain forged.

If all the servers say "Yes, we recognise that message", then you have a verified path back to the original sender (or at least to their ISP, which is the next best thing).

We can also ask the sending ISP if the email address given as the sender is one which they recognise as being owned by the sender. (ISPs would generally know your email address with them; if you use other addresses, you as the address owner would need to tell your ISP about them so they could verify them with this system when asked)

This gives us a fairly crude way of weighting messages. We haven't actually blocked any spam yet, but we can add the verification details to the headers, which will give the recipient's spam filtering software something extra to work with.

Now, because we can be certain of the originating ISP of a verified email, and we know whether the sending email address is valid we can, for a verified email, have a fairly good guess as to whether it's a legitimate email or spam, before we even look at the content of the message itself.

We can then do our normal spam filtering to remove the junk. But because we have verified the sender, it allows us to bounce the junk emails back to them, without worrying about the bounces ending up with the wrong person. This means that spammers would no longer be able to send millions of unsolicited emails without getting a good proportion of them back. Okay, granted, they may not bother even checking their mailboxes, but any sensible ISP would quickly disown a user that generated that kind of traffic, and then left it unclaimed in their mailbox. People with bot-infected machines who didn't realise they were sending out junk would start seeing bounces and hopefully do something about it.

The next stage is to create a trust level for verified email. If a particular server is sending out a lot of spam, it will be recognised as such, and because their messages have been verified, we know that particular server is the culprit (or is the ISP for the culprit, which is the next best thing). This allows us to determine the trustworthiness of any given mail server, which can also be added to the headers to aid filtering.

In the first instance, if a server becomes untrusted because it has been verified as sending out a lot of spam, the administrators can be notified. If the problem persists, we can take stronger measures as we feel appropriate.

Of course, this would have to co-exist with the existing untrusted email network, which means that you wouldn't be able to verify every email (in fact, to begin with, you wouldn't be able to verify many at all), and spammers would naturally try to avoid using the verified servers to avoid the possibility of being tracked down. But as more and more servers adopt the system, our spam filters will increasingly see the lack of verification as being a bad thing, and so what spam does come from them will become easier to filter.

Finally, being able to verify for certain that an email has been sent by a particular user at a particular ISP would be a major bonus for businesses -- it would improve security of online transactions, and make fraudulent activities like domain hijacking much harder.

So to summarise, the aim of the exercise here is to be able to verify where an email has come from by checking back with the sending server. This allows us to detect email with forged headers and to respond to spam in a way that punishes the sender, rather than the recipient or any innocent third parties.

It would add to the amount of network traffic required for an email to get from sender to recipient, but not by an intolerable amount, especially if that is offset by a reduction in spam as a result.

I suspect I'm not the first to have this kind of idea, but if not, why isn't it already in use? I'd be interested to hear anyone else's thoughts. But if you are going to pick holes in the argument, please do try to come up with a way to fix it as well. ;-)

Thank you for reading.

This discussion has been archived. No new comments can be posted.