Previous Entry Share Next Entry
yet another idea for spam-handling.
chmarr
No, this isn't some mighty comeback to using my journal again, so you can go back to sleep now :) but, I wanted to write down somewhere a little idea I had to reduce the amount of incoming spam, in a public place, so it can't be patented later on by someone else. Of course, I don't guarantee that this isn't ALREADY patented.



This idea should fit into the existing SMTP protocol as an extension, just fine.

Goals:

- Reduce spam and viruses by requiring the SMTP client do a little extra work.
- Reduce the need for sending 'bounce messages' which are likely to be undeliverable, or misdirected, because of spam and viruses.

Summary:

Server receives message from client. Instead of accepting or rejecting the message there, it says 'come back later to check'. Server can then do whatever checking it wants to. When the client contacts it again, server responds with its real answer. The server, if it wants to deliver the message, has the option of delivering or holding the message until the client re-contacts the server for the check.

Details:

Both the server and client are modified to handle a new addition to the SMTP protocol. The SMTP protocol has a way of reporting and accepting additional facilities, so the below would be compatible with existing SMTP servers.

Client contacts the server to deliver a message, discovers that the SMTP server has the addition, and activates it. The server could be configured to FORCE the new facility, rejecting any incoming message where the client does not activate the new feature.

Client sends the message in the normal way, except the server responds with 'message received, come back in N minutes to retrieve real delivery results'. N can be any number, but should be high enough to cover any processing the server needs to do. it also issues an ID for the message, that the client should store in its local database. The ID code should be made up of a unique part, and a random part to act as a 'password' of sorts. Eg:

213 5d3a6ff0-12345678 12 Received, come back in 12 minutes to check.

Client either delivers more messages the same way, or goes away.

Server then processes the message however it would like, such as a spam or virus scan, or just holding onto the message for the fun of it. Server stores whatever 'result' it would want to issue to the client when (if?) it comes back. Server MAY deliver the message on if it is configured to do so.

Client comes back after N minutes. Client activates feature, then asks if the message was delivered, giving the ID code returned by the server. Eg:

DELV 5d3a6ff0-12345678

Server issues final result code, which are any of the normal SMTP result codes. Accepting, rejecting, or any of the other options.

Server, if it still hasn't decided on the result, can reissue the 'come back later' code.

Server, if the client NEVER comes back, would purge the result, (and message, if it's still holding onto it) from the database. Perhaps 4 days would be enough.

Client would hold onto the message until it is sure the server has delivered it. If the server rejects, client can issue a bounce of its own. If the server claims it never saw the message, client can re-send.


Advantages:

Some spam/virus checking takes so long that it's not feasible to keep the client waiting for the result. However, if the message is to be rejected, this involves sending a bounce message, just in case the sender was legitimate. Most spam/virus senders are NOT legitimate, and the message will either generate a bounce of its own, or go to the wrong person. This method allows the use of normal SMTP result codes, allows greater flexibility in what processing is done, and puts the onus on the client to come back later for the real result.

This effectively emulates 'greylisting', if the server is configured to hold a 'successful' message until the client comes back later.

Disadvantages:

This puts more requirements on the server to hold onto result codes for 4 days, and possibly even whole messages for 4 days. However, it is NOT necessary to hold onto messages that are going to be 'rejected'; since the client should ALSO be holding onto the message in case of that possibility, the client can issue a bounce containing the full message, if so configured.

This puts more requirements onto the client, but no more than greylisting would.




  • 1
*desecrates said journal with his penis*

Disadvantages: E-mail takes a lot longer to deliver. Say goodbye to any rapid conversation that depends on e-mail as a component - no mure flurries of e-mail exchanges, no more discussing stuff in one channel and transmitting files via e-mail for quick evaluation, no more LJ comment exchanges unless you obsessively check the pages...

Actually, the plan will not slow down email any significant amount, unless the administrator takes the opportunity to implement slowler spam-checking facilities, or implement the grey-listing emulation.

Assume the case of a spam-check taking 10 seconds to complete. With the normal current SMPT, the server receives the message, tells the client that it was accepted, runs it through the spam checks and then either bounces it, or delivers it to your email box. The same delay if the server holds onto the connection while the check is made.

With the new system, the server will receive the mail, tell the client to come back in 15 seconds (10 seconds for average processing, +5 'fudge factor'), and if the client DOES come back, will deliver the mail to your inbox right then. Extra delay for the new system, 5 seconds only for the fudge factor.

If the administrator decide to implement more detailed spam checks, then, its going to take longer, but that's the price.

Now... if the administrator decides to emulate greylisting with this system, then the first email that comes through with the sender/recipient pair is going to take 15 minutes, whether it's because of this system, or because of the stock-standard greylisting method. So... there's no extra delay because of the new system.

It would be perfectly feasible for the server to say "tell client to come back in 15 minutes if I've not seen the sender/recipient pair before, or 10 seconds if I have".

In summary, no extra delay other than the 'fudge factor', and even that's optional if you dont mind the risk that the client will come back, and processing hasn't completed yet.


Our current mail server at work is often delayed by 10-20mins, delays of 3hrs are not uncommon when we're being spam-bombed, so an overhead of a few minutes wouldn't make a great deal of difference. Immediate delivery is preferred, but email was never meant to be instantanous.

[waves a claw] Hi, Ch'marr!

BTW, you're describing a variation of greylisting, I idea that's been kicked around for a couple of years and turns out to be a real handy way to block a lot of the zombie pc's from filling your mail spool. It also turns out that a few "big" name companies are too damn stupid to do email right and get blocked also, as well as sites that have deliberatly chosen to turn off retries.

I use greylisting and take great joy in telling otherwise legit companies that don't have a damn clue how to run mail servers that we'll accept email from them just as soon as they purchase said clue.

On day one, the gods created SMTP. On day two, they added retries to deal with the fact the network connections were done by the lowest bidder. Turn off retries at your own risk ...

I already mentioned that it emulates greylisting, so, no point for you :)

  • 1
?

Log in

No account? Create an account