Tuesday, July 6, 2010

Spam e-mail processing

The transmission and reception of e-mail has become less reliable and more complicated over time as the volume of unsolicited bulk or commercial e-mail (UBE/UCE, aka spam e-mail) increases. According to several published reports, the percentage of spam e-mail is now around 90%.

At the Server
Corporations and Internet Service Providers (ISPs) have been deploying increasingly aggressive spam filtering to combat this onslaught of junk and malicious e-mail. Today there are as many as five possible dispositions for e-mail arriving at a server owned by the service provider:
  1. Discard the e-mail when identified as blatant spam
  2. Place the e-mail in a separate spam folder in the user's e-mail account
  3. Tag the e-mail as possible spam and place it in the user's e-mail inbox
  4. Remove detected malicious software content from the e-mail and place the “disinfected e-mail” in the user's inbox, with notification of the action taken
  5. Place the unaltered email in the user's e-mail inbox
Most service providers today offer some degree of customization of the filtering that takes place at the server. This involves one or more of the following actions by the user, depending on what the service provider offers:
  1. Designate whether the service provider should place suspected spam in a spam folder
  2. Designate e-mails tagged as possible spam as being not spam.
  3. Designate e-mails that should be treated as spam, but were not tagged by the server.
  4. Add a sender's e-mail address to the recipient's contact list on the server. Many service providers use a recipient's contact list as a “white list” of addresses whose e-mail should not be treated as spam.
  5. Select a level of spam filtering sensitivity, possibly with separate thresholds for immediate discard and for spam tagging of e-mails.
At the User's Computer
On the e-mail user's computer, the client software for handling e-mail has gained features to help the user to filter out the spam that remains after the first level of processing by the e-mail server belonging to a corporation or ISP.  In the case using a browser-based e-mail client (i.e. web-mail client), most of the features are those available at the server. A user can typically also create custom filters for handling e-mails according to criteria chosen by the user.

In the case of dedicated e-mail client software running on the user's computer, such as Mozilla Thunderbird or Microsoft Outlook, customization of the built-in spam filtering can be accomplished through one or more of the following actions by the user:
  1. Designate what to do with e-mails automatically tagged by the client software or the server as possible spam. Options may include: discard, save in a spam folder, remove tag.
  2. Mark additional e-mails as spam that were not previously tagged as spam. This typically trains the client software to recognize possible spam, tailoring the spam recognition profile to the individual user.
  3. Un-mark e-mails tagged by the client software as possible spam, in the case where the user considers the e-mail to be not spam.
  4. Add a sender's e-mail address to the recipient's contact list in the client software. Many client programs use a recipient's contact list as a “white list” of addresses whose e-mail should not be treated as spam.
  5. Set up filters to handle e-mails according to tags, From: addresses, To: addresses, Subject line or body content, etc.
Those who originate spam continue to adapt to the above measures, with the result that some spam gets through to the end recipient and some legitimate e-mail is blocked.

The above observations are based on my own experience with one Internet service provider, one corporate e-mail system, and at least six e-mail client programs. Comments and additional observations are welcome.