Preventing SPAM

This page explains how to make use of the built in SPAM filters and should help you to reduce or even prevent SPAM. For more detailed information on what each option of the SPAM settings is for, please view the SPAM Filtering page.

What is SPAM?

SPAM is basically unsolicited mail which arrives in your Inbox and can often contain offensive content, crude promises and generally costs the Internet community millions every year. Only a few years ago, email marketing was seen as a low cost and very effective marketing tool but in reality it turned out to be something quite different. Today, SPAM is seen as one of the largest problems on the Internet, second only to viruses. Ability Mail Server offers several effective SPAM filters which if used correctly can almost completely eliminate SPAM. These filters are described in greater detail in the following sections.

White List

This feature allows you to ensure certain mails are always allowed through the SPAM Filtering service without hindrance. This is useful when certain mails are consistently falsely identified as SPAM. It is also useful if you simply want to ensure that you do not risk vital mails from important contacts being lost. White List can detect mails via the IP, domain, reverse IP lookup, sender address or recipient address.

Black List

This simple filter works in a similar way to the White List, but allows you to specify certain mails to always be classed as SPAM. This is useful if you consistently receive a certain SPAM mail from the same IP, domain or sender. Black List can detect mails via the IP, domain, reverse IP lookup, sender address or recipient address.

Tarpitting

Tarpitting provides a simple mechanism for detecting potentially abusive clients, which is also often a trait of SPAM bots (software which attempts to automatically bulk send SPAM). Tarpitting works by keeping a count of how many failed RCPT TO commands an IP has sent and once this crosses a certain threshold, the IP is blocked for a period of time. After that period has passed, the IP is then re-allowed access.

Sender Domain Check

This technique is designed to detect SPAM systems or viruses which attempt to fake the sender address. Often spammers will create a fake sender address for their mail which appears to be on your own domain, this filter can check for such attempts and block accordingly. The sender domain check is performed only when a sender address of a locally hosted domain is used. The first of the two checks performed includes verifying that the client connection is permitted relaying access (effectively that they have authenticated), as it is often only actual users sending outgoing mail that should have a sender address of a local domain. The second check is that the sender address is actually a real user email address, as spammers often invent fake addresses per SPAM mail (as they do not actually know the full list of available users).

Transaction Delays

Often SPAM bots can be impatient in how they handle SMTP transactions, and so have very short time out periods. Real mail delivery systems which follow the SMTP protocol correctly should be designed to be more patient. This filter slows down the SMTP transaction to the point that some SPAM senders will fail but where real mail delivery systems will still continue and deliver mail successfully.

Grey List

Grey Listing is quite an effective filter which works, like Transaction Delays, by taking advantage of the behavior of SPAM bots. This filter initially refuses all incoming mail with a temporary error message, then after a set period will allow mail through. Real mail delivery systems which follow the SMTP protocol correctly, should be designed to retry the mail delivery. In comparison, SPAM bots will often only try once, and move onto the next mail (try and forget method).

SPAM Trap

A common trick of SPAM bots is to scan entire websites and retrieve email addresses, this means that any public email addresses will often be subject to vast volumes of SPAM. An easy solution to combat this is to also insert fake addresses into your website HTML code but hidden, and if you receive mail at any of those addresses, block the sending IP. The SPAM Trap feature allows you to achieve this. Simply create a list of fake email addresses that are not real users on your mail server, embed then in your website HTML code in whichever way you see fit, then add them to the SPAM Trap list. The list of blocked IPs is maintained in the provided TXT file so you can manually add and remove IPs.

Real-time Black Lists (RBLs)

Real-time black lists are the quickest and simplest of SPAM filters to configure. These filters allow Ability Mail Server to query online databases of bad IP addresses to check if the sending IP is listed. This method of SPAM detection ensures that known bad SMTPs (such as open relays, which are often used for SPAM delivery) can be detected before the mail even arrives. Ability Mail Server contains many preset RBLs but the software can be configured to access as many RBLs as required.

Bayesian Filter

Bayesian Filtering is probably the most effective SPAM filtering method available today. Although it can initially be complicated to setup, once up and running, the filter can usually detect over 99% of SPAM with a very low false positive rate (usually less than 0.1%). The filter works by simply learning how to recognize SPAM from non-SPAM, and this is done by examining the content of an incoming mail and generating a probability score between 1%-99% (the higher the score the more likely a mail is SPAM). At the heart of the Bayesian Filter there is a database of tokens, these tokens are data objects which form the filters knowledge.

Training

To start with, the Bayesian Filter is completely untrained and cannot yet be used. The reason for this is that each Bayesian Filter will adapt to the type of SPAM that a particular mail server receives. This means that pre-training a filter is impossible as it will only result in inaccurate detection. By default, the filter requires to learn from a minimum of 25 SPAM mails and 25 non-SPAM mails. However, the more mail the filter can learn from the more accurate its detection will be. We recommend that you train the filter on at least a 1000 of each. There are two methods in which the filter can be taught, manual or automatic. In most circumstances the automatic training is preferable as this allows you and your users to all to partake in the training process. Please note training is an ongoing process as it is important that the filter learns about new types of SPAM. However, the automatic learning feature should allow the filter to manage itself with little input from the administrator.

Manual Training

This is the simplest method in which to teach the filter and can be done by either selecting a user and account directory, or by providing a physical directory which contains some raw mail files. The easiest way to kick start the filter is to manually provide some SPAM and non-SPAM learning material. This could be done by accessing an account either through WebMail or IMAP4 and creating 2 folders called 'spam' and 'non-spam'. You can then move or copy mails into these folders, sorting each mail appropriately. Once you have enough mails, you can then manually teach the filter. If this method of teaching is preferable to the automatic learning features, you could create a Content Filter Rule which sends a copy of any incoming mail to a specified account (preset rule 'Send Copy of Incoming Mail To'). The administrator could then use this account to manually sort the mails and then manually teach the filter.

Auto-Learn From Score

This method allows the filter to instantly learn from mails based on the resultant score. This is the simplest method of automatic learning but can only effectively be used once the filter has been taught by an already large number of mails. Basically you set an upper score threshold and lower score threshold. If a mail scores equal to or higher than the upper threshold, this is fed back into the filter and used for learning as SPAM. If the mail scores lower than the lower threshold, this is also fed back into the filter and used for learning as non-SPAM. Of course with this approach, the filter is completely self learning and requires no human intervention. However, this can also mean that a larger number of incorrectly identified mails may be used for learning.

Auto-Learn From Users

This method is a little more complicated but is also the most useful learning method. It allows any of your users (or selected users) to partake in training the filter. It works by regularly examining the mails within the users accounts, using any appropriate mail as learning material. You can completely control which account directories contain what type of mail and also use the system to automatically delete SPAM mails after learning. However, the only drawback with this approach is that any of the users which are used for learning, can poison the filters knowledge by either intentionally or unintentionally placing mails in the wrong folder. It is also important to note that mails store status information when they are used for learning. This prevents the same mail from accidentally being used multiple times although they can be re-learnt as the opposite type if previously learnt from the wrong folder by moving them to the correct folder.

Re-training

It is possible that at some point in the future the filter may become severely poisoned and may begin to make more incorrect identifications. If this is the case you can reset the database and begin again. For this reason it is often a good idea to keep a record of the mails used for learning.

Sender Policy Framework (SPF)

Although SMTP provides a simple and fast method of mail delivery across networks and the Internet, spoofing who the mail is from is rather easy to do. This means that there is no guarantee that any mail you receive is actually from who it says it is from. This also makes it easier for SPAM mails and viruses to spread. Ability Mail Server helps combat this problem by supporting the new SPF system, which in simple terms means that domains can declare which SMTP servers are permitted for their outbound mail traffic. Should a domain publish SPF information, and a mail is received from that domain but from a non-permitted IP, the mail can be refused before it even enters Ability Mail Server.