MUSes Spam Filtering

The Math department uses the CITES spam and virus filter to process all e-mail that comes in from outside the department. By default, the CITES spam filters will remove viruses, and will assign a "spam score" to all remaining e-mail. When the e-mail comes in to our servers, we rewrite the subject line of possible spam e-mail messages to provide a visual indication of the score assigned by CITES. However, the tagged spam is still delivered to your inbox, and you have to delete the e-mail manually.

The rest of this document deals with various ways of automatically removing some of these messages from your inbox, so you don't have to deal with them manually. If you are happy with the simple rewriting of the subject line, then great! You can safely ignore the rest of this web page. For those that would like a little more automation, we offer several ways of taking care of these spam messages automatically.

There are several alternatives for automatically dealing with e-mail tagged as spam by CITES.

Recommended - CITES Spam Control

The recommended method is to use the facilities provided by CITES for handling spam. This option provides the best balance between ease of use and flexibility. They provide a number of different options with a relatively easy-to-use interface.

Requirements

There are two requirements to use this option. First, you must have a UIUC NetID. Anyone who is a student, employee, or emeriti of the university meets this requirement. Second, any mail sent to either your Math e-mail (YourMathUsername@math.uiuc.edu) or your UIUC e-mail (YourNetID@uiuc.edu) must end up in the same account via forwarding. Most of our users with a UIUC NetID either have their university email forwarded to Math, or their Math email forwarded to CITES Express email. Either combination will satisfy this requirement. If you are not sure if you satisfy this requirement, then chances are that you do. Over 90 percent of our users meet these two requirements.

Getting Started

Click here to get started with CITES Spam Control.

To get started with CITES Spam Control, you need to activate it through CITES. An excellent overview of their spam control system is available here. Many users start out with the "Cautious" setting, and then move on to the "Aggressive" setting when they feel comfortable that spam is being filtered as they want. Note that CITES Spam Control is operated and supported by CITES. Any questions regarding usage of the tools for filtering spam, setting options, and viewing spam digests should be directed to the CITES help desk.

Alternative - Standard math procmail configuration file

For users that either do not have a UIUC NetID, or who chose to deal with their UIUC and Math e-mail separately, we offer a solution based on local processing using a program called procmail. This is the simplest solution to set up and use, but does not provide the flexibility of the other options. Procmail is a program that does general filtering of e-mail. Each user can have a personal configuration file in their home directory that tells procmail what to do with incoming mail. We have developed a simple procmail configuration file that you can use. When activated, this configuration file will tell procmail to delete any e-mail where "[SPAM]" would otherwise be prepended to the subject line. This option will NOT remove the e-mail that currently has "[SPAM? Score=nn]" prepended to the subject line. Those messages will still be delivered to your inbox for manual processing. This solution is intended to be simple to use and implement, but does not offer the options of the other alternatives.

Getting Started

There are two ways to activate the standard math procmail configuration file. The quickest way is to open up a terminal session on any of our Sun or Linux machines, and type the command:
	startspamfilter
This will save your old procmail file (if you had one), and create a new link to the default math procmail configuration file. If you don't feel comfortable doing this yourself, the other way to accomplish the same thing is to send an e-mail to help@math.uiuc.edu letting us know you'd like this done, and we'll be happy to do it for you.

Alternative - Custom .procmailrc

Users that want a high degree of control over how spam is handled may want to develop their own procmail configuration (.procmailrc) file. This option provides almost unlimited flexibility, but comes with some learning curve for those that do not already know Procmail.

Getting started

To get started with this option, we suggest you look at the default math procmailrc file. This file can be found in /usr/local/admin/procmailrc.deletespam. The comments and examples in that file should be taken as a starting point.

Also, when working with procmail, remember to use the options to log actions, with verbose output. It saves a lot of time and effort tracking down strange behavior with procmail.

A quick and dirty way to use what we have already provided is to put the following line at the top of your current .procmailrc:

	INCLUDERC=/usr/local/admin/procmailrc.deletespam
What this will do is use our "delete spam that CITES thinks should be deleted" functionality, without disturbing whatever your .procmailrc file already does. The following e-mail headers will probably be the most useful:

  • X-Spam-Score:
  • Numeric score from 0 to 100. This score is highly non-linear, with almost all e-mail clustered around 0 and 100. An almost-real-time graphical distribution of this score for all e-mail coming into the CITES mail relays can be found here.

  • X-Spam-Details:
  • Additional information from the CITES Spam filter including the rule that they matched. The matching rule can be any of the categories of rules listed at this web page in the first table after the graphs (e.g., tag_spam, cautious_likelyspam, etc.).

  • X-MUSes-Info: Matched rule NOTSPAM
  • E-mail coming into our system which we have determined not to be spam. Currently this tracks the same category as the rule in the X-Spam-Details: header.

  • X-MUSes-Info: Matched rule LIKELYSPAM1
  • X-MUSes-Info: Matched rule LIKELYSPAM2
  • E-mail coming into our system which we have determined is probably spam, but we are not sure. We take a more conservative view of what is "definitely" spam than CITES does. This category includes two types of e-mail. The first is e-mail tagged as "likely_spam" by CITES (rule LIKELYSPAM1). The second is e-mail that CITES tags as spam, but is below our score threshold (rule LIKELYSPAM2).

  • X-MUSes-Info: Matched rule ISSPAM
  • E-mail coming into our system which we have determined is spam. E-mail that has a very high X-Spam-Score is in this category. The cutoff for what we consider spam is somewhat higher than that used by CITES, so we will sometimes flag an e-mail as "LIKELYSPAM" that CITES has tagged as definitely spam (though not the other way around).

  • X-MUSes-Info: Matched rule NOHEADER
  • This is e-mail that does not have an X-Spam-Score: header. Any e-mail send within the department will match this rule.

  • X-MUSes-Original-Subject:
  • This is the original subject of a LIKELYSPAM or ISSPAM message. It is only included in e-mails where we have rewritten the subject line with "[SPAM]" or "[SPAM? Score=nn]".