Categories
Computers

Trading procmail for sieve

WARNING: Much technical jargon to follow. Those not versed in *nix style email black magic and jargon should proceed at their own risk. YOU HAVE BEEN WARNED.

I’ll state up front that my home email system has been working just fine for years now. That doesn’t mean I was entirely pleased with it, though. The main source of my angst was the use of procmail as my mail filter for routing mail delivered to me to my various personal mail folders.

Sure, there’s the maintainability of a procmail configuration file. It’s not exactly pretty to look at. There are special flags and characters galore that need to be researched every time it’s touched. There are special, obfuscated, fall-through conditions where certain processing paths are taken. In all, it’s the sort of configuration that makes total sense right up to the point where you get it working. Two days later, it might as well all be Greek. To top things off, procmail is a dinosaur, with no active development or support for the code base.

Even so, I did put the time in to figure out how to leverage it to the best of it’s capabilities and it has served me well over the years. My main bone of contention with the use of procmail in my case is it’s position as a glue component to bolt my spam filter, bogofilter, to my system’s MTA exim. In short, it’s a kludge and one that I’ve grown less fond of as time has passed.

To more thoroughly explain things, it’s necessary to mention another part of my mail system: dovecot, an IMAP server which has proven extremely useful over the years. The Wife and I both can access email from any of a number of devices; computers, tablets, phones, and so forth; from anywhere we have network access. All of these different forms of access are possible because of dovecot. As such, dovecot isn’t going anywhere. Now dovecot happens to come with it’s own filtering capabilities, provided by an implementation of Sieve filtering, and also has it’s own LDA, appropriately named dovecot-lda. It’s the presence of these 2 elements that, to my mind, make procmail seemingly superfluous because between Sieve and dovecot-lda all the functionality of procmail is possible in a more modern package.

So why haven’t I ditched procmail yet?

Here’s the problem: I use user-level word lists for spam detection with bogofilter as opposed to a global word list and Sieve does not easily pair up with bogofilter and it’s limited with regards to exim.

With bogofilter, it’s possible to either use a global wordlist for detecting spam or a per-user wordlist, each of which resides in a user’s private directory. In this way, the Wife can have spam detected how she likes and I can have spam detected how I like. While it’s possible to incorporate bogofilter support directly into exim, it seems this way only supports use of a global wordlist, which is a no-go for my situation.

Now one might presume that I could still dump procmail and just make use of Sieve to run my mail through bogofilter for spam detection. It is, after all, a filtering language. Unfortunately, it’s not possible to do this because Sieve does not support running external programs. Thus, there is no way to get it to run mails through bogofilter.

So to take advantage of Sieve, the processing has to take the following path: exim has to route the mail to an individual user, where (somehow!) it is then run through bogofilter which modifies the mail’s headers slightly to mark it as spam or not, after which the modified mail must be (somehow!) handed to dovecot-lda which will then run it through a Sieve filtering script. The Sieve script can then check the mail for spam and place it in the appropriate mail folder.

As hinted at, the bugaboo has been how to get exim to hand the mail to bogofilter so it can use the user’s word list for spam detection and then pass the resulting mail to dovecot-lda.

It turns out to be possible with the help of exim‘s support of .forward files, as well as a little helper script.

To make it work, start by enabling the Sieve plugin in dovecot. Do this by editing /etc/dovecot/dovecot.conf and adding the following configuration:

protocol lda {
    ...
    mail_plugins = sieve
    ...
}

(The ‘…’ characters just indicate the possible presence of other lines in within the brackets. They shouldn’t actually be in the file.)

Once this is done, restart dovecot however appropriate for your system. On debian using the /etc/init.d/dovecot restart incantation works nicely. Out of the box support has now been created for a ~/.dovecot.sieve file.

Next, create a .forward file for exim as follows:

# Exim Filter <== IMPORTANT: DO NOT REMOVE THIS LINE
if error_message then finish endif
pipe /home/user/.forward-helper

Now create the file /home/user/.forward-helper as follows:

#!/bin/sh
/usr/bin/bogofilter -u -e -p -d /home/user/.bogofilter/ | /usr/lib/dovecot/dovecot-lda

The one thing to check on in these commands are that all of the paths are correct. The path following the -d should be the path to the bogofilter wordlist. Similarly, make sure that the path to bogofilter and dovecot-lda are correct for your system. In both cases above user should be substituted with the appropriate username.

What will happen now is that as after exim figures out which user to rout mail to, it will run that user’s .forward file. The file is setup as an exim filter file and will pipe the mail to the script .forward-helper. That script takes care of running the mail through bogofilter and then handing off the resulting mail to the dovecot-lda. The helper file is necessary because of the multiple pipes. While it is possible to run the mail through bogofilter directly from the exim filter file, the result cannot be grabbed for further use, like to pipe to dovecot-lda. Thus, the helper file takes care of that for us.

At this point, all mail will start showing up in your INBOX (I’m assuming use of maildir here). For a start, here’s how to separate out spam, ham and unsure mail messages using Sieve:

require "fileinto";
if header :contains :comparator "i;octet" "X-Bogosity" "Spam"
{
    fileinto "spam";
    stop;
}
elsif header :contains :comparator "i;octet" "X-Bogosity" "Unsure"
{
    fileinto "unsure";
    stop;
}

Place this snippet into a file named .dovecot.sieve in the user’s home directory. Now, spam will go into a mail folder called “spam”, mail that can’t be classified goes into a folder called “unsure” and the rest will go into the user’s INBOX. Please see RFC3028 for a detailed explanation of how the above works as well as how to further filter mail.

The solution seems somewhat trivial, but as a non-sysadmin lacking decades of experience working with email systems I can say it’s taken me quite awhile to figure it out. Initially, I searched high-and-low for someone else who had done this, to no avail. Then I had to become somewhat steeped in the machinations of exim to figure out how to make it work. In all, it’s a satisfying solution and the new Sieve scripts are much easier to understand and maintain. So long to procmail.