Categories
Computers Lua Programming

bogotrain.lua

My spam filter of choice has been bogofilter for many a year now. For the mail I receive it got to be very accurate quickly and it has remained so ever since. It is one of the Bayesian variety of spam filters and requires “training” to keep it properly classifying email.

I use an IMAP server for working with my mail so integrating bogofilter with the server is less than ideal, which would be to use a keystroke and immediately reclassify the mail. Instead, I’ve assigned a couple of training folders that I then farm out to a script run as a cron job. Specifically, for misclassified spam (i.e. mail that’s actually good but was misclassified as spam) I created a spam2mail folder and for misclassified good mail (i.e. mail that’s actually spam but is classified as good) I use the Junk folder. The script, using IMAP, interrogates the mail folders, retrains bogofilter on the mail, and then places the mail in the appropriate final destination, either my spam folder or my INBOX.

Originally, I wrote the script in question using perl and IMAPtalk. Since I wrote an IMAP library in lua, I figured it appropriate to rewrite the script in lua using my library.

After the break is the code.

!/usr/bin/env lua

local imaplib = require("imap4")

local host, user, passwd = arg[1], arg[2], arg[3]
local bogo_useropts = arg[4] or ' '
if bogo_useropts then bogo_useropts = ' '..bogo_useropts end

function chk_result(r, imap)
    if r:getTaggedResult() ~= 'OK' then
        imap:shutdown()
        error("IMAP command failed")
    end
    return r
end

function move_messages(imap, bogo_opts, user, src_mb, dest_mb)
    local r = chk_result(imap:select(src_mb), imap)
    -- untagged EXISTS will have message count
    local msg_cnt = r:getUntaggedContent('EXISTS')[1]
    if msg_cnt == '0' then return end
    -- open bogofilter for writing
    local fh = io.popen('bogofilter'..bogo_opts, 'w')
    if not fh then error("Could not open bogofilter") end
    local path = '/home/'..user..'/Maildir/.'..src_mb
    fh:write(path)
    fh:close()
    -- moving messages around in imap is a pain
    -- it consists of moving, then setting flags, then expunging
    -- the close gets us out of the selected state so a new
    -- folder can be selected
    chk_result(imap:copy('1:*', dest_mb), imap)
    chk_result(imap:store('1:*', '+FLAGS', [[\Deleted \Seen]]), imap)
    chk_result(imap:expunge(), imap)
    chk_result(imap:close(), imap)
end

local imap = imaplib.IMAP4:new(host)
local r = chk_result(imap:login(user, passwd), imap)

move_messages(imap, bogo_useropts..' -Snb', user, 'spam2mail', 'INBOX')
move_messages(imap, bogo_useropts..' -Nsb', user, 'Junk', 'spam')

Pretty simple. Two helper functions do the lion’s share of the work. Obviously, move_messages is the big one. Basically, it checks if mail is in a src_mb and proceeds accordingly, reclassifying mail by piping each mail to bogofilter or simply returning if not. As noted in the comment, moving mail around with IMAP is a bit of a pain, requiring 3 steps to copy mail into a new mailbox and then deleting it from the original. But I’ll also note that those commands work on all the messages in a folder (as opposed to having to repeat the commands for each message in the mailbox), so it’s not all bad.

Leave a Reply

Your email address will not be published. Required fields are marked *