wanted thunderbird extension
If any of you are extension developers and you're looking for a new project, I've got one for you:
I get a lot of spam. No doubt much of it is blocked at the server before it ever reaches my inbox but for that batch of spam that does make it to my inbox, Thunderbird's adaptive spam filters automatically flags it and moves it to a special folder.
What I'd like from this extension is to be able to visualize the different kinds of spam I receive and to be able to watch the categories over time. I believe our spam filtering could tell an extension which key words or phrases generated the "hit" and an extension could have categories like "prescription medications", "stock tips", "pornography", etc. so that I could track over time what kinds of spam I'm getting.
I know there's not a lot of practical value here, but I think it could be interesting and I think that if we can find some way to make spam filtering fun, that's a win :-)
What do you all think? Any volunteers?
update: I wasn't terribly clear, as is obvious from the comments. I'm not interested in actually keeping any of the spam. I'd like some nice visualization of the kinds of spam I'm getting. I could see, for example, a graph that shows trendlines for the different items they're trying to sell, or the different techniques they're using to try to defeat spam filtering, and maybe the total volume of spam over time.
Does that make more sense?
reactions, thoughts, comments, etc.
I was actually working on something that was similar, but not exactly the same, as what you're describing. It was a catagorizer for non-spam, basically filtering ham mail into various catagories using bayesian techniques (named Thunderjudge). I gave up, and put it aside for a while because (1) I found it too difficult to hook into TB's spam filterer; I was just not smart enough to figure out the right way to do it, and (2) moving messages into catagories in response to mail arriving would corrupt the message databases from time to time. This latter problem doesn't seem like it would be an issue with what you're describing, since your feature only involves tallying and not any explicit actions. But the former problem seems like it would still be prohibitive. Furthermore, I figure it would be nice to have a more general approach which would catagorize any type of mail.
More info:
http://www.cs.stevens.edu/~dlong/software/thunderjudge/
http://groups.google.com/group/mozilla.dev.apps.thunderbird/browse_frm/thread/df91bf207b46d47b/87f06e6c0486e0eb#87f06e6c0486e0eb
Posted by: Dustin Long | October 9, 2006 1:58 PM
There is a piece of software called Polymail (http://www.extravalent.com/) that cateegorizes email like this, but it's not free (as in beer or speech) and it doesn't interface with Thunderbird.
Posted by: Matt Nordhoff | October 9, 2006 5:10 PM
Not exactly what you were asking about, but interesting nonetheless: grow your own spam tree at the spam garden here:
http://www.netlash.com/spamgarden/
Posted by: Step | October 9, 2006 5:34 PM
Wooohh.. I would love that as well.
Maybe we could get Scott to fill us in on how feasible it would be given TB's current bayesian api (or lack-thereof).
Posted by: Jed | October 9, 2006 8:52 PM
(not starting a flame)
In Opera Mail you can add as many 'learning' filters as you want, and messages can appear in multiple filters. Works pretty good, if you give it something to work on.
Posted by: Rijk | October 10, 2006 12:05 AM
Why not set mail.server.default.spamLoggingEnabled to true?
Posted by: funTomas | October 10, 2006 12:08 AM
Why not use Mailwasher? All the stats on spam you need.
W.
Posted by: Wally | October 10, 2006 1:00 AM
If I can piggyback a request onto yours, Asa: Is there any way to make the junk mail filter automatically mark a message as deleted? Right now I can have TB move any message that it flags on its own, and any message I manually flag can be deleted automatically, but there's nowhere in the options (that I can find) to make TB say, "this is spam; mark as deleted".
It doesn't really help me to have all these messages marked as spam if I have to go through and delete them manually before they can be purged. I already know they're spam by looking at the subject; saving me the time of picking through my morning mail would be a good thing.
Posted by: Jason | October 10, 2006 4:08 AM
So you're suggesting something that would allow you to set up a regular expression or other form of rule for the purposes of filtering incoming mail marked as spam into specific folders?
It would be nice if you could have a "filter" setup that parses incoming messages marked as spam into these "filters" that have a regex in them that looks for whatever keywords you're looking for.
Posted by: The double M | October 10, 2006 10:13 AM
Wow, that would be a very ambitious extension. It would have to completely redo tbird's bayes filter to label many more subcategories (mortgage, pharmacy, 419, ...) and obfuscation techniques, perhaps from jwc's spammer's compendium. I'd suspect the 'show a table of number of spam's you've seen this week/month' would be much more feasible, but perhaps only for messages the built in spam checker marked. For me, that would be pretty useless -- Tbird's antispam filter catches (litterally) 5% of the spam I get, despite our corp spam filter tagging 97ish percent, heavy training, and periodic resets ('cause I can't imagine how it culd possibly be so terrible).
Posted by: Miles | October 12, 2006 7:33 AM