May 2011

Cocktail Talk is a casual monthly newsletter intended to arm you with amusing bits and bytes of information on whats happening in the computer world.  Topics sure to break the ice and capture an audience at many a social or business event.

NinjaAVG, Computer Associates, Trend Micro and Symantec are softwares you buy and put on your PC to protect you from Viruses and Spam.


eMail suspected of being Spam goes into your Spam Folder, if it's not Spam maybe you fish it out and add it to your Safe Sender List. Some Spam makes it to your Inbox, and maybe you delete it and are done, maybe you add it to a Blocked List.


Most people spend a lot more time fishing eMail out of the Spam Folder than they do deleting what gets through. That's a pretty good argument for lightening up a little at the border.


That's pretty much all you see on the surface. Let's take a trip into the internet to see how it blocks Spam with SpamAssassin.  


Above the surface are paid for products like Microsoft and Apple, but down there, in the Cloud, it's all free. Free is looked down upon, up here, but down there, in the Cloud, it makes the world go around. Please understand, down there, in the world in the Cloud, eight plus eight equals F*. It also equals 10000**.


Internet Servers run Apache, on Linux, both free. Websites are created in WordPress, Joomla or some other free software. Contact Us Pages use FormMail, which is free, to send eMail. Webmail handlers Horde and Squirrelmail are free. SpamAssassin, also free, is what fights Spam on the internet, at your Post Office, before it gets delivered to your PC.


SpamAssassin scans eMail and reports on the probability of it being Spam. eMail has two parts, the Header, and the Body. The Header has a lot of information in it that you don't usually see.  SpamAssassin reports what it finds in the Header section. like this;


X-Spam-Subject: ***SPAM***

X-Spam-Status: Yes, score=4.4

X-Spam-Score: 44

X-Spam-Bar: ++++

X-Spam-Report: Spam detection software, running on the system "", has

 identified this incoming email as possible spam.  The original message

 has been attached to this so you can view it (if it isn't spam)...


On a scale of one to ten, ten being highest, SpamAssassin ranked this a 4.4. Someone else determines the cutoff for what's allowable, SpamAssassin just ranks them. Only a fraction of the Spam targeting you is delivered to your PC, the majority is deleted based on SpamAssassin ranking. The question is, like the man said; "How do it know?".


SpamAssassin "knows" by running about 200 tests on each and every eMail, I can't really list  them, especially not the funny ones, because the SpamAssassin test names  would be blocked as Spam by SpamAssassin. It's one of those "Only we can say that." things. You can look at all the tests on their website, , where they can say anything. Anyway.


After it runs all these tests it takes the results and does some cyphering, and gozintas, carries a naught, and decides if you've got Spam.


SpamAssassin also uses a formula derived from Bayes Theorum to rank the spaminess of individual words not covered by a test in an eMail. .


Let's use the word "Replica", as in "Fake Rolex", to look at the formula used by SpamAssassin. 


The formula used by SpamAssassin is:


Pr(S/W) = (Pr(W/S) * Pr(S))  /  (Pr(W/S) * Pr(S) + Pr(W/H) * Pr(H))




Pr(S/W)  is the probability that a message is a spam, knowing that the word "replica" is in it;

Pr(S)      is the overall probability that any given message is spam; 

Pr(W/S)  is the probability that the word "replica" appears in spam messages;

Pr(H)      is the overall probability that any given message is not spam (is "ham");

Pr(W/H)  is the probability that the word "replica" appears in ham messages.


Wow, that's really important to know, I bet.


Bad eMail is Spam, good eMail is Ham. Why, I don't know, Hawaiians seem to love Spam and have it on so many menus that I can't imagine it's bad. It lasts forever too, maybe longer than a Twinkie, without refrigeration. Nancy (not her real name) and I went to Hawaii in 1999 and I still have a souvenir can of Spam on my shelf in the office. Anyway.


Wow, that's really important to know too, I bet.


SpamAssassin also consults Razor2. It's kind of a "Who's Who" of known Spammies. 


Like the man says: "Razor2 establishes a distributed and constantly updating catalogue of spam in propagation that is consulted by email clients to filter out known spam. Detection is done with statistical and randomized signatures that efficiently spot mutating spam content. User input is validated through reputation assignments based on consensus on report and revoke assertions which in turn is used for computing confidence values associated with individual signatures."


Wow, that's really important too, I bet.


Fact of the matter is, SpamAssassin, invisible hero, does the dirty work for us, for free, using lots of wonderful toys that we couldn't care less about. 

Eight plus eight may equal sixteen, maybe it equals F, or even 10000, that's your call, and that's Cocktail Talk.




Thank you for reading,


Craig Phillips
CN Consulting, Inc.


* Hexidecimal F is 16. 0-9,A,B,C,D,E,F. Because 0 counts as one, etc.

** 10000 is 16 Binary. 1 is 1, 10 is two, 11 is three, 100 is four, 101 is five, 110 is six, etc 

