![]() |
Search |
Submit ArticleIf you would like to submit an article, click the button below. Navigation |
Bayesian FiltersBy: Debbie Hamstead, Sat Dec 10th, 2005 12:41:38 AM A common problem with filters is the fact that they are a one-size-fits all solution to SPAM. The rules are concrete and only change based on input from updates from the Anti-spam service. SPAM changes too quickly to make that method effective. Additionally, what is SPAM to you may not be to someone else. That is where Bayesian filters come in. They are very effective at eliminating SPAM and have very low false-positive rates for their users. (Article continued below)
Bayesian filters are based on Bayesian logic, a branch of logic named for Thomas Bayes, an eighteenth century Mathematician. This type of logic applies to decision making by determining the probability of a certain event based on the history of past events. Using this as a model seemed a logical step for SPAM filtering. If you can predict what SPAM will look like now based on what is has looked like in the past, you are halfway to the solution. To finish solving the problem, Bayesian filters were developed to be dynamic and continue to be effective as the SPAM changes. Bayesian filters are content based. They look for characteristics in each email that you receive and calculate the probability of it actually being SPAM. These characteristics are generally words in the content and the header file information that each email contains. They can also include common SPAM HTML code, word pairs, phrases, and the location of a phrase in the body of the email. Typical words in SPAM would be "Free" and "Win", while "humility" would probably not appear. The filter begins with a 50% neutral score for the email, and then adds points for SPAM characteristics. Likewise, deductions are made for non-SPAM characteristics present. The total score is calculated and then action is taken based on its likelihood of being SPAM. The filter does not assume that all arriving email is bad, rather that all email is neutral and should be considered equally. Bayesian filters are better than traditional content scoring filters in that they are trained by you to recognize your email. A doctor, for example, might have many emails legitimately using the word "Viagra". A traditional content scoring filter would probably shoot that email to the SPAM folder, or delete it. This would result in a high false-positive rate for the doctor, even if you don't want Viagra emails. The filter will build a list based on the doctors email use and corrections to incorrectly marked email. The initial training period may be a little time consuming, but once complete offers a tailored solution to SPAM control for each user. In addition to protecting the good email, the filter makes it difficult for Spammers to trick as every filter will have individual requirements. That being said, Spammers do have a few weapons in their arsenal to attempt to circumvent Bayesian filters. The easiest would be to create SPAM that looks like an everyday letter. This would remove their ability to use typical marketing techniques and so is not as likely with normal commercial email. For the purveyors of fraud, however, this would be easier. Spammers could also so weight a message with a common good word, or distort the bad ones, that it becomes scored as neutral or lower and get through. Once correctly marked as SPAM by you, though, the filter will adjust and not be fooled again. This automation and ability of the software to grow as you and SPAM change over time is key to the significance of these types of filters. Widespread use of good Bayesian filters will not only eliminate SPAM on your end, but would reduce the practice of Spamming altogether. If they cannot get the mail through, they are just wasting their time. About the author: Debbie Hamstead is the webmaster of http://www.StompingOutSPAM.com Offering a comprehensive Quick Start Guide to keeping SPAM out of your inbox. She also manages http://www.nichesites4profit.com |
Sign In |
| Home |
Contact Us |
XML SiteMap Free Articles © 2004 - 2008 - Information Articles | ||