We have all been annoyed with spam mails. The need for spam removal is critically important otherwise we would spend hours daily to remove them from our mailbox.
A mail can be said as spam based on two criteria:
1)Bulk: the recipient's personal identity and context are irrelevant because the message is equally applicable to many other potential recipients.
2)Unsolicited: the recipient has not verifiably granted deliberate, explicit, and still-revocable permission for it to be sent.
The main goal of any anti-spam is that none of the mails which are not anti-spam should be classified as spam no matter how powerful and effective it is.An ideal anti-spam system rejects messages which are both bulk and unsolicited, letting pass those messages which are of specific personal relevance to the recipient (not "bulk"), and those which the recipient has expressly requested (not "unsolicited").
The efficiency plays a major role is mail anti-spamming. The issue of computation for classifying the mail could be a major concern. The parsing of every text message and computation and application of any comparisons, algorithms or statistical analysis could for sure take a lot of time and processing. A system that accepts all mail and then discards the portion which is spam wastes significant resources on mail that will ultimately be discarded. This is the hidden cost of spam, and it can be arbitrarily large, since it depends on how much spam other parties send to the recipient. To address this, the hypothetical intelligent agent could operate at the sender's system, preventing unwanted data from entering the network at all. Unfortunately this seems practically untenable for several obvious reasons, not the least of which is the cost of replicating the agent at every prospective sender. The text analysis based on statistical data is most common approach. It tries to identify the unique trades in spam mails and uses a combination of them to identify spam mails. Using a slightly tweaked (as described below) Bayesian filter, we now miss less than 5 per 1000 spams, with 0 false positives. Characteristics like starting with “Dear friend” or a click invitation can be some of many characteristics to be identified .
Some other types of mail anti-spam are:
The most effective mean is “Source Address Blacklisting” . It is an aggressive approach which refuses all mail from sources which have a known bad history of sending spam, a bad reputation for the same, or some other feature which warrants blacklisting as a bad risk. There are also other applications for general lists of IP addresses, but refusing delivery of mail before "DATA" in SMTP . Unlike most anti-spam techniques, blacklisting reduces the hidden cost of spam by preventing transmission of the message.
Whitelisting is effective as an anti-spam technique, but it is overkill. It eliminates all sources which are not pre-approved, and so long as all the approved sources can be trusted to operate within the bounds of acceptable behaviour, it eliminates spam. It also eliminates any possibility of using the email address in question as a means of introduction
Greylisting eliminates those senders which attempt delivery in a "hit and run" manner, not reattempting delivery in accordance with standards. This has nothing to do with the characteristics of spam in a direct sense, but it so happens that many spammers use "ratware" delivery systems which are egregiously non-compliant with regards to standards, and this technique efficiently prevents communication of messages from such systems.
Challenge/response, or ‘move to spam’ in its broadest sense, attempts to determine that some source address of the message is monitored by a human being capable of taking some requested action. This effectively precludes the possibility that the message is sent in bulk, in most cases.
No comments:
Post a Comment