Jump to content

Spamming

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 206.109.159.11 (talk) at 20:06, 28 January 2003. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Spamming is the act or offense of sending unsolicited, bulk (and usually commercial) electronic messages. Because it is usually commercial the email version of spam is often known by the abbreviation "UCE" for "unsolicited commercial e-mail," though "UBE" for "unsolicited bulk email" is arguably more accurate. Though this can be done through any number of media, the most common is email. The most common purpose for spamming is to advertise, usually pornography.

A spammer sends identical or nearly identical messages to thousands of email addresses. These addresses are often harvested from Usenet postings or web pages, obtained from databases, or simply guessed by using common names and domains. By definition, spam is sent without the permission of the recipients.

The best real-world analogy to understand the core problem of spam is to buy a copy of the nearest big city newspaper and count the advertisements, classifieds and all. Each advertiser paid at least as much it'd cost them to do a decent sized spam run to get their message into a newspaper in the hope that a few hundred thousand people might see it. And a few hundred thousand people is a very small spam run indeed.

If spamming by mail cost as little as spamming by email, you could expect to see about as many messages each day in your mailbox, as there are ads in that paper. And that's just from that one city... how many cites of that size are there in the world? Remember: it doesn't cost any more to spam from New Delhi as New York.

Spamming is broadly considered unacceptable behavior by Internet service providers and indeed most Internet users. Users find spam annoying and its contents frequently offensive; Internet service providers object to the unrecoupable cost of processing other people's advertisements. Surveys have indicated that spam is one of most users' greatest annoyances about the Internet today. Sending spam is a violation of the Acceptable Use Policy (AUP) of most ISPs, and can lead to the termination of the sender's account.

By and large, senders of email advertisements each assert that what they do is not spamming. Precisely what sorts of activity constitute spamming is a matter of debate, and definitions differ based on the purpose for which "spamming" is being defined.

In comparison, sending junk mail via the mail is not a form of spamming. The senders of the junk advertisements have to pay for the postage for each ad they send out. In contrast, ISP's are burdened with the task and cost of develivering spam email (and other forms of spam).

Etymology

The term spam is derived from the Monty Python SPAM sketch, set in a cafe where everything on the menu includes SPAM™ luncheon meat. While a customer plaintively asks for some kind of food without SPAM in it, the server reiterates the SPAM-filled menu. Soon, a chorus of Vikings join in with a song: "SPAM, SPAM, wonderful SPAM, glorious SPAM," over and over again, drowning out all conversation.

The term "spamming" was first used on the Internet to refer to disruptive, repetitious messages on MUD games. Soon, it came to refer also to the flooding of Usenet newsgroups with junk messages. After a pair of lawyers (Canter & Siegel) started using bulk Usenet posting as a means of advertisement, the term came to include unauthorized commercial use of the noncommercial Usenet medium. Email spamming, and the use of the term, followed shortly. [1]

There are two popular (and incorrect) folk etymologies of the word "spam". The first, promulgated by spammers Canter & Siegel, is that "spamming" is what happens when one dumps a can of SPAM into a fan blade. The second is the acronym "shit posing as mail."

Hormel Foods, the makers of SPAM™ luncheon meat, do not object to the Internet use of the term "spamming." However, they do ask that the capitalized word "SPAM" be reserved to refer to their product and trademark. [2]

Email spamming

Among system administrators, the terms "unsolicited commercial email" and "unsolicited bulk email" (UCE and UBE) are used as more formal definitions of email spam. Larger ISPs such as America Online report that anywhere from one-third to two-thirds of their email server capacity is consumed by spam. Because this cost is imposed without the consent of either the site owners or the authorized users, many argue that email spamming is a form of theft of services.

Many email spammers send their UCE through open mail relays. The SMTP system, used to send email across the Internet, forwards mail from one server to another; mail servers that ISPs run commonly require some form of authentication that the user is a customer of that ISP. Open relays, however, do not properly check who is using the mail server and pass all mail to the destination address, making it quite a bit harder to track down spammers.

"Official" views on spamming can be found in RFC 2635.

The costs of spam

Spamming is sometimes called the electronic equivalent of junk postal mail. However, the printing and postage costs of junk mail are paid for by the sender -- in the case of spam, the recipient's mail site pays most of the costs, in terms of bandwidth, CPU processing time, and storage space. Spammers frequently use free dial-up accounts, so their costs may be quite minimal indeed. Because of this offloading of costs onto the recipient, many consider spamming to be theft or criminal conversion.

Because spamming is forbidden by ISPs, spammers frequently seek out and make use of vulnerable third-party systems such as open mail relays and open proxy servers. Spammers have also abused resources set up for purposes of anonymous speech online, such as anonymous remailers. As a result, many of these resources have been shut down, denying their utility to legitimate users.

Many users are bothered by spam because it impinges upon the amount of time they spend reading their email. Many also find the content of spam frequently offensive, in that pornography is one of the most frequently advertised products. Spammers send their spam largely indiscriminately, so pornographic ads may show up in a workplace email inbox -- or a child's. (The sending of pornography to children is illegal in many jurisdictions.)

Soms spammers argue that most of these costs could potentially be alleviated by having spammers reimburse ISPs and individuals for their material. There's two problems with this logic: first, the rate of reimbursement they could credibly budget is unlikely to be nearly high enough to pay the cost; and second, the human cost (lost mail, lost time, and lost opportunities) is basically unrecoverable.

Philosophical questions

While spamming is considered offensive by many on the Internet, it is also important to consider whether it is desirable to enlist government involvement in policing the Internet. The greatest value of the Net has always been the free dissemination of information. Government involvement would create a precedent for official censorship. Is the cost imposed by spam so high, that the Internet community needs governments to tell us what to send and receive?

Additionally, the costs of spam are usually complained about by ISPs, which often are owned by multi-national corporations (AOL Time-Warner and MSN, for instance). Is it possible that they are simply trying to protect their advantage in being able to reach large numbers of people from small businesses which may eventually impede on their markets? Also, does the Internet community really want big business ISPs acting as censors of our mail when software on individual computers is getting better at filtering it for us?

As offensive as spam may be, is it in the best interest of the Internet community to side with one form of "Big Brother" (big business) in calling in another form of "Big Brother" (government) to reduce their finacial burden and protect their business interests at the cost of imposing a form of censorship which, although its direct effects may be desirable, could open the door to further controls on our access to information down the road?

The answer, of course, is that spam itself hinders free speech. If you miss messages from people you know and companies you do business with because your message was lost in the spam, or your filters accidentally deleted that important message, then you don't have to wait for government censors... the spammers are already doing it for them.

Because, we have to remember, always, that the greatest costs of spam are the barriers it places between people. This leads to two important points to keep in mind. First, legislation aimed at combating spam must remain targeted at unsolicited broadcasting... any law that potentially makes any single message actionable will have as great a chilling effect on speech as spam itself does. Seocnd, it is vital that legislation be targeted at the mechanism of spamming... bulk unsolicited email... rather than narrowly targeting some specific kind of speech.

Defense against spam

There are a number of services and software systems that mail sites and users can use to reduce the load of spam on their systems and mailboxes. Some of these depend upon rejecting email from Internet sites known or likely to send spam. Others rely on automatically analyzing the content of email messages and weeding out those which resemble spam. These two approaches are sometimes termed blocking and filtering.

Blocking and filtering each have their advocates and advantages. While both reduce the amount of spam delivered to users' mailboxes, blocking does much more to alleviate the bandwidth cost of spam, since spam can be rejected before the message is transmitted to the recipient's mail server. Filtering tends to be more thorough, since it can examine all the details of a message. Many modern spam filtering systems take advantage of machine learning techniques, which vastly improve their accuracy over manual methods. However, some people find filtering intrusive to privacy, and many mail administrators prefer blocking to deny access to their systems from sites tolerant of spammers.

DNSBLs

DNS-based Blackhole Lists, or DNSBLs, are a blocking technique, whereby a site publishes lists of IP addresses via the DNS, in such a way that mail servers can easily be set to reject mail from those addresses. There are literally scores of DNSBLs, each of which reflects different policies: some list sites known to emit spam; others list open mail relays or proxies; others, such as SPEWS, list ISPs known to support spam.

For history and details on DNSBLs, see DNSBL.

Heuristic and statistical filtering

Until recently, content filtering techniques relied on mail administrators specifying lists of words or regular expressions disallowed in mail messages. Thus, if a site receives spam advertising "herbal Viagra", the administrator might place these words in the filter configuration. The mail server would thence reject any message containing the phrase. The disadvantage of this static filtering is that it is difficult to maintain, and prone to false positives: it is always possible, even for an unlikely phrase, that non-spam mail will contain it.

Heuristic filtering, such as is implemented in the program SpamAssassin, relies on assigning numerical scores to various phrases and patterns which may occur in messages. Scores may be positive numbers, indicating likeliness that patterns indicate spam; or negative, indicating legitimate mail. Each message is scanned for these patterns, and the applicable scores tallied up. If the total is above a fixed value, the message is rejected or flagged as spam. [3]

However, heuristic filtering still relies on an administrator or maintainer to generate the list of scores. Statistical filtering, first proposed in 2002 by Paul Graham, uses probabilistic methods derived from Bayes' Theorem, to predict whether messages are spam or not -- based on collections of spam and nonspam ("ham") email submitted by users. A statistical filter is basically a kind of text classification system, and a number of machine learning researchers have turned their attention to the problem. [4]

Filtering software is available both for mail servers and for mail client programs. (links here please)

Spam tips for users

Aside from installing client-side filtering software, end users can protect themselves from the brunt of of spam's impact in numerous other ways.

Address munging

One way that spammers obtain email addresses to target is to trawl the Web and Usenet for strings which look like addresses. Thus, if one's address is never listed on these fora, they cannot find it. Posting anonymously, or with an entirely faked name and address, is one way to avoid this "address harvesting". Users who want to receive legitimate email regarding their posts or Web sites can alter their addresses in some way that humans can figure out but spammers haven't (yet). For instance, joe@example.net might post as joeNOS@PAM.example.net. This is called address munging, from the jargon word "munge" meaning to break.

Address munging does not, however, evade so-called "dictionary attacks" in which the spammer generates a number of likely-to-exist addresses out of names and common words. For instance, if there is someone with the address adam@aol.com it is likely that he gets a lot of spam ....

Defeating Web bugs and JavaScript

Many modern mail programs incorporate Web browser functionality, such as the display of HTML and images. This can easily expose the user to pornographic or otherwise offensive images in spam. In addition, spam written in HTML can contain JavaScript programs to direct the user's Web browser to an advertised page, or to make the spam message difficult or impossible to close or delete. In some cases, spam messages have contained attacks upon security vulnerabilities in the HTML renderer, using these holes to install spyware. (Some computer viruses are borne by the same mechanisms.)

Users can defend against these methods by using mail clients which do not display HTML or attachments, or by configuring their clients not to display these by default.

Avoiding responding to spam

It is well established that some spammers regard responses to their messages -- even responses which say "Don't spam me" -- as confirmation that an email address refers validly to a reader. Likewise, many spam messages contain Web links or addresses which the user is directed to follow to be removed from the spammer's mailing list. In several cases, spam-fighters have tested these links and addresses and confirmed that they do not lead to the recipient address's removal -- if anything, they lead to more spam.

Reporting spam

The majority of ISPs explicitly forbid their users from spamming, and eject from their service users who are found to have spammed. Tracking down a spammer's ISP and reporting the offense often leads to the spammer's service being terminated. Unfortunately, it can be difficult to track down the spammer -- and while there are some online tools to assist, they are not always accurate.

Two such online tools are SpamCop and Network Abuse Clearinghouse. Both provide automated or semi-automated means to report spam to ISPs. Some spam-fighters regard them as inaccurate compared to what an expert in the email system can do; however, most email users are not experts.

Other forms of spam

Since the late 1990s, mail system administrators have taken many steps to crack down on spamming. Some of these have even been successful. As a result, those who want to send unsolicited advertisements over the Internet at others' expense have turned to a number of other media.

Instant messaging (IM) systems are a popular target for spammers. Many IM systems offer a directory of users, including demographic information such as age and sex. Advertisers can gather this information, sign on to the system, and send unsolicited messages. To combat this, some users choose to receive IMs only from people they already know.

In 2002, a number of spammers have begun using the Microsoft Windows Messaging service to get their message across. This isn't the same as the IM system "MSN Messenger"; rather, it is a function of Windows designed to allow servers to send alerts to administrator workstations. Windows Messaging spam appears as normal dialog boxes containing the spammer's message. It can be blocked using a firewall by closing down port 135.

Spamming of Usenet newsgroups actually pre-dates email spam. It is primarily used to advertise pornography.

Both email and other forms of spamming have been used for purposes other than advertisement. Many early Usenet spams were religious or political in nature. Serdar Argic, for instance, spammed Usenet with historical revisionist screeds. A number of evangelists have spammed Usenet and email media with preaching messages.

Particularly on Usenet, spamming has also been used as a denial of service tactic, specifically by overwhelming the readers of a newsgroup with an inordinate number of nonsense messages. Since these messages are usually forged (that is, sent falsely under regular posters' names) this tactic has come to be known as "sporgery" (from spam + forgery). This tactic has for instance been used by partisans of the Church of Scientology against the alt.religion.scientology newsgroup (see Scientology vs. the Internet) and by spammers against news.admin.net-abuse.email, a forum for mail administrators to discuss spam problems. Applied to email, this is termed mailbombing.


Alternate Meanings

The term "spamming" is also used in the older sense of something repetitious and disruptive by players of first-person shooter computer games. In this sense it refers to "area denial" tactics -- repeatedly firing rockets or other explosive shells into an area.

MUD, MUSH, and MUCK players happily continue using the word in its original sense. When a player returns to the terminal after a brief break to find her screen filled with pages of random chat, that's still called "spam". [5]


Neither of these senses of the word imply that the "spamming" is abusive.


See also: electronic mailing list, netiquette, Serdar Argic, make money fast