Jump to content

Web beacon

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Midwestmich99 (talk | contribs) at 18:18, 17 November 2019 (Added hyperlink and fixed grammar). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A web beacon (also called web bug, tracking bug, tag, web tag, page tag, tracking pixel, pixel tag, 1×1 GIF, or clear GIF) is one of various techniques used on web pages and email, to unobtrusively (usually invisibly) allow checking that a user has accessed some content.[1] Web beacons are typically used by third parties to monitor the activity of users at a website for the purpose of web analytics or page tagging.[2] They can also be used for email tracking.[3] When implemented using JavaScript, they may be called JavaScript tags.[4]

Using such beacons, companies and organizations can track the online behavior of web users. At first, the companies doing such tracking were mainly advertisers or web analytics companies; later social media sites also started to use such tracking techniques, for instance through the use of buttons which act as tracking beacons.

There is work in progress to standardize an interface that web developers can use to create web beacons.[5]

Overview

A web beacon is any of a number of techniques used to track who is visiting a web page. They can also be used to see if an email was read or forwarded or if a web page was copied to another website.

The first web beacons were small digital image files that were embedded in a web page or email. The image could be as small as a single pixel, and could be of the same color as the background, or completely transparent (thus the name “tracking pixel”). When a user opens the page or email where such an image was embedded, they might not see the image, but their web browser or email reader would automatically download the image, requiring the user's computer to send a request to the host company's server, where the source image was stored. This request would provide identifying information about the computer, allowing the host to keep track of the user.

This basic technique has been developed further so that all sorts of elements can be used as beacons. Currently these can include visible elements such as graphics, banners or buttons, but also non-pictorial HTML elements such as the frame, style, script, input link, embed, object, etc., of an email or web page.

The identifying information provided by the user's computer typically includes its IP address, the time the request was made, the type of web browser or email reader that made the request, and the existence of cookies previously sent by the host server. The host server can store all of this information, and associate it with a session identifier or tracking token that uniquely marks the interaction.

Framing

The use of framing added a new level of versatility to web beacons. Framing allows web pages to refer to content such as images or buttons or HTML elements that are located on other servers, rather than hosting this content directly on their own server. When a user sees the email or the web page, the user's email reader or web browser prepares the referred content for display. To do so it has to send a request to the third-party server to ask it to send the referred content. As part of that request, the user's computer then has to supply identifying information to the third-party server.

This protocol allows companies to embed beacons in content that they do not directly own or operate, and then use such beacons for tracking purposes. The beacons are embedded in an email or web page as images or buttons or other HTML elements, but they are hosted on a different server than the website where they are embedded, and it is to this third-party server that requests and identifying information are sent.

For instance, in the case of an advertisement that is displayed as an image on a web page, the image file would not reside on the page's host server, but on a server belonging to the advertising company. When a user opens the page, the user's computer will request to download the advertisement from the page's server, but will then be referred to the advertiser's server, and will request to download the image from the advertiser's server. This request will require the user's computer to supply identifying information about itself to the advertiser.

This means that a third-party site such as an advertiser, can gather information about visitors to a main site, such as a news site or a social media site, even if users are not clicking on the advertisement. Moreover, given that beacons are not just embedded in visible advertisements but can be embedded in completely invisible elements, a third party can gather such information even if the user is completely unaware of the third party's existence.

Use by companies

Once a company can identify a particular user, the company can then track that user's behavior across multiple interactions with different websites or web servers. As an example, consider a company that owns a network of websites. This company could store all of its images on one particular server, but store the other contents of its web pages on a variety of other servers. For instance each server could be specific to a given website, and could even be located in a different city. But the company could use web beacons to count and recognize individual users who visit the different websites. Rather than gathering statistics and managing cookies for each server independently, the company can analyze all this data together, and track the behavior of individual users across all the different websites, assembling a profile of each user as he or she navigates in these different environments.

Email tracking

Web beacons embedded in emails have greater privacy implications than beacons embedded in web pages. Through the use of an embedded beacon, the sender of an email - or even a third party - can record the same sort of information as an advertiser on a website, namely the time that the email was read, the IP address of the computer that was used to read the email (or the IP address of the proxy server that the reader went through), the type of software used to read the email, and the existence of any cookies previously sent. In this way, the sender - or a third party - can gather detailed information about when and where each particular recipient reads his email. Every subsequent time the email message is displayed, the same information can also be sent again to the sender or third party.

"Return-Receipt-To" (RRT) email headers can also trigger sending of information and these may be seen as another form of web beacon.[6]

Web beacons are used by email marketers, spammers and phishers to verify that an email is read. Using this system, they can send similar emails to a large number of addresses and then check which ones are valid. Valid in this case means that the address is actually in use, that the email has made it past spam filters, and that the content of the email is actually viewed.

To some extent, this kind of email tracking can be prevented by configuring the email reader software to avoid accessing remote images. Examples of email software able to do this include the Gmail, Yahoo!, Hushmail and SpamCop/Horde webmail clients; Mozilla Thunderbird, Opera, Pegasus Mail, IncrediMail, Apple Mail, later versions of Microsoft Outlook, and KMail mail readers.

However, since beacons can be embedded in email as non-pictorial elements, the email need not contain an image or advertisement or anything else related to the identity of the monitoring party. This makes detection of such emails difficult.[7]

One way to neutralize such email tracking is to disconnect from the Internet after downloading email but before reading the downloaded messages. (Note that this assumes one is using an email reader that resides on one's own computer and downloads the emails from the email server to one's own computer.) In that case, messages containing beacons will not be able to trigger requests to the beacons' host servers, and the tracking will be prevented. But one would then have to delete any messages suspected of containing beacons, or risk having the beacons activate again once the computer is reconnected to the Internet.

The only way to completely avoid email tracking by beacons is to use a text-based email reader (such as Pine or Mutt), or a graphical email reader with purely text-based HTML capabilities (such as Mulberry). These email readers do not interpret HTML or display images, so their users are not subject to tracking by email web beacons. Plain-text email messages cannot contain web beacons because their contents are interpreted as display characters instead of embedded HTML code, so opening such messages does not initiate any communication.

Some email readers offer the option to disable all HTML in every message (thus rendering all messages as plain text), and this too will prevent tracking beacons from working.

More recently, many email readers and web-based email services have moved towards not loading images when opening a hypertext email that comes from an unknown sender, or that is suspected to be spam email. The user must explicitly choose to load images. But of course beacons can be embedded in non-pictorial elements of a hypertext email.

Web beacons can also be filtered out at the server level so that they never reach the end user. MailScanner is an example of gateway software that can neutralize email tracking beacons for all users of a particular server.

The Beacon API

The Beacon API (Application programming interface) is a candidate recommendation of the World Wide Web Consortium, the standards organization for the web.[8] It is a standardized set of protocols designed to allow web developers to track the activity of users without slowing down website response times. It does this by sending tracking information back to the beacon's host server after the user has navigated away from the webpage.[9]

Use of the Beacon API allows tracking without interfering with or delaying navigation away from the site, and is invisible to the end-user.[10] Support for the Beacon API was introduced into Mozilla's Firefox browser in February 2014[11] and in Google's Chrome browser in November 2014.[12]

See also

References

  1. ^ Stefanie Olsen (January 2, 2002). "Nearly undetectable tracking device raises concern". CNET News. Retrieved May 23, 2019.
  2. ^ Richard M. Smith (November 11, 1999). "The Web Bug FAQ". EFF.org Privacy Archive. Retrieved July 12, 2012.
  3. ^ Richard Lowe Jr And Claudia Arevalo-Lowe. "Email web bug invisible tracker collects info without permission". mailsbroadcast.com. Retrieved August 22, 2016.
  4. ^ Negrino, Tom; Smith, Dori. JavaScript para World Wide Web. Pearson Education, 2001. https://napoleon.bc.edu/ojs/index.php/ital/article/viewFile/1771/1677 accessed 1 October 2015
  5. ^ Jatinder Mann; Alois Reitbauer (April 13, 2017). "Beacon". W3C Candidate Recommendation. W3C. Retrieved November 7, 2019.{{cite web}}: CS1 maint: multiple names: authors list (link)
  6. ^ See Internet Engineering Task Force memorandum RFC 4021.
  7. ^ David Berlind (September 26, 2006). "Have you received any "traceable" PattyMail recently?". ZDNet. Retrieved July 12, 2012.
  8. ^ Beacon W3C Candidate Recommendation 13 April 2017
  9. ^ Introduction to the Beacon API - Sitepoint.com, January 2015
  10. ^ Squeezing the Most Into the New W3C Beacon API - NikCodes, 16 December 2014
  11. ^ Navigator.sendBeacon - Mozilla Developer Network
  12. ^ Send beacon data in Chrome 39 - developers.google.com, September 2015