IMDb
File:The Internet Movie Database (IMDb)1145113025625.png | |
Type of site | Online movie, tv, and video game database |
---|---|
Owner | Amazon.com |
Commercial | Yes |
Registration | Optional |
The Internet Movie Database (IMDb) is an online database of information about actors, movies, television shows, television stars and video games. Owned by Amazon.com since 1998, the IMDb celebrated its fifteenth anniversary on October 17, 2005.
Overview
The IMDb website consists of the largest known single accumulation of data on individual films, television programs, direct-to-video product and videogames reaching back to their respective beginnings, and worldwide in scope. Wherever possible, the information goes beyond simple screen or press credits to include uncredited personnel and companies involved, either artistically or technically, in the production and distribution, thus aiming at completeness of detail. Furthermore, IMDb tracks projects in production, and even major, announced projects still in the developmental stage. Simultaneously, a collateral database of all persons identified in the product database exists, including biographical details and information about other aspects of their professional lives not covered by individual entries in the database (theatrical appearances, commercial advertising appearances, etc.). Information is largely provided by a cadre of volunteers with expertise in various areas of film history, with the actual staff largely used to screen and edit the voluminous amount of material submitted daily, and to track information from industry resources on current and planned projects and contemporary personalities only.
The IMDb also offers ancillary material such as daily movie and TV news, and running special features about various movie events such as the Academy Awards. IMDb also has an active message board system: there are message boards for each database entry, which can be found at the bottom of the relevant page, as well as general discussion boards on various topics.
IMDb is a free site, which requires only registration to access its complete range of data and activities. Any person with an e-mail account and a web browser that accepts cookies can set up an account with IMDb, then research covered product, submit information and engage in other site activities. (Site visitors not wishing to provide registration information can, however, search and view the database.) For automated queries, most of the database can be downloaded as (compressed) plain text files and the information can be extracted using the tools provided (typically using a command line interface). See: IMDb interfaces
It has also in 2003 spun off a private, subscription-funded site, IMDbPro, offering the entire IMDb contents plus additional information for business professionals, such as personnel contact details, movie event calendars, and a greater range of industry news.
Statistics
- Titles: 781,111
- People: 2,070,885
See: IMDb Statistics
History
In rec.arts.movies
The database originated from two lists started as independent projects in early 1989 by participants in the Usenet newsgroup rec.arts.movies. In each case, a single maintainer recorded items emailed by newsgroup readers, and posted updated versions of his list from time to time.
It began with a posting titled "Those Eyes", on the subject of actresses with beautiful eyes. Hank Driskill began to collect a list of sexy actresses and what movies they had appeared in, and as the size of the repeated posting grew far beyond a normal newsgroup article, it soon became known simply as "THE LIST". [1].
The other project, started by Chuck Musciano, was briefly called the "Movie Ratings List" and soon became the "Movie Ratings Report". Musciano simply asked readers to rate movies on a scale of 1 to 10, and reported on the votes [2]. He soon began posting "ballots" with lists of movies for people to rate, so his list also grew quickly.
In 1990 Col Needham collated the two lists and produced a "Combined LIST & Movie Ratings Report" [3], and at this point the ball really started rolling. Needham soon found himself starting a (male) "Actors List", while Dave Knight began a "Directors List", and Andy Krieg took over THE LIST, which would later be renamed as the "Actress List". Both this and the Actors List had been restricted to people who were still alive and working, but retired people began to be added, and Needham also started what was then (but did not remain) a separate "Dead Actors/Actresses List". The goal now was to make the lists as inclusive as the maintainers could manage.
In late 1990, the lists included almost 10,000 movies and television series. On October 17, 1990, Needham posted a collection of Unix shell scripts which could be used to search the four lists, and the database that would become the IMDb was born. At the time, it was known as the "rec.arts.movies movie database".
On the Web
By 1993, the database had been expanded to include additional categories of filmmakers and other demographic material, as well as trivia, biographies, and plot summaries; the movie ratings had been properly integrated with the list data; and a centralised email interface for querying the database had been created. Later in the year, it moved onto to the World Wide Web (a network in its infancy back then) under the name of Cardiff Internet Movie Database. The database resided on the servers of the computer science department of Cardiff University in Wales. Rob Hartill was the original web interface author. In 1994, the email interface was revised to accept the submission of all information, meaning that people no longer had to email the specific list maintainer with their updates. However, the structure remained that information received on a single film was divided among multiple section managers, the sections being defined and determined by categories of film personnel and the individual filmographies contained therein. Its management also continued to be in the hands of a small contingent of underpaid or volunteer "section managers" who were receiving ever-growing quantities of information on films from around the world and across time from contributors of widely varying level of expertise and informational resources. Despite the annual claims of Needham, in a year-end report newsletter to the Top 50 contributors, that "fewer holes" must now remain for the coming year, the amount of information still missing from the database was vastly underestimated. Over the next few years, the database was run on a network of mirrors across the world with donated bandwidth.
As an independent company
In 1995, it became obvious to the principal site managers that the project had become too large to maintain merely through donations and in their spare time. The decision was made to become a commercial venture and in 1996, IMDb was incorporated in the United Kingdom, becoming the Internet Movie Database Ltd, with Col Needham the primary owner as well as identifed figurehead. The remaining shareholders were the people maintaining the database. Revenue was generated through advertising, licensing and partnerships.
This state of affairs continued until 1998. The database was growing every day, and it was again reaching a critical point. Most revenues were being spent on equipment, and there was not enough money left over to pay full time salaries. The system was also suffering noticeable slowdowns both in accessing the site and in having new data posted. Offers were solicited and received from major businesses to purchase the database; however, the shareholders were unwilling to sell if it could not be guaranteed that the information would be accessible to the internet community for free.
As a subsidiary company
In 1999, Jeff Bezos, founder, owner and CEO of Amazon.com struck a deal with Col Needham and other principal shareholders, to buy IMDb outright and attach it to his corporate empire as a subsidiary, private company. This gave IMDb the ability to pay the shareholders salaries for their work, while Amazon.com would be able to use the IMDb as an advertising resource for selling DVDs and videotapes. Volunteer contributors were not advised in advance of even the possibility of IMDb - and their contributions along with it - being sold to a private business, which created some initial discord and defection of regulars. Promises to recompense all major contributors in some unspecified way for their prior services were issued by Col Needham in announcing the sale, but did not materialize.
IMDb continues to expand its functionality. In 2002, it added a subscription service known as IMDbPro aimed at entertainment professionals. It provides a variety of services including production and box office details, as well as a company directory. Most information contained in the IMDb database proper continues to come from volunteer researchers, whose only incentive, since 2003, is that if they are identified as being one of "the top 100 contributors" in terms of amounts of hard data submitted, they receive complimentary free access to IMDbPro for the following calendar year.
The database
On 26 January 2006, the long-awaited "Full Episode Support" came online, meaning the database now supports separate cast and crew listings for every episode of every TV series. This was described by Col Needham as "the largest change we've ever made to our data model", and increased the number of titles in the database from 485,000 to nearly 750,000.
At present, the database entries for TV series are in a state of flux, as listings are migrated from series titles to individual episodes. The maintainers anticipate that it will take a couple of months for data to settle down and bugs to be ironed out.
This section needs expansion. You can help by adding to it. |
Ancillary features
User voting
As one adjunct to data, the IMDb offers a facility for users to rate films by choosing one of ten categories in the range 1-10, with each user able to submit one vote. The points of reference given to users of these categories are the descriptions "1 (awful)" and "10 (excellent)"; and these are the only descriptions of categories. Current plans in development will also allow this rating to occur for television programming on an episode-by-episode basis.
In adopting this method, IMDb is following its widespread usage; the method is the same as rating in the range of a half star to five stars. When used in reviews by a single reviewer, the method has some basic utility given a rating is usually given in the context of a qualitative appraisal of the film. The simplicity of this method makes it popular, but in terms of psychometric, statistical, and other criteria, the method suffers fundamental flaws.
Flaws of the voting method
In using ratings to rank films and compute averages, the ratings are taken to represent individuals’ perceptions of the quality of films. With each user voting for a film only once in a category from 1 to 10, there is no means for evaluating internal reliability using a traditional index such as Cronbach's alpha or any other kind of reliability index. More than one question or comparison is required for any psychometric instrument or process in order to evaluate reliability. Following from this, it is also impossible to evaluate the degree of validity of the ratings as measures of viewer perceptions. Establishing validity would require firstly establishing reliability and, secondly, that the ratings represent what they are supposed to represent. There are many approaches to scaling and analysing data in order to evaluate the quality of measures, including methods of evaluating preferences. The use of a single ten-point categorization is simplistic, and fails to meet basic requirements for scaling preferences or perceptions.
A basic problem is that the votes are a case of a convenience sampling: they represent only the conglomeration of those inclined to vote, and these users are, in some cases overwhelmingly, fans of recent productions. In addition, users are able to obtain multiple IDs by registering from different emails, thus a single person can effectively vote numerous times for the same movie. This results in information being weighted more heavily for voters who cast multiple votes, and the reporting of incorrect numbers of 'IMDb users' who have voted for a film. IMDb employs filters which it claims mitigate the effects of such strategies, but it is impossible to gauge how effective these are because insufficient information is provided on the site.
Another basic problem is that different users may employ different criteria in making judgments, and may use the categories in different ways (e.g. some will rarely use the category 10 while others may use it frequently). This is a primary reason that the use of a single vote is inadequate for scaling film perception: it is not possible to ascertain the basis for voting decisions or to evaluate the consistency of decisions when only one decision is made and no explicit criterion is stated. For example, fans that most prefer crime films may tend to have different criteria in mind than fans that most prefer comedies.
The flaws described above are compounded when data are aggregated on IMDb to produce "averages" used for lists in which films are ranked, and therefore supposedly compared. User votes are at best ordinal categorizations. While it is not uncommon to calculate averages or means for such data, doing cannot be justified due to the fact that in calculating averages, equal intervals are required to represent the same difference between levels of perceived quality.
For various reasons, therefore, the ratings therefore have no scientific basis or validity; neither do they represent a single standard of determination or the views of the international community.
The use of filters and weights for individual films
IMDb indicates that submitted votes are filtered and weighted in various ways in order to produce a weighted mean that is displayed for each film, series, and so on. It states that filters are used to avoid 'vote stuffing', the method not being reported to avoid attempts to circumvent it. IMDb also states that their "scheme combines a number of well-known and proven statistical methods, including a trimmed mean to reduce extreme influences" [4]. While it is correct that trimmed means reduce extreme influences, it is somewhat misleading to describe this as a 'proven' statistical method: the key issue is whether the method is appropriate in a particular situation. It is appropriate when the robustness or reliability of information from the extremes is questionable for any reason due to the influence of extremes on the mean, but it is not appropriate when the extremes provide important information. For example, a trimmed mean is inappropriate when a distribution genuinely has a mode at one extreme of the range of a scale, such as when the most common vote is 9 or 10.
Lists in which films are ranked
The IMDb Top 250 is supposedly a listing the top 'rated' 250 films, which is based on votes by the registered users of the website using the methods described. Only theatrical releases running longer than 60 minutes with over 1300 votes are considered; all other product are ineligible. Also, the 'top 250' rating is based on only the votes of "regular voters" (IMDb does not define this term). In addition to other weightings, the top 250 films are also based on a weighted rating formula referred to in actuarial science as a credibility formula. This label arises because a statistic is taken to be more credible the greater the number of individual pieces of information; in this case from eligible voters. The use of this formula means when the number of votes becomes smaller, the average becomes closer to the 'average' rating across all films. Conversely, the average rating of a film becomes closer to the average before the formula is applied as numbers of votes increase. A limitation of this formula used in producing the top 250 is that it weights an 'average' rating more highly for sheer numbers of votes, irrespective of the criteria brought to bear by different voters. An example of the benefit of using the formula, however, is that it mitigates the effects of extreme votes by small but extreme groups of fans.
The IMDb also has a Bottom 100 feature which is assembled in the same way. A disproportionate number of "Bottom 100" films were featured on Mystery Science Theater 3000, as a result of an MST3K website encouraging all its users to register with IMDb and vote "1" on films featured on the show, during IMDb's early years.
The top 250 list comprises a wide strata of films, including major releases, cult films, independent films, critically acclaimed films, silent films and foreign films. Nevertheless, there is no substantive basis for compiling such lists of rankings for the reasons outlined. In summary, the key problems with the lists are that:
- Averages should not be calculated for data of the kind collected.
- It is impossible to evaluate the reliability or validity of user ratings.
- Films are not compared with respect to explicit, let alone common, criteria.
- Only users inclined to vote for a film do so and individuals can vote multiple times as different 'users'.
Message boards
One of the most used features of the Internet Movie Database is the Message Boards that coincide with every database entry, along with 47 Main Boards. These boards allow registered users to share, discuss and debate information about the movie/actor/writer. They were not originally part of the IMDb, but were added only after its purchase by Amazon.com, some time in the year 2000.
The Main Boards are wide discussion forums that pertain to certain aspects of film discussion. They divide into the categories Trivia! Trivia! (various aspects of detailed film minutia), Awards Season (various movie awards winners and nominees), FilmTalk (talk about film in general and specific films), TV Talk (television shows, new and old), Shop Talk (film professions), Genre Zone (a number of established movie genres), Around the World (global cinema), Star Talk (celebrities and film professionals), General Boards (miscellaneous and non-film-related topics), Video Games (talk about games consoles and video games in general), and IMDb Help (anything pertaining directly to the site itself). As the IMDb expires older posts from all message boards variably, it is difficult to precisely measure traffic according to individual board, but The Sandbox and The Soapbox are amongst the highest traffic boards on IMDb. The Soapbox is a general purpose discussion board, where users can go for "their more heated discussions". The Sandbox is a general purpose, anything-goes board designated for test messages and off-topic posts.
Over the last 5 years the George W. Bush, Michael Jackson and Soapbox message boards (and, to a lesser extent, the Fahrenheit 9/11 and The Passion of the Christ message boards and other message boards for political and religious personas) have been major targets for heated debate, ranting and trolling.
Registered Users:
8,630,000 - Jan 1 2006
10,000,000 - Fri Mar 31 2006
IMDb Home page - "Visited by over 35 million movie lovers each month!"
Copyright issues
All volunteers who contribute content to the database retain copyright to their contributions but grant full rights to copy, modify, and sublicense the content to IMDb. IMDb in turn does not allow others to use movie summaries or actor biographies without written permission. Using filtering software to avoid the display of advertisements from the site is also explicitly forbidden. Only small subsets of filmographies are allowed to be quoted, and only on non-commercial websites.
Criticisms
This article needs additional citations for verification. |
Despite its popularity, IMDb still has its share of critics. Some of the more common complaints leveled against the site include:
- The ability of software to filter content is limited so all content has to be manually approved. This leads to backlogs.
- Only 17 staff members are actively involved in validating and processing through the system the hundreds of thousands of lines of presumed information contributed each month. [5] (Per Col Needham, in a post on an earlier version of the IMDb message boards, now deleted, only designated Section Managers validate and process data for their designated sessions. From the staff list here referenced, these would be Bailey, Bernhardt, Cairella, Hafner, Hawker, Heidelbach, Higgins, Kazarian, Leonard, Norris, Reeves, Simeon, Smith, Stevenson, Tinto, Vaughan and Vernon).
- Staff members gauge the validity of contributed data based on the past reliability of the contributor, as none are themselves experts in all of the significantly varied areas of film history to know what is valid themselves. Given the volume of submissions and the number of volunteers who submit information, it's little wonder that errors abound.
- Submissions of product data are processed by categories of personnel contained in the submission, meaning the data for any one film is broken up into several components and examined independently of the other components, then reassembled without checking the continuity of the whole, which may be further disrupted if one manager's section(s) is/are backlogged, an unfortunately regular occurrence at IMDb.
- Submission policies have become more rigid over the years, and approval of new titles to be added has become more cautious, but errors still occur while the added restrictions have made it more difficult to add information to the database or correct mistakes.
- Furthermore, IMDb also retains the right to publish AND what not to publish in such categories as a film's trivia, goofs, celebrity information, etc., regardless of how true it is. It is common for an item to be published one day, only to be relinquished the next; in other cases, it can be difficult to get a demonstrably untrue piece of information removed.
- There is a severe lack of ongoing moderation for its message boards. Many irrelevant, attacking or obscene messages, and general trolling, has increasingly plagued the boards of films and personalities, not to mention the board set up by IMDb for the explicit purpose of being an outlet for screed ("The Soapbox"), which seems to only attract more flame-warriors and trolls to the site. Although offensive messages can be reported, their removals are very slow and only one report per offending poster can be filed by each poster (under a revised system introduced in late-2005, replacing a system which allowed multiple abuse reports against an offender), and users have been given an "ignore this poster" option as a sop (the function blocks the message from the view of the user who has placed someone on his/her ignore list; the offending poster's contributions remain live and visible to anyone who does not have them in "ignore" status). Interestingly, the messages that do get deleted are the least obscene, if not at all.
- People have complained that posts on the message boards and accounts are being deleted without any reason, which has created a theory that trolls have found a way to actually delete other people's posts and/or accounts. The board administrators comment on this is that: "the problem they describe simply does not exist. The only people who can disable accounts are IMDb staff members on the boards administration team." [6] Another theory for these deletions is that the administrators don't take enough time investigating reports of abuse and delete posts or accounts without first checking if they're breaking IMDb's terms and conditions for the message board. The administrators deny this as well.
- The "Mini biography" section (see a full view by clicking the "(show more)" link) on each actor's homepage has information which is often very uneven and out-of-date, and in many cases shoddy and completely non-verified. The information regarding the most popular and established performers is often (but not always) correct, with the quality and veracity of the data of the supporting and less well-known players often very rough and quite unreliable.
- To read the forum posts you have to be registered.
- To register you have to fill the form with some personal information as such your zip code, year of birth and sex.
See also
- Movie Tome
- All Movie Guide
- Internet Broadway Database
- Internet Book Database
- Fictional film
- Films that have been considered the greatest ever
- Films considered the worst ever
- Rotten Tomatoes
- Internet Movie Cars Database
- Internet Adult Film Database
External links
- The Internet Movie Database—including a copyright statement, license terms, and database statistics
- IMDb's UK mirror
- IMDb's history of itself
- IMDb general message boards
- IMDb's French site with French alternate titles
- IMDb's German site with German alternate titles
- IMDb's Italian site with Italian alternate titles
- IMDb's Spanish site with Spanish alternate titles
- IMDb's Portuguese site with Portuguese alternate titles
- IMDb's AKA site listing all AKA (i.e. also known as) titles
- The Complete IMBD in TomeRaider 3 Format
- "Do You IMDb?" August 2004 article from L.A. Weekly
- Most linked to IMDb entries
- Movie Blue Book database of films available for license and distribution
The IMDb's newsgroup origins
- ^ Unfortunately, Google Groups coverage of rec.arts.movies is incomplete during the relevant time period, with a 6-month gap in late 1988 and early 1989 and a number of missing articles after that. This posting, with almost 1000 entries, is the earliest version of THE LIST that is preserved. This response to an item in the newsgroup's FAQ list tells the then-recent story of the list's origin.
- ^ Chuck Musciano's first posting proposing the movie ratings report is also missing, but here are his first call for votes and his first ratings report.
- ^ Needham's first combined LIST and ratings report. His first posting of the database scripts is not available.
- A 1994 FAQ list for the database. Section 8 tells its early history in a less POV, Needham-o-centric manner than the IMDb link above.