Jump to content

Wikipedia:Citation overkill

From Wikipedia, the free encyclopedia
(Redirected from Wikipedia:REFBLOAT)
These are probably too many sources to cite for a single point.

Wikipedia policy requires all content within articles to be verifiable. While adding inline citations is helpful, adding too many can cause citation clutter, making articles look untidy in read mode and difficult to navigate in markup edit mode. If a page features citations that are mirror pages of others, or which simply parrot the other sources, they contribute nothing to the article's reliability and are detrimental to its readability.

One cause of "citation overkill" is edit warring, which can lead to examples like "Graphism is the study[1][2][3][4][5][6][7][8][9][10][11][12][13][14][15] of ...". Extreme cases have seen fifteen or more footnotes after a single word, as an editor tries to strengthen their point or the overall notability of the subject with extra citations, in the hope that others will accept that reliable sources support it. Similar circumstances can also lead to overkill with legitimate sources, when existing sources have been repeatedly removed or disputed on spurious grounds or against consensus.

Another common cause of citation overkill is simply that people want the source they've seen to be included in the article too, so they just tack it onto the end of existing content without making an effort to actually add any new content.

The purpose of any article is first and foremost to be read – unreadable articles do not give our readers any material worth verifying. It is also important for an article to be verifiable. Without citations, we cannot know that the material isn't just made up, unless it is a case of common sense (see WP:BLUE). A good rule of thumb is to cite at least one inline citation for each section of text that may be challenged or is likely to be challenged, or for direct quotations. Two or three may be preferred for more controversial material or as a way of preventing linkrot for online sources, but more than three should generally be avoided; if four or more are needed, consider bundling (merging) the citations.

Not only does citation overkill impact the readability of an article, it can call the notability of the subject into question by editors. A well-meaning editor may attempt to make a subject which does not meet Wikipedia's notability guidelines appear to be notable through sheer quantity of sources, without actually paying any attention to the quality of the sources. Ironically, this serves as a red flag to experienced editors that the article needs scrutiny and that each citation needs to be verified carefully to ensure that it was really used to contribute to the article.

Misuse to prove an obvious point

[edit]

It is possible that an editor who is trying to promote an article to GA-class (good article status) might add citations to basic facts such as "...the sky is blue..."[6]. While this might be a good thing in their eyes, the fact that the sky is blue does not usually require a citation. In all cases, editors should use common sense. In particular, remember that Wikipedia is not a dictionary and we do not need citations for the meanings of everyday words and phrases.

Notability bomb

[edit]
Metaphorical ref bombs being deployed on a Wikipedia article.

A common form of citation overkill is adding sources to an article without regard as to whether they support substantive or noteworthy content about the topic. This may boost the number of footnotes and create a superficial appearance of notability, which can obscure a lack of substantive, reliable, and relevant information. This phenomenon is especially common in articles about people and organizations.

Examples of this type of citation overkill include:

  • Citations lacking significant coverage – Citations that briefly namecheck the fact that the subject exists, but are not actually about the subject to any non-trivial degree.
    Example: A source that quotes the subject giving a brief soundbite to a reporter in an article about something or someone else.
  • Citations that verify random facts – Citations that don't even namecheck the subject at all, but are present solely to verify a fact that's entirely tangential to the topic's own notability or lack thereof.
    Example: A statement of where the person was born referenced to a source that verifies only that the named town exists; a statement about a charitable organization is sourced to a source that talks about the subject the organization is interested in, e.g., hunger or homelessness or art, but does not mention this charity at all.
  • Citations to work that the article's subject produced – A series of citations that Gish gallop their way through a rapid-fire list of content that doesn't help to establish notability.
    Example: An article about an author sourced to works they have published; an article about an artist sourced to songs that they released.
  • Citations that name-drop reliable sources – Citations that are added only to make it seem that 'this topic was covered by X', rather than to actually support any substantive content about the topic.
    Example: A citation to a source that is cited to support a statement in the Wikipedia article that merely says "The Times published an article about them" or "Chris Celebrity was interviewed by Big Show", instead of supporting any encyclopedic content about anything stated in that source, such as "In 2019, The Times said they were at high risk for bankruptcy".

Some people might try to rest notability on a handful of sources that do not contribute, while other people might try to build the pile of sources up into the dozens or even hundreds instead – so this type of citation overkill may require special attention. Either way, the principle is the same: Sources support notability based on what they say about the topic, not just the number of footnotes present. An article with just four or five really good sources is considered better referenced than an article that cites 500 bad ones.

Overloading an article with bad citations can backfire if the article is nominated for deletion. Participators may not want to look at all one hundred citations, and they may instead choose to look at just a smaller sample. If they find only unreliable sources or sources that do not discuss the subject in depth, they could recommend deletion. The good sources could be missed.

Draft articles with excessive citations are likely to be ignored by volunteer reviewers in the articles for creation (AfC) process, contributing to the backlog and resulting in a delay of several months before the draft is reviewed, usually only to be declined.

Needless repetition

[edit]

Material that is repeated multiple times in an article does not require an inline citation for every mention. If you say an elephant is a mammal more than once, provide one only at the first instance.

Avoid cluttering text with redundant citations like this:

Elephants are large[1] land[2] mammals[3] ... Elephants' teeth[4] are very different[4] from those of most other mammals.[3][4] Unlike most mammals,[3] which grow baby teeth[5] and then replace them with a permanent set of adult teeth,[4] elephants have cycles of tooth[5] rotation throughout their entire[6] lives.[4]

1. Expert, Alice. (2010) Size of elephants: large.
2. Smith, Bob. (2009) Land-based animals, Chapter 2: The Elephant.
3. Christenson, Chris. (2010) An exhausting list of mammals.
4. Maizy, Daisy. (2009) All about the elephants' teeth, p. 23–29
5. Reporter, Rae. (2012) Yes, Elephants Still Have Teeth.
6. Portant, I.M. (2015) "Analysis of Tooth Presence during Elephant Lifespan". J. Imp.

In addition, as per WP:PAIC, citations should be placed at the end of the passage that they support. If one source alone supports consecutive sentences in the same paragraph, one citation of it at the end of the final sentence is sufficient. It is not necessary to include a citation for each individual consecutive sentence, as this is overkill. This does not apply to lists or tables, nor does it apply when multiple sources support different parts of a paragraph or passage.

This is correct:

In the first collected volume, Marder explains that his work is "about the affinity of life", wherein the characters "understand that ultimately they depend on each other for survival". Wiater and Bissette see this relationship as a wider metaphor for the interdependency of the comics industry. Indeed, addressing the potential underlying complexity, Marder suggests that "it's harder to describe it than it is to read it". He also calls it "an ecological romance ... a self-contained fairy tale about a group of beings who live in the center of their perfect world [and are] obsessed with maintaining its food chain", a self-described "really low concept!" Equally, he says, "the reader has to invest a certain amount of mental energy to follow the book", which includes "maps and a rather long glossary". Despite these potentially conflicting comments, Wiater and Bissette reiterate that "there is no simpler or more iconographic comic book in existence".<ref name="Rebels">[[Stanley Wiater|Wiater, Stanley]] and [[Stephen R. Bissette|Bissette, Stephen R.]] (eds.) "Larry Marder Building Bridges" in ''Comic Book Rebels: Conversations with the Creators of the New Comics'' (Donald I. Fine, Inc. 1993) ISBN 1-55611-355-2 pp. 17–27</ref>

This is also correct, but is an example of overkill:

In the first collected volume, Marder explains that his work is "about the affinity of life", wherein the characters "understand that ultimately they depend on each other for survival".<ref name="Rebels" /> Wiater and Bissette see this relationship as a wider metaphor for the interdependency of the comics industry.<ref name="Rebels" /> Indeed, addressing the potential underlying complexity, Marder suggests that "it's harder to describe it than it is to read it".<ref name="Rebels" /> He also calls it "an ecological romance ... a self-contained fairy tale about a group of beings who live in the center of their perfect world [and are] obsessed with maintaining its food chain", a self-described "really low concept!"<ref name="Rebels" /> Equally, he says, "the reader has to invest a certain amount of mental energy to follow the book", which includes "maps and a rather long glossary".<ref name="Rebels" /> Despite these potentially conflicting comments, Wiater and Bissette reiterate that "there is no simpler or more iconographic comic book in existence".<ref name="Rebels">[[Stanley Wiater|Wiater, Stanley]] and [[Stephen R. Bissette|Bissette, Stephen R.]] (ed.s) "Larry Marder Building Bridges" in ''Comic Book Rebels: Conversations with the Creators of the New Comics'' (Donald I. Fine, Inc. 1993) ISBN 1-55611-355-2 pp. 17–27</ref>

If consecutive sentences are supported by the same reference, and that reference's inline citation is placed at the end of the paragraph as described at WP:CITETYPE, an editor may want to consider using Wikipedia's hidden text syntax <!-- --> to place hidden ref name tags at the end of each sentence. Doing so may benefit others adding material to that paragraph in the future. If that happens, they can uncomment the hidden citations and switch to citing references after every sentence. Having hidden citations could cause confusion, especially among inexperienced editors, so the approach is strictly optional and should be used cautiously.

Reprints

[edit]

Another common form of citation overkill is to cite multiple reprintings of the same content in different publications – such as several different newspapers reprinting the same wire service article, or a newspaper or magazine article getting picked up by a news aggregator – as if they constituted distinct citations. Such duplicated citations may be piled up as multiple references for the same fact or they may be split up as distinct footnotes for different pieces of content, so watching out for this type of overkill may sometimes require special attention.

This type of overkill should be resolved by merging all of the citations into a single one and stripping unhelpful repetitions – when possible, the retained citation should be the originator of the content rather than a reprinter or aggregator, but if this is not possible (e.g. some wire service articles) then retain the most reliable and widely distributed available reprinter (for example, if the same article has been linked to both The New York Times and The Palookaville Herald, then The New York Times should be retained as the citation link.)

A similar case is redundant citation of an article that got its information from an article we have already cited. An exception, to many scientific and technical editors, is when we cite a peer-reviewed literature review and also cite some of the original research papers the review covers. This is often felt to provide better utility for academic and university-student users of Wikipedia, and improved verifiability of details, especially in medical topics. Similar concerns about the biographies of living people may sometimes result in "back-up" citations to original reportage of statements or allegations that are later repeated by secondary sources that provide an overview.

In-article conflict

[edit]

In controversial topics, sometimes editors will stack citations that do not add additional facts or really improve article reliability, in an attempt to "outweigh" an opposing view when the article covers multiple sides of an issue or there are competing claims. This is something like a PoV fork and edit war at once, happening inside the article's very content itself, and is an example of the fallacy of proof by assertion: "According to scholars in My School of Thought, Claim 1.[1][2][3][4][5] However, experts at The Other Camp suggest that Claim 2.[6][7][8][9][10]"

If this is primarily an inter-editor dispute over a core content policy matter (point of view, source interpretation, or verifiability of a claim), talk page discussion needs to proceed toward resolving the matter and balancing the article. If the dispute seems intractable among the regular editors of the article, try the requests for comments process; the applicable NPOV, NOR or RS noticeboard; or formal dispute resolution.

If the matter is the subject of real-world dispute in reliable sources, our readers actually need to know the conflict exists and what its parameters are (unless one of the conflicting views is a fringe viewpoint). Competing assertions with no context are not encyclopedic. Instead, the material should be rewritten to outline the nature of the controversy, ideally beginning with secondary sources that independently describe the conflicting viewpoints or data, with additional, less independent sources cited only where pertinent, for verification of more nuanced claims made about the views or facts as represented by the conflicting sources. Sources that are opinional in nature – op-eds, advocacy materials, and other primary sources – can usually simply be dropped unless necessary to verify quotations that are necessary for reader understanding of the controversy.

Other views and solutions

[edit]

Contrary views (and approaches to addressing their concerns) include:

  • A cited source usually contains further relevant information than the particular bit(s) it was cited for, and its removal may be thought to "deprive" the reader of those additional resources. Wikipedia is not a Web index, and our readers know how to use online search engines. In most cases, if a source would be somewhat or entirely redundant to cite for a particular fact, but has important additional information, it is better to use it to add these facts to the article. Or, if the additional material is not quite encyclopedically pertinent to the article but provides useful background information, add it to the "Further reading" or "External links" section instead of citing it inline in a way that does not actually improve verifiability.
  • An additional citation may allay concerns of some editors that the text constitutes a copyright violation. This is usually a short-term issue, and is often better handled by discussing the evidence on the talk page, if the additional citation does not really increase verifiability (e.g., because the original citation, with which the added one would be redundant, is to a clearly reliable source, and there are no disputes about its accuracy or about the neutrality or nature of its use).
  • As alluded to above, an additional citation may allay concerns as to whether the other citation(s) are sufficient, for WP:RS or other reasons. While this is often a legitimate rationale to add an additional source that some editors might consider not strictly necessary, it is sometimes more practical to replace weak sources with more reliable ones, or to add material outlining the nature of a disagreement between reliable sources. How to approach this is best settled on a case-by-case basis on the article's talk page, with an RfC if necessary, especially if the alleged fact, topic, or source is controversial. Adding competing stacks of citations is not how to address WP content disputes or real-world lack of expert consensus.

How to trim excessive citations

[edit]
This barber has the right idea: trim away the excess.

Try to construct passages so that an entire sentence or more can be cited to a particular source, instead of having sentences that each require multiple sources.

Sometimes it may be possible to salvage sources from a citekill pileup by simply moving them to other places in the article. Sometimes, a source which has been stacked on top of another source may also support other content in the article that is presently unreferenced, or may support additional content that isn't in the article at all yet, and can thus be saved by simply moving it to the other fact or adding new content to the article.

Deciding which citations to remove

[edit]

If there are six citations on a point of information, and the first three are highly reputable sources (e.g., books published by university presses), and the last three citations are less reputable or less widely circulated (e.g., local newsletters), then trim out those less reputable sources.

If all of the citations are to highly reputable sources, another way to trim their number is to make sure that there is a good mix of types of sources. For example, if the six citations include two books, two journal articles, and two encyclopedia articles, the citations could be trimmed down to one citation from each type of source. Comprehensive works on a topic often include many of the same points. Not all such works on a topic need be cited – choose the one or ones that seem to be the best combination of eminent, balanced, and current.

In some cases, such as articles related to technology or computing or other fields that are changing very rapidly, it may be desirable to have the sources be as up-to-date as possible. In these cases, a few of the older citations could be removed.

For many subjects, some sources are official or otherwise authoritative, while others are only interpretive, summarizing, or opinionated. If the authoritative sources are not controversial, they should generally be preferred. For example, a company's own website is probably authoritative for an uncontroversial fact like where its headquarters is located, so newspaper articles need not be cited on that point. The World Wide Web Consortium's specifications are, by definition, more authoritative about HTML and CSS standards than third-party Web development tutorials.

Citation merging

[edit]

If there is a good reason to keep multiple citations, for example, to avoid perennial edit warring or because the sources offer a range of beneficial information, clutter may be avoided by merging the citations into a single footnote. This can be done by putting, inside the reference, bullet points before each source, as in this example, which produces all of the sources under a single footnote number. Within a simple text citation, semicolons can be used to separate multiple sources.

Examples

[edit]

Each of these articles has been corrected. Links here are to previous versions where a citation problem existed.

Templates

[edit]

See also

[edit]