Talk:Human genome

Human genome was one of the Natural sciences good articles, but it has been removed from the list. There are suggestions below for improving the article to meet the good article criteria. Once these issues have been addressed, the article can be renominated. Editors may also seek a reassessment of the decision if they believe there was a mistake.

Article milestones
Date	Process	Result
October 1, 2006	Good article nominee	Listed
September 24, 2009	Good article reassessment	Delisted

Current status: Delisted good article

This article has not yet been rated on Wikipedia's content assessment scale.
It is of interest to the following WikiProjects:

Human Genetic History (inactive)

This article is within the scope of WikiProject Human Genetic History, a project which is currently considered to be inactive.Human Genetic HistoryWikipedia:WikiProject Human Genetic HistoryTemplate:WikiProject Human Genetic HistoryHuman Genetic History

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Molecular Biology: Genetics / MCB

	This article is within the scope of WikiProject Molecular Biology, a collaborative effort to improve the coverage of Molecular Biology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Molecular BiologyWikipedia:WikiProject Molecular BiologyTemplate:WikiProject Molecular BiologyMolecular Biology
???	This article has not yet received a rating on the importance scale.
	This article is supported by the Genetics task force (assessed as High-importance).
	This article is supported by the Molecular and Cell Biology task force (assessed as Top-importance).

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Medicine Mid‑importance

	Medicine portal This article is within the scope of WikiProject Medicine, which recommends that medicine-related articles follow the Manual of Style for medicine-related articles and that biomedical information in any article use high-quality medical sources. Please visit the project page for details or ask questions at Wikipedia talk:WikiProject Medicine.MedicineWikipedia:WikiProject MedicineTemplate:WikiProject Medicinemedicine
Mid	This article has been rated as Mid-importance on the project's importance scale.
	This article was selected on the Medicine portal as one of Wikipedia's best articles related to Medicine.

Template:WP1.0

This article has been mentioned by a media organization:

"Wikipedia and Academia". Vermont Public Radio. 7 March 2013.

Archives

Index

Archive 1

Archive 2

Archive 3

This page has archives. Sections older than 90 days may be automatically archived by when more than 5 sections are present.

quality

according to the url below, a paper from top experts in a top journal (ie highly authoritative) says that there are many many gaps (unsequenced) regions in the human genome imo, the lack of attention paid to these gaps is somewhat misleading for the general public; eg when scientists use the word "complete" it means, per the dictionary, that we have no gap, no missign sequence genome yet this is empirically false http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html

Current list of human protein-coding genes

I know we have sub-articles on this with partial lists by chromosome, but there's now a complete list in the pages below in the event anyone is interested. I unfortunately couldn't add more information columns to those wikitables since I was running right up against the page size limit on both pages and wanted to split the list across as few pages as possible. Seppi333 (Insert 2¢) 19:28, 2 November 2019 (UTC)[reply]

july 2020 NIH X CHROMASOME RESULT

id like to add this to the article.

should i ?

and whare ?

links: https://www.genome.gov/news/news-release/NHGRI-researchers-generate-complete-human-x-chromosome-sequence

https://www.nih.gov/news-events/news-releases/nih-researchers-generate-complete-human-x-chromosome-sequence

Not sure this is really the best place for it - the result is really more about the process of completing the sequence than it is about the genome itself. I would suggest: Human Genome Project#State of completion (and it would be better to cite the Nature paper than the press releases). Agricolae (talk) 21:07, 19 July 2020 (UTC)[reply]

Number of genes?

The article says

As genome sequence quality and the methods for identifying protein-coding genes improved,[9] the count of recognized protein-coding genes dropped to 19,000-20,000.[12] However, a fuller understanding of the role played by sequences that do not encode proteins, but instead express regulatory RNA, has raised the total number of genes to at least 46,831,[13] plus another 2300 micro-RNA genes.[14]

but also says

The haploid human genome (23 chromosomes) is about 3 billion base pairs long and contains around 30,000 genes.[29]

Which is it, or are both numbers, properly understood, correct? —WWoods (talk) 18:54, 14 November 2020 (UTC)[reply]

It depends on what you count as a gene, and how long ago the analysis was done. That the 46k number comes from an article with the sub-headline: "The new estimate is based on a broader definition of just what a gene is". The second is more vague and generic, and it is unclear if it is rejecting the new definition of a gene offered by the other analysis, or if this is simply 'old data', that though the page has been updated as recently as this August, this particular datum is of older vintage. I think we need to see what sources from the past year and a half are saying about the gene count. Without that, the 46,831 number just represents one paper's conclusion that the definition of a gene should be changed, and what number that would produce, but has the field accepted this adjustment? Even if they have followed this thinking, the number is overly precise given that it results from making a whole lot of individual calls over whether each site is a gene or not. I think we should either present it less precisely as 'more than 46800' and likewise not present this redefinition of the gene as the recent 'new understanding' if it is just one paper's position - i.e. we should use conditional language, 'if the view of a gene is expanded, . . . ' or else we should use more descriptive language to refer to the precise number 'a recent analysis arguing the definition of a gene should be expanded concluded. . . .' so it is clear this is a single analysis with a specific set of assumptions. Agricolae (talk) 19:20, 14 November 2020 (UTC)[reply]

Junk DNA

90% of the human genome is junk but "junk" is only mentioned once in the article. That needs to be fixed. Genome42 (talk) 20:40, 27 July 2022 (UTC)[reply]

Yeah, it needs to be removed. 'Junk' DNA is a concept defined by what it isn't, and is no longer considered a useful distinction given the diverse types of functional (in some cases critically important) and non-functional DNA that is included in this catch-all term. I have removed it. Agricolae (talk) 22:40, 27 July 2022 (UTC)[reply]

There is solid evidence that 90% of or genome is junk. There will soon be a separate Wikipedia article devoted to this topic. You need to read up on this topic. You might want to start with one of my blog posts.

"Five Things You Should Know if You Want to Participate in the Junk DNA Debate"

https://sandwalk.blogspot.com/2013/07/five-things-you-should-know-if-you-want.html Genome42 (talk) 20:16, 29 July 2022 (UTC)[reply]

This is not the 1980s, when if DNA didn't encode proteins then we just threw up our hands and dismissively called it 'junk'. A promoter is not 'junk'. A centromere is not 'junk'. rDNA is not 'junk'. And most importantly, these don't belong in the same heterogeneous category with each other, let alone in the same artificial category with LINEs, pseudogenes, viral insertions, etc. We already mention so-called 'junk' DNA in the article, just not using the inaccurate term for it - "Human genomes include both protein-coding DNA genes and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly-repetitive sequences." There is little benefit in ignoring this diversity, lumping it all together under a heterogeneous catch-all term of 'all of the various different things that aren't X'. More useful would be to break out the unhelpful '90%' figure into its distinct constituent parts, which data should be in the 2022 Science Special Issue (currently behind a paywall for me). Agricolae (talk) 20:57, 29 July 2022 (UTC)[reply]

Just a followup, elsewhere you have defined junk DNA as "DNA that can be deleted from the genome without affecting the fitness of the individual or the species". I would suggest that no piece of the human DNA has been characterized to the level at which we would know this for certain, and it would be unethical to carry out the test. It amounts to a 'we don't know of a benefit so there must not be one' argument. Given that pseudogenes can provide a venue for variation that can then be incorporated back into the transcriptome through gene conversion or reactiveation, that we all have at least one retrotransposed mRNAs in our genome that is a fully-functionaltranscribed protein because it randomly inserted next to a promoter, or a promoter randomly inserted next to it, and that LINEs allow for unequal crossing over and consequent gene duplication, specialization and diversification, I don't know how you point to a particular piece of DNA and say that having it can't possibly ever provide a fitness advantage. In your blog you point to the range of difference in genome size, but there are genomes stripped down to the point where they have next to no 'junk DNA' - if they can, so could we, but we don't - it has been argued that this is evidence for a functional benefit to all that misnamed 'junk', even if we don't know how such a fitness benefit might come about. And all of that doesn't get around the fundamental problem that 'junk' DNA is a heterogeneous category, defined by what it isn't, and like Tolstoy's unhappy families, all unhappy in their own different way, nothing more than a catchall for whatever things that are not something else, a combination of a linguistic holdover from a previous time and an expression of current imperfect knowledge and imagination. Even if it really is useless, and we certainly can't be sure that is the case, we are much better off talking individually about pseudogenes, LINEs, etc., rather than about an artificial category including all of these disparate things, each unhappy for its own particular reason. We are usually better off describing green, and blue, and yellow, rather than creating a page for 'colours that aren't red'.Agricolae (talk) 00:35, 30 July 2022 (UTC)[reply]

Much of what you have just written is scientifically incorrect, misleading, or logically flawed (sometimes all three). But your lack of knowledge of the scientific controversy over junk DNA isn't really the point. The point is that there are many highly respected and intelligent scientists who say that most of our genome consists of junk DNA. You may not agree with them but you are wrong to use your power as a Wikipedia editor to impose your personal opinion on a Wikipedia article. It's ridiculous to censure any mention of "junk DNA" when the term is used in lots of other Wikipedia articles and is part of a very interesting and ongoing controversy about the content of the human genome.

I've posted your comments on my blog where we can discuss all the bits that need correcting.

Wikipedia blocks any mention of junk DNA in the "Human genome" article

https://sandwalk.blogspot.com/2022/07/wikipedia-blocks-any-mention-of-junk.html Genome42 (talk) 15:40, 30 July 2022 (UTC)[reply]

Do what you want on your personal playground - that has nothing to do with Wikipedia. Here I would rather discuss why it is important to have a classification for what remains a Tolstoyan 'unhappy families' category that combines didfferent types of DNA with different origins, regulation and hypothesized evolutionary contribution(s). It is a loss of information, bordering on intentional obfuscation, to take defined genomic proportions of LINEs, pseudogenes, SINEs, etc., etc., and simply report that as a combined "90% of the genome is 'other stuff'" instead of being specific. Agricolae (talk) 02:27, 31 July 2022 (UTC)[reply]

At some point Agricolae needs to recognize that the opinion Genome42 expressed is not just a minority view held by a few cranks. I wouldn't go so far as to say that it's the majority opinion of protein chemists, but it may be. Is there a single protein chemist that anyone can cite (full reference please) who disagree with Genome42? Note also that Genome42's qualifications are very well known and easily checked: his textbook is one of the most influencial biochemistry textbooks in use today. Unfortunately I have not had any success trying to discover what qualifications his opponents have. Some non-biochemists may agree that junk DNA is passé, maybe even a majority, but there are certainly some geneticists and molecular biologists who do not. Dan Graur, for example, has expressed himself very forcefully on the the subject, and so did Sydney Brenner when he was alive. This argument will not go away. Athel cb (talk) 18:08, 30 July 2022 (UTC)[reply]

I have to side with both Genome42 and Athel cb here. I have been following the junk-DNA controversy for over a decade now and it is in NO way a fringe theory. It is a legitimate scientific controversy that is being actively published on and researched. The claim that it is fringe that there is junk DNA, or that it is settled that there is no junk DNA is empirically and historically an outright falsehood, and there is substantial support for both the reality that junk DNA exists, and that a significant fraction of the human genome is in fact junk DNA.

I think Agricolae needs to familiarize itself with the considerable literature that has already been cited earlier by Genome42 as pretty much all of the arguments brought up by Agricolae are addressed at length in those papers. Rumraket38 (talk) 22:35, 30 July 2022 (UTC)[reply]

Let me add my agreement to the comments by Laurence Moran (Genome42), Athel Cornish-Bowden (Athel cb), and Rumraket. There is a dramatic contrast between the views of almost all researchers on molecular evolution, and the views of other genomicists and molecular biologist. The former understand the evidence that most of our genome is "junk DNA". The latter are basing themselves mostly on recent hearsay. This is a sad situation and a major disagreement, and it is outrageous that Wikipedia does not allow even a mention of it in this article. Felsenst (talk) 23:41, 30 July 2022 (UTC) (Joe Felsenstein, Emeritus Professor, Department of Genome Sciences, University of Washington, Seattle (also Department of Biology).[reply]

I think this is a sematic issue. It is clear that at least some DNA that was formerly called junk has a function. Neverthelss, there remains a large percentage of DNA that has no apparent function, at least no function that has yet been identified. Finally some DNA may not have an immediate function in living organisms, but may still play an evolutionary role.^[1] The term junk DNA is widely used in the literature, hence I think it should at least be mentioned with all the above caveats. Boghog (talk) 19:57, 30 July 2022 (UTC)[reply]

Yes, and take it further - these roles, these potential functions are different for each different type of so-called 'junk'. The hypothesized evolutionary role(s) of LINEs is/are different than the potential evolutionary role(s) of pseudogenes. We don't have an article for 'all of the different functional types of DNA with known roles' that tries to combine a discussion of promoters and coding sequence and rDNA and origins of replication, we discuss the individual roles of each specific type, yet 'junk DNA' is analogously heterogeneous and less helpful than similarly individualized discussion of the various components dismissively clustered in that 'everything else' pseudo-category. Agricolae (talk) 02:27, 31 July 2022 (UTC)[reply]

References

^ Andolfatto P (October 2005). "Adaptive evolution of non-coding DNA in Drosophila". Nature. 437 (7062): 1149–52. doi:10.1038/nature04107. PMID 16237443. Lay summary in: "UCSD Study Shows 'Junk' DNA Has Evolutionary Importance". ScienceDaily. Rockville, MD. 20 October 2005.

[pmid16237443-1] Andolfatto P (October 2005). "Adaptive evolution of non-coding DNA in Drosophila". Nature. 437 (7062): 1149–52. doi:10.1038/nature04107. PMID 16237443. Lay summary in: "UCSD Study Shows 'Junk' DNA Has Evolutionary Importance". ScienceDaily. Rockville, MD. 20 October 2005.

[1]