Wikipedia:Parser bug reports
Bangs in links in tables wreck the table Koyaanis Qatsi, Sunday, March 31, 2002
Observe: the following table is coded properly, but displays incorrectly (as you'll see in the second row, if you view source). Piping the link so the bang displays but isn't linked to corrects it:
Vivement dimanche! | Confidentially Yours | 1983 | W | D | ||
Vivement dimanche! | Confidentially Yours | 1983 | W | D |
- Looks like a bug in the reject-bad-links code. I'll work on it... Brion VIBBER, Sunday, March 31, 2002
- Got it. Fixed in CVS, wait for Jimbo to upgrade. Brion VIBBER, Tuesday, April 2, 2002
- Has that been upgraded? It quit linking but the two rows in the table should be exactly the same except for that? Koyaanis Qatsi, Sunday, April 7, 2002
Ampersands in links surrounded by italics cause words to drop and the end italics tag to be ignored Not a common problem, I'm sure. But go to the history of talk:Osama bin Laden and you can see what I'm talking about. The ampersand in the former [[Roger & Me]] was the culprit. Koyaanis Qatsi, Sunday, April 7, 2002
Several asterisks in a row will prevent linewraps (or increase the linewrap length considerably?) Koyaanis Qatsi, Monday, April 8, 2002
See the history of Talk:Terrists for an example. I doubt this is a common issue, though, since most of us use four dashes. :-)
- This is not a bug, so I'm going to move this to the "fixed" site. Asterisks at the start of a line are used by the wiki software to make bullet lists, which can be nested. A row of, say, 20 asterisks is asking the software to make 20 nested bullet lists, and it does so correctly. It fails to wrap lines because bullet lists are indented, and when you ask for 20 indents, that line becomes very long to accommodate them, and it takes the rest of the page with it. In short, DON'T DO THAT, because it's not ever going to change. -- Lee Daniel Crocker
Table positioned between two paragraphs displays at bottom of page
(possibly related to above bug??) Wednesday, April 10, 2002
If you look at the table in Talk:High_German, you'll see that instead of appearing between the two paragraphs of my note, it leaves a "close table" tag where the table belongs and puts the table at the end of the page. I've double-checked my table code for errors, and can't find any. I've also tried just making one big table, with the first and last paragraphs in their own table rows, but the problem persists. Is this a bug, or am I having a Stupid Attack™? pgdudda
- You're missing a </center> tag; it looks okay after I added that in. But that did trigger a bug in the parser that caused it to eat the table instead of the center tag... I'll try to fix that, but in the meantime, uh, don't do that. :) Brion VIBBER, Wednesday, April 10, 2002
- Oh, so I *was* having a Stupid Attack™, but at least my Stupid Attack helped uncover another bug. Thanks! :-) pgdudda Thursday, April 11, 2002
Linking error 2/25/02
Oregon consititution had several articles with multiple spaces in them - so the link was Article II (two spaces before this) title here instead of Article II title here and the link resolves to different locations. Rob Salzman
- Hmm, I think this is semi-fixed. Anyone still seeing these kinds of errors? Brion VIBBER, Friday, April 19, 2002
STATUS: UNKNOWN
Parser
Parser does not recognize mailto:, news:, ftp: URL scheme (2002/02/20)
The link parser seems not to recognize the mailto: URL scheme (RFC 2368). This prevents a user from creating a clickable e-mail contact point in his user page.
- Actual results: Mail Damian produces no link
- Expected results: a clickable link that opens the user's MUA with a new message addressed to tepples@spamcop.net
- See Urban legend for an example of news:; what had been a working Usenet news link now shows up as [1] Malcolm Farmer
- See VIM for an example of ftp:.
- Fixed it in CVS. We'll have http, ftp, news, mailto. I don't remember if I added gopher as well ;) --Magnus Manske, Sunday, April 14, 2002
STATUS : Solved in CVS
Last line link in list
(2002/1/29) If the last lines of an article looks like this:
* [http://www.yahoo.com/ Yahoo]
then the bottom part of the page ("Main Page | Recent Changes...") will be indented to the left and screwed up. See SandBox for an example. This only happens if all of the following are true:
- we are in a list
- we have an URL link
- The last letter of the URL is /
- The name of the link occurs on the next line
- You are using IE 5.5 on Windows. Netscape 4.76 on Linux does not show the effect.
(2002/3/2) Right now, I see the bug also in Netscape. An example is at the bottom of Duverger's Law. AxelBoldt
- That page renders correctly for me on Mozilla 0.9.8 & Netscape 4.78 (Linux). The example in wikipedia:Sandbox still leaves an indent on the following page contents (which is due to a bug in the wiki-to-html rendering code), but not in the link bar at the bottom (which is now separated by a div tag, so there shouldn't be any interference). Brion VIBBER 2002/03/02
Parser generates extra whitespace
The Bipolar disorder page is full of extra whitespace - looking at the article reveals lots of <p> </p;gt; and <pre> </pre> spans generated.
Similarly, if an otherwise emply line contains some white space, the previous parser took that as a paragraph break, while the new parser treats it as a block of indented nothing, resulting in too much space between the paragraphs.
If whitespace precedes a #, then it is taken to be a numbered list, while before it was taken as a literal # (which is the correct behavior, especially useful for programs). AxelBoldt
STATUS : Solved in CVS
Bad table code can screw up layout
(2002/1/28) In the Quaternions article, the first part of the article appears at the bottom of the page, as do all the QuickBar links. --Zundark
- This was caused by Bad Table Code in the article. There was no closing TR tag for the last row in the table, and an extra open TR tag after the end of the table. I've fixed the article... The parser could probably be made to be able to normalize these things, though (ie, remove table-ish tags not inside &amp;lt;TABLE&amp;gt;...&amp;lt;/TABLE&amp;gt;) --Brion Vibber
Text between a pair of links sometimes omitted from displayed page
For an example of this, see the 14 April 06:28 version of the Leigh Brackett page. The three apostrophes don't put her name into bold properly; and chunks of text between some links are being omitted. The links in the edit text look perfectly normal, with no funny characters. Inserting a carriage return just before the omitted text seems to fix the problem, so it looks as if there's something odd happening in the function that parses the wiki text. Malcolm Farmer, Monday, April 15, 2002
- The missing text after BAD LINKS is an old bug that was fixed a week or so ago, but the fix hasn't been installed yet. The links in question are bad because they have line breaks *in* the links! A line break is not a valid character in an article title. Brion VIBBER, Monday, April 15, 2002
Parser issues with header lines
The display of Eight queens puzzle is... less than optimal. The problem is that the leading space on a line used to disable the processing of '#': now the Python program example is damaged.
Definition lists produce invalid HTML, could use some improvement as well
(2002/4/16) Lee Daniel Crocker The line
; term : definition
is rendered as
<DL><dt> term </DL><DL><dt><dd> definition</DL>
Note that neither the "dt" nor "dd" elements are properly closed. Further,
(2002/1/25) Definition lists like:
- Term 1
- Definition 1.
- Term 2
- Definition 2.
each get put in separate &amp;lt;dl&amp;gt; tags, resulting in too much spacing between them. Carey Evans
While we're at it, it would be nice if the DD/DL elements were only closed off on a full blank line (or end of article), and not just a single newline. That would make them more consistent with regular paragraph text, and make articles with long definitions easier to write and edit.
Specifically,
; term : long definition blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah
should be rendered identically to
; term : long definition blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah
This should also be the case for the ULs and OLs created by * and #. Of course, if the first character of a new line within a DD is ";", then close the DD and open a DT; if it is ":", insert an empty DT and open a new DD. When a full blank line is encountered, close both the open DD and the DL. I'll take a look at the parser code to see if that's possible.
STATUS : Solved in CVS
Character entities in links
Sat Feb 2 00:23:40 UTC 2002: On list of food additives, I have additives like [[&amp;beta;-cyclodextrine]]. When I click on the question mark to create an article about it, I get the Main Page displayed for edit instead. Note that since &amp; is a safe character in URI path segments, escaping it as %26 has no effect.
- This is due to a bug in the code putting too many HTML escapes into the title; if it were working correctly, the %26 escape would indeed have an effect. My recommendation until this is resolved: use β-cyclodextrine ([[beta-cyclodextrine|β-cyclodextrine]]). --Brion Vibber
- There's probably good arguments for actually writing "beta-cyclodextrine" in the article. However, my point about the % escape is that according to RFC 2396, there is no difference between %26 and just & in the path of the URL. --Carey
- Well, there's the RFC and then there's the actual behavior of the software... PHP does not seem to consider %26 to be an ampersand for the purposes of extracting variables from the URL's *query* bit. At least my reading of the RFC agrees with it: ?3.4 Query Component ... Within a query component, the characters ..."@", "&", "="... are reserved.? It's not a problem in the path, only in the query when you're e.g. editing the page. --BV
- The URL rewriting to give nice URLs like http://www.wikipedia.com/wiki/MainPage rather than .../wiki.phtml?MainPage makes this a bit more complicated. There's no question mark in the URL for this edit page, so Apache is probably justified in converting %26 to & internally, before processing the Alias or RewriteRule directive, or http://www.wikipedia.com/%77%69%6B%69/ wouldn't work. --Carey
(Ideally the URL would be encoded as %CE%B2-cyclodextrine, the UTF-8 encoding of GREEK SMALL LETTER BETA.)
- Impossible until the database is converted from ISO-8859-1 to UTF-8. --BV
- I would just write <? echo urlencode(recode("h..utf8", $title)) ?>. --Carey
- Yeah, that could probably work as long as titles are normalized internally. I'll try banging the code into place... --BV
References: RFC 2396, W3C on i18n of URIs
Problem in "printable version" page?
Please go to Category theory and try the "printable version" link; you will probably see that the word functors remains as a blue link instead of becoming simple italics text. I was unable to spot any sort of difference from other links that would cause this strange behaviour, and I suppose it can be considered a bug, since the printable version should not contain any link in the text part. Daniel M
- Yup, it's a bug, caused by the fact that the link looks like [[functor|<b>functors</b>]]. It's fixed in the development version of the code. AxelBoldt
STATUS : Solved in CVS
I've had this problem show up a few times, and can't really be sure whether it's from the system or my own machine. Most recently it has shown up in famous women in history. On the edit page there is text between Mildred Zaharias and Margaret Thatcher, including a section heading - but when saved these two names appear together on the same line without even a space between them. Eclecticology
- STATUS: DUPLICATE BUG, SEE NUMEROUS REPORTS AT THE TOP OF THE PAGE ABOUT MISSING TEXT AFTER LINKS THAT CONTAIN INVALID CHARACTERS
- Thanks! Perhaps someone with more experience in these matters could start a "troubleshooting" page with step-by-step instructions for fixing this kind of bug on a page.
This is new (since late this afternoon, Pacific Time). The top of pages has this:
Warning: Supplied argument is not a valid MySQL result resource in /home/wiki-newest/work-http/wikiPage.php on line 86
Warning: Supplied argument is not a valid MySQL result resource in /home/wiki-newest/work-http/wikiPage.php on line 88
The "Recent Changes" page doesn't have these lines, but pages that I get to by clicking their names in the Recent Pages page do. Redirected pages have each error message twice. -- Marj Tiefert, Tuesday, May 7, 2002
- Yes, I'm seeing that now too... (Note to other developers about this bug: the running version of wikiPage.php is, I believe, 1.112 from 2002-04-02. The errors are coming up in
wikiPage::load()
onmysql_fetch_object()
andmysql_release()
calls when grabbing data out of theunlinked
table. Unfortunately, the developer's database access function is still disabled, so I can't test the query manually to see what's going wrong... I haven't seen any similar errors from the preceding query on thelinked
table.) Brion VIBBER, Tuesday, May 7, 2002
I'm getting the same error on almost all screens of the wikipedia. Yesterday it was reporting errors in lines 84 and 86 and refusing to load screens at all. Today the error has switched to line '99' with this showing up at the top of virtually every screen I got to: Warning: Supplied argument is not a valid MySQL result resource in /home/wiki-newest/work-http/wikiPage.php on line 99 ~KJ Wed 8th May 2002
---
The colours are all wrong as of this morning. View source on Netscape shows the BODY HTML tag has `textcolor=" TEXT" bgcolor=" BGCOLOR"' suggesting that a string hasn't been properly substituted for a variable somewhere. On my Netscape browser, it interprets the bgcolor command as dark green, making the black text difficult to read, and the blue links even harder. Changing skins makes no difference Malcolm Farmer, Thursday, May 9, 2002