Jump to content

Wikipedia:Parser bug reports

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Malcolm Farmer (talk | contribs) at 04:24, 9 May 2002 (Page colours have gone awry). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Bangs in links in tables wreck the table Koyaanis Qatsi, Sunday, March 31, 2002

Observe: the following table is coded properly, but displays incorrectly (as you'll see in the second row, if you view source). Piping the link so the bang displays but isn't linked to corrects it:

Vivement dimanche! Confidentially Yours 1983 W D
Vivement dimanche! Confidentially Yours 1983 W D
Looks like a bug in the reject-bad-links code. I'll work on it... Brion VIBBER, Sunday, March 31, 2002
Got it. Fixed in CVS, wait for Jimbo to upgrade. Brion VIBBER, Tuesday, April 2, 2002
Has that been upgraded? It quit linking but the two rows in the table should be exactly the same except for that? Koyaanis Qatsi, Sunday, April 7, 2002

Ampersands in links surrounded by italics cause words to drop and the end italics tag to be ignored Not a common problem, I'm sure. But go to the history of talk:Osama bin Laden and you can see what I'm talking about. The ampersand in the former [[Roger & Me]] was the culprit. Koyaanis Qatsi, Sunday, April 7, 2002


Several asterisks in a row will prevent linewraps (or increase the linewrap length considerably?) Koyaanis Qatsi, Monday, April 8, 2002

See the history of Talk:Terrists for an example. I doubt this is a common issue, though, since most of us use four dashes.  :-)

This is not a bug, so I'm going to move this to the "fixed" site. Asterisks at the start of a line are used by the wiki software to make bullet lists, which can be nested. A row of, say, 20 asterisks is asking the software to make 20 nested bullet lists, and it does so correctly. It fails to wrap lines because bullet lists are indented, and when you ask for 20 indents, that line becomes very long to accommodate them, and it takes the rest of the page with it. In short, DON'T DO THAT, because it's not ever going to change. -- Lee Daniel Crocker

Table positioned between two paragraphs displays at bottom of page
(possibly related to above bug??) Wednesday, April 10, 2002

If you look at the table in Talk:High_German, you'll see that instead of appearing between the two paragraphs of my note, it leaves a "close table" tag where the table belongs and puts the table at the end of the page. I've double-checked my table code for errors, and can't find any. I've also tried just making one big table, with the first and last paragraphs in their own table rows, but the problem persists. Is this a bug, or am I having a Stupid Attack™? pgdudda

You're missing a </center> tag; it looks okay after I added that in. But that did trigger a bug in the parser that caused it to eat the table instead of the center tag... I'll try to fix that, but in the meantime, uh, don't do that. :) Brion VIBBER, Wednesday, April 10, 2002
Oh, so I *was* having a Stupid Attack™, but at least my Stupid Attack helped uncover another bug. Thanks!  :-) pgdudda Thursday, April 11, 2002



Linking error 2/25/02

Oregon consititution had several articles with multiple spaces in them - so the link was Article II (two spaces before this) title here instead of Article II title here and the link resolves to different locations. Rob Salzman

Hmm, I think this is semi-fixed. Anyone still seeing these kinds of errors? Brion VIBBER, Friday, April 19, 2002
STATUS: UNKNOWN

Parser

Parser does not recognize mailto:, news:, ftp: URL scheme (2002/02/20)

The link parser seems not to recognize the mailto: URL scheme (RFC 2368). This prevents a user from creating a clickable e-mail contact point in his user page.

  • Actual results: Mail Damian produces no link
  • Expected results: a clickable link that opens the user's MUA with a new message addressed to tepples@spamcop.net

--Damian Yerrick

Fixed it in CVS. We'll have http, ftp, news, mailto. I don't remember if I added gopher as well ;) --Magnus Manske, Sunday, April 14, 2002
STATUS : Solved in CVS

Last line link in list

(2002/1/29) If the last lines of an article looks like this:

* [http://www.yahoo.com/
Yahoo]

then the bottom part of the page ("Main Page | Recent Changes...") will be indented to the left and screwed up. See SandBox for an example. This only happens if all of the following are true:

  1. we are in a list
  2. we have an URL link
  3. The last letter of the URL is /
  4. The name of the link occurs on the next line
  5. You are using IE 5.5 on Windows. Netscape 4.76 on Linux does not show the effect.

AxelBoldt

(2002/3/2) Right now, I see the bug also in Netscape. An example is at the bottom of Duverger's Law. AxelBoldt

That page renders correctly for me on Mozilla 0.9.8 & Netscape 4.78 (Linux). The example in wikipedia:Sandbox still leaves an indent on the following page contents (which is due to a bug in the wiki-to-html rendering code), but not in the link bar at the bottom (which is now separated by a div tag, so there shouldn't be any interference). Brion VIBBER 2002/03/02



Parser generates extra whitespace

The Bipolar disorder page is full of extra whitespace - looking at the article reveals lots of <p> </p;gt; and <pre> </pre> spans generated.

Similarly, if an otherwise emply line contains some white space, the previous parser took that as a paragraph break, while the new parser treats it as a block of indented nothing, resulting in too much space between the paragraphs.

If whitespace precedes a #, then it is taken to be a numbered list, while before it was taken as a literal # (which is the correct behavior, especially useful for programs). AxelBoldt

STATUS : Solved in CVS

Bad table code can screw up layout

(2002/1/28) In the Quaternions article, the first part of the article appears at the bottom of the page, as do all the QuickBar links. --Zundark

This was caused by Bad Table Code in the article. There was no closing TR tag for the last row in the table, and an extra open TR tag after the end of the table. I've fixed the article... The parser could probably be made to be able to normalize these things, though (ie, remove table-ish tags not inside &amp;amp;lt;TABLE&amp;amp;gt;...&amp;amp;lt;/TABLE&amp;amp;gt;) --Brion Vibber

Text between a pair of links sometimes omitted from displayed page

For an example of this, see the 14 April 06:28 version of the Leigh Brackett page. The three apostrophes don't put her name into bold properly; and chunks of text between some links are being omitted. The links in the edit text look perfectly normal, with no funny characters. Inserting a carriage return just before the omitted text seems to fix the problem, so it looks as if there's something odd happening in the function that parses the wiki text. Malcolm Farmer, Monday, April 15, 2002

The missing text after BAD LINKS is an old bug that was fixed a week or so ago, but the fix hasn't been installed yet. The links in question are bad because they have line breaks *in* the links! A line break is not a valid character in an article title. Brion VIBBER, Monday, April 15, 2002

Parser issues with header lines

The display of Eight queens puzzle is... less than optimal. The problem is that the leading space on a line used to disable the processing of '#': now the Python program example is damaged.


Definition lists produce invalid HTML, could use some improvement as well

(2002/4/16) Lee Daniel Crocker The line

; term : definition

is rendered as

<DL><dt> term </DL><DL><dt><dd> definition</DL>

Note that neither the "dt" nor "dd" elements are properly closed. Further,

(2002/1/25) Definition lists like:

Term 1
Definition 1.
Term 2
Definition 2.

each get put in separate &amp;amp;lt;dl&amp;amp;gt; tags, resulting in too much spacing between them. Carey Evans

While we're at it, it would be nice if the DD/DL elements were only closed off on a full blank line (or end of article), and not just a single newline. That would make them more consistent with regular paragraph text, and make articles with long definitions easier to write and edit.

Specifically,

; term
  : long definition blah blah blah blah blah blah blah blah blah
   blah blah blah blah blah blah blah blah blah blah blah blah blah
   blah blah blah blah blah blah
  

should be rendered identically to

; term
  : long definition blah blah blah blah blah blah blah blah blah blah blah
    blah blah blah blah blah blah blah blah blah blah blah blah blah blah
    blah blah blah
  

This should also be the case for the ULs and OLs created by * and #. Of course, if the first character of a new line within a DD is ";", then close the DD and open a DT; if it is ":", insert an empty DT and open a new DD. When a full blank line is encountered, close both the open DD and the DL. I'll take a look at the parser code to see if that's possible.

STATUS : Solved in CVS

Character entities in links

Sat Feb 2 00:23:40 UTC 2002: On list of food additives, I have additives like [[&amp;amp;beta;-cyclodextrine]]. When I click on the question mark to create an article about it, I get the Main Page displayed for edit instead. Note that since &amp;amp; is a safe character in URI path segments, escaping it as %26 has no effect.

This is due to a bug in the code putting too many HTML escapes into the title; if it were working correctly, the %26 escape would indeed have an effect. My recommendation until this is resolved: use β-cyclodextrine ([[beta-cyclodextrine|&beta;-cyclodextrine]]). --Brion Vibber
There's probably good arguments for actually writing "beta-cyclodextrine" in the article. However, my point about the % escape is that according to RFC 2396, there is no difference between %26 and just & in the path of the URL. --Carey
Well, there's the RFC and then there's the actual behavior of the software... PHP does not seem to consider %26 to be an ampersand for the purposes of extracting variables from the URL's *query* bit. At least my reading of the RFC agrees with it: ?3.4 Query Component ... Within a query component, the characters ..."@", "&", "="... are reserved.? It's not a problem in the path, only in the query when you're e.g. editing the page. --BV
The URL rewriting to give nice URLs like http://www.wikipedia.com/wiki/MainPage rather than .../wiki.phtml?MainPage makes this a bit more complicated. There's no question mark in the URL for this edit page, so Apache is probably justified in converting %26 to & internally, before processing the Alias or RewriteRule directive, or http://www.wikipedia.com/%77%69%6B%69/ wouldn't work. --Carey

(Ideally the URL would be encoded as %CE%B2-cyclodextrine, the UTF-8 encoding of GREEK SMALL LETTER BETA.)

Impossible until the database is converted from ISO-8859-1 to UTF-8. --BV
I would just write &lt;? echo urlencode(recode("h..utf8", $title)) ?&gt;. --Carey
Yeah, that could probably work as long as titles are normalized internally. I'll try banging the code into place... --BV

References: RFC 2396, W3C on i18n of URIs


--Carey Evans


Problem in "printable version" page?

Please go to Category theory and try the "printable version" link; you will probably see that the word functors remains as a blue link instead of becoming simple italics text. I was unable to spot any sort of difference from other links that would cause this strange behaviour, and I suppose it can be considered a bug, since the printable version should not contain any link in the text part. Daniel M

Yup, it's a bug, caused by the fact that the link looks like [[functor|<b>functors</b>]]. It's fixed in the development version of the code. AxelBoldt
STATUS : Solved in CVS

I've had this problem show up a few times, and can't really be sure whether it's from the system or my own machine. Most recently it has shown up in famous women in history. On the edit page there is text between Mildred Zaharias and Margaret Thatcher, including a section heading - but when saved these two names appear together on the same line without even a space between them. Eclecticology

STATUS: DUPLICATE BUG, SEE NUMEROUS REPORTS AT THE TOP OF THE PAGE ABOUT MISSING TEXT AFTER LINKS THAT CONTAIN INVALID CHARACTERS
Thanks! Perhaps someone with more experience in these matters could start a "troubleshooting" page with step-by-step instructions for fixing this kind of bug on a page.

This is new (since late this afternoon, Pacific Time). The top of pages has this:

Warning: Supplied argument is not a valid MySQL result resource in /home/wiki-newest/work-http/wikiPage.php on line 86

Warning: Supplied argument is not a valid MySQL result resource in /home/wiki-newest/work-http/wikiPage.php on line 88

The "Recent Changes" page doesn't have these lines, but pages that I get to by clicking their names in the Recent Pages page do. Redirected pages have each error message twice. -- Marj Tiefert, Tuesday, May 7, 2002

Yes, I'm seeing that now too... (Note to other developers about this bug: the running version of wikiPage.php is, I believe, 1.112 from 2002-04-02. The errors are coming up in wikiPage::load() on mysql_fetch_object() and mysql_release() calls when grabbing data out of the unlinked table. Unfortunately, the developer's database access function is still disabled, so I can't test the query manually to see what's going wrong... I haven't seen any similar errors from the preceding query on the linked table.) Brion VIBBER, Tuesday, May 7, 2002

I'm getting the same error on almost all screens of the wikipedia. Yesterday it was reporting errors in lines 84 and 86 and refusing to load screens at all. Today the error has switched to line '99' with this showing up at the top of virtually every screen I got to: Warning: Supplied argument is not a valid MySQL result resource in /home/wiki-newest/work-http/wikiPage.php on line 99 ~KJ Wed 8th May 2002


---

The colours are all wrong as of this morning. View source on Netscape shows the BODY HTML tag has `textcolor=" TEXT" bgcolor=" BGCOLOR"' suggesting that a string hasn't been properly substituted for a variable somewhere. On my Netscape browser, it interprets the bgcolor command as dark green, making the black text difficult to read, and the blue links even harder. Changing skins makes no difference Malcolm Farmer, Thursday, May 9, 2002