Wikipedia talk:Version 1.0 Editorial Team/Index
Archives |
---|
Intended meaning of importance
There was a disagreement at Talk:NPA personality theory about how to rate the article's quality and importance for Wikipedia:WikiProject Psychology. The quality issue was cleared up pretty quickly, but there was quite a discussion about what exactly "importance" is supposed to mean in this context, and some additional discussion took place at Wikipedia talk:WikiProject Psychology/Assessment. I know that the WikiProjects are not required to use the same assessment scale as the Version 1.0 Editorial Team, but I think we want it to be as consistent as possible. I wanted to ask this group, what exactly are you looking at when you assess an article's importance? How do you intend to use this assessment in your project? I appreciate any feedback you can give us to clear this up! —Cswrye 19:14, 19 October 2006 (UTC)
- For WP:TWP assessments, the rule of thumb I've been using is judging how vital an understanding of the article's topic is to understanding the history and technology of rail transport worldwide. The top-level technology definition articles on {{Train topics}} are the only articles we've marked with Top importance, while at the other end, almost all subway stations are rated as Low importance. So far, only one of the ratings that I've applied has been questioned (the other editor thought an article should be High rather than Mid importance), so the guidelines that I put on the Assessment page seem to have been accepted by the rest of the project. Slambo (Speak) 19:22, 19 October 2006 (UTC)
- As was stated in the NPA theory talk page, we at WP:1.0 are using importance to prioritise articles within a particular subject area. Your importance criteria look excellent to me - they fit within our broad guidelines, but they have been tailored for your particular subject. If we are to produce a DVD with 20,000 articles, perhaps 200 of those might be related to psychology. Which 200 should we pick? A featured article or even good article that is equivalent to Slambo's project's Jordanhill Railway Station (i.e., low importance) may be a nice read, but it has no place in a general encyclopedia. However a topic like behaviorism (high importance) would be appropriate, even if only B-Class quality. We are currently looking to set up a bot that will pull out all of the usable articles with a certain level of importance, while allowing for the scope of the WikiProject (thus, a "High-importance" WikiProject:Psychology article would carry more weight than a "High-importance" WikiProject:Behaviourism article, say). We will also compensate for how a project grades, to allow for a project trying to cram all its articles as Top or High importance. Once we have some pilot trials done, I'll let people here know what's happening.
- Scanning quickly through the comments on the talk page you cite, I see the problem is perhaps one that is more common in psychology than in many fields (my field is chemistry, where things are more hierarchical). There may be a lot of theories out there that are broad in scope, but only a few of them are widely accepted. IMHO the "widely accepted" (and known) should trump the broad scope every time. Would a fresh psychology BS/BSc/BA know about this theory? Would Britannica have an article on this topic? If the answer to both of these is NO then it is Mid or Low. If it's Mid or Low, and you were to look at a newly-written encyclopedia of psychology, would you expect to see it included? If not, then it is probably Low importance; if yes then it would be Mid. If the theory gains popularity, then it could rise in importance over time, but we shouldn't be trying to push things ahead of the psychology community - we can't assume that this particular theory will become important in the future. Does all this seem reasonable? Walkerma 20:38, 19 October 2006 (UTC)
- Makes sense to me. In fact, the way that you portrayed importance is exactly the way that I think it should be assessed. I appreciate the input. —Cswrye 15:19, 27 October 2006 (UTC)
- I also support the suggested approach in general, and in particular for the pyschology project. Rfrisbietalk 16:44, 27 October 2006 (UTC)
- Update. The article in question was deleted and its now a story on slachdot, where they say (incorectly) thats its a hoax. --Salix alba (talk) 10:53, 6 November 2006 (UTC)
Non-article class parameters
The following postings were copied by User:Walkerma from the main 1.0 talk page
Can anyone advise on how the non-article class parameters are supposed to work for the purposes of these combined WikiProject/assesment banners being placed on non-article talk pages? I have seen these banners placed on talk pages for relevant categories and template using class=NA, class=template, class=category, and so on. But the approach doesn't seem completely consistent.
An example is Category:Template-Class_film_articles where the Film WikiProject has grouped templates that have been "rated" template-class. This imprecise wording is avoided if the "NA wording" is used to say that the template is a template and doesn't need rating. Some template have been set up to do this, but I can't find any examples at the moment. Can anyone remind me where they are, or how to tweak the wording?
Going back to the film non-article parameters. The blurb on Category:Film_articles_by_quality shows that the system has been extended to include other classes such as List, Category and Disambig (I haven't found anyone yet using a "redirect" class to organise redirects, though see Category:Middle-earth redirects). I assume, that like the NA classification, these "non-article" classifications don't appear in the film quality statistics page and other stats pages, which I believe are maintained by a bot. I can understand why it doesn't include them directly, but what is the best way to generate statistics based on these non-article parameters such as NA, category, and template?
An alternative approach is seen at WikiProject Middle-earth, where Template:ME-project is used on article talk pages, Template:ME-category is used on category talk pages, and Template:ME-template is used on template pages.
Is there any reason to prefer putting all the parameters inside one template (as in the Film WikiProject), or to use separate banners (the Middle-earth WikiProject)? I prefer the latter approach, but was wondering if the assessment statistcs approach could be adapted to include stats on the number of templates, categories and other non-article pages? Carcharoth 11:32, 21 October 2006 (UTC)
- Another approach is the one I have started at Wikipedia:WikiProject_Middle-earth/Assessment#Page_types, where I am proposing a separate set of "page type" parameters and a separate line in the banner on which to display this parameter. Would this be helpful? Part of the reason for this is that it would be helpful to be able to assess some lists (currently, people tend to mix a "list" parameter into the rating scale), and in some cases to assess some of the larger templates (though this is not essential). Carcharoth 15:00, 21 October 2006 (UTC)
End of copied comments
- Well, {{WPMILHIST}} uses the wording change trick, but it only supports the single "NA" class. If you wanted to have multiple classes, doing it with if statements would probably be too ugly to even contemplate; the cleaner solution would presumably be to use a switch statement for that line in the template:
{{#switch:{{{class}}} |FA |A |B |Start |Stub= This article has been rated as {{{class}}}-Class. |Dab= This article is a disambiguation page, and does not require a rating. |Cat= This page is a category, and does not require a rating ... }}
- There doesn't seem to be any particularly good way of getting statistics for these, but I'm not entirely certain why they would be all that useful anyways.
- As far as having a separate field for these is concerned, I don't think it's a good idea. With the possible exception of lists (but those are genuine articles, and should really be assessed as any other article is), the various things that get these other tags can't be meaningfully assigned to any of the quality levels. It's meaningless, in other words, to talk about an "A-Class category" or a "FA-Class disambiguation page". If they're all just going to have "NA" in the first field, though, I don't see the point in introducing a second one; the existing "class" parameter can just as easily be used, as it's not doing anything productive there anyways. Kirill Lokshin 05:38, 22 October 2006 (UTC)
- Thanks for the sample code. That is very helpful. I found that having separate banners was easier and produces the same result, though any general wording changes have to be tweaked over 4-5 templates, but I'll have a look at the coding sometime. Someone also mentioned that some templates are complex enough to need a rating. eg. Template:Saffir-Simpson_full. Part of the problem is that I am using the example of the Assessment bot (Mathbot) to try and generate statistics about all the pages maintained by a WikiProject, not just the pages that need assessing. That is of more interest to the WIkiProject than the assessment project. I agree that class=List is not helpful, but the existence of a separate "Featured-list" process means that Class-FA is not applicable, so another quality parameter is needed for lists. Carcharoth 09:58, 23 October 2006 (UTC)
India project and Trains banner templates have the following class values.
- Disambig or Dab - The article is a disambiguation page.
- Redirect or Redir - The article is a redirect page.
- Template - The page is a template.
- Category or Cat - The page is a category.
- Image or Img - The page is a image.
- List - The page is a list
- NA - any other than the above.
If you need more information, check out the banner templates. Regards, Ganeshk (talk) 06:41, 22 October 2006 (UTC)
Those classifications have absolutely no bearing whatsoever on Wikipedia 1.0 assessments (which this page is about). Mathbot doesn't read them and it really doesn't matter how they're formatted. That's why there's no standard scheme. I agree with Kirill about class=List and said pretty much the same when somebody mentioned that my plugin doesn't support it: a list is an article, it can be featured, it should be assessed. --kingboyk 11:37, 22 October 2006 (UTC)
- These classifications help the project in grouping them into different categories. I hope Mathbot would someday check them too. :) The banners I think are useful for both 1.0 assessments and a general awareness of article count and quality within the project. Until now, there was no such project-level statistic. Some lists are just list of wiki-links. Those would be tagged class=List. Lists that have good content are given class ratings such as FA, GA etc. -- Ganeshk (talk) 16:44, 22 October 2006 (UTC)
- If they're just links they should be a category or (if they're a red link farm) a worklist in WikiProject space, imho.
- Agreed about the usefulness of banners and Mathbot's work; I'm just giving some historical perspective as to why that part (class=NA etc) isn't standardised. --kingboyk 18:56, 22 October 2006 (UTC)
- I understand that Mathbot is set up to deal with assessment categories. I should have made clearer that I'm using its example to set up similar statistics pages. I really like the stats pages that this bot generates. So would it be better to try and get a separate bot to run over any categories that have been set to track the templates, categories and whatnot that a WikiProject also deals with, and what would be the best way of doing that? It is fairly easy to manually cut and paste categories and set them up as a list in a Wikipedia project page to visually survey as a tree, and also as a more permanent snapshot than the toolserver CategoryTree tool, but I like the idea of getting a bot to do the counting. Carcharoth 09:58, 23 October 2006 (UTC)
- Agreed about the usefulness of banners and Mathbot's work; I'm just giving some historical perspective as to why that part (class=NA etc) isn't standardised. --kingboyk 18:56, 22 October 2006 (UTC)
- Different specific Wikiprojects projects of course have different/more needs than the WP1.0 project. I'd be rather reluctant to modify the current bot script to serve such (diverging) requests. However, I'd be more than happy to give my Perl code (or at least the subroutine for reading categories) to anybody who knows some Perl and would be willing to implement extra features for his/her specific Wikiproject (although for somebody starting fresh a better idea may be to use the Python bot framework instead which is more mature than the Perl bot framework). Oleg Alexandrov (talk) 15:00, 23 October 2006 (UTC)
Thread moved from Wikipedia talk:Version 1.0 Editorial Team/Work via Wikiprojects
Hi,
There's a problem with one of the comments at Wikipedia:Version 1.0 Editorial Team/Ethnic groups articles by quality. The article in question is "Obotrites," but it has weirdness in the comment section... help would be appreciated!--Ling.Nut 15:50, 21 October 2006 (UTC)
PS - it may have something to do with the fact that the comment page for "Rukai people", which probably appeared in that particular slot previuosly, was deleted. --Ling.Nut 16:01, 21 October 2006 (UTC)
- I'm not sure if Mathbot had formatted the page correctly or not (I think maybe not, so please check the old revision Oleg), but one immediate problem was that Talk:Rukai people/Comments was redlink (it had been deleted). Another potential problem for including comments in tables was that one of your members has a | in his signature. --kingboyk 16:13, 21 October 2006 (UTC)
- That was a bug in my code which was triggered by redlinked comment pages (as remarked above by Kingboyk). It was actually a big bug, I wonder how come it did not cause more trouble. Fixed now. Thanks! Oleg Alexandrov (talk) 15:33, 22 October 2006 (UTC)
Two template suggestions
I would like to make two template suggestions:
- The Jim Thorpe Problem - If you look at the referenced talk page, you'll see banners for the following WikiProjects: Beisbol, Penn, OK, Indigenous peoples, Biography and NFL. It would be good to have a template with a single assesment and that allows more than one WP to be listed.
- The Overlapping Problem - If you look at WikiProject California and WikiProject Southern California 'or WikiProject Pennsylvania and WikiProject Philadelphia; you'll see that they overlap. It would be good to have a template for the lower group in the hierarchy that allows it to have its articles placed in both wikiprojects.
--evrik 20:03, 24 October 2006 (UTC)
- Yes, I agree, this is a problem we've been discussing here and at WP:WVWP. The problem is the explosion of these templates and assessments. We already have a few hybrid templates - all chemical elements are in Wikipedia 0.5, so we have a joint Version 0.5/Chemistry template. The key here is not the technology - that's easy - the key is to get the projects to talk to one another. I think this is probably best handled through WP:COUNCIL, but that's a pretty new group with little clout as yet, so much of the template consolidation for now will have to happen on a case-by-case basis, with projects agreeing things between each other. Try posting a comment at the relevant WikiProject! Thanks, Walkerma 21:59, 24 October 2006 (UTC)
I found a silly example of this at List of fictional battles (I know, I know...). Given that the list can be expanded almost indefinitely, does each fictional universe WikiProject get to assess the list? But I am sure that better examples of overview articles can be found. Such as Human or Earth - several WikiProjects have probably fought pitched battles over those articles already! :-) I tend towards the share-and-share-alike mentality, but then I found someone had put a WP-Film template on Tom Bombadil, and I went livid! :-) But I do have issues with articles where there are bits about adaptations in films, eg. many of the LotR character articles have a bit about the film adaptation, so Gandalf has a "film-project" tag because there is a small section about Adaptations. For LotR, there are separate film articles, so that is not so much of a problem. But where overlap is so unbalanced, at what point can another WikiProject get a foot in the door, so to speak? Carcharoth 05:45, 25 October 2006 (UTC)
- The fundamental criterion, I think, is whether there is (or should be) content in the article related to the scope of the project. In your examples, Gandalf would be in Films because there's a discussion about him in films, while Bombadil probably isn't (but could be, if editors decide to add discussion of why he was cut from the films to the article). In general, though, the projects themselves are usually quite content to let other projects tag "their" articles; it's mostly the people who don't like the tags that complain. ;-) Kirill Lokshin 05:49, 25 October 2006 (UTC)
- It just seems a bit silly for the film-project tag for Gandalf to have a rating of B-class. Is that a rating for the whole article, or just the paragraph on Gandalf in films. We know it is the film bit, but how does that square with getting a rating that means anything in a list of film-related articles? The Cultural depictions of Joan of Arc mentions films about her. Does that mean that there is any point in the WP-Film people rating the article? I'd say not. Carcharoth 06:01, 25 October 2006 (UTC)
- In practice, it's a rating for the whole article. In many cases, the ratings will match among the various projects (sometimes because the same person will fill out all of them); in other scenarios, however, they may vary depending on what's being looked for. For example, suppose that Cultural depictions of Joan of Arc discussed novels at length but glossed over the films; it might then get a high rating from the Novels project (as everything they looked for would be included) but not from the Films project (members of which would probably be more likely to notice the ommisions). Kirill Lokshin 06:08, 25 October 2006 (UTC)
- Right. I've also been considering whether to see if the Disaster management WP has the manpower to assess disaster articles. There, the logical course of action would seem to be to assess those articles that aren't already covered by, say, aviation, or trains, or earthquake, or hurricane, WPs. A kind of, fill-in-the-gaps approach for a broad, overview project. That could also apply to the Meteorology WP mentioned above. So having a parameter in the banner template coding that says "still part of this project, but not rated because a sub/sister project has rated it", or something? Worth thinking about? Carcharoth 06:34, 25 October 2006 (UTC)
...has no assessed articles shown, 327 unassessed, but a total below the importance ratings of 1339. Category:Stub-class Germany articles, for example, contains hundreds of articles and is properly situated in Category:Germany articles by quality which is a subcat of Category:Wikipedia 1.0 assessments. Any idea what's going on? --kingboyk 13:55, 22 October 2006 (UTC)
- It should be "Stub-Class" and not "Stub-class" (uppercase "Class"). I now made the bot accept the lowercase version too [1]. Such things go against the nature of Wikipedia, which is case-sensitive; let us hope that won't introduce more bugs in the bot. Oleg Alexandrov (talk) 16:19, 22 October 2006 (UTC)
- Aha. I'd have just fixed the category and sent them a nasty letter personally, but you're the boss! ;) --kingboyk 18:55, 22 October 2006 (UTC)
- That was my first idea too. But then I realized that such things could happen in the future too, and the changes to the code were not big (and luckily it was Sunday morning and I had time to kill :) Oleg Alexandrov (talk)
- Aha. I'd have just fixed the category and sent them a nasty letter personally, but you're the boss! ;) --kingboyk 18:55, 22 October 2006 (UTC)
150 projects milestone
The index currently shows 150 projects are participating in the bot process. Should this go to Signpost? -- Ganeshk (talk) 18:43, 24 October 2006 (UTC)
- It will help recruit more projects to participate too. -- Ganeshk (talk) 18:43, 24 October 2006 (UTC)
- It probably could—I doubt anyone will look at it too closely—but I'll point out that we don't actually have 150 separate projects. WP Beatles, for example, is responsible for a half-dozen different lists from the 150. Kirill Lokshin 18:47, 24 October 2006 (UTC)
- (Edit conflict) I don't think we should send things too often - it's only about a month since they covered us reaching the 100,000 assessed article mark (and now we've passed 150,000!). I think we should hold off until 200 or (preferably 250 projects, and/or 250,000 assessed articles. I disagree with Kirill about the value of this, I know for certain that some people found out about the assessment program through our recent publicity. Walkerma 18:53, 24 October 2006 (UTC)
- Oops, I wasn't very clear there; I meant people wouldn't look too closesly at the fact that we don't actually have 150 projects, not that they wouldn't care about the announcement. ;-) Kirill Lokshin 19:04, 24 October 2006 (UTC)
- Ironically enough, Kirill brings a point that recently reared its head in WP:WPTC: whether it is a good idea to make navigation a bit more hierarchical, and put, instead of Category:Tropical cyclone articles by quality and Category:Meteorology articles by quality as separate entities, Tropical cyclones inside Meteorology, but still existing as separate entities. For the bloody details of the discussion, you can see here, but it would also be helpful for WP Beatles, perhaps. Titoxd(?!?) 05:55, 25 October 2006 (UTC)
- (Edit conflict) I don't think we should send things too often - it's only about a month since they covered us reaching the 100,000 assessed article mark (and now we've passed 150,000!). I think we should hold off until 200 or (preferably 250 projects, and/or 250,000 assessed articles. I disagree with Kirill about the value of this, I know for certain that some people found out about the assessment program through our recent publicity. Walkerma 18:53, 24 October 2006 (UTC)
- It probably could—I doubt anyone will look at it too closely—but I'll point out that we don't actually have 150 separate projects. WP Beatles, for example, is responsible for a half-dozen different lists from the 150. Kirill Lokshin 18:47, 24 October 2006 (UTC)
- There's also a few empty entries and at least one duplicate (see Wikipedia_talk:Version_1.0_Editorial_Team/Index#Article_counts). --kingboyk 16:11, 5 November 2006 (UTC)
Problem with WP:Dallas
I can't figure out why articles aren't showing up in Category:Dallas articles with comments. An example of a page that should is Talk:Oak Lawn, Dallas per Talk:Oak Lawn, Dallas/Comments and coding from {{WikiProject Dallas}}. Any help/advice would be great!! drumguy8800 C T 19:30, 26 October 2006 (UTC)
- This is fixed. I had to add a little space before the sort key. That did the trick. Regards, Ganeshk (talk) 04:07, 27 October 2006 (UTC)
Mixed up entries on WP:HV
I don't know why but it looks like a few entries are mixed up for WP:HV. Please let us know if there is a bug in our code and if there's any good way to avoid this. The relevant entries are: Wikipedia:Version 1.0 Editorial Team/Heraldry and vexillology articles by quality/1 (articles 2 and 3), and Wikipedia:Version 1.0 Editorial Team/Heraldry and vexillology articles by quality/2 (articles 387 and 388) and Wikipedia:Version 1.0 Editorial Team/Heraldry and vexillology articles by quality/4 (articles 37 and 38 and 391 and 392). Regards. Valentinian (talk) / (contribs) 09:27, 27 October 2006 (UTC)
- The problem appears when an article has a /Comments page (coming from another wikiproject assessment, since WP:HV doesn't use comments). It looks like there's a }} missing when the comments are transcluded into the worklist, e.g. (for the first example above)
{{assessment | page=[[Elias Ashmole]] [http://en.wikipedia.org/w/index.php?title=Elias_Ashmole&oldid=75298981 ] | importance= | date=[[October 6]], [[2006]] | class={{FA-Class}} | version=0.5 | comments={{Talk:Elias Ashmole/Comments }}
- Editing the page to add an extra }} after /Comments and before the closing }} of the {{assessment}} template fixes the problem for the current instance of the page, but this will of course get overwritten with the next update. --Dr pda 12:03, 27 October 2006 (UTC)
- That's a bot bug. The bot was adding the "}}" just fine, the problem is that when it was reading its own output next time it was not reading the "}}" in. I fixed the bug, see here. Thanks for the report. Oleg Alexandrov (talk) 05:31, 28 October 2006 (UTC)
- Thanks for the help. Valentinian (talk) / (contribs) 07:54, 28 October 2006 (UTC)
Article counts
I was wondering if the bot ought to ignore WikiProjects which have 0 articles? (Index · Statistics · Log)
Also, since you collect the stats for every project on every run, it might be cool to add columns to the table on this page for total of number articles (per project) and number unassessed? --kingboyk 15:39, 18 October 2006 (UTC)
- Bump. I found another empty list today (Index · Statistics · Log; see also Index · Statistics · Log). --kingboyk 16:07, 5 November 2006 (UTC)
- I would think that if you want the stats for a project, you could just click on the stats link, available on every line in the index page. I'd think that having the total stats only in the index page should be enough.
- All I can say is that Mortal Kombat created the assessment page only in the middle of last month, and Taiwan only a few days ago, by the histories. I think maybe the time has come to remind Mortal Kombat that they said they were going to assess their articles, and that they might lose the option if they don't start, and Taiwan could certainly be told that we're ready for them to start assessing whenever they are, but I'd give them at least a month before each before really removing them. Badbilltucker 14:25, 6 November 2006 (UTC)
- Thanks for checking these, Bill. IMHO only reminders are in order at this stage, but since these categories aren't doing any harm I don't think we should start using threats for quite a while yet! Walkerma 16:32, 6 November 2006 (UTC)
- About empty projects, I don't see what would be gained by ignoring projects which have zero articles. They are very few and the bot does not use much energy in updating those, as there is nothing to update. :) Oleg Alexandrov (talk) 16:26, 5 November 2006 (UTC)
- Just to prove I've too much time on my hands I made a combined Project league table.
- Results are interesting, showing different approaches of some projects to grading. some add the template to all their articles and then get on with grading, other only grade their best articles. --Salix alba (talk) 19:10, 5 November 2006 (UTC)
- ...and some use a bot (which gives large unassessed and auto-assessed-as stub numbers); some projects have assessed all the articles within their scope that they know about so have 0 unassessed both in the chart and literally. --kingboyk 19:18, 5 November 2006 (UTC)
- And you say that you have too much time on your hands? ;) Titoxd(?!?) 19:00, 6 November 2006 (UTC)
- How about % of articles at each grade (including unassessed)? That might be interesting. --kingboyk 19:16, 5 November 2006 (UTC)
- OK, I've added this to User:Salix alba/Project league table. --Salix alba (talk) 19:40, 5 November 2006 (UTC)
- Nice work. Now we need to ask Oleg nicely and perhaps Mathbot will write it out every day? :) --kingboyk 19:47, 5 November 2006 (UTC)
- Is it going to be useful? So far I see a large table with all kinds of stats which people may wonder about once in a while but which would not be worth updating everyday I would think. OK, if people thin it would be worth updating everyday, I could do it.
- Nice work. Now we need to ask Oleg nicely and perhaps Mathbot will write it out every day? :) --kingboyk 19:47, 5 November 2006 (UTC)
- OK, I've added this to User:Salix alba/Project league table. --Salix alba (talk) 19:40, 5 November 2006 (UTC)
- By the way, Salix alba, actually it may be a nice challenge for you to write a Perl bot to read those pages, make the data go through the script you made already, and publish the data back. What do you think? Oleg Alexandrov (talk) 03:50, 6 November 2006 (UTC)
- I can help you set up the bot. :) Oleg Alexandrov (talk) 03:51, 6 November 2006 (UTC)
- Useful? Probably not, Oleg, no - I just thought it was cool in a geeky kind of way :) --kingboyk 11:48, 6 November 2006 (UTC)
- By the way, Salix alba, actually it may be a nice challenge for you to write a Perl bot to read those pages, make the data go through the script you made already, and publish the data back. What do you think? Oleg Alexandrov (talk) 03:50, 6 November 2006 (UTC)
- Kind of marginally useful. The table tells us a few useful things: theres still 33 unassessed V 0.5 articles, and theres only 22 countries articles. Both of these are things which could be quickly fixed. Having an overview makes it easier to spot such items. I know that User:Lincer is currently tagging random articles and he might find it a bit more effective to target some of the projects with big unassessed categories.
- The table does not really need to be updated daily, a moththly update would probably suffice.
- Yes I have though about getting a bot account, probably to help make lists of mathematics aticles by field, without having to bug Oleg too much. --Salix alba (talk) 17:04, 6 November 2006 (UTC)
- Good luck with that. Setting up a bot is very easy as soon as you intall WWW::Mediawiki::Client which has a lot of dependencies. I'd say it is worth you giving it a try. :) Oleg Alexandrov (talk) 05:41, 7 November 2006 (UTC)
- I converted to the table to a Openoffice spreadsheet format. You can find it at File:Project league table.sxc at Commons. This would help in running queries on the data. Regards, Ganeshk (talk) 03:24, 7 November 2006 (UTC)
Bot went mad
Today the bot was editing just fine for a while, until here, then it started mass-blanking the articles. That is something I can't explain, either there is an error in the script, then it should be all wrong, or there should be no error, then all the pages should be right. I really must go to bed now, so I just killed the bot and will look into this tomorrow. Sorry, I don't know what is going on. Oleg Alexandrov (talk) 06:06, 7 November 2006 (UTC)
As if one can sleep. :) I found the problem; the html source code for categories changed suddenly in a very subtle way, but enough to confuse the bot. I fixed it now. Here are some issues to think about.
- What to do with the roughly hundred and sixty blanked pages (starting from the link above -- album articles by quality). The admin rollback applied to an article would do a mass revert not only of the last mathbot edit, but of all the mathbot edits to that article until a non-mathbot edit. Can the javascript tool do better?
- I modified the subroutine which collects articles and subcategories from a category to just die if something is suspicious and not ruin everything. Whether this will take care of all the problems is not yet absolutely certain. The moral is that parsing data from fickle html source code is a bad idea in principle. Any ideas? Oleg Alexandrov (talk) 06:51, 7 November 2006 (UTC)
PS I will not run the bot until something is done about item 1 above. Restarting the bot would of course regenrate all pages, but it will lose the history link information (see for example the first column here for what I mean). Oleg Alexandrov (talk) 06:54, 7 November 2006 (UTC)
- Glad you found the problem! There's nothing worse than an inexplicable bug! I wonder if we went through the pages with AWB, semi-manually reverting the Mathbot changes? If it's only 160 pages, that's doable isn't it? Walkerma 06:58, 7 November 2006 (UTC)
- From my readings of WP:BOT theres various query API's avaliable meta:API and User:Yurik/Query API these might be more stable than directly reading the categories. --Salix alba (talk) 08:20, 7 November 2006 (UTC)
- I revered a few projects manually that I am part of that got the boot. But I think we need to revert to save the history. Shane (talk/contrib) 08:33, 7 November 2006 (UTC)
- Beware of the rouge bot network, it'll start with a few innocent edits, then... Just spamming a little, Scoo 08:58, 7 November 2006 (UTC)
- I reverted today's edits to WP:DALLAS's list. You never know what you got 'til its gone (er, temporarily whacked) ;) drumguy8800 C T 10:23, 7 November 2006 (UTC)
- I reverted a few but I'm not sure what to revert. Should we revert all of the last Mathbot edits made to the one before? This would include log, statistics, and the list (and its subpage). Cbrown1023 14:11, 7 November 2006 (UTC)
- What needs to be reverted is all the blanked pages (take the diff by looking at bot's contributions and see if the page got blanked). I will do a bunch of them by hand tonight, any help is of course appreciated. Oleg Alexandrov (talk) 17:17, 7 November 2006 (UTC)
- OK, I produced a list of what I believe are the corrupted pages, here. I've tried using AWB to revert these to the previous version, but I don't know how to do this. I know it can be done because I've seen lots of people use AWB to revert to the previous version after vandalism/blanking, which is in effect what is needed here. Can someone who knows AWB perhaps try this? I presume that once you set things up you can revert the problem in a few minutes. Thanks, Walkerma 18:01, 7 November 2006 (UTC)
- What needs to be reverted is all the blanked pages (take the diff by looking at bot's contributions and see if the page got blanked). I will do a bunch of them by hand tonight, any help is of course appreciated. Oleg Alexandrov (talk) 17:17, 7 November 2006 (UTC)
- I reverted a few but I'm not sure what to revert. Should we revert all of the last Mathbot edits made to the one before? This would include log, statistics, and the list (and its subpage). Cbrown1023 14:11, 7 November 2006 (UTC)
- I revered a few projects manually that I am part of that got the boot. But I think we need to revert to save the history. Shane (talk/contrib) 08:33, 7 November 2006 (UTC)
- From my readings of WP:BOT theres various query API's avaliable meta:API and User:Yurik/Query API these might be more stable than directly reading the categories. --Salix alba (talk) 08:20, 7 November 2006 (UTC)
- I Don't think AWB can revert, it can only edit the the current version. Wikipedia:Tools/Navigation popups can be handy for reverting, though its a two step process, first you need to do a diff or hist and then hover over the revision you want and click revert. --Salix alba (talk) 18:19, 7 November 2006 (UTC)
- OK, those people must use some other tool as well. Anyway, good news. Looking at that list, and also at Mathbot's "contributions", it is clear that Oleg caught it once it reached the Firefly project's list. So I think it shouldn't be too hard to go through the list of 60 Foobar articles by quality pages manually. I will do that now. Should we also revert the logs, to make sure the Biography log doesn't choke as 100,000 articles are added into it? If we fix those pages, will Mathbot be able to fix the statistics page itself? Walkerma 18:46, 7 November 2006 (UTC)
- All the "by quality" list pages are now OK. It took me exactly 30 minutes to do this using the Auto-Martin Browser. Do we still need to revert the corresponding log pages? Meanwhile, I must get on with other things at work. Walkerma 19:28, 7 November 2006 (UTC)
- Yeah, the logs are borked too. Working on it, but I would appreciate some help... :) Titoxd(?!?) 20:13, 7 November 2006 (UTC)
- I think I reverted all of them now. Please double-check if I missed something... Titoxd(?!?) 21:35, 7 November 2006 (UTC)
- Yeah, the logs are borked too. Working on it, but I would appreciate some help... :) Titoxd(?!?) 20:13, 7 November 2006 (UTC)
- All the "by quality" list pages are now OK. It took me exactly 30 minutes to do this using the Auto-Martin Browser. Do we still need to revert the corresponding log pages? Meanwhile, I must get on with other things at work. Walkerma 19:28, 7 November 2006 (UTC)
- Thanks everybody. I planned that after work (meaning now) I'd go and revert after my bot, and now appears that I was saved a couple of hours of hard work.
- The bot is fixed now, and again, I programmed it to just die if it can't detect subpages. I wish I could say there would never be problems again, but unfortunately one never knows, and the consequences of the bot going nuts would be hundreds of messed up pages. I guess the best we can do is to monitor the bot and block it if it misbehaves. Other ideas? Oleg Alexandrov (talk) 03:13, 8 November 2006 (UTC)
24-hour bot runs
What'll happen when there're so many projects and updates to do that it takes the bot more than 24 hours to do a run? Then we won't have daily runs anymore. Rlevse 13:04, 9 November 2006 (UTC)
- Well, just because yesterday's bot did not finish running should not affect the run of today's bot. But you are right, very huge runs are not a good thing.
- The script spends most time in fetching the history of each newly added article (or article whose rating changed) and creating a link to a version in history. That thing needs to be done on a per-article basis, while everything else is done on a per category or list basis. Oleg Alexandrov (talk) 16:57, 9 November 2006 (UTC)
- OK, but what about when it takes over 24 hours to do that? Will the bot crash, will we have to change to running only every 2 days, etc? I recall when my project was added there were only 30 projects involved but now there are 170+ and every week it takes longer. That's what made me thing of this future possibility.Rlevse 19:06, 9 November 2006 (UTC)
- The bot won't crash in the sense that the two running bot instances won't be editing the same pages at the same time so they won't conflict. But it may crash because one bot already takes around 50% of the memory of my machine when running (it needs to keep all articles in the memory to compute the totals without repetitions). With two bots I may need to run them on different machines on alternate days or something. There could also be other issues coming up when the scale of the project increases. Let's see. :) Oleg Alexandrov (talk) 03:56, 10 November 2006 (UTC)
- Would this make it faster? Titoxd(?!?) 05:02, 10 November 2006 (UTC)
- This will make things faster and more robust I believe. Thanks! I think Salix alba also mentioned this earlier. I will look into implementing this alternative way of collecting categories in the near future. Oleg Alexandrov (talk) 16:11, 10 November 2006 (UTC)
- By the way, Oleg, I've contacted Yurik about this, and hopefully he can tell us if this can be done with the current Query API, or if we have to wait until the MediaWiki API is implemented. Titoxd(?!?) 05:21, 11 November 2006 (UTC)
- After talking with Yurik on IRC tonight, he said that we can request the pages in the assessed categories using the current Query API (as in the link I made above), do the back-end work the bot currently does (for example, determine which pages changed categories, which articles were added to the assessment lists, etc.) and then we can retrieve revision IDs for every change with a separate query, for example, this one for a bunch of hurricane articles: [2]. Now, I am not sure how many articles you can retrieve at once, but if you retrieve, let's say, up to 10 at a time, that should be much faster than what we currently do, and more robust, as well. Just remember that they're phasing out the Query API gradually for the MediaWiki API, so the syntax for the first query may have to be eventually changed. You can also change the format of the query at any time, to be more efficient with Perl. So, now, I shall go to sleep, as it is past 2:00 am here. Titoxd(?!?) 09:25, 11 November 2006 (UTC)
- By the way, Oleg, I've contacted Yurik about this, and hopefully he can tell us if this can be done with the current Query API, or if we have to wait until the MediaWiki API is implemented. Titoxd(?!?) 05:21, 11 November 2006 (UTC)
- This will make things faster and more robust I believe. Thanks! I think Salix alba also mentioned this earlier. I will look into implementing this alternative way of collecting categories in the near future. Oleg Alexandrov (talk) 16:11, 10 November 2006 (UTC)
- Would this make it faster? Titoxd(?!?) 05:02, 10 November 2006 (UTC)
- The bot won't crash in the sense that the two running bot instances won't be editing the same pages at the same time so they won't conflict. But it may crash because one bot already takes around 50% of the memory of my machine when running (it needs to keep all articles in the memory to compute the totals without repetitions). With two bots I may need to run them on different machines on alternate days or something. There could also be other issues coming up when the scale of the project increases. Let's see. :) Oleg Alexandrov (talk) 03:56, 10 November 2006 (UTC)
- The real problem right now is the amount of time needed to fetch the history of each newly added article. Right now, about 1/4 of the total articles are involved. As we get more and more articles in the system, if I understand this right, there will be fewer and fewer articles added to the system for the first time, and with any luck the run time will be reduced. I hope. Anyway, that's the impression I got from what he said. Badbilltucker 19:28, 9 November 2006 (UTC)
- OK, but what about when it takes over 24 hours to do that? Will the bot crash, will we have to change to running only every 2 days, etc? I recall when my project was added there were only 30 projects involved but now there are 170+ and every week it takes longer. That's what made me thing of this future possibility.Rlevse 19:06, 9 November 2006 (UTC)
(Deindent and reply to Titoxd above.) Then perhaps we can wait until they change the syntax, as we don't want the bot to again mass blank everything because it is confused by the change. Oleg Alexandrov (talk) 19:32, 11 November 2006 (UTC)
Mathbot down
I am not able to connect to my UCLA machine today, the network is down. I guess that explains why mathbot stopped running in the middle of the night. Oleg Alexandrov (talk) 16:13, 10 November 2006 (UTC)
- My school's network came back up, and the bot should run as usual tonight. Oleg Alexandrov (talk) 03:08, 11 November 2006 (UTC)
- Today the network was down again. Now it is up. The bot should run as usual tonight. Oleg Alexandrov (talk) 19:32, 11 November 2006 (UTC)
Bot bug?
While looking through Wikipedia:Version 1.0 Editorial Team/Tropical cyclone articles by quality/1, I saw that the "edit comment" links are malformed. Am I the only one who saw that? Titoxd(?!?) 05:20, 11 November 2006 (UTC)
- That's a bug, and I guess it spread everywhere. Sorry. I fixed it now. Oleg Alexandrov (talk) 19:32, 11 November 2006 (UTC)
Articles by class and importance
Is it possible for Mathbot to create a table showing the correlation of class and importance? For example:
Table Name Goes Here? |
Importance | ||||
---|---|---|---|---|---|
Top | High | Mid | Low | ||
Class | FA | # | # | # | # |
A | # | # | # | # | |
GA | # | # | # | # | |
B | # | # | # | # | |
Start | # | # | # | # | |
Stub | # | # | # | # |
My thoughts are that it isn't really feasibly possible with the way Mathbot works. I assume it simply loads the categories instead of loading each page and implementing this would seriously convolute the code, require an entire rewrite and would cause massive load. But I figured I'd just ask anyway, in case I'm completely wrong. :) thadius856talk 01:29, 14 November 2006 (UTC)
- Implementing this feature will not require a massive rewrite or a massive load. Each article is now an object knowing its quality (FA, A, GA, etc) and its importance (Top, High, Low). I would need to iterate through all the articles in a given project (say Beatles articles) and add to the appropriate cell in the table above.
- However, it is true that implementing this feature will require an amount of work, and it will generate a large amount of pages in addition to existing ones. I would be reluctant to work on this unless people believe that this feature will be extremely helpful to the project and worth updating such pages every day. For now I am not really convinced, for not too-large projects one can do some counting by hand to estimate some numbers in the table above, as the articles are sorted nicely first by quality and then by importance (priority), see for example Wikipedia:Version 1.0 Editorial Team/The Beatles articles by quality. Oleg Alexandrov (talk) 03:21, 14 November 2006 (UTC)
- Couldn't the table go in the statistics page for each project? (I'm still thinking whether this is a good idea or not...) Titoxd(?!?) 03:32, 14 November 2006 (UTC)
- Probably a good idea to put it there, if it's created; it'll keep down the number of extra pages, at the least.
- If this is created, incidentally: would it be possible not to generate an extra table for projects that don't have any importance assessments? It'll be rather unhelpful in those cases. ;-) Kirill Lokshin 04:58, 14 November 2006 (UTC)
- Couldn't the table go in the statistics page for each project? (I'm still thinking whether this is a good idea or not...) Titoxd(?!?) 03:32, 14 November 2006 (UTC)
- I think this table would be a very nice way to summarise the statistics; as long as the unassessed (NA) and the totals for each type were added in as well, as shown:
Table Name Goes Here? |
Importance | ||||||
---|---|---|---|---|---|---|---|
Top | High | Mid | Low | NA | Total | ||
Class | FA | # | # | # | # | # | # |
A | # | # | # | # | # | # | |
GA | # | # | # | # | # | # | |
B | # | # | # | # | # | # | |
Start | # | # | # | # | # | # | |
Stub | # | # | # | # | # | # | |
NA | # | # | # | # | # | # | |
Total | # | # | # | # | # | # |
- It's worth noting (yes, I realise it may be obvious to some) that this isn't just an alternative format, it actually presents information that is available but is presently hidden in the tables. I presume that this table would go on the statistics page in place of the existing table - is that what is proposed? In that case, one thing I don't understand is why it would involve generating new pages - can someone explain that to me?
- One alternative, to save making Mathbot's code (and Oleg's life!) any more complicated, is to use MartinbotII. MartinbotII has been recruited by yours truly to start generating new tables based on importance and quality criteria - you can see the test results from Chemistry, Physics, Maths and a few from Medicine here. I am proposing that this bot runs weekly, generating lists of articles suitable for WP:1.0. If Martin (not me!) is willing to do so, maybe MartinbotII could generate statistics like this, with an indication (in the table) of which cells are included in the release and which are not. Walkerma 05:35, 14 November 2006 (UTC)
On a side note, I figured I'd update it with some color, just for the sake of it. ;) thadius856talk 05:40, 14 November 2006 (UTC)
x | Top | High | Mid | Low |
---|---|---|---|---|
FA | # | # | # | # |
A | # | # | # | # |
GA | # | # | # | # |
B | # | # | # | # |
Start | # | # | # | # |
Stub | # | # | # | # |
It seems that this idea has good support so I will modify the code which generates the statistics table to include the extra columns as above with quality vs importance.
That is going to make the stats tables wider than now (see Wikipedia:Version 1.0 Editorial Team/A-League player articles by quality statistics for current layout), but I hope that won't be a problem. I will work on this the coming weekend. Oleg Alexandrov (talk) 16:02, 14 November 2006 (UTC)
Table Name Goes Here? |
Importance | ||||||
---|---|---|---|---|---|---|---|
Top | High | Mid | Low | NA | Total | ||
Class | FA | # | # | # | # | # | # |
A | # | # | # | # | # | # | |
GA | # | # | # | # | # | # | |
B | # | # | # | # | # | # | |
Start | # | # | # | # | # | # | |
Stub | # | # | # | # | # | # | |
NA | # | # | # | # | # | # | |
Total | # | # | # | # | # | # |
Is that the combined idea then the information and the colour. :: Kevinalewis : (Talk Page)/(Desk) 16:26, 14 November 2006 (UTC)
- I only see one problem with that table, Kevin. I assume you're using "NA" to mean "Not Assessed". However, hopefully you remember that articles can be NA class, meaning that they're non-article pages. An example of this would be Category talk:Stub-Class airport articles, while the corresponding category would be Category:Non-article airport pages. Perhaps "???" would work better, as it's already recognized in most 1.0 talk banners. By the way, I don't think the intersection of the two totals columns really has an significances, other than totaling how many class/importance ratings have been given. All the same, removing it looks very tacky.
Assessment Statistics |
Importance | ||||||
---|---|---|---|---|---|---|---|
Top | High | Mid | Low | ??? | Total | ||
Class | FA | # | # | # | # | # | # |
A | # | # | # | # | # | # | |
GA | # | # | # | # | # | # | |
B | # | # | # | # | # | # | |
Start | # | # | # | # | # | # | |
Stub | # | # | # | # | # | # | |
??? | # | # | # | # | # | # | |
Total | # | # | # | # | # | # | |
Non-article pages | # |
Obviously a project should concentrate on having their highest importance articles improved to the highest classes (rising numbers in top-left corner and decreasing numbers in bottom-right corner), and this is what I was hoping such a format would promote when I first proposed it. If you look at WP:AIRPORTS/A, for example, you'll notice that our one FA and two GA articles are not necessarily the most important airports in the world by most definitions. thadius856talk 23:46, 14 November 2006 (UTC)
Category intersection
- Is it possible to link the numbers (#) to sub-pages that contain the articles that related to both (intersection)? For example, A/Top will link to a new page, "List of A-class articles with Top-importance". I would suggest the new stats table be added to a new page since the existing stats table is used at the bottom of many project navigation boxes. -- Ganeshk (talk) 20:26, 19 November 2006 (UTC)
- I agree with the latter but do you know how many categories the former would create? Cbrown1023 20:36, 19 November 2006 (UTC)
- I was looking for the bot to create sub-pages not categories. It would be like 24 of them. :) But it would be really useful. How do I find a list of stubs that are of Top importance. Is there a way? -- Ganeshk (talk) 20:39, 19 November 2006 (UTC)
- May be the former can be programmed into the project banner templates to create the intersection categories. And the bot could link the numbers to the respective categories. Regards, Ganeshk (talk) 20:46, 19 November 2006 (UTC)
- This idea might help Oleg actually. Similar to how the bot uses the categories to look up class and importance, Mathbot could use just these interesection categories instead of doing the computation. Regards, Ganeshk (talk) 21:06, 19 November 2006 (UTC)
- I agree with the latter but do you know how many categories the former would create? Cbrown1023 20:36, 19 November 2006 (UTC)
- I modified the India project banner to implement the 30 intersection categories. I can now find out Stub-Class India articles of Top-importance. Now can Mathbot be programmed to read these categories? What do others feel about this? Regards, Ganeshk (talk) 22:40, 19 November 2006 (UTC)
- I personally think the method we use currently is already complicated to set up, so having 30 categories is close to a nightmare, and that it will put a barrier to entry to new WikiProjects who don't want to go through roundabouts to get two or three articles assessed... Titoxd(?!?) 23:40, 19 November 2006 (UTC)
- Could it be made optional? Projects that have these intersection cats will additionally get a new page with the above statistics table? It didn't take me too long to setup these cats. If new projects don't create these cats, the rest of the system still works for them. -- Ganeshk (talk) 01:18, 20 November 2006 (UTC)
- Oh for Wikipedia:Category intersection which would make these problmes go away if implemented. I've also a slight concern that changing the size of the statistics table may mess up some of the project pages, I know the mathematics and portugal projects transclude the statistics page in to the project pages. --Salix alba (talk) 00:41, 20 November 2006 (UTC)
- I personally think the method we use currently is already complicated to set up, so having 30 categories is close to a nightmare, and that it will put a barrier to entry to new WikiProjects who don't want to go through roundabouts to get two or three articles assessed... Titoxd(?!?) 23:40, 19 November 2006 (UTC)
- I modified the India project banner to implement the 30 intersection categories. I can now find out Stub-Class India articles of Top-importance. Now can Mathbot be programmed to read these categories? What do others feel about this? Regards, Ganeshk (talk) 22:40, 19 November 2006 (UTC)
- I don't like the idea of more links in the table, either with subpages, or with categories. It would be a lot of work to set up, and really, I don't see the gain. And the idea of optional features is not good either I think; it just creates a bloated code with features with few people if anbody uses. If I see good support from the community in implementing this, I will, otherwise I think it is not worth the trouble. Oleg Alexandrov (talk) 03:06, 20 November 2006 (UTC)
- Well, I personally think it is redundant with the main tables, as they are already sorted by quality, then importance. Any desired combination is already visible through there... Titoxd(?!?) 05:18, 20 November 2006 (UTC)
- The numerical count of combination is visible, but the list of articles behind the count is not. As Salix alba pointed out, Wikipedia:Category intersection may be the answer for this. But it is still in development stage. I do see a need for this however it is finally implemented. Regards, Ganeshk (talk) 19:32, 20 November 2006 (UTC)
- No, I don't mean the summary table; I mean the actual assessment tables, like Wikipedia:Version 1.0 Editorial Team/India articles by quality/1. Those include the sorted info, so the need for those pages is already fulfilled... Titoxd(?!?) 21:08, 20 November 2006 (UTC)
- I don't think Wikipedia:Version 1.0 Editorial Team/India articles by quality/1 is same as what I am asking for. What you point to is just a list of articles sorted by the quality. They don't show the intersection. Can you show me how I can see "A Class articles of Top importance" using those pages? Regards, Ganeshk (talk) 21:33, 20 November 2006 (UTC)
- The pages are ordered by quality and then by importance... so you scroll down to A-Class and then look at which ones say Top-Importance (they are all grouped together). Cbrown1023 22:29, 20 November 2006 (UTC)
- I don't think Wikipedia:Version 1.0 Editorial Team/India articles by quality/1 is same as what I am asking for. What you point to is just a list of articles sorted by the quality. They don't show the intersection. Can you show me how I can see "A Class articles of Top importance" using those pages? Regards, Ganeshk (talk) 21:33, 20 November 2006 (UTC)
- No, I don't mean the summary table; I mean the actual assessment tables, like Wikipedia:Version 1.0 Editorial Team/India articles by quality/1. Those include the sorted info, so the need for those pages is already fulfilled... Titoxd(?!?) 21:08, 20 November 2006 (UTC)
- The numerical count of combination is visible, but the list of articles behind the count is not. As Salix alba pointed out, Wikipedia:Category intersection may be the answer for this. But it is still in development stage. I do see a need for this however it is finally implemented. Regards, Ganeshk (talk) 19:32, 20 November 2006 (UTC)
- Well, I personally think it is redundant with the main tables, as they are already sorted by quality, then importance. Any desired combination is already visible through there... Titoxd(?!?) 05:18, 20 November 2006 (UTC)
- I don't like the idea of more links in the table, either with subpages, or with categories. It would be a lot of work to set up, and really, I don't see the gain. And the idea of optional features is not good either I think; it just creates a bloated code with features with few people if anbody uses. If I see good support from the community in implementing this, I will, otherwise I think it is not worth the trouble. Oleg Alexandrov (talk) 03:06, 20 November 2006 (UTC)
Update - trial Category Intersection system is at http://aerik.com/wikintersections.php. Please don't overload it! :-) See Wikipedia talk:Category intersection for details of the person who set that up. Carcharoth 16:03, 21 January 2007 (UTC)
Datestamp changed to GMT
Following a comment by Mike Peel on my talk page, I modified the date stamp at the bottom of lists to GMT from my local time (PST). Today that will cause the bot to jump a day, see here, but from tomorrow on, the datestamp output by the bot will actually correspond to the current day to most users at most times. Oleg Alexandrov (talk) 03:53, 16 November 2006 (UTC)
Linking Importance categories
Oleg, Just like the class categories link to the related project's respective class categories, could the same be done with the Importance parameter? Right now they show up with no links on the statistics pages. Thanks, Ganeshk (talk) 23:47, 18 November 2006 (UTC)
- Thanks, good point. I will do that. Oleg Alexandrov (talk) 08:52, 19 November 2006 (UTC)
- I implemented that. It will show up in all stats pages tonight when the bot runs as usual. Oleg Alexandrov (talk) 22:06, 16 December 2006 (UTC)
Class and Importance, again...
Just wondering if the update was still scheduled for this weekend. I'm getting rather antsy, but I completely understand if its been delay, or trashed entirely. Thanks! thadius856talk 18:52, 19 November 2006 (UTC)
- It is still Sunday here, in Los Angeles. :) If I knew somebody is dying to see that feature, I could have done it earlier. :)
- OK, one can see how things look here.
- I agree with Salix alba's comment (three sections above) that wider stats table could be a problem for some projects. In the same time, I am reluctant to decide the table appearance on a per project basis. Today the bot is running with the new (bigger and more detailed) table. Depending on the community feedback I will either have the bot always using the new format or revert to the old one. Oleg Alexandrov (talk) 04:33, 20 November 2006 (UTC)
- Looks great to me! Walkerma 04:37, 20 November 2006 (UTC)
- It looks awesome! One small gripe: would it be possible to put a break in between the project name and "articles" to prevent stretching the first column for longer project names? For example, Adelaide<br />articles or Adelaide<br>articles. thadius856talk 06:01, 20 November 2006 (UTC)
- Agree it looks good. A small point, I'd bold the numbers in the totals row and column, so they stand out more from the other numbers. --Salix alba (talk) 10:04, 20 November 2006 (UTC)
- I implemented these two. They indeed make the table nicer. Oleg Alexandrov (talk) 03:51, 21 November 2006 (UTC)
- Agree it looks good. A small point, I'd bold the numbers in the totals row and column, so they stand out more from the other numbers. --Salix alba (talk) 10:04, 20 November 2006 (UTC)
- It looks awesome! One small gripe: would it be possible to put a break in between the project name and "articles" to prevent stretching the first column for longer project names? For example, Adelaide<br />articles or Adelaide<br>articles. thadius856talk 06:01, 20 November 2006 (UTC)
- Looks great to me! Walkerma 04:37, 20 November 2006 (UTC)
It looks fine to me, but could the old table still also be generated as a smaller alternative? A lot of the projects don't use the importance criteria at all, and furthermore a lot of them try to transclude the stats table in their sidebars. The new wider table pretty much breaks the nice-looking sidebars. Thanks. Girolamo Savonarola 10:31, 20 November 2006 (UTC)
- That would require generating two different statistics pages for each project. It is something I would be rather reluctant to do, I feel the script is getting a lot of feature creep and and is becoming too wasteful of resources. Are two tables really necessary? Oleg Alexandrov (talk) 15:52, 20 November 2006 (UTC)
- Probably not; having more pages would just confuse things. Is there any way around it, though? For example, could the script only write columns to the statistics table if they have a non-zero total? (In other words, a project that didn't use importance ratings would get two columns ["None" and "Total"] rather than the current six.) Kirill Lokshin 16:33, 20 November 2006 (UTC)
- Please? :) Girolamo Savonarola 05:24, 21 November 2006 (UTC)
- I implemented Kirill's suggestion above. Now empty columns won't show up in the table, see for example Wikipedia:Version 1.0 Editorial Team/A-League player articles by quality statistics. That will help to some extent. Oleg Alexandrov (talk) 03:59, 22 November 2006 (UTC)
- Once again, Oleg, you work miracles. That looks excellent. Thank you so much for your time and effort, many of us on Wikipedia are in your debt. Walkerma 05:29, 22 November 2006 (UTC)
New stats layout
Okay, the new stats layout is good. But it creates problems for some WikiProjects.
A lot of the projects use {{PROJECT articles by quality statistics}} to blend the stats into the project page. Now this new, wider table is destroying the layout. AQu01rius (User • Talk) 16:54, 20 November 2006 (UTC)
- Yes, we expected this would occur in a few cases, but felt the upset in changing over was worthwhile for the extra information. I hope that you can re-format your project page, just as the Math folks did this morning. If you have a problem doing this, let us know at WVWP and we'll help out. Walkerma 17:57, 20 November 2006 (UTC)
- OK, I see a problem with projects like {{WPCanada_Navigation}} who are inserting the stats page by transclusion into their navigation template. Are there any suggestions from the technically smart people (i.e., not me)? Or do we just ask projects not to do this? Walkerma 05:40, 21 November 2006 (UTC)
- On {{WPIndia Navigation}}, I set the navigation box width to 325px like you advised above. That fixed it. -- Ganeshk (talk) 05:57, 21 November 2006 (UTC)
- Thanks! I'm sure you've done wonders for India-Canada relations! Walkerma 06:23, 21 November 2006 (UTC)
- On {{WPIndia Navigation}}, I set the navigation box width to 325px like you advised above. That fixed it. -- Ganeshk (talk) 05:57, 21 November 2006 (UTC)
This doesn't look good. 325px is too wide, and eats a lot space (which ruins the purpose of sidebar). I'll try to figure out other ways around.. AQu01rius (User • Talk) 21:04, 21 November 2006 (UTC)
- Nevermind, there's no way around (I thought some <noinclude> tweak may work, but the bot replaces the entire stats page in every update).
The new detailed stats is really not suitable for inline inclusion anymore, and what I'll do is to remove it, and just leave the "view full worklist" link in the sidebar. AQu01rius (User • Talk) 19:25, 23 November 2006 (UTC)
- Heya, can't you make the new layout optional? Prefer the old one. I left a question here, which is probably the wrong place.... sorry!
- --Ling.Nut 04:25, 24 November 2006 (UTC)
- Perhaps we can switch to the old mode indeed, there seem to be plenty of people who are not happy with the extra columns. Oleg Alexandrov (talk) 05:27, 24 November 2006 (UTC)
- No.. Keep it. Your doing amazing work. But is it possible to make it optional? Like <noinclude> the detailed parts. AQu01rius (User • Talk) 21:19, 24 November 2006 (UTC)
- <noinclude> is not a good solution though, as most people (both those who love and those who hate the extra columns) use that template transcluded. (For the record, I don't much care which format is used, but I don't want to complicate the code by introducing per-project preferences.) Oleg Alexandrov (talk) 02:13, 25 November 2006 (UTC)
Canada
{{WikiProject Canada}} not working with the Mathbot assessment project. Lincher 19:31, 20 November 2006 (UTC)
- Ah, I fixed the category name in the template. It should work now.. I haven't set up the Importance catogorization though. AQu01rius (User • Talk) 06:05, 21 November 2006 (UTC)
- Thanks for the fix, the elements in the template make it too dangerous to touch by a n00b like me. Lincher 20:20, 21 November 2006 (UTC)
- Canada is still a new project, and probably just hasn't had all the categories and such created yet. I know it's a new project; I made the banner and userbox for them. Any objections to having the modifications dropping the assessment criteria reversed, IF the project is organized to work with the bot and regularize assessments? Badbilltucker 14:27, 22 November 2006 (UTC)
- Thanks for the fix, the elements in the template make it too dangerous to touch by a n00b like me. Lincher 20:20, 21 November 2006 (UTC)
- I can't quite understand what your saying ... AQu01rius (User • Talk) 19:28, 23 November 2006 (UTC)
Yurik's query API
I am very tempted to switch to Yurik's API (mentioned a few sections above) for reading categories, not only because it would be faster and less confusing to my bot, but also because the way things are now, parsing HTML, sometimes results in cached (and therefore inaccurate) information. Yurik's API provides minimally formatted output delivered straight from the database, which is as good as it gets.
An example of getting the articles and subcategories in Category:Wikipedia 1.0 assessments is here. That text does not include all subcategories however (there is a limit on the number of subcategories displayed at once, for performance reasons I guess). What I was not able to figure out is what query to use to get the remainder of the category. Any ideas on that? Oleg Alexandrov (talk) 16:25, 23 November 2006 (UTC)
- In the query setion of the output
[query] => Array ( [category] => Array ( [next] => Maharashtra articles by quality )
)
the next argument gives the starting point for the next query. You can do a loop, or recursion, testing to see if query section of the output is present. --Salix alba (talk) 17:48, 23 November 2006 (UTC)
- In this instance the next url will be http://en.wikipedia.org/w/query.php?what=category&cptitle=Wikipedia_1.0_assessments&format=txt&cpfrom=Maharashtra_articles_by_quality. --Salix alba (talk) 21:12, 23 November 2006 (UTC)
- Strangely enough I tried that, but I could not make it work. Thanks! Oleg Alexandrov (talk) 05:27, 24 November 2006 (UTC)
- There is still a problem though. In the link you supplied above, Maharashtra_articles_by_quality serves as a tag to move to the next page. However, the actual subcategory Category:Maharashtra_articles_by_quality, is neither in the first page, nor in the second page. The bot then fails to read that category, as you see here. Do you know why? Oleg Alexandrov (talk) 06:25, 24 November 2006 (UTC)
- Curiously it does seem to work. The articles does appear to be unordered which can confuse http://en.wikipedia.org/w/query.php?what=category&cptitle=Wikipedia_1.0_assessments&cpfrom=Maharashtra+articles+by+quality&format=xml works. I seem to get different results puting the format argument before the cpfrom. --Salix alba (talk) 16:29, 24 November 2006 (UTC)
- That's so weird. Writing the same link as
- (underscores instead of plus signs, does not have Category:Maharashtra articles by quality as output. Oleg Alexandrov (talk) 02:23, 25 November 2006 (UTC)
Switch to Yurik's API
With Salix alba's help I switched to Yurik's API for reading categories (and will do that for history revisions soon also, that will make the code faster). Let us hope that Yurik won't change the API syntax, that would confuse the bot. The new API has the advantage that it always gets up-to-date info, rather than stale html the way it was before. Oleg Alexandrov (talk) 05:42, 25 November 2006 (UTC)
Running the bot on demand
I implemented a web form that allows one to run the bot for an individual project at any time, rather than waiting a good chunk of a day until it is scheduled. I think that could be helpful for new projects as then one could get a quick feedback if the project was set up right. The link is here. Oleg Alexandrov (talk) 05:27, 25 November 2006 (UTC)
- So, as a common courtesy to you and your bot, projects like Biography should not run it happenstance and should wait for your scheduled update because their vast number of articles? Cbrown1023 14:03, 25 November 2006 (UTC)
- I guess it would be more of a curtsesy to the Wikipedia servers not to overuse the bot. But I don't know, if you have a large project, but you did a lot of changes and want instant gratification, I guess you could go for it. Up to your conscience. :) Oleg Alexandrov (talk) 15:56, 25 November 2006 (UTC)
Bot source code
I posted the source code to the Perl script which updates the lists in the index together with all dependencies and instructions here.
I think it would be a good idea if perhaps somebody could try it out. I am not going anywhere any time soon, but thinking long term, considering how important the Wikipedia 1.0 project is and the amount of work put into this project by the community so far, I think it would good if the code is public and more people than just me could run it and perhaps even have an idea of how to modify it. Following the instructions over there should take an hour at most, assuming that nothing goes wrong. So, any volunteers? :) Oleg Alexandrov (talk) 05:34, 27 November 2006 (UTC)
Linking of dates
I was just looking through the lists of all articles per WikiProject (xXx articles by quality/#). I'm just wondering why the dates were all wikilinked. It seems to me it doesn't really serve any purpose and WP:MOSDATE seems to hint that dates shouldn't be linked unless they give context, but I get the feeling that's only for ns0.
However, removing the wikilinks seems that it wouldn't trim much of the size of the file off. I was just wondering if perhaps switching to a [[YYYY-MM-DD]] format would keep file sizes down any. It wouldn't change anything on the user's end, since user preferences catch that format as well. thadius856talk|airports|neutrality 06:07, 27 November 2006 (UTC)
Sortable tables
Should we get excited about this commit to SVN? Apparently, we are now able to create sortable tables... Titoxd(?!?) 00:25, 28 November 2006 (UTC)
- Here's an example, using a modified {{assessment header}} found in my sandbox: [3] Titoxd(?!?) 00:57, 28 November 2006 (UTC)
- Very cool! :) Very slow... :( But I'm assuming they're still ironing out a lot of the kinks. Definitely a good step towards resolving pesky issues of list ordering (eg chronological vs. reverse chronological). Girolamo Savonarola 01:15, 28 November 2006 (UTC)
- That's awesome. Except, doesn't it kinda take the place of India's multi-categories. It's easy to see all these this way. I hope Oleg implements it in the work lists when they iron out the kinks. Cbrown1023 01:23, 28 November 2006 (UTC)
- Very cool! :) Very slow... :( But I'm assuming they're still ironing out a lot of the kinks. Definitely a good step towards resolving pesky issues of list ordering (eg chronological vs. reverse chronological). Girolamo Savonarola 01:15, 28 November 2006 (UTC)
Generating some categories automatically
When adding a new project, one should generate those FA-Class, A-Class, Top-importance and other categories. I made that step (semi-)automatic. One can visit Wikipedia:Version 1.0 Editorial Team/Generate categories and specify what categories to create, and the bot will do it for you. Only administrators can use that tool (it requires editing a protected page). That way random people can't just generate any categories they want, and if the categories were generated incorrectly the admin in question can delete them.
One can argue this tool is not that necessary, but I think it can save some work when setting up a project (and for people who did not set up a new project for the bot before there is less to learn this way). Oleg Alexandrov (talk) 06:39, 28 November 2006 (UTC)
Statistics - total number of assess articles decreasing?
Am I missing something, or have almost 3000 articles just gone missing? See: [4]- Trevor MacInnis (Contribs) 04:49, 29 November 2006 (UTC)
- You're missing a zero; it's almost 30,000 articles gone. It may be a bug with the bot; but, based on prior experience, I suspect that some WikiProject has broken their banner code. I'll see if I can figure out what list these are disappearing from. Kirill Lokshin 04:54, 29 November 2006 (UTC)
- Well, Biography lost about 20,000; I'm not sure where the others are. Kirill Lokshin 05:15, 29 November 2006 (UTC)
- I killed the bot for tonight until this is sorted out. Weird indeed, I am also trying to understand what is going on. Oleg Alexandrov (talk) 05:37, 29 November 2006 (UTC)
- I browsed through a few early-alphabet projects, the only project where I saw a problem was Biography. I notice that the log has been blanked by the bot. I also checked, the statistics show only 200 B-Class articles, but the category contains many more than 200 articles. Walkerma 06:25, 29 November 2006 (UTC)
- Would it be a good idea to run the bot for Biography on demand, to see if it is a problem with the bot, or a problem with the project? Titoxd(?!?) 06:49, 29 November 2006 (UTC)
- Yeah, I think it would be a good idea.
- Would it be a good idea to run the bot for Biography on demand, to see if it is a problem with the bot, or a problem with the project? Titoxd(?!?) 06:49, 29 November 2006 (UTC)
- I browsed through a few early-alphabet projects, the only project where I saw a problem was Biography. I notice that the log has been blanked by the bot. I also checked, the statistics show only 200 B-Class articles, but the category contains many more than 200 articles. Walkerma 06:25, 29 November 2006 (UTC)
- I killed the bot for tonight until this is sorted out. Weird indeed, I am also trying to understand what is going on. Oleg Alexandrov (talk) 05:37, 29 November 2006 (UTC)
- Well, Biography lost about 20,000; I'm not sure where the others are. Kirill Lokshin 05:15, 29 November 2006 (UTC)
- I further modified the routine which processes the logs to try not to blank them but rather truncate them if they are too big. That may help in the future.
- I still don't know what was going on. Oleg Alexandrov (talk) 16:59, 29 November 2006 (UTC)
I looked at the stats of around 65 of the projects at the end of the list, and also at all biography projects (arts and entertainment, core, military, and other biographical). None had any significant decreases in the numbers, except the fat mother-of-all plain biography articles by quality. That one is always ran last, as it is the hugest.
If it is a bot bug, it could be subtle, as it shows up very seldom (once, so far). Can't be that the server was down; the bot was programmed to die if it can't repeatedly do an HTTP request. If it can't read the contents of a category, it would also die.
The bot did not crash, otherwise it would not have commited the total stats on that date. By the way, if you look at that diff, you would see that only the B-Class articles and Unassessed articles decreased, and roughly by same amount as in the problematic Biography stats.
All in all, I don't know what is going on. Maybe the code is sound but we are pushing against the limits of Perl, or computer memory, or who knows what. Either way it appears that the only place article disappeared from is the biography project (such a monster should not exist to start with). I won't let the bot run today either. But if we discover nothing by tomorrow I guess we could run it again and hope for the best.
This keeps me again wondering. What if one day for a reason or another the bot will get really mad? Would be rather hard to reverse the damage. Any comments? Oleg Alexandrov (talk) 03:54, 30 November 2006 (UTC)
- There was one change in {{WPBiography}} yesterday,[5] but I can't figure out how it would affect the listing... perhaps you can decipher something? I don't think we're reaching the limits of the MediaWiki API either, as the last change in SVN was six weeks ago...
- As for bot lunacy: the dirtiest way I could think of was having a different bot (I don't know... MissMathbot or something) update the "last updated" date field in the tables, so if Mr. Mathbot blows stuff up, we can always bot rollback him to the missus' version... Titoxd(?!?) 05:24, 30 November 2006 (UTC)
- I was wondering a similar thing - having an "antidote" bot (AntiMathbot?) ready to do a mass revert of all Mathbot's recent edits if he turns into Madbot. Just don't set such a bot loose against my edits, or I'll turn paranoid! Walkerma 05:29, 30 November 2006 (UTC)
- That would require some programming, parsing history and dates, etc. I thought of this problem too, and I came with a rather dumb idea. How about just backing up each day's output on my local machine, say for the last 10 days? That's easy to implement, it would require creating a directory for each calendar day (November_16_2006 for example) and doing a save for each page submitted to Wikipedia. I have a few spare gigs on my work computer I could use. How's that? :) Oleg Alexandrov (talk) 05:44, 30 November 2006 (UTC)
- That sounds great to me! If you do that, it should probably be noted on Mathbot's page that these backups are available. Are you always around when Mathbot is running, or does he sometimes run when you're absent? Walkerma 06:38, 30 November 2006 (UTC)
- I will implement the hard drive backup as it is easy. Now, the backups will not be available online, as I don't have enough storage in my world-viewable directory, so only I would be able to do the reverts. Long term, I agreee that it is more elegant to revert with a bot, so I will think about that too (although that's of course more work; anybody else willing to write such a bot? :) I am not always around when the bot is running, it runs on its own schedule. So, usually if something is suspicious the bot should be blocked right away. Oleg Alexandrov (talk) 16:03, 30 November 2006 (UTC)
- Would it even need a separate bot? For example, having one thread of the program write the assessment tables, and a different thread with a different bot edit just the one line in the date, as I said above, could work. That way, we can, in the worst-case scenario, admin rollback him by hand. Titoxd(?!?) 05:29, 1 December 2006 (UTC)
- I will implement the hard drive backup as it is easy. Now, the backups will not be available online, as I don't have enough storage in my world-viewable directory, so only I would be able to do the reverts. Long term, I agreee that it is more elegant to revert with a bot, so I will think about that too (although that's of course more work; anybody else willing to write such a bot? :) I am not always around when the bot is running, it runs on its own schedule. So, usually if something is suspicious the bot should be blocked right away. Oleg Alexandrov (talk) 16:03, 30 November 2006 (UTC)
- That sounds great to me! If you do that, it should probably be noted on Mathbot's page that these backups are available. Are you always around when Mathbot is running, or does he sometimes run when you're absent? Walkerma 06:38, 30 November 2006 (UTC)
- That would require some programming, parsing history and dates, etc. I thought of this problem too, and I came with a rather dumb idea. How about just backing up each day's output on my local machine, say for the last 10 days? That's easy to implement, it would require creating a directory for each calendar day (November_16_2006 for example) and doing a save for each page submitted to Wikipedia. I have a few spare gigs on my work computer I could use. How's that? :) Oleg Alexandrov (talk) 05:44, 30 November 2006 (UTC)
- I was wondering a similar thing - having an "antidote" bot (AntiMathbot?) ready to do a mass revert of all Mathbot's recent edits if he turns into Madbot. Just don't set such a bot loose against my edits, or I'll turn paranoid! Walkerma 05:29, 30 November 2006 (UTC)
- That's simple indeed. But this would imply that each page must be edited twice each day, doubling the history and making the servers store an additional version of each page each day (and gosh, we have many pages). Also, if say we realized that something was wrong not right away but after more than 24 hours, then more than one previous edit would need reversal, and since two bots edited the same page in one day, the admin rollback would not work. The more elegant solution would be I think a script smart enough to actually go through a bot's contribs and reverting those which happened on a given day to given projects. But that's hard to do. :) Oleg Alexandrov (talk) 05:38, 1 December 2006 (UTC)
I am almost finished with the "backup on disk feature", I think I will complete it tomorrow.
I reverted by hand the first 25 pages of Wikipedia:Version 1.0 Editorial Team/Biography articles by quality, containing all FA-Class, A-Class, GA-Class, B-Class and a bunch of Start-Class articles (note by the way the corrupted text at Wikipedia:Version 1.0 Editorial Team/Biography articles by quality/6 (specific version link), I don't understand what is going on).
I tested the bot just in case on the Adelaide project, and nothing strange happens.
So, with the evidence so far that the problem is most likely not with the bot but with the Biography project, I restarted the bot for today. Hopefully nothing wrong will happen. Once the backup feature is finished, even if something wrong happens it will be easier to revert (not that I should become more sloppy as a coder :)
Good night, all. :) Oleg Alexandrov (talk) 05:38, 1 December 2006 (UTC)
- Good night, Oleg. For tomorrow: looking at the Biography table link you gave us, someone is adding huge templates to the Comments pages of some Congressional biographies... could that have an adverse effect? Titoxd(?!?) 05:44, 1 December 2006 (UTC)
- It should not. As far as the bot is concerned, it does not matter what is inside a comments page, all it sees is the link to it. But something in the comments pages is messing up the rendering of tables, that should be fixed eventually. Oleg Alexandrov (talk) 15:55, 1 December 2006 (UTC)
May just be that it has stopped again - this time at about 7:30 at the end of "Albums" :: Kevinalewis : (Talk Page)/(Desk) 10:54, 1 December 2006 (UTC)
- It is happily running now. (I checked a bunch of statistics, and articles are not disappearing.) I guess it appeared it stopped becausse it was slowed down by having to read the most recent version of a bunch of new added articles. Once I switch to Yurik's API for that, it should go much faster.
- By the way, as the bot runs it is writing log info to disk. The log is publically available here so one can tell if the bot is stuck or died or something. Oleg Alexandrov (talk) 15:55, 1 December 2006 (UTC)
- OK, the back-up feature works now. From now on each list of the form "Foobar articles by quality" will be backed up for the last five days. So, if anything goes wrong with the list, be it bot's fault or not, we can recover the data. I think backing up the logs and the stats is not that important, and I am not sure if going beyond five days is worth it. Oleg Alexandrov (talk) 05:36, 2 December 2006 (UTC)
- Hmm. Biography just recovered about 40,000 articles.[6] That was weird... Titoxd(?!?) 06:54, 2 December 2006 (UTC)
- Why is that weird? It recovered the B class articles lost before, a bunch of unassessed articles, and much more. I think that's fine. Oleg Alexandrov (talk) 17:01, 2 December 2006 (UTC)
- No, the entire loss-recover cycle without us having any clue was weird... that was what I was referring to. :) Titoxd(?!?) 19:19, 2 December 2006 (UTC)
- Why is that weird? It recovered the B class articles lost before, a bunch of unassessed articles, and much more. I think that's fine. Oleg Alexandrov (talk) 17:01, 2 December 2006 (UTC)
Film stats
The film stats didn't update last night so I tried to run the bot on demand. It worked unti it got here:
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=Unassessed+importance+film+articles&format=txt&cpfrom=List+of+films+made+into+television+programs
It just stopped. Do you know what the problem is? Cbrown1023 17:04, 2 December 2006 (UTC)
- I started it "on demand" and it works now. Did you actually keep that window open where the code was running to see what messages it was outputting?
- By the way, I have no idea why the code stopped last night after doing just a few projects. I am trying to figure that out. Oleg Alexandrov (talk) 17:48, 2 December 2006 (UTC)
- About the "on demand" thing yeah, the window is still open right now. (Just in case you need to know something else about the code.) Cbrown1023 17:53, 2 December 2006 (UTC)
- Whatever you did, it worked. It is now fixed (/updated). Cbrown1023 18:26, 2 December 2006 (UTC)
- About the "on demand" thing yeah, the window is still open right now. (Just in case you need to know something else about the code.) Cbrown1023 17:53, 2 December 2006 (UTC)
- I could make the online "on demand" tool work either for the film project. It works though for the "Beatles" project. And I managed to make the film project work from the command line (that is what you saw).
- In short, the codebase seems sound, but when called online for large projects it does not want to work. I will try to think of what is going on. Oleg Alexandrov (talk) 18:30, 2 December 2006 (UTC)
Bot confused again
Biography's in trouble again. ;-) Kirill Lokshin 20:58, 3 December 2006 (UTC)
- I just went through the December 3 log. It appears that only the biography articles have problems. The other log, what the bot actually writes to disk as it does stuff, also has nothing suspicious. I am truly at loss. I will try to investigate what is going on. The biggest mystery is why other projects are not affected, only this one, which is also by far the biggest? Oleg Alexandrov (talk) 23:40, 3 December 2006 (UTC)
- Is there a way to sort the results of a query.php query? Because if the "by quality" categories are modified (e.g. a page is added to them), then that may jumble the results of the SQL query, because in PHP, database entries do not have any particular order unless specified otherwise, and that may be causing some issues. (I just thought about that right now...) Titoxd(?!?) 00:13, 4 December 2006 (UTC)
- Do you mean to say that if a category is modified while the bot reads subpage by subpage then the bot may not read it correctly? Oleg Alexandrov (talk) 00:22, 4 December 2006 (UTC)
- Yes, kind of. If the category is modified, let's say, Unassessed biography articles, then the order in the SQL database is jumbled, if I remember my PHP correctly. Titoxd(?!?) 01:13, 4 December 2006 (UTC)
- Do you mean to say that if a category is modified while the bot reads subpage by subpage then the bot may not read it correctly? Oleg Alexandrov (talk) 00:22, 4 December 2006 (UTC)
I will keep that in mind. But there's got to be more to it. Per the log, the bot started reading the B-Class biography articles, read the articles starting with A, then B, then C, all the way to J, and then simply did not go on. Here's the relevant part in the log
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Andy+Warhol
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Bo+Schembechler
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Constantine+Maroulis
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=El+Greco
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Gessius+Florus
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Isabella%2c+Countess+of+Atholl
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=John+H.+Russell%2c+Jr.
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=Start-Class+biography+articles&format=txt
(look at the last too lines, it just went from B-Class to Start-Class). Even if the B-Class category was jumbled, it should have still yielded some articles beyond J. When I ran the bot manually this afternoon, it easily went beyond J. Oleg Alexandrov (talk) 01:57, 4 December 2006 (UTC)
I was doing some testing for the bot and found articles disappearing, see Wikipedia:Version 1.0 Editorial Team/Pokémon Collaborative Project articles by quality/1 history. That is very strange. I am taking the bot down for today, hopefully one of these days I'll find out what is going on. Oleg Alexandrov (talk) 02:29, 4 December 2006 (UTC)
- Just to clarify what I wrote above. Yesterday I did this test, ran the bot, got this (articles disappearing). Reran the bot again, got this (more articles disappearing), then this (articles popped right back).
- Today, I did exactly the same test here. I ran the bot, and this time I got the correct result here, the bot just put back the articles I removed (with new history revision link, obviously).
- I did not change the code from yesterday to today.
- Now, all I can think about is that what is going on is not bot's fault, but rather, the server does not always give it accurate information about what is inside categories. I cannot come up with any other explanation. Comments? Oleg Alexandrov (talk) 03:31, 5 December 2006 (UTC)
- Ask the people at Wikipedia:Village pump (technical)? They should be able to help if the problem's on Wikipedia's side. Mike Peel 10:17, 5 December 2006 (UTC)
- I don't even know how to phrase the question properly. :) I think Yurik would know this stuff, because we ultimately use his query format. But I decided to chicken out for now and not use it anymore and hope that it would solve the problems (see below). Oleg Alexandrov (talk) 04:09, 6 December 2006 (UTC)
So, the bot has not been working well recetnly, and I think this started with me switching to Yurik's query format for reading categories. I decided to go back to the old way of reading categories, parsing html source. That also has problems, like sometimes the wiki servers change the format of html (and the bot is confused) and sometimes they serve cached info, but let us see if the switch back will deal with the problems with the biography articles.
I implemented a couple of routines which will hopefully make the bot automatically recover if articles go missing.
The way things are now, if say a bot did not read well a category (be it either its fault or server's fault), some articles will disappear from the worklists. Next time the bot runs it may read that category well and recover an article in the list. What would be lost however, would be the history link and the date (column 1 and 3 here for example).
In addition to doing the backup mentioned somewhere in the previous sections, I now store on disk the history links and dates for the last five days (all in one single file for each project). So, if an article goes missing, and pops up back in a day or two, the bot will check on disk if that article has been around recently. If yes, and if the quality assessment did not change, the bot will recover the old history link and date from disk, so that info won't be lost.
In short, now the bot not only writes backups, it also reads backups, and vital info is not stored only within the Wikipedia worklists but also on disk. Here's a demonstration: I removed a few articles, and the bot put them back without information loss.
I will let the bot run today. Let's see what happens. Oleg Alexandrov (talk) 04:09, 6 December 2006 (UTC)
Bot error when articles are deleted
When loading the Album articles and identifying the different versions, when the bot doesn't find a version then there is an automatic error as in Noxious Saucy Beast and Here We Are (Swizzle Tree) where they have been fixed ... I have removed the tag for there is no article attached to the talk page. I think this might be the what causes the problem. Lincher 07:07, 6 December 2006 (UTC)
- That's right. :( Recently I switched to Yurik's query format also for reading history version information, and in the process I had programmed the bot to die if it can't find a history version. That was meant to test Yurik's quiery format and I did not get to it because of other recent problems (ironically also caused by the switch to Yurik's format for reading categories).
- Tonight I'll revert to the good html way of finding the history version too. Hopefully this will put all the recent problems behind us. For today, I think one can still use the on demand version of the bot to run it for your specific project unless the bot dies on you as above.
- I am sorry for all the recent problems. I'll do my best to get over that as soon as possible. Oleg Alexandrov (talk) 16:45, 6 December 2006 (UTC)
- Hmm... that caused the bot to die, or did that cause the bot to stop reading that category and skip over to the next one? Titoxd(?!?) 16:49, 6 December 2006 (UTC)
- It litterally died for it didn't go into its normal sleep period and gave me a message that looked like this, the log. Lincher 17:12, 6 December 2006 (UTC)
- To reply to Titoxd, that sometime in the past the bot stopped reading a category and switched to another one, is not a problem of my making (but that the bot died when it should not is an unintended consequence of what I did).
- OK, so I now went back to the old way of reading history links and categories (fixing a bug in my history-reading routine along the way), so hopefully we are back to normal. I did not give up on using Yurik's query features, but I will be much more cautious and will spend more time understanding how it works. Let us see how the bot run tonight. Oleg Alexandrov (talk) 01:56, 7 December 2006 (UTC)
No update in 3 days
What's up with the bot. There hasn't been a full run in 3 days. Rlevse 23:30, 6 December 2006 (UTC)
- Problems, see the above posts... Cbrown1023 01:36, 7 December 2006 (UTC)
- However, if you would like to get a bot run now, try the automated version. Cbrown1023 01:55, 7 December 2006 (UTC)
Bot did not finish updating the biography articles
... because of a power outage at my work. The power is still out, perhaps it will come back later tonight. Oleg Alexandrov (talk) 01:29, 8 December 2006 (UTC)
- And this grandiously proves Murphy's Law... :| Titoxd(?!?) 03:59, 8 December 2006 (UTC)
- Whatever, as long as it is not bot's fault. :) Oleg Alexandrov (talk) 05:18, 8 December 2006 (UTC)
Article moves
Until now, when an article got renamed, the bot would consider that the old article disappeared and a new article got created. Now I modified the script to actually copy over the history link and date to the new article. I don't know how necessary that is, but I would think it makes more sense that way. Oleg Alexandrov (talk) 05:18, 8 December 2006 (UTC)
- It definitely makes more sense to me. How would the logs record such a change? Currently you see [[Oldname]] removed and [[Newname]] added as two separate entries, would it remain like that? Walkerma 05:32, 8 December 2006 (UTC)
- The bot will say Old name moved to new name'. Oleg Alexandrov (talk) 06:12, 8 December 2006 (UTC)
- It would be nice if the bot notices a namespace move but no talk page move, he also does that talk page move for it prevents from having redirect pages associated with an assessment on that talk page even though the article is at another place. I. E. [[Article a]] moved to [[Article b]] where [[Talk:Article a]] isn't moved to [[Talk:Article b]] and a user also adds an assessment to [[Talk:Article b]] making both talk pages full but only Article b containing the article. Lincher 05:55, 8 December 2006 (UTC)
- The bot cannot do page moves, unfortunately.
- Your point is I think that we should eliminate redirects from the lists. So, if a talk page is say B-class, but the article itself is a redirect, the article must be removed from the list. I will think on whether that's possible. Oleg Alexandrov (talk) 06:12, 8 December 2006 (UTC)
- Yes that is what I meant, just couldn't make it concise enough. Don't burn yourself trying it though. Best of luck and great job on the bot. Lincher 16:17, 8 December 2006 (UTC)
- Your point is I think that we should eliminate redirects from the lists. So, if a talk page is say B-class, but the article itself is a redirect, the article must be removed from the list. I will think on whether that's possible. Oleg Alexandrov (talk) 06:12, 8 December 2006 (UTC)
Redlinks
Quite a lot of the articles in the list are redlinks, but their talk pages list the article as B-Class, etc. I wrote a script to tag such talk pages for speedy deletion. They will also show up in Category:Wikipedia 1.0 problematic articles for people to take a closer look. Oleg Alexandrov (talk) 06:12, 8 December 2006 (UTC)
- I see that Mathbot is blocked, due to CSD'ing several inappropriate talk pages. Can I assume this is the reason why? thadius856talk|airports|neutrality 09:35, 8 December 2006 (UTC)
- The articles tagged so far seem to be mostly Talk archives. It would be better if you wrote a script to removed tags from archives. Dev920 (Have a nice day!) 10:31, 8 December 2006 (UTC)
- Sorry! The bot was doing the right thing for a while, then I went to bed, and then the mess started. The bot was rightfully blocked and hopefully I'll learn my lesson to supervise it more properly. My dumb tagging script finished by now, so I unblocked the bot and dealt with the few improperly tagged articles in Category:Wikipedia 1.0 problematic articles which were not cleaned up by others. I also restarted the WP1.0 script. Oleg Alexandrov (talk) 16:14, 8 December 2006 (UTC)
- But for the sake of completeness, and in reply to Dev920 above, most articles were tagged correctly. The archives were a minority, but they were the only ones left in bot's contributions after most of the other talk pages were speeded. Not that it matters of course. Oleg Alexandrov (talk) 16:30, 8 December 2006 (UTC)
- Sorry! The bot was doing the right thing for a while, then I went to bed, and then the mess started. The bot was rightfully blocked and hopefully I'll learn my lesson to supervise it more properly. My dumb tagging script finished by now, so I unblocked the bot and dealt with the few improperly tagged articles in Category:Wikipedia 1.0 problematic articles which were not cleaned up by others. I also restarted the WP1.0 script. Oleg Alexandrov (talk) 16:14, 8 December 2006 (UTC)
Wikipedia Version 0.5
Category:Wikipedia Version 0.5 is very huge I think. Is perhaps time to split it according to individual projects, say Category:Military history version 0.5 articles? That would require modifying the assessment templates for all the projects (e.g., {{WPMILHIST}}), but I'd think that at some point this may need to be done. Comments? Oleg Alexandrov (talk) 20:30, 9 December 2006 (UTC)
- It already has a bunch of sub-categories for the broad topic areas; I'm not sure why the tag is also putting everything directly into the main category, but it would be trivial to change that. Kirill Lokshin 20:48, 9 December 2006 (UTC)
- There are 10 subcategories (and a "Misc") - such as History, Arts, etc. See {{V0.5}} for a full list. I'm in fact using these to make navigation pages for the CD such as Wikipedia:Version 0.5/Language and Literature. There are around 2000 articles in the main category - I didn't think that would count as huge, does it? I think there was a reason for having a global category, though I forget the reason now - maybe Tito knows. Is there a problem with it? Walkerma 03:20, 10 December 2006 (UTC)
- Each time the bot runs on demand or I need to do some debugging of the code, one needs to wait until all version 0.5 articles are read. Ideally, next to Category:Physics articles by quality, Category:Physics articles by importance, and Category:Physics articles with comments there would also be a Category:Physics version 0.5 articles, so instead of all the version 0.5 articles being read in bulk at the beginning, they would be read separately for each project when needed. Not a big reason of course, but I thought it would be nice if things are that way. Oleg Alexandrov (talk) 05:11, 10 December 2006 (UTC)
- I don't mean that you should reshuffle the entire V0.5 naming scheme. If a physics article shows up both in Category:Physics version 0.5 articles, and in Category:Natural sciences Version 0.5 articles, and in Category:Wikipedia Version 0.5, that would be perfectly fine. The bot would simply read Category:Physics version 0.5 articles to get version information for physics articles, and ignore the other categories. Oleg Alexandrov (talk) 05:17, 10 December 2006 (UTC)
- Each time the bot runs on demand or I need to do some debugging of the code, one needs to wait until all version 0.5 articles are read. Ideally, next to Category:Physics articles by quality, Category:Physics articles by importance, and Category:Physics articles with comments there would also be a Category:Physics version 0.5 articles, so instead of all the version 0.5 articles being read in bulk at the beginning, they would be read separately for each project when needed. Not a big reason of course, but I thought it would be nice if things are that way. Oleg Alexandrov (talk) 05:11, 10 December 2006 (UTC)
- The reason we had the root category was that it allowed us to catch articles that "fell through the cracks"; similar to the No-Class importance categories many projects have. If anyone can think of a nice way to ensure that articles won't fall through the cracks, then I see no problem with this change... :) Titoxd(?!?) 05:09, 11 December 2006 (UTC)
- Thanks Tito! I thought there was some reason like that. Oleg, Version 0.5 is small enough that we didn't see the need to break down the categories any smaller. We may need to switch to small categories (such as Physics) for Version 0.7 and later releases, as these releases may get pretty big. I will be working on this sort of thing in the next six weeks, and I'll bear it in mind when I'm setting up the new systems for 0.7. Cheers, Walkerma 07:04, 11 December 2006 (UTC)
- Is there any reason we can't just have the tag add a "Miscellaneous" category explicitly when a valid one isn't provided? That seems much more useful for finding things that fell through the cracks than simply putting everything in one category, no? Kirill Lokshin 10:20, 11 December 2006 (UTC)
(Moved "BozMo's first cut to Wikipedia talk:Version 1.0 Editorial Team#BozMo's first cut)
Statistics/Logs for Category-Class and Template-Class articles
WikiProject Louisville has recently started seriously doing assessment work. I was wondering if templates and categories covered by our project could also be hit by the bot for the sake of statistics/logs. It would be very helpful. Stevie is the man! Talk • Work 17:18, 16 December 2006 (UTC)
- I think it's been doing that. See Wikipedia:Version 1.0 Editorial Team/Louisville articles by quality and links therein. Oleg Alexandrov (talk) 18:24, 16 December 2006 (UTC)
- Nope. I just went through the lists again, and it's all articles. No categories or templates to be found. Stevie is the man! Talk • Work 01:22, 17 December 2006 (UTC)
- I see what you mean now. I thought the whole purpose of this project is to assess articles by quality, and not templates. I'd suggest you put all the categories/templates either in a category or on a list. Perhaps I am wrong, but I don't think they belong with in the list of assessed articles. Oleg Alexandrov (talk) 05:12, 17 December 2006 (UTC)
- Well, categories and templates are just as key to quality in the Wikipedia as articles, and given that we can set the class to 'Cat' or 'Template', it would seem we should be able to track these as well. I would at least like to see numbers for these, even if we don't necessarily assess quality. Stevie is the man! Talk • Work 05:52, 17 December 2006 (UTC)
- You can manually count the number of items in these categories : Category:Category-Class Louisville articles & Category:Template-Class Louisville articles, to have both the category count and the template count. Hope this helps! Lincher 21:04, 18 December 2006 (UTC)
Rating doesn't show up
Something is strange here. Look at the Scouting project tag here: Talk:Stanisław Broniewski, the ratings display correctly, note the talk page show no change in Dec. Then look here under Dec 24, [7], the importance was removed and then here at the bottom the importance doesn't show up: [8]. Why did this happen, the rating did not change but the bot processed it as if it were? Thanks. Rlevse 10:43, 30 December 2006 (UTC)
- This was like this for several days, but has now cleared up. Go figure.Rlevse 01:09, 31 December 2006 (UTC)
- Yeah, I just noticed the same thing. I did not do any changes to the code. I have no idea what is going on. I suspect the server is feeding lies to the bot. :) Oleg Alexandrov (talk) 01:11, 31 December 2006 (UTC)
- This was like this for several days, but has now cleared up. Go figure.Rlevse 01:09, 31 December 2006 (UTC)
Making the main stats table 2D
Not so long ago all the stats tables for individual projects have become two-dimensional in the sense of displaying quality vs importance data. Only the main table Wikipedia:Version 1.0 Editorial Team/Statistics is still 1D. I am considering making it 2D also. That would make it like the other tables, and would be one fewer subroutine for me to maintain.
Of course, the bigger size of the table could be a problem, but from what I saw, it shows up only at Wikipedia:Version 1.0 Editorial Team/Work via Wikiprojects and Wikipedia:Version 1.0 Editorial Team/Index as transclusion, and there it could be pushed down or up the page so that its width does not cause problems. Any comments about that? Oleg Alexandrov (talk) 20:13, 1 January 2007 (UTC)
- Out of curiosity: how would it handle articles that had been given different importance ratings by different projects? Such cases are fairly numerous now. Kirill Lokshin 20:25, 1 January 2007 (UTC)
- Currently, when counting for the big table, if an article was encountered once in a project, it would be ignored when it is encountered second time in another project. That means obviously that the importance data is not perfectly accurate, but this avoids repetitions (when an article is counted twice). I don't see a good way to take into account that an article has different importance ratings in different projects. Oleg Alexandrov (talk) 21:00, 1 January 2007 (UTC)
- I generally find that the higher importance rating is for articles within very specific WikiProjects, which can inflate importance that would be given to that article within a general encyclopedia (which the 1.0 release would be). The lower rating is usually given by a wikiproject that is (theoretically) only peripherally interested in the article. I've been guilty of tagging borderline articles for the attention of a WikiProject, usually on the basis that if that WikiProject won't deal with it, no-one else will. I've seen the film WikiProject 'claim' film sections of articles, which is fair enough, but the importance rating will generally relate only to that section, even if the quality rating is for the article as a whole. What happens when quality ratings differ? Carcharoth 12:34, 23 January 2007 (UTC)
- Currently, when counting for the big table, if an article was encountered once in a project, it would be ignored when it is encountered second time in another project. That means obviously that the importance data is not perfectly accurate, but this avoids repetitions (when an article is counted twice). I don't see a good way to take into account that an article has different importance ratings in different projects. Oleg Alexandrov (talk) 21:00, 1 January 2007 (UTC)
"None importance" in cvg
On Wikipedia:Version 1.0 Editorial Team/Computer and video game articles by quality statistics, one of the table headers is "None", which uses {{No-Class}}, but actually refers to Category:Unknown-priority computer and video game articles. Since the statistics page is generated by Mathbot, I didn't just want to change it. Would it be ok for me to go ahead and change that label? JACOPLANE • 2007-01-2 18:03
- I'm also not exactly sure what it should be replaced with. {{Needed-Class}} ? JACOPLANE • 2007-01-2 18:09
- Needed-Class is used by other projects to indicate articles that need to be written. If the text in the table is changed to anything other than None, I would suggest Unknown (or Unk.) similar to the way it says Unassessed in the quality scale. Slambo (Speak) 18:19, 2 January 2007 (UTC)
- Thanks for the response. Just to make sure: if I change the label to "Unknown" that won't mess up Mathbot, right? JACOPLANE • 2007-01-2 18:29
- My guess is that the next bot run would overwrite the column heading back to what it shows now when it updates the counts in each field. I don't know how much work it would be to make such a change on a project-specific basis (I'm guessing that such a change would be impractical), but if the column label is changed globally... I know in WP:TWP it wouldn't make much difference if it said None or Unknown for that column, but I'm curious about other projects' opinions on this before we advocate such a change. Slambo (Speak) 18:39, 2 January 2007 (UTC)
- The bot now treats the "No-importance", "Unknown-importance", "Unassessed-importance", and "Unassigned-importance" as if they were synonymous. That can be changed of course, if there are good reasons for that. Oleg Alexandrov (talk) 21:30, 7 January 2007 (UTC)
- My guess is that the next bot run would overwrite the column heading back to what it shows now when it updates the counts in each field. I don't know how much work it would be to make such a change on a project-specific basis (I'm guessing that such a change would be impractical), but if the column label is changed globally... I know in WP:TWP it wouldn't make much difference if it said None or Unknown for that column, but I'm curious about other projects' opinions on this before we advocate such a change. Slambo (Speak) 18:39, 2 January 2007 (UTC)
- Thanks for the response. Just to make sure: if I change the label to "Unknown" that won't mess up Mathbot, right? JACOPLANE • 2007-01-2 18:29
- Needed-Class is used by other projects to indicate articles that need to be written. If the text in the table is changed to anything other than None, I would suggest Unknown (or Unk.) similar to the way it says Unassessed in the quality scale. Slambo (Speak) 18:19, 2 January 2007 (UTC)
Disambig Pages
Why are there two different templates for the same thing? (Template:Dab-Class and Template:Disambig-Class) Cbrown1023 18:21, 6 January 2007 (UTC)
- They're both unofficial—the assessment system doesn't keep track of disambiguation pages—so it doesn't matter too much. I supect we can redirect one to the other with no ill effects. Kirill Lokshin 19:58, 6 January 2007 (UTC)
WP China
Anyone else noticed the monstrous maze of categories created by WP China? For example: Category:Stub-Class China-related articles of High-importance.
Should anyone be in the mood for a mass CFD or indeed a spot of rogue adminship there's a target for you... --kingboyk 19:30, 10 January 2007 (UTC)
- Eh, unless they're causing problems, I wouldn't bother. If the project in question finds them useful, I see no reason to get rid of them. Kirill Lokshin 19:37, 10 January 2007 (UTC)
- I'll bet you a dollar they don't actually find them useful ;) Seemed like a good idea at the time though no doubt.
- Seriously, I think it gives the assessment scheme a bad name if it's seen to grow to such ludicrous proportions. Just my 2c. You can have the other 98c later. --kingboyk 19:40, 10 January 2007 (UTC)
- We ought to ask the project about it, at the least.
- (Obviously, though, I'm a bit biased; I have my own reasons for not wanting people to start taking an axe to assessment categories. ;-) Kirill Lokshin 19:49, 10 January 2007 (UTC)
- Their problem, I'd say. :) Oleg Alexandrov (talk) 03:44, 11 January 2007 (UTC)
- 'oly 'hit, you guys are insane. :) Titoxd(?!?) 05:52, 22 January 2007 (UTC)
WikiProject Canada template problem
This template has a variable "type", which allows the item to be labeled a temple/list/category. But it doesn't work if its assessed NA-class. See Talk:Lieutenant Governors of Nova Scotia for it working, and Template talk:St. John's landmarks for it not working. Any ideas? - Trevor MacInnis (Contribs) 01:05, 13 January 2007 (UTC)
Trial category intersection
For those wanting to intersect importance and rating (eg. to find all the unassessed top-importance articles in a WikiProject), a trial Category Intersection system is at http://aerik.com/wikintersections.php. Please don't overload it! :-) See Wikipedia talk:Category intersection for details of the person who set that up. Carcharoth 16:04, 21 January 2007 (UTC)
- Oh $*%&!! "I'm using a copy of the relevant tables from November, so this isn't live data" - forget that, but hassle whoever can get this system up and running. It would be really good. See what I did at Category:Unassessed Tolkien articles. Carcharoth 16:13, 21 January 2007 (UTC)
- Nice. Cbrown1023 18:54, 21 January 2007 (UTC)
- The developers are somewhat aware of this work: [9]. Titoxd(?!?) 19:37, 21 January 2007 (UTC)
- Would this intersection feature be that useful for this project? Oleg Alexandrov (talk) 05:43, 22 January 2007 (UTC)
- It would generate a link for every row/column intersection in the individual project stats tables. WP India asked for that previously, didn't they? Titoxd(?!?) 05:53, 22 January 2007 (UTC)
- And WP China seem to be doing by hand (see a few sections above). Carcharoth 23:23, 22 January 2007 (UTC)
- The reason I want this is that if you look at the stats box currently transcluded at the top right on Wikipedia talk:WikiProject Middle-earth, you can see that we made an initial pass over the ~1200 articles to find around 330 that are of top, high and mid importance. Most of the other 870 or so are of low importance or perma-stubs that will be merged. Picking out these articles as a priority was easier than assessing them at the same time, though with hindsight assessing both importance and class at the same time would have been best. Anyway, the situation yesterday was that we had 969 unassessed articles, and I wanted to pick out from those the ones that had been rated important, without wasting time by clicking on the already assessed ones. I eventually did a manual intersection using Excel to compare lists from the unassessed category and the importance category. The result was the lists at Wikipedia:WikiProject Middle-earth/Assessment/Current work. I would have much preferred not to do those lists manually (it only took 30 minutes or so), as an intersection would allow people to work on the same dynamically generated intersection without needing to manually update a list. Also, I could have used Wikipedia:Version 1.0 Editorial Team/Tolkien articles by quality/1, but this was out-of-date as assessment work had been done that day. Working from the categories was the only option other than waiting for the bot to update the list. This type of set-up probably only applies to WikiProjects with large number of articles needing to be organised, and with large numbers of stubs, and needing to pull out a core set of articles. WikiProject Biography springs to mind. Look at Wikipedia:Version 1.0 Editorial Team/Biography articles by quality statistics. They've pulled out 200 top-importance articles. But say that they eventually have another, lower tier of 1000 mid-importance articles (I know they've deprecated that, but this is an example). If there were 1000 unassessed mid-importance articles, how would they be separated from the 133177 other unassessed articles? How would someone find those 1000 articles that one person had labelled as of mid-importance, to enable them to assess these mid-importance articles? Maybe the Film WikiProject is a better example. See Wikipedia:Version 1.0 Editorial Team/Film articles by quality statistics. Now, can you see why someone might want to find out what the 52 high-importance stubs are, and work on getting them up to start level at least? Or the seven top-importance Good Articles, and work on improving those. Do you see what I mean by these examples? If your bot could provide links to those numbers, that would be so awesome. It could even just link to right section of the list here, if you can organise those lists by sections, rather than cut into the 45 lists at Wikipedia:Version 1.0 Editorial Team/Film articles by quality. The list is not ideal though, as that is only updated daily. A dynamically generated category intersection would still be best, as it automatically updates as people work on assessing articles. Carcharoth 11:29, 22 January 2007 (UTC)
- To pull out a single suggestion from that rather long post, is it possible to get the bot to add div-id tags to label the points where the following transitions are in the list: FA-top, FA-high, FA-mid, FA-low, FA-unknown, A-top, A-high, etc (for all 35 permutations down to unassessed-unknown)? Then, when it writes the number into the table, it would do it in the form [[PAGENAME#FA-top|NUMBER]], but put no link if the number was 0. That sounds terribly complicated, doesn't it, especially as PAGENAME varies depending on where the cut-off point between pages is. Don't worry, I'm sure Category Intersection won't be that far away, and I don't mind doing manual intersections for now, waiting a day and copying off the list after the bot updates. Carcharoth 02:50, 23 January 2007 (UTC)
- OK, so if I understand it correctly, the category intersection thing does not yet work. It will be rather simple to modify the stats table to have links to the intersections, I will work on it when that tool comes live. You are right in that the div-id tags looks like it would be complicated to implement, and the fact that it would be just a temporary solution makes me even more reluctant to work on it. Will it take long until the category intersection thing works? Oleg Alexandrov (talk) 04:13, 23 January 2007 (UTC)
- I wouldn't like to say. Single unified log-in and stable versions are touted as the big things that developers are working on at the moment. After that, I don't really know. I would hazard a guess at anything from a few months to a few years, depending on what time the developers have free (and I don't know any developers, this is just from memories of what I've read elsewhere). It is possible that technical problems delay it indefinitely, but I really, really hope not. There is meta:DynamicPageList (installed on WikiNews), and meta:DynamicPageList2 (intended for WikiNews), but those are not (yet) available on en-Wikipedia. Carcharoth 12:43, 23 January 2007 (UTC)
- Hi - sorry to take awhile to get over here. The issue is entirely performance. The implementation of category intersection I'm testing may have good enough performance for en, but honestly, I think it's probably borderline. Performance is why DPL isn't installed on en, too (conjecture on my part, but from a point of some knowledge - the SQL DPL uses get's really bogged down with large datasets). I think we're right there though - I'm not a full fledged developer; this is my first only real contribution to the codebase, but I think everyone thinks I'm on the right track. I'm going to collect more data with this test script, and also write one that uses Lucene. I'm sorry though - I don't have solid plans to update the data; it took me awhile to download and then build the table I'm using for testing.--Aerik 17:25, 23 January 2007 (UTC)
- I wouldn't like to say. Single unified log-in and stable versions are touted as the big things that developers are working on at the moment. After that, I don't really know. I would hazard a guess at anything from a few months to a few years, depending on what time the developers have free (and I don't know any developers, this is just from memories of what I've read elsewhere). It is possible that technical problems delay it indefinitely, but I really, really hope not. There is meta:DynamicPageList (installed on WikiNews), and meta:DynamicPageList2 (intended for WikiNews), but those are not (yet) available on en-Wikipedia. Carcharoth 12:43, 23 January 2007 (UTC)
- OK, so if I understand it correctly, the category intersection thing does not yet work. It will be rather simple to modify the stats table to have links to the intersections, I will work on it when that tool comes live. You are right in that the div-id tags looks like it would be complicated to implement, and the fact that it would be just a temporary solution makes me even more reluctant to work on it. Will it take long until the category intersection thing works? Oleg Alexandrov (talk) 04:13, 23 January 2007 (UTC)
- To pull out a single suggestion from that rather long post, is it possible to get the bot to add div-id tags to label the points where the following transitions are in the list: FA-top, FA-high, FA-mid, FA-low, FA-unknown, A-top, A-high, etc (for all 35 permutations down to unassessed-unknown)? Then, when it writes the number into the table, it would do it in the form [[PAGENAME#FA-top|NUMBER]], but put no link if the number was 0. That sounds terribly complicated, doesn't it, especially as PAGENAME varies depending on where the cut-off point between pages is. Don't worry, I'm sure Category Intersection won't be that far away, and I don't mind doing manual intersections for now, waiting a day and copying off the list after the bot updates. Carcharoth 02:50, 23 January 2007 (UTC)
- It would generate a link for every row/column intersection in the individual project stats tables. WP India asked for that previously, didn't they? Titoxd(?!?) 05:53, 22 January 2007 (UTC)
- Would this intersection feature be that useful for this project? Oleg Alexandrov (talk) 05:43, 22 January 2007 (UTC)
- The developers are somewhat aware of this work: [9]. Titoxd(?!?) 19:37, 21 January 2007 (UTC)
- Nice. Cbrown1023 18:54, 21 January 2007 (UTC)
Bot missed an article?
Does the bot often miss articles? It seemed to miss Elvish languages. I assessed it here at 21:32 on 21 January 2007. The bot updated the list Wikipedia:Version 1.0 Editorial Team/Tolkien articles by quality/1 with this edit at 22:20 on 22 January 2007, but the article is still listed as unassessed? Anything to worry about? Carcharoth 23:21, 22 January 2007 (UTC)
- There are two templates on that page, so it appears in both categories. The bot went with the unassessed as that is where it would likely get more attention. Cbrown1023 01:12, 23 January 2007 (UTC)
- ROTFL! I've come across that before. So easy to miss that. At least I'll know next time! Any way to scan for duplicate templates? Carcharoth 02:42, 23 January 2007 (UTC)
Bot down for today
... due to scheduled computer network downtime at my work. The bot should run tomorrow as usual. Oleg Alexandrov (talk) 05:04, 24 January 2007 (UTC)
Quarter million articles assessed!
I see that we have finally made it to 250,000 articles assessed! Not bad for about 8 months work. Hats off to all of those hard working people across 300+ projects, as well as to Oleg for his patience and dedication! We should celebrate and publicise this achievement. Walkerma 07:20, 28 January 2007 (UTC)
Caribbean unassessed articles category
The statistics page for the Caribbean WikiProject links "Unassessed" to the empty Category:Unassessed-Class Caribbean articles, but it should link to Category:Unassessed Caribbean articles, which is where the unassessed articles actually are. I've tried changing the link by hand, but mathbot changed it back with the next update. Anyone know how to fix this? Jwillbur 21:33, 3 February 2007 (UTC)
- Those two categories were duplicating each other. In that case, the bot links to whichever it finds first. I deleted one of them and now the bot links to the other one. Oleg Alexandrov (talk) 18:19, 4 February 2007 (UTC)
Bot rename
I created a new bot account, WP 1.0 bot which I am considering using instead of Mathbot to update the WP 1.0 pages. That because updating these pages takes so many edits that Mathbot's supposedly mathematical edits can barely be seen in its contributions.
Nothing should change but the bot name. Oleg Alexandrov (talk) 22:28, 4 February 2007 (UTC)
- We'll probably need to update all the places where Mathbot is explicitly named as updating the statistics; but that shouldn't be a problem.
- Please don't forget to clear the new bot account with the approval board before moving operations to it, incidentally. ;-) Kirill Lokshin 22:33, 4 February 2007 (UTC)
- I am now asking for approval at Wikipedia:Bots/Requests for approval/WP 1.0 bot. I could not find any places where mathbot is mentioned by name, but perhaps I did not know where to look. Oleg Alexandrov (talk) 22:41, 4 February 2007 (UTC)
- OK, mathbot was mentioned at Wikipedia:Release Version and I fixed that. Oleg Alexandrov (talk) 22:51, 4 February 2007 (UTC)
- I am now asking for approval at Wikipedia:Bots/Requests for approval/WP 1.0 bot. I could not find any places where mathbot is mentioned by name, but perhaps I did not know where to look. Oleg Alexandrov (talk) 22:41, 4 February 2007 (UTC)
Comments are welcome at Wikipedia:Bots/Requests for approval/WP 1.0 bot on the frequency of bot runs. Oleg Alexandrov (talk) 00:22, 5 February 2007 (UTC)
new WP 1.0 bot performance
seems a tad slower if that is possible. Quite a bit of slippage from the first days updates! :: Kevinalewis : (Talk Page)/(Desk) 13:49, 6 February 2007 (UTC)
- Per the discussion at Wikipedia:Bots/Requests for approval/WP 1.0 bot I had the bot edit a page every 10 seconds only, instead of every 5 seconds.
- In a sense, that it takes longer and longer to update pages makes sense, we are talking a number of articles which is good fraction of a million. My proposal would be to run the bot every other day only. People who are impatient can occasionally run the cgi script which does things on demand, although it seems that it dies out half-way for very large projects. Oleg Alexandrov (talk) 02:18, 7 February 2007 (UTC)
- Well, I still don't completely understand why the read requests have to hold off for two seconds, as these barely registered on the radar for the 18K hits/sec Wikipedia has been getting. Perhaps cutting the delay to 1 sec (or perhaps even 0.5 sec) would be all right? (I still don't like the added constraint for write time, though, but that's a different issue altogether.) Titoxd(?!?) 02:24, 7 February 2007 (UTC)
- The bot was fetching wikicode every 2 seconds and the contents of categories every second. I now made the bot fetch wikicode every second too. Let's see if that helps. I'd be kind of reluctant to fetch things faster. Oleg Alexandrov (talk) 03:27, 7 February 2007 (UTC)
- Well, I still don't completely understand why the read requests have to hold off for two seconds, as these barely registered on the radar for the 18K hits/sec Wikipedia has been getting. Perhaps cutting the delay to 1 sec (or perhaps even 0.5 sec) would be all right? (I still don't like the added constraint for write time, though, but that's a different issue altogether.) Titoxd(?!?) 02:24, 7 February 2007 (UTC)
Problem with bot?
Is there something wrong with the bot? It is adding articles to the LGBT log as being unassessed, but most of them already are. Dev920 (Have a nice day!) 20:10, 11 February 2007 (UTC)
- That was a bug, sorry. It was affecting the log only, not the lists themselves. I fixed it now. Thanks. Oleg Alexandrov (talk) 23:08, 11 February 2007 (UTC)
Strange reporting of renames
It seems like the bot is reporting the new name of the page for both the old and new ones, resulting in a bunch of log entries like "X renamed to X"; here, for example. Kirill Lokshin 21:06, 14 February 2007 (UTC)
- So one more bug. Last week I made a lot of changes to the script to make it much easier to translate to to other Wikipedia languages. I tested the code (carefully, I thought) but some bugs crept in anyway. I fixed this now. Thanks. Oleg Alexandrov (talk) 03:26, 15 February 2007 (UTC)
Using WP1.0 Bot for GAs
Please read this proposal and leave comments. Thanks, Walkerma 05:01, 22 February 2007 (UTC)
This project was renamed, and this is now handled by Category:Video game articles by quality. The category is listed to be deleted, but I want to make sure you're all done with it first. What's is the correct way to remove this from assesment? Please respond on my talk page ... -- Prove It (talk) 15:05, 23 February 2007 (UTC)
- Well, all you need to is delete Category:Computer and video game articles by quality, Category:Computer and video game articles by importance, Category:Computer and video game articles with comments, all the pages in that category, and all the subpages in Wikipedia:Version 1.0 Editorial Team/Computer and video game articles by quality. Since you guys wanted the rename so badly, I guess you've go to do all that cleanup. :) I will reply on your talk page too. Oleg Alexandrov (talk) 02:48, 24 February 2007 (UTC)
Well, I was afraid of that. So is there any fast way to delete a bunch of subpages at once? -- Prove It (talk) 02:55, 24 February 2007 (UTC)
- I don't think so. You guys wanted the rename, you've got to delete the old names (39 of them, see here). Sorry. :) Oleg Alexandrov (talk) 03:00, 24 February 2007 (UTC)
Proposal to run the bot every 48 hours
For at least two days the bot took around or more than 36 hours to run. I think that we arrived at a time when we should run the bot once every two days instead of every day. Comments? Oleg Alexandrov (talk) 03:08, 24 February 2007 (UTC)
- I'd support that. Or would it be possible to clone the bot and have one just hit the big ones (MILHIST, WPBIO, Album, France, Australia, Film, India, Computer & Video Game) and the other hit the rest?↔NMajdan•talk 23:00, 28 February 2007 (UTC)
- I thought of this too. But then it would not be possible to compute the total number of articles. Well, that can be accomplished by saving things to disk, etc. But I doubt it would be worth the trouble, I think updating the lists every other day should keep things reasonably up to date. Oleg Alexandrov (talk) 04:21, 1 March 2007 (UTC)
- Ok, that makes sense. Go for it, if you haven't already.↔NMajdan•talk 14:54, 2 March 2007 (UTC)
- Due to no objections, the bot has been running every other day for the last few days. Oleg Alexandrov (talk) 16:03, 2 March 2007 (UTC)
- Ok, that makes sense. Go for it, if you haven't already.↔NMajdan•talk 14:54, 2 March 2007 (UTC)
- I thought of this too. But then it would not be possible to compute the total number of articles. Well, that can be accomplished by saving things to disk, etc. But I doubt it would be worth the trouble, I think updating the lists every other day should keep things reasonably up to date. Oleg Alexandrov (talk) 04:21, 1 March 2007 (UTC)
Removing the importance part from our projects assessments
The importance rating has cause enough controversy and is not being used to its full potential in the Aircraft project. What would be the easiest way of removing this part from our assessment profile. Can we just delete the related categories and remove the code from the project banner? What will the bot do after this is done? - Trevor MacInnis (Contribs) 21:26, 26 February 2007 (UTC)
- The above will be enough, the bot won't complain. However, the "Importance:None" column will still show up in Wikipedia:Version 1.0 Editorial Team/Aircraft articles by quality statistics, as you see now in Wikipedia:Version 1.0 Editorial Team/Military history articles by quality statistics. Hope that helps. Oleg Alexandrov (talk) 03:09, 27 February 2007 (UTC)
Something broken or I missed a change?
Wikipedia:Version 1.0 Editorial Team/The KLF articles by quality log hasn't been updated in some weeks and it looks like we dropped off the Index too. Has something broken or has there been a change in my absence? --kingboyk 22:52, 28 February 2007 (UTC)
- That's really strange. I can't see any reason why it wouldn't show up; perhaps Oleg can spot something I'm missing. Kirill Lokshin 10:57, 2 March 2007 (UTC)
- Oleg? --kingboyk 18:36, 2 March 2007 (UTC)
- Sorry. I noticed your comment above only this morning, but did not have time to reply. The problem seems to be that while Category:The KLF articles by quality has at the bottom the Category: Wikipedia 1.0 assessments, when actually browsing Category: Wikipedia 1.0 assessments one could not find Category:The KLF articles by quality in there. This sounds paradoxal, but this happens every now and then with pages/categories which have not been edited for a while.
- Oleg? --kingboyk 18:36, 2 March 2007 (UTC)
- I did a dummy edit to Category:The KLF articles by quality to make it pop up in the base category, reran the bot, and now it showed up in the index.
- KLF is one of our oldest projects. It is quite likely that other projects whose categories have not been edited for a while will start disappearing. Something to keep an eye on. Oleg Alexandrov (talk) 03:41, 3 March 2007 (UTC)
- No need to apologise Oleg, and thanks ever so much for sorting that out. Just one of those quirks I guess :) Cheers and thanks again. (And yes, if anyone else comes complaining of the same thing we know what to look for now :)) --kingboyk 10:44, 3 March 2007 (UTC)
- KLF is one of our oldest projects. It is quite likely that other projects whose categories have not been edited for a while will start disappearing. Something to keep an eye on. Oleg Alexandrov (talk) 03:41, 3 March 2007 (UTC)
This page needs a TOC
Can this page be divided up with a {{CompactTOC}}? It'll make looking through it a bit easier, if people are looking to do so. - Trevor MacInnis (Contribs) 22:09, 2 March 2007 (UTC)
- It needs to be archived. Something I can handle later unless somebody else does it before.↔NMajdan•talk 04:41, 3 March 2007 (UTC)
- Sorry, I guess I should clarify, it's not the talk page that I think could use a TOC but Wikipedia:Version 1.0 Editorial Team/Index. - Trevor MacInnis (Contribs) 04:47, 3 March 2007 (UTC)
- Well, there is only one section there, called "See also". Not much for a TOC, right? :) Actually the whole page is a big fat TOC already, all it has is a list of lists. I don't see how adding a TOC would help. Oleg Alexandrov (talk) 16:52, 3 March 2007 (UTC)
- Ok then. I just hate having to scroll,scroll,scroll,scroll,scroll down, oops too far, scroll up, to find the projects under, say, "M". - Trevor MacInnis (Contribs) 17:29, 3 March 2007 (UTC)
- That's correct. But what we need then is not a TOC per se, but rather sections, in other words, the table may need to be split into subtables, for each letter in the alphabet, and each subtable placed into its own section. That should not be hard to implement, I can do it if there is good support for this. Oleg Alexandrov (talk) 18:40, 3 March 2007 (UTC)
- Yes, please, that would be very helpful.--DorisHノート 22:08, 23 March 2007 (UTC)
- That's correct. But what we need then is not a TOC per se, but rather sections, in other words, the table may need to be split into subtables, for each letter in the alphabet, and each subtable placed into its own section. That should not be hard to implement, I can do it if there is good support for this. Oleg Alexandrov (talk) 18:40, 3 March 2007 (UTC)
- Ok then. I just hate having to scroll,scroll,scroll,scroll,scroll down, oops too far, scroll up, to find the projects under, say, "M". - Trevor MacInnis (Contribs) 17:29, 3 March 2007 (UTC)
- Well, there is only one section there, called "See also". Not much for a TOC, right? :) Actually the whole page is a big fat TOC already, all it has is a list of lists. I don't see how adding a TOC would help. Oleg Alexandrov (talk) 16:52, 3 March 2007 (UTC)
- Sorry, I guess I should clarify, it's not the talk page that I think could use a TOC but Wikipedia:Version 1.0 Editorial Team/Index. - Trevor MacInnis (Contribs) 04:47, 3 March 2007 (UTC)
- Wouldn't ID divs or spans work? As in having <span id="B"/> before the first B item, etc, and {{CompactTOC}} (or better {{CompactTOC8}}) at the top? It would work inside the table, see for example Towns of Alberta#T. --Qyd 21:48, 4 June 2007 (UTC)
New Projects are not having their stats created
I've added a few projects, one Category:Rotorcraft articles by quality two and a half days ago, the others Category:Red Bull Air Race World Series articles by quality and Category:Gliding articles by quality more recently, and their statistics pages have yet to be created by the bot. They are using the same project banner as the aviation project, {{WPAVIATION}}, in the same way that the Military history project uses the same banner for all its projects. Could someone look over them to see if I missed something that the bot looks for in order to "do its thing". Thanks. - Trevor MacInnis (Contribs) 05:36, 5 March 2007 (UTC)
- Aha! Rotorcraft just got done. I guess it just takes a lot longer than it used to to begin updates. - Trevor MacInnis (Contribs) 05:52, 5 March 2007 (UTC)
Stats
To help out with Wikipedia:WikiProject Biography/Assessment/Assessment Drive could the bot tally up the total of assessed articles? I'm trying to encourage folks to focus on how much they've achieved, not the bogus unassessed number (bogus because nearly 40,000 living person articles - and lord knows how many bios about dead people - don't have any {{WPBiography}} tag). --kingboyk 13:56, 6 March 2007 (UTC)
- Well, the total number of assessed articles is the total number of articles minus the number of unassessed ones. The latter two numbers are in the stats already. I could modify the code generating the stats to print in addition the total of assessed articles also, but then that number will be printed out for each of the 400 projects stats. I can do that if people think that would be helpful overall for the projects. Otherwise as a solution specific to your project one could write a bot to read the current stats, do the subtraction, and post that number in a place where you guys could easily see. Oleg Alexandrov (talk) 16:04, 6 March 2007 (UTC)
- I know how to do the maths Oleg, but I prefer to have machines do these things for me :) We're talking pretty big numbers at WPBio, and at some other Projects. I was thinking a line could be added to the Stats table to complement the (bogus) unassessed count. I'm happy to wait and see if other projects think this would be useful before you decide. Thanks for the reply, as always. --kingboyk 19:38, 6 March 2007 (UTC)
- I am for it. It is a good statistic to have and more reflective of a WikiProject.↔NMajdan•talk 19:40, 6 March 2007 (UTC)
- OK, I will work on this in the weekend. Oleg Alexandrov (talk) 04:15, 8 March 2007 (UTC)
- I agree that it would be helpful. The first thing I do when I look at a stats list is subtract unassessed from total to work out this number - it would be nice to have this actually displayed. Thanks, Walkerma 04:23, 8 March 2007 (UTC)
- Sorry, I did not get to that in the weekend, and I am very consumed by the real life this week. I'll try to do it this coming weekend. Oleg Alexandrov (talk) 03:57, 13 March 2007 (UTC)
- OK, I added the assessed row, as seen at Wikipedia:Version 1.0 Editorial Team/African military history articles by quality statistics. Later today when the bot runs will add that row to all stats tables. Oleg Alexandrov (talk) 00:10, 18 March 2007 (UTC)
- Thank you Oleg! Much appreciated. --kingboyk 15:31, 22 March 2007 (UTC)
- OK, I added the assessed row, as seen at Wikipedia:Version 1.0 Editorial Team/African military history articles by quality statistics. Later today when the bot runs will add that row to all stats tables. Oleg Alexandrov (talk) 00:10, 18 March 2007 (UTC)
- Sorry, I did not get to that in the weekend, and I am very consumed by the real life this week. I'll try to do it this coming weekend. Oleg Alexandrov (talk) 03:57, 13 March 2007 (UTC)
- I agree that it would be helpful. The first thing I do when I look at a stats list is subtract unassessed from total to work out this number - it would be nice to have this actually displayed. Thanks, Walkerma 04:23, 8 March 2007 (UTC)
- OK, I will work on this in the weekend. Oleg Alexandrov (talk) 04:15, 8 March 2007 (UTC)
- I am for it. It is a good statistic to have and more reflective of a WikiProject.↔NMajdan•talk 19:40, 6 March 2007 (UTC)
- I know how to do the maths Oleg, but I prefer to have machines do these things for me :) We're talking pretty big numbers at WPBio, and at some other Projects. I was thinking a line could be added to the Stats table to complement the (bogus) unassessed count. I'm happy to wait and see if other projects think this would be useful before you decide. Thanks for the reply, as always. --kingboyk 19:38, 6 March 2007 (UTC)
WP 1.0 bot and article class
I have a question about Wikipedia:Version 1.0 Editorial Team/Numismatic articles by quality statistics. Pages in this project now have a category class, template, dab, etc. These new classes can be found at Category:WikiProject Numismatics articles. Do you think you can upgrade the bot to identify these classes? Thanks. --ChoChoPK (球球PK) (talk | contrib) 08:09, 11 March 2007 (UTC)
- The bot is currently hard coded to accept only FA-Class, GA-Class, A-Class, B-Class, Stub-Class, and Unassessed-Class. The problem with Category-Class is that it contains categories, not articles, which would need special treatment in my code. The problem with new classes in general is that the bot needs to be told for each one how to sort it in the table relative to the other classes.
- All in all, taking into account that there are more than 400 projects, and my code is being translated to other language Wikipedia too, I am very reluctant to expand the code to support project specific needs. Perhaps there are other ways you guys can keep track of those categories? Oleg Alexandrov (talk) 15:24, 11 March 2007 (UTC)
- I understand your difficulty. If I simply add rows to the table, which link to the proper categories, and question marks in place of the count, would you have your bot leave that part alone (a temporary solution for the moment). --ChoChoPK (球球PK) (talk | contrib) 15:35, 11 March 2007 (UTC)
- Well, the statistics table is recreated each time so any changes are overwritten. What you suggest would not be easy to implement. It would require the bot to first read the table, then decide based on some kind of algorithm what to keep and what to overwrite, and then write back the table. Definitely not impossible, but it would make the code much too complicated I think. Oleg Alexandrov (talk) 20:25, 11 March 2007 (UTC)
- These parameters (list, disambig, etc.) are quite distinct from quality classes. I would suggest writing a separate bot to produce this kind of information. This bot might, perhaps, be able to post this type of category information into the /comments subpage, and then WP1.0 Bot would put the information into the main table. Alternatively, you may want to keep such pages out of the main table altogether, and have the new bot produce its own listing of non-article pages, organised by category. If you can get something like this working, please share it with us, because I know others would be interested. Cheers, Walkerma 03:22, 12 March 2007 (UTC)
- I have something like that working here. It's not hard to implement from Oleg's framework. It would be much easier if each attribute had a category assigned to it. CMummert · talk 12:44, 12 March 2007 (UTC)
- That looks very nice- thanks for sharing it! It shows how projects can tailor their output. We may want to use your bot for Version 0.7, where we have existing categories such as history, maths, natural sciences, etc., which are not readable by WP 1.0 bot. Thanks, Walkerma 21:57, 12 March 2007 (UTC)
- CMummert, how about showing some code? :) Oleg Alexandrov (talk) 03:57, 13 March 2007 (UTC)
- Here it is: [10]. It would need some editing to be useful for other people, but the idea is extremely simple. CMummert · talk 00:10, 15 March 2007 (UTC)
- CMummert, how about showing some code? :) Oleg Alexandrov (talk) 03:57, 13 March 2007 (UTC)
- That looks very nice- thanks for sharing it! It shows how projects can tailor their output. We may want to use your bot for Version 0.7, where we have existing categories such as history, maths, natural sciences, etc., which are not readable by WP 1.0 bot. Thanks, Walkerma 21:57, 12 March 2007 (UTC)
- I have something like that working here. It's not hard to implement from Oleg's framework. It would be much easier if each attribute had a category assigned to it. CMummert · talk 12:44, 12 March 2007 (UTC)
- These parameters (list, disambig, etc.) are quite distinct from quality classes. I would suggest writing a separate bot to produce this kind of information. This bot might, perhaps, be able to post this type of category information into the /comments subpage, and then WP1.0 Bot would put the information into the main table. Alternatively, you may want to keep such pages out of the main table altogether, and have the new bot produce its own listing of non-article pages, organised by category. If you can get something like this working, please share it with us, because I know others would be interested. Cheers, Walkerma 03:22, 12 March 2007 (UTC)
- Well, the statistics table is recreated each time so any changes are overwritten. What you suggest would not be easy to implement. It would require the bot to first read the table, then decide based on some kind of algorithm what to keep and what to overwrite, and then write back the table. Definitely not impossible, but it would make the code much too complicated I think. Oleg Alexandrov (talk) 20:25, 11 March 2007 (UTC)
- I understand your difficulty. If I simply add rows to the table, which link to the proper categories, and question marks in place of the count, would you have your bot leave that part alone (a temporary solution for the moment). --ChoChoPK (球球PK) (talk | contrib) 15:35, 11 March 2007 (UTC)
FA-Class article count
I believe that somewhere, 17 articles as FA class when they are not. According to Wikipedia:Featured articles there are 1307, but according to Wikipedia:Version 1.0 Editorial Team/Statistics there are 1324. Can the bot be programmed to catch this? Or is this just a problem for the project's involved to correct?- Trevor MacInnis (Contribs) 00:14, 18 March 2007 (UTC)
- I think I can guess the reason for at least some of the discrepancy. When a FA gets first listed, its assessment is clear, and it will probably get upgraded to FA-Class immediately. When a FA gets delisted, the project tags may get left at FA for some time after delisting. I don't think this matters too much - I'm glad the numbers are so close! It shows that pretty much all of our FAs are being tracked by one WikiProject or another!
- I just made a list from Wikipedia:Featured articles using AWB, and it gives 1307 mainspace links from that page that are true article listings Meanwhile, Category:Wikipedia featured articles gives 1303 mainspace article talk pages. The difference is with the following:
- Unfortunately I can't get a list of the FAs found by the 1.0 Bot directly.
- I think it would be a bad idea to try and get this bot to make checks of this sort, because I think bots should have a clear purpose. This bot is so busy it takes 36 hours to complete one cycle, so we don't want to burden it with more tasks. It's also got a lot of code, we don't want to make it any more complicated. However, I think it might be worth a check to make sure it's not double counting anything, stuff like that. Is there some may to get a list of the FAs it is finding? Or is this what Category:Wikipedia featured articles is supposed to be? Thanks, Walkerma 01:20, 18 March 2007 (UTC)
- I love this last paragraph (about not making the code more complicated, one can always hide behind that :) Now, the bot does not overcount, it keeps a global hash making sure that in the total stats each article shows up only once. Oleg Alexandrov (talk) 04:00, 18 March 2007 (UTC)
- Sure it would be possible to get a list of the articles that the WP 1.0 bot counts - fetch a list of all subcategories of Category:FA-Class_articles and then fetch the contents of those categories. Oleg has released the code for WP 1.0 bot, so this would only take a few extra lines of perl. The following works for me:
my $Root_category = 'FA-Class_articles';
my @tmp_cats;
my @tmp_articles;
&fetch_articles_cats($Root_category, \@tmp_cats, \@tmp_articles);
my $cat;
my @tmp_cats2;
my %FeaturedArticles;
my $featured_article;
foreach $cat (@tmp_cats) {
print "fetch 2 $cat\n";
&fetch_articles_cats($cat, \@tmp_cats2, \@tmp_articles);
foreach $featured_article ( @tmp_articles) { $FeaturedArticles{$featured_article} =1;}
print "$cat " . (scalar @tmp_articles) . "\n";
}
print "Count: " . (scalar keys %FeaturedArticles) . "\n";
- I count 1370 featured articles this way. Category:Wikipedia featured articles is added by {{ArticleHistory}} if
currentstatus
isFA
. So by cross referencing it would be easy to make a list of the exceptional articles. CMummert · talk 02:13, 18 March 2007 (UTC)- Here are the impostors this morning:
- Most of them seem be former FAs where the project rating was not changed. Since the list is so short, I'll go through and fix the project ratings to non-FA. So you may have to look at the history to see what the problems were. CMummert · talk 12:15, 18 March 2007 (UTC)
- There are periodic checks for the self-consistency of WP:FA, Category:Wikipedia featured articles, and transclusions of Template:Featured article, and similar for WP:FFA and its category. I have seen some projects assign "FA" class to project-specific "selected articles" though I think this is just a misunderstanding. Gimmetrow 21:19, 18 March 2007 (UTC)
- The FA-Class tag is also used for featured lists, at least by some projects; that'll cause the numbers not to match up even if everything is consistent. Kirill Lokshin 21:27, 18 March 2007 (UTC)
- There are periodic checks for the self-consistency of WP:FA, Category:Wikipedia featured articles, and transclusions of Template:Featured article, and similar for WP:FFA and its category. I have seen some projects assign "FA" class to project-specific "selected articles" though I think this is just a misunderstanding. Gimmetrow 21:19, 18 March 2007 (UTC)
- I count 1370 featured articles this way. Category:Wikipedia featured articles is added by {{ArticleHistory}} if
- There are also Wikipedia:Featured images. I think that WP 1.0 bot is likely to include all three in the "FA-class" count, provided that the project puts its rating template on all of them. CMummert · talk 01:14, 19 March 2007 (UTC)
- Yes, of course, Talk:United States Navy enlisted rates. But there are 230-some featured lists, and the difference mentioned above is less than 70. There are some featured articles without any project tag, too. Gimmetrow 21:36, 18 March 2007 (UTC)
Unassessed question
Somehow, my project has two unassessed categories. I think this happened when a bunch of empty categories were deleted awhile ago. Which one does the bot look at? My banner places unassessed articles in Category:Unassessed University of Oklahoma articles but the statistics table links to Category:Unassessed-Class University of Oklahoma articles. I want to make sure I delete the correct one. Thanks.↔NMajdan•talk 21:06, 21 March 2007 (UTC)
- I think the bot should be happy with either one. If you delete one of the unassessed categories, the bot will link to the other one in the stats. Oleg Alexandrov (talk) 03:33, 22 March 2007 (UTC)
Total number of articles tracked?
The current total of 701,787 is, I assume, the total number of articles with participating WikiProject tags on them (assessed or not). I know there have been updates on this and similar figures in the Wikipedia Signpost, but can anyone here provide a graph of how the total has varied over time? When is the figure likely to hit 1 million? Is it ever likely to catch up with the total number of articles (currently 1,698,947)? Carcharoth 11:46, 22 March 2007 (UTC)
- If we could answer such questions we'd make a lot of money on the stock market. :) The bot never kept information about the total number of articles each day. That info is of course contained in the history of the page, but would be a pain to extract.
- To keep it simple, this project is less than a year old I think. So perhaps in a year we'll cover all the articles (unless the bot stops working for reasons of scale, that is). Oleg Alexandrov (talk) 14:50, 22 March 2007 (UTC)
- I (painfully ;)) extracted the numbers from the history page. I used 1 week intervals, here's the simple graph I came up with. Yeah, a year sounds like a good estimate. MahangaTalk to me 15:18, 22 March 2007 (UTC)
- Ooh. Nice. Could you stick a green solid area under the blue bit to show total number of assessed articles? Carcharoth 18:15, 22 March 2007 (UTC)
- I wonder if the total number of articles (that must have been graphed somewhere else already) can be put in as a black solid line up above this solid stuff? :-) Carcharoth 18:17, 22 March 2007 (UTC)
- There's a lot of stuff on Wikipedia:Statistics, but no easy way to retrieve that information. I did add the total assessed to the previous graph. That's about the extent of my excel expertise. Hope you like. MahangaTalk to me 20:05, 22 March 2007 (UTC)
- I wonder if the total number of articles (that must have been graphed somewhere else already) can be put in as a black solid line up above this solid stuff? :-) Carcharoth 18:17, 22 March 2007 (UTC)
- Ooh. Nice. Could you stick a green solid area under the blue bit to show total number of assessed articles? Carcharoth 18:15, 22 March 2007 (UTC)
- I (painfully ;)) extracted the numbers from the history page. I used 1 week intervals, here's the simple graph I came up with. Yeah, a year sounds like a good estimate. MahangaTalk to me 15:18, 22 March 2007 (UTC)
- Very nice to see this, thanks! I can see the effect of Kingboyk's stubs script after August 2006, too! Thanks, Walkerma 21:29, 22 March 2007 (UTC)
- Bah! It's not a script, it's a finely tuned piece of low level code! <cough> --kingboyk 22:01, 23 March 2007 (UTC)
- I'd also like to think that last little surge over the past month is due to WPP:BIO's assessment drive.↔NMajdan•talk 19:11, 23 March 2007 (UTC)
- Actually, NMajdan, I had thought the same thing, I think that assessment drive is having a big impact. Also, the huge surge last August came from WP:BIO joining. Walkerma 21:57, 23 March 2007 (UTC)
- Very nice to see this, thanks! I can see the effect of Kingboyk's stubs script after August 2006, too! Thanks, Walkerma 21:29, 22 March 2007 (UTC)
Next step is to program the bot to generate and update this graph itself. <ow! stop hitting me with that wet trout, Oleg!> :-) Carcharoth 22:53, 22 March 2007 (UTC)
- Heh. Apparently I'm not the only one who does these kinds of things... Titoxd(?!? - cool stuff) 03:15, 23 March 2007 (UTC)
- Awesome. :) Oleg Alexandrov (talk) 03:26, 23 March 2007 (UTC)
That's beautiful. Could you update it every month? Pretty please! --kingboyk 22:00, 23 March 2007 (UTC)
WP 1.0 bot is so slow
WP 1.0 bot is so slow. Can you increase the speed of the bot. I read recently that many bots could increase their speeds. Don't remember where I read that, but I just think your bot is going too slow. --Paracit 02:37, 24 March 2007 (UTC)
- The bot is slow because it has to do a lot of work (458 projects with thousands of articles each). Currently it does a read request each second and a write request each 10 seconds. I was told to not have it write more often than that.
- Also, I'd think that having the bot run every other day is acceptable, no? People who want a quick bot run can do so here. Oleg Alexandrov (talk) 03:14, 24 March 2007 (UTC)
- Note that if you run it from the toolserver, the stress on the servers themselves in both reads and writes is reduced; at least that's what I was told a while ago in #wikimedia-tech. I'm not sure about the details, but it may be something to think about. Titoxd(?!? - cool stuff) 04:25, 24 March 2007 (UTC)
- Really? I have an account on the toolserver which I could use. But I am kind of skeptical that the toolserver has a more intimate relationship with the main database than other computers on the net. If this were for sure, it would be great. Oleg Alexandrov (talk) 04:34, 24 March 2007 (UTC)
- It's not really a database issue (we all know that database replication there sucks); it's more of having a direct connection to the database cluster, bypassing the squid cache servers. Or something like that. However, read and write performance is much higher, AFAIK. Titoxd(?!? - cool stuff) 04:38, 24 March 2007 (UTC)
- Really? I have an account on the toolserver which I could use. But I am kind of skeptical that the toolserver has a more intimate relationship with the main database than other computers on the net. If this were for sure, it would be great. Oleg Alexandrov (talk) 04:34, 24 March 2007 (UTC)
- Note that if you run it from the toolserver, the stress on the servers themselves in both reads and writes is reduced; at least that's what I was told a while ago in #wikimedia-tech. I'm not sure about the details, but it may be something to think about. Titoxd(?!? - cool stuff) 04:25, 24 March 2007 (UTC)
- There are two things to consider. Once is that my script is a resource hog, it can gobble up tens of megabytes of memory (if not more), perhaps taking resources from other programs on the toolserver. Second, and more importantly, from my experience the toolserver can be down every now and then, and even if it is down for a moment, an entire two-day run of the bot is interrupted. But one of these days I'll give it a try moving things to the toolserver, let's see if things become faster. Oleg Alexandrov (talk) 14:57, 25 March 2007 (UTC)
The bot did its last two runs on the toolserver. I can't say if it was faster because neither run was finished. Either the machine was rebooted or the script died or something. I am moving it back to my department's machines. I'll also think of ways to make the script faster. Oleg Alexandrov (talk) 02:44, 11 April 2007 (UTC)
- Make it a class assignment? ;) Titoxd(?!? - cool stuff) 03:39, 11 April 2007 (UTC)
AWB script to create article assessment categories
Every so often I find myself creating a new set of assessment categories for a WP Biography workgroup. It's a dull and repetitive task, so yesterday I knocked up a script in the shape of an AWB plugin to do the job. It's not a bot, it just asks the user for some config info, creates a category list which it adds to the AWB list, and then fills in the categories with some boilerplate text. User can review the text before save and is always in full control.
The plugin should ship with the next version of AWB, and source code (VB.net) is in the AWB subversion repository. Please try it!
Some examples created with this tool: Category:Biography (baronets) articles by quality, Category:Biography (peerage) articles by quality, Category:Biography (peerage) articles by priority. --kingboyk 12:54, 26 March 2007 (UTC)
- There is also my script, at Wikipedia:Version 1.0 Editorial Team/Generate categories which has been working for a while. But that one requires admin privileges. So it's nice to have the AWB alternative. Oleg Alexandrov (talk) 14:37, 26 March 2007 (UTC)
- Bah, didn't know about that. Oh well, as you say there's an alt for non-admins now :) --kingboyk 14:38, 26 March 2007 (UTC)
Intersections etc on WP1.0 bot
- The first two comments here were pasted from Oleg's talk page by TimNelson
More ideas:
- Could you generate a page that lists the 10 pages in Wikipedia with the most WikiProjects attached?
- Could you make it so that it lists the 10 Wikipedia projects that have the most overlap, but no common task force? Eg. if WP Biography and WP Mathematics have a lot of overlap, but no common task force, it might be an indication that a Mathematicians Biography task force should be set up.
-- TimNelson 09:53, 3 April 2007 (UTC)
- Thanks, these are good suggestions. But I have rather little time for coding for the moment, and if I had more, more pressing tasks would be making WP 1.0 bot faster, introducing a table of contents in the index (as requested at WT:1.0/I), and a few others. I can work on this, but it may take me a few weeks to get to it. You can also try to raise this at WT:1.0/I. There are a few perl programmers there who could implement this. Cheers, Oleg Alexandrov (talk) 15:05, 3 April 2007 (UTC)
- I'm another one who's come here wondering what had happened to the bot for the Wine Project ;-/ but after the obligatory thanks for doing this in the first place, and bearing in mind the above comments about time, I thuoght I'd do a 'me too' on the original intersection idea - and specifically on the Stubs line (if that helps at all with server load), for the purpose of prioritising stub killing. A few weeks ago the Wikiwino stubs were distributed something like 2 Top, 55 High, 300 Mid, 700 Low (something like that anyway), and I ended up doing a quick and dirty VLOOKUP + filter in Excel to identify the Top and High priorities. If a filter for those stubs was a single click away, I'm sure it would help with stub killing. But having the table showing stubs ticking away is great for morale ;-) - thanks again FlagSteward 20:41, 13 April 2007 (UTC)
- It seems to me like what you're talking about is (possibly) a different task. Can I point out m:CatScan, which currently doesn't update from English Wikipedia due to lack of hardware. Possibly you could contact the owner of CatScan and offer money for additional hardware or something -- I dunno. But I'd be interested in seeing CatScan going, for much the same reasons.
- -- TimNelson 10:30, 14 April 2007 (UTC)
- Or you could create a category structure like this Category:India articles by quality and importance on your banner that would give you a category such as this, Category:Stub-Class India articles of Top-importance. Let me know if I need to explain further. Regards, Ganeshk (talk) 14:33, 14 April 2007 (UTC)
Interwiki on statistics page
The French Wikipedia now has about 13,000 articles assessed. Is it possible to add an interwiki link to this stats page from Wikipedia:Version 1.0 Editorial Team/Statistics? I'm hesitating because I know that page is edited daily by the bot. Can we add a noinclude section that the bot will ignore? I hope other languages will take off with bot assessments like the English & French, and if so we will want to have interwiki links. We could also use such a section to add the page to a category. Walkerma 05:50, 10 April 2007 (UTC)
- Nice to know the French are doing well. :) Tomorrow I'll modify the bot code to allow a section which the bot won't overwrite. Oleg Alexandrov (talk) 03:39, 11 April 2007 (UTC)
- I modified the code so that text after a bot tag in Wikipedia:Version 1.0 Editorial Team/Statistics will not be modified. I also added the "fr" link at the bottom. Oleg Alexandrov (talk) 02:17, 12 April 2007 (UTC)
- Thanks! I also appreciate your fixing the Hungarian link as well, I'd assumed they'd given up or something! Good to see things under way there, too. Cheers, Walkerma 05:02, 12 April 2007 (UTC)
- I modified the code so that text after a bot tag in Wikipedia:Version 1.0 Editorial Team/Statistics will not be modified. I also added the "fr" link at the bottom. Oleg Alexandrov (talk) 02:17, 12 April 2007 (UTC)
Why non bot run?
What's wrong with the bot? Many projects haven't had a run since 7 Apr while others have had two runs since then. It's always the projects at the end of the alphabet that lose out.23:07, 12 April 2007 (UTC)
- Is the bot simply running out of time to process the later projects? I was under the impression that this shouldn't happen with a two-day run. Or has there been some change made to how it operates?
- The lack of updates for nearly a week now is, admittedly, disconcerting. Kirill Lokshin 00:50, 13 April 2007 (UTC)
- See the section #WP 1.0 bot is so slow above. I tried to move the bot to the toolserver to make it faster. The results were sad, the bot never finished its run (the toolserver can't be counted to be up and running continuously for two days in a row I think). The bot is back to my school's computer from yesterday night and it's been running well since then. Oleg Alexandrov (talk) 02:12, 13 April 2007 (UTC)
- Ah, ok; that explains it. (Pretty sad how unreliable the toolserver is, though.) Kirill Lokshin 02:24, 13 April 2007 (UTC)
- Just as a suggestion, how about maintaining a tidemark as the bot goes through the categories, and then restarting at the tidemark if the last run failed? That way the pain would be shared if there's server problems, and every category would get an update every eg 3 days rather than Aardvarks getting a daily update and Zebras get none at all. Just saying, from the perspective of the Wine project ;-/ FlagSteward 20:45, 13 April 2007 (UTC)
- This is a good idea but now that I moved the bot back to the original server, crashes and interruptions should happen very seldom (judging from past history) so I hope there is no need to implement advanced crash recovery. Oleg Alexandrov (talk) 00:29, 14 April 2007 (UTC)
- Just as a suggestion, how about maintaining a tidemark as the bot goes through the categories, and then restarting at the tidemark if the last run failed? That way the pain would be shared if there's server problems, and every category would get an update every eg 3 days rather than Aardvarks getting a daily update and Zebras get none at all. Just saying, from the perspective of the Wine project ;-/ FlagSteward 20:45, 13 April 2007 (UTC)
- Wikipedia:Version 1.0 Editorial Team/Biography articles by quality statistics hasn't been updated for 8 days. Any idea when we might see an update Oleg? (Please don't stress over this, if the answer is "not until next week/month" then c'est la vie :)) --kingboyk 21:56, 13 April 2007 (UTC)
- It will get updated today. The biography is last in the list, because it is by far the hugest category. Oleg Alexandrov (talk) 00:29, 14 April 2007 (UTC)
- Ah, ok; that explains it. (Pretty sad how unreliable the toolserver is, though.) Kirill Lokshin 02:24, 13 April 2007 (UTC)
- See the section #WP 1.0 bot is so slow above. I tried to move the bot to the toolserver to make it faster. The results were sad, the bot never finished its run (the toolserver can't be counted to be up and running continuously for two days in a row I think). The bot is back to my school's computer from yesterday night and it's been running well since then. Oleg Alexandrov (talk) 02:12, 13 April 2007 (UTC)
Mathematics grading now produces lists by sub-field
Folks here might be interested in what the Maths wikiproject have done with assessment. We have a field parameter which is used to place the article in a sub field, say algebra or geometry. User:CMummert has now written a bot which reads this field and produces field specific lists like Geometry and topology. A similar scheme could be useful for other wiki projects which have very large number of articles. --Salix alba (talk) 16:25, 22 April 2007 (UTC)
- Dunno if you guys considered this, but you could avoid having to deal with your own bot simply by having the field parameter generate a 1.0-bot-readable category; see, for example, how {{WPMILHIST}} creates assessments for each task force. Kirill Lokshin 16:28, 22 April 2007 (UTC)
- The bigger difficulty is that the math project uses a B+ grade but WP 1.0 does not. The bot does a lot more cross indexing that WP 1.0 does. Start with the table and click on any link that isn't a category link to see it in action. CMummert · talk 21:16, 26 April 2007 (UTC)
- Why would you want an extra grade? --kingboyk 21:47, 26 April 2007 (UTC)
- We found there was a very large gap between start and GA classes. B+ is generally those articles which could be considered closest to being put forward as a GA nom. --Salix alba (talk) 23:17, 26 April 2007 (UTC)
- I see. Fair enough, of course, but I'd have thought "non-standard" gradings would just give you more work. Perhaps you like work, I don't know :) --kingboyk 23:51, 26 April 2007 (UTC)
- That's a fantastic tool, CMummert! OK, Kirill points out one other way to do this, but I think this is a very nice alternative. It's clearly been very well thought out. Would it be possible for us to use this bot for the 1.0 project (outside Math)? We already have ten "fields" (and yes, one of them is Mathematics), the only difference is that our template uses the word "category" instead of "field". I dare say we could do an AWB sweep to change that. I would love to be able to produce a nice list of (say) all of the history articles in Version 0.7. At present all we can do (that I know about) is to use AWB on Category:History Version 0.7 articles and convert it to a list which we can then upload onto a new page. This method was used with Version 0.5 when writing navigation pages. I can imagine that some other WikiProjects may want to use it too - it depends if you're willing to support it. You know that Oleg has his hands full running the WP 1.0 bot.... Thanks, and great work, Walkerma 01:10, 27 April 2007 (UTC)
- I see. Fair enough, of course, but I'd have thought "non-standard" gradings would just give you more work. Perhaps you like work, I don't know :) --kingboyk 23:51, 26 April 2007 (UTC)
- We found there was a very large gap between start and GA classes. B+ is generally those articles which could be considered closest to being put forward as a GA nom. --Salix alba (talk) 23:17, 26 April 2007 (UTC)
- Why would you want an extra grade? --kingboyk 21:47, 26 April 2007 (UTC)
- (lower indent) The Version 0.7 setup is pretty easy to adapt the script to. I uploaded examples at User:VeblenBot/Version_0.7/MainTable and User:VeblenBot/Version_0.7/History (actually, I uploaded the whole table set). I have no objection to maintaining this script - it's under 500 lines of perl and there should be no need to change it once it is set up correctly. Unfortunately its not very modular, so configuring it means editing the source, not writing a config file like the WP 1.0 bot. I don't mind running the script for other projects if they are interested. But I would need to put in a bot request before adding to the current automatic workload. CMummert · talk 03:15, 27 April 2007 (UTC)
- Hey, that's amazing! When we set up the category parameter, this was exactly the sort of thing I had in mind! Could you make the bot request? I'm pretty sure others would be interested in using this. Thank you SO much! And I just realised - it's the bot's birthday today - how appropriate! Thank you, Walkerma 04:16, 27 April 2007 (UTC)
- I don't really understand the way that the release version tags are set up. There are 0.5 articles and 0.7 articles - should my tables include both? Right now they only include the 0.7 categories. It would help with the request if other people associated with the release version are interested in the tables/lists. And there are some formatting issues I need to take care of.
- One reason I am suspicious is that my bot counts 2099 articles but Oleg's bot only counts 2067. I have to figure that out. My list is very thorough, which could account for the difference. Does Oleg's bot ignore NA-Class articles? It also seems to ignore the incorrect tag on Talk:Grateful Dead. CMummert · talk 04:56, 27 April 2007 (UTC)
- Everything from Version 0.5 is automatically in Version 0.7, and the template should be set up to say that. As for the discrepancy, there may be some lists that don't show, or something like that. I think that Oleg's bot includes NA-Class. A useful tool is the list-comparer in AWB - I used that a lot to resolve the differences we found when putting together the Version 0.5 listings. You may want to check that things aren't being double counted - not an issue in Maths, but maybe possible if an article has BOTH 0.5 and 0.7 tags on it. Things are in a bit of a state of flux at the moment, we just switched to a new 1.0 template recently, and not everything has been changed over. Also, I haven't checked the log for talk page vandalism recently, but I'll go through that when I get a chance. Many thanks, once again! Walkerma 05:16, 27 April 2007 (UTC)
- The bot does not count NA-Class articles. I was not aware of this class until now. Is it a quality or importance class? Oleg Alexandrov (talk) 06:29, 27 April 2007 (UTC)
- NA-Class is a quality class for articles that don't fit into the quality system, like year and century pages (19th century). In the math project there is also Image-Class and List-Class, but the {{releaseversion}} template doesn't support those. If I subtract from my list that one exception (Talk:Grateful Dead)) and the NA class articles then my count matches Oleg's count. WP 1.0 bot must not count Grateful Dead because it's in Category:Version 0.7 articles with invalid quality ratings.
- I can't run AWB because my home and work computers run Linux, but I manage somehow. CMummert · talk 13:30, 27 April 2007 (UTC)
- Since they don't fit into the quality system, then I guess they should not be counted by the bot. By the way, I run Linux at home and work too. :) Oleg Alexandrov (talk) 14:44, 27 April 2007 (UTC)
- Indeed, Oleg, the whole point of NA is that it's not applicable to the assessments system and thus your bot needn't know about it.
- I also use that value to turn off the "priority" rating when a {{WPBiography}} has more than one workgroup active with different priority params (and then kludge it by adding the priority categories manually). --kingboyk 14:54, 27 April 2007 (UTC)
- Since they don't fit into the quality system, then I guess they should not be counted by the bot. By the way, I run Linux at home and work too. :) Oleg Alexandrov (talk) 14:44, 27 April 2007 (UTC)
- The bot does not count NA-Class articles. I was not aware of this class until now. Is it a quality or importance class? Oleg Alexandrov (talk) 06:29, 27 April 2007 (UTC)
- Everything from Version 0.5 is automatically in Version 0.7, and the template should be set up to say that. As for the discrepancy, there may be some lists that don't show, or something like that. I think that Oleg's bot includes NA-Class. A useful tool is the list-comparer in AWB - I used that a lot to resolve the differences we found when putting together the Version 0.5 listings. You may want to check that things aren't being double counted - not an issue in Maths, but maybe possible if an article has BOTH 0.5 and 0.7 tags on it. Things are in a bit of a state of flux at the moment, we just switched to a new 1.0 template recently, and not everything has been changed over. Also, I haven't checked the log for talk page vandalism recently, but I'll go through that when I get a chance. Many thanks, once again! Walkerma 05:16, 27 April 2007 (UTC)
- Hey, that's amazing! When we set up the category parameter, this was exactly the sort of thing I had in mind! Could you make the bot request? I'm pretty sure others would be interested in using this. Thank you SO much! And I just realised - it's the bot's birthday today - how appropriate! Thank you, Walkerma 04:16, 27 April 2007 (UTC)
- The bigger difficulty is that the math project uses a B+ grade but WP 1.0 does not. The bot does a lot more cross indexing that WP 1.0 does. Start with the table and click on any link that isn't a category link to see it in action. CMummert · talk 21:16, 26 April 2007 (UTC)
Interest?
Is there interest here in VeblenBot's tables? There are examples at User:VeblenBot/Version_0.7/MainTable and subpages. If there is interest in doing something with them here, I'll put in a bot request to update them daily. CMummert · talk 13:17, 30 April 2007 (UTC)
- If there is good interest in the table, one option could also be to merge CMummert's code into the main WP 1.0 bot code so that all projects could use it. But that of course would need discussion to make sure people agree with that. Oleg Alexandrov (talk) 15:19, 30 April 2007 (UTC)
- I would have no objection to that. The main work that would have to be completed is to find a way to modularize the code to use a per-project configuration file. CMummert · talk 15:50, 30 April 2007 (UTC)
- I would love for VeblenBot to become a standard tool for the 1.0 project. If it can be incorporated nicely into WP 1.0 Bot, that would be great. If it would be more reliable/efficient to keep them separate, let's do that. Either way, it's a really nice feature, thanks, Walkerma 02:38, 1 May 2007 (UTC)
- If there is any project that would like to use the bot, I would be glad to set it up. I need to put in another bot request, though, which means I need to know exactly what is being requested. CMummert · talk 03:09, 7 May 2007 (UTC)
- I was just about to post a request here - you must have read my mind! Please go ahead and set up the bot for:
- All of these use the same 11 categories, see {{WP1.0}} for a list. With the GA project, the "topic" (equiv. to category) parameter is present in the template, but it has not been used so far, so the 11 individual GA categories are currently empty. Without your bot, these categories provided no useful information, but with your bot it will become worthwhile for people to include the category. Once again, thanks! Walkerma 04:35, 7 May 2007 (UTC)
- OK. The table at User:VeblenBot/Version 0.7/MainTable and its subpages should take care of the Release Version project. I'll look into the vital articles project next. Could someone look over the release version pages and let me know if there are any problems? CMummert · talk 15:48, 7 May 2007 (UTC)
- This looks excellent to me, I can't see anything wrong! I'm guessing that it needs both importance and quality for the top table - right? We have very few assessed for importance at present. I love the category table. Thank you very much, this is very helpful. Walkerma 04:13, 8 May 2007 (UTC)
- I put in a request for this function; I expect it should be approved pretty quickly. CMummert · talk 01:39, 9 May 2007 (UTC)
- This looks excellent to me, I can't see anything wrong! I'm guessing that it needs both importance and quality for the top table - right? We have very few assessed for importance at present. I love the category table. Thank you very much, this is very helpful. Walkerma 04:13, 8 May 2007 (UTC)
- OK. The table at User:VeblenBot/Version 0.7/MainTable and its subpages should take care of the Release Version project. I'll look into the vital articles project next. Could someone look over the release version pages and let me know if there are any problems? CMummert · talk 15:48, 7 May 2007 (UTC)
- If there is any project that would like to use the bot, I would be glad to set it up. I need to put in another bot request, though, which means I need to know exactly what is being requested. CMummert · talk 03:09, 7 May 2007 (UTC)
- I would love for VeblenBot to become a standard tool for the 1.0 project. If it can be incorporated nicely into WP 1.0 Bot, that would be great. If it would be more reliable/efficient to keep them separate, let's do that. Either way, it's a really nice feature, thanks, Walkerma 02:38, 1 May 2007 (UTC)
- I would have no objection to that. The main work that would have to be completed is to find a way to modularize the code to use a per-project configuration file. CMummert · talk 15:50, 30 April 2007 (UTC)
Short log or temporary archive
Hi. WP:WINE has a bit of a problem at the moment - we like to have the assessment log in some inner HTML on our homepage in order to keep an eye on what other people have been up to recently, but thanks to a stub assessment drive in early March our log is currently running at over 300kb, which is slowing down our home page a little. I appreciate that the logs expire after 3 months, but we can't really wait that long. I've had a bit of a poke round the talk archives here but hadn't found anything to match this problem. I've thought of two options :
Manual kludge
I was wondering if it would be OK if I just set up Wikipedia:Version_1.0_Editorial_Team/Wine_articles_by_quality_log/Archive, manually cut out everything over a month old, set up a link to the archive from the main log and then deleted it in two months time - would that break the bot horribly? Doesn't need any work on your part, and just gets us out of this temporary hole.
New feature
A more elegant solution that might be useful on many project portals would be a separate 'shortlog' page, that just had the changes since the last botrun, or the last week or something, plus a link to the main log page. I appreciate this option involves extra coding, but I thought I'd float it.....
Of course there is a third option, to delete the log from our homepage for the next few weeks, but I'd only do that if the 'temporary manual archive' option isn't available. FlagSteward 21:58, 30 April 2007 (UTC)
- The bot will not be affected no matter what you do. It usually takes the log page as it is and just appends to it. Oleg Alexandrov (talk) 04:04, 1 May 2007 (UTC)
- OK - I've now set up Wikipedia:Version_1.0_Editorial_Team/Wine_articles_by_quality_log/Archive to offload 300kb-worth of log ;-/ temporarily. I've appended a note to the end of the log to explain where the old stuff has gone, and a note to the top of the archive explaining where it came from. I hope this is OK - give us a shout if it breaks anything FlagSteward 01:37, 2 May 2007 (UTC)
- Unless I'm missing something, there's no need to do that. Just truncate the log. I know it has a header saying in effect "leave this page alone" but that's really only to discourage people disambiguating links. If the log is too long for transclusion you can truncate it; I've done it often enough and the bot doesn't mind. The old revisions stay in the history, so there's really no need to archive it. --kingboyk 11:25, 3 May 2007 (UTC)
- OK - I've now set up Wikipedia:Version_1.0_Editorial_Team/Wine_articles_by_quality_log/Archive to offload 300kb-worth of log ;-/ temporarily. I've appended a note to the end of the log to explain where the old stuff has gone, and a note to the top of the archive explaining where it came from. I hope this is OK - give us a shout if it breaks anything FlagSteward 01:37, 2 May 2007 (UTC)
Small update on the bot
I modified the bot code to fetch the latest history version of articles as suggested a while ago by Titoxd and Salix alba (I remember it was both), by doing a query of the form
http://en.wikipedia.org/w/query.php?format=txt&what=revisions&rvlimit=1&titles=Louis+de+Luxembourg%2c+Count+of+Saint-Pol%7CLouis-Alexandre+de+Bourbon%2c+comte+de+Toulouse%7CLouis-S%c3%a9bastien+Lenormand%7CLoup+River+%28France%29%7CLourdes+Medical+Bureau
which does a bunch of articles at the same time (five in this case). The bot should be faster as a result, but in the last several days since it's been running I have not noticed great improvements. Well, at least it does not get slower. :) Oleg Alexandrov (talk) 02:11, 3 May 2007 (UTC)
Proposal to make the bot faster
The bot is taking around three days to do the update nowadays, which is not good. I have a proposal. If we remove the "last updated" tag and the date at the bottom of subpages (see here for an example of what I mean), then the bot won't need to update subpages on which no changes happen except the datestamp. The main indeces for each subject would still get their datestamp (like the index Wikipedia:Version 1.0 Editorial Team/Aircraft articles by quality of the above subpage). Would people agree with this? Oleg Alexandrov (talk) 18:31, 19 May 2007 (UTC)
- Well, it's a pain, but I think it's worth it for faster updates. The only exception I'd like to see is that entirely new pages should still be done (ie. if they've never been done before). -- TimNelson 00:54, 21 May 2007 (UTC)
Bot missed the end of the alphabet AGAIN
on 30 May, the bot didn't make it to the zebras again. Something should be done about the alligators and jackals always getting an update and the sloths and zebras missing out all too often.Rlevse 12:58, 31 May 2007 (UTC)
Articles not in categories
I just added the importance scale to a new task force I helped create. But, no articles are being added to the categories. I know this isn't an issue with the bot but I was wondering if anybody else has ran into this issue and what you did to resolve. I was able to resolve one article by simply removing the rating then re-adding it but that is not a solution for hundreds of articles. For instance, Talk:Tulsa Zoo is properly tagged and the correct category (Category:Mid-importance Tulsa articles) is at the bottom. But if you go to that cat, there is nothing in it. Any ideas?↔NMajdan•talk 14:21, 31 May 2007 (UTC)
- Well, they're there now. Must've been a huge backlog in Wikipedia's queue.↔NMajdan•talk 15:58, 31 May 2007 (UTC)
WP 1.0 bot stopped for now due to a problem
I stopped the bot because there is something wrong with the query which finds articles in a given category. For example, consider the large Category:Stub-Class mathematics articles. To find the articles in there, one has to do several consecutive queries, each giving 200 articles. The following query
works, but if you replace "Cl" at the end by "Cm", so instead of giving the articles starting from "Cl" on, give the articles starting from "Cm" on,
the query gives an error. I contacted Yurik about this. Any ideas in what is going on? Oleg Alexandrov (talk) 02:46, 6 June 2007 (UTC)
- I spent some time looking at the code, but I'm not experienced enough to make any further progress without being able to see the data that query.php gets from the SQL server. I also let Yurik know about it, and there is a note at User_talk:Yurik/Query_API#Categories (copied from VPT) about the issue. — Carl (CBM · talk) 12:35, 6 June 2007 (UTC)
Query.php is now working correctly on the math-related categories. I don't know what was changed to make it work. — Carl (CBM · talk) 01:39, 10 June 2007 (UTC)