Jump to content

Wikipedia:Bots/Requests for approval

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Madman (talk | contribs) at 02:01, 23 June 2007 (STBot 8 in trial period). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

New to bots on Wikipedia? Read these primers!

To run a bot on the English Wikipedia, you must first get it approved. Follow the instructions below to add a request. If you are not familiar with programming consider asking someone else to run a bot for you.

 Instructions for bot operators
User:ST47/BAG Wikipedia:Bots/Requests for approval/ArchiveBox

Current requests for approval

Requests to add a task to an already-approved bot

Operator: Quadell

Automatic or Manually Assisted: Automatic, supervised

Programming Language(s): Perl (with Perlwikipedia)

Function Summary: Add {{WPBiography}} where there's persondata; add then add {{DEFAULTSORT}} where it's obvious, and collect summary info where it's not.

Edit period(s): one time run, or in small batches

Edit rate requested: 6 edits per minute

Already has a bot flag (Y/N): Yes

Function Details: The bot will look through every article that transcludes the WPBiography template (nearly 400,000 articles so far) looking for a sortname (such as "Smith, John") and living/dead info. This might be found within the WPBiography template itself, in persondata, in categories, in a DEFAULTSORT tag -- or in multiple locations which may or may not agree. The bot will log that info (the various listed sortnames and living/dead info for each article) in a text file on my server. While I'm there, in certain limited circumstances, it will standardize this info in the article itself. See User:Polbot/ideas/defaultsort#Detailed specification for more details than you could shake a stick at.

The logfile that is created by this process will (it is hoped) be perused by volunteer humans to pick a proper sortname from among the choices, and a later bot (not requested for authorization at this time) will fix or add DEFAULTSORT info in the articles.

Discussion

Looks good. I'm supporting it. ~ Wikihermit 23:16, 15 June 2007 (UTC)[reply]

Same here, looks like a resourceful task. Good luck. E talk 23:17, 15 June 2007 (UTC)[reply]
Can you provide a link to the discussion? —METS501 (talk) 19:30, 16 June 2007 (UTC)[reply]
    • Yes, sorry, here and here, mostly. The main thing is, right now metadata is handled haphazardly in several ways. If there were a standardized and simplified way of handling metadata, it would definitely be an improvement; but then this bot's work would be overwritten, basically. So I'm waiting to see if it would be better to use the proposed {{UsePersondata}} tag or something similar. – Quadell (talk) (random) 19:56, 16 June 2007 (UTC)[reply]
      • Personally, I feel that we shouldn't wait for the metadata discussions to be concluded. Those discussions will probably turn out to be very long and involved, if only because it will change the way many editors are familiar with handling such metadata, and they will probably object. In any case, it should be possible to carry out edits with this bot and then migrate the data later with no loss of information. Consider it an interim improvement. By the way, where are the discussion about this proposed "UsePersondata" tag? Carcharoth 16:42, 17 June 2007 (UTC)[reply]

A couple edits of what this bot would do would be nice. Consider it a 2 edit trial -- Tawker 19:48, 16 June 2007 (UTC)[reply]

Another point, one of the pressing questions I was hoping this bot would answer, is one of scale. Currently, around 10,000 articles use Persondata, and nearly 400,000 articles use the WPBiography template. What I don't know, and would like to see answered, is how many of these articles use the DEFAULTSORT magic word, and how many use the WPBiography 'listas' parameter (or rather, how many don't use these sort keys), and how many still use individual category pipe-sorting instead? It is entirely possible that the vast majority of the nearly 400,000 biographical articles (say 300,000) lack any sort key whatsoever. If so, only the 100,000 using them would need to be standardised. The other 300,000 would be turfed over to a human project to decide what the appropriate sort key is. Finding out how many articles use the 'listas' parameter would require an edit to the template along the lines of that done in the past at Template talk:Infobox Writer#Magnum Opus, but detecting which biographical articles have category pipes and/or DEFAULTSORT entries is more difficult. I'm going to ask a developer if there is a quick way to do that. Carcharoth 16:41, 17 June 2007 (UTC)[reply]

Update. The discussions are either inconclusive or ongoing. See mw:User talk:Robchurch#DEFAULTSORT and sort keys in general and mw:Talk:API#DEFAULTSORT key, in case anyone with more computing know-how than me (not difficult) gets a bright idea from them. Carcharoth 21:54, 18 June 2007 (UTC)[reply]
  • It appears that it would not be prudent for me to wait on metadata standardization before running this bot. Feel free to approve it for a trial run, if you deem it acceptable. – Quadell (talk) (random) 22:10, 18 June 2007 (UTC)[reply]
  • Looks good. I have a couple of questions. 1) Is the code written and waiting to go? (If it is, I see no reason why we shouldn't have a small trial run; it's always easier to evaluate a trial run than it is to evaluate mere words). 2) Is there not a danger that this will result in mis-sorting? My thinking is that in some categories, the DEFAULTSORT won't apply. Imaginary example: George W. Bush in Category:Presidents would sort as "Bush, George W." In Category:Bush family he would sort as "George W." --kingboyk 21:11, 20 June 2007 (UTC)[reply]
    • Answer #1: The code for reading the pages and making a logfile is written and waiting. I'm still. . . polishing. . . the function that actually changes the pages' wikicode. If I get the green light for a test, I'll have it done within 12 hours. Yes, the code is done and ready to run. Answer #2: This is a very cautious bot. The only situation where it would write a DEFAULTSORT is when there is a "listas" parameter in the WPBiography template, and also when every other sortname (category pipes and Persondata name) give the exact same sortname as the listas. So if one category pipe gives a different sort, then the DEFAULTSORT will not be written. (It will be logged, though, for an editor to look at.) – Quadell (talk) (random) 22:16, 20 June 2007 (UTC)[reply]
    • Another answer to #2: Look at Jeb Bush. That already has DEFAULTSORT and is incorrectly sorted in Category:Bush family. But if you add a pipe-sort to give [[Category:Bush family|Jeb]], that will over-ride the DEFAULTSORT. This is why templates that pipe-sort their categories using [[Category:Random|{{PAGENAME}}]] (so as to avoid talk and user pages being grouped separately) over-ride the DEFAULTSORT magic sort. Incidentially, many of the articles in Category:Bush family are incorrectly pipe-sorted, but the category is unwieldy at the moment anyway, and fails to help people navigate to the area of the Bush family they might be interested in. As Quadell says, the idea is to standardise pages where the sort keys are the same. Where they are different (or absent), humans need to do some checking. Ultimately, only having one sort key in one location would also be a great boon (updating the same information in two locations is silly and inefficient), but that is now part of the bigger metadata debate (which will take a long time to resolve, hence Quadell deciding to go ahead with this for now). Carcharoth 11:43, 21 June 2007 (UTC)[reply]

Also, people wanted to see 2 or 3 examples of what the code would do. Here it is: [1], [2], and [3]. – Quadell (talk) (random) 02:26, 21 June 2007 (UTC)[reply]

And if you dig through the history of the articles and the talk pages, you can see where people added some parameters but didn't add them all. eg. Category:Living people, but not "living=yes" and "listas" or category pipes, but not DEFAULTSORT. In the Donovan Swailes case, the category piping was present from when the article was created (26 May 2005), the WPBiography template was added on 20 September 2006, the listas parameter added on 18 May 2007, the living=no on 7 June 2007, and the DEFAULTSORT on 21 June 2007. This means that future categories can be added without the need for pipe-sorting. Carcharoth 11:53, 21 June 2007 (UTC)[reply]

It seems there's unanimous support for this. Can I run it? – Quadell (talk) (random) 15:47, 26 June 2007 (UTC)[reply]

Also note the ballpark figure over at User:Polbot/ideas/defaultsort#Further discussion, where it is estimated that around 20,000 articles already contain DEFAULTSORT. So the majority of the bot's work may be just gathering other sortkey data from the other 300,000+ articles. Plus the category stuff as well - not sure what the scale of that part of the bot's work is. Carcharoth 16:54, 26 June 2007 (UTC)[reply]

Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. Yeah, go ahead woth 50 edits or so. Sorry for the delay. --ST47Talk 14:26, 9 July 2007 (UTC)[reply]

Trial run

OK, if Quadell doesn't mind, I'll come up with a list of 50 articles that include all the plausible combinations of different actions (selecting 50 at random from 350,000 might not really work here), and what I, as a human would do with those articles. If the bot does the right actions, I think we can safely say it passes the Turing test, let alone knows how to do this task. :-) Carcharoth 14:39, 9 July 2007 (UTC)[reply]

That sounds great! Thanks. – Quadell (talk) (random) 14:50, 9 July 2007 (UTC)[reply]
I've made a start at User:Carcharoth/Polbot3 trial run, which were really picked at random from the first 5000 on what links here for WPBiography. I've ensured a scattering of special characters, pages with parentheses, commas, ordinal numbers, funny names, and so on. I even included some that are false positives (they aren't really biographies). Not a comprehensive trial, and I'm now slowly picking my way through them and realising that a random pick tends to find more dead people than living ones... I may need to pop over to Category:Living people for a more representative sample. Carcharoth 15:59, 9 July 2007 (UTC)[reply]
  • I've finished making notes (offline) on what I think should happen when the bot runs over the list of 31 at User:Carcharoth/Polbot3 trial run. Quadell, do you want to see those notes before or after you run the bot over the list? :-) I've included Alan Turing in the list, along with a little surprise (don't look in the edit history, otherwise it will spoil the surprise!). When it is done, could you post a link so the edits and edit summaries can be looked at. Do you know how to set up the URL to link to a set of edits? eg. this is a set of the 5 edits I did leading up to 01:39:18 on 2007/07/09: see here. Depending how many edits the bot makes, you'd set the limit and offset accordingly for Polbot's contributions. Carcharoth 19:59, 9 July 2007 (UTC)[reply]
I have run the trial. See all the gooey details at User talk:Carcharoth/Polbot3 trial run. – Quadell (talk) (random) 15:53, 10 July 2007 (UTC)[reply]
Request Expired. Marking request as expired as it was either never transcluded to Wikipedia:BRFA or has had no attention for some time. Richard0612 12:21, 19 February 2009 (UTC)[reply]
The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.

</noinclude>

The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section.
The result of the discussion was Speedily Approved.

(Formerly WOPR)

Operator: White Cat

Automatic or Manually Assisted: Supervised & automatic. I will review the former category and new one (as well as the change in RC feed) but probably wont review every individual diff.

Programming Language(s): AWB

Function Summary: Rectaegorization (based on discussions on medians such as Wikipedia:CFD)

Edit period(s) (e.g. Continuous, daily, one time run): Assisted but automated runs. I will manually input the task (recategorization) and monitor. Each task is to be run only once.

Edit rate requested: 500 edits per hour (really depends on the task)

Already has a bot flag (Y/N): N

Function Details: Find and replace task. Usage of "Category:X"'s will be replaced with "Category:Y"'s based on consensus

Discussion

Um, I'm not entirely clear on what you intend to monitor the RC feed for and, also if you don't mind, could you please explain what you mean by 'based on discussions on medians such as WP:CFD'? -- Seed 2.0 18:22, 11 June 2007 (UTC)[reply]

Say a CFD discussion concludes as a "rename category", the bot would do just that. Or if the discussion concluded with a "delete", the bot would remove the categories. I can use the same function for TFDs - basically any deletion/discussion that may require such a service. The bot would run based on community consensus only.
I will be monitoring the RC feed to check on the bot. I wouldn't be using the bot to monitor RC feed. I monitor the RC feed to check for possible problems. If for instance the bot removes 200 bytes, thats proof of a breakdown (happens rarely with AWB I believe. I should see the same byte difference during the run of all bot edits. Its just me being extra careful.
-- Cat chi? 19:04, 11 June 2007 (UTC)

I'd just like to clarify, since category deletion wasn't mentioned previously. I assume, since you don't have the sysop bit, that if consensus is to "delete" that your bot will only be removing the category from all pages in the article namespace? What other namespaces may be modified? — Madman bum and angel (talkdesk) 17:30, 12 June 2007 (UTC)[reply]

All namespaces could be modified, most likely, main, template, and user. --ST47Talk 18:13, 12 June 2007 (UTC)[reply]
The bot can handle any namespace except mediawiki namespace. -- Cat chi? 18:16, 12 June 2007 (UTC)
WhiteCat, I presume this bot is to be tasked with Wikipedia:CFD/W work. You might want to check and see if the bot is even needed, there are quite a few that do this task. In addition user cats tend to be a bit hard for bots to actually do, as some of them are hidden behind noinclude tags in included templates. This makes the category show up on every page that the template is on (say a userbox), but not place the userbox itself into the category, which effectively makes finding the userbox difficult. I have written a tool that does a bit of recursion to identify the userbox, unfortunatly its windows only, so if this is an important task, please let me know and I'll write up a diddy that can be run on linux. I have run this style bot task before. —— Eagle101Need help? 00:34, 16 June 2007 (UTC)[reply]
Oh I use windows. I use AWB's find and replace function to re-categorize. It works fine IMHO. I can handle it :) -- Cat chi? 00:22, 25 June 2007 (UTC)
Well, I hope you don't mind or take this the wrong way but since that's a bit unconventional (I assume you write your own regexps to handle pipes, etc.), I just have to ask: what experience do you have working with categories? --S up? 00:30, 25 June 2007 (UTC)[reply]
I used AWB to recategorise speedy entries under this account. I do not use regexps (not a whole lot of experience with it) but instead the normal button for "find and replace". Replacing [[Category:Foooo1 with [[Category:Foooo2 is rather trivial for me. It works fine. I monitor every edit the bot makes just in case something doesn't work right. -- Cat chi? 00:35, 25 June 2007 (UTC)
What about links to the category, e.g. Category:WikiProject banners? I'd suggest adding a :? in there. — Madman bum and angel (talkdesk) 17:04, 26 June 2007 (UTC)[reply]
There are those specific exceptions where the code would need to be adjusted. This will be done when necesary :) -- Cat chi? 17:06, 26 June 2007 (UTC)
Why not just add a :? instead? A link to a category can be easy for you to miss. — Madman bum and angel (talkdesk) 17:33, 27 June 2007 (UTC)[reply]
What links to a category and the pages categorised are threated differently in AWB. Thats why :) -- Cat chi? 17:58, 27 June 2007 (UTC)

Speedily Approved. This is already done by a bunch of bots and uses AWB which makes almost no mistakes. —METS501 (talk) 05:48, 30 June 2007 (UTC)[reply]

The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.


Bots in a trial period


Approved requests

Bots that have been approved for operations after a successful BRFA will be listed here for informational purposes. No other approval action is required for these bots. Recently approved requests can be found here (edit), while old requests can be found in the archives.


Unapproved requests

Bots that have been denied for operations will be listed here for informational purposes for at least 7 days before being archived. No other action is required for these bots. Older requests can be found in the Archive.

Expired/withdrawn Requests

These requests have expired, as information required by the operator was not provided. These bots are not authorized to run, but such lack of authorization does not follow from a finding as to merit. A bot that, having been approved for testing, was not tested by an editor, or one for which the results of testing were not posted, for example, would appear here. Bot requests should not be placed here if there is an active discussion ongoing above. Operators whose requests have expired may reactivate their requests at anytime. The following list shows recent requests (if any) that have expired, listed here for informational purposes for at least 7 days before being archived. Older requests can be found in the Archive.