[Home]Discussion Page

HomePage | RecentChanges | Preferences

The Encyclopaedia has now been locked; contributors must log in to make changes. [more]


Kevan: Hm, the [Wikipedia guidelines on pluralisation] are perhaps relevant, in terms of pluralising titles of pages which "distinguish between multiple distinct instances of related items". I think "Ghost Station" should be singular if it just tells you what a Ghost Station is, but plural if it goes on to list all of the Underground's Ghost Stations. (Although maybe it'd be more useful to split the pages into "Ghost Station" and "List of Ghost Stations".)

So Venbacker Numbers should be singular, but the definition should be rewritten so that it's actually a page about what a Venbacker Number is. Which I've just done. And I've deleted the orphan.

Dunx: Following up on the pluralisation discussion above, I've added a few new entries in the index which have a gerundive form. I've used this for terms such as "boxing" and "cornering" specifically because it distinguishes these actions from the simple nouns.

Index Regeneration

Dunx: I've noticed a few entries in the "Recent Changes" page which haven't made it into the A to Z page. Is index regeneration something which should be automated? Or is this something we should just remember to fix by hand?

(incidentally, hurrah for Kevan for setting this up. Good job, sir)

rab: Also if you make a reference to a non-existent entry in another entry it might be hel pfulto have a way to track these down so the revelant entries can be researched and compiled.

Kevan: Mm, updating the A-Z needs to be done by hand, although there is a Wiki function that displays [an alphabetised list of all existing pages], which - I presume - shouldn't be too hard to improve the display of, by mucking around with the source code. Although this does list every page in the Wiki, which may or may not be a good thing.

You can also get a list of all pages that link to a page, by clicking on the page's title (eg. [pages that link to 'Token']) - one thing that I've seen on [other Wikis] is for every page to link to the "category" that it belongs in, so that the "backlinks to this category" page serves as an index. (Which would allow us to have an Encyclopaedia category, aFamous Player category as a subset, an Online Player category as a separate group, and so on.) Provided that people remember to add the category links, this would be an easy way to keep things ordered.

As for non-existent entries, the Wiki function [action=links] displays a list ofall entries and their outlinks, with unwritten ones clearly visible, if you want to see what we're missing. (I should probably link to the [documentation for this Wiki software], somewhere.) Such known unknowns don't seem entirely unreasonable, though, for an Encyclopaedia Morningtonia.

Dunx: I'd probably go for the categorisation; that would certainly be my preferred coding solution.

Kevan: Yeah, me too, I think, with some script-level tweaking of the backlink displays. I'll see whether I can update all the existing entries automatically...

Kevan: Hm, I've gotten around to giving the [full Wiki index] page some better formatting (basically just removing the search count, and putting letters at the start of each section); this is the same code that's used to generate [back-links for Player Profiles], and now that I've done it, I'm more enthused by the idea of properly categorising all entries (so that we can autogenerate an alphabetical list of Game Terms, and Famous Players, as well as having a single, exhaustive, fully-automatic A-Z index).

Will also look into auto-appending category keywords to the existing pages, when I've got another moment. Stop me if you've heard this one before.


Dunx: another question - how do you log in on another computer? On the Preferences screen?

Kevan: Yeah, it's just a cookie thing. I donรยรยรยรยรยรยรยรย't know whether it stores the password from your first login and demands it in future, or not.

Player Profiles

Kevan: ... Yes, I'd been meaning to do that. I've taken the opportunity to try out entry-based auto-indexing - by adding a simple link to "Player Profiles" at the bottom of all player profile pages, we can link to [this backlinks page] to autogenerate a list of all profiles. It's a bit ugly (and includes itself and the front page), but I should be able to muck around with the Wiki source to change that, eventually. (So we can do the same thing for Famous Players, MC Servers, Game Terminology, and the rest, eventually, as I said further up, somewhere.)

Dunx: Good job.

Kevan: Thanks. I'll sort out something to automatically append category links to previously-written pages, en masse, over the next few days.

What's Next?

Dunx: there are a few holes left in the index (some of them my own making, I'll grant) but what's next? I was going to mine the WMCC97 games first.

Si: The IMCS entry seems to be quite a large hole in the encyclopaedia. I think such an enormous undertaking as writing an entry for IMCS would require some sort of consensus. I'm quite happy to write some sort of draft. Would it be better to post it here for editing, rather than submitting it directly to the encyclopaedia, as there might be some polarised views on how it should be expressed...

Dunx: ... or we could just point the interested reader at imcs.org. Why rewrite what the organisation itself has already written?

Si: Having said that, you don't catch the Encyclopaedia Brittanica telling its readers to "look elsewhere". A small point, maybe but a fair one.

Dunx: good point, well made.

Spam resistance

Dan: I noticed last night (2004/10/31) that the link pixies had been in and been very generous with the chinese-charset nuggets of clickable goodness. It might be nice for you if you could safely wander off for a few days and be reasonably sure that hordes of old men hadn't been crawling in the windows and throwing garbage all over the place. Is there a spam-proofing patch for UseModWiki? Something like "display a graphic piece of text on the edit page and an md5sum of the text of that graphic in a hidden field. Require the user to enter the text on the graphic into a text field. If on POST, an md5sum of that text doesn't match the contents of the hidden field, apologize in six languages and refuse the post."

Admittedly this is not a trivial thing to do: you have to cache the image somewhere so it can be served in a separate request (saving it to a temp folder that gets periodically cleaned and giving it a name based on the md5sum would do it) so another option would be to simply display the code word in cleartext on the security-through-obscurity principle that spambots won't have been coded for it.

Kevan: Can't seem to see any specific patches, although there is [some discussion] of possible countermeasures. I think it depends whether the majority of spam is coming from enslaved human workers, or from fire-and-forget robots - there's not really much we can do if human spammers are visiting to manually ruin half a dozen pages, every so often, because it's (ahem) not easily distinguishable from proper user behaviour.

An elegant solution would just be to react angrily to people trying to post URLs, but I'm not sure there's any useful middle-ground there - if we ban all URLs, then I can't link to the WikiSpam discussion I just linked to; if we redirect them or otherwise obscure them, spammers will be denied their precious PageRank, but we'll still have to wash the graffiti off (that even if they realise that their links won't be shown as links, after submitting them, they're unlikely to apologise and tidy up on their way out).

Maybe it's worth digging out some more stats on this. If spammers are mostly robots, then it's possible that they're just firing out pre-filled forms at all the Wikis they can find on Google, and that renaming a few form fields (or adding a hidden one which gets checked upon reception) would be enough to shut them up.

I get the feeling that it's mostly enslaved human workers in tanks of green goo, though, and that they'll just punch through [Captcha tests] as being a dull part of their job.

Dan : It sounds as if you're up to speed with the state of the art on this subject. As is always the case with these arms races, it's almost innately inconclusive. Personally I'm a bit of a fan of the "surprise" development, where you do something that may not be unbeatable but isn't vulnerable to bots or the presumably dimwitted spam-lackey because you're the only one doing it. It'll work for a while but if it's any good, people will copy you and then the counter-countermeasure will shortly crop up in the wild. Still, this is a problem I have to solve as well for some of my own stuff. Oh, and I suppose I could have helped out in the above instance by actually reverting the affected pages; I think I was caught up in something at the time. I'll be sure to do my part if there's a next time.

Kevan: I've just noticed 500-odd v1agra spams that have been seeping into my horrible, hand-written blog comments system, which was presumably a spam-human briefly analysing my comments code to get a form structure, and feeding it into an enthusiastic wood-chipper. So I've added a few simple "Bayesian" countermeasures to that, to automatically reject anything with certain words or word-fragments or more than a few URLs in it - even if a clever human spammer wastes some time working out what I'm filtering for, they won't be able to put any useful spam in.

So yeah, I'll do something similar for this Wiki, when I've had a chance to pull apart its innards. I imagine that politely denying any changes which attempt to add more than a couple of URL links in a single edit would do the trick.

Kevan: One other thing - anyone who has an admin password should feel free to use the [IP-banning superpower] (which I've only just noticed), if things get stupid.

Dunx: Is it worth restricting editing privileges to registered users? Not very friendly for casual contributors of course, but I think there are fewer casual Wiki contributors than casual MC players.

Another option for URL rejection might be to change the URL markup, and reject any post that uses the standard syntax. V.bad for usability of course.

As for spambots, I am pretty sure that most of the blogspam I get is bot-driven, given that I often get the exact same comment being applied to multiple posts from wildly different IP addresses. And they are certainly up to doing screen scraping for hidden/renamed fields, since I've done all that in my MT installation and I'm still gettins a trickle of it. Should upgrade to MT v3, but really don't want to.

Kevan: Yeah, maybe restricted editing is the way forward. I've just locked [another Wiki] because (by its nature) it only had a few regular contributors; it's only loosely locked, though, with an unambiguous clue to the password being given in a help page - enough to confound a robotic spammer, and presumably enough to bore a paid-by-weight human spammer who hasn't got time to browse the site to find out what insect species the Jatok are. Would be easy to add a similar colour-supplement-quiz-question for the Morningtonia.

It seems a great shame to do that here, though - that part of the strength of the Wiki is that any random visitor (whether a stranger, or a veteran in a hurry) can contribute quickly and easily and move on. I'll code in some simple but friendly filtering, first, when I've got a moment, and see whether that stops anything.

Kevan: Simple and friendly filtering is now engaged - any page submissions with "suspicious" content will now be rejected with a cheerful accusation of spamming and an invite to rewrite it. (If spammers take this literally and start working around it, I'll change it to a grumpier system-error message.) Hopefully this shouldn't catch any false-positives, or choke on any existing entries, although we might have trouble with meta-conversations about spam. But we'll know why that's happened, when it's happened.

Let me know if you hit any problems, anyone.

Vanishing early revisions

rab: I note that when I click on 'View revisions' I only seem to get the last ten or so. Have the earlier ones gone for good?

Kevan: Ah, hm, so they have. Looking at the config, there's a "days to keep old revisions" setting, which is currently set to a fortnight, which I've doubled to be a bit more generous, for future generations who aren't so alert to graffiti.

Stray link tracking

Simons Mith: Is it possible to collate a list of links that have been referred to but not yet created? I came across a couple under Septimus Divergence which I have now added but there are bound to be others. Also, is it possible to automate a scan for 'malformed' links - may I suggest performing a naive search for [ or ] and just see what comes up.

Kevan: The ["links" action] is the nearest thing the Wiki offers - it gives you a list of all pages and their links, with uncreated links being shown as unlinked black text.

Scanning for bad markup is less straightforward, and not easily greppable at the server level because each page is stored with all its previous revisions in the same file. I'm not sure it's a priority, though - bad spelling and grammar are as much of an issue as bad markup, and can only be found by randomly browsing entries.

Simons Mith: Oh, that will do nicely. I am now adding the orphans to the index page. Obviously some should be tracked down and removed from the referring pages, as there are a few typos - but for the moment I am just using the A to Z as a checklist. I will do the corrections in a second pass.

OK, the obvious ones have been done. Highlights and points arising:

OK, I'm done for the moment. I plan to give other people ample chance to comment on what I've done so far before I do any more.



Simons Mith:

Simons Mith: Um. Etiquette question. Dan's adjustment to his own profile has caused my editor circuits to start firing at maximum. The original write-up was markedly better; modesty has removed not only the excessive Dan-references, which is fair enough, but also useful data on past names of parslow.com. And the reason why the site was called 'Beyond the Eighth Dimension' for a time is also legitimate information. But beyond adjusting the odd link and correcting typos, which I have done in PaulWay's profile, I think rewriting someone else's own profile is definitely Too Much.

Kevan: Given that it's actually the MCiOS page, rather than Dan's personal profile, I think the modesty is a bit out of place - disparaging his programming skills and dismissing innovations as pointless might be endearing when obviously written by Dan, but harsh when read in the neutral voice of the Encyclopaedia. (That of Peter Jones, perhaps.)

So rewording it into flatter statements would be fair, I think. I don't mind if he'd rather keep the Eighth Dimension reference an enigma, though.

Dan: To whoever patched it up (Simons?), my thanks. Perhaps I erred by touching it at all. Having done so it was in for a penny: once my name was in the edit history the idea that it persist in having any language complimentary to myself went from mildly cringe-inducing to unendurable, something I doubt I need to explain to present company. I agree that to convert it to ironically overstated disparagement was an overreaction and just as un-encyclopaedic, and unsuitable unless explicity attributed. Your version seems to hit the right documentary tone.

Dunx: I've also tidied up my own profile, which was dangerously immodest. I have a question, though: I have just added an entry on White Rose MC, since it is a current location to access the Yorkives. I have put it in current places to play and am not sure about that at all. It is obviously very tightly linked to York and Madeira, but at the same time it seems to me to be a distinct location. Any ideas? Another approach I thought of was to add a new category of "Archives", but since this would be the only entry in that category this seems unsatisfactory.

JLE: I don't know WHO it was that attacked about two dozen entries and removed all but the first half-sentence of each, but I've restored them all from a previous revision. Down with spammers.

Also I hope people like the definitions of "Purist", "Rule Inventor" and "Gadgeteer" as referenced from the page about Mornington Nomic. I have NOT linked these to the main page because I don't believe anybody will have any interest in these three terms unless they are *already* looking at the Mornington Nomic page, and they are not relevant directly to Mornington *Crescent*.

Simons Mith: Spammer's subnet added to banned IP range. Don't think we'll miss various.newmail.ru. Perhaps an IP-banning policy ought to be defined? I'm quite happy to ban an ISP's entire subnet for the first offence, personally but for now I am just wiping out the relevant .D octet.

[JLE] Gadgeteer et al are already on the main page thanks to my placeholder links. I thought that would be what they were, but your explanations are models of succinctness. Not being a Nomic player meself, I would have never have put it so well. I disagree that they are Nomic-only concepts, however; the philosophy of many standard MC players matches well with what you have given, although its manifestation in plain MC is (understandably) different.

JLE: Heh. I like to think of CAMREC as self-declared "Purists" but constantly splitting over different values of Purism, while the IMCS tends to be more a battleground between the Rule Inventors and Gadgeteers (the former having been more prominent in shaping CF84 and indeed the Finsbury Option, whereas the latter had altogether too much influence on HP2K.)

Dunx: [JLE] Kudos to you for your prompt work in fixing those apparently malicious breakages. A couple more snuck in between your finishing and Simons' IP banning, which I;ve patched.

Also, I agree with Simons: excellent definitions of Gadgeteer, Purist and Rule Inventor. I recognise myself in all of those...

Kevan: Looks like we've been hit by them again this morning; I've banned the IP address midway through its tinkering. Oddly, there's no corresponding URL-stuffing attack in the [spam rejection log] (I was wondering if it was a spambot that was deleting a chunk of text before adding links in a second edit, to counter some standard filter), so maybe we're moving into the age of pointless, scripted Wiki vandalism. Heigh ho.

(Hm, Simons' IP ban didn't work, by the way, as the "IP address" being used by the spammers didn't fit the regexp he'd given - it had a load of words on the end. Have changed the regexp in the banned list.)

rab: What a pile of arse, and in the way of these things, complete pointlessness on the part of the attackers. Any chance a quick "restore previous version" button?

Kevan: The basic Wiki software doesn't have an option for it - I might be able to knock up a shortcut to the edit-the-previous-version URL, or something. Would be a bit hesitant to add a fire-and-forget "restore previous version" button, though, because annoying vandals would use it and think it was funny. Might be good as an admin-only link, though. Will have a look when I've got some spare time.

rab: [Kevan] Yeah, I did wonder about that. Even if the number of clicks could be reduced by one, or a scroll-to-the-bottom alleviated, it would make the clean-up operation less arduous. Mind you, many hands make light work, as Dunx may have just noticed :)

Dunx: Vandalism is the new spam, eh? Maybe it's time to bring up the subject of user registration again...

[rab] I thought they were going quickly! And I'm glad to note that simulpost protection appears to operate cleanly on editing this page too.

Dunx: More spam removal; 19 articles were scribbled over. I don't want to be boring, but can we address the subject of user registration again?

Kevan: I suppose the mad-spambot vandalism has been outnumbering the genuine contributions of passers-by. When I've got some more time and space, I'll implement something along the lines of the advertising-blurb competition question that I've used in [Lexicon] - that the Wiki will require a password for edits to be made, but it'll be something that any leisurely human being could look up in the Encyclopaedia in a minute or two.

Dunx: Goodo. Rather irritatingly, many of the articles I restored yesterday have been scribbled over *again*.

Dan: It occurred to me some time ago that an effective yet dead simple filter for these sorts of things might be a text field into which one types one's response to the prompt "Surname of the nice lady who writes letters to Sir Humphrey." A search won't answer that question, at least not in any highly ranked way, but no one who can't answer it has any business contributing to the EM.

Kevan: Okay, it's now locked, with a link to an explanatory page in the header. The password is very easy to dig out, though - anything more obscure would possibly exclude people who had something to say, but had reached the game through a different route (I don't know how many online players have never heard an episode of ISIHAC). I've been using the really-easy-password thing on Lexicon for a few months, anyway, and we've not had any spam in that time. Spambots would be foiled by anything, and vat-grown human spammers presumably don't want to waste their time digging for an answer that might not be there, when they could be spamming a less-defended Wiki.

Dunx: thanks Kevan. I'll try and provide some actual entries now rather than just spam removal. Incidentally, the last couple of lots of vandalism were not just truncation but real spam entries, presumably crafted in such a way as to efeat the filters. The point is moot now, with any luck.

Generated Indices

Dunx: I know the Player Profiles are generated using a category index. I note that there are a couple of stub links on the front page for rulesets and stations; can we construct category-driven indices for these? Is there a standard way of building these without coding support?

Kevan: The Player Profiles link on the main page actually hard-links to the "backlinks" page for the Player Profiles page (which is just a "la la, don't read this" placeholder). But by putting a link to that page on every player profile, then a list of backlinks to Player Profiles becomes a list of player profiles.

(Or, eh, it nearly does - a list of backlinks also contains any pages which contain the string, rather poorly. That Sims appears in the [backlinks for 'Kevan'], because my name's in a URL on it, with no actual link to my profile.)

But mm, you probably knew that anyway. The standard way of building a new category does involve a lot of boring legwork adding "Categories: Blah" to all the pages you want to put in that category. Although I could automate it on the server, by throwing together a script to append the text to a load of the Wiki's raw data files. (It would mean someone giving me a big list of articles to place in the category, though, obviously.)

Simons Mith [Having submission trouble for some reason...] Well, I've partly short-circuited this issue by doing the necessary legwork. First-pass only for now - it's getting late. I've already noticed a copuple of items I missed first time through. However, while it is right and fitting that, say, Baker Street appears on the newly-created Station Index, we have a conflict between Baker Street as a station and Baker Street as the name of a ruleset. Suggests to me we need to be more formal in how we reference these definitions. Perhaps [Station - Baker Street]? and [Ruleset - Baker Street]? is the way to go?

Usemod vs MediaWiki

Dan (Rdsmith4): Might I suggest an improved wiki engine? MediaWiki, the one on which run Wikipedia et al., is very effective.

Kevan: It certainly is, but I'm not sure it'd be worth the effort of moving all the content over, given the tiny amount of editing that this Wiki gets, these days.

Dan (yes, that one): In case it isn't obvious that wasn't me up there. Clearly it was obvious to Kevan since he amended the commenter's name. The suggestion to move to MediaWiki because it's very effective seems to imply that Usemod isn't. But it certainly seems to be. And have you ever tried porting content from one Wiki to another? The difficulty varies depending on which ones you're porting between but it's never trivial, and no two wikis entirely agree on what wiki markup actually is. Like most such things it's not worth doing if you haven't actually identified a problem that it solves.

Pave: I've added a few bits (not changed anything though). I think someone (I'll do it if needs be) should go through the current games and sites to build all the players profiles up - even if it is to just get their names up for now.

Gil: I've added a Gil page for the player profiles, but I can't see how to link it to the Player Profile page.

Dan: Done. It's accomplished by inserting the Player Profiles category token into the text. View the page in edit mode to, um, view.

Simons Mith: Well now, a minor gotcha; [[DrQu+xum]] does not work as a link. Kevan, would you mind confirming that its presence isn't causing any behind-the-scenes breakages? Ta. BTW, if I Categorise Branson, Richard, should he go in the Famous Players section as well? Currently I'm assuming not.

Simons Mith: Another thought that occurs, rather than have entries such as e.g. Red, Blue, Green and others spread to the four corners of the alphabet, perhaps list them as sub-entires under colour? Catch is, how to decide how finely to divide things. Logically the same trick could be also be applied to Famous Players, Stations, Rulesets, which would rather defeat the point of having sub-indexes for those areas. I have applied this idea tentatively as an experiment, but feedback would be appreciated. Still, I'm the only one updating at the moment :-)

HomePage | RecentChanges | Preferences
This page is read-only | View other revisions
Last edited May 3, 2007 11:11 pm by Simons Mith (diff)