Wednesday, June 20, 2012


As useful as it is, Wikipedia can undoubtedly be better. I think an important part of this is seeing what it has to offer as a non-traditional encyclopedia. Wikipedia articles represent the evolutionary synthesis of ideas. A new article format could let users see this process. A many-layered dynamic format, as an optional setting, can be much richer than a traditional single-page article format.

We can easily identify what parts of articles get modified the most, and longitudinal observation will reveal a typical article's evolutionary pattern. Using stochastic selection of semantic units (sentences or paragraphs) rated on a "editing-frequency" scale an algorithm could build summaries of larger articles. The summarizing process could be designed to reduce article length by percentage. The most likely sentences or paragraphs to be left out will be those that get edited the most.

It's hard to say when this kind of filtering would be useful, or when it would turn the content into nonsense (assuming is doesn't do so all the time anyway). So, the user would be given a "filter scale", controlling the percentage of summarizing. If the user chooses to change this from the default (no summarizing), they could mark which setting they prefer. An aggregate of these settings will, at minimum, indicate the kind of information people choose to filter out.

A visual layer of editing-frequency metadata  would allow users to see how the filter operates, pulling highly contentious or ephemeral information out while leaving information that has been stable over a long term. Highlighting the semantic units in different colors according to editing frequency will make the changes evident. 

The goal is allowing users to draw their own conclusions about how this public information repository evolves. Whether or not I pursue this myself or not entirely depends on how far I get with perl and other coding languages. It's hard to tell if I'm finding good, entertaining reasons to write code myself, or if I'm doomed to get bored. We'll see.