Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because there are storage costs other than the literal cost of maintaining data on a server.

Is the data accurate and up-to-date? Are the links still viable? Do the citations check out? etc.

It's just like having a project with a huge number of lines of code. It's an important maintenance task to delete outdated or obsolete code - not because you will run out of disk space to store your .c files, but because unused, unmaintained code is very likely to be confusing or wrong.

Remember also the power rule for user contribution: 90% of all users do not contribute anything ever. 9% of users contribute at least once. 1% of users contribute everything else. If an article is not very notable, it's extremely likely that the person who originally contributes it will be the only person who ever contributes anything to the article, and contributing the original is very likely to be the only contribution to the subject they make. This will hold true even if there are some number of "long tail" readers interested in the article, because readers are drastically more common than writers, and infrequent writers drastically more common than writers who consistently attempt to improve the articles they read.

So: Articles with low notability are probably poorly maintained, out of date, and/or consisting entirely of the statements of a single person. Making them better imposes a cost which relatively few people in Wikipedia's community actually pay - the cost of taking time to edit something.

All of Wikipedia's pains and stretch marks will become much clearer if you restate the purpose and function of the site. If you approach Wikipedia as a server farm that takes in user contributions and produces a compilation of same, then it's strange to have any limits on contributions, since the cost of storing text data is extremely small. If you approach Wikipedia as a group of people that takes in the members' time and energy and produces agreement between the members, then you see where their problems come from.

Remember, a Wikipedia article gets to stay in a particular state only when everyone who potentially could change it agrees that no changes are necessary. Wikipedia produces consensus, not data. Its limiting factor is the time and patience of the people who work on edits.

If you need consensus - the kind that will not be repeatedly challenged by latecomers - and you are limited in the number of people who will voluntarily "meet" in the editing area to discuss a particular subject, then low-popularity articles are a cost with no benefit. The opinions of a single person - possibly an oddball, since they're the only one who submitted the article - will produce no agreement, and it will consume more time and energy to find people who could hammer out a lasting consensus.



Wikipedia produces consensus, not data.

That's a great description of what Wikipedia is.

I keep thinking of it as an open source repository of human knowledge, like an infinitely large encyclopedia. But in reality it is more like a forum where useful information "gets voted to top", so to speak.


I think this is also the best way to understand the results of editing wars and the like.

A requirement for consensus is incompatible with individual rights, since people have the right to disagree. For low-interest topics, Wikipedia eventually succeeds anyway because less-motivated contributors become bored or exhausted and stop arguing. For high-interest topics, the search for consensus becomes more and more explicitly political - political as in "determining the governing rules of a body of disparate people", rather than political as in "are Republicans better than Democrats", although the two kinds of 'political' are both present in many edit wars.

Take the cool-off rules, for examples - locking an article for a period of time, or forbidding certain people from editing it. This is central to Wikipedia's functioning in highly controversial areas, but "nobody can edit this article today" is a policy, not a datapoint. Politics produces policy - a set of actions that can be acted upon by people who do NOT necessarily agree about the underlying facts.

Nasty edit-wars borrow from the tools of dictatorship and oppression - censorship, propaganda, exclusion from the body politic. It's not a coincidence; as mentioned above, consensus is incompatible with individual rights. You need to limit which individuals are allowed to participate, or negotiate the ways that their participation will be accepted. In a sufficiently small or like-minded group, consensus can work again.

A data server has a failure state where some of the data is lost. Wikipedia's failure state, on the other hand, resembles the internal rupture of a political party in a post-Soviet state. Like communist countries, Wikipedia articles demand consensus from their citizens in order to operate.

Of course, unlike Eastern Bloc revolutions, Wikipedia contributors typically do not get lined up against a wall and shot. =) That's also one of the reasons we tend to put Wikipedia edit wars in either the "tech company struggles with its architecture" bucket or the "people on the Internet are angry, news at 7" bucket, rather than the "fractured political unit experiences internal strife due to irreconcilable demands on limited resources" bucket where they actually belong.


Do you have any ideas for other rules that can increase the useful-output-to-political-conflict ratio?

(Welcome either in this thread or in direct communication; I am trying to devise better computer-mediated systems for this. While conflict and politics are intrinsic to human affairs I think proper system design can reduce the number of zero- or negative-sum interactions.)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: