Teaching the hive mind to discriminate

Writing on his blog, Chicago law professor Cass Sunstein invokes the name of the sainted Hayek to endorse the decentralized nature of Wikipedia and other peer-production exercises:

Developing one of the most important ideas of the 20th century, Nobel Prize-winning economist Friedrich Hayek attacked socialist planning on the grounds that no planner could possibly obtain the “dispersed bits” of information held by individual members of society. Hayek insisted that the knowledge of individuals, taken as a whole, is far greater than that of any commission or board, however diligent and expert. The magic of the system of prices and of economic markets is that they incorporate a great deal of diffuse knowledge.

Sunstein fails to appreciate that markets are a special case in group dynamics, where knowledge is maximally distributed. So I pointed that out to him:

Wikipedia is all about process, and because its process is so different from Britannica’s, it’s not really accurate to describe it as an “encyclopedia”. Wikipedia is actually the world’s largest collection of trivia, gossip, received wisdom, rumor, and innuendo. It’s valuable because any large collection of information is valuable, but not in the same way that the verified, expert summaries in an encyclopedia are valuable.

If it’s true that the “knowledge of individuals, taken as a whole, is far greater than that of any commission or board,” it’s also true that the sum of their prejudice, mistaken beliefs, wishful thinking, and conformance to tradition is greater.

All of this is to say that group endeavors like Wikipedia produce breadth but not depth. For some endeavors depth is important, but for all others it’s fine to consult the rabble.

Marketing, for example, can gain much by mining the dark corners of Wikipedia; engineering and medicine, not so much, as knowledge is not dispersed at the depths as it is at the surface.

Which brings us back to Hayek. Markets do a great job of bringing information about the wishes of buyers to bear on the consciousness of sellers. Everybody who participates in a market is an expert on the subject of his own wishes or his own product. But when you leave the realm of buying and selling, expertise is not as widely dispersed as participation, and then the decentralized model falls down.

And then we’ve got a little back-and-forth with Tim Wu on Tech Lib, where Wu says:

So its obviously true that decentralized and centralized systems are better for different things, as RB points.

One thing I think is interesting, and don’t quite understand, is how often, however, humans tend to underestimate the potential of decentralized solutions

That’s what Hayek was getting at in his paper — there’s no question that if you put a perfect, planned economy next to an unplanned economy, the planned economy will win. Hands down.

But we aren’t good at knowing when information problems will cripple what would have been the better system.

So maybe we’re overcompensating, as RB suggests, in the direction of decentralized systems, but I happen to think we have to fight a perfectionist instinct that drives us too over-centralization

Just ask Napoleon III

Here’s the essential issue, as I see it: It’s undeniably true that information exists nearly everywhere, hence the potential information present in a large group is greater than that in a small group, and that’s why markets allocate resources better than committees. But it’s also true that misinformation exists nearly everywhere, so there’s also a huge potential for large groups to be misguided.

So the real question about information and group scaling is this: are there procedures for separating good information from false information (“discrimination”) that are effective enough to allow groups to be scaled indefinitely without a loss of information quality? It’s an article of faith in the Wikipedia “community” that such procedures exist, and that they’re essentially self-operative. That’s the mythos of “emergence”, that systems, including human systems, automatically self-organize in such a way as to reward good behavior and information and purge bad information. This seems to be based on the underlying assumption that people being basically good, the good will always prevail in any group.

I see no reason to believe that groups have this property, even if one accepts as given the fundamental goodness of the individual. And even if some groups have this property, does it follow that self-selecting groups do? Polling, for example, seems to pretty accurate when it’s done by random sample. But self-selected polling is notoriously inaccurate. If a web site puts up a presidential preference poll and supporters of one candidate or another urge each other to vote, the results are skewed.

This is what happens in Wikipedia and many open source projects: participation is limited to people with an interest in a particular outcome, and they distort the process to get the desired result. Participation is not automatically tailored to align with expertise, as it is in markets.

The methods we have for separating fact from fiction, such as the expert opinion, scientific method and random polling, don’t scale to arbitrarily large groups.

Hence the work of large groups is suspect.

Banning Wikipedia

The dubious nature of Wikipedia information has come to the attention of the authorities:

When half a dozen students in Neil Waters’s Japanese history class at Middlebury College asserted on exams that the Jesuits supported the Shimabara Rebellion in 17th-century Japan, he knew something was wrong. The Jesuits were in “no position to aid a revolution,” he said; the few of them in Japan were in hiding.

He figured out the problem soon enough. The obscure, though incorrect, information was from Wikipedia, the collaborative online encyclopedia, and the students had picked it up cramming for his exam.

Dr. Waters and other professors in the history department had begun noticing about a year ago that students were citing Wikipedia as a source in their papers. When confronted, many would say that their high school teachers had allowed the practice.

But the errors on the Japanese history test last semester were the last straw. At Dr. Waters’s urging, the Middlebury history department notified its students this month that Wikipedia could not be cited in papers or exams, and that students could not “point to Wikipedia or any similar source that may appear in the future to escape the consequences of errors.”

Kudos to Middlebury College.

Breathtaking stupidity

There’s way too much stupidity in the world to comment on all of it, but sometimes you see something that sets a new standard. The Cato Institute has commissioned Jaron Lanier to explain the Internet, and his contribution makes all the silly drivel written about it in the past look downright serious. Lanier’s main point is that the Internet is a social construct:

I hope I have demonstrated that the Net only exists as a cultural phenomenon, however much it might be veiled by an illusion that it is primarily industrial or technical. If it were truly industrial, it would be impossible, because it would be too expensive to pay all the people who maintain it.

Now it’s silly enough when left-feminist academics say “gender is a social construct” but this is downright hilarious. Lanier had something to do with gaming goggles once upon a time, but he’s basically illiterate and has no special expertise in networking. Cato is obviously over-funded and intent on wasting your time.

If you want to read a futurist of merit, check out Ray Kurzweil, a man of learning and intelligence who certainly won’t waste your time with a bunch of new-age drivel.

Coyote at the Dog Show has read Lanier’s essay, and he’s not impressed either. He mentions Lanier’s seemingly senseless attack on the concept of the “file” in computers. The revolutionary alternative that Lanier proposes is a time-indexed file, something that’s commonplace for video servers. Not exactly revolutionary, and not exactly well-informed.

If you don’t like files, folders, directories, and symbolic links, fine, throw all your stuff into a single common file and be done with it.