Thursday, March 5, 2009

Twitter and Extremistan

Back around October last year I signed up to Twitter, at the strong suggestion of my long time friend Terry Jones. He read my post The Wisdom of a Random Crowd of One and tweeted out to his followers, and on to Tim O'Reilly, who in turn tweeted it out to his 17,000 followers (Tim has almost twice that number now). I got 150 visits on one day, when 20 or so would have been expected. You can see the spike in the graph below depicting visitors to my blog.


You might call this data point an outlier, since the closest I have got since then is 68 visits in a day, and I think that came from a title promising too much.

That October visit spike reminded me of the fictitious lands of Mediocristan and Extremistan, introduced in Taleb's Black Swan book. Black Swan events are creatures of Extremistan but we mistake their significance because our personal risk-assessing machinery is domiciled in Mediocristan. Here is one of Taleb's examples to explain the difference between these two lands.

Consider a thought experiment where you assemble a sample of a thousand or so people (adults) into a group and measure their height. You would expect most people to between 120 cm and 210 cm (or roughly 4 feet to 7 feet tall), yielding a range of about one metre (3 feet) from tallest to shortest. Still, we might find someone outside this range if we happened to select an NBA player or two, so consider replacing it with 30 cm to 330 cm (roughly 1 foot to 10 feet). It would be quite rare to meet an adult whose height falls outside this range.


The average height H of an adult male will be between 160 cm and 185 cm (between 5 and 6 feet), depending on the country, and for sake of argument, let us consider the American average of H = 178 cm (5 feet 9 inches). In this case we see that pretty much everyone's height will fall in the range 0 cm to 2*H cm. So considering a deviation away from the average of size the same as the average covers the full spectrum of heights (0 to 11 and a half feet). And this is what characterizes Mediocristan - the average of the sample is a good guide to the sample itself. We could not add person to the group and expect a big change in the average or a new observation outside our 0 to 2*H interval. The same could be said if we selected weight or shoe size instead of height.

But consider changing this physical attribute of the group to a social attribute, such as net financial worth. Here the average of our sample may be say $1 million USD. But if we added Bill Gates to the sample, with say a net worth of $80 billion USD then suddenly this one observation dwarfs all others - its 8,000 times bigger than the current sample average. The previous observations in the data set are not a good indicator for the existence of the "Bill Gates outlier". Welcome to Extremistan, and as observed by Taleb, practically any social measure exhibits Extremistan properties.

John Robb has created a graphic that shows the main differences between Mediocristan and Extremistan for his post on the topic


The last comparison point, Normal (Gaussian) vs. Pareto curves, is a more familiar expression of the Mediocristan vs. Extremistan debate. There is another interesting article on this issue by John Hagel, but my comments will have to wait for another post. For the moment, Twitter is one of my doors into the Extremistan world.

