All Chapters
The Missing Chapters

Power Laws vs Bell Curves

Extremistan and the Tyranny of the Average

~30 min read · Interactive essay

Chapter 1

The Man Who Averaged Everything

In 1835, a Belgian astronomer named Adolphe Quetelet did something that sounds perfectly sensible and turned out to be quietly insane. He measured the chest circumferences of 5,738 Scottish soldiers and plotted them on a graph. The result was a gorgeous, symmetric hump — tall in the middle, tapering on both sides. A bell curve. Quetelet was thrilled. He'd found the l'homme moyen, the "average man," and he declared this shape to be the signature of nature itself.1

And to be fair, for chest measurements, he was basically right. If you grab a random Scottish soldier, his chest is probably within a couple of inches of the average. Even the most barrel-chested outlier isn't ten times wider than the scrawniest recruit. Heights work the same way. So do IQ scores, blood pressure readings, and the speeds of molecules in a gas. In these domains, the average is a meaningful summary — the center genuinely holds.

Quetelet's insight was beautiful: deviations from the average are symmetric and bounded. Nobody is negative inches tall. Nobody is a thousand feet tall. The extremes have a leash, and it's short.

His mistake was assuming this was the only game in town.

Because while Quetelet was busy averaging soldiers, the economy was producing something decidedly un-average. By the late 1800s, a tiny sliver of industrialists controlled a staggering fraction of the world's wealth. The distribution of money looked nothing like the distribution of chests. It didn't have a hump. It had a cliff on one side and a tail that stretched out to the horizon, where a handful of names — Rockefeller, Carnegie, Vanderbilt — lived in a statistical ZIP code all their own.

The bell curve is what nature does when she's feeling generous. The power law is what happens when she's feeling dramatic.
· · ·
Mediocristan Everyone roughly the same height Extremistan OUTLIER One giant dominates the average

In Mediocristan, no individual can change the story. In Extremistan, a single outlier rewrites it.

Chapter 2

Pareto's Garden and Zipf's Library

The Italian economist Vilfredo Pareto noticed it first — or at least, he was the first to write it down with enough math to be taken seriously. In 1896, he observed that about 80% of Italy's land was owned by 20% of the population. That's memorable enough. But the deeper pattern was that this wasn't just an Italian thing. He checked England, Prussia, and several other countries. Same story everywhere.2

The relationship was scale-free. Zoom in on the top 20% and you'd find that 80% of their wealth was held by 20% of them. It was turtles all the way up. Pareto had stumbled onto a power law — a distribution where the probability of an event is inversely proportional to some power of its size.

Pareto Distribution
P(X > x) = (xm / x)α
where xm is the minimum value and α (the "tail exponent") controls how fat the tail is. Smaller α = wilder extremes.

A few decades later, the linguist George Kingsley Zipf found the same pattern in a completely different place: words. He counted word frequencies in English text and discovered that the most common word ("the") appears about twice as often as the second most common ("of"), three times as often as the third, and so on. The nth-ranked word appears with frequency proportional to 1/n.3

This is eerie. Language and land ownership have nothing obvious in common, yet they dance to the same mathematical tune. City populations follow it too: New York is roughly twice the size of Los Angeles, three times Houston. Earthquake magnitudes, asteroid sizes, web page hits, casualties in wars — the list goes on and on, as if the universe has a house style for inequality.

The 80/20 Rule and Its Cousins

The "80/20 rule" is really just the most famous instance of a power law. But 80/20 isn't sacred — the exact split depends on the tail exponent α. Sometimes it's 90/10. Sometimes it's 99/1. The principle is always the same: a small number of items account for a disproportionate share of the total.

· · ·
Chapter 3

Mediocristan and Extremistan

Nassim Nicholas Taleb gave the two worlds the names they deserved.4 Mediocristan is the land of the bell curve — the place where heights live, where test scores live, where no single data point can dramatically change the story. If you're computing the average height of a thousand people and you add the tallest person who ever lived, the average barely budges. The individual is subordinate to the collective.

Extremistan is the land of the power law, and it plays by opposite rules. If you're computing the average wealth of a thousand people and Jeff Bezos walks into the room, the average might increase by a factor of a hundred thousand. A single observation can dominate, distort, or define the entire sample. The collective is subordinate to the individual.

The Bill Gates bar experiment: Imagine a bar with 50 people in it. Their average wealth is maybe $70,000. Now Bill Gates walks in. The average wealth is suddenly over $2 billion. The median hasn't changed at all. The mean is now useless — it describes nobody in the room, including Gates.

In Mediocristan, the average is the story. In Extremistan, the average is a fairy tale.

This isn't just philosophy. It has ferocious practical consequences. If you use Gaussian (bell-curve) models to estimate risk in a power-law domain, you will systematically, catastrophically underestimate the probability of extreme events. You'll think a once-in-ten-thousand-years flood is impossible, and then it happens twice in a decade. You'll build financial models that say a single-day market drop of 20% is so unlikely it should never occur in the lifetime of the universe — and then Black Monday 1987 happens.5

In Mediocristan, the average is the story. In Extremistan, the average is a fairy tale.
· · ·
Chapter 4

Why the Tails Tell the Tale

Here's the mathematical crux, and it's worth pausing to actually stare at it.

In a normal distribution, the probability of being more than k standard deviations from the mean drops off like e−k²/2. That's astonishingly fast. The probability of a 6-sigma event — six standard deviations out — is about 1 in 500 million. A 10-sigma event? Forget it. The numbers become so small they lose physical meaning.

Tail Behavior Comparison
Bell curve: P(X > x) ~ e

Power law: P(X > x) ~ xα
Exponential decay vs. polynomial decay. The power law's tail is incomparably fatter.

In a power law with exponent α = 2, the probability of an event 10× larger is only 100× less likely. That sounds like a lot until you compare it: in the bell curve world, an event 10× further out is roughly 1022 times less likely. The power law is a million billion billion times more generous with extreme events.

WHAT HAPPENS WHEN AN OUTLIER WALKS IN Heights (Mediocristan) 8'11" + avg new avg Average barely budges ✓ Wealth (Extremistan) + $150B avg new avg Average explodes ✗ ~$70K each

Adding the tallest human barely changes average height. Adding one billionaire makes the average meaningless.

This is why earthquakes are so dangerous. The Gutenberg-Richter law says earthquake frequency follows a power law: for each unit increase in magnitude, earthquakes become about 10× rarer. A magnitude 7 is ten times less likely than a magnitude 6. But a magnitude 8 — which releases 32 times more energy — is only 10× rarer than a 7. In a bell-curve world, a quake that big would be essentially impossible. In the real world, it's just rare enough to be forgotten between occurrences.6

The Deadly Error

When you model a power-law phenomenon with a bell curve, you are telling yourself that extreme events are exponentially rarer than they actually are. This is the error that blew up Long-Term Capital Management, that made the 2008 financial crisis "impossible" according to the models, and that keeps surprising city planners when "hundred-year floods" happen every decade.

· · ·
Chapter 5

See It For Yourself

Mathematics is one of those fields where you can know a fact without feeling it. You can read that power-law tails are "fatter" and nod along without the visceral shock of watching it happen. So let's fix that.

The interactive tool below lets you place a bell curve and a power law side by side and watch what happens as you change their parameters. Pay special attention to the tails — the far-right region where extreme events live. Drag the sliders and see where the two distributions agree and where they violently disagree.

Distribution Visualizer
Compare bell curves and power laws side by side. Watch what happens in the tails.
Bell Curve σ (spread) 1.0
Power Law α (tail exponent) 2.0
View Range (x-max) 5
Bell Curve (Normal)
Power Law (Pareto)
Probability of X > x-max
Bell Curve
Power Law

Notice what happens when you push the view range out to 15 or 20. The bell curve has effectively vanished — its probability is so small your screen can't even render it. Meanwhile the power law is still chugging along, still producing events, still dangerously present. That gap — between "essentially zero" and "rare but real" — is where Black Swans live.

Try the "Sample 1000 Points" button. Watch how the bell-curve dots cluster obediently near the center, while the power-law dots occasionally fling themselves off toward the edge of your screen. Those lonely red dots in the far right? Those are the bankruptcies, the pandemics, the market crashes. They're not anomalies. They're the point.

· · ·
Chapter 6

Black Swans and the Limits of Experience

Before Europeans reached Australia, every swan anyone had ever seen was white. A perfectly reasonable induction: all swans are white. Then in 1697, Dutch explorers sailed into Western Australia and found black ones. The entire edifice of "all swans are white" collapsed with a single observation.7

Taleb turned this into a framework for thinking about extreme events in power-law domains. A Black Swan has three properties: it's an outlier (nothing in the past convincingly pointed to its possibility), it carries extreme impact, and after the fact, we concoct explanations that make it seem predictable in retrospect.8

The critical insight is that Black Swans aren't bugs in the system. In a power-law world, they're features. They are the mathematically inevitable consequence of fat tails. If you live in Extremistan, rare-but-massive events aren't anomalies — they're the primary drivers of the aggregate. A single day accounts for most of the stock market's annual return. A single earthquake does most of the century's seismic damage. A single pandemic kills more people than all the other diseases of the decade combined.

In 2023, Spotify had over 100 million tracks in its library. The top 0.2% of artists earned roughly 90% of all streaming revenue. The median artist earned less than $100 for the year. The mean was pulled so far upward by Taylor Swift and Bad Bunny that it described precisely nobody.9

This is what a power law feels like if you're living inside one. It doesn't feel like an interesting statistical observation. It feels like the universe is rigged.

City sizes tell the same story. There are about 19,000 incorporated cities in the United States. The median population is around 6,000. But New York City has 8.3 million people. The mean is dragged so far upward by a handful of mega-cities that it's meaningless as a description of a "typical" city. In Mediocristan, the mean and the median are friends. In Extremistan, they barely know each other.

· · ·
Chapter 7

So What Do We Do About It?

The first step is diagnosis. You need to figure out which world you're in. If you're measuring physical attributes of organisms (heights, weights, blood cell counts), you're probably in Mediocristan. If you're measuring anything involving networks, human decisions, or multiplicative processes (wealth, fame, city sizes, casualties), assume Extremistan until proven otherwise.

Here's a quick heuristic that works surprisingly well: can a single observation be larger than the sum of all the others? If the answer is no — if you can't imagine one person's height being taller than the combined height of everyone else in the room — you're in Mediocristan. If the answer is yes — one person's wealth can exceed the combined wealth of every other person in the room — you're in Extremistan.10

The second step is humility. In Extremistan, your sample is never large enough. The most important event in your domain might not have happened yet. All of recorded financial history might not contain the worst crash that's coming. Your earthquake data going back 200 years might not include a single instance of the magnitude that's overdue. The past is a biased guide to the future when you live on a fat tail.

Practical Rules for Extremistan

1. Never use the average of a power-law distribution to predict individual outcomes.
2. Use the median, not the mean, as your measure of "typical."
3. Build systems that are robust to extremes you haven't yet observed.
4. Be deeply suspicious of any risk model that uses a Gaussian distribution for financial returns, insurance claims, or natural disaster magnitudes.
5. Remember that in Extremistan, the most important data point is the one you haven't seen yet.

Quetelet's bell curve was a genuine breakthrough — in its proper domain. The normal distribution is one of the most beautiful and useful objects in all of mathematics. But beauty has a dark side: it seduces you into thinking it applies everywhere. The real mathematical skill isn't knowing the formulas. It's knowing which formula to reach for. And in a world increasingly dominated by networks, platforms, and compounding effects, the answer is increasingly: not the bell curve.

The real mathematical skill isn't knowing the formulas. It's knowing which formula to reach for.

Ellenberg writes that mathematics is "the extension of common sense by other means." Here's the common sense: some worlds are gentle, and some worlds are savage. The bell curve is the gentle world's portrait. The power law is the savage one's. And if you mistake one for the other, the results won't be a rounding error. They'll be a catastrophe.

Notes & References

  1. Adolphe Quetelet, A Treatise on Man and the Development of His Faculties (1835). Quetelet's concept of the "average man" was revolutionary in social statistics but later criticized for reifying a mathematical abstraction. For a sharp modern critique, see Todd Rose, The End of Average (2016).
  2. Vilfredo Pareto, Cours d'économie politique (1896). The 80/20 observation was later popularized by Joseph Juran as "the vital few and the trivial many."
  3. George Kingsley Zipf, Human Behavior and the Principle of Least Effort (1949). Zipf's law has been found in dozens of domains beyond linguistics, from protein expression levels to the population of craters on the Moon.
  4. Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable (Random House, 2007). The Mediocristan/Extremistan framework appears in Part 1.
  5. On Black Monday (October 19, 1987), the Dow Jones Industrial Average fell 22.6% in a single day — an event that Gaussian models assigned a probability of roughly 10−148. For perspective, there are only about 1080 atoms in the observable universe.
  6. Beno Gutenberg and Charles F. Richter, "Frequency of earthquakes in California," Bulletin of the Seismological Society of America 34(4), 1944, pp. 185–188.
  7. The historical detail is from Willem de Vlamingh's 1697 expedition to the Swan River in Western Australia.
  8. Taleb's three-property definition of Black Swan events: outlier status, extreme impact, and retrospective (but not prospective) predictability. The Black Swan, Prologue.
  9. Exact streaming revenue figures are proprietary, but multiple analyses (e.g., Music Business Worldwide, 2023) confirm that the top fraction of a percent of artists capture the vast majority of plays. The shape is robustly Pareto.
  10. This "dominance" heuristic is adapted from Taleb's discussion in Antifragile (2012), Chapter 14. If one observation can dominate the sum, the distribution has no well-behaved mean.