All Chapters

The Missing Chapter

The Pigeonhole Principle

The simplest idea that proves impossible things

An extension of Jordan Ellenberg's "How Not to Be Wrong"

Chapter 83

Counting Hairs in London

Here is a fact about London that no one has ever checked, that no one needs to check, and that is as certain as anything in mathematics: right now, at this very moment, there are at least sixty people in London who have exactly the same number of hairs on their head.

Not "roughly the same." Not "probably the same." Exactly the same. Down to the last follicle.

The argument is almost embarrassingly simple. London has about 9 million residents. A human head carries at most about 150,000 hairs — and that's being generous to the most luxuriantly maned among us.1 So there are 150,001 possible hair-counts (from 0 to 150,000), and 9,000,000 heads. Divide 9,000,000 by 150,001 and you get 60. At least one hair-count must be shared by at least 60 Londoners. Pure logic. No microscopes required.

This is the pigeonhole principle, one of the oldest and most useful ideas in combinatorics, and it says something so obvious it hardly feels like it deserves a name: if you put more pigeons into fewer holes, at least one hole must contain more than one pigeon.

The German mathematician Peter Gustav Lejeune Dirichlet formalized this in 1834, calling it the Schubfachprinzip — the "drawer principle," as in: if you have more socks than drawers, some drawer has at least two socks.2 The French call it the principe des tiroirs. The English-speaking world, with characteristic love of birds, settled on "pigeonhole."

It feels like nothing. A tautology dressed up in a name. But here's the thing about the pigeonhole principle: it has teeth. It proves things that sound like they should require sophisticated machinery — deep probability theory, elaborate constructions, exhaustive searches — but actually follow from nothing more than the fact that you can't fit three objects into two boxes without doubling up somewhere.

Hole 1 Hole 2 Hole 3 Hole 4 1 2 3 4 5 5 pigeons, 4 holes Collision guaranteed
No matter how you assign five pigeons to four holes, at least one hole must double up. Dirichlet's principle, 1834.
· · ·

The Explorer

The principle becomes visceral when you watch it happen. Add pigeons below and try — really try — to avoid a collision. You can't. Mathematics won't let you.

Pigeonhole Explorer

All clear — each hole has at most one pigeon.
· · ·

Birthdays, Friendships, and Certainty

You've probably heard of the birthday paradox: in a room of just 23 people, there's a better-than-even chance that two share a birthday. It's a lovely result, but it's probabilistic. With 23 people, there's still a 49.3% chance everyone's birthday is different. You might get lucky.3

The pigeonhole principle doesn't deal in luck. It deals in inevitability. There are 366 possible birthdays (counting February 29). So gather 367 people — any 367 people, in any city, from any walk of life — and two of them must share a birthday. Not probably. Must. The 367th person walks through the door and the universe has no choice: every day is already taken.

The birthday paradox tells you when coincidences become likely. The pigeonhole principle tells you when they become logically unavoidable. Both are useful. But there's something uniquely satisfying about the latter — a kind of mathematical certainty that feels almost indecent in a world full of ambiguity.

Here's another one. Think about the social network of New York City — about 8.3 million people. Each person has some number of friends within the city, ranging from 0 (a true hermit) to 8,299,999 (an impossibly gregarious soul who knows literally everyone else). That's 8.3 million possible friend-counts. It looks like there might be enough pigeonholes for everyone. But wait: if someone has 0 friends, then no one can have 8,299,999 friends (because that would require being friends with the hermit). The extremes exclude each other. So the actual number of possible friend-counts is at most 8,299,999 — one fewer than the population. Pigeonhole: two New Yorkers must have exactly the same number of friends.4

The Friendship Theorem

In any group of n people, at least two must have the same number of friends within the group. This isn't sociology — it's arithmetic. The extremes (0 friends and n−1 friends) are mutually exclusive, leaving only n−1 possible values for n people.

· · ·

Why Decimals Repeat

You learn in school that the decimal expansion of any fraction eventually repeats: 1/7 = 0.142857142857…, 1/3 = 0.333…, 1/11 = 0.090909…. Teachers state this as a fact. Few explain why.

The pigeonhole principle explains why. When you perform long division of, say, 1 by 7, at each step you get a remainder. The possible remainders are 0, 1, 2, 3, 4, 5, and 6 — that's 7 values. If you ever get a remainder of 0, the decimal terminates. Otherwise, after at most 7 steps, you must see a remainder you've seen before (7 pigeons, 6 non-zero holes). And once a remainder repeats, the entire subsequent sequence of digits must repeat too, because long division is deterministic: same remainder, same next digit, same next remainder, forever.

Long division of 1 ÷ 7: remainders must cycle r = 1 → .1 r = 3 → .14 r = 2 → .142 r = 6 → .1428 r = 4 → .14285 r = 5 → .142857 r = 1! → loop After 6 steps, remainder 1 reappears — cycle begins anew Only 6 nonzero remainders possible (pigeonholes) → repetition is inevitable
Long division of 1÷7 produces remainders from a finite set. After at most 6 steps, a remainder must repeat, and the decimal cycle locks in forever.

This is the pigeonhole principle in its purest form: a finite number of possibilities, an inexhaustible process, and therefore a guaranteed repetition. It's the same reason any sequence generated by a finite-state machine — a pseudorandom number generator, a simple cellular automaton — must eventually loop. Finite pigeonholes. Infinite pigeons. QED.

· · ·
The Generalized Form

Beyond Two-in-a-Box

The basic principle says n + 1 pigeons into n holes forces a double-up. The generalized pigeonhole principle sharpens this: if you distribute kn + 1 objects into n containers, at least one container must hold at least k + 1 objects.

Generalized Pigeonhole Principle
kn + 1 objects into n boxes some box has ≥ k + 1
This is why the London hair-count argument gives 60, not just 2.

Back to London: 9,000,000 people, 150,001 possible hair-counts. We can write 9,000,000 = 59 × 150,001 + 149,941. So k = 59, and the generalized principle guarantees at least 60 people sharing one hair-count. Not 2. Not "a few." Sixty. And really, it's probably many more — the principle gives you a floor, not a ceiling.5

· · ·

The Compression Paradox

Now let me show you something that every programmer should know and most don't. People speak casually about "compressing data" as if any file can be made smaller. ZIP, GZIP, LZ4 — the tools feel like magic, turning bloated files into sleek archives. But here's a mathematical fact that no algorithm can escape:

No lossless compression algorithm can make every file shorter.

The proof is the pigeonhole principle, and it fits in a paragraph. Consider all possible files of exactly n bits. There are 2n such files. Now suppose a magical compressor could shrink every one of them to fewer than n bits. The compressed outputs would all be files of at most n − 1 bits. How many such files are there? At most 20 + 21 + … + 2n−1 = 2n − 1. That's one fewer output than inputs. Pigeonhole: two different input files must compress to the same output. But then decompression can't tell them apart, violating "lossless." Contradiction.6

So for every file a compressor manages to shrink, there must exist another file it makes longer (or leaves unchanged). Compression doesn't destroy information; it rearranges it, spending its budget on the files that look like the data it expects (text, images, structured data) and punishing the files that don't (random noise, already-compressed archives). The pigeonhole principle is what makes this a theorem rather than a heuristic.

Compression Paradox Explorer

All 3-bit strings (8 total) must be mapped to shorter outputs for "compression." But there are only 7 possible shorter strings. Try building a mapping — the pigeonhole principle will stop you.

· · ·

Hash Collisions and the Internet

Every time you log into a website, every time your browser verifies an SSL certificate, every time a blockchain validates a transaction, a hash function is at work — a function that takes an input of arbitrary size and maps it to a fixed-size output. SHA-256, for instance, maps any file to a 256-bit string.

There are infinitely many possible inputs. There are exactly 2256 possible outputs. Pigeonhole: collisions are guaranteed. Different inputs that produce the same hash must exist. The entire security of modern cryptography doesn't rest on the assumption that collisions don't exist — they mathematically must — but on the assumption that they're computationally infeasible to find.7

This is the pattern that makes the pigeonhole principle so quietly powerful. It doesn't tell you which two inputs collide. It doesn't tell you how to find them. It just tells you, with the serene confidence of a mathematical truth, that they're out there. Somewhere in the impossibly vast space of all possible documents, there is a PDF of Shakespeare's sonnets and a PDF of your tax return that hash to the same SHA-256 value. Finding that pair would take longer than the age of the universe. But logic guarantees their existence.

· · ·
The Big Picture

Ramsey Theory and the Ultimate Pigeonhole

The pigeonhole principle is a seed. Ramsey theory is the tree that grows from it.

In 1930, the British mathematician Frank Ramsey proved something extraordinary: in any sufficiently large structure, order is unavoidable. Color the edges of a complete graph on 6 vertices with two colors — red and blue. Ramsey proved that you are guaranteed to find a monochromatic triangle: three vertices with all connecting edges the same color.8

R(3,3) = 6: order emerges from chaos However you 2-color the edges, a same-color triangle must appear
Ramsey's theorem guarantees a monochromatic triangle among 6 vertices. The highlighted red triangle cannot be avoided — disorder has limits.

The proof is elegant and rests, at its heart, on the pigeonhole principle. Pick any vertex. It has 5 edges leading to the other vertices. Those 5 edges are colored with 2 colors. By the pigeonhole principle, at least 3 of the edges must be the same color — say red. Now look at the three vertices on the other end of those red edges. If any edge between those three vertices is also red, you've got a red triangle. And if none of them are red, then all three edges between them are blue — and you've got a blue triangle. Either way: monochromatic triangle. Guaranteed.

Ramsey theory generalizes this insight to dizzying heights: larger structures, more colors, higher dimensions. The Ramsey numbers grow so fast that determining even small ones is computationally intractable — Erdős famously quipped that if aliens demanded we compute R(5,5) or be destroyed, we should muster all our resources to compute it, but if they demanded R(6,6), we should just fight the aliens instead.9 But the qualitative result always holds: enough structure guarantees patterns. It's the pigeonhole principle scaled up to the sublime.

· · ·

The Moral

The pigeonhole principle is not deep mathematics. It's barely even a theorem — more like a statement too obvious to deny. And that's precisely what makes it important. In mathematics, as in life, the most powerful ideas are often the ones that feel trivially obvious once stated but whose consequences are anything but trivial.

When someone tells you that a compression algorithm can shrink every file, the pigeonhole principle says no. When someone wonders whether two people in a huge city might share some obscure property, the pigeonhole principle says yes, and here's how many. When a cryptographer designs a hash function, the pigeonhole principle is the adversary they already know exists — the guarantee of collisions they must engineer around rather than hope away.

The pigeonhole principle teaches something Ellenberg cares about deeply: that certainty is possible, even in a messy and ambiguous world — if you know where to look for it.

It's counting. That's all. Just counting. But counting, done carefully, is a form of proof. And proof, in a world drowning in probabilistic hedging and Bayesian updates and "well, it depends," is a rare and beautiful thing.

Notes & References

  1. The average human head has about 100,000 hairs, with blondes averaging closer to 150,000 and redheads about 90,000. The upper bound of 150,000 is generous. See: Krause, K. & Foitzik, K. "Biology of the Hair Follicle," Seminars in Cell & Developmental Biology, 2006.
  2. Dirichlet used the principle (without naming it) in his 1834 paper on Diophantine approximation, proving that for any irrational number α, there are infinitely many fractions p/q with |α − p/q| < 1/q². The name Schubfachprinzip was attached later.
  3. The birthday paradox calculation: P(all different) = 365/365 × 364/365 × … × 343/365 ≈ 0.4927 for 23 people. See Flajolet, P. & Sedgewick, R., Analytic Combinatorics, Cambridge University Press, 2009, §VI.
  4. This argument appears in many combinatorics textbooks. A nice exposition is in Lovász, L., Combinatorial Problems and Exercises, North-Holland, 1993.
  5. The generalized pigeonhole principle: if N objects are placed into k containers, some container has at least ⌈N/k⌉ objects. The hair-count bound of 60 = ⌈9,000,000/150,001⌉.
  6. This is sometimes called the "counting argument" against universal compression. For a thorough treatment, see Cover, T. & Thomas, J., Elements of Information Theory, Wiley, 2006, Chapter 5.
  7. SHA-256 has a 256-bit output space, meaning 2256 ≈ 1.16 × 1077 possible hashes. The birthday attack finds collisions in expected O(2128) operations, still far beyond current computational reach. See Menezes, A., van Oorschot, P. & Vanstone, S., Handbook of Applied Cryptography, CRC Press, 1996.
  8. Ramsey, F.P. "On a Problem of Formal Logic," Proceedings of the London Mathematical Society, 1930. The result was originally a lemma in a paper about mathematical logic — the combinatorial applications came later.
  9. The quote is widely attributed to Erdős. As of 2024, R(5,5) is known only to lie between 43 and 48. R(6,6) is between 102 and 165. See Radziszowski, S. "Small Ramsey Numbers," Electronic Journal of Combinatorics, Dynamic Survey DS1.