All Chapters

The Missing Chapter

Listening for the Signal in the Static

The world is full of noise pretending to be meaning. The hardest skill in mathematics — and in life — is learning to tell the difference.

An extension of Jordan Ellenberg's "How Not to Be Wrong"

Chapter 1

The Stockbroker Who Could Predict the Future

You receive a letter from a stockbroker. "Next week," it says, "the market will go up." You throw it away. But the market does go up. The next week, another letter: "The market will go down." It goes down. This happens ten weeks in a row. The eleventh letter arrives with an invitation to invest $10,000 in the broker's fund, and by now, some part of your brain is screaming: this person can see the future.

Here's the trick. In week one, the broker sent 10,240 letters — half predicting "up," half predicting "down." After the market went up, the 5,120 people who got the wrong prediction never heard from the broker again. The remaining 5,120 got a new letter. Half said "up," half said "down." Ten rounds later, exactly 10 people have received ten correct predictions in a row. They're astonished. They invest. The broker has manufactured a perfect signal out of pure noise.

This isn't a thought experiment. Variants of this scam have appeared for decades, from penny-stock tipsters to sports-betting Telegram channels. The scheme works because humans are exquisitely, almost pathologically, tuned to detect patterns — even where no pattern exists. We evolved in a world where the rustle in the grass really might be a predator. The cost of a false positive (running from nothing) was negligible; the cost of a false negative (ignoring a lion) was death. So our brains developed a shoot-first-ask-questions-later pattern detector, and ten thousand years later, we're still using it to interpret stock charts.

The problem isn't that we look for patterns. The problem is that we're too good at finding them. And in a world flooded with data, this ancient gift has become a modern curse.

· · ·
Chapter 2

What Even Is Noise?

Let's be precise. A signal is the part of the data that tells you something true about the world — the underlying pattern, the systematic component, the thing you'd still see if you could measure it a thousand times. Noise is everything else: measurement error, random fluctuation, sampling variation, the butterfly that flapped its wings in Borneo.

Every dataset you have ever encountered is a mixture of signal and noise. When a poll says the president's approval rating is 43%, the signal might be 44% and the noise pushed it down a point. When your bathroom scale says you gained two pounds overnight, the signal (your actual weight change) might be zero, and the noise (water retention, time of day, where you stood on the scale) accounts for the rest. When a school district reports that test scores rose 5% after implementing a new curriculum, the signal — the genuine effect of the curriculum — could be anywhere from −2% to +12%, with noise filling in the gap.

SIGNAL NOISE WHAT YOU SEE
Every observation you make is a cocktail of truth and randomness. The challenge is separating one from the other.

The signal-to-noise ratio (SNR) tells you how much of what you're seeing is real. In physics and engineering, it's a precise quantity: the power of the signal divided by the power of the noise, often measured in decibels. In everyday reasoning, it's a metaphor — but a dangerously useful one. A dataset with high SNR is one where the signal shines through clearly; a dataset with low SNR is a minefield where almost every "pattern" you see is probably noise.

Signal-to-Noise Ratio
SNR = Psignal / Pnoise
When the ratio is large, patterns are trustworthy. When it's close to 1 — or below — you're mostly reading tea leaves.

Here's the thing most people miss: you usually don't know the ratio. You see the noisy data. You don't see the signal and the noise separately, with helpful color-coding. So everything comes down to how you interpret what you see — and that's where we go catastrophically wrong.

· · ·
Chapter 3

Overfitting: The Art of Believing Too Hard

In 2008, Google launched Flu Trends, a system that used search queries to predict flu outbreaks faster than the CDC could. It was brilliant, it was celebrated in Nature, and for a few years, it worked beautifully. Then it started overestimating flu prevalence — by as much as 140%. What went wrong?

Google Flu Trends had committed the cardinal sin of data science: it had overfit the data. The algorithm had found 45 search terms that correlated with flu outbreaks in the training data, but some of those correlations were coincidental — artifacts of noise. "High school basketball" correlated with flu in the training data because basketball season and flu season overlap. But when the timing of basketball searches shifted, the model's predictions went haywire.1

Overfitting is what happens when you mistake noise for signal and then build a model around your mistake. It's like a student who memorizes every answer on a practice test instead of learning the underlying concepts. On the practice test, they score 100%. On the real test, they bomb — because the specific questions changed, and they never learned the general principles.

The more flexible your model — the more parameters it has, the more degrees of freedom — the better it can fit the noise in any particular dataset. A straight line through a scatterplot might miss some nuance, but a 20th-degree polynomial can snake through every single data point, hitting each one perfectly. The line is underfit; the polynomial is overfit. The truth is almost always somewhere in between.

Drag the complexity slider to see how higher-degree polynomials chase the noise. The true signal is a gentle curve — everything beyond that is wishful thinking.

The interactive above lets you see overfitting in action. The true underlying relationship is a gentle cubic curve (the dashed blue line). The data points are that curve plus random noise. A linear fit (degree 1) misses the curve; a cubic fit (degree 3) captures the shape nicely; and a 15th-degree polynomial writhes through every data point like a snake with anxiety. It has perfect training accuracy and zero understanding.

This isn't just a data-science problem. It's a human thinking problem. We overfit our lives all the time. You eat sushi before a job interview and get the offer, so now sushi is your "lucky food." Your team wins when you wear a certain jersey, so the jersey has magical powers. You try a new management technique and sales improve, so the technique obviously works. In each case, you've built a story that perfectly explains the past data — but the story is molded around noise, not signal, and it will fail you eventually.

· · ·
Chapter 4

The Noise Bottleneck

There's a profound asymmetry in the signal-noise problem: adding data helps, but not as fast as you'd like.

If you flip a coin 10 times and get 7 heads, you might wonder if it's biased. But 7 out of 10 is entirely within the range of a fair coin — the noise is huge relative to any signal. Flip it 100 times and get 70 heads? Now something is clearly wrong. The noise has shrunk, and the signal (a biased coin) shines through.

The key mathematical fact is this: the noise in an average shrinks as the square root of the sample size. To cut the noise in half, you need four times as much data. To cut it by a factor of ten, you need a hundred times as much. This is the famous 1/√n law, and it explains why so many fields of inquiry are stuck in what I'll call the noise bottleneck.

The Square Root Law
Standard Error = σ / √n
Doubling your precision requires quadrupling your data. Diminishing returns are baked into the mathematics.

Consider nutrition science. You want to know if eating blueberries reduces the risk of heart disease. The true effect, if it exists, is probably small — maybe a 3% reduction in risk. But the noise in any individual's health outcome is enormous: genetics, exercise, stress, sleep, a thousand other dietary factors. To detect a 3% signal amid all that noise, you'd need a randomized trial with tens of thousands of participants followed for decades. The noise bottleneck means that small but real effects are extraordinarily expensive to verify — and the temptation to declare victory with underpowered studies is irresistible.2

This is why headline-grabbing nutrition findings so often contradict each other. "Red wine prevents cancer!" "Red wine causes cancer!" Both studies were probably too small to see through the noise. Each one found a flicker and mistook it for a flame.

The Noise Bottleneck Principle

In a noisy domain, the first 80% of apparent understanding comes quickly. The last 20% — the part that separates real insight from superstition — requires exponentially more data, patience, and humility than most people can muster.

· · ·
Chapter 5

Can You Tell Real from Random?

One of the most reliable findings in cognitive psychology is that humans are terrible at recognizing true randomness. If you ask people to write down a "random" sequence of coin flips, they produce something far too orderly — too many alternations, not enough long streaks. Real randomness is clumpier than people expect.

This has a direct consequence for signal detection. When you see a streak in real data — five good quarters in a row, a basketball player making six consecutive three-pointers, a stock rising eight days straight — your brain flags it as meaningful. "Something systematic must be going on." But in truly random data, streaks like these are not just possible; they're inevitable. In 200 coin flips, you'll almost certainly see a run of 7 or 8 identical outcomes in a row. That's not signal. That's just what noise looks like.3

Try it yourself. Below are two sequences: one was generated by a fair random process, and one was created by a human trying to "look random." Can you tell which is which?

Real or Fake Randomness?
One sequence is truly random; the other was crafted by a human. Click on the one you think is real.

SEQUENCE A

SEQUENCE B

Score: 0/0

Most people find this surprisingly hard. The random sequence tends to have longer streaks and more apparent "patterns" than the human-generated one, because humans unconsciously avoid streaks — they think randomness should look balanced. But it doesn't. Noise is wild and clumpy and full of things that look like patterns. That's the whole problem.

· · ·
Chapter 6

The Paradox of Big Data

You might think the solution is obvious: get more data. We live in the age of big data, after all. Surely with enough observations, the signal will emerge?

Not necessarily. More data helps if the data is measuring what you think it's measuring. But in practice, big data introduces its own noise problems. When you have millions of data points and thousands of variables, the number of possible correlations is astronomical — and many of them will be statistically significant by pure chance.

Tyler Vigen's famously absurd collection of "spurious correlations" illustrates this perfectly. The divorce rate in Maine correlates almost perfectly (r = 0.99) with the per capita consumption of margarine. U.S. spending on science correlates with suicides by hanging. These aren't real relationships — they're what happens when you search through enough variables. With enough hay, you'll always find a needle, even if nobody put one there.4

THE MULTIPLE COMPARISONS TRAP Expected false discoveries Number of variables tested 20 1 100 5 500 25 1,000 50 10,000 500 At p = 0.05, 5% of tests yield false positives
At the standard p = 0.05 threshold, testing 10,000 variables guarantees about 500 "significant" results that are pure noise.

This is the multiple comparisons problem, and it's one of the central crises of modern science. When researchers test many hypotheses on the same dataset — or when they "explore" the data until something turns up significant — they're fishing in a lake of noise. The fish they catch are often made of static.

The statistician John Tukey put it beautifully: "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data."5

· · ·
Chapter 7

Nate Silver's Razor

In his book The Signal and the Noise, Nate Silver draws a sharp distinction between fields where prediction works and fields where it doesn't. Weather forecasting has improved enormously over the past fifty years — five-day forecasts today are as accurate as one-day forecasts were in the 1980s. Earthquake prediction, by contrast, remains essentially impossible. Why?6

The answer comes down to signal-to-noise ratio and the nature of the system. Weather is chaotic but measurable: we have satellites, weather stations, ocean buoys, and atmospheric models that capture the signal well enough for short-term prediction. Earthquakes are driven by processes deep underground that we can barely observe, making the noise effectively infinite relative to any predictive signal.

Silver proposes what we might call Silver's Razor: before trying to predict something, ask whether the domain has enough signal to support prediction at all. Fields like baseball statistics and weather have enough signal — the data is structured, the sample sizes are large, and the underlying systems are at least partially understood. Fields like macroeconomic forecasting, political punditry, and earthquake prediction are noise-dominated, and anyone who claims high accuracy in these domains is either deluded or selling something.

"The signal is the truth. The noise is what distracts us from the truth."
— Nate Silver

This is where intellectual humility becomes a technical skill. The expert who says "I don't know" in a noisy domain is more calibrated than the pundit who offers confident predictions. Knowing the noise level of your domain isn't pessimism — it's accuracy.

· · ·
Chapter 8

The Signal-to-Noise Ratio of Your Life

Once you start thinking in terms of signal and noise, you see the distinction everywhere — and you start making better decisions.

Consider hiring. A thirty-minute job interview is almost entirely noise. The candidate's nervousness, the interviewer's mood, whether the conversation happened to land on a topic the candidate knows well — these random factors swamp any real signal about job performance. Studies consistently show that unstructured interviews are among the worst predictors of actual job success.7 Google famously discovered that the number of interviews a candidate went through had no correlation with their later performance after four interviews — the additional data was pure noise.

Or consider the news. On any given day, the stock market moves up or down by some amount, and a financial journalist writes a story explaining why. "Markets fell on concerns about China." "Markets rose on optimism about earnings." But short-term market movements are almost entirely noise — random fluctuations driven by the chaotic interaction of millions of traders. The journalist is writing explanations for random numbers, and you're reading them as if they contain insight.

The signal-noise framework gives you a simple but powerful question to ask about any claim, any pattern, any story: Is there enough signal here to justify the conclusion? Sometimes the answer is yes. A clinical trial with 10,000 participants and a clear result — that's signal. A friend's anecdote about how acupuncture cured their back pain — that's a sample size of one, buried in noise.

Before you believe a pattern, ask:

1. Sample size: Is it big enough for the effect to be distinguishable from noise?

2. Replication: Has anyone else found the same pattern independently?

3. Mechanism: Is there a plausible reason for the relationship, or is it just a correlation?

4. Multiple comparisons: How many patterns were tested before this one was declared significant?

5. Effect size: Even if real, is the effect large enough to matter?

Mathematics, at its best, gives you tools to see the world more clearly. The signal-noise distinction isn't just a technical concept from electrical engineering — it's a way of thinking that applies every time you encounter data, make a judgment, or form a belief. The world is full of noise pretending to be signal, and the only defense is the discipline to ask: How much of what I'm seeing is real?

The answer, more often than you'd like, is: less than you think. And that's not a reason for despair. It's a reason for humility — which, in mathematics as in life, is the beginning of wisdom.

Notes & References

  1. David Lazer et al., "The Parable of Google Flu: Traps in Big Data Analysis," Science 343, no. 6176 (2014): 1203–1205. The study documented how GFT overestimated flu prevalence by 50% over a two-year period, largely due to overfitting to seasonal search terms.
  2. John P. A. Ioannidis, "Why Most Published Research Findings Are False," PLOS Medicine 2, no. 8 (2005): e124. Ioannidis's landmark paper showed that in fields with small sample sizes and small effect sizes, the majority of "significant" findings are likely false positives — noise masquerading as signal.
  3. For the mathematics of runs and streaks in random sequences, see Mark F. Schilling, "The Longest Run of Heads," The College Mathematics Journal 21, no. 3 (1990): 196–207. The expected longest run in n fair coin flips is approximately log₂(n).
  4. Tyler Vigen, Spurious Correlations (New York: Hachette, 2015). The companion website tylervigen.com continues to generate new examples of meaningless correlations between unrelated time series.
  5. John W. Tukey, "Sunset Salvo," The American Statistician 40, no. 1 (1986): 72–76.
  6. Nate Silver, The Signal and the Noise: Why So Many Predictions Fail — but Some Don't (New York: Penguin, 2012). Silver's taxonomy of predictive domains by signal-to-noise ratio remains one of the clearest frameworks for understanding why some fields forecast well and others don't.
  7. Frank L. Schmidt and John E. Hunter, "The Validity and Utility of Selection Methods in Personnel Psychology," Psychological Bulletin 124, no. 2 (1998): 262–274. Unstructured interviews had a validity coefficient of only 0.38 for predicting job performance.