Newcomb's Problem — The Missing Chapters

Chapter 28

The Problem That Won't Go Away

In 1969, the philosopher Robert Nozick did something unusual. He presented a simple thought experiment to a room full of brilliant people — and watched them tear each other apart.

The setup is almost childishly simple. There are two boxes in front of you. Box A is transparent and contains $1,000. Box B is opaque. You can either take both boxes, or take only Box B. That's it. That's the whole decision.

The catch — and there's always a catch — is that a supremely accurate predictor called Omega has already made a prediction about what you'll do. If Omega predicted you'd take only Box B, it placed $1,000,000 inside. If Omega predicted you'd take both boxes, it left Box B empty.1

Omega has done this thousands of times before. Its track record is extraordinary — say, 99% accurate. The prediction has already been made. The money is already in the boxes or not. Now you choose.

So: do you take one box, or two?

The setup: Box A always has $1,000. Box B has $1M or nothing — decided before you choose.

The Case for Two Boxes

The two-boxer's argument is crisp and logically airtight. At the moment you choose, the money in Box B is either there or it isn't. Omega made its prediction yesterday. Your choice now can't retroactively change what's in the box. So consider your two scenarios:

If Box B contains $1,000,000: taking both gets you $1,001,000. Taking only B gets you $1,000,000. Both is better by $1,000.

If Box B contains nothing: taking both gets you $1,000. Taking only B gets you $0. Both is better by $1,000.

No matter what, taking both boxes gets you an extra $1,000. This is the dominance principle — one of the bedrock axioms of rational choice. You'd have to be crazy to leave free money on the table.2

The Case for One Box

The one-boxer looks at the two-boxer with pity. "Yes, yes, very clever," they say. "Now look at the scoreboard."

People who take one box walk away with $1,000,000 almost every time. People who take both boxes walk away with $1,000 almost every time. Omega is right 99% of the time, remember? If you're the kind of person who takes both boxes, Omega almost certainly predicted that, and Box B is empty. You get your precious $1,000. Congratulations.

The one-boxer's logic is simple: be the kind of person Omega predicts will take one box. Those people are millionaires. The others are holding a thousand bucks and a philosophy degree.3

"I have put this problem to a large number of people… To almost everyone it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly."
— Robert Nozick, 1969

· · ·

Chapter 28.1

Play the Game

Theory is nice. Let's get empirical. Below you can play Newcomb's Problem against a simulated predictor. Adjust its accuracy and see what happens over many rounds. If the dominance argument is right, two-boxing should always be better. If the one-boxers are right, the predictor's accuracy should matter — a lot.

🎲 Newcomb's Box Game

Predictor accuracy: 90.0%

Box A

$1,000

Box B

???

Make your choice!

Rounds

Total Earned

Avg / Round

One-Box Wins

Two-Box Wins

Try this

Play 20+ rounds with each strategy. Then slide the accuracy down to 50% (coin-flip predictor) and try again. Notice when the two-boxer starts winning? That's the crux: the argument for one-boxing depends entirely on the predictor being good. The argument for two-boxing doesn't depend on the predictor at all.

· · ·

Chapter 28.2

The Schism in Decision Theory

Newcomb's Problem isn't a party trick. It exposed a fault line that runs through the foundations of how we think about rational choice. On one side: causal decision theory (CDT). On the other: evidential decision theory (EDT).

CDT says: evaluate your options by their causal consequences. Your choice right now can't cause the money to appear or disappear in Box B. Omega already decided. So take both boxes — your act has no causal power over the prediction.4

EDT says: evaluate your options by what they tell you about the world. If you're the kind of person who one-boxes, that's evidence that Omega predicted you'd one-box, which is evidence Box B has a million dollars. Choosing one box is correlated with being rich. Choose one box.

This isn't a minor quibble. CDT and EDT are different theories of rationality. They disagree about what it means to make a good decision. And in Newcomb's Problem, they give opposite answers.

Two theories of rationality, one impossible problem, zero consensus since 1969.

The Predictor Doesn't Need to Be God

Here's what makes people dismiss the problem too quickly: "Perfect predictors don't exist." True. But they don't need to be perfect. They just need to be good enough.

Think about someone who knows you extremely well — a spouse, a parent, a close friend. If they had to predict whether you'd take one box or two, they'd probably be right more than 50% of the time. In many cases, much more. You're predictable. Sorry. We all are.5

And here's the thing: as long as the predictor's accuracy exceeds a certain threshold (about 50.05%, as you can verify in the game above), one-boxing has a higher expected payoff. The dominance argument is logically valid — but it conflicts with the expected-value calculation. Something has to give.

· · ·

Chapter 28.3

Where Do You Stand?

Enough theory. Time to commit. Are you a one-boxer or a two-boxer?

📊 The Great Newcomb's Poll

Choose your side. See how other readers split.

· · ·

Chapter 28.4

Newcomb's Problem Is Everywhere

You might think this is just philosophers having fun. But Newcomb-like structures show up constantly in real life — wherever your decision is correlated with someone else's prediction of it.

"My single vote won't change the outcome, so why bother?" This is two-boxing logic applied to democracy. Your vote doesn't cause millions of like-minded people to also vote — but it's correlated with them. If you stay home, you're the kind of person whose demographic stays home. If everyone in your reference class reasons the same way (and they tend to), the group outcome depends on what the individual decides. The one-boxer goes to the polls.6

Nuclear deterrence is a Newcomb problem in disguise. If the missiles are already flying, retaliating causes nothing but extra destruction — two-boxing logic says don't push the button. But the whole point of deterrence is being the kind of agent that would retaliate. You need to be a one-boxer — committed to a strategy that looks irrational once the moment arrives — precisely so that the moment never arrives.

Every time you keep a promise when breaking it would benefit you in the moment, you're one-boxing. The payoff of being known as trustworthy — being predicted to cooperate — outweighs the $1,000 you could grab right now by defecting. Two-boxers in business don't get invited back.

Kavka's Toxin Puzzle

Philosopher Gregory Kavka sharpened the blade further in 1983. A billionaire offers you $1,000,000 if you intend, at midnight tonight, to drink a mildly unpleasant (but not dangerous) toxin tomorrow. You don't actually have to drink it — you just have to genuinely intend to. But here's the thing: once midnight passes and you've won the money, you have no reason to drink. And if you know you won't drink it, you can't genuinely intend to. And if you can't intend to, you don't get the money.7

Kavka's Toxin reveals the same crack as Newcomb's Problem: can a rational agent commit to a strategy that's suboptimal at the moment of execution, because the commitment itself is what generates the payoff?

Functional Decision Theory: A Way Out?

In 2017, Eliezer Yudkowsky and Nate Soares proposed functional decision theory (FDT), which tries to dissolve the Newcomb stalemate. The idea: don't ask "what should I do now?" but rather "what decision procedure should I be running?"8

FDT says: you should one-box, not because your choice causes Omega to put money in the box, and not merely because one-boxing is evidence the money is there, but because Omega's prediction and your choice are both outputs of the same underlying computation — your decision algorithm. When you choose, you're choosing on behalf of every copy and simulation of your algorithm, including the one Omega ran when making the prediction.

It's a clever move. It respects the causal structure (your choice doesn't retroactively cause the money to appear) while still recommending one-boxing (because your algorithm is what Omega was reading). Whether it fully works is still debated — but it shows that after fifty years, new ideas are still emerging from Nozick's deceptively simple puzzle.

Functional Decision Theory: your choice and Omega's prediction are both running the same code.

· · ·

Chapter 28.5

What Nozick Knew

Robert Nozick, the man who started all this, never told anyone which side he was on. "I have not," he admitted, "been able to make myself feel sure that I should take one box." He found the arguments for two-boxing compelling. He also found the arguments for one-boxing compelling. He thought anyone who didn't feel the pull of both sides wasn't thinking hard enough.

And maybe that's the real lesson. Newcomb's Problem isn't a bug in decision theory — it's a feature of the world. Some situations genuinely put our principles in conflict. The dominance principle (never leave free money on the table) and the expected value principle (maximize your average payoff) usually agree. Newcomb's Problem is the stress test that shows they don't always.

You can pick a side. You probably already have. But if you're not at least a little uncomfortable with your choice, you might want to think again.

The boxes are waiting.

Notes & References

Robert Nozick, "Newcomb's Problem and Two Principles of Choice," in Essays in Honor of Carl G. Hempel, ed. Nicholas Rescher (Dordrecht: D. Reidel, 1969), 114–146. The problem was originally invented by physicist William Newcomb and shared with Nozick, who brought it to philosophical fame.
The dominance argument goes back to the foundations of game theory. See R. Duncan Luce and Howard Raiffa, Games and Decisions (New York: Wiley, 1957). The principle: if strategy A is at least as good as strategy B in every possible state of the world, and strictly better in at least one, choose A.
David Lewis made the most forceful philosophical case for two-boxing in "Causal Decision Theory," Australasian Journal of Philosophy 59, no. 1 (1981): 5–30. He lost no sleep over one-boxers getting rich.
Allan Gibbard and William Harper, "Counterfactuals and Two Kinds of Expected Utility," in Foundations and Applications of Decision Theory, eds. C.A. Hooker, J.J. Leach, and E.F. McClennen (Dordrecht: D. Reidel, 1978), 125–162.
Philip Tetlock's work on superforecasters shows that human prediction can be remarkably accurate with the right methods and training. See Philip Tetlock and Dan Gardner, Superforecasting: The Art and Science of Prediction (New York: Crown, 2015).
The voting-as-Newcomb analogy is developed in Isaac Levi, "A Note on Newcombmania," Journal of Philosophy 79, no. 6 (1982): 337–342, and more recently in debates on "Evidential Cooperation in Large Worlds" by Oesterheld (2017).
Gregory Kavka, "The Toxin Puzzle," Analysis 43, no. 1 (1983): 33–36.
Eliezer Yudkowsky and Nate Soares, "Functional Decision Theory: A New Theory of Instrumental Rationality," arXiv preprint arXiv:1710.05060 (2017).

The Problem That Won't Go Away

The Case for Two Boxes

The Case for One Box

Play the Game

🎲 Newcomb's Box Game

The Schism in Decision Theory

The Predictor Doesn't Need to Be God

Where Do You Stand?

📊 The Great Newcomb's Poll

Expected Payoff by Predictor Accuracy

Newcomb's Problem Is Everywhere

Kavka's Toxin Puzzle

Functional Decision Theory: A Way Out?

What Nozick Knew

Notes & References