All Chapters

The Missing Chapter

The Million-Dollar Gap

Why checking an answer is easy but finding one might be impossibly hard

An extension of Jordan Ellenberg's "How Not to Be Wrong"

Chapter 58

The Sudoku Asymmetry

Here's something you already know, even if you've never put it into words. Hand someone a completed Sudoku puzzle — every row, column, and 3×3 box filled with the digits 1 through 9 — and they can check it in about thirty seconds. Scan the rows, scan the columns, scan the boxes. Done. Either it's valid or it isn't.

Now hand someone a blank Sudoku and ask them to solve it. A hard one — the kind with only 17 given digits, the minimum needed to guarantee a unique solution. Depending on their skill, you're looking at twenty minutes, an hour, or an evening of increasingly desperate pencil marks. Some people never finish.

This gap — between the difficulty of checking an answer and the difficulty of finding one — is not a quirk of Sudoku. It is perhaps the deepest unsolved problem in all of mathematics, and someone will pay you one million dollars if you can resolve it.

The question has a name that sounds like a text message abbreviation: P vs NP. It sits on the Clay Mathematics Institute's list of Millennium Prize Problems alongside the Riemann Hypothesis and the Navier-Stokes equations — the most important open questions in mathematics, each carrying a million-dollar bounty.1 And unlike some of those other problems, which feel hermetically sealed inside pure mathematics, P vs NP reaches out and grabs the real world by the throat. If P equals NP, then every encryption system protecting your bank account can be cracked. Every protein-folding problem in biology becomes trivial. Every optimization problem in logistics, scheduling, and circuit design has an efficient solution. Mathematical creativity itself — the act of finding proofs — becomes no harder than checking them.

If P doesn't equal NP — which is what almost every computer scientist believes — then we live in a universe where some problems are inherently, provably, irreducibly hard. Not hard because we haven't been clever enough yet. Hard because hardness is baked into the structure of computation itself.

Let's figure out what this actually means.

· · ·
Chapter 58

What Computers Can Do Quickly

Mathematicians love to classify things. Biologists have kingdoms and phyla; chemists have the periodic table; and computer scientists have complexity classes — categories for problems based on how hard they are to solve.

The most important class is called P, which stands for "polynomial time." A problem is in P if there's an algorithm that solves it in a number of steps that's a polynomial function of the input size. Sorting a list of n numbers? You can do it in roughly n log n steps — that's polynomial (better than polynomial, actually). Finding the shortest path between two points in a network? Dijkstra's algorithm handles it in polynomial time. Determining whether a number is prime? After centuries of uncertainty, a trio of Indian computer scientists proved in 2002 that this, too, is in P.2

Polynomial time is the computer scientist's definition of "feasible." An algorithm that runs in n² steps is slow for large inputs, but it's fundamentally tractable — double the input size and you quadruple the work. Manageable. An algorithm that runs in 2n steps, on the other hand, is catastrophic. Add one more element to your input and you double the total work. For n = 100, that's more steps than there are atoms in the observable universe.

Input size (n) Steps 2ⁿ n log n Polynomial (fast) Exponential (doom)
The polynomial vs exponential divide: a modest gap for small inputs becomes an unbridgeable chasm.
Chapter 58

The Art of Checking

Now meet NP, which — and I cannot stress this enough — does not stand for "not polynomial." It stands for "nondeterministic polynomial time," a name that makes sense only if you've taken a graduate course in theoretical computer science and are the sort of person who names things badly.3

Here's a friendlier definition: a problem is in NP if, whenever someone hands you a proposed solution, you can check whether it's correct in polynomial time. You might have no idea how to find the solution efficiently, but if an angel whispered the answer in your ear, you could verify it fast.

Every problem in P is automatically in NP — if you can solve something quickly, you can certainly check an answer quickly. (Just solve it yourself and compare.) So P ⊆ NP. The million-dollar question is whether the reverse is also true. Is every problem that's easy to check also easy to solve?

"If P = NP, then the world would be a profoundly different place than we usually assume it to be. Everyone who could appreciate a symphony would be Mozart; everyone who could follow a step-by-step argument would be Gauss."
— Scott Aaronson4

Try it yourself. Below is a graph coloring puzzle — you need to color every node so that no two connected nodes share the same color. Checking a coloring takes an instant: just scan each edge. Finding a valid coloring? That's where the sweat starts.

Interactive: NP Verification vs Solving

Click a node, then click a color to paint it. Try to color the graph so no connected nodes share a color. The timer tracks how long you take — verification is instant.

Nodes:
0.0s
Select a node, then pick a color.

Did you feel it? With five nodes, it's almost trivial. With twelve, you start backtracking. With eighteen, you're making strategic decisions, painting yourself into corners, erasing and retrying. And yet checking any proposed coloring — valid or not — takes the same negligible instant regardless of size. That asymmetry is the heart of P vs NP.

· · ·
Chapter 58

The Master Key: NP-Completeness

In 1971, Stephen Cook proved something remarkable. He showed that there exists a single problem — the Boolean satisfiability problem, or SAT — with a very special property: if you could solve SAT in polynomial time, you could solve every problem in NP in polynomial time.5 Not just some NP problems. All of them. SAT is what Cook called "NP-complete" — it is, in a precise technical sense, the hardest problem in NP.

A SAT problem looks like this: you have a bunch of Boolean variables (true or false), and a formula built from ANDs, ORs, and NOTs. Can you find an assignment of true/false values that makes the whole formula true? Simple to state, maddening to solve.

A year later, Richard Karp showed that twenty-one other famous problems were also NP-complete.6 Graph coloring. The traveling salesman problem. Bin packing. Integer programming. Each one is a master key: solve any single one efficiently, and you've solved them all. Prove any single one can't be solved efficiently, and none of them can be.

NP P Sorting Shortest path Primality NP-Complete SAT Graph coloring TSP P = NP ?
The landscape of complexity (assuming P ≠ NP). NP-complete problems sit at the hardest edge of NP.

This is what makes NP-completeness so powerful and so unnerving. These twenty-one problems — now thousands, since more are discovered regularly — are all secretly the same problem wearing different disguises. Solve one, solve them all. Fail at one, fail at all. They're connected by reductions: mechanical translations that convert any instance of one problem into an instance of another, preserving the answer.

Seeing a Reduction

The interactive below shows this in action. We take a small SAT formula — a few clauses with Boolean variables — and mechanically convert it into a graph coloring problem. The key insight: the SAT formula is satisfiable if and only if the resulting graph is colorable with 3 colors. Solve either one and you've solved both.

Interactive: SAT → Graph Coloring Reduction

Toggle the Boolean variables below. When the SAT formula is satisfied, the graph coloring is automatically valid — and vice versa. They're the same problem in disguise.

SAT Instance
reduces to
Graph Coloring
· · ·
Chapter 58

What If P = NP?

Let's indulge the fantasy for a moment. Suppose someone proves P = NP constructively — not just showing they're equal, but providing an actual efficient algorithm for SAT. What happens?

Cryptography collapses. RSA encryption, which protects your online banking, relies on the assumption that factoring large numbers is hard. If P = NP, it isn't. Every encrypted message ever sent could be retroactively decoded. Bitcoin's security evaporates.7

Drug discovery accelerates by decades. Protein folding — predicting a protein's 3D structure from its amino acid sequence — is believed to be NP-hard. Efficient algorithms would let us design drugs by computation alone.

Creativity becomes mechanical. Finding a mathematical proof is (essentially) an NP problem: the proof itself is the certificate, checkable in polynomial time. If P = NP, there's an efficient algorithm that finds proofs. Mathematicians become obsolete — or at least, the part of mathematics that involves inspired leaps rather than systematic search.

Optimization is solved. Scheduling, routing, packing, planning — every logistics problem that companies spend billions on becomes computationally trivial.

This would be, without exaggeration, the most transformative intellectual discovery in human history. It would change everything from warfare to weather prediction to Wall Street. Which is one reason most experts suspect it isn't true.

Chapter 58

Why Everyone Believes P ≠ NP (But Nobody Can Prove It)

The consensus among theoretical computer scientists is overwhelming: P ≠ NP. In a 2019 survey, 88% of complexity theorists said they believed P ≠ NP, with only about 1% believing P = NP.8 But belief is not proof, and the gap between intuition and theorem has never been wider.

Why do they believe it? Partly induction — thousands of smart people have tried for fifty years to find polynomial-time algorithms for NP-complete problems, and nobody has succeeded. Partly because the consequences of P = NP are so extreme that they feel implausible. And partly because there's a deep aesthetic conviction that finding should be harder than checking, that the creative act of constructing a solution should demand more than the clerical act of verifying one.

But aesthetics is not mathematics. To prove P ≠ NP, you'd need to show that for some specific NP problem, no polynomial-time algorithm exists. Not that we haven't found one. That one cannot exist. You need a proof of impossibility — and those are notoriously difficult.

Why it's so hard to prove

In 1994, Razborov and Rudich proved that a vast class of proof strategies — called "natural proofs" — cannot resolve P vs NP (assuming certain cryptographic assumptions hold). Baker, Gill, and Solovay had already shown in 1975 that "relativizing" proofs can't work either. These are called barrier results: they don't tell us the answer, but they tell us that whole avenues of attack are dead ends. Any successful proof must use genuinely new ideas.

In August 2010, Vinay Deolalikar, a researcher at HP Labs, circulated a 100-page manuscript claiming to prove P ≠ NP. The paper generated immediate excitement — and equally immediate scrutiny. Within weeks, a collaborative effort by mathematicians and computer scientists (conducted largely on blogs and wikis) identified fundamental gaps in the argument. The proof did not survive peer review. The million dollars remains unclaimed.

Living with Hardness

If we can't prove P ≠ NP, and we can't solve NP-complete problems efficiently, what do we actually do? This is where computer science gets practical, even beautiful.

We approximate. For the traveling salesman problem, we can't find the shortest route, but we can find one that's guaranteed to be within 50% of optimal. For many real-world instances, that's good enough.

We use heuristics. SAT solvers — programs that tackle the Boolean satisfiability problem — can handle formulas with millions of variables, even though SAT is NP-complete. The worst case is exponential, but the typical case, for the kind of structured formulas that arise in practice, is often manageable.

We randomize. Sometimes flipping coins helps. Randomized algorithms can solve certain problems faster than any known deterministic algorithm, and for many NP-hard problems, random restarts and stochastic local search are the best tools we have.

And sometimes we just change the problem. Can't optimize globally? Optimize locally. Can't find the exact answer? Find a probabilistically good one. Can't solve the worst case? Solve the average case. Theoretical hardness is a wall, but practical computer science is very good at finding windows.

· · ·
Chapter 58

Why This Matters to You

You might think P vs NP is an abstraction that concerns only theorists. It isn't. Every time you buy something online, the security of the transaction depends on the assumption that P ≠ NP. Every time a logistics company routes its delivery trucks, it's navigating an NP-hard problem with heuristic algorithms that work despite the theoretical intractability. Every time a pharmaceutical company folds a protein or an airline schedules its crews, it's bumping up against the same fundamental barrier.

Cryptography Logistics Biology AI & Proofs Everything rides on P vs NP
From online banking to drug design, the P vs NP question underlies the modern world.

P vs NP is, at bottom, a question about the nature of creativity. Can insight be automated? Is the flash of genius — the "aha!" of finding a proof, composing a symphony, cracking a code — fundamentally different from the plodding work of checking whether someone else's insight is correct? Most of us feel, intuitively, that it is. That creating is harder than critiquing. That the artist and the critic are engaged in different kinds of labor.

P vs NP asks whether that intuition can be made rigorous. Whether mathematics itself can prove that discovery is harder than verification. And the fact that we've been trying for over fifty years and still can't — that the question remains open, tantalizing, maddening — tells us something profound about the limits of human knowledge.

We live in the gap between P and NP. We assume the gap is real. We build our entire digital civilization on that assumption. And we cannot prove it.

That should keep you up at night. Or at least make you appreciate, the next time you buy something online, that your credit card number is protected by a conjecture.

Notes & References

  1. The Clay Mathematics Institute announced the seven Millennium Prize Problems in 2000, each carrying a $1,000,000 prize. As of 2024, only the Poincaré Conjecture has been solved (by Grigori Perelman, who declined the prize). See claymath.org/millennium-problems.
  2. Agrawal, M., Kayal, N., & Saxena, N. (2004). "PRIMES is in P." Annals of Mathematics, 160(2), 781–793. The AKS primality test was the first deterministic polynomial-time algorithm for primality testing.
  3. The "nondeterministic" refers to a theoretical model of computation — the nondeterministic Turing machine — which can magically "guess" the right answer at each step. A problem is in NP if such a machine can solve it in polynomial time. Equivalently (and more usefully), it's in NP if a deterministic machine can verify a solution in polynomial time.
  4. Aaronson, S. (2006). "Reasons to Believe." Blog post on Shtetl-Optimized. This quote has become one of the most cited informal descriptions of the P vs NP question's significance.
  5. Cook, S. A. (1971). "The complexity of theorem-proving procedures." Proceedings of the Third Annual ACM Symposium on Theory of Computing, 151–158. Leonid Levin independently proved a similar result in the Soviet Union.
  6. Karp, R. M. (1972). "Reducibility among combinatorial problems." In Complexity of Computer Computations, Plenum Press, 85–103. The 21 problems include satisfiability, clique, vertex cover, Hamiltonian cycle, graph coloring, and many others.
  7. More precisely, RSA relies on the difficulty of factoring, which is in NP (and co-NP) but is not known to be NP-complete. However, if P = NP, factoring would certainly be in P. Bitcoin's proof-of-work (SHA-256 hashing) and ECDSA signatures would likewise collapse.
  8. Gasarch, W. I. (2019). "Guest Column: The Third P =? NP Poll." ACM SIGACT News, 50(1), 38–59. This is the third in a series of polls conducted in 2002, 2012, and 2019.