All Chapters

The Missing Chapter

Benford's Law II: The Forensic Frontier

How digit patterns catch tax cheats, election fraud, and the people who think they're clever enough to fake the numbers.

An extension of Jordan Ellenberg's "How Not to Be Wrong"

Chapter 50

The Accountant Who Could Smell a Lie

In 1993, something funny was going on in the Brooklyn Democratic primary. Payments to poll workers — the people who staffed voting stations — looked… off. Not obviously wrong, not cartoon-villain-stuffing-cash-in-suitcases wrong, but subtly, mathematically wrong. And one man noticed.

His name was Mark Nigrini, a South African–born accountant working on his PhD at the University of Cincinnati, and he had an unusual obsession: the first digits of numbers.1 While most of us glance at a spreadsheet and see dollar amounts, Nigrini saw digit distributions. And the Brooklyn poll worker payments had far too many numbers starting with 7 and 8 — digits that, in naturally occurring financial data, should be relatively rare.

If you read Chapter 8, you already know why. Benford's Law tells us that in many real-world datasets, the leading digit isn't uniformly distributed. The digit 1 appears as the first digit about 30.1% of the time, while 9 shows up only 4.6% of the time. This isn't a curiosity or a coincidence — it's a mathematical consequence of how numbers grow, scale, and accumulate in the real world.

But here's what Chapter 8 didn't tell you: Benford's Law isn't just a neat party trick. It's a weapon.

Nigrini's analysis of the 1993 Brooklyn Democratic primary revealed that payments to poll workers deviated significantly from Benford's expected distribution. The excess of high first digits (7s and 8s) suggested that someone had been fabricating payment amounts — probably picking numbers that "felt" random but betrayed the human inability to mimic natural digit distributions. The case became a landmark example of digital analysis in forensic accounting.

Nigrini's PhD thesis, completed in 1992, laid the groundwork for an entirely new field: digital analysis in forensic accounting.2 His insight was beautifully simple. If you suspect someone is fabricating financial data, don't look for the fraud itself — look at the digits. Genuine numbers follow Benford's Law. Fabricated numbers, invented by humans who think they know what "random" looks like, almost never do.

The IRS took notice. So did the Big Four accounting firms. Today, Benford's Law analysis is a standard first-pass screening tool in forensic audits. Deloitte, PricewaterhouseCoopers, Ernst & Young, and KPMG all use some version of it. The logic is ruthlessly efficient: run the numbers through a Benford filter, flag the accounts that deviate, and send the auditors where they'll actually find something.3

· · ·
The Second Digit

Where the Real Magic Happens

Here's something the fraudsters eventually figured out: if Benford's Law catches you by checking first digits, maybe you should make sure your fabricated numbers have convincing first digits. Stuff more 1s in there, fewer 9s. Problem solved, right?

Wrong. Because Benford's Law has a sequel, and it's even more powerful.

The second digit of a number — the one most people don't even think about — follows its own Benford-like distribution. And almost nobody remembers to fake that one.

The second-digit distribution is subtler than the first. While first digits range from the dramatic 30.1% for 1 down to 4.6% for 9, second digits are more uniform — ranging from about 12.0% for 0 down to 8.5% for 9. The differences are smaller, but they're real, and they're remarkably consistent across genuine datasets.

Benford's Expected Digit Distributions First Digit 130.1% 2 3 4 5 6 7 8 9 Second Digit 012.0% 1 2 3 4 5 6 7 8 9
First digits show dramatic variation; second digits are subtler but still non-uniform — and harder to fake.

This is why second-digit analysis is the forensic accountant's secret weapon. A clever fraudster might doctor first digits to match Benford's curve. But correcting second digits too? That requires knowing the expected distribution and having the discipline to apply it across thousands of entries. Almost nobody does.

The Statistical Tests

Of course, "this looks off" doesn't hold up in court. You need numbers to back up your numbers. Two tests dominate forensic Benford analysis:

Chi-Squared Test for Benford Compliance
χ² = Σ (observed expected)² / expected
Sum over all digits. Higher χ² means greater deviation from Benford's Law.

The chi-squared test compares your observed digit frequencies against the expected Benford frequencies. A high chi-squared value means the data deviates significantly from what nature would produce. For first-digit analysis with 9 categories, a chi-squared above 15.51 (at the 5% significance level) raises the red flag.

But chi-squared has a weakness: it's sensitive to sample size. With a very large dataset, even tiny (and meaningless) deviations will produce a "significant" result. Enter the MAD test — Mean Absolute Deviation.

Mean Absolute Deviation (MAD)
MAD = (1/K) Σ |observed% expected%|
K = number of digit categories. MAD gives a scale-independent measure of deviation.

Nigrini himself proposed the MAD thresholds that are still standard today:4 For first-digit tests, a MAD below 0.006 is "close conformity," 0.006–0.012 is "acceptable," 0.012–0.015 is "marginally acceptable," and above 0.015 is "nonconformity." For second digits, the thresholds are tighter because the expected distribution is flatter.

· · ·
Interactive

Fraud Detective

Time to put your forensic skills to work. Below are financial datasets — some are genuine corporate data, others were fabricated. Examine the digit distributions, check the statistical tests, and decide: clean or fraudulent?

🔍 Fraud Detective

Select a dataset, examine its digit patterns, and make your call.

Round 1 of 6  |  Score: 0/0
χ² (First Digit)
χ² (Second Digit)
MAD (First)
MAD (Second)
· · ·
Elections

When Digits Meet Democracy

In 2009, something strange happened in Iran's presidential election. Mahmoud Ahmadinejad won by a suspiciously large margin, and the streets of Tehran erupted in protest. Statisticians, naturally, reached for Benford's Law.

Walter Mebane, a political scientist at the University of Michigan, had spent years developing a specific test: second-digit Benford analysis applied to vote counts from individual precincts.5 The idea is elegant. In a genuine election, the number of votes per candidate per precinct spans enough orders of magnitude that Benford's Law should apply. If someone is stuffing ballots — adding fictitious votes to precincts — the second digits of the inflated totals will drift away from the expected distribution.

Mebane applied his test to the Iranian results and found significant deviations. The same technique flagged anomalies in Russia's 2011 parliamentary elections, where United Russia's vote totals showed suspicious digit patterns in certain regions.

Case closed? Not so fast.

The Deckert Warning

In 2011, Joseph Deckert, Mikhail Myagkov, and Peter Ordeshook published a devastating critique.6 They applied the same Benford tests to elections that nobody disputed — stable Western democracies with clean track records — and found that those elections also deviated from Benford's Law. The uncomfortable truth: precinct-level vote counts don't always span enough orders of magnitude for Benford's Law to reliably apply. A deviation from Benford is not proof of fraud. It's a clue, at best — and a false alarm, at worst.

This is perhaps the most important lesson in all of forensic statistics: a statistical anomaly is not evidence of wrongdoing. Benford analysis can tell you where to look. It cannot tell you what happened. The digit distribution is a smoke detector, not a fire investigator. And sometimes, the toast just burned.

Benford Analysis: Evidence Hierarchy All accounts / precincts Benford flags (anomalies) Confirmed fraud (investigation needed) 100% ~10-20% ~2-5%
Benford analysis narrows the search space — it doesn't deliver verdicts.
Interactive

Election Forensics Lab

Select a pre-loaded election dataset and run second-digit Benford analysis. Watch the MAD score and see whether the data conforms, looks suspicious, or fails outright. Then remember: deviations aren't proof.

🗳️ Election Forensics Lab

Run second-digit Benford analysis on election datasets.

· · ·
Boundaries

When Benford Breaks Down

Not all numbers bow to Benford. And knowing when it doesn't apply is just as important as knowing when it does.

Benford's Law works on data that spans multiple orders of magnitude and arises from multiplicative processes. Corporate revenues, population figures, river lengths, physical constants, street addresses — all Benford-compliant. But consider phone numbers. They don't span orders of magnitude; they're assigned within fixed ranges. ZIP codes? Same problem. Human heights in centimeters? They cluster between 150 and 200 — barely one order of magnitude, nowhere near enough for Benford to emerge.

The rule of thumb: if the data couldn't plausibly include values from 1 to at least 1,000 (three orders of magnitude), be skeptical of Benford-based conclusions.

The Deeper Mathematics: Scale Invariance

Why does Benford's Law work at all? The deep answer is scale invariance. Imagine you have a Benford-distributed dataset of revenues in dollars. Now convert everything to euros, yen, or Bitcoin. Multiply every number by some constant. The resulting data still follows Benford's Law.7

This is extraordinary. It means Benford's Law is the only distribution that is invariant under multiplication by an arbitrary constant. It doesn't matter what units you use. It doesn't even matter what base you use — Benford's Law has analogues in base 2, base 8, base 16, any base. This base invariance is what makes it a genuine mathematical law rather than an empirical curiosity.

Scale Invariance: Multiply by Any Constant Revenue ($) × 0.85 (→ €) Revenue (€) Same shape. Always. That's scale invariance. P(d) = log₁₀(1 + 1/d)
Benford's Law is the unique distribution unchanged by multiplication — it doesn't care about your units.
Modern Era

COVID, Crypto, and the Digital Dragnet

In 2020, as COVID-19 case counts poured in from every country, researchers turned to Benford's Law to check whether national reporting was honest.8 Countries with transparent health systems — Germany, South Korea, the US — produced case counts that closely matched Benford's expected distribution. Other countries showed curious deviations, suggesting possible underreporting or data manipulation. It wasn't proof, but it was a useful signal in a noisy, politicized information landscape.

Cryptocurrency has opened another frontier. Blockchain transactions are public, and the amounts transacted follow Benford's Law beautifully — until someone starts wash trading (buying from yourself to inflate volumes). Several exchanges have been flagged by Benford analysis for suspiciously uniform transaction amounts, a hallmark of automated fake trading.

Corporate financial statements remain the bread and butter. When Enron's books were later analyzed using Benford's Law, the deviations were screaming. Not subtly off — dramatically wrong, especially in second-digit distributions. The numbers had been fabricated wholesale, and the digits knew.

This is the enduring power of Nigrini's insight from that Brooklyn primary in 1993: you can lie about the numbers, but you can't lie about the digits. The pattern is too deep, too mathematical, too woven into the fabric of how numbers work. It's not that fraudsters aren't smart. It's that Benford's Law is smarter.

Or rather — it's that mathematics doesn't need to be smart. It just needs to be right.

Notes & References

  1. Mark J. Nigrini, "The Detection of Income Tax Evasion Through an Analysis of Digital Distributions," PhD dissertation, University of Cincinnati, 1992.
  2. Mark J. Nigrini, Benford's Law: Applications for Forensic Accounting, Auditing, and Fraud Detection (Hoboken: Wiley, 2012).
  3. Durtschi, Hillison, and Pacini, "The Effective Use of Benford's Law to Assist in Detecting Fraud in Accounting Data," Journal of Forensic Accounting 5 (2004): 17–34.
  4. Nigrini, "A Taxpayer Compliance Application of Benford's Law," Journal of the American Taxation Association 18, no. 1 (1996): 72–91. The MAD conformity thresholds for first digits: ≤0.006 (close), 0.006–0.012 (acceptable), 0.012–0.015 (marginal), >0.015 (nonconformity).
  5. Walter R. Mebane Jr., "Election Forensics: Vote Counts and Benford's Law," working paper, University of Michigan, 2006. Updated version in Political Analysis 19, no. 3 (2011).
  6. Joseph Deckert, Mikhail Myagkov, and Peter C. Ordeshook, "Benford's Law and the Detection of Election Fraud," Political Analysis 19, no. 3 (2011): 245–268.
  7. Theodore P. Hill, "A Statistical Derivation of the Significant-Digit Law," Statistical Science 10, no. 4 (1995): 354–363.
  8. Andrew Koch and Tim Okamura, "Benford's Law and COVID-19 Reporting," Economics Letters 196 (2020): 109573.