If you watch Numb3rs on CBS, you’ll have noticed a rather bizarre discussion last night of Simpson’s Paradox, which was alleged to say that combing two series of numbers into a single series can change their order (it doesn’t really say that, but that’s beside the point.) The example given was David Justice’s and Derek Jeter’s batting averages in 1995 and 1997. In each year, Justice had a better average than Jeter, but for the total of the two years, Jeter was alleged to have had a better average. It’s not hard to figure out how this could be true, but it wasn’t. The actual numbers for those years are these:
Justice H/AB Jeter H/AB ------- ----- 1995 .253 104/411 .250 12/48 1997 .329 163/495 .291 190/654 ========================================== Comb. .295 267/906 .288 202/702
Justice’s numbers, Jeter’s numbers
If Jeter had hit better in 1997, much closer to Justice’s average, it would have been true because Jeter very few at bats in 1995 and many more at bats in 1997 than Justice. For some bizarre reason, the show used fictitious numbers that didn’t even add up, alleging that Justice hit .321 and .329 for a combined average of .298.
How a show that’s supposed to be so math-oriented can screw up arithmetic so badly would be a a mystery if it weren’t for the fact that mathematicians are notoriously bad at basic arithmetic.
H/T Amnesia, who also got it wrong.
UPDATE: Aha! Reader Brian Thomas explains it all. See comments.
The problem is explained by considering the 3 years 95-97, which also illustrate the paradox nicely:
Somebody in the show deleted the 1995 data and wrote 1995 beside the 1996 data. This is simply a sloppy editing error (likely intended to simplify the chart). Blame the editors.