Numb3rs botches Simpson’s Paradox

If you watch Numb3rs on CBS, you’ll have noticed a rather bizarre discussion last night of Simpson’s Paradox, which was alleged to say that combing two series of numbers into a single series can change their order (it doesn’t really say that, but that’s beside the point.) The example given was David Justice’s and Derek Jeter’s batting averages in 1995 and 1997. In each year, Justice had a better average than Jeter, but for the total of the two years, Jeter was alleged to have had a better average. It’s not hard to figure out how this could be true, but it wasn’t. The actual numbers for those years are these:

          Justice  H/AB     Jeter    H/AB
          -------           -----
1995       .253   104/411   .250     12/48
1997       .329   163/495   .291   190/654
Comb.      .295   267/906   .288   202/702

Justice’s numbers, Jeter’s numbers

If Jeter had hit better in 1997, much closer to Justice’s average, it would have been true because Jeter very few at bats in 1995 and many more at bats in 1997 than Justice. For some bizarre reason, the show used fictitious numbers that didn’t even add up, alleging that Justice hit .321 and .329 for a combined average of .298.

How a show that’s supposed to be so math-oriented can screw up arithmetic so badly would be a a mystery if it weren’t for the fact that mathematicians are notoriously bad at basic arithmetic.

H/T Amnesia, who also got it wrong.

UPDATE: Aha! Reader Brian Thomas explains it all. See comments.

One thought on “Numb3rs botches Simpson’s Paradox”

  1. The problem is explained by considering the 3 years 95-97, which also illustrate the paradox nicely:

     Year    Jeter                   Justice
    1995    12/48= .250         104/411=.253
    1996    183/582=.314        45/140 = 0.321
    1997    190/654 = 0.291     163/495 = .329
    Combined    385/1284= .300  312/1046 = 0.298

    Somebody in the show deleted the 1995 data and wrote 1995 beside the 1996 data. This is simply a sloppy editing error (likely intended to simplify the chart). Blame the editors.

