# How to Lie with Statistics

All Hacker News 14
This Year Hacker News 3

How to Lie with Statistics

4.4

Review Date:

It's always best to use the maximum possible precision. This way people know they can trust your data.

Round numbers would lead people to believe you were just making things up.

Source:

How to Lie with Statistics, Darrell Huff and Irving Geis

Statistics is not trusted because it shouldn't be. It is incredibly easy to lie with statistics. There's even a tutorial:

https://www.amazon.com/How-Lie-Statistics-Darrell-Huff/dp/03...

Statistics is ultimately counting, and therefore is incredibly vulnerable to discretion in choosing "what counts". Take the unemployment rate as one example. When people realize that its not equivalent to "people who do not have a job", how can you complain that they trust statistics less?

Expert authority is in decline because it should be, as there is an increasing body of evidence that experts, from politics to medicine, have almost no advantage in forecasting power than the average person.

https://www.amazon.com/Expert-Political-Judgment-Good-Know/d...

Why should "experts" (often just pundits) have any authority when they have consistently demonstrated they deserve very little?

Finally, the political slant of this article, going along with the decried "fake news", blaming the election results on these declines in authority, is pathetic. It's basically an extension of "the other side is filled with stupids" and has no credibility, no matter how you dress it in professional journalistic veneer.

How to Lie with Statistics is a classic on this:
I like the story of the graph a lot more than the graph itself.

The two things I would like to see are:

- Per capita debt per person who attended college. (Or perhaps who graduated college) This would answer an implied question of "What if we're just getting more people going to college?"

- The salary legend should start at 0. This would put the relative movement of salary in a more accurate context.

I don't think fixing these changes the story of "The long term cost of college is going up, while the short term benefits are going down." but when I see tricks out of "How to lie with Statistics"[1] my BS detector goes up.

[1]

How to Lie with Statistics (http://www.amazon.com/How-Lie-Statistics-Darrell-Huff/dp/039...) is a short, enjoyable read. It doesn't tell you how to do statistics, but it gives some warning about common problems.
Unfortunately, sites like snopes and politifact also succumb to the same type of bias as the right-wing sites. See this article where politifact was handed propaganda material from the Clinton foundation and parroted it without doing any actual checking about AIDS drugs the Clinton foundation was funding:

https://www.amazon.com/How-Lie-Statistics-Darrell-Huff/dp/03...

Everyone has an agenda, follow the money, and trust no one. Whether it's right-wing like Alex Jones or Left-wing like the Tampa Bay Times a.k.a politifact, you need to be suspicious and do your own research if you want the truth.

You should read this book. It's short, succinct and shows one problem with evidence - your view of the same data set can be skewed through clever manipulation.

A few examples are in order.

There are many instances in advertising where you want to show the average value of something, say the average weight loss for your new diet pill. "Average 20 pounds lost!"

Well, that's quite a trick. What average? They're likely to choose the mean, rather than median, because it is more sensitive to extreme values and would increase the "average" for the same data set. They'll never tell you which average they used.

There's a second trick in the example. 20 pounds lost? In what time span? Without specifying, which advertisers generally don't, it's not even clear if the pill is more effective than a proper diet.

Another common example of how to skew perception: the choice of axes on graphs. Say the GDP falls from 50,000 to 49,000 per capita for a country. If you choose the axis of the plot to range from 48,500 to 50,500 or so, it'll look like a catastrophic drop. If you choose the axis to range from 0 to 100,000, the drop will look insignificant. If you plot on a logarithmic scale, it might be hard to tell there's even a difference!

There are lots more examples. The problem is that data can be manipulated in tricky ways to reach whatever conclusion you want. Peer review in science is a counter-measure to this, which generally doesn't exist in politics.

> Not significantly (8% in 2000 vs 7.4% in 2010) but it gives more context than just "there are more in 2010" in my opinion

If you torture the statistics enough, they'll generally confess to whatever you want them to. There's even an entire book on this sort of thing. (To be fair, the book's intent isn't so much to teach you how to do it, but more to teach you how to spot it and make you think about such things when reading other people's work. That said, reading it would probably help you do it better if that was your goal.)

That said, this particular example is a relatively mild one (and may even be accidental rather than intentional) -- there are much, much worse abuses out there.

Not directly political, but a prerequisite:

https://toptalkedbooks.com/amzn/0393310728

One of the most important books I've read.