Teschler on Topic
Leland Teschler • Executive Editor
Here’s a quick way to make yourself unpopular and bruise egos among researchers in your field: Check the statistics in their research findings for errors.
It turns out that researchers in many fields aren’t particularly good statisticians. So when they apply statistical tools to data they’ve collected, they often screw up the math or draw the wrong conclusions from their calculations.
In particular, researchers are prone to find statistical significance in results where there really isn’t any. So warns Steve Ziliak, an economics professor at Roosevelt University. Ziliak coauthored a book called The Cult of Statistical Significance wherein he warns that researchers frequently misuse the student T-test and p values. Ziliak combed through papers published in numerous prestigious economics, operations research, and medical journals. He found many instances of researchers using statistical significance as if it was the same as correlation.
The distinction between the two concepts isn’t just a matter of pedantic statistical minutia. In medical research, for example, confusion about significance levels can lead to rejecting good drugs in favor of alternatives that are less effective.
There is also evidence that technical personnel aren’t particularly good at catching simple math errors in their work. So says Stanford Associate Professor Kristin Sainani. Writing in the journal of the American Academy of Physical Medicine and Rehabilitation, she says statistical errors are surprisingly common in biomedical literature, and many of them are detectable simply by running the numbers given in the paper.
Common sense and simple arithmetic are often all that’s required to find problems. In one paper, Sainani says she found numerical problems just by scanning the numbers in a table—they didn’t add up. In another case, Sainani noticed that a non-surgical pain treatment claimed a drastic, and implausibly large, reduction in pain. A little sleuthing revealed the researchers had confused standard error with standard deviation when compiling results.
I’m sure the researchers who made these blunders were embarrassed when confronted with their mistakes. Interestingly, free tools on the internet now make it possible for almost anyone to point out mathematical errors to the chagrin of those who made them.
One in this category is called Statcheck (statcheck.io) which extracts statistics from papers and checks them for internal consistency. Another free online calculator called Grim (Granularity-related inconsistent means test) flags impossible mean values. When the reported mean and sample size are entered, Grim (www.prepubmed.org/grim_test/) tells you whether they are consistent or inconsistent, i.e., whether or not you can really compute this mean if the samples are all whole numbers.
An additional fun online tool is WebPlotDigitizer (apps.automeris.io/wpd/) which examines a plot you upload (as an image) to it and extracts the x and y values, among other things. Given a forest plot (a graphical display of estimated results from a number of scientific studies addressing the same question,), the tool also extracts other parameters such as means and confidence intervals. When given histograms or bar charts it figures percentages and calculates angles and distances from images.
These tools can open the door to all kinds of mischief. Those who are especially industrious might aspire to the achievements of English anesthetist John Carlisle. When he can’t sleep, he goes through data in published clinical trials looking for problems. According to Scientific American, Carlisle’s part-time efforts have led to the retraction or correction of hundreds of papers and have helped end the careers of three scientists who faked data outright.
Not bad results for just a hobby. DW