Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Wednesday July 26 2017, @10:39AM   Printer-friendly
from the probably-a-good-idea dept.

Statistician Valen Johnson and 71 other researchers have proposed a redefinition of statistical significance in order to cut down on irreproducible results, especially those in the biomedical sciences. They propose "to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005" in a preprint article that will be published in an upcoming issue of Nature Human Behavior:

A megateam of reproducibility-minded scientists is renewing a controversial proposal to raise the standard for statistical significance in research studies. They want researchers to dump the long-standing use of a probability value (p-value) of less than 0.05 as the gold standard for significant results, and replace it with the much stiffer p-value threshold of 0.005.

Backers of the change, which has been floated before, say it could dramatically reduce the reporting of false-positive results—studies that claim to find an effect when there is none—and so make more studies reproducible. And they note that researchers in some fields, including genome analysis, have already made a similar switch with beneficial results.

"If we're going to be in a world where the research community expects some strict cutoff ... it's better that that threshold be .005 than .05. That's an improvement over the status quo," says behavioral economist Daniel Benjamin of the University of Southern California in Los Angeles, first author on the new paper, which was posted 22 July as a preprint article [open, DOI: 10.17605/OSF.IO/MKY9J] [DX] on PsyArXiv and is slated for an upcoming issue of Nature Human Behavior. "It seemed like this was something that was doable and easy, and had worked in other fields."

But other scientists reject the idea of any absolute threshold for significance. And some biomedical researchers worry the approach could needlessly drive up the costs of drug trials. "I can't be very enthusiastic about it," says biostatistician Stephen Senn of the Luxembourg Institute of Health in Strassen. "I don't think they've really worked out the practical implications of what they're talking about."

They have proposed a P-value of 0.005 because it corresponds to Bayes factors between approximately 14 and 26 in favor of H1 (the alternative hypothesis), indicating "substantial" to "strong" evidence, and because it would reduce the false positive rate to levels they have judged to be reasonable "in many fields".

Is this good enough? Is it a good start?

OSF project page. If you have trouble downloading the PDF, use this link.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by moondrake on Wednesday July 26 2017, @02:27PM (5 children)

    by moondrake (2658) on Wednesday July 26 2017, @02:27PM (#544648)

    I have only skimmed over the paper so far, but it seems completely unworkable to me. I was surprised to see so many biologists on the author list. Any biologist with some decent understanding of statistics understands that it is simply not feasible to get that amount of significant for many real effects. I also think that statistics sometimes makes too many idealistic assumptions for biological, i.e. real, populations and our ability to take samples from that.

    Let me give a simple example of the kind of problems we are already dealing with (with the very lenient p0.05): Growth rate is often dependent on some metabolism. You can measure the speed of this metabolism by varying methods. Now we give an inhibitor to this metabolism, but at such low concentrations that although the effect was 20%, the inhibition is too small compared to the natural variation between individuals to be significant. Yet, after 30 days of exponential growth with a non-significant difference in metabolism, the treated individuals are significantly and 200% different in size.

    But wait, you say: lets just measure more individuals to decrease our standard error. But unfortunately, for many real experiments, doing it with that many samples would mean introducing all kinds of problems (cannot be with the same material, on the same day, by the same person, or with the same organisms). You can work around some of these. But not for all (the numbers needed are staggering). Instead, you try to do different experiments, all pointing to the same thing, and discuss that it is likely that an effect exists, even although your p is 0.1, well screw that.

    And sometimes, where it is possible to use a ridiculous amount of data, you end up with all kinds of things that are significantly different (a 0.0000001% difference can be significant, given enough data), without actually being relevant.

    I see more benefit in making people understand what a p-value means, or by talking about likelihood ratios then to force the field into a definition of significance where nothing of interest is significant anymore.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 0) by Anonymous Coward on Wednesday July 26 2017, @04:01PM (4 children)

    by Anonymous Coward on Wednesday July 26 2017, @04:01PM (#544696)

    Its bizarre that you seem to understand that 'significance" is a function of sample size, but still think there is some point to determining it. All it measures is how much effort you are willing/able to put forth to collect data in support of your idea (ie it measures the strength of prior belief about whether an "effect" is positive/negative). The entire thing is pointless.

    • (Score: 2) by moondrake on Wednesday July 26 2017, @07:07PM (3 children)

      by moondrake (2658) on Wednesday July 26 2017, @07:07PM (#544803)

      Is it bizarre that I think that? Is it not exactly what the people in this paper are proposing?

      I agree that there is no point to it, but you won't be a very successful scientist with that attitude.

      • (Score: 0) by Anonymous Coward on Wednesday July 26 2017, @10:49PM (2 children)

        by Anonymous Coward on Wednesday July 26 2017, @10:49PM (#544906)

        Yes, it is bizarre that the authors think this as well. In reading their paper though, it sounds like many of them actually don't want NHST around either. The figure maybe this will cut down on BS somehow by making it slightly harder for those who don't know what they are doing.

        Also, I quit medical research literally for this reason. It was too depressing and pointless. They said it was too complicated, but if you do a good job (come up with a "real" model for what is going on and test that), they just ask if the group comparison was statistically significant anyway. It is a waste of your career to do medical research right now.

        • (Score: 0) by Anonymous Coward on Thursday July 27 2017, @08:43AM (1 child)

          by Anonymous Coward on Thursday July 27 2017, @08:43AM (#545076)

          You're not wrong.

          One disturbing thing I am noticing is tons of shitty shitty studies that weakly confirm older studies but don't cite them. Instead they cite 40 articles from 2015 and later written by their fellow countrymen. It feels like a "yellow" washing of science. The number of publications and citations swamp the literature and pad resumes. Publication count was always slightly dodgy but citation count used to be slightly reliable. Now citation counts are becoming garbage - because this new generation of "patriotic" scientists cite their own countrymen at 10x the rate of others.

          There's barely anything worth reading and yet more and more of it being published.

          • (Score: 0) by Anonymous Coward on Thursday July 27 2017, @03:55PM

            by Anonymous Coward on Thursday July 27 2017, @03:55PM (#545231)

            Meh, I've seen the same BS from all cultures. Some are just more sophisticated about producing the junk than others due to steps they memorized in school.