Statisticians Want to Abandon Science's Standard Measure of `Significance'

Not abandon Not abandon (Score: 3, Insightful) by ikanreed on Friday April 19 2019, @06:49PM (2 children)

on Friday April 19 2019, @06:49PM (#832244) Journal

Decentralize.

I've read the back and forth on this. The biggest contingent just wants to make "chance of happening randomly" less relevant. It's still going to be something you'll want to do an analysis of. "Oh wow, I found an effect size of 100% in this sample" then your sample size is 3, and you do the p analysis and find it could happen randomly within your distribution 1 out of 5 times?

The main problem we have is that "significant" is frequently not significant in the real and intuitive sense, in that it doesn't inform us of something predictive.

My opinion is that the thing to do is up front hypotheses, before any analysis or data collection is done. It would do more to cure the p-hacking than any amount of stricture about what kinds of analysis are "good enough".

Re:Not abandon (Score: 0) by Anonymous Coward on Friday April 19 2019, @07:57PM

by Anonymous Coward on Friday April 19 2019, @07:57PM (#832263)

What does "happen randomly" mean? There are always multiple models of "random chance" available derived from different assumptions. Eg, binomial vs poisson binomial.
https://en.m.wikipedia.org/wiki/Binomial_distribution [wikipedia.org]
https://en.m.wikipedia.org/wiki/Poisson_binomial_distribution [wikipedia.org]
You are testing the validity of the assumptions, not chance.

Parent
Whatever happened to the scientific method? (Score: 5, Insightful) by jb on Saturday April 20 2019, @01:07AM

by jb (338) on Saturday April 20 2019, @01:07AM (#832389)

My opinion is that the thing to do is up front hypotheses, before any analysis or data collection is done. It would do more to cure the p-hacking than any amount of stricture about what kinds of analysis are "good enough".
Precisely. And that's the way things were for decades (or even centuries, depending on which field of science you're interested in), until the current fad of "junk science" took off.
The problem is right there in the opening sentence of TFS:
In science, the success of an experiment is often determined by a measure called "statistical significance."
When I was at university (a long time ago), doing that would have earned me a fail. It was drummed into us over & over again that inferential statistics (of any kind) are only useful as a sanity check, after the fact. Reversing the order of things (by just running a bunch of stats on an existing data set then manufacturing a hypothesis afterwards to fit the strongest statistical result) was regarded, quite rightly, as cheating, since such "results" are meaningless.
There's nothing wrong with using suitable statistics to help confirm the validity of a properly designed experiment after its completion. But using them to come up with a proposition to "test" (it's no test at all by then) is more akin to astrology than science...

Parent

Throwing the baby with the water Throwing the baby with the water (Score: 0) by Anonymous Coward on Friday April 19 2019, @07:01PM (1 child)

by Anonymous Coward on Friday April 19 2019, @07:01PM (#832247)

A group of people use badly a tool, and then just want to throw it away and replace it by... Nothing?
Why don't they just use their new methodology and see how it fare in the long run? There could even be statistical measurement on how good it is.
Also why does "everyone" have to drop this tool at the same time? I don't understand.

Re:Throwing the baby with the water (Score: 2) by takyon on Friday April 19 2019, @07:04PM

by takyon (881) <takyonNO@SPAMsoylentnews.org> on Friday April 19 2019, @07:04PM (#832249) Journal

why does "everyone" have to drop this tool at the same time?
I wouldn't bet on that happening. Every journal will have their own policy. Maybe Springer Nature, AAAS, etc. will set the policy for large groups of journals, but I still would not expect a consensus on this.

--
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]

Parent

Wrong Wrong (Score: 0) by Anonymous Coward on Friday April 19 2019, @07:01PM (7 children)

by Anonymous Coward on Friday April 19 2019, @07:01PM (#832248)

"you'd see a difference as big or bigger only 5 percent of the time if it wasn't really there —"

The problem to begin with is they should be testing predictions of their theories, not looking for "differences". Academic research is so fucked... I was just watching the big short and its exactly like financial markets in like 2006. How do I make money off this?

Medicine is different Medicine is different (Score: 0) by Anonymous Coward on Friday April 19 2019, @07:27PM (11 children)

by Anonymous Coward on Friday April 19 2019, @07:27PM (#832257)

The letter is signed by a bunch of medical scientists. Their problems with statistical significance do not apply to science generally. Physics uses a .0000003 cutoff to determine statistical significance, and it's not a problem. Medical research has the problem because .05 is such a weak result.

Wouldn't It Have Been Useful... Wouldn't It Have Been Useful... (Score: 5, Informative) by NotSanguine on Friday April 19 2019, @07:53PM (3 children)

by NotSanguine (285) <{NotSanguine} {at} {SoylentNews.Org}> on Friday April 19 2019, @07:53PM (#832262) Homepage Journal

To include the link to the "March 20 comment published in Nature [nature.com]. Especially since the link was included in the original text quoted in TFS?

That comment [nature.com] doesn't actually call for folks to stop using p-values, rather they call for such p-values not to be used as arbiters of valid vs. invalid:

Let’s be clear about what must stop: we should never conclude there is ‘no difference’ or ‘no association’ just because a P value is larger than a threshold such as 0.05 or, equivalently, because a confidence interval includes zero. Neither should we conclude that two studies conflict because one had a statistically significant result and the other did not. These errors waste research efforts and misinform policy decisions.

And they certainly don't call for an end to using p-values.

Personally, I think that low p-values should be treated as Isaac Asimov pointed out [quotationspage.com] in another context:

The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' (I found it!) but 'That's funny ...'

Such results should prompt additional analysis and (sorely lacking as well) attempts at replication.

--
No, no, you're not thinking; you're just being logical. --Niels Bohr

Re:Wouldn't It Have Been Useful... (Score: 2) by edIII on Friday April 19 2019, @08:06PM

by edIII (791) on Friday April 19 2019, @08:06PM (#832268)

The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' (I found it!) but 'That's funny ...'
This happens a lot more often than people may think too. Accidental Scientific Discoveries [mentalfloss.com]

--
Technically, lunchtime is at any moment. It's just a wave function.

Parent
Re:Wouldn't It Have Been Useful... Re:Wouldn't It Have Been Useful... (Score: 0) by Anonymous Coward on Friday April 19 2019, @08:08PM (1 child)

by Anonymous Coward on Friday April 19 2019, @08:08PM (#832269)

Using p m-values isnt the same as using statistical significance. Checking for a statistically significant difference is a misuse of p-values, unless you have a theoretical reason to predict such a thing.

Parent
- Re:Wouldn't It Have Been Useful... (Score: 0) by Anonymous Coward on Friday April 19 2019, @08:10PM
  
  by Anonymous Coward on Friday April 19 2019, @08:10PM (#832270)
  
  p m-values -> p-values
  
  Parent

Abandon publishing statistical studies Abandon publishing statistical studies (Score: 0) by Anonymous Coward on Friday April 19 2019, @08:02PM (6 children)

by Anonymous Coward on Friday April 19 2019, @08:02PM (#832266)

A statistical study, by definition, only provides evidence for correlations, but everyone talking about these studies always assumes that causation is proven. Read an article about any published study, and you will always find causative prescriptions attached - do this, eat that, sleep more, vote democrat, etc. These prescriptions are never justified based on the evidence of correlation found in these studies. Causation research is rare and you always have to read the actual paper to find out whether causation was investigated.

Really, the best thing to do at this point is to stop reporting any correlation studies altogether. Yes, they are still useful to guide further causation studies, but non-scientists just become confused and take statistics as dogma. Ban it and the world will be a better place.

Re:Abandon publishing statistical studies Re:Abandon publishing statistical studies (Score: 0) by Anonymous Coward on Friday April 19 2019, @08:14PM (4 children)

by Anonymous Coward on Friday April 19 2019, @08:14PM (#832271)

What is an example of a "causation study"?

Parent
- Re:Abandon publishing statistical studies (Score: 0) by Anonymous Coward on Friday April 19 2019, @08:47PM
  
  by Anonymous Coward on Friday April 19 2019, @08:47PM (#832285)
  
  > What is an example of a "causation study"?
  Well, Google thinks that it's a statistical study like this (first hit using your sentence as the search string:
  https://www.fmcsa.dot.gov/safety/research-and-analysis/large-truck-crash-causation-study-ltccs-analysis-series-using-ltccs [dot.gov]
  The Large Truck Crash Causation Study (LTCCS) was undertaken jointly by the Federal Motor Carrier Safety Administration (FMCSA) and the National Highway Traffic Safety Administration (NHTSA). The LTCCS is based on a nationally representative sample of nearly 1,000 injury and fatal crashes involving large trucks that occurred between April 2001 and December 2003. The data collected provide a detailed description of the physical events of each crash, along with an unprecedented amount of information about all the vehicles and drivers, weather and roadway conditions, and trucking companies involved in the crashes.
  
  But my interpretation of your sentence requires an actual experiment using the classic version of scientific method -- hypothesis and so on.
  
  Parent
- Re:Abandon publishing statistical studies Re:Abandon publishing statistical studies (Score: 5, Insightful) by Thexalon on Friday April 19 2019, @09:13PM (2 children)
  
  by Thexalon (636) on Friday April 19 2019, @09:13PM (#832296)
  
  A causation study would be one that demonstrates the process by which A leads to B by doing A to one group while ensuring A doesn't happen to another group and seeing if B happens.
  They're harder to do in a lot of sciences because:
  A. We don't have a few copies of Earth sitting around to use for experiments.
  B. We don't have an easy way of moving stars, planets, and other really large objects around.
  C. Ethics boards are kinda keen on human test subjects surviving the experiment.
  D. It's really hard to isolate some things, because people are complicated.
  
  --
  The only thing that stops a bad guy with a compiler is a good guy with a compiler.
  
  Parent
  - Re:Abandon publishing statistical studies Re:Abandon publishing statistical studies (Score: 0) by Anonymous Coward on Saturday April 20 2019, @04:50AM (1 child)
    
    by Anonymous Coward on Saturday April 20 2019, @04:50AM (#832450)
    
    Those are all true for large scale sociology conundrums.
    However, there are areas where correlation is consistently taken as causation when a causation study would be more appropriate. For example, a lot of the studies about the effects of cannabis are suspect for this reason. I want to know what the downsides actually are if I'm say a cancer patient and weighing it against opoids or if I have anxiety and I'm weighing it against an SSRI. I'm not interested in mental illness being correlated, mostly because of the consistent dismissal that causation could be the other way around, i.e. self-medication.
    _{(And I really would like to know that. I switched from [legal] cannabis to bupropion so I could quit smoking. I did not expect bupropion to actually be effective as an anti-depressant as well, so I'm pleasantly surprised that it also has that effect for me. Now I want to know whether bupropion causes hypertension or if it's merely correlated with hypertension, and I want to know whether cannabis causes mental illness or is merely correlated with it. I can't make an objective decision without causation being established.)}
    
    Parent
    - Re:Abandon publishing statistical studies (Score: 2) by Thexalon on Monday April 22 2019, @07:34PM
      
      by Thexalon (636) on Monday April 22 2019, @07:34PM (#833498)
      
      Medical research usually runs into problems with (D): People are complicated, which makes effects hard to isolate.
      For example, is it the cannabis versus the opioids versus something else, the level of sunlight and thus Vitamin D, the pesticides used on what they had for dinner last Tuesday, etc.
      
      --
      The only thing that stops a bad guy with a compiler is a good guy with a compiler.
      
      Parent
Re:Abandon publishing statistical studies (Score: 1, Informative) by Anonymous Coward on Friday April 19 2019, @09:02PM

by Anonymous Coward on Friday April 19 2019, @09:02PM (#832293)

Oh, because authors of scientific publications need to worry about how non-scientists will misinterpret them? That's not how the world works.

Parent

p Values Plus p Values Plus (Score: 0) by Anonymous Coward on Friday April 19 2019, @08:33PM (3 children)

by Anonymous Coward on Friday April 19 2019, @08:33PM (#832277)

Generally speaking, this complaint is calling for providing more information than just p (significance) to judge an experiment or statistic. To properly interpret any statistical conclusion, you need:

Findings: conclusion based on hypothesis
p: significance
n: population size
population descriptor (remember, "In mice," from a few days ago?)

(It would also be nice to include experiment methodology, [relative] standard deviation, and/or correlation coefficient, but that would require slightly more audience understanding.)

This is as true with clickbait medical papers as it is with clickbait political poll results. Those are "damned statistics" instead of useful information.

A lot of the time now, when publishing new information, the only things used in headlines are Findings. Significance may be in the article. You may need to read the abstract to get the population descriptor and the actual paper to get population size. No one should need to dig through so many layers to get to the truth about statistics that are being put on display. Summarize it all or be ignored.

Re:p Values Plus Re:p Values Plus (Score: 0) by Anonymous Coward on Friday April 19 2019, @08:38PM (2 children)

by Anonymous Coward on Friday April 19 2019, @08:38PM (#832282)

So you need the significance and sample size, but not effect size?

Parent
- Re:p Values Plus Re:p Values Plus (Score: 0) by Anonymous Coward on Friday April 19 2019, @10:00PM (1 child)
  
  by Anonymous Coward on Friday April 19 2019, @10:00PM (#832321)
  
  That's another of those "requires greater audience understanding" bits. The stuff I listed as mandatory is able to be written into a short, one-sentence summary, whereas effect size, along with correlation and deviation, go in the second or third sentence. Also, as effect size is usually used in studies of studies, it isn't always applicable. The complaint is about reporting of medical findings, most sensational reporting of which are initial findings with no follow-up. (I work in manufacturing QA and product/test method development, so effect size is not valid in all my work either.)
  Completely made-up example:
  "In 100 studies of 100+ people, on average, 5 people polled actually understand the meaning of the phrase, "relative standard deviation" (p = 0.05); more people should study statistics.
  These polls were conducted outside shopping malls, were one-question, long-answer verbal surveys, and included only those people who would answer the question. Those who would not answer were asked why, and their results were recorded and categorized, if possible; over multiple studies, it is noted that as much as 50% of those asked did not answer the target question, which is considered acceptable by the polling industry."
  Effect size is also usually buried the full-text paper alongside either methodology or findings, topic-dependent.
  The more actual, non-speculative information reported in an article about a finding, the better.
  It sure would be nice if reporters understood what they were reporting on such that their news service was more effective in providing accurate information.
  
  Parent
  - Re:p Values Plus (Score: 0) by Anonymous Coward on Friday April 19 2019, @10:09PM
    
    by Anonymous Coward on Friday April 19 2019, @10:09PM (#832328)
    
    I was making fun of you, since you only need two of the three to get the other. The point of a p-value is to normalize effect size to sample size.
    
    Parent

I found another issue (Score: 2, Insightful) by Anonymous Coward on Friday April 19 2019, @08:38PM

by Anonymous Coward on Friday April 19 2019, @08:38PM (#832280)

"Achieving an experimental result with statistical significance often determines if a scientist's paper gets published or if further research gets funded."

Go bigger Go bigger (Score: 3, Interesting) by jmorris on Friday April 19 2019, @08:50PM (1 child)

by jmorris (4844) on Friday April 19 2019, @08:50PM (#832288)

The problem of overdependence on dodgy use of statistics runs deeper than just using too big of a p-value cutoff.

Start by reading William Briggs's writings on the subject Classic Posts [wmbriggs.com]. Scroll down to the Probability & Statistics section and read a few at random. If you actually have a mind oriented toward science you will lose a few hours there. It is worth it. It is not required that you even agree with everything there, but most of it is fascinating.

And if you really want to have have your worldview challenged, go read Thomas Carlyle's Chartism [google.com] for a unpopular take on the basic error behind most use of statistics and charts

Re:Go bigger (Score: 2, Insightful) by pTamok on Friday April 19 2019, @09:18PM

by pTamok (3042) on Friday April 19 2019, @09:18PM (#832298)

Part of the issue is people not understanding the tools they are using. The 'throw a dataset at a bunch of analysis programs and see what sticks' approach is used by far too many people.
Just as many computer programs written by scientists to aid their work turn out to be badly written, so statistical analysis done by people who are experts in their field but have had little or no education in statistics often turns out to be flawed.
The issue is not so much whether a p-value is less than 0.05, but whether the statistical analysis is correct, relevant, and contextually aware. I am not an expert in statistics. I know my ignorance of this topic is embarrassingly large, but at least I know that I should not opine on areas I am so profoundly ignorant in. Unfortunately, many researchers are not so self-aware.
Significant results should be reproducible. This seems to be a fairly basic requirement, yet studies of reproducibility of research have found some a worrying lack of reproducibility. e.g. Nature human behaviour: A manifesto for reproducible science [nature.com]

Parent

Is the objection to Statistical Signficance signf? Is the objection to Statistical Signficance signf? (Score: 4, Funny) by aristarchus on Friday April 19 2019, @09:13PM (5 children)

by aristarchus (2645) on Friday April 19 2019, @09:13PM (#832295) Journal

Alright,

More than 800 statisticians and scientists

How statistically significant is this? Is it like something one might read on Quillette? Inquiring minds want to know.

According to the United States Bureau of Labor Statistics, as of 2014, 26,970 jobs were classified as statistician in the United States.

https://en.wikipedia.org/wiki/Statistician [wikipedia.org]

The current population of the United States of America is 328,621,262 as of Friday, April 19, 2019, based on the latest United Nations estimates.
the United States population is equivalent to 4.27% of the total world population.

https://www.worldometers.info/world-population/us-population/ [worldometers.info]

26,970/328,621,262=.0000820701613640568 or 0.00820701613641%, so,

As of February 2019, the total population of the world exceeds 7.71 billion people

http://worldpopulationreview.com/ [worldpopulationreview.com]

So we are looking at around 2,581,838.1 Statisticians, world wide. We add in Scientists.
From Unesco [unesco.org]:

There were 7.8 million full-time equivalent researchers in 2013, representing growth of 21% since 2007. Researchers accounted for 0.1% of the global population.

2,581,838.1+7,800,000= 10,381,838

And 800 out of those ten million are calling for the end of the term "statistically significant".

Re:Is the objection to Statistical Signficance sig (Score: 5, Funny) by Bot on Friday April 19 2019, @10:29PM

by Bot (3902) on Friday April 19 2019, @10:29PM (#832338) Journal

>And 800 out of those ten million are calling for the end of the term "statistically significant".
True true, but appeal to rationality didn't work for the default choice of init on linux systems, so, it might not work for science in general too.

--
Account abandoned.

Parent
Re:Is the objection to Statistical Signficance sig Re:Is the objection to Statistical Signficance sig (Score: 0) by Anonymous Coward on Saturday April 20 2019, @12:55PM (3 children)

by Anonymous Coward on Saturday April 20 2019, @12:55PM (#832542)

You are treating this like a random sample, it isn't.

Parent
- Re:Is the objection to Statistical Signficance sig Re:Is the objection to Statistical Signficance sig (Score: 2) by aristarchus on Saturday April 20 2019, @07:52PM (2 children)
  
  by aristarchus (2645) on Saturday April 20 2019, @07:52PM (#832674) Journal
  
  It is a self-selected sample of a rather large population. Kind of like a Fox News Poll.
  
  Parent
  - Re:Is the objection to Statistical Signficance sig Re:Is the objection to Statistical Signficance sig (Score: 0) by Anonymous Coward on Saturday April 20 2019, @10:55PM (1 child)
    
    by Anonymous Coward on Saturday April 20 2019, @10:55PM (#832751)
    
    They dont treat themselves as a sample of anything... only you do because you don't get it.
    
    Parent
    - Re:Is the objection to Statistical Signficance sig (Score: 0) by Anonymous Coward on Saturday April 20 2019, @10:58PM
      
      by Anonymous Coward on Saturday April 20 2019, @10:58PM (#832754)
      
      BTW, I say that as someone who would never sign this.
      
      Parent

I am all for it I am all for it (Score: 3, Insightful) by Bot on Friday April 19 2019, @10:26PM (1 child)

by Bot (3902) on Friday April 19 2019, @10:26PM (#832336) Journal

Finally, science removes from its back the burden of proof and becomes a fully featured religion.
Basically, the revolution was a path to eradicate old lifestyles and aristocracy, now that the replacement bureaucracy is technocratically getting bolted in place, the path (which, as any revolution astronomically, is a more or less circular) turns back towards ignorance and servitude. But hey, its colors are the rainbow's and its hymns are happy hippy stuff to chant around a fire, so it doesn't look like old grey dark ages...

--
Account abandoned.

Re:I am all for it (Score: 0) by Anonymous Coward on Saturday April 20 2019, @11:50AM

by Anonymous Coward on Saturday April 20 2019, @11:50AM (#832511)

Read the other comments here, the burden of proof was removed when the researchers started testing a null hypothesis rather than their hypothesis. When this happened varies by topic, but generally 1940s-1970s is when it all starting going to shit.

Parent

exponential growth of experiments exponential growth of experiments (Score: 2, Interesting) by unhandyandy on Saturday April 20 2019, @02:49AM (1 child)

by unhandyandy (4405) on Saturday April 20 2019, @02:49AM (#832432)

Perhaps the problem is that after almost a century the number of experiments today is several orders of magnitude greater than when 0.05 was enshrined as the right p value. So inevitably when say 1000 experiments are performed 20 of them will seem to have "statistical significance" just due to chance.

Re:exponential growth of experiments (Score: 0) by Anonymous Coward on Saturday April 20 2019, @11:53AM

by Anonymous Coward on Saturday April 20 2019, @11:53AM (#832513)

Then you would also expect an increase in "good" studies too. What has happened is only an increase in crappy studies to the point that 50-90% cannot even be replicated. Of the rest, most are probably misinterpreted too.

Parent

Our studies are not showing what we want... (Score: 2) by Entropy on Saturday April 20 2019, @03:40PM

by Entropy (4228) on Saturday April 20 2019, @03:40PM (#832600)

So lets change them so they do show what we want.

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Log In

Related Links