Humanity would understand very little about cancer, and be hard-pressed to find cures, without scientific research. But what if, when teams recreated each other's research, they didn't arrive at the same result?
That's what the Reproducibility Project: Cancer Biology of the Center for Open Science is attempting to do—redo parts of 50 important cancer studies and compare their results. They released their first five replications today, and it turns out that not all of the data is matching up. At least once in every paper, a result reported as statistically significant (the way scientists calculate whether an effect is caused by more than chance alone) was not statistically significant in the replicated study. In two of the cases, the differences between the initial and replicated studies were even more striking, giving the Center for Open Science researchers cause for concern
"I was surprised by the results because of all that homework that we did" to make sure the studies were being reproduced accurately, Tim Errington, Project Manager at the Center for Open Science told Gizmodo. "We thought we were crossing every T and dotting every I... Seeing some of these experimental systems not behave the same was something I was not expecting to happen."
(Score: 2) by bradley13 on Saturday February 04 2017, @01:48PM
...except the way people understand them. The lazy see a p-test and think: "here's a result". This is, of course, wrong.
If you see a result p really means is: Hey, look - this might be something interesting. We should look into this further.
Everyone is somebody else's weirdo.
(Score: 0) by Anonymous Coward on Saturday February 04 2017, @01:55PM
It depends how you use it:
https://meehl.dl.umn.edu/sites/g/files/pua1696/f/074theorytestingparadox.pdf [umn.edu]
http://www.biorxiv.org/content/biorxiv/early/2016/12/20/095570.full.pdf [biorxiv.org]
(Score: 2) by TheRaven on Saturday February 04 2017, @04:25PM
sudo mod me up
(Score: 3, Insightful) by mhajicek on Saturday February 04 2017, @05:53PM
Perhaps it should become standard to publish the full data set, so that anyone cam do their own statistical analysis.
The spacelike surfaces of time foliations can have a cusp at the surface of discontinuity. - P. Hajicek
(Score: 2) by deimtee on Saturday February 04 2017, @10:57PM
I agree that publishing all the data would be best, but in human studies it is possible you will run into patient confidentiality problems. At the very least there would be extra work involved in anonymizing patient data. (however, any animal based studies should be regarded as very suspect if they don't make the full dataset available. )
If you cough while drinking cheap red wine it really cleans out your sinuses.
(Score: 0) by Anonymous Coward on Saturday February 04 2017, @09:56PM
I think I've only ever seen one use Welch's t-test, which actually does account for this.
The t-test that R uses by default is the Welch's t-test. I think there is not even an useful case where to use a Student's t-test above the Welch's t-test.
In many biological cases one would do an ANOVA anyway, instead of multiple t-tests.