Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 12 submissions in the queue.
posted by martyb on Thursday February 21 2019, @01:29AM   Printer-friendly
from the search-and-research dept.

Machine-learning techniques used by thousands of scientists to analyse data are producing results that are misleading and often completely wrong.

Dr Genevera Allen from Rice University in Houston said that the increased use of such systems was contributing to a "crisis in science".

She warned scientists that if they didn't improve their techniques they would be wasting both time and money. Her research was presented at the American Association for the Advancement of Science in Washington.

A growing amount of scientific research involves using machine learning software to analyse data that has already been collected. This happens across many subject areas ranging from biomedical research to astronomy. The data sets are very large and expensive.

[...] "There is general recognition of a reproducibility crisis in science right now. I would venture to argue that a huge part of that does come from the use of machine learning techniques in science."

The "reproducibility crisis" in science refers to the alarming number of research results that are not repeated when another group of scientists tries the same experiment. It means that the initial results were wrong. One analysis suggested that up to 85% of all biomedical research carried out in the world is wasted effort.

It is a crisis that has been growing for two decades and has come about because experiments are not designed well enough to ensure that the scientists don't fool themselves and see what they want to see in the results.

https://www.bbc.com/news/science-environment-47267081


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @01:53AM (7 children)

    by Anonymous Coward on Thursday February 21 2019, @01:53AM (#804326)

    You tell machines to do the science for you, are you a scientist?

    I bet this affects the "social science" the most.

    • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @02:51AM (3 children)

      by Anonymous Coward on Thursday February 21 2019, @02:51AM (#804340)

      > You tell machines to do the science for you, are you a scientist?

      I tried something similar on famous artist Claus Oldenberg once, at a college art opening. One of the pieces on display was this one (or very similar):
      http://infinitemiledetroit.com/Claes_Oldenburgs_Giant_Three-Way_Plug_and_the_Issue_of_Projective_Vision.html [infinitemiledetroit.com] There was also a similar sized cube tap made from fabric (soft sculpture), and several other large metal sculptures.

      My question was, did you have fun making these things after you conceived of them. His answer was that he didn't make anything, jobbed it all out to specialist fabrication shops.

      ** You hire craftspeople to build your sculptures, are you an artist?

      • (Score: 1, Insightful) by Anonymous Coward on Thursday February 21 2019, @03:26AM (1 child)

        by Anonymous Coward on Thursday February 21 2019, @03:26AM (#804352)

        At least the "artist" jobbed out to - note - "specialist" fabrication shops. Most of this AI Science is DIY with some quick intro to Python Jupyter coked up on the researcher's own little notebook. Or in the cases of super-computer abuse, well, faulty premises at best, programmed in as the expected result. No surprises here, human error amplified byu several orders of magnitude by machine iteration.

        • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @03:43AM

          by Anonymous Coward on Thursday February 21 2019, @03:43AM (#804359)

          > coked up on the researcher's own little notebook

          Ha ha, I see what you did there! (assuming you meant cooked).

      • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @03:31AM

        by Anonymous Coward on Thursday February 21 2019, @03:31AM (#804354)

        In this case, he put it out open, and he should have listed his subcontractors in the credit for the works.

        Rather different from "scientists" that use some "smart" software tools that they don't understand, to validate their hypothesis.

        In the end, it comes down to: Do you even understand what the fuck you are arguing?

    • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @09:53AM (2 children)

      by Anonymous Coward on Thursday February 21 2019, @09:53AM (#804446)

      Dark Matter, after what? 200 failed attempts? tells me that the stupid is in all of us half insane monkeys, not just your chosen 'out' group.

      • (Score: 1, Funny) by Anonymous Coward on Thursday February 21 2019, @02:36PM

        by Anonymous Coward on Thursday February 21 2019, @02:36PM (#804518)

        I think I should start collecting bookmarks on all those super-convinced "dark matter does not exist" posts. I still regret that I didn't do it on all those "gravitational waves do not exist" posts back before they got detected.

      • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @06:56PM

        by Anonymous Coward on Thursday February 21 2019, @06:56PM (#804626)

        Oooh, let's mix up Dark Matter and AI in a pot/cauldron, add some strawberries and plug the wires into a fresh cuppa tea. Its the New Science, like the New Math that edumacated a generation. What can go wrong?

  • (Score: 5, Insightful) by gznork26 on Thursday February 21 2019, @02:11AM (9 children)

    by gznork26 (1159) on Thursday February 21 2019, @02:11AM (#804331) Homepage Journal

    The lack of properly defined experimental procedures predates AI, and has nothing to do with the tools used to analyze the data. In the 90s, my wife the programmer took courses in psychopharmacology. While reading through research papers, she kept pointing out flaws in the reasoning that would have been obvious to the researchers if they had learned to program before drafting experimental procedures. The same problem happens in other fields as well, most notably the law.

    To address the problem, start by teaching the practitioners how to thing as logically as a programmer, which means paying attention to the ELSE condition.

    --
    Khipu were Turing complete.
    • (Score: 1, Insightful) by Anonymous Coward on Thursday February 21 2019, @02:32AM (3 children)

      by Anonymous Coward on Thursday February 21 2019, @02:32AM (#804337)

      But! But! But!
      (sputters coffee)
      ALL our use cases would have this code:
      ELSE
          rejectHypothesis // oh sh!t there goes the funding
      ENDIF

      • (Score: 1) by khallow on Thursday February 21 2019, @01:19PM (2 children)

        by khallow (3766) Subscriber Badge on Thursday February 21 2019, @01:19PM (#804485) Journal
        I'm sure that bug would be fixed real quick. I bet it's the same reason that null hypothesis significance testing ("NHST") became so popular, it's easy to come up with stuff that appears to be results and in that way lowers the effort required to continue funding.
        • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @06:59PM (1 child)

          by Anonymous Coward on Thursday February 21 2019, @06:59PM (#804628)

          Bugfixed "their" code:
          ELSE
                reHypothesize(newBS,AImagic) // we keeps the funding, maybe gets moare muney
                goto 10
          ENDIF

    • (Score: 2) by KilroySmith on Thursday February 21 2019, @03:14AM

      by KilroySmith (2113) on Thursday February 21 2019, @03:14AM (#804347)
    • (Score: 2) by driverless on Thursday February 21 2019, @08:39AM (2 children)

      by driverless (4770) on Thursday February 21 2019, @08:39AM (#804430)

      Seen the same thing in, um, "social sciences". A friend of mine, a statistician, took a few papers in the field some years ago, but dropped it after he started re-casting the lectures as "which basic flaws in methodology were used to produce these results"? In other words he had unconsciously switched from absorbing the material to looking out for all the errors in it instead, sort of like when you watch a so-bad-it's-good unintentional comedy SciFi.

      • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @10:25AM (1 child)

        by Anonymous Coward on Thursday February 21 2019, @10:25AM (#804455)

        In other words he had unconsciously switched from absorbing the material to looking out for all the errors in it instead

        I'm sorry, but what? The first thing you'll know about science is that people that read the papers are actually interested in the procedures and they try to understand how is the procedure written about better than what they thought of. The first thing you try to determine is whether the other side didn't make mistakes. Then you ask them to clarify if something is suspicious.

        There is no one that is serious about science that is "absorbing the material". "Absorbing the material" happens in grade school.

        Also, nothing to do with social sciences. Just look in Biology or Pharmacology or many medicinal fields. There is a reason many require basic understand of physics for their degree, but I guess they all think error analysis is too hard after that.

        • (Score: 3, Touché) by driverless on Thursday February 21 2019, @12:10PM

          by driverless (4770) on Thursday February 21 2019, @12:10PM (#804466)

          There is no one that is serious about science that is "absorbing the material". "Absorbing the material" happens in grade school.

          Or in classes at University, specifically the classes that I mentioned the friend of mine was taking.

    • (Score: 2) by VLM on Thursday February 21 2019, @01:01PM

      by VLM (445) on Thursday February 21 2019, @01:01PM (#804481)

      I am not disagreeing with you in that its a valid way to look at the problem, but there is a second way where any EE who's ever implemented a modem demodulator (in software, or, LOL, hardware) knows all about bit error rates vs SNR.

      So a communications-EE interpretation of statistics-hacking is if you're willing to accept a ridiculous bad bit error rate (and lets face it, academics are rewarded for maximizing the amount of toilet paper published, not publishing true stuff) then you can pull any signal out of a large enough sample of noise. It is literally the million monkeys thing where a trillion loosely coupled electrons making thermal noise does in fact generate the works of Shakespeare if you're willing to post-process a large enough sample of thermal noise using your modem demodulator.

      Understanding the future really does belong to the EE who can code.

  • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @02:19AM (1 child)

    by Anonymous Coward on Thursday February 21 2019, @02:19AM (#804332)

    You use tools of which you have no clue how the damn things work, I mean not even the rudimentary principles of its working,

    What do you expect?

    • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @02:29AM

      by Anonymous Coward on Thursday February 21 2019, @02:29AM (#804335)

      At least this time around they are admitting the "Science" factor is bunkham, not screaming in your face "But it is (proven) Science!" when you merely point out that more research would be a good idea. Time for a new term and yet another English word to get invented - I give you here and now: "scaince."
      scaince, n. science attempted with the involvement of Artifical Intelligence.

  • (Score: 0) by Anonymous Coward on Thursday February 21 2019, @03:28AM

    by Anonymous Coward on Thursday February 21 2019, @03:28AM (#804353)

    reproducibility matters not at all.

    When medical research devolved to making huge productions out of tiny statistical increases/decreases in whatever, only "significant" in the sense of some chosen statistical method and never in the common sense of the word - no wonder the effects, even if there for real, easily drown in the noise. But who cares, when the results, properly presented, get to sell another $BILLIONS of $OVERPRICED_DRUG ?

  • (Score: 2) by captain normal on Thursday February 21 2019, @05:15AM

    by captain normal (2205) on Thursday February 21 2019, @05:15AM (#804391)

    Garbage in > garbage out.

    --
    Everyone is entitled to his own opinion, but not to his own facts"- --Daniel Patrick Moynihan--
  • (Score: 3, Insightful) by PiMuNu on Thursday February 21 2019, @07:46AM

    by PiMuNu (3823) on Thursday February 21 2019, @07:46AM (#804417)

    Machine learning is of course a big pile of overhyped balls. But the author demonstrates an equivalent level of s**t-talking with statements like:

    > "I would venture to argue that a huge part of that does come from the use of machine learning techniques in science."

    As someone who does multivariate analysis, I take personal offence from statements like this which strongly imply that my results, which the author has never seen or read, are wrong. Where is the author's evidence? Machine learning, like any other sort of analysis tool, is useful when used correctly.

  • (Score: 2) by VLM on Thursday February 21 2019, @01:13PM (1 child)

    by VLM (445) on Thursday February 21 2019, @01:13PM (#804483)

    A growing amount of scientific research involves ... analyse data that has already been collected.

    https://en.wikipedia.org/wiki/Scholasticism [wikipedia.org]

    Everything old is new again...

    To quote the wikipedia

    a program of employing that method in articulating and defending dogma in an increasingly pluralistic context

    Paraphrased, the entire topic is something like "Our theological dogma is obsolete and infertile WRT new ideas, so to look busy we'll logic chop the hell out of old stuff to bamboozle the natives into thinking we're doing something important"

    Sorta like the old saying about "bad money always chases out good money" the problem for academia is scholasticism chases out actual research. Running a bunch of regressions in R or python on the same old data never discovers anything nearly as useful as building the worlds biggest telescope or proton collider, but it is a hell of a lot cheaper, so we end up with a lot more journals publishing toilet paper and a lot less concrete and steel research buildings. Kinda a game of chicken, "somebody should do real research but its hard and expensive so I'll play an abstract video game; hope somebody out there does actual research..."

    • (Score: 0) by Anonymous Coward on Friday February 22 2019, @12:51AM

      by Anonymous Coward on Friday February 22 2019, @12:51AM (#804806)

      I was thinking more along the lines of this https://en.wikipedia.org/wiki/Spline_interpolation [wikipedia.org]

      Walk off the edges and your data goes somewhere else and usually not anywhere near reality.

      Even within any curve you can get wild results (depending on the alg you use). If you are close to existing results you get decent results. But the further away you get from 'real data' you get strange data.

      Anyone who has taken algebra 3/4 should know this.

(1)