Stories
Slash Boxes
Comments

SoylentNews is people

posted by chromas on Monday January 14 2019, @02:22AM   Printer-friendly
from the Surely-you-jest,-Dr.-Feynman dept.

Probably not that good of an article, but it actually exists, only at Wired, so it is certain that it probably is worth reading. But only if you go in with no preconceptions.

Nobel laureate Richard Feynman once asked his Caltech students to calculate the probability that, if he walked outside the classroom, the first car in the parking lot would have a specific license plate, say 6ZNA74. Assuming every number and letter are equally likely and determined independently, the students estimated the probability to be less than 1 in 17 million. When the students finished their calculations, Feynman revealed that the correct probability was 1: He had seen this license plate on his way into class. Something extremely unlikely is not unlikely at all if it has already happened.

Bayesian probability is all well and good, until it runs up against actuality. But the point here is all about having a Beautiful Mind or π, and seeing patterns everywhere, and how if you see them in Big Data, the patterns are bigger. But no less crazy.

The Feynman trap—ransacking data for patterns without any preconceived idea of what one is looking for—is the Achilles heel of studies based on data mining. Finding something unusual or surprising after it has already occurred is neither unusual nor surprising. Patterns are sure to be found, and are likely to be misleading, absurd, or worse.

This approach to "science" can certainly lead to interesting results, as in this particular study:

A standard neuroscience experiment involves showing a volunteer in an MRI machine various images and asking questions about the images. The measurements are noisy, picking up magnetic signals from the environment and from variations in the density of fatty tissue in different parts of the brain. Sometimes they miss brain activity; sometimes they suggest activity where there is none.

A Dartmouth graduate student used an MRI machine to study the brain activity of a salmon as it was shown photographs and asked questions. The most interesting thing about the study was not that a salmon was studied, but that the salmon was dead. Yep, a dead salmon purchased at a local market was put into the MRI machine, and some patterns were discovered. There were inevitably patterns—and they were invariably meaningless.

Brings to mind (brains!) a certain Irish myth of the Salmon of Knowledge, and the parallel formation of the posthumous Salmon of Doubt by Douglas Adams.

The problem has become endemic nowadays because powerful computers are so good at plundering Big Data. Data miners have found correlations between Twitter words or Google search queries and criminal activity, heart attacks, stock prices, election outcomes, Bitcoin prices, and soccer matches. You might think I am making these examples up. I am not.

There are even stronger correlations with purely random numbers. It is Big Data Hubris to think that data-mined correlations must be meaningful. Finding an unusual pattern in Big Data is no more convincing (or useful) than finding an unusual license plate outside Feynman's classroom.

New Myth: Big Data and the MRIed Dead Salmon of Pattern Imagination.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Insightful) by DrkShadow on Monday January 14 2019, @02:54AM (16 children)

    by DrkShadow (1404) on Monday January 14 2019, @02:54AM (#786228)

    I believe this would be "overfitting". If you're going to train on data and take your conclusions from that data, well duh, you're going to get a model that fits _that_data_. This is why you separate your training set and your test set. (You can probably call "big data" a "machine learning" task, and so the preceding applies.)

    If you get a model showing that apples are equal to oranges because they're both round, and limes, too, then fine. It should fall apart when you start looking at melons. Whatever. If you're going to try and generate a model from a given set of data, all you're getting is the least complex model (mean squared?) for your data -- and this is a perfectly valid model for the extent of your data.

    If you're going to try and apply your model's data to other data, it will either fall apart or fit. If it falls apart, throw it away. If it fits, then try and figure out _why_ it fits. You don't need to have the "why" before you have a computer generate a plausible model. If it keeps fitting data, try and figure out why. Keeps fitting -- not one time, and not one dataset. Look at the relations and understand why. We do this as humans because our data sets are so much more vast than anything we feed into a computer that we can spot the false positives that result from insufficient data.

    Starting Score:    1  point
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 3, Interesting) by Arik on Monday January 14 2019, @03:16AM (15 children)

    by Arik (4543) on Monday January 14 2019, @03:16AM (#786238) Journal
    You're missing the point.

    There is no 'chance' in nature. It's an abstraction, a mathematical way of expressing our uncertainty, and of constraining it.

    What's the chance that the next coin I flip will come up heads? Conventionally we say it's 50/50, but in fact it's either 0 or 100. *We* just don't know which yet.

    --
    If laughter is the best medicine, who are the best doctors?
    • (Score: 2) by c0lo on Monday January 14 2019, @03:37AM (6 children)

      by c0lo (156) Subscriber Badge on Monday January 14 2019, @03:37AM (#786245) Journal

      There is no 'chance' in nature. It's an abstraction, a mathematical way of expressing our uncertainty, and of constraining it.

      Inviting the magister to open a discussion about this position, the 'free will' (or 'predestination') and the consequences on the ethics of the everyday life.

      --
      https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
      • (Score: 2) by aristarchus on Monday January 14 2019, @07:23AM (5 children)

        by aristarchus (2645) on Monday January 14 2019, @07:23AM (#786335) Journal

        Oh, c0lo, we do not want to go there! Can you imagine the implications, if the universe as it were, were to be the way it had to be, and the TMB was not just an annoying fact of reality, but a necessary consequence of it? That way lies madness. There is only one thing to do, and that is to punch the Nazis in the face. Random enough to argue for free will, and also random enough to suggest to the Nazis that they could be wrong? But as a Buddhist, I have to believe in the laws of Karma, which either confirm or totally negate the doctrine of free will. So again, provocation denied.

        • (Score: 2) by c0lo on Monday January 14 2019, @07:46AM (4 children)

          by c0lo (156) Subscriber Badge on Monday January 14 2019, @07:46AM (#786345) Journal

          Can you imagine the implications, if the universe as it were, were to be the way it had to be, and the TMB was not just an annoying fact of reality, but a necessary consequence of it?

          Yes, I can.

          There is only one thing to do, and that is to punch the Nazis in the face.

          See? You can imagine it too.

          Random enough to argue for free will, and also random enough to suggest to the Nazis that they could be wrong?

          Nothing random in it, that the inexorable destiny! (he says)
          It's only that the Nazis are such special snowflakes (they can't help it, though, they are so predestined) they can't take it as a (white) man.

          But as a Buddhist, I have to believe in the laws of Karma

          Nothing in Buddhism says you can't punch a Nazi in the face, it only asks you to do it in disregard with the fruits of your action.
          E.g. falling of a rock on the Nazi's face will not count as bad Karma for the rock, 'cause the rock is dispassionate in its actions and derives no benefit from them.

          (Large grin)

          --
          https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
          • (Score: 2) by aristarchus on Monday January 14 2019, @08:00AM (3 children)

            by aristarchus (2645) on Monday January 14 2019, @08:00AM (#786351) Journal

            Not sure where I am drawn on this: my fist as a rock falling on the face of a Nazi, with no passion or intent; or my fist falling upon the face of the Nazi as a Buddhist call to awaking, or a Christian act of kindness toward my errant brother. And I think that the Compassionate Buddhist Nazi Face-punch is only slightly less suspect than the Christian one? I say, when carrying water and chopping wood, one shouid just carry water and chop wood. And when one is punching Nazis in the face, one should just punch Nazis in the face, without regard to consequences or conversion. Just do it, as Nike says. Nike, by the way, is the Goddess of Victory. And some times, the only way to win is not to fight, but just to punch Nazis in the face.

            • (Score: 2) by c0lo on Monday January 14 2019, @08:51AM (2 children)

              by c0lo (156) Subscriber Badge on Monday January 14 2019, @08:51AM (#786364) Journal

              Not sure where I am drawn on this: my fist as a rock falling on the face of a Nazi, with no passion or intent; or my fist falling upon the face of the Nazi as a Buddhist call to awaking, or a Christian act of kindness toward my errant brother...

              If that Nazi didn't ask for awakening or kindness, is because his destiny didn't allows it.
              Thus, the only proper way [wikipedia.org] to do it is as "right work done well". The Nike way, yes.

              (grin)

              --
              https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
              • (Score: 2) by aristarchus on Monday January 14 2019, @09:14AM (1 child)

                by aristarchus (2645) on Monday January 14 2019, @09:14AM (#786377) Journal

                But how do we tell the Nazis destined to enlightenment by our fist, from those who are not! This an eternal question, posed most greatly by Mitchell and Webb in their sketch "Nazis [wikia.com], also on Youtube Nazi Channel, Are we the Baddies? [youtube.com] Best to take off, and nuke them from orbit, of punch the in the face.

                • (Score: 2) by c0lo on Monday January 14 2019, @09:49AM

                  by c0lo (156) Subscriber Badge on Monday January 14 2019, @09:49AM (#786389) Journal

                  But how do we tell the Nazis destined to enlightenment by our fist, from those who are not!

                  Why, do you make any distinction between the water you carry or the wood you chop?

                  You can be at pace (with yourself): the water you'll be carrying tomorrow is not the same water you carry today, 'cause it flows together with the wood and everything; so, the Nazi you'll punch tomorrow will not be the same one you punch today.
                  Pretty much like every man deserves a cup of wine and a cup of wine transforms a man into new one, your work will change the Nazi, but that shouldn't deter you from your predestined work nor your detachment from the passions (but not the world).

                  (grin)

                  --
                  https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
    • (Score: 2) by DrkShadow on Monday January 14 2019, @04:03AM (5 children)

      by DrkShadow (1404) on Monday January 14 2019, @04:03AM (#786256)

      Unless you're the almighty Gosh, _we_ don't know whether it will be heads or tails. We don't try to know what the next of your coin tosses will be. It feels like you're convoluting the issue.

      Big Data works only in Big Generalities. It works in statistics, not in absolutes. Given We can't know the absolute inputs, and don't have the Absolute System, all we can do is try to fit the observed outputs, and infer the cause (and maybe check it), and infer what might happen next given known inputs and a non-absolute model. It will only ever be approximation. Its promise is never that it will be reality, all anyone is hoping for is that it will be a good match, the more data we give it. (Corporations have used this to great increases in wealth, if you want a measure for its efficacy.)

      The Stallman quote is a bad one. He said probability one -- not accounting for someone leaving early, or even leaving before him, or being late out of the previous class and not having left yet. The probability is not one, and we/he do not know whether the car is currently there or not, or will be there at the end of class or not. (Further, he said "A specific" and not "this specific", which is a technical difference but I digress.) The car is "There" or "Not There", right, but we don't know the state it's in. (Doesn't, say, quantum mechanics act a lot seemingly on probabilities?..)

      Suppose there is no chance in nature. Then what is your proposal given we can't exactly define "nature" or calculate based on it? The big data that we use is only as biased as the data provided, and that's the best anyone can do. If you want to make it lopsided based on assumptions, go for it -- to me that's additional data that you're feeding in (correct or not) causing the same calculations to be made. (Probability of your guaranteed falsehood: zero. Is your falsehood actually false?...)

      • (Score: 2) by Pslytely Psycho on Monday January 14 2019, @05:51AM (1 child)

        by Pslytely Psycho (1218) on Monday January 14 2019, @05:51AM (#786296)

        "The car is "There" or "Not There", right, but we don't know the state it's in"
        Oh, I get it, it's a Schrödinger's Car type of thing....

        Ok, I'm sorry, I'm leaving now.....

        --
        Alex Jones lawyer inspires new TV series: CSI Moron Division.
        • (Score: 1, Funny) by Anonymous Coward on Monday January 14 2019, @04:42PM

          by Anonymous Coward on Monday January 14 2019, @04:42PM (#786477)

          Schrodinger and Heisenberg were pulled over.

          Policeman asks if he knew how fast he was going.

          Heisenberg says no, but he was quite confident as to where he was.

          Policeman asks to look in the trunk, "Did you know you had a dead cat in here?"

          Schrodinger says, "well, I do NOW!"

      • (Score: 1, Insightful) by Anonymous Coward on Monday January 14 2019, @06:14AM

        by Anonymous Coward on Monday January 14 2019, @06:14AM (#786309)

        Unless you're the almighty Gosh, _we_ don't know whether it will be heads or tails. We don't try to know what the next of your coin tosses will be. It feels like you're convoluting the issue.

        You could know for certain if you knew all the variables (force of the flip, wind, etc). Probability is just describing our uncertainty, not anything about the coin flip.

      • (Score: 0) by Anonymous Coward on Monday January 14 2019, @07:25AM

        by Anonymous Coward on Monday January 14 2019, @07:25AM (#786336)

        "Stallman" ?? You seem to have switched gurus, oh Shadow of Dark. Do try to keep up.

      • (Score: 3, Interesting) by Arik on Monday January 14 2019, @01:31PM

        by Arik (4543) on Monday January 14 2019, @01:31PM (#786428) Journal
        "Unless you're the almighty Gosh, _we_ don't know whether it will be heads or tails. We don't try to know what the next of your coin tosses will be."

        Yes, that's what I said.

        "It feels like you're convoluting the issue."

        That may be, but in fact I'm doing the opposite.

        "Big Data works only in Big Generalities. It works in statistics, not in absolutes. "

        Very true. And that's problematic because most people don't understand statistics.

        I mean, frankly, *I* definitely don't understand statistics. Not completely, or anything like it. But I have a basic grounding, and with that, it's quite conspicuous that most people really do not have even that. But what makes it worse is virtually everyone *thinks* they understand it. All you have to do to see this being used to manipulate people is turn on the tv, or turn off adblock. Or listen to just about any politician or political candidate. Fundamental confusions related to statistics are reliable tools in the hands of marketers who probably don't even understand what they are doing themselves.

        "Its promise is never that it will be reality, all anyone is hoping for is that it will be a good match, the more data we give it"

        Oh, no, the press releases tend to cross the line. But even the more believable claim is still suspect, likely false. They're still limited by GIGO. This is cargo cult statistics, really, just keep throwing bad data into the mix and hope the algorithm magically turns it good. It doesn't work that way. More garbage in just makes for more garbage out.

        "The Stallman quote is a bad one."

        You mean the Feynman quote?

        "He said probability one -- not accounting for someone leaving early, or even leaving before him, or being late out of the previous class and not having left yet. The probability is not one, and we/he do not know whether the car is currently there or not, or will be there at the end of class or not."

        Yes, we do, as the instance to which he referred had already occurred - he knew, and we know, that the probability was indeed 1.00 - actual fact versus speculative estimation.

        "The big data that we use is only as biased as the data provided, and that's the best anyone can do. "

        ??? No it's not, and you're not making any sense.

        You don't need huge amounts of data to do a good statistical analysis - relatively small datasets are not a problem if they are clean. The real challenge is cleaner data and more accurate analysis.

        --
        If laughter is the best medicine, who are the best doctors?
    • (Score: 2, Informative) by pTamok on Monday January 14 2019, @10:59AM (1 child)

      by pTamok (3042) on Monday January 14 2019, @10:59AM (#786405)

      What's the chance that the next coin I flip will come up heads? Conventionally we say it's 50/50, but in fact it's either 0 or 100. *We* just don't know which yet.

      That depends on the coin. If it is an American nickel, the chance is about 1 in 6000 that it will land balanced on its edge. [harvard.edu] That means the conventional chances are about 5999/12000 heads, 5999/12000 tails, and 2/12000 edge.

      With regard to 'we just don't know yet', if you believe in the 'many worlds' formulation of quantum mechanics*, then all possible outcomes of the coin toss happen, and we merely experience one of them in our history. Therefore there is a remarkably large number of 'we's, each of which has a different experience of the universe, as each has a different world line.

      Alternatively, you can formulate a reasonable theory that out experience of time passing is simply a consequence of passing at constant speed along a fourth static dimension, all of which exists 'simultaneously', which has two consequences: first that all apparent possible changes are actually predetermined by the 4 dimensional static structure of the universe; secondly as a result of this predetermination, there is no such thing as free will. This approach is known as chronogeometric fatalism, or chronogeometric determinism.

      The interpretation of Quantum Mechanics known as Pilot Wave Theory [wikipedia.org], or de Broglie-Bohm Theory [wikipedia.org] is deterministic, but controversial. I think most physicists tend to support the Copenhagen interpretation, or something similar which incorporate the Born rule, and therefore have probabilistic outcomes - in other words there is 'chance' in nature.

      So at present, it appears it is possible to argue reasonably both for (in principle) being able to know the results of a coin toss in advance, or not. The mathemaical formulations of QM 'work' in the sense they describe the expected outcomes of experiments, and those descriptions are accurate. Interpreting the mathematics of QM is a subject of great debate.

      Your original point suggests you would support a Bayensian interpretation of QM [wikipedia.org].

      In the macroscopic world, if you flip an unbiased coin a sufficient number of times, you can be pretty certain that the proportion of heads to tails will be very close to 1:1 [askamathematician.com], even though the actual sequence of results can contain arbitrarily long sequences of solely heads or solely tails [sas.com].

      *In some (highly selective) polls, a significant number of physicists choose this as their preferred interpretation of Quantum Mechanics [quora.com]. While it is not accepted by all QM physicists, it is 'respectable'.

      • (Score: 3, Informative) by PiMuNu on Monday January 14 2019, @03:38PM

        by PiMuNu (3823) on Monday January 14 2019, @03:38PM (#786457)

        > If it is an American nickel, the chance is about 1 in 6000 that it will land balanced on its edge.

        FTFAbstract: "with randomized initial conditions"; it depends on the distribution of randomized initial conditions.

        > Quantum mechanics [blah blah] and therefore have probabilistic outcomes

        This is irrelevant, the probability that the coin toss is not determined by initial conditions is vanishingly small from quantum mechanics. A coin toss is essentially classical.