Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 19 submissions in the queue.

Submission Preview

Link to Story

Evidence for accurate predictions based on machine learning algorithms?

Accepted submission by Anonymous Coward at 2015-09-09 15:05:13
Answers

I've been looking into jobs for data analysts, sometimes called data scientists. I see that there is lots and lots of money being thrown at people to take "big data" (eg millions of data points for hundreds of different variables) and plug it into a sort of black box algorithm.

Roughly, these algorithms look at how the various inputs are correlated with each other and some outcome of interest, then assign a set of model parameters (sometimes called coefficients) that minimizes some kind of error metric. Some percent of the dataset is used for training, then the other part is "held out" for testing. It is then called an "accurate prediction" when the model can fit the testing dataset relatively well (ie the error metric is less than some threshold).

It is then assumed that the relationship between future input variables and outcome will be similar to that observed for the testing dataset. Based on this, business/policy decisions are made. There are some simpler situations like facial recognition where I would be optimistic regarding this final assumption. However, I have my doubts it is approximately true when it comes to human behavior, and never seem to see any actual predictive skill being assessed. See, for example, the press release associated with this story: https://soylentnews.org/article.pl?sid=15/09/08/1437220 [soylentnews.org] .

What examples are there of these "predictive" machine learning algorithms being accurate? I mean using the same parameters (no tweaking) and new data that was unavailable at the time the model was developed. If you have an algorithm that really worked, there should be a webpage listing all the predictions. It'd be easy to prove you know what you are doing. Right?


Original Submission