Large datasets and predictive analytics software are a fertile field for innovation, but while excellent open source tools like Sci-Py, R, etc are freely available, the datasets are not. A Computerworld article notes that the scarcity of large publicly available data collections has led to a database released for a competition by Netflix half a decade ago now being constantly used in computer science research.
Australia's government does provide an easy way to find, access and reuse some public datasets, but most public and private databases are silo-ed away from experimenters. The Open Data Handbook offers some guidelines for defining openness in data, but offers little in ways to drive organisations to make their datasets available.
So do we need a GPL for data, and if so, what would it look like?
(Score: 2) by wantkitteh on Friday March 20 2015, @02:51PM
I don't think that's quite what the original article is referring to - you want data that's already available to be free as in beer, rather than free as in speech which is more the issue at hand. What I'm trying to ask is whether you include other people's personal, private data in the collection of data that your government has assembled and should be available for free. The argument "I contributed towards my neighbourhood's government subsidised drug rehab and mental health facilities, I should be entitled to all their data!" is indicative of a pretty sick attitude, so I hope that's not what you mean.
(Score: 2) by Phoenix666 on Friday March 20 2015, @06:22PM
No, that's not what I mean. Privacy is big with me. I think there would be value in an anonymized db of everyone's DNA, for example, because it would do so much for archaeology, epidemiology, etc., but I have no trust in the government whatsoever so scratch that idea.
Washington DC delenda est.
(Score: 2) by wantkitteh on Friday March 20 2015, @08:43PM
Ok, misunderstanding cleared up ;) See comments/links elsewhere in this comment section for details on how hard anonymising data really is.