Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Wednesday March 18 2015, @06:51PM   Printer-friendly
from the data-is-power dept.

Large datasets and predictive analytics software are a fertile field for innovation, but while excellent open source tools like Sci-Py, R, etc are freely available, the datasets are not. A Computerworld article notes that the scarcity of large publicly available data collections has led to a database released for a competition by Netflix half a decade ago now being constantly used in computer science research.

Australia's government does provide an easy way to find, access and reuse some public datasets, but most public and private databases are silo-ed away from experimenters. The Open Data Handbook offers some guidelines for defining openness in data, but offers little in ways to drive organisations to make their datasets available.

So do we need a GPL for data, and if so, what would it look like?

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Insightful) by Nerdfest on Thursday March 19 2015, @12:50AM

    by Nerdfest (80) on Thursday March 19 2015, @12:50AM (#159666)

    Your intentions are not particularly relevant. Once the data is out there, it is available to people wishing to pay for it, or to those that exploit the security of those holding it. Once someone new has it, the problem just gets bigger. No it's not *about* money, money is just the difference between 'open' and 'not open'. In one case anyone can get the data, and in others, anyone with money, connections, or partnerships can get the data. Yes, it's easier and cheaper when the data is open, but it's really no different in the end. The only things that stops all data from being open is security, privacy policies (where they are respected and actually respect privacy) and laws. Only *one* of those things needs to fail for control of the data to be lost.

    Starting Score:    1  point
    Moderation   0  
       Insightful=1, Overrated=1, Total=2
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by wantkitteh on Thursday March 19 2015, @04:11AM

    by wantkitteh (3362) on Thursday March 19 2015, @04:11AM (#159748) Homepage Journal

    So your argument is that, because all someone has to get hold of private data is actively commit a crime, all data should be open? That's got to be the most retarded argument for anything being open that I've ever heard in my life. How about you dox yourself to prove your point?

    • (Score: 2) by Nerdfest on Thursday March 19 2015, @09:59AM

      by Nerdfest (80) on Thursday March 19 2015, @09:59AM (#159840)

      What I'm saying is that disparate data sources can be combined to reduce anonymity whether they're open or not. Just because the data's not open and you don't know who has what data doesn't mean it isn't happening. It just means that you don't know who's doing it.

      • (Score: 1, Insightful) by Anonymous Coward on Thursday March 19 2015, @12:19PM

        by Anonymous Coward on Thursday March 19 2015, @12:19PM (#159910)

        And open access data means you still don't know who's doing it and now the barrier for those unknown people to do it is significantly reduced.

      • (Score: 2) by wantkitteh on Thursday March 19 2015, @02:50PM

        by wantkitteh (3362) on Thursday March 19 2015, @02:50PM (#159970) Homepage Journal

        "...disparate data sources can be combined to reduce anonymity whether they're open or not."

        Okay - put your data where you mouth is - d0x yourself. Release every scrap of data you can about yourself under a creative commons license. It can all be used whether it's open or not, right? It won't make any difference to you if you just make it more convenient for everyone to access, right? Do it or stfufe.