Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Wednesday March 18 2015, @06:51PM   Printer-friendly
from the data-is-power dept.

Large datasets and predictive analytics software are a fertile field for innovation, but while excellent open source tools like Sci-Py, R, etc are freely available, the datasets are not. A Computerworld article notes that the scarcity of large publicly available data collections has led to a database released for a competition by Netflix half a decade ago now being constantly used in computer science research.

Australia's government does provide an easy way to find, access and reuse some public datasets, but most public and private databases are silo-ed away from experimenters. The Open Data Handbook offers some guidelines for defining openness in data, but offers little in ways to drive organisations to make their datasets available.

So do we need a GPL for data, and if so, what would it look like?

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1, Insightful) by Anonymous Coward on Wednesday March 18 2015, @09:36PM

    by Anonymous Coward on Wednesday March 18 2015, @09:36PM (#159609)

    1. a open compliant dataset is one which is human readable in its raw form, where practical, for instance comma-separated values. Examples here would be the Project Gutenberg index or the VAST small molecule database.

        That was easy.

    That's the current fashion, which could change. It's like saying that all character text APIs should use UTF-8. Well, that's certainly a popular choice, but there are special circumstances (people in the Far East might not want to pay the 30-50 percent size penalty if most of their data consists of Han characters) and technology advances.

    Starting Score:    0  points
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  

    Total Score:   1