Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Tuesday October 24 2017, @09:02AM   Printer-friendly
from the one-license-to-rule-them-all dept.

The Linux Foundation has created one open-data licence framework to rule them all, allowing users to collaborate on data-driven projects.

Today at the Open Source Summit in Prague, executive director Jim Zemlin announced the Community Data License Agreement, which is designed for non-proprietary data.

The org says data producers can now share the goods "with greater clarity about what recipients may do with it".

One branch "puts terms in place to ensure that downstream recipients can use and modify that data, and are also required to share their changes", while the other does not oblige users to share those changes.

The idea is to accelerate machine learning in open source.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by looorg on Tuesday October 24 2017, @08:58PM

    by looorg (578) on Tuesday October 24 2017, @08:58PM (#587094)

    I'm skeptical but perhaps this will be nice for machine-gathered data, machines that just automatically create large amounts of sensory data all day everyday. But that might be one of few areas where this might actually be viable. Mostly because the data gathered will be so useless and boring.

    For anything gathered by hand, surveys, observations and such there will just be to much work and to many obstacles and risks involved by just giving it away to anyone. Certainly so when it's data about people. In my mind I have quickly gone over all the projects I, remember and, have been working on for the last couple of years and there would be so much work to be able to distribute the data afterwards it just wouldn't be viable from a time- and economical standpoint. Most concerning tho is that anonymity might go completely out the window. Things would just have to be purged and stripped down to the bare essentials. It's already hard to get people to answer surveys and if we just gave it away afterwards I strongly suspect that the answer-rate would plummet. In some sense they answer our surveys because they trust us not to fuck them over in the end and afterwards.

    For anything that gathers data on people the ethics board will probably have a fit, just trying to get past them would probably be a complete nightmare. It's already starting to be a pain in the arse where you have to adjust questions and answers to fit various criteria. If they would be told we would just hand out all the data afterwards in raw to anyone with an internet connection I think they might have a stroke right there on the spot and a big red NO stamp all over the application would be sure to follow.

    Various companies and commercial entities wouldn't want to give it away since it reveals what they are doing to their competition. So for them there might be other ethical or security concerns. But then that is probably beyond the issue since it's stated in the article that the license is designed for "non-proprietary data".

    But even for the non-commercial data it will probably boil down to economical aspect as gathering data costs money, a lot of money. Just giving it away doesn't make sense in that regard. Not that your data might sell. But just to make it ready for the public is a full time effort. So as previously noted I do suspect that the free data will be generic auto-created data, or things that are just so old it's borderline worthless. Even if Google starts scanning data as they do with books it might not all make sense without human intervention.

    The org says data producers can now share the goods "with greater clarity about what recipients may do with it".

    This sentence doesn't make any kind of sense to me. If I produce data and then give it away I do lose all control and I have no idea what the recipients might or might not do with it. They might use the data all wrong, since they might not take into account what the data was gathered for and how and just interpret the results as to fit some agenda of theirs. I can't take it back at that time. But they can use me for their purpose. They might produce shit with it and drag your name in the mud. I don't have time to sit and monitor the world for people that grab my data from the internet and then write retractions if people do fucked up things with it.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2