Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday March 01 2022, @02:22AM   Printer-friendly
from the acquisition-for-30x-annual-recurring-revenue dept.

The Free Software Foundation (FSF) has published five of the white papers it funded regarding questions about Microsoft Copilot. After Microsoft acquired GitHub, it set up a machine learning system to cull through its archive of software, called Copilot. The approach chosen and even the basic activity raises many questions starting with those of licensing.

Microsoft GitHub's announcement of an AI-driven Service as a Software Substitute (SaaSS) program called Copilot -- which uses machine learning to autocomplete code for developers as they write software -- immediately raised serious questions for the free software movement and our ability to safeguard user and developer freedom. We felt these questions needed to be addressed, as a variety of serious implications were foreseen for the free software community and developers who use GitHub. These inquiries -- and others possibly yet to be discovered -- needed to be reviewed in depth.

In our call for papers, we set forth several areas of interest. Most of these areas centered around copyright law, questions of ownership for AI-generated code, and legal impacts for GitHub authors who use a GNU or other copyleft license(s) for their works. We are pleased to announce the community-provided research into these areas, and much more.

First, we want to thank everyone who participated by sending in their papers. We received a healthy response of twenty-two papers from members of the community. The papers weighed-in on the multiple areas of interest we had indicated in our announcement. Using an anonymous review process, we concluded there were five papers that would be best suited to inform the community and foster critical conversations to help guide our actions in the search for solutions.

These five submissions are not ranked, and we decided it best to just let the papers speak for themselves. The papers contain opinions with which the Free Software Foundation (FSF) may or may not agree, and any views expressed by the authors do not necessarily represent the FSF. They were selected because we thought they advanced discussion of important questions, and did so clearly. To that end, the FSF is not providing any summaries of the papers or elaborating on our developing positions until we can learn further, through the community, how best to view the situation.

The FSF has also arranged upcoming discussions regarding these white papers. Microsoft bought GitHub in 2018 for $7.5 billion in stock, which if it had been real money instead it would have been 30 times the annual recurring revenue brought in by GitHub.

Previously:
(2021) GitHub's Automatic Coding Tool Rests on Untested Legal Ground
(2020) GitHub Revamps Copyright Takedown Policy After Restoring YouTube-dl
(2018) Microsoft Agrees to Acquire GitHub... for $7.5 Billion [Updated]
(2014) Atom, GitHub's Editor Now Open Source


Original Submission

Related Stories

Atom, GitHub's Editor Now Open Source 20 comments

GitHub announced today that the editor it has been working on is now open source.

Today, we're excited to announce that we are open-sourcing Atom under the MIT License. We see Atom as a perfect complement to GitHub's primary mission of building better software by working together. Atom is a long-term investment, and GitHub will continue to support its development with a dedicated team going forward. But we also know that we can't achieve our vision for Atom alone. As Emacs and Vim have demonstrated over the past three decades, if you want to build a thriving, long-lasting community around a text editor, it has to be open source.

I have been using the Atom beta as my primary editor for the past few weeks and have been very happy with it.

It is currently only available for the mac, but it is based on Chromium and Node, and "Windows and Linux releases are on the roadmap."

Microsoft Agrees to Acquire GitHub... for $7.5 Billion [Updated] 105 comments

[Update 20180604 @ 14:00 UTC: Acquisition confirmed. Microsoft is paying $7.5 billion in stock. Coverage at Microsoft, Security Week, The Register, and The Verge. Also, see the Microsoft blog post. --martyb]

Microsoft has reportedly acquired GitHub

Microsoft has reportedly acquired GitHub, and could announce the deal as early as Monday. Bloomberg reports that the software giant has agreed to acquire GitHub, and that the company chose Microsoft partly because of CEO Satya Nadella. Business Insider first reported that Microsoft had been in talks with GitHub recently.

Time to move off GitHub?

Previously: Microsoft Holds Acquisition Talks with Github

An AC also submitted Bloomberg's article.


Original Submission #1Original Submission #2

GitHub Revamps Copyright Takedown Policy After Restoring YouTube-dl 17 comments

GitHub Revamps Copyright Takedown Policy After Restoring YouTube-dl

GitHub revamps copyright takedown policy after restoring YouTube-dl:

The source code for YouTube-dl, a tool you can use to download videos from YouTube, is back up on GitHub after the code repository took it down in October following a DMCA complaint from the Recording Industry Association of America (RIAA). Citing a letter from the Electronic Frontier Foundation (the EFF), GitHub says it ultimately found that the RIAA's complaint didn't have any merit.

[...]

This is the best possible outcome of the RIAA's attack on youtube-dl. Good on @GitHub for standing up for developers against DMCA § 1201 abuses.

The @EFF did amazing work representing the project, and you should read their letter: https://t.co/Whh0cKTgIFhttps://t.co/BT1aovWZx7

— Filippo Valsorda 💚🤍❤️ ✊ (@FiloSottile) November 16, 2020

If there's a silver lining to the episode, it's that GitHub is implementing new policies to avoid a repeat of a repeat situation moving forward. [...]

GitHub is also establishing a $1 million defense fund to provide legal aid to developers against suspect section 1201 claims, as well as doubling down on its lobbying work to amend the DMCA and other similar copyright laws across the world.

GitHub’s Automatic Coding Tool Rests on Untested Legal Ground 73 comments

GitHub’s automatic coding tool rests on untested legal ground:

The Copilot tool has been trained on mountains of publicly available code

[...] When GitHub announced Copilot on June 29, the company said that the algorithm had been trained on publicly available code posted to GitHub. Nat Friedman, GitHub’s CEO, has written on forums like Hacker News and Twitter that the company is legally in the clear. “Training machine learning models on publicly available data is considered fair use across the machine learning community,” the Copilot page says.

But the legal question isn’t as settled as Friedman makes it sound — and the confusion reaches far beyond just GitHub. Artificial intelligence algorithms only function due to massive amounts of data they analyze, and much of that data comes from the open internet. An easy example would be ImageNet, perhaps the most influential AI training dataset, which is entirely made up of publicly available images that ImageNet creators do not own. If a court were to say that using this easily accessible data isn’t legal, it could make training AI systems vastly more expensive and less transparent.

Despite GitHub’s assertion, there is no direct legal precedent in the US that upholds publicly available training data as fair use, according to Mark Lemley and Bryan Casey of Stanford Law School, who published a paper last year about AI datasets and fair use in the Texas Law Review.

[...] And there are past cases to support that opinion, they say. They consider the Google Books case, in which Google downloaded and indexed more than 20 million books to create a literary search database, to be similar to training an algorithm. The Supreme Court upheld Google’s fair use claim, on the grounds that the new tool was transformative of the original work and broadly beneficial to readers and authors.

Microsoft’s GitHub Copilot Met with Backlash from Open Source Copyright Advocates:

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 4, Insightful) by JoeMerchant on Tuesday March 01 2022, @02:54AM

    by JoeMerchant (3937) on Tuesday March 01 2022, @02:54AM (#1225758)

    The Melancholy Elephants of C++ may be more numerous, with greater combinatorial space than the ones plying a 12 tone musical scale, but in the end... there are only so many ways to write certain phrases of code.

    Much of my best Qt code is lifted verbatim from the API documentation. It's clear, readable, and anybody who doesn't understand what it is doing can copy it into a Google search and get the full API documentation around the snippet.

    I suppose I should try CoPilot before taking a position, but until CoPilot starts coughing up nuggets like: "It looks like you're writing a music playlist database, would you like to try one of these?" copying little code snippets is like accidentally quoting movie lines - there are so many movies already made that you can't avoid repeating short phrases from several every time you speak.

    --
    Україна досі не є частиною Росії. https://www.newsweek.com/russian-state-tv-ukraine-war-dirty-bomb-putin-1754428
  • (Score: 5, Interesting) by bzipitidoo on Tuesday March 01 2022, @03:46AM

    by bzipitidoo (4388) on Tuesday March 01 2022, @03:46AM (#1225774) Journal

    Sounds like they treated the effort like they were publishing a scientific journal that is highly regarded and therefore too picky. Were any of those 22 papers garbage? I suspect a lot more than 5 of them were excellent. Like, maybe 20 out of 22, and the other 2 weren't bad, just not superlative.

    Peer reviewed journals are stack rankers from Hell.

  • (Score: -1, Spam) by Anonymous Coward on Tuesday March 01 2022, @06:20AM (1 child)

    by Anonymous Coward on Tuesday March 01 2022, @06:20AM (#1225794)

    Has Aristarchus weighed in on this, yet? I would respect his opinion, if it were allowed, by the censors of SoylentNews. They seem to have also been bought out by Micro$oft, and in the words of Edward Longshanks, "some turned for much less".

    • (Score: 2, Insightful) by Anonymous Coward on Tuesday March 01 2022, @10:48AM

      by Anonymous Coward on Tuesday March 01 2022, @10:48AM (#1225815)

      Dude, no one cares.

      They were given an option to opt out for several months as penance. Or leave forever.

      Continually whining ON EVERY FUCKING THREAD is not going to win them any friends.

  • (Score: 5, Interesting) by Mojibake Tengu on Tuesday March 01 2022, @07:16AM (4 children)

    by Mojibake Tengu (8598) Subscriber Badge on Tuesday March 01 2022, @07:16AM (#1225804) Journal

    This new phenomenon will not stay limited to just Open Source and Microsoft. IBM is already receding in dread for they understand what they achieved, but the worms can is wide open.

    It's FSF now in panic but every aspect of all human creative domains will be affected. All kind of human engineering and technology could be harvested for proven concepts by AI. No legal constructions could prevent an AIP revolution.

    Embrace!

    - What is your logical position, girls?
    - We both are your enemy. Defend yourself.

    -- 绝命响应 Jué mìng xiǎngyìng

    --
    The edge of 太玄 cannot be defined, for it is beyond every aspect of design
    • (Score: 3, Insightful) by JoeMerchant on Tuesday March 01 2022, @10:45AM (3 children)

      by JoeMerchant (3937) on Tuesday March 01 2022, @10:45AM (#1225814)

      Isn't this just the scientific method, in a nutshell?

      Whether you say "harvested for proven concepts by AI" or "standing on the shoulders of giants" the outcome is the same.

      Progress leads to better tools, better tools lead to more progress. Did people feel the same way when scribes started doing the work of bards more effectively?

      --
      Україна досі не є частиною Росії. https://www.newsweek.com/russian-state-tv-ukraine-war-dirty-bomb-putin-1754428
      • (Score: 3, Disagree) by Mojibake Tengu on Tuesday March 01 2022, @01:41PM (2 children)

        by Mojibake Tengu (8598) Subscriber Badge on Tuesday March 01 2022, @01:41PM (#1225832) Journal

        But this situation is different to all previous epochs. Now it is the first time in known history when the technology transcends human mental abilities.

        Not just some trivial science and progress, impact of synthetic thinking on governance methods and religion is unpredictable at this moment.
        A complete crush of current social model is imminent.

        --
        The edge of 太玄 cannot be defined, for it is beyond every aspect of design
        • (Score: 3, Insightful) by JoeMerchant on Tuesday March 01 2022, @05:46PM

          by JoeMerchant (3937) on Tuesday March 01 2022, @05:46PM (#1225904)

          I would say that a complete crush of the current social model was imminent upon the introduction of fire, farming, pottery, metalworking, written language, ocean/upwind sailing, gunpowder, external then internal combustion engines, the telegraph, wireless communication, heavier than air flight, electric power, refrigeration, interstate class highway systems, "the bomb", artificial satellites, computers and databases, the internet, smartphones, etc.

          Every one of those turned existing society on its head, transforming long standing social models or obliterating them entirely across much of the human world. Most people of the day didn't fully comprehend the new technology when it rolled out, nor it's true implications for the near and distant future.

          What is called AI today is just existing algorithms becoming a little surprisingly capable of making distinctions in large datasets when scaled up to large numbers of "trained coefficients.". It's not magic, just math.

          --
          Україна досі не є частиною Росії. https://www.newsweek.com/russian-state-tv-ukraine-war-dirty-bomb-putin-1754428
        • (Score: 4, Insightful) by Anonymous Coward on Tuesday March 01 2022, @07:42PM

          by Anonymous Coward on Tuesday March 01 2022, @07:42PM (#1225938)

          Now it is the first time in known history when the technology transcends human mental abilities.

          The technology transcended them when a trader made the first abacus, and when a scribe recorded the first story. Nothing new.
          The problem is, now it is the technology of lying that finally transcended human abilities to challenge it. :(
          The broadcast bullshit, then the astroturf bullshit, then the targeted bullshit, then the echo chambers of bullshit. Our brains' antiquated firmware has proven itself critically vulnerable to visual footage ("seeing is believing", for nothing on Earth could produce illusions before the cinema), and is now exploited left and right, by "the left" and "the right" and everyone else and their pets (literally, cue cat videos). And the global madness inevitably came. :(

          As to so-called "AI", wake me up when the thing becomes able to understand what it does; if a nuke does not put me to eternal sleep before that, which now feels far more probable than a glorified pattern recognizer magically achieving sentience.
          "GitHub Copilot tries to understand your intent and to generate the best code it can, but the code it suggests may not always work, or even make sense." https://copilot.github.com/#faqs [github.com]

  • (Score: 0) by Anonymous Coward on Tuesday March 01 2022, @07:31AM

    by Anonymous Coward on Tuesday March 01 2022, @07:31AM (#1225806)

    That's a headline from the future. If you can't swindle them out, you buy them out.

  • (Score: 2) by Frosty Piss on Tuesday March 01 2022, @12:49PM (5 children)

    by Frosty Piss (4971) on Tuesday March 01 2022, @12:49PM (#1225828)

    Are there viable alternatives to MS GitHub?

    • (Score: 1, Informative) by Anonymous Coward on Tuesday March 01 2022, @01:19PM (2 children)

      by Anonymous Coward on Tuesday March 01 2022, @01:19PM (#1225830)

      Yes, gitlab and bitbucket come to mind.

      • (Score: 0) by Anonymous Coward on Wednesday March 02 2022, @02:05AM

        by Anonymous Coward on Wednesday March 02 2022, @02:05AM (#1226014)

        I've used both, github is still better in all the ways that count.

      • (Score: 3, Informative) by hendrikboom on Wednesday March 02 2022, @02:18AM

        by hendrikboom (1125) Subscriber Badge on Wednesday March 02 2022, @02:18AM (#1226016) Homepage Journal

        And gitea

    • (Score: 3, Touché) by iWantToKeepAnon on Tuesday March 01 2022, @08:31PM

      by iWantToKeepAnon (686) Subscriber Badge on Tuesday March 01 2022, @08:31PM (#1225953) Homepage Journal
      --
      "Happy families are all alike; every unhappy family is unhappy in its own way." -- Anna Karenina by Leo Tolstoy
    • (Score: 2) by jb on Wednesday March 02 2022, @03:23AM

      by jb (338) on Wednesday March 02 2022, @03:23AM (#1226033)

      Yes, just host your own source repository (like we all did for decades before behemoths like sourceforge & github came along).

      In fact, for all but the most complex projects, git is overkill and even something as simple (and commensurately more robust) as cvs will suffice.

(1)