As projected here back in October, there is now a class action lawsuit, albeit in its earliest stages, against Microsoft over its blatant license violation through its use of the M$ GitHub Copilot tool. The software project, Copilot, strips copyright licensing and attribution from existing copyrighted code on an unprecedented scale. The class action lawsuit insists that machine learning algorithms, often marketed as "Artificial Intelligence", are not exempt from copyright law nor are the wielders of such tools.
The $9 billion in damages is arrived at through scale. When M$ Copilot rips code without attribution and strips the copyright license from it, it violates the DMCA three times. So if olny 1% of its 1.2M users receive such output, the licenses were breached 12k times with translates to 36k DMCA violations, at a very low-ball estimate.
"If each user receives just one Output that violates Section 1202 throughout their time using Copilot (up to fifteen months for the earliest adopters), then GitHub and OpenAI have violated the DMCA 3,600,000 times. At minimum statutory damages of $2500 per violation, that translates to $9,000,000,000," the litigants stated.
Besides open-source licenses and DMCA (§ 1202, which forbids the removal of copyright-management information), the lawsuit alleges violation of GitHub's terms of service and privacy policies, the California Consumer Privacy Act (CCPA), and other laws.
The suit is on twelve (12) counts:
– Violation of the DMCA.
– Breach of contract. x2
– Tortuous interference.
– Fraud.
– False designation of origin.
– Unjust enrichment.
– Unfair competition.
– Violation of privacy act.
– Negligence.
– Civil conspiracy.
– Declaratory relief.
Furthermore, these actions are contrary to what GitHub stood for prior to its sale to M$ and indicate yet another step in ongoing attempts by M$ to undermine and sabotage Free and Open Source Software and the supporting communities.
Previously:
(2022) GitHub Copilot May Steer Microsoft Into a Copyright Lawsuit
(2022) Give Up GitHub: The Time Has Come!
(2021) GitHub's Automatic Coding Tool Rests on Untested Legal Ground
(Score: 3, Interesting) by sjames on Friday January 06 2023, @05:43PM (5 children)
I have heard the counter-argument, but both programming and natural language are filled with pat phrases. For example, I'll bet you read "Your point is well taken" at some point before the first time you said or wrote it. That's not an accusation, it's just how language works. It's also to be expected of a synthetic neural network.
The person who wrote the code that CoPilot learned from probably did so because their own naturally occurring neural net distilled down many variants that it was exposed to and identified that particular variation as a canonical phrase.
I'm almost 100% certain that I have written a line of code at some point in my career that was identical to a line someone else wrote and that neither of us is aware of it, simply because it was a good concise way to express the thought.
(Score: 2, Interesting) by shrewdsheep on Friday January 06 2023, @05:52PM (4 children)
Indeed one line wouldn't qualify and it would be impossible to find the primordial line anyhow. This is precisely what will be tested in court: how many lines qualify as plagiarism. For me, it would be about 10 lines of code but it is really statistics: how many lines make a unique snippet of code.
(Score: 4, Interesting) by HiThere on Friday January 06 2023, @06:13PM
It probably will end up being something that stupid, but that's a really stupid measure of whether anything significant was copied. And even more of whether anything significant that was original to the author was copied.
And if it's successful, Knuth can sue every programmer in existence for their entire worth.
Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
(Score: 1, Informative) by Anonymous Coward on Friday January 06 2023, @06:44PM
> about 10 lines of code
Or about a half-line of APL...
(Score: 3, Interesting) by turgid on Friday January 06 2023, @09:30PM
Something that troubles me is the concept of accessors in OOP languages, getters and setters. They're boiler plate, yet they consume many lines of code. Does copying one of those constitute plagiarism? My next questions is "What have we been doing for the last 40 years?" Why don't OOP languages provide operators for this purpose? Why are we writing this code by hand?
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 3, Interesting) by RS3 on Sunday January 08 2023, @05:27AM
All this is making me wonder: are they looking at source code? Or object / binary executable? Cause you could copy binary, toss in some nops here and there, recalculate checksum, and I think it'd be difficult to detect? But make the source look very different. I dunno, it's a mess; there are no easy or simple answers.