Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
-
@thomasjwebb @Foxboron @scy In the US, at least, human authorship is required for copyright, and if you try to copyright something that's a mix of AI and human generated then generally only the human generated part is copyrightable.
This is separate from the LLMs emitting text other people have written, so at *best* this code can't be licensed because it's not copyrightable, and at worst its license laundering and there's precedent (IIRC) for stomping on that hard.
-
@aud It's at least not systems code, so there's not a lot of potential for buffer overflow and other memory unsafety exploits, but yeah. No. chardet is not a small surface area.
@xgranade@wandering.shop There's just no way that's a good idea. I'm pretty sure a human who tried to push a 15K rewrite into most libraries would be yelled at forever and the PR rejected, or asked to be broken into smaller PRs, because it's just such a large change in one go and no one can possibly fit that entire thing into their head.
It doesn't magically become a good idea just because claude shat it out. -
@Foxboron Yeah but that's what I mean: Just because the end result is not copyrightable, does that automatically mean that it can't be a copyright violation?
Like, changing the format or medium of something is not a copyrightable work.
So, by that logic, if I take a copyrighted MP3 and convert it to AAC and publish that, my AAC is not copyrightable, but it's not a copyright violation to take it and publish it?
That's what I mean.
@scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:
- The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.
- A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.
- Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.
- Copyright has an ancient...
-
@scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:
- The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.
- A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.
- Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.
- Copyright has an ancient...
@scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.
This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.
So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.
-
@scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.
This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.
So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.
-
-
-
-
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron Fun, one of the fundamental problems I have with this technology!
-
@Foxboron It looks like this was the PR?
https://github.com/chardet/chardet/pull/322
Even aside from the ethical and moral issues with LLMs, it doesn't seem optimal that a 15k line PR affecting almost a million dependent repos (if GitHub's count is to be believed) was up for three days before getting merged in.
@xgranade
They have been the upstream maintainer for years, so I don't see any huge issue with that.I would have done the same probably?
-
@xgranade
They have been the upstream maintainer for years, so I don't see any huge issue with that.I would have done the same probably?
@Foxboron Posted an unkind reply and deleted, sorry. I'm getting frustrated with the whole AI thing today, and I'm not being my best self. I should probably just step offline for a bit.
This is just so... frustrating.
-
@scy
US court is leaning towards that LLM generated code is fundamentally not copyrightable.This is a different problem to the moral issues I have with this.
@Foxboron @scy@chaos.social That'd be the US system. Then there's the various Euro systems that differ substantially. I'm certainly curious how this will turn out.
On the other hand: it'd require that those who can enforce their rights here actually do so.
Given that IP rights are normally enforced pretty harshly, even on consumers (anyone remember the days of the torrent c&d letters or the traditional find&ban the infringing exhibitor days on computex et al?) they're effectively completely ignored on FOSS.
There is virtually no education for biz, cs or law students on this topic, let alone mandatory ed.Presenting the case of possibilities and rights to those who have them is often dismissed by those, especially developers on the younger side or those who are still in a "hobby" / "non commercial" stage. Only to shortly after complain about sustainability and demanding funding.
Instead we see demands to throw substantial amounts of tax money after random Foss projects on more or less random criteria and evaluators. Which will totally scale, right?
Virtually every company that was enforced against in terms of FOSS compliance ended up consciously allocating resources to FOSS in various ways. There are a lot of companies and they are a renewable resource in a functional economy.
But what do I know, rite? I just see the cases.
/rant -
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron today's new term "code laundering" I'll keep that one 😆
-
@Foxboron Posted an unkind reply and deleted, sorry. I'm getting frustrated with the whole AI thing today, and I'm not being my best self. I should probably just step offline for a bit.
This is just so... frustrating.
@xgranade
Yes.But lets not clutch pearls over how a understaffed FOSS project decides to merge their work.
-
undefined oblomov@sociale.network shared this topic