Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
-
@Foxboron https://fosdem.org/2026/schedule/event/SUVS7G-lets_end_open_source_together_with_this_one_simple_trick/ didn't watch this talk yet, but seems relevant!
EDIT: just watched it. Note: _loads_ of genAI video... feels like my brain is a bit broken. But entertaining. Goes through the history of copyright (from books in the 1700s) through to cleanrooming in the 1970s and then strongly makes the point that cleanrooming is "almost free" now.
True to the talk title, the talk offers no solutions, ending with "this is the end of open source as we know it" :/
@Foxboron the presenters have a live demo: https://malus.sh/
-
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron Except the output can't be copyrighted and so the result is public domain. It can't even be licensed anymore.
-
YUP
copyright is for humans, not automata ―hard or soft.
so, ironically, the prompts are copyrightable but not the output.
so anything you want to copyright should not be prompted into a corporate regurgitation machine, including so-called grammar checkers.
-
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron Went ahead and added an issue since you can't apply an MIT license to public domain LLM output.
-
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron that's... not copyrightable, therefore not licensable?
-
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron@chaos.social I don't think this is going to hold up in court. Whatever you instructed that LLM to make can definitely be classified as a derivative work under the LGPL.
-
Sure, but we are not really looking at, nor discussing, cases where LLMs spits out something verbatim from another project in this case.
@Foxboron @joshbressers @scy verbatim isn’t the question here, the question is infringement. is the output here substantially derivative of previous versions of chardet to the point that it could be considered infringing? US copyright precedent is a muddled mess and I think this could implicate at least one unresolved circuit split. I don’t know what the answer will be but I know I wouldn’t want to be standing in the blast radius of that decision
-
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron If you can't copyright it, you can't license the copyright. Interesting times.
-
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron It looks like this was the PR?
https://github.com/chardet/chardet/pull/322
Even aside from the ethical and moral issues with LLMs, it doesn't seem optimal that a 15k line PR affecting almost a million dependent repos (if GitHub's count is to be believed) was up for three days before getting merged in.
-
@Foxboron It looks like this was the PR?
https://github.com/chardet/chardet/pull/322
Even aside from the ethical and moral issues with LLMs, it doesn't seem optimal that a 15k line PR affecting almost a million dependent repos (if GitHub's count is to be believed) was up for three days before getting merged in.
@xgranade@wandering.shop @Foxboron@chaos.social 3 days, little review, 15K lines, on a library that seems to perform operations on text input?
yeah, that's a big fucking "no" from a security standpoint. How many goddamn security issues are waiting in the wings here? -
@xgranade@wandering.shop @Foxboron@chaos.social 3 days, little review, 15K lines, on a library that seems to perform operations on text input?
yeah, that's a big fucking "no" from a security standpoint. How many goddamn security issues are waiting in the wings here?@aud It's at least not systems code, so there's not a lot of potential for buffer overflow and other memory unsafety exploits, but yeah. No. chardet is not a small surface area.
-
@thomasjwebb @Foxboron @scy In the US, at least, human authorship is required for copyright, and if you try to copyright something that's a mix of AI and human generated then generally only the human generated part is copyrightable.
This is separate from the LLMs emitting text other people have written, so at *best* this code can't be licensed because it's not copyrightable, and at worst its license laundering and there's precedent (IIRC) for stomping on that hard.
-
@aud It's at least not systems code, so there's not a lot of potential for buffer overflow and other memory unsafety exploits, but yeah. No. chardet is not a small surface area.
@xgranade@wandering.shop There's just no way that's a good idea. I'm pretty sure a human who tried to push a 15K rewrite into most libraries would be yelled at forever and the PR rejected, or asked to be broken into smaller PRs, because it's just such a large change in one go and no one can possibly fit that entire thing into their head.
It doesn't magically become a good idea just because claude shat it out. -
@Foxboron Yeah but that's what I mean: Just because the end result is not copyrightable, does that automatically mean that it can't be a copyright violation?
Like, changing the format or medium of something is not a copyrightable work.
So, by that logic, if I take a copyrighted MP3 and convert it to AAC and publish that, my AAC is not copyrightable, but it's not a copyright violation to take it and publish it?
That's what I mean.
@scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:
- The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.
- A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.
- Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.
- Copyright has an ancient...
-
@scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:
- The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.
- A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.
- Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.
- Copyright has an ancient...
@scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.
This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.
So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.
-
@scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.
This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.
So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.
-
-
-
-
Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?
https://github.com/chardet/chardet/releases/tag/7.0.0
That is one way to launder GPL code I guess?
@Foxboron Fun, one of the fundamental problems I have with this technology!