This is bad.
-
There's already copyright case law regarding llm generated text.
Judges have ruled it is not human authored and therefore not subject to copyright.
The latest one i read specifically said that you must specifically state which portions were generated and exclude those sections from claimed copyright.
So "human put llm code chunks together" is likely only protected for the arrangement of the chunks and not any of the code itself. (Not a lawyer, making reasonable guess off of lots and lots of copyright knowledge and case law for things like remixing and collage work.)
@pathunstrom @SnoopJ @xgranade that matches my understanding (from reading about rulings), but it’s not clear to me what that means when an LLM reproduces already copyrighted material. Does the prior copyright mean the output can be a violation even tho it can’t be copyrighted itself? Does the non-copyrightability of output override the previous ownership? That sounds absurd, but to me treating LLM output as anything but a derivative work was already absurd
-
@pathunstrom @SnoopJ @xgranade that matches my understanding (from reading about rulings), but it’s not clear to me what that means when an LLM reproduces already copyrighted material. Does the prior copyright mean the output can be a violation even tho it can’t be copyrighted itself? Does the non-copyrightability of output override the previous ownership? That sounds absurd, but to me treating LLM output as anything but a derivative work was already absurd
@ShadSterling @xgranade my understanding is that if you distribute someone else's protected work¹, you have infringed by distributing (that portion of) their work, full stop.
the particular means by which the infringement occurred are AFAIK entirely irrelevant to legal standing (i.e. the right for the owner of that work to sue the infringer), but the cases @pathunstrom is referring to may represent a gap between my understanding and the current practice of law in the US.
---
¹ or more precisely in this case: enough of someone else's protected work that they can make a convincing argument that it *is* their protected work in court, since outputs of all the models people talk about are in some sense *always* built from the protected works of others -
@ShadSterling @xgranade my understanding is that if you distribute someone else's protected work¹, you have infringed by distributing (that portion of) their work, full stop.
the particular means by which the infringement occurred are AFAIK entirely irrelevant to legal standing (i.e. the right for the owner of that work to sue the infringer), but the cases @pathunstrom is referring to may represent a gap between my understanding and the current practice of law in the US.
---
¹ or more precisely in this case: enough of someone else's protected work that they can make a convincing argument that it *is* their protected work in court, since outputs of all the models people talk about are in some sense *always* built from the protected works of others@SnoopJ @xgranade @pathunstrom that’s what I would have expected before the (IMO nonsensical) rulings about LLM outputs; AFAIK, whether LLM output can be infringing in that way has not yet been tested in court. I don’t know what to expect when such a case is heard, and the way things have been going I’m not looking forward to finding out
-
@xgranade @ireneista "do you have five million dollars of disposable income to fund an alternative to the PSF" is a good place to start, if you want to frame it as a "hostile fork" situation. the only solution is to get involved in the messy process of politics and governance and try to figure out a way to negotiate a durable peace
@glyph @xgranade @ireneista why do we need an alternative to Pumpkin Spice Farts? And why does it have to cost so much?
-
I'm gonna be real with folks here. I fucked up, and bad, with my participation in the open-slopware list. As a result, I'm not the right person to do it, but there has to be some kind of accounting for what damage AI is doing to open source.
For all the whinging about "supply chains" over the past few years, it *is* a problem when your code suddenly depends on AI, even if only indirectly.
@xgranade why do you consider open-slopware a mistake, btw?
-
@xgranade why do you consider open-slopware a mistake, btw?
@outfrost I don't, per se, but I consider the way I participated in it to be a mistake, and one that got people hurt. Any time you make a list of people, no matter your intentions, that takes caution --- caution that I did not personally put into practice. I don't get to hide behind my intentions on that.
-
@outfrost I don't, per se, but I consider the way I participated in it to be a mistake, and one that got people hurt. Any time you make a list of people, no matter your intentions, that takes caution --- caution that I did not personally put into practice. I don't get to hide behind my intentions on that.
@xgranade gotcha, so it's not about the main list itself, but callouts on specific maintainers?
-
@xgranade I also dislike it, but the cat's out of the bag, even if it wasn't allowed people would still be using it, just without revealing it
@MissingClara @xgranade That's a bad argument against having a policy. Policy is a statement of who does and doesn't belong. If they're using it without revealing it, trying to launder code with fraudulent provenance into your project, that's highly malicious behavior worthy of a ban from the project once they're caught. And it's a signal to your good contributors that the project is healthy and not going to be turned into irreparable garbage by slop bros.
-
@xgranade gotcha, so it's not about the main list itself, but callouts on specific maintainers?
@outfrost It's complicated. There's very valid critiques of the list, and also bad faith misrepresentations of what the list was. I can only speak to my own actions, given how complex everything got.
-
@iampytest1 @theorangetheme The idea of putting a noreply email address on your commits is extremely funny to me. What exactly is the point of putting an email address on the commit message at all, then? It's not supposed to be your ID badge. What do you think is the reason that standard was created?
-
@outfrost It's complicated. There's very valid critiques of the list, and also bad faith misrepresentations of what the list was. I can only speak to my own actions, given how complex everything got.
@outfrost I put more of those thoughts together when I first apologized, and when I also tried to be clear about what parts I did not and still do not apologize for.
-
@iampytest1 @theorangetheme The idea of putting a noreply email address on your commits is extremely funny to me. What exactly is the point of putting an email address on the commit message at all, then? It's not supposed to be your ID badge. What do you think is the reason that standard was created?
@iampytest1 @theorangetheme Just another example of how LLMs degrade code quality. If a commit is "authored" by Claude, there is absolutely no one accountable to that code. If you want to reach out to the committer, it goes to an unmonitored email address. Great! Very healthy for our systems.
-
@joXn I'm not here to critique specific individuals, and I'll ask that you don't use my replies to do so either. In particular, this is a large systemic problem across OSS, and while I can think of a few specific bad actors making things worse on purpose, I don't think that's the most common modality by far.
Besides, if having worked for msft is QN automatic disqualification, I'll disclose that I worked there about 5.5 years.
-
@iampytest1 @theorangetheme Just another example of how LLMs degrade code quality. If a commit is "authored" by Claude, there is absolutely no one accountable to that code. If you want to reach out to the committer, it goes to an unmonitored email address. Great! Very healthy for our systems.
@iampytest1 @theorangetheme The only purpose behind it is anthropomorphizing Claude Code in order to sell "AI". You wouldn't put Visual Studio as a co-author on your commit. But because Claude is supposed to be "a person" (read: slave), we pretend this tool is an equal author, even though they don't exist and it's impossible to contact them.
-
@ancoghlan I mean, yes, but also AI is DDoSing the heck out of that process? More that point I was getting at is that I wasn't able to find any policies against AI-generated code in the first place, so there's very little in the way of safeguards to prevent more such commits in the future.
@xgranade @ancoghlan I don't know that any denial of service has *yet* taken place in CPython, but it's definitely happening in OSS writ large.
I'm thinking again of Naomi Ceder's excellent PyCon US talk about the gift economies of open source and her words of caution to guard against financialized interests… I don't know how to square these contributions in that perspective beyond the amount of personal trust invested in people with the commit bit.
I respect and trust Serhiy and Gregory quite a bit! If such a commit were authored by a core developer whose work I am less personally familiar with, I might feel very differently…
-
@jo My thread detailing what I do and do not apologize for may be a good start?
-
@xgranade @ancoghlan I don't know that any denial of service has *yet* taken place in CPython, but it's definitely happening in OSS writ large.
I'm thinking again of Naomi Ceder's excellent PyCon US talk about the gift economies of open source and her words of caution to guard against financialized interests… I don't know how to square these contributions in that perspective beyond the amount of personal trust invested in people with the commit bit.
I respect and trust Serhiy and Gregory quite a bit! If such a commit were authored by a core developer whose work I am less personally familiar with, I might feel very differently…
@SnoopJ @xgranade @ancoghlan Examples of this happening for reference purposes: https://www.pcgamer.com/software/platforms/open-source-game-engine-godot-is-drowning-in-ai-slop-code-contributions-i-dont-know-how-long-we-can-keep-it-up/ https://arxiv.org/abs/2601.15494
-
@outfrost I put more of those thoughts together when I first apologized, and when I also tried to be clear about what parts I did not and still do not apologize for.
@xgranade i see, thanks a lot for the context!
-
@xgranade @ancoghlan I don't know that any denial of service has *yet* taken place in CPython, but it's definitely happening in OSS writ large.
I'm thinking again of Naomi Ceder's excellent PyCon US talk about the gift economies of open source and her words of caution to guard against financialized interests… I don't know how to square these contributions in that perspective beyond the amount of personal trust invested in people with the commit bit.
I respect and trust Serhiy and Gregory quite a bit! If such a commit were authored by a core developer whose work I am less personally familiar with, I might feel very differently…
@xgranade @ancoghlan I definitely cannot see how the use of the tool could be scaled up from where it is *right now* to "a substantial fraction of contributions use it" without burning up more human time.
I hope that the claim of ownership signified by the CLA is taken seriously in the inevitable-under-scale case of someone who contributes code they *don't* own, but I guess it's easy to wander into hypotheticals here.
I can say for sure it is a net negative in my view of the project and will probably say so in the next PSF survey, but I don't know how active I'm going to be about that displeasure until then. Guess it depends on what else changes.
-
Part of the problem with doing so is.... well, now what? It's not like a Python project can just... stop being a Python project?
But I think it's important to at least understand the scope of the problem.
@xgranade doesn’t micropython also run on Unix?
If you don’t depend on all those C extensions, maybe that will do…?