This is bad.
-
@joXn I'm not here to critique specific individuals, and I'll ask that you don't use my replies to do so either. In particular, this is a large systemic problem across OSS, and while I can think of a few specific bad actors making things worse on purpose, I don't think that's the most common modality by far.
Besides, if having worked for msft is QN automatic disqualification, I'll disclose that I worked there about 5.5 years.
@xgranade Thank you for clearly expressing a boundary here, which I will be glad to respect.
My point was not that “working for Microsoft” is disqualifying — I also worked there for about that long (on a Python-related product, too!).
It goes more to a broader question of what puts any OSS project squarely in the LLM “blast radius”—when a significant set of senior contributors really believe in AI as a “useful tool”, that will be more “useful” with “improvement”, that’s probably a problem.
-
@ancoghlan I mean, yes, but also AI is DDoSing the heck out of that process? More that point I was getting at is that I wasn't able to find any policies against AI-generated code in the first place, so there's very little in the way of safeguards to prevent more such commits in the future.
@ancoghlan @xgranade not only AI people are DDoSing that process, the process wasn't intended to review plausible-looking extrusion of github paste from your own tools. The threat is new, increasingly documented as a cognitive hazard and it's scary to see people with commit access to the backbone of the ecosystem just YOLO it with hubris.
-
@xgranade Thank you for clearly expressing a boundary here, which I will be glad to respect.
My point was not that “working for Microsoft” is disqualifying — I also worked there for about that long (on a Python-related product, too!).
It goes more to a broader question of what puts any OSS project squarely in the LLM “blast radius”—when a significant set of senior contributors really believe in AI as a “useful tool”, that will be more “useful” with “improvement”, that’s probably a problem.
@joXn Thanks, I really appreciate it. Anyway, yeah, I agree that that's a problem when there's not active opposition from people with the ability to influence the direction of a project. That's a very broad problem in OSS right now, though... I'm just sad that it's hit as close to home.
-
@xgranade @ancoghlan I don't know that any denial of service has *yet* taken place in CPython, but it's definitely happening in OSS writ large.
I'm thinking again of Naomi Ceder's excellent PyCon US talk about the gift economies of open source and her words of caution to guard against financialized interests… I don't know how to square these contributions in that perspective beyond the amount of personal trust invested in people with the commit bit.
I respect and trust Serhiy and Gregory quite a bit! If such a commit were authored by a core developer whose work I am less personally familiar with, I might feel very differently…
@SnoopJ @xgranade We're definitely getting hit by slop PRs as well (easiest to see by looking at the list of closed PRs in the CPython repo). Unfortunately, I think the initial LLM guidance was born out of a Discord discussion, so I don't have a handy link for the rationale behind it. https://github.com/python/devguide/issues/1513 has *some* info, but not the original examples where we saw LLMs being used to good effect by non-native English speakers to participate more easily in English-only threads.
-
@SnoopJ @xgranade @pathunstrom that’s what I would have expected before the (IMO nonsensical) rulings about LLM outputs; AFAIK, whether LLM output can be infringing in that way has not yet been tested in court. I don’t know what to expect when such a case is heard, and the way things have been going I’m not looking forward to finding out
@ShadSterling @SnoopJ @xgranade the last one I heard might happen settled almost immediately. (I want to say Disney was on the plaintiff side and you need to be extra confident to run defense against them)
-
@SnoopJ @xgranade We're definitely getting hit by slop PRs as well (easiest to see by looking at the list of closed PRs in the CPython repo). Unfortunately, I think the initial LLM guidance was born out of a Discord discussion, so I don't have a handy link for the rationale behind it. https://github.com/python/devguide/issues/1513 has *some* info, but not the original examples where we saw LLMs being used to good effect by non-native English speakers to participate more easily in English-only threads.
@ancoghlan @xgranade I have experienced the assistive effect it has when operating across a language barrier¹ and can't deny that it's very real.
Appreciate the peek behind the curtains.
---
¹ though gosh I wish we had specialist machine translator models doing this very important work instead of generalists who might just pepper in bad facts for non-linguistic reasons >.< -
As a second addendum, since this has come up in several reply threads, the number of commits is limited so far, and doesn't date back past December 5, 2025 so far as I'm aware of.
The Python-specific part of that broader problem is, at least to my mind, that there's not a mechanism that I see for limiting that exposure to those commits, to preventing further and more expansive commits in the future.
Addendum the third, amplifying something I've said a few times now in replies: as much as this is not about Python *in particular* so much as Python *as an example* of a systemic problem, it's also not about individual people so much as about OSS communities writ large.
There are absolutely bad-faith actors at play here who have taken active effort to make things worse, but I don't think that's common, nor do I think pointing fingers at individual contributors here is a useful strategy.
-
Addendum the third, amplifying something I've said a few times now in replies: as much as this is not about Python *in particular* so much as Python *as an example* of a systemic problem, it's also not about individual people so much as about OSS communities writ large.
There are absolutely bad-faith actors at play here who have taken active effort to make things worse, but I don't think that's common, nor do I think pointing fingers at individual contributors here is a useful strategy.
Naming individual people is sometimes justified, and I have done so in other contexts, but it's best done with extreme caution. For calling out the *systemic problem* of slopware vendors making incursions into OSS processes, it's just not worth the potential harm, that's not good praxis imho.
-
Naming individual people is sometimes justified, and I have done so in other contexts, but it's best done with extreme caution. For calling out the *systemic problem* of slopware vendors making incursions into OSS processes, it's just not worth the potential harm, that's not good praxis imho.
@xgranade Other fun complications: OSS consumers actively paying maintainers of their dependencies is good and something we want to encourage. Ditto for offering maintainers free access to tools that make their lives easier. Yet some of those consumers and tools pose a credible threat to the entire edifice of open source collaboration.
I'm a former defence contractor that works for a retail bank so the moral ground I'm standing on is far from high, but even I can see we're deep in a quagmire.
-
@xgranade Other fun complications: OSS consumers actively paying maintainers of their dependencies is good and something we want to encourage. Ditto for offering maintainers free access to tools that make their lives easier. Yet some of those consumers and tools pose a credible threat to the entire edifice of open source collaboration.
I'm a former defence contractor that works for a retail bank so the moral ground I'm standing on is far from high, but even I can see we're deep in a quagmire.
@ancoghlan I mean, I get it, yeah. My PhD was largely funded by the US military, and was at an institution that is abusive in the extreme. My postdoc was a bit better, but not by a *lot*.
All that said, I do think that it isn't unreasonable to consider AI as being fashware, given its eugenicist origin and given the ways it's funded and developed. I get some folks are stuck in jobs that make and push it, but also, it's important to resist incursion into OSS spaces?
-
@MissingClara @xgranade That's a bad argument against having a policy. Policy is a statement of who does and doesn't belong. If they're using it without revealing it, trying to launder code with fraudulent provenance into your project, that's highly malicious behavior worthy of a ban from the project once they're caught. And it's a signal to your good contributors that the project is healthy and not going to be turned into irreparable garbage by slop bros.
@dalias @xgranade we already ban those slop spammers. that leaves reasonable contributors, but no it isn't a good argument. some highly knowledgeable folks that I know are experimenting using LLMs as a tool with success, while I dislike it due to ethical reasons, there is no objective technical reason to reject those contributions.
-
@dalias @xgranade we already ban those slop spammers. that leaves reasonable contributors, but no it isn't a good argument. some highly knowledgeable folks that I know are experimenting using LLMs as a tool with success, while I dislike it due to ethical reasons, there is no objective technical reason to reject those contributions.
@MissingClara @dalias I disagree with the use of AI on an ethical basis, and also believe that there's an objective technical reason to reject AI-generated code. In particular, LLMs do not model the behavior of code, and do not reason about its likely effects. It similarly isn't designed in any meaningful way. Those properties aren't compatible with sound engineering?
-
@MissingClara @dalias I disagree with the use of AI on an ethical basis, and also believe that there's an objective technical reason to reject AI-generated code. In particular, LLMs do not model the behavior of code, and do not reason about its likely effects. It similarly isn't designed in any meaningful way. Those properties aren't compatible with sound engineering?
-
@MissingClara @dalias (afk for a while due to other life things, not trying to ignore you, sorry)
-
@dalias @xgranade we already ban those slop spammers. that leaves reasonable contributors, but no it isn't a good argument. some highly knowledgeable folks that I know are experimenting using LLMs as a tool with success, while I dislike it due to ethical reasons, there is no objective technical reason to reject those contributions.
@MissingClara @xgranade There absolutely are technical reasons. Lack of provenance. Lack of ability to license the code comoatibly with the project. High probability of easily missed errors - the slop machines excel at making things that look deceptively right in form but aren't. People who don't understand these problems are not "highly knowledgeable". They're folks who've already sabotaged their own cognitive facilities and whom it's not safe to trust.
-
@MissingClara @xgranade There absolutely are technical reasons. Lack of provenance. Lack of ability to license the code comoatibly with the project. High probability of easily missed errors - the slop machines excel at making things that look deceptively right in form but aren't. People who don't understand these problems are not "highly knowledgeable". They're folks who've already sabotaged their own cognitive facilities and whom it's not safe to trust.
@dalias @MissingClara I don't even entirely disagree, but I wouldn't necessarily characterize people who use AI tools as being not "highly knowledgeable," given the sheer amount of disinformation that AI companies have put out about how (and indeed, if!) they work. Well meaning experts can and have come under misapprehensions due to that disinfo.
It's part of why a strong stance banning AI generated code is as important, even at a technical level. It helps counter that disinformation.
-
@dalias @MissingClara I don't even entirely disagree, but I wouldn't necessarily characterize people who use AI tools as being not "highly knowledgeable," given the sheer amount of disinformation that AI companies have put out about how (and indeed, if!) they work. Well meaning experts can and have come under misapprehensions due to that disinfo.
It's part of why a strong stance banning AI generated code is as important, even at a technical level. It helps counter that disinformation.
@dalias @MissingClara (still mostly afk, but trying to pop back on occasionally as best as i can)
-
@dalias @MissingClara I don't even entirely disagree, but I wouldn't necessarily characterize people who use AI tools as being not "highly knowledgeable," given the sheer amount of disinformation that AI companies have put out about how (and indeed, if!) they work. Well meaning experts can and have come under misapprehensions due to that disinfo.
It's part of why a strong stance banning AI generated code is as important, even at a technical level. It helps counter that disinformation.
@dalias @MissingClara Anyway, that is part of why I don't think going after specific individuals is not the most productive thing, even aside from the potential harm. AI in OSS is a systemic problem, irrespective of the expertise or lack thereof of the people using AI products.
-
@MissingClara @xgranade There absolutely are technical reasons. Lack of provenance. Lack of ability to license the code comoatibly with the project. High probability of easily missed errors - the slop machines excel at making things that look deceptively right in form but aren't. People who don't understand these problems are not "highly knowledgeable". They're folks who've already sabotaged their own cognitive facilities and whom it's not safe to trust.
@dalias @xgranade I can concede of the provenance and licensing, but on the last point, that can also be true from regular user contributions. to be a good maintainer you need to be able catch those errors, so it doesn't matter if it comes from a real user or a LLM. especially in a project like Python, where the smallest thing can break a lot of code downstream.
-
@dalias @xgranade I can concede of the provenance and licensing, but on the last point, that can also be true from regular user contributions. to be a good maintainer you need to be able catch those errors, so it doesn't matter if it comes from a real user or a LLM. especially in a project like Python, where the smallest thing can break a lot of code downstream.
@MissingClara @xgranade If someone actually wrote it and it looks right, you have whatever mental process they went thru writing it plus the "looks right" as reasons it's likely right. If it was LLM slop, you only have how it looks.