This is bad.
-
Naming individual people is sometimes justified, and I have done so in other contexts, but it's best done with extreme caution. For calling out the *systemic problem* of slopware vendors making incursions into OSS processes, it's just not worth the potential harm, that's not good praxis imho.
@xgranade Other fun complications: OSS consumers actively paying maintainers of their dependencies is good and something we want to encourage. Ditto for offering maintainers free access to tools that make their lives easier. Yet some of those consumers and tools pose a credible threat to the entire edifice of open source collaboration.
I'm a former defence contractor that works for a retail bank so the moral ground I'm standing on is far from high, but even I can see we're deep in a quagmire.
-
@xgranade Other fun complications: OSS consumers actively paying maintainers of their dependencies is good and something we want to encourage. Ditto for offering maintainers free access to tools that make their lives easier. Yet some of those consumers and tools pose a credible threat to the entire edifice of open source collaboration.
I'm a former defence contractor that works for a retail bank so the moral ground I'm standing on is far from high, but even I can see we're deep in a quagmire.
@ancoghlan I mean, I get it, yeah. My PhD was largely funded by the US military, and was at an institution that is abusive in the extreme. My postdoc was a bit better, but not by a *lot*.
All that said, I do think that it isn't unreasonable to consider AI as being fashware, given its eugenicist origin and given the ways it's funded and developed. I get some folks are stuck in jobs that make and push it, but also, it's important to resist incursion into OSS spaces?
-
@MissingClara @xgranade That's a bad argument against having a policy. Policy is a statement of who does and doesn't belong. If they're using it without revealing it, trying to launder code with fraudulent provenance into your project, that's highly malicious behavior worthy of a ban from the project once they're caught. And it's a signal to your good contributors that the project is healthy and not going to be turned into irreparable garbage by slop bros.
@dalias @xgranade we already ban those slop spammers. that leaves reasonable contributors, but no it isn't a good argument. some highly knowledgeable folks that I know are experimenting using LLMs as a tool with success, while I dislike it due to ethical reasons, there is no objective technical reason to reject those contributions.
-
@dalias @xgranade we already ban those slop spammers. that leaves reasonable contributors, but no it isn't a good argument. some highly knowledgeable folks that I know are experimenting using LLMs as a tool with success, while I dislike it due to ethical reasons, there is no objective technical reason to reject those contributions.
@MissingClara @dalias I disagree with the use of AI on an ethical basis, and also believe that there's an objective technical reason to reject AI-generated code. In particular, LLMs do not model the behavior of code, and do not reason about its likely effects. It similarly isn't designed in any meaningful way. Those properties aren't compatible with sound engineering?
-
@MissingClara @dalias I disagree with the use of AI on an ethical basis, and also believe that there's an objective technical reason to reject AI-generated code. In particular, LLMs do not model the behavior of code, and do not reason about its likely effects. It similarly isn't designed in any meaningful way. Those properties aren't compatible with sound engineering?
-
@MissingClara @dalias (afk for a while due to other life things, not trying to ignore you, sorry)
-
@dalias @xgranade we already ban those slop spammers. that leaves reasonable contributors, but no it isn't a good argument. some highly knowledgeable folks that I know are experimenting using LLMs as a tool with success, while I dislike it due to ethical reasons, there is no objective technical reason to reject those contributions.
@MissingClara @xgranade There absolutely are technical reasons. Lack of provenance. Lack of ability to license the code comoatibly with the project. High probability of easily missed errors - the slop machines excel at making things that look deceptively right in form but aren't. People who don't understand these problems are not "highly knowledgeable". They're folks who've already sabotaged their own cognitive facilities and whom it's not safe to trust.
-
@MissingClara @xgranade There absolutely are technical reasons. Lack of provenance. Lack of ability to license the code comoatibly with the project. High probability of easily missed errors - the slop machines excel at making things that look deceptively right in form but aren't. People who don't understand these problems are not "highly knowledgeable". They're folks who've already sabotaged their own cognitive facilities and whom it's not safe to trust.
@dalias @MissingClara I don't even entirely disagree, but I wouldn't necessarily characterize people who use AI tools as being not "highly knowledgeable," given the sheer amount of disinformation that AI companies have put out about how (and indeed, if!) they work. Well meaning experts can and have come under misapprehensions due to that disinfo.
It's part of why a strong stance banning AI generated code is as important, even at a technical level. It helps counter that disinformation.
-
@dalias @MissingClara I don't even entirely disagree, but I wouldn't necessarily characterize people who use AI tools as being not "highly knowledgeable," given the sheer amount of disinformation that AI companies have put out about how (and indeed, if!) they work. Well meaning experts can and have come under misapprehensions due to that disinfo.
It's part of why a strong stance banning AI generated code is as important, even at a technical level. It helps counter that disinformation.
@dalias @MissingClara (still mostly afk, but trying to pop back on occasionally as best as i can)
-
@dalias @MissingClara I don't even entirely disagree, but I wouldn't necessarily characterize people who use AI tools as being not "highly knowledgeable," given the sheer amount of disinformation that AI companies have put out about how (and indeed, if!) they work. Well meaning experts can and have come under misapprehensions due to that disinfo.
It's part of why a strong stance banning AI generated code is as important, even at a technical level. It helps counter that disinformation.
@dalias @MissingClara Anyway, that is part of why I don't think going after specific individuals is not the most productive thing, even aside from the potential harm. AI in OSS is a systemic problem, irrespective of the expertise or lack thereof of the people using AI products.
-
@MissingClara @xgranade There absolutely are technical reasons. Lack of provenance. Lack of ability to license the code comoatibly with the project. High probability of easily missed errors - the slop machines excel at making things that look deceptively right in form but aren't. People who don't understand these problems are not "highly knowledgeable". They're folks who've already sabotaged their own cognitive facilities and whom it's not safe to trust.
@dalias @xgranade I can concede of the provenance and licensing, but on the last point, that can also be true from regular user contributions. to be a good maintainer you need to be able catch those errors, so it doesn't matter if it comes from a real user or a LLM. especially in a project like Python, where the smallest thing can break a lot of code downstream.
-
@dalias @xgranade I can concede of the provenance and licensing, but on the last point, that can also be true from regular user contributions. to be a good maintainer you need to be able catch those errors, so it doesn't matter if it comes from a real user or a LLM. especially in a project like Python, where the smallest thing can break a lot of code downstream.
@MissingClara @xgranade If someone actually wrote it and it looks right, you have whatever mental process they went thru writing it plus the "looks right" as reasons it's likely right. If it was LLM slop, you only have how it looks.
-
@MissingClara @xgranade If someone actually wrote it and it looks right, you have whatever mental process they went thru writing it plus the "looks right" as reasons it's likely right. If it was LLM slop, you only have how it looks.
@dalias @MissingClara I don't want to sell anyone reviewing code short; my objection isn't that people aren't doing enough to review LLM extruded code.
It's that reviewing extruded code is *at best* an effective mitigation of a risk that shouldn't have been there in the first place. With human-written code, you at least have some marginal, tiny trust in the author that they are competent, have some concept of how the code works, and so forth, but *none* of that exists for AI.
-
@dalias @MissingClara I don't want to sell anyone reviewing code short; my objection isn't that people aren't doing enough to review LLM extruded code.
It's that reviewing extruded code is *at best* an effective mitigation of a risk that shouldn't have been there in the first place. With human-written code, you at least have some marginal, tiny trust in the author that they are competent, have some concept of how the code works, and so forth, but *none* of that exists for AI.
@dalias @MissingClara From that point, I guess I'm making a "just because the hole is there doesn't mean we should keep digging" kind of argument. Code review, even by excellent and competent reviewers, has at best *reduced* but not eliminated defects and vulnerabilities in code.
Code review is incredibly difficult, it's why rely so much on having incredibly competent reviewers.
-
@dalias @xgranade I can concede of the provenance and licensing, but on the last point, that can also be true from regular user contributions. to be a good maintainer you need to be able catch those errors, so it doesn't matter if it comes from a real user or a LLM. especially in a project like Python, where the smallest thing can break a lot of code downstream.
@MissingClara the intent of a real contributor is to provide a good, safe, working code change. the goal of an llm is to deceive you into believing that it produced good, safe, working code.
given how important details are in something like Python, how far do you want to ride this conflict of interest?
-
@MissingClara @xgranade If someone actually wrote it and it looks right, you have whatever mental process they went thru writing it plus the "looks right" as reasons it's likely right. If it was LLM slop, you only have how it looks.
-
@MissingClara the intent of a real contributor is to provide a good, safe, working code change. the goal of an llm is to deceive you into believing that it produced good, safe, working code.
given how important details are in something like Python, how far do you want to ride this conflict of interest?
-
@MissingClara @xgranade That process still has some error rate. So your overall error rate is going to be much higher when the code has no provenance and is just slop than when both you and the author would need to have mistakes in your thought processes at the same time.
-
@dalias @MissingClara From that point, I guess I'm making a "just because the hole is there doesn't mean we should keep digging" kind of argument. Code review, even by excellent and competent reviewers, has at best *reduced* but not eliminated defects and vulnerabilities in code.
Code review is incredibly difficult, it's why rely so much on having incredibly competent reviewers.
@xgranade @dalias I understand, but at least for Python in specific, the level of scrutiny put on contributions is greater than the kind of issues you are mentioning. if these contributions are being accepted, it is because it was practical to review contributions that made use of LLMs, if that starts not being the case, they will stop being accepted. that said, I still disagree with it on an ethical level.
-
@xgranade @dalias I understand, but at least for Python in specific, the level of scrutiny put on contributions is greater than the kind of issues you are mentioning. if these contributions are being accepted, it is because it was practical to review contributions that made use of LLMs, if that starts not being the case, they will stop being accepted. that said, I still disagree with it on an ethical level.
@MissingClara @dalias I'll reserve a more detailed disagreement here, as I'm not sure it's productive at this point. Suffice to say, I do not agree with the use of LLMs at an ethical *or technical* level.
That said, I do want to pull back slightly — my original point as per "I'm not trying to pick on Python here" is that OSS *in general* is under significant threat from AI products, and not that Python in particular is worse off here than the field in general.