Can we just put it bluntly?
-
I guess what Im trying to get at is that if *any* amount of AI code is considered uncopyrightable, that would become a poison pill for any project that has had any amount of AI code contributed to it.
It's not like every line of code authored by an LLM has a label that says: "I was written by an LLM." If I'm not mistaken there are OSS projects like the Linux kernel which will accept PRs that were partially authored by LLMs. I don't see how that could be untangled.
-
I guess what Im trying to get at is that if *any* amount of AI code is considered uncopyrightable, that would become a poison pill for any project that has had any amount of AI code contributed to it.
It's not like every line of code authored by an LLM has a label that says: "I was written by an LLM." If I'm not mistaken there are OSS projects like the Linux kernel which will accept PRs that were partially authored by LLMs. I don't see how that could be untangled.
@yosh @soph That part doesn't really seem like a problem, honestly. As I understand.
It's already the case that Linux kernel contributors (like most OSS projects) retain copyright on their contributions. The "linux kernel" can't sue anyone for copyright infringement; only the specific copyright holders, for the code they own.
A particular contributor's contributions being public domain presumably is similar as far as actual copyright enforcement to that person not being interested in joining as a plaintiff in a copyright lawsuit.
(Of course, if the LLM's output were found to be *infringing* that could be a bigger problem.)
-
@yosh @soph That part doesn't really seem like a problem, honestly. As I understand.
It's already the case that Linux kernel contributors (like most OSS projects) retain copyright on their contributions. The "linux kernel" can't sue anyone for copyright infringement; only the specific copyright holders, for the code they own.
A particular contributor's contributions being public domain presumably is similar as far as actual copyright enforcement to that person not being interested in joining as a plaintiff in a copyright lawsuit.
(Of course, if the LLM's output were found to be *infringing* that could be a bigger problem.)
@yosh @soph Or perhaps rather it *is* a problem, but it's an existing problem, not a new one. For most projects.
The GNU project in contrast generally wants copyright assignment from contributors exactly to help avoid this sort of issue with license enforcement: https://www.gnu.org/licenses/why-assign.html
-
Ah ok! In practice I expect there is likely going to be a pretty big difference between the two.
Once you get down to brass tacks: if a human is the one driving then it becomes hard to come up with language that does ban LLMs, but does not also ban things like compilers and digital cameras.
Because both of those are also instances of: "I pressed a button and it automatically generated binary output – none of which was produced directly by me."
@yosh @soph From a copyright perspective object code is a direct translation of the human-written source code. It's a 'derivative work', like any other translation. In copyright, there is a distinction between the idea and the expression of the idea, so saying 'I have an idea for some code' aand then the AI does the work, the work is not copyrightable. It may, however, be basically plagiarizing other work in the process. One expects this will not be last we hear from lawyers.
-
I guess I am slightly more cynical about copyright law. I view it as a tool by capital, for capital — doubly so in countries like the US where bribery is legal in all but name.
Right now the stock market is fully leveraged on AI. I don't see how the US would ever find itself in a position where a Supreme Court ruling would ever intentionally put the entire economy in peril.
@yosh@yosh.is @soph I see this argument all the time: that A.I. will somehow end the copyright laws and we'll somehow magically end up in a better situation. I think this is an extremely naive stance. If copyright law is "a tool of capital", why would its sucessor, brought about by A.I. tech-capitalism, not be equally as bad, or even worse?
-
@yosh@yosh.is @soph I see this argument all the time: that A.I. will somehow end the copyright laws and we'll somehow magically end up in a better situation. I think this is an extremely naive stance. If copyright law is "a tool of capital", why would its sucessor, brought about by A.I. tech-capitalism, not be equally as bad, or even worse?
-
Can we just put it bluntly?
If you're vibe-coding open source, you are *not* doing open source.
To do open source, you must be creating source code that both has clear provenance *and* the new code you're writing is IP you have full rights to offer under compatible license. As is quickly becoming clear, that second one is getting tested and failing legal checks in places like the US.
(Asking in order to be corrected)
My understanding is that the current precedent and the position of the US Copyright Office is that human authorship must be there for a work to by copyrighted.
Wouldn't that be an endpoint for looking for what the appropriate usage rights (at least in the US), e.g. that it's free to use?
Or is this one of those things where there's a very specific definition of FOSS/OSS that I'm blurring/ignorant of?
-
(Asking in order to be corrected)
My understanding is that the current precedent and the position of the US Copyright Office is that human authorship must be there for a work to by copyrighted.
Wouldn't that be an endpoint for looking for what the appropriate usage rights (at least in the US), e.g. that it's free to use?
Or is this one of those things where there's a very specific definition of FOSS/OSS that I'm blurring/ignorant of?
I ask because I recently got corrected on the difference between "SCOTUS declined to hear the case, so a lower court's decision stands" and "it is set as official national legal precedent", and don't want to continue to make a similar mistake.
-
Ah ok! In practice I expect there is likely going to be a pretty big difference between the two.
Once you get down to brass tacks: if a human is the one driving then it becomes hard to come up with language that does ban LLMs, but does not also ban things like compilers and digital cameras.
Because both of those are also instances of: "I pressed a button and it automatically generated binary output – none of which was produced directly by me."
@yosh @soph Something I've been working with in that distinction is "it produces bugs, can you or someone else go find what went wrong and fix it?"
This clearly tells apart compilers, linters and dictionary based spellcheckers from AI tools. (It may also place non-FLOSS, especially online, tools in the AI category accidentally, but that's not too bad as you can't disprove that they are).
-
@yosh @soph Or perhaps rather it *is* a problem, but it's an existing problem, not a new one. For most projects.
The GNU project in contrast generally wants copyright assignment from contributors exactly to help avoid this sort of issue with license enforcement: https://www.gnu.org/licenses/why-assign.html
@ids1024 @yosh @soph the point is that if the LLM's output is not copyrightable, then for that code, the GPL does not apply, because in order for a license to mean anything, _someone_ must first hold the copyright.
That means that if we somehow get to a point where *all* of the code in the kernel was produced by an LLM, then anyone could ship a binary of that version of the kernel without shipping source code, and GNU/FSF wouldn't be able to do anything about it.
-
Can we just put it bluntly?
If you're vibe-coding open source, you are *not* doing open source.
To do open source, you must be creating source code that both has clear provenance *and* the new code you're writing is IP you have full rights to offer under compatible license. As is quickly becoming clear, that second one is getting tested and failing legal checks in places like the US.
Oddly enough, the people who can't be bothered to write their own code ALSO can't be bothered to grapple with the responsibilities and legal repercussions of their actions.
Go fig.
-
@soph I'm not an expert at all when it comes to licenses but I think that if I was to release code that was heavily vibe coded, I would feel compelled to release it under public domain.
@PierricD @soph if it was heavily vibe-coded, then the non-vibe-coded pars could be released under public domain, but the vibe-coded parts (if i understand the court rulings correctly) never belonged to anyone in the first place.
It's as if every LLM releases each "work" that it produces into public domain, at the exact microsecond that it emits that "work".
-
@yosh @soph Something I've been working with in that distinction is "it produces bugs, can you or someone else go find what went wrong and fix it?"
This clearly tells apart compilers, linters and dictionary based spellcheckers from AI tools. (It may also place non-FLOSS, especially online, tools in the AI category accidentally, but that's not too bad as you can't disprove that they are).
Sorry, I guess I don't quite understand how you're making this distinction? If gcc produces a bug, I don't actually know how to fix gcc to make it stop producing that bug? Or is that not what you meant?
Or are you trying to say that one category of tools is deterministic and the other is not? Because that's true; but arguably also kind of the point.
-
@PierricD @soph if it was heavily vibe-coded, then the non-vibe-coded pars could be released under public domain, but the vibe-coded parts (if i understand the court rulings correctly) never belonged to anyone in the first place.
It's as if every LLM releases each "work" that it produces into public domain, at the exact microsecond that it emits that "work".
I'd contest that this is not clear, either. If it can be shown that something an LLM created has clear lineage back to a copyrighted work, then we have a clear breach of the original license.
This needs to be the case to support the original license and the spirit of that license. Tools that are just doing a glorified copy/paste are violating those original licenses.
-
Can we just put it bluntly?
If you're vibe-coding open source, you are *not* doing open source.
To do open source, you must be creating source code that both has clear provenance *and* the new code you're writing is IP you have full rights to offer under compatible license. As is quickly becoming clear, that second one is getting tested and failing legal checks in places like the US.
@soph Interesting: do you have a citation for vibe-code failing legal checks? No issue if you don't, I can hop on Google and find it, but this would be the first I'd heard of it (most of what I've heard is that LLM output is uncopyrightable, not that it violates someone else's copyright).
The reason I ask is that I'm aware of corporate contexts where it's being used and if the courts are leaning towards declaring them copyright violations that could have significant implications (declaring them non-copyrightable probably less so; in practice most companies protect source code as trade-secret because it's hard to prove provenance in a court-of-law... Or by moving so fast that a competitor exfiltrating an out-of-date chunk of source isn't super useful or would require the competitor to also have the hardware / architecture the code "lives" in).
-
undefined oblomov@sociale.network shared this topic