Can we just put it bluntly?
-
@soph I'm not an expert at all when it comes to licenses but I think that if I was to release code that was heavily vibe coded, I would feel compelled to release it under public domain.
unfortunately, this would also be problematic. If parts of the code it generated come from licensed code, and the connection to the original code is clear when the two are compared, then publishing it under public domain would potentially violate the original license.
-
Can we just put it bluntly?
If you're vibe-coding open source, you are *not* doing open source.
To do open source, you must be creating source code that both has clear provenance *and* the new code you're writing is IP you have full rights to offer under compatible license. As is quickly becoming clear, that second one is getting tested and failing legal checks in places like the US.
Am I right that by "vibe-coding" you mean "generating code, with no to little human involvement in the process". Which would be different than: "using tools to generate code, but with a human actively in the loop".
I believe the crux of the case in the US was that the defendant claimed they did not create the works, a machine did, and because non-humans cannot claim IP protections they lost the case. Or did I misunderstand something about that case?
-
Am I right that by "vibe-coding" you mean "generating code, with no to little human involvement in the process". Which would be different than: "using tools to generate code, but with a human actively in the loop".
I believe the crux of the case in the US was that the defendant claimed they did not create the works, a machine did, and because non-humans cannot claim IP protections they lost the case. Or did I misunderstand something about that case?
I guess I am slightly more cynical about copyright law. I view it as a tool by capital, for capital — doubly so in countries like the US where bribery is legal in all but name.
Right now the stock market is fully leveraged on AI. I don't see how the US would ever find itself in a position where a Supreme Court ruling would ever intentionally put the entire economy in peril.
-
I guess I am slightly more cynical about copyright law. I view it as a tool by capital, for capital — doubly so in countries like the US where bribery is legal in all but name.
Right now the stock market is fully leveraged on AI. I don't see how the US would ever find itself in a position where a Supreme Court ruling would ever intentionally put the entire economy in peril.
Perhaps, though the recent ruling could have massive impacts. I suspect you're still right, until it becomes convenient for the people in power to enforce things and attack their enemies with it.
-
Am I right that by "vibe-coding" you mean "generating code, with no to little human involvement in the process". Which would be different than: "using tools to generate code, but with a human actively in the loop".
I believe the crux of the case in the US was that the defendant claimed they did not create the works, a machine did, and because non-humans cannot claim IP protections they lost the case. Or did I misunderstand something about that case?
No, I mean generating code using an LLM at all. Though it'll be up to the lawyers how it's applied. If the machine is the one generating the code, even if you're the one telling it what to generate, then it's still producing things you didn't type or research or... etc
-
No, I mean generating code using an LLM at all. Though it'll be up to the lawyers how it's applied. If the machine is the one generating the code, even if you're the one telling it what to generate, then it's still producing things you didn't type or research or... etc
Ah ok! In practice I expect there is likely going to be a pretty big difference between the two.
Once you get down to brass tacks: if a human is the one driving then it becomes hard to come up with language that does ban LLMs, but does not also ban things like compilers and digital cameras.
Because both of those are also instances of: "I pressed a button and it automatically generated binary output – none of which was produced directly by me."
-
Ah ok! In practice I expect there is likely going to be a pretty big difference between the two.
Once you get down to brass tacks: if a human is the one driving then it becomes hard to come up with language that does ban LLMs, but does not also ban things like compilers and digital cameras.
Because both of those are also instances of: "I pressed a button and it automatically generated binary output – none of which was produced directly by me."
This is a bit different than those examples, though. In the case of code, we're talking about the source code itself, regardless of further application of tools to it.
If the code itself cannot be copyrighted, then how it plays into the IP required to participate in open source becomes the issue
-
This is a bit different than those examples, though. In the case of code, we're talking about the source code itself, regardless of further application of tools to it.
If the code itself cannot be copyrighted, then how it plays into the IP required to participate in open source becomes the issue
The funniest outcome for sure is that the resulting ruling would make all software defacto illegal.
"To the maintainers of this open-source project. We are the big co legal department. We would like to get your written sign-off that no 'AI assistive tooling' has ever been used in this project. It is important for our supply chain. We expect a reply within 5 days."
If anything AI has touched becomes devoid of legal protections, then that would probably implode the tech sector overnight.
-
The funniest outcome for sure is that the resulting ruling would make all software defacto illegal.
"To the maintainers of this open-source project. We are the big co legal department. We would like to get your written sign-off that no 'AI assistive tooling' has ever been used in this project. It is important for our supply chain. We expect a reply within 5 days."
If anything AI has touched becomes devoid of legal protections, then that would probably implode the tech sector overnight.
I think you're maybe saying something a bit more grandiose than what I'm trying to get at, which is around the code being generated itself by AI.
Here I'm not talking like autogenerated version bumps, but really truly stuff trained on unknown IP.
If some projects need to rollback and try again, I don't think that would be devastating. Sure, newer projects might suffer, but there was plenty of waste when the hype was around stuff like blockchain, too.
-
I think you're maybe saying something a bit more grandiose than what I'm trying to get at, which is around the code being generated itself by AI.
Here I'm not talking like autogenerated version bumps, but really truly stuff trained on unknown IP.
If some projects need to rollback and try again, I don't think that would be devastating. Sure, newer projects might suffer, but there was plenty of waste when the hype was around stuff like blockchain, too.
I guess what Im trying to get at is that if *any* amount of AI code is considered uncopyrightable, that would become a poison pill for any project that has had any amount of AI code contributed to it.
It's not like every line of code authored by an LLM has a label that says: "I was written by an LLM." If I'm not mistaken there are OSS projects like the Linux kernel which will accept PRs that were partially authored by LLMs. I don't see how that could be untangled.
-
I guess what Im trying to get at is that if *any* amount of AI code is considered uncopyrightable, that would become a poison pill for any project that has had any amount of AI code contributed to it.
It's not like every line of code authored by an LLM has a label that says: "I was written by an LLM." If I'm not mistaken there are OSS projects like the Linux kernel which will accept PRs that were partially authored by LLMs. I don't see how that could be untangled.
-
I guess what Im trying to get at is that if *any* amount of AI code is considered uncopyrightable, that would become a poison pill for any project that has had any amount of AI code contributed to it.
It's not like every line of code authored by an LLM has a label that says: "I was written by an LLM." If I'm not mistaken there are OSS projects like the Linux kernel which will accept PRs that were partially authored by LLMs. I don't see how that could be untangled.
-
I guess what Im trying to get at is that if *any* amount of AI code is considered uncopyrightable, that would become a poison pill for any project that has had any amount of AI code contributed to it.
It's not like every line of code authored by an LLM has a label that says: "I was written by an LLM." If I'm not mistaken there are OSS projects like the Linux kernel which will accept PRs that were partially authored by LLMs. I don't see how that could be untangled.
@yosh @soph That part doesn't really seem like a problem, honestly. As I understand.
It's already the case that Linux kernel contributors (like most OSS projects) retain copyright on their contributions. The "linux kernel" can't sue anyone for copyright infringement; only the specific copyright holders, for the code they own.
A particular contributor's contributions being public domain presumably is similar as far as actual copyright enforcement to that person not being interested in joining as a plaintiff in a copyright lawsuit.
(Of course, if the LLM's output were found to be *infringing* that could be a bigger problem.)
-
@yosh @soph That part doesn't really seem like a problem, honestly. As I understand.
It's already the case that Linux kernel contributors (like most OSS projects) retain copyright on their contributions. The "linux kernel" can't sue anyone for copyright infringement; only the specific copyright holders, for the code they own.
A particular contributor's contributions being public domain presumably is similar as far as actual copyright enforcement to that person not being interested in joining as a plaintiff in a copyright lawsuit.
(Of course, if the LLM's output were found to be *infringing* that could be a bigger problem.)
@yosh @soph Or perhaps rather it *is* a problem, but it's an existing problem, not a new one. For most projects.
The GNU project in contrast generally wants copyright assignment from contributors exactly to help avoid this sort of issue with license enforcement: https://www.gnu.org/licenses/why-assign.html
-
Ah ok! In practice I expect there is likely going to be a pretty big difference between the two.
Once you get down to brass tacks: if a human is the one driving then it becomes hard to come up with language that does ban LLMs, but does not also ban things like compilers and digital cameras.
Because both of those are also instances of: "I pressed a button and it automatically generated binary output – none of which was produced directly by me."
@yosh @soph From a copyright perspective object code is a direct translation of the human-written source code. It's a 'derivative work', like any other translation. In copyright, there is a distinction between the idea and the expression of the idea, so saying 'I have an idea for some code' aand then the AI does the work, the work is not copyrightable. It may, however, be basically plagiarizing other work in the process. One expects this will not be last we hear from lawyers.
-
I guess I am slightly more cynical about copyright law. I view it as a tool by capital, for capital — doubly so in countries like the US where bribery is legal in all but name.
Right now the stock market is fully leveraged on AI. I don't see how the US would ever find itself in a position where a Supreme Court ruling would ever intentionally put the entire economy in peril.
@yosh@yosh.is @soph I see this argument all the time: that A.I. will somehow end the copyright laws and we'll somehow magically end up in a better situation. I think this is an extremely naive stance. If copyright law is "a tool of capital", why would its sucessor, brought about by A.I. tech-capitalism, not be equally as bad, or even worse?
-
@yosh@yosh.is @soph I see this argument all the time: that A.I. will somehow end the copyright laws and we'll somehow magically end up in a better situation. I think this is an extremely naive stance. If copyright law is "a tool of capital", why would its sucessor, brought about by A.I. tech-capitalism, not be equally as bad, or even worse?
-
Can we just put it bluntly?
If you're vibe-coding open source, you are *not* doing open source.
To do open source, you must be creating source code that both has clear provenance *and* the new code you're writing is IP you have full rights to offer under compatible license. As is quickly becoming clear, that second one is getting tested and failing legal checks in places like the US.
(Asking in order to be corrected)
My understanding is that the current precedent and the position of the US Copyright Office is that human authorship must be there for a work to by copyrighted.
Wouldn't that be an endpoint for looking for what the appropriate usage rights (at least in the US), e.g. that it's free to use?
Or is this one of those things where there's a very specific definition of FOSS/OSS that I'm blurring/ignorant of?
-
(Asking in order to be corrected)
My understanding is that the current precedent and the position of the US Copyright Office is that human authorship must be there for a work to by copyrighted.
Wouldn't that be an endpoint for looking for what the appropriate usage rights (at least in the US), e.g. that it's free to use?
Or is this one of those things where there's a very specific definition of FOSS/OSS that I'm blurring/ignorant of?
I ask because I recently got corrected on the difference between "SCOTUS declined to hear the case, so a lower court's decision stands" and "it is set as official national legal precedent", and don't want to continue to make a similar mistake.
-
Ah ok! In practice I expect there is likely going to be a pretty big difference between the two.
Once you get down to brass tacks: if a human is the one driving then it becomes hard to come up with language that does ban LLMs, but does not also ban things like compilers and digital cameras.
Because both of those are also instances of: "I pressed a button and it automatically generated binary output – none of which was produced directly by me."
@yosh @soph Something I've been working with in that distinction is "it produces bugs, can you or someone else go find what went wrong and fix it?"
This clearly tells apart compilers, linters and dictionary based spellcheckers from AI tools. (It may also place non-FLOSS, especially online, tools in the AI category accidentally, but that's not too bad as you can't disprove that they are).