oh someone finally offered cory doctorow a big enough sack of cash to do a cory doctorow?
-
@Da_Gut thatβs incorrect for all of the local, supposedly open source models I know of
all of the research Iβve read on this has easily extracted verbatim plagiarized text from the models, because all of them have their origins in the same sources β usually Facebookβs leaked llama model or deepseek (which itself took from previous models). it isnβt possible for LLM models to be trained by anything other than a billion dollar company or a state operating like one.
@Da_Gut this is such a common thing that one of the multibillion dollar AI companies posted their official policy on it: using crafted prompts to extract the training data from an existing model of theirs is IP theft. of IP that was stolen from every web page ever.
thatβs one of the mechanisms that will be used to make local LLMs a boon for the AI industrial complex rather than a threat. these supposed alternatives are already highly dependent on the system they claim to oppose.
-
@zzt Iβve had to shut down a website Iβve been running since 1995. Iβve left a single page that basically says fuck AI. This was a teeny niche site with minuscule traffic. Llm scrapers drove my hosting fees into the hundreds. My ISP said they wouldnβt charge me, all I had to do was destroy a lifetime of work and remove anything the llm could scrape.
-
@Da_Gut thatβs incorrect for all of the local, supposedly open source models I know of
all of the research Iβve read on this has easily extracted verbatim plagiarized text from the models, because all of them have their origins in the same sources β usually Facebookβs leaked llama model or deepseek (which itself took from previous models). it isnβt possible for LLM models to be trained by anything other than a billion dollar company or a state operating like one.
@zzt oh, I did not know that. So if you disconnect it from the Internet, it ceases to work?
-
undefined oblomov@sociale.network shared this topic
-
oh good, the βyouβre just doing purity cultureβ thing is already taking hold over on bluesky
so the line is now supposed to be that local LLMs are good and moral and SaaS LLMs are bad, when local LLMs come from the same fucked system thatβs also actively making it impossible to buy computing hardware powerful enough to run even a shitty local LLM? is that about right? Iβm supposed to clap cause someone with money is running a plagiarism machine but slower and shittier on their desktop?
@zzt I love (hate) the "open source" LLMs lie
-
@zzt oh, I did not know that. So if you disconnect it from the Internet, it ceases to work?
@Da_Gut the factors Iβm talking about arenβt technical ones, theyβre social and systemic. specifically:
- local LLMs are worse than cloud ones, and necessarily must always be. it isnβt possible for independent development of models to happen, and LLMs are already on an intentionally fast deprecation cycle. old models arenβt viewed as useful by anybody.
- itβs very easy for established companies to take action against local models as IP theft, and theyβre already laid the groundwork for this -
@Da_Gut the factors Iβm talking about arenβt technical ones, theyβre social and systemic. specifically:
- local LLMs are worse than cloud ones, and necessarily must always be. it isnβt possible for independent development of models to happen, and LLMs are already on an intentionally fast deprecation cycle. old models arenβt viewed as useful by anybody.
- itβs very easy for established companies to take action against local models as IP theft, and theyβre already laid the groundwork for this@Da_Gut - the hardware necessary to run local LLMs is being bought up en masse, in advance of its production, by AI companies, making local LLMs an avenue available only to the wealthy
- in any case, no such thing as an open source LLM exists. they are all binary blobs derived from proprietary and plagiarized data. -
@zzt Iβve had to shut down a website Iβve been running since 1995. Iβve left a single page that basically says fuck AI. This was a teeny niche site with minuscule traffic. Llm scrapers drove my hosting fees into the hundreds. My ISP said they wouldnβt charge me, all I had to do was destroy a lifetime of work and remove anything the llm could scrape.
@MissConstrue @zzt das ist aber sehr Γ€rgerlich!!!!
-
@zzt I have this vague feeling that there's some underlying need for "true" sapient AI that's feeding this.
-
@zzt Me personally, I'd love to have a sentient android friend, but reality just doesn't support that right now. Maybe one day far in the future someone will crack real AI, but these models are nowhere near real cognition.
-
thx for telling me that everything I have hosted on the web getting repeatedly scraped to death by what would previously be considered a massive attack but is now being carried out by the largest corporations in the world is normal, actually. hope they give us good licensing terms on our data, uhhh no wait their IP, once theyβre done killing and buying all the original data sources
@zzt
What did he do this time? -
@zzt
What did he do this time?@ohmu this is a pretty good summary: https://hol.ogra.ph/notes/aiyjrab16ujh21v2
-
@ohmu this is a pretty good summary: https://hol.ogra.ph/notes/aiyjrab16ujh21v2
@zzt
Thank you kindly -
oh good, the βyouβre just doing purity cultureβ thing is already taking hold over on bluesky
so the line is now supposed to be that local LLMs are good and moral and SaaS LLMs are bad, when local LLMs come from the same fucked system thatβs also actively making it impossible to buy computing hardware powerful enough to run even a shitty local LLM? is that about right? Iβm supposed to clap cause someone with money is running a plagiarism machine but slower and shittier on their desktop?
@zzt but they fixed the one problem they pretended to listen to the detractors about! Surely, that means we can purely focus on how effective we say it is at what it does, right?
obligatory /s