Yesterday Cory Doctorow argued that refusal to use LLMs was mere "neoliberal purity culture".
-
@FediThing @pluralistic @tante i feel in the similar way as big tech has taken the notion of AI and LLMs as a cue/excuse to mount a global campaign of public manipulation and massive investments into a speculative project and pumps gazillions$ into it and convinces everyone it's innevitable tech to be put in bag of potato chips, the backlash is then that anything that bears the name of AI and LLM is poisonous plague and people are unfollowing anyone who's touched it in any way or talks about it in any other way than "it's fascist tech, i'm putting a filter in my feed!" (while it IS fascist tech because it's in hands of fascists).
in my view the problem seems not what LLMs are (what kind of tech), but how they are used and what they extract from planet when they are used by the big tech in this monstrous harmful way. of course there's a big blurred line and tech can't be separated from the political, but... AI is not intelligent (Big Tech wants you to believe that), and LLMs are not capable of intelligence and learning (Big Tech wants you to believe that).
so i feel like a big chunk of anger and hate should really be directed at techno oligarchs and only partially and much more critically at actual algorithms in play. it's not LLMs that are harming the planet, but rather the extraction, these companies who are absolute evil and are doing whatever the hell they want, unchecked, unregulated.
or as varoufakis said to tim nguyen: "we don't want to get rid of your tech or company (google). we want to socialize your company in order to use it more productively" and, if i may add, safely and beneficialy for everyone not just a few.
@prinlu @FediThing @pluralistic @tante I agree with most things said in this thread, but on a very practical level, I'm curious what training data was used for the model used by @pluralistic 's typo-checking ollama?
for me, that training data is key here. was it consensually allowed for use in training?
because as I understand, LLMs need vast amounts of training data, and I'm just not sure how you would get access to such data consensually. would love to be enlightened about this :)
-
@pluralistic @simonzerafa @tante
I'll reiterate my response.When you *alone* do it...no big deal.
When a couple of million do it ON THEIR OWN LAPTOPS...problem.@clintruin @simonzerafa @tante
OK, sorry, i was under the impression that I was having a discussion with someone who understands this issue.
You are completely, empirically, technically wrong.
Checking the punctuation on a document on your laptop uses less electricity than watching a Youtube video.
-
Yesterday Cory Doctorow argued that refusal to use LLMs was mere "neoliberal purity culture". I think his argument is a strawman, doesn't align with his own actions and delegitimizes important political actions we need to make in order to build a better cyberphysical world.
EDIT: Diskussions under this are fine, but I do not want this to turn into an ad hominem attack to Cory. Be fucking respectful
https://tante.cc/2026/02/20/acting-ethical-in-an-imperfect-world/
@tante
while we're pointing out logistical inconsistencies..there is zero reason to stop masking in an ongoing pandemic - especially as someone who acknowledged the benefits previously
nothing has changed to make this a rational choice and it can't be said to be in solidarity with disabled people (or folks in general)
-
Yesterday Cory Doctorow argued that refusal to use LLMs was mere "neoliberal purity culture". I think his argument is a strawman, doesn't align with his own actions and delegitimizes important political actions we need to make in order to build a better cyberphysical world.
EDIT: Diskussions under this are fine, but I do not want this to turn into an ad hominem attack to Cory. Be fucking respectful
https://tante.cc/2026/02/20/acting-ethical-in-an-imperfect-world/
@tante People like Cory who mock others for their disabilities are not worth paying attention to.
-
@prinlu @FediThing @pluralistic @tante I agree with most things said in this thread, but on a very practical level, I'm curious what training data was used for the model used by @pluralistic 's typo-checking ollama?
for me, that training data is key here. was it consensually allowed for use in training?
because as I understand, LLMs need vast amounts of training data, and I'm just not sure how you would get access to such data consensually. would love to be enlightened about this :)
@bazkie @prinlu @FediThing @tante
I do not accept the premise that scraping for training data is unethical (leaving aside questions of overloading others' servers).
This is how every search engine works. It's how computational linguistics works. It's how the Internet Archive works.
Making transient copies of other peoples' work to perform mathematical analysis on them isn't just acceptable, it's an unalloyed good and should be encouraged:
https://pluralistic.net/2023/09/17/how-to-think-about-scraping/
-
@reflex @shiri @pluralistic @tante
Seems like Cory's local punctuation and grammer checker is such an example, no?
@mastodonmigration
it's the "copyright" issue, the outlook that unless everyone who posted anything that was used receives a check for a hefty sum then it's unethical.Copyright is in quotes because it's not really a violation of copyright (the LLMs are not producing whole copies of copywritten materials without basically being forced) nor is it a violation of the intent of copyright (people are confused, copyright was never intended to give artists total control, it's just to ensure new art continues to be created).
-
Yesterday Cory Doctorow argued that refusal to use LLMs was mere "neoliberal purity culture". I think his argument is a strawman, doesn't align with his own actions and delegitimizes important political actions we need to make in order to build a better cyberphysical world.
EDIT: Diskussions under this are fine, but I do not want this to turn into an ad hominem attack to Cory. Be fucking respectful
https://tante.cc/2026/02/20/acting-ethical-in-an-imperfect-world/
@tante
I partly agree with Cory and partly not.
Refusing to use resource-gobbling datacenter-hosted LLMs makes perfect sense. I'd just as soon heat my house by burning kittens. It is also a rational political statement.Refusing to use an LLM hosted on my own iron is also a political statement, as well as a personal choice. I don't give a hoot about ideological purity; I just distrust clankers, and don't want to get into the habit of depending on them. (Besides, they offer me nothing I cannot as easily do for myself.)
-
@pluralistic I don't think mink fur or LLMs are comparable to criticizing the origins of the internet or transistors. It's the process that produced mink fur and LLMs that is destructive, not merely that it's made by bad people.
For example, LLM crawlers regularly take down independent websites like Codeberg, DDoSing, threatening the small web. You may say "but my LLM is frozen in time, it's not part of that scraping now", but it would not remain useful without updates.
@skyfaller@jawns.club @pluralistic@mamot.fr @FediThing@social.chinwag.org @tante@tldr.nettime.org This is precisely it; it's about the process, not their distance from Altman, Amodei, et al. (which the Ollama project and those like it achieve).
The LLM models themselves are, per this analogy, still almost entirely of the mink-corpse variety, and I think it's a stretch to scream "purity!" at everyone giving you the stink eye for the coat you're wearing.
It's not impossible to have and use a model, locally hosted and energy-efficient, that wasn't directly birthed by mass theft and human abuse (or training directly off of models that were). And having models that aren't, that are genuinely open, is great! That's how the wickedness gets purged and the underlying tech gets liberated.
Maybe your coat is indeed synthetic, that much is still unclear, because so far all the arguing seems to be focused on the store you got it from and the monsters that operate the worst outlets. -
@clintruin @simonzerafa @tante
OK, sorry, i was under the impression that I was having a discussion with someone who understands this issue.
You are completely, empirically, technically wrong.
Checking the punctuation on a document on your laptop uses less electricity than watching a Youtube video.
@pluralistic @simonzerafa @tante
Fair enough, Cory. You're gonna do what you want regardless of my accuracy or inaccuracy anyway. And maybe I've misunderstood this. The same way many many will.
But visualize this:
"Hey...I just read Cory Doctrow uses an LLM to check his writing."
"Really?"
"Yeah, it's true."
"Cool, maybe what I've read about ChatGPT is wrong too..." -
I'd actually take this a step further and say that technologies ARE social arrangements.
@lrhodes I agree, I believe that we do encode our values into our technology. Particularly with what we code and what we use to code or write.
-
@bazkie @prinlu @FediThing @tante
I do not accept the premise that scraping for training data is unethical (leaving aside questions of overloading others' servers).
This is how every search engine works. It's how computational linguistics works. It's how the Internet Archive works.
Making transient copies of other peoples' work to perform mathematical analysis on them isn't just acceptable, it's an unalloyed good and should be encouraged:
https://pluralistic.net/2023/09/17/how-to-think-about-scraping/
@pluralistic @prinlu @FediThing @tante I think the difference to search engines is how LLM reproduces the training data..
as a thought experiment; what if I'd scrape all your blogposts, then start a blog that makes Cory Doctorow styled blogposts, which would end up more popular than your OG blog since I throw billions in marketing money at it.
would you find that ethical? would you find it acceptable?
further thought experiment; lets say you lose most of your income as a result and have to stop making blogs and start flipping burgers at mcDonalds.
your blog would stop existing, and so, my copycat blog would, too - or at least, it would stop bringing novel blogposts.
this kind of effect is real and will very much hinder cultural development, if not grind it to a halt.
that is a problem - this is culturally unsustainable.
-
@pluralistic @simonzerafa @tante
Fair enough, Cory. You're gonna do what you want regardless of my accuracy or inaccuracy anyway. And maybe I've misunderstood this. The same way many many will.
But visualize this:
"Hey...I just read Cory Doctrow uses an LLM to check his writing."
"Really?"
"Yeah, it's true."
"Cool, maybe what I've read about ChatGPT is wrong too..."@clintruin @simonzerafa @tante
This is an absurd argument.
"I just read about a thing that is fine, but I wasn't paying close attention, so maybe something bad is good?"
Come.
On.
-
@tante Since I assume all the #Epstein documents have been scraped into all the LLM models by now, I'd love to see an example of LLM tech being used for good.
Show me the list of Epstein co-conspirators.
Show me names of who helped them escape accountability, and how they did it.
Show me who raped children. Their names, addresses, passport photos.
Then I will believe LLMs and "AI" have delivered a benefit. -
@clintruin @simonzerafa @tante
This is an absurd argument.
"I just read about a thing that is fine, but I wasn't paying close attention, so maybe something bad is good?"
Come.
On.
@pluralistic @simonzerafa @tante
Maybe...
Maybe not.You have a good day.
-
@FediThing @bazkie @prinlu @tante
There are tons of private search engines, indices, and analysis projects that don't direct text to other works.
I could scrape the web for a compilation of "websites no one should visit, ever." That's not "labor theft."
-
> I am not clear on how this connects to discussing origins of technologies
Because the arguments against running an LLM on your own computer boil down to, "The LLM was made by bad people, or in bad ways."
This is a purity culture standard, a "fruit of the poisoned tree" argument, and while it is often dressed up in objectivity ("I don't use the fruit of the poisoned tree"), it is just special pleading ("the fruits of the poisoned tree that I use don't count, because __").
@pluralistic @FediThing @tante
What's the difference between your argument here and "Slavery is OK because I didn't kidnap the slaves; I just inherited them from my dad." ??
-
@skyfaller@jawns.club @pluralistic@mamot.fr @FediThing@social.chinwag.org @tante@tldr.nettime.org This is precisely it; it's about the process, not their distance from Altman, Amodei, et al. (which the Ollama project and those like it achieve).
The LLM models themselves are, per this analogy, still almost entirely of the mink-corpse variety, and I think it's a stretch to scream "purity!" at everyone giving you the stink eye for the coat you're wearing.
It's not impossible to have and use a model, locally hosted and energy-efficient, that wasn't directly birthed by mass theft and human abuse (or training directly off of models that were). And having models that aren't, that are genuinely open, is great! That's how the wickedness gets purged and the underlying tech gets liberated.
Maybe your coat is indeed synthetic, that much is still unclear, because so far all the arguing seems to be focused on the store you got it from and the monsters that operate the worst outlets.@correl @skyfaller @FediThing @tante
More fruit of the poisoned tree.
"This isn't bad, but it has bad things in its origin. The things I use *also* have bad things in their origin, but that's OK, because those bad things are different because [reasons]."
This is the inevitable, pointless dead-end of purity culture.
-
@lrhodes I agree, I believe that we do encode our values into our technology. Particularly with what we code and what we use to code or write.
@onepict Yeah, code is a pretty literal manifestation of that principle, right?
And one of the major advantages of AI from an ideological point of view is that it allows the provider to write their values into *other people's code*.
-
@pluralistic @FediThing @tante
What's the difference between your argument here and "Slavery is OK because I didn't kidnap the slaves; I just inherited them from my dad." ??
Because there are no slaves in this instance. Because no one is being harmed or asked to do any work, or being deprived of anything, or adversely affected in *any articulable way*.
But yeah, in every other regard, this is exactly that enslaving people.
Sure.
-
@clintruin @simonzerafa @tante
You are laboring under a misapprehension.
I will reiterate my question, with all caps for emphasis.
Which "couple million people" suffer harm when I run a model ON MY LAPTOP?
@pluralistic @clintruin @simonzerafa @tante
Which "couple million people" suffer harm when I run a model ON MY LAPTOP?
Anyone who's hosting a website, and is getting hammered by the bots that seek content to train the models on. Those of us are the ones who continue getting hurt.
Whether you run it locally or not, makes little difference. The models were trained, and training very likely involved scraping, and that continues to be a problem to this day. Not because of ethical concerns, but technical ones: a constant 100req/sec 24/7, with over 2.5k req/sec waves may sound little in this day and age, but at around 2.5k req/sec (sustained for about a week!), my cheap VPS's two vCPUs are bogged down trying to deal with all the TLS handshakes, let alone serving anything.
That is a cost many seem to forget. It costs bandwidth, CPU, and human effort to keep things online under the crawler DDoS - which often will require cold, hard cash too, to survive.
Ask Codeberg or LWN how they fare under crawler load, and imagine someone who just wants to have their stuff online having to deal with similar abuse.
That is the suffering you enable when using any LLM model, even locally.