Yesterday Cory Doctorow argued that refusal to use LLMs was mere "neoliberal purity culture".
-
@FediThing @bazkie @prinlu @tante
The argument was literally, "It's not OK to check the punctuation in *your own work* if the punctuation checker was created by examining other peoples' work, because performing mathematical analysis on other peoples' work is *per se* unethical."
@FediThing @bazkie @prinlu @tante
By this standard the OED is unethical.
-
@bazkie @prinlu @FediThing @tante
First: checking for punctuation errors and other typos *in my own work* in a model running on *my own laptop* has nothing - not one single, solitary thing - in common with your example.
Nothing.
Literally, nothing.
But second: I literally license my work for commercial republication and it is widely republished in commercial outlets without any payment or notice to me.
@pluralistic but then you consented to that, right? you are in control of that.
also my example IS similar - after all, it's data scraped without consent, used to create another work. the typo-checker changes your blogpost based on my training data, in the same way my copycat blog changes 'my' works based on your training data.
sure, it's on a way different scale - deliberately, to more clearly show the principle - but it's the same thing.
-
Yesterday Cory Doctorow argued that refusal to use LLMs was mere "neoliberal purity culture". I think his argument is a strawman, doesn't align with his own actions and delegitimizes important political actions we need to make in order to build a better cyberphysical world.
EDIT: Diskussions under this are fine, but I do not want this to turn into an ad hominem attack to Cory. Be fucking respectful
https://tante.cc/2026/02/20/acting-ethical-in-an-imperfect-world/
@tante I appreciate this post. I have gotten into similar discussions of purity culture around generative AI use (me being against using AI) and you articulate many of the feelings I have about it well.
-
@pluralistic but then you consented to that, right? you are in control of that.
also my example IS similar - after all, it's data scraped without consent, used to create another work. the typo-checker changes your blogpost based on my training data, in the same way my copycat blog changes 'my' works based on your training data.
sure, it's on a way different scale - deliberately, to more clearly show the principle - but it's the same thing.
Should we ban the OED?
There is literally no way to study language itself without acquiring vast corpora of existing language, and no one in the history of scholarship has ever obtained permission to construct such a corpus.
-
@FediThing @bazkie @prinlu @tante
Once again, you a replying to a thread that started when someone wrote that using an LLM to check the punctuation in your own work is ethically impermissible because no one should assemble corpora of other peoples' works for analytical purposes under any circumstances, ever.
-
@FediThing @bazkie @prinlu @tante
The argument was literally, "It's not OK to check the punctuation in *your own work* if the punctuation checker was created by examining other peoples' work, because performing mathematical analysis on other peoples' work is *per se* unethical."
@pluralistic @FediThing @prinlu @tante I'd say "because performing [automated, mass scale] mathematical analysis on other peoples' work [without their consent] [with the goal of augmenting one's own work] is *per se* unethical" - and in that case, it's a statement I would agree with.
-
@FediThing @bazkie @prinlu @tante
Once again, you a replying to a thread that started when someone wrote that using an LLM to check the punctuation in your own work is ethically impermissible because no one should assemble corpora of other peoples' works for analytical purposes under any circumstances, ever.
@pluralistic @FediThing @prinlu @tante sure, but I'm responding here specifically to your statement that scraping for training isn't unethical per se.
-
@pluralistic @FediThing @prinlu @tante I'd say "because performing [automated, mass scale] mathematical analysis on other peoples' work [without their consent] [with the goal of augmenting one's own work] is *per se* unethical" - and in that case, it's a statement I would agree with.
@bazkie @FediThing @prinlu @tante
You've literally just made the case against:
* Dictionaries
* Encyclopedias
* BibliographiesAnd also the entire field of computational linguistics.
If that's your position, fine, we have nothing more to say to one another because I think that's a very, very bad position.
-
@pluralistic @FediThing @prinlu @tante sure, but I'm responding here specifically to your statement that scraping for training isn't unethical per se.
@pluralistic @FediThing @prinlu @tante you keep conveniently malforming the aspect of "mass automated non-consensual scraping with the goal of helping producing works" into "analytical purposes" and I find that in rather bad faith
-
There is no virtue in being constrained or regulated per se.
Regulation isn't a good unto itself.
Regulation that is itself good - drawn up for a good purpose, designed to be administrable, and then competently administered - is good.
@pluralistic @herrLorenz @tante Of course! Agreed.
The overlap ends around -when- reasons are "good" enough. Laws about how to treat other people are relatively easy.
But until enough people see rivers on fire, regulations on -doing certain things- aren't imposed, despite many people saying "hey, this isn't good" decades prior.
Not reining in/regulating until after -foreseeable- catastrophes results in all kinds of shit shows (from the MIC, to urban sprawl, to plastics, to tax laws, etc)
-
@bazkie @FediThing @prinlu @tante
You've literally just made the case against:
* Dictionaries
* Encyclopedias
* BibliographiesAnd also the entire field of computational linguistics.
If that's your position, fine, we have nothing more to say to one another because I think that's a very, very bad position.
@pluralistic @FediThing @prinlu @tante I did not make that case, if you'd properly read my [additions] to the statement.
making dictionaries etc isn't automated on mass scales like feeding training data to LLMs is.
it's a very human job that involves a lot of expertise and takes a lot of time.
-
@Colman @FediThing @tante That's interesting. I've never wondered that about you.
@pluralistic @Colman @FediThing @tante
This is...disappointing. To be fair, I'm disappointed in almost everyone in this thread for engaging in schoolyard shit throwing, but you're much higher in status and your shit sticks. Have a conversation. Figure out where these views can comingle. Find common understanding or you risk using your high status to fracture an already unstable alliance of people who want technology to operate safely and for the benefit of our shared humanity.
Do better.
-
Yesterday Cory Doctorow argued that refusal to use LLMs was mere "neoliberal purity culture". I think his argument is a strawman, doesn't align with his own actions and delegitimizes important political actions we need to make in order to build a better cyberphysical world.
EDIT: Diskussions under this are fine, but I do not want this to turn into an ad hominem attack to Cory. Be fucking respectful
https://tante.cc/2026/02/20/acting-ethical-in-an-imperfect-world/
@tante the enshittification’s got to his head i guess
i’ll say people who go out of their way to be unethical and complain about “purity culture” when confronted about it are fucking annoying at best
no more respect for that guy
-
@correl @skyfaller @FediThing @tante
More fruit of the poisoned tree.
"This isn't bad, but it has bad things in its origin. The things I use *also* have bad things in their origin, but that's OK, because those bad things are different because [reasons]."
This is the inevitable, pointless dead-end of purity culture.
@pluralistic This seems like whataboutism. Valid criticisms can come from people who don't behave perfectly, because otherwise no one would be able to criticize anything. Similarly, we can criticize society while participating in it.
The point I'd like to make (that doesn't seem to be landing) is that LLMs aren't just made by bad people, but are also made through harmful processes. Harm dealt mostly during creation can be better than continuing harm, but still harmful.
@correl @FediThing @tante -
Hmmmm... How about this perspective?
LLM is just a programming technique. The ethicality of using LLMs relates to the type of use and the source of the data it was trained on.
Using LLMs to search the universe for dark matter using survey telescopic data or to identify drug efficacy using anonymized public health records is simply using the latest technology for good purpose. Cory's use seems like this.
LLMs trained on stolen data creating derivative work. That's just theft.
@mastodonmigration @tante nope nope nope nope stop with the slopologism
Using LLMs to search the universe for dark matter using survey telescopic data or to identify drug efficacy using anonymized public health records is simply using the latest technology for good purpose.
that’s the broader field of machine learning. not LLMs
LLMs are by and large unethical cognitive decline inducing generators of soulless slop.
-
@mastodonmigration @tante nope nope nope nope stop with the slopologism
Using LLMs to search the universe for dark matter using survey telescopic data or to identify drug efficacy using anonymized public health records is simply using the latest technology for good purpose.
that’s the broader field of machine learning. not LLMs
LLMs are by and large unethical cognitive decline inducing generators of soulless slop.
@mastodonmigration @tante oh and for fuck’s sake, there’s no such thing as “stealing data”, copying is not theft and a progressive individual should understand that. you’ve better angles to complain about slop generators from
-
@pluralistic This seems like whataboutism. Valid criticisms can come from people who don't behave perfectly, because otherwise no one would be able to criticize anything. Similarly, we can criticize society while participating in it.
The point I'd like to make (that doesn't seem to be landing) is that LLMs aren't just made by bad people, but are also made through harmful processes. Harm dealt mostly during creation can be better than continuing harm, but still harmful.
@correl @FediThing @tante@pluralistic @correl @FediThing @tante In the climate crisis we are often concerned about "embodied emissions", things made with fossil fuels that may not use fossil fuels once they're created. If we don't change our fossil fuel using production systems, those embodied emissions could be enough to kill us.
I'd say that the literal and figurative embodied emissions of even local LLMs are sufficient to make them problematic to use. Individuals avoiding them is insufficient but necessary.
-
@zenkat good point. I will admit, if we didn't live in a techbro-feudal slopworld, I probably wouldn't mind a non-consenually trained typo checker LLM all that much.
-
@FediThing @tante This is the use-case that is under discussion.
@pluralistic @FediThing @tante you’re attempting to legitimize use of an unethical technology for something you don’t actually need a plausible-sounding-wall-of-text generator for
it goes beyond “it’s made by bad people in bad ways”. it’s a “”tool”” that actively causes cognitive decline and psychosis and sucks the soul out of everything it touches. and mind you promoting and legitimizing it is an act of support for those bad people and their bad ways. your deflection is a typical that of someone with no regard for ethics
“I installed Ollama” instantly gives a person away as a techbro
- your not-so-friendly not-so-neighborhood “””liberal”””
-
@bazkie @prinlu @FediThing @tante
I do not accept the premise that scraping for training data is unethical (leaving aside questions of overloading others' servers).
This is how every search engine works. It's how computational linguistics works. It's how the Internet Archive works.
Making transient copies of other peoples' work to perform mathematical analysis on them isn't just acceptable, it's an unalloyed good and should be encouraged:
https://pluralistic.net/2023/09/17/how-to-think-about-scraping/
@pluralistic @bazkie @prinlu @FediThing @tante Is DDOSing independent websites an unalloyed good?