LLMs are spam generators.

talin@mastodon.social

@cstross Because so many people boosted my post, I want to explore this topic in more detail.

First, a thought experiment: suppose we had cheaply scalable mind control, even if partially effective (LLMs are not mind control but they touch some of the same wires). Democracy would end, since the controllers could sway elections at will. The so-called free market would instantly become a command economy. Science, and any other activity requiring independent thought, would be a dead letter.

talin@mastodon.social

@cstross In the book _Rainbows End_, Vinge proposes a form of mind control called "YGBM" technology, which stands for "You Gotta Believe Me", a way to get affected individuals to accept arbitrary propositions as truth.

What this has in common with LLMs is technologically enhanced plausibility.

cstross@wandering.shop

@talin I was going to mention "Rainbows End"!

talin@mastodon.social

@cstross Now, we know from the earliest days of machine learning that algorithms are capable of exploiting any loophole in their fitness function with seemingly godlike competence.

Unfortunately, we can't make a fitness function for intelligence, truthfulness, or integrity. What we can do, at great expense, is make a fitness function for plausibility.

When an LLM goes through its reinforcement learning, it's behavior is rewarded based on whether some human reviewer believed the result.

talin@mastodon.social

@cstross As LLMs improve, they will get better at chasing their fitness function, that is, at convincing people that what they say is true. This is what LLMs have in common with YGBM.

Those who know history know that the invention of cheap printing sparked of centuries of bloody religious conflict. In this regard, LLMs are like "hold my beer".

cstross@wandering.shop · semafor.com/article/11/07/2025

@Npars01 Yep, 💯

highlandlawyer@mastodon.social · semafor.com/article/11/07/2025

@Npars01 @cstross
Spam making software also assists the nazis too.
"The ideal subject of totalitarian rule is not the convinced Nazi or the convinced Communist, but people for whom the distinction between fact and fiction (i.e., the reality of experience) and the distinction between true and false (i.e., the standards of thought) no longer exist." ~ Hannah Arendt

yacc143@mastodon.social

But yes, if you generate text with an LLM, you generally get text that is basically a "probable" text, based upon the training and the hyperparameters. And let's be honest, people that rely upon the training data of an LLM as a "search engine data corpus", they do have a problem.
🙅

yacc143@mastodon.social

@cstross Well, they are statistical models of language. To be precise, the corpus they were trained on, and there are a number of other aspects, like representation, what cleanups were applied.

Yes, one way to use these models, the one that seems to fascinate most people (but IMHO not necessarily the most useful, depending upon what one wants to achieve), is to complete a prefix with a plausible ending, statistically speaking.

Generally, they are a compelling NLP technology.

victimofsimony@infosec.exchange

@cstross
@ViennaMike
@ireneista

As a #PublicPolicy wonk I'm already remembering how different our common law legal systems have become in just three centuries. Over here they're still worried about the anti-porn #Censorship case where the #SupremeCourt refused to define what they were censoring. They famously said, "I know it when I see it," grumble grumble prurient use in commerce, harrumph typical opinion of ordinary firmness, followed by thirty years of chipping off corners without admitting they were sort of making shit up.

viennamike@mastodon.social

@davidgerard @cstross The article you posted included a very useful study. However, contrary to the article, that study is NOT the only study on the topic. Other, far larger, more comprehensive, studies show measured, meaningful productivity improvements. You asked for measurements, so here they are: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4945566 and https://www.youtube.com/watch?v=tbDDYKRFjhk The Stanford study found that the average productivity boost is significant (~20%) but some teams see productivity decrease.

avadeaux@mastodonsweden.se

@cstross I understand what you mean, but I’d say they’re simply databases. Spam generator is one of their many uses.

davidgerard@circumstances.run

@ViennaMike @cstross your paper's key metric is pull requests, which is addressed in the link you answered but didn't read (one jira leaves, three jiras enter). but thanks anyway.

(over and over i see this from AI guys, rebuttals that ignore the question asked)

carpetbomberz@mastodon.online

@oisin @cstross -via Wikipedia.org

"collisional cascading, or ablation cascade" especially given the training data and the Model Collapse predicted when they run out of "clean data" and start to be trained on synthetic `made up` data.

thepowernap@mefi.social

@ludicity @dragonfrog @cstross @ViennaMike

I'm kinda betting on this, but it doesn't mean the next couple of years won't be painful.

thepowernap@mefi.social

@NefariousCelt @dragonfrog @cstross @ViennaMike

95% agree. I'm currently working an LLM project that is interfacing with something a touch more than home automation. Aside from fine tuning the model and putting constraints on the output (think https://github.com/dottxt-ai/outlines), you can use existing approved software as validation of llm output.

All of this requires in-depth knowlege, patience and exaustive testing.

nefariouscelt@mastodon.scot

@ThePowerNap @dragonfrog @cstross @ViennaMike I suspect I am coming from a very risk averse standpoint. The “summarise this” is what scares me with LLM marketing. I have been burned by poor software requirement specifications. Language simplification can do that in multiple sectors. But yeah mapping natural language to a fixed set of goals, that’s fine. Trust AI with my life, no. Sudden random steering inputs WTAF stelantis.

talin@mastodon.social

@kupac @cstross It took a few decades after Gutenberg before our cognitive immune systems adapted to the reality of cheaply mass-produced books.

However, I'm suspicious of historical argument by analogy, so I'll leave it at that.

viennamike@mastodon.social

@leadegroot @dragonfrog @ThePowerNap
I can certainly see a bubble bursting, but I'm not sure that Phind is a sign. Hard to compete with those with billions to develop core models, and not going to hit any profit home runs reselling access to other's models.

speakertomanagers@wandering.shop

@cstross
The Turing Test will not fail an LLM for the same reason humans are ready to accept them as persons: humans evolved with a hardcoded behavior to perceive any language-using process as a cognitive entity and create a mental model of it. Rejecting that behavior takes a metacognitive ability that has not been taught to most people, so the problem isn’t rampant stupidity (look at how many clearly intelligent AI researchers fail this test) but rather ignorance, sometimes wilful, granted.

Piero Bosio Social Web Site Personale

LLMs are spam generators.

Feed RSS

Gli ultimi otto messaggi ricevuti dalla Federazione

Post suggeriti

Per chi utilizza Linux ed ha bisogno di un editor pdf esiste Master Pdf Editor.

#unix_surrealism #dataswamp #technomage #art #comic #openbsd

ブースト、お気に入り大歓迎！mizuame_photoさんの投稿作品です。

Worst path lights of the week...