Spend the day talking to workers council members about "AI".
-
@glyph@mastodon.social @tante@tldr.nettime.org this is the thing that drives me a little batty: "AI", or (mis)applied statistics, is just... well, statistics. And all these "AI experts" never even try to use any sort of metric, much less a statistically rigorous method, to gauge if the damn thing works or not...
@aud @tante @glyph well they do have metrics, it's just that they're generally ad-hoc and terrible metrics
and even when they aren't, Goodhart's Law ensures that relying on them turns the exercise into farce relatively soon.
arguably that kind of farce is the entire history of the false spring: "simply scale it up" worked surprisingly well, then worked surprisingly well again, and therefore we can extrapolate that it will work forever and [financial irresponsibility] and oops now it's not working anymore oh shit oh fuck uhhhh AGENTS, we're doing agents now! Yea, that's the ticket. (and so on)
-
Spend the day talking to workers council members about "AI". And it's kinda wild hearing their stories from the wild: Management is 100% in fantasy "AI" can do everything land and makes huge plans for how to use "AI" to cut workers when real projects that supposedly can do 50% of a specific task end up being able to do 8%. And they still go live. It's fucking bonkers. CEO's are really not okay.
@tante unfortunately and increasingly, management is most interested in whatever looks good in PowerPoint rather than their product in the real world.
-
@aud @tante @glyph well they do have metrics, it's just that they're generally ad-hoc and terrible metrics
and even when they aren't, Goodhart's Law ensures that relying on them turns the exercise into farce relatively soon.
arguably that kind of farce is the entire history of the false spring: "simply scale it up" worked surprisingly well, then worked surprisingly well again, and therefore we can extrapolate that it will work forever and [financial irresponsibility] and oops now it's not working anymore oh shit oh fuck uhhhh AGENTS, we're doing agents now! Yea, that's the ticket. (and so on)
@aud @tante @glyph the addition of "vision heads" has always been the brightest example of this to me, and came sooner than the craze for "agents".
They ran out of runway to scale up on text alone but clearly adding more parameters was the thing that needed doing. Bolting an entire vision system to the side of the model sure does add a lot of parameters and keeps you on the curve of projected growth.
It doesn't really solve any problems in a way that might generate revenue, but it demos quite well and a good demo is all you've ever really needed to separate tech speculators from their cash, *particularly* the ones gambling on "AI" at any point in tech history.
-
@aud @tante @glyph well they do have metrics, it's just that they're generally ad-hoc and terrible metrics
and even when they aren't, Goodhart's Law ensures that relying on them turns the exercise into farce relatively soon.
arguably that kind of farce is the entire history of the false spring: "simply scale it up" worked surprisingly well, then worked surprisingly well again, and therefore we can extrapolate that it will work forever and [financial irresponsibility] and oops now it's not working anymore oh shit oh fuck uhhhh AGENTS, we're doing agents now! Yea, that's the ticket. (and so on)
@SnoopJ@hachyderm.io @tante@tldr.nettime.org @glyph@mastodon.social ah, I meant for the boosters who are "seeing huge gains"; it's always anecdotal and then any outside measurements of it contradict said anecdotal claims...
but also, yes, what you just said, X 1000. Even the earlier "measurements" were horseshit: "we tested this by making it generate answers {for an extremely well documented standardized test for which answers appear many times in the training corpus} and it got a grade of 45%!" which they claim is good, except that's actually failing which they never seem to mention... -
But it was super fun to lead them through a "this is how you can force reasonable evaluation on 'AI' projects which kills most of them" framework and see how they felt empowered and able to actually do their job again.
@tante do you have a link to that framework?
Also: https://labornotes.org/2026/03/four-union-strategies-fight-ai
-
@tante yeah it's a real "YOU HAD ONE JOB" situation
-
@tante do you have a link to that framework?
Also: https://labornotes.org/2026/03/four-union-strategies-fight-ai
@emma haven't formalized it fully so it's not written up anywhere. It's in my head and a few phrases on slides right now.
-
@aud @tante @glyph the addition of "vision heads" has always been the brightest example of this to me, and came sooner than the craze for "agents".
They ran out of runway to scale up on text alone but clearly adding more parameters was the thing that needed doing. Bolting an entire vision system to the side of the model sure does add a lot of parameters and keeps you on the curve of projected growth.
It doesn't really solve any problems in a way that might generate revenue, but it demos quite well and a good demo is all you've ever really needed to separate tech speculators from their cash, *particularly* the ones gambling on "AI" at any point in tech history.
@SnoopJ@hachyderm.io @tante@tldr.nettime.org @glyph@mastodon.social and now we have "so many models to choose from", so we get to play double extra bonus round roulette! Don't just vary your prompts, change models! Infinite combinatorics! You'll never run out of parameters to fiddle with! Burn those tokens, burn em good!
-
-
@otherdog @tante I guess I'll drop the link again just for reference if you haven't seen it, I didn't do so above because I feel like I post this every single day now to the point where the self-promotion feels shameful. but it remains painfully, almost nauseatingly relevant, so, here you go https://blog.glyph.im/2025/08/futzing-fraction.html
-
undefined cwebber@social.coop shared this topic