Speaking of AI.
-
Speaking of AI. We're in a tight spot. "We", as in, tech workers, I mean.
A bunch of things are true and they all pull in different directions.
- It works. I'm not really willing to entertain the argument that it doesn't, anymore, after having built several large and small projects with it. For coding, it is a lever that can provide a dramatic productivity increase. I'm comfortable saying 3x+, overall. Maximalists are saying 10x or 100x. Even if it's only 3x, that's industry shaking.
1/? -
Oh, also, I have skin in the game. I'm not just randomly dismissing the ethical concerns, I'm right in the middle of them.
A book I wrote was among those pirated by Anthropic. I'm getting ~$1500 (and my publisher is getting the other ~$1500) from the settlement. And, since I have a bunch of code in Open Source projects spanning decades, I'm sure my code is also in the training data for all of them.
I'm not ecstatic about it. But, it's where we are and I don't imagine I can do much about it.
@swelljoe I learned a weird thing about my stuff and Anthropic https://berryvilleiml.com/2025/12/05/the-anthropic-copyright-settlement-is-telling/
-
Speaking of AI. We're in a tight spot. "We", as in, tech workers, I mean.
A bunch of things are true and they all pull in different directions.
- It works. I'm not really willing to entertain the argument that it doesn't, anymore, after having built several large and small projects with it. For coding, it is a lever that can provide a dramatic productivity increase. I'm comfortable saying 3x+, overall. Maximalists are saying 10x or 100x. Even if it's only 3x, that's industry shaking.
1/?@swelljoe As a non-user of AI (lucky), my impression was that in the areas it works well in -- repetitive codebases that resemble ones in the training dataset -- the productivity increase also incurs technical debt at a rate higher than if you'd gotten some junior coders to do it; is that wrong?
-
@swelljoe I learned a weird thing about my stuff and Anthropic https://berryvilleiml.com/2025/12/05/the-anthropic-copyright-settlement-is-telling/
@noplasticshower yeah, I'm confident that many, if not all, of the major models have ingested the thousands of posts I've made to the forum I maintain for the OSS projects I work on. I feel ambivalent about that. On one hand, if someone is asking ChatGPT for help with my software, I'd rather it give reasonable answers than dangerous ones.
But, also, it sucks that the way it does that kind of thing is DDoSing my websites periodically and blatantly disregarding copyright or licenses.
-
@swelljoe As a non-user of AI (lucky), my impression was that in the areas it works well in -- repetitive codebases that resemble ones in the training dataset -- the productivity increase also incurs technical debt at a rate higher than if you'd gotten some junior coders to do it; is that wrong?
-
@swelljoe As a non-user of AI (lucky), my impression was that in the areas it works well in -- repetitive codebases that resemble ones in the training dataset -- the productivity increase also incurs technical debt at a rate higher than if you'd gotten some junior coders to do it; is that wrong?
@clayote it is quite wrong, as of October of last year, when the current crop of models arrived. As of Opus 4.5, Codex 5.2, and Gemini 3, when used in an agentic context (e.g. Claude Code), they're not limited to simple/repetitive code or code that is prominent in the training data.
The training data is "the entire internet and all of public Github", so it knows every language, every framework. Yeah, it's better at simple CRUD apps in TypeScript, but it also kicks my ass in my best languages.
-
@clayote it is quite wrong, as of October of last year, when the current crop of models arrived. As of Opus 4.5, Codex 5.2, and Gemini 3, when used in an agentic context (e.g. Claude Code), they're not limited to simple/repetitive code or code that is prominent in the training data.
The training data is "the entire internet and all of public Github", so it knows every language, every framework. Yeah, it's better at simple CRUD apps in TypeScript, but it also kicks my ass in my best languages.
@clayote I mean, there are still problems it can't solve, but that set is much smaller than you would think if you last looked at it seriously any time up until a few months ago. The models now can search the web, instrument software so they can test without human intervention, and plan quite large/complicated projects for implementation across several context windows.
When driven by an expert, there is very little it can't do, and it does it all very, very, rapidly.
-
@clayote it is quite wrong, as of October of last year, when the current crop of models arrived. As of Opus 4.5, Codex 5.2, and Gemini 3, when used in an agentic context (e.g. Claude Code), they're not limited to simple/repetitive code or code that is prominent in the training data.
The training data is "the entire internet and all of public Github", so it knows every language, every framework. Yeah, it's better at simple CRUD apps in TypeScript, but it also kicks my ass in my best languages.
@swelljoe That's interesting considering that if I'm not mistaken (based on your work on Webmin/Virtualmin), one of your best languages is Perl. I never seriously got into Perl, but it has a reputation for being quite expressive. So in theory, you should be able to express what you want directly in the language. It feels wrong that giving instructions to an LLM, in ambiguous natural language, and having it grind away, is kicking your ass even in a language like Perl. Like a failure of PL design.
-
@noplasticshower yeah, I'm confident that many, if not all, of the major models have ingested the thousands of posts I've made to the forum I maintain for the OSS projects I work on. I feel ambivalent about that. On one hand, if someone is asking ChatGPT for help with my software, I'd rather it give reasonable answers than dangerous ones.
But, also, it sucks that the way it does that kind of thing is DDoSing my websites periodically and blatantly disregarding copyright or licenses.
@swelljoe yup. But if you use it as a tool to assist you, you can assist yourself. I found that out yesterday.
https://berryvilleiml.com/2026/02/18/using-gemini-in-the-silver-bullet-reboot/
-
@swelljoe That's interesting considering that if I'm not mistaken (based on your work on Webmin/Virtualmin), one of your best languages is Perl. I never seriously got into Perl, but it has a reputation for being quite expressive. So in theory, you should be able to express what you want directly in the language. It feels wrong that giving instructions to an LLM, in ambiguous natural language, and having it grind away, is kicking your ass even in a language like Perl. Like a failure of PL design.
@matt "quantity has a quality all its own". Maybe I can write better code, given sufficient time. I can certainly write more concise code (especially in Perl).
But, the models write code an order of magnitude faster than I can, and they can write code 24/7. And, honestly, it's pretty good code, most of the time.
It's still true that the hardest part is deciding what to make rather than making it, but it's drastically easier to write software now with the AI than doing it myself.
-
@matt "quantity has a quality all its own". Maybe I can write better code, given sufficient time. I can certainly write more concise code (especially in Perl).
But, the models write code an order of magnitude faster than I can, and they can write code 24/7. And, honestly, it's pretty good code, most of the time.
It's still true that the hardest part is deciding what to make rather than making it, but it's drastically easier to write software now with the AI than doing it myself.
@swelljoe What scares me is the thought of having to *review* all that code (not yours specifically, just in general as usage of coding agents ramps up). Given that LLMs can write code faster than we can, they can certainly write it way faster than I can read it.
-
@swelljoe What scares me is the thought of having to *review* all that code (not yours specifically, just in general as usage of coding agents ramps up). Given that LLMs can write code faster than we can, they can certainly write it way faster than I can read it.
@matt that's the other thing I feel uneasy about. You can't realistically review it all. At least not in the sense we think of reviewing code, if you want to obtain the velocity of using AI.
You can insist on extensive static analysis and 100% unit test coverage. It never complains about busy work. You can let another AI review the code. I've been doing Copilot code review when checking in code, which is also a cost to velocity, and it rarely detects real bugs, more often misunderstandings.
-
@matt that's the other thing I feel uneasy about. You can't realistically review it all. At least not in the sense we think of reviewing code, if you want to obtain the velocity of using AI.
You can insist on extensive static analysis and 100% unit test coverage. It never complains about busy work. You can let another AI review the code. I've been doing Copilot code review when checking in code, which is also a cost to velocity, and it rarely detects real bugs, more often misunderstandings.
@matt there's a theory that very strict languages (e.g. Rust) are a great fit for AI, because the AI doesn't mind fighting the borrow checker, and the strictness of the language protects against many classes of bug.
I think lack of verification (as in "trust but verify") is why a lot of folks have bad experiences with it (I mean, even after the crossover point where the models and agents got really good). If you give a model clear success criteria, it'll hammer on it until success is achieved.
-
@matt that's the other thing I feel uneasy about. You can't realistically review it all. At least not in the sense we think of reviewing code, if you want to obtain the velocity of using AI.
You can insist on extensive static analysis and 100% unit test coverage. It never complains about busy work. You can let another AI review the code. I've been doing Copilot code review when checking in code, which is also a cost to velocity, and it rarely detects real bugs, more often misunderstandings.
@swelljoe If the LLM misunderstands that often when doing code review, then it could also misunderstand in the direction of letting legitimate bugs through, right?
Sounds to me like we're lowering the bar on quality because business people, in response to what AI boosters are selling, are demanding that we pump out more and more, faster and faster.
I mean, do what you have to do to hold onto your job, but I think I'll keep resisting as long as I can.
-
@swelljoe If the LLM misunderstands that often when doing code review, then it could also misunderstand in the direction of letting legitimate bugs through, right?
Sounds to me like we're lowering the bar on quality because business people, in response to what AI boosters are selling, are demanding that we pump out more and more, faster and faster.
I mean, do what you have to do to hold onto your job, but I think I'll keep resisting as long as I can.
@matt I didn't say it was often. It's just more often than finding real serious bugs. It's mostly things like, "it assumed I wanted to keep backward compatibility with an old API endpoint, even though there are no consumers of the old API and killing old code is actually a benefit".
I'm not convinced it's writing more bugs than I would have written in the same amount of code or that my bugs are more likely to be caught in review. I don't write 100% test coverage for any of my projects...