This is one of the worst takes from LLM enthusiasts.
-
RE: https://mastodon.social/@stroughtonsmith/116030136026775832
This is one of the worst takes from LLM enthusiasts.
Compilers are deterministic, extremely well tested, made out of incredibly detailed specifications debated for months and properly formalized.
LLMs are random content generators with a whole lot of automatically trained heuristics. They can produce literally anything. Not a single person who built them can predict what the output will be for a given input.
Comparing both is a display of ignorance and dishonesty.
@arroz LLMs are a compiler in the same way that my 3-year old with a bunch of crayons is a camera.
-
@petros Of course this doesnât mean you have a tool that assists you with hard and repetitive work. If someone is scanning documents from the VI century for historical preservation, having a tool that helps identifying characters worn out by time, the several aspects of translation and interpretation, etc, might help. But thatâs not something that does the job for itself. The historian is the central piece of that puzzle with the necessary knowledge and context for doing it.
@arroz In this case there are invoices and purchase orders coming as PDF, unstructured data.
Currently there is OCR software and manual data entry. Both make mistakes, so there is always "double keying". If the result is the same, it is considered right. Otherwise it goes to review.
Now there are 2 LLMs who do the "keying" job. Both get it ça. 90% right.
A difference to compilers: two compilers do not create the same machine code, so one cannot compare two results and decide that's right.
-
@arroz In this case there are invoices and purchase orders coming as PDF, unstructured data.
Currently there is OCR software and manual data entry. Both make mistakes, so there is always "double keying". If the result is the same, it is considered right. Otherwise it goes to review.
Now there are 2 LLMs who do the "keying" job. Both get it ça. 90% right.
A difference to compilers: two compilers do not create the same machine code, so one cannot compare two results and decide that's right.
@arroz Also, if there still is an error in one invoice and purchase order, it is usually not catastrophic. You get 250 screws instead of 25.. that happened even before we had computers. It's annoying but.. well, magic doesn't happen, sh** does ;-)
Given that we work on behalf of customers, we need to have an acceptably low error rate, of course.
-
RE: https://mastodon.social/@stroughtonsmith/116030136026775832
This is one of the worst takes from LLM enthusiasts.
Compilers are deterministic, extremely well tested, made out of incredibly detailed specifications debated for months and properly formalized.
LLMs are random content generators with a whole lot of automatically trained heuristics. They can produce literally anything. Not a single person who built them can predict what the output will be for a given input.
Comparing both is a display of ignorance and dishonesty.
@arroz Had a genAI-curious colleague voice this exact take last week.
I pointed out the same things you did, but honestly they're so eager to believe that I don't think they internalized the difference...
Another, koolaid-drinking colleague replied "well sometimes compilers are not deterministic!!!", as if finding a compiler bug every 15 years was the same as an LLM crapping out every prompt. -
@arroz Also, if there still is an error in one invoice and purchase order, it is usually not catastrophic. You get 250 screws instead of 25.. that happened even before we had computers. It's annoying but.. well, magic doesn't happen, sh** does ;-)
Given that we work on behalf of customers, we need to have an acceptably low error rate, of course.
@petros What you need is to get rid of the PDFs and deploy an online store. đ
What is the failure rate of the traditional OCRs compared to the LLMs? And how modern were those OCRs? Modern OCR in the last 5 years or so have a success rate way higher than 90%. And are the failures on OCR itself or interpreting their context (aka knowing how to read the invoice or order, not just identifying the right characters)?
-
@arroz @stroughtonsmith
Jesus fucking Christ, these people are incompetent idiots. Iâm even more glad to be out of the programming business given that these are the morons with whom Iâd be interacting. Everything is going to go to shit.@mtconleyuk @arroz @stroughtonsmith can we please go back to talking with each others instead of shouting? Please make your point without insulting somebody who made his point!
-
RE: https://mastodon.social/@stroughtonsmith/116030136026775832
This is one of the worst takes from LLM enthusiasts.
Compilers are deterministic, extremely well tested, made out of incredibly detailed specifications debated for months and properly formalized.
LLMs are random content generators with a whole lot of automatically trained heuristics. They can produce literally anything. Not a single person who built them can predict what the output will be for a given input.
Comparing both is a display of ignorance and dishonesty.
-
Vibe coded skyscrapers.
@Orb2069 @aspensmonster @zzt @arroz
Soon coming to an eathquake zone near you!
-
@arroz LLMs are a compiler in the same way that my 3-year old with a bunch of crayons is a camera.
-
@arroz But why generate code at all. Just execute the prompts directly. Suits me... đ
-
RE: https://mastodon.social/@stroughtonsmith/116030136026775832
This is one of the worst takes from LLM enthusiasts.
Compilers are deterministic, extremely well tested, made out of incredibly detailed specifications debated for months and properly formalized.
LLMs are random content generators with a whole lot of automatically trained heuristics. They can produce literally anything. Not a single person who built them can predict what the output will be for a given input.
Comparing both is a display of ignorance and dishonesty.
@arroz even if LLMs were comparable, people do review the output of compilers
-
@petros What you need is to get rid of the PDFs and deploy an online store. đ
What is the failure rate of the traditional OCRs compared to the LLMs? And how modern were those OCRs? Modern OCR in the last 5 years or so have a success rate way higher than 90%. And are the failures on OCR itself or interpreting their context (aka knowing how to read the invoice or order, not just identifying the right characters)?
@arroz I don't have the exact numbers of "traditional" OCR but it will be around 90% as well. And, yes, you are right, the issue is not to get the letters right, it's to make it structured information. With OCR it needs templating which tells the OCR where to find an address, what to do with multiple lines and pages etc. Every new format requires that work again.
LLMs are "smarter" in that regard.
Fun fact rookie error: Sending a T&C page to a LLM. It chews on it forever..
-
RE: https://mastodon.social/@stroughtonsmith/116030136026775832
This is one of the worst takes from LLM enthusiasts.
Compilers are deterministic, extremely well tested, made out of incredibly detailed specifications debated for months and properly formalized.
LLMs are random content generators with a whole lot of automatically trained heuristics. They can produce literally anything. Not a single person who built them can predict what the output will be for a given input.
Comparing both is a display of ignorance and dishonesty.
@arroz @binford2k some people already understood this in 2016: https://www.commitstrip.com/en/2016/08/25/a-very-comprehensive-and-precise-spec/
-
@arroz I don't have the exact numbers of "traditional" OCR but it will be around 90% as well. And, yes, you are right, the issue is not to get the letters right, it's to make it structured information. With OCR it needs templating which tells the OCR where to find an address, what to do with multiple lines and pages etc. Every new format requires that work again.
LLMs are "smarter" in that regard.
Fun fact rookie error: Sending a T&C page to a LLM. It chews on it forever..
@arroz And, yeah, why there are so many companies who send this PDFs. God knows. I worked in the automotive industry until 2015 and they still faxed orders.. And it's not Australia only, e.g. just recently we "OCRed" a big Canadian company's invoices.
-
@arroz I've had a horrible idea... Why are we building LLMs that output C, Python, etc when we could be building LLMs that produce bytecode? More efficient and completely unauditable!
-
RE: https://mastodon.social/@stroughtonsmith/116030136026775832
This is one of the worst takes from LLM enthusiasts.
Compilers are deterministic, extremely well tested, made out of incredibly detailed specifications debated for months and properly formalized.
LLMs are random content generators with a whole lot of automatically trained heuristics. They can produce literally anything. Not a single person who built them can predict what the output will be for a given input.
Comparing both is a display of ignorance and dishonesty.
@arroz he claims to âmake apps and break thingsâ...
-
@arroz @binford2k some people already understood this in 2016: https://www.commitstrip.com/en/2016/08/25/a-very-comprehensive-and-precise-spec/
@nils_ballmann @arroz @binford2k what one faces when doing formal verification of LLM output. However, LLMs might enable us to write larger formally verified systems in practice. LLMs could help with the spec writing and validation as well. We'll see.
LLMs are basically generators in neuro-symbolic hybrid systems. And many people like to use them for productivity. I.e. a component or tool. No reason to get emotional about it. Like humans, LLMs are unreliable but still useful.
-
RE: https://mastodon.social/@stroughtonsmith/116030136026775832
This is one of the worst takes from LLM enthusiasts.
Compilers are deterministic, extremely well tested, made out of incredibly detailed specifications debated for months and properly formalized.
LLMs are random content generators with a whole lot of automatically trained heuristics. They can produce literally anything. Not a single person who built them can predict what the output will be for a given input.
Comparing both is a display of ignorance and dishonesty.
@arroz well, except gcc -Ofast, obviously
Notable that dynamic code generation has fallen out of favour in database engines (select -> assembly-> machine code) with SIMD opcodes being the replacement because it's a nightmare to debug when a failure happens inside generated code
AVX512 opcodes support breakpoints and debugging if you add them through intrinsics -
-
RE: https://mastodon.social/@stroughtonsmith/116030136026775832
This is one of the worst takes from LLM enthusiasts.
Compilers are deterministic, extremely well tested, made out of incredibly detailed specifications debated for months and properly formalized.
LLMs are random content generators with a whole lot of automatically trained heuristics. They can produce literally anything. Not a single person who built them can predict what the output will be for a given input.
Comparing both is a display of ignorance and dishonesty.
@arroz Iâd actually hazard a guess that there are more assembly programmers alive today than at any time in history.
