Piero Bosio Social Web Site Personale

Social Forum federato con il resto del mondo. Non contano le istanze, contano le persone

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Language models cannot reliably distinguish belief from knowledge and fact

Uncategorized

1 Posts 1 Posters 8 Views

undefined This user is from outside of this forum
undefined This user is from outside of this forum
gert@poliversity.it

wrote on last edited by

#1

Language models cannot reliably distinguish belief from knowledge and fact
Abstract
-----------
«As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial.»
#ai #LLMs #epistemology #knowledge
https://www.nature.com/articles/s42256-025-01113-8
1 Reply Last reply
1
0
undefined informapirata@mastodon.uno shared this topic on

Feed RSS

Language models cannot reliably distinguish belief from knowledge and fact

Gli ultimi otto messaggi ricevuti dalla Federazione

undefined
atsuzaki@types.pl

@ricci

read more
undefined
ricci@discuss.systems

@atsuzaki NO

read more
undefined
atsuzaki@types.pl

@ricci i'm a kat does that count

read more
undefined
ricci@discuss.systems

You are only allowed to vote in this poll if you are a #cat

read more
undefined
anarchiversitario@poliversity.it

La nuova era delle guerre high-tech. intervista all’autore di “imperialismo digitale”, dario guarascio
@anarchia
“Imperialismo digitale. Economia e guerra ai tempi delle piattaforme e dell’IA” è il titolo del libro, edito per Laterza (2026), di Dario Guarascio, docente di Economia Politica a La Sapienza. Un libro che intreccia il ruolo delle

read more
undefined
idlebrain@infosec.exchange

@lorenzofb AI is definitely a trend.

read more
undefined
cwebber@social.coop

putting all my investments into vibegambling

read more
undefined
evan@cosocial.ca

@erin boooooooooooooooo

read more

@pierobosio@soc.bosio.info

Post suggeriti

undefined

At this point, open-source development itself is being DDoS'ed by LLMs and their human users.
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized aislop llms
13

1

0 Votes

13 Posts

77 Views

undefined

@jzb it's worse: we have built hundreds of machines peeing in the pool. And their reservoir of pee is infinite.
undefined

A thought that popped into my head when I woke up at 4 am and couldn’t get back to sleep…
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized llms claude chatgpt
66

0 Votes

66 Posts

317 Views

undefined

@em_and_future_cats @mathew @jzb It is not. No LLM can ignore the data in the training set. And NotebookLM is definitely not a local instance.
undefined

A remarkably prophetic 1923 cartoon depicting how a creative process would be automated in 2023.
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized cartoon tech technology bigtech artificialintelligence llm llms
1

1

0 Votes

1 Posts

7 Views

undefined

A remarkably prophetic 1923 cartoon depicting how a creative process would be automated in 2023.#cartoon #tech #technology #BigTech #AI #ArtificialIntelligence #LLM #LLMs #MachineLearning #GenAI #generativeAI #AISlop #Meta #Google #gemini #OpenAI #ChatGPT #anthropic #claude
undefined

Federating knowledge: exploring ways to bridge wikis and notes
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized 39c3 knowledge freeknowledge wiki mediawiki api obsidian anytype
1

0 Votes

1 Posts

6 Views

undefined

Federating knowledge: exploring ways to bridge wikis and notesJoin the workshop at #39C3! NEW DATE: Day 4, 13:40 @ Free Knowledge Habitat Workshop Area.Most people and organisations have their very own way of acquiring, organising, archiving, sharing, and collaborating on knowledge repositories. A broad spectrum of opinions and approaches resulted in a diverse and rich ecosystem of knowledge management solutions. Nevertheless, this also implies scattered and disconnected knowledge sources. What would it mean to build bridges among wikis and federate knowledge?This workshop is going to be heavily centred on a twofold discussion, exploring the challenge of federated knowledge starting from two questions.What does it mean to federate knowledge repositories?Instead of pursuing a silver-bullet solution to embrace all use-cases, what would it mean to foster and enable interoperability for different software?These questions stem from years of questioning and wondering how to integrate my personal note-taking and collective, participatory knowledge management at work, in organisations, institutions, and informal collectives. Recently, I began actively researching this topic as I started playing with the MediaWiki API to cross-synchronise my local Markdown notes and the XPUB wiki, the public learning wiki of the Experimental Publishing master. I am puzzled by taking advantage of the potential of a specific software (in this case, MediaWiki) while fearing of being locked-in.Some further, more specific, insights and questions:Local-first approaches and software (e.g. Reflection)Interesting experiments based on existing protocols, such as IbisWhat do we take of semi-open and obscure yet very cool initiatives like AnytypeThe power and the limits of plain-text: how to enable collaboration on simple Markdown files and build on top of it, as Obsidian doesCc: @modal @p2panda @obsidian @wikimediaDE @dweb#knowledge #FreeKnowledge #wiki #MediaWiki #API #Obsidian #Anytype #Ibis #IbisWiki #Reflection #CCC #Federation #federatedKnowledge #docs #PKM #knowledgeManagement #personalKnowledgeManagement #collectiveKnowledgeManagement #DWeb #decentralization #ActivityPub

Piero Bosio Social Web Site Personale

Language models cannot reliably distinguish belief from knowledge and fact

Feed RSS

Gli ultimi otto messaggi ricevuti dalla Federazione

Post suggeriti

At this point, open-source development itself is being DDoS'ed by LLMs and their human users.

A thought that popped into my head when I woke up at 4 am and couldn’t get back to sleep…

A remarkably prophetic 1923 cartoon depicting how a creative process would be automated in 2023.

Federating knowledge: exploring ways to bridge wikis and notes