Safe As 33: Is ChatGPT bullsh** you? How Large Language models aim to be convincing rather than truthful

Large Language Models, like ChatGPT have amazing capabilities. But are their responses, aiming to be convincing human text, more indicative of BS? That is, responses that are indifferent to the truth? If they are, what are the practical implications? Today’s paper is: Hicks, M. T., Humphries, J., & Slater, J. (2024). ChatGPT is bullshit. Ethics and… Continue reading Safe As 33: Is ChatGPT bullsh** you? How Large Language models aim to be convincing rather than truthful

Can chatbots provide more social connection than humans?

Can chatbots provide more social connection than humans? Possibly, providing that they don’t “claim too much humanity”. Three study protocols with 801, 201 and 401 had participants engage with AI social chatbots. They note that the long-term consequences of social chatbot use is unknown, but is important to study since “hundreds of millions of people… Continue reading Can chatbots provide more social connection than humans?

Enhancing AI-Assisted Group Decision Making through LLM-Powered Devil’s Advocate

Can LLM’s effectively play as devil’s advocate, enhancing group decisions? Something I’ve been working on lately is AI as a co-agent for cognitive diversity / requisite imagination. Here’s a study which explored an LLM as a devil’s advocate, and I’ll post another study next week on AI and red teaming. [Though this study relied on… Continue reading Enhancing AI-Assisted Group Decision Making through LLM-Powered Devil’s Advocate

Endoscopist De-Skilling after Exposure to Artificial Intelligence in Colonoscopy: A Multicenter Observational Study

Does AI use contribute to de-skilling? Probably, according to this study of endoscopists. This study compared >1.4k patient outcomes who underwent non-AI assisted colonoscopy before and after AI implementation. Background: ·        A recent meta-analysis of 20 randomised trials “showed an absolute 8.1 % increase in ADR [Adenoma detection rate] with the use of AI during colonoscopy.5… Continue reading Endoscopist De-Skilling after Exposure to Artificial Intelligence in Colonoscopy: A Multicenter Observational Study

The impact of generative AI on critical thinking skills: a systematic review, conceptual framework and future research directions

The impact of generative AI on critical thinking skills: a systematic review, conceptual framework and future research directions How do generative AI (GenAI) models affect critical thinking skills? This systematic review unpacked 68 studies to explore the good and the bad. GenAI are “machine-learning algorithms, usually transformer-based large-language models (LLMs), that generate new text, code… Continue reading The impact of generative AI on critical thinking skills: a systematic review, conceptual framework and future research directions

BEWARE OF BOTSHIT: HOW TO MANAGE THE EPISTEMIC RISKSOF GENERATIVE CHATBOTS

Really interesting discussion paper on the premise of ‘botshit’: the AI version of bullshit. I can’t do this paper justice – it’s 16 pages, so I can only cover a few extracts. Recommend reading the full paper. Tl;dr: generative chatbots predict responses rather than knowing the meaning of their responses, and hence, “produce coherent-sounding but… Continue reading BEWARE OF BOTSHIT: HOW TO MANAGE THE EPISTEMIC RISKSOF GENERATIVE CHATBOTS

How generative AI reshapes construction and built environment: The good, the bad, and the ugly

This paper discusses some of the good, bad and ugly of GenAI use in construction. GenAI “poised to fundamentally transform the Construction and Built Environment (CBE) industry” but also is a “dual-edged sword, offering immense benefits while simultaneously posing considerable difficulties and potential pitfalls” Not a summary – just a few extracts: The Good: ·        GenAI… Continue reading How generative AI reshapes construction and built environment: The good, the bad, and the ugly

Large language models powered system safety assessment: applying STPA and FRAM

An AI, STPA and FRAM walk into a bar…ok, that’s all I’ve got. This study used ChatGPT-4o and Gemini to apply STPA and FRAM to analyse:   “liquid hydrogen (LH2) aircraft refuelling process, which is not a well- known process, that presents unique challenges in hazard identification”. One of several studies applying LLMs to safety… Continue reading Large language models powered system safety assessment: applying STPA and FRAM

Call Me A Jerk: Persuading AI to Comply with Objectionable Requests

Can LLMs be persuaded to act like d*cks? A really interesting study from Meincke et al. found human persuasion techniques also worked on LLMs. They tested how “classic persuasion principles like authority, commitment, and unity can dramatically increase an AI’s likelihood to comply with requests they are designed to refuse”. I’m drawing from their study… Continue reading Call Me A Jerk: Persuading AI to Comply with Objectionable Requests

Bullshit vs Botshit: what’s the difference?

A couple more extracts from Hannigan et al.’s paper on ‘botshit. Bullshit is “Human-generated content that has no regard for the truth, which a human then uses in communication and decision-making tasks”. Botshit is “Chatbot generated content that is not grounded in truth (i.e., hallucinations) and is then uncritically used by a human in communication… Continue reading Bullshit vs Botshit: what’s the difference?