This study explored how “a range of current AI systems have learned how to deceive humans”. Extracts: · “One part of the problem is inaccurate AI systems, such as chatbots whose confabulations are often assumed to be truthful by unsuspecting users” · “It is difficult to talk about deception in AI systems without psychologizing them. In humans,… Continue reading AI deception: A survey of examples, risks, and potential solutions
Tag: artificial intelligence
Agentic Misalignment: How LLMs could be insider threats (Anthropic research)
AI and malicious compliance. This research from Anthropic has done the rounds, but quite interesting. In controlled experiments (not real-world applications), they found that AI models could resort to “malicious insider behaviors when that was the only way to avoid replacement or achieve their goals—including blackmailing officials and leaking sensitive information to competitors”. Some extracts:… Continue reading Agentic Misalignment: How LLMs could be insider threats (Anthropic research)
Safe As 33: Is ChatGPT bullsh** you? How Large Language models aim to be convincing rather than truthful
Large Language Models, like ChatGPT have amazing capabilities. But are their responses, aiming to be convincing human text, more indicative of BS? That is, responses that are indifferent to the truth? If they are, what are the practical implications? Today’s paper is: Hicks, M. T., Humphries, J., & Slater, J. (2024). ChatGPT is bullshit. Ethics and… Continue reading Safe As 33: Is ChatGPT bullsh** you? How Large Language models aim to be convincing rather than truthful
Enhancing AI-Assisted Group Decision Making through LLM-Powered Devil’s Advocate
Can LLM’s effectively play as devil’s advocate, enhancing group decisions? Something I’ve been working on lately is AI as a co-agent for cognitive diversity / requisite imagination. Here’s a study which explored an LLM as a devil’s advocate, and I’ll post another study next week on AI and red teaming. [Though this study relied on… Continue reading Enhancing AI-Assisted Group Decision Making through LLM-Powered Devil’s Advocate
BEWARE OF BOTSHIT: HOW TO MANAGE THE EPISTEMIC RISKSOF GENERATIVE CHATBOTS
Really interesting discussion paper on the premise of ‘botshit’: the AI version of bullshit. I can’t do this paper justice – it’s 16 pages, so I can only cover a few extracts. Recommend reading the full paper. Tl;dr: generative chatbots predict responses rather than knowing the meaning of their responses, and hence, “produce coherent-sounding but… Continue reading BEWARE OF BOTSHIT: HOW TO MANAGE THE EPISTEMIC RISKSOF GENERATIVE CHATBOTS
Human Factors and Ergonomics in Industry 5.0 —A Systematic Literature Review
This open access article may interest people – it explored the future of human factors/ergonomics in Industry 5.0 (I05). Not a summary but you can read the full paper freely. Some extracts: Shout me a coffee Study link: https://doi.org/10.3390/app15042123 LinkedIn post: https://www.linkedin.com/posts/benhutchinson2_this-open-access-article-may-interest-people-activity-7300617102564933632-WGPj?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAeWwekBvsvDLB8o-zfeeLOQ66VbGXbOpJU