Tuning into whispered frequencies: Harnessing Large Language Models to detect Weak Signals in complex socio-technical systems

This study evaluated whether LLMs can support a scaled and systematic analysis of surveyed data about worker adaptive practices, to foster weak signal ID. E.g. can LLMs help identify weak signals from large-scale data. In this case, textual data describing frontline personnel adaptive behaviours during everyday operations. This was obtained via survey. PS. Check out… Continue reading Tuning into whispered frequencies: Harnessing Large Language Models to detect Weak Signals in complex socio-technical systems

Safer Systems: People Training or System Tuning?

Hollnagel discusses the role of training in complex systems. Shared under open access licence. PS. Check out my YouTube channel: Safe As: A thrifty analysis of safety, AI and risk – YouTube Extracts:·        “Safety is usually seen as a problem when it is absent rather than when it is present, where accidents, incidents, and the like… Continue reading Safer Systems: People Training or System Tuning?

Does using AI make us smarter, or just more confident?

Does using AI make us smarter, or just more confident? This ep covers a recent study on how generative AI affects our “metacognition” – our ability to judge our own performance. Researchers tracked hundreds of people solving logical reasoning problems with and without AI.

Improving Construction Site Safety with Large Language Models: A Performance Analysis

This preliminary, proof of concept study explored the effectiveness of GPT-4o in construction visual hazard recognition. They contrasted performance against OHS experts. Source was **static images** from Google and real construction sites (not real-time video analysis). The LLM & experts were asked to rate the hazard, justify their judgement, and assess the immediate issues, use… Continue reading Improving Construction Site Safety with Large Language Models: A Performance Analysis

Safe As: Can AI make your doctors worse at their job?

Can #AI make your doctor worse at their job? This multicentre study compared physician ADR (Adenoma Detection Rate) before and after AI-assisted detection – and then after removing the AI-assistance. What do you think – will the overall benefits of AI overweigh the negative and unintended skill drops of people? (*** Please subscribe, like and… Continue reading Safe As: Can AI make your doctors worse at their job?

AI: Structural vs Algorithmic Hallucinations

#AI: Structural vs Algorithmic Hallucinations There’s several typologies that have sorted different types of hallucinations – this is just one I recently saw. This suggests that structural hallucinations are an inherent part of the mathematical and logical structure of the #LLM, and not a glitch or bad prompt. LLMs are probabilistic engines, with no understanding… Continue reading AI: Structural vs Algorithmic Hallucinations

AI deception: A survey of examples, risks, and potential solutions

This study explored how “a range of current AI systems have learned how to deceive humans”. Extracts: ·        “One part of the problem is inaccurate AI systems, such as chatbots whose confabulations are often assumed to be truthful by unsuspecting users” ·        “It is difficult to talk about deception in AI systems without psychologizing them. In humans,… Continue reading AI deception: A survey of examples, risks, and potential solutions

Agentic Misalignment: How LLMs could be insider threats (Anthropic research)

AI and malicious compliance. This research from Anthropic has done the rounds, but quite interesting. In controlled experiments (not real-world applications), they found that AI models could resort to “malicious insider behaviors when that was the only way to avoid replacement or achieve their goals—including blackmailing officials and leaking sensitive information to competitors”. Some extracts:… Continue reading Agentic Misalignment: How LLMs could be insider threats (Anthropic research)

Large Language Models in Lung Cancer: Systematic Review

This systematic review of 28 studies explored the application of LLMs for lung cancer care and management. Probably few surprises here. And it’s focused mostly on LLMs, rather than specialised AI models. Extracts: ·        The review identified 7 primary application domains of LLMs in LC: auxiliary diagnosis, information extraction, question answering, scientific research, medical education, nursing… Continue reading Large Language Models in Lung Cancer: Systematic Review

From transcript to insights: Summarizing safety culture interviews with LLMs

From transcript to insights: summarizing safety culture interviews with LLMs How well does OpenAI o1 work for summarising ‘safety culture’ interviews, and how does it compare to human notes? This study did just that. Extracts: ·        They assessed correctness via exhaustiveness (comparison of LLM claims vs human interviewer notes), consistency (comparison of LLM claims between subsequent… Continue reading From transcript to insights: Summarizing safety culture interviews with LLMs