Can ChatGPT exceed humans in construction project risk management?

This study pit ChatGPT 4 versus competent construction personnel (project/site managers, engineers etc.) in a task of project risk management.

They specifically compared results between the AI model and people on a construction project case study:

·      Identify and list the potential project risks

·      Which risks are most critical and analyse them?

·      How are these risks to be controlled?

They also had SMEs evaluate the responses from both people and the AI, and had the AI evaluate responses.

[* Note: While this experiment pit them against each other, they recognise of course that the strength is more as a support/supplement, discussed in the paper]

Some findings:

·       “ChatGPT has demonstrated a superior ability to generate comprehensive risk management plans,

·       with its quantitative scores significantly surpassing the human average”

·       “Nonetheless, the AI model’s strategies are found to lack practicality and specificity, areas where human expertise excels”

·      “the results of our research revealed indications of the superiority of ChatGPT over human experts in the test comparison”

·       “This finding undermines the narrative of human dominance in complex decision-making tasks and reveals the potential of generative AI to support traditional CPRM processes alongside human experts”

·      “ChatGPT achieved significantly better quantitative evaluations than humans both when reviewed by human reviewers and ChatGPT itself. Human experts average 5.7, while ChatGPT averages a score of 8.6 … on the exercise based on human review”

·      ChatGPT scored human responses higher than people rated people, but it also scored its own responses higher for risk ID and analysis, but humans valued the AI answers more highly for risk control

·      “Some evaluators praised ChatGPT’s ability to provide comprehensive and detailed risk assessments, suggesting that generative AI systems have the potential to excel in this area”

·      However, “an equally common concern was the specificity and practicality of risk management”, where the AI was criticised for “responses for failing to tailor their risk management strategies to the specific circumstances of the construction project”

·      Likewise, “While the AI responses were praised for their breadth, they were criticized for being too general and lacking actionable strategies indicating a need for human experts’ tacit knowledge”

·      Some people were concerned that the too-detailed approach from ChatGPT could be a distraction, with an over-abundance of information

·      “While ChatGPT demonstrates an impressive ability to comprehensively analyze and present risks within the test set, it also shows a potential gap in providing practical, context-specific, easily implementable strategies and seemingly lacks the implicit knowledge some human respondents could showcase”

·      “AI models such as ChatGPT may not necessarily surpass human capabilities in managing construction project risks in their current state, but they offer promising potential to enhance human performance when used as complementary tools”

·      “AI models are currently limited in their ability to access and interpret implicit data, such as personal experience with nuanced and complex underlying assumptions and hidden correlations, or unarticulated expertise and judgment, a dimension where human experts hold an advantage”

·      “While humans have inherent limitations, such as time and motivational constraints that could have potentially influenced their performance, AI models like ChatGPT operate devoid of such restrictions. These disparities suggest that the proposed optimal approach is to leverage AI in CPRM, where its capabilities excel, and freeing human expert resources to concentrate on areas where they demonstrate superior capabilities”

Ref: Nyqvist, R., Peltokorpi, A., & Seppänen, O. (2024). Can ChatGPT exceed humans in construction project risk management?. Engineering, Construction and Architectural Management31(13), 223-243.

Study link: https://www.emerald.com/insight/content/doi/10.1108/ECAM-08-2023-0819/full/pdf

My site with more reviews: https://safety177496371.wordpress.com

LinkedIn post: https://www.linkedin.com/posts/benhutchinson2_this-study-pit-chatgpt-4-versus-competent-activity-7254960815768809472-4xZY?utm_source=share&utm_medium=member_desktop

Leave a comment