New research suggests that generative AI has the potential to speed up diagnosis and reduce waiting times in the emergency department.


SwissCognitive Guest Blogger:  HennyGe Wichers, PhD – “ChatGPT rivals doctors at suggesting diagnosis in ER”


In December of last year, most of us were oblivious to ChatGPT. But the large language model (LLM) was already taking the United States Medical Licensing Exam, a three-part test aspiring doctors take between medical school and residency. The AI did well, scoring at or near the passing threshold for all three exams – without any special training. The chatbot also offered sensible and insightful explanations for its answers.

But now there’s evidence suggesting that ChatGPT can also work in the real world. Researchers Dr Hidde ten Berg and Dr Steef Kurstjens piloted the LLM in the Emergency Department of Jeroen Bosch Hospital in The Netherlands. They published their findings in the Annals of Emergency Medicine on September 9, 2023, and will present their work at the European Emergency Medicine Congress in Barcelona on September 17-20.

* * *

“Like a lot of people, we have been trying out ChatGPT and we were intrigued to see how well it worked for examining some complex diagnostic cases. So, we set up a study to assess how well the chatbot worked compared to doctors with a collection of emergency medicine cases from daily practice,” Dr Ten Berg explains.

ChatGPT and GPT-4 performed well in generating lists of possible diagnoses and suggesting the most likely option. The results produced by the LLMs showed a lot of overlap with actual doctors’ lists of potential diagnoses.

Thank you for reading this post, don't forget to subscribe to our AI NAVIGATOR!


“Simply put, this indicates that ChatGPT was able [to] suggest medical diagnoses much like a human doctor would,” Dr Ten Berg adds in an interview ahead of the Congress in this weekend.

The researchers asked the LLM to suggest differential diagnoses for 30 patients who attended the Emergency Department in early 2022. At the time, a doctor examined each patient on arrival at the hospital and made notes of their assessment. Patients then underwent standard laboratory tests and were assigned to a treating physician, who recorded potential diagnoses and decided on additional tests. All patients were discharged with a confirmed diagnosis, and letters to their General Practitioner confirmed the details.

For the study, a fresh set of doctors reviewed each case and devised five possible diagnoses for each patient, picking one as the most likely. Initially, they only used the available notes. Then, they looked at the test results and made revisions if the new information changed their opinion. Finally, the research team entered each case into ChatGPT and GPT-4 in threefold.

Human doctors included the correct diagnosis in their top 5 for 83% of cases using only the notes. But the LLMs got impressive scores, too. ChatGPT achieved 77% and GPT-4 87%. Adding the lab test results, physicians’ accuracy increased to 87% and ChatGPT got a near-perfect 97%. GPT-4 remained at 87%.

Fig 1: Percentage of cases with correct diagnosis in the top 5 (researchers’ image)

In some cases, the AI outperformed the physician. Dr Ten Berg illustrates: “For example, we included a case of a patient presenting with joint pain that was alleviated with painkillers, but redness, joint pain and swelling always recurred. In the previous days, the patient had a fever and sore throat. A few times there was a discolouration of the fingertips. Based on the physical exam and additional tests, the doctors thought the most likely diagnosis was probably rheumatic fever, but ChatGPT was correct with its most likely diagnosis of vasculitis.”

* * *

These results suggest that “there is potential here for saving time and reducing waiting times in the emergency department,” according to Dr Ten Berg. He adds that the benefit of using AI could be supporting doctors with less experience, or it could help spot rare diseases.

But it’s important to remember that ChatGPT is not a medical device, and there are concerns over privacy when using AI with medical data.

“We are a long way from using ChatGPT in the clinic, but it’s vital that we explore new technology and consider how it could be used to help doctors and their patients,” said Youri Yordanov, who was not involved in the research. The professor at the St. Antoine Hospital emergency department (APHP Paris) in France and Chair of the European Society for Emergency Medicine added: “I look forward to more research in this area and hope that it might ultimately support the work of busy health professionals.”

The study adds to a growing literature highlighting the role AI can play in personalised treatments and using electronic health records for (preventive) diagnostics.

* * *

Source: Annals of Emergency Medicine via EurekAlert!

About the Author:

HennyGe Wichers is a technology science writer and reporter. For her PhD, she researched misinformation in social networks. She now writes more broadly about artificial intelligence and its social impacts.