As generative artificial intelligence (AI) programs, such as ChatGPT, continue to evolve and improve, there is growing interest in how the technology could be used in healthcare — including whether it could replace doctors. According to a new study published in JAMA Internal Medicine, ChatGPT may be better at answering patient's questions, scoring higher than actual doctors on both quality and empathy.
For the study, researchers collected patient questions and responses from doctors through Reddit's AskDocs subreddit. In this community, which has 452,000 members, people can ask medical questions and receive answers from verified healthcare professionals. In total, 195 exchanges were collected from AskDocs.
Then, the researchers posed the original questions to ChatGPT and recorded each of its responses. Both responses were then evaluated by a panel of three licensed physicians.
The physicians were asked to choose which response they thought was better and judge each response on the quality of its information and the empathy/bedside manner given. Each assessment was scored on a five-tier scale that ranged from "very poor" to "very good" (quality) or "not empathetic" to "very empathetic" (empathy).
Overall, the panel said they preferred ChatGPT's responses over physicians' in 78.6% of the 585 completed evaluations. ChatGPT's responses were also longer than the doctors' responses, ranging from 168 to 245 words compared to 17 and 62 words, respectively.
The AI responses also rated significantly higher on both quality and empathy. On average, ChatGPT's responses scored a 4 on quality and a 4.67 on empathy. In comparison, the physician responses scored a 3.33 on quality and 2.33 on empathy. In total, ChatGPT had 3.6 times more high-quality responses and 9.8 times more empathetic responses than physicians.
"ChatGPT provides a better answer," said John Ayers, vice chief of innovation with the division of infectious disease and global public health at the Qualcomm Institute at University of California, San Diego, who led the study. "I think of our study as a phase zero study, and it clearly shows that ChatGPT wins in a landslide compared to physicians, and I wouldn't say we expected that at all."
Although ChatGPT scored higher than physicians, Ayers emphasized that "[t]his doesn't mean AI will replace your physicians." Instead, he said that "it does mean a physician using AI can potentially respond to more messages with higher-quality responses and more empathy."
Unlike physicians, who are often press for time and struggling with burnout, ChatGPT can more easily craft a detailed and empathetic response, which can enhance a doctor's actual response.
"AI messaging is not operating in a constraint," Ayers said. "That's the new value-added of AI-assisted medicine. Doctors will spend less time over verbs and nouns and conjugation, and more time actually delivering health care."
In addition, Ayers noted that AI programs could improve patient outcomes and "be a game changer for public health."
In an accompanying editorial, Jonathan Chen, an assistant professor at the Stanford University School of Medicine, and co-authors said that, while there are ways for physicians to implement AI technology into clinical practice, it's important to do so carefully since there's a risk of exacerbating existing biases or creating other harms.
"Medicine is much more than just processing information and associating words with concepts; it is ascribing meaning to those concepts while connecting with patients as a trusted partner to build healthier lives," the authors wrote.
Overall, Ayers said that, even with the potential risks that come with AI technology, he's "pretty optimistic about what it could do for people's health."
Currently, three health systems — UC San Diego Health, UW Health, and Stanford Health Care — have already begun or are planning to start new ChatGPT pilots in their organizations. In the programs, healthcare workers will test whether AI helps reduce the time they spend answering patients' online questions.
According to the Wall Street Journal, when a doctor clicks on a message from a patient, ChatGPT will instantly generate a draft reply. To create the reply, the AI utilizes information from the patient's message, as well as an abbreviated version of their electronic medical history. Physicians then can either use the draft as the basis of their response to the patient or start with a completely blank reply.
So far, ChatGPT is only being used on certain questions, such as prescription requests or requests for documentation. It is not being used for any questions requesting medical advice.
Marlene Millen, a primary care physician at UC San Diego Health, said that while early AI-generated responses have required heavy editing, "it is helping us get it started," particularly by saving time that would have been spent pulling up a patient's chart.
Millen noted that healthcare staff so far seem to be excited about the AI program's potential uses, particularly as they struggle to accommodate more and more patient messages. "Doctors are so burnt out that they are looking for any kind of hope," she said. (Mozes, HealthDay/U.S. News & World Report, 4/28; Adams, MedCity News, 4/30; Tilley, Daily Mail, 4/28; Subbaraman, Wall Street Journal, 4/28; Reed, Axios, 5/1; McPhillips, CNN, 4/28; DePeau-Wilson, MedPage Today, 4/28)
Download our cheat sheet to learn how generative artificial intelligence (AI) could transform healthcare and go beyond chatbots to include search, image creation, and synthetic data.
Create your free account to access 2 resources each month, including the latest research and webinars.
You have 2 free members-only resources remaining this month remaining this month.
Never miss out on the latest innovative health care content tailored to you.