Summary

Top 13 papers analyzed

LLMs have achieved significant success in interacting with human. However, recent studies have revealed that these models often suffer from hallucinations, leading to overly confident but incorrect judgments. This limits their application in the medical domain, where tasks require the utmost accuracy. This paper introduces an automated evaluation framework that assesses the practical capabilities of LLMs as virtual doctors during multi-turn consultations.Consultation tasks are designed to require LLMs to be aware of what they do not know, to inquire about missing medical information from patients, and to ultimately make diagnoses.To evaluate the performance of LLMs for these tasks, a benchmark is proposed by reformulating medical multiple-choice questions from the United States Medical Licensing Examinations (USMLE), and comprehensive evaluation metrics are developed and evaluated on three constructed test sets.A medical consultation training set is further constructed to improve the consultation ability of LLMs.The results of the experiments show that fine-tuning with the training set can alleviate hallucinations and improve LLMs' performance on the proposed benchmark

The copilot framework, called Healthcare Copilot, enhances language models for medical consultations. It has three components for patient interactions, memory, and dialogue processing. It significantly improves inquiry capability, fluency, accuracy, and safety in medical conversations.

Published By:

ArXiv - arXiv.org

2024

Cited By:

5

Scientists developed a specialized language model called ChatDoctor to improve the accuracy of medical advice given by large language models. By refining the model using real-world patient-doctor interactions and allowing it to access reliable online and offline sources, ChatDoctor significantly improved its understanding of patient inquiries and ability to provide accurate advice.

Published By:

Cureus

2023

Cited By:

203

GPT-4 demonstrates comparable medical accuracy to human experts in providing medical advice for cardiology. However, human experts outperformed GPT-4 in specific categories, highlighting limitations in clinical judgment. Further research and improvements in language models are necessary for safe and effective use in medical consultations.

Published By:

JMIR Med Educ - JMIR Medical Education

2024

Cited By:

0

The study assesses the effectiveness of language models in responding to inquiries from autistic individuals in a Chinese setting. The results show that while physicians' responses were superior overall, language models can provide valuable guidance and demonstrate empathy. Further research and optimization are needed for integration in clinical settings.

Published By:

J Med Internet Res - Journal of Medical Internet Research

2023

Cited By:

1

DISC-MedLLM is a solution that uses Large Language Models to provide accurate medical responses in conversational healthcare services. It surpasses existing models by using high-quality datasets and demonstrates effectiveness in bridging the gap between general language models and medical consultation.

Published By:

ArXiv - arXiv.org

2023

Cited By:

39

LLM-AMT is a system designed to enhance the proficiency of language models in the medical field. By incorporating authoritative medical textbooks, LLM-AMT improves response quality and outperforms specialized models trained on medical corpus.

Published By:

ArXiv - arXiv.org

2023

Cited By:

37

Language models have been successful in interacting with humans but often make incorrect judgments due to hallucinations. This paper introduces an evaluation framework to assess language models as virtual doctors, with a focus on their ability to make accurate diagnoses. Fine-tuning with a medical consultation training set improves the models' performance and reduces hallucinations.

Published By:

ArXiv - arXiv.org

2023

Cited By:

4

Large language models (LLMs) trained by deep learning algorithms can generate text responses to user prompts, leading to potential use in clinical consultation. However, limitations like confabulations, lack of contextual awareness, and biases prevent safe clinical deployment. Infectious diseases clinicians should engage with LLMs to advocate for responsible use in specialist care.

Published By:

Clin Infect Dis - Clinical Infectious Diseases

2023

Cited By:

26

Large language models like ChatGPT are increasingly being used as information sources in medicine, but a study found that their performance in answering clinical case-based questions in otorhinolaryngology (ORL) was inferior to ORL consultants. Although ChatGPT provided longer and more coherent answers, medical adequacy and conciseness were significantly lower compared to ORL consultants' answers.

Published By:

JMIR Med Educ - JMIR Medical Education

2023

Cited By:

9

NLP in healthcare is usually focused on patient-centered services, but the potential for NLP to assist inexperienced doctors in communication skills is largely unexplored. "ChatCoach" is a human-AI framework that allows medical learners to practice dialogues with a patient agent and receive structured feedback from a coach agent, improving their skills.

Published By:

Annu Meet Assoc Comput Linguistics - Annual Meeting of the Association for Computational Linguistics

2024

Cited By:

1

The development of large language models like ChatGPT raises concerns in various domains, including healthcare and bioethics. These concerns include issues of informed consent, the risk of medical deepfakes and misinformation, competition law, user challenges, algorithmic monoculture, and environmental impact. However, an analysis of ChatGPT's impact on bioethics should also consider how it affects the interpersonal, critical, and reason-giving aspect of healthcare.

Published By:

Am J Bioeth - American Journal of Bioethics

2023

Cited By:

2

The study compared the use of a digital health app along with standard care to standard care alone for patients undergoing ACL surgery. The results showed that using the app in addition to standard care significantly improved pain and symptoms before and after surgery. The app is considered a safe and effective tool for prehabilitation and rehabilitation in ACL surgery.

Published By:

Knee Surg Sport Traumatol Arthrosc - Knee Surgery, Sports Traumatology, Arthroscopy

2024

Cited By:

1

Hospitalized psychiatric patients perceive medical consultations as important in bridging the gap between psychiatric and somatic treatment, providing a sense of security and acknowledgment. However, memory impairment and psychiatric treatment hindered the full utilization of these consultations. Support from psychiatric staff is crucial for initiating somatic interventions and improving patient health.

Published By:

Nord J Psychiatry - Nordic Journal of Psychiatry

2024

Cited By:

0