Gemini Goes to Med School:
Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations
In this comprehensive study, we’ll embark on a evaluation of Google’s Gemini, a state-of-the-art large language model, across several benchmarks in the medical domain, including:
- Medical reasoning (MultiMedQA)
- Hallucination detection (Med-HALT)
- Medical Visual Question Answering (VQA)