PROJECTS
Evaluation of LLMs Accuracy and Consistency in the Registered Dietitian Exam
This project evaluates how leading large language models — GPT-4o, Claude 3.5, and Gemini 1.5 — perform on 1,050 Registered Dietitian exam questions across diverse nutrition domains. We test advanced prompting strategies, including Zero-Shot, Chain of Thought, Self-Consistency, and Retrieval-Augmented Prompting, to measure accuracy and consistency. The results reveal how model choice and prompt design directly impact reliability in diet and nutrition applications, paving the way for safer, more effective AI-driven nutrition support systems.
