Abstract
Questions are a fundamental tool for acquiring information, from children’s learning to complex tasks. Recent work has shown that the informativeness of questions by large language models (LLMs) can be enhanced through Direct Preference Optimization (DPO) and Expected Information Gain (EIG). In this study, we evaluate the effectiveness of a DPO-trained model in the context of medical interviews. Our findings indicate that DPO training improves success rates in medical interviews, thereby demonstrating the broader applicability and generalizability of this approach.