hoodsen's comments

hoodsen · 2025-12-10T15:40:26 1765381226

Do you have plans to improve the quality of the LLM as judge, in order to achieve better parity with human clinician annotators? For example, fine-tuning models? Thinking that the comparative clinician judgements themselves would make useful fine-tuning material.

RicardoRei · 2025-12-10T15:47:53 1765381673

yep yep. Its something we have to study and its likely we can improve the LLM as a Judge further.

Same thing for the patient LLM. We can probably fine-tune an LLM to do a better job at simulating patients.

Those two components of our framework have space for improvement