> The OCR returns confidence intervals. If the LLM reasons over OCR that it is n...

> The OCR returns confidence intervals. If the LLM reasons over OCR that it is not confident about, we flag this in the UI to the end user and ask a human to review before moving forward.

This seems helpful, but what if the flagging system misses an error? Do you measure the accuracy of your various systems on your customer data? These are typically the more challenging aspects of integrating ML in healthcare.