In my tests I found tesseract quite good for regular text documents. For other kinds of texts it's not great.
As for using models - there are some good small language models as well, and of course LLMs.
I sorta feel though that if one needs complex OCR, or a vision model for layout, one should opt for either a commercial solution that abstracts the deployment and GPU management, or bake ones own system.
For most use cases involving text documents though, my subjective opinion is that tesseract is sufficient.