Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> So, in order to extract data from them, one needs not only good PDF parser but an OCR engine too.

You can go further. Invoices often contain block sections of text with important terms of the invoice, such as shipping time information, insurance, warranties, etc. To build something that works universally, you also need very good natural language processing.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: