Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Along with the other commenter, the reason the dictionary would start getting so big is that words with a stem would have all its variations being different tokens (cat, cats, sit, sitting, etc). Also any out-of-dictionary words or combo words, eg. "cat bed" would not be able to be addressed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: