Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A byte tokenized model is naturally 100% multi-lingual in all languages in its data set. There just isn't a lot of reason for teams to spend the extra training time to build that sort of model.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: