Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> [Mar 2024] An MoE model based on the v5 Eagle 2T model

(note, approximate date) Hyped about this! This could strike a powerful balance between performance and reasonably retained low environmental/token cost impact. Would be cool with improved coverage of Scandinavian languages along with it, but I guess we'll see.

And yeah, I think a true revolution will happen (or might already be) when we realize the value of training data and how to structure and balance its content in the most optimal way for training.



Looking into the nordic pile maybe? There are some datasets




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: