Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This can be trained using only 5 Seconds of reference audio: https://google.github.io/tacotron/publications/speaker_adapt... https://arxiv.org/pdf/1806.04558.pdf

It's been mentioned a bit already, but thought it was worth calling out. This may be one of the lowest-overhead ways to start experimenting, at least in terms of data collection.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: