Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think anybody following OpenAI's feature releases will be caught off guard by ChatGPT becoming multi-modal. The app already features voice input. That still translates voice into text before sending, but it works so well that you basically never need to check or correct anything. Rather, you might have already been asking yourself why it doesn't reply back with a voice already.

And the ability ingest images was a highlight and all the hype of the GPT-4 announcement back in March: https://openai.com/research/gpt-4



one of the original training sets for the BERT series is called 'BookCorpus', accumulated by regular grad students for Natural Language Processing science. Part of the content was specifically and exactly purposed to "align" movies and video with written text. That is partly why it contains several thousand teen romance novels and ordinary paperback-style story telling content. What else is in there? "inquiring minds want to know"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: