Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s wild to see the discoveries being made in ML research. Like most of these ‘discoveries,’ it makes a fair amount of sense after thinking about it. Of course it’s not just going to spit out random noise for random input, it’s been trained to generate realistic looking images.

But I think it is an interesting discovery because I don’t think anyone could have predicted this.

One of my favorite examples is the classification model that will identify an apple with a sticker on it that says “pear” as a pear—it makes sense, but is still surprising when you first see it.



> One of my favorite examples is the classification model that will identify an apple with a sticker on it that says “pear” as a pear—it makes sense, but is still surprising when you first see it.

That classification model (CLIP) is the first stage of this image generator (DALLE) - and actually this shows that it doesn't think they're exactly the same thing, or at least that's not the full story, because DALL-E doesn't confuse the two.

However, other CLIP guided image generation models do like to start writing the prompt as text into the image if you push them too hard.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: