It’s wild to see the discoveries being made in ML research. Like most of these ‘...

astrange · on May 31, 2022

> One of my favorite examples is the classification model that will identify an apple with a sticker on it that says “pear” as a pear—it makes sense, but is still surprising when you first see it.

That classification model (CLIP) is the first stage of this image generator (DALLE) - and actually this shows that it doesn't think they're exactly the same thing, or at least that's not the full story, because DALL-E doesn't confuse the two.

However, other CLIP guided image generation models do like to start writing the prompt as text into the image if you push them too hard.