There would still be a selection pressure towards better models even if the datasets were completely overtaken by AI-generated images, as there would still be humans in the loop, choosing which AI-generated images (or videos/text/etc) to use for their content, picking the better ones, discarding the not-so-good ones. Thus, the input dataset to the next generation of models would not be the raw output set of the previous generation(s), but rather some manually selected subset of the best outputs of the previous generation(s). However, this seems like it could only provide a pressure for the models to move towards a local maximum, so perhaps there might be interesting opportunities outside of this local maximum.