From what I understand, the problem is not really from scrapers that would "poun...

From what I understand, the problem is not really from scrapers that would "pound" the service by being thousands of requesting the same things multiple times, but that they are scraping the whole of Wikipedia including heavy content, like video that is not accessed often.

If that is the case, I would think that it is a little bit concerning that the model of Wikipedia is based on having most resources not accessed.

Otherwise, if my understanding is wrong, it would mean that AI company are constantly scraping the same content for change like a search engine would do, but it does little sense to me as I easily guess that models are only trained once every few months at most.

And also I don't understand how they were not already encountering this problem with the existing constant crawling of search engines...