Possession laws are pretty strict and hard to decode. I wouldn't want to be the test case in court. The idea of "poisoning" a dataset is an interesting theoretical. But in practice, I just want to judge the likelihood that the dataset is poisoned by the presence of images. If it is then there's not much I can do with it.
Nonsense. Gwern doesn't need to do anything for anyone.
It's an interesting issue, and a way investigators may be attacked, but it's their responsibility alone. There exists data. This is that data. The data may bite. Touch the data at your own risk.
Guess what, laws aren't universal! Unless gwern has a complete understanding of your jurisdiction and can somehow guess how you plan to use the data, he cannot know what is legal and wasn't isn't. The burden lies on you.
It's a general black market, not just drugs. For example, one of the sites described on that page is PEDOFUNDING, "A crowdfunding site for child pornography." Now the dump isn't supposed to contain any images, but it's hard to be 100% sure. In any case, whatever risk there might be seems to be clearly implied in the name and description there.