It's a nice idea, but so few sites set up equivalent data endpoints well that I'm sure there's vanishingly small returns for putting in the work to consume them this way.
Plus, the feeds might not get you the same content. When I used RSS more heavily some of my favorite sites only posted summaries in their feeds, so I had to read the HTML pages anyway. How would an scraper know whether that's the case?
The real problem is the the explosion of scrapers that ignore robots.txt has put a lot of burden on all sites, regardless of APIs.
43-44% of websites are Wordpress. Many non WP sites still have public APIs. Besides the legality of ignoring the robots.txt, it's also just the kind and courteous thing to do.
If a site uses GraphQL then it's worth learning, because usually the queries are poorly secured and you can get interesting information from that endpoint.
Plus, the feeds might not get you the same content. When I used RSS more heavily some of my favorite sites only posted summaries in their feeds, so I had to read the HTML pages anyway. How would an scraper know whether that's the case?
The real problem is the the explosion of scrapers that ignore robots.txt has put a lot of burden on all sites, regardless of APIs.