Does not give you any insight in the provenance, but Philip James has an awesome talk[0] where he uses Datasette to collect city government documents to be in a more structured format for analysis. Lots of clever automation to extract data from PDFs. Really some amazing work for surfacing civic decisions.
[0] https://pyvideo.org/pybay-2024/automate-your-city-data-with-... "Automate Your City Data with Python"