The Hellscape that is Scraping Legislative Data as an Open Source Project

by Rylie Johnson | at Minnebar 17

Collecting complex legislative data from hundreds of government websites isn't all sunshine & rainbows. We'll cover some of the strangest things we've encountered with scraping these sites: from awful APIs to entire government offices disappearing. Most official sources have been easy enough to find, while others have been more tricky, with at least one coming from an address scribbled on a napkin. Since we're also an open source project, we have some struggles with things like getting helpful issue submissions and ensuring clarity around the project ecosystem.

Open States strives to improve civic engagement at the state and federal level by providing data and tools regarding state legislatures by aggregating legislative information from all 50 states, Congress, Washington, D.C., and Puerto Rico. This information is then standardized, cleaned, and published to the public for free. At the cost of our sanity.

This talk is recommended for folks interested in the process of scraping government data, contributing to open source civic tech projects, or for those that enjoy hearing about other people’s misery.

Beginner

Rylie Johnson

Rylie is a Software Engineer with Plural and a Maintainer of Open States, where they aim to improve civic engagement at the state and federal level by providing data and tools regarding those legislatures.

They currently live in Minneapolis with their pit bull Frankie. Outside of work, they enjoy practicing circus arts, spending too much money at bookstores, and tripping the light fantastic on any dance floor.