Skip to content

Latest commit

 

History

History
23 lines (18 loc) · 1.19 KB

README.md

File metadata and controls

23 lines (18 loc) · 1.19 KB

Socrata-to-Seafowl sync job

This repository contains the code used to sync the data from our index of open data on Socrata into a Seafowl instance.

This data will power the SocFeed app in the future.

In the meantime, see the Observable notebook that showcases this dataset.

How it works

  • Every night (currently on-demand), we initiate a download of the new snapshots of Socrata's Discovery API from Splitgraph in the Parquet format
  • This gives us a pre-signed S3 URL to download the file
  • We use CREATE EXTERNAL TABLE on Seafowl with this URL to append this data to a history table (bypassing having to download this file from the GitHub Actions instance)
  • Then, we use a not dbt script that creates some derived tables (monthly/weekly/daily summary) used by the SocFeed app (actual dbt support coming soon!)