Skip to content

Commit

Permalink
kingfisher-collect(incremental): Load the prepared data into a table,…
Browse files Browse the repository at this point in the history
… to normalize the data used by Power BI
  • Loading branch information
jpmckinney committed May 16, 2024
1 parent af441b7 commit bc1b2a8
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions salt/kingfisher/collect/files/bi/cron.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,15 @@ psql -U kingfisher_collect -h localhost -t -c 'SELECT data FROM {{ crawl.spider

{{ userdir }}/bin/manage.py json-to-csv -q {{ scratchdir }}/{{ crawl.spider }}.json {{ scratchdir }}/{{ crawl.spider }}.csv

psql -U kingfisher_collect -h localhost -q \
-c "BEGIN" \
-c "DROP TABLE {{ crawl.spider }}_clean" \
-c "CREATE TABLE {{ crawl.spider }}_clean (data jsonb)" \
-c "\copy {{ crawl.spider }}_clean (data) FROM stdin" \
-c "CREATE INDEX idx_{{ crawl.spider }}_clean ON {{ crawl.spider }}_clean (cast(data->>'date' as text))" \
-c "END" \
< {{scratchdir}}/{{crawl.spider}}.out.jsonl

psql -U kingfisher_collect -h localhost -q \
-c "BEGIN" \
-c "DELETE FROM {{ crawl.spider }}_result" \
Expand Down

0 comments on commit bc1b2a8

Please sign in to comment.