This repository contains materials for my talk at the Data Wranglers DC meetup on January 7, 2015.
I've also given prior talks on SQL to the same group:
- August 6, 2014, materials at nihonjinrxs/dwdc-august2014
- June 4, 2014, materials at nihonjinrxs/dwdc-june2014
The talk consists of two major pieces:
- Introductions to the following (via slide deck and presentation):
- The JavaScript Object Notation (JSON) Data Interchange Format (see IETF RFC 7159)
- Briefly: relational data, SQL, and PostgreSQL
- PostgreSQL's implementation in versions 9.2 through 9.4 of JSON types and JSON operators and functions
- A few examples (via PostgreSQL 9.4 flavored SQL code and demonstration):
- Some demonstration code that shows how to use the JSON operators and JSON processing functions in PostgreSQL 9.4. Data used in the demo code was generated with Mockaroo.
- An example that pulls job opening data from the DigitalGov Jobs API and processes it into tabular data.
- An example that pulls weather data from the OpenWeatherMap API and processes it into tabular data.
Note that the focus of this talk is on data preprocessing tasks (also called wrangling or munging), and not on analysis tasks. The goal of this work is to demonstrate some ways to use PostgreSQL's capabilities in version 9.4 (released December 18, 2014) to process JSON formatted data into tabular datasets amenable to future analysis.
Folders are as follows:
- A slide deck (
./slides
) in Apple Keynote, PDF and HTML formats - A set of PostgreSQL 9.4 flavor SQL scripts (
./code
) that create the local PostgreSQL database objects demonstrating working with thejson
andjsonb
types, including:- Some demonstration code that shows how to use the JSON operators and JSON processing functions in PostgreSQL 9.4. Mockaroo settings and mock data are also included.
- An example that pulls job opening data from the DigitalGov Jobs API and processes it into tabular data.
- An example that pulls weather data from the OpenWeatherMap API and processes it into tabular data.
I recommend that anyone wishing to understand what I've done and presented here should tackle these pieces in order, starting with the slide deck. The code is best followed in the order presented: demo code, followed by Jobs API code, followed by OpenWeatherMap API code.
This work and the opinions expressed here are my own, and do not purport to represent the views of my current or former employers.