- Setup Apache Toree Jupyter notebook.
- Download raw data from: https://rapidsai.github.io/demos/datasets/mortgage-data
- Run Mortgage ETL job.
- Setup Apache Toree Jupyter notebook.
- Install
cudatoolkit
andnumba
(conda
example provided, but you can also usepip
):
conda install numba
conda install cudatoolkit
- Download raw data:
wget https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_20{09..16}-{01..12}.csv
- Run Taxi ETL job.