This project shows how easy it is to integrate the datahub schema registry for use in spark and prophecy. It features two main functions:
- browse datasets - Browse and select existing datasets listed in datahub, synchronizing schema and other metadata.
- sync dataset - If you want to save dataset details to datahub, simply click "Sync To Datahub" to sync schema, descriptions and other metadata to datahub.
- A deployed instance of datahub, See here for a guide to deploy datahub.
- A datahub bearer token to authenticate requests, See here for a guide to generate authentication tokens.
Edit either in prophecy or in github: project/gems/prophecysamples_datahubtemplate/gems/DatahubTable.py
Set these to appropriate values:
DATAHUB_BASE_URL = "<INSERT_DATAHUB_URL>"
DATAHUB_TOKEN = "<INSERT_DATAHUB_TOKEN>"