Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data. It does that today by indexing data resources (tables, dashboards, streams, etc.) and powering a page-rank style search based on usage patterns (e.g. highly queried tables show up earlier than less queried tables). Think of it as Google search for data. The project is named after Norwegian explorer Roald Amundsen, the first person to discover South Pole.
The frontend service leverages a separate search service for allowing users to search for data resources, and a separate metadata service for viewing and editing metadata for a given resource. It is a Flask application with a React frontend.
For information about Amundsen and our other services, visit the main repository README.md
. Please also see our instructions for a quick start setup of Amundsen with dummy data, and an overview of the architecture.
- Python >= 3.5
- Node = v8.x.x or v10.x.x (v11.x.x has compatibility issues)
- npm >= 6.x.x
Please note that the mock images only served as demonstration purpose.
-
Landing Page: The landing page for Amundsen including 1. search bars; 2. popular used tables;
-
Table Detail Page: Visualization of a Hive / Redshift table
-
Column detail: Visualization of columns of a Hive / Redshift table which includes an optional stats display
-
Data Preview Page: Visualization of table data preview which could integrate with Apache Superset
Please visit Installation guideline on how to install Amundsen.
Please visit Configuration doc on how to configure Amundsen various enviroment settings(local vs production).
Please visit Developer guidelines if you want to build Amundsen in your local environment.