Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate tooling options of user portal #3972

Closed
bendnorman opened this issue Nov 22, 2024 · 1 comment
Closed

Evaluate tooling options of user portal #3972

bendnorman opened this issue Nov 22, 2024 · 1 comment

Comments

@bendnorman
Copy link
Member

bendnorman commented Nov 22, 2024

Background

We conducted a few superset users tests #3855 and compiled the feedback in this google doc. We agreed Superset mostly meets our requirements and is an improvement on dataset we identified a few concerns:

  • Superset is not a super customizable tool. If we wanted to make UI or feature changes in the future we'd need to fork the repo and get familiar with the sprawling javascript codebase. We're generally worried we'll outgrow the features and design goals of Superset.
  • Permissions: Superset meets our core requirement is to allow people to easily discover PUDL tables and download subsets of the data using a UI. We were hoping to also allow users to create their own dashboards and charts given Superset is a full BI tool. However, we realized that all users can see all saved charts and dashboards. This isn't ideal from a privacy, reputation and UX perspective. We probably would need to disable dashboard and chart making.
  • Programmatically creating table dashboards is not well supported. I was able to use the Superset dashboard yaml files and the API to create a script that can programmatically create dashboards for all our tables. However, the process for changing the template dashboard and apply those changes to all the table dashboards is cumbersome. We'd have to adjust the template dashboard in superset, download the yaml config file, figure out which pieces of the yaml file need to be parameterized, delete all the old table dashboards and then rerun the script.
  • Superset doesn't provide many features for making the data more discoverable. However, we managed to include the data dictionary in the welcome dashboard.

Generally, it feels like we're forcing Superset to be an open data portal when it's really designed to be a great BI tool for internal data. Therefore were going to do another pass at tool research before moving forward. That being said, Superset without letting users create their own dashboards and charts is still a solid backup options for improving our data delivery.

Requirements

We started this project at the end June 2024. Since then, we've talked to a ton energy data practitioners which changed our original requirements for this tool. Here are our requirements for this current research:

Must haves

  • Functionalities
    • Searchable list of all available tables
    • Access to data that’s currently not in SQLite (current Parquet files)
    • Display of table and column level descriptions/metadata
    • Filtering data and downloading as a CSV
    • Table preview
    • Updates when we update our data
    • Allow catalyst to create visualizations and share them publicly
  • Metrics
    • Collect user emails + collect metrics on table usage (not necessarily connected)
    • User self-registration (vs. us adding people manually) + recording user emails
  • Infrastructure
    • <5s for query response when there are 15 concurrent users
    • Operable by multiple Catalyst team members
    • Manage the UI as code so we don’t have to deal with a UI and we can version control the changes
  • Operations
    • Ability to have stable costs (e.g., by limiting usage)
    • We aren’t on the hook for 24/7 on-call maintenance - i.e. we are OK with it going down

Nice to have

  1. Users have the ability to make charts
  2. Ability to customize / configure the UI - e.g., make a dashboard, add explanatory text, having some modularity
  3. Correlating users and their usage
  4. Ability to create limited public view
  5. Users can export and/or visualizations

UI Requirements

These feel like sub-requirements but I think are important to include:

  • Can add and remove columns. During testing some users felt overwhelmed when scrolling through our tables with hundreds of columns.
  • Filter using a drop down menu with auto complete. During testing some users were trusted by the lack of autocomplete in filter fields.
  • Can filter on any column

Tools to explore

BI/portal tools

  • Metabase
  • Redash
  • CKAN
  • ... (please add more here)

Python web tools

  • dash
  • reflex
  • NiceGUI
  • Taipy
  • ... (please add more here)

Parquet front ends

  • Perspective
  • ... (please add more here)

Process

  1. Agree on the requirements listed above
  2. Agree on a list of tools to explore
  3. Synchronously evaluate the tools on the requirements
@bendnorman bendnorman moved this from New to Backlog in Catalyst Megaproject Nov 22, 2024
@bendnorman bendnorman moved this from Backlog to In progress in Catalyst Megaproject Nov 27, 2024
@jdangerx jdangerx moved this from In progress to Done in Catalyst Megaproject Dec 17, 2024
@jdangerx jdangerx closed this as completed by moving to Done in Catalyst Megaproject Dec 17, 2024
@jdangerx
Copy link
Member

Decision made to spend a little time on the homegrown user portal to see where we can go with that, and then weighing that vs. superset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

2 participants