Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose issues in Parquet format via datasette #105

Open
Ben-Hodgkiss opened this issue Oct 18, 2024 · 1 comment
Open

Expose issues in Parquet format via datasette #105

Ben-Hodgkiss opened this issue Oct 18, 2024 · 1 comment
Assignees

Comments

@Ben-Hodgkiss
Copy link
Contributor

Ben-Hodgkiss commented Oct 18, 2024

Overview

Expose issues in Parquet format via datasette

Background
Following the design proposal for an internal API, we would like to prove some technology choices which include the use of Fast API with DuckDB accessing Parquet on S3.

This work was identified during the spike on API design.

By exposing the issues in Parquet format via datasette, we will know whether Parquet will be an appropriate format for consuming via datasette as well as a new internal API.

A spike was done on using parquet in datasette by @ssadhu-sl here, which contains a fork of the datasette parquet plugin that we can start with

Tech Approach

  • Add Parquet plugin to datasette
  • Modify Parquet plugin as necessary to work alongside our exisitng datasette configuration

Acceptance Criteria/Tests

  • Code merged within datasette repo
  • Datasette exposes issues in Parquet format to production environment

Ticket Management - DELETE this section once completed

  • Complete all relevant tags - make sure Infrastructure is tagged so it is picked up by our filters!
  • Complete the time estimate field
  • Make sure you have a PR link in the Overviewabove.
  • If relevant, link to the relevant OKR as an attachment.
  • Link to any tickets in other boards that are dependent on it.
@Ben-Hodgkiss Ben-Hodgkiss converted this from a draft issue Oct 18, 2024
@Ben-Hodgkiss Ben-Hodgkiss moved this from Refine, Prioritise & Plan to Backlog in Infrastructure Oct 22, 2024
@Ben-Hodgkiss
Copy link
Contributor Author

@CarlosCoelhoSL - when picking this up, please liaise with @cpcundill about work he's already done on #102 as it may have done some of the ticket already. Also worth speaking with @eveleighoj to see if we can link the new "Expectations" work on this ticket into Parquet and Datasette at the same time.

@CarlosCoelhoSL CarlosCoelhoSL self-assigned this Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Development
Development

No branches or pull requests

2 participants