Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify User DataSource #82

Open
mdr223 opened this issue Jan 27, 2025 · 0 comments
Open

Simplify User DataSource #82

mdr223 opened this issue Jan 27, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@mdr223
Copy link
Collaborator

mdr223 commented Jan 27, 2025

Writing a custom user DataSource is too difficult in PZ. It is made especially more complicated in the context of validation DataSources which are needed for sentinel (optimized) execution.

Users should only need to override a single get_item (or __iterator__) method which yields outputs from the DataSource. In my opinion, users should not need to know how to construct (or use) PZ's DataRecord class.

Instead, my proposal is the following:

  • users implement the get_item method, which takes an integer idx and returns a dict
  • the dict keys are the field names for the record to be returned (which must match those in the DataSource schema); the values are the values for those fields
  • PZ takes the output dict and converts it into a DataRecord on behalf of the user

One further step may include eliminating the need for users to specify a Schema at all; I will explore this option as part of the work on this issue.

@mdr223 mdr223 added the enhancement New feature or request label Jan 27, 2025
@mdr223 mdr223 self-assigned this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant