Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partition github api related dagster assets #2415

Open
ravenac95 opened this issue Oct 25, 2024 · 0 comments
Open

Partition github api related dagster assets #2415

ravenac95 opened this issue Oct 25, 2024 · 0 comments
Assignees

Comments

@ravenac95
Copy link
Member

Describe the feature you'd like to request

Currently, ossd__repositories and ossd__sbom both have a long queue to process in order to reach completeness. However, if an error occurs it starts again at the top of the queue as opposed to resuming in the correct place. We should partition the projects dataframe that is used as input to both of these assets so that we can ensure that there is a way to checkpoint the process.

Describe the solution you'd like

Split the projects dataframe into partitions.

Describe alternatives you've considered

If this doesn't work as intended we will need to have some kind of external state to control restarts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

2 participants