Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate BigQuery Dataset #2

Open
3 tasks
import-pandas-as-numpy opened this issue Jun 25, 2023 · 0 comments
Open
3 tasks

Integrate BigQuery Dataset #2

import-pandas-as-numpy opened this issue Jun 25, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@import-pandas-as-numpy
Copy link
Member

import-pandas-as-numpy commented Jun 25, 2023

Add BigQuery as a data source to replace RSS feeds for updated and new packages.

  • Establish an organizational Google Services account. This is required, but the queries shouldn't pass the free tier.
  • Query the BigQuery dataset for relevant metadata (package title, name, download url)
  • Pass appropriately structured information to Dragonfly API for distribution to clients.

This will spawn an issue to link distributions together. Each distribution in the BigQuery dataset is tracked as a separate row, but current models expect all distributions of a package together to avoid missing malicious behavior in one portion of a package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 📋 Backlog
Development

No branches or pull requests

1 participant