Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OP: High level search on dependencies of past RF projects #2276

Closed
ccerv1 opened this issue Sep 30, 2024 · 4 comments
Closed

OP: High level search on dependencies of past RF projects #2276

ccerv1 opened this issue Sep 30, 2024 · 4 comments
Assignees
Labels
c:analytics Analytics and metrics

Comments

@ccerv1
Copy link
Member

ccerv1 commented Sep 30, 2024

What would you like to see? Please provide a clear and concise description of what you want

From Jonas:

What would be the lift to put together a v2 of the "downstream impact" model for packages that you put together a while back? Link here. Just having up to date data with all the new Superchain projects that are in OSO would be great. This could give a good basis to reason about parts of the RF Dev tooling round

Is there any additional context you can share? Feel free to include notes or screenshots that can help us research and track the issue.

See also #31

Is there a write-up or doc associated with this issue? Please link to it.

No response

@ccerv1 ccerv1 self-assigned this Sep 30, 2024
@github-project-automation github-project-automation bot moved this to Backlog in OSO Sep 30, 2024
@ccerv1 ccerv1 added the c:analytics Analytics and metrics label Sep 30, 2024
@ccerv1
Copy link
Member Author

ccerv1 commented Oct 3, 2024

OK, this is super cool! And it's working decently for NPM packages at least.

See results here:
https://docs.google.com/spreadsheets/d/1iLETROqdcGtZMBzGB2CwDnvJvZo_RP85GNDF6kFVeDo/edit?usp=sharing

First, I run a basic query to get the current snapshot of dependencies:

SELECT
  `SnapshotAt` AS `time`,
  `System` as event_source,
  `Name` AS from_artifact_namespace,
  CONCAT(`Name`, "_", `Version`) as from_artifact_name,
  `Dependency`.`Name` AS to_artifact_namespace,
  CONCAT(`Dependency`.`Name`, "_", `Dependency`.`Version`) as to_artifact_name,    
FROM `bigquery-public-data.deps_dev_v1.Dependencies`
WHERE
  `MinimumDepth` = 1  -- direct deps only
  AND TIMESTAMP_TRUNC(`SnapshotAt`, DAY) > TIMESTAMP("2024-09-18") -- arbitrary cutoff date
  AND UPPER(`System`) = 'NPM'

(i save this into a temp table, since it's a huge dataset)

Then I run the following:


with repos as (
  select distinct
    pbc.project_name,
    concat(rm.artifact_namespace, '/', rm.artifact_name) as repo_name
  from `oso.projects_by_collection_v1` pbc
  join `oso.int_repo_metrics_by_project` rm
    on pbc.project_id = rm.project_id
  where pbc.collection_name = 'optimism'
), 
packages as (
  select
    replace(to_artifact_namespace, '@', '') AS dependency,
    replace(from_artifact_namespace, '@', '') AS dependent_package
  from `static_data_sources.deps_dev_playground`
)

select distinct
  repos.project_name,
  packages.dependent_package,
  packages.dependency
from repos
join packages on repos.repo_name = packages.dependent_package
order by project_name

@ccerv1 ccerv1 closed this as completed Oct 3, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in OSO Oct 3, 2024
@JSeiferth
Copy link

@ccerv1 on the v0 results, how many onchain projects were included?
Anything about the results that suprised you?

@ccerv1 ccerv1 reopened this Oct 9, 2024
@github-project-automation github-project-automation bot moved this from Done to Needs Review in OSO Oct 9, 2024
@ccerv1
Copy link
Member Author

ccerv1 commented Oct 9, 2024

@Jabolol have a look at my project matching logic above. What would you do differently?

@ccerv1
Copy link
Member Author

ccerv1 commented Oct 14, 2024

Closing out again in favor of #2338

@ccerv1 ccerv1 closed this as completed Oct 14, 2024
@github-project-automation github-project-automation bot moved this from Needs Review to Done in OSO Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c:analytics Analytics and metrics
Projects
Status: Done
Development

No branches or pull requests

2 participants