Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curate contributors from *all* repos across intermine #17

Open
yochannah opened this issue Aug 14, 2019 · 3 comments
Open

Curate contributors from *all* repos across intermine #17

yochannah opened this issue Aug 14, 2019 · 3 comments
Assignees

Comments

@yochannah
Copy link
Member

Mini application wishlist: we could use the GitHub API to curate all the contributors from all of the github repos, and output it in the json format used by http://intermine.org/contributors/ (sample json https://raw.githubusercontent.com/intermine/intermine-homepage-2017/master/content/contributors/community.md - this is manually curated at the moment).

If this need a github token it might be a good idea to start out a tiny node app similar to https://github.com/yochannah/first-ticket-finder

@Iamhomkar
Copy link

Hello, I wish to work on this issue. I went through the github api. I have come up with two ways of implementing it.

  • Github Webhooks
    Github provides a PullRequestEvent under its webhooks. We could register a post url at server startup time so that our server is notified when a PR is merged. We could retrieve contributor profile from that. For the storage part we could use node-cache for caching the contributors. We could use Set data structure on emailid for uniquely identifying contributors. The cache would be updated on (possibly )every PR by a new user.

  • GitHub API
    The other way is to retrieve all repos under intermine by one api call and then get all contributors for those repos. In this case we could clear the cache once per month or every 15 days. And freshly repopulate the cache by hitting the above two api calls. It would be like a cron job in itself. These API calls (contributors per repo) and (repos for an org) come with pagination for upto 100 entries per page.

Can you provide suggestions on how to proceed.

@yochannah
Copy link
Member Author

@Omkar-Halikar wow, you've really thought this through beautifully! I think the API method is, perhaps, more likely to be useful as it will find all contributors, whereas the PR one will find new contribors only. Sound good?

I've created an empty repo and given you write access: https://github.com/intermine/cross-repo-contributor-list - if you have any questions at all, mention @yochannah, tweet @yoyehudi, pop by to say hi on chat or if needed email [email protected].

@Iamhomkar
Copy link

Thanks for pointing out the issue with the PR approach. I will keep you posted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants