Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate contributor SVG widgets #4

Merged
merged 4 commits into from
Aug 7, 2019
Merged

Generate contributor SVG widgets #4

merged 4 commits into from
Aug 7, 2019

Conversation

Beanow
Copy link
Member

@Beanow Beanow commented Jul 15, 2019

This PR adds an early version of the SourceCred widgets.
This widgets tool is intended for various styles of "widgets" like markdown tables or images like this SVG, see sourcecred/sourcecred#1047.

Specifically it will generate an SVG with the all-time top 50 contributors over 4.5 cred (based on the new timeline calculation), which we can use for placing in READMEs.

The order of avatars is descending by cred, taking into account our weights defined at weights.toml

Example from sfosc/sfosc today:
example

This file would be made available at
https://sfosc.org/sourcecred/widgets/sfosc-sfosc-contributors.svg
And update along with the scores on a daily cronjob, so we can include this file from the README.md

@Beanow
Copy link
Member Author

Beanow commented Jul 18, 2019

@vsoch @nothingismagick are you able to help review this?

@vsoch
Copy link

vsoch commented Jul 18, 2019

yep!

Copy link

@vsoch vsoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this first round of comments, I'd be happy to test this out.

@@ -0,0 +1,8 @@
# THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include the yarn lock file? At least for jekyll, I usually don't add the Gemfile.lock because it adds more issues than helps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the time for node projects yes you should. Because the package.json dependencies allow you to install newer semver feature or patch updates at random.

  "dependencies": {
    "toml": "^3.0.0"
  }

(^3.0.0 matches toml 3.1.3 just fine)

Committing a lock file ensures you have consistent environments between developers / CI jobs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Push the lockfile.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, 2 vs. 1 here! I should actually be advocating to include it, reproducibility etc.. Maybe this wouldn't be an issue if we interact via a container anyway.


export SOURCECRED_DIRECTORY=$(pwd)/sourcecred_data
SOURCECRED_GITHUB_TOKEN=`cat secrets/token`
REPOS=`cat repositories.txt`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check here that this secrets file exists.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then of course exit if the variables aren't what we expect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local-* scripts are mostly debugging helpers. But you're right it would be useful for others if it's consistent and produces clear errors like this. Going to improve those a little.


cd mkweights
yarn
cd ..
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so I guess my worry here is that there are quite a few dependencies. For example, not everyone has yarn. and node is a beast. Is this generated on a user computer, locally? I'm thinking either we can provide a Dockerfile (and container?) to issue commands with (and the one dependency is docker of course) or the generation can be done in CI that has a recipe to install everything. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so I guess my worry here is that there are quite a few dependencies.

That is definitely true. I'm involved with a couple of discussions to make depending on sourcecred simpler. For example sourcecred/widgets#8

For example, not everyone has yarn. and node is a beast.

With regards to yarn. Sourcecred has a yarn lockfile committed but not the npm package-lock.json counterpart. Thus your dependencies would be yarn + npm + node if I had kept the npm lock variant. That's why I switch to yarn here.

Is this generated on a user computer, locally?

As local-* is intended for debugging it's useful to have node installed. And yarn is pretty stable so npm install -g yarn would be sufficient if you do have node and npm.

The official node docker container has yarn installed. In .drone.yml I use node:10 as that's the active LTS node version. Should you prefer avoiding node you might try using node:10 or node:12 (current stable and tested by sourcecred test suite) and including the repository root as a bind volume. Personally I've not tested the local-* with this yet. But it's essentially what our Drone build does.

I'm thinking either we can provide a Dockerfile (and container?) to issue commands with (and the one dependency is docker of course) or the generation can be done in CI that has a recipe to install everything. Thoughts?

That is what this PR will do. The .drone.yml file configures the job run on a daily interval. That calls scripts/rebuild-site.sh with node:10 as the main docker image to do so.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not worried about lots of dependencies per sé - but then I have all of these installed on every machine I work on.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If local is for debugging, and there is a configuration with Drone to run programatically otherwise, I'm good here. What we don't want is an entirely un-reproducible thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the main script used for CI is scripts/rebuild-site.sh
Which is also usable locally.

yarn install
yarn backend
for repo in $REPOS; do
SOURCECRED_GITHUB_TOKEN=$SOURCECRED_GITHUB_TOKEN node bin/sourcecred.js load $repo --weights ../.weights.json
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you walk me through what is going on here?

  • node mkweights create some original file, (and weights.toml is the configuration of weights?
  • yarn install installs, what does backend do?
  • then we iterate through the repos, and we issue a load command. Are the repos already included in weight.json (and we are loading) or are we adding data to weights.json for the repo?

Probably some comments in these sections would fit the bill, so others / future you can come back and remember what it does :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node mkweights create some original file, (and weights.toml is the configuration of weights?

Yes. Sourcecred has a json weights format that is more verbose than the toml file format I'm using. I created the toml format as a more intuitive one to discuss. And lines 7-11 are to convert the toml format to the json format.

yarn install installs, what does backend do?

Yarn has a cli syntax yarn <script_name> so this is the equivalent of npm run <script_name>. In this case the script backend is defined here. Which will use webpack and babel to transliterate the sourcecred source code (using flow and esm) to javascript code that can be executed by node 10 or 12.

then we iterate through the repos, and we issue a load command. Are the repos already included in weight.json (and we are loading) or are we adding data to weights.json for the repo?

The load command fetches github data and runs the sourcecred scoring algorithm. We pass the weights arguments as configuration values. These weights essentially quantify: "How important is a comment? How important is a commit? How important is a PR?" ... etc.


That is the context to what these commands entail.

In broad stroke the workflow is:

  • Convert our simplified weights.toml to the weights.json format used by SourceCred.
  • yarn install + yarn backend to set up SourceCred.
  • sourcecred/bin/sourcecred.js load ... to fetch github data and score it using our weights.
  • sourcecred/bin/sourcecred.js scores ... to export those scores in a format we can consume.
  • widgets/bin/contributor-wall-svg.js to consume those scores and generate the SVG.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final question here - just want to check that there is no kind of PI in the GitHub data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good question. My assumption would be that because no special permissions are required to fetch user info it could only be public data. However many people people use their full name and an email and make it public. So that may be PI either way. I don't think they are stored, only the ID and username for graphs and avatars and username for widgets. Though I would need to check.

yarn
export SVG_MIN_CRED=4.5
export SVG_MAX_USERS=50
for repo in $REPOS; do
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is fairly intuitive!

@Beanow
Copy link
Member Author

Beanow commented Jul 18, 2019

Added the extra comments and checks to the local-* debugging scripts.

Copy link

@nothingismagick nothingismagick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it - but still think that having more than just an avatar would be nice (here is an example from a webpack page):
Screen Shot 2019-07-19 at 10 07 34

I know that having the name is actually something that would belong to the sourcecred/widgets project - but wanted to note that here.

@@ -1,3 +1,6 @@
[submodule "sourcecred"]
path = sourcecred
url = https://github.com/Beanow/sourcecred.git
[submodule "widgets"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With submodules we need to be explicit about how to clone / fork and install submodules in the documentation. Some people might not immediately know that they need to recurse:

$ git clone --recurse-submodules

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEVELOPING.md does make mention of it recommending git submodule init > git submodule update.

There is an open issue to do away with the need for submodules upstream, or rather simplify depending on sourcecred: sourcecred/widgets#8

@@ -0,0 +1,8 @@
# THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Push the lockfile.


cd mkweights
yarn
cd ..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not worried about lots of dependencies per sé - but then I have all of these installed on every machine I work on.

@Beanow
Copy link
Member Author

Beanow commented Jul 19, 2019

@nothingismagick agreed, variants with more information would be useful and implemented in the widgets package.

Feedback welcome, as expanding on this would be great for the utility of the package. And achieving the example you gave should be fairly easy if I dig into the SVG spec a little bit.

For SFOSC perhaps we should take it up to an issue in the main repo to discuss what we would like to have?

@vsoch
Copy link

vsoch commented Jul 19, 2019

I personally like the square graphics (without names) but I'm open to have my mind changed! @nothingismagick what do you think having the names adds?

@nothingismagick
Copy link

It removes the cult of personality. And furthermore, there are people with no personally identifiable, so you end up with stuff like this:
quasar_contributors_demo

@vsoch
Copy link

vsoch commented Jul 19, 2019

I kind of like the grid, even with the missing pictures here and there :) It looks like a zoom in of one of those paintings / pictures that is made entirely of other pictures. Whoa! That would be an awesome integration for sourcecred - to make a picture based on matching pixels in some template image.

@Beanow
Copy link
Member Author

Beanow commented Aug 7, 2019

@vsoch your review is marked requesting changes. Was there a particular point I hadn't addressed yet?

Copy link

@vsoch vsoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks good! I had a bunch of original comments, all of which you addressed, and I mostly just forgot to approve :) Approved!

@Beanow
Copy link
Member Author

Beanow commented Aug 7, 2019

Thanks!

I gave this another test run before merging and noticed the output CNAME file contained 'sfosc.org' instead of sfosc.org and fixed this.

@Beanow Beanow merged commit ffa8cad into master Aug 7, 2019
Beanow added a commit to Beanow/sfosc that referenced this pull request Aug 7, 2019
It's updated daily along with the sourcecred data.
See sfosc/sourcecred#4
@Beanow Beanow deleted the pr/contributors-svg branch August 22, 2019 10:48
Beanow added a commit to sfosc/sfosc that referenced this pull request Aug 27, 2019
It's updated daily along with the sourcecred data.
See sfosc/sourcecred#4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants