Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate contributor SVG widgets #4

Merged
merged 4 commits into from
Aug 7, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions .drone.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,7 @@ steps:
- mkdir -p ~/.ssh
- ssh-keyscan -H github.com >> ~/.ssh/known_hosts
- echo "$SSH_BOT_KEY" > ~/.ssh/id_rsa && chmod 0600 ~/.ssh/id_rsa
- mkdir site
- ./scripts/prepare_weights.sh
- sourcecred/scripts/build_static_site.sh --target ./site --repo sfosc/sfosc --repo sfosc/wizard --cname sfosc.org --weights "$(pwd)/.weights.json"
- ./scripts/rebuild-site.sh

- name: commit-and-push
image: docker:git
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
/sourcecred_data
/secrets
.weights.json
/site
.weights.json
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
[submodule "sourcecred"]
path = sourcecred
url = https://github.com/Beanow/sourcecred.git
[submodule "widgets"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With submodules we need to be explicit about how to clone / fork and install submodules in the documentation. Some people might not immediately know that they need to recurse:

$ git clone --recurse-submodules

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEVELOPING.md does make mention of it recommending git submodule init > git submodule update.

There is an open issue to do away with the need for submodules upstream, or rather simplify depending on sourcecred: sourcecred/widgets#8

path = widgets
url = https://github.com/sourcecred/widgets.git
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Currently looks at:

- https://github.com/sfosc/sfosc
- https://github.com/sfosc/wizard
- https://github.com/sfosc/sourcecred

## Weight configuration

Expand Down
12 changes: 0 additions & 12 deletions mkweights/package-lock.json

This file was deleted.

8 changes: 8 additions & 0 deletions mkweights/yarn.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include the yarn lock file? At least for jekyll, I usually don't add the Gemfile.lock because it adds more issues than helps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the time for node projects yes you should. Because the package.json dependencies allow you to install newer semver feature or patch updates at random.

  "dependencies": {
    "toml": "^3.0.0"
  }

(^3.0.0 matches toml 3.1.3 just fine)

Committing a lock file ensures you have consistent environments between developers / CI jobs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Push the lockfile.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, 2 vs. 1 here! I should actually be advocating to include it, reproducibility etc.. Maybe this wouldn't be an issue if we interact via a container anyway.

# yarn lockfile v1


toml@^3.0.0:
version "3.0.0"
resolved "https://registry.yarnpkg.com/toml/-/toml-3.0.0.tgz#342160f1af1904ec9d204d03a5d61222d762c5ee"
integrity sha512-y/mWCZinnvxjTKYhJ+pYxwD0mRLVvOtdS2Awbgxln6iEnt4rk0yBxeSBHkGJcPucRiG0e55mwWp+g/05rsrd6w==
3 changes: 3 additions & 0 deletions repositories.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
sfosc/sfosc
sfosc/wizard
sfosc/sourcecred
9 changes: 5 additions & 4 deletions scripts/local-debug.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,19 @@

export SOURCECRED_DIRECTORY=$(pwd)/sourcecred_data
SOURCECRED_GITHUB_TOKEN=`cat secrets/token`
REPOS=`cat repositories.txt`

cd mkweights
npm i
yarn
cd ..

cat ./weights.toml | node mkweights > .weights.json

cd sourcecred
yarn install
yarn backend

SOURCECRED_GITHUB_TOKEN=$SOURCECRED_GITHUB_TOKEN node bin/sourcecred.js load sfosc/sfosc --weights ../.weights.json
SOURCECRED_GITHUB_TOKEN=$SOURCECRED_GITHUB_TOKEN node bin/sourcecred.js load sfosc/wizard --weights ../.weights.json
for repo in $REPOS; do
SOURCECRED_GITHUB_TOKEN=$SOURCECRED_GITHUB_TOKEN node bin/sourcecred.js load $repo --weights ../.weights.json
done

yarn start
28 changes: 28 additions & 0 deletions scripts/local-generate-svg.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/usr/bin/env bash

export SOURCECRED_DIRECTORY=$(pwd)/sourcecred_data
SOURCECRED_GITHUB_TOKEN=`cat secrets/token`
REPOS=`cat repositories.txt`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check here that this secrets file exists.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then of course exit if the variables aren't what we expect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local-* scripts are mostly debugging helpers. But you're right it would be useful for others if it's consistent and produces clear errors like this. Going to improve those a little.


cd mkweights
yarn
cd ..
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so I guess my worry here is that there are quite a few dependencies. For example, not everyone has yarn. and node is a beast. Is this generated on a user computer, locally? I'm thinking either we can provide a Dockerfile (and container?) to issue commands with (and the one dependency is docker of course) or the generation can be done in CI that has a recipe to install everything. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so I guess my worry here is that there are quite a few dependencies.

That is definitely true. I'm involved with a couple of discussions to make depending on sourcecred simpler. For example sourcecred/widgets#8

For example, not everyone has yarn. and node is a beast.

With regards to yarn. Sourcecred has a yarn lockfile committed but not the npm package-lock.json counterpart. Thus your dependencies would be yarn + npm + node if I had kept the npm lock variant. That's why I switch to yarn here.

Is this generated on a user computer, locally?

As local-* is intended for debugging it's useful to have node installed. And yarn is pretty stable so npm install -g yarn would be sufficient if you do have node and npm.

The official node docker container has yarn installed. In .drone.yml I use node:10 as that's the active LTS node version. Should you prefer avoiding node you might try using node:10 or node:12 (current stable and tested by sourcecred test suite) and including the repository root as a bind volume. Personally I've not tested the local-* with this yet. But it's essentially what our Drone build does.

I'm thinking either we can provide a Dockerfile (and container?) to issue commands with (and the one dependency is docker of course) or the generation can be done in CI that has a recipe to install everything. Thoughts?

That is what this PR will do. The .drone.yml file configures the job run on a daily interval. That calls scripts/rebuild-site.sh with node:10 as the main docker image to do so.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not worried about lots of dependencies per sé - but then I have all of these installed on every machine I work on.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If local is for debugging, and there is a configuration with Drone to run programatically otherwise, I'm good here. What we don't want is an entirely un-reproducible thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the main script used for CI is scripts/rebuild-site.sh
Which is also usable locally.


cat ./weights.toml | node mkweights > .weights.json

cd sourcecred
yarn install
yarn backend
for repo in $REPOS; do
SOURCECRED_GITHUB_TOKEN=$SOURCECRED_GITHUB_TOKEN node bin/sourcecred.js load $repo --weights ../.weights.json
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you walk me through what is going on here?

  • node mkweights create some original file, (and weights.toml is the configuration of weights?
  • yarn install installs, what does backend do?
  • then we iterate through the repos, and we issue a load command. Are the repos already included in weight.json (and we are loading) or are we adding data to weights.json for the repo?

Probably some comments in these sections would fit the bill, so others / future you can come back and remember what it does :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node mkweights create some original file, (and weights.toml is the configuration of weights?

Yes. Sourcecred has a json weights format that is more verbose than the toml file format I'm using. I created the toml format as a more intuitive one to discuss. And lines 7-11 are to convert the toml format to the json format.

yarn install installs, what does backend do?

Yarn has a cli syntax yarn <script_name> so this is the equivalent of npm run <script_name>. In this case the script backend is defined here. Which will use webpack and babel to transliterate the sourcecred source code (using flow and esm) to javascript code that can be executed by node 10 or 12.

then we iterate through the repos, and we issue a load command. Are the repos already included in weight.json (and we are loading) or are we adding data to weights.json for the repo?

The load command fetches github data and runs the sourcecred scoring algorithm. We pass the weights arguments as configuration values. These weights essentially quantify: "How important is a comment? How important is a commit? How important is a PR?" ... etc.


That is the context to what these commands entail.

In broad stroke the workflow is:

  • Convert our simplified weights.toml to the weights.json format used by SourceCred.
  • yarn install + yarn backend to set up SourceCred.
  • sourcecred/bin/sourcecred.js load ... to fetch github data and score it using our weights.
  • sourcecred/bin/sourcecred.js scores ... to export those scores in a format we can consume.
  • widgets/bin/contributor-wall-svg.js to consume those scores and generate the SVG.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final question here - just want to check that there is no kind of PI in the GitHub data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good question. My assumption would be that because no special permissions are required to fetch user info it could only be public data. However many people people use their full name and an email and make it public. So that may be PI either way. I don't think they are stored, only the ID and username for graphs and avatars and username for widgets. Though I would need to check.

done
cd ..

cd widgets
yarn
export SVG_MIN_CRED=4.5
export SVG_MAX_USERS=50
for repo in $REPOS; do
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is fairly intuitive!

echo "Generating ${repo//\//-}-contributors.svg"
node ../sourcecred/bin/sourcecred.js scores $repo | SOURCECRED_GITHUB_TOKEN=$SOURCECRED_GITHUB_TOKEN ./bin/contributor-wall-svg.js > "../${repo//\//-}-contributors.svg"
done
12 changes: 0 additions & 12 deletions scripts/prepare_weights.sh

This file was deleted.

79 changes: 79 additions & 0 deletions scripts/rebuild-site.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
#!/usr/bin/env bash

CNAME="${CNAME:-'sfosc.org'}"
SVG_MIN_CRED=${SVG_MIN_CRED:-4.5}
SVG_MAX_USERS=${SVG_MAX_USERS:-50}

toplevel="$(git -C "$(dirname "$0")" rev-parse --show-toplevel)"
cd "${toplevel}"

die() {
printf >&2 'fatal: %s\n' "$@"
exit 1
}

# Check our dependencies.
[ -z "$(which node)" ] && die "Node must be installed and available in \$PATH"
[ -z "$(which yarn)" ] && die "Yarn must be installed and available in \$PATH"

# Make sure we have a token.
[ -z "${SOURCECRED_GITHUB_TOKEN}" ] && die "No SOURCECRED_GITHUB_TOKEN has been set."

# Find our repository list.
[ ! -e "repositories.txt" ] && die "A repositories.txt file is expected in the repository root."
REPOS="$(cat repositories.txt)"

# Rebuild weight overrides from root toml file.
WEIGHTS_OPT=""
[ -e ".weights.json" ] && rm .weights.json
if [ -e "weights.toml" ]; then
echo "Converting weights.toml"
cd mkweights
yarn --production
cd ..
cat weights.toml | node mkweights > .weights.json
WEIGHTS_OPT="--weights .weights.json"
fi

# Rebuild sourcecred dependencies.
echo "Building SourceCred binaries."
cd "${toplevel}/sourcecred"
SOURCECRED_BIN="${toplevel}/sourcecred/bin"
yarn
yarn -s backend --output-path "${SOURCECRED_BIN}"

# Reload repository data.
echo "Loading repository data."
SOURCECRED_DIRECTORY="${toplevel}/sourcecred_data"
for repo in $REPOS; do
SOURCECRED_DIRECTORY="${SOURCECRED_DIRECTORY}" node "${SOURCECRED_BIN}/sourcecred.js" load "${repo}" $WEIGHTS_OPT
done

# Create static website.
echo "Rebuilding static website"
cd "${toplevel}/sourcecred"
target="${toplevel}/site"
[ -d "${target}" ] && rm -rf "${target}"
yarn -s build --output-path "${target}"

# Import cred data.
mkdir "${target}/api/"
mkdir "${target}/api/v1/"
cp -r "${SOURCECRED_DIRECTORY}" "${target}/api/v1/data"
rm -rf "${target}/api/v1/data/cache"

# Set CNAME.
printf '%s' "${CNAME}" >"${target}/CNAME" # no newline

# Generate widgets.
echo "Generating widgets"
cd "${toplevel}/widgets"
widgets_target="${target}/widgets"
mkdir -p "${widgets_target}"
yarn
for repo in $REPOS; do
echo "Generating ${repo//\//-}-contributors.svg"
SOURCECRED_DIRECTORY="${SOURCECRED_DIRECTORY}" node "${SOURCECRED_BIN}/sourcecred.js" scores "${repo}" | \
SVG_MIN_CRED=$SVG_MIN_CRED SVG_MAX_USERS=$SVG_MAX_USERS \
./bin/contributor-wall-svg.js > "${widgets_target}/${repo//\//-}-contributors.svg"
done
1 change: 1 addition & 0 deletions widgets
Submodule widgets added at 5f873e