-
Notifications
You must be signed in to change notification settings - Fork 141
GitHub to Gist Service Migration
Since RCloud 1.8 we support three back-ends for notebooks: GitHub (and GitHub Enterprise), gitgist (local git repositories) and RCloud Gist Service (centralized server on top of git repositories). This page describes the process of migrating from GitHub back-end to RCloud Gist Service.
The main difference is that GitHub uses its own user management and authentication mechanism, whereas RCloud Gist Service plugs into the RCloud authentication (SessionKeyServer). The benefit is that there is now only a single authentication authority and a single token that governs both execution and notebook access.
Typical GH setup in an enterprise setting:
GitHub <=> RCloud compute instances <-> SessionKeyServer
after migration
RCloud compute instances <====> RCloud Gist Service
| |
+==> Session Key Server <--+
which means that SKS has to be accessible both from the compute nodes as well as the gist service. Moreover if multiple RCloud instances uses the same Gist Service, they have to be registered with the Service such that it knows which SKS to authenticate against, .e.g:
RCloud Instance 1 <-->| Gist |<--> RCloud Instance 2
SKS 1 <-------------| Service |----> SKS 2
When migrating an existing GitHub Enterprise installation, use the following process:
- upgrade to RCloud 1.8
- install github_0.9.9 package (
install.packages("github",,"http://rforge.net")
) - install Java on the machine/VM that you will be using for the Gist Service. The default port is 13020 so make sure it is accessible from RCloud compute nodes. Unlike GH installations it doesn't need to be client-visible as it only provides API access. Finally, make sure it can reach the SessionKeyServer instances you have (typically on port 4301).
- set GitHub into mainenance mode
- create a backup using GHE backup utilities, preferably from the Gist Service machine
- create destination directory for the Gist Service. A typical choice is
/data/rcloud/data/gist-service
for standard RCloud installations, but it can be any directory. - use migration script
scripts/migrate-ghe2gists.pl
to copy gists from GHE backup into the Gist Service directory, e.g.perl scripts/migrate-ghe2gists.pl /shared/ghe/backup/current /data/rcloud/data/gist-service
Note that this step can take quite a while depending on the disk speed and number of gists that need to be migrated (~30k notebooks at 4Gb take ~45min on a fast RAID array). - if you have users using private keys you will have to migrate them between authentication methods, because SKS stores keys separate for each method. Note that this can be done at any point as it is independent of the Gist back-end.
- update to latest SKS sources (typically by running
git pull
in/data/rcloud/services/SessionKeyServer
) and runmake CopyKeys.jar
in theSessionKeyServer
directory to create key migration tool - stop SessionKeyServer. All actions below have to be performed as the same user that is normally running SKS.
- determine which RCloud execution authentication method is used in your installation, e.g.
auth/pam
if you use PAM - run
java -jar CopyKeys.jar -d key.db stored auth/pam
where the last argument is your RCloud execution authentication method - start SessionKeyServer
- update to latest SKS sources (typically by running
- Download the Gist Service (currently https://github.com/MangoTheCat/rcloud-gist-services) - although you can build it yourself, pre-made binary JAR with configuration is available at https://github.com/att/rcloud/releases/download/rcloud-gist-service-0.3.1-rc/rcloud-gist-service-0.3.1.tar.gz
- Edit
application.yml
to match your setup. The sample is setup to use/data/rcloud
locations and local SKS. Typically you will want to checkgists:
section, in particularroot:
which should match the directory you used above andkeyservers:
section which should list all SKS instances that will be using this service. The names in the list should match thegithub.client.id
as defined inrcloud.conf
of each instance ordefault
which is used for all unknown client ids. You can enable SSL if you wish (uses the same JKS format as SKS itself) and also check the location of the log file. - Start the service
java -jar rcloud-gist-service-0.3.1.jar
The final step is to configure RCloud to use the new service and restart. Example entries for rcloud.conf
:
github.client.id: default
github.client.secret: X
github.api.url: https://rcloud.research.att.com:13020/
github.auth.forward: https://rcloud.research.att.com/login_successful.R
github.auth: exec.token
rational.githubgist: true
In detail, github.auth.forward
must point to the login_successful.R
entry of your instance - the same you used when registering GitHub application. The github.client.id
can be an arbitrary name, but if you use multiple RCloud instances against one service they must have distinct names and those names must have corresponding entries the keyservers:
section. Note that github.client.secret
must be set to some value although it is not actually used. github.api.url:
must be set to the gist service URL. Finally, github.auth:
tells RCloud to not use OAuth but instead use the token as execution token and rational.githubgist:
disables work-arounds for GitHub-specific idiosyncrasies (race conditions in the API, inability to self-fork/bi-fork etc.). Restart RCloud (rcloud-qap
and rcloud-script
services).
If you get 415 errors (invalid content) on writes then you have not upgraded the github package to 0.9.9.
If RCloud is complaining on connect that it cannot use read-only gist backend in the main section, then it's likely that you are missing at last one of the entries above - don't forget to include both github.client.id:
and github.client.secret:
even though the secret is actually verified.