-
-
Notifications
You must be signed in to change notification settings - Fork 0
Creating WST datasets
Ben Klein edited this page May 16, 2021
·
3 revisions
While most of WST's usefulness comes from the size and scope of the data we can analyze, often there are cases when you want a smaller or specific selection of repositories to compose a specialized dataset.
All of the code we use to build our own datasets is in this repo, so how can you go about using it?
First off you'll need a database ready to store the output, you can follow your favorite guide for installing ArangoDB, after which you should create a database and a user with write access to that database.
In order to set up the collections, relations, and indexes, the wsyntree-collector
command has a subcommand that initializes the database for you:
export WST_DB_URI="http://DB_USERNAME:DB_PASSWORD@localhost:8529/NAME_OF_DATABASE"
wsyntree-collector -v db init
# if you wish to re-initialize a database by **DELETING ALL DATA** and re-creating:
wsyntree-collector -v db init --delete
TODO: collector design is still being changed during development