Building docker

See the file build_images.sh.

Update the .dockerignore file if need be, for example if the amount of files sent to the docker daemon during build is too large.

Running a client

Prerequisites

You should have the following docker volumes created:

pip-cache: Pip caching directory, to avoid downloading same dependencies when relaunching a server.
results: Contains the metrics recolted by the server (and the client, technically)
build-vol: Contains the builds of the libraries to use.

Configuration takes place with environment variables. Look at the template file envs/template.env, which contains these parameters:

BUG_NAME=EXPERIMENT_NAME
CLIENT_MANUAL_DEPENDENCY=COMMIT_SHA
CLIENT_PY_VERSION=3.5.6
EVALUATION_TYPE=
MODEL_LIBRARY=pytorch
CLIENT_LOG_DIR=/results/client
NUMBER_OF_RUNS=2
PY_CACHE_DIR=/pip-cache
DATA_SERVER_ENDPOINT=tcp://IP_ADRESS_OF_SERVER:90002
USE_BUILD_MKL=1
USE_CUDA=True

There are also the following environment variables that need to be added when launching an evaluation:

CHALLENGE, denotes the name of the challenge to use
NUM_CLASSES denotes the output size of the network to build, if the network supports it. Tied to CHALLENGE usually.
MODEL_NAME denotes the name of the model to use sudo docker run --rm --mount source=pip-cache,target=/pip-cache --mount source=results,target=/results --mount source=build-vol,target=/builds --gpus all --env-file /path/to/env/buggy.env --env CHALLENGE=NAME_OF_CHALLENGE --env NUM_CLASSES=NUMBER_OF_POSSIBLE_CLASSES --env MODEL_NAME=MODEL_TO_USE -it emiliorivera/ml-frameworks:eval100_client

Running a server

Prerequisites

You should have the following docker volumes created:

pip-cache: Pip caching directory, to avoid downloading same dependencies when relaunching a server.
results: Contains the metrics recolted by the server (and the client, technically)
data: Contains the data for the challenges (datasets)

Here is the following preferred command in order to launch a server: sudo docker run --name eval100_server --mount source=pip-cache,target=/pip-cache --mount source=results,target=/results --mount source=data,target=/data --gpus all --env DATA_SERVER_DATA_ROOT=/data -it emiliorivera/ml-frameworks:eval100_server

The default values for the following parameters are set:

PY_CACHE_DIR to /pip-cache
METRICS_LOG_DIR to /results/server
SEED_CONTROLLER_FILE to /results/server/seed_control
SERVER_LOG_FILE to $METRICS_LOGDIR/server.log}"
SERVER_PY_VERSION to 3.6.8

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
envs		envs
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile.client		Dockerfile.client
Dockerfile.server		Dockerfile.server
README.md		README.md
build_images.sh		build_images.sh
lttng.list		lttng.list
run_launcher.sh		run_launcher.sh
training_loop.sh		training_loop.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building docker

Running a client

Prerequisites

Running a server

Prerequisites

About

Releases

Packages

Contributors 3

Languages

swatlab/ml-frameworks-evaluation

Folders and files

Latest commit

History

Repository files navigation

Building docker

Running a client

Prerequisites

Running a server

Prerequisites

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages