SlipStreamJobEngine

SlipStream job engine use cimi job resource and zookeeper as a locking queue. It's done in a way to be horizontally scalled on different nodes.

Facts:

Each action should be distributed by a standalone distributor
More than one distributor for the same action can be started on different nodes but one will be elected to distribute the job (action).
Executor load actions dynamically at his startup
Zookeeper is used as a Locking queue containing only job uuid in /job/entries
Running jobs are put in zookeeper under /job/taken
If executor is unable to communicate with CIMI, the job in running state is released (put back in zookeeper queue).
The action implementation should take care if necessary to continue the execution or to make the cleanup of a unfinshed running job
If connection is lost with zookeeper /job/taken (executing jobs) will be released because this is ephemeral nodes.
Stopping the executor will try to make a proper shuttdown by waiting 2 minutes before killing the process. Each thread that terminate his running action will not take a new one.

Run the slipstream executor

Install the rpm of SlipStreamJobEngine

Create a file /etc/default/slipstream-job-executor with following content:

DAEMON_ARGS='--ss-url=https://<CIMI_ENDPOINT>:<CIMI_PORT> --ss-user=super --ss-pass=<SUPER_PASS> --zk-hosts=<ZOOKEEPER_ENDPOINT>:<ZOOKEEPER_PORT> --threads=8 --es-hosts-list=<ELASTICSEARCH_ENDPOINTS>'

Start the service with systemctl start slipstream-job-executor

Run the slipstream distributors

Install the rpm of SlipStreamJobEngine

Create a file /etc/default/slipstream-job-distributor with following content:

DAEMON_ARGS='--ss-url=https://<CIMI_ENDPOINT>:<CIMI_PORT> --ss-user=super --ss-pass=<SUPER_PASS> --zk-hosts=<ZOOKEEPER_ENDPOINT>:<ZOOKEEPER_PORT>'

Start the service with systemctl start slipstream-job-distributor@<DISTRIBUTOR_SCRIPT_FILENAME_LAST_PART>

e.g systemctl start slipstream-job-distributor@jobs_cleanup.service

Implement new actions

To implement new actions to be executed by job executor, you have to create a class equivalent to actions/dummy_test_action.py. You have to restart the job executor to force it reload implemented actions.

To create a new action distributor, which will create a cimi job every x time. Create a class equivalent to scripts/job_distributor_dummy_test_action.py.

Logging

Check /var/log/slipstream/log/ folder.

Debugging

You can get a trace-back of all running threads using tools like https://pyrasite.readthedocs.io

pip install pyrasite
Get python process PID of the executor e.g.
Connect to slipstream bash session: su - slipstream
pyrasite-shell
print traceback with entering code below into pyrasite repl

import sys
import threading
import traceback

for th in threading.enumerate():
    print(th)
    traceback.print_stack(sys._current_frames()[th.ident])
    print()

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
.circleci		.circleci
job		job
rpm		rpm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SlipStreamJobEngine

Run the slipstream executor

Run the slipstream distributors

Implement new actions

Logging

Debugging

About

Releases

Packages

Contributors 7

Languages

License

slipstream/SlipStreamJobEngine

Folders and files

Latest commit

History

Repository files navigation

SlipStreamJobEngine

Run the slipstream executor

Run the slipstream distributors

Implement new actions

Logging

Debugging

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages