Skip to content

Latest commit

 

History

History
64 lines (60 loc) · 10.5 KB

head_maintenance_nodes.md

File metadata and controls

64 lines (60 loc) · 10.5 KB

Ansible roles in the infrastructure-playbook repository

  • The following are the roles that are currently being installed on the head and maintenance nodes via the sn06 playbook, sn07 playbook, and maintenance node playbook
  • The roles are classified as either head node only, maintenance node only, or both
  • Head nodes: are the nodes that are running the Galaxy web server, the Galaxy job handlers, and the Galaxy workflow schedulers. As of 15/02/2023 sn06.galaxyproject.eu, and sn07.galaxyproject.eu are the two head nodes. Only sn06 is in production.
  • Maintenance node: runs cron jobs, contains Galaxy codebase, config, etc, pushes data to influxdb, performs cleanup tasks, syncs Galaxy codebase to NFS, etc.
Roles Head node(s) only Maintenance node only Both Adds cronjob? Comments Separate repo
usegalaxy_eu.htcondor ✔️ ✔️
ssh_hardening ✔️ ✔️
galaxyproject.gxadmin ✔️ ✔️
ssh-host-sign ✔️
usegalaxy-eu.dynmotd ✔️
usegalaxy-eu.autofs ✔️ ✔️
influxdata.chrony ✔️ ✔️
usegalaxy-eu.autoupdates ✔️ ✔️
galaxyproject.miniconda ✔️ ✔️
geerlingguy.repo-epel ✔️ ✔️
usegalaxy_eu.handy.os_setup ✔️ ✔️
usegalaxy-eu.logrotate ✔️
dj-wasabi.telegraf ✔️ listen_galaxy_routes (statsd), and galaxy_active_users (uses /var/log/nginx/) should be enabled only on the head nodes via the variable telegraf_plugins_extra
usegalaxy_eu.fs_maintenance ✔️ ✔️ All tasks (htcondor cron tasks, adding htcondor scripts, etc) except the fsm_cron_tasks can run on the maintenance node because the gxadmin tasks in fsm_cron_tasks uses the galaxy's log directory /var/log/galaxy for cleanup ✔️
usegalaxy-eu.fix-stuck-handlers ✔️ ✔️ Cron jobs for handlers, schedulers, and gunicorn. Also, sync to nfs (this should be removed and added to maintenance only node and the rest of them can run on both the head nodes)
galaxyproject.cvmfs ✔️ ✔️
hxr.monitor-galaxy-journalctl ✔️
geerlingguy.docker ✔️ ✔️
hxr.aws-cli ✔️
galaxyproject.tiaas2 ✔️ ✔️
usegalaxy-eu.nginx ✔️ ✔️
usegalaxy_eu.ansible_nginx_upload_module ✔️ ✔️
usegalaxy-eu.gapars-galaxy ✔️
usegalaxy_eu.galaxy_systemd ✔️ ✔️
usegalaxy-eu.subdomain-themes ✔️
usegalaxy-eu.log-cleaner ✔️
usegalaxy-eu.error-pages ✔️
usegalaxy-eu.fix-unscheduled-jobs ✔️ ✔️ runs journalctl on galaxy-handler and then runs gxadmin mutate and creates a cron job
usegalaxy-eu.galaxy-procstat ✔️
usegalaxy-eu.update-hosts ✔️ ✔️ 1. Uses condor, 2. Updates the computing nodes list on the head nodes to a file /etc/genders, so this needs to be run only on the head nodes ✔️
galaxyproject.galaxy ✔️ ✔️
usegalaxy-eu.fix-galaxy-server-dir ✔️
hxr.install-to-venv ✔️
usegalaxy_eu.gie_proxy ✔️ ✔️
usegalaxy-eu.fix-ancient-ftp-data ✔️ ✔️
usegalaxy-eu.fix-missing-api-keys ✔️ ✔️
usegalaxy-eu.fix-user-quotas ✔️ ✔️
usegalaxy_eu.tpv_auto_lint ✔️ ✔️
usegalaxy-eu.galaxy-slurp ✔️ ✔️
hxr.postgres-connection ✔️
usegalaxy-eu.tours ✔️
usegalaxy-eu.webhooks ✔️
usegalaxy-eu.rsync-to-nfs ✔️
hxr.galaxy-nonreproducible-tools ✔️
usegalaxy-eu.bashrc ✔️
usegalaxy-eu.monitoring ✔️
hxr.monitor-email ✔️
hxr.monitor-cluster ✔️
usegalaxy-eu.htcondor_release ✔️ ✔️ condor release held jobs as cron task
usegalaxy-eu.fix-unscheduled-workflows ✔️ ✔️
usegalaxy-eu.fix-stop-ITs ✔️ ✔️
usegalaxy-eu.vgcn-monitoring ✔️

Separate repo: Whether the role has its own repo or is it a local role located and available only in the infrastructure_playbook repo