Skip to content

Latest commit

 

History

History
180 lines (147 loc) · 19 KB

storage.md

File metadata and controls

180 lines (147 loc) · 19 KB
title
UseGalaxy.EU storage

AutoFS

Our storage mounts are controlled everywhere with autofs.

In VGCN machines it's defined in the userdata.yml file while in other machines it is controlled by usegalaxy-eu.autofs ansible role and some variables like in group_vars/sno6.yml

How it works

/etc/auto.master.d/data.autofs has a line like:

/data           /etc/auto.data          nfsvers=3

Note that the above autofs conf is VERY sensitive to spaces. Do not retab unless you need to. /etc/auto.data looks like:

#name   options                         source
#
0       -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/&
1       -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/&
2       -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/&
3       -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/&
4       -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/&
5       -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/&
6       -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/&
7       -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/&
dp01    -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dataplant01
dnb01   -rw,hard,nosuid      ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/&
dnb02   -rw,hard,nosuid      ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/&
dnb03   -rw,hard,nosuid      ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/&
dnb04   -rw,hard,nosuid      ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/&
dnb05   -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb01/&
dnb06   -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb06
dnb07   -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb07
dnb08   -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/dnb08
db      -rw,hard,nosuid      ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/&
gxtst   -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/ws01/galaxy-sync/test
gxkey   -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/ws01/galaxy-sync/main
jwd     -rw,hard,nosuid      denbi.svm.bwsfs.uni-freiburg.de:/ws01/&

So dnb01 will be available under /data/dnb01

Different kinds of storage

  • managed iSilon storage (NFS)
  • managed NetApp storage (NFS, S3 possible)
  • zfs1: Big machine (>200TB) with spinning disks and SSD cache frontend (self-build)
  • ssds1: SSD-only machine (24x1.8TB) (self-build)

Group-based storage

It is possible to assign storage to dedicated Galaxy user groups. For example, the above storage dp01 is dedicated to the DataPLANT project and can be only used by researchers associated with the dataplant Galaxy group. This works via our dynamic job submission system (total-perspective-vortex). All jobs are going through these rules and we added a special one for the dataplant group. The drawback is that you cannot easily assign multiple storage backends to one group or different weights at the moment.

Sync

We have /usr/bin/galaxy-sync-to-nfs, created by this Ansible role, on sn04 that takes care of synchronizing Galaxy data from sn04 to the storage into the computational cluster.

Currently, the script is invoked:

  • by the handler in the Galaxy playbook.
  • by Jenkins, as a downstream project at the end of tools installation. See install_tools

Cluster and Mounts (WIP)

Adding new storage/mount points to galaxy is not trivial, since there are many machines involved.

After adding a DNS-A-Record to the infrastructure/dns.tf,

it is sufficient for most machines to add the mount point to infrastucture-playbook/group_vats/all.yml in the autofs_conf_files section.

HOWEVER for

Steps to add a new data (dnbXX) share

  1. Request the storage team (RZ) for a new data share (dnbXX)
  2. Once the new mount point is made available, mount it and test if the new mount point actually works and is reachable (pick a worker node and try it).
  3. The following places must be updated to roll out the new data share and to migrate from the previous share
    1. Add the new mount point to the mounts repository. An example PR can be found here
    2. Update the currently running vgcnbwc-worker-* nodes with the new mount using pssh.
      pssh -h /etc/pssh/cloud -l centos -i 'sudo su -c "echo \"dnb09    -rw,hard,nosuid,nconnect=2      denbi.svm.bwsfs.uni-freiburg.de:/dnb09\" >> /etc/auto.data"'
    3. Verify that the mount is successful
      pssh -h /etc/pssh/cloud -l centos -i 'ls -l /data/dnb09/'
    4. Then, update the object_store_conf.xml, for example like see here
    5. Once everything is merged, run the Jenkins job (sn06 playbook project) to deploy the new data share
    6. Monitor the changes and the handler logs to make sure that there are no errors.

NFS export policies

  • Export rules are
fr1-cl2::> export-policy rule show -vserver denbi -fields protocol,clientmatch,rorule,rwrule,superuser -policyname denbi     
vserver policyname ruleindex protocol clientmatch     rorule rwrule superuser 
------- ---------- --------- -------- --------------- ------ ------ --------- 
denbi   denbi      1         nfs3     132.230.223.238 sys    sys    any
denbi   denbi      1         nfs3     132.230.223.239 sys    sys    any       
denbi   denbi      3         nfs3     10.5.68.0/24    sys    sys    any       
2 entries were displayed.

fr1-cl2::> export-policy rule show -vserver denbi -fields protocol,clientmatch,rorule,rwrule,superuser -policyname denbi-svc 
vserver policyname ruleindex protocol clientmatch     rorule rwrule superuser 
------- ---------- --------- -------- --------------- ------ ------ --------- 
denbi   denbi-svc  1         nfs3     132.230.180.148 sys    sys    sys       

fr1-cl2::> export-policy rule show -vserver denbi -fields protocol,clientmatch,rorule,rwrule,superuser -policyname denbi-ws  
vserver policyname ruleindex protocol clientmatch     rorule rwrule superuser 
------- ---------- --------- -------- --------------- ------ ------ --------- 
denbi   denbi-ws   1         nfs3     132.230.223.238 sys    sys    any
denbi   denbi-ws   1         nfs3     132.230.223.239 sys    sys    any       
denbi   denbi-ws   3         nfs3     10.5.68.0/24    sys    sys    any       
denbi   denbi-ws   4         nfs3     132.230.223.213 sys    sys    none      
3 entries were displayed.

fr1-cl2::> export-policy rule show -vserver denbi -fields protocol,clientmatch,rorule,rwrule,superuser -policyname birna     
vserver policyname ruleindex protocol clientmatch      rorule rwrule superuser 
------- ---------- --------- -------- ---------------- ------ ------ --------- 
denbi   birna      1         nfs3     132.230.153.0/28 sys    sys    any       
denbi   birna      2         nfs3     10.5.68.0/24     sys    sys    none      
2 entries were displayed.
  • INFO:
    • policyname denbi is used for all dnbXX volumes, policyname denbi-svc is used for the svc01 volume, policyname denbi-ws is used for the galaxy_sync, ws01, ws02 volumes and policyname birna is used for the birna01 volume.
    • superuser means no_root_squash in this case. This means that the root account on the maschine with ip 132.230.223.239 and the machines within the subnet 10.5.68.0/24 can access (read and write) the volumes.
    • Do not use shares (jwd, and jwd03f) exported via ws01 and ws02. These shares will be removed soonish (as of: 14.06.2023)

The following table shall give an overview of the different mount points and where they are used:

Mountpoint Physicalmachine Export Purpose sn05 sn06 sn07 incoming celery VGCN
/data/jwd NetApp 400 denbi.svm.bwsfs.uni-freiburg.de:/ws01/& job working dir ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
/data/jwd01 Spinning Disks with SSD cache (self-build) zfs1.galaxyproject.eu:/export/& job working dir ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
/data/jwd02f Full SSD (self-build) zfs2f.galaxyproject.eu:/export/& job working dir (full-flash) ✔️ ✔️ ✔️ ✔️ ✔️
/data/jwd03f NetApp A400 flash denbi.svm.bwsfs.uni-freiburg.de:/ws02/& job working dir (full-flash) ✔️ ✔️ ✔️ ✔️ ✔️
/data/jwd04 Full SSD (self-build)
(from here no f for flash in name)
zfs3f.galaxyproject.eu:/export/& job working dir (full-flash) ✔️ ✔️ ✔️ ✔️ ✔️
/data/jwd05e Full SSD (self-build)
(from here no f for flash in name)
e: encrypted
zfs3f.galaxyproject.eu:/export/& job working dir (full-flash) ✔️ ✔️ ✔️ ✔️
/opt/galaxy NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/ws01/galaxy-sync/main galaxy root ✔️ ✔️
/usr/local/tools NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/tools tool dir ✔️ ✔️ ✔️
/data/gxtst NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/ws01/galaxy-sync/test ✔️ ✔️ ✔️ ✔️
/data/gxkey NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/ws01/galaxy-sync/main ✔️ ✔️ ✔️ ✔️
/data/galaxy-sync NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/galaxy-sync/main Galaxy root (galaxy's codebase) ✔️ ✔️ ✔️
/data/db iSilon ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/& ✔️ ✔️ ✔️ ✔️ ✔️
/data/0 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/1 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/2 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/3 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/4 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/5 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/6 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/7 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/depot/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb01 NetApp A400 /future iSilon ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb02 NetApp A400 /future iSilon ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb03 iSilon ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb04 iSilon ufr-dyn.isi1.public.ads.uni-freiburg.de:/ifs/isi1/ufr/bronze/nfs/denbi/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb05 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb01/& storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb06 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb06 storage (old) ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb07 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb07 currently used ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb08 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb08 currently used ✔️ ✔️ ✔️ ✔️ ✔️
/data/dnb09 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dnb09 unused ✔️ ✔️ ✔️ ✔️ ✔️
/data/dp01 NetApp A400 denbi.svm.bwsfs.uni-freiburg.de:/dataplant01 special storage for DataPLANT group ✔️ ✔️ ✔️ ✔️ ✔️

"old" means in this case, the storage is still used to read old datasets, but not to write new ones.