Skip to content

Deploy Xcache via a Singularity Container

Wei Yang edited this page Oct 25, 2019 · 19 revisions

Build a Singularity container image with Xcache built-in

xcache.singularity.def can be used to build a Singularity image with Xcache inside and preconfigured. For Singularity 2.4, the command is

sudo singularity build -w xcache.img xcache.singularity.def

(for older versions of Singularity, the build commands maybe different but the recipe file will likely be the same). The -w tells Singularity to build an ext3 based image, as opposed to the default, smaller and read-only squashfs based images. An ext3 image can be used with Singularity 2.3 and run on both RHEL6/CentOS6 and 7 platforms.

The Singularity build process checks the following directories

/cvmfs/oasis.opensciencegrid.org/mis/osg-wn-client/current/el7-x86_64/etc/grid-security
/cvmfs/oasis.opensciencegrid.org/mis/osg-wn-client/current/el7-x86_64/etc/vomsdir

They will be used to populate the /etc/grid-security/certificates,vomsdir} directories. If they are not available, then those directories will be left empty. (This can be change in the recipe file - for example to use build host's /etc/grid-security directory). If needed, bind mounts (below) can be used at container start up time to make the content of these two directories available to the container.

Run the container

At minimum, you need a dedicated file system that is fully writable by the user who run the container. The filesystem should be bind mounted to /data. The command is

 singularity run -B /my_filesystem:/data xcache.img

With Singularity 2.3 and older, -H ${HOME}:/mnt may be needed. If you only have network filesystem available for cache space, using a loop mounted file system image may give you better performance.

The above command will immediately return, leaving a xrootd process running in background. Using ps -ef you can see the xrootd process (with ppid=1) but not the Singularity process. This is expected. To stop the xcache, type pkill xrootd.

In some cases (and most ATLAS use cases), the container needs a continuously updated X509 user proxy certificate (in order to fetch from data source). The container will search the default location /tmp/x509up_u$(id -u) for x509 proxy.

Bind mount options

The container can utilize the following bind mounts:

  • /var/run/x509up: In case the data source requires GSI security, and the x509 user proxy certificate to be used by the container isn't in the standard location (above), you can bind mount it to /var/run/x509up.
  • /etc/grid-security/{certificates, vomsdir}: can be overwritten by bind mounts.
  • /etc/xrootd/xcache.cfg: If you want use your own xrootd/xcache configuration file, you can bind mount it to /etc/xrootd/xcache.cfg. Otherwise the container will automatically generated (and use/save) a configuration file to /tmp/xcache.cfg. The container will give a best guess about how much memory should be used by the cache, and the low and high water marks for disk space cleaning. The automatically generated configuration file is a good starting point.

Start a cluster of Xcache

Just like regular xrootd, You can build a Xcache cluster. To start a cluster, you need to define a unix environment variable XCACHE_RDR to the full DNS name of the redirector host (hostname -f). Do this on all machines: redirector and data servers. Then run the Singularity command (above) on all of them. This will start both xrootd and cmsd processes on all machines. To stop, type pkill xrootd and pkill cmsd.

Note: on redirector host, bind mount to /data is still needed. Only log files will appear there so the space usage will be small. There is no need to provide x509 user proxy certificate on the redirector host.