Skip to content

Latest commit

 

History

History
243 lines (213 loc) · 7.28 KB

xdmod_integration.md

File metadata and controls

243 lines (213 loc) · 7.28 KB

XDMoD Integration

One of the challenges of the OpenShift/NERC integration is accommodating existing tools that NERC already uses. This document covers usage accounting through XDMoD.

Overview

XDMoD is a usage accounting tool whose data structures and functionality have proven suitable for NERC. Its UI has multiple features that existing NERC users have grown accustomed to, including:

  • Different admin and PI views
  • A variety of output formats
  • Report generation

However there is a drawback to using XDMoD: the codebase is hard to decipher, making it difficult to make reliable estimations regarding the time needed to extend it to accommodate OpenShift. Fortunately an alternative approach to OpenShift integration is available: simply using one of the existing data structures for OpenShift data. This approach was discussed with XDMoD developers, who raised no objections.

Event Based vs Job Based

XDMoD has two approaches when collecting data for a resource:

  • Job Based: Each data entry encapsulates all the known information about a single job.
  • Event Based: Each data entry represents an event (such as the creation or deletion of a VM)

Both methods allow XDMoD to reconstruct the state of a system and view the consumption of computing resources at a particular point of time. Which method is better for OpenShift?

OpenShift jobs - pods - can run for a long period of time; that makes it difficult to create a single job entry in a database. However an event-based approach is also problematic, as the OpenShift CLI does not have strong event querying capabilities, and the information contained within an event does not have the level of detail required for XDMoD.

One possible alternative is to take periodic samples of data and treat each sample as a completed job. The question becomes whether this method of data gathering results in usable output as viewed from the XDMoD UI.

The remainder of this document explores this approach.

OpenShift Data Collection

OpenShift metrics are stored in Prometheus, and can be queried using Prometheus’s query language: PromQL.

The following PromQL query retrieves metric data for an OpenShift cluster. That data is averaged over the past hour using one minute samples and aggregated by namespace.

  • avg_over_time(sum by (namespace) (<metric>)[1h:1m])

That query can be combined with the following OpenShift REST API call to produce values for each hour in a day.

  • /api/v1/query_range?query=<query>&start=<date_string>T00:00:00Z&end=<date_string>T23:59:59Z&step=3600s

We can use the above to query the following metrics over a day:

  • kube_pod_init_container_resource_requests_cpu_cores
    • The number of CPU cores requested by an init container.
  • kube_pod_init_container_resource_limits_cpu_cores
    • The number of CPU cores requested limit by an init container.
  • kube_pod_init_container_resource_requests_memory_bytes
    • Bytes of memory requested by an init container.
  • kube_pod_init_container_resource_limits_memory_bytes
    • Bytes of memory requested limit by an init container.
  • kube_deployment_status_replicas
    • The number of replicas per deployment.

The mapping of these values into the XDMoD data structure is described below.

Annotations

Some job information required by XDMoD are not metrics, but simply additional information about the namespace. That information can be included when creating a namespace through the use of annotations.

Kristi Nikolla has already created a patch that adds OpenShift support to the ColdFront OpenStack plugin. This support is similar to that for OpenStack, allowing for the activation and deactivation of an allocation, as well as the association and dissociation of a user with the allocation. This code can be updated to also set annotations by passing in the following:

               "metadata": {
                   "annotations": {
                       "cf_pi": <pi_username>
                       "cf_project_id": <project_id>
                   }
               }

These annotations can later be queried through the OpenShift Python client.

Corresponding OpenShift Data with XDMoD Data Structures

The XDMoD data structure that accepts job data is the one used for Slurm. This table shows one possible correspondence between Slurm and OpenShift data:

Slurm OpenShift Equivalent
job_id autogenerated by script
job_id_raw autogenerated by script
cluster_name openshift cluster environment variable
partition_name blank
qos_name blank
account_name cf_pi annotation
group_name cf_project_id annotation
gid_number blank
user_name openshift namespace
uid_number blank
submit_time set to start_time
eligible_time set to start_time
start_time beginning of report time
end_time end of report time
elapsed end_time - start_time
exit_code blank
state RUNNING
nnodes kube_deployment_status_replicas
ncpus kube_pod_init_container_resource_requests_cpu_cores
req_cpus kube_pod_init_container_resource_limits_cpu_cores
req_mem kube_pod_init_container_resource_limits_memory_bytes
req_tres cpu=<req_cpu>,mem=<req_mem>,node=<req_pods>
alloc_tres set to req_tres
timelimit set to elapsed
node_list blank
job_name openshift pod name

In order to retrieve the above OpenShift information in the Slurm format needed by XDMoD, we can create a script that pulls the required data from Prometheus and OpenShift and formats it appropriately. XDMoD can then “shred” and “ingest” that data, allowing it to be viewed in its GUI. Other existing XDMoD functions - such as the automatic generation of reports - should then also be accessible for this OpenShift data.