Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do we define the technical capabilities of a given TES API #188

Open
patmagee opened this issue Sep 22, 2022 · 2 comments
Open

How do we define the technical capabilities of a given TES API #188

patmagee opened this issue Sep 22, 2022 · 2 comments
Milestone

Comments

@patmagee
Copy link

TES provides an abstraction layer on-top of the smallest unit of work in the workflow execution stack and interfaces directly with some sort of compute infrastructure. At the moment, user's of the TES API need to know and understand the technical constraints of the underlying compute environment in order to know what sort of capabilities a given TES api has. This is a problem in so far as it leaks the implementation details and requires the user to know information prior to using the API.

IF we push the boundaries of WES and TES the natural conclusion is federating work submitted to WES across different TES backends that fit the mould of the requested task resources/capabilities. This implies A LOT of machine -> machine interaction where it is a WES api determining where to send work. In order to accomplish this we really need a way to describe the complete list of technical capabilities of a TES backend. #186 is a good example of a specific technical ability that would need to be described in some way. Other examples would be GPU support (and what line GPUS), the range of CPU's or supported data types etc etc.

You could imagine that a workflow run through WES could take each individual task (in CWL/WDL perlance) and use heuristics to map it onto a particular TES backend. I know that some work has been done by @uniqueg and his team on building a Gateway TES that does a similar role, but I wonder what would be required to make any TES api able to participate in this gateway approach

@uniqueg
Copy link
Contributor

uniqueg commented Sep 22, 2022

Thanks @patmagee! I/we agree that this is a very important discussion to be had. The ability to provide arbitrary backend params in v1.1, coupled with the ability to provide some resource requirements and the broadcasting of some capabilities (supported storage protocols) via the service info endpoint might serve as a blueprint/starting point for that.

To address the constraints you mentioned,
in a next step, we could try to agree on a controlled vocabulary for capabilities and resource requirements, ideally starting with those that can be mapped well across a wide range of backends. Then extend the service info accordingly to broadcast these capabilities. A nice side effect of that is that one could find appropriate TES instances dynamically via the Service Registry API.

Apart from the gateway TES, we have previously also worked on a task distribution logic that takes into the account the location of the data to try to send the compute to the data. It also considers costs and expected completion time. To make that work, we have deployed a GET /tasks/info endpoint as a sidecar next to our TES instances. Essentially, sending the resource requirements to that endpoint will tell the client the expected costs and queueing time. It's a very naive model, but it did address, in a first attempt,
the use cases of minimizing data transfer and balancing loads over multiple TES services.

For more details, have a look here: https://github.com/elixir-cloud-aai/TEStribute

@vsmalladi vsmalladi added this to the 1.2 milestone Jan 12, 2023
@uniqueg
Copy link
Contributor

uniqueg commented Sep 20, 2024

See here for related cross-standards issue (and proposal): ga4gh/TASC#45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants