You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to integrate compute/network with iRODS data grid. To start, we will configure KINC Pegasus workflow to ingest input data from the ScIDAS iRODS deployment. We envision raw data transferred from NCBI into IRODS using replication policies prior workflow execution but for now lets assume that the data is already in iRODS.
See email thread initiated by @feltus and response from Mats@ISI.
For the SciDAS project, we now have iRODS installations at WSU, RENCI (of course), and CU. We are loading it up with genomes and will be moving data into the iRODs zone from NCBI and our own file systems. Then we want to pull from iRODS into Pegasus workflows to process on OSG (+Cloudlab and Chameleon).
QUESTION: Is it possible to pull data from a remote iRODS zone directly into the Pegasus workflow at osgconnect and then run on OSG?
Yes, just put it in URL from by prepending irods:// to the path. This is to let Pegasus know it is an iRODS location. Then add a "irodsPassword"
to your configuration file:
See More
For the SciDAS project, we now have iRODS installations at WSU, RENCI (of course), and CU. We are loading it up with genomes and will be moving data into the iRODs zone from NCBI and our own file systems. Then we want to pull from iRODS into Pegasus workflows to process on OSG (+Cloudlab and Chameleon).
QUESTION: Is it possible to pull data from a remote iRODS zone directly into the Pegasus workflow at osgconnect and then run on OSG?
Yes, just put it in URL from by prepending irods:// to the path. This is to let Pegasus know it is an iRODS location. Then add a "irodsPassword"
to your configuration file:
It is a while ago we did this implementation, so it might need some refreshing - but that should be easy.
What workflow is this for? For GEM, data would probably have to be pulled in to the submit host, and split up. If you have more of a one input per job setup, you can do staging bypass and have the jobs start up and pull directly from irods. You have to be careful with this approach as you can easily end up with 100's or 1000's of clients interacting with your data store.
The text was updated successfully, but these errors were encountered:
We need to integrate compute/network with iRODS data grid. To start, we will configure KINC Pegasus workflow to ingest input data from the ScIDAS iRODS deployment. We envision raw data transferred from NCBI into IRODS using replication policies prior workflow execution but for now lets assume that the data is already in iRODS.
See email thread initiated by @feltus and response from Mats@ISI.
-----Original Message-----
From: Mats Rynge [mailto:[email protected]]
Sent: Thursday, September 21, 2017 12:48 PM
To: Alex Feltus [email protected]
Cc: Claris Castillo [email protected]; Ficklin, Stephen Patrick ([email protected]) [email protected]; William Poehlman ([email protected]) [email protected]
Subject: Re: iRODS and OSG-GEM/OSG-KINC Workflows
For the SciDAS project, we now have iRODS installations at WSU, RENCI (of course), and CU. We are loading it up with genomes and will be moving data into the iRODs zone from NCBI and our own file systems. Then we want to pull from iRODS into Pegasus workflows to process on OSG (+Cloudlab and Chameleon).
QUESTION: Is it possible to pull data from a remote iRODS zone directly into the Pegasus workflow at osgconnect and then run on OSG?
Yes, just put it in URL from by prepending irods:// to the path. This is to let Pegasus know it is an iRODS location. Then add a "irodsPassword"
to your configuration file:
See More
For the SciDAS project, we now have iRODS installations at WSU, RENCI (of course), and CU. We are loading it up with genomes and will be moving data into the iRODs zone from NCBI and our own file systems. Then we want to pull from iRODS into Pegasus workflows to process on OSG (+Cloudlab and Chameleon).
QUESTION: Is it possible to pull data from a remote iRODS zone directly into the Pegasus workflow at osgconnect and then run on OSG?
Yes, just put it in URL from by prepending irods:// to the path. This is to let Pegasus know it is an iRODS location. Then add a "irodsPassword"
to your configuration file:
https://urldefense.proofpoint.com/v2/url?u=https-3A__pegasus.isi.edu_documentation_cred-5Fstaging.php-23irods-5Fcred&d=DwICaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=-iT4EzFw1LHEyOlwP_M6-5Up102auJihsYcZkxUv70c&m=bLAIbB-XUqd6aK6SnRLqlpPjZxdS6bOAkCgLQPTy_sQ&s=RS6HsUcljTEYzSouEdMR_bb2N6oAatgBLUNeWZkTGfo&e=
It is a while ago we did this implementation, so it might need some refreshing - but that should be easy.
What workflow is this for? For GEM, data would probably have to be pulled in to the submit host, and split up. If you have more of a one input per job setup, you can do staging bypass and have the jobs start up and pull directly from irods. You have to be careful with this approach as you can easily end up with 100's or 1000's of clients interacting with your data store.
The text was updated successfully, but these errors were encountered: