-
Notifications
You must be signed in to change notification settings - Fork 11
GL4U: Amplicon Seq 2023 Pilot ACCESS Request and Content Setup
- Summary
- Requesting resources
- Transferring credits
- Requesting an increase in allotted quotas
- Setting up instances
- Creating links for participants
- Deleting instances
ACCESS is an NSF-funded resource that can provide cloud computing for research and educational purposes. If you are an educator that wants to run the GL4U: Amplicon Seq 2023 bootcamp for your class, this page will help walk you through the process of requesting resources and then how to utilize them. If you have any questions or hit any problems, please don't hesitate to reach out to [email protected] for help 🙂
The process is detailed below, but all that is required for submitting a request is filling out a form that includes:
- A couple of paragraphs providing an overview of the purpose (here for educational purposes) and how ACCESS would be used
- A CV of the submitting PI
- ACCESS - an NSF-funded program that we apply to in order to get cloud-computing resources
- Jetstream2 - one of the primary computing infrastructures ACCESS works with (and the one most suitable for requests like this)
- Exosphere - a web-interface for managing Jetstream2 computing resources
Click to Login to ACCESS at the top right of this page https://access-ci.org/. There is no account creation step here, but rather you can choose from a few methods to authenticate your identity, such as ORCID or Google. You also may need to set up dual-factor authentication.
Once logged in, the next step is to submit a request. There are different types offered, listed here. At the time of putting this page together, "Explore ACCESS" is the most appropriate for educational purposes.
To begin the process, while logged in, go to the opportunities page, and click to "SUBMIT AN EXPLORE ACCESS REQUEST".
There you will need to enter a few things. An example of each is presented below, but these should be adjusted for your scenario. Anything not listed here can be skipped on the form:
Running an Amplicon Sequencing Bioinformatics Course
General overview
I am a professor and would like to utilize the NASA GeneLab (https://genelab.nasa.gov/) GeneLab for Colleges and Universities (GL4U; https://github.com/nasa/GeneLab-Training/tree/main/GL4U) amplicon sequencing training materials with my students. This will likely take place over just a week or two, and an EXPLORE ACCESS allocation would provide sufficient resources to be able to provide the same computing environment to all participants.How I plan to use ACCESS
I intend to use Indiana Jetstream2, managed through the Exosphere website, and to create individual m3.medium instances for each participant based off the publicly available "GL4U-amplicon-2023" image.Thank you for your consideration and any help!
space biology, bioinformatics
check box for "Classroom or training activities"
"Other Biological Sciences"
Here is where you should upload your CV.
Following filling out the above and attaching your CV, you can click SUBMIT at the bottom.
Once the request has been approved, you will need to transfer the credits from ACCESS to JetStream2. Once logged in at https://access-ci.org/, go to https://allocations.access-ci.org/requests in order to transfer credits.
For the appropriate allocation, select "Choose New Action", then "Exchange". On the next screen, choose Indiana Jetstream2 as Resource, click "Add Resource", enter all credits, add anything to the comment box (as it is required), then click Submit.
The starting allotted quotas will typically only allow up to maybe 10 concurrent instances to be created. If you are going to have more than that actively participating, you need to submit a request to increase the allotted quotas.
To do this, log into JetStream2 using the same identity authentication as used above to log in to ACCESS, then click "Add allocation", then "Add ACCESS Account", then once verified, select the allocation to be added to JetStream2.
After it is present on JetStream2 when you are logged in, then go to this support page to build an email as described next: https://jetstream2.exosphere.app/exosphere/getsupport
Select the button for "An Allocation", then modify the following text to specific your specific allocation (e.g., "BIO######", which you can get from this ACCESS page) and how many concurrent instances you will need (an instance is a computer, so 1 for each planned participant):
Hi there,
We plan to use this allocation (<YOUR ALLOCATION ID>) with <YOUR TOTAL STUDENTS/PARTICIPANTS> concurrent m3.medium instances for a bioinformatics course we are running.
Could you please help with increasing the allotted quotas so that we will be able to run up to <YOUR TOTAL STUDENTS/PARTICIPANTS> m3.medium instances concurrently on this allocation, including cores, ram, volume, ports, available IP addresses, and whatever else would be required?
Thank you for any help!
Then click to "Build Support Request", copy the contents of the text window, and paste it in an email to "[email protected]" with the subject header "[Jetstream2] Support Request From Exosphere for Jetstream2".
Once the above is all taken care of, you can begin setting up instances.
There is extensive documentation on Jetstream2 here: https://docs.jetstream-cloud.org/ui/exo/exo/
It is a lot, and the Jetstream2 folks are super-responsive to requests for help, but, as mentioned above, feel free to reach out to [email protected] too.
Log into JetStream2, select the appropriate allocation, then:
- choose "Create" at the top-right, then "Instance"
- select "By Image", select the text window to search by name, and search for GL4U-amplicon-2023, and choose "Create Instance" on the "GL4U-amplicon-2023" image
- give the instance a name, like "Amplicon-Course"
- select "m3.medium"
- move the slide to create as many instances as needed, if you need more than the max that can be created at one time, do this process in as many steps that are needed
- click "Advanced Options", and click to "Assign a public IP address to this instance"
- at the bottom is a "Boot Script", select the entire text and delete it, then replace it with the following:
#cloud-config
users:
- default
- name: exouser
shell: /bin/bash
groups: sudo, admin
sudo: ['ALL=(ALL) NOPASSWD:ALL']{ssh-authorized-keys}
- name: gl4u
shell: /bin/bash
groups: users
lock_passwd: false
passwd: $1$V9O.SGtD$LqHZP91jT/Sjhax8kWSQF1
ssh_pwauth: true
package_update: true
package_upgrade: {install-os-updates}
packages:
- git{write-files}
bootcmd:
# have this here and in runcmd so if shelved and restarted it should launch juypter again (here alone didn't work for me)
- sudo -u gl4u -H sh -c "bash /opt/jupyter-boot.sh"
runcmd:
# downloading and unpacking notebooks
- sudo -u gl4u -H sh -c "curl -L -o ~/GL4U-2023-amplicon-bootcamp-notebooks.zip https://figshare.com/ndownloader/files/41500734"
- sudo -u gl4u -H sh -c "unzip ~/GL4U-2023-amplicon-bootcamp-notebooks.zip -d ~/"
- sudo -u gl4u -H sh -c "rm ~/GL4U-2023-amplicon-bootcamp-notebooks.zip"
# next is launching jupyter lab
- sudo -u gl4u -H sh -c "bash /opt/jupyter-boot.sh"
# this is sourced here to set the R kernel to be findable by jupyter (haven't gotten it to work any other way yet...)
- sudo -u gl4u -H sh -c "bash -ic \". ~/.bashrc\""
- echo on > /proc/sys/kernel/printk_devkmsg || true # Disable console rate limiting for distros that use kmsg
- sleep 1 # Ensures that console log output from any previous command completes before the following command begins
- >-
echo '{"status":"running", "epoch": '$(date '+%s')'000}' | tee --append /dev/console > /dev/kmsg || true
- chmod 640 /var/log/cloud-init-output.log
- {create-cluster-command}
- (which apt-get && apt-get install -y python3-venv) # Install python3-venv on Debian-based platforms
- (which yum && yum install -y python3) # Install python3 on RHEL-based platforms
- |-
python3 -m venv /opt/ansible-venv
. /opt/ansible-venv/bin/activate
pip install --upgrade pip
pip install ansible-core
ansible-pull \
--url "{instance-config-mgt-repo-url}" \
--checkout "{instance-config-mgt-repo-checkout}" \
--directory /opt/instance-config-mgt \
-i /opt/instance-config-mgt/ansible/hosts \
-e "{ansible-extra-vars}" \
/opt/instance-config-mgt/ansible/playbook.yml
- ANSIBLE_RETURN_CODE=$?
- if [ $ANSIBLE_RETURN_CODE -eq 0 ]; then STATUS="complete"; else STATUS="error"; fi
- sleep 1 # Ensures that console log output from any previous commands complete before the following command begins
- >-
echo '{"status":"'$STATUS'", "epoch": '$(date '+%s')'000}' | tee --append /dev/console > /dev/kmsg || true
mount_default_fields: [None, None, "ext4", "user,exec,rw,auto,nofail,x-systemd.makefs,x-systemd.automount", "0", "2"]
mounts:
- [ /dev/sdb, /media/volume/sdb ]
- [ /dev/sdc, /media/volume/sdc ]
- [ /dev/sdd, /media/volume/sdd ]
- [ /dev/sde, /media/volume/sde ]
- [ /dev/sdf, /media/volume/sdf ]
- [ /dev/vdb, /media/volume/vdb ]
- [ /dev/vdc, /media/volume/vdc ]
- [ /dev/vdd, /media/volume/vdd ]
- [ /dev/vde, /media/volume/vde ]
- [ /dev/vdf, /media/volume/vdf ]
Then click "Create".
The instances tend to build in just a few minutes. You can see the status indicator of the instances on the Jetstream2 allocation page start at "Building", changing through a few stages, and ending at "Ready" when they are ready.
Each instance has its own IP address, and that IP address can be used to provide a link to the participants to access their own cloud-computing environment through a web-browser. The below examples are with the mock IP address "XXX.XXX.XXX.XX", so you would need to alter that for each individual IP, but this is what the links would look like:
A link structured like this would take you to the base Jupyter lab environment: http://XXX.XXX.XXX.XX:8000
A link structured like this would take you to the opening overview notebook page: http://XXX.XXX.XXX.XX:8000/lab/tree/00-overview.ipynb
Each user has the same user name (gl4u) and password (gl4u2023). These instances are ephemeral and are not meant to hold anything that needs to be secure.
When finished, you can select multiple instances on the Instance page of Jetstream2 for the appropriate allocation, and choose to delete the instances.
Detailed Jetstream2 documentation can be found here, and feel free to reach out to [email protected] 🙂