Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for custom attributes and read type description of fastq #102

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cgirardot
Copy link

Implements #86

Allows to add "schema_attribute[tag]" (e.g. sample_attribute[treatment]) in the input schema tables (tsv only) where the schema ('sample', 'run', 'experiment', 'study') e.g. a new sample_attribute[treatment] column in the ena_sample.tsv. These extra headers are injected in the XML generation stream, and injected in the generated XML as a ATTRIBUTE sequence (templates where modified accordingly). For samples, only the default ERC000011 was modified to support these additional attributes. Unit is not yet supported.

Additionally, support for read_type and read_label (as new headers in the ena_run.tsv) is added to the run XML for files of type fastq to support single cell situations where more than 2 fastq files are available (ENA then requires to have read_type described). Multiple values can be passed using CSV format eg paired,cell_barcode

Limitations: read_label is not fully supported as it would require to support SpotDescriptorType in the run XML but it is unclear how this information could be passed. Basic support for SPOT_DECODE_SPEC with a READ_SPEC using BASE_COORD (see SRA.common.xsd) could be provided with:

  • headers like READ_SPEC 1...READ_SPEC n where the header number is the READ_SPEC's READ_INDEX
  • value would be formatted like READ_LABEL:READ_CLASS:READ_TYPE:BASE_COORD:SPOT_LENGTH. For example: UMI1:Application Read:Other:1:8.

…e[tag]' and inject each of these tags as a schema_attribute XML sequence. Additionally, support for read_type and read_label (not fully supported yet though) is added to the run XML (when present in the run ena table)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant