Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement wildcard queries with N #53

Open
sdelatorrep opened this issue Jan 19, 2021 · 0 comments
Open

Implement wildcard queries with N #53

sdelatorrep opened this issue Jan 19, 2021 · 0 comments
Assignees
Labels
enhancement Improvement of a feature

Comments

@sdelatorrep
Copy link
Contributor

sdelatorrep commented Jan 19, 2021

In the spec (see here), we can read:

    ReferenceBases:
      description: |
        Reference bases for this variant (starting from `start`). 
        Accepted values: [ACGTN]*. N is a wildcard, that denotes the 
        position of any base, and can be used as a standalone base of any 
        type or within a partially known sequence. For example a sequence 
        where the first and last bases are known, but the middle portion can 
        exhibit countless variations of [ACGT], or the bases are unknown: 
        ANNT the Ns can take take any form of [ACGT], which makes both ACCT 
        and ATGT (or any other combination) viable sequences.
      type: string
      pattern: '^([ACGTN]+)$'
    AlternateBases:
      description: |
        The bases that appear instead of the reference bases. Accepted 
        values: [ACGTN]*. N is a wildcard, that denotes the position of any 
        base, and can be used as a standalone base of any type or within a 
        partially known sequence. For example a sequence where the first and 
        last bases are known, but the middle portion can exhibit countless 
        variations of [ACGT], or the bases are unknown: ANNT the Ns can take 
        take any form of [ACGT], which makes both ACCT and ATGT (or any 
        other combination) viable sequences.
        
        Categorical variant queries, e.g. such *not* being represented through 
        sequence & position,  make use of the `variantType` parameter.
        
        Optional: either `alternateBases` or `variantType` is required.
      type: string
      pattern: '^([ACGTN]+)$'

Implement this usage of N as a wildcard for [ACGT] in both referenceBases and alternateBases parameters.

@sdelatorrep sdelatorrep added the enhancement Improvement of a feature label Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement of a feature
Projects
None yet
Development

No branches or pull requests

2 participants