Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize review.tsv #164

Open
1 task
joeflack4 opened this issue Nov 8, 2024 · 4 comments
Open
1 task

Utilize review.tsv #164

joeflack4 opened this issue Nov 8, 2024 · 4 comments
Assignees
Labels
omim qc quality control

Comments

@joeflack4
Copy link
Contributor

joeflack4 commented Nov 8, 2024

Overview

In #158, review.tsv was added to the list of artefacts. We should utilize it.

How to utilize

Copy/paste: review.tsv in README 2024/11/07

## Release files ... - `review.tsv`: Special cases to consider for manual review

About review.tsv

Columns:

  • classCode: integer: ID of review case class
  • classShortName: string (camelCase): describing the review case class
  • value: any: Some form of data to review
  • comment: string (optional)

1. causalD2gButMarkedDigenic

This review case involves what would be otherwise considered a valid disease-gene relationship, but for the fact that
it quite unusually includes 'digenic' in the label, even though it only had 1 association. OMIM doesn't have a
guarantee on the data quality of its disease-gene associations marked 'digenic', so for any of these entries, it could
be the case that either (a) it is not 'digenic'; OMIM should remove that from the label, and Mondo can make an explicit
exception to add the relationship, or could otherwise wait until OMIM fixes the issue and it will automatically be
added, or (b) it is in fact 'digenic', and OMIM should add the missing 2nd gene association.

Markdown table of current review.tsv

From:

classCode classShortName value
1 causalDigenicBut1Assoc OMIM:619151: AMED syndrome, digenic (Gene: OMIM:103710)
1 causalDigenicBut1Assoc OMIM:108770: Atrial standstill, digenic (GJA5/SCN5A) (Gene: OMIM:121013)
1 causalDigenicBut1Assoc OMIM:158901: Facioscapulohumeral muscular dystrophy 2, digenic (Gene: OMIM:614982)
1 causalDigenicBut1Assoc OMIM:620040: Dyskeratosis congenita, digenic (Gene: OMIM:188350)
1 causalDigenicBut1Assoc OMIM:619478: Facioscapulohumeral muscular dystrophy 4, digenic (Gene: OMIM:602900)
1 causalDigenicBut1Assoc OMIM:256040: Proteasome-associated autoinflammatory syndrome 1 and digenic forms (Gene: OMIM:177046)

Options

  • a. Relevant mondo repo pipelines (e.g. Update OMIM gene references in Mondo mondo#8108 for Gene-Disease associations) can examine it, filter out the specific review classes that apply to that pipeline, and print a message.
  • b. An SOP can tell the reviewer to check it.
  • c. Some email when a release is run.
    • Probably not this. Too many emails, because release run weekly, and oftentimes also done temporarily in PRs. And these review cases are likely to not go away for some time.
@joeflack4 joeflack4 added omim analysis Not a feature or update to the core of the repository, but an ad hoc analysis. QA / test Tests and other Quality Assurance issues. qc quality control and removed QA / test Tests and other Quality Assurance issues. analysis Not a feature or update to the core of the repository, but an ad hoc analysis. labels Nov 8, 2024
@joeflack4
Copy link
Contributor Author

@twhetzel I marked as very high priority but feel free to change.

@joeflack4
Copy link
Contributor Author

@matentzn FYI, I think you will find this interesting. And also I think Trish and I are leaning towards option 'a'. If so, then this is likely something you will be working into your mondo repo source pipelines.

@joeflack4
Copy link
Contributor Author

@twhetzel Also, regarding the documentation for this "review class" causalD2gButMarkedDigenic, one detail that I did not add was the fact that Joanna told us something like that they "don't chase digenic cases closely". So the data quality on these cases can't be assured.

Also a reminder to both of us: we are not importing digenic disease associations, but for these cases, it may be that they are not digenic, since there is only 1 association. If that's the case, we do care about them.

@joeflack4
Copy link
Contributor Author

Additional documentation consideration:

  • Could have the GitHub action, when it is creating a release, add that same snippet from the readme into the release comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
omim qc quality control
Projects
None yet
Development

No branches or pull requests

2 participants