-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reactome Release #85 #215
Comments
@dustine32 @deustp01 I have spent the day when not in meetings QCing the models wrt the tickets we decided to cover. I'm willing to give this a thumbs up with the caveats in mind of the reactions that are ouside the main pathway. I'm going to toss this back into your court @dustine32 for the Shex checks. |
@ukemi Thank you for the quick testing! Here is the ShEx report for this run: Looks like 161 models fail so hopefully it's an obvious systemic cause that's easy to fix. |
Checking the file:
|
Say NO to drugs: Row 19 in the Main report spreadsheet
... is a drug metabolism pathway, as are all children of the "Drug ADME" superpathway. Someday, even if drugs remain out of scope for GO-CAM, we may want to see if useful in-scope xenobiotic metabolism can be mined from these pathways but for now, I think it would be OK to exclude the entire "Drug ADME" superpathway from the collection used as inputs for GO-CAM, in the same way that we exclude all children of the "Disease" superpathway. |
Interesting dilemma that we should discuss. GO does consider response to drugs in scope at the moment, eg response to cisplatin. Would this fit into acceptable then? It's not the action of a drug. I'm not sure what the current plan is. I do think at some point someone is going to see that the drug pathways will have a use, even if it's outside of the scope of GO. |
@ukemi Yeah, it could be I need to update minerva or it may be using a cached ShEx spec. I'm taking a look now! |
@ukemi Ok, (sigh) it was an older (by a few days) version of the ShEx spec causing issues. Specifically, the I'm rerunning the checks after also updating minerva and will have new, better results soon! |
True - drug "ADME" (absorption distribution metabolism excretion" may well be in scope for GO. But if we go in that direction, children of GO:0042221 response to chemical "Any process that results in a change in state or activity of a cell or an organism (in terms of movement, secretion, enzyme production, gene expression, etc.) as a result of a chemical stimulus," may be less good than children of GO:0006805 xenobiotic metabolic process "The chemical reactions and pathways involving a xenobiotic compound, a compound foreign to the organism exposed to it. It may be synthesized by another organism (like ampicillin) or it can be a synthetic chemical." The definition of GO:0042221 to me implies phenomena in the realm of physiology / homeostasis - how does the organism restore a steady state after a chemical perturbation, while GO:0006805 lets us focus narrowly on metabolic processes that occur in response to introduction of a foreign substance, maybe with some help from children of GO:0042908 xenobiotic transport. This sounds like a strategy question and maybe of broad enough interest to be discussed on an ontology call. Also scope creep for this ticket - maybe a new "say maybe to drugs" ticket? |
Will this have any effect on the lists of unexpected ChEBI instances in the current version of the main table? Or (I guess) is this a separate issue, and all those ChEBI's still need to be sorted out? |
It should not sweep the violators under the rug. It should only filter out the models that I see passing the shex when I run it 'live' from Noctua. The ones with chemical violators still come up as 'Invalid'. @deustp01, is the column I'm filling in on the spreadsheet useful to you? I'm doing some other QC today, but if it is useful I will continue to fill it out. |
@ukemi @deustp01 Here is the new, fixed ShEx report: Only 86 fails! @deustp01 Right, the unrecognized ChEBI classes are a separate issue and shouldn't change with the fixed ShEx report. |
I'm confused. The previous "Main Report June-22-2023" spreadsheet has a column (column J) "suspected reason for failure". In the latest version main_report.txt, this column is gone. But this is the column I was using to get the ChEBI IDs of the molecules that need to be investigated to find wrong charge states and stereochemistry, and stealth drugs. Should I continue to work from the June 22 spreadsheet? Also, @ukemi, what version of the spreadsheet are you working on and what column are you putting comments in? Or are those comments things like the entry in row 5 of the June 22 spreadsheet "This one passes when I run the reasoner in Noctua?????", in which case I'm already looking in the right place. For now, I will work from ChEBI numbers in column J of the June 22 spreadsheet and build my own document that lists each ChEBI ID, what Reactome pathway / GO-CAM model it is associated with, what problem I found, and how I resolved it in the Reactome central database. OK? |
That's the column I added and was editing by hand. I guess that answers my question about whether it was useful. :) |
@deustp01, I'll continue to work in the old spreadsheet, but will use the new one as a guide. If the column is blank, it passes and is excluded from the new run. Does that make sense? |
Yes, definitely. Also, do I remember right that not all ChEBI instances are included in NEO, but only a subset of plausible ones? So that wouyld explain row 2 in the table, where we annotated a weird small molecule product, CHEBI:142614 - 5-guanidinohydantoin, generated when a modified base is removed from damaged DNA and the fix will be to add CHEBI:142614 to the list of permitted small molecules? |
You remember correctly. Chemicals are only loaded if up until this point we needed to use them. So I suspect the failures will be of two flavors: ones that are correct but we never needed them before and ones where the charge state/etc is wrong and the correct form is available. Once you have vetted the violators, we will send a list of the needed chemicals to @balhoff so we can include them in the ChEBI load. |
@deustp01 I'm getting through the list. I stopped at the Lewis Blood group today. I should be able to finish the rest by tomorrow before I head out. |
@ukemi I'm at row 69 - KEAP1-NFE2L2 pathway - many, many small molecules with wrong charges where we will need to ask fir a new ChEBI instance to get it right, as well as many cases where I expect the needed ChEBI is not on Jim's list. Is there an easy way to check that list ourselves? I'm keeping my notes so far in an excel spreadsheet on my laptop just because that makes moving between windows easy. If it's helpful, I could make my results so far into a third sheet of the "Main Report" Google doc, and add new rows as I progress. |
Kind of a hacky solution, but if you go to the Noctua interface and go to a model, you can just go to a Reactome one. Click on add individual over at the left. Enter the ChEBI identifier and see if it autocompletes to the chemical. If it does, then NEO knows about it. If it doesn't then we will have to add it. |
@deustp01 It's more work, but are you also keeping Rhea in the loop with these? We might as well make sure that that coordination stays in alignment or it may come back to haunt us. |
My fantasy is that corrections propagate into the Reactome released database, yielding reactions that now match Rhea, and these matches propagate into Rhea. The reality is that the first step happens reliably, and matching to Rhea and propagation of the match into Rhea are do-able. We (you, me, Dustin) need to have a conversation with Adam Wright to figure out how to get the do-able stuff to happen, maybe bringing in Alan Bridge from Rhea to help with figuring out what we need to export for Rhea to pick up. In a big majority of the charge state problems I'm finding, ChEBI doesn't even have a term for the pH 7.3 form - ChEBI made the existing term in response to a pH-ignorant request from us - so a first step will be to get ChEBI terms. Onwards! |
This is much better than checking them ourselves in Noctua, guessing that processing such a list is easy for @balhoff . Any requests for what goes in the list besides the ChEBI ID, and for how it's formatted beyond one ID per line of plain text file? |
Makes perfect sense. I still would love to see the 3-resource alignment. I just finished the list. There we a couple models that still seemed to pass when I ran the reasoner on my end. I'm going to try to figure out what's happening with the transporters and the shex violations. It looks like some things are failing because chemicals aren't being recognized as chemicals. They are resolving to chebi identifiers that are recognized, but still throwing a violation. Just so we have a record, here is how I check the things on this report:
|
This is much better than checking them ourselves in Noctua, guessing that processing such a list is easy for @balhoff . Any requests for what goes in the list besides the ChEBI ID, and for how it's formatted beyond one ID per line of plain text file? They get added to an import text file in this format: We can probably just open an ontology ticket and ask an editor to do it. I haven't done anything like this in so long, I'm a bit unsure of myself. Plus opening the ticket and tagging it would create a record of our work. |
Complete agreement here - this is essential, so all the maneuvering is aimed at using us effectively to get there. |
I've finished my review, summarized in two new worksheets added to the main report Google doc. The first sheet lists each GO-CAM, in the order they are listed on the first worksheet, for which there were ChEBI-related issues, with a separate row for each ChEBI instance that includes my guess as to what's going on. I expect that in most cases, the ChEBI IDs are not on Jim's list to build into NEO. But many of these IDs, as noted, also now point to molecules that are not in their correct pH 7.3 charge state. My opinion is that rather than temporarily populating NEO with incorrect ChEBI molecules to support current Rhea-noncompliant Reactome models, we should fix Reactome. A complication is that in most cases ChEBI does not yet have an entry for the pH 7.3 form of the molecule, so a first step will be to get the needed ChEBI instances, then fix Reactome and map the fixed reactions to Rhea, then fix NEO. In the second added worksheet, I sorted the first one on the ChEBI ID column and edited to get a list of all 153 ChEBI IDs we are concerned with - substantial work, but realistic to get done perhaps even in time for the next Reactome release if ChEBI IDs are easy to get. @ukemi @dustine32 sanity check please. |
- [X] - Replace transports_or_maintains_localization_of relations with has_primary_input Change all 'transports_or_maintains_localization_of' relations to 'has_primary_input' #200. I believe we also still need to update the Shex. Update ShEX rules for transporter activity go-shapes#273
- [ ] - Remove (don't load) models that are a single node Remove (don't load) models that are a single node #202
- [X] - Change the representation of regulatory reactions to use 'has_small_molecule_regulator' activity Change the representation of regulatory reactions to use 'has_small_molecule_regulator' activity #204 Checked in glycolysis model.
- [ ] - Molecular Events should not be entities Molecular Events should not be entities #119
- [X] - Convert chemicals to ChEBI rather than Reacto Convert chemicals to ChEBI rather than Reacto #221
- [X] - Replace relation rules to parallel this ticket. /noctua/issues/813
The text was updated successfully, but these errors were encountered: