Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Scov2 polyprotein listed as a species in Noctua #486

Open
vanaukenk opened this issue May 27, 2022 · 13 comments
Open

Investigate Scov2 polyprotein listed as a species in Noctua #486

vanaukenk opened this issue May 27, 2022 · 13 comments
Assignees
Labels

Comments

@vanaukenk
Copy link

During the QC checks for bringing Noctua up after the 2022-05-26 outage, I noticed a suspicious entry, pp1ab Scov2, in the list of species:

image

I thought pp1ab was a polyprotein and that's how it looks in noctua-amigo:

image

@balhoff @tmushayahama - can you take a look to see why this entry is included as a species? Thanks.

Also tagging @kltm

@balhoff
Copy link
Member

balhoff commented May 27, 2022

@tmushayahama how is that list created? (What service does it call to get it?) 'pp1ab Scov2' does not look like a taxon at least in the latest NEO file.

@vanaukenk
Copy link
Author

@balhoff
@tmushayahama uses the taxon API from minerva (/taxa)

@kltm
Copy link
Member

kltm commented May 27, 2022

As a hint, noting that the /taxa API is returning:
{ id: "http://identifiers.org/uniprot/P0DTD1", label: "pp1ab Scov2" }

@kltm
Copy link
Member

kltm commented May 27, 2022

Noting this found in neo.obo:

[Term]
id: UniProtKB:P0DTD1-PRO_0000449619
name: nsp1 Scov2
synonym: "nsp1" BROAD []
synonym: "P0DTD1-PRO_0000449619" RELATED []
synonym: "protein" RELATED []
is_a: CHEBI:33695
relationship: has_gene_template PR:000050270%7CUniProtKB%3AP0DTD1-PRO_0000449635%7CPRO_0000449635
relationship: in_taxon UniProtKB:P0DTD1 ! pp1ab Scov2
property_value: https://w3id.org/biolink/vocab/category https://w3id.org/biolink/vocab/GeneProduct
property_value: https://w3id.org/biolink/vocab/category https://w3id.org/biolink/vocab/MacromolecularMachine

I don't believe in_taxon is supposed to work like that.

@kltm
Copy link
Member

kltm commented May 27, 2022

@kltm
Copy link
Member

kltm commented May 27, 2022

It looks like the taxon is off by one for GPI 1.2?

UniProtKB	P0DTD1-PRO_0000449619	nsp1	Host translation inhibitor nsp1|P0DTD1(1-180)|rep/Clv:nsp1 (SARS2)|PRO_0000449619|nsp1 (SARS2)|UniProtKB:P0DTD1, 1-180|leader protein (SARS2)|UniProtKB:P0DTC1, 1-180|non-structural protein 1 (SARS2)|nsp-1|ns1|ns-1|host translation inhibitor nsp1|Severe acute respiratory syndrome (SARS) coronavirus nonstructural protein 1	protein	taxon:2697049	UniProtKB:P0DTD1	PR:000050270|UniProtKB:P0DTD1-PRO_0000449635|PRO_0000449635

http://geneontology.org/docs/gene-product-information-gpi-format/

@kltm
Copy link
Member

kltm commented May 27, 2022

Related to geneontology/go-site#1431

@balhoff
Copy link
Member

balhoff commented May 27, 2022

@kltm it seems like you found the problem. But in the neo.owl I downloaded yesterday I saw in_taxon NCBITaxon:2697049. I wonder why the discrepancy?

@kltm
Copy link
Member

kltm commented May 27, 2022

@balhoff Yeah, there's some stuff I'm not sure about here, especially as that file has not been touched in years, so I'm not sure why it's a problem now.
I'm tagging upstream contributors @cmungall and @justaddcoffee to confirm format for GPI 1.2.

@kltm
Copy link
Member

kltm commented Jun 8, 2022

From @cmungall , we can go ahead and manually fix this file ourselves upstream.

@kltm
Copy link
Member

kltm commented Jun 8, 2022

@kltm
Copy link
Member

kltm commented Jun 10, 2022

If we understand this correctly, this should be fixed on next NEO release.

@kltm
Copy link
Member

kltm commented Sep 24, 2022

Hm. Apparently not. Still appearing on Noctua landing page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants