From 9497d6c7789d40b94d1200fa4ae43f942ce3cdfa Mon Sep 17 00:00:00 2001 From: khituras Date: Sun, 15 Jan 2023 18:36:23 +0100 Subject: [PATCH] Resolves #224. --- .../src/main/resources/de/julielab/gepi/webapp/pages/FAQ.tml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gepi/gepi-webapp/src/main/resources/de/julielab/gepi/webapp/pages/FAQ.tml b/gepi/gepi-webapp/src/main/resources/de/julielab/gepi/webapp/pages/FAQ.tml index b3bbd9ce..4e1ac361 100644 --- a/gepi/gepi-webapp/src/main/resources/de/julielab/gepi/webapp/pages/FAQ.tml +++ b/gepi/gepi-webapp/src/main/resources/de/julielab/gepi/webapp/pages/FAQ.tml @@ -68,7 +68,7 @@ How exactly are gene names mapped to gene IDs in a GePI query?
- We match the names after a normalization step to the NCBI Gene symbols in our database. + We match the input names after a normalization step to the NCBI Gene symbols in our database. The normalization step includes lower-casing of the name and the removal of punctuation and white spaces so that, for example, il2 and il-2 are both mapped to IL2. Gene name matching will often find multiple matches in our database despite the fact that we use the NCBI gene_orthologs file to create single representatives for orthologous genes. Sometimes not all species are (yet) included in the file. Since the genes that exist in several species often carry the same name, this could result in multiple input matches. It is also possible that the normalization causes multiple symbols to match. For this reason, the symbol mapping table in the statistics element of the result dashboard shows the most frequent target name for an input gene name. Still, all found elements will be searched for in GePI. If this leads to unwanted results, it is recommended to use canonical gene IDs in the query.