Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd time out #25

Open
ch-sander opened this issue Oct 9, 2024 · 4 comments
Open

Odd time out #25

ch-sander opened this issue Oct 9, 2024 · 4 comments

Comments

@ch-sander
Copy link
Contributor

maybe this is a problem of oxigraph and not shmarql? @Tpt

Endpoint: https://dataria.org/sparql

Query

SELECT DISTINCT (COUNT(DISTINCT ?object_8) AS ?object_8_count) ?place_4 ?place_4_label WHERE {
  ?place_1 (<http://www.graceful17.org/ontology/falls_within>+) ?place_4.
  ?place_4 <http://www.graceful17.org/ontology/called> ?place_4_label;
    <http://www.graceful17.org/ontology/has_main_type> <http://www.graceful17.org/resources/type_77>.
  ?place_1 ^<http://www.graceful17.org/ontology/primary_place> ?institution_6.
  ?institution_6 <http://www.graceful17.org/ontology/holds_immaterial_object> ?object_8.
  ?object_8 <http://www.graceful17.org/ontology/called> ?object_8_label.
  ?object_8  <http://www.graceful17.org/ontology/has_main_type> <http://www.graceful17.org/resources/type_460>.
}
GROUP BY ?place_4 ?place_4_label
ORDER BY DESC (?object_8_count)
LIMIT 10000

Result

504 Gateway Time-out
nginx/1.22.0 (Ubuntu)

Solution

Avoiding <http://www.graceful17.org/ontology/has_main_type> <http://www.graceful17.org/resources/type_460> or making it OPTIONAL will return a result in 0.3 seconds. It also works with a subquery

      {
    SELECT DISTINCT ?object_8 WHERE {
     ?object_8 <http://www.graceful17.org/ontology/has_main_type> <http://www.graceful17.org/resources/type_460>.
    }
  }

I can DESCRIBE both a ?object_8 and <http://www.graceful17.org/resources/type_460> -- the predicate <http://www.graceful17.org/ontology/has_main_type> exists!

It might be something on my end -- just wanted to check if I need to dig deeper in my data and config or if it could be out of my (immediate) control...

@ch-sander
Copy link
Contributor Author

ch-sander commented Oct 9, 2024

Before the timeout the CPU is at 100% (memory is fine), so it's desperately trying to get to the results

@Tpt
Copy link

Tpt commented Oct 9, 2024

The timeout itself is likely because of the reverse-proxy you are using. There is no timeout in Oxigraph server.

On why the query execution is so slow, it's likely that it is because Oxigraph picks a bad join ordering, exploding the computation time. Subqueries is indeed a way to game the join reordering system.

@ch-sander
Copy link
Contributor Author

thanks @Tpt . Yes, the timeout is the proxy, but >60s is the issue (even if I set the timeout to 10 minutes).

I can optimize the queries but using the visual query builder sparnatural it's kind of problematic as it constructs the sparql for the user...

probably optimizing "bad join ordering" on oxigraph's end is not an easy fix?

@Tpt
Copy link

Tpt commented Oct 9, 2024

probably optimizing "bad join ordering" on oxigraph's end is not an easy fix?

It's indeed not an easy fix. It would still be great to put some work in this area because the current reordering algorithm is very bad. However, it's an endless topic, join reordering is a very hard problem you can put dozens of years of work into.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants