Panache Hibernate adapting query count based on distinct keyword #38316

rysurd · 2024-01-20T15:51:10Z

DavideD

It seems fine, but I wouldn't merge it without a test case.

rysurd · 2024-01-23T15:37:15Z

It seems fine, but I wouldn't merge it without a test case.

I think there are errors in tests that I need to check and fix. And on top on that add a specific test indeed. This is still a work in progress I suppose :)

DavideD · 2024-01-23T15:45:08Z

Thanks @rysurd, that's fine.

But it's usually better to send a PR when one also has a test case. It saves everybody's time.

rysurd · 2024-01-23T23:30:40Z

Hello @DavideD ! I updated my PR with two main changes ;

Moved the query change from the select distinct to the from clause
Slightly adjusted the integration tests to overcome the issue I had during previous tests (like here https://github.com/rysurd/quarkus/actions/runs/7631080457/job/20788624212#step:16:3215). This issue is the following :

java.lang.IllegalArgumentException: org.hibernate.query.SemanticException: Select item at position 1 in select list has no alias (aliases are required in CTEs and in subqueries occurring in from clause)

I also added one another test in TestEndpoint for projection ; we also check distinct query for projection on Long, just like in the original issue.

DavideD

Unless I'm missing something, this solution doesn't seem correct.

The issue is about generating a query like:

select count(distinct entity.id) from ...

but you are creating the following query instead:

select count(*) from (select distinct entity.id from ... )

Also, if existing tests don't work any more after applying your fix, it means there's something wrong with your solution. You shouldn't change them to make your solution work.

If we apply this solution, what do you think it will happen to the apps running distinct queries without aliases after the upgrade?

DavideD · 2024-01-24T08:03:44Z

integration-tests/hibernate-orm-panache/src/main/java/io/quarkus/it/panache/TestEndpoint.java

@@ -1326,7 +1326,7 @@ public String testProjection() {

        PanacheQuery<CatProjectionBean> projectionDistinctQuery = Cat
                // The spaces at the beginning are intentional
-                .find("   SELECT   disTINct  c.name, cast(null as string), SUM(c.weight) from Cat c where name = :name group by name  ",
+                .find("   SELECT   disTINct  c.name as foo, cast(null as string) as bar, SUM(c.weight) as baz from Cat c where name = :name group by name  ",


This is suspicious, why this test suddenly requires a change?

Yeah, this should keep working.

DavideD · 2024-01-24T08:04:01Z

integration-tests/hibernate-orm-panache/src/main/java/io/quarkus/it/panache/TestEndpoint.java

        // We are checking that not everything gets lowercased
        PanacheQuery<CatProjectionBean> letterCaseQuery = Cat
                // The spaces at the beginning are intentional
-                .find("   SELECT   disTINct  'GARFIELD', 'JoN ArBuCkLe' from Cat c where name = :NamE group by name  ",
+                .find("   SELECT   disTINct  'GARFIELD' as foo, 'JoN ArBuCkLe' as bar from Cat c where name = :NamE group by name  ",


This is suspicious, why this test suddenly requires a change?

rysurd · 2024-01-24T08:27:02Z

Unless I'm missing something, this solution doesn't seem correct.

The issue is about generating a query like:
select count(distinct entity.id) from ...
but you are creating the following query instead:
select count(*) from (select distinct entity.id from ... )

So indeed I wanted to implement a query like the first one you suggested. It work for cases where you want to distinct count on one column. However it fails when we have aggregate function (like SUM) or when we want to count distinct on multiple columns. If you plainly put the input select clauses in the distinct statement you can end up having a query like this for instance

select count(distinct entity.id, SUM(entity.weight)) from ...

It'll generate an error (This one https://github.com/rysurd/quarkus/actions/runs/7628145894/job/20778994932#step:16:3219). So this is why I went rather for moving the distinct statement in a subquery in the from statement. Maybe (probably) I'm the one missing out something here that would make this work 😄

Also, if existing tests don't work any more after applying your fix, it means there's something wrong with your solution.
You shouldn't change them to make your solution work.

If we apply this solution, what do you think it will happen to the apps running distinct queries without aliases after the upgrade?

I'd agree that something can be wrong with the solution, but this could also mean we didn't took some edge case into consideration when writing the tests, in can happen sometimes. In any case if you can help me understand the previous problem about aggregate function in subquery I'd appreciate ! I'll keep working on this issue and will try to find another solution.

Thanks a lot!

EDIT : Also I'm not sure we can have distinct count on multiple columns right from the select statement with Hibernate

DavideD · 2024-01-24T09:58:58Z

... but this could also mean we didn't took some edge case into consideration when writing the tests,

It seems we did in this case, otherwise CI wouldn't fail.

So indeed I wanted to implement a query like the first one you suggested. It work for most case however it fails when we have aggregate function (like SUM) in the select statement

I see now.
I'm not a fan of this solution but I guess it's the simplest compromise ¯_(ツ)_/¯

Alternatives I can think of is to split by , and remove all the fields starting with an aggregate functions. This won't work without having a proper parser because people can still include a text string with anything in the select clause, or use escape chars.

Or, add an attribute to .count(), so that a user can specify the argument of the count function in any cirumnstance.

@gsmet, If we decide to go for this approach, this is a breaking change that will make queries not using an alias fail.
So, at the very least, it shouldn't be included in a micro.

In any case if you can help me understand the previous problem about aggregate function in subquery I'd appreciate

I'm not sure, but I'm assuming that when having subqueries, you have to specify which field of which entity you refer to because you could have a reference to the same entity multiple times.

rysurd · 2024-01-24T10:58:14Z

Or, add an attribute to .count(), so that a user can specify the argument of the count function in any cirumnstance.

I think this could be a solution, I'd even like to try to implement it just to see. But yeah, I think unless we missed something there is no straightforward solution to the problem ... Let's see what @gsmet thinks.

DavideD · 2024-01-24T11:00:48Z

Alternatives I can think of is to split by , and remove all the fields starting with an aggregate functions. This won't work without having a proper parser because people can still include a text string with anything in the select clause, or use escape chars.

This solution is not really different from what we did in other places, though

FroMage · 2024-01-24T11:11:22Z

Hello.

While I agree this is a bug, I'm skeptical about what the behaviour of .project(Long.class) should be, because the docs state:

Defines a projection class. This will transform the returned values into instances of the given type using the following mapping rules:

If your query already selects some specific columns (starts with select distinct? a, b, c…) then we transform it into a query of the form: select distinct? new ProjectionClass(a, b, c)…. There must be a matching constructor that accepts the selected column types, in the right order.

If your query does not select any specific column (starts with from…) then we transform it into a query of the form: select new ProjectionClass(a, b, c…) from… where we fetch the list of selected columns from your projection class' single constructor, using its parameter names (or their ProjectedFieldName annotations), in the same order as the constructor.
If this is already a project query of the form select distinct? new…, we throw a PanacheQueryException

At this point, it's already unclear what projections do in the case of a no-op. Should we really generate select distinct new java.lang.Long(entity.id)…?

I don't even know if this works.

In any case, there's definitely a bug because since this is a no-op, the count method should be the same as the original one, and it's not the case.

Now, given that the logic is duplicated in ORM, HR and Mongo, a fix should also include the other queries. Though in the case of Mongo, it doesn't appear to need a fix.

As to whether we need a real parser for this, I'm seriously starting to wonder if we should not start a discussion with @gavinking about the new shorter queries supported in HQL, and adding feature methods like project and count to any query types of ORM/HR if they're missing there. And let them deal with the parsing and all. This is bound to be better for both ORM and us.

FroMage · 2024-01-24T11:12:47Z

integration-tests/hibernate-orm-panache/src/main/java/io/quarkus/it/panache/TestEndpoint.java

@@ -1326,7 +1326,7 @@ public String testProjection() {

        PanacheQuery<CatProjectionBean> projectionDistinctQuery = Cat
                // The spaces at the beginning are intentional
-                .find("   SELECT   disTINct  c.name, cast(null as string), SUM(c.weight) from Cat c where name = :name group by name  ",
+                .find("   SELECT   disTINct  c.name as foo, cast(null as string) as bar, SUM(c.weight) as baz from Cat c where name = :name group by name  ",


Yeah, this should keep working.

FroMage · 2024-01-24T11:13:14Z

integration-tests/hibernate-orm-panache/src/main/java/io/quarkus/it/panache/TestEndpoint.java

        // We are checking that not everything gets lowercased
        PanacheQuery<CatProjectionBean> letterCaseQuery = Cat
                // The spaces at the beginning are intentional
-                .find("   SELECT   disTINct  'GARFIELD', 'JoN ArBuCkLe' from Cat c where name = :NamE group by name  ",
+                .find("   SELECT   disTINct  'GARFIELD' as foo, 'JoN ArBuCkLe' as bar from Cat c where name = :NamE group by name  ",


DavideD · 2024-01-24T12:15:44Z

While I agree this is a bug, I'm skeptical about what the behaviour of .project(Long.class)

Ah, I missed this.
I'm not sure what's going on in this case.

As to whether we need a real parser for this, I'm seriously starting to wonder if we should not start a discussion with @gavinking about the new shorter queries supported in HQL, and adding feature methods like project and count to any query types of ORM/HR if they're missing there. And let them deal with the parsing and all.

If possible (I'm not familiar with the new shorter form), I think that would be the best approach.

IMHO, Panache should focus on providing a better programmatic DSL (maybe using Criteria and the JPA metamodel, or other libraries underneath) and leave strings parsing to the existing libraries that have been refined over the many years they've been around.

rysurd · 2024-01-24T13:48:54Z

As to whether we need a real parser for this, I'm seriously starting to wonder if we should not start a discussion with @gavinking about the new shorter queries supported in HQL, and adding feature methods like project and count to any query types of ORM/HR if they're missing there. And let them deal with the parsing and all. This is bound to be better for both ORM and us.

Do you happen to have any documentation or guide on how to implement these new and shorter HQL queries? Is it done by using Criteria and JPA metamodel as @DavideD suggested before? If indeed it can allow to not deal with SQL parsing we could benefit from it.

FroMage · 2024-01-24T14:49:46Z

This is a discussion I should have with @gavinking at some point.

quarkus-bot bot added the area/panache label Jan 20, 2024

rysurd mentioned this pull request Jan 20, 2024

PanacheQuery count method ignoring distinct after using project #38033

Open

gsmet requested review from loicmathieu, DavideD and FroMage January 23, 2024 13:56

DavideD suggested changes Jan 23, 2024

View reviewed changes

rysurd force-pushed the ISSUES-38033 branch from 6a6762b to ecfef57 Compare January 23, 2024 15:36

rysurd marked this pull request as draft January 23, 2024 15:41

rysurd force-pushed the ISSUES-38033 branch 2 times, most recently from 87406db to 49aaeb6 Compare January 23, 2024 22:23

Panache Hibernate adapting query count based on distinct keyword

e8dbd9a

rysurd force-pushed the ISSUES-38033 branch from 49aaeb6 to e8dbd9a Compare January 23, 2024 23:27

rysurd marked this pull request as ready for review January 23, 2024 23:34

DavideD suggested changes Jan 24, 2024

View reviewed changes

FroMage requested changes Jan 24, 2024

View reviewed changes

gsmet added the triage/on-ice Frozen until external concerns are resolved label Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Panache Hibernate adapting query count based on distinct keyword #38316

Panache Hibernate adapting query count based on distinct keyword #38316

rysurd commented Jan 20, 2024

DavideD left a comment

rysurd commented Jan 23, 2024 •

edited

Loading

DavideD commented Jan 23, 2024 •

edited

Loading

rysurd commented Jan 23, 2024

DavideD left a comment

DavideD Jan 24, 2024

FroMage Jan 24, 2024

DavideD Jan 24, 2024

FroMage Jan 24, 2024

rysurd commented Jan 24, 2024 •

edited

Loading

DavideD commented Jan 24, 2024 •

edited

Loading

rysurd commented Jan 24, 2024 •

edited

Loading

DavideD commented Jan 24, 2024

FroMage commented Jan 24, 2024

FroMage Jan 24, 2024

FroMage Jan 24, 2024

DavideD commented Jan 24, 2024

rysurd commented Jan 24, 2024

FroMage commented Jan 24, 2024

Panache Hibernate adapting query count based on distinct keyword #38316

Are you sure you want to change the base?

Panache Hibernate adapting query count based on distinct keyword #38316

Conversation

rysurd commented Jan 20, 2024

DavideD left a comment

Choose a reason for hiding this comment

rysurd commented Jan 23, 2024 • edited Loading

DavideD commented Jan 23, 2024 • edited Loading

rysurd commented Jan 23, 2024

DavideD left a comment

Choose a reason for hiding this comment

DavideD Jan 24, 2024

Choose a reason for hiding this comment

FroMage Jan 24, 2024

Choose a reason for hiding this comment

DavideD Jan 24, 2024

Choose a reason for hiding this comment

FroMage Jan 24, 2024

Choose a reason for hiding this comment

rysurd commented Jan 24, 2024 • edited Loading

DavideD commented Jan 24, 2024 • edited Loading

rysurd commented Jan 24, 2024 • edited Loading

DavideD commented Jan 24, 2024

FroMage commented Jan 24, 2024

FroMage Jan 24, 2024

Choose a reason for hiding this comment

FroMage Jan 24, 2024

Choose a reason for hiding this comment

DavideD commented Jan 24, 2024

rysurd commented Jan 24, 2024

FroMage commented Jan 24, 2024

rysurd commented Jan 23, 2024 •

edited

Loading

DavideD commented Jan 23, 2024 •

edited

Loading

rysurd commented Jan 24, 2024 •

edited

Loading

DavideD commented Jan 24, 2024 •

edited

Loading

rysurd commented Jan 24, 2024 •

edited

Loading