-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utility classes to make it easier to use sandbox facet API for most common cases #14237
base: main
Are you sure you want to change the base?
Conversation
Simplifies requesting taxonomy and long range facets Makes sure we don't count docs two times for the same index field when there are multiple requests Returns FacetResult - to make it easier to migrate for classic facets module users TODOs: - better names - maybe use names suggested in TK - implement for all existing facet cutters - add comprehensive tests - fix total value where possible in tests
…s: reuse some code, make more abstract
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a glance at everything and didn't see anything objectionable. It's great if we're making this module easier to use. I have some non-blocking concerns that this is a lot of new code, with more abstractions, and it doesn't have a user waiting to use it. There are arguments against those concerns as well and I personally like the new aggregations, but I thought it was worth raising the point.
@@ -130,6 +135,88 @@ void index() throws IOException { | |||
IOUtils.close(indexWriter, taxoWriter); | |||
} | |||
|
|||
/** | |||
* Example for {@link FacetBuilder} usage - simple API that provides results in a format very | |||
* similar to classic facets module. It doesn't give all flexibility available with {@link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really does look more familiar! Can we highlight it somewhere more prominent as well? Maybe in the faceting guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a good idea to have a bidirectional link for the shared examples in the classic and new aggregation modules?
Edit: nvm, I think the examples are a bit different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating the FacetBuilder
and FacetOrchestrator
abstractions - they do really simplify the usage of the new API.
} | ||
|
||
@Override | ||
FacetCutter createFacetCutter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a problem that this is called create
, giving the impression that we get a new cutter? Reusing the old one could be problematic, couldn't it, if the user expects a new cutter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you are right, this method doesn't have to create FacetCutter, e.g. FacetCutter might be created in advance and returned by the method. Renamed to getFacetCutter
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package org.apache.lucene.sandbox.facet.utils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a package for the facet builders?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to keep *FacetOrchstrator
s in the same package as FacetBuilder
s, so that we can keep some FacetBuilder
method package-private. So the only file that maybe doesn't belong in the same package is ComparableUtils.java
. At the same time the idea of ComparableUtils is similar - make it easier for users to use the new module, so I thought we could reduce number of packages and keep meaningful package structure at the same time... But I don't feel strongly about it. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get your argument. Let's keep it as is then.
Thanks for reviewing @stefanvodita !
I agree with the concern, at the same time it might be one of these chicken and egg cases when users avoid using the feature just because it doesn't have simple enough API for simple use cases? One use case for the new utils I have in mind is luceneutil perf tests - it puts me off that we need 90 lines to test performance of the new API, while for existing facets module it is only two lines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing my comments! I'm up for merging this, while keeping in mind the concerns we discussed.
+ topDocs.scoreDocs.length); | ||
|
||
//// Results | ||
FacetResult autorResults = authorFacetBuilder.getResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] authorResults instead of autorResults?
.collect(query, ds); | ||
|
||
//// Results | ||
FacetResult autorResults = authorFacetBuilder.getResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] authorResults instead of autorResults?
"Price", | ||
new LongRange("0-10", 0, true, 10, true), | ||
new LongRange("10-20", 10, true, 20, true)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To maintain output consistency between different examples, we could add something equivalent to what we have in simpleFacetsWithSearch
? Or, we could remove it from there.
System.out.println(
"Search results: totalHits: "
+ topDocs.totalHits
+ ", collected hits: "
+ topDocs.scoreDocs.length);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this intentionally, to demonstrate that there are methods to do facet only and facets with search (examples that haveWithSearch
suffix also compute/print TopDocs). We can rename this example to simpleFacetsOnlyWithDrillSideways
maybe?
|
||
//// Search and collect | ||
TopDocs topDocs = | ||
new FacetOrchestrator() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle cases where more than one FacetBuilder
instances specify topN?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each FacetBuilder
can have its own topN value set by withTopN()
, which is used to compute FacetBuilder#getResult
import org.apache.lucene.sandbox.facet.labels.RangeOrdToLabel; | ||
|
||
/** {@link FacetBuilder} factory for long range facets. */ | ||
public final class LongRangeFacetBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we make LongRangeFacetBuilder
extend CommonFacetBuilder
? The create
method returning something of a different type is a bit confusing to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that CommonFacetBuilder can be used to compute results for long range facets, so there is no need to create a new FacetBuilder subclass. But I thought it would be nice to have a utility class (I should have added a private constructor to make it static-only, will do) that helps to get CommonFacetBuilder instance.
I can remove the class and let users rely on examples, or rename it to RangeFacetBuilder
and add similar methods for double
ranges. WDYT?
* Common {@link FacetBuilder} that works with any {@link FacetCutter} and {@link OrdToLabel} | ||
* provided by user. | ||
*/ | ||
public final class CommonFacetBuilder extends BaseFacetBuilder<CommonFacetBuilder> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure when I should be using CommonFacetBuilder
vs BaseFacetBuilder
? Some examples, even in the documentation would help? For e.g, why can't TaxonomyFacetBuilder
extend CommonFacetBuilder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BaseFacetBuilder is abstract package-private class that implements common functionality that different FacetBuilder implementations can reuse.
CommonFacetBuilder is a final class, kind of generic facet builder that can be used with any FacetCutter, but as a consequence it lacks some facet type specific functionality. E.g. using it for taxonomy facets makes sense only if we don't need to specify parent path and don't need to compute overall value. The idea is that creating FacetBuilder should not be required for every facet type, just for the types that require some custom logic, such as taxonomy facets.
WDYT?
@@ -130,6 +135,88 @@ void index() throws IOException { | |||
IOUtils.close(indexWriter, taxoWriter); | |||
} | |||
|
|||
/** | |||
* Example for {@link FacetBuilder} usage - simple API that provides results in a format very | |||
* similar to classic facets module. It doesn't give all flexibility available with {@link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating the FacetBuilder
and FacetOrchestrator
abstractions - they do really simplify the usage of the new API.
In the initial sandbox facet module PR @gsmiller suggested adding helpers to make common tasks easier.
This implementation uses existing type
FacetResult
to return results. The type has limitations, e.g. it supports only one value per label. At the same time, it is simple, users are already familiar with it, and it is probably good enough for the most common tasks. Users who need more complex results can use the lower level API FacetCutter, FacetRecorder, etc to get multiple aggregations per label, to implement custom sort order or any other non-trivial faceting task.It's a draft, if people are ok with the approach overall I will add more unit tests and javadocs.