-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix paramsBoostedFields and paramsFilterFields #25
base: master
Are you sure you want to change the base?
Conversation
According to documentation: http://actionml.com/docs/ur_config fields: optional, default = none. An array of ... . Positive biases are boosts any negative number will filter out any results that do not contain the values in the field name. When adding fields to algorithm parameters with a negative bias (-1) they will not be filtered, but instead added as boosted fields with a negative value. Providing a positive number will cause this to be a filter. If you look at the annotations, someone has tried to fix this but instead of swapping the ">" and "<", he moved "=" instead. I have corrected this issue
Do you have a test to prove correct behavior? See examples/integration-test for how to do this. It should be easy, to create and add more tests. Also I believe the test is already included. The "=" is because 0 is reserved for a new meaning in the UR v0.6.0, namely as an exclusion filter. Bias in v 0.6.0:
The next release will be timed to coincide with PIO 0.11.0 (being voted on) and Mahout v0.113.0 (being assembled for release). Watch for a new branch labeled v0.6.0-SNAPSHOT. |
Good point, I couldn't find any tests where algorithm params include the fields attribute. I will try to create a new test to cover it. |
it is very difficult to test params in engine.json in an automated way. So this is a case of fields in engine.json? That helps to know since they have had minimal use or testing, except the time stamp filters. Putting a field in engine.json means it will be applied to every query, which is not very useful. Why include data for the items if they are never needed as recs? I was thinking of deprecating that but if you need it, let's make sure it works as advertized. I'd be happy for a very simple test run on it's own not integrated with |
I don't use the fields parameter currently. I'm using the template and was reviewing your code for a better understanding of the algorithm and its opportunities. This just caught my eye. One use case I can imagine for the fields parameter is setting default boosts and filters for items. Since query parameters override this, it would make sense to describe common filters here too. However this could cause issues on the application side as the results are silently boosted and filtered which may not always be a good practice. Fox example in my case (movie recommendations) I would populate a field isAdult = yes/no (or perhaps K-rating) to each item. By default I want to exclude all adult content from recommendations, unless I explicitly say so in the query (adult recommendations feed). Currenly I have to add isAdult = [no] to each query (which is not a problem). Another case is when I want to set default boosts to item attributes (ie title boost 1.5, description 1, cast 1.2, etc). What do you think where these should be described? (application for each query, algo params fields, somewhere else). Ofcourse this would be a separate test with special data.
I think there might be a broken case if you add a fields with negative bias (filter) and in the query add it as a boost. In this case the query boost would not override fields filter as they're added in separate sections of the ES query (should, must_not). I see 2 options: |
Description:
According to documentation: http://actionml.com/docs/ur_config
fields: optional, default = none. An array of ... . Positive biases are boosts any negative number will filter out any results that do not contain the values in the field name.
Problem:
When adding fields to algorithm parameters with a negative bias (-1) they will not be filtered, but instead added as boosted fields with a negative value. Providing a positive number will cause this to be a filter.
If you look at the annotations, someone has tried to fix this but instead of swapping the ">" and "<", he moved "=" instead.
Solution:
I have corrected this issue so that paramsFields are consistent with queryFields. (see queryBoostedFields below)