about sparse embedding, may upper_bound is calc with max_value * query_value on this dim, instead only max_value on this dim? #36717
Replies: 2 comments 1 reply
-
另外有一个问题想请赐教: 也就是重点是,只有第一个首doc_id不等于cursors[pivot]首doc_id的cursor,才会做skip?
|
Beta Was this translation helpful? Give feedback.
-
Hi @ldak4747, Regarding your first question: as mentioned in my reply in #36711, "query_value is multiplied onto max_score when the cursor is created.". So no need to multiply it again. Regarding the second question: it is only a choice of implementation, not correctness. Possible strategies could be:
Any strategy is ok. |
Beta Was this translation helpful? Give feedback.
-
in src/index/sparse/sparse_inverted_index.h InvertedIndex::search_wand, when calc the upper_bound by traverse cursors, codes as "upper_bound += cursors[pivot]->max_score();", may the codes could be "upper_bound += (cursors[pivot]->max_score() * query_value);" (pseudocode)?
for example, index_value on one dim is from 0 to 1, but query_value on one dim is 1e10, so the inner product maybe very huge on this dim, the distance between 0.1 * 1e10 and 0.9 * 1e10 maybe very huge, but distance between 0.1 and 0.9 maybe no influence to prevent found_pivot = false
my idea is:
Beta Was this translation helpful? Give feedback.
All reactions