diff --git a/en/phased-ranking.html b/en/phased-ranking.html index 04abf0e9de..a4b7b7c630 100644 --- a/en/phased-ranking.html +++ b/en/phased-ranking.html @@ -48,7 +48,7 @@ -

First-phase ranking on content nodes

+

First-phase ranking on content nodes

Normally, you will always start by having one ranking expression that is evaluated on the content nodes. This is configured in @@ -67,7 +67,7 @@

First-phase ranking on content nodes

-

Two-phase ranking on content nodes

+

Two-phase ranking on content nodes

While some use cases only require one (simple) first-phase ranking expression, for more advanced use cases it's possible to @@ -111,9 +111,9 @@

Two-phase ranking on content nodes

-

Using a global-phase expression

+

Using a global-phase expression

- Using a rank expressions configured as a + Using a rank expression configured as a global-phase in the rank-profile section of a schema, you can add a ranking phase that will run in the stateless container after @@ -208,7 +208,7 @@

Using a global-phase expression

In the above example, the my_expensive_function will be evaluated on the content nodes - for the 50 top ranking documents from the first-phase so that the global-phase does not need to re-evaluate. + for the 50 top-ranking documents from the first-phase so that the global-phase does not need to re-evaluate.

@@ -232,10 +232,10 @@

Cross-hit norm reranks (see configuration above). This means that first, the input (my_function_or_feature) is computed or extracted from each hit that global-phase will - rerank; then the normalization step is applied; afterwards when - computing the actual global-phase expression the normalized output + rerank; then the normalization step is applied; afterwards, when + computing the actual global-phase expression, the normalized output is used. - As an example, assume some text fields with bm25 enabled, an onnx + As an example, assume some text fields with bm25 enabled, an ONNX model (from the example in the previous section), and a "popularity" numeric attribute:

@@ -256,7 +256,7 @@

Cross-hit norm }

- The normalize_linear normalizer takes a single argument which must be + The normalize_linear normalizer takes a single argument, which must be a rank-feature or the name of a function. It computes the maximum and minimum values of that input and scales linearly to the range [0, 1], basically using the formula output = (input - min) / (max - min) @@ -265,9 +265,9 @@

Cross-hit norm The reciprocal_rank normalizer takes one or two arguments; the first must be a rank-feature or the name of a function, while the second (if present) must be a numerical constant, called k with default value 60.0. - It sorts the input values and finds their rank (so highest score gets + It sorts the input values and finds their rank (so the highest score gets rank 1, next highest 2, and so on). The output from reciprocal_rank is computed - with the formula output = 1.0 / (k + rank) so note that even the best + with the formula output = 1.0 / (k + rank) , so note that even the best input only gets 1.0 / 61 = 0.016393 as output with the default k.

@@ -302,9 +302,9 @@

Stateless re-ranking

The number of hits is limited by the query api hits parameter and maxHits setting. - The hits available for container level re-ranking are the global top ranking hits - after content nodes have retrieved and ranked the hits - and global top ranking hits have been found by merging the responses from the content nodes. + The hits available for container-level re-ranking are the global top-ranking hits + after content nodes have retrieved and ranked the hits, + and global top-ranking hits have been found by merging the responses from the content nodes.

@@ -320,7 +320,7 @@

Top-K Query Operators

The nearest neighbor search operator is also a top-k - retrieval operator and the two operators can be combined in the same query. + retrieval operator, and the two operators can be combined in the same query.

@@ -329,9 +329,9 @@

Choosing phased ranking functions

A good quality ranking expression will for most applications consume too much CPU to be runnable on all retrieved or matched documents within the latency budget/SLA. -The application ranking function should hence in most cases be a second phase function. -The task then becomes to find a first phase function, -which correlates sufficiently well with the second phase function. +The application ranking function should hence in most cases be a second-phase function. +The task then becomes to find a first-phase function, +which correlates sufficiently well with the second-phase function.

@@ -469,7 +469,7 @@

Rank phase statistics

Usage

- The framework is flexible in use, the normal use case is: + The framework is flexible in use; the normal use case is:

  1. @@ -507,7 +507,7 @@

    Usage

     field match_count type long {
    -    indexing: 7 | to_long | attribute | summary   # Initialized to 7 for a new document. Default is 0.
    +    indexing: 7 | to_long | attribute | summary   # Initialized to 7 for a new document. The default is 0.
         attribute: mutable
     }