Skip to content

Commit

Permalink
Merge pull request #997 from vespa-engine/kkraune-patch-1
Browse files Browse the repository at this point in the history
fix format and language
  • Loading branch information
kkraune authored Dec 19, 2024
2 parents 9d8a797 + bdbeca7 commit 7df4dd1
Showing 1 changed file with 7 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@
"\n",
"This guide illustrates how to feed multiple passages per Vespa document (long-context)\n",
"\n",
"- Compress token vectors using binarization compatible with Vespa unpackbits\n",
"- Compress token vectors using binarization compatible with Vespa `unpack_bits`\n",
"- Use Vespa hex feed format for binary vectors with mixed vespa tensors\n",
"- How to query Vespa with the colbert query tensor representation\n",
"- How to query Vespa with the ColBERT query tensor representation\n",
"\n",
"Read more about [Vespa Long-Context ColBERT](https://blog.vespa.ai/announcing-long-context-colbert-in-vespa/).\n",
"\n",
Expand All @@ -46,7 +46,7 @@
"id": "17d765d7",
"metadata": {},
"source": [
"Load a checkpoint with colbert and obtain document and query embeddings\n"
"Load a checkpoint with ColBERT and obtain document and query embeddings\n"
]
},
{
Expand Down Expand Up @@ -102,7 +102,7 @@
"id": "23b2e1f4",
"metadata": {},
"source": [
"See the shape of the colbert document embeddings:"
"See the shape of the ColBERT document embeddings:"
]
},
{
Expand Down Expand Up @@ -155,7 +155,7 @@
"source": [
"The query is always padded to 32 so in the above we have 32 query token vectors.\n",
"\n",
"Routines for binarization and output in Vespa tensor format that can be used in queries and in JSON feed.\n"
"Routines for binarization and output in Vespa tensor format that can be used in queries and JSON feed.\n"
]
},
{
Expand All @@ -172,7 +172,7 @@
"\n",
"\n",
"def binarize_token_vectors_hex(vectors: torch.Tensor) -> Dict[str, str]:\n",
" # Notice axix=2 to pack the bits in the last dimension which is the token level vectors\n",
" # Notice axix=2 to pack the bits in the last dimension, which is the token level vectors\n",
" binarized_token_vectors = np.packbits(np.where(vectors > 0, 1, 0), axis=2).astype(\n",
" np.int8\n",
" )\n",
Expand Down Expand Up @@ -440,7 +440,7 @@
"id": "cebada8d",
"metadata": {},
"source": [
"### Querying Vespa with colbert tensors \n"
"### Querying Vespa with ColBERT tensors \n"
]
},
{
Expand Down

0 comments on commit 7df4dd1

Please sign in to comment.