From 21297c4872932428269e046d4a5993d058cd8d2e Mon Sep 17 00:00:00 2001 From: Robert Shelton Date: Thu, 24 Oct 2024 16:06:01 -0400 Subject: [PATCH] Add hybrid search notebook (#35) * init * update hybrid search notebook * fix colab link * extend hybrid search examples * another pass through * update with redispy release * slight tweak on aggregations * fix >= --------- Co-authored-by: Tyler Hutcherson --- python-recipes/vector-search/01_redisvl.ipynb | 5 +- .../vector-search/02_hybrid_search.ipynb | 1236 +++++++++++++++++ 2 files changed, 1237 insertions(+), 4 deletions(-) create mode 100644 python-recipes/vector-search/02_hybrid_search.ipynb diff --git a/python-recipes/vector-search/01_redisvl.ipynb b/python-recipes/vector-search/01_redisvl.ipynb index c47e943..52484e0 100644 --- a/python-recipes/vector-search/01_redisvl.ipynb +++ b/python-recipes/vector-search/01_redisvl.ipynb @@ -252,9 +252,6 @@ "metadata": {}, "outputs": [], "source": [ - "# from redis.commands.search.field import VectorField, TagField, NumericField, TextField\n", - "# from redis.commands.search.indexDefinition import IndexDefinition, IndexType\n", - "\n", "from redisvl.schema import IndexSchema\n", "from redisvl.index import SearchIndex\n", "\n", @@ -1070,7 +1067,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "Python 3", "language": "python", "name": "python3" }, diff --git a/python-recipes/vector-search/02_hybrid_search.ipynb b/python-recipes/vector-search/02_hybrid_search.ipynb new file mode 100644 index 0000000..8aa488f --- /dev/null +++ b/python-recipes/vector-search/02_hybrid_search.ipynb @@ -0,0 +1,1236 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)\n", + "# Implementing hybrid search with Redis\n", + "\n", + "Hybrid search is all about combining lexical search with semantic vector search to improve result relevancy. This notebook will cover 3 different hybrid search strategies with Redis:\n", + "\n", + "1. Linear combination of scores from lexical search (BM25) and vector search (Cosine Distance) with the aggregation API\n", + "2. Client-Side Reciprocal Rank Fusion (RRF)\n", + "3. Client-Side Reranking with a cross encoder model\n", + "\n", + ">Note: Additional work is planed within the Redis core and ecosystem to add more flexible hybrid search capabilities in the future.\n", + "\n", + "## Let's Begin!\n", + "\"Open\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Install Packages" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "%pip install -q \"redisvl>=0.3.5\" sentence-transformers pandas \"redis>=5.2.0\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Data/Index Preparation\n", + " \n", + "In this section:\n", + "\n", + "1. We prepare the data necessary for our hybrid search implementations by loading a collection of movies. Each movie object contains the following attributes:\n", + " - `title`\n", + " - `rating`\n", + " - `description`\n", + " - `genre`\n", + " \n", + "2. We generate vector embeddings from the movie descriptions. This allows users to perform searches that not only rely on exact matches but also on semantic relevance, helping them find movies that align closely with their interests.\n", + "\n", + "3. After preparing the data, we populate a search index with these movie records, enabling efficient querying based on both lexical and vector-based search techniques." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Running remotely or in collab? Run this cell to download the necessary dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "!git clone https://github.com/redis-developer/redis-ai-resources.git temp_repo\n", + "!mv temp_repo/python-recipes/vector-search/resources .\n", + "!rm -rf temp_repo" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Install Redis Stack\n", + "\n", + "Later in this tutorial, Redis will be used to store, index, and query vector\n", + "embeddings and full text fields. **We need to have a Redis\n", + "instance available.**\n", + "\n", + "#### Local Redis\n", + "Use the shell script below to download, extract, and install [Redis Stack](https://redis.io/docs/getting-started/install-stack/) directly from the Redis package archive." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "%%sh\n", + "curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg\n", + "echo \"deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main\" | sudo tee /etc/apt/sources.list.d/redis.list\n", + "sudo apt-get update > /dev/null 2>&1\n", + "sudo apt-get install redis-stack-server > /dev/null 2>&1\n", + "redis-stack-server --daemonize yes" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Alternative Redis Access (Cloud, Docker, other)\n", + "There are many ways to get the necessary redis-stack instance running\n", + "1. On cloud, deploy a [FREE instance of Redis in the cloud](https://redis.com/try-free/). Or, if you have your\n", + "own version of Redis Enterprise running, that works too!\n", + "2. Per OS, [see the docs](https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/)\n", + "3. With docker: `docker run -d --name redis-stack-server -p 6379:6379 redis/redis-stack-server:latest`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Define the Redis Connection URL\n", + "\n", + "By default this notebook connects to the local instance of Redis Stack. **If you have your own Redis Enterprise instance** - replace REDIS_PASSWORD, REDIS_HOST and REDIS_PORT values with your own." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import warnings\n", + "\n", + "warnings.filterwarnings('ignore')\n", + "\n", + "# Replace values below with your own if using Redis Cloud instance\n", + "REDIS_HOST = os.getenv(\"REDIS_HOST\", \"localhost\") # ex: \"redis-18374.c253.us-central1-1.gce.cloud.redislabs.com\"\n", + "REDIS_PORT = os.getenv(\"REDIS_PORT\", \"6379\") # ex: 18374\n", + "REDIS_PASSWORD = os.getenv(\"REDIS_PASSWORD\", \"\") # ex: \"1TNxTEdYRDgIDKM2gDfasupCADXXXX\"\n", + "\n", + "# If SSL is enabled on the endpoint, use rediss:// as the URL prefix\n", + "REDIS_URL = f\"redis://:{REDIS_PASSWORD}@{REDIS_HOST}:{REDIS_PORT}\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Create redis client, load data, generate embeddings" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "from redis import Redis\n", + "\n", + "client = Redis.from_url(REDIS_URL)" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "\n", + "with open(\"resources/movies.json\", 'r') as file:\n", + " movies = json.load(file)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "from redisvl.utils.vectorize import HFTextVectorizer\n", + "\n", + "# load model for embedding our movie descriptions\n", + "model = HFTextVectorizer('sentence-transformers/all-MiniLM-L6-v2')" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "movie_data = [\n", + " {\n", + " **movie,\n", + " \"description_vector\": model.embed(movie[\"description\"], as_buffer=True, dtype=\"float32\")\n", + " } for movie in movies\n", + "]" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'title': 'Explosive Pursuit',\n", + " 'genre': 'action',\n", + " 'rating': 7,\n", + " 'description': 'A daring cop chases a notorious criminal across the city in a high-stakes game of cat and mouse.',\n", + " 'description_vector': b'\\x9bf|=\\x0e`\\n;\"\\x92\\xb7;<\\xcb~\\xbd\\xfad\\xce\\xbb\\xc3\\x16J=V\\xa7?=\\xedv\\x95\\xaa\\x1c=\\xfd\\xee\\x89<\\xbd\\xb0-<\\x82\\xb2\\x9f\\xbc[\\x0b\\xc3\\xbd\\x98NR=xl\\xf7\\xbcN>\\x17\\xbe#\\x12\\x05\\xb99u\\xbf<\\xb0\\xe0b\\xba\\xd3\\xa6\\xa8\\xbdx\\xdc\\xec\\xbcRc%=\\xe4\\xe7r\\xbb\\x1eOG=?(\\x85=o@\\xa2\\xbc2Z\\xd0\\xbdC%K\\xbd\\xb9\\xed\\x94\\xbcR\\xddH=\\x92&F<\\xc6*\\xec<\\x90\\xd8\\x8d\\xbd\\xcbZ\\x98<\\t\\xa3\\xa3=>g3\\xbd&\\xcd\\xbd\\xbd\\x95$\\xf7;\\xfd\\xf4z=\\xfc\\xb4\\x8c=\\x85\\x0e\\xc6\\xbdnI\\x90\\xbdJ\\x16\\xbd;s\\xe7\\x0c\\xbd 3\\xc9\\xbc\\x85\\xf8\\xbb\\xbc\\xbf&u\\xbb5\\x8f\\xca<\\x05\\x80J=\\x0f\\xaf*=\\x8bOU\\xbd\\xc8\\xf0\\x95\\xbc\\x1d\\x02\\x19=)\\xf4K<\\xcb\\xc2\\t=F\\x83\\xac=\\x9f\\xd7\\xb8\\xbd\\xf2\\xb5\\x9c\\xbdB\\x85\\x18=\\x96d&=-3\\xf8<\\xfa\\xf7\\x88<\\x16v\\xf2\\xbb-=[\\xbd\\xf7\\xac\\xee\\xbb5:A\\xbd\\xd9d\\x19\\xbdrd\\xf2\\xbb!\\xbax;\\xdc;O<\\xb61,\\xbc\\xed\\xae\\xae=^\\x00-\\xbc\\x1a\\x06\\xae\\xbda\\xd6\\x1a=\\xcc\\xbf\\xcd=\\x1f\\x150=\\xcf\\xf1\\x9d\\xbc\\xa9GK=\\xaa\\xb8 =\\xb4\\xf1I\\xbd\"e\\x9e\\xbbF\\x8b\\xf7:\\x94\\xf8\\x1c=\\xa9\\xba\\xde<\\xcco\\x16\\xbb\\xe6]p\\xbb\\xbb\\xd5<<\\xac\\x95\\xa3\\xb8\\xc29s<&4&\\x10\\x90\\xbbvt\\xb9\\xbb\\x00\\xc9\\xb9\\xbb\\xfehk=\\x9a\\r\\xad<3f\\xa8\\xbd\\xbd]\\xcc=\\x15\\xe0 \\xbe\\xc74/\\xbd{f\\xf7\\xbcQ\\x9av=\\x11\\x0cq<,\\xda\\x1c\\xbd\\x01\\t\\x8b<\\xf0n\\xa6\\xbc\\xe4t\\x86<\\x82\\x87\\x19=v\\xae\\xe4\\xbc4m^\\xbc\\nV\\x0e\\xbd\\x81\\xb0\\xe3\\xbc\\xd3FU;\\xaaG|\\xbdW\\xfb\\x8b\\xbd\\x7f\\x81*\\xbdy\\x83\\xf4={\\xb7\\x10;\\x15!\\x0e\\xbd\\xfa\\xd3\\xb4=\\x15&\\x15\\xbdM\\x86\\x83=m$:\\xbdv\\x1bF\\xbd\\xa2?\\x14\\xbe\\xc5\\x8f(\\xbd\\xe3O\\x89\\xbd\\x17\\xae\\xd4<\\xa3\\x12\\xc3=\\xaf\\x05O\\xbd\\x7f\\x8ep\\xbc!\\xb5\\xac\\xbc\\xc4\\x9ee\\xbd9\\x8es;[a\\xc1;\\xd2\\xfaB\\xbd\\xf9#\\xfe:\\x90\\xe6\\xf4=\\xb2\\x15*<~\\xf8\\x1b=\\x01\\xfcV\\xbd\\xcf\\xd1\\r=*\\xee\\x06=\\x18u\\xba\\xbd\\x02\\xa4\\xd6<\\xf8\\xeb\\xd9;\\xc49/=\\xa8\\xc2\\x85=u\\x0b\"=\\xe9i\\xef<4\\xe8c=\\xfa2\\x08\\xbe\\xd4\\x12;=,VW;\\x15\\xa4b<\\xb0\\x9d\\xb7<\\x95r;\\xbd{z\\x91\\xbcI\\x00<\\xbd\\x18\\x1a\\xa3<\\xf9J%\\xbc\\n\\xe7\\xbf\\xbbr\\x87\\x12=\\x97\\x1d\\x95=\\x83|\\xfd\\xbc\\xed\\xf1\\xd1\\xbd%z\\x84;\\xcb\\tu=c\\x8ai\\x85<\\xa29,=\\xbb\\xf5\\xdf\\xba\\xa0\\x14:\\xbdL9\\x08\\xbd\\x02\\x0c\\xbe\\xbcr\\xb9\\x9a<\\xab_6=\\x17Ub\\xbd\\xa4\\xb7#=[\\xee\\xa2\\xbag\\x95\\xe1\\xbc\\xfc\\xefX=\\xa2u\\x11=>\\xd86=\\xb8\\x06\\x9f\\xbc(\\xe5\\xf0<#\\x15t=\\xa0\\xaf\\xd0\\xbbeK-=\\xd5H\\x11\\xbd\\xd2\\x036=\\xff\\x15\\xd8<0x\\xfd\\xbcO\\x10\\x9b=\\xb8\\xdf_\\xbc\\xbe\\xff\\x03\\xbd\\xfbD\\xaa=\\xc5\\xab\\x0b\\xbd!$\\xe6\\xbc7\\x0cr=v\\xbc\\x99=\\xb6\\xae\\xa6<\\x1e\\x9b$\\xbd\\x98y\\x06\\xbd\\xe2\\xcf\\xde=\\xefX\\x8f=g%\\r\\xbd\\xbby\\x0e\\xbc4\\xe0\\t<\\'\\tI=\\xf8w\\x10\\xbd\\xfc\\xd4;\\xbd\\x82\\x0f\\xd9<\\xcd\\xe8\\x93\\xbb\\\\\\xdf\\xba\\xbd\\\\ c=|\\x9b\\x97;\\x19u\\xe0\\xbc\\x9a\\x10\\x9e\\xbdr\\xf4~=e\\x9ehh\\xa6\\xaf<\\xc4\\x8b\\x83\\xbb\\x19\\x1e\\x17\\xbd\\x87L*\\xbds\\x08m\\xbc\\xfcV\\x989C\\xf9\\xc2\\xbd\\x00g\\x11=\\xcf\\xdc\\xd7\\xbd\\xc9\\xfax<\\xa2\\xc0\\xa9;t\\xd6\\xc8\\xbb@1I\\xbd\\x19\\x7f\\x0c\\xbd\\x87P\\xb8\\xba\\x0e\\x14\\xf1\\xbc\\x9f\\xf2\\xca\\xbd\\xf5uA\\xbc\\xb6\\xf9<;\\x1e\\x0e\\x9d\\xbb{\\xd1r\\xbd\\xd4\\xc3}\\xbc\\xc6\\xc0\\xe5\\xbd\\x05\\x18\\xf4=\\xaaTp\\xbd!gC<\\xe5:\\x16\\xbd1|\\x19\\xbb\\xe3.\\xbf<\\xea$5=QGl=1\\xbd\\\\=bGE\\xbc\\xae\\xb8\\x85\\xbd\\xd2\\xd8Y\\xbd\\x17\\xfb\\xff;0\\r\\x88=\\x8f\\xe1\\xab=\\x84{@\\xbd\\x11O\\xc6\\xbb\\xba$o=\\x0e#\\xf4\\xbdk\\x98\\xde=\\x96~0>\\x82 \\x98\\xbc|\\xd9\\x03\\xbe\\xaek\\x8a\\xbd\\xa1l/=\\xd1ul\\xbd$\\xfb\\xd5\\x07\\xcb\\xe9\\xcd\\xbc\\xf1\\x17>\\xbdO\\xc0\\x83\\xbc=\\x1bY\\xbd>\\xd8\\x94\\xbd\\xc0/\\x1d\\xbc4M\\x07\\xbeN\\xdd\\x8f=+\\x08\\xc1\\xbcV\\xe6NJ\\x8f\\x7f<\\xccE\\xb5\\xbd\\x1aF\\x05=a@/=\\xa0\\xad1\\xbd \\xb1\\x8a=\\x14u\\x04\\xbc\\x9cI \\xbd9\\x8b\\x9b\\xbd\\x8bF\\xc4=\\xf7\\xf7;K\\xa6\\x05\\xbd\\x9du\\xe8<\\xb4\\x88N=\\xab\\x13\\x07\\xbd\\xef_`\\xbdS\\xc7\\x99\\xbd\\xd7\\x92\\xb9\\xd8)=\\x12G\\xe1\\xbd\\xden\\x18<\\xabem\\xbd\\xc4\\x9a8\\xbdh\\nL=`\\xbd8=U\\xe1\\xe1<\\x01\\xa0-\\xbb\\xa2v\\xab<\\xfeD(\\xbc\\xc0\\xfcy<\\x11y\\x96\\xbd\\xa8\\t\\xbf\\xbdIu\\xf8:\\x9a\\x1b:='},\n", + " {'title': 'Fast & Furious 9',\n", + " 'genre': 'action',\n", + " 'rating': 6,\n", + " 'description': 'Dom and his crew face off against a high-tech enemy with advanced weapons and technology.',\n", + " 'description_vector': b'&\\xa5\\xc7\\xbc\\xf7,\\xa2==\\x19H\\xbcF\\xc6t\\xbd\\xa3\\xa2C=\\x15\\x0f\\x18\\xbc\\xc8Kz=\\xeb\\x13\\xa0=\\xe5\\xe1\\x8c\\xbd\\xc3\\x84&=wZ\\x07=\\xbf\\xa8M\\xbc\\xb0\\xfaq=d\\x8b\\xe3\\xbc\\xdb\\xa3A\\xbd)\\'\\x13\\xbd\\x00\\x84\\x8a=\\xfb\\x9e\\xdd;@&s=\\x9b0l<\\xcbS\\x03\\xbcQ\\xf1:\\xbc\\xe6\\x07\\x14=u\\r\\x03\\xbd\\xa8\\x18\\xb6\\xbd\\xc5\\xf0\\xbf=b(\\xae=4t\\x91\\xbd\\xfc\\x96n\\xbc\\xc8>\\xbb\\xbc\\xb6\\x87\\t=\\x7f\\xc0\\xda\\xbc\\x8d\\xf6@\\xbcf\\xcd\\'\\xbci\\x9a\\x10\\xbe\\x00\\x98\\xaf=\\x9c\\x8f\\xd1\\xbc$\\xa4C=$\\xee)\\xbc\\x80g\\x9d\\xbcm6\\x98\\xbd\\x00\\x01\\x8a\\xbd\\xc9l\\x15=2\\x19\\x03\\xbd\\xf1\\xba\\xd5<\\x0b\\x8b\\xa2\\xbc\\x80K\\x8a=\\xf7\\'h<\\x89\\xe2\\n\\xbdX\\xd4\\xcd<\\x03?9\\xbcZ\\x1eh=\\xcc\\xa8a=\\xc7\\xcd\\xbf\\xbb)\\x00;=jK\\x9e=\\x95\\x84\\x97\\xbdv\\x82\\xb3=\\xa1\\xd8\\xb8;\\xd3\\xa6j<\\x87\\xdd\\x9b\\xbc3\\x03}\\xbd\\xbc\\xa3\\xdc\\xe1\\xd1Q\\xbdU\\x15\\xcf\\xbc\\x13\\x0c\\xb0\\xbc3\\xc8\\xfc<\\x04\\x8d\\x98=t\\x0e9=O3K=K\\xf2\\xcd\\xbc\\xdf\\x04E\\xbd\\xfc\\x987;\\x9e\\x9ct\\xbd\\xbfy|=\\xf8\\xd2\\x80<\\x00\\xa4\\x0c<\\x01\\x0e\\x18>\\x11\\x14q\\xbdi\\xe6Q=qR:\\xbd\\xbf\\xd4k\\xbd\\xbdX\\x81=\\x00|\\x98\\xbc\\n\\xbe\\xaf\\xbd\\xc6\\xe4\\xc6=\\xf4\\xc7\\x8e\\xbd_\\xd9\\xff\\xbc\\xc6\\xe50\\xbd_-\\xaa\\xbc\\x16\\xdf\\x92;p\\x9e\\xc2\\xc0XB=L\\xb5\\x99\\xbb\\x086\\x90\\xbc\\xab\\x99\\x98=\\x8a\\xb16\\xbc\\xcaE\\xba\\xbd\\x93\\x93W=\\xe7\\r\\xe9<\\xbf\\xb7\\x8e=\\xf0X\\xa9=\\xf2;\\x18\\xba{U\\x15\\xbd\\xefH\\x00\\xbd\\x12g\\xa2;\\x81\\xb0\\xb3\\xbd\\x8f\\x8c =T\\x7fv\\xbb\\x08y\\x84\\xbc\\xba\\'\\xd8\\xba1\\x92\\xa5\\xbc5\\x1b)\\xbc\\x803\\xae\\xbb\"O\\x95<\\xe4\\x82\\x9d\\xbc{O\\x08 str:\n", + " \"\"\"Convert a raw user query to a redis full text query joined by ORs\"\"\"\n", + " tokens = [token.strip().strip(\",\").lower() for token in user_query.split()]\n", + " return \" | \".join([token for token in tokens if token not in stopwords])\n", + "\n", + "# Example\n", + "tokenize_query(user_query)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we need methods to create vector search and full-text search queries:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [], + "source": [ + "# Function to create a vector query using RedisVL helpers for ease of use\n", + "from redisvl.query import VectorQuery, FilterQuery\n", + "from redisvl.query.filter import Text\n", + "from redisvl.redis.utils import convert_bytes, make_dict\n", + "\n", + "\n", + "def make_vector_query(user_query: str, num_results: int, filters = None) -> VectorQuery:\n", + " \"\"\"Generate a Redis vector query given user query string.\"\"\"\n", + " vector = model.embed(user_query, as_buffer=True, dtype=\"float32\")\n", + " query = VectorQuery(\n", + " vector=vector,\n", + " vector_field_name=\"description_vector\",\n", + " num_results=num_results,\n", + " return_fields=[\"title\", \"description\"]\n", + " )\n", + " if filters:\n", + " query.set_filter(filters)\n", + " \n", + " return query\n", + "\n", + "\n", + "def make_ft_query(text_field: str, user_query: str, num_results: int) -> FilterQuery:\n", + " \"\"\"Generate a Redis full-text query given a user query string.\"\"\"\n", + " return FilterQuery(\n", + " filter_expression=f\"~({Text(text_field) % tokenize_query(user_query)})\",\n", + " num_results=num_results,\n", + " return_fields=[\"title\", \"description\"],\n", + " dialect=4,\n", + " ).scorer(\"BM25\").with_scores()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Linear Combination using Aggregation API\n", + "\n", + "The goal of this technique is to calculate a weighted sum of the BM25 score for our provided text search and the cosine distance between vectors calculated via a KNN vector query. This is possible in Redis using the [aggregations API](https://redis.io/docs/latest/develop/interact/search-and-query/advanced-concepts/aggregations/), as of `Redis 7.4.x` (search version `2.10.5`), within a single database call.\n", + "\n", + "In Redis, the aggregations api allow you the ability to group, sort, and transform your result data in the ways you might expect to be able to do with groupby and sums in other database paradigms. \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we build a base `VectorQuery` that runs a KNN-style vector search and test it below:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'id': 'movie:dba67e0f8f4f45e38ba58533a7e70ec3',\n", + " 'vector_distance': '0.643690049648',\n", + " 'title': 'The Incredibles',\n", + " 'description': \"A family of undercover superheroes, while trying to live the quiet suburban life, are forced into action to save the world. Bob Parr (Mr. Incredible) and his wife Helen (Elastigirl) were among the world's greatest crime fighters, but now they must assume civilian identities and retreat to the suburbs to live a 'normal' life with their three children. However, the family's desire to help the world pulls them back into action when they face a new and dangerous enemy.\"},\n", + " {'id': 'movie:0d8537e75af24af6b118f4629c2758a3',\n", + " 'vector_distance': '0.668439269066',\n", + " 'title': 'Explosive Pursuit',\n", + " 'description': 'A daring cop chases a notorious criminal across the city in a high-stakes game of cat and mouse.'},\n", + " {'id': 'movie:b81aad8ca262422cb80ba725b17afce4',\n", + " 'vector_distance': '0.698122382164',\n", + " 'title': 'Mad Max: Fury Road',\n", + " 'description': \"In a post-apocalyptic wasteland, Max teams up with Furiosa to escape a tyrant's clutches and find freedom.\"}]" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "query = make_vector_query(user_query, num_results=3)\n", + "\n", + "# Check standard vector search results\n", + "index.query(query)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we add a full-text search predicate using RedisVL helpers and our user-query tokenizer:" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'(~@description:(action | adventure | movie | great | fighting | scenes | crime | busting | superheroes | magic))=>[KNN 3 @description_vector $vector AS vector_distance]'" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "base_full_text_query = str(Text(\"description\") % tokenize_query(user_query))\n", + "\n", + "# Add the optional flag, \"~\", so that this doesn't also act as a strict text filter\n", + "full_text_query = f\"(~{base_full_text_query})\"\n", + "\n", + "\n", + "# Add full-text predicate to the vector query \n", + "query.set_filter(full_text_query)\n", + "query.query_string()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**The query string above combines both full-text search and a vector search.** This will be passed to the aggregation API to combine using a simple weighted sum of scores before a final sort and truncation.\n", + "\n", + "Note: for the following query to work `redis-py >= 5.2.0`" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'vector_distance': '0.643690049648',\n", + " '__score': '0.968066079387',\n", + " 'title': 'The Incredibles',\n", + " 'description': \"A family of undercover superheroes, while trying to live the quiet suburban life, are forced into action to save the world. Bob Parr (Mr. Incredible) and his wife Helen (Elastigirl) were among the world's greatest crime fighters, but now they must assume civilian identities and retreat to the suburbs to live a 'normal' life with their three children. However, the family's desire to help the world pulls them back into action when they face a new and dangerous enemy.\",\n", + " 'cosine_similarity': '0.678154975176',\n", + " 'bm25_score': '0.968066079387',\n", + " 'hybrid_score': '0.765128306439'},\n", + " {'vector_distance': '0.668439269066',\n", + " '__score': '0',\n", + " 'title': 'Explosive Pursuit',\n", + " 'description': 'A daring cop chases a notorious criminal across the city in a high-stakes game of cat and mouse.',\n", + " 'cosine_similarity': '0.665780365467',\n", + " 'bm25_score': '0',\n", + " 'hybrid_score': '0.466046255827'},\n", + " {'vector_distance': '0.698122382164',\n", + " '__score': '0',\n", + " 'title': 'Mad Max: Fury Road',\n", + " 'description': \"In a post-apocalyptic wasteland, Max teams up with Furiosa to escape a tyrant's clutches and find freedom.\",\n", + " 'cosine_similarity': '0.650938808918',\n", + " 'bm25_score': '0',\n", + " 'hybrid_score': '0.455657166243'}]" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from typing import Any, Dict, List\n", + "from redis.commands.search.aggregation import AggregateRequest, Desc\n", + "\n", + "# Build the aggregation request\n", + "req = (\n", + " AggregateRequest(query.query_string())\n", + " .scorer(\"BM25\")\n", + " .add_scores()\n", + " .apply(cosine_similarity=\"(2 - @vector_distance)/2\", bm25_score=\"@__score\")\n", + " .apply(hybrid_score=f\"0.3*@bm25_score + 0.7*@cosine_similarity\")\n", + " .load(\"title\", \"description\", \"cosine_similarity\", \"bm25_score\", \"hybrid_score\")\n", + " .sort_by(Desc(\"@hybrid_score\"), max=3)\n", + " .dialect(4)\n", + ")\n", + "\n", + "# Run the query\n", + "res = index.aggregate(req, query_params={'vector': query._vector})\n", + "\n", + "# Perform output parsing\n", + "[make_dict(row) for row in convert_bytes(res.rows)]\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Notes on aggregate query syntax \n", + "- `.scorer`: specifies the scoring function to use BM25 in this case\n", + " - [see docs](https://redis.io/docs/latest/develop/interact/search-and-query/advanced-concepts/scoring/) for all available scorers\n", + "- `.add_scores`: adds the scores to the result\n", + "- `.apply`: algebraic operations that can be customized for your use case\n", + "- `.load`: specifies fields to return - all in this case.\n", + "- `.sort_by`: sort the output based on the hybrid score and yield top 5 results\n", + "- `.dialect`: specifies the query dialect to use." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we will define a function to do the entire operation start to finish for simplicity." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "def linear_combo(user_query: str, alpha: float, num_results: int = 3) -> List[Dict[str, Any]]:\n", + " # Add the optional flag, \"~\", so that this doesn't also act as a strict text filter\n", + " text = f\"(~{Text('description') % tokenize_query(user_query)})\"\n", + "\n", + " # Build vector query\n", + " query = make_vector_query(user_query, num_results=num_results, filters=text)\n", + " \n", + " # Build aggregation\n", + " req = (\n", + " AggregateRequest(query.query_string())\n", + " .scorer(\"BM25\")\n", + " .add_scores()\n", + " .apply(cosine_similarity=\"(2 - @vector_distance)/2\", bm25_score=\"@__score\")\n", + " .apply(hybrid_score=f\"{1-alpha}*@bm25_score + {alpha}*@cosine_similarity\")\n", + " .sort_by(Desc(\"@hybrid_score\"), max=num_results)\n", + " .load(\"title\", \"description\", \"cosine_similarity\", \"bm25_score\", \"hybrid_score\")\n", + " .dialect(4)\n", + " )\n", + "\n", + " # Run the query\n", + " res = index.aggregate(req, query_params={'vector': query._vector})\n", + "\n", + " # Perform output parsing\n", + " if res:\n", + " movies = [make_dict(row) for row in convert_bytes(res.rows)]\n", + " return [(movie[\"title\"], movie[\"hybrid_score\"]) for movie in movies]" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[('The Incredibles', '0.765128306439'),\n", + " ('Explosive Pursuit', '0.466046255827'),\n", + " ('Mad Max: Fury Road', '0.455657166243'),\n", + " ('The Dark Knight', '0.452280691266'),\n", + " ('Despicable Me', '0.448826777935'),\n", + " ('Inception', '0.434456580877')]" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Test it out\n", + "\n", + "# 70% of the hybrid search score based on cosine similarity\n", + "linear_combo(user_query, alpha=0.7, num_results=6)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Client-side fusion with RRF\n", + "\n", + "Instead of relying on document scores like cosine similarity and BM25/TFIDF, we can fetch items and focus on their rank. This rank can be utilized to create a new ranking metric known as [Reciprocal Rank Fusion (RRF)](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf). RRF is powerful because it can handle ranked lists of different length, scores of different scales, and other complexities.\n", + "\n", + "Although Redis does not currently support RRF natively, we can easily implement it on the client side." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "def fuse_rankings_rrf(*ranked_lists, weights=None, k=60):\n", + " \"\"\"\n", + " Perform Weighted Reciprocal Rank Fusion on N number of ordered lists.\n", + " \"\"\"\n", + " item_scores = {}\n", + " \n", + " if weights is None:\n", + " weights = [1.0] * len(ranked_lists)\n", + " else:\n", + " assert len(weights) == len(ranked_lists), \"Number of weights must match number of ranked lists\"\n", + " assert all(0 <= w <= 1 for w in weights), \"Weights must be between 0 and 1\"\n", + " \n", + " for ranked_list, weight in zip(ranked_lists, weights):\n", + " for rank, item in enumerate(ranked_list, start=1):\n", + " if item not in item_scores:\n", + " item_scores[item] = 0\n", + " item_scores[item] += weight * (1 / (rank + k))\n", + " \n", + " # Sort items by their weighted RRF scores in descending order\n", + " return sorted(item_scores.items(), key=lambda x: x[1], reverse=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[(2, 0.04814747488101534),\n", + " (1, 0.032266458495966696),\n", + " (6, 0.03200204813108039),\n", + " (5, 0.01639344262295082),\n", + " (4, 0.016129032258064516),\n", + " (3, 0.015873015873015872),\n", + " (7, 0.015625),\n", + " (8, 0.015384615384615385)]" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Below is a simple example of RRF over a few lists of numbers\n", + "fuse_rankings_rrf([1, 2, 3], [2, 4, 6, 7, 8], [5, 6, 1, 2])" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [], + "source": [ + "def weighted_rrf(\n", + " user_query: str,\n", + " alpha: float = 0.5,\n", + " num_results: int = 4,\n", + " k: int = 60,\n", + ") -> List[Dict[str, Any]]:\n", + " \"\"\"Implemented client-side RRF after querying from Redis.\"\"\"\n", + " # Create the vector query\n", + " vector_query = make_vector_query(user_query, num_results=len(movie_data))\n", + "\n", + " # Create the full-text query\n", + " full_text_query = make_ft_query(\"description\", user_query, num_results=len(movie_data))\n", + "\n", + " # Run queries individually\n", + " vector_query_results = index.query(vector_query)\n", + " full_text_query_results = index.query(full_text_query)\n", + "\n", + " # Extract titles from results\n", + " vector_titles = [movie[\"title\"] for movie in vector_query_results]\n", + " full_text_titles = [movie[\"title\"] for movie in full_text_query_results]\n", + "\n", + " # Perform weighted RRF\n", + " return fuse_rankings_rrf(vector_titles, full_text_titles, weights=[alpha, 1-alpha], k=k)[:num_results]" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[('The Incredibles', 0.016009221311475412),\n", + " ('Explosive Pursuit', 0.01575682382133995),\n", + " ('Mad Max: Fury Road', 0.015079365079365078),\n", + " ('Finding Nemo', 0.015008960573476702),\n", + " ('Fast & Furious 9', 0.014925373134328358),\n", + " ('The Dark Knight', 0.014854753521126762)]" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Test it out!\n", + "weighted_rrf(user_query, num_results=6)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "But say we want to give more weight to the vector search rankings in this case to boost semantic similarities contribution to the final rank:" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[('The Incredibles', 0.016162909836065574),\n", + " ('Explosive Pursuit', 0.015905707196029777),\n", + " ('Mad Max: Fury Road', 0.015396825396825395),\n", + " ('The Dark Knight', 0.015162852112676057),\n", + " ('Fast & Furious 9', 0.014925373134328356),\n", + " ('Inception', 0.014715649647156496)]" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "weighted_rrf(user_query, alpha=0.7, num_results=6)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Client-side reranking\n", + "\n", + "An alternative approach to RRF is to simply use an external reranker to order the final recommendations. RedisVL has built-in integrations to a few popular reranking modules." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "349ccd5a976f4283866adfc290ab85ea", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "config.json: 0%| | 0.00/794 [00:00 List[Dict[str, Any]]:\n", + " \"\"\"Rerank the candidates based on the user query with an external model/module.\"\"\"\n", + " # Create the vector query\n", + " vector_query = make_vector_query(user_query, num_results=num_results)\n", + "\n", + " # Create the full-text query\n", + " full_text_query = make_ft_query(\"description\", user_query, num_results=num_results)\n", + "\n", + " # Run queries individually\n", + " vector_query_results = index.query(vector_query)\n", + " full_text_query_results = index.query(full_text_query)\n", + "\n", + " # Assemble list of potential movie candidates with their IDs\n", + " movie_map = {}\n", + " for movie in vector_query_results + full_text_query_results:\n", + " candidate = f\"Title: {movie['title']}. Description: {movie['description']}\"\n", + " if candidate not in movie_map:\n", + " movie_map[candidate] = movie\n", + "\n", + " # Rerank candidates\n", + " reranked_movies, scores = reranker.rank(\n", + " query=user_query,\n", + " docs=list(movie_map.keys()),\n", + " limit=num_results,\n", + " return_score=True\n", + " )\n", + "\n", + " # Fetch full movie objects for the reranked results\n", + " return [\n", + " (movie_map[movie['content']][\"title\"], score)\n", + " for movie, score in zip(reranked_movies, scores)\n", + " ]\n" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[('The Incredibles', -0.45268189907073975),\n", + " ('The Dark Knight', -7.411877632141113),\n", + " ('Explosive Pursuit', -8.751346588134766),\n", + " ('Mad Max: Fury Road', -7.049145698547363),\n", + " ('Aladdin', -9.638406753540039),\n", + " ('Despicable Me', -9.797615051269531)]" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Test it out!\n", + "rerank(user_query, num_results=6)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This technique is certainly much slower than simple RRF as it's running an additional cross-encoder model to rerank the results. This can be fairly computationally expensive, but tunable with enough clarity on the use case and focus (how many items to retrieve? how many items to rerank? model accleration via GPU?)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Comparing Approaches\n", + "\n", + "While each approach has strengths and weaknesses, it's important to understand that each might work better in some use cases than others. Below we will run through a sample of user queries and generate matches for each using different hybrid search techniques." + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "movie_user_queries = [\n", + " \"I'm in the mood for a high-rated action movie with a complex plot\",\n", + " \"What's a funny animated film about unlikely friendships?\",\n", + " \"Any movies featuring superheroes or extraordinary abilities\", \n", + " \"I want to watch a thrilling movie with spies or secret agents\",\n", + " \"Are there any comedies set in unusual locations or environments?\",\n", + " \"Find me an action-packed movie with car chases or explosions\",\n", + " \"What's a good family-friendly movie with talking animals?\",\n", + " \"I'm looking for a film that combines action and mind-bending concepts\",\n", + " \"Suggest a movie with a strong female lead character\",\n", + " \"What are some movies that involve heists or elaborate plans?\",\n", + " \"I need a feel-good movie about personal growth or transformation\",\n", + " \"Are there any films that blend comedy with action elements?\", \n", + " \"Show me movies set in dystopian or post-apocalyptic worlds\",\n", + " \"I'm interested in a movie with themes of revenge or justice\",\n", + " \"What are some visually stunning movies with impressive special effects?\"\n", + "]" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "\n", + "\n", + "rankings = pd.DataFrame()\n", + "rankings[\"queries\"] = movie_user_queries\n", + "\n", + "# First, add new columns to the DataFrame\n", + "rankings[\"hf-cross-encoder\"] = \"\"\n", + "rankings[\"rrf\"] = \"\"\n", + "rankings[\"linear-combo-bm25-cosine\"] = \"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [], + "source": [ + "# Now iterate through the queries and add results\n", + "for i, user_query in enumerate(movie_user_queries):\n", + " rankings.at[i, \"hf-cross-encoder\"] = rerank(user_query, num_results=4)\n", + " rankings.at[i, \"rrf\"] = weighted_rrf(user_query, alpha=0.7, num_results=4)\n", + " rankings.at[i, \"linear-combo-bm25-cosine\"] = linear_combo(user_query, alpha=0.7, num_results=4)" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
querieshf-cross-encoderrrflinear-combo-bm25-cosine
0I'm in the mood for a high-rated action movie ...[(Explosive Pursuit, -11.244140625), (Mad Max:...[(The Incredibles, 0.016029143897996357), (Mad...[(The Incredibles, 0.552392209158), (Despicabl...
1What's a funny animated film about unlikely fr...[(Despicable Me, -10.441911697387695), (The In...[(Black Widow, 0.015625), (The Incredibles, 0....[(The Incredibles, 0.454752022028), (Despicabl...
2Any movies featuring superheroes or extraordin...[(The Incredibles, -3.6648106575012207), (The ...[(The Incredibles, 0.01639344262295082), (Mad ...[(The Incredibles, 0.603234936448), (The Aveng...
3I want to watch a thrilling movie with spies o...[(The Incredibles, -10.843631744384766), (Expl...[(Skyfall, 0.01631411951348493), (Explosive Pu...[(Skyfall, 0.44384047389), (Despicable Me, 0.4...
4Are there any comedies set in unusual location...[(The Incredibles, -11.45376968383789), (Explo...[(Madagascar, 0.015272878190495952), (Explosiv...[(Madagascar, 0.442132177949), (Despicable Me,...
\n", + "
" + ], + "text/plain": [ + " queries \\\n", + "0 I'm in the mood for a high-rated action movie ... \n", + "1 What's a funny animated film about unlikely fr... \n", + "2 Any movies featuring superheroes or extraordin... \n", + "3 I want to watch a thrilling movie with spies o... \n", + "4 Are there any comedies set in unusual location... \n", + "\n", + " hf-cross-encoder \\\n", + "0 [(Explosive Pursuit, -11.244140625), (Mad Max:... \n", + "1 [(Despicable Me, -10.441911697387695), (The In... \n", + "2 [(The Incredibles, -3.6648106575012207), (The ... \n", + "3 [(The Incredibles, -10.843631744384766), (Expl... \n", + "4 [(The Incredibles, -11.45376968383789), (Explo... \n", + "\n", + " rrf \\\n", + "0 [(The Incredibles, 0.016029143897996357), (Mad... \n", + "1 [(Black Widow, 0.015625), (The Incredibles, 0.... \n", + "2 [(The Incredibles, 0.01639344262295082), (Mad ... \n", + "3 [(Skyfall, 0.01631411951348493), (Explosive Pu... \n", + "4 [(Madagascar, 0.015272878190495952), (Explosiv... \n", + "\n", + " linear-combo-bm25-cosine \n", + "0 [(The Incredibles, 0.552392209158), (Despicabl... \n", + "1 [(The Incredibles, 0.454752022028), (Despicabl... \n", + "2 [(The Incredibles, 0.603234936448), (The Aveng... \n", + "3 [(Skyfall, 0.44384047389), (Despicable Me, 0.4... \n", + "4 [(Madagascar, 0.442132177949), (Despicable Me,... " + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rankings.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array(['Show me movies set in dystopian or post-apocalyptic worlds',\n", + " list([('Mad Max: Fury Road', -3.490626335144043), ('Despicable Me', -11.051526069641113), ('The Incredibles', -11.315656661987305), ('Black Widow', -10.880638122558594)]),\n", + " list([('Mad Max: Fury Road', 0.01602086438152012), ('Skyfall', 0.015607940446650124), ('The Incredibles', 0.015237691001697792), ('Black Widow', 0.01513526119402985)]),\n", + " list([('Mad Max: Fury Road', '0.452238571644'), ('The Incredibles', '0.445061546564'), ('Madagascar', '0.41901564002'), ('Despicable Me', '0.416218408942')])],\n", + " dtype=object)" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rankings.loc[12].values" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Wrap up\n", + "That's a wrap! Hopefully from this you were able to learn:\n", + "- How to implement simple vector search queries in Redis\n", + "- How to implement vector search queries with full-text filters\n", + "- How to implement hybrid search queries using the Redis aggregation API\n", + "- How to perform client-side fusion and reranking techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}