Skip to content

Commit

Permalink
updated
Browse files Browse the repository at this point in the history
Signed-off-by: Francisco Javier Arceo <[email protected]>
  • Loading branch information
franciscojavierarceo committed May 11, 2024
1 parent 7422a9c commit 839e07b
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 11 deletions.
40 changes: 40 additions & 0 deletions module_4_rag/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,43 @@ flowchart TD;
C[Materialize Online] --> D[Retrieval Augmented Generation];
```

# Results

The simple demo shows the code below with the retrieved data shown.

```python
import pandas as pd

from feast import FeatureStore
from batch_score_documents import run_model, TOKENIZER, MODEL
from transformers import AutoTokenizer, AutoModel

df = pd.read_parquet("./feature_repo/data/city_wikipedia_summaries_with_embeddings.parquet")

store = FeatureStore(repo_path=".")

# Prepare a query vector
question = "the most populous city in the U.S. state of Texas?"

tokenizer = AutoTokenizer.from_pretrained(TOKENIZER)
model = AutoModel.from_pretrained(MODEL)
query_embedding = run_model(question, tokenizer, model)
query = query_embedding.detach().cpu().numpy().tolist()[0]

# Retrieve top k documents
features = store.retrieve_online_documents(
feature="city_embeddings:Embeddings",
query=query,
top_k=3
)
```
And running `features_df` will show:

```
$features_df
Embeddings distance
0 [0.11749928444623947, -0.04684492573142052, 0.... 0.935567
1 [0.10329511761665344, -0.07897591590881348, 0.... 0.939936
2 [0.11634305864572525, -0.10321836173534393, -0... 0.983343
```
13 changes: 2 additions & 11 deletions module_4_rag/module_4.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,9 @@
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import pandas as pd\n",
"import warnings\n",
"from feast import FeatureStore\n",
"\n",
"from batch_score_documents import run_model, TOKENIZER, MODEL\n",
"from transformers import AutoTokenizer, AutoModel"
Expand Down Expand Up @@ -164,15 +165,6 @@
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import os"
]
},
{
"cell_type": "code",
"execution_count": 5,
Expand Down Expand Up @@ -380,7 +372,6 @@
}
],
"source": [
"from feast import FeatureStore\n",
"store = FeatureStore(repo_path=\".\")"
]
},
Expand Down

0 comments on commit 839e07b

Please sign in to comment.