Skip to content

Commit

Permalink
Update url links in rig/rag notebooks (#7)
Browse files Browse the repository at this point in the history
* update rag

* update rig

* update flash to pro
  • Loading branch information
shifucun authored Sep 5, 2024
1 parent c1f0d7a commit 05f518f
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 32 deletions.
37 changes: 16 additions & 21 deletions notebooks/data_gemma_rag.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7235,9 +7235,9 @@
"\n",
"* **Hugging Face Token**. To obtain the token, login to your Hugging Face account [token settings](https://huggingface.co/settings/tokens) to create a new token. Copy this token and store it on the colab notebook `Secrets` section with Name `HF_TOKEN`.\n",
"\n",
"* **Data Commons API Key**. Register for an API key from Data Commons [API key portal](https://apikey.datacommons.org). Once you get the API key, store it on the colab notebook `Secrets` section with Name `DC_API_KEY`.\n",
"* **Data Commons API Key**. Register for an API key from Data Commons [API key portal](https://apikeys.datacommons.org). Once you get the API key, store it on the colab notebook `Secrets` section with Name `DC_API_KEY`.\n",
"\n",
"* **Gemini 1.5 Flash API Key**. Register for an API key from [Google AI Studio](https://aistudio.google.com/app/apikey). Once you get the API key, store it on the colab notebook `Secrets` section with Name `GEMINI_API_KEY`\n",
"* **Gemini 1.5 Pro API Key**. Register for an API key from [Google AI Studio](https://aistudio.google.com/app/apikey). Once you get the API key, store it on the colab notebook `Secrets` section with Name `GEMINI_API_KEY`\n",
"\n",
"\n",
"\n",
Expand All @@ -7249,7 +7249,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {
"id": "FpL9Rqb_PfxS",
"collapsed": true,
Expand All @@ -7271,7 +7271,7 @@
}
],
"source": [
"!pip install -q git+https://github.com/shifucun/dc-llm-tools@fix\n",
"!pip install -q git+https://github.com/datacommonsorg/llm-tools\n",
"!pip install -q bitsandbytes accelerate"
]
},
Expand Down Expand Up @@ -7300,9 +7300,9 @@
"DC_API_KEY = userdata.get('DC_API_KEY')\n",
"dc = dg.DataCommons(api_key=DC_API_KEY)\n",
"\n",
"# Get Gemini 1.5 Flash mode\n",
"# Get Gemini 1.5 Pro model\n",
"GEMINI_API_KEY = userdata.get('GEMINI_API_KEY')\n",
"gemini_model = dg.GoogleAIStudio(model='gemini-1.5-flash', api_keys=[GEMINI_API_KEY])\n",
"gemini_model = dg.GoogleAIStudio(model='gemini-1.5-pro', api_keys=[GEMINI_API_KEY])\n",
"\n",
"\n",
"# Get finetuned Gemma2 model from HuggingFace\n",
Expand Down Expand Up @@ -7567,7 +7567,7 @@
"outputId": "df86556b-3283-4821-81e4-0b393ff2ae15",
"collapsed": true
},
"execution_count": 2,
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
Expand Down Expand Up @@ -7887,7 +7887,7 @@
"id": "jZRYhyuGkIJg",
"cellView": "form"
},
"execution_count": 3,
"execution_count": null,
"outputs": []
},
{
Expand Down Expand Up @@ -7916,11 +7916,7 @@
"cell_type": "code",
"source": [
"print(f\"[QUERY]: {QUERY}\\n\")\n",
"ans = dg.RAGFlow(llm_question=hfm,\n",
" llm_answer=gemini_model,\n",
" data_fetcher=dc,\n",
" in_context=False).query(query=QUERY)\n",
"\n",
"ans = dg.RAGFlow(llm_question=hfm, llm_answer=gemini_model, data_fetcher=dc).query(query=QUERY)\n",
"print(ans.answer())"
],
"metadata": {
Expand All @@ -7930,7 +7926,7 @@
"id": "6hawX4Eg1knC",
"outputId": "646cd744-3634-487a-8e00-90e4037c8e17"
},
"execution_count": 4,
"execution_count": null,
"outputs": [
{
"output_type": "stream",
Expand Down Expand Up @@ -8134,14 +8130,13 @@
"\n",
"Here's how RAG works:\n",
"\n",
"User Query: A user submits a query to the LLM.\n",
"Query Analysis & Data Commons Query Generation: The DataGemma model (based on the Gemma 2 (27B) model and fully fine-tuned for this RAG task) analyzes the user's query and generates a corresponding query (or queries) in natural language that can be understood by Data Commons' existing natural language interface.\n",
"Data Retrieval from Data Commons: Data Commons is queried using this natural language query, and relevant data tables, source information, and links are retrieved.\n",
"Augmented Prompt: The retrieved information is added to the original user query, creating an augmented prompt.\n",
"Final Response Generation: A larger LLM (Gemini 1.5 Pro) uses this augmented prompt, including the retrieved data, to generate a comprehensive and grounded response.\n",
"\n",
"1. User Query: A user submits a query to the LLM.\n",
"2. Query Analysis & Data Commons Query Generation: The DataGemma model (based on the Gemma 2 (27B) model and fully fine-tuned for this RAG task) analyzes the user's query and generates a corresponding query (or queries) in natural language that can be understood by Data Commons' existing natural language interface.\n",
"3. Data Retrieval from Data Commons: Data Commons is queried using this natural language query, and relevant data tables, source information, and links are retrieved.\n",
"4. Augmented Prompt: The retrieved information is added to the original user query, creating an augmented prompt.\n",
"5. Final Response Generation: A larger LLM (Gemini 1.5 Pro) uses this augmented prompt, including the retrieved data, to generate a comprehensive and grounded response.\n",
"\n",
"In the above example, 14 questions being asked of Data Commons (eg \"What is the population of Sunnyvale?\") and corresponding data table are retrieved. The data in these table is used to compose the final response with coherent information and insight."
"In the above example, 14 questions are asked of Data Commons (eg \"What is the population of Sunnyvale?\") and corresponding data tables are retrieved. The data in these table is used to compose the final response with coherent information and insight."
],
"metadata": {
"id": "waKfp9VumJn7"
Expand Down
19 changes: 8 additions & 11 deletions notebooks/data_gemma_rig.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7235,7 +7235,7 @@
"\n",
"* **Hugging Face Token**. To obtain the token, login to your Hugging Face account [token settings](https://huggingface.co/settings/tokens) to create a new token. Copy this token and store it on the colab notebook `Secrets` section with Name `HF_TOKEN`.\n",
"\n",
"* **Data Commons API Key**. Register for an API key from Data Commons [API key portal](https://apikey.datacommons.org). Once you get the API key, store it on the colab notebook `Secrets` section with Name `DC_API_KEY`.\n",
"* **Data Commons API Key**. Register for an API key from Data Commons [API key portal](https://apikeys.datacommons.org). Once you get the API key, store it on the colab notebook `Secrets` section with Name `DC_API_KEY`.\n",
"\n",
"Then install the required libraries."
],
Expand All @@ -7245,7 +7245,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {
"id": "FpL9Rqb_PfxS",
"collapsed": true,
Expand All @@ -7267,7 +7267,7 @@
}
],
"source": [
"!pip install -q git+https://github.com/shifucun/dc-llm-tools@fix\n",
"!pip install -q git+https://github.com/datacommonsorg/llm-tools\n",
"!pip install -q bitsandbytes accelerate"
]
},
Expand Down Expand Up @@ -7559,7 +7559,7 @@
"outputId": "08cd29ea-7c49-454f-e7ec-fddac64c3e69",
"collapsed": true
},
"execution_count": 2,
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
Expand Down Expand Up @@ -7879,7 +7879,7 @@
"id": "jZRYhyuGkIJg",
"cellView": "form"
},
"execution_count": 5,
"execution_count": null,
"outputs": []
},
{
Expand All @@ -7892,7 +7892,7 @@
"cellView": "form",
"id": "hB-DKl0BxlN7"
},
"execution_count": 3,
"execution_count": null,
"outputs": []
},
{
Expand All @@ -7908,10 +7908,7 @@
"cell_type": "code",
"source": [
"print(f\"[QUERY]: {QUERY}\\n\")\n",
"ans = dg.RIGFlow(llm=hfm,\n",
" data_fetcher=dc,\n",
" in_context=False).query(query=QUERY)\n",
"\n",
"ans = dg.RIGFlow(llm=hfm, data_fetcher=dc, in_context=False).query(query=QUERY)\n",
"print(ans.answer())"
],
"metadata": {
Expand All @@ -7921,7 +7918,7 @@
"id": "6hawX4Eg1knC",
"outputId": "84cb32be-f0bb-4428-a330-2403baf21c9c"
},
"execution_count": 6,
"execution_count": null,
"outputs": [
{
"output_type": "stream",
Expand Down

0 comments on commit 05f518f

Please sign in to comment.