rasbt · rasbt · Jul 28, 2024 · Jul 28, 2024 · Jul 28, 2024
diff --git a/ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb b/ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb
@@ -41,13 +41,13 @@
     "  2. We use the instruction-finetuned LLM to generate multiple responses and have LLMs rank them based on given preference criteria\n",
     "  3. We use an LLM to generate preferred and dispreferred responses given certain preference criteria\n",
     "- In this notebook, we consider approach 3\n",
-    "- This notebook uses a 70 billion parameter Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n",
+    "- This notebook uses a 70 billion parameters Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n",
     "- The expected format of the instruction dataset is as follows:\n",
     "\n",
     "\n",
     "### Input\n",
     "\n",
-    "```python\n",
+    "```json\n",
     "[\n",
     "    {\n",
     "        \"instruction\": \"What is the state capital of California?\",\n",
@@ -71,7 +71,7 @@
     "\n",
     "The output dataset will look as follows, where more polite responses are preferred (`'chosen'`), and more impolite responses are dispreferred (`'rejected'`):\n",
     "\n",
-    "```python\n",
+    "```json\n",
     "[\n",
     "    {\n",
     "        \"instruction\": \"What is the state capital of California?\",\n",
@@ -98,7 +98,7 @@
     "]\n",
     "```\n",
     "\n",
-    "### Ouput\n",
+    "### Output\n",
     "\n",
     "\n",
     "\n",
@@ -135,7 +135,7 @@
    "id": "8bcdcb34-ac75-4f4f-9505-3ce0666c42d5",
    "metadata": {},
    "source": [
-    "## Installing Ollama and Downloading Llama 3"
+    "## Installing Ollama and Downloading Llama 3.1"
    ]
   },
   {
@@ -353,7 +353,7 @@
    "source": [
     "from pathlib import Path\n",
     "\n",
-    "json_file = Path(\"..\") / \"01_main-chapter-code\" /  \"instruction-data.json\"\n",
+    "json_file = Path(\"..\", \"01_main-chapter-code\", \"instruction-data.json\")\n",
     "\n",
     "with open(json_file, \"r\") as file:\n",
     "    json_data = json.load(file)\n",
@@ -498,7 +498,7 @@
    "metadata": {},
    "source": [
     "- If we find that the generated responses above look reasonable, we can go to the next step and apply the prompt to the whole dataset\n",
-    "- Here, we add a `'chosen`' key for the preferred response and a `'rejected'` response for the dispreferred response"
+    "- Here, we add a `'chosen'` key for the preferred response and a `'rejected'` response for the dispreferred response"
    ]
   },
   {