Add main example (chat UI) (#99)

* bootstrap main example + mockup UI * first working version * more features * add CI build step * add more models * run prettier * more models * fix build * test ci * ok
ngxson · Aug 1, 2024 · b0f6d66 · b0f6d66
1 parent 83a2b4e
commit b0f6d66
Show file tree

Hide file tree

Showing 49 changed files with 6,734 additions and 417 deletions.
diff --git a/.github/workflows/generate-docs.yml b/.github/workflows/generate-docs.yml
@@ -1,4 +1,4 @@
-name: Deploy docs to GitHub Pages
+name: Deploy docs and demo to GitHub Pages
 
 on:
   # Runs on pushes targeting the default branch
@@ -44,6 +44,14 @@ jobs:
           rm .gitignore
           rm -rf node_modules
 
+      - name: Build main example
+        working-directory: ./examples/main
+        run: |
+          npm i
+          npm run build
+          rm .gitignore
+          rm -rf node_modules
+
       - name: Upload artifact
         uses: actions/upload-pages-artifact@v3
         with:

diff --git a/.prettierignore b/.prettierignore
@@ -0,0 +1,32 @@
+**/.vscode
+**/.github
+**/.git
+**/.svn
+**/.hg
+**/node_modules
+**/dist
+**/docs
+
+/llama.cpp
+
+/examples/advanced
+/examples/basic
+/examples/embeddings
+
+/scripts
+/esm
+/models
+/build
+
+/src/multi-thread
+/src/single-thread
+
+*.md
+*.mdx
+*.json
+*.lock
+*.yml
+*.cpp
+*.hpp
+
+*.config.js
diff --git a/README.md b/README.md
@@ -2,7 +2,9 @@
 
 ![](./README_banner.png)
 
-Another WebAssembly binding for [llama.cpp](https://github.com/ggerganov/llama.cpp). Inspired by [tangledgroup/llama-cpp-wasm](https://github.com/tangledgroup/llama-cpp-wasm), but unlike it, **Wllama** aims to supports **low-level API** like (de)tokenization, embeddings,...
+WebAssembly binding for [llama.cpp](https://github.com/ggerganov/llama.cpp)
+
+👉 [Try the demo app](https://github.ngxson.com/wllama/examples/main/dist/)
 
 ## Recent changes
 
@@ -38,6 +40,10 @@ Limitations:
 
 ## Demo and documentations
 
+**Main demo**: https://github.ngxson.com/wllama/examples/main/dist/
+
+![](./assets/screenshot_0.png)
+
 **Documentation:** https://github.ngxson.com/wllama/docs/
 
 Demo:
@@ -114,11 +120,11 @@ Cases where we want to split the model:
 - Due to [size restriction of ArrayBuffer](https://stackoverflow.com/questions/17823225/do-arraybuffers-have-a-maximum-length), the size limitation of a file is 2GB. If your model is bigger than 2GB, you can split the model into small files.
 - Even with a small model, splitting into chunks allows the browser to download multiple chunks in parallel, thus making the download process a bit faster.
 
-We use `gguf-split` to split a big gguf file into smaller files. You can download the pre-built binary via [llama.cpp release page](https://github.com/ggerganov/llama.cpp/releases):
+We use `llama-gguf-split` to split a big gguf file into smaller files. You can download the pre-built binary via [llama.cpp release page](https://github.com/ggerganov/llama.cpp/releases):
 
 ```bash
 # Split the model into chunks of 512 Megabytes
-./gguf-split --split-max-size 512M ./my_model.gguf ./my_model
+./llama-gguf-split --split-max-size 512M ./my_model.gguf ./my_model
 ```
 
 This will output files ending with `-00001-of-00003.gguf`, `-00002-of-00003.gguf`, and so on.

diff --git a/actions.hpp b/actions.hpp
@@ -255,6 +255,8 @@ json action_load(app_t &app, json &body)
       {"token_bos", llama_token_bos(app.model)},
       {"token_eos", llama_token_eos(app.model)},
       {"token_eot", llama_token_eot(app.model)},
+      {"add_bos_token", llama_add_bos_token(app.model) == 1},
+      {"add_eos_token", llama_add_eos_token(app.model) == 1},
       {"has_encoder", llama_model_has_encoder(app.model)},
       {"token_decoder_start", llama_model_decoder_start_token(app.model)},
   };

diff --git a/assets/screenshot_0.png b/assets/screenshot_0.png
diff --git a/examples/main/.eslintrc.cjs b/examples/main/.eslintrc.cjs
@@ -0,0 +1,18 @@
+module.exports = {
+  root: true,
+  env: { browser: true, es2020: true },
+  extends: [
+    'eslint:recommended',
+    'plugin:@typescript-eslint/recommended',
+    'plugin:react-hooks/recommended',
+  ],
+  ignorePatterns: ['dist', '.eslintrc.cjs'],
+  parser: '@typescript-eslint/parser',
+  plugins: ['react-refresh'],
+  rules: {
+    'react-refresh/only-export-components': [
+      'warn',
+      { allowConstantExport: true },
+    ],
+  },
+};
diff --git a/examples/main/.gitignore b/examples/main/.gitignore
@@ -0,0 +1,24 @@
+# Logs
+logs
+*.log
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+pnpm-debug.log*
+lerna-debug.log*
+
+node_modules
+dist
+dist-ssr
+*.local
+
+# Editor directories and files
+.vscode/*
+!.vscode/extensions.json
+.idea
+.DS_Store
+*.suo
+*.ntvs*
+*.njsproj
+*.sln
+*.sw?
diff --git a/examples/main/README.md b/examples/main/README.md
@@ -0,0 +1,8 @@
+# wllama main example
+
+TODO:
+- Chat auto scroll
+- Load local gguf
+- Add log screen
+- Switching theme
+- Warning limitations on mobile
diff --git a/examples/main/index.html b/examples/main/index.html
@@ -0,0 +1,13 @@
+<!doctype html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <link rel="icon" type="image/svg+xml" href="/wllama.png" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>wllama</title>
+  </head>
+  <body>
+    <div id="root"></div>
+    <script type="module" src="/src/main.tsx"></script>
+  </body>
+</html>