Skip to content

Commit

Permalink
Interactivity Overhaul (User Interface & Model Instrumentation & Netw…
Browse files Browse the repository at this point in the history
…ork Comms) (#1054)

# Interactivity Overhaul
> What you want, when you want.
>
> -- <cite>some guidance developer (circa 2024)</cite>

![Screenshare of updated UI in Jupyter
notebook](https://github.com/user-attachments/assets/d65bafd4-5c9e-41c2-b5a2-4043e7cbfaae)

## Overview

This PR is the first of many focusing on interactivity. It introduces an
updated user interface for notebooks, new instrumentation for models,
and a respective network layer to handle bidirectional communication
between the IPython kernel and JavaScript client. To further support
this, models have reworked rendering, added tracing logic to better
support replays where required.

This PR also functions as a foundational step towards near future work
including rendering across various environments (i.e. terminal support
as TUI and append-only outputs), upgraded benchmarking and model
inspection.

### TL;DR

We added a lot of code to support better model metrics and
visualization. We are getting ready for multimedia streaming, and want
to have users deep inspect all the models, without overheating the
computer.

### Acknowledgements

Big shoutouts to: 
- Loc (co-developed this PR): model instrumentation & metrics.
- Jingya: consult & sketches on enhanced UI design.
- Harsha: overall feedback & collab on prototypes.

### Running this PR
- `cd packages/python/stitch && pip install -e .`
- Go run a notebook.

## User Interface

Design principle: **All visibility. No magic.**

Overall we're trying to show as much as we can on model outputs. When
debugging outputs, there can be real ugliness that is often hidden away
including tokenization concerns and critical points that may dictate the
rest of the output. This need for inspection increases as users begin to
define their own structured decoding grammars, unexpected
overconstraints can occur in development.

The old user interface that displays HTML as a side-effect in notebooks
when models compute, have been replaced with a custom Jupyter Widget
(see Network Communications for more detail), of which hosts an
interactive sandboxed iframe. We still support a legacy mode, if users
desire the previous UI.

**Before**
<img width="651" alt="image"
src="https://github.com/user-attachments/assets/89b91c60-e428-43bb-ab41-d7ab34c65483">

**After**
<img width="353" alt="image"
src="https://github.com/user-attachments/assets/49964f2f-aef2-4faf-b631-5b1898677dd3">

We're getting more information to the output at the expense of less text
density. There is simply more going on, and in order to keep some
legibility we've increased text size and spacing, compensating for two
visual elements (highlighting and underlines) that are used to convey
token info for scanning. A general metrics bar is also displayed for
discoverability on token reduction and other efficiency metrics relevant
when prompt engineering for reduced costs.

When users want further detail on tokens, we support a tool tip that
contains top 5 alternate token candidates alongside exact values for
visual elements. Highlighting has been applied to candidates,
accentuating tokens that include spaces.

We use a mono-space typeface such that data format outputs can be
inspected quicker (i.e. verticality can matter for balancing braces and
indentation).

As users learn a system: a UI with easier discoverability can come at
the cost of productivity. We've made all visual components optional to
keep our power users in the flow, and in the future we intend to allow
users to define defaults to fully support this.

For legacy mode (modeled after previous UI). Users can execute
`guidance.legacy_mode(True)` at the start of their notebook.

![image](https://github.com/user-attachments/assets/df48d9cf-8f2f-4aac-9791-fc9d407359f1)
*Old school cool*.

### The Code

- Added
- `guidance.visual` module. Handles renderer creation (stitch or HTML
display) and all required messaging. This also handles Jupyter cell
change detection for deciding when widgets need to be instantiated or
reset.
- `guidance.trace` module. Tracks model inputs & outputs of an engine.
Important for replaying for clients.
- `graphpaper-inline` NPM package has been added. This handles all
client-side rendering and messaging. Written with
Svelte/TypeScript/Tailwind/D3.

- Changed
- Rendering logic has been stripped from `Model` class and has been
delegated to `Renderer` member where possible.
- Relevant state logic has been augmented for inputs & outputs, and
stored within engine for tracing across models.
- Role processing across guidance has been thinned. `Model` class now
generates role openers and closer text directly from its respective chat
template.

## Instrumentation

Instrumentation is key for model inspection, debugging and
cost-sensitive prompt engineering. This includes backing the new UI.
Metrics are now collected for both general compute resources
(CPU/GPU/RAM) and model tokens (including token counts/reduction,
latency, type, backtracking).

### The Code

* Added (metric collection feature)
* Add Monitor class in _model.py to collect common metrics (CPU, RAM,
GPU utilization, etc.)
* Monitor runs in a separated process to prevent competing resources
with model/engine process
    * Model now keeps stats of current input/output/backtrack tokens
* At the end of notebook cell's execution, we'll collect probability of
each token in the final model state, and collect associated stats per
token such as
      * Latency
      * If token was generated, force-forwarded or from user input

* Changed:
* Replaced get_next_token with get_next_token_with_top_k to keep track
issued token along with its associated top_k tokens (both constrained
and unconstrained). Data will be stored in EngineOutput class
* Model now has VisBytesChunk object to keep track of which part of the
chunk is from user input, generated by engine or force-forwarded by
parser.
VisBytesChunk also stores the list of EngineOutput objects generated by
the engine during chunk generation.
This facilitates the process of checking tokens from the final state are
generated, force-forwarded or from user input.
* Add get_per_token_topk_probs function in Engine class to calculate
probability of each token in the token list.
This function is used at the end of the cell execution to calculate the
probabilities of model state in unconstrained mode.
* Add get_per_token_stats function in Model class to report stats for
each token in model state in unconstrained mode.
Stats include issued token, probability, latency, top-k, masked-top-k if
available.
Data from get_per_token_stats will be reported to the UI for new
visualization.

## Network Communications

We have two emerging requirements that will impact future guidance
development. One, the emergence of streaming multimedia around language
models (audio/video). Two, user interactivity within the UI, requesting
more data or computation that may not be feasible to
r`pre-(?:fetch|calculate)` to a static client.

For user interactivity from UI to Python, it's also important that we
cover as many notebook environments as possible. Each cloud notebook
provider has their own quirks of which complicates client development.
Some providers love resizing cell outputs indefinitely, others refuse to
display HTML unless it's secured away in an isolated iframe.

All in all, we need a solution that is isolated, somewhat available
across providers and can allow streams of messages between server
(Jupyter Python kernel) and client (cell output with a touch of JS).

### Stitch
> It's 3:15AM, bi-directional comms was a mistake.
>
> -- <cite>some guidance developer, minutes prior to passing out (circa
2024)</cite>

`stitch` is an auxiliary package we've created, that handles
bi-directional communication between a web client and a Jupyter python
kernel. It does this by creating a thin custom Jupyter widget that
handles messages between the kernel and a sandboxed iframe hosting the
web client. It looks something like this:

`python code` -> `kernel-side jupyter widget` -> `kernel comms (ZMQ)` ->
`client-side jupyter widget` -> `window event message` -> `sandboxed
iframe` -> `web client (graphpaper-inline)`

This package drives messages between `guidance.visual` module and
`graphpaper-inline` client. All messages are streamed to allow
near-real-time rendering within a notebook. Bi-directional comms is used
to repair the display if initial messages have been missed (client will
request a full replay when it notices the first message it receives has
a non-zero identifier).

### The Code

- Added
  - `stitch` Python package. Can be found at `packages/python/stitch`.

## Future work

We wanted to shoot for the stars, and ended up in the ocean. The
following will occur after this PR.

Near future tasks:
- User defaults for UI
- Terminal support (non-interactive & shell)
- Restyle
- Richer visualizations
- Memory re-architecture (broader than this PR)
- Interactive support for multimedia
- Guidance quality-of-life (visual diff testing)

---------

Signed-off-by: Loc Huynh <[email protected]>
Signed-off-by: JC1DA <[email protected]>
Co-authored-by: Loc Huynh <[email protected]>
Co-authored-by: Loc Huynh <[email protected]>
Co-authored-by: Hudson Cooper <[email protected]>
  • Loading branch information
4 people authored Dec 21, 2024
1 parent 326bc1c commit 727e832
Show file tree
Hide file tree
Showing 111 changed files with 22,140 additions and 373 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@ notebooks/local_scratch
__pycache__/
.vscode
.vs
.idea/
/build
/dist
*.egg-info
*.diskcache
.ipynb_checkpoints
node_modules
/client
.eggs/
.env
.DS_Store
Expand Down
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include resources/graphpaper-inline.html
3 changes: 3 additions & 0 deletions client/graphpaper-inline/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
node_modules/
build/
.DS_Store
3 changes: 3 additions & 0 deletions client/graphpaper-inline/TODO.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
- Remove CDN font links (googlefonts)
- Image integration
- Testing
5 changes: 5 additions & 0 deletions client/graphpaper-inline/build-to-guidance.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash
set -x

npm run build
cp dist/index.html ../../guidance/resources/graphpaper-inline.html
2 changes: 2 additions & 0 deletions client/graphpaper-inline/dist/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*
!.gitignore
40 changes: 40 additions & 0 deletions client/graphpaper-inline/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{
"name": "graphpaper",
"version": "0.0.1",
"scripts": {
"build": "rollup -c",
"dev": "rollup -c -w",
"start": "sirv dist"
},
"devDependencies": {
"@rollup/plugin-commonjs": "^26.0.1",
"@rollup/plugin-node-resolve": "^15.2.3",
"@rollup/plugin-terser": "^0.4.4",
"@rollup/plugin-typescript": "^11.1.6",
"@types/d3-scale": "^4.0.8",
"@types/d3-scale-chromatic": "^3.0.3",
"@types/dompurify": "^3.0.5",
"autoprefixer": "^10.4.20",
"cssnano": "^7.0.5",
"postcss": "^8.4.41",
"rollup": "^4.21.0",
"rollup-plugin-copy": "^3.5.0",
"rollup-plugin-html-bundle": "^0.0.3",
"rollup-plugin-livereload": "^2.0.5",
"rollup-plugin-postcss": "^4.0.2",
"rollup-plugin-serve": "^1.1.1",
"rollup-plugin-svelte": "^7.2.2",
"sirv-cli": "^2.0.2",
"svelte": "^4.2.18",
"svelte-preprocess": "^6.0.2",
"tailwindcss": "^3.4.10",
"tslib": "^2.6.3",
"typescript": "^5.5.4"
},
"dependencies": {
"d3-interpolate": "^3.0.1",
"d3-scale": "^4.0.2",
"d3-scale-chromatic": "^3.1.0",
"dompurify": "^3.1.7"
}
}
Loading

0 comments on commit 727e832

Please sign in to comment.