Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design proposal: Chat Completions API (rev. 1.1) #144

Open
dlqqq opened this issue Jan 4, 2025 · 14 comments
Open

Design proposal: Chat Completions API (rev. 1.1) #144

dlqqq opened this issue Jan 4, 2025 · 14 comments
Labels
enhancement New feature or request

Comments

@dlqqq
Copy link
Member

dlqqq commented Jan 4, 2025

Description

This issue proposes a design for a new Chat Completions API. This API will allow consumer extensions to provide completions for the user's current input from the UI. In this context, a consumer extension is any frontend + server extension that intends to provide completions for substrings in the chat input.

Motivation

Suppose a user types / in the chat with Jupyter AI installed. Today, Jupyter Chat responds by showing a menu of chat completions:

Screenshot 2025-01-02 at 5 08 27 PM

The opening of this completions menu is triggered simply by typing /. However, the current implementation only allows a single "trigger character" (/). This means that @ commands in Jupyter AI cannot be autocompleted. Furthermore, the completer makes a network call every time a user types /.

This design aims to:

  1. Allow multiple completers to provide completions,
  2. Allow triggering patterns to be strictly defined, and
  3. Generate completions in a way which minimizes network calls.

To help explain the proposed design, this document will start from the perspective of a consumer extension, then work backwards towards the necessary changes in Jupyter Chat.

Step 1: Define a new IChatCompleter interface

To register completions for partial inputs, a consumer extension must provide a set of chat completers. A chat completer is a JavaScript/TypeScript class which provides:

  • id (property): Defines a unique ID for this chat completer. We will see why this is useful later.

  • regex (property): Defines a regex which matches any incomplete input.

    • Each regex should end with $ to ensure this regex only matches partial inputs just typed by the user. Without $, the completer may generate completions for commands which were already typed.
  • async initialize(): void: called and awaited by Jupyter Chat.

  • async getCompletions(match: str): ChatCompletion[]: Defines a method which accepts a substring matched by its regex, and returns a list of potential completions for that input. This list may be empty.

    • It would be helpful to think of this method as just returning a list of completions for the user's input. We return a list of objects to allow each completion to have metadata, such as the description & icon.

It's important to note that a consumer extension may provide more than 1 completer. This allows extensions to provide completions for different commands which aren't easily captured by a single regex. For example, Jupyter AI can have a completer for / commands and another completer for @ commands.

Jupyter Chat will define a new IChatCompleter interface which chat completers must implement, shown below.

import { LabIcon } from "@jupyterlab/ui-components";

type ChatCompletion = {
    // e.g. "/ask" if the input was `/`
    value: string;
    
    // if set, use this as the label. otherwise use `value`.
    label?: string;
    
    // if set, show this as a subtitle.
    description?: string;
    
    // identifies which icon should be used, if any.
    // Jupyter Chat should choose a default if one is not provided.
    icon?: LabIcon;
}

interface IChatCompleter {
    id: string;
    regex: string;
    async function initialize(): void;
    async function getCompletions(match: str): ChatCompletion[]
}

The consumer extension will construct/instantiate the class itself before providing it to Jupyter Chat. Jupyter Chat will call await initialize() on each completer on init. The details of this will be discussed later.

To define a chat completer, a consumer extension should implement the IChatCompleter interface. Here is an example of how Jupyter AI may implement a chat completer to provide completions for its slash commands:

import { IChatCompleter } from "@jupyter/chat"
import { AiService } from "@jupyter-ai/core"

class SlashCommandCompleter implements IChatCompleter {
    public id: string = "jai-slash-commands";
    
    /**
     * matches when:
       - any partial slash command appears at start of input
       - the partial slash command is immediately followed by end of input
       
       Examples:
       - "/" => matched
       - "/le" => matched
       - "/learn" => matched
       - "/learn " (note the space) => not matched
       - "what does /help do?" => not matched
     */
    public regex: string = "/^\/\w*$/";
    
    // used to cache list of slash commands
    private _slash_commands?: ChatCompletion[];
    
    async function initialize(): void {
        commands: any[] = await AiService.listSlashCommands()
        // process list of slash commands into list of potential completions
        // cache this under this._slash_commands
        this._slash_commands = ...
    }
    
    async function getCompletions(match: str) {
        // return completions by filtering this list
        // (no network call needed!)
        return this._slash_commands.filter(
            cmd => cmd.value.startsWith(match)
        )
    }
    
}

Step 2: Create a new completers registry

For Jupyter Chat to have awareness of completers in other extensions, the consumer extension must register each of its chat complters to a ChatCompletersRegistry object on init. This registry is a simple class which will provide the following methods:

  • add_completer(completer: IChatCompleter): void: adds a completer to its memory. A completer is said to be registered after this method is called on it.
  • get_completers(): IChatCompleter[]: returns a list of all registered completers.
  • init_completers(): void: calls await initialize() on all registered completers.

To provide access to this ChatCompletersRegistry object, Jupyter Chat will define a plugin which provides a IChatCompletersRegistry token. When consumer extensions require this token in their frontend plugins, they receive a reference to the ChatCompletersRegistry singleton initialized by Jupyter Chat, allowing them to register their completers. This system of providing & consuming tokens to build modular applications is common to all of JupyterLab's frontend.

Jupyter Chat already defines a IAutocompletionRegistry using a similar approach, used by Jupyter AI to provide completion for / commands. Because an implementation reference is already available, we will not go into detail here. It is sufficient to know that at this point, we have a way of allowing consumer extensions to define multiple completers and provide them to Jupyter Chat for use.

Step 3: Integrate new chat completions API

From the example SlashCommandCompleter implementation in Step 1, we can piece together how the application should behave:

  1. On init, each consumer extension instantiates its completers and adds them to the ChatCompletersRegistry singleton, provided by Jupyter Chat.

  2. Jupyter Chat should call ChatCompletersRegistry.init_completers() in the background.

  3. Perform the following on input changes:

    • Take the substring ending in the user's cursor, and store this as a local variable, e.g. partial_input.

    • For each completer, test partial_input against the completer's regex. If a match m is found, call getCompletions(m). Store a reference to this Promise.

    • Add a callback to the Promise to append the new completions to the existing list of completions.

    • If a completion is accepted, replace the substring of the input matched by the completer's regex with the completion.

    • If a user ignores completions and continues typing, cancel all Promises and return to 3).

The frontend implementation may debounce how frequently it tests the input against each regex, as testing an input against multiple regexes may be expensive. However, I think it is important we test the performance as-is first before making an optimization, since debouncing any callback adds a fixed amount of latency (the debounce delay).

Conclusion

The IChatCompleter interface defined in Step 1 and the ChatCompletersRegistry defined in Step 2 give consumer extensions a way of defining and providing chat completers. This interface and registry together define the Chat Completions API. Step 3 of this document provides guidance on how to use the new chat completions API to provide better completions in Jupyter Chat.

Benefits & applications

  • Because completers live in the frontend, they may not need to make a network call when triggered by the input. Some completers may allow completions to be statically defined (e.g. emoji names) and others may only need to make a network call at init (e.g. slash commands).

  • Because completers live in the frontend, it can choose to use any API to communicate with the server. If a Python-only API is required, a custom server handler can be defined to provide the same capabilities to the completer.

  • Completers are uniquely identified by their id, so two completers can use the same regex but yield two different sets of completions.

    • Application: Another extension could use the same / command regex to provide completions for its own custom / commands.

    • Application: @ can trigger multiple completers; one may provide usernames of other users in the chat, and another may provide the @ commands available in Jupyter AI (e.g. @file).

  • A completion doesn't need to share a prefix with the substring that triggered completions.

    • Application: Define a completer that matches $ and returns the completion \\$. Pressing "Enter" to accept the completion allows a user to easily type a literal dollar sign instead of opening math mode. If typing math was the user's intention, typing any character other than "Enter" hides the \\$ completion and allows math to be written.
  • Regex allows the triggering of completions to be strictly controlled. This means that "complete-able" suffixes don't need some unique identifier like / or @.

    • Application: Define a completer that matches ./ following whitespace and returns filenames for the current directory. For example, this could trigger the completions ./README.md, ./pyproject.toml, etc.

    • Application: Define a completer that matches : following whitespace and returns a list of emojis.

Shortcomings & risks

  • This design doesn't provide a clear way for a completer to open a custom UI instead of adding another completion entry.

    • Risk: If we don't address this shortcoming and this design makes it into Jupyter Chat v1, then we would likely need a major release to implement this in the future.

    • From @mlucool in Design proposal: Chat Completions API (rev. 0) #143: "I think a file completer would want a different experience than an variable one. As an example, for the @var completer, we envisioned users could click on the variable and interact with it. For example, maybe it lets the user have a preview of what will be sent or maybe it lets the user specify some parameters (e.g. you want the verbose mode of a specific variable). While these are only half-formed ideas, it's good to not restrict."

    • I agree that this could bring a lot of user benefit. At the same time, I have to be mindful of the engineering effort to implement this, as some stakeholders would like Jupyter AI v3.0.0 released by March. @mlucool Let's briefly discuss whether this is something we should do once you're back on Monday.

If a major revision of this design is needed, I will close this issue, revise the design, and open a new issue with a bumped revision number.

@dlqqq dlqqq added the enhancement New feature or request label Jan 4, 2025
@brichet
Copy link
Collaborator

brichet commented Jan 6, 2025

Thanks @dlqqq for this very complete proposal.
Indeed, it should be addressed before v1 to avoid needing a major release to include it.

👍 for defining it in the frontend. In addition to the list of benefits, I would add the compatibility with extensions running in Jupyterlite, without server extensions.

Some details of the current state of the completion

  • a registry already exists, but it must be improved to fit the requirement of this proposal.
  • a token is provided by jupyterlab-chat to access this registry.
  • the autocompletion only uses the 'current' AutoCompletion from the registry (the last one registered), and opens only if the input is an exact string, here

Comments on the proposal

  • I wonder if the ChatCompletion should allow a props field (as is is currently), to let the extension handle the options of the Autocomplete component from MUI (if we aim to keep this component in future). This is used in jupyter-ai to customise the rendering.
    The benefit of not allowing it is to ensure that all suggestions look the same.
  • Does the IChatCompletion need both regex and getCompletions() options ? It seems that the regex would be used by the getCompletions() function, but maybe there are other use cases.

@mlucool
Copy link

mlucool commented Jan 6, 2025

Each regex should end with $ to ensure this regex only matches partial inputs just typed by the user. Without $, the completer may generate completions for commands which were already typed.

Maybe I am misunderstanding, but I think this requirement may be too strict. Example:
User starts with:
Can you pivot @my_df around username?
Now sees, they meant to use my_other_df and goes back to edit (where | is where the cursor is)
Can you pivot @my_o| around username

Would this still trigger the @ autocomplete?


Can you sketch out how one may use the @var? I suspect the completers need a refence back to the current chat. Examples:

  1. What notebook(s) is the chat attached to?
  2. What what the previous convo history (e.g. you could imagine that we want the default autocomplete list for @ to include anything recently used at the top)
  3. Is there a way to make a command stand out from just text? e.g. @definedvar we may want with a gray background typovar in red?
  4. How is the completer linked to the processing? For @var we later need to pull data. Is the expectation that parsing will need to be done with a similar regexp?

@dlqqq
Copy link
Member Author

dlqqq commented Jan 6, 2025

@brichet I think we should avoid adding features specific to the current frontend component library (Material UI), as that may change in the future. I don't think we should add a props field to ChatCompletion unless there's a strong use-case which justifies it. Omitting this also helps keep chat completions looking consistent in the UI, as you mentioned.

Does the IChatCompletion need both regex and getCompletions() options ? It seems that the regex would be used by the getCompletions() function, but maybe there are other use cases.

IChatCompleter exposes the regex to tell Jupyter Chat when to trigger completions. The getCompletions() methods in turn accepts the matched string to generate completions. Chat completers will always need a way of knowing when to trigger & how to produce completions, so I think both the property & the method are necessary.

You may be conflating the IChatCompleter (the interface implemented by chat completers) and the ChatCompletion (the struct representing an acceptable completion) types. They do have similar and possibly confusing names, so I'm open to new suggestions here.

@dlqqq
Copy link
Member Author

dlqqq commented Jan 6, 2025

@mlucool

Maybe I am misunderstanding, but I think this requirement may be too strict. Example: ...

Thanks for calling this use-case out. You're right that if we take the entire input string and match it against a regex, the regex will not match an @ command in the middle of the input.

I think we can handle this use-case with a simple modification. Instead of testing the whole input string against each regex, take the substring of the input up to the user's cursor position, then test that substring against each regex. This effectively changes $ in regexes to match "user's cursor" instead of just "end of input". This way, a regex ending in $ can still match at most 1 substring (the original intent behind specifying $), while still generating completions if the user moves the cursor to a previous @ command.

Let me know if this sounds reasonable, in which case I'll patch the existing design.

@dlqqq
Copy link
Member Author

dlqqq commented Jan 6, 2025

@mlucool

  1. What notebook(s) is the chat attached to?
  1. What what the previous convo history (e.g. you could imagine that we want the default autocomplete list for @ to include anything recently used at the top)

I haven't designed how we can "link" 1+ files to chats yet, so retrieving this data from the completer is still an open question. We could pass a reference of the current YChat model (which stores the state of the entire chat) to getCompletions(). This would change its type signature to:

function getCompletions(ychat: YChat, match: string): ChatCompletion[]

This by itself would allow for use-case 2), since a completer can call YChat#getMessages() to get all previous messages. I was hoping that the linked files design would allow linked files would also be accessible directly from the YChat object, but we can't be confident about this until we get some clarity on how we link files.

I will do some experimentation today to explore ways we can get a reference to any arbitrary files accessible from the YChat object.

  1. Is there a way to make a command stand out from just text? e.g. @definedvar we may want with a gray background typovar in red?

The current design doesn't allow for this, but we can allow completers to return some kind of sentinel value to indicate that this is an invalid input. For example, if a completer returns null instead of [...], Jupyter Chat should highlight in the autocomplete menu that the current input is invalid and set it to red in the autocomplete menu.

I'm hesitant on allowing completers to arbitrarily define how their completions are rendered, as that allows other extensions a little too much freedom to deviate from Jupyter's UI design. @ellisonbg has highlighted this concern in the past.

However, allowing completers to define how their completions are rendered is certainly possible. We can modify the type signature of getCompletions() to allow this:

function getCompletions(match: string): (ChatCompletion | JSX.Element)[]

For each generated completion:

  • If it is of type ChatCompletion, then let Jupyter Chat handle its rendering in the completions menu.
  • Otherwise (if it is a React element), then render that element directly in the completions menu.
  1. How is the completer linked to the processing? For @var we later need to pull data. Is the expectation that parsing will need to be done with a similar regexp?

Currently, yes, the expectation is that command parsing & handling is done separately. The issue is that the concept of "chat commands" doesn't exist in Jupyter Chat. Jupyter AI exclusively defines messages are parsed and handled, which is all currently done in the backend.

We could extend this design to be more general, i.e. build a Chat Commands API instead of just a Chat Completions API. This was my original intent, but I scaled back the design after I realized how much of an overhaul this would entail. I think that this may be worthwhile, but also that we would need to first identify what specific & significant benefits this change would provide.

@mlucool
Copy link

mlucool commented Jan 7, 2025

I think we can handle this use-case with a simple modification. Instead of testing the whole input string against each regex, take the substring of the input up to the user's cursor position, then test that substring against each regex. This effectively changes $ in regexes to match "user's cursor" instead of just "end of input". This way, a regex ending in $ can still match at most 1 substring (the original intent behind specifying $), while still generating completions if the user moves the cursor to a previous @ command.

I think that sounds fine. Checking how slack handles middle edits (@abc|def), it just pushes characters after the autocomplete to start a new word

I'm hesitant on allowing completers to arbitrarily define how their completions are rendered, as that allows other extensions a little too much freedom to deviate from Jupyter's UI design.

As long as files and variables have really great UX, I don't have a strong view on others defining their completion.

function getCompletions(match: string): (ChatCompletion | JSX.Element)[]

Won't this make it hard to let things edit later? Or does editing change the state back to text?

@dlqqq
Copy link
Member Author

dlqqq commented Jan 8, 2025

@mlucool

I think that sounds fine. Checking how slack handles middle edits (@abc|def), it just pushes characters after the autocomplete to start a new word

Thanks for confirming. I've patched this revision of the design.

Won't this make it hard to let things edit later? Or does editing change the state back to text?

Can you clarify what you mean here?

On another note: I really appreciate how much thought your team has put into building the best UX for including files & variables. To build a design that allows for this, it would be helpful to have a visualization of the UX your team is envisioning. This will allow me to know what additional capabilities are required from the completer. Even a pen-and-paper sketch showing the UI across a user's actions would be tremendously helpful.

Perhaps @govinda18 has some materials to share?

@dlqqq dlqqq changed the title Design proposal: Chat Completions API (rev. 1) Design proposal: Chat Completions API (rev. 1.1) Jan 8, 2025
@dlqqq
Copy link
Member Author

dlqqq commented Jan 8, 2025

One shortcoming of this design is that the ChatCompleter doesn't provide handling / parsing after the user accepts a completion. However, after selecting a variable/file with an @ command, we ideally want some way of "attaching" the variable/file to the chat input.

I'm going to think more deeply about how we can unify most of the chat command logic in the frontend. In the meantime, I've opened this issue to track the need to support message attachments: #147

@mlucool
Copy link

mlucool commented Jan 9, 2025

Won't this make it hard to let things edit later? Or does editing change the state back to text?

Can you clarify what you mean here?

Given function getCompletions(match: string): (ChatCompletion | JSX.Element)[], I can return <img src='something'\>. Now I put my cursor on this and want to edit it again. How do I change it to something else? Do I just delete the element?

On another note: I really appreciate how much thought your team has put into building the best UX for including files & variables. To build a design that allows for this, it would be helpful to have a visualization of the UX your team is envisioning. This will allow me to know what additional capabilities are required from the completer. Even a pen-and-paper sketch showing the UI across a user's actions would be tremendously helpful.

We don't yet have mocks for variables. The key thoughts so far include:

  1. Make it easy for users to select a variable. Aside from typing, we want this to feel like the best options are a little work away as possible (thing gmail and emails not being sorted, but common things up top). This could be predictive (e.g. ghost text) or heuristic based (last used in chat, the kernel?, what's in the active cell etc)
  2. Make it easy for users to understand how a variable will be represented to the LLM. Give them abilities to alter that representation (e.g. verbose). Representations need not be text (e.g. a chart for a multimodal model)
  3. It needs to work beyond top level globals (e.g. imagine a debugger breakpoint where you get a chat to help you debug, you'd want to pass in local variables too. Maybe the same is true with foo['bar'])

Files to me is a bit easier. See what the top other assistants (cursor, copilot, windsurf) are doing and assume if its useful there, we want something at least as good :)

@krassowski
Copy link
Member

Some quick thoughts:

  • Is initialize() is required? In the example AiService.listSlashCommands could be invoked in the constructor and saved on this._slash_commands? While it is not set getCompletions could just return empty list.
  • if getCompletions took an object, it could be easily extended in the future. I mean instead of getCompletions(ychat: YChat, match: string) something like:
    getCompletions(request: ICompletionRequest)
    
    interface ICompletionRequest {
       ychat: YChat;
       match: string;
    }
  • I would push regex down to getCompletions as an implementation detail to minimize API surface; I struggle to see the value in doing a two-step approach here, we will just end up parsing twice
  • In the case of attaching a preview of variable, it may not be available (it may take time to render); I would discourage the idea of getCompletions returning either ChatCompletion or JSX.Element because we will need two or three different information pieces:
    • how to render the completion in the dropdown (i.e. label/description, or in the most general case a JSX.Element)
    • how to render the completion in the chat
    • probably more things we did not think of yet

@dlqqq
Copy link
Member Author

dlqqq commented Jan 10, 2025

@mlucool

Representations need not be text (e.g. a chart for a multimodal model)

Let's call this feature rich command previews for the sake of discussion.

Thank you for sharing more details about your team's vision for chat completion! This is helpful. I will take this into account while writing a draft implementation to explore different approaches to providing rich command previews. The draft implementation will help us find the right solution to use here.

Now I put my cursor on this and want to edit it again. How do I change it to something else? Do I just delete the element?

I'm still a little lost about what you mean by "changing it". I'm assuming you mean changing the user input, since we haven't had any discussion for editing/deleting command previews. I'll explore an example here and propose how this would affect the command previews shown in the completions menu.

Let | denote the user's cursor and suppose that there are two local variables: users_dataframe and users_csv. Suppose they type:

Write code to add an "Age" column to @users_dataframe|

This shows just 1 command preview for users_dataframe. If they want to change this, the user should press Backspace repeatedly until their input reads as:

Write code to add an "Age" column to @users|

This now shows 2 command previews: users_dataframe and users_csv. Then, suppose the user types more:

Write code to add an "Age" column to @users_cs|

This now just shows the command preview users_csv.

Does this clarify the behavior?

@dlqqq
Copy link
Member Author

dlqqq commented Jan 10, 2025

@krassowski Wow, thanks for the great feedback!

Is initialize() is required? In the example AiService.listSlashCommands could be invoked in the constructor and saved on this._slash_commands? While it is not set getCompletions could just return empty list.

The rationale is that async functions are easily called in the constructor, but we can't await a constructor easily. It seems easier to bundle any & all async init tasks into a single function which can be awaited. I know there are some Lumino data structures which are used in JupyterLab for this use-case, but I don't see the value in preferring those over simply awaiting an async function.

Jupyter Chat will be the only extension calling await initialize() on each completer, so I think this way of initializing completers is OK.

if getCompletions took an object, it could be easily extended in the future.

Thanks, I agree with you. It's more future-proof to use only one argument with an object type that can be extended as needed. I'll queue this change for the next revision.

I would push regex down to getCompletions as an implementation detail to minimize API surface; I struggle to see the value in doing a two-step approach here, we will just end up parsing twice

Yeah, this also makes sense. Passing the entire input string to each completer is equivalent in performance, and it also simplifies the mental model of how completers work. I'll queue this change for the next revision.

In the case of attaching a preview of variable, it may not be available (it may take time to render); I would discourage the idea of getCompletions returning either ChatCompletion or JSX.Element because we will need two or three different information pieces

I agree that this feature is currently ambiguous in its precise behavior & may introduce "unknown unknown" risk. Therefore, I'm writing a draft implementation which we can all use to test different strategies and iterate on this further.

My thinking is that we may find a way to support rich (image/multimedia) command previews in the completions menu as we experiment. If this proves too challenging however, we may just want to do this later in a future release of Jupyter Chat.

@krassowski
Copy link
Member

I know there are some Lumino data structures which are used in JupyterLab for this use-case.

Not sure what you are referring to here.

The rationale is that async functions are easily called in the constructor, but we can't await a constructor easily.

True. But then you have a disconnect between initialization and construction logic which can bite you later. A common pattern for async initialization would be:

interface IX {
  ready: Promise<void>;
}
class X implements IX {
  constructor() {
    this._ready = this._initialize();
  }
  get ready(): Promise<void> {
    return this._ready;
  }
  private async _initialize() {
    // do stuff
  }
  private _ready: Promise<void>;
}

This reduces the API contract a little bit and avoids the risk of initialize being called multiple times. Still, not quite sure why you want to await for initialization. I mean maybe for logging errors in case of initialization timeout it would be fine, but otherwise you would risk getting stuck when awaiting for initialize or ready call which never resolves, right?

Anyways, this is a minor detail and more me saying "I did that before and it bit me" than "it must be done this way".

@dlqqq
Copy link
Member Author

dlqqq commented Jan 11, 2025

@krassowski Ah, thanks for clarifying. I must've been confusing your suggestion with something else. This makes much more sense to me now seeing your example. I agree this is better, so let's also queue this for the next revision. 👍

Thanks for taking the time to leave all this helpful feedback!

I actually can't own the draft implementation right now; I need to upgrade Jupyter AI to use langchain~=0.3 instead of langchain~=0.2. We're getting some complaints from users & dependents who want to use the latest LangChain with Jupyter AI v2. Given that we need to upgrade for v3 anyways, I think this takes priority.

Others, please feel free to own the draft implementation of this design while I work on the LangChain upgrade. Any contribution would be appreciated! 🤗 Just leave a note in this thread for others to avoid duplicate work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants