[Idea]: Persona-Based Muxing #1055

aponcedeleonch · 2025-02-14T10:19:34Z

Enhance CodeGate’s muxing functionality to support user-defined “personas,” allowing the system to classify incoming requests based on a persona and then route them to an LLM chosen by the user. For instance, a “Frontend React Expert” persona might be manually mapped to a favorite advanced LLM, while a “Backend Microservices Guru” persona could be routed to a lightweight local model—entirely at the user’s discretion.

Why Is This Feature Important?

Fine-Grained Control. Users maintain complete authority over which LLM handles requests for a given persona (e.g., “Frontend React Expert” → Model X, “Backend Microservices Guru” → Model Y).
Better Alignment With Developer Expertise. By defining personas that capture specific skill sets or roles within a project, responses can be more targeted and relevant to the given domain, without sacrificing user choice in model selection.
Resource and Cost Efficiency. Users decide exactly when to employ advanced or specialized models, and when to default to smaller, cost-effective ones, based on the persona’s needs.

Possible Solution

Persona Definitions

Store personas in CodeGate configuration (similar to how CodeGate currently handles different providers).

Examples:

Frontend React Expert: Focuses on UI and React-specific queries.
Backend Microservices Guru: Focuses on scalability, architecture, and performance.

Local LLM Classifier

A small, local model quickly inspects incoming prompts to determine which persona best fits.
Example: "How do I optimize state management in my React app?" → Frontend React Expert.

User-Defined LLM Routing

After classification, CodeGate routes requests to the LLM the user has configured for that persona.
Example: Frontend React Expert → [User-selected advanced model].
Users can easily update which LLM is tied to each persona at any time.

Challenges & Considerations

Classifier Accuracy. Ensuring the local LLM correctly identifies the right persona. Misclassifications could lead to irrelevant or suboptimal answers—even if the correct LLM is specified.
Performance & Latency. Running a local model for classification adds a small overhead. Must be optimized to avoid bottlenecks in large-scale or rapid-fire scenarios. An alternative would be to use one of the user-defined providers. Although this would mean that CodeGate is going to consume more of the user's tokens which might not be expected and cause a bad impression.
User Experience. Providing a clear interface or config structure for defining personas and selecting their corresponding LLMs. Ensuring that changes to persona-LLM mappings are intuitive and quick to implement.
Extensibility. Potential to introduce more advanced persona logic in the future (e.g., dynamic persona creation).

Additional Context

No response

kantord · 2025-02-14T10:40:35Z

A small, local model quickly inspects incoming prompts to determine which persona best fits.
Example: "How do I optimize state management in my React app?" → Frontend React Expert.

It is perhaps slightly off-topic, but I was thinking if the same logic could be used in order to help the user takem maximum advantage of CodeGate features.

For instance by detecting if the user is typing a lot of manual instructions to convince the model to write code in a specific style, we might want to let them know that they can avoid this by creating a custom prompt.

(That is just one example, it's probably applicable to most features)

kantord · 2025-02-14T10:43:43Z

@dashtangui was also looking into Cline's "memory bank" concept which is (as I understand) another take on the same idea.

She mentioned that there is a certain convenience to being able to specify these files in your repository, and even share it with your workmates simply using git.

I believe this is something we should look into as well, but she can probably explain the idea better

kantord · 2025-02-14T11:00:33Z

IMO it is a great idea!

I was also thinking if we could have an LLM tool for configuring the different personas. Similarly to how the "Custom GPT" feature works in ChatGPT. That feature allows you to chat with an LLM agent which will create a configuration/persona

This is a feature I have used in the past to create customized assistants for different development tasks (such as generating integration tests based on a screenshot, matching a certain testing approach/code style) and it works pretty well!

aponcedeleonch · 2025-02-14T13:05:22Z

It is perhaps slightly off-topic, but I was thinking if the same logic could be used in order to help the user takem maximum advantage of CodeGate features.
For instance by detecting if the user is typing a lot of manual instructions to convince the model to write code in a specific style, we might want to let them know that they can avoid this by creating a custom prompt.

@kantord Yes, that's correct. Even workspace creation and managing could be done in-chat with the help of an LLM. The main stoppers at the moment are 2 things:

We don't have a local LLM in CodeGate that we can leverage to detect all of this natural language requests
UX. We need to be careful when implementing this kind of stuff because latency could take a big hit.

Number 1 should be solved if we go with this approach of using an LLM to detect personas.

Cline's "memory bank" concept which is (as I understand) another take on the same idea.

I quickly browsed at the link. Memory bank solves a different problem IMO. Persona muxing is more like a way of routing requests whereas memory bank is a way of managing the context of the conversations itself. But memory bank does seem interesting and we could explore if there's some intersection

I was also thinking if we could have an LLM tool for configuring the different personas. Similarly to how the "Custom GPT" feature works in ChatGPT. That feature allows you to chat with an LLM agent which will create a configuration/persona

That's a nice idea. We can start simple and have user-specified personas but definitely that would be a nice feature as well.

kantord · 2025-02-14T14:30:10Z

I quickly browsed at the link. Memory bank solves a different problem IMO. Persona muxing is more like a way of routing requests whereas memory bank is a way of managing the context of the conversations itself. But memory bank does seem interesting and we could explore if there's some intersection

Maybe I'm confused, is the idea based on using also different custom prompts though, or is it 100% just about the model/provider? In any case, I think that if we have a logic for personas, we 100% should have features for customizing the system prompt as well.

In any case I get your point that it's a different use case, I guess the memory bank is also more "project-based" than "persona-based" 🤔 but I am not sure if there is a natural separation point between the 2

lukehinds · 2025-02-18T15:41:58Z

I imagine this would be more architect (deepseek /o1 reasoning models) / coder (qwen , Claude sonnet 3.5), then JavaScript or database expert. Having a stack specific persona is more of a prompt thing?

aponcedeleonch · 2025-02-19T07:31:40Z

@kantord Sorry, I missed your last comment! Yes, the initial idea is 100% about:

Classify a request prompt into a persona
Routing the request prompt to a model/provider based on the persona identified.

But for sure, a natural evolution of the feature would the possibility to add a custom system prompt based on the persona. Similarly to what we're doing right now with workspaces.

aponcedeleonch · 2025-02-19T07:37:12Z

@lukehinds Yes, completely the personas are meant to be more general and not so specific. Maybe the examples I chose in the description of the issue are not the best.

We could also bundle CodeGate with a set of sample personas, so users only have to configure the muxing rules to route persona-identified requests to their preferred models.

Related to: #1055 For the current implementation of muxing we only need to match a single Persona at a time. For example: 1. mux1 -> persona Architect -> openai o1 2. mux2 -> catch all -> openai gpt4o In the above case we would only need to know if the request matches the persona `Architect`. It's not needed to match any extra personas even if they exist in DB. This PR introduces what's necessary to do the above without actually wiring in muxing rules. The PR: - Creates the persona table in DB - Adds methods to write and read to the new persona table - Implements a function to check if a query matches to the specified persona To check more about the personas and the queries please check the unit tests

aponcedeleonch added the needs-triage label Feb 14, 2025

aponcedeleonch changed the title ~~[idea]:~~ [Idea]: Persona-Based Muxing Feb 14, 2025

lukehinds removed the needs-triage label Feb 20, 2025

aponcedeleonch self-assigned this Mar 3, 2025

aponcedeleonch mentioned this issue Mar 3, 2025

Added a class which performs semantic routing #1192

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Idea]: Persona-Based Muxing #1055

[Idea]: Persona-Based Muxing #1055

aponcedeleonch commented Feb 14, 2025 •

edited

Loading

kantord commented Feb 14, 2025

kantord commented Feb 14, 2025

kantord commented Feb 14, 2025 •

edited

Loading

aponcedeleonch commented Feb 14, 2025 •

edited

Loading

kantord commented Feb 14, 2025

lukehinds commented Feb 18, 2025 •

edited

Loading

aponcedeleonch commented Feb 19, 2025

aponcedeleonch commented Feb 19, 2025

[Idea]: Persona-Based Muxing #1055

[Idea]: Persona-Based Muxing #1055

Comments

aponcedeleonch commented Feb 14, 2025 • edited Loading

Why Is This Feature Important?

Possible Solution

Persona Definitions

Local LLM Classifier

User-Defined LLM Routing

Challenges & Considerations

Additional Context

kantord commented Feb 14, 2025

kantord commented Feb 14, 2025

kantord commented Feb 14, 2025 • edited Loading

aponcedeleonch commented Feb 14, 2025 • edited Loading

kantord commented Feb 14, 2025

lukehinds commented Feb 18, 2025 • edited Loading

aponcedeleonch commented Feb 19, 2025

aponcedeleonch commented Feb 19, 2025

aponcedeleonch commented Feb 14, 2025 •

edited

Loading

kantord commented Feb 14, 2025 •

edited

Loading

aponcedeleonch commented Feb 14, 2025 •

edited

Loading

lukehinds commented Feb 18, 2025 •

edited

Loading