Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Idea]: Persona-Based Muxing #1055

Open
aponcedeleonch opened this issue Feb 14, 2025 · 8 comments
Open

[Idea]: Persona-Based Muxing #1055

aponcedeleonch opened this issue Feb 14, 2025 · 8 comments
Assignees

Comments

@aponcedeleonch
Copy link
Contributor

aponcedeleonch commented Feb 14, 2025

Enhance CodeGate’s muxing functionality to support user-defined “personas,” allowing the system to classify incoming requests based on a persona and then route them to an LLM chosen by the user. For instance, a “Frontend React Expert” persona might be manually mapped to a favorite advanced LLM, while a “Backend Microservices Guru” persona could be routed to a lightweight local model—entirely at the user’s discretion.

Why Is This Feature Important?

  1. Fine-Grained Control. Users maintain complete authority over which LLM handles requests for a given persona (e.g., “Frontend React Expert” → Model X, “Backend Microservices Guru” → Model Y).
  2. Better Alignment With Developer Expertise. By defining personas that capture specific skill sets or roles within a project, responses can be more targeted and relevant to the given domain, without sacrificing user choice in model selection.
  3. Resource and Cost Efficiency. Users decide exactly when to employ advanced or specialized models, and when to default to smaller, cost-effective ones, based on the persona’s needs.

Possible Solution

Persona Definitions

Store personas in CodeGate configuration (similar to how CodeGate currently handles different providers).

Examples:

  • Frontend React Expert: Focuses on UI and React-specific queries.
  • Backend Microservices Guru: Focuses on scalability, architecture, and performance.

Local LLM Classifier

A small, local model quickly inspects incoming prompts to determine which persona best fits.
Example: "How do I optimize state management in my React app?" → Frontend React Expert.

User-Defined LLM Routing

After classification, CodeGate routes requests to the LLM the user has configured for that persona.
Example: Frontend React Expert → [User-selected advanced model].
Users can easily update which LLM is tied to each persona at any time.

Challenges & Considerations

  1. Classifier Accuracy. Ensuring the local LLM correctly identifies the right persona. Misclassifications could lead to irrelevant or suboptimal answers—even if the correct LLM is specified.
  2. Performance & Latency. Running a local model for classification adds a small overhead. Must be optimized to avoid bottlenecks in large-scale or rapid-fire scenarios. An alternative would be to use one of the user-defined providers. Although this would mean that CodeGate is going to consume more of the user's tokens which might not be expected and cause a bad impression.
  3. User Experience. Providing a clear interface or config structure for defining personas and selecting their corresponding LLMs. Ensuring that changes to persona-LLM mappings are intuitive and quick to implement.
  4. Extensibility. Potential to introduce more advanced persona logic in the future (e.g., dynamic persona creation).

Additional Context

No response

@aponcedeleonch aponcedeleonch changed the title [idea]: [Idea]: Persona-Based Muxing Feb 14, 2025
@kantord
Copy link
Member

kantord commented Feb 14, 2025

A small, local model quickly inspects incoming prompts to determine which persona best fits.
Example: "How do I optimize state management in my React app?" → Frontend React Expert.

It is perhaps slightly off-topic, but I was thinking if the same logic could be used in order to help the user takem maximum advantage of CodeGate features.

For instance by detecting if the user is typing a lot of manual instructions to convince the model to write code in a specific style, we might want to let them know that they can avoid this by creating a custom prompt.

(That is just one example, it's probably applicable to most features)

@kantord
Copy link
Member

kantord commented Feb 14, 2025

@dashtangui was also looking into Cline's "memory bank" concept which is (as I understand) another take on the same idea.

She mentioned that there is a certain convenience to being able to specify these files in your repository, and even share it with your workmates simply using git.

I believe this is something we should look into as well, but she can probably explain the idea better

@kantord
Copy link
Member

kantord commented Feb 14, 2025

IMO it is a great idea!

I was also thinking if we could have an LLM tool for configuring the different personas. Similarly to how the "Custom GPT" feature works in ChatGPT. That feature allows you to chat with an LLM agent which will create a configuration/persona

This is a feature I have used in the past to create customized assistants for different development tasks (such as generating integration tests based on a screenshot, matching a certain testing approach/code style) and it works pretty well!

@aponcedeleonch
Copy link
Contributor Author

aponcedeleonch commented Feb 14, 2025

It is perhaps slightly off-topic, but I was thinking if the same logic could be used in order to help the user takem maximum advantage of CodeGate features.
For instance by detecting if the user is typing a lot of manual instructions to convince the model to write code in a specific style, we might want to let them know that they can avoid this by creating a custom prompt.

@kantord Yes, that's correct. Even workspace creation and managing could be done in-chat with the help of an LLM. The main stoppers at the moment are 2 things:

  1. We don't have a local LLM in CodeGate that we can leverage to detect all of this natural language requests
  2. UX. We need to be careful when implementing this kind of stuff because latency could take a big hit.

Number 1 should be solved if we go with this approach of using an LLM to detect personas.


Cline's "memory bank" concept which is (as I understand) another take on the same idea.

I quickly browsed at the link. Memory bank solves a different problem IMO. Persona muxing is more like a way of routing requests whereas memory bank is a way of managing the context of the conversations itself. But memory bank does seem interesting and we could explore if there's some intersection


I was also thinking if we could have an LLM tool for configuring the different personas. Similarly to how the "Custom GPT" feature works in ChatGPT. That feature allows you to chat with an LLM agent which will create a configuration/persona

That's a nice idea. We can start simple and have user-specified personas but definitely that would be a nice feature as well.

@kantord
Copy link
Member

kantord commented Feb 14, 2025

I quickly browsed at the link. Memory bank solves a different problem IMO. Persona muxing is more like a way of routing requests whereas memory bank is a way of managing the context of the conversations itself. But memory bank does seem interesting and we could explore if there's some intersection

Maybe I'm confused, is the idea based on using also different custom prompts though, or is it 100% just about the model/provider? In any case, I think that if we have a logic for personas, we 100% should have features for customizing the system prompt as well.

In any case I get your point that it's a different use case, I guess the memory bank is also more "project-based" than "persona-based" 🤔 but I am not sure if there is a natural separation point between the 2

@lukehinds
Copy link
Contributor

lukehinds commented Feb 18, 2025

I imagine this would be more architect (deepseek /o1 reasoning models) / coder (qwen , Claude sonnet 3.5), then JavaScript or database expert. Having a stack specific persona is more of a prompt thing?

@aponcedeleonch
Copy link
Contributor Author

@kantord Sorry, I missed your last comment! Yes, the initial idea is 100% about:

  1. Classify a request prompt into a persona
  2. Routing the request prompt to a model/provider based on the persona identified.

But for sure, a natural evolution of the feature would the possibility to add a custom system prompt based on the persona. Similarly to what we're doing right now with workspaces.

@aponcedeleonch
Copy link
Contributor Author

@lukehinds Yes, completely the personas are meant to be more general and not so specific. Maybe the examples I chose in the description of the issue are not the best.

We could also bundle CodeGate with a set of sample personas, so users only have to configure the muxing rules to route persona-identified requests to their preferred models.

@aponcedeleonch aponcedeleonch self-assigned this Mar 3, 2025
aponcedeleonch added a commit that referenced this issue Mar 3, 2025
Related to: #1055

For the current implementation of muxing we only need
to match a single Persona at a time. For example:
1. mux1 -> persona Architect -> openai o1
2. mux2 -> catch all -> openai gpt4o

In the above case we would only need to know if the request
matches the persona `Architect`. It's not needed to match
any extra personas even if they exist in DB.

This PR introduces what's necessary to do the above without
actually wiring in muxing rules. The PR:
- Creates the persona table in DB
- Adds methods to write and read to the new persona table
- Implements a function to check if a query matches to the specified persona

To check more about the personas and the queries please check the unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants