Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to add vocabulary word lists to this implementation of the model? #8

Open
HugoPfeffer opened this issue Sep 9, 2024 · 1 comment

Comments

@HugoPfeffer
Copy link

title.

I'm not very experienced, meaning this question may be really stupid, but can anyone help me?

@xmontero
Copy link

xmontero commented Sep 20, 2024

My use case is the following:

  • I own a travel agency and sometimes we speak about names of providers, name of our own company, etc. We use many times "Catalan = ca" as our internal language for meetings.
  • It usually gets "converted" into dictionary words. Say a provider is named "T. E. Ball" and the phrase is "Call the T. E. Ball guys" it might be transcribed as "Call the table guys". (But in catalan, just translated the use case into English).

So I want to pass a list of "own words" that if sounded "similar to that written spelling" then it's spelled as in the "custom dictionary".

I also think that provided that the LLMs use "proximity" in the vectors, maybe a list of words along with their definition could help decode the context. Something like:

definitions:
    T. E. Ball: "The name of a provider in Japan"
    Mr. Watanabe: "The boss of T. E. Ball"
    [...]

Does it make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants