on_prem_engine

On Prem LLM Model selector

Install the transformers and torch libraries for loading and running the Hugging Face models:

pip install transformers torch

Python Code for downloading HuggingFace (download_t5small_model.py)

You need to make sure there is a ./models/t5-small folder.

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "t5-small"  # The model you want to use
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Save the model locally
model.save_pretrained("./models/t5-small")
tokenizer.save_pretrained("./models/t5-small")

Python Code for downloading HuggingFace (download_fastchat-t5-3b_model.py)

You need to make sure there is a ./models/fastchat-t5-3b-v1.0 folder.

Make sure you login to hugging face :

huggingface-cli login

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "fastchat-t5-3b-v1.0"  # The model you want to use
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
    
# Save the model locally
model.save_pretrained("./models/fastchat-t5-3b-v1.0")
tokenizer.save_pretrained("./models/fastchat-t5-3b-v1.0")

Python Code for downloading HuggingFace (download_Llama-3.2-1B_model.py.py)

You need to make sure there is a ./models/Llama-3.2-1B folder.

You need to request access : https://huggingface.co/meta-llama/Llama-3.2-1B

Make sure you login to hugging face :

huggingface-cli login

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "fastchat-t5-3b-v1.0"  # The model you want to use
token = "HUGGINFACE TOKEN"

model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
    
# Save the model locally
model.save_pretrained("./models/Llama-3.2-1B")
tokenizer.save_pretrained("./models/Llama-3.2-1B")

Run python3 download_t5small_model.py

Python Code for Chat with Model

from transformers import AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoTokenizer, AutoConfig
import sys

def generate_response(model_name, user_input):
    # Load the configuration to check model type
    config = AutoConfig.from_pretrained(f'./models/{model_name}')

    # Choose model class based on configuration
    if config.model_type == "llama":
        model = AutoModelForCausalLM.from_pretrained(f'./models/{model_name}')
    else:
        model = AutoModelForSeq2SeqLM.from_pretrained(f'./models/{model_name}')

    # Load the tokenizer (the same for both model types)
    tokenizer = AutoTokenizer.from_pretrained(f'./models/{model_name}')

    # Tokenize the user input
    inputs = tokenizer(user_input, return_tensors="pt")

    # Generate a response
    outputs = model.generate(inputs["input_ids"], max_length=150, num_return_sequences=1)

    # Decode the response
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response

if __name__ == "__main__":
    model_name = sys.argv[1]  # Model name e.g., "meta-llama/Llama-3.2-1B" or "fastchat-t5-3b-v1.0"
    user_input = sys.argv[2]  # User input (message)

    print(generate_response(model_name, user_input))

Test model python3 chat_with_model.py t5-small "What is new?" python3 chat_with_model.py fastchat-t5-3b "What is new?" python3 chat_with_model.py Llama-3.2-1B "What is new?"
Run rails server

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
bin		bin
config		config
db		db
lib		lib
log		log
public		public
storage		storage
test		test
tmp		tmp
vendor		vendor
.gitattributes		.gitattributes
.gitignore		.gitignore
.ruby-version		.ruby-version
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
README.md		README.md
Rakefile		Rakefile
chat_with_Llama.py		chat_with_Llama.py
chat_with_Llama_EOS.py		chat_with_Llama_EOS.py
chat_with_model.py		chat_with_model.py
config.ru		config.ru
download_Llama-3.2-1B_model.py		download_Llama-3.2-1B_model.py
download_fastchat-t5-3b_model.py		download_fastchat-t5-3b_model.py
download_t5small_model.py		download_t5small_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

on_prem_engine

About

Releases

Packages

Languages

License

scherztc/on_prem_engine

Folders and files

Latest commit

History

Repository files navigation

on_prem_engine

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages