You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our new method of loading models into our inferencing engines requires an initContainer that uses Zarf Injection to mount a PV into the inferencing engine's container. This initContainer + PV is technically another container that needs to be approved through hardening processes.
This spike is to see how we can separate out the injection into its own container for hardening, to include the model files, and then run this package through the IronBank hardening process. The model we will use is the defenseunicorns/Hermes-2-Pro-Mistral-7B-4bit-32g-GPTQ, and the targeted backend will be the vLLM backend.
The text was updated successfully, but these errors were encountered:
Our new method of loading models into our inferencing engines requires an initContainer that uses Zarf Injection to mount a PV into the inferencing engine's container. This initContainer + PV is technically another container that needs to be approved through hardening processes.
This spike is to see how we can separate out the injection into its own container for hardening, to include the model files, and then run this package through the IronBank hardening process. The model we will use is the defenseunicorns/Hermes-2-Pro-Mistral-7B-4bit-32g-GPTQ, and the targeted backend will be the vLLM backend.
The text was updated successfully, but these errors were encountered: