spike: model files pvc image hardening example #979

justinthelaw · 2024-09-04T14:02:16Z

Our new method of loading models into our inferencing engines requires an initContainer that uses Zarf Injection to mount a PV into the inferencing engine's container. This initContainer + PV is technically another container that needs to be approved through hardening processes.

This spike is to see how we can separate out the injection into its own container for hardening, to include the model files, and then run this package through the IronBank hardening process. The model we will use is the defenseunicorns/Hermes-2-Pro-Mistral-7B-4bit-32g-GPTQ, and the targeted backend will be the vLLM backend.

justinthelaw self-assigned this Sep 4, 2024

justinthelaw added this to the Next (M11) - Conformance | Stability | Documentation milestone Sep 4, 2024

justinthelaw added enhancement New feature or request spike labels Sep 4, 2024

jalling97 linked a pull request Sep 5, 2024 that will close this issue

ADR: Breaking models weights out of model images #752

Draft

justinthelaw mentioned this issue Oct 9, 2024

EPIC: IronBank LeapfrogAI Hardening #750

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spike: model files pvc image hardening example #979

spike: model files pvc image hardening example #979

justinthelaw commented Sep 4, 2024 •

edited

Loading

spike: model files pvc image hardening example #979

spike: model files pvc image hardening example #979

Comments

justinthelaw commented Sep 4, 2024 • edited Loading

justinthelaw commented Sep 4, 2024 •

edited

Loading