-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Forward arguments from TGI launcher to the model (#28)
* Include revision * Expose match_batch_size as envvar for TGI entrypoint * Remove Intellij files from git * Remove unused variable in entrypoint * again * Fix TGI_MAX_INPUT_LENGTH to TGI_MAX_INPUT_TOKENS to stay in tokens * Let's allow to use specific TGI commit * Delete comments * Makes it possible to install specific commit of TGI also in tgi_test * Oops missing one file * leverage forwarded variables from the launcher to allocate the model * Fix invalid variable name * Add missing find-links argument to make the dependend tests running * Update tests with new args * Revert using git and use curl + github archive * let's define max-batch-prefill-tokens too * Let's map the model_id to the value provided by * Remove overriding TGI entrypoint
- Loading branch information
1 parent
e663b13
commit c7fe483
Showing
14 changed files
with
84 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -132,4 +132,5 @@ dmypy.json | |
# Models | ||
*.pt | ||
|
||
.vscode | ||
.vscode | ||
.idea/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,44 @@ | ||
#!/bin/bash | ||
|
||
if [[ -z "${HF_MODEL_ID}" ]]; then | ||
echo "HF_MODEL_ID must be set" | ||
# Hugging Face Hub related | ||
if [[ -z "${MODEL_ID}" ]]; then | ||
echo "MODEL_ID must be set" | ||
exit 1 | ||
fi | ||
export MODEL_ID="${HF_MODEL_ID}" | ||
export MODEL_ID="${MODEL_ID}" | ||
|
||
if [[ -n "${HF_MODEL_REVISION}" ]]; then | ||
export REVISION="${HF_MODEL_REVISION}" | ||
# TGI related | ||
if [[ -n "${TGI_MAX_CONCURRENT_REQUESTS}" ]]; then | ||
export TGI_MAX_CONCURRENT_REQUESTS="${TGI_MAX_CONCURRENT_REQUESTS}" | ||
else | ||
export TGI_MAX_CONCURRENT_REQUESTS 4 | ||
fi | ||
|
||
if [[ -n "${HF_MODEL_TRUST_REMOTE_CODE}" ]]; then | ||
export TRUST_REMOTE_CODE="${HF_MODEL_TRUST_REMOTE_CODE}" | ||
if [[ -n "${TGI_MAX_BATCH_SIZE}" ]]; then | ||
export TGI_MAX_BATCH_SIZE="${TGI_MAX_BATCH_SIZE}" | ||
else | ||
export TGI_MAX_BATCH_SIZE 1 | ||
fi | ||
|
||
text-generation-launcher --port 8080 | ||
if [[ -n "${TGI_MAX_INPUT_TOKENS}" ]]; then | ||
export TGI_MAX_INPUT_TOKENS="${TGI_MAX_INPUT_TOKENS}" | ||
else | ||
export TGI_MAX_INPUT_TOKENS 128 | ||
fi | ||
|
||
if [[ -n "${TGI_MAX_TOTAL_TOKENS}" ]]; then | ||
export TGI_MAX_TOTAL_TOKENS="${TGI_MAX_TOTAL_TOKENS}" | ||
else | ||
export TGI_MAX_TOTAL_TOKENS 256 | ||
fi | ||
|
||
TGI_MAX_BATCH_PREFILL_TOKENS=$(( TGI_MAX_BATCH_SIZE*TGI_MAX_INPUT_TOKENS )) | ||
|
||
text-generation-launcher --port 8080 \ | ||
--max-concurrent-requests ${TGI_MAX_CONCURRENT_REQUESTS} \ | ||
--max-batch-size ${TGI_MAX_BATCH_SIZE} \ | ||
--max-batch-prefill-tokens ${TGI_MAX_BATCH_PREFILL_TOKENS} \ | ||
--max-input-tokens ${TGI_MAX_INPUT_TOKENS} \ | ||
--max-total-tokens ${TGI_MAX_TOTAL_TOKENS} \ | ||
--model-id ${MODEL_ID} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters