Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage]: File Access Error When Using RunAI Model Streamer with S3 in VLLM #12311

Open
1 task done
nskumz opened this issue Jan 22, 2025 · 4 comments · May be fixed by #12353
Open
1 task done

[Usage]: File Access Error When Using RunAI Model Streamer with S3 in VLLM #12311

nskumz opened this issue Jan 22, 2025 · 4 comments · May be fixed by #12353
Labels
usage How to use vllm

Comments

@nskumz
Copy link

nskumz commented Jan 22, 2025

Your current environment

I am encountering a persistent issue when attempting to serve a model from an S3 bucket using the vllm serve command with the --load-format runai_streamer option. Despite having proper access to the S3 bucket and all required files being present, the process fails with a "File access error." Below are the details of the issue:

Command Used:
vllm serve s3://hip-general/benchmark-model-loading/ --load-format runai_streamer

Error Message:
Exception: Could not send runai_request to libstreamer due to: b'File access error'

Environment Details:
VLLM version: 0.6.6
Python version: 3.12
RunAI Model Streamer version: 0.11.2
S3 Region: us-west-2


Files in S3 Bucket:
config.json
generation_config.json
model-00001-of-00004.safetensors
model-00002-of-00004.safetensors
model-00003-of-00004.safetensors
model-00004-of-00004.safetensors
model.safetensors.index.json
special_tokens_map.json
tokenizer.json
tokenizer_config.json

my deployment file is

apiVersion: apps/v1
kind: Deployment
metadata:
name: benchmark-model-8b
namespace: workload
spec:
replicas: 1
selector:
matchLabels:
app: benchmark-model-8b
strategy:
type: Recreate
template:
metadata:
creationTimestamp: null
labels:
app: benchmark-model-8b
spec:
containers:
- command:
- sh
- -c
- exec tail -f /dev/null
env:
- name: HF_HOME
value: /huggingface
- name: HUGGINGFACE_HUB_CACHE
value: /huggingface/hub
- name: HF_HUB_ENABLE_HF_TRANSFER
value: "False"
- name: HUGGING_FACE_HUB_TOKEN
value: ""
image: vllm/vllm-openai:v0.6.6
imagePullPolicy: IfNotPresent
name: benchmark-model-8b
ports:
- containerPort: 8888
name: http
protocol: TCP
resources:
limits:
nvidia.com/gpu: "1"
requests:
cpu: "5"
memory: 128Gi
securityContext:
capabilities:
add:
- SYS_ADMIN
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /huggingface
name: hf-volume
- mountPath: /dev/shm
name: dshm
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: hf-volume
persistentVolumeClaim:
claimName: benchmark-model-pvc
- emptyDir:
medium: Memory
sizeLimit: 90Gi
name: dshm

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@nskumz nskumz added the usage How to use vllm label Jan 22, 2025
@nskumz
Copy link
Author

nskumz commented Jan 22, 2025

Could you please pickup this one @omer-dayan

@noa-neria
Copy link

noa-neria commented Jan 22, 2025

Hi,

Try to pass the credentials as environment variables in the command line:
AWS_ACCESS_KEY_ID=my_key AWS_SECRET_ACCESS_KEY=my_secret vllm serve s3://hip-general/benchmark-model-loading/ --load-format runai_streamer

Our implementation is using the AWS S3 C++ SDK, which applies the default authentication chain of AWS and is aligned with AWS CLI.

In order to find the problem you can check the AWS trace logs. Add the following environment variable to the command line
RUNAI_STREAMER_S3_TRACE=1

Trace logs are written into a file in the location of the executable (where the vllm is running)

There can be various reasons why the AWS CLI succeeds but not the SDK, for example

  • Credentials issues
    The SDK may not be resolving credentials the same way the AWS CLI does.

    • Ensure the environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN) are correctly set if used.
    • By default, the SDK uses the default profile unless another is specified. If using the shared credentials file, ensure the AWS_PROFILE environment variable is set correctly, or the default profile is configured correctly.
    • If using an IAM role (e.g., on EC2), ensure the instance or container has the correct permissions attached.
    • If using credentials file, the SDK might not be looking in the same location as the CLI for the credentials file. Pass the correct location e.g. AWS_SHARED_CREDENTIALS_FILE=~/.aws/credentials
    • If using credentials file, verify its format since the SDK is more strict. Avoid trailing spaces or malformed entries.
  • Region mismatch
    Check the logs for a line similar to Resolved region: us-east-1 and compare to the CLI region aws configure get region

@nskumz
Copy link
Author

nskumz commented Jan 23, 2025

Thanks for the quick response @noa-neria , so as you suggested i just configured in above way using this command AWS_ACCESS_KEY_ID="ASIAXYKJSZPEDFHGSUHDF2" AWS_SECRET_ACCESS_KEY="8CltEJHjedfjkjfhWDJHHuue/h" vllm serve s3://hip-general/benchmark-model-loading/ --load-format runai_streamer

for this, i am getting as below error,
huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/tmp/tmpx3kfctt4'. Use repo_type argument if needed.

for this i have passed the require argument like --model <> ,still i am getting this.
Could you please provide something about it to resolve.

@omer-dayan
Copy link
Contributor

Hey @nskumz
Thanks for the report!

It is indeed a bug.
I opened a PR for fixing it:
#12353

For now, a workaround is to remove the trailing "/" at the end of the path:
s3://oip-general/benchmark-model-loading/ -> s3://oip-general/benchmark-model-loading

Sorry for the inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage How to use vllm
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants