You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/usr/local/bin/xinference", line 8, in
sys.exit(cli())
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/xinference/deploy/cmdline.py", line 845, in model_launch
kwargs[ctx.args[i][2:]] = handle_click_args_type(ctx.args[i + 1])
IndexError: list index out of range
Expected behavior / 期待表现
qwen2.5-instruct-awq q4 model can run launch smoothly on my device.
The text was updated successfully, but these errors were encountered:
System Info / 系統信息
cuda12.4
system: ubuntu 22.04.5 LTS
GPU:A10 (23G) * 4
images: xprobe/xinference:latest
image ID: 7a1223f1d698
vllm: 0.6.0
vllm-flash-attn: 2.6.1
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
version 0.16.1
pip packages' infomation:
accelerate 0.34.0
aiofiles 23.2.1
aiohappyeyeballs 2.4.0
aiohttp 3.10.5
aioprometheus 23.12.0
aiosignal 1.3.1
aliyun-python-sdk-core 2.16.0
aliyun-python-sdk-kms 2.16.5
altair 5.4.1
annotated-types 0.7.0
anthropic 0.37.1
antlr4-python3-runtime 4.9.3
anyio 4.4.0
argcomplete 3.5.1
async-timeout 4.0.3
attrdict 2.0.1
attrs 24.2.0
audioread 3.0.1
auto_gptq 0.7.1
autoawq 0.2.5
autoawq_kernels 0.0.6
av 13.1.0
bcrypt 4.2.0
beautifulsoup4 4.12.3
bitsandbytes 0.44.1
black 24.10.0
boto3 1.28.64
botocore 1.31.85
cdifflib 1.2.6
certifi 2019.11.28
cffi 1.17.1
chardet 3.0.4
charset-normalizer 3.3.2
chattts 0.2.0
click 8.1.7
cloudpickle 3.0.0
colorama 0.4.6
coloredlogs 15.0.1
conformer 0.3.2
contourpy 1.3.0
controlnet-aux 0.0.7
crcmod 1.7
cryptography 43.0.3
cycler 0.12.1
Cython 3.0.11
datamodel-code-generator 0.26.2
datasets 2.21.0
dbus-python 1.2.16
decorator 5.1.1
DeepCache 0.1.1
diffusers 0.31.0
dill 0.3.8
diskcache 5.6.3
distro 1.9.0
distro-info 0.23+ubuntu1.1
dnspython 2.7.0
ecdsa 0.19.0
editdistance 0.8.1
einops 0.8.0
einx 0.3.0
email_validator 2.2.0
encodec 0.1.1
eva-decord 0.6.1
exceptiongroup 1.2.2
fastapi 0.112.2
ffmpy 0.4.0
filelock 3.15.4
FlagEmbedding 1.2.11
flashinfer 0.1.6+cu124torch2.4
flatbuffers 24.3.25
fonttools 4.54.1
frozendict 2.4.6
frozenlist 1.4.1
fsspec 2024.6.1
funasr 1.1.12
fvcore 0.1.5.post20221221
gdown 5.2.0
gekko 1.2.1
genson 1.3.0
gguf 0.9.1
gradio 4.26.0
gradio_client 0.15.1
h11 0.14.0
hf_transfer 0.1.8
hiredis 3.0.0
httpcore 1.0.5
httptools 0.6.1
httpx 0.27.2
huggingface-hub 0.24.6
humanfriendly 10.0
hydra-core 1.3.2
HyperPyYAML 1.2.2
idna 2.8
imageio 2.36.0
imageio-ffmpeg 0.5.1
importlib_metadata 8.4.0
importlib_resources 6.4.5
inflect 5.6.2
interegular 0.3.3
iopath 0.1.10
isort 5.13.2
jaconv 0.4.0
jamo 0.4.1
jieba 0.42.1
Jinja2 3.1.4
jiter 0.5.0
jj-pytorchvideo 0.1.5
jmespath 0.10.0
joblib 1.4.2
jsonschema 4.23.0
jsonschema-specifications 2023.12.1
kaldiio 2.18.0
kiwisolver 1.4.7
lark 1.2.2
lazy_loader 0.4
libnacl 2.1.0
librosa 0.10.2.post1
lightning 2.4.0
lightning-utilities 0.11.8
litellm 1.50.4
llama_cpp_python 0.2.90
llvmlite 0.43.0
lm-format-enforcer 0.10.6
loguru 0.7.2
loralib 0.1.2
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.9.2
mdurl 0.1.2
mistral_common 1.3.4
modelscope 1.17.1
mpmath 1.3.0
msgpack 1.0.8
msgspec 0.18.6
multidict 6.0.5
multiprocess 0.70.16
mypy-extensions 1.0.0
narwhals 1.10.0
natsort 8.4.0
nemo_text_processing 1.0.2
nest-asyncio 1.6.0
networkx 3.3
numba 0.60.0
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-ml-py 12.560.30
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.6.68
nvidia-nvtx-cu12 12.1.105
omegaconf 2.3.0
onnxruntime-gpu 1.16.0
openai 1.52.2
opencv-contrib-python-headless 4.10.0.84
opencv-python 4.10.0.84
optimum 1.23.2
orjson 3.10.10
ormsgpack 1.6.0
oss2 2.19.0
outlines 0.0.46
packaging 24.1
pandas 2.2.2
parameterized 0.9.0
partial-json-parser 0.2.1.1.post4
passlib 1.7.4
pathspec 0.12.1
peft 0.13.2
pillow 10.4.0
pip 24.2
platformdirs 4.3.6
plumbum 1.9.0
pooch 1.8.2
portalocker 2.10.1
prometheus_client 0.20.0
prometheus-fastapi-instrumentator 7.0.0
protobuf 5.28.0
psutil 6.0.0
py-cpuinfo 9.0.0
pyairports 2.1.1
pyarrow 17.0.0
pyasn1 0.6.1
pybase16384 0.3.7
pycountry 24.6.1
pycparser 2.22
pycryptodome 3.21.0
pydantic 2.8.2
pydantic_core 2.20.1
pydub 0.25.1
Pygments 2.18.0
PyGObject 3.36.0
pynini 2.1.5
pynndescent 0.5.13
pyparsing 3.2.0
PySocks 1.7.1
python-apt 2.0.1+ubuntu0.20.4.1
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
python-jose 3.3.0
python-multipart 0.0.12
pytorch-lightning 2.4.0
pytorch-wpe 0.0.1
pytz 2024.1
PyYAML 6.0.2
pyzmq 26.2.0
quantile-python 1.1
qwen-vl-utils 0.0.8
ray 2.35.0
redis 5.2.0
referencing 0.35.1
regex 2024.7.24
requests 2.32.3
requests-unixsocket 0.2.0
rich 13.9.3
rouge 1.0.1
rpds-py 0.20.0
rpyc 6.0.1
rsa 4.9
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.12
ruff 0.7.1
s3transfer 0.7.0
sacremoses 0.1.1
safetensors 0.4.4
scikit-image 0.24.0
scikit-learn 1.5.2
scipy 1.14.1
semantic-version 2.10.0
sentence-transformers 3.2.1
sentencepiece 0.2.0
setuptools 75.2.0
sglang 0.3.4.post1
shellingham 1.5.4
six 1.14.0
sniffio 1.3.1
soundfile 0.12.1
soupsieve 2.6
soxr 0.5.0.post1
sse-starlette 2.1.3
starlette 0.38.4
sympy 1.13.2
tabulate 0.9.0
tblib 3.0.0
tensorboardX 2.6.2.2
tensorizer 2.9.0
termcolor 2.5.0
threadpoolctl 3.5.0
tifffile 2024.9.20
tiktoken 0.7.0
timm 1.0.11
tokenizers 0.20.1
toml 0.10.2
tomli 2.0.2
tomlkit 0.12.0
torch 2.4.0
torch-complex 0.4.4
torchaudio 2.4.0
torchmetrics 1.5.1
torchvision 0.19.0
tqdm 4.66.5
transformers 4.45.2
transformers-stream-generator 0.0.5
triton 3.0.0
typer 0.11.1
typing_extensions 4.12.2
tzdata 2024.1
umap-learn 0.5.6
unattended-upgrades 0.1
urllib3 2.0.7
uvicorn 0.30.6
uvloop 0.20.0
vector-quantize-pytorch 1.18.5
verovio 4.3.1
vllm 0.6.0
vllm-flash-attn 2.6.1
vocos 0.1.0
watchfiles 0.24.0
websockets 11.0.3
WeTextProcessing 1.0.3
wget 3.2
wheel 0.34.2
wrapt 1.16.0
xformers 0.0.27.post2
xinference 0.16.1
xoscar 0.3.3
xxhash 3.5.0
yacs 0.1.8
yarl 1.9.9
zipp 3.20.1
zmq 0.0.0
zstandard 0.23.0
The command used to start Xinference / 用以启动 xinference 的命令
docker run -d -p 9997:9997 -e XINFERENCE_HOME=/data -e XINFERENCE_MODEL_SRC=modelscope -v /opt/dlami/nvme/models:/data --gpus all xprobe/xinference:latest xinference-local -H 0.0.0.0
Reproduction / 复现过程
command
xinference launch -en vllm --size-in-billions 72 --model-format awq --gpu-idx 0,1,2,3 --n-gpu 4 --quantization Int4 -n qwen2.5-instruct --max_model_len 4096 --gpu_memory_utilization 0.99 --NCCL_DEBUG=INFO
result
Traceback (most recent call last):
File "/usr/local/bin/xinference", line 8, in
sys.exit(cli())
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/xinference/deploy/cmdline.py", line 845, in model_launch
kwargs[ctx.args[i][2:]] = handle_click_args_type(ctx.args[i + 1])
IndexError: list index out of range
Expected behavior / 期待表现
qwen2.5-instruct-awq q4 model can run launch smoothly on my device.
The text was updated successfully, but these errors were encountered: