Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on AVX instruction on Gemini Lake #12739

Open
rgov opened this issue Jan 23, 2025 · 4 comments
Open

Crash on AVX instruction on Gemini Lake #12739

rgov opened this issue Jan 23, 2025 · 4 comments
Assignees

Comments

@rgov
Copy link

rgov commented Jan 23, 2025

I'm attempting to run the intelanalytics/ipex-llm-inference-cpp-xpu container on an Intel Celeron J4125 (Gemini Lake) CPU with integrated UHD Graphics 600 GPU.

When the ollama_llama_server program runs, it crashes due to an illegal instruction:

time=2025-01-23T17:46:10.943+08:00 level=INFO source=runner.go:963 msg="starting go runner"
time=2025-01-23T17:46:10.943+08:00 level=INFO source=runner.go:964 msg=system info="AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | cgo(gcc)" threads=4
SIGILL: illegal instruction
PC=0x7f544807fd16 m=4 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc5 0xf9 0x6e 0xc3 0x8d 0x53 0x3 0xc5 0xf9 0xc4 0xc0 0x1 0xc5 0xf9 0xc4 0xc1

(gdb) bt
#0  0x00007ffff547fd16 in ggml_init () from /llm/ollama/libollama_ggml.so
#1  0x00007ffff7468422 in llama_backend_init () from /llm/ollama/libollama_llama.so

(gdb) x/4i $rip
=> 0x7ffff547fd16 <ggml_init+118>:	vmovd  %ebx,%xmm0
   0x7ffff547fd1a <ggml_init+122>:	lea    0x3(%rbx),%edx
   0x7ffff547fd1d <ggml_init+125>:	vpinsrw $0x1,%eax,%xmm0,%xmm0
   0x7ffff547fd22 <ggml_init+130>:	vpinsrw $0x2,%ecx,%xmm0,%xmm0

It looks like libollama_ggml.so is compiled with AVX instructions, but my processor doesn't support AVX (despite the log message that says "AVX = 1").

$ cat /proc/cpuinfo
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand lahf_lm 3dnowprefetch intel_pt ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust smep erms mpx rdseed smap clflushopt sha_ni xsaveopt xsavec xgetbv1 dtherm ida arat pln pts rdpid md_clear arch_capabilities

Do you intend to support CPUs without AVX? Or where can I find out how the files under /usr/local/lib/python3.11/dist-packages/bigdl/cpp/libs/ are compiled so I can rebuild them without AVX instructions?

@rgov
Copy link
Author

rgov commented Jan 23, 2025

It looks like this was tracked by ollama/ollama#2187 and was fixed as of v0.5.2, whereas the container currently ships with v0.5.1.

There's a related commit in Ollama: ollama/ollama@667a2ba:

We build the GPU libraries with AVX enabled to ensure that if not all
layers fit on the GPU we get better performance in a mixed mode.
If the user is using a virtualization/emulation system that lacks AVX
this used to result in an illegal instruction error and crash before this
fix. ...

In ollama/ollama@4879a23 the restriction was removed.

@sgwhat
Copy link
Contributor

sgwhat commented Jan 24, 2025

Hi @rgov , we are releasing IPEX-LLM Ollama v0.5.4, and we will inform you once it is completed.

@rgov
Copy link
Author

rgov commented Jan 24, 2025

Thanks @sgwhat. You would have to build Ollama with CUSTOM_CPU_FLAGS='' to build a runner without the AVX instructions. I'm not sure the invocation to build multiple runners with different processor features.

@rgov
Copy link
Author

rgov commented Jan 24, 2025

I don't think it's going to work on Gemini Lake anyway, the oneAPI requires "Intel® UHD Graphics for 11th generation Intel processors or newer." Looks like this processor is generation "9.5" according to Wikipedia.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants