Skip to content

Benchmarking code for running quantized kernels from vLLM and other libraries

Notifications You must be signed in to change notification settings

neuralmagic/quant_kernel_benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example Usage

Run the benchmark (generates a .pkl file with the results)

python benchmark_kernels.py --act-type bfloat16 --kernels torch_fp16,machete,fbgemm_i4,marlin,gemlite model_bench

Plot the results

python plot/plot_normalized_runtime.py <generated_file>.pkl --highlight machete

About

Benchmarking code for running quantized kernels from vLLM and other libraries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages