-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Default Backend Should be changed into tilelang
instead of tvm
before v0.0.1 release
#252
Comments
ref to release plan #150 |
<style>
</style>
|
in most benchmark cases, tl has better performance or equal to tir backend, while in m=32 or m=128 with dequantize, tl has weaker performance than tir backend, which is mainly due to the lack of implementation for block reduction. To reproduce the results: #!/usr/bin/env bash
set -euo pipefail
test_shapes="$(python3 <<EOF
import json
benchmark_shapes = [
(1, 16384, 16384),
(32, 16384, 16384),
(128, 16384, 16384),
(512, 16384, 16384),
(16384, 16384, 16384),
]
op_configs = [
{"A_dtype": "float16", "W_dtype": "float16", "accum_dtype": "float16", "out_dtype": "float16"},
{"A_dtype": "float16", "W_dtype": "int4", "accum_dtype": "float16", "out_dtype": "float16", "with_scaling": False},
{"A_dtype": "float16", "W_dtype": "int4", "accum_dtype": "float16", "out_dtype": "float16", "with_scaling": True, "group_size": -1},
{"A_dtype": "int8", "W_dtype": "int8", "accum_dtype": "int32", "out_dtype": "int8"},
{"A_dtype": "int8", "W_dtype": "int2", "accum_dtype": "int32", "out_dtype": "int8"},
]
op_config = "MatmulConfig"
op_class = "Matmul"
configs = []
for shape in benchmark_shapes:
for config in op_configs:
input_args = list(shape)
input_args.append(config["A_dtype"])
input_args.append(config["W_dtype"])
input_args.append(config["out_dtype"])
input_args.append(config["accum_dtype"])
input_args.append("nt") # layout
input_args.append(False) # with_bias
input_args.append(-1 if "group_size" not in config else config["group_size"])
input_args.append(False if "with_scaling" not in config else config["with_scaling"])
input_args.append(False if "with_zeros" not in config else config["with_zeros"])
input_args.append(None if "zeros_mode" not in config else config["zeros_mode"])
configs.append([op_config, op_class, input_args])
print(json.dumps(configs))
EOF
)"
echo "Running benchmark with test shapes:"
python3 -c "import json; configs = json.loads('$test_shapes'); [print(c) for c in configs]"
mkdir -p benchmark_logs
# backends=("tir" "tl")
backends=("tl")
for backend in "${backends[@]}"; do
log_file="benchmark_logs/${backend}_benchmark.log"
echo "Running benchmark for backend '${backend}'"
cmd="python ./benchmark/operators/benchmark_bitblas_matmul.py --backend ${backend} --test_shapes '${test_shapes}'"
echo "Running command: $cmd"
bash -c "$cmd 2>&1 | tee ${log_file}"
echo "Logs for backend '${backend}' written to ${log_file}"
done
|
TL with Split K Support Performance: <style> </style>
Let's make a new pull request to change the default backend into tilelang |
Closed as be modified at pr #270 |
We propose changing the default backend from tvm to tilelang before the v0.0.1 release. The tilelang backend has demonstrated compatibility with all current operators (e.g., Matmul, Flash Attention) and offers significant performance advantages.
TODO Items
The text was updated successfully, but these errors were encountered: