[Dev] Enhance Operator Cache to support multi-thread environments #205

LeiWang1999 · 2024-10-01T08:16:50Z

Fix ref to #204 #186

This pull request introduces several changes to improve thread safety, enhance scheduler functionality, and refactor code for better readability and maintainability. The most important changes include adding a lock to the OperatorCache class, modifying the ThreadPoolExecutor to use a variable number of workers, and introducing a new fine-grained matrix multiplication scheduler.

Thread Safety Enhancements:

bitblas/cache/operator.py: Added a cache_locker using threading.RLock to synchronize access to the cache in methods like add, get, clear, and save_into_database. [1] [2]

Scheduler Improvements:

bitblas/base/utils.py: Modified ThreadPoolExecutor to use a variable number of workers (max_workers) instead of a fixed number (4).
bitblas/ops/base_scheduler.py: Added a method get_hardware_aware_configs to raise a NotImplementedError for hardware-aware tuning.
bitblas/ops/general_matmul/tilelang/dense/matmul_simt.py: Introduced a new MatmulFineGrainSIMTScheduler class for fine-grained matrix multiplication scheduling.

Code Refactoring:

bitblas/ops/general_matmul/tilelang/dense/matmul_tensorcore.py: Renamed from matmul.py and added imports and methods for hardware-aware configurations. [1] [2] [3]
bitblas/ops/operator.py: Refactored multiple methods for better readability, including apply_fast_tuning, hardware_aware_finetune, and _build_default_module. [1] [2]

Additional Changes:

bitblas/ops/general_matmul/tilelang/dense/__init__.py: Updated imports to include matmul_simt and matmul_tensorcore.
bitblas/ops/operator.py: Added import for tl_apply_and_build from bitblas.tl.tuner.

…y function

…ps_dynamic

LeiWang1999 added 14 commits September 28, 2024 07:43

Refactor tilelang dequantize module and add matmul_blocked_weight_onl…

f3b1eb9

…y function

remove un-implemented code.

730d13e

Implement BaseScheduler to wrap some related items.

8047ee7

lint fix

64db065

test skip

cef04a8

Refactor tilelang dequantize module and add matmul_blocked_weight_onl…

f1652e9

…y function

Merge branch 'main' of https://github.com/microsoft/BitBLAS into tl_o…

4f6c545

…ps_dynamic

test fix

c485b68

hardware tuning demo

ebe42a6

Merge branch 'main' of https://github.com/microsoft/BitBLAS into tl_o…

88230ec

…ps_dynamic

remove debug related items.

44246a1

imlement tuner and cache fix

bb51e15

Merge branch 'main' of https://github.com/microsoft/BitBLAS into tl_o…

f42a3b9

…ps_dynamic

lint fix

de7ae18

LeiWang1999 merged commit 22e25c9 into microsoft:main Oct 1, 2024
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dev] Enhance Operator Cache to support multi-thread environments #205

[Dev] Enhance Operator Cache to support multi-thread environments #205

LeiWang1999 commented Oct 1, 2024

[Dev] Enhance Operator Cache to support multi-thread environments #205

[Dev] Enhance Operator Cache to support multi-thread environments #205

Conversation

LeiWang1999 commented Oct 1, 2024

Thread Safety Enhancements:

Scheduler Improvements:

Code Refactoring:

Additional Changes: