[Dev] Enhance Operator Cache to support multi-thread environments #205
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix ref to #204 #186
This pull request introduces several changes to improve thread safety, enhance scheduler functionality, and refactor code for better readability and maintainability. The most important changes include adding a lock to the
OperatorCache
class, modifying theThreadPoolExecutor
to use a variable number of workers, and introducing a new fine-grained matrix multiplication scheduler.Thread Safety Enhancements:
bitblas/cache/operator.py
: Added acache_locker
usingthreading.RLock
to synchronize access to the cache in methods likeadd
,get
,clear
, andsave_into_database
. [1] [2]Scheduler Improvements:
bitblas/base/utils.py
: ModifiedThreadPoolExecutor
to use a variable number of workers (max_workers
) instead of a fixed number (4).bitblas/ops/base_scheduler.py
: Added a methodget_hardware_aware_configs
to raise aNotImplementedError
for hardware-aware tuning.bitblas/ops/general_matmul/tilelang/dense/matmul_simt.py
: Introduced a newMatmulFineGrainSIMTScheduler
class for fine-grained matrix multiplication scheduling.Code Refactoring:
bitblas/ops/general_matmul/tilelang/dense/matmul_tensorcore.py
: Renamed frommatmul.py
and added imports and methods for hardware-aware configurations. [1] [2] [3]bitblas/ops/operator.py
: Refactored multiple methods for better readability, includingapply_fast_tuning
,hardware_aware_finetune
, and_build_default_module
. [1] [2]Additional Changes:
bitblas/ops/general_matmul/tilelang/dense/__init__.py
: Updated imports to includematmul_simt
andmatmul_tensorcore
.bitblas/ops/operator.py
: Added import fortl_apply_and_build
frombitblas.tl.tuner
.