-
Notifications
You must be signed in to change notification settings - Fork 74
Commit
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Sphinx build info version 1 | ||
# This file records the configuration used when building these files. When it is not found, a full rebuild will be done. | ||
config: 580ffefb3400b8d12e26a069df89b475 | ||
tags: 645f666f9bcd5a90fca523b33c5a78b7 |
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
Machine-Learning System Exploration Tools | ||
========================================== | ||
|
||
Mase is a Machine Learning compiler based on PyTorch FX, maintained by researchers at Imperial College London. We provide a set of tools for inference and training optimization of state-of-the-art language and vision models. The following features are supported, among others: | ||
|
||
- **Quantization Search**: mixed-precision quantization of any PyTorch model. We support `microscaling <https://arxiv.org/abs/2310.10537>`__ and other numerical formats, at various granularities. | ||
|
||
- **Quantization-Aware Training (QAT)**: finetuning quantized models to minimize accuracy loss. | ||
|
||
- **Hardware Generation**: automatic generation of high-performance FPGA accelerators for arbitrary Pytorch models, through the Emit Verilog flow. | ||
|
||
- **Distributed Deployment**: Automatic parallelization of models across distributed GPU clusters, based on the `Alpa <https://arxiv.org/abs/2201.12023>`__ algorithm. | ||
|
||
For more details, refer to the `Tutorials <https://deepwok.github.io/mase/modules/documentation/tutorials.html>`_. If you enjoy using the framework, you can support us by starring the repository on `GitHub <https://github.com/DeepWok/mase>`__! | ||
|
||
Efficient AI Optimization | ||
---------------------------------------------------- | ||
|
||
MASE provides a set of composable tools for optimizing AI models. The tools are designed to be modular and can be used in a variety of ways to optimize models for different hardware targets. The tools can be used to optimize models for inference, training, or both. The tools can be used to optimize models for a variety of hardware targets, including CPUs, GPUs, and FPGAs. The tools can be used to optimize models for a variety of applications, including computer vision, natural language processing, and speech recognition. | ||
|
||
|
||
|
||
Hardware Generation | ||
---------------------------------------------------- | ||
|
||
Machine learning accelerators have been used extensively to compute models with high performance and low power. Unfortunately, the development pace of ML models is much faster than the accelerator design cycle, leading to frequent changes in the hardware architecture requirements, rendering many accelerators obsolete. Existing design tools and frameworks can provide quick accelerator prototyping, but only for a limited range of models that fit into a single hardware device. With the emergence of large language models such as GPT-3, there is an increased need for hardware prototyping of large models within a many-accelerator system to ensure the hardware can scale with ever-growing model sizes. | ||
|
||
.. image:: ../imgs/mase_overview.png | ||
:alt: logo | ||
:align: center | ||
|
||
MASE provides an efficient and scalable approach for exploring accelerator systems to compute large ML models by directly mapping onto an efficient streaming accelerator system. Over a set of ML models, MASE can achieve better energy efficiency to GPUs when computing inference for recent transformer models. | ||
|
||
|
||
Documentation | ||
---------------------------------------------------- | ||
|
||
For more details, explore the documentation | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Overview | ||
|
||
modules/documentation/installation | ||
modules/documentation/quickstart | ||
modules/documentation/tutorials | ||
modules/documentation/health | ||
modules/documentation/specifications | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Machop API | ||
|
||
modules/machop | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Mase Components | ||
|
||
modules/hardware/hardware_documentation | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Advanced Deep Learning Systems | ||
|
||
modules/adls_2024 | ||
modules/adls_2023 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
Advanced Deep Learning Systems: 2023/2024 | ||
========================================== | ||
|
||
The widespread adoption of deep learning methods has been largely driven by the availability of easy-to-use systems such as PyTorch and TensorFlow. However, it is less common for users to explore the internals of the libraries and understand how they function, as well as how to optimize the high-level code for hardware systems. When deep learning algorithms are deployed into custom hardware, they are often modified to run faster and more efficiently. This module will provide you with the basic concepts and principles of modern deep learning systems, and explore how optimizations can be applied from both the software and hardware aspects of the system stack. | ||
|
||
Learning Outcomes | ||
------------------------------ | ||
|
||
On successful completion of this module, you'll be able to : | ||
|
||
1. Analyze the design principles of modern machine learning systems | ||
|
||
2. Argue the mapping of high-level Python code in Pytorch or Tensorflow into actual hardware (such as GPUs and FPGAs) | ||
|
||
3. Assess the potential benefits of software and hardware optimizations | ||
|
||
4. Argue by comparing and contrasting how various vision and language models can benefit from different optimizations and being mapped to hardware. | ||
|
||
Syllabus | ||
------------------------------ | ||
|
||
This module covers the introduction to modern ML systems and frameworks, ML models and their characteristics (Transformers, Convolutional Networks, 3D CNNs, Vision Transformers, Graph Neural Networks and Generative models such as VAEs and Diffusion Models) (3 hours), Modern ML Compilers (including the concept of Computational Graphs, Parallelism and Graph-level optimization) (2 hours), Model Compression (including Low Rank Approximation, Pruning, Quantization and Adaptive Compute), Hardware acceleration (including Commodity hardware, Custom hardware and MLPerf), Automated Machine Learning (including Network Architecture Search, Reinforcement Learning based NAS, Gradient-based NAS and Weight-sharing) (4 hours), Deep Learning Training (including Backpropagation, Scalability, Data parallel vs. Model parallel and Multi-GPU/Node training) (2 hours) and Systems for various Deep Learning paradigms (including Federated Learning and Large Scale ML on the Cloud) (1 hour). | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Lab Materials | ||
|
||
labs_2023/lab1 | ||
labs_2023/lab2 | ||
labs_2023/lab3 | ||
labs_2023/lab4-hardware | ||
labs_2023/lab4-software | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Additional Resources | ||
|
||
labs_2023/setup_docker_env |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
Advanced Deep Learning Systems: 2024/2025 | ||
========================================= | ||
|
||
The widespread adoption of deep learning methods has been largely driven by the availability of easy-to-use systems such as PyTorch and TensorFlow. However, it is less common for users to explore the internals of the libraries and understand how they function, as well as how to optimize the high-level code for hardware systems. When deep learning algorithms are deployed into custom hardware, they are often modified to run faster and more efficiently. This module will provide you with the basic concepts and principles of modern deep learning systems, and explore how optimizations can be applied from both the software and hardware aspects of the system stack. | ||
|
||
Learning Outcomes | ||
------------------------------ | ||
|
||
On successful completion of this module, you'll be able to : | ||
|
||
1. Analyze the design principles of modern machine learning systems | ||
|
||
2. Argue the mapping of high-level Python code in Pytorch or Tensorflow into actual hardware (such as GPUs and FPGAs) | ||
|
||
3. Assess the potential benefits of software and hardware optimizations | ||
|
||
4. Argue by comparing and contrasting how various vision and language models can benefit from different optimizations and being mapped to hardware. | ||
|
||
Syllabus | ||
------------------------------ | ||
|
||
This module covers the following topics: | ||
1. Introduction to modern ML systems and frameworks, ML models and their characteristics (Transformers, Convolutional Networks, 3D CNNs, Vision Transformers, Graph Neural Networks and Generative models such as VAEs and Diffusion Models) (3 hours) | ||
2. Modern ML Compilers (including the concept of Computational Graphs, Parallelism and Graph-level optimization) (2 hours) | ||
3. Model Compression (including Low Rank Approximation, Pruning, Quantization and Adaptive Compute), Hardware acceleration (including Commodity hardware, Custom hardware and MLPerf), Automated Machine Learning (including Network Architecture Search, Reinforcement Learning based NAS, Gradient-based NAS and Weight-sharing) (4 hours) | ||
4. Deep Learning Training (including Backpropagation, Scalability, Data parallel vs. Model parallel and Multi-GPU/Node training) (2 hours) and Systems for various Deep Learning paradigms (including Federated Learning and Large Scale ML on the Cloud) (1 hour). | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Lab Materials | ||
|
||
labs_2024/lab_0_introduction | ||
labs_2024/lab_1_compression | ||
labs_2024/lab_2_nas | ||
labs_2024/lab_3_mixed_precision_search | ||
labs_2024/lab4-hardware | ||
labs_2024/lab4-software | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Additional Resources | ||
|
||
labs_2024/setup_docker_env |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
chop.actions | ||
==================== | ||
|
||
chop.actions.train | ||
------------------------- | ||
|
||
.. automodule:: chop.actions.train | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: | ||
|
||
chop.actions.test | ||
------------------------- | ||
|
||
.. automodule:: chop.actions.test | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: | ||
|
||
chop.actions.transform | ||
------------------------- | ||
|
||
.. automodule:: chop.actions.transform | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: | ||
|
||
chop.actions.search | ||
--------------------------------- | ||
|
||
.. automodule:: chop.actions.search.search | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
chop.passes.graph.analysis.add\_metadata | ||
======================================== | ||
|
||
add\_common\_metadata\_analysis\_pass | ||
------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.add_metadata.add_common_metadata.add_common_metadata_analysis_pass | ||
|
||
add\_software\_metadata\_analysis\_pass | ||
--------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.add_metadata.add_software_metadata.add_software_metadata_analysis_pass | ||
|
||
add\_hardware\_metadata\_analysis\_pass | ||
--------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.add_metadata.add_hardware_metadata.add_hardware_metadata_analysis_pass |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
chop.passes.graph.analysis.autosharding | ||
======================================== | ||
|
||
.. autosharding\_analysis\_pass | ||
.. ------------------------------------- | ||
.. .. autofunction:: chop.passes.graph.analysis.autosharding.autosharding_analysis_pass | ||
.. alpa\_autosharding\_pass | ||
.. --------------------------------------- | ||
.. .. autofunction:: chop.passes.graph.analysis.autosharding.alpa.alpa_autosharding_pass | ||
.. alpa\_intra\_op\_sharding\_pass | ||
.. --------------------------------------- | ||
.. .. autofunction:: chop.passes.graph.analysis.autosharding.alpa_intra_operator.alpa_intra_op_sharding_pass |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
chop.passes.graph.analysis.init\_metadata | ||
==================================================================== | ||
|
||
init\_metadata\_analysis\_pass | ||
-------------------------------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.init_metadata.init_metadata_analysis_pass |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
chop.passes.graph.pruning | ||
=================================================== | ||
|
||
|
||
|
||
add\_pruning\_metadata\_analysis\_pass | ||
-------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.pruning.calculate_sparsity.add_pruning_metadata_analysis_pass | ||
|
||
add\_natural\_sparsity\_metadata\_analysis\_pass | ||
------------------------------------------------ | ||
|
||
.. autofunction:: chop.passes.graph.analysis.pruning.calculate_natural_sparsity.add_natural_sparsity_metadata_analysis_pass | ||
|
||
hook\_inspection\_analysis\_pass | ||
-------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.pruning.hook_inspector.hook_inspection_analysis_pass | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
chop.passes.graph.calculate\_avg\_bits\_mg\_analysis\_pass | ||
========================================================== | ||
|
||
|
||
|
||
calculate\_avg\_bits\_mg\_analysis\_pass | ||
---------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.quantization.calculate_avg_bits.calculate_avg_bits_mg_analysis_pass | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
chop.passes.graph.analysis.report | ||
================================= | ||
|
||
|
||
report_graph_analysis_pass | ||
--------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.report.report_graph.report_graph_analysis_pass | ||
|
||
report_node_hardware_type_analysis_pass | ||
--------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.report.report_node.report_node_hardware_type_analysis_pass | ||
|
||
report_node_meta_param_analysis_pass | ||
------------------------------------ | ||
|
||
.. autofunction:: chop.passes.graph.analysis.report.report_node.report_node_meta_param_analysis_pass | ||
|
||
report_node_shape_analysis_pass | ||
------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.report.report_node.report_node_shape_analysis_pass | ||
|
||
report_node_type_analysis_pass | ||
------------------------------ | ||
|
||
.. autofunction:: chop.passes.graph.analysis.report.report_node.report_node_type_analysis_pass | ||
|
||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
chop.passes.graph.analysis.runtime | ||
================================== | ||
|
||
runtime\_analysis\_pass | ||
------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.runtime.runtime_analysis_pass |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
chop.passes.graph.analysis.statistical\_profiler.profile_statistics | ||
==================================================================== | ||
|
||
|
||
profile\_statistics\_analysis\_pass | ||
----------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.statistical_profiler.profile_statistics.profile_statistics_analysis_pass |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
chop.passes.graph.analysis.verify.verify | ||
======================================== | ||
|
||
|
||
verify\_metadata\_analysis\_pass | ||
----------------------------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.verify.verify.verify_metadata_analysis_pass | ||
|
||
verify\_common\_metadata\_analysis\_pass | ||
----------------------------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.verify.verify.verify_common_metadata_analysis_pass | ||
|
||
verify\_software\_metadata\_analysis\_pass | ||
----------------------------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.verify.verify.verify_software_metadata_analysis_pass | ||
|
||
verify\_metadata\_analysis\_pass | ||
----------------------------------------------------------- | ||
|
||
.. autofunction:: chop.passes.graph.analysis.verify.verify.verify_hardware_metadata_analysis_pass |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
chop.datasets | ||
==================== | ||
|
||
chop.dataset.nerf | ||
------------------------- | ||
|
||
.. automodule:: chop.dataset.nerf | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: | ||
|
||
chop.dataset.nlp | ||
------------------------- | ||
|
||
.. automodule:: chop.dataset.nlp | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: | ||
|
||
chop.dataset.physical | ||
------------------------- | ||
|
||
.. automodule:: chop.dataset.physical | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: | ||
|
||
chop.dataset.vision | ||
------------------------- | ||
|
||
.. automodule:: chop.dataset.vision | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: |