-
Notifications
You must be signed in to change notification settings - Fork 1.1k
MONAI_Design
#Table of Contents
- Introduction
- Design philosophy
- Routes to using MONAI
- How MONAI adds value
- High level MONAI design
- Design pages
This page is the base page of the MONAI design document. If you are intending to contribute to the MONAI code base, it is important that you get an understanding of the design and design philosophy of MONAI so that we can ensure that MONAI is a success into the future.
This document is separated into a number of sections:
- Design philosophy - the type of framework we want MONAI to be and why
- High-level design - critical elements of design that sit at the architecture layer and above
- Design conventions - design and code-level conventions that should be adhered to if possible when considering new applications, networks, layers, mechanisms, etc.
- Application design conventions
- Network design conventions
- Design discussions - highlighting design decisions for optional extensions to the standard MONAI componentry
Our design philosophy is to build a framework that is, above all other things, unopinionated. This is a central tenet to the philosophy of MONAI, and gives easy access to MONAI for new users coming to the framework from various different directions.
We want a design for Monai that closely follows the philosophy of pytorch and ignite, the two deep learning (DL) packages on which Monai is being built. When we use the term 'unopinionated', we mean the following:
- The design should not apply radically different design decisions to those made by pytorch and ignite, except where there is a very strong reason to do so
- The design should provide useful extensions to pytorch and ignite such that they should appear to be the natural ways that pytorch / ignite would provide such functionality
- The fundamental nature of pytorch / ignite should not be hidden from the MONAI user
- An existing code-base should be extensible using MONAI without requiring a rewrite to conform to the 'MONAI way'
Everything that we add in terms of layers, networks, applications and so forth should be as if they were written for those frameworks by those teams. This is the vanilla version of all functionality, and all functionality that we add should be present in vanilla form. This way, there is no impedance mismatch to be overcome by someone who wants to dip their toe into MONAI on a project.
Specialised functionality is provided to make a user's life easier by providing useful types, abstractions and capabilities that go above and beyond pytorch / ignite philosophy. All such functionality should be opt-in and, in general, wrap the vanilla functionality to maximise code-reuse. One such example is the creation of preprocessing wrappers that are specialised to given types.
We expect users to arrive at Monai through one of three main routes. Our goal is to make it easy for them to use MONAI whichever way they choose to access it, and for them to be able to gradually adopt use of MONAI without ever facing a sudden, steep learning curve.
The user is initially using networks through configuration files and not interacting with the python API. Everything is done through modification of modular, intuitive configuration options:
- The user adopts a 'MONAIRun' configured network as the starting point and makes changes to the configuration file to use with their particular dataset, potentially with a set of pre-trained weights
- The user eventually wants to tweak the existing application and make new changes that are not reflected in the config file
The user has written networks in pytorch and wants to make use of MONAI functionality so that they don't have to write that code themselves. They want to use MONAI because it doesn't force them to rewrite their code-base:
- The user adds some piece of missing functionality. Likely candidates are:
- Pre-processing functionality, including use of asynchronous IO
- Adaptive learning rate functionality
- Network telemetry to their output of choice
- Encapsulated trainers / evaluators
- The user gradually adopts more and more elements of MONAI as they evolve their network sophistication, looking through the set of MONAI applications and reducing their line count as they adopt functionality
The user wants to write a network from scratch and wishes to use MONAI to do so:
- The user looks at existing applications and adapts one to their liking, or
- The user looks at existing applications and writes a new application from the ground up
Monai adds value by making simple things easy and hard things possible. It does so in a way that is entirely compatible and in keeping with the design of pytorch itself, the ignite framework that sits on top of pytorch, and other extensions to pytorch such as torchvision. While Monai should always represent a relatively thin, unopinionated wrapper around pytorch / ignite, there are particular areas where Monai can easily add value.
Pre-processing is an important area where both Clara Train and NiftyNet have provided functionality that is conspicuously absent from tensorflow / pytorch. Sophisticated pre-processing pipelines are essential to achieving ultimate model performance and hard to write, especially once we take into account complexities such as extension of augmentation to 3D and 4D datasets, the complexity of implementing non-linear deformations, for example, is often outside the skill set of otherwise skilled deep learning practitioners. The same can be said of using asynchronicity to minimise the overhead of IO completion when reading source data from disk.
Preprocessing is also gateway functionality to a more comprehensive use of the Monai framework by people who have existing pytorch codebases; it is largely additive in its usage. The underlying functionality should be exposed in a completely vanilla fashion, and then wrapped by more sophisticated handlers that result in natural looking sequences of function calls. Under the hood, we can make use of generators to provide caching, asynchrony, and other mechanisms to limit IO latency. We can provide typed wrappers that reduce the noise by using types such as MedicalImage, that bundle together related pieces of information.
One of ignite's key value propositions is that it provides engines; these are training and evaluation loops with a flexible set of hooks that allow a user to build applications in only a few lines of code, while having the ability to customise the behaviour of the engine through event handlers. These are accessed by natural looking factory functions, and this is a good starting point on which to build a broader set of medical application specific training and evaluation loops.
One of the design goals for MONAI is that, as we add to the set of engines that are available for training and evaluation, we continuously refactor towards a useful set of composable components by which new engine training and evaluation loops are built.
- TODO: whole paragraph is vague, refine and add content
There are several techniques in deep learning that can be considered as higher-order deep learning techniques. Ensembling and hyperparameter searches, for example, can be implemented either as pre-processing style functionality or as engine helpers that iteratively run engines themselves.
Mainstream features such as adaptive learning rates are also candidate pieces of composable engine functionality.
- TODO: code examples and more detail for all of the following subsections
In order to meet the design goals set down by the user pathways, the MONAI codebase should be demonstrative of what we think are best practices uses of the API. One of its strengths should be the rich set of applications that are developed through the components that we build, as they represent exemplars for others to follow. There are a number of aspects to be considered here:
- Consistent validation of parameters
- Clear and consistent reporting of validation failures
- Consistent network configuration capabilities and conventions for doing so
- Consistent network customisation capabilities and conventions for doing so
Consistent validation of parameters is very important in python modules; duck typing means that errors are often found far from the API call site at which they might first have been validated, which leads to obscure exceptions and unnecessary time spent debugging. Writing validation code for every function is onerous, but validation helper functions for all of the common parameter types are a simple solution. These are function calls that perform sensible check for parameters based both on their types and the concepts that they represent. For example, a float parameter must be a floating point value of some description but a learning rate is a float that must also be positive (under normal circumstances).
- TODO: more examples
Another benefit to the provision of parameter validation methods is that they can also ensure consistency and clarity in the reporting of API contract violations.
- TODO: more detail required
Network configuration & customisation convention / consistency Networks that become part of Monai should be built to what we consider to be current best practices with respect to configurability. This configurability should be presented and used in consistent ways across networks, where applicable. The following aspects are all considered configurable:
- Channel counts
- 2D / 3D versions of networks *1
- Weight initialisation
- Normalisation layer
- Dropout
- Activation
- Number of layers in layered networks
- Block repeat counts
- Convolutional block definition / override
- *1 TODO: potentially controversial; discuss
This section documents the concrete designs / design proposals for different aspects of Monai.
Each concrete design aspect is covered in its own sub-page, linked below:
- TODO: fill this section out, including examples
Vanilla pytorch networks are often defined to work with a specific image format. When a network is required to function with a number of different image formats, there is no pytorch mechanism for doing so.
Typically, such configuration should happen at model instantiation time rather than model execution time.
We can use factories / factory methods to do this effectively, and default to 4D BFHW if not specified.
Pre-processing steps
- Pre-preprocessing and one time caching
- Patch vs image wide preprocessing
- 'Data echoing' https://arxiv.org/abs/1907.05550
- Batch level augmentation https://arxiv.org/abs/1901.09335
Losses Selection from across Clara Train / Niftynet custom losses
Networks
- Neural architecture search
- Ensembling - see engines / engine helpers also Engines
- Q learning
- PixelRNN / PixelCNN
- Features
- Mini batching - along with pre-processing
- Mini-batch persistency https://arxiv.org/abs/1806.07353 Engine Helpers
- Mini batching - along with pre-processing
- Ensembling
- Network meta training
- Hyper parameter search - generator based
- Federated learning
- Learning without forgetting
Adaptive losses
- For Asynchronicity - MR vs CT different preprocessing pipelines - Need for modularity
- Conditional training
- Data sink solutions (csv, images, tables
- Allow for different processing (folder wise, individual subject)
Optimisers
- Medical imaging specific optimisers
- Hybrid optimisers