Skip to content

MIOpen v2.0.0

Compare
Choose a tag to compare
@daniellowell daniellowell released this 08 Jul 17:30
· 1798 commits to master since this release

Notes:

  • This release contains several new features including an immediate mode for selecting convolutions, bfloat16 support, new layers, modes, and algorithms.
  • MIOpenDriver, a tool for benchmarking and developing kernels is now shipped with MIOpen.
  • BFloat16 now supported in HIP requires an updated rocBLAS as a GEMM backend.
  • Immediate mode API now provides the ability to quickly obtain a convolution kernel.
  • MIOpen now contains HIP source kernels and implements the ImplicitGEMM kernels. This is a new feature and is currently disabled by default. Use the environmental variable "MIOPEN_DEBUG_CONV_IMPLICIT_GEMM=1" to activation this feature. ImplicitGEMM requires an up to date HIP version of at least 1.5.9211.
  • A new "loss" catagory of layers has been added, of which, CTC loss is the first. See the API reference for more details.
  • 2.0 is the last release of active support for gfx803 architectures. In future releases, MIOpen will not actively debug and develop new features specifically for gfx803.
  • System Find-Db in memory cache is disabled by default. Please see build instructions to enable this feature.

Changes:

  • Added support for bfloat16 datatype in convolutions
  • Added softmax channel mode and new softmax version 2 API
  • Added fast / accurate / log softmax algorithms
  • Added new implicit GEMM convolution algorithm for forward and backwards data passes, disabled by default
  • Added int32 datatype support for output tensors in int8 convolutions
  • Added immediate mode for finding the best convolution kernel for a given configuration
  • Added a Find-Db infrastructure which stashes results of find on a user's system
  • Added a shipped System Find-Db containing offline run Find() results
  • Added an additional, faster batch norm assembly kernel for fp16
  • Added CTC loss layer
  • Added MIOpenDriver as a default component in MIOpen's build #34
  • Fixed C compatability for boolean types in C API #103
  • Fixed incorrect calculation in per-activation batch norm backwards pass #104
  • Fixed bug #95 with asm batch norm ISA
  • Fixed IsApplicable bug in Conv3x3Asm for group convolutions
  • Improved performance of 1x1 stride 2 fp32 convolutions in the forward and backwards data passes
  • Improved 3-D convolution stability
  • Improved applicability of direct convolution backwards weights for 2x2, 5x10, and 5x20 filter sizes
  • Improved maintainability in kernels and cpp code
  • Updated rocBLAS minimum version to branch master-rocm-2.6