Skip to content
Baptiste Wicht edited this page May 10, 2017 · 3 revisions

ETL can be configured with several macros:

  • ETL_VECTORIZE_IMPL: Enable vectorized implementations of some algorithms
  • ETL_VECTORIZE_EXPR: Allow ETL to vectorize all expressions when possible (for instance r = a + b / c)
  • ETL_VECTORIZE_FULL: Shortcut for ETL_VECTORIZE_IMPL and ETL_VECTORIZE_EXPR
  • ETL_BLAS_MODE: Allow the use of a BLAS library (through cblas). A BLAS library must be installed and compilation flags must be configured accordingly.
  • ETL_MKL_MODE: Allow the use of Intel MKL-specific functions to tune some algorithms (currently online inplace matrix transposition and FFT)
  • ETL_STRICT_DIV: Prevent ETL from using multiplication to perform division
  • ETL_NO_UNROLL_VECT: Tell ETL not to unroll vectorized loops
  • ETL_NO_UNROLL_NON_VECT: Tell ETL not to unroll normal loops
  • ETL_CONV_VALID_FFT: Allow ETL to use FFT for valid convolutions
  • ETL_PARALLEL: Allow ETL to automatically parallelize some expression
  • ETL_CUBLAS_MODE: Allow the use of NVIDIA CUBLAS library. CUBLAS must be installed and compilation flags must be configured accordingly.
  • ETL_CUFFT_MODE: Allow the use of NVIDIA CUFFT library. CUFFT must be installed and compilation flags must be configured accordingly.
  • ETL_CUDNN_MODE: Allow the use of NVIDIA CUDNN library. CUDNN must be installed and compilation flags must be configured accordingly.
  • ETL_MAX_WORKSPACE: Sets the size (in bytes) of the maximum memory that ETL is allowed to allocate to speed up some operations
  • ETL_CUDNN_MAX_WORKSPACE: Sets the size (in bytes) of the maximum GPU memory that ETL is allowed to allocate to speed up some CUDNN operations